python中的上下文最常用词语料库答案

【问题标题】：Context most frequent words corpus in pythonpython中的上下文最常用词语料库
【发布时间】：2015-06-15 01:07:05
【问题描述】：

在使用以下 def 查找语料库中最常用的 10 个单词后（使用 Python），我必须比较这 10 个单词在所述语料库的不同子类别中的上下文。

def meest_freq(mycorpus):
    import string
    woorden = mycorpus.words()
    zonderhoofdletters = [word.lower() for word in woorden]
    filtered = [word for word in zonderhoofdletters if word not in stopList]
    no_punct = [s.translate(None, string.punctuation) for s in filtered]
    word_counter = {}
    D = defaultdict(int)
    for word in no_punct:
        D[word] +=1
    popular_words = sorted(D, key = D.get, reverse = True)
    woord1 = popular_words[1]
    woord2 = popular_words[2]
    woord3 = popular_words[3]
    woord4 = popular_words[4]
    woord5 = popular_words[5]
    woord6 = popular_words[6]
    woord7 = popular_words[7]
    woord8 = popular_words[8]
    woord9 = popular_words[9]
    woord10 = popular_words[10]
    print "De 10 meest frequente woorden zijn: ", woord1, ",", woord2, ',', woord3, ',', woord4, ',', woord5, ',', woord6, ',', woord7, ',', woord8, ',', woord9, "en", woord10
    return popular_words

我想使用以下代码：

def context(cat):
    words = popular_words[:10]
    context = words.concordance()
    print context

不幸的是，我不断收到“AttributeError: 'str' object has no attribute 'concordance' 有谁知道为什么我不能在第二个 def 中使用我的第一个代码块的结果？我认为通过使用返回语句它应该能够工作。

【问题讨论】：

您实际上并没有从函数中获取返回值 - 您必须使用 words = meest_freq(yourcorpus)[:10]
它来自 nltk。我们在课堂上看到过

标签： python nltk

【解决方案1】：

有谁知道为什么我不能在第二个 def 中使用我的第一个代码块的结果？我认为通过使用返回语句它应该能够工作。

因为函数不返回变量，它们返回值。

您在context 中使用的popular_words 并非来自meest_freq；它来自某个地方的某个全局变量。在meest_freq 内部，popular_words 是本地的。这是因为规则：如果您在函数内分配一个名称，则它是一个本地名称，除非您在 global 语句中另有说明。在context 中，没有分配给popular_words，因此Python 会寻找具有该名称的全局变量。这个全局包含一些你不希望它包含的东西，可能是因为你正在测试解释器中的函数（也许你在测试和修复函数的早期版本时留下了它......）。

请不要尝试为此使用全局变量。您已经正确地吸取了教训，从函数中获取信息的方法是通过返回值。与之对应的；获取信息到函数的方法是将其作为参数传入。与meest_freq 了解语料库的方式相同（因为您将其传递为mycorpus），所以context 应该了解流行词。

您必须在某个地方拥有调用这两个函数的代码。该代码应采用从meest_freq 返回的值，并将其传递给context，与将语料库传递给meest_freq 的方式相同。

或者，如果你将语料库传递给context，那么你可以在那里进行调用。因为你的名字，很难知道什么是组织事物的正确方式；我不知道cat 应该是什么意思，或者context 与任何事情有什么关系，或者concordance 在这种情况下意味着什么。

【讨论】：