在python中按频率对单词进行排序答案

【问题标题】：Sorting words by their freq in python在python中按频率对单词进行排序
【发布时间】：2013-03-31 22:02:37
【问题描述】：

import sys
def candidateWord():
   filePath = "sample.txt"
   file = open(filePath,'r')
   word_count = {}
   for line in sys.stdin.readlines():
         for word in line.split():
            #words = word.lower()
            words = word.strip('!,.?1234567890-=@#$%^&*()_+').lower()
            word_count[words] = word_count.get(words,0) + 1

         for key in word_count.keys():
            #sorted(word, key = str,lower)
            print (str(key)+' '+str(word_count[key]))

candidateWord()

我将如何使用我已经拥有的按频率对文本文件中的单词进行排序？

文本文件 (sample.txt) 包含以下内容：How are you How are you I am good. HBHJKOLDSA How

我的愿望输出应该是：

how 3
am 2
are 2
i 2
you 2
good 1
hbhjkoldsa 1

我正在使用 python 3。

【问题讨论】：

标签： python list sorting python-3.x

【解决方案1】：

使用collections.Counter：

from collections import Counter

with open("sample.txt", 'r') as f:
    text = f.read()

words = [w.strip('!,.?1234567890-=@#$%^&*()_+') for w in text.lower().split()]

counter = Counter(words)

print(counter.most_common())
# [('how', 3), ('are', 2), ('you', 2), ('good', 1), ('i', 1), ('am', 1), ('hbhjkoldsa', 1)]

你想要的输出：

print("\n".join("{} {}".format(*p) for p in counter.most_common()))

使用您的代码并按（频率降序，单词升序）排序：

for key, value in sorted(word_count.items(), key=lambda p: (-p[1], p[0])):
    print("{} {}".format(key, value))

Counter结果也可以按照同样的方式排序，只需将word_count.items()替换为counter.most_common()即可。

【讨论】：

是否可以将 collections.Counter 与文件一起使用而不仅仅是文本？
当然。你为什么打开一个文件然后从stdin读取？
这里，我将text 替换为文件的内容。
所以把我所有的都换成你的就完成了吗？
是的，但如果这是一项任务，您可能应该实现自己的类似计数器的过程？