计算字符串中的字母频率（Python）[重复]答案

【问题标题】：Counting Letter Frequency in a String (Python) [duplicate]计算字符串中的字母频率（Python）[重复]
【发布时间】：2017-04-20 11:49:03
【问题描述】：

我正在尝试计算单词中每个字母出现的次数

word = input("Enter a word")

Alphabet=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']

for i in range(0,26):
    print(word.count(Alphabet[i]))

这当前输出每个字母出现的次数，包括不出现的次数。

我如何垂直列出字母并在旁边列出频率，例如，如下所示？

word="你好"

H 1

E 1

L 2

O 1

【问题讨论】：

30 秒的搜索表明您可以使用 collections.Counter。
这看起来像是一个家庭作业问题，因此您可能想阅读these guidelines 了解如何在 SO 上提出此类问题。稍后我会发布一些答案。

标签： python frequency-analysis

【解决方案1】：

from collections import Counter
counts=Counter(word) # Counter({'l': 2, 'H': 1, 'e': 1, 'o': 1})
for i in word:
    print(i,counts[i])

尝试使用Counter，它将创建一个字典，其中包含集合中所有项目的频率。

否则，只有当word.count(Alphabet[i]) 大于 0 时，您才可以对当前代码执行条件 print，但这会更慢。

【讨论】：

【解决方案2】：

def char_frequency(str1):
    dict = {}
    for n in str1:
        keys = dict.keys()
        if n in keys:
            dict[n] += 1
        else:
            dict[n] = 1
    return dict
print(char_frequency('google.com'))

【讨论】：

感谢您提供此代码 sn-p，它可能会提供一些有限的短期帮助。一个正确的解释would greatly improve 它的长期价值通过展示为什么这是一个很好的解决问题的方法，并将使它对未来有其他类似问题的读者更有用。请edit您的回答添加一些解释，包括您所做的假设。
简单地遍历字符串并在新出现元素的字典中形成一个键，或者如果元素已经出现，则将其值增加1。
使用 dict.keys() jsut 进行in 测试毫无意义..
请注意，您不应将字典命名为 dict，因为这是保留字（它是字典的构造器术语！）
ctrl c & ctrl v. from here ---> w3resource.com/python-exercises/string/…

【解决方案3】：

作为Pythonista said，这是collections.Counter的工作：

from collections import Counter
print(Counter('cats on wheels'))

打印出来：

{'s': 2, ' ': 2, 'e': 2, 't': 1, 'n': 1, 'l': 1, 'a': 1, 'c': 1, 'w': 1, 'h': 1, 'o': 1}

【讨论】：

【解决方案4】：

s = input()
t = s.lower()

for i in range(len(s)):
    b = t.count(t[i])
    print("{} -- {}".format(s[i], b))

【讨论】：

【解决方案5】：

跟进what LMc said，您的代码已经非常接近功能性了。您只需要对结果集进行后处理即可删除“无趣”的输出。这是使您的代码工作的一种方法：

#!/usr/bin/env python
word = raw_input("Enter a word: ")

Alphabet = [
    'a','b','c','d','e','f','g','h','i','j','k','l','m',
    'n','o','p','q','r','s','t','u','v','w','x','y','z'
]

hits = [
    (Alphabet[i], word.count(Alphabet[i]))
    for i in range(len(Alphabet))
    if word.count(Alphabet[i])
]

for letter, frequency in hits:
    print letter.upper(), frequency

但是使用collections.Counter 的解决方案更加优雅/Pythonic。

【讨论】：

【解决方案6】：

无需库的简单解决方案：

string = input()
f = {}
for i in string:
  f[i] = f.get(i,0) + 1
print(f)

这是 get() 的链接：https://docs.quantifiedcode.com/python-anti-patterns/correctness/not_using_get_to_return_a_default_value_from_a_dictionary.html

【讨论】：

【解决方案7】：

如果要避免使用库或内置函数，那么以下代码可能会有所帮助：

s = "aaabbc"  # Sample string
dict_counter = {}  # Empty dict for holding characters
                   # as keys and count as values
for char in s:  # Traversing the whole string
                # character by character
    if not dict_counter or char not in dict_counter.keys(): # Checking whether the dict is
                                                            # empty or contains the character
        dict_counter.update({char: 1}) # If not then adding the
                                       # character to dict with count = 1
    elif char in dict_counter.keys(): # If the character is already
                                      # in the dict then update count
        dict_counter[char] += 1
for key, val in dict_counter.items(): # Looping over each key and
                                      # value pair for printing
    print(key, val)

输出：

a 3
b 2
c 1

【讨论】：

【解决方案8】：

供将来参考：当您有一个包含所有想要的单词的列表时，可以说wordlist这很简单

for numbers in range(len(wordlist)):
    if wordlist[numbers][0] == 'a':
        print(wordlist[numbers])

【讨论】：

【解决方案9】：

另一种方法是删除重复字符并仅迭代唯一字符（使用set()），然后计算每个唯一字符的出现次数（使用str.count()）

def char_count(string):
    freq = {}
    for char in set(string):
        freq[char] = string.count(char)
    return freq


if __name__ == "__main__":
    s = "HelloWorldHello"
    print(char_count(s))
    # Output: {'e': 2, 'o': 3, 'W': 1, 'r': 1, 'd': 1, 'l': 5, 'H': 2}

【讨论】：

【解决方案10】：

包含所有字母表可能是有意义的。例如，如果您对计算单词分布之间的余弦差感兴趣，则通常需要所有字母。

你可以使用这个方法：

from collections import Counter 

def character_distribution_of_string(pass_string):
  letters = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
  chars_in_string = Counter(pass_string)
  res = {}
  for letter in letters:
    if(letter in chars_in_string):
      res[letter] = chars_in_string[letter]
    else: 
      res[letter] = 0 
  return(res)

用法：

character_distribution_of_string("This is a string that I want to know about")

全字符分布

{'a': 4,
 'b': 1,
 'c': 0,
 'd': 0,
 'e': 0,
 'f': 0,
 'g': 1,
 'h': 2,
 'i': 3,
 'j': 0,
 'k': 1,
 'l': 0,
 'm': 0,
 'n': 3,
 'o': 3,
 'p': 0,
 'q': 0,
 'r': 1,
 's': 3,
 't': 6,
 'u': 1,
 'v': 0,
 'w': 2,
 'x': 0,
 'y': 0,
 'z': 0}

您可以轻松提取字符向量：

list(character_distribution_of_string("This is a string that I want to know about").values())

给予...

[4, 1, 0, 0, 0, 0, 1, 2, 3, 0, 1, 0, 0, 3, 3, 0, 0, 1, 3, 6, 1, 0, 2, 0, 0, 0]

【讨论】：

【解决方案11】：

初始化一个空字典并遍历单词的每个字符。如果字典中存在当前字符，则将其值加 1，如果不存在，则将其值设置为 1。

word="Hello"
characters={}
for character in word:
    if character in characters:
        characters[character] += 1
    else:
        characters[character] =  1
print(characters)

【讨论】：

【解决方案12】：

import string
word = input("Enter a word:  ")
word = word.lower()

Alphabet=list(string.ascii_lowercase)
res = []

for i in range(0,26): 
    res.append(word.count(Alphabet[i]))

for i in range (0,26):
    if res[i] != 0:
        print(str(Alphabet[i].upper()) + " " + str(res[i]))

【讨论】：

【解决方案13】：

def string(n):
    a=list()
    n=n.replace(" ","")
    for i in  (n):
        c=n.count(i)
        a.append(i)
        a.append(c)
        y=dict(zip(*[iter(a)]*2))
    print(y)

string("让我们希望生活更美好")
#输出：{'L': 1, 'e': 5, 't': 3, 's': 1, 'h': 1, 'o': 2, 'p': 1, 'f': 2，'r'：2，'b'：1，'l'：1，'i'：1}
（如果你注意到输出 2 L 字母一个大写和另一个小写..如果你想让它们一起寻找下面的代码）

在输出中它会删除重复的字符，删除空格并仅对唯一字符进行迭代。如果你想同时计算大写和小写：

def string(n):
    n=n.lower() #either use (n.uperr()) 
    a=list()
    n=n.replace(" ","")
    for i in  (n):
        c=n.count(i)
        a.append(i)
        a.append(c)
        y=dict(zip(*[iter(a)]*2))
    print(y)

string("让我们希望生活更美好")
#输出：{'l'：2，'e'：5，'t'：3，'s'：1，'h'：1，'o'：2，'p'：1，'f'： 2、'r'：2、'b'：1、'i'：1}

【讨论】：