字典仅返回 for 循环内的最后一个键值对答案

【问题标题】：Dictionary returns only last key value pairs inside for loop字典仅返回 for 循环内的最后一个键值对
【发布时间】：2019-07-10 17:58:20
【问题描述】：

我有一个字符串列表：

A = [
    'philadelphia court excessive disappointed court hope hope',
    'hope hope jurisdiction obscures acquittal court',
    'mention hope maryland signal held mention problem internal reform life bolster level grievance'
    ]

另一个列表为：

B = ['court', 'hope', 'mention', 'life', 'bolster', 'internal', 'level']

我想根据字符串列表A 中列表词B 的出现次数创建字典。类似的，

C = [
        {'count':2,'hope':2,'mention':0,'life':0,'bolster':0,'internal':0,'level':0},
        {'count':1,'hope':2,'mention':0,'life':0,'bolster':0,'internal':0,'level':0},
        {'count':0,'hope':1,'mention':2,'life':1,'bolster':1,'internal':1,'level':1}
    ]

我喜欢什么，

dic={}
for i in A:
    t=i.split()
    for j in B:
        dic[j]=t.count(j)

但是，它只返回最后一对字典，

print (dic)

{'court': 0,
 'hope': 1,
 'mention': 2,
 'life': 1,
 'bolster': 1,
 'internal': 1,
 'level': 1}

【问题讨论】：

“我想创建字典”实际上不是真的，您正在尝试创建字典列表。因此需要将字典附加到列表中。还要注意初始化dic 的位置。请检查我的答案。
您可能会使用collections.Counter 稍微改进您的代码，而不是自己明确地计算内容。

标签： python python-3.x list dictionary

【解决方案1】：

您无需像示例输出中那样创建字典列表，而是仅创建一个字典（并在每次检查短语时覆盖字数）。您可以使用re.findall 来计算每个短语中出现的单词（如果您的任何短语包含后跟标点符号的单词，例如“希望？”，这样做的好处是不会失败）。

import re

words = ['court', 'hope', 'mention', 'life', 'bolster', 'internal', 'level']
phrases = ['philadelphia court excessive disappointed court hope hope','hope hope jurisdiction obscures acquittal court','mention hope maryland signal held mention problem internal reform life bolster level grievance']

counts = [{w: len(re.findall(r'\b{}\b'.format(w), p)) for w in words} for p in phrases]

print(counts)
# [{'court': 2, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 1, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 0, 'hope': 1, 'mention': 2, 'life': 1, 'bolster': 1, 'internal': 1, 'level': 1}]

【讨论】：

【解决方案2】：

两个问题：您在错误的位置初始化dic，而不是在列表中收集那些dics。这是修复：

C = []    
for i in A:
    dic = {}
    t=i.split()
    for j in B:
        dic[j]=t.count(j)
    C.append(dic)
# Result:
[{'court': 2, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0},
{'court': 1, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0},
{'court': 0, 'hope': 1, 'mention': 2, 'life': 1, 'bolster': 1, 'internal': 1, 'level': 1}]

【讨论】：

@Learner 当然！请不要忘记接受最有帮助的答案。谢谢！
是的，我一定会这样做的。只是想问一下，我有一个长度约为 100,000 的安静的大字符串列表。这里的 for 循环会增加计算时间吗？
@Learner 由于您必须在两个列表上进行迭代，每个元素无一例外，我认为双循环是不可避免的。
是的，我也使用了你的代码。计算结果花了很长时间。你有更好的方法吗？会很有帮助的
@Learner 请查看并行处理 - Python 中的多线程。这样，您可以将作业分解为多个并行运行的部分，并显着减少时间。

【解决方案3】：

试试这个，

from collections import Counter

A = ['philadelphia court excessive disappointed court hope hope',
     'hope hope jurisdiction obscures acquittal court',
     'mention hope maryland signal held mention problem internal reform life bolster level grievance']

B = ['court', 'hope', 'mention', 'life', 'bolster', 'internal', 'level']

result = [{b: dict(Counter(i.split())).get(b, 0) for b in B} for i in A]
print(result)

输出：

[{'court': 2, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 1, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 0, 'hope': 1, 'mention': 2, 'life': 1, 'bolster': 1, 'internal': 1, 'level': 1}]

【讨论】：

【解决方案4】：

您总是用dict[j]=t.count(j) 覆盖字典dic 中的现有值。您可以为每个 i 创建一个新的 dict 并将其附加到如下列表中：

dic=[]
for i in A:
    i_dict = {}
    t=i.split()
    for j in B:
        i_dict[j]=t.count(j)
    dic.append(i_dict)
print(dic)

【讨论】：

【解决方案5】：

为避免覆盖现有值，请检查该条目是否已在字典中。尝试添加：

if j in b:
    dic[j] += t.count(j)
else:
    dic[j] = t.count(j)

【讨论】：

我做了一些事情，dic=[] for i in A: i_dict = {} t=i.split() for j in B: if j in t: i_dict[j] += t.count(j) else: i_dict[j] = t.count(j) dic.append(i_dict) 给出错误KeyError: 'court'