比较字典比使用 set 更快的方法答案

【问题标题】：faster way to compare dictionaries than using set比较字典比使用 set 更快的方法
【发布时间】：2011-12-27 22:09:50
【问题描述】：

我有两个具有唯一键但值可能重叠的大型字典。我想将每组字典值相互比较并找到重叠的数量。我已经使用两个for 循环和set 完成了此操作，但我想知道是否有更快/更优雅的方法来执行此操作。

dic1 = {'a': ['1','2','3'], 'b':['4','5','6'], 'c':['7','8','9']}
dic2 = {'d': ['1','8','9'], 'e':['10','11','12'], 'f':['7','8','9']}

final_list=[]
for key1  in dic1:
    temp=[]    
    for key2 in dic2:
        test  = set(dic1[key1])
        query = set(dic2[key2])
        x = len(test & query)
        temp.append( [key2, x] )
    final_list.append([key1, temp])

【问题讨论】：

最后一行有错误。你的意思是final_list.append([key1, temp])？
您真的在比较 dic1 中的每个键和 dic2 中的每个键吗？这就是他们所说的O n^2。它本质上很慢。
@S.Lott。是的。我正在寻找全面的比较。但是，可能有一种方法可以将其简化为完全反对子集 - 我可以追求。

标签： python dictionary set

【解决方案1】：

您想要“反转”您的一个（或两个）字典。

val1 = defaultdict(list)
for k in dic1:
    for v in dic1[k]:
        val[v].append( k )
# val1 is a dictionary with each value mapped to the list of keys that contain that value.

for k in dic2: 
    for v in dic2[k]:
        val1[v] is the list of all keys in dic1 that have this value

【讨论】：

我喜欢这个主意。我认为它会很好地工作，因为我的两本词典的大小不同。我可以“颠倒”小的，然后循环穿过大的。
@zach：完全倒退。把大的倒过来。循环穿过小的。它会更快。