Python：根据 OrderedDict 中的顺序打印 defaultdict 中的元素答案

【问题标题】：Python: Printing elements in the defaultdict based on the order in the OrderedDictPython：根据 OrderedDict 中的顺序打印 defaultdict 中的元素
【发布时间】：2015-12-31 07:49:52
【问题描述】：

import string
from collections import namedtuple
from collections import defaultdict
from collections import OrderedDict

matrix_col = {'11234':0, '21234':2, '31223':0, '46541':0, '83432':1, '56443':2, '63324':0, '94334':0, '72443':1}
matrix_col = OrderedDict(sorted(matrix_col.items(), key=lambda t: t[0]))

trans = defaultdict(dict)
trans['11234']['46541'] = 2
trans['11234']['21234'] = 1
trans['11234']['31223'] = 2
trans['11234']['83432'] = 1
trans['21234']['31223'] = 2
trans['21234']['46541'] = 1
trans['21234']['72443'] = 1
trans['21234']['83432'] = 1
trans['56443']['72443'] = 1
trans['56443']['83432'] = 1

for u1, v1 in matrix_col.items():
    for u2, v2 in matrix_col.items():
        for w1 in trans.keys():
            for w2, c in trans[u1].items():
                if u1 == str(w1) and u2 == str(w2):
                    print u1, u2, c

如上所述，我正在尝试根据 matrix_col (OrderedDict) 中元素的排序顺序打印 trans (defaultdict) 的元素并且无法执行那。以下是我无法生成的预期输出：

11234 11234 0
11234 21234 1
11234 31223 2
11234 46541 2
11234 56443 0
11234 63324 0
11234 72443 0
11234 83432 1
11234 94334 0
21234 11234 0
21234 21234 0
21234 31223 2
21234 46541 1
21234 56443 0
21234 63324 0
21234 72443 1
21234 83432 1
21234 94334 0
31223 11234 0
31223 21234 0
31223 31223 0
31223 46541 0
31223 56443 0
31223 63324 0
31223 72443 0
31223 83432 0
31223 94334 0
...

感谢任何帮助。

【问题讨论】：

既然您显然在编写 Python 2 代码，我需要指出：if u1 in trans.keys(): 是一行糟糕的代码。 Python 2 上的 trans.keys() 会创建一个新的 list 键，因此每次测试时您都会制作一个较大的对象，然后线性扫描它以获得命中，而不是直接使用 @987654327 进行 O(1) 成员资格测试@。同样，.items() 的大多数用途可能应该是 .iteritems() 或 .viewitems() 直接迭代，而不是让 lists 复制它们（纯 .keys()/.items() 只有在你将在迭代期间改变dict）。

标签： python numpy ordereddictionary defaultdict

【解决方案1】：

我能够解决它。这里是：

for u1, v1 in matrix_col.items():
    for u2, v2 in matrix_col.items():
        bastim = True
        for w1 in trans.keys():
            for w2, c in trans[u1].items():
                if u1 == str(w1) and u2 == str(w2):
                    print u1, u2, c  
                    bastim = False
        if bastim:
            print u1, u2, 0

谢谢大家。

【讨论】：

但这比trans 的元素打印更多。它包括您从未放入 trans 的对。
@hpaulj 显然这是有意的行为。我也没听懂。

【解决方案2】：

此迭代有效：

for u1 in matrix_col:
    d = trans[u1]
    # d may be empty dict
    for u2 in matrix_col:
        print u1, u2, d.get(u2, 0)

在此迭代之前查看trans：

defaultdict(<type 'dict'>, {
   '21234': {'31223': 2, '46541': 1, '72443': 1, '83432': 1}, 
   '11234': {'21234': 1, '31223': 2, '46541': 2, '83432': 1}, 
   '56443': {'83432': 1, '72443': 1}
  })

有“21234”、“11234”和“56443”的条目；当迭代使用另一个u1 时，d 将是一个空字典{}。 d.get 负责返回一个有意义的值 (0)，以防 u2 不存在。

defaultdict 将为您引用的每个键添加条目，但您必须先引用它。迭代 trans.keys() 不会生成新的密钥。您的初始迭代按照您的描述进行 - print the elements of trans (defaultdict)。

您的布尔值 bastim 处理相同的问题 - 填充不在 trans 中的 0。

如果trans 是defaultdict 的defautdicts，那么迭代可能会简单一点：

def foo():
    # the inner dict defaults to 0
    return defaultdict(int)    
trans = defaultdict(foo)
for u1 in matrix_col:
    d = trans[u1]
    for v1 in matrix_col:
        print u1,v1, d[v1]

如果内部 dict 在列表中收集值，这会更有趣

def foo():
    return defaultdict(list)
trans = defaultdict(foo)

以及添加使用append（并重复）

trans['11234']['46541'].append(2)
trans['11234']['21234'].append(1)
trans['11234']['31223'].append(2)
trans['11234']['83432'].append(1)

trans['11234']['46541'].append(5)
trans['11234']['21234'].append(3)
trans['11234']['31223'].append(4)

生产

11234 11234 []
11234 21234 [1, 3]
11234 31223 [2, 4]
11234 46541 [2, 5]
....

【讨论】：

【解决方案3】：

没有基于 OrderedDict 上的任意排序对字典进行排序的标准方法（据我所知），但您始终可以按相同的内容进行排序。在这种情况下，只需一个默认排序就足够了。

for k, sub_dct in sorted(trans.items()):
    for sub_k, v in sorted(sub_dct.items()):
        print k, sub_k, v

我想，另一种方法是遍历 OrderedDict 两次并针对 defaultdict 进行查找。

for k in matrix_col:
    for sub_k in matrix_col:
        v = trans.get(k, {}).get(sub_k, 0)
        print k, sub_k, trans[k][sub_k]

【讨论】：

谢谢。不过，您的选项 #1 不会生成我正在寻找的输出。它仅使用 trans 中的现有元素生成输出。您的选项 #2 给出以下错误： TypeError: 'int' object has no attribute 'getitem'
@JamesDerrick bah 那是因为我犯了一个简单的错误。调整后的底部代码应该可以工作。不过，我不明白您对选项 #1 的评论——如果您试图做的不是从trans 中的现有元素生成输出，那么您的问题还不清楚。
如果我没有说清楚，我很抱歉：有一个 OrderedDict，matrix_col。还有一个默认字典，trans。我需要按 matrix_col 的顺序迭代 trans 的所有元素并打印。如果 matrix_col 元素存在于 trans 中，则打印其值 c。如果不是，则打印 matrix_col 中值为 0 的元素。请查看预期输出。这真的不言自明。使用您的代码，您只是在打印 trans 中的内容。您还可以打印 matrix_col 中但不在 trans 中且值为 0 的其他元素吗？谢谢
@JamesDerrick 啊哈，那么是的！

【解决方案4】：

我扩展了您自己的答案。这段代码看起来相当，运行速度大约是原来的 3 倍，虽然不确定你是否受到 CPU 的限制，并且它可能无法在 python

            items=matrix_col.items()
            import operator
            for (u1, v1), trans_u1_items in zip(items,map(operator.methodcaller('items'), map(trans.__getitem__,matrix_col))):
                for u2, v2 in items:
                    bastim = True
                    for w1 in trans:
                        for w2, c in trans_u1_items:
                            if u1 == w1 and u2 == w2:
                                print u1, u2, c
                                bastim = False
                    if bastim:
                        pass
                        print u1, u2, 0

【讨论】：