在 Python 中有效地将元组列表压缩为列表字典？答案

【问题标题】：Efficiently Compacting a List of Tuples to a Dictionary of Lists in Python?在 Python 中有效地将元组列表压缩为列表字典？
【发布时间】：2026-01-21 13:15:01
【问题描述】：

问题

我有兴趣找到一种更有效（代码复杂性、速度、内存使用、理解、生成器）的方法来减少两个元素元组的列表，其中第一个元素可能在元素，到列表字典。

from copy import deepcopy
a = [('a', 'cat'), ('a', 'dog'), ('b', 'pony'), ('c', 'hippo'), ('c','horse'), ('d', 'cow')]

b = {x[0]: list() for x in a}

c = deepcopy(b)
for key, value in b.items():
    for item in a:
        if key == item[0]:
            c[key].append(item[1])
print(a)
print(c)

[('a', 'cat'), ('a', 'dog'), ('b', 'pony'), ('c', 'hippo'), ('c', '马'), ('d', '牛')]

{'a': ['cat', 'dog'], 'b': ['pony'], 'c': ['hippo', 'horse'], 'd': ['cow' ]}

答案测试

from collections import defaultdict
from itertools import groupby
from operator import itemgetter
import timeit

timings = dict()

def wrap(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

a = [('a', 'cat'), ('a', 'dog'), ('b', 'pony'), ('c', 'hippo'), ('c','horse'), ('d', 'cow')]

# yatu's solution
def yatu(x):
    output = defaultdict(list)
    for item in x:
        output[item[0]].append(item[1])
    return output

# roseman's solution
def roseman(x):
    d = defaultdict(list)
    for key, value in a:
        d[key].append(value)
    return d

# prem's solution
def prem(a):
    result = {k: [v for _,v in grp] for k,grp in groupby(a, itemgetter(0))}
    return result

# timings
yatus_wrapped = wrap(yatu, a)
rosemans_wrapped = wrap(roseman, a)
prems_wrapped = wrap(prem, a)
timings['yatus'] = timeit.timeit(yatus_wrapped, number=100000)
timings['rosemans'] = timeit.timeit(rosemans_wrapped, number=100000)
timings['prems'] = timeit.timeit(prems_wrapped, number=100000)

# output results
print(timings)

{'yatus': 0.171220442, 'rosemans': 0.153767728, 'prems': 0.22808025399999993}

Roseman 的解决方案是最快的，谢谢。

【问题讨论】：

这是对每个键的 dict 理解，并使用 list 理解来构建每个值。你被困在哪里了？显示问题代码，而不是让别人为你写。
嗨@Prune 我没有被卡住，而是寻求有关优化的反馈。显示了解决方案，如何在速度、内存使用等方面进行改进。

标签： python list dictionary optimization tuples

【解决方案1】：

这可以通过使用 defaultdict 的单个循环来完成：

from collections import defaultdict
d = defaultdict(list)
for key, value in a:
    d[key].append(value)

【讨论】：

几乎在同一时间发布相同的想法:)
这是最快的解决方案，请参阅添加到问题的时间。谢谢。

【解决方案2】：

你可以使用defaultdict:

from collections import defaultdict
a = [('a', 'cat'), ('a', 'dog'), ('b', 'pony'), ('c', 'hippo'), ('c','horse'), ('d', 'cow')]

output = defaultdict(list)

for item in a:
    output[item[0]].append(item[1])

这种方法将需要更少的空间（仅a 和output）并具有更好的运行时（线性运行时复杂性，因为它迭代a 一次并将每个元素添加到output 字典中 - 插入到字典中发生在恒定的时间内）。

【讨论】：

【解决方案3】：

您可以先使用itertools.groupby 对项目进行分组，然后根据需要合并它们

>>> from itertools import groupby
>>> from operator import itemgetter
>>> {k: [v for _,v in grp] for k,grp in groupby(a, itemgetter(0))}
{'a': ['cat', 'dog'], 'b': ['pony'], 'c': ['hippo', 'horse'], 'd': ['cow']}

如果输入不总是按排序顺序排序

【讨论】：

groupby 要求对原始序列进行排序，否则 k 可能会在结果序列的非相邻元素中重复，并且您的最终 dict 将仅具有与最后一个关联的值一组k 值。
@chepner。我已经在答案中提到，如果尚未排序，则需要对输入进行排序