如何通过键合并 Python 字典？答案

【问题标题】：How can I merge Python dictionaries by keys?如何通过键合并 Python 字典？
【发布时间】：2021-04-22 03:20:43
【问题描述】：

假设我在一个列表中有两个字典，如下所示：

[
    {'country': 'United Kingdom',
    'city': 'Cambridge',
    'title': 'University of Cambridge',
    'region': 'Europe'},
    {'country': 'United States',
    'city': 'Cambridge',
    'title': 'Institute of Technology (MIT)',
    'region': 'North America'}
]

下面的列表中还有另外两本词典（针对每所大学的所在地）：

[
    {'title': 'University of Cambridge',
    'latitude': '52.1873962',
    'longitude': '0.1302958635475542'},
    {'title': 'Institute of Technology (MIT)',
    'latitude': '42.3582393',
    'longitude': '-71.09664602558988'},
]

如何将这两个列表组合成：

[
    {'country': 'United Kingdom',
    'city': 'Cambridge',
    'title': 'University of Cambridge',
    'region': 'Europe',
    'latitude': '52.1873962',
    'longitude': '0.1302958635475542'},
    {'country': 'United States',
    'city': 'Cambridge',
    'title': 'Institute of Technology (MIT)',
    'region': 'North America', 
    'latitude': '42.3582393',
    'longitude': '-71.09664602558988'}
]

在语法上最简洁的方法是什么？

感谢您的帮助！！

【问题讨论】：

到目前为止你尝试了什么？

标签： python json python-3.x dictionary

【解决方案1】：

您可以使用zip() 将字典配对，然后将它们合并到一个理解中，例如：

d1 = [
    {'country': 'United Kingdom',
    'city': 'Cambridge',
    'title': 'University of Cambridge',
    'region': 'Europe'},
    {'country': 'United States',
    'city': 'Cambridge',
    'title': 'Institute of Technology (MIT)',
    'region': 'North America'}
]

d2 = [
    {'title': 'University of Cambridge',
    'latitude': '52.1873962',
    'longitude': '0.1302958635475542'},
    {'title': 'Institute of Technology (MIT)',
    'latitude': '42.3582393',
    'longitude': '-71.09664602558988'},
]

[{**a,**b} for a,b in zip(d1, d2)]

这会给你：

[{'country': 'United Kingdom',
  'city': 'Cambridge',
  'title': 'University of Cambridge',
  'region': 'Europe',
  'latitude': '52.1873962',
  'longitude': '0.1302958635475542'},
 {'country': 'United States',
  'city': 'Cambridge',
  'title': 'Institute of Technology (MIT)',
  'region': 'North America',
  'latitude': '42.3582393',
  'longitude': '-71.09664602558988'}]

【讨论】：

【解决方案2】：

你可以使用**kwargs合并两个字典

unis = [
    {'country': 'United Kingdom',
    'city': 'Cambridge',
    'title': 'University of Cambridge',
    'region': 'Europe'},
    {'country': 'United States',
    'city': 'Cambridge',
    'title': 'Institute of Technology (MIT)',
    'region': 'North America'}
]

locs = [
    {'title': 'University of Cambridge',
    'latitude': '52.1873962',
    'longitude': '0.1302958635475542'},
    {'title': 'Institute of Technology (MIT)',
    'latitude': '42.3582393',
    'longitude': '-71.09664602558988'},
]

combined_list = []
for i, v in enumerate(unis):
    combined_dict = {**v, **locs[i]}
    combined_list.append(combined_dict)

print(combined_list) 输出（为可见性而格式化）：

[
    {'country': 'United Kingdom',
     'city': 'Cambridge',
     'title': 'University of Cambridge',
     'region': 'Europe',
     'latitude': '52.1873962',
     'longitude': '0.1302958635475542'},

    {'country': 'United States',
     'city': 'Cambridge', 
     'title': 'Institute of Technology (MIT)', 
     'region': 'North America', 
     'latitude': '42.3582393', 
     'longitude': '-71.09664602558988'}
]

【讨论】：

【解决方案3】：

这适用于 perge。

for i in range(len(x)):x[i].update(y[i])

输出

[{'country': 'United Kingdom', 'city': 'Cambridge', 'title': 'University of Cambridge', 'region': 'Europe', 'latitude': '52.1873962', 'longitude': '0.1302958635475542'}, {'country': 'United States', 'city': 'Cambridge', 'title': 'Institute of Technology (MIT)', 'region': 'North America', 'latitude': '42.3582393', 'longitude': '-71.09664602558988'}]```

【讨论】：

【解决方案4】：

由于我们要合并，我们不知道字典出现的顺序，也不能声称第一个列表的长度与第二个列表的长度相等。因此：

first = [
    {'country': 'United Kingdom',
    'city': 'Cambridge',
    'title': 'University of Cambridge',
    'region': 'Europe'},
    {'country': 'United States',
    'city': 'Cambridge',
    'title': 'Institute of Technology (MIT)',
    'region': 'North America'}
]
second = [
    {'title': 'University of Cambridge',
    'latitude': '52.1873962',
    'longitude': '0.1302958635475542'},
    {'title': 'Institute of Technology (MIT)',
    'latitude': '42.3582393',
    'longitude': '-71.09664602558988'},
]

a = {j['title']:j for j in second}
[{**i, **a[i['title']]} for i in first]

[{'city': 'Cambridge',
  'country': 'United Kingdom',
  'latitude': '52.1873962',
  'longitude': '0.1302958635475542',
  'region': 'Europe',
  'title': 'University of Cambridge'},
 {'city': 'Cambridge',
  'country': 'United States',
  'latitude': '42.3582393',
  'longitude': '-71.09664602558988',
  'region': 'North America',
  'title': 'Institute of Technology (MIT)'}]

【讨论】：

绝对值得再次检查，假设两个列表的顺序相同。如果不是，最好将字典设置为title，以避免嵌套循环的 O(n²) 复杂性。

【解决方案5】：

如果我们假设两个列表的顺序可能并不总是匹配，我们不能只是将列表压缩在一起。创建一个 dict 以允许 O(1) 在 title 元素上查找大学数据。

然后迭代国家，如果在查找字典中找到匹配的标题，则合并字典。

countries = [
    {'country': 'United Kingdom',
    'city': 'Cambridge',
    'title': 'University of Cambridge',
    'region': 'Europe'},
    {'country': 'United States',
    'city': 'Cambridge',
    'title': 'Institute of Technology (MIT)',
    'region': 'North America'}
]

universities = [
    {'title': 'University of Cambridge',
    'latitude': '52.1873962',
    'longitude': '0.1302958635475542'},
    {'title': 'Institute of Technology (MIT)',
    'latitude': '42.3582393',
    'longitude': '-71.09664602558988'},
]

unidict = {u['title']: u for u in universities}
result = [{**c, **unidict[c['title']]} for c in countries if c['title'] in unidict]
print(result)

结果

[{'country': 'United Kingdom', 'city': 'Cambridge', 'title': 'University of Cambridge', 'region': 'Europe', 'latitude': '52.1873962', 'longitude': '0.1302958635475542'}, {'country': 'United States', 'city': 'Cambridge', 'title': 'Institute of Technology (MIT)', 'region': 'North America', 'latitude': '42.3582393', 'longitude': '-71.09664602558988'}]

【讨论】：