如果初始d列表的顺序不重要,你可以把每个字典的.items()转换成一个可以散列的frozenset(),然后你就可以把整个转换成一个set() 或 frozenset() ,然后将每个 frozenset() 转换回字典。示例 -
uniq_d = list(map(dict, frozenset(frozenset(i.items()) for i in d)))
sets() 不允许重复元素。尽管您最终会失去列表的顺序。对于 Python 2.x ,不需要 list(...),因为 map() 返回一个列表。
示例/演示 -
>>> import pprint
>>> pprint.pprint(d)
[{'feature_a': 1, 'feature_b': 'Jul', 'feature_c': 100},
{'feature_a': 2, 'feature_b': 'Jul', 'feature_c': 150},
{'feature_a': 1, 'feature_b': 'Mar', 'feature_c': 110},
{'feature_a': 1, 'feature_b': 'Jul', 'feature_c': 100},
{'feature_a': 1, 'feature_b': 'Jul', 'feature_c': 150}]
>>> uniq_d = list(map(dict, frozenset(frozenset(i.items()) for i in d)))
>>> pprint.pprint(uniq_d)
[{'feature_a': 1, 'feature_b': 'Jul', 'feature_c': 100},
{'feature_a': 1, 'feature_b': 'Jul', 'feature_c': 150},
{'feature_a': 1, 'feature_b': 'Mar', 'feature_c': 110},
{'feature_a': 2, 'feature_b': 'Jul', 'feature_c': 150}]
对于新的需求-
但是,如果我有另一个 feature_d 但我只想删除 feature_a、_b 和 _c 怎么办
如果两个条目具有相同的 feature_a、_b 和 _c,则无论 feature_d 中的内容是什么,它们都被视为相同并重复
执行此操作的一种简单方法是使用集合和新列表,仅将您需要的功能添加到集合中,然后检查是否仅使用您需要的功能。示例 -
seen_set = set()
new_d = []
for i in d:
if tuple([i['feature_a'],i['feature_b'],i['feature_c']]) not in seen_set:
new_d.append(i)
seen_set.add(tuple([i['feature_a'],i['feature_b'],i['feature_c']]))
示例/演示 -
>>> d = [{'feature_a':1, 'feature_b':'Jul', 'feature_c':100, 'feature_d':'A'},
... {'feature_a':2, 'feature_b':'Jul', 'feature_c':150, 'feature_d': 'B'},
... {'feature_a':1, 'feature_b':'Mar', 'feature_c':110, 'feature_d':'F'},
... {'feature_a':1, 'feature_b':'Mar', 'feature_c':110, 'feature_d':'G'}]
>>> seen_set = set()
>>> new_d = []
>>> for i in d:
... if tuple([i['feature_a'],i['feature_b'],i['feature_c']]) not in seen_set:
... new_d.append(i)
... seen_set.add(tuple([i['feature_a'],i['feature_b'],i['feature_c']]))
...
>>> pprint.pprint(new_d)
[{'feature_a': 1, 'feature_b': 'Jul', 'feature_c': 100, 'feature_d': 'A'},
{'feature_a': 2, 'feature_b': 'Jul', 'feature_c': 150, 'feature_d': 'B'},
{'feature_a': 1, 'feature_b': 'Mar', 'feature_c': 110, 'feature_d': 'F'}]