【发布时间】:2021-02-02 19:28:04
【问题描述】:
给定两个任意长度的列表列表,假设list1 和list2 我想将list1 中的列表划分为列表子集,如果它们只包含list2 的列表之一。
我给你举个具体的例子:
list1 = [[1, 2, 3, 4], [1, 2, 3, 5, 6, 8], [1, 2, 3, 6, 7], [1, 2, 3, 6, 8, 9, 10],
[1, 2, 3, 6, 8, 11, 12], [1, 2, 4, 5, 9, 10], [1, 2, 4, 5, 11, 12],
[1, 2, 5, 6, 7, 9, 10], [1, 2, 5, 6, 7, 11, 12], [1, 2, 5, 6, 8, 9, 10],
[1, 2, 5, 6, 8, 11, 12], [3, 4, 5, 6, 8], [3, 5, 9, 10], [3, 5, 11, 12],
[4, 6, 7], [4, 6, 8, 9, 10], [4, 6, 8, 11, 12], [9, 10, 11, 12]]
list2 = [[2], [6, 7], [6, 8], [9,9]]
然后函数的期望结果将是“内部”匹配:
[[1, 2, 3, 4],
[1, 2, 4, 5, 11, 12],
[4, 6, 7],
[3, 4, 5, 6, 8],
[4, 6, 8, 11, 12],
[3, 5, 9, 10],
[9, 10, 11, 12]]
对于“外部”匹配项(因此是 list_1 中的剩余项目):
[(1, 2, 5, 6, 8, 11, 12),
(1, 2, 5, 6, 7, 11, 12),
(4, 6, 8, 9, 10),
(1, 2, 5, 6, 7, 9, 10),
(1, 2, 3, 5, 6, 8),
(1, 2, 3, 6, 8, 11, 12),
(1, 2, 3, 6, 7),
(3, 5, 11, 12),
(1, 2, 4, 5, 9, 10),
(1, 2, 5, 6, 8, 9, 10),
(1, 2, 3, 6, 8, 9, 10)]
我编写了一个快速而肮脏的解决方案,可以产生所需的结果,但不能很好地扩展很长的列表(例如 100000 和 2500)。
我的解决方案:
from itertools import chain
def find_all_sets(list1,list2):
d = {}
d2 = {}
count = 0
for i in list2:
count = count + 1
set2 = set(i)
d['set'+str(count)] = set2
d['lists'+str(count)] = []
first = []
d2['match'+str(count)] = []
for a in list1:
set1 = set(a)
if d['set'+str(count)].issubset(set1) == True:
first.append(a)
d['lists'+str(count)].append(first)
d2['match'+str(count)].append(d['lists'+str(count)])
count = 0
count2 = -1
d3 = {}
all_sub_lists = []
for i in d2.values():
count = count + 1
count2 = count2 + 1
d3['final'+str(count)] = []
real = []
for item in i:
for each_item in item:
for each_each_item in each_item:
seta= set(each_each_item)
save = []
for i in list2:
setb = set(i)
a=setb.issubset(seta)
save.append(a)
index_to_remove = count2
new_save = save[:index_to_remove] + save[index_to_remove + 1:]
if True not in new_save:
real.append(each_each_item)
d3['final'+str(count)].append(real)
all_sub_lists.append(real)
inner_matches = list(chain(*all_sub_lists))
setA = set(map(tuple, inner_matches))
setB = set(map(tuple, list1))
outer_matches = [i for i in setB if i not in setA]
return inner_matches, outer_matches
inner_matches, outer_matches = find_all_sets(list1,list2)
我正在寻找一种更快的方式来处理大型列表。如果“内部”和“外部”匹配的术语不清楚,请原谅。我不知道怎么称呼他们。
【问题讨论】: