【问题标题】:How can I find an intersection among multiple lists?如何找到多个列表之间的交集?
【发布时间】:2021-04-16 21:56:30
【问题描述】:

我有多个数组,我想找到它们之间的交集我尝试了以下代码。

my_lists = [['Finish', 'Purpose', 'Form', 'Series', 'Tiles Type', 'Finishing'], ['Color', 'Thickness', 'Usage/Application', 'Brand', 'Marble Type', 'Material'], ['Color', 'Brand', 'Finishing', 'Origin', 'Marble Type', 'Thickness'], ['Thickness', 'Form', 'Size', 'Series', 'Usage/Application', 'Finishing'], ['Thickness', 'Material Grade', 'Size', 'Usage/Application', 'Material'], ['Usage/Application', 'Form', 'Finishing', 'Brand', 'Material', 'Shape'], ['Application Area', 'Form', 'Finishing', 'Brand', 'Color', 'Coverage Area'], ['Usage/Application', 'Marble Type', 'Thickness', 'Brand', 'Form'], ['Unit Size (mm X mm)', 'Marble Type', 'Thickness', 'Finishing', 'Usage', 'Brand'], ['Marble Type', 'Unit Size (mm X mm)', 'Usage', 'Thickness', 'Color'], ['color'], ['Thickness', 'Size', 'Usage/Application', 'Series', 'Finish', 'Marble Type'], ['Thickness', 'Usage/Application', 'Brand', 'Color', 'Marble Type', 'Unit Size (mm X mm)'], ['Color', 'Marble Type', 'Usage'], ['Thickness', 'Size', 'Material', 'Finish', 'Packaging Size', 'Packaging Type'], ['Color', 'Material', 'Thickness', 'Usage/Application', 'Back Lit', 'Brand'], ['Material', 'Pattern', 'Shape'], ['Form', 'Application Area', 'Material', 'Thickness', 'Colour', 'Finishing'], ['Color', 'Usage/Application', 'Brand', 'Series'], ['Color', 'Material', 'Thickness', 'Usage/Application', 'Brand', 'Surface Finish'], ['Brand', 'Color', 'Usage/Application', 'Thickness', 'Size', 'Finish'], ['Form', 'Material', 'Usage', 'Marble Type', 'Thickness', 'Finishing'], ['Form', 'Color', 'Marble Type', 'Unit Size', 'Features', 'Coverage Area'], ['Usage', 'Form'], ['Finish', 'Application Area', 'Purpose', 'Thickness', 'Pattern'], ['Usage/Application', 'Finishing', 'Material', 'Brand', 'Size', 'Category Type'], ['Usage/Application', 'Size', 'Color', 'Marble Type', 'Features', 'Finishing'], ['Marble Type', 'Surface Finishing', 'Stone Form', 'Usage'], ['Brand', 'Material', 'Finish', 'Thickness', 'Size']]
print(set.intersection(*map(set,list(my_lists ))))

但我得到一个空集

set()

我真正想要的是在所有列表中找到共同的元素

【问题讨论】:

    标签: python arrays data-structures intersection


    【解决方案1】:

    您的示例中的所有列表之间没有共同的元素 - 您可以看到第一个和第二个列表完全不相交。因此,空集的正确返回答案。此操作只会查找 EACH 列表中的任何字符串。

    编辑

    如果您的目标是找到曾经重复的字符串,我会执行以下操作:

    import numpy as np
    my_lists = [['Finish', 'Purpose', 'Form', 'Series', 'Tiles Type', 'Finishing'], ['Color', 'Thickness', 'Usage/Application', 'Brand', 'Marble Type', 'Material'], ['Color', 'Brand', 'Finishing', 'Origin', 'Marble Type', 'Thickness'], ['Thickness', 'Form', 'Size', 'Series', 'Usage/Application', 'Finishing'], ['Thickness', 'Material Grade', 'Size', 'Usage/Application', 'Material'], ['Usage/Application', 'Form', 'Finishing', 'Brand', 'Material', 'Shape'], ['Application Area', 'Form', 'Finishing', 'Brand', 'Color', 'Coverage Area'], ['Usage/Application', 'Marble Type', 'Thickness', 'Brand', 'Form'], ['Unit Size (mm X mm)', 'Marble Type', 'Thickness', 'Finishing', 'Usage', 'Brand'], ['Marble Type', 'Unit Size (mm X mm)', 'Usage', 'Thickness', 'Color'], ['color'], ['Thickness', 'Size', 'Usage/Application', 'Series', 'Finish', 'Marble Type'], ['Thickness', 'Usage/Application', 'Brand', 'Color', 'Marble Type', 'Unit Size (mm X mm)'], ['Color', 'Marble Type', 'Usage'], ['Thickness', 'Size', 'Material', 'Finish', 'Packaging Size', 'Packaging Type'], ['Color', 'Material', 'Thickness', 'Usage/Application', 'Back Lit', 'Brand'], ['Material', 'Pattern', 'Shape'], ['Form', 'Application Area', 'Material', 'Thickness', 'Colour', 'Finishing'], ['Color', 'Usage/Application', 'Brand', 'Series'], ['Color', 'Material', 'Thickness', 'Usage/Application', 'Brand', 'Surface Finish'], ['Brand', 'Color', 'Usage/Application', 'Thickness', 'Size', 'Finish'], ['Form', 'Material', 'Usage', 'Marble Type', 'Thickness', 'Finishing'], ['Form', 'Color', 'Marble Type', 'Unit Size', 'Features', 'Coverage Area'], ['Usage', 'Form'], ['Finish', 'Application Area', 'Purpose', 'Thickness', 'Pattern'], ['Usage/Application', 'Finishing', 'Material', 'Brand', 'Size', 'Category Type'], ['Usage/Application', 'Size', 'Color', 'Marble Type', 'Features', 'Finishing'], ['Marble Type', 'Surface Finishing', 'Stone Form', 'Usage'], ['Brand', 'Material', 'Finish', 'Thickness', 'Size']]
    big_list = [x for a_list in my_lists for x in a_list]
    unique_strings, number_of_appearances = np.unique(big_list, return_counts=True)
    index = np.flip(np.argsort(number_of_appearances))
    print(unique_strings[index], number_of_appearances[index])
    

    这会展平您的列表,找到唯一的字符串,并根据它们出现的次数(从多到少)对它们进行排序。第一个字符串将是“最常找到的元素”,任何计数大于 1 的字符串都会在多个列表中重复。

    【讨论】:

    • 所以没有办法找到所有列表中的共同元素或最常见的元素whiteout交集。例如在所有列表中找到最常找到的元素
    • @HassanIbraheem 我编辑了我的答案,包括我认为你在这里要求的内容。如果不是这样,您应该更新您的问题并更具体地说明您想要什么。
    【解决方案2】:

    我认为这会有所帮助;

    from functools import reduce
    reduce(numpy.intersect1d, (my_lists))
    

    来源: https://numpy.org/doc/stable/reference/generated/numpy.intersect1d.html

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-03-18
      • 2015-03-06
      • 2023-03-27
      • 1970-01-01
      • 2019-07-01
      • 1970-01-01
      • 2011-04-11
      相关资源
      最近更新 更多