【问题标题】:Count duplicate lists inside a list计算列表中的重复列表
【发布时间】:2017-11-28 13:46:20
【问题描述】:
lis = [ [12,34,56],[45,78,334],[56,90,78],[12,34,56] ]

我希望结果为 2,因为重复列表的数量总共为 2。我该怎么做?

我做过类似的事情

count=0
for i in range(0, len(lis)-1):
    for j in range(i+1, len(lis)):
        if lis[i] == lis[j]:
            count+=1

但计数值为 1,因为它返回匹配的列表。如何获取重复列表的总数?

【问题讨论】:

  • 按照你的逻辑,这个列表[ [12,34,56],[45,78,334],[56,90,78],[12,34,56], [56,90,78],[12,34,56] ]的总数应该是多少?
  • @RomanPerekhrest 您的问题中的总数应该是 5。
  • 创建另一个列表,其中布尔值列表的长度设置为 false。找到匹配项时将两个索引都标记为真,最后计算真数。奖励是您可以使用这些值来跳过已经标记为 true 的值。

标签: python python-3.x count duplicates


【解决方案1】:

解决方案

如果您的子列表仅包含数字并因此是可散列的,您可以使用collections.Counter

>>> from collections import Counter
>>> lis = [[12, 34, 56], [45, 78, 334], [56, 90, 78], [12, 34, 56]]
>>> sum(y for y in Counter(tuple(x) for x in lis).values() if y > 1)
2
>>> lis = [[12, 34, 56], [45, 78, 334], [56, 90, 78], [12, 34, 56], [56, 90, 78], [12, 34, 56]]
>>> sum(y for y in Counter(tuple(x) for x in lis).values() if y > 1)
5

按步骤

将您的子列表转换为元组:

tuple(x) for x in lis

数一数:

>>> Counter(tuple(x) for x in lis)
Counter({(12, 34, 56): 3, (45, 78, 334): 1, (56, 90, 78): 2})

只取值:

>>> Counter(tuple(x) for x in lis).values()
dict_values([3, 1, 2])

最后,只求计数大于 1 的那些:

> sum(y for y in Counter(tuple(x) for x in lis).values() if y > 1)
5

使其可重复使用

将其放入一个函数中,添加一个文档字符串和一个文档测试:

"""Count duplicates of sub-lists.
"""


from collections import Counter


def count_duplicates(lis):
    """Count duplicates of sub-lists.

    Assumption: Sub-list contain only hashable elements.
    Result: If a sub-list appreas twice the result is 2.
    If a sub-list aprears three time and a other twice the result is 5.

    >>> count_duplicates([[12, 34, 56], [45, 78, 334], [56, 90, 78],
    ...                   [12, 34, 56]])
    2
    >>> count_duplicates([[12, 34, 56], [45, 78, 334], [56, 90, 78],
    ...                   [12, 34, 56], [56, 90, 78], [12, 34, 56]])
    ...
    5
    """
    # Make it a bit more verbose than necessary for readability and
    # educational purposes.
    tuples = (tuple(elem) for elem in lis)
    counts = Counter(tuples).values()
    return sum(elem for elem in counts if elem > 1)


if __name__ == '__main__':

    import doctest

    doctest.testmod(verbose=True)

运行测试:

python count_dupes.py 
Trying:
    count_duplicates([[12, 34, 56], [45, 78, 334], [56, 90, 78],
                      [12, 34, 56]])
Expecting:
    2
ok
Trying:
    count_duplicates([[12, 34, 56], [45, 78, 334], [56, 90, 78],
                      [12, 34, 56], [56, 90, 78], [12, 34, 56]])
Expecting:
    5
ok
1 items had no tests:
    __main__
1 items passed all tests:
   2 tests in __main__.count_duplicates
2 tests in 2 items.
2 passed and 0 failed.
Test passed.

【讨论】:

  • 你也可以使用 defaultdict(int) 来达到同样的效果。
  • @Mike Müller 谢谢!
猜你喜欢
  • 1970-01-01
  • 2016-05-20
  • 1970-01-01
  • 2016-03-04
  • 2018-09-04
  • 1970-01-01
  • 1970-01-01
  • 2021-07-16
  • 2015-05-09
相关资源
最近更新 更多