【问题标题】:Searching for duplicate values in a 2D array在二维数组中搜索重复值
【发布时间】:2013-04-29 01:28:22
【问题描述】:

我正在寻找一种方法来搜索二维数组中的重复部分。

以下面的数组为例:

 1    2    3    4    5
 6    7    8    9   10
11   12   13   14   15
16   17   18   19   20
21   22   23   24   25
26    *8    9*   29   30
31   *13   14   15*   35
17   *18   19*   39   40
41   *23   24*   44   45
46   47   48   49   50

有没有什么方法可以自动搜索重复区域并保存坐标?

【问题讨论】:

    标签: python arrays search area


    【解决方案1】:
    >>> l=[[1,    2,    3,    4,    5],
    ... [6,    7,    8,    9,   10],
    ... [11,   12,   13,   14,   15],
    ... [16,   17,   18,   19,   20],
    ... [21,   22,   23,   24,   25],
    ... [26,    8,    9,   29,   30],
    ... [31,   13,   14,   15,   35],
    ... [17,   18,   19,   39,   40],
    ... [41,   23,   24,   44,   45],
    ... [46,   47,   48,   49,   50]]
    >>> seen = set()
    >>> dupes = {}
    >>> for i_index, i in enumerate(l):
    ...     for j_index, j in enumerate(i):
    ...         if j in seen:
    ...             dupes[(i_index, j_index)] = j
    ...         seen.add(j)
    ...
    >>> for coord, num in dupes.iteritems():
    ...     print "%s: %s" % (coord, num)
    ...
    (7, 0): 17
    (8, 2): 24
    (7, 1): 18
    (8, 1): 23
    (6, 1): 13
    (6, 3): 15
    (6, 2): 14
    (5, 1): 8
    (5, 2): 9
    (7, 2): 19
    

    【讨论】:

    • 如果有多个重复项的坐标,这是否有效?
    • @RyanSaxe 字典键是坐标对,因此可以用相同的值填充整个数组,并将第一个坐标对之后的每个坐标对保存为 dupes 字典中的单独项目。
    【解决方案2】:

    保留所有先前条目的 collections.counter。在遍历数组时,检查计数器类中是否已经存在每个元素,如果存在,将坐标附加到列表中,然后继续。如果没有,请在该特定数字上增加计数器。

    【讨论】:

      【解决方案3】:

      使用dict,其中键是数字并将其坐标存储在列表中。

      In [171]: lis
      Out[171]: 
      [[1, 2, 3, 4, 5],
       [6, 7, 8, 9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25],
       [26, 8, 9, 29, 30],
       [31, 13, 14, 15, 35],
       [17, 18, 19, 39, 40],
       [41, 23, 24, 44, 45],
       [46, 47, 48, 49, 50]]
      
      In [172]: from collections import defaultdict
      
      In [173]: dic=defaultdict(list)
      
      In [174]: for i,x in enumerate(lis):
          for j,y in enumerate(x):
              dic[y].append((i,j))
         .....:         
      
      In [175]: for num,coords in dic.items():
          if len(coords)>1:
              print "{0} was repeated at coordinates {1}".format(num,
                                                   " ".join(str(x) for x in coords))
         .....:         
      8 was repeated at coordinates (1, 2) (5, 1)
      9 was repeated at coordinates (1, 3) (5, 2)
      13 was repeated at coordinates (2, 2) (6, 1)
      14 was repeated at coordinates (2, 3) (6, 2)
      15 was repeated at coordinates (2, 4) (6, 3)
      17 was repeated at coordinates (3, 1) (7, 0)
      18 was repeated at coordinates (3, 2) (7, 1)
      19 was repeated at coordinates (3, 3) (7, 2)
      23 was repeated at coordinates (4, 2) (8, 1)
      24 was repeated at coordinates (4, 3) (8, 2)
      

      【讨论】:

        【解决方案4】:

        如果我正确理解您的问题,它不仅需要寻找单个重复值,还需要寻找任何一系列值。即使用[1,2,3,4],它会在[39,87,2,3,4]中找到[2,3,4]的副本。

        导入和测试值

        import itertools,pprint
        from collections import defaultdict
        l = ((1, 2, 3, 4, 5),
         (6, 7, 8, 9, 10),
         (11, 12, 13, 14, 15),
         (16, 17, 18, 19, 20),
         (21, 22, 23, 24, 25),
         (26, 8, 9, 29, 30),
         (31, 13, 14, 15, 35),
         (17, 18, 19, 39, 40),
         (41, 23, 24, 44, 45),
         (46, 47, 48, 49, 50))
        

        主要代码:

        seen = defaultdict(dict)
        for y,row in enumerate(l):
                rowlen = len(row)
                values = [ [ (row[i:k+1]) for (i,k) in zip(range(rowlen),range(e,rowlen,1))] for e in range(rowlen) ]
                for valueGroup in values:
                    for x,value in enumerate(valueGroup):
                        seen[value]['count'] = seen[value].get('count',0) + 1
                        seen[value]['x-coOrd'] = x
                        seen[("R",y)][value] = True
        
        for y in range(len(l)):
            my_rows_vals = seen[("R",y)].keys()
            for value in my_rows_vals:
                if seen[value]['count'] > 1:
                    print "{0} repeated at ({1},{2})".format(value,seen[value]['x-coOrd'],y)
        

        将输出,作为样本(有更多输出):

        (13, 14) repeated at (1,6)
        (14, 15) repeated at (2,6)
        (13,) repeated at (1,6)
        (13, 14, 15) repeated at (1,6)
        (14,) repeated at (2,6)
        (17, 18) repeated at (0,7)
        (18, 19) repeated at (1,7)
        (17,) repeated at (0,7)
        (18,) repeated at (1,7)
        (19,) repeated at (2,7)
        (17, 18, 19) repeated at (0,7)
        (23,) repeated at (1,8)
        (24,) repeated at (2,8)
        (23, 24) repeated at (1,8)
        

        列表推导逻辑是根据这个例子推理的

         l = [1,2,3,4]
         len = 4
         i:k
         0:1 1:2 2:3 3:4  i = 0,1,2,len-e  k = e,e+1,e+2,e+3    e = 0
         0:2 1:3 2:4      i = 0,1,len-e    k = e,e+1,e+2        e = 1
         0:3 1:4          i = 0,len-e      k = e,e+1            e = 2
         0:4              i = len-e        k = e                e = 3
        

        此方法与其他答案不同,因为它检查单个数字和 数字序列,并突出显示参与匹配的双方。

        【讨论】:

          猜你喜欢
          • 2018-01-16
          • 2016-01-19
          • 1970-01-01
          • 1970-01-01
          • 2016-09-12
          • 1970-01-01
          • 2021-10-28
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多