【问题标题】:Identify similar numbers from several lists从多个列表中识别相似的数字
【发布时间】:2020-08-29 23:59:25
【问题描述】:

我有 3 个列表:

r=[0.611695403733703, 0.833193902333201, 1.09120811998494]
g=[0.300675698437847, 0.612539072191236, 1.18046695352626]
b=[0.00668849762984564, 0.611946522017357, 1.16778502636141]

我想计算最相似数字的平均值。在上面的示例中,r[0]g[1]b[1] 非常相似(大约为0.61...)。如何识别这种模式?

【问题讨论】:

  • 在这个问题中没有 numpy .. 为什么要这样标记它 - 或者会这样标记它?
  • 如果使用 numpy 可以提供更简洁的解决方案,那就是 x=np.array(x)

标签: python-3.x list numpy pattern-matching


【解决方案1】:

使用列表推导的蛮力:

r=[0.611695403733703, 0.833193902333201, 1.09120811998494]
g=[0.300675698437847, 0.612539072191236, 1.18046695352626]
b=[0.00668849762984564, 0.611946522017357, 1.16778502636141]


rg = [ (idx_r, idx_g,r,g) if abs(rr-gg) < 0.001 else None 
      for idx_r,rr in enumerate(r) 
      for idx_g, gg in enumerate(g)]

rb = [ (idx_r, idx_b,r,b) if abs(rr-bb) < 0.001 else None 
      for idx_r,rr in enumerate(r) 
      for idx_b, bb in enumerate(b)]

gb = [ (idx_g, idx_b,g,b) if abs(gg-bb) < 0.001 else None 
      for idx_g,gg in enumerate(g) 
      for idx_b, bb in enumerate(b)]

print(filter(None,rg+rb+gb))

输出:

[(0, 1, [0.611695403733703, 0.833193902333201, 1.09120811998494], 
        [0.300675698437847, 0.612539072191236, 1.18046695352626]), 
 (0, 1, [0.611695403733703, 0.833193902333201, 1.09120811998494], 
        [0.00668849762984564, 0.611946522017357, 1.16778502636141]), 
 (1, 1, [0.300675698437847, 0.612539072191236, 1.18046695352626], 
        [0.00668849762984564, 0.611946522017357, 1.16778502636141])]

输出是 1.list 中的索引元组、2.list 中的索引以及两个列表中的索引。

【讨论】:

    【解决方案2】:

    您正在计算所有点集之间的距离。最好的方法是scipy.spatial.distance.cdist:

    from scipy.spatial.distance import cdist
    import numpy as np
    
    r=[0.611695403733703, 0.833193902333201, 1.09120811998494]
    g=[0.300675698437847, 0.612539072191236, 1.18046695352626]
    b=[0.00668849762984564, 0.611946522017357, 1.16778502636141]
    
    arr = np.array([r,g,b])
    # need 2d set of points
    arr_flat = arr.ravel()[:, np.newaxis]
    
    # computes distance between every point, pairwise
    dists = cdist(arr_flat, arr_flat)
    # (1,2) is the same as (2,1), so only consider each pair once
    # ie. use upper triangle
    dists = np.triu(dists)
    # set 0 values to inf so we don't consider the,m
    dists[dists == 0] = np.inf
    
    # get all pairs that are below this threshold level
    ahold = 0.01
    coords = np.nonzero(dists<thold)
    
    labels = 'rgb'
    print(f'Pairs of points closer than {thold}:')
    for i, j in zip(*coords):
        print(labels[i//3] + f'[{i%3}]', labels[j//3] + f'[{j%3}]')
    
    >>> Pairs of points closer than 0.01:
        r[0] g[1]
        r[0] b[1]
        g[1] b[1]
    
    # can easily count the number of points as
    np.count_nonzero(dists<thold)
    >>> 3
    

    【讨论】:

      猜你喜欢
      • 2018-12-12
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-08-09
      相关资源
      最近更新 更多