【问题标题】:Python code to return total count of no. of positions in which items are differing at same indexPython 代码返回 no 的总数。项目在同一索引处不同的位置
【发布时间】:2016-10-18 03:01:14
【问题描述】:

A=[1,2,3,4,5,6,7,8,9] B=[1,2,3,7,4,6​​,5,8,9]

我必须比较这两个列表并返回没有的计数。使用一行 python 代码的项目不同的位置。

例如: 给定数组的输出应为 4,因为在索引 (3,4,5,6) 处,项目不同。因此,程序应返回 4。

我这样做的方法是使用 for 循环比较每个位置:

count=0
for i in range(0,len(A)):
   if(A[i]==B[i]):
     continue
   else:
     count+=1
print(count)

请帮我写一行python代码。

【问题讨论】:

  • @StevenRumbalski 应该是a != b
  • 更正:sum(a != b for a, b in zip(A, B))(感谢@acw1668。)

标签: python arrays python-2.7 python-3.x numpy


【解决方案1】:
count = sum(a != b for a, b in zip(A, B))
print(count)

或者只是print sum(a != b for a, b in zip(A, B))

你可以查看zip/lambda/map here,这些工具在python中非常强大和重要..

Here您也可以查看其他使用这些工具的方法。

玩得开心!

【讨论】:

    【解决方案2】:

    many ways 可以做到这一点。如果你使用 numpy,你可以使用 np.count_nonzero:

    >>> a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
    >>> b = np.array([1, 2, 3, 7, 4, 6, 5, 8, 9])
    >>> a != b
    array([False, False, False,  True,  True, False,  True, False, False], dtype=bool)
    >>> np.count_nonzero(a != b)
    3
    

    请注意,a != b 返回一个包含真假的 数组,具体取决于条件在每个索引处的计算方式。

    这是速度比较:

    >>> %timeit np.count_nonzero(a != b)
    The slowest run took 40.59 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 752 ns per loop
    
    >>> %timeit sum(i != j for i, j in zip(a, b))
    The slowest run took 5.86 times longer than the fastest. This could mean that an intermediate result is being cached.
    100000 loops, best of 3: 18.5 µs per loop
    

    缓存掩盖了时间,但40.59 * 0.752 = 30.52µs,而5.86 * 18.5 = 108.41µs,所以numpy的最慢似乎仍然比纯python最慢的运行要快得多。

    使用更大的数组会更清楚:

    >>> n = 10000
    >>> a = np.arange(n)
    >>> b = np.arange(n)
    >>> k = 50
    >>> ids = np.random.randint(0, n, k)
    >>> a[ids] = 0
    >>> ids = np.random.randint(0, n, k)
    >>> b[ids] = 0
    >>> %timeit np.count_nonzero(a != b)
    The slowest run took 20.50 times longer than the fastest. This could mean that an intermediate result is being cached.
    100000 loops, best of 3: 11.5 µs per loop
    >>> %timeit sum(i != j for i, j in zip(a, b))
    100 loops, best of 3: 15.6 ms per loop
    

    差异更加明显,numpy 最多 235 micro-秒,而纯python 需要15.6 milli-秒平均

    【讨论】:

      猜你喜欢
      • 2011-12-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2023-03-04
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多