在numpy数组中迭代没有for循环答案

【问题标题】：Iterating without for loop in numpy array在numpy数组中迭代没有for循环
【发布时间】：2017-05-05 19:37:51
【问题描述】：

我需要对 numpy 数组进行逻辑迭代，其值取决于其他数组的元素。我在下面编写了代码来澄清我的问题。有什么建议可以在没有 for 循环的情况下解决这个问题？

Code
a = np.array(['a', 'b', 'a', 'a', 'b', 'a'])
b = np.array([150, 154, 147, 126, 148, 125])
c = np.zeros_like(b)
c[0] = 150
for i in range(1, c.size):
    if a[i] == "b":
        c[i] = c[i-1]
    else:
        c[i] = b[i]

【问题讨论】：

标签： python numpy iterator vectorization

【解决方案1】：

这是一种使用np.maximum.accumulate 和np.where 组合的方法来创建以特定间隔停止的阶梯式索引，然后简单地索引到b 将给我们所需的输出.

因此，实现将是 -

mask = a!="b"
idx = np.maximum.accumulate(np.where(mask,np.arange(mask.size),0))
out = b[idx]

示例逐步运行 -

In [656]: # Inputs 
     ...: a = np.array(['a', 'b', 'a', 'a', 'b', 'a'])
     ...: b = np.array([150, 154, 147, 126, 148, 125])
     ...: 

In [657]: mask = a!="b"

In [658]: mask
Out[658]: array([ True, False,  True,  True, False,  True], dtype=bool)

# Crux of the implmentation happens here :
In [696]: np.where(mask,np.arange(mask.size),0)
Out[696]: array([0, 0, 2, 3, 0, 5])

In [697]: np.maximum.accumulate(np.where(mask,np.arange(mask.size),0))
Out[697]: array([0, 0, 2, 3, 3, 5])# Stepped indices "intervaled" at masked places

In [698]: idx = np.maximum.accumulate(np.where(mask,np.arange(mask.size),0))

In [699]: b[idx]
Out[699]: array([150, 150, 147, 126, 126, 125])

【讨论】：

【解决方案2】：

您可以使用更矢量化的方法，如下所示：

np.where(a == "b", np.roll(c, 1), b)

如果条件为True，np.where 将从np.roll(c, 1) 中获取元素，如果条件为False，它将从b 中获取元素。 np.roll(c, 1) 会将c 的所有元素“滚动” 1，以便每个元素都引用c[i-1]。

这些类型的操作使 numpy 如此无价。如果可能，应避免循环。

【讨论】：

这是简单而简洁的解决方案。但是，为什么它返回 [150 0 147 126 0 125]？它不会从 a[i] = "b" 的 'b' 数组中获取值。
我误读了这个问题，这只有在你已经拥有 c 的所有元素但你通过循环填充它时才有效，所以它比这更复杂一点

【解决方案3】：

如果您不需要环绕边距，有一个非常简单的解决方案：

a = np.array(['a', 'b', 'a', 'a', 'b', 'a'])
b = np.array([150, 154, 147, 126, 148, 125])
c = b.copy()  #removes necessity of else case
c[a[:-1]=='b'] = c[a[1:]=='b']

或同样：

a = np.array(['a', 'b', 'a', 'a', 'b', 'a'])
b = np.array([150, 154, 147, 126, 148, 125])
c = b.copy()  #removes necessity of else case
mask = a == 'b'
c[mask[:-1]] = c[mask[1:]]

如果你想环绕边距 (a[0]=='b')，那么它会变得有点复杂，你要么需要使用 roll，要么先用 if 捕捉这种情况。

【讨论】：