【问题标题】:Convert an array of peaks to a series of steps that represent to most recent peak value将一组峰值转换为一系列表示最近峰值的步骤
【发布时间】:2018-10-01 21:48:20
【问题描述】:

给定一个这样的峰值数组:

peaks = [0, 5, 0, 3, 2, 0, 1, 7, 0]

如何创建一个指示最近峰值的步骤数组,如下所示:

steps = [0, 5, 5, 3, 3, 3, 3, 7, 7]

要求:

  • 这将用于对大型 3D 图像 (1000**3) 进行图像分析,因此需要快速,这意味着没有 for 循环或列表推导...只有 numpy 向量化。
  • 我上面给出的示例是一个线性列表,但这需要同样适用于 ND 图像。这意味着沿单个轴执行操作,但同时允许多个轴。

注意

我最近asked a question 被证明是一个骗子(scipy.maximum.accumulate 很容易解决),但我的问题还包含一个可选的“如果”扭曲,如上所述。事实证明,我实际上也需要第二个行为,所以我只重新发布这部分。

【问题讨论】:

  • 如果必须要快,C 是比 python 更好的选择。 Fortran 可以说比任何一个都好。
  • @MadPhysicist 是的,你是正确的,并且 +1 用于向 Fortran 提供道具......但也存在“快速编码”的问题。除了一个好的 numpy 实现,速度总是让我感到惊讶。
  • Numpy 大部分是用 C 编码的,并且是为速度而设计的,所以这并不奇怪。不过,与 python 的交互仍然会产生大量开销。
  • Numba 或 Cython 不适合您吗?这很可能比任何矢量化方法都更快,并且更容易编码。
  • @max9111 “而且更容易编码” 值得商榷。 ;-)

标签: numpy


【解决方案1】:

这是一个处理 ND 的解决方案,可以检测像 ..., 0, 4, 4, 4, 3, ... 这样的“宽峰”,但不能检测到 ..., 0, 4, 4, 4, 7, ...

import numpy as np
import operator as op

def keep_peaks(A, axis=-1):
    B = np.swapaxes(A, axis, -1)
    # take differences between consecutive elements along axis
    # pad with -1 at the start and the end
    # the most efficient way is to allocate first, because otherwise
    # padding would involve reallocation and a copy
    # note that in order to avoid that copy we use np.subtract and its
    # out kwd
    updown = np.empty((*B.shape[:-1], B.shape[-1]+1), B.dtype)
    updown[..., 0], updown[..., -1] = -1, -1
    np.subtract(B[..., 1:], B[..., :-1], out=updown[..., 1:-1])
    # extract indices where the there is a change along axis
    chnidx = np.where(updown)
    # get the values of the changes
    chng = updown[chnidx]
    # find indices of indices 1) where we go up and 2) the next change is
    # down (note how the padded -1's at the end are useful here)
    # also include the beginning of each 1D subarray
    pkidx, = np.where((chng[:-1] > 0) & (chng[1:] < 0) | (chnidx[-1][:-1] == 0))
    # use indices of indices to retain only peak indices
    pkidx = (*map(op.itemgetter(pkidx), chnidx),)
    # construct array of changes of the result along axis
    # these will be zero everywhere
    out = np.zeros_like(A)
    aux = out.swapaxes(axis, -1)
    # except where there is a new peak
    # at these positions we need to put the differences of peak levels
    aux[(*map(op.itemgetter(slice(1, None)), pkidx),)] = np.diff(B[pkidx])
    # we could ravel the array and do the cumsum on that, but raveling
    # a potentially noncontiguous array is expensive
    # instead we keep the shape, at the cost of having to replace the
    # value at the beginning of each 2D subarray (we do not need the
    # "line-jump" difference but the plain 1st value there)
    aux[..., 0] = B[..., 0]
    # finally, use cumsum to go from differences to plain values
    return out.cumsum(axis=axis)

peaks = [0, 5, 0, 3, 2, 0, 1, 7, 0]

print(peaks)
print(keep_peaks(peaks))

# show off axis kwd and broad peak detection
peaks3d = np.kron(np.random.randint(0, 10, (3, 6, 3)), np.ones((1, 2, 1), int))

print(peaks3d.swapaxes(1, 2))
print(keep_peaks(peaks3d, 1).swapaxes(1, 2))

示例运行:

[0, 5, 0, 3, 2, 0, 1, 7, 0]
[0 5 5 3 3 3 3 7 7]
[[[5 5 3 3 1 1 4 4 9 9 7 7]
  [2 2 9 9 3 3 4 4 3 3 7 7]
  [9 9 0 0 2 2 5 5 7 7 9 9]]

 [[1 1 3 3 9 9 3 3 7 7 0 0]
  [1 1 1 1 4 4 5 5 0 0 3 3]
  [5 5 5 5 8 8 1 1 2 2 7 7]]

 [[6 6 3 3 8 8 2 2 3 3 2 2]
  [6 6 9 9 3 3 9 9 3 3 9 9]
  [1 1 5 5 7 7 2 2 7 7 1 1]]]
[[[5 5 5 5 5 5 5 5 9 9 9 9]
  [2 2 9 9 9 9 4 4 4 4 7 7]
  [9 9 9 9 9 9 9 9 9 9 9 9]]

 [[1 1 1 1 9 9 9 9 7 7 7 7]
  [1 1 1 1 1 1 5 5 5 5 3 3]
  [5 5 5 5 8 8 8 8 8 8 7 7]]

 [[6 6 6 6 8 8 8 8 3 3 3 3]
  [6 6 9 9 9 9 9 9 9 9 9 9]
  [1 1 1 1 7 7 7 7 7 7 7 7]]]

【讨论】:

  • 这真是太神奇了,但是以一种“几乎是魔法”的方式......你能在里面撒一些 cmets 来解释它是如何工作的吗? :-)
  • @2cynykyl 完成。希望对您有所帮助。
最近更新 更多