连接大型 numpy 数组的最快方法答案

【问题标题】：fastest way to concatenate large numpy arrays连接大型 numpy 数组的最快方法
【发布时间】：2022-01-09 21:50:03
【问题描述】：

我正在做一些光流分析。目标是遍历长电影中的每一帧，计算密集的光流，并将得到的角度和幅度附加到不断增长的 numpy 数组中。我发现完成每个连续循环所需的时间越来越长，我不知道为什么。这是一个概括问题的简单示例循环：

import numpy as np

arraySize = (1, 256, 256)          # correct array size
emptyArray = np.zeros(arraySize)   # empty array to fill with angles from every image pair
timeElapsed = []                   # empty list to fill with time values

for i in range(100):               # iterates through the frames in the image stack
    start = time.time()            # start the time
    newArray = np.zeros(arraySize) # makes an example new array
    emptyArray = np.concatenate((emptyArray, newArray)) # concats new and growing arrays
    end = time.time()              # stop the time
    timeElapsed.append(end-start)  # append the total time for the loop to the growing list

如果我随后绘制每个循环所用的时间，我会在每次通过循环时得到线性增加。在这个例子中，它仍然可以容忍，但对于我的实际数据集，它不是。

我猜测较大的数组需要更多时间来处理，但我不知道该怎么做才能避免这种情况。有没有更好、更快或更 Pythonic 的方法来做到这一点？

------------- 编辑 -------------

根据 mathfux 的建议：我将循环修改如下：

arraySize = (1, 256, 256)          # correct array size
emptyArray = np.concatenate([np.zeros(arraySize) for i in range(100)])   # empty array to fill with angles from every image pair
timeElapsed = []                   # empty list to fill with time values

for i in range(100):               # iterates through the frames in the image stack
    start = time.time()            # start the time
    newArray = np.zeros(arraySize) # makes an example new array
    emptyArray[i] = newArray[0]    # overwrites empty array with newarray values at the relevant position
    end = time.time()              # stop the time
    timeElapsed.append(end-start)  # append the total time for the loop to the growing list

现在迭代之间的时间/循环非常一致：

谢谢！

【问题讨论】：

如果您的数据只包含零，np.zeros((100, 256, 256), dtype=int) 应该足够了。最有效的方法是将数据展平，然后对其进行重塑。
指定 dtype 会提高内存效率吗？如果我理解正确，我不能指定“int”，因为我将用浮点数替换这些值。
IMO 最好使用此更新修改 mathfux 的答案，而不是将其添加到问题中。
我不知道我可以用这个更新修改 mathfux 的答案，当我点击“编辑”时，它说建议的编辑队列已满。

标签： python numpy concatenation numpy-ndarray

【解决方案1】：

使用加速器可以通过使用 GPU 或 TPU 功能来提高代码速度，例如通过使用 jax 库，您的代码可能会使用 google colab TPU 运行 about 1000 times faster than other answers（每个循环大约 40 到 50 µs）：

from jax import jit

@jit
def zac():
    arraySize = (1, 256, 256)          # correct array size
    emptyArray = np.zeros(arraySize)   # empty array to fill with angles from every image pair
    timeElapsed = []                   # empty list to fill with time values

    for i in range(100):               # iterates through the frames in the image stack
        start = time.time()            # start the time
        newArray = np.zeros(arraySize) # makes an example new array
        emptyArray = np.concatenate((emptyArray, newArray)) # concats new and growing arrays
        end = time.time()              # stop the time
        timeElapsed.append(end-start)  # append the total time for the loop to the growing list

%timeit -n10000 zac()计算的结果如下：

10000 loops, best of 5: 47.7 µs per loop

【讨论】：

【解决方案2】：

这种方式在我的电脑上似乎快了 28 倍

start = time.time()                    # start the time
arrays = []
for i in range(100):                   # iterates through the frames in the image stack
    arrays.append(np.zeros(arraySize)) 

#Concatenate all in one time     
newArray=np.concatenate(arrays)
end = time.time()              # stop the time
timeElapsed2 = end-start  

print("Elapesed:",timeElapsed2)

print("sum elapsed times of first method:", np.sum(timeElapsed))

经过：0.021436214447021484

第一种方法的总经过时间：0.6163454055786133

【讨论】：

太棒了！我运行了这个和 mathfux 的答案 10,000 次，它们基本上是相同的； mathfux = 13.58ms，你的 = 13.63ms

【解决方案3】：

每次添加新数组时，都会分配新内存以创建更大的数组并将数据记录到其中。这是非常昂贵的。更好的解决方案是分配一次特定大小的内存，然后只使用np.concatenate 记录一次您的日期：

np.concatenate([np.zeros(arraySize) for i in range(100)])

【讨论】：