增长 numpy 数值数组的最快方法答案

【问题标题】：Fastest way to grow a numpy numeric array增长 numpy 数值数组的最快方法
【发布时间】：2021-11-11 02:47:26
【问题描述】：

要求：

我需要从数据中增加一个任意大的数组。
我可以猜测大小（大约 100-200），但不能保证数组每次都适合
一旦它增长到最终大小，我需要对其执行数值计算，因此我希望最终得到一个二维 numpy 数组。
速度至关重要。例如，对于 300 个文件中的一个，update() 方法被调用 4500 万次（大约需要 150s 左右），而 finalize() 方法被调用 500k 次（总共需要 106s）......总共需要 250s左右。

这是我的代码：

def __init__(self):
    self.data = []

def update(self, row):
    self.data.append(row)

def finalize(self):
    dx = np.array(self.data)

我尝试过的其他内容包括以下代码……但这要慢得多。

def class A:
    def __init__(self):
        self.data = np.array([])

    def update(self, row):
        np.append(self.data, row)

    def finalize(self):
        dx = np.reshape(self.data, size=(self.data.shape[0]/5, 5))

这是如何调用的示意图：

for i in range(500000):
    ax = A()
    for j in range(200):
         ax.update([1,2,3,4,5])
    ax.finalize()
    # some processing on ax

【问题讨论】：

在完成之前是否需要是一个numpy数组？如果没有，请使用列表列表，然后在完成后转换。
@AndrewJaffe 列表的列表是否与 numpy 的内存效率相匹配？
There's another method using list of numpy array and np.concatenate

标签： python performance numpy

【解决方案1】：

我尝试了一些不同的事情，有时间安排。

import numpy as np

你提到的慢的方法：（32.094秒）

class A:

    def __init__(self):
        self.data = np.array([])

    def update(self, row):
        self.data = np.append(self.data, row)

    def finalize(self):
        return np.reshape(self.data, newshape=(self.data.shape[0]/5, 5))

常规ol Python列表：（0.308秒）

class B:

    def __init__(self):
        self.data = []

    def update(self, row):
        for r in row:
            self.data.append(r)

    def finalize(self):
        return np.reshape(self.data, newshape=(len(self.data)/5, 5))

尝试在 numpy 中实现一个数组列表：（0.362 秒）

class C:

    def __init__(self):
        self.data = np.zeros((100,))
        self.capacity = 100
        self.size = 0

    def update(self, row):
        for r in row:
            self.add(r)

    def add(self, x):
        if self.size == self.capacity:
            self.capacity *= 4
            newdata = np.zeros((self.capacity,))
            newdata[:self.size] = self.data
            self.data = newdata

        self.data[self.size] = x
        self.size += 1

    def finalize(self):
        data = self.data[:self.size]
        return np.reshape(data, newshape=(len(data)/5, 5))

这就是我的计时方式：

x = C()
for i in xrange(100000):
    x.update([i])

所以看起来普通的旧 Python 列表相当不错；）

【讨论】：

我认为 60M 更新和 500K 完成调用的比较更加清晰。在此示例中，您似乎没有调用 finalize。
@fodon 我实际上确实调用了 finalize ——每次运行一次（所以我想影响不大）。但这让我觉得我可能误解了您的数据是如何增长的：如果您在更新中获得 6000 万，我认为这将为下一次敲定提供至少 6000 万的数据？
@Owen 60M 和 500K 分别表示对 update 和 finalize 的调用分别为 6000 万和 500000 次。请参阅我修改后的时序，它测试了 update 与 finalize 的 100:1 比率
请注意，当您的内存不足时，第三个选项更好。第二个选项需要大量内存。原因是 Python 的列表是对值的引用数组，而 NumPy 的数组是实际的值数组。
您可以通过将 for 循环替换为 self.data.extend(row) 来使更新成为第二部分的一部分，不要认为会有性能差异，但看起来也更好。

【解决方案2】：

np.append() 每次都复制数组中的所有数据，但列表的容量会增加一个因子（1.125）。 list 速度很快，但内存使用量大于数组。如果你关心内存，你可以使用 python 标准库的数组模块。

这里是关于这个话题的讨论：

How to create a dynamic array

【讨论】：

有没有办法改变列表增长的因素？
np.append() 消耗的时间随着元素的数量呈指数增长。
^ 线性（即总累积时间是二次的），而不是指数。

【解决方案3】：

使用 Owen 帖子中的类声明，这是一个修改后的时间，具有一些最终确定的效果。

简而言之，我发现 C 类提供的实现比原始帖子中的方法快 60 倍以上。（为文字墙道歉）

我使用的文件：

#!/usr/bin/python
import cProfile
import numpy as np

# ... class declarations here ...

def test_class(f):
    x = f()
    for i in xrange(100000):
        x.update([i])
    for i in xrange(1000):
        x.finalize()

for x in 'ABC':
    cProfile.run('test_class(%s)' % x)

现在，结果时间：

答：

     903005 function calls in 16.049 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000   16.049   16.049 <string>:1(<module>)
100000    0.139    0.000    1.888    0.000 fromnumeric.py:1043(ravel)
  1000    0.001    0.000    0.003    0.000 fromnumeric.py:107(reshape)
100000    0.322    0.000   14.424    0.000 function_base.py:3466(append)
100000    0.102    0.000    1.623    0.000 numeric.py:216(asarray)
100000    0.121    0.000    0.298    0.000 numeric.py:286(asanyarray)
  1000    0.002    0.000    0.004    0.000 test.py:12(finalize)
     1    0.146    0.146   16.049   16.049 test.py:50(test_class)
     1    0.000    0.000    0.000    0.000 test.py:6(__init__)
100000    1.475    0.000   15.899    0.000 test.py:9(update)
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
100000    0.126    0.000    0.126    0.000 {method 'ravel' of 'numpy.ndarray' objects}
  1000    0.002    0.000    0.002    0.000 {method 'reshape' of 'numpy.ndarray' objects}
200001    1.698    0.000    1.698    0.000 {numpy.core.multiarray.array}
100000   11.915    0.000   11.915    0.000 {numpy.core.multiarray.concatenate}

乙：

     208004 function calls in 16.885 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.001    0.001   16.885   16.885 <string>:1(<module>)
  1000    0.025    0.000   16.508    0.017 fromnumeric.py:107(reshape)
  1000    0.013    0.000   16.483    0.016 fromnumeric.py:32(_wrapit)
  1000    0.007    0.000   16.445    0.016 numeric.py:216(asarray)
     1    0.000    0.000    0.000    0.000 test.py:16(__init__)
100000    0.068    0.000    0.080    0.000 test.py:19(update)
  1000    0.012    0.000   16.520    0.017 test.py:23(finalize)
     1    0.284    0.284   16.883   16.883 test.py:50(test_class)
  1000    0.005    0.000    0.005    0.000 {getattr}
  1000    0.001    0.000    0.001    0.000 {len}
100000    0.012    0.000    0.012    0.000 {method 'append' of 'list' objects}
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  1000    0.020    0.000    0.020    0.000 {method 'reshape' of 'numpy.ndarray' objects}
  1000   16.438    0.016   16.438    0.016 {numpy.core.multiarray.array}

C:

     204010 function calls in 0.244 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.244    0.244 <string>:1(<module>)
  1000    0.001    0.000    0.003    0.000 fromnumeric.py:107(reshape)
     1    0.000    0.000    0.000    0.000 test.py:27(__init__)
100000    0.082    0.000    0.170    0.000 test.py:32(update)
100000    0.087    0.000    0.088    0.000 test.py:36(add)
  1000    0.002    0.000    0.005    0.000 test.py:46(finalize)
     1    0.068    0.068    0.243    0.243 test.py:50(test_class)
  1000    0.000    0.000    0.000    0.000 {len}
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  1000    0.002    0.000    0.002    0.000 {method 'reshape' of 'numpy.ndarray' objects}
     6    0.001    0.000    0.001    0.000 {numpy.core.multiarray.zeros}

A 类被更新销毁，B 类被终结销毁。 C 类在两者面前都很健壮。

【讨论】：

更新完成 n 次，然后调用一次 finalize。这整个过程完成了 m 次（否则没有数据可以最终确定）。另外，与原始帖子进行比较时...您是指第一个（array.append + numpy 转换）还是（numpy.append + reshape）？
cProfile。这是我的代码 sn-p 中调用的第一个导入和最后一行。

【解决方案4】：

用于终结的函数存在很大的性能差异。考虑以下代码：

N=100000
nruns=5

a=[]
for i in range(N):
    a.append(np.zeros(1000))

print "start"

b=[]
for i in range(nruns):
    s=time()
    c=np.vstack(a)
    b.append((time()-s))
print "Timing version vstack ",np.mean(b)

b=[]
for i in range(nruns):
    s=time()
    c1=np.reshape(a,(N,1000))
    b.append((time()-s))

print "Timing version reshape ",np.mean(b)

b=[]
for i in range(nruns):
    s=time()
    c2=np.concatenate(a,axis=0).reshape(-1,1000)
    b.append((time()-s))

print "Timing version concatenate ",np.mean(b)

print c.shape,c2.shape
assert (c==c2).all()
assert (c==c1).all()

使用 concatenate 似乎比第一个版本快两倍，比第二个版本快 10 倍以上。

Timing version vstack  1.5774928093
Timing version reshape  9.67419199944
Timing version concatenate  0.669512557983

【讨论】：

【解决方案5】：

如果您想通过列表操作提高性能，请查看 blist 库。是python列表等结构的优化实现。

我还没有对其进行基准测试，但他们页面中的结果看起来很有希望。

【讨论】：

【解决方案6】：

多维numpy数组

在 Owen 和 Prashant Kumar 的帖子中添加使用多维 numpy 数组（又名 shape）的版本可加快 numpy 解决方案的代码速度。特别是如果您需要经常访问 (finalize()) 数据。

Version	Prashant Kumar	row_length=1	row_length=5
Class A - np.append	2.873 s	2.776 s	0.682 s
Class B - python list	6.693 s	80.868 s	22.012 s
Class C - arraylist	0.095 s	0.180 s	0.043 s

Prashant Kumar 栏目是他在我的机器上执行的示例，以进行比较。 row_length=5 是最初问题的示例。 python list 的显着增加来自 {built-in method numpy.array}，这意味着 numpy 需要更多时间来将列表的多维列表转换为相对于一维列表的数组，并在两者具有相同数量条目的情况下对其进行整形，例如np.array([[1,2,3]*5]) 与 np.array([1]*15).reshape((-1,3)).

这是代码：

import cProfile
import numpy as np

class A:
    def __init__(self,shape=(0,), dtype=float):
        """First item of shape is ingnored, the rest defines the shape"""
        self.data = np.array([], dtype=dtype).reshape((0,*shape[1:]))

    def update(self, row):
        self.data = np.append(self.data, row)

    def finalize(self):
        return self.data
    
    
class B:
    def __init__(self, shape=(0,), dtype=float):
        """First item of shape is ingnored, the rest defines the shape"""
        self.shape = shape
        self.dtype = dtype 
        self.data = []

    def update(self, row):
        self.data.append(row)

    def finalize(self):
        return np.array(self.data, dtype=self.dtype).reshape((-1, *self.shape[1:]))
    
    
class C:
    def __init__(self, shape=(0,), dtype=float):
        """First item of shape is ingnored, the rest defines the shape"""
        self.shape = shape
        self.data = np.zeros((100,*shape[1:]),dtype=dtype)
        self.capacity = 100
        self.size = 0

    def update(self, x):
        if self.size == self.capacity:
            self.capacity *= 4
            newdata = np.zeros((self.capacity,*self.data.shape[1:]))
            newdata[:self.size] = self.data
            self.data = newdata

        self.data[self.size] = x
        self.size += 1

    def finalize(self):
        return self.data[:self.size]
    

def test_class(f):
    row_length = 5
    x = f(shape=(0,row_length))
    for i in range(int(100000/row_length)):
        x.update([i]*row_length)
    for i in range(1000):
        x.finalize()

for x in 'ABC':
    cProfile.run('test_class(%s)' % x)

还有一个添加到post above from Luca Fiaschi 的选项。

b=[]
for i in range(nruns):
    s=time.time()
    c1=np.array(a, dtype=int).reshape((N,1000))
    b.append((time.time()-s))
    
print("Timing version array.reshape ",np.mean(b))

给我：

Timing version vstack         0.6863266944885253
Timing version reshape        0.505419111251831
Timing version array.reshape  0.5052066326141358
Timing version concatenate    0.5339600563049316

【讨论】：