多维numpy数组
在 Owen 和 Prashant Kumar 的帖子中添加使用多维 numpy 数组(又名 shape)的版本可加快 numpy 解决方案的代码速度。特别是如果您需要经常访问 (finalize()) 数据。
| Version |
Prashant Kumar |
row_length=1 |
row_length=5 |
| Class A - np.append |
2.873 s |
2.776 s |
0.682 s |
| Class B - python list |
6.693 s |
80.868 s |
22.012 s |
| Class C - arraylist |
0.095 s |
0.180 s |
0.043 s |
Prashant Kumar 栏目是他在我的机器上执行的示例,以进行比较。 row_length=5 是最初问题的示例。 python list 的显着增加来自 {built-in method numpy.array},这意味着 numpy 需要更多时间来将列表的多维列表转换为相对于一维列表的数组,并在两者具有相同数量条目的情况下对其进行整形,例如np.array([[1,2,3]*5]) 与 np.array([1]*15).reshape((-1,3)).
这是代码:
import cProfile
import numpy as np
class A:
def __init__(self,shape=(0,), dtype=float):
"""First item of shape is ingnored, the rest defines the shape"""
self.data = np.array([], dtype=dtype).reshape((0,*shape[1:]))
def update(self, row):
self.data = np.append(self.data, row)
def finalize(self):
return self.data
class B:
def __init__(self, shape=(0,), dtype=float):
"""First item of shape is ingnored, the rest defines the shape"""
self.shape = shape
self.dtype = dtype
self.data = []
def update(self, row):
self.data.append(row)
def finalize(self):
return np.array(self.data, dtype=self.dtype).reshape((-1, *self.shape[1:]))
class C:
def __init__(self, shape=(0,), dtype=float):
"""First item of shape is ingnored, the rest defines the shape"""
self.shape = shape
self.data = np.zeros((100,*shape[1:]),dtype=dtype)
self.capacity = 100
self.size = 0
def update(self, x):
if self.size == self.capacity:
self.capacity *= 4
newdata = np.zeros((self.capacity,*self.data.shape[1:]))
newdata[:self.size] = self.data
self.data = newdata
self.data[self.size] = x
self.size += 1
def finalize(self):
return self.data[:self.size]
def test_class(f):
row_length = 5
x = f(shape=(0,row_length))
for i in range(int(100000/row_length)):
x.update([i]*row_length)
for i in range(1000):
x.finalize()
for x in 'ABC':
cProfile.run('test_class(%s)' % x)
还有一个添加到post above from Luca Fiaschi 的选项。
b=[]
for i in range(nruns):
s=time.time()
c1=np.array(a, dtype=int).reshape((N,1000))
b.append((time.time()-s))
print("Timing version array.reshape ",np.mean(b))
给我:
Timing version vstack 0.6863266944885253
Timing version reshape 0.505419111251831
Timing version array.reshape 0.5052066326141358
Timing version concatenate 0.5339600563049316