矢量化解决方案的基准测试
我们希望在这篇文章中对矢量化解决方案进行基准测试。现在,向量化试图避免循环遍历每一行并进行洗牌。因此,输入数组的设置涉及更多的行数。
方法-
def app1(a): # @Daniel F's soln
i, j = np.nonzero(a.astype(bool))
k = np.argsort(i + np.random.rand(i.size))
a[i,j] = a[i,j[k]]
return a
def app2(x): # @kazemakase's soln
r, c = np.where(x != 0)
n = c.size
perm = np.random.permutation(n)
i = np.argsort(perm + r * n)
x[r, c] = x[r, c[i]]
return x
def app3(a): # @Divakar's soln
m,n = a.shape
rand_idx = np.random.rand(m,n).argsort(axis=1)
b = a[np.arange(m)[:,None], rand_idx]
a[a!=0] = b[b!=0]
return a
from scipy.ndimage.measurements import labeled_comprehension
def app4(a): # @FabienP's soln
def func(array, idx):
r[idx] = np.random.permutation(array)
return True
labels, idx = nz = a.nonzero()
r = a[nz]
labeled_comprehension(a[nz], labels + 1, np.unique(labels + 1),\
func, int, 0, pass_positions=True)
a[nz] = r
return a
基准测试程序 #1
为了进行公平的基准测试,使用 OP 的样本并简单地将它们堆叠为更多行以获得更大的数据集似乎是合理的。因此,通过该设置,我们可以创建两个具有 200 万行和 2000 万行数据集的案例。
案例 #1:包含2*1000,000 行的大型数据集
In [174]: a = np.array([[2,3,1,0],[0,0,2,1]])
In [175]: a = np.vstack([a]*1000000)
In [176]: %timeit app1(a)
...: %timeit app2(a)
...: %timeit app3(a)
...: %timeit app4(a)
...:
1 loop, best of 3: 264 ms per loop
1 loop, best of 3: 422 ms per loop
1 loop, best of 3: 254 ms per loop
1 loop, best of 3: 14.3 s per loop
案例#2:2*10,000,000 行的更大数据集
In [177]: a = np.array([[2,3,1,0],[0,0,2,1]])
In [178]: a = np.vstack([a]*10000000)
# app4 skipped here as it was slower on the previous smaller dataset
In [179]: %timeit app1(a)
...: %timeit app2(a)
...: %timeit app3(a)
...:
1 loop, best of 3: 2.86 s per loop
1 loop, best of 3: 4.62 s per loop
1 loop, best of 3: 2.55 s per loop
基准测试程序 #2:广泛的一个
为了涵盖输入数组中非零百分比变化的所有情况,我们涵盖了一些广泛的基准测试场景。此外,由于app4 似乎比其他人慢得多,为了更仔细地检查,我们将在本节中跳过那个。
设置输入数组的辅助函数:
def in_data(n_col, nnz_ratio):
# max no. of elems that my system can handle, i.e. stretching it to limits.
# The idea is to use this to decide the number of rows and always use
# max. possible dataset size
num_elem = 10000000
n_row = num_elem//n_col
a = np.zeros((n_row, n_col),dtype=int)
L = int(round(a.size*nnz_ratio))
a.ravel()[np.random.choice(a.size, L, replace=0)] = np.random.randint(1,6,L)
return a
主要计时和绘图脚本(使用 IPython 魔术函数。因此,需要在复制和粘贴到 IPython 控制台时运行)-
import matplotlib.pyplot as plt
# Setup input params
nnz_ratios = np.array([0.2, 0.4, 0.6, 0.8])
n_cols = np.array([4, 5, 8, 10, 15, 20, 25, 50])
init_arr1 = np.zeros((len(nnz_ratios), len(n_cols) ))
init_arr2 = np.zeros((len(nnz_ratios), len(n_cols) ))
init_arr3 = np.zeros((len(nnz_ratios), len(n_cols) ))
timings = {app1:init_arr1, app2:init_arr2, app3:init_arr3}
for i,nnz_ratio in enumerate(nnz_ratios):
for j,n_col in enumerate(n_cols):
a = in_data(n_col, nnz_ratio=nnz_ratio)
for func in timings:
res = %timeit -oq func(a)
timings[func][i,j] = res.best
print func.__name__, i, j, res.best
fig = plt.figure(1)
colors = ['b','k','r']
for i in range(len(nnz_ratios)):
ax = plt.subplot(2,2,i+1)
for f,func in enumerate(timings):
ax.plot(n_cols,
[time for time in timings[func][i]],
label=str(func.__name__), color=colors[f])
ax.set_xlabel('No. of cols')
ax.set_ylabel('time [seconds]')
ax.grid(which='both')
ax.legend()
plt.tight_layout()
plt.title('Percentage non-zeros : '+str(int(100*nnz_ratios[i])) + '%')
plt.subplots_adjust(wspace=0.2, hspace=0.2)
时序输出 -
观察:
结论:
在 60% 非零标记之前,方法 #1 似乎表现不错。对于更多非零并且如果行长度很小,方法#3 似乎表现不错。