这里有两个numpy 选项和np.in1d,它是来自基本python 的in 的矢量化版本。当数组很大时,第一个选项显示一些加速:
选项一(快一):
np.in1d(A, L).reshape(A.shape).astype(int)
选项二(慢一):
np.apply_along_axis(np.in1d, 0, A, L).astype(int)
时间:
A = np.random.randint(0, 10, (1000, 1000))
L = [3,4,5]
def loop():
B = np.zeros(A.shape)
for e in L:
B[A==e] = 1
return B
%timeit np.in1d(A, L).reshape(A.shape).astype(int)
# 100 loops, best of 3: 6.4 ms per loop
%timeit loop()
# 100 loops, best of 3: 16.8 ms per loop
%timeit np.apply_along_axis(np.in1d, 1, A, L).astype(int)
# 10 loops, best of 3: 21.5 ms per loop
%timeit np.apply_along_axis(np.in1d, 0, A, L).astype(int)
# 10 loops, best of 3: 35.1 ms per loop
结果检查:
B1 = loop()
B2 = np.apply_along_axis(np.in1d, 0, A, L).astype(int)
B3 = np.apply_along_axis(np.in1d, 1, A, L).astype(int)
B4 = np.in1d(A, arrL).reshape(A.shape).astype(int)
(B1 == B2).all()
# True
(B1 == B3).all()
# True
(B1 == B4).all()
# True