这是一种矢量化方式 -
def randnum_excludeone(A, val):
n = val[-1]
idx = np.random.randint(0,n,len(A))
idx[idx>=A] += 1
return idx
我们的想法是我们为A 中的每个条目生成随机整数,覆盖val 减去1 的整个长度。然后,如果当前生成的随机数等于或大于当前A元素,我们添加1,否则我们保留它。因此,对于生成的任何小于当前A 数字的随机数,我们保留它。否则,加上1,我们将从当前的A 数字偏移。这是我们的最终输出 - idx。
让我们验证随机性并确保它在非 A 元素之间是一致的 -
In [42]: A
Out[42]: array([2, 3, 1, 0, 2, 1, 2, 3, 1, 0, 4])
In [43]: val
Out[43]: array([0, 1, 2, 3, 4, 5])
In [44]: c = np.array([randnum_excludeone(A, val) for _ in range(10000)])
In [45]: [np.bincount(i) for i in c.T]
Out[45]:
[array([2013, 2018, 0, 2056, 1933, 1980]),
array([2018, 1985, 2066, 0, 1922, 2009]),
array([2032, 0, 1966, 1975, 2040, 1987]),
array([ 0, 2076, 1986, 1931, 2013, 1994]),
array([2029, 1943, 0, 1960, 2100, 1968]),
array([2028, 0, 2048, 2031, 1929, 1964]),
array([2046, 2065, 0, 1990, 1940, 1959]),
array([2040, 2003, 1935, 0, 2045, 1977]),
array([2008, 0, 2011, 2030, 1937, 2014]),
array([ 0, 2000, 2015, 1983, 2023, 1979]),
array([2075, 1995, 1987, 1948, 0, 1995])]
大型阵列基准测试
其他矢量化方法:
# @Paul Panzer's solution
def pp(A, val):
n,N = val[-1]+1,len(A)
D = np.random.randint(1,n,N)
B = (A-D)%n
return B
计时结果-
In [66]: np.random.seed(0)
...: A = np.random.randint(0,6,100000)
In [67]: %timeit pp(A,val)
100 loops, best of 3: 3.11 ms per loop
In [68]: %timeit randnum_excludeone(A, val)
100 loops, best of 3: 2.53 ms per loop
In [69]: np.random.seed(0)
...: A = np.random.randint(0,6,1000000)
In [70]: %timeit pp(A,val)
10 loops, best of 3: 39.9 ms per loop
In [71]: %timeit randnum_excludeone(A, val)
10 loops, best of 3: 25.9 ms per loop
将val的范围扩大到10 -
In [60]: np.random.seed(0)
...: A = np.random.randint(0,10,1000000)
In [61]: %timeit pp(A,val)
10 loops, best of 3: 31.2 ms per loop
In [62]: %timeit randnum_excludeone(A, val)
10 loops, best of 3: 23.6 ms per loop