相当高效,使用滑动窗口方法,应该在线性时间内运行。
def find_x_in_y(subset, main):
""" Returns a list of the indexes of the first value of matches. """
results = []
for i in xrange(len(main)):
if main[i:i+5] == subset:
results.append(i)
return results
# values borrowed from @JonClements
subset = [True, True, False, True, True]
main = [False, True, False, True, True, True, False, True, True, False, True, False, True]
>>> find_x_in_y(subset, main)
... [4]
@abarnert 很好,这是一个实际上非常有效的方法。
效率极高,将整个 bool 列表转换为 bitarray 并运行原生搜索方法。
from bitarray import bitarray
def find_x_in_y(subset, main):
subarray = bitarray(subset)
mainarray = bitarray(main)
return [int(i) for i in mainarray.itersearch(subarray)]
timeit 结果:
Length of main: 10 100 1000 10000 100000 1000000
returning _all_ matches:
# number of matches 1 10 100 1000 10000 100000
# sliding window approach (0.00059, 0.00502, 0.04194, 0.26211, 2.55554, 26.21962)
# bitarray approach (0.00028, 0.00072, 0.00484, 0.02926, 0.2822, 2.93676)
returning first match:
# sliding window approach (0.00034, 0.00034, 0.00034, 0.00021, 0.00026, 0.00059)
# bitarray approach (0.00017, 0.00017, 0.00016, 0.00011, 0.00014, 0.00049)
# joined string approach (0.00134, 0.00721, 0.06244, 0.39224, 4.21628, 39.63207)