在列表上进行 3 克迭代的一种方法是使用 zip
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
np.array([i for i in zip(a,a[1:],a[2:])])
array([[ 1, 2, 3],
[ 2, 3, 4],
[ 3, 4, 5],
[ 4, 5, 6],
[ 5, 6, 7],
[ 6, 7, 8],
[ 7, 8, 9],
[ 8, 9, 10]])
解决 n-gram 迭代的一般函数可以使用以下 -
def find_ngrams(input_list, n):
return np.array(list(zip(*[input_list[i:] for i in range(n)])))
find_ngrams(a, 3) #try setting n to other values like 2 or 4 or 5
array([[ 1, 2, 3],
[ 2, 3, 4],
[ 3, 4, 5],
[ 4, 5, 6],
[ 5, 6, 7],
[ 6, 7, 8],
[ 7, 8, 9],
[ 8, 9, 10]])
find_ngrams(a, 5)
array([[ 1, 2, 3, 4, 5],
[ 2, 3, 4, 5, 6],
[ 3, 4, 5, 6, 7],
[ 4, 5, 6, 7, 8],
[ 5, 6, 7, 8, 9],
[ 6, 7, 8, 9, 10]])