重复二维数组的行答案

【问题标题】：Repeat rows of a 2D array重复二维数组的行
【发布时间】：2019-02-07 05:11:00
【问题描述】：

我有一个 numpy 数组，我想重复它 n 次，同时保留行的原始顺序：

>>>a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

所需的输出（对于 n =2）：

>>>a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

我找到了一个 np.repeat 函数，但是，它不保留列的原始顺序。是否有任何其他内置函数或技巧可以在保留顺序的同时重复数组？

【问题讨论】：

标签： python numpy

【解决方案1】：

这是另一种方式。我还添加了一些与@coldspeed 解决方案的时间比较

n = 2
a_new = np.tile(a.flatten(), n) 
a_new.reshape((n*a.shape[0], a.shape[1]))
# array([[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7],
#        [ 8,  9, 10, 11],
#        [ 0,  1,  2,  3],
#        [ 4,  5,  6,  7],
#        [ 8,  9, 10, 11]])

与coldspeed方案的性能对比

我的方法 n = 10000

a = np.array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
n = 10000

def tile_flatten(a, n):
    a_new = np.tile(a.flatten(), n).reshape((n*a.shape[0], a.shape[1])) 
    return a_new

%timeit tile_flatten(a,n)
# 149 µs ± 20.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

coldspeed 的解决方案 1 for n = 10000

a = np.array([[ 0,  1,  2,  3],
   [ 4,  5,  6,  7],
   [ 8,  9, 10, 11]])
n = 10000

def concatenate_repeat(a, n):
    a_new =  np.concatenate(np.repeat(a[None, :], n, axis=0), axis=0)
    return a_new

%timeit concatenate_repeat(a,n)
# 7.61 ms ± 1.37 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

coldspeed 的解决方案 2 for n = 10000

a = np.array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
n = 10000

def broadcast_reshape(a, n):
    a_new =  np.broadcast_to(a, (n, *a.shape)).reshape(-1, a.shape[1])
    return a_new

%timeit broadcast_reshape(a,n)
# 162 µs ± 29.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

@user2357112 的解决方案

def tile_only(a, n):
    a_new = np.tile(a, (n, 1))
    return a_new

%timeit tile_only(a,n)
# 142 µs ± 21.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

【讨论】：

在您对我的答案的测试中，元组可能应该是 (n, 1) 而不是 (2, 1)。
@user2357112：哦，是的，感谢您的关注。我惊讶地看到一个数量级的差异。现在我更正了。

【解决方案2】：

使用np.repeat，后跟np.concatenate：

np.concatenate(np.repeat(a[None, :], n, axis=0), axis=0)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

另一种选择是使用np.broadcast_to：

np.broadcast_to(a, (n, *a.shape)).reshape(-1, a.shape[1])

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

【讨论】：

对不起，我在期望的输出中做错了，它的尺寸错误。
@coldspeed：reshape 这里效率低吗？
@Bazingaa 我不这么认为，它只会改变数组的步幅（数据不会改变）。
@Bazingaa 您实际上应该使用timeit（而不是time）来计时语句。我添加了一个更快的解决方案，但我认为你的解决方案要快一些。
@coldspeed：感谢您的回复。使用 timeit 需要把东西放在一个函数中，对吧？

【解决方案3】：

numpy.repeat 用于按元素重复。为了将数组作为一个整体重复，您需要numpy.tile。

numpy.tile(a, (2, 1))

元组是每个轴上的重复次数。您希望第一个为 2，第二个为 1，因此元组为 (2, 1)。

【讨论】：

哇哦。我不知道传递元组。这是我 +1 的赞成票
为了完整起见，我想将您解决方案的时间添加到我的答案中，以突出您答案的新颖性。我可以这样做吗？
@Bazingaa：来吧。
我添加了它。非常感谢

【解决方案4】：

这是np.resize 的填充模式很有用的一种情况：

In [82]: arr = np.arange(12).reshape(3,4)
In [83]: np.resize(arr,(6,4))
Out[83]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

（resize 方法不同。）

【讨论】：

【解决方案5】：

你可以试试numpy.tile()。

这是您可以使用 numpy.tile 重复您的数组同时保存原始顺序的方法：

import numpy as np

a = np.array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

n = 5
b = np.tile(a, (n,1))
print b

输出：

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

【讨论】：

【解决方案6】：

你也可以试试

b=np.append(a,a).reshape(np.shape(a)[0]*2,np.shape(a)[1])

输出

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

【讨论】：