为什么循环中追加的数组长度大于迭代次数？答案

【问题标题】：Why the length of the array appended in loop is more than the number of iteration?为什么循环中追加的数组长度大于迭代次数？
【发布时间】：2020-09-04 12:45:16
【问题描述】：

我运行了这段代码，预计数组大小为 10000，因为 time 是一个长度为 10000 的 numpy 数组。

freq=np.empty([])
for i,t in enumerate(time):
    freq=np.append(freq,np.sin(t))
print(time.shape)
print(freq.shape)

但这是我得到的输出

(10000,)
(10001,)

谁能解释我为什么会出现这种差异？

【问题讨论】：

你有没有看更简单的np.append案例，比如np.append(np.empty([]),1)？这会产生一个 (2,) 数组。不要使用np.append。它正在用你不理解的输入玩游戏。而且在循环中使用时效率很低。

标签： python numpy for-loop append

【解决方案1】：

我认为您正在尝试复制列表操作：

freq=[]
for i,t in enumerate(time):
     freq.append(np.sin(t))

但np.empty 或np.append 都不是完全克隆；名称相似，但差异很大。

第一：

In [75]: np.empty([])                                                                                  
Out[75]: array(1.)
In [77]: np.empty([]).shape                                                                            
Out[77]: ()

这是一个 1 元素，0d 数组。

如果您查看 np.append 的代码，您会发现如果第一个参数不是 1d（并且未提供轴）它是 flattens 它（也有文档记录）：

In [78]: np.append??                                                                                   
In [82]: np.empty([]).ravel()                                                                          
Out[82]: array([1.])
In [83]: np.empty([]).ravel().shape                                                                    
Out[83]: (1,)

它不是一维、一元素的数组。将其附加到另一个数组：

In [84]: np.append(np.empty([]), np.sin(2))                                                            
Out[84]: array([1.        , 0.90929743])

结果是 2d。重复 1000 次，最终得到 1001 个值。

np.empty 尽管它的名字不会产生一个 [] 列表等价物。正如其他人所展示的np.array([]) 和np.empty(0) 一样。

np.append 不是列表追加克隆。它只是np.concatenate 的一个覆盖功能。可以将元素添加到更长的数组中，但除此之外它有太多的陷阱而无法使用。在这样的循环中尤其糟糕。获得正确的起始数组很棘手。而且它很慢（与列表追加相比）。实际上，这些问题适用于循环中concatenate 和stack... 的所有使用。

【讨论】：

【解决方案2】：

你应该填写 np.empty(0)。

我找numpy的源码numpy/core.py

def empty(shape, dtype=None, order='C'):
"""Return a new matrix of given shape and type, without initializing entries.
Parameters
----------
shape : int or tuple of int
    Shape of the empty matrix.
dtype : data-type, optional
    Desired output data-type.
order : {'C', 'F'}, optional
    Whether to store multi-dimensional data in row-major
    (C-style) or column-major (Fortran-style) order in
    memory.
See Also
--------
empty_like, zeros
Notes
-----
`empty`, unlike `zeros`, does not set the matrix values to zero,
and may therefore be marginally faster.  On the other hand, it requires
the user to manually set all the values in the array, and should be
used with caution.
Examples
--------
>>> import numpy.matlib
>>> np.matlib.empty((2, 2))    # filled with random data
matrix([[  6.76425276e-320,   9.79033856e-307], # random
        [  7.39337286e-309,   3.22135945e-309]])
>>> np.matlib.empty((2, 2), dtype=int)
matrix([[ 6600475,        0], # random
        [ 6586976, 22740995]])
"""
return ndarray.__new__(matrix, shape, dtype, order=order)

它将第一个arg形状输入到ndarray中，因此它将初始化一个新数组作为[]。

您可以打印np.empty(0) 和freq=np.empty([]) 以查看它们的区别。

【讨论】：

我认为他使用np.append 比使用np.empty 更成问题。但是 OP 在时尚之类的天真列表中同时使用了 np.empty([]) 和 np.append。

【解决方案3】：

事实证明，函数np.empty() 返回给定形状的未初始化数组。因此，当您执行np.empty([]) 时，它会返回一个未初始化的数组array(0.14112001)。这就像有一个“准备好使用”的值，但没有实际值。您可以通过在循环开始之前打印变量 freq 来检查这一点。

因此，当您循环遍历 freq = np.append(freq,np.sin(t)) 时，这实际上会初始化数组并向其附加第二个值。

另外，如果您只需要创建一个空数组，只需执行 x = np.array([]) 或 x = []。

您可以在此处阅读有关此 numpy.empty 函数的更多信息：

https://numpy.org/doc/1.18/reference/generated/numpy.empty.html

这里还有更多关于初始化数组的信息：

https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.3/com.ibm.xlc1313.aix.doc/language_ref/aryin.html

我不确定我是否足够清楚。这不是一个直截了当的概念。所以请告诉我。

【讨论】：