将新列/数组添加到 Numpy 数组答案

【问题标题】：Adding a New Column/Array to a Numpy Array将新列/数组添加到 Numpy 数组
【发布时间】：2014-03-05 14:58:39
【问题描述】：

我正在尝试向 numpy 数组添加一列。每行当前有四个值，我希望每行有五个值。下面是返回ValueError: all the input arrays must have same number of dimensions 的可重现示例我不明白为什么会出现错误，因为Y 与X 具有相同的长度，就像b 与documentation 中的a 具有相同的长度一样.最终，我想要一种最有效的方法，将像 Y 这样的数组添加到像 X 这样的现有数组中，作为每一行的新列。

import numpy as np
from sklearn import datasets

#Documentation
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
c = np.concatenate((a, b.T), axis=1)
print(c)

#My Case
iris = datasets.load_iris()
X = iris.data
Y = iris.target
Z = np.concatenate((X, Y.T), axis = 1) #Is transpose necessary for single dimension array? Throws error either way
print(Z)

编辑：我应该补充一点，在实践中我将使用的是来自sklearn 拟合模型的预测值。所以我特别感兴趣的是将预测值添加到像 X 这样的现有数组中的最有效方法，即 sklearn 使用的格式。下面的这个解决方案来自 M4rtini 的评论，我认为这相当于 Dietrich 的解决方案之一。这是最快的实现吗？

#My Case
import numpy as np
from sklearn import datasets
from sklearn.linear_model import LinearRegression

iris = datasets.load_iris()
X = iris.data
Y = iris.target
model = LinearRegression()
model.fit(X,Y)
y_hat = model.predict(X).reshape(-1,1)
Z = np.concatenate((X, y_hat), axis = 1)

【问题讨论】：

尝试使用Y.reshape(-1,1) 而不是Y.T。如果没有第二维转置它不会做任何事情。

标签： python arrays numpy scikit-learn

【解决方案1】：

为确保您的尺寸匹配，请尝试：

Z = np.vstack((X.T,Y)).T

或

Yr = np.reshape(Y, (len(Y),1))
Z = np.hstack((X,Yr))

因为

X.shape  = (150, 4)
Y.shape  = (150,)
Yr.shape = (150,1)
Z.shape  = (150,5)

【讨论】：

reshape 或 transposition 会更快/它们各自对内存使用有何影响？
@Michael：Transition 会更快并且需要更少的内存，因为数组不会重复。使用 M4rtini 的解决方案 Y.reshape(-1,1) 在效率上可能与 Y.T 没有区别

【解决方案2】：

注意b 是一个二维数组：

In [1848]: b = np.array([[5, 6]])

In [1849]: b.shape
Out[1849]: (1, 2)

Y.T 不会使 Y 成为二维数组：

In [1856]: Y
Out[1856]: array([0, 1, 2, 3])

In [1857]: Y.T
Out[1857]: array([0, 1, 2, 3])

In [1858]: Y.T.shape
Out[1858]: (4,)

将Y 设为二维数组：

In [1867]: Y1=Y.reshape(-1, 1)

In [1868]: Y1
Out[1868]: 
array([[0],
       [1],
       [2],
       [3]])

In [1869]: Y2=Y.reshape(1, -1)

In [1870]: Y2
Out[1870]: array([[0, 1, 2, 3]])

或使用np.newaxis:

In [1872]: Y3
Out[1872]: 
array([[0],
       [1],
       [2],
       [3]])

In [1873]: Y4=Y[np.newaxis, :]

In [1874]: Y4
Out[1874]: array([[0, 1, 2, 3]])

【讨论】：