学习 |线性回归 |合身答案

【问题标题】：Sklearn | LinearRegression | Fit学习 |线性回归 |合身
【发布时间】：2018-01-26 13:39:10
【问题描述】：

我在使用 Scikit Learn 中的 LinearRegression 算法时遇到了一些问题 - 我浏览了论坛并用 Google 搜索了很多，但由于某种原因，我没有设法绕过该错误。我正在使用 Python 3.5

以下是我尝试过的，但不断收到值错误：“找到样本数量不一致的输入变量：[403, 174]”

X = df[["Impressions", "Clicks", "Eligible_Impressions", "Measureable_Impressions", "Viewable_Impressions"]].values

y = df["Total_Conversions"].values.reshape(-1,1)

print ("The shape of X is {}".format(X.shape))
print ("The shape of y is {}".format(y.shape))

The shape of X is (577, 5)
The shape of y is (577, 1)

X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.3, random_state = 42)
linreg = LinearRegression()
linreg.fit(X_train, y_train)
y_pred = linreg.predict(X_test)
print (y_pred)

print ("The shape of X_train is {}".format(X_train.shape))
print ("The shape of y_train is {}".format(y_train.shape))
print ("The shape of X_test is {}".format(X_test.shape))
print ("The shape of y_test is {}".format(y_test.shape))

The shape of X_train is (403, 5)
The shape of y_train is (174, 5)
The shape of X_test is (403, 1)
The shape of y_test is (174, 1)

我是否遗漏了一些明显的东西？

任何帮助将不胜感激。

亲切的问候，阿德里安

【问题讨论】：

标签： python pandas numpy scikit-learn linear-regression

【解决方案1】：

看起来您的训练和测试包含 X 和 y 的不同行数。这是因为您以错误的顺序存储 train_test_split() 的返回值

改变这个

X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.3, random_state = 42)

到这里

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state = 42)

【讨论】：

这行得通。知道这很愚蠢。谢谢鲍勃