【问题标题】:Why score method needs to reshape the parameter while the r2_score method does not?为什么 score 方法需要重塑参数而 r2_score 方法不需要?
【发布时间】:2020-09-02 04:25:20
【问题描述】:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import Lasso, LinearRegression
from sklearn.metrics.regression import r2_score
np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10
X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)

def regression_score():
    polynomial_features= PolynomialFeatures(degree=12)
    x_train_poly = polynomial_features.fit_transform(X_train.reshape(-1,1))
    x_test_poly = polynomial_features.fit_transform(X_test.reshape(-1,1))
    model = LinearRegression()
    model.fit(x_train_poly, y_train)
    test_pred_linear_regression = model.predict(x_test_poly)
    LinearRegression_R2_test_score = model.score(y_test, test_pred_linear_regression)

regression_score()

每当我运行上述代码时,我都会收到以下错误

ValueError: Expected 2D array, got 1D array instead:
array=[ 0.99517935 -0.16081     0.3187423   1.53763897].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

但如果我尝试找出 y_test 和 test_pred_linear_regression 的形状,它们都提供相同的形状 (4,)。但是每当我使用 r2_score 而不是 score 方法时,我都会得到想要的结果。

LinearRegression_R2_test_score = r2_score(y_test, test_pred_linear_regression)

有人能指出我在这里缺少什么吗??

【问题讨论】:

  • 在跑分前试试y_test=y_test.reshape(-1,1),test_pred_linear_regression=test_pred_linear_regression.reshape(-1,1),看看有没有帮助。
  • 我之前尝试在 score 方法中重塑值,仍然得到同样的错误。遵循您的建议后,我收到错误,即在分配之前引用了局部变量
  • that local variable is referenced before assignment 表示您没有将代码放在正确的位置。在LinearRegression_R2_test_score = model.score(y_test, test_pred_linear_regression)之前尝试一下
  • 我确实按照你提到的那样尝试过,但仍然出现同样的错误,正如我在第一条评论中提到的那样,我试过了,LinearRegression_R2_test_score = lasso.score(y_test.reshape(-1,1),test_pred_linear_regression.reshape(-1,1) 出现ValueError: shapes (4,1) and (13,) not aligned: 1 (dim 1) != 13 (dim 0)
  • 值得一试。我不知道这些方法的细节。我建议阅读他们的文档以了解输入要求。

标签: python numpy scikit-learn


【解决方案1】:

这只是与每个函数期望的输入有关。 scikit-learn 中估计器对象的 score 方法期望测试输入 (x_test_poly) 和相应的真值 (y_test) 作为输入。 r2_score 函数期望预测值 y_pred 和相应的真实值 (y_test) 作为输入。

r2_score

LinearRegression

希望有帮助!

编辑:正确的代码是

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import Lasso, LinearRegression
from sklearn.metrics.regression import r2_score
np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10
X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)

def regression_score():
    polynomial_features= PolynomialFeatures(degree=12)
    x_train_poly = polynomial_features.fit_transform(X_train.reshape(-1,1))
    x_test_poly = polynomial_features.fit_transform(X_test.reshape(-1,1))
    model = LinearRegression()
    model.fit(x_train_poly, y_train)
    test_pred_linear_regression = model.predict(x_test_poly)
    LinearRegression_R2_test_score = model.score(x_test_poly,y_test)
    r2 = r2_score(y_test,test_pred_linear_regression)
    return LinearRegression_R2_test_score, r2

linearreg, r2 = regression_score() # -4.311980555741178, -4.311980555741178

【讨论】:

  • difference-between-model-score-vs-r2-score 。他们两个实际上做同样的事情,我不明白 score 方法的数组重塑部分
  • 你不需要重塑任何东西! score 方法不将预测值作为输入。我在答案中添加了正确的代码。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2011-04-07
  • 1970-01-01
  • 1970-01-01
  • 2010-11-02
  • 2014-10-07
  • 1970-01-01
相关资源
最近更新 更多