sklearn找到线性回归系数的名称答案

【问题标题】：sklearn finding the name of linear regression coefficientsklearn找到线性回归系数的名称
【发布时间】：2021-09-20 23:37:58
【问题描述】：

问题：如何在不手动跟踪输入线性回归的特征顺序的情况下找出输出系数属于哪个特征

我有一个具有以下功能的数据集。
usertype 包含 Subscriber 和 Customer。

我train_test_split数据。

feature = ['age','usertype','gender']

X = citibike_dropped[feature]
y = citibike_dropped['tripduration']


X_train, X_test, y_train, y_test = train_test_split(X,y,random_state=123)

我使用 sklearn Pipeline 进行预处理并适合线性回归

ct = ColumnTransformer(
    [('ohe',OneHotEncoder(handle_unknown = 'ignore'),['usertype']),
    ('scaler',MinMaxScaler(),['age'])],
    remainder = 'passthrough')

lr = LinearRegression()


Input = [('transformer',ct),('clf',lr)]
pipe = Pipeline(Input)

我在将pipe 与x_train and y_train 拟合后检查系数

pipe.fit(X_train,y_train);
pipe.named_steps['clf'].coef_

输出

array([  0.        , 499.85347478, 177.64720307])

如何找出上述系数属于哪个特征？**

【问题讨论】：

我的意思是w。例如y = 0*usertype+ 499 *gender + 177*64*age + c?
@PrakashDahal 是 scikit-learn，coef_ 是 w 和 intercept_ 是 b
功能可能按照您提供给模型的顺序排列，例如feature = ['age','usertype','gender']。 y = 0*age+ 499 *usertype+ 177.64*gender+ c
@PrakashDahal 我相信coef_ 将成为ColumnTransformer 输入linear regression 的方式。在这种情况下，它将是 OneHotEncoder:usertype，然后是 StandardScaler:age，然后是 passthrough:Gender。但是这样，我必须手动跟踪。如果给定一个大型功能和多个管道，手动跟踪可能会很困难。所以我想知道是否有任何函数可以返回 coef_ 名称

标签： python scikit-learn

【解决方案1】：

或许您可以查看here 或here 以找到决策线或属于该要素的区域

【讨论】：