如何使用交叉验证模型获取系数答案

【问题标题】：How to get coefficients with cross validation model如何使用交叉验证模型获取系数
【发布时间】：2019-06-15 20:33:27
【问题描述】：

如何通过交叉验证模型获得系数？当我进行交叉验证时，我得到了 CV 模型的分数，我怎样才能得到系数？

#Split into training and testing
x_train, x_test, y_train, y_test = train_test_split(samples, scores, test_size = 0.30, train_size = 0.70)

clf = svm.SVC(kernel='linear', C=1)
scores = cross_val_score(clf, x_train, y_train, cv=5)
scores

我想打印与每个特征相关的系数

   #Print co-efficients of features
    for i in range(0, nFeatures):
    print samples.columns[i],":", coef[0][i]

这个没有交叉验证，提供系数

#Create SVM model using a linear kernel
model = svm.SVC(kernel='linear', C=C).fit(x_train, y_train)
coef = model.coef_

【问题讨论】：

标签： python svm cross-validation

【解决方案1】：

您可能想要使用model_selection.cross_validate（与return_estimator=True）而不是cross_val_score。它更加灵活，因此您可以访问用于每个折叠的估算器：

from sklearn.svm import SVC
from sklearn.model_selection import cross_validate

clf = SVC(kernel='linear', C=1)
cv_results = cross_validate(clf, x_train, y_train, cv=5, return_estimator=True)

for model in cv_results['estimator']:
    print(model.coef_)

应该给你想要的，希望！（您可以通过cv_results['train_score'] 和cv_results['test_score'] 访问指标）

【讨论】：

那么，如果我打印 (samples.columns[i],":", coef[0][i])，我会从交叉验证模型中得到每列的平均系数吗？ @福迪
当然可以，如果您愿意，只需将每个 model.coef_ 结果加载到 ndarray 或类似的数据结构中，然后计算每个系数在折叠中的平均值。