如何在 GridSearchCV 中使用 Perceptron 对数据执行预处理方法？答案

【问题标题】：How to perform a preprocessing method on the data with Perceptron in GridSearchCV?如何在 GridSearchCV 中使用 Perceptron 对数据执行预处理方法？
【发布时间】：2021-02-18 04:12:21
【问题描述】：

我已经检查了this question，但答案没有帮助。

我正在尝试在 GridSearchCV 中使用带有感知器的 StandardScaler 和 Normalizer 等预处理方法：

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, Normalizer
from sklearn.linear_model import Perceptron

param_grid = [{
    'tol': [1e-1, 1e-3, 1e-5],
    'penalty': ['l2', 'l1', 'elasticnet'],
    'eta0': [0.0001, 0.001, 0.01, 0.1, 1.0]
}]

scoring = {
    'AUC-ROC': 'roc_auc',
    'Accuracy': 'accuracy',
    'AUC-PR': 'average_precision'
}

pipe = Pipeline([('scale', StandardScaler()), ('clf', Perceptron())])

search = GridSearchCV(pipe,
                      param_grid,
                      scoring=scoring,
                      refit='AUC-ROC',
                      cv=skf,
                      return_train_score=True)

results = search.fit(Xtrain, ytrain)

当我运行代码时，我得到：

ValueError: Invalid parameter class_weight for estimator Pipeline(steps=[('scale', StandardScaler()), ('clf', Perceptron())]). Check the list of available parameters with `estimator.get_params().keys()`.

我认为这个错误是因为提供的param_grid 不适用于StandardScaler()。此外，当我打印search.get_params().keys() 时，我得到：

dict_keys(['cv', 'error_score', 'estimator__memory', 'estimator__steps', 'estimator__verbose', 'estimator__scale', 'estimator__clf', 'estimator__scale__copy', 'estimator__scale__with_mean', 'estimator__scale__with_std', 'estimator__clf__alpha', 'estimator__clf__class_weight', 'estimator__clf__early_stopping', 'estimator__clf__eta0', 'estimator__clf__fit_intercept', 'estimator__clf__l1_ratio', 'estimator__clf__max_iter', 'estimator__clf__n_iter_no_change', 'estimator__clf__n_jobs', 'estimator__clf__penalty', 'estimator__clf__random_state', 'estimator__clf__shuffle', 'estimator__clf__tol', 'estimator__clf__validation_fraction', 'estimator__clf__verbose', 'estimator__clf__warm_start', 'estimator', 'n_jobs', 'param_grid', 'pre_dispatch', 'refit', 'return_train_score', 'scoring', 'verbose'])

我该如何解决？

【问题讨论】：

这能回答你的问题吗？ How to apply StandardScaler in Pipeline in scikit-learn (sklearn)?
@JesusSono 但是它没有，它提供了很好的信息，谢谢。

标签： python python-3.x scikit-learn

【解决方案1】：

您应指定应将param_grid 参数应用于管道中的哪个转换：

param_grid = [{
    'clf__tol': [1e-1, 1e-3, 1e-5],
    'clf__penalty': ['l2', 'l1', 'elasticnet'],
    'clf__eta0': [0.0001, 0.001, 0.01, 0.1, 1.0]
}]

【讨论】：