为什么 scikit-learn 的 GridSearchCV 只将第一个参数作为最佳估计器？答案

【问题标题】：Why does scikit-learn's GridSearchCV take only first parameter as best estimator?为什么 scikit-learn 的 GridSearchCV 只将第一个参数作为最佳估计器？
【发布时间】：2020-04-25 15:22:58
【问题描述】：

我正在尝试使用 GridSearchCV 找出 SVC 模型中的最佳估算器，这是我的代码和输出

from sklearn import datasets
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
param_grid = {'C': [5e3,1e3, 1e4, 5e4, 1e5], 'gamma': [0.0005, 0.0001, 0.001, 0.005, 0.01, 0.1]}
clf = GridSearchCV(SVC(kernel='rbf', class_weight='balanced', probability=True), param_grid)
clf = clf.fit(emb_array, label)
print("Best estimator found by grid search:")
print(clf.best_estimator_)

输出

Best estimator found by grid search:
SVC(C=5000.0, cache_size=200, class_weight='balanced', coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma=0.0005, kernel='rbf',
    max_iter=-1, probability=True, random_state=None, shrinking=True, tol=0.001,
    verbose=False)

如果我将param_grid 更改为

   param_grid = {'C': [1e3, 5e3, 1e4, 5e4, 1e5], 'gamma': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.1]}

然后输出

Best estimator found by grid search:
SVC(C=1000.0, cache_size=200, class_weight='balanced', coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma=0.0001, kernel='rbf',
    max_iter=-1, probability=True, random_state=None, shrinking=True, tol=0.001,
    verbose=False)

如果它只将第一个参数作为best_estimator_ 最好，那么GridSearchCV 有什么用？

【问题讨论】：

标签： python scikit-learn grid-search svc

【解决方案1】：

best_estimator_ 返回的是得分最高/损失最小的估计器，而不是第一个参数的估计器。您所看到的可能是您的数据和超参数扫描的性质的产物。例如，这里是鸢尾花数据上的相同代码。

from sklearn import datasets
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV

iris = datasets.load_iris()

param_grid = {'C': [5e3,1e3, 1e4, 5e4, 1e5], 'gamma': [1.0, 0.0005, 0.0001, 0.001, 0.005, 0.01, 0.1]}
clf = GridSearchCV(SVC(kernel='rbf', class_weight='balanced', probability=True), param_grid)
clf = clf.fit(iris.data, iris.target)
print("Best estimator found by grid search:")
print(clf.best_estimator_)
Best estimator found by grid search:
SVC(C=1000.0, break_ties=False, cache_size=200, class_weight='balanced',
    coef0=0.0, decision_function_shape='ovr', degree=3, gamma=0.0005,
    kernel='rbf', max_iter=-1, probability=True, random_state=None,
    shrinking=True, tol=0.001, verbose=False)

第一个论点并不是最终成为最好的论点。

请注意，您可能还想稍微重构一下您的代码。 fit 也会就地修改估算器对象，最好在 GridSearchCV 对象之外显式定义估算器，如下所示：

...
svc = SVC(kernel='rbf', class_weight='balanced', probability=True)
clf = GridSearchCV(svc, param_grid)
clf.fit(iris.data, iris.target)
...

【讨论】：

你能解释一下我的情况吗？为什么它只需要 c 和 gamma 的第一个参数