对 GridSearchCV 的工作感到困惑

【问题标题】：Confused with repect to working of GridSearchCV对 GridSearchCV 的工作感到困惑
【发布时间】：2015-01-13 17:41:03
【问题描述】：

GridSearchCV 实现了一个 fit 方法，在该方法中它执行 n 次交叉验证以确定最佳参数。在此之后，我们可以使用 predict() 直接将最佳估计器应用于测试数据 - 遵循此链接： - http://scikit-learn.org/stable/auto_examples/grid_search_digits.html

这里说“模型是在完整的开发集上训练的”

然而，我们在这里只应用了 n 折交叉验证。分类器是否也在以某种方式对整个数据进行自我训练？还是只是在应用预测时在 n 折中选择具有最佳参数的最佳训练估计器？

【问题讨论】：

标签： python machine-learning scikit-learn

【解决方案1】：

如果您想使用predict，您需要将'refit' 设置为True。来自文档：

refit : boolean
    Refit the best estimator with the entire dataset. 
    If “False”, it is impossible to make predictions using 
    this GridSearchCV instance after fitting.

看起来默认是真的，所以在例子中predict是基于整个训练集的。

【讨论】：