make_scorer NotFittedError :Estimator not fit, 在利用模型之前调用 `fit`答案

【问题标题】：make_scorer NotFittedError :Estimator not fitted, call `fit` before exploiting the modelmake_scorer NotFittedError :Estimator not fit, 在利用模型之前调用 `fit`
【发布时间】：2020-07-03 02:00:54
【问题描述】：

我正在尝试为我的 gridsearchCV 使用自定义评分函数。

def my_custom_loss_func(model,X_test,Y_test):
    train_view=pd.DataFrame(np.round(model.predict(X_test)),columns=['predicted'])
    train_view['original']=pd.DataFrame(Y_test).reset_index().delay
    train_view['difference']=train_view.original-train_view.predicted
    score = train_view[abs(train_view.difference) <=3 ].count()[2]/train_view.shape[0]*100
    return score
my_scorer = make_scorer(my_custom_loss_func, greater_is_better=True)

这里是我的模型拟合代码

params = {"learning_rate"    : [0.05, 0.10, 0.15, 0.20, 0.25, 0.30 ] ,
 "max_depth"        : [ 3, 4, 5, 6, 8, 10, 12, 15],
 "min_child_weight" : [ 1, 3, 5, 7 ],
 "gamma"            : [ 0.0, 0.1, 0.2 , 0.3, 0.4 ],
 "colsample_bytree" : [ 0.3, 0.4, 0.5 , 0.7 ] }

folds = 3
param_comb = 5

skf = StratifiedKFold(n_splits=folds, shuffle = True, random_state = 1001)
model_lgbm = lgb.LGBMRegressor(n_estimators=300)

random_search = RandomizedSearchCV(model_lgbm, param_distributions=params, n_iter=param_comb, scoring=my_scorer(model_lgbm,X_train,y_train), n_jobs=4, cv=skf.split(X_train,y_train), verbose=3, random_state=1001 )
random_search.fit(X_train,y_train)

我收到这个错误

---------------------------------------------------------------------------
NotFittedError                            Traceback (most recent call last)
<ipython-input-101-8b3c5a6432af> in <module>
----> 1 random_search = RandomizedSearchCV(model_lgbm, param_distributions=params, n_iter=param_comb, scoring=my_scorer(model_lgbm,X_train,y_train), n_jobs=4, cv=skf.split(X_train,y_train), verbose=3, random_state=1001 )
      2 random_search.fit(X_train,y_train)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\metrics\scorer.py in __call__(self, estimator, X, y_true, sample_weight)
     89         """
     90 
---> 91         y_pred = estimator.predict(X)
     92         if sample_weight is not None:
     93             return self._sign * self._score_func(y_true, y_pred,

C:\ProgramData\Anaconda3\lib\site-packages\lightgbm\sklearn.py in predict(self, X, raw_score, num_iteration, pred_leaf, pred_contrib, **kwargs)
    648         """
    649         if self._n_features is None:
--> 650             raise LGBMNotFittedError("Estimator not fitted, call `fit` before exploiting the model.")
    651         if not isinstance(X, (DataFrame, DataTable)):
    652             X = _LGBMCheckArray(X, accept_sparse=True, force_all_finite=False)

NotFittedError: Estimator not fitted, call `fit` before exploiting the model.

我无法理解我哪里出错了

【问题讨论】：

标签： scikit-learn hyperparameters gridsearchcv

【解决方案1】：

您的自定义损失函数应该只有 y_pred 和 y_true 作为参数。您遇到的错误是由于函数model.predict(X_test) 中的第一行引起的，因为作为错误状态，您只能将 model.predict 用于已经训练好的模型。

查看Custom Loss Functions for Gradient Boosting 以查看如何在 sklearn 中编写自定义损失函数的示例。

【讨论】：