尚未安装 RandomForestClassifier 实例。在使用此方法之前使用适当的参数调用“fit”答案

【问题标题】：RandomForestClassifier instance not fitted yet. Call 'fit' with appropriate arguments before using this method尚未安装 RandomForestClassifier 实例。在使用此方法之前使用适当的参数调用“fit”
【发布时间】：2018-12-26 02:43:44
【问题描述】：

我正在尝试训练决策树模型，将其保存，然后在以后需要时重新加载。但是，我不断收到以下错误：

尚未安装此 DecisionTreeClassifier 实例。叫“适合” 在使用此方法之前使用适当的参数。

这是我的代码：

X_train, X_test, y_train, y_test = train_test_split(data, label, test_size=0.20, random_state=4)

names = ["Decision Tree", "Random Forest", "Neural Net"]

classifiers = [
    DecisionTreeClassifier(),
    RandomForestClassifier(),
    MLPClassifier()
    ]

score = 0
for name, clf in zip(names, classifiers):
    if name == "Decision Tree":
        clf = DecisionTreeClassifier(random_state=0)
        grid_search = GridSearchCV(clf, param_grid=param_grid_DT)
        grid_search.fit(X_train, y_train_TF)
        if grid_search.best_score_ > score:
            score = grid_search.best_score_
            best_clf = clf
    elif name == "Random Forest":
        clf = RandomForestClassifier(random_state=0)
        grid_search = GridSearchCV(clf, param_grid_RF)
        grid_search.fit(X_train, y_train_TF)
        if grid_search.best_score_ > score:
            score = grid_search.best_score_
            best_clf = clf

    elif name == "Neural Net":
        clf = MLPClassifier()
        clf.fit(X_train, y_train_TF)
        y_pred = clf.predict(X_test)
        current_score = accuracy_score(y_test_TF, y_pred)
        if current_score > score:
            score = current_score
            best_clf = clf


pkl_filename = "pickle_model.pkl"  
with open(pkl_filename, 'wb') as file:  
    pickle.dump(best_clf, file)

from sklearn.externals import joblib
# Save to file in the current working directory
joblib_file = "joblib_model.pkl"  
joblib.dump(best_clf, joblib_file)

print("best classifier: ", best_clf, " Accuracy= ", score)

这是我如何加载模型并对其进行测试：

#First method
with open(pkl_filename, 'rb') as h:
    loaded_model = pickle.load(h) 
#Second method 
joblib_model = joblib.load(joblib_file)

如您所见，我尝试了两种保存方法，但均未奏效。

这是我的测试方法：

print(loaded_model.predict(test)) 
print(joblib_model.predict(test))

您可以清楚地看到这些模型实际上是拟合的，如果我尝试使用任何其他模型，例如 SVM 或 Logistic 回归，该方法就可以正常工作。

【问题讨论】：

您安装了网格搜索对象，因此您应该更改为best_clf = grid_search。您的MLPClassifier 代码很好。

标签： python machine-learning scikit-learn cross-validation grid-search

【解决方案1】：

问题出在这一行：

best_clf = clf

您已将clf 传递给grid_search，它会克隆估算器并将数据拟合到这些克隆模型上。因此，您的实际 clf 保持不变。

你需要的是

best_clf = grid_search

保存拟合的grid_search 模型。

如果不想保存grid_search的全部内容，可以使用grid_search的best_estimator_属性来获取实际克隆的拟合模型。

best_clf = grid_search.best_estimator_

【讨论】：