GridSearchCV 错误“数组中的索引太多”答案

【问题标题】：GridSearchCV error "Too many indices in the array"GridSearchCV 错误“数组中的索引太多”
【发布时间】：2017-08-13 05:19:28
【问题描述】：

我正在使用监督学习算法随机森林分类器来训练数据。

    clf = RandomForestClassifier(n_estimators=50, n_jobs=3, random_state=42)

网格中不同的参数是：

    param_grid = { 
    'n_estimators': [200, 700],
    'max_features': ['auto', 'sqrt', 'log2'],
    'max_depth': [5,10],
    'min_samples_split': [5,10]
    }

分类器“clf”和参数网格“param_grid”在GridSearhCV方法中传递。

    clf_rfc = GridSearchCV(estimator=clf, param_grid=param_grid)

当我使用标签为特征拟合时

    clf_rfc.fit(X_train, y_train)

我收到错误“数组中的索引过多”。 X_train 的形状是 (204,3)，y_train 的形状是 (204,1)。

尝试使用选项 clf_rfc.fit(X_train.values, y_train.values) 但无法摆脱错误。

任何建议将不胜感激！

【问题讨论】：

请发布错误的完整堆栈跟踪。
还可以尝试将您的 y_train 重塑为 y_train.reshape(204) 以使其成为一维数组的序列

标签： scikit-learn random-forest grid-search

【解决方案1】：

如上一篇文章所述，问题似乎在于 y_train 的维度是 (204,1)。我认为这是问题所在，而不是 (204,1) 应该是 (204,)，点击 here 了解更多信息。

所以如果你重写 y_train 一切都应该没问题：

c, r = y_train.shape
y_train = y_train.reshape(c,)

如果它给出错误，例如： AttributeError: 'DataFrame' object has no attribute 'reshape' 然后尝试：

c, r = y_train.shape
y_train = y_train.values.reshape(c,)

【讨论】：

【解决方案2】：

y_train 应该是一维数组

我试过clf_rfc.fit(X_train, y_train.flatten())，它确实有效！

【讨论】：

【解决方案3】：

“y-train”数据框的形状不正确。试试这个：

clf_rfc.fit(X_train, y_train[0].values)

或

clf_rfc.fit(X_train, y_train.values.ravel())

【讨论】：