尝试拟合模型 XgBoost 时元组索引超出范围答案

【问题标题】：tuple index out of range when trying to fit model XgBoost尝试拟合模型 XgBoost 时元组索引超出范围
【发布时间】：2020-01-24 20:02:20
【问题描述】：

我正在尝试在单词向量上训练模型 xgboost。当我这样做时

model = xgb.XGBClassifier()
model.fit(X_train["comment_preproc"], y_train["label"])
y_predict = model.predict(X_test["comment_preproc"])

我得到了错误

IndexError                                Traceback (most recent call last)
<ipython-input-26-870161aebeee> in <module>()
      1 model = xgb.XGBClassifier()
----> 2 model.fit(X_train["comment_preproc"], y_train["label"])
      3 y_predict = model.predict(X_test["comment_preproc"])

/usr/local/lib/python3.6/dist-packages/xgboost/sklearn.py in fit(self, X, y, sample_weight, eval_set, eval_metric, early_stopping_rounds, verbose, xgb_model, sample_weight_eval_set, callbacks)
    717             evals = ()
    718 
--> 719         self._features_count = X.shape[1]
    720 
    721         if sample_weight is not None:

IndexError: tuple index out of range

我以为 X_train 和 y_train 的形状可能不同，但事实并非如此

我做错了什么？

【问题讨论】：

标签： python pandas xgboost

【解决方案1】：

元组(758079,) 和(758079,) 仅包含一个元素。

因此你得到的错误：

>>> t = (758079,)
>>> t[0]
758079
>>> t[1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: tuple index out of range

【讨论】：

【解决方案2】：

只需将.to_frame() 添加到您的X_train["comment_preproc"] 系列：

model.fit(X_train["comment_preproc"].to_frame(), y_train["label"])

或

model.fit(X_train[["comment_preproc"]], y_train["label"])

它应该可以工作

【讨论】：