【问题标题】:SKLearn Cross Validation Error -- Type ErrorSKLearn 交叉验证错误——类型错误
【发布时间】:2023-03-15 07:37:01
【问题描述】:

我正在尝试对我的 KNN 分类器的结果进行交叉验证。我使用了以下代码,它返回一个类型错误。

就上下文而言,我已经导入了 SciKit Learn、Numpy 和 Pandas 库。

from sklearn.cross_validation import cross_val_score, ShuffleSplit

n_samples = len(y)
knn = KNeighborsClassifier(3)
cv = ShuffleSplit(n_samples, n_iter=10, test_size=0.3, random_state=0)

test_scores = cross_val_score(knn, X, y, cv=cv)
test_scores.mean()

返回:

    ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-139-d8cc3ee0c29b> in <module>()
  7 cv = ShuffleSplit(n_samples, n_iter=10, test_size=0.3, random_state=0)
  8 
  9 test_scores = cross_val_score(knn, X, y, cv=cv)
 10 test_scores.mean()

//anaconda/lib/python2.7/site-packages/sklearn/cross_validation.pyc in     cross_val_score(estimator, X, y, scoring, cv, n_jobs, verbose, fit_params, score_func, pre_dispatch)
1150         delayed(_cross_val_score)(clone(estimator), X, y, scorer, train, test,
1151                                   verbose, fit_params)
1152         for train, test in cv)
1153     return np.array(scores)
1154 

//anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
515         try:
516             for function, args, kwargs in iterable:
517                 self.dispatch(function, args, kwargs)
518 
519             self.retrieve()
//anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in dispatch(self, func, args, kwargs)
310         """
311         if self._pool is None:
312             job = ImmediateApply(func, args, kwargs)
313             index = len(self._jobs)
314             if not _verbosity_filter(index, self.verbose):
//anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __init__(self, func, args, kwargs)
134         # Don't delay the application, to avoid keeping the input
135         # arguments in memory
136         self.results = func(*args, **kwargs)
137 
138     def get(self):

//anaconda/lib/python2.7/site-packages/sklearn/cross_validation.pyc in _cross_val_score(estimator, X, y, scorer, train, test, verbose, fit_params)
1056         y_test = None
1057     else:
1058         y_train = y[train]
1059         y_test = y[test]
1060     estimator.fit(X_train, y_train, **fit_params)

TypeError: only integer arrays with one element can be converted to an index

【问题讨论】:

  • 请说明您的 y 变量是来自还是来自 pandas.DataFrame

标签: python numpy pandas scikit-learn cross-validation


【解决方案1】:

这是一个与 pandas 相关的错误。 Scikit learn 需要 numpy 数组、稀疏矩阵或与这些行为类似的对象。

pandas DataFrames 的主要问题是由于使用 [...] 进行索引选择列而不是行。 pandas 中的行索引是通过 DataFrame.loc[...] 完成的。这是 sklearn 的意外行为。错误可能来自第 1058 行,代码无法提取训练样本。

要解决这个问题,如果您的 y 是一个 DataFrame 列,请尝试将您的列转换为数组类型

y = y.values

否则pandas-sklearn 可能是一个选项。

【讨论】:

    猜你喜欢
    • 2017-05-07
    • 1970-01-01
    • 1970-01-01
    • 2012-12-31
    • 2015-07-07
    • 2018-08-16
    • 2021-06-13
    • 2018-04-26
    • 2021-07-10
    相关资源
    最近更新 更多