【问题标题】:How to revise this code for implementing Smote Oversampling and Cross Validation pipeline to multiclass classification problem?如何修改此代码以实现 Smote 过采样和交叉验证管道以解决多类分类问题?
【发布时间】:2022-01-04 01:15:53
【问题描述】:
imba_pipeline = make_pipeline(SMOTE(random_state=42), 
                              DecisionTreeClassifier())
cross_val_score(imba_pipeline, X_train1, y_train1, scoring='recall', cv=kf)

当我运行它时,它给出了以下错误:

ValueError:目标是多类但平均值='二进制'。请选择另一个平均设置,[None, 'micro', 'macro', 'weighted'] 之一。

    Out[102]:
    array([nan, nan, nan, nan, nan])

【问题讨论】:

标签: python classification pipeline


【解决方案1】:

如果您使用 score='recall' ,它会调用 sklearn.metrics.recall_score 并使用默认值 average = "binary" 。一种方法是创建自定义评分:

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from imblearn.over_sampling import SMOTE
from imblearn.pipeline import make_pipeline
from sklearn.model_selection import cross_val_score
from sklearn.metrics import recall_score
from sklearn.metrics import make_scorer

X,y = load_iris(return_X_y=True)

imba_pipeline = make_pipeline(SMOTE(random_state=42), 
                              DecisionTreeClassifier())

multi_recall = make_scorer(recall_score, average="micro")

cross_val_score(imba_pipeline, X, y, scoring=multi_recall, cv=5)

array([0.96666667, 0.96666667, 0.9       , 0.93333333, 1.        ])

【讨论】:

  • 非常感谢。
猜你喜欢
  • 2015-10-29
  • 2019-10-02
  • 2018-06-30
  • 2018-12-27
  • 2021-11-05
  • 2017-11-16
  • 2023-03-17
  • 2017-11-15
  • 2016-02-25
相关资源
最近更新 更多