【问题标题】:Sklearn: NotFittedError: This SVC instance is not fitted yet. Soft Voting on Calibration classifiersSklearn:NotFittedError:此 SVC 实例尚未安装。校准分类器的软投票
【发布时间】:2019-03-10 17:02:39
【问题描述】:

我尝试在 sklearn 上对校准分类器使用软投票。由于到目前为止软投票没有prefit 选项,我尝试让VotingClassifier.fit() 调用CalibratedClassifierCV.fit()。以下是我的代码:

data = load_breast_cancer()

# Data spliting.
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25)

# Base classifiers.
clf_svm = svm.SVC(gamma=0.001, probability=True)
clf_svm.fit(X_train, y_train)

clf_lr = LogisticRegression(random_state=0, solver='lbfgs')
clf_lr.fit(X_train, y_train)

svm_isotonic = CalibratedClassifierCV(clf_svm, cv='prefit', method='isotonic')
svm_isotonic.fit(X_val, y_val)

lr_isotonic = CalibratedClassifierCV(clf_lr, cv='prefit', method='isotonic')
lr_isotonic.fit(X_val, y_val)

eclf_soft2 = VotingClassifier(estimators=[
    ('svm', svm_isotonic), ('lr', lr_isotonic)], voting ='soft')
eclf_soft2.fit(X_val, y_val)

但是,我遇到了一些奇怪的错误:

Traceback (most recent call last):
  File "/home/ubuntu/projects/faceRecognition/faceVerif/util/plot_calibration.py", line 127, in <module>
    main(parse_arguments(sys.argv[1:]))
  File "/home/ubuntu/projects/faceRecognition/faceVerif/util/plot_calibration.py", line 120, in main
    eclf_soft2.fit(X_val, y_val)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/ensemble/voting_classifier.py", line 189, in fit
    for clf in clfs if clf is not None)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 779, in __call__
    while self.dispatch_one_batch(iterator):
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 625, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 588, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 111, in apply_async
    result = ImmediateResult(func)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 332, in __init__
    self.results = batch()
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 131, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 131, in <listcomp>
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/ensemble/voting_classifier.py", line 31, in _parallel_fit_estimator
    estimator.fit(X, y)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/calibration.py", line 157, in fit
    calibrated_classifier.fit(X, y)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/calibration.py", line 335, in fit
    df, idx_pos_class = self._preproc(X)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/calibration.py", line 290, in _preproc
    df = self.base_estimator.decision_function(X)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/svm/base.py", line 527, in decision_function
    dec = self._decision_function(X)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/svm/base.py", line 384, in _decision_function
    X = self._validate_for_predict(X)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/svm/base.py", line 437, in _validate_for_predict
    check_is_fitted(self, 'support_')
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 768, in check_is_fitted
    raise NotFittedError(msg % {'name': type(estimator).__name__})
sklearn.exceptions.NotFittedError: This SVC instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

我的问题是如何解决这个错误,或者有什么替代解决方案吗?

提前谢谢你。

【问题讨论】:

    标签: python machine-learning scikit-learn ensemble-learning


    【解决方案1】:

    VotingClassifier 将克隆提供的估计器(以及在这种情况下的内部估计器),然后尝试适应它们。但是在CalibratedClassifierCV 中,您使用cv='prefit',它假定您已经安装了估计器。这会导致冲突和错误。

    解释:

    VotingClassifier 有两个内部估算器

    • ('svm', svm_isotonic),
    • ('lr', lr_isotonic)

    当您调用eclf_soft2.fit 时,它会首先调用clone svm_isotoniclr_isotonic。克隆这些CalibratedClassifierCV 估计器将克隆其基本估计器clf_svmclf_lr

    这种克隆只复制参数值,而不是从之前对fit() 的调用中学到的实际属性。所以基本上你克隆的clf_svmclf_lr 现在不合适了。

    不幸的是,没有简单的方法可以为您的用例设置此权利:适合投票分类器,这反过来又适合内部校准分类器,但不适合基本分类器。

    但是,如果您只想在两个 CalibratedClassifierCV 估计器的组合系统上使用 VotingClassifier 的软投票功能,则可以轻松完成。

    从我对类似问题的其他答案中获取想法:

    你可以这样做:

    import numpy as np
    
    # Define functions
    def custom_fit(estimators, X, y):
        for clf in estimators:
            clf.fit(X, y)
    
    def custom_predict(estimators, X, voting = 'soft', weights = None):
    
        if voting == 'hard':
            pred = np.asarray([clf.predict(X) for clf in estimators]).T
            pred = np.apply_along_axis(lambda x:
                                       np.argmax(np.bincount(x, weights=weights)),
                                       axis=1,
                                       arr=pred.astype('int'))
        else:
            pred = np.asarray([clf.predict_proba(X) for clf in estimators])
            pred = np.average(pred, axis=0, weights=weights)
            pred = np.argmax(pred, axis=1)
    
        return pred
    
    
    # Use them
    estimators=[svm_isotonic, lr_isotonic]
    custom_fit(estimators, X_val, y_val)
    
    custom_predict(estimators, X_test)    
    

    【讨论】:

    • 我很好奇为什么 sklearn 不提供prefit 用于可以轻松解决此类问题的投票分类器?
    • @Tengerye 我要评论它是这样的,因为它干扰了 scikit-learn 的管道架构,但找到了你打开的问题。
    • 您好,很高兴再次收到您的来信。 scikit-learn 的开发人员似乎决定改进代码,但卡在了某个地方。顺便说一句,我修复了您提供的代码中的错误,您能看一下吗? @Vivek Kumar
    • @Tengerye 感谢您更正代码。我还稍微修改了custom_fit()
    猜你喜欢
    • 2021-12-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-10-05
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-04-30
    相关资源
    最近更新 更多