【问题标题】:Hypopt hyperparameter tuning error: 'sklearn.metrics' has no attribute 'scorer'Hypopt 超参数调整错误:“sklearn.metrics”没有属性“scorer”
【发布时间】:2021-03-17 06:26:50
【问题描述】:

我正在尝试使用 hypopt 执行 GridSearch 以执行多分类任务。

param_grid = [{'C': [1, 10, 100],  'penalty' :['l2']}]
gs = GridSearch(model = LogisticRegression(multi_class='multinomial'), param_grid = param_grid)
gs.fit(X_train, y_train, X_val, y_val, scoring='f1_macro')

不指定评分函数,按预期运行。但是,当我指定评分函数时,例如到'f1_macro',我得到以下错误:

   0%|          | 0/3 [00:00<?, ?it/s]/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
/usr/local/lib/python3.6/dist-packages/hypopt/model_selection.py:174: UserWarning: ERROR in thread<NoDaemonProcess(NoDaemonPoolWorker-59, started)>with exception:
module 'sklearn.metrics' has no attribute 'scorer'
  warnings.warn('ERROR in thread' + pname + "with exception:\n" + str(e))


 33%|███▎      | 1/3 [00:13<00:26, 13.21s/it]/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
/usr/local/lib/python3.6/dist-packages/hypopt/model_selection.py:174: UserWarning: ERROR in thread<NoDaemonProcess(NoDaemonPoolWorker-60, started)>with exception:
module 'sklearn.metrics' has no attribute 'scorer'
  warnings.warn('ERROR in thread' + pname + "with exception:\n" + str(e))


 67%|██████▋   | 2/3 [00:13<00:09,  9.30s/it]/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
/usr/local/lib/python3.6/dist-packages/hypopt/model_selection.py:174: UserWarning: ERROR in thread<NoDaemonProcess(NoDaemonPoolWorker-59, started)>with exception:
module 'sklearn.metrics' has no attribute 'scorer'
  warnings.warn('ERROR in thread' + pname + "with exception:\n" + str(e))


100%|██████████| 3/3 [00:19<00:00,  6.59s/it]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-102-2a8cb30a1d8d> in <module>()
      7 # Grid-search all parameter combinations using a validation set.
      8 gs = GridSearch(model = LogisticRegression(multi_class='multinomial'), param_grid = param_grid)
----> 9 gs.fit(X_train, y_train, X_val, y_val, scoring='f1_macro')
     10 

/usr/local/lib/python3.6/dist-packages/hypopt/model_selection.py in fit(self, X_train, y_train, X_val, y_val, scoring, scoring_params, verbose)
    361             else:
    362                 results = [_run_thread_job(job) for job in params]
--> 363             models, scores = list(zip(*results))
    364             self.model = models[np.argmax(scores)]
    365         else:

ValueError: not enough values to unpack (expected 2, got 0)

错误也可以很容易地通过采取来重现

X_train = np.array([[1, 2, 3], [3, 4, 5], [1, 2, 3]])
X_val = X_train
y_train = [1,0,2]
y_val = y_train

不知道发生了什么!?

我用

sklearn.__version__
>> 0.22.2.post1
hypopt.__version__
>> 1.0.9

【问题讨论】:

  • 我遇到了类似的问题;根据我对回购问题的理解,有两个不同的问题。一是由于GridSearch 中的默认parallelize=True(参见here)。第二个好像和sklearn版本有关;我可以通过恢复到sklearn.__version__ 0.21 来解决它。
  • 无论如何,我无法弄清楚问题出在哪里(wrt 较新的sklearn 版本),此外,在坚持非自定义评分功能的同时,scorer 已成为一个事实sklearn.__version__ >= 0.22 中的私有 API 并不是真正导致错误的原因(请参阅另一个相关问题 here)。
  • @amiola 错误解决方法见下文

标签: python scikit-learn grid-search


【解决方案1】:

hypoptsklearn 版本之间存在兼容性问题,错误消息非常清楚。

我有:

import hypopt
import sklearn
hypopt.__version__, sklearn.__version__
('1.0.9', '0.23.2')

我确实遇到了和你一样的错误。原因是下面source代码:

elif type(scoring) in [metrics.scorer._PredictScorer, metrics.scorer._ProbaScorer] \
            or metrics.scorer._PredictScorer in type(scoring).__bases__ \
            or metrics.scorer._ProbaScorer in type(scoring).__bases__:
            score = scoring(model_clone, job_params["X_val"], job_params["y_val"])

metrics.scorer 更改为metrics._scorer——因为这是sklearn v.23.1 所期望的——你会没事的。

证明:

from sklearn.linear_model import LogisticRegression
from hypopt import GridSearch

X_train = np.array([[1, 2, 3], [3, 4, 5], [1, 2, 3]])
X_val = X_train
y_train = [1,0,2]
y_val = y_train
param_grid = [{'C': [1, 10, 100],  'penalty' :['l2']}]
model = LogisticRegression(multi_class='multinomial')
gs = GridSearch(model = model, param_grid = param_grid, num_threads=1)
gs.fit(X_train, y_train, X_val, y_val, scoring='f1_micro')
100%|██████████| 3/3 [00:00<00:00, 32.56it/s]
LogisticRegression(C=1, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='multinomial', n_jobs=None, penalty='l2',
                   random_state=0, solver='lbfgs', tol=0.0001, verbose=0,
                   warm_start=False)

【讨论】:

  • 谢谢@Sergey Bushmanov!我明白你的意思,但我仍然有一个疑问。实际上,我会说使用 non-custom 评分参数(如'f1_micro')您宁愿输入here,从而规避您实际显示的问题。我错过了什么?
  • @amiola 乍一看我也有同样的感觉,但经过仔细调查,我想出了发布的答案。最好的证明是正确的:它正在工作。为什么?因为有一个 try/except 子句,而我们尝试失败。为什么?因为我们没有通过我提到的 elif。为什么我们失败了?因为 metrics.scorer 不存在。
猜你喜欢
  • 2019-03-07
  • 2021-08-25
  • 2017-03-22
  • 1970-01-01
  • 2022-07-28
  • 2021-07-04
  • 2021-02-11
  • 2020-09-24
  • 2021-03-29
相关资源
最近更新 更多