【发布时间】:2021-07-31 03:33:00
【问题描述】:
我目前正在研究疾病预测机器学习模型。我在模型中使用了随机森林分类器,现在我试图获得预测值的概率,但代码给了我一个错误。在这个程序中,我想具体得到“每个”预测的概率。例如,我输入了症状来预测疾病,预测的疾病是“过敏”。然后,我希望我的程序将预测疾病“过敏”的概率显示为百分比,但程序会给出错误,例如“分类指标无法处理多类和未知目标的混合”。我想我需要使用混淆矩阵来显示概率,但它也给出了关于多类问题的相同错误。更清楚地说,我只想将每个预测值的概率显示为“百分比”。例如,过敏性疾病的概率是 90% 等。我该如何做到这一点,如何解决我的问题?
这里是相关代码:
p=pickle_model.predict([[22,8,50,9,20,47,50,38,0,0,0]])
actual=np.array((22,8,50,9,20,47,50,38,0,0,0))
pred=pickle_model.predict_proba([[p,0,0,0,0,0,0,0,0,0,0]])
在下面的代码块中:
from sklearn.metrics import confusion_matrix
import sklearn.metrics as mt
from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
print(accuracy_score(actual, p, normalize=True, sample_weight=None))
我得到错误:
ValueError Traceback (most recent call last)
<ipython-input-69-e8980bf68410> in <module>
3 from sklearn.metrics import accuracy_score
4 from sklearn.metrics import precision_score
----> 5 print(accuracy_score(actual, p, normalize=True, sample_weight=None))
6 #precision, recall, fscore, support =
7 #score(y_test, p)
~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
~\anaconda3\lib\site-packages\sklearn\metrics\_classification.py in accuracy_score(y_true, y_pred, normalize, sample_weight)
200
201 # Compute accuracy for each possible representation
--> 202 y_type, y_true, y_pred = _check_targets(y_true, y_pred)
203 check_consistent_length(y_true, y_pred, sample_weight)
204 if y_type.startswith('multilabel'):
~\anaconda3\lib\site-packages\sklearn\metrics\_classification.py in _check_targets(y_true, y_pred)
81 y_pred : array or indicator matrix
82 """
---> 83 check_consistent_length(y_true, y_pred)
84 type_true = type_of_target(y_true)
85 type_pred = type_of_target(y_pred)
~\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_consistent_length(*arrays)
260 uniques = np.unique(lengths)
261 if len(uniques) > 1:
--> 262 raise ValueError("Found input variables with inconsistent numbers of"
263 " samples: %r" % [int(l) for l in lengths])
264
ValueError: Found input variables with inconsistent numbers of samples: [11, 1]
此外,我在此代码块中遇到的另一个错误是:
ValueError Traceback (most recent call last)
<ipython-input-65-774dbd6b46f7> in <module>
8
9 # specificity
---> 10 tn, fp, fn, tp = mt.confusion_matrix(actual, predict).ravel()
11 specificity = tn / (tn+fp)
12 print(specificity)
~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
~\anaconda3\lib\site-packages\sklearn\metrics\_classification.py in confusion_matrix(y_true, y_pred, labels, sample_weight, normalize)
294
295 """
--> 296 y_type, y_true, y_pred = _check_targets(y_true, y_pred)
297 if y_type not in ("binary", "multiclass"):
298 raise ValueError("%s is not supported" % y_type)
~\anaconda3\lib\site-packages\sklearn\metrics\_classification.py in _check_targets(y_true, y_pred)
90
91 if len(y_type) > 1:
---> 92 raise ValueError("Classification metrics can't handle a mix of {0} "
93 "and {1} targets".format(type_true, type_pred))
94
ValueError: Classification metrics can't handle a mix of multiclass and unknown targets
【问题讨论】:
标签: python machine-learning scikit-learn