ValueError: y_true 在 {'True', 'False'} 中取值并且 pos_label 未在 ROC_curve 中指定答案

【问题标题】：ValueError: y_true takes value in {'True', 'False'} and pos_label is not specified in ROC_curveValueError: y_true 在 {'True', 'False'} 中取值并且 pos_label 未在 ROC_curve 中指定
【发布时间】：2021-09-24 01:29:37
【问题描述】：

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.5, random_state=2)

# generate a no skill prediction (majority class)
ns_probs = [0 for _ in range(len(y_test))]

# fit a model
model = KNeighborsClassifier(n_neighbors = 3)
model.fit(x_train, y_train)

# predict probabilities
lr_probs = model.predict_proba(x_test)

# keep probabilities for the positive outcome only
lr_probs = lr_probs[:, 1]

# calculate scores
ns_auc = roc_auc_score(y_test, ns_probs)
lr_auc = roc_auc_score(y_test, lr_probs)

# summarize scores
print('No Skill: ROC AUC=%.3f' % (ns_auc))
print('Logistic: ROC AUC=%.3f' % (lr_auc))

# calculate roc curves
ns_fpr, ns_tpr, _ = roc_curve(y_test, ns_probs) <-- Error Occurred
lr_fpr, lr_tpr, _ = roc_curve(y_test, lr_probs)

...

我正在尝试在 KNN 算法中使用 ROC 曲线。

ValueError: y_true takes value in {'True', 'False'} and pos_label is not specified: 
either make y_true take value in {0, 1} or {-1, 1} or pass pos_label explicitly

但是，正如您在上面看到的，发生了错误。

from sklearn.preprocessing import LabelEncoder

encoder = LabelEncoder()
encoder.fit(data.Malware)
data['TrueorFalse'] = encoder.transform(data['TrueorFalse'])
data.value_counts(data['TrueorFalse'].values, sort=False)
data.head()

所以为了解决这个问题，我认为我写的“True”和“False”标签是有问题的，因为它们是字符串。因此，应用上面的代码将 True 或 Flase 分别切换为 0 和 1，但仍然会出现错误。我使用True 和False 作为TrueorFalse 列中的标签。我有什么遗漏吗？

【问题讨论】：

标签： machine-learning scikit-learn computer-vision sklearn-pandas

【解决方案1】：

y_test = y_test.map({'True': 1, 'False': 0}).astype(int)

添加此代码帮助我解决了我的问题。

【讨论】：