【问题标题】:How can I calculate F1-measure and ROC in multiclass classification problem in R?如何在 R 中的多类分类问题中计算 F1-measure 和 ROC?
【发布时间】:2023-04-01 03:33:01
【问题描述】:

我有这个用于多类分类问题的代码:

data$Class = as.factor(data$Class)
levels(data$Class) <- make.names(levels(factor(data$Class)))
trainIndex <- createDataPartition(data$Class, p = 0.6, list = FALSE, times=1)
trainingSet <- data[ trainIndex,]
testingSet  <- data[-trainIndex,]
train_x <- trainingSet[, -ncol(trainingSet)]
train_y <- trainingSet$Class

testing_x <- testingSet[, -ncol(testingSet)]
testing_y <- testingSet$Class

oneRM <- OneR(trainingSet, verbose = TRUE)
oneRM
summary(oneRM)
plot(oneRM)    

oneRM_pred <- predict(oneRM, testing_x)
oneRM_pred

eval_model(oneRM_pred, testing_y)


AUC_oneRM_pred <- auc(roc(oneRM_pred,testing_y))
cat ("AUC=", oneRM_pred)

# Recall-Precision curve    
oneRM_prediction <- prediction(oneRM_pred, testing_y)
RP.perf <- performance(oneRM_prediction, "tpr", "fpr")

plot (RP.perf)

plot(roc(oneRM_pred,testing_y))

但是代码不起作用,在这行之后:

oneRM_prediction

我收到此错误:

预测错误(oneRM_pred, testing_y):预测格式为 无效。

另外,我不知道如何轻松获得 F1-measure。

最后一个问题,在多类分类问题中计算AUC有意义吗?

【问题讨论】:

    标签: r classification roc multiclass-classification


    【解决方案1】:

    如果我使用级别(onerm_pred)以这种方式:

    ...
    oneRM <- OneR(trainingSet, verbose = TRUE)
    oneRM
    summary(oneRM)
    plot(oneRM)    
    
    oneRM_pred <- predict(oneRM, testing_x)
    levels(oneRM_pred) <- levels(testing_y)
    ...
    

    精度远低于以前。所以,我不确定是否强制执行相同的级别是一个好的解决方案。

    【讨论】:

      【解决方案2】:

      让我们从 F1 开始。

      假设您使用的是 iris 数据集,首先,我们需要像您一样加载所有内容、训练模型并执行预测。

      library(datasets)
      library(caret)
      library(OneR)
      library(pROC)
      
      trainIndex <- createDataPartition(iris$Species, p = 0.6, list = FALSE, times=1)
      trainingSet <- iris[ trainIndex,]
      testingSet  <- iris[-trainIndex,]
      train_x <- trainingSet[, -ncol(trainingSet)]
      train_y <- trainingSet$Species
      
      testing_x <- testingSet[, -ncol(testingSet)]
      testing_y <- testingSet$Species
      
      oneRM <- OneR(trainingSet, verbose = TRUE)
      oneRM_pred <- predict(oneRM, testing_x)
      

      然后,您应该计算每个类的准确率、召回率和 F1。

      cm <- as.matrix(confusionMatrix(oneRM_pred, testing_y))
      n = sum(cm) # number of instances
      nc = nrow(cm) # number of classes
      rowsums = apply(cm, 1, sum) # number of instances per class
      colsums = apply(cm, 2, sum) # number of predictions per class
      diag = diag(cm)  # number of correctly classified instances per class 
      
      precision = diag / colsums 
      recall = diag / rowsums 
      f1 = 2 * precision * recall / (precision + recall) 
      
      print(" ************ Confusion Matrix ************")
      print(cm)
      print(" ************ Diag ************")
      print(diag)
      print(" ************ Precision/Recall/F1 ************")
      print(data.frame(precision, recall, f1)) 
      

      之后就可以找到宏F1了。

      macroPrecision = mean(precision)
      macroRecall = mean(recall)
      macroF1 = mean(f1)
      
      print(" ************ Macro Precision/Recall/F1 ************")
      print(data.frame(macroPrecision, macroRecall, macroF1)) 
      

      要查找 ROC(准确地说是 AUC),最好使用 pROC 库。

      print(" ************ AUC ************")
      roc.multi <- multiclass.roc(testing_y, as.numeric(oneRM_pred))
      print(auc(roc.multi))
      

      希望对你有帮助。

      查找有关 F1 的 link 和 AUC 的 this 的详细信息。

      【讨论】:

      猜你喜欢
      • 2016-04-07
      • 2016-08-05
      • 2018-03-25
      • 2016-01-24
      • 1970-01-01
      • 2020-07-06
      • 2016-08-06
      • 2021-06-28
      • 2013-12-29
      相关资源
      最近更新 更多