【发布时间】:2018-03-30 16:21:35
【问题描述】:
我有二进制 YES/NO 类响应的数据。使用以下代码运行 RF 模型。我在获取混淆矩阵结果时遇到问题。
dataR <- read_excel("*:/*.xlsx")
Train <- createDataPartition(dataR$Class, p=0.7, list=FALSE)
training <- dataR[ Train, ]
testing <- dataR[ -Train, ]
model_rf <- train( Class~., tuneLength=3, data = training, method =
"rf", importance=TRUE, trControl = trainControl (method = "cv", number =
5))
结果:
Random Forest
3006 samples
82 predictor
2 classes: 'NO', 'YES'
No pre-processing
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 2405, 2406, 2405, 2404, 2404
Addtional sampling using SMOTE
Resampling results across tuning parameters:
mtry Accuracy Kappa
2 0.7870921 0.2750655
44 0.7787721 0.2419762
87 0.7767760 0.2524898
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 2.
到目前为止还不错,但是当我运行此代码时:
# Apply threshold of 0.50: p_class
class_log <- ifelse(model_rf[,1] > 0.50, "YES", "NO")
# Create confusion matrix
p <-confusionMatrix(class_log, testing[["Class"]])
##gives the accuracy
p$overall[1]
我收到此错误:
Error in model_rf[, 1] : incorrect number of dimensions
如果你们能帮助我获得混淆矩阵结果,我将不胜感激。
【问题讨论】:
-
将
model_rf[, 1]打印到控制台查看一下。 -
如果您在问题中包含minimal reproducible example,它会更容易为您提供帮助。
标签: r random-forest r-caret confusion-matrix