【发布时间】:2018-08-22 06:05:48
【问题描述】:
我正在尝试使用rpart 在插入符号中最大限度地提高模型选择的灵敏度。为此,我尝试复制此处给出的方法(向下滚动到使用用户定义函数 FourStat 的示例)caret's github page
# create own function so we can use "sensitivity" as our metric to maximise:
Sensitivity.fc <- function (data, lev = levels(data$obs), model = NULL) {
out <- c(twoClassSummary(data, lev = levels(data$obs), model = NULL))
c(out, Sensitivity = out["Sens"])
}
rpart_caret_fit <- train(outcome~pred1+pred2+pred3+pred4,
na.action = na.pass,
method = "rpart",
control=rpart.control(maxdepth = 6),
tuneLength = 20,
# maximise sensitivity
metric = "Sensitivity",
maximize = TRUE,
trControl = trainControl(classProbs = TRUE,
summaryFunction = Sensitivity.fc))
但是当我得到摘要时
rpart_caret_fit
表示它仍然使用ROC准则来选择最终模型:
CART
678282 samples
4 predictor
2 classes: 'yes', 'no'
No pre-processing
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 678282, 678282, 678282, 678282, 678282, 678282, ...
Resampling results across tuning parameters:
cp ROC Sens Spec Sensitivity.Sens
0.000001909738 0.7259486 0.4123547 0.8227382 0.4123547
0.000002864607 0.7259486 0.4123547 0.8227382 0.4123547
0.000005729214 0.7259489 0.4123622 0.8227353 0.4123622
0.000006684083 0.7258036 0.4123614 0.8227379 0.4123614
0.000007638953 0.7258031 0.4123576 0.8227398 0.4123576
0.000009548691 0.7258028 0.4123539 0.8227416 0.4123539
0.000010694534 0.7257553 0.4123589 0.8227332 0.4123589
0.000015277905 0.7257313 0.4123614 0.8227290 0.4123614
0.000032465548 0.7253456 0.4112838 0.8234272 0.4112838
0.000038194763 0.7252966 0.4112912 0.8234196 0.4112912
0.000076389525 0.7248774 0.4102792 0.8240339 0.4102792
0.000164237480 0.7244847 0.4093688 0.8246372 0.4093688
0.000194793290 0.7241532 0.4086596 0.8250930 0.4086596
0.000310650737 0.7237546 0.4087379 0.8250393 0.4087379
0.001625187154 0.7233805 0.4006570 0.8295729 0.4006570
0.001726403276 0.7233225 0.3983850 0.8308874 0.3983850
0.002173282000 0.7230906 0.3915758 0.8348320 0.3915758
0.002237258227 0.7230906 0.3915758 0.8348320 0.3915758
0.006140444689 0.7173854 0.4897494 0.7695558 0.4897494
0.055330843035 0.5730987 0.2710906 0.8545549 0.2710906
ROC was used to select the optimal model using the largest value.
The final value used for the model was cp = 0.000005729214.
如何覆盖 ROC 选择方法?
【问题讨论】: