【问题标题】:Calculating AUC of training dataset for glm function in R计算 R 中 glm 函数的训练数据集的 AUC
【发布时间】:2019-06-09 17:11:08
【问题描述】:

我正在尝试使用 glm 在我的逻辑回归模型的训练数据上找到 AUC

我将数据拆分为训练和测试集,使用 glm 拟合逻辑回归模型回归模型,计算预测值并尝试找到 AUC

d<-read.csv(file.choose(), header=T)
 set.seed(12345)
 train = runif(nrow(d))<.5
 table(train)
 fit = glm(y~ ., binomial, d)
 phat<-predict(fit,type = 'response')
 d$phat=phat
 g <- roc(y ~ phat, data = d, print.auc=T)
 plot(g)

【问题讨论】:

标签: r glm auc


【解决方案1】:

另一个用户友好的选项是使用 caret 库,这使得在 R 中拟合和比较回归/分类模型变得非常简单。以下示例代码使用 GermanCredit 数据集使用逻辑来预测信用价值回归模型。代码改编自这篇博客:https://www.r-bloggers.com/evaluating-logistic-regression-models/

library(caret)

## example from https://www.r-bloggers.com/evaluating-logistic-regression-models/
data(GermanCredit)

## 60% training / 40% test data
trainIndex <- createDataPartition(GermanCredit$Class, p = 0.6, list = FALSE)

GermanCreditTrain <- GermanCredit[trainIndex, ]
GermanCreditTest <- GermanCredit[-trainIndex, ]

## logistic regression based on 10-fold cross-validation 
trainControl <- trainControl(
     method = "cv",
     number = 10,
     classProbs = TRUE,
     summaryFunction = twoClassSummary
)

fit <- train(
    form = Class ~ Age + ForeignWorker + Property.RealEstate + Housing.Own + 
         CreditHistory.Critical,  
    data = GermanCreditTrain,
    trControl = trainControl,
    method = "glm", 
    family = "binomial", 
    metric = "ROC"
)

## AUC ROC for training data
print(fit)

## AUC ROC for test data
## See https://topepo.github.io/caret/measuring-performance.html#measures-for-class-probabilities
 predictTest <- data.frame(
         obs = GermanCreditTest$Class,                                    ## observed class labels
         predict(fit, newdata = GermanCreditTest, type = "prob"),         ## predicted class probabilities
         pred = predict(fit, newdata = GermanCreditTest, type = "raw")    ## predicted class labels
     ) 

twoClassSummary(data = predictTest, lev = levels(predictTest$obs))

【讨论】:

    【解决方案2】:

    我喜欢使用ROCR 库中的performance 命令。

    library(ROCR)
    # responsev = response variable
    
    d.prediction<-prediction(predict(fit, type="response"), train$responsev)
    d.performance<-performance(d.prediction,measure = "tpr",x.measure="fpr")
    d.test.prediction<-prediction(predict(fit,newdata=d.test, type="response"), d.test$DNF)
    d.test.prefermance<-performance(d.test.prediction, measure="tpr", x.measure="fpr")
    
    # What is the actual numeric performance of our model?
    performance(d.prediction,measure="auc")
    performance(d.test.prediction,measure="auc")
    
    

    【讨论】:

    • 尝试您的代码。 train$responsev 中的错误:$ 运算符对原子向量无效
    • 是的,train 是我的数据集,我有测试集,我的代码在上面!我可能有问题
    猜你喜欢
    • 2018-03-28
    • 2015-12-27
    • 1970-01-01
    • 1970-01-01
    • 2013-12-16
    • 2014-07-22
    • 2011-06-21
    • 1970-01-01
    • 2023-03-26
    相关资源
    最近更新 更多