【问题标题】:Logistic Regression Tuning Parameter Grid in R Caret Package?R Caret包中的逻辑回归调整参数网格?
【发布时间】:2018-05-29 02:15:47
【问题描述】:

我正在尝试使用caret package 在 R 中拟合逻辑回归模型。我做了以下事情:

model <- train(dec_var ~., data=vars, method="glm", family="binomial",
                 trControl = ctrl, tuneGrid=expand.grid(C=c(0.001, 0.01, 0.1, 1,10,100, 1000)))

但是,我不确定该模型的调整参数应该是什么,而且我很难找到它。我假设它是 C,因为 C 是 sklearn 中使用的参数。目前,我收到以下错误 -

错误:调整参数网格应该有列参数

您对如何解决此问题有任何建议吗?

【问题讨论】:

标签: r logistic-regression r-caret hyperparameters


【解决方案1】:

Per Max Kuhn 的网络书-search for method = 'glm' herecaret 中没有调整参数glm

我们可以通过测试几个基本的train 调用来轻松验证这种情况。首先,让我们从一个方法 (rpart) 开始,该方法在网络手册中确实有一个调整参数 (cp)。

library(caret)
data(GermanCredit)

# Check tuning parameter via `modelLookup` (matches up with the web book)
modelLookup('rpart')
#  model parameter                label forReg forClass probModel
#1 rpart        cp Complexity Parameter   TRUE     TRUE      TRUE

# Observe that the `cp` parameter is tuned
set.seed(1)
model_rpart <- train(Class ~., data=GermanCredit, method='rpart')
model_rpart
#CART 

#1000 samples
#  61 predictor
#   2 classes: 'Bad', 'Good' 

#No pre-processing
#Resampling: Bootstrapped (25 reps) 
#Summary of sample sizes: 1000, 1000, 1000, 1000, 1000, 1000, ... 
#Resampling results across tuning parameters:

#  cp          Accuracy   Kappa    
#  0.01555556  0.7091276  0.2398993
#  0.03000000  0.7025574  0.1950021
#  0.04444444  0.6991700  0.1316720

#Accuracy was used to select the optimal model using  the largest value.
#The final value used for the model was cp = 0.01555556.

我们看到cp 参数已调整。现在让我们试试glm

# Check tuning parameter via `modelLookup` (shows a parameter called 'parameter')
modelLookup('glm')
#  model parameter     label forReg forClass probModel
#1   glm parameter parameter   TRUE     TRUE      TRUE

# Try out the train function to see if 'parameter' gets tuned
set.seed(1)
model_glm <- train(Class ~., data=GermanCredit, method='glm')
model_glm
#Generalized Linear Model 

#1000 samples
#  61 predictor
#   2 classes: 'Bad', 'Good' 

#No pre-processing
#Resampling: Bootstrapped (25 reps) 
#Summary of sample sizes: 1000, 1000, 1000, 1000, 1000, 1000, ... 
#Resampling results:

#  Accuracy   Kappa    
#  0.7386384  0.3478527

在上述glm 的情况下,没有执行参数调整。根据我的经验,名为parameterparameter 似乎只是一个占位符,而不是真正的调整参数。如以下代码所示,即使我们尝试强制它调整 parameter,它基本上也只会执行一个值。

set.seed(1)
model_glm2 <- train(Class ~., data=GermanCredit, method='glm',
                    tuneGrid=expand.grid(parameter=c(0.001, 0.01, 0.1, 1,10,100, 1000)))
model_glm2
#Generalized Linear Model 

#1000 samples
#  61 predictor
#   2 classes: 'Bad', 'Good' 

#No pre-processing
#Resampling: Bootstrapped (25 reps) 
#Summary of sample sizes: 1000, 1000, 1000, 1000, 1000, 1000, ... 
#Resampling results across tuning parameters:

#  Accuracy   Kappa      parameter
#  0.7386384  0.3478527  0.001    
#  0.7386384  0.3478527  0.001    
#  0.7386384  0.3478527  0.001    
#  0.7386384  0.3478527  0.001    
#  0.7386384  0.3478527  0.001    
#  0.7386384  0.3478527  0.001    
#  0.7386384  0.3478527  0.001    

#Accuracy was used to select the optimal model using  the largest value.
#The final value used for the model was parameter = 0.001.

【讨论】:

    猜你喜欢
    • 2020-01-05
    • 2021-11-14
    • 2018-01-26
    • 1970-01-01
    • 2019-03-20
    • 2019-08-20
    • 1970-01-01
    • 2018-02-12
    • 2014-06-20
    相关资源
    最近更新 更多