【发布时间】:2016-03-07 19:40:54
【问题描述】:
您好,我在 R 中使用 randomForest,它不接受逻辑变量作为响应 (Y),但似乎接受它作为预测变量 (X)。我有点惊讶 b/c 我认为逻辑本质上是 2 类因素......
我的问题是:randomForest 是否接受逻辑作为预测变量,而不是作为响应?为什么会这样? 其他常见模型(glmnet、svm、...)是否接受逻辑变量?
感谢任何解释/讨论。谢谢
N = 100
data1 = data.frame(age = sample(1:80, N, replace=T),
sex = sample(c('M', 'F'), N, replace=T),
veteran = sample(c(T, F), N, replace=T),
exercise = sample(c(T, F), N, replace=T))
sapply(data1, class)
# age sex veteran exercise
# "integer" "factor" "logical" "logical"
# this doesnt work b/c exercise is logical
rf = randomForest(exercise ~ ., data = data1, importance = T)
# Warning message:
# In randomForest.default(m, y, ...) :
# The response has five or fewer unique values. Are you sure you want to do regression?
# this works, and veteran and exercise (logical) work as predictors
rf = randomForest(sex ~ ., data = data1, importance = T)
importance(rf)
# F M MeanDecreaseAccuracy MeanDecreaseGini
# age -2.0214486 -7.584637 -6.242150 6.956147
# veteran 4.6509542 3.168551 4.605862 1.846428
# exercise -0.1205806 -6.226174 -3.924871 1.013030
# convert it to factor and it works
rf = randomForest(as.factor(exercise) ~ ., data = data1, importance = T)
【问题讨论】:
标签: r random-forest