【发布时间】:2018-12-08 10:41:37
【问题描述】:
我正在尝试使用随机森林构建时间序列模型。但是,每次运行代码时,我都会遇到同样的错误,即:
[.data.frame(data, , all.vars(Terms), drop = FALSE) 中的错误:
选择了未定义的列
我非常了解随机森林背后的大部分理论,但使用它运行的代码并不多。
这是我的代码:
library(randomForest)
library(caret)
fitControl <- trainControl(
method = "repeatedcv",
number = 10,
repeats = 1,
classProbs = FALSE,
verboseIter = TRUE,
preProcOptions=list(thresh=0.95,na.remove=TRUE,verbose=TRUE))
set.seed(1234)
rf_grid <- expand.grid(mtry = c(1:6))
fit <- train(df.ts[,1]~.,
data=df.ts[,2:6],
method="rf",
preProcess=c("center","scale"),
tuneGrid = rf_grid,
trControl=fitControl,
ntree = 200,
metric="RMSE")
对于可重现的示例,您可以在以下数据集上运行代码:
df.ts <- structure(list(ts.t = c(315246, 219908, 193014, 231970, 248246,
+ 247112, 268218, 263637, 264306, 245730, 256548, 227525, 304468,
+ 229614, 202985), ts1 = c(233913, 315246, 219908, 193014, 231970,
+ 248246, 247112, 268218, 263637, 264306, 245730, 256548, 227525,
+ 304468, 229614), ts2 = c(253534, 233913, 315246, 219908, 193014,
+ 231970, 248246, 247112, 268218, 263637, 264306, 245730, 256548,
+ 227525, 304468), ts3 = c(226650, 253534, 233913, 315246, 219908,
+ 193014, 231970, 248246, 247112, 268218, 263637, 264306, 245730,
+ 256548, 227525), ts6 = c(213268, 242558, 250554, 226650, 253534,
+ 233913, 315246, 219908, 193014, 231970, 248246, 247112, 268218,
+ 263637, 264306), ts12 = c(333842, 210279, 193051, 174262, 216712,
+ 144327, 213268, 242558, 250554, 226650, 253534, 233913, 315246,
+ 219908, 193014)), .Names = c("ts.t", "ts1", "ts2", "ts3", "ts6", "ts12"), row.names = 13:27, class = "data.frame")
我希望有人能发现我的错误
谢谢,
【问题讨论】:
标签: r time-series rstudio random-forest