【问题标题】:rolling stepwise regression in RR中的滚动逐步回归
【发布时间】:2016-07-06 17:51:32
【问题描述】:

我有一个包含 12 个预测变量的数据框和一个名为 BEI 的数字列表(我想预测它)。我想对每 12 行数据运行逐步选择,例如 1:12、2:13 等。对于每次滚动,我想返回系数并使用系数来预测 BEI。以下是我的代码:

k = length(BEI)
coef.list <- numeric()
predicted.list <- numeric()
for(i in 1:(k-11)){
  BEI.subset <- BEI[i:(i+11)]
  predictors.subset <- predictors[c(i:(i+11)),]
  fit.stepwise <- regsubsets(BEI.subset~., data = predictors.subset, nvmax = 10, method = "forward")
  fit.summary <- summary(fit.stepwise)
  id <- which.min(fit.summary$cp)
  coefficients <- coef(fit.stepwise,id)
  coef.list <- append(coef.list, coefficients)
  form <- as.formula(fit.stepwise$call[[2]])
  mat <- model.matrix(form,predictors.subset)
  predicted.stepwise <- mat[,names(coefficients)]%*%coefficients
  predicted.list <- append(predicted.list, predicted.stepwise)
}

我得到了这样的错误: 重新排序变量并重试: 有 50 个或更多警告(使用 warnings() 查看前 50 个)

警告是: 1:在 jumps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, ... : 找到 1 个线性依赖项 2:在 jumps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, ... : 找到 1 个线性依赖项 3:在 jumps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, ... : 找到 1 个线性依赖项 ....等等。

我该如何解决这个问题?或者这是编写代码的更好方法?

【问题讨论】:

    标签: r regression


    【解决方案1】:

    您遇到错误的原因是滚动数据子集的缺失值 (NA)。

    以data(swiss)为例:

    dim(swiss) 
    # [1] 47  6
    
    split_swiss <- lapply(1:nrow(swiss), function(x) swiss[x:(x+11),])
    length(split_swiss)
    # [1] 47  ## rolling subset produce 47 data.frames. 
    
    lapply(tail(split_swiss), head) # show the first 6 rows of the last 6 data.frames 
    [[1]]
                 Fertility Agriculture Examination Education Catholic Infant.Mortality
    Neuchatel         64.4        17.6          35        32    16.92             23.0
    Val de Ruz        77.6        37.6          15         7     4.97             20.0
    ValdeTravers      67.6        18.7          25         7     8.65             19.5
    V. De Geneve      35.0         1.2          37        53    42.34             18.0
    Rive Droite       44.7        46.6          16        29    50.43             18.2
    Rive Gauche       42.8        27.7          22        29    58.33             19.3
    
    [[2]]
                 Fertility Agriculture Examination Education Catholic Infant.Mortality
    Val de Ruz        77.6        37.6          15         7     4.97             20.0
    ValdeTravers      67.6        18.7          25         7     8.65             19.5
    V. De Geneve      35.0         1.2          37        53    42.34             18.0
    Rive Droite       44.7        46.6          16        29    50.43             18.2
    Rive Gauche       42.8        27.7          22        29    58.33             19.3
    NA                  NA          NA          NA        NA       NA               NA
    
    [[3]]
                 Fertility Agriculture Examination Education Catholic Infant.Mortality
    ValdeTravers      67.6        18.7          25         7     8.65             19.5
    V. De Geneve      35.0         1.2          37        53    42.34             18.0
    Rive Droite       44.7        46.6          16        29    50.43             18.2
    Rive Gauche       42.8        27.7          22        29    58.33             19.3
    NA                  NA          NA          NA        NA       NA               NA
    NA.1                NA          NA          NA        NA       NA               NA
    
    [[4]]
                 Fertility Agriculture Examination Education Catholic Infant.Mortality
    V. De Geneve      35.0         1.2          37        53    42.34             18.0
    Rive Droite       44.7        46.6          16        29    50.43             18.2
    Rive Gauche       42.8        27.7          22        29    58.33             19.3
    NA                  NA          NA          NA        NA       NA               NA
    NA.1                NA          NA          NA        NA       NA               NA
    NA.2                NA          NA          NA        NA       NA               NA
    
    [[5]]
                 Fertility Agriculture Examination Education Catholic Infant.Mortality
    Rive Droite      44.7        46.6          16        29    50.43             18.2
    Rive Gauche      42.8        27.7          22        29    58.33             19.3
    NA                 NA          NA          NA        NA       NA               NA
    NA.1               NA          NA          NA        NA       NA               NA
    NA.2               NA          NA          NA        NA       NA               NA
    NA.3               NA          NA          NA        NA       NA               NA
    
    [[6]]
                Fertility Agriculture Examination Education Catholic Infant.Mortality
    Rive Gauche      42.8        27.7          22        29    58.33             19.3
    NA                 NA          NA          NA        NA       NA               NA
    NA.1               NA          NA          NA        NA       NA               NA
    NA.2               NA          NA          NA        NA       NA               NA
    NA.3               NA          NA          NA        NA       NA               NA
    NA.4               NA          NA          NA        NA       NA               NA
    

    如果您要使用这些 data.frames 运行 regsubsets,其中预测变量多于案例,则会出现错误。

    lapply(split_swiss, function(x) regsubsets(Fertility ~., data=x, nvmax=10, method="forward"))
    
     Error in leaps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, force.in = force.in,  : 
      y and x different lengths In addition: Warning messages:
    1: In leaps.setup(x, y, wt = wt, nbest = nbest, nvmax = nvmax, force.in = force.in,  :
      1  linear dependencies found
     ......
    

    相反,我只能保留 12 行的子集并继续回归:

    split_swiss_2 <- split_swiss[sapply(lapply(split_swiss, na.omit), nrow) == 12]
    lapply(split_swiss_2, function(x) regsubsets(Fertility ~., data=x, nvmax=10, method="forward"))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-04-01
      • 2018-08-31
      • 1970-01-01
      • 2016-03-20
      • 2019-09-25
      • 2020-08-03
      • 1970-01-01
      相关资源
      最近更新 更多