【问题标题】:R: How to for loop a multiple linear regression which drops factors with <2 levelsR:如何循环一个多元线性回归,它会降低<2个水平的因子
【发布时间】:2019-06-27 21:02:25
【问题描述】:

我正在尝试循环多重线性回归并自动删除没有至少两个级别的因子以避免以下错误消息:

contrasts&lt;-(*tmp*, value = contr.funs[1 + isOF[nn]]) 中的错误:对比只能应用于具有 2 个或更多级别的因子*

现在我的代码是:

df %>% 
  group_by(crop_name) %>% 
    do(tidy(lm(formula = value ~ intercrop + 
erosion_c + purchased_seed + inorg_pest +
 org_pest + landscape + fert + inorgfert,
             data = . )))

问题是,一些作物的样本量很大,我要回归的所有变量都有很多点,而另一些作物的样本量非常小,并且零接受了给定的处理(即没有血果作物间作, 等等。)。

有没有办法在 for 循环中告诉 R 回归它可以做的事情,放弃其他所有内容,并避免出现此错误消息?

【问题讨论】:

  • 您可以使用“表”来识别问题组合。您可能需要忽略不是主要问题的变量。
  • 听起来可以通过nest()filter()purrr:map() 实现。如果您可以发布一些示例数据,我很乐意给您举个例子。

标签: r for-loop linear-regression tidy


【解决方案1】:

我很新,所以这可能不是最好的方法。您可能需要使用crop_name 设置for 循环,因为在我的示例中,df 是您的一个作物组的子集。

df <- data.frame(intercrop = c("A","B","C","A","B","C"),
                   erosion_c = c("A","D","C","A","B","C"),
                   purchased_seed = c("A","B","D","F","E","C"),
                   inorg_pest = c("A","B","C","A","B","C"),
                   org_pest = c("A","B","A","A","B","B"),
                   landscape = c("A","A","A","A","A","A"),
                   fert = c("A","B","C","A","B","C"),
                   inorgfert = c("A","B","C","A","B","C")
                   )


yo <- sapply(df, levels)
hi <- as.data.frame(c(NA))
for(i in 1:length(yo)){
  hi[i] <- length(yo[[i]])
  names(hi)[i] <- names(df[i])
}

hi <- subset(as.data.frame(t(hi)), V1 >= 2)

formu <- row.names(hi)
formu <- as.formula(paste("value ~ ",gsub('.{3}$', '', paste( unlist(paste(formu,"+ ")), collapse=''))))

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2020-08-08
    • 1970-01-01
    • 2018-07-05
    • 2017-05-05
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多