【发布时间】:2020-02-29 18:22:00
【问题描述】:
food_consumption %>%
group_by(food_category) %>%
summarise(mod= lm(co2_emmission ~ consumption))
运行此代码后,我收到以下错误
错误:列
mod的长度必须为 1(汇总值),而不是 12
我怎样才能使它正确,并获得每个类别的回归结果?
【问题讨论】:
food_consumption %>%
group_by(food_category) %>%
summarise(mod= lm(co2_emmission ~ consumption))
运行此代码后,我收到以下错误
错误:列
mod的长度必须为 1(汇总值),而不是 12
我怎样才能使它正确,并获得每个类别的回归结果?
【问题讨论】:
我们可以将lm 模型的输出包装在list 中,因为它有很多组件,而summarise 想要返回每个组的长度为 1
library(dplyr)
food_consumption %>%
group_by(food_category) %>%
summarise(mod= list(lm(co2_emmission ~ consumption)))
在dplyr的devel版本中,可以使用condense,它会自动返回一个list
food_consumption %>%
group_by(food_category) %>%
condense(mod= lm(co2_emmission ~ consumption))
使用可重现的示例
mtcars %>%
group_by(cyl) %>%
summarise(mod = list(lm(mpg ~ gear)))
# A tibble: 3 x 2
# cyl mod
# <dbl> <list>
#1 4 <lm>
#2 6 <lm>
#3 8 <lm>
或condense
mtcars %>%
group_by(cyl) %>%
condense(mod = lm(mpg ~ gear))
# A tibble: 3 x 2
# Rowwise: cyl
# cyl mod
# <dbl> <list>
#1 4 <lm>
#2 6 <lm>
#3 8 <lm>
为了得到系数
mtcars %>%
group_by(cyl) %>%
condense(mod = lm(mpg ~ gear), Coef = coef(mod))
# A tibble: 3 x 3
# Rowwise: cyl
# cyl mod Coef
# <dbl> <list> <list>
#1 4 <lm> <dbl [2]>
#2 6 <lm> <dbl [2]>
#3 8 <lm> <dbl [2]>
或者mutate和map
mtcars %>%
group_by(cyl) %>%
summarise(mod = list(lm(mpg ~ gear))) %>%
mutate(Coef = map(mod, coef))
# A tibble: 3 x 3
# cyl mod Coef
# <dbl> <list> <list>
#1 4 <lm> <dbl [2]>
#2 6 <lm> <dbl [2]>
#3 8 <lm> <dbl [2]>
或者另一个选项是nest,然后是map,而不是list
library(purrr)
mtcars %>%
group_by(cyl) %>%
nest %>%
transmute(mod = map(data, ~ lm(mpg ~ gear, data = .x)))
# A tibble: 3 x 2
# Groups: cyl [3]
# cyl mod
# <dbl> <list>
#1 6 <lm>
#2 4 <lm>
#3 8 <lm>
【讨论】: