【问题标题】:Alternative of summarise() function in dplyrdplyr 中 summarise() 函数的替代方案
【发布时间】:2020-02-29 18:22:00
【问题描述】:
food_consumption %>% 
   group_by(food_category) %>% 
   summarise(mod= lm(co2_emmission ~ consumption))

运行此代码后,我收到以下错误

错误:列 mod 的长度必须为 1(汇总值),而不是 12

我怎样才能使它正确,并获得每个类别的回归结果?

【问题讨论】:

    标签: r dplyr tidyverse


    【解决方案1】:

    我们可以将lm 模型的输出包装在list 中,因为它有很多组件,而summarise 想要返回每个组的长度为 1

    library(dplyr)
    food_consumption %>% 
         group_by(food_category) %>%
         summarise(mod= list(lm(co2_emmission ~ consumption)))
    

    dplyrdevel版本中,可以使用condense,它会自动返回一个list

    food_consumption %>% 
         group_by(food_category) %>%
         condense(mod= lm(co2_emmission ~ consumption))
    

    使用可重现的示例

    mtcars %>% 
         group_by(cyl) %>%
         summarise(mod = list(lm(mpg ~ gear)))
    # A tibble: 3 x 2
    #    cyl mod   
    #  <dbl> <list>
    #1     4 <lm>  
    #2     6 <lm>  
    #3     8 <lm>  
    

    condense

    mtcars %>% 
       group_by(cyl) %>%
       condense(mod = lm(mpg ~ gear))
    # A tibble: 3 x 2
    # Rowwise:  cyl
    #    cyl mod   
    #  <dbl> <list>
    #1     4 <lm>  
    #2     6 <lm>  
    #3     8 <lm>  
    

    为了得到系数

    mtcars %>% 
       group_by(cyl) %>%
        condense(mod = lm(mpg ~ gear), Coef = coef(mod))
    # A tibble: 3 x 3
    # Rowwise:  cyl
    #    cyl mod    Coef     
    #  <dbl> <list> <list>   
    #1     4 <lm>   <dbl [2]>
    #2     6 <lm>   <dbl [2]>
    #3     8 <lm>   <dbl [2]>
    

    或者mutatemap

    mtcars %>% 
        group_by(cyl) %>%
        summarise(mod = list(lm(mpg ~ gear))) %>% 
        mutate(Coef = map(mod, coef))
    # A tibble: 3 x 3
    #    cyl mod    Coef     
    #  <dbl> <list> <list>   
    #1     4 <lm>   <dbl [2]>
    #2     6 <lm>   <dbl [2]>
    #3     8 <lm>   <dbl [2]>
    

    或者另一个选项是nest,然后是map,而不是list

    library(purrr)
    mtcars %>% 
        group_by(cyl) %>% 
        nest %>% 
        transmute(mod = map(data, ~ lm(mpg ~ gear, data = .x)))
    # A tibble: 3 x 2
    # Groups:   cyl [3]
    #    cyl mod   
    #  <dbl> <list>
    #1     6 <lm>  
    #2     4 <lm>  
    #3     8 <lm>  
    

    【讨论】:

    • 谢谢!这非常有帮助......但现在从列表中获取所有模型的系数有点困难。你能帮我吗?请
    猜你喜欢
    • 2019-04-01
    • 2018-04-30
    • 1970-01-01
    • 2016-08-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多