【问题标题】:Getting summaries from multiple simple linear regressions in R从 R 中的多个简单线性回归中获取摘要
【发布时间】:2020-03-12 16:34:53
【问题描述】:

我正在对我的数据集中的多个组执行简单的线性回归。但是,我想从这些回归中提取摘要,并按组将它们放入主表中。我可以像这样运行它(它可以工作):

  fit_basic <- rs2_anova %>% #Run multiple simple linear regressions
    group_by(quant_method) %>% 
    nest() %>% 
    mutate(model = map(data, ~lm(recoveries ~ treatment, data = .))) 

fit_basic_A <- fit_basic[[1,"model"]] #Remove the model from fit_basic
fit_basic_B <- fit_basic[[1,"model"]] #Remove the model from fit_basic

fit_basic_table_A <- get_regression_table(fit_basic_A) %>%
  select("term", "estimate") %>%
  pivot_wider(names_from = "term", values_from = "estimate") %>%
  mutate(quant_method = "A")

fit_basic_table_B <- get_regression_table(fit_basic_A) %>%
  select("term", "estimate") %>%
  pivot_wider(names_from = "term", values_from = "estimate") %>%
  mutate(quant_method = "B")

fit_basic_table <- rbind(fit_basic_table_A, fit_basic_table_B)

为了节省几行代码(因为我的组比这里介绍的多得多),我想我可以使用 map 函数,但我一直卡在映射汇总表上,这会引发错误:

fit_basic <- rs2_anova %>% 
  group_by(quant_method) %>% 
  nest() %>% 
  mutate(model = map(data, ~lm(recoveries ~ treatment, data = .))) %>%
  mutate(summaries = map(data, get_regression_table(.$model)))

Error in input_checks(model, digits, print) : 
  Only simple linear regression models are supported. Try again using only `lm()` models as appropriate.

我也尝试过类似的方法:

fit_basic_table <- map(fit_basic$model, 
                           function(x) {
                             p <- get_regression_table(x)
                             cbind(par=rownames(p), p)
                           }) 

但是我得到了一个无法分解为单个数据框的数据框列表,并且我丢失了我的组指定。我试过了:

fit_basic_table <- map(fit_basic$model, 
                           function(x) {
                             p <- get_regression_table(x)
                             cbind(par=rownames(p), p)
                           }) %>% 
  map_df(as_tibble, .id = "id") 

fit_basic_table <- map(fit_basic$model, 
                           function(x) {
                             p <- get_regression_table(x)
                             cbind(par=rownames(p), p)
                           }) %>% 
  unnest(cols = "id")

关于如何实现自动化的任何想法?

*随机测试数据框:

quant_method <- c("A", "A","A","A","A","A","A","A","A","A","B","B","B","B","B","B","B","B","B","B")
treatment <- c("x","x","x","x","x","y","y","y","y","y","x","x","x","x","x","y","y","y","y","y")
recoveries <-c("88","86","87","82","85","76","65","55","72","71","98","96","97","92","99","66",
               "55","55","62","61")
rs2_anova <- data.frame(quant_method, treatment, recoveries)

【问题讨论】:

    标签: r linear-regression tidyverse purrr


    【解决方案1】:

    这是使用tidyversebroom 包的一种解决方案。它与您尝试的 purrr 方法略有不同,但我认为结果显示了您有兴趣从 lm 对象中提取的术语(、术语和估计)。

    library(tidyverse)
    library(broom)
    
    #Added the stringsAsFactors argument = F to avoid an error in the lm model
    rs2_anova <- data.frame(quant_method, 
                            treatment, 
                            recoveries,
                            stringsAsFactors = F)
    
    fit_basic <- rs2_anova %>% 
      #Group by quant_method column
      group_by(quant_method) %>%
      #do the linear models by grouping var
      do(model = lm(recoveries ~ treatment, data = .)) %>%
      #tidy lm object and order it as tibble
      tidy(model)
    

    【讨论】:

    • 这不起作用。当我运行 do(model = lm(recoveries ~treatment, data = .)) 行时,我收到一个错误,即 Error in eval(predvars, data, env) : object 'recoveries' not found
    • 我从头开始运行代码,它工作正常。您可以检查“rs2_anova”对象中是否确实有一个名为“recoveries”的列。
    【解决方案2】:

    我在这里找到了答案:unnest a list column after modeling after group_by in r

    并修改为:

    fit_cec <- rs2_anova %>% 
      group_by(quant_method) %>%
      nest %>% 
      mutate(data = map(data, ~ .x %>%
                          summarise(model = list(broom::tidy(lm(recoveries ~ treatment)))))) %>% 
      unnest(data) %>% 
      unnest(model)
    

    或获取所有估计、预测和摘要 (https://drsimonj.svbtle.com/running-a-model-on-separate-groups)。这也很好用:

    fit_cec <- rs2_anova %>% 
      group_by(quant_method) %>%
      nest %>% 
      mutate(fit = map(data, ~ lm(loss_abs_BC_2 ~ cec, data = .)),
             parameters = map(fit, tidy), #provides estimate table for slope and y-intercept with std.error and estimate p-values
             summary = map(fit, broom::glance), #provides R2, adj.R2, sigma, and model p-value
             predictions = map(fit, augment)) %>% #provides fitted values with residuals and errors
      unnest(parameters) %>%
      pivot_wider(names_from = term, values_from = c(estimate, std.error, statistic, p.value)) %>%
      unnest(summary) %>%
      unnest(predictions)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2018-07-26
      • 2018-06-06
      • 1970-01-01
      • 2020-10-01
      • 2016-09-20
      • 2018-10-03
      • 1970-01-01
      相关资源
      最近更新 更多