为了完整起见,这里还有base R和data.table的解决方案。
基础 R
使用split() 和lapply() 的一种可能的基本R 方法是suggested by Jogo:
result <- lapply(split(DT, list(DT$region, DT$season, DT$year)),
lm, formula = response ~ altitud)
print(result)
$IT.wint.2013
Call:
FUN(formula = ..1, data = X[[i]])
Coefficients:
(Intercept) altitud
-140.0510 0.2306
$IT.wint.2014
Call:
FUN(formula = ..1, data = X[[i]])
Coefficients:
(Intercept) altitud
-484.3333 0.6667
或者,使用管道来提高可读性
library(magrittr)
result <- split(DT, list(DT$region, DT$season, DT$year)) %>%
lapply(lm, formula = response ~ altitud)
数据表
在broom的帮助下:
library(data.table)
library(magrittr)
setDT(DT)[, lm(response ~ altitud, .SD) %>% broom::tidy(), by = .(region, season, year)]
region season year term estimate std.error statistic p.value
1: IT wint 2013 (Intercept) -140.0510204 31.82553603 -4.400586 0.1422513
2: IT wint 2013 altitud 0.2306122 0.03888277 5.930962 0.1063382
3: IT wint 2014 (Intercept) -484.3333333 NaN NaN NaN
4: IT wint 2014 altitud 0.6666667 NaN NaN NaN
setDT(DT)[, lm(response ~ altitud, .SD) %>% broom::glance(), by = .(region, season, year)]
region season year r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual
1: IT wint 2013 0.9723576 0.9447152 1.111168 35.17631 0.1063382 2 -2.925132 11.85026 9.1461 1.234694 1
2: IT wint 2014 1.0000000 NaN NaN NaN NaN 2 Inf -Inf -Inf 0.000000 0
如果为不同的组计算lm() 非常耗时,那么存储结果并将其用于后续处理步骤可能是值得的:
mod <- setDT(DT)[, .(model = .(lm(response ~ altitud, .SD))), by = .(region, season, year)]
mod
region season year models
1: IT wint 2013 <lm>
2: IT wint 2014 <lm>
mod$models 是等效于result 的模型列表。
现在,我们可以从计算模型中提取所需的信息,例如,
mod[, models[[1]] %>% broom::tidy(), by = .(region, season, year)]
region season year term estimate std.error statistic p.value
1: IT wint 2013 (Intercept) -140.0510204 31.82553603 -4.400586 0.1422513
2: IT wint 2013 altitud 0.2306122 0.03888277 5.930962 0.1063382
3: IT wint 2014 (Intercept) -484.3333333 NaN NaN NaN
4: IT wint 2014 altitud 0.6666667 NaN NaN NaN
数据
library(data.table)
DT <- fread("
region season year altitud response
IT wint 2013 800 45
IT wint 2013 815 47
IT wint 2013 840 54
IT wint 2014 800 49
IT wint 2014 815 59")