【发布时间】:2021-10-26 22:58:30
【问题描述】:
我正在尝试使用自举估计回归斜率及其置信区间。我想为分组数据做这件事。我在这个网站(https://www.tidymodels.org/learn/statistics/bootstrap/)上关注了这个例子,但我不知道如何让它与分组/嵌套数据一起工作。我不断收到以下信息:
错误:mutate() 列 model 有问题。
ℹmodel = map(splits, ~lm(conc ~ yday, data = .))。
未找到 x 对象“conc”
library(tidyverse)
library(tidymodels)
dat <-
structure(list(site = c("mb", "mb", "mb", "mb", "mb", "mb", "mb",
"mb", "sp", "sp", "sp", "sp", "sp", "sp", "sp", "sp"), year = c(2015,
2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,
2015, 2015, 2015, 2015), yday = c(15, 15, 35, 35, 48, 48, 69,
69, 15, 15, 37, 37, 49, 49, 69, 69), samp_depth_cat2 = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Mid-2",
"Bottom"), class = "factor"), analyte = c("NO3", "NO3", "NO3",
"NO3", "NO3", "NO3", "NO3", "NO3", "NH4", "NH4", "NH4", "NH4",
"NH4", "NH4", "NH4", "NH4"), conc = c(44.8171069465267, 44.7775358035268,
33.3678662097523, 33.0710828871279, 25.8427604055115, 26.9309658742058,
23.7585524380667, 17.5240386949382, 8.35832733633183, 9.29280745341615,
10.0797380595417, 10.2322058970515, 13.7930668951239, 15.6226805882773,
25.3003042764332, 16.8723637466981)), row.names = c(NA, -16L), class = c("tbl_df",
"tbl", "data.frame"))
set.seed(27)
# This is where I get the error
lm_boot <-
dat %>%
group_by(site, year, samp_depth_cat2, analyte) %>%
nest() %>%
bootstraps(., times = 1000, apparent = TRUE) %>%
mutate(model = map(splits, ~lm(conc ~ yday, data = .)),
coef_info = map(model, tidy))
boot_coefs <-
lm_boot %>%
unnest(coef_info)
percentile_intervals <- int_pctl(lm_boot, coef_info)
percentile_intervals
更新
我尝试映射引导函数,然后对该列表列中的拆分进行线性回归。它生成了一个名为 model 的新列,但其中似乎没有任何模型元素。
lm_boot <-
dat %>%
group_by(site, year, samp_depth_cat2, analyte) %>%
nest() %>%
mutate(boots = map(data, ~bootstraps(., times = 1000, apparent = TRUE)),
model = map(boots, "splits", ~lm(conc ~ yday, data = .x)))
有什么想法吗?
【问题讨论】:
-
我还不知道如何在 tidymodels 中做到这一点,但这里有一个相关的包,似乎它可能适用于此:davisvaughan.github.io/strapgod/reference/bootstrapify.html
-
我想我已经让 bootstrapify 工作了,但我需要弄清楚如何从 bootstrap 估计中计算置信区间。希望有人可以帮助解决上面的原始代码。
标签: r lm bootstrapping tidymodels rsample