【问题标题】:How to calculate bootstrapped confidence interval using the mean_CI_boot used in ggplot2?如何使用 ggplot2 中使用的 mean_CI_boot 计算自举置信区间?
【发布时间】:2023-03-29 10:02:01
【问题描述】:

我有一个 2 x 2 阶乘数据集,我使用 mean_cl_boot 函数为其绘制了置信区间。我想使用适当的函数在 R 中计算它。我该怎么做?

我的数据集样本如下:

df <- data.frame(
      fertilizer = c("N","N","N","N","N","N","N","N","N","N","N","N","P","P","P","P","P","P","P","P","P","P","P","P","N","N","N","N","N","N","N","N","N","N","N","N","P","P","P","P","P","P","P","P","P","P","P","P"), 
      level = c("low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","low"), 
      repro = c(0,90,2,4,0,80,1,90,2,33,56,0,99,100,66,80,1,0,2,33,0,0,1,2,90,5,2,2,5,8,0,1,90,2,4,66,0,0,0,0,1,2,90,5,2,5,8,55)
    )    

我知道有一些方法可以从图表中提取 CI 点,但我确实不想这样做。我想使用计算这个的函数。

【问题讨论】:

    标签: r ggplot2 confidence-interval


    【解决方案1】:

    mean_cl_boot 建立在Hmisc::smean.cl.boot() 之上。

    如果您想计算所有值的引导 CI(无论级别如何),smean.cl.boot(df$repro) 应该这样做。

    这就是你将如何在基础 R 中执行拆分-应用-组合:

    library(Hmisc)
    ss <- with(df, split(df, list(fertilizer,level)))
    bb <- lapply(ss, function(x) smean.cl.boot(x$repro))
    do.call(rbind,bb)
    

    结果:

               Mean     Lower    Upper
    N.high 19.00000  5.747917 36.58750
    P.high 26.09091  8.631818 47.27273
    N.low  33.75000 12.416667 58.26042
    P.low  20.38462  1.615385 42.69423
    

    如果你想在 tidyverse 中这样做:

    library(tidyverse)
    (df 
        %>% group_split(fertilizer,level) 
        %>% map_dfr(~as_tibble(rbind(smean.cl.boot(.[["repro"]]))))
    

    (这并不完全令人满意:可能有一种更清洁的方法)

    【讨论】:

    • 谢谢!如果我想计算数据集的总体平均值(即所有肥料和水平)和一个自举 CI,我该如何使用第一组代码来完成?
    最近更新 更多