【问题标题】:Plotting confidence intervals of predictions from a Bayesian beta regression model从贝叶斯 beta 回归模型中绘制预测的置信区间
【发布时间】:2018-04-11 16:47:52
【问题描述】:

我有下面的示例数据和代码,希望您能就如何从贝叶斯 beta 回归模型绘制可信的预测区间提供帮助。

library(ggplot2)
library(plotly)
library(zoib)

data("GasolineYield", package = "zoib")

re.md <- zoib(yield ~ temp | 1 | 1, data=GasolineYield, 
              joint = FALSE, random=1, EUID=GasolineYield$batch, 
              zero.inflation = FALSE, one.inflation = FALSE, 
              n.iter=3200, n.thin=15, n.burn=200)

pred <- pred.zoib(re.md, data.frame(temp = seq(100, 600, 0.01)))

df <- data.frame(temp = seq(100, 600, 0.01), 
                 yield = (pred$pred[[1]][, 201] + pred$pred[[2]][, 201])/2)

ggplotly( 
ggplot() + 
geom_point(data = GasolineYield, 
aes(x = temp, y = yield, fill = batch), 
size = 4, shape = 21) + 
xlim(100, 600) + 
geom_line(data = df, aes(y = yield, x = temp), col="red") + 
theme_classic())

【问题讨论】:

    标签: r ggplot2 regression prediction bayesian


    【解决方案1】:

    我对贝叶斯统计几乎没有经验(尽管我很想深入了解它),但我相信这就是您所追求的:

    df1 <- data.frame(temp = seq(100, 600, 0.01), 
                      pred$summary)
    ggplotly( 
      ggplot() + 
        geom_point(data = GasolineYield, 
                   aes(x = temp, y = yield, fill = batch), 
                   size = 4, shape = 21) + 
        xlim(100, 600) + 
        geom_line(data = df1, aes(y = mean, x = temp), col="red") + 
        geom_ribbon(data = df1, aes(ymin= X2.5., ymax = X97.5., x = temp), alpha = 0.3) +
        theme_classic())
    

    来自?pred.zoib的帮助:

    summary if TRUE(默认),每个后验的基本总结 预测值,包括平均值、标准差、最小值、最大值、中值、2.5% 和 97.5% 分位数,提供。

    这与您正在绘制的内容有些不同,因为实际上是平均值:

    rowSums(pred$pred[[1]])/ncol(pred$pred[[1]]

    可视化差异:

    df <- data.frame(temp = seq(100, 600, 0.01), 
                     yield = (pred$pred[[1]][, 201] + pred$pred[[2]][, 201])/2)
    
    ggplotly( 
      ggplot() + 
        geom_point(data = GasolineYield, 
                   aes(x = temp, y = yield, fill = batch), 
                   size = 4, shape = 21) + 
        xlim(100, 600) + 
        geom_line(data = df1, aes(y = mean, x = temp), col="red") + 
        geom_ribbon(data = df1, aes(ymin= X2.5., ymax = X97.5., x = temp), alpha = 0.3) +
        geom_line(data = df, aes(y = yield, x = temp), col="blue") + 
        theme_classic())
    

    一些额外的注意事项:

    all.equal(rowSums(pred$pred[[1]])/ncol(pred$pred[[1]]), df1$mean)
    #output
    TRUE
    
    all.equal(apply(pred$pred[[1]], 1, quantile, probs = 0.025), df1$X2.5.)
    #output
    TRUE
    
    all.equal(apply(pred$pred[[1]], 1, quantile, probs = 0.975), df1$X97.5.)
    #output
    TRUE
    

    maxmin 等也是如此。

    我不确定pred$pred[[2]] 代表什么,但您可以使用上述方法为它生成摘要并像这样绘制它:

    df2 <- data.frame(temp = seq(100, 600, 0.01), 
                  mean = apply(pred$pred[[2]], 1, mean),
                  X97.5. = apply(pred$pred[[2]], 1, quantile, probs = 0.975),
                  X2.5. = apply(pred$pred[[2]], 1, quantile, probs = 0.025))
    

    让我们同时绘制两者(小心我的 R 在使用 ggplotly 执行此操作时变得无响应):

      ggplot() + 
        geom_point(data = GasolineYield, 
                   aes(x = temp, y = yield, fill = batch), 
                   size = 4, shape = 21) + 
        xlim(100, 600) + 
        geom_line(data = df1, aes(y = mean, x = temp), col="red") + 
        geom_ribbon(data = df1, aes(ymin= X2.5., ymax = X97.5., x = temp), alpha = 0.3) +
        geom_line(data = df2, aes(y = mean, x = temp), col="blue") + 
        geom_ribbon(data = df2, aes(ymin= X2.5., ymax = X97.5., x = temp), alpha = 0.3)+
        theme_classic()
    

    【讨论】:

      猜你喜欢
      • 2020-07-02
      • 2018-02-05
      • 2016-09-15
      • 2023-04-09
      • 2022-01-13
      • 2016-08-16
      • 2016-08-21
      • 2017-09-18
      • 1970-01-01
      相关资源
      最近更新 更多