在 R 中绘制具有置信区间的时间序列答案

【问题标题】：Plot time series with confidence intervals in R在 R 中绘制具有置信区间的时间序列
【发布时间】：2015-08-11 20:38:34
【问题描述】：

这是我在 R: 中制作的几个不同时间序列的图

我用一个简单的循环制作了这些：

for(i in 1:ngroups){
  x[paste0("Group_",i)] = apply(x[,group == i],1,mean)
}

plot(x$Group_1,type="l",ylim=c(0,300))
for(i in 2:ngroups){
    lines(x[paste0("Group_",i)],col=i)
}

我也可以使用matplot 制作这个情节。现在，如您所见，每个组都是其他几列的平均值。我想做的是像上图一样绘制系列，但另外显示有助于该平均值的基础数据的范围。例如，紫色线将以浅紫色阴影区域为界。在任何给定的时间索引，紫色区域将从紫色组中的最低值延伸到最高值（或者说，5 到 95 个百分位数）。有没有一种优雅/聪明的方法来做到这一点？

【问题讨论】：

我不确定我是否理解你的代码，但你看过ggplot2::geom_smooth()吗？
我相信 geom_smooth 是为了添加一个平均值——它可能是创建我上面的情节版本的好方法，但我认为它不能用来在行。
我认为geom_smooth 正是你要找的，看看这里的最后一个情节：docs.ggplot2.org/0.9.3.1/geom_smooth.html
您需要 stat_summary 来绘制 CI。这里有足够的信息。 docs.ggplot2.org/current/stat_summary.html 。您对 stat_sum_df("mean_cl_normal", geom = "smooth") 感兴趣，但您可以看到 stat_summary 可以处理很多任务
@MatiasAndina 经验分位数与参数置信区间不同，mean_cl_boot 可能更接近

标签： r ggplot2

【解决方案1】：

这是使用graphics 包（R 附带的图形）的答案。我还尝试解释polygon（用于生成 CI）是如何创建的。这可以重新用于解决您的问题，我没有确切的数据。

# Values for noise and CI size
s.e. <- 0.25 # standard error of noise
interval <- s.e.*qnorm(0.975) # standard error * 97.5% quantile

# Values for Fake Data
x <- 1:10 # x values
y <- (x-1)*0.5 + rnorm(length(x), mean=0, sd=s.e.) # generate y values

# Main Plot
ylim <- c(min(y)-interval, max(y)+interval) # account for CI when determining ylim
plot(x, y, type="l", lwd=2, ylim=ylim) # plot x and y

# Determine the x values that will go into CI
CI.x.top <- x # x values going forward
CI.x.bot <- rev(x) # x values backwards
CI.x <- c(CI.x.top, CI.x.bot) # polygons are drawn clockwise

# Determine the Y values for CI
CI.y.top <- y+interval # top of CI
CI.y.bot <- rev(y)-interval # bottom of CI, but rev Y!
CI.y <- c(CI.y.top,CI.y.bot) # forward, then backward

# Add a polygon for the CI
CI.col <- adjustcolor("blue",alpha.f=0.25) # Pick a pretty CI color
polygon(CI.x, CI.y, col=CI.col, border=NA) # draw the polygon

# Point out path of polygon
arrows(CI.x.top[1], CI.y.top[1]+0.1, CI.x.top[3], CI.y.top[3]+0.1)
arrows(CI.x.top[5], CI.y.top[5]+0.1, CI.x.top[7], CI.y.top[7]+0.1)

arrows(CI.x.bot[1], CI.y.bot[1]-0.1, CI.x.bot[3], CI.y.bot[3]-0.1)
arrows(CI.x.bot[6], CI.y.bot[6]-0.1, CI.x.bot[8], CI.y.bot[8]-0.1)

# Add legend to explain what the arrows are
legend("topleft", legend="Arrows indicate path\nfor drawing polygon", xjust=0.5, bty="n")

这是最终结果：

【讨论】：

【解决方案2】：

我使用一些随机数据制作了一个 df。

这是df

df
   x         y
1  1 3.1667912
2  1 3.5301539
3  1 3.8497014
4  1 4.4494311
5  1 3.8306889
6  1 4.7681518
7  1 2.8516945
8  1 1.8350802
9  1 5.8163498
10 1 4.8589443
11 2 0.3419090
12 2 2.7940851
13 2 1.9688636
14 2 1.3475315
15 2 0.9316124
16 2 1.3208475
17 2 3.0367743
18 2 3.2340156
19 2 1.8188969
20 2 2.5050162

当您使用带有 mean_cl_normal 和 geom smooth 的 stat_summary 进行绘图时

   ggplot(df,aes(x=x,y=y))+geom_point() +  
stat_summary(fun.data=mean_cl_normal, geom="smooth", colour="red")

正如有人评论的那样，也许 mean_cl_boot 更好，所以我使用了它。

 ggplot(df,aes(x=x,y=y))+geom_point() +
  stat_summary(fun.data=mean_cl_boot, geom="smooth", colour="red")

它们确实有些不同。您也可以根据需要使用confint 参数。

【讨论】：