【问题标题】:How to add 95% confidence intervals to graph of proportions of factor levels in ggplot?如何将 95% 的置信区间添加到 ggplot 中因子水平的比例图?
【发布时间】:2019-10-11 14:11:29
【问题描述】:

我想在之前提出的问题得到的出色答案的基础上再接再厉:

Graph proportion within a factor level rather than a count in ggplot2

我希望以代码为基础:

var1 <- c("Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left","Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left")
var2 <- c("Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", NA, "Slightly lower","Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly lower", "Higher", "Higher", "Higher", NA, "Slightly lower")
df <- as.data.frame(cbind(var1, var2))

library(dplyr)
library(ggplot2)

df %>%
  na.omit() %>%
  group_by(var1, var2) %>%
  summarise(n = n()) %>%
  mutate(n = n/sum(n)) %>%
  ungroup() %>%
  ggplot() + aes(var2, n, fill = var1) + 
  geom_bar(position = "dodge", stat = "identity") + 
  labs(x="Left or Right",y="Count")+
  scale_y_continuous() +
  scale_fill_discrete(name = "Answer:")+ theme_classic()+ 
  theme(legend.position="top")  +
  scale_fill_manual(values = c("black", "red"))

以 95% 置信区间的形式向图表上的每个条形添加误差条。我试图添加术语

upperE=(1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n), lowerE=(-1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n).

但是我总是遇到错误......

我还尝试为图表制作一个全新的数据框,因此:

var1 <- c("Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left","Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left")
var2 <- c("Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", NA, "Slightly lower","Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly lower", "Higher", "Higher", "Higher", NA, "Slightly lower")
df <- as.data.frame(cbind(var1, var2))



dat <- df %>%
  na.omit() %>%
  group_by(var1, var2) %>%
  summarise(n = n()) %>%
  mutate(prop = n/sum(n),upperE=1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n, lowerE=-1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n)

test <- ggplot(dat, aes(x=var2, y = prop, fill = var1))+ 
  geom_bar(position = "dodge", stat = "identity") + geom_errorbar(aes(ymin = lowerE, ymax = upperE),position="dodge")+
  labs(x="Answer",y="Proportion")+
  scale_fill_discrete(name = "Condition:")+ theme_classic()+ 
  theme(legend.position="top") 

这给了我误差线,但位于 Y 轴上的 0 处,而不是每个条形的顶部...

有人有什么建议吗?谢谢!

【问题讨论】:

    标签: r ggplot2 errorbar


    【解决方案1】:

    我现在已经弄清楚了如何让误差条位于每个条的适当位置 - 我需要将误差条的 ymin 和 ymax 规范与正在绘制的值相关联,因此:

    dat <- df %>%
      na.omit() %>%
      group_by(var1, var2) %>%
      summarise(n = n()) %>%
      mutate(prop = n/sum(n),upperE=1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n, lowerE=-1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n)
    
    test <- ggplot(dat, aes(x=var2, y = prop, fill = var1))+ 
      geom_bar(position = "dodge", stat = "identity") + geom_errorbar(aes(ymin = prop+lowerE, ymax = prop+upperE),width = .2, position=position_dodge(.9))+
      labs(x="Answer",y="Proportion")+
      scale_fill_discrete(name = "Condition:")+ theme_classic()+ 
      theme(legend.position="top") 
    

    这给了:

    【讨论】:

    • 你确定不是aes(ymin = prop + lowerE, ymax = prop + upperE)
    • 嗨!谢谢,两者都是正确的,但你的方式读起来更直观——我会改变我的脚本。谢谢你:)
    • 别担心莎拉:)
    【解决方案2】:

    95%CI 中 SE 的比例公式为:se = sqrt((p * (1-p))/n。所以我认为在上面的解决方案中声明:sqrt(n/sum(n) * 1-(n/sum(n))/n)。但是,n 只有成功次数。完整样本为sum(n)。所以它实际上应该是sqrt(n/sum(n) * (1-(n/sum(n))/**sum**(n))

    【讨论】:

      猜你喜欢
      • 2022-08-24
      • 1970-01-01
      • 2013-07-22
      • 2020-09-01
      • 2019-12-12
      • 1970-01-01
      • 2019-11-08
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多