【问题标题】:How to create a bar chart with multiple x variables per bar using ggplot or plotly?如何使用 ggplot 或 plotly 创建每个条形具有多个 x 变量的条形图?
【发布时间】:2018-03-02 11:44:58
【问题描述】:

我想在 R 中使用 ggplot 或其他包绘制一个条形图,显示每个条形的多个 X 变量的值。

感谢您的帮助,并附上Akseer et al 的图形以显示我要绘制的图形。

下面我提供了复制此条形图的示例数据。

对于前两个代码,干预和组的间距和顺序旨在反映干预的分类,如示例图所示。这是因为并非所有干预措施都适合所有人。此外,在创建数据集后,需要删除不属于图 B 中给定干预的组的值(国家中位数)。

Interventions<-c("Demand of family planning satisfied", ## interventions for 1s group

          "ANC 1+",                                 ## interventions for 2nd group
          "ANC 4+", 
          "ANC by skilled provider",
          "Protected against neonatal tetanus",

          "SBA",                                   ## interventions for 3rd group
          "Facility deliveries",

          "Early breastfeeding",                   ## interventions for 4th group

          "Exclusive breastfeeding at 6 months",   ## interventions for 5th group
          "Minimum meal frequency", 
          "BCG", 
          "Penta3", 
          "Measles",
          "Received vitamin A during the last 6 months",

          "Diarrhoea treatment (ORS)",             ## interventions for 6th group
          "Care seeking for pneumonia", 
          "Antibiotics for pneumonia", 

          "Improved drinking water sources",       ## interventions for 7th group
          "Improved sanitation facilities") 

现在我给小组。图 B 中的每个条形图显示了每个干预措施的全国中位数。前 7 组是绘制这些条形的全国中位数:

Prepregnancy<- (sample(1:100, 19, replace=TRUE)) ## 1st group

Pregnancy<-(sample(1:100, 19, replace=TRUE))   ## 2nd group

Birth<-(sample(1:100, 19, replace=TRUE))       ## 3rd group

Postnatal<-(sample(1:100, 19, replace=TRUE))   ## 4th group

Infancy<-sample(1:100, 19, replace=TRUE)       ## 5th group

Childhood<-sample(1:100, 19, replace=TRUE)     ## 6th group

Other<-sample(1:100, 19, replace=TRUE)        ## 7th group

下面我提供最后一部分数据,即“省级覆盖”组的数据。这里有一个考虑因素:与上述 7 组(国家中位数)不同,下面所有这些“省级覆盖”变量都适用于 19 种干预措施中的每一种,如图 B 所示。

Provincial1<-sample(1:100, 19, replace=TRUE)  ## provincial level observations for each of the 19 interventions
Provincial2<-sample(1:100, 19, replace=TRUE)  ## provincial level observations for each of the 19 interventions
Provincial3<-sample(1:100, 19, replace=TRUE)  ## provincial level observations for each of the 19 interventions
Provincial4<-sample(1:100, 19, replace=TRUE)  ## provincial level observations for each of the 19 interventions
Provincial5<-sample(1:100, 19, replace=TRUE)  ## provincial level observations for each of the 19 interventions 
Provincial6<-sample(1:100, 19, replace=TRUE)  ## provincial level observations for each of the 19 interventions 
Provincial7<-sample(1:100, 19, replace=TRUE)  ## provincial level observations for each of the 19 interventions 
Provincial8<-sample(1:100, 19, replace=TRUE)  ## provincial level observations for each of the 19 interventions 
Provincial9<-sample(1:100, 19, replace=TRUE)  ## provincial level observations for each of the 19 interventions 
Provincial10<-sample(1:100, 19, replace=TRUE) ## provincial level observations for each of the 19 interventions 


mydata_B<-data.frame(Interventions, Prepregnancy, Pregnancy, 
                 Birth, Postnatal, Infancy, Childhood, Other, 

                 Provincial1, Provincial2, Provincial3,
                 Provincial4, Provincial5,
                 Provincial6, Provincial7, Provincial8,
                 Provincial9, Provincial10)

rownames(mydata_B) <- mydata_B[,1]
dtFig3B <- mydata_B[,-1]

同样,在创建数据集后,需要删除那些不属于图 B 中给定干预的组的值(国家中位数)。

我将不胜感激有关如何在 R 中重现此条形图的任何想法。

【问题讨论】:

    标签: r ggplot2 plotly


    【解决方案1】:

    此示例说明如何使用factor(x, levels) 确保同一组中的条形图放置在一起。在ggplot 调用中,您可以将分组变量映射到填充美学以在视觉上分离组。使用stat = "unique" 获取唯一值而不是进行计数(其中每个条的高度由df 中对应的行数决定)。

    library(ggplot2)
    
    df <- data.frame(x = rep(c("Z", "A", "Y", "B", "X"), each = 5), 
                     value = sample(10:99, 25))
    
    # divide into groups
    groups <- c(Z = "g1", A = "g3", Y = "g3", B = "g1", X = "g2")
    df$group <- groups[as.character(df$x)]
    
    # set the order of group
    df$group <- factor(df$group, c("g1", "g2", "g3"))
    
    # order df by group
    df <- df[order(df$group), ]
    
    # reset the order of x accordingly
    df$x <- factor(df$x, unique(df$x))
    
    # calculate medians
    medians <- tapply(df$value, df$x, median)
    df$median <- medians[as.character(df$x)]
    
    # plot, mapping group to fill aesthetic
    ggplot(df, aes(x, fill = group)) +
      geom_bar(aes(y = median), stat = "unique") +
      geom_point(aes(y = value)) + 
      labs(y = "values and median")
    

    【讨论】:

    • 这可以绘制部分图表,但我仍然无法重现该示例。还有其他部分。例如,如何在每个条形图中绘制省级覆盖率的值?另外,如何添加这些垂直线分隔组?谢谢!
    • @Krantz 我扩展了示例,将数据点作为散点图,将中位数作为条形图。据我所知,ggplot 没有直接的方法在(组)条之间添加垂直线。
    • 太棒了!如何为每个省份赋予不同的颜色以使数据点更有意义?非常感谢您的意见,@Jorid。
    • @Krantz 我不建议为数据点着色,因为它们可能与条形颜色发生冲突。为了放大视觉对比,您可以尝试在geom_point 调用中使用aes(y = value, shape = x),尽管我认为映射到水平轴就足够了。
    • 我明白你的意思。我做了aes(y = value, shape = x),但我怀疑我们需要使省份与 x 不同,因为我们需要能够在给定的 x 类别(干预)内对比不同省份的值(覆盖范围)。当前情况允许对比 x 之间的形状(干预:“Z”、“A”、“Y”、“B”、“X”),我们已经用颜色(组:“g1”、“g2 "、"g3") 和条形图(中位数)。简而言之,我们需要省份之间的对比发生在 x 的水平之内,而不是在 x 水平之间。提前致谢!
    【解决方案2】:

    这显示了如何根据this answer 在组之间放置行。 这是对上述@Jordi 答案的扩展。 修改为对省份点进行着色并在条形图上使用 alpha。 19 个省份真的很难通过颜色来区分,因此可能需要使用一些形状,如其他 cmets 所述。

    library(ggplot2)
    
    # make data
    df = read.csv(text='
    group,intervention,province,value
    g1,i1,p1,10
    g1,i1,p2,12
    g1,i2,p1,13
    g1,i2,p2,15
    g2,i3,p1,18
    g2,i3,p2,20
    g3,i4,p1,14
    g3,i4,p2,16
    g3,i5,p1,18
    g3,i5,p2,20
    ', stringsAsFactors = FALSE)
    
    # define ordered factors to ontrol plot orders
    df$group = ordered(df$group, levels = c("g3", "g2", "g1")) ## deliberately reversed 
    df$intervention = ordered(df$intervention, levels = c("i1", "i2", "i3", "i4", "i5"))
    
    # find the last intervention in each group
    library(dplyr)
    last_in_group = df  %>%
      group_by(group, intervention) %>%
      summarize() %>%
      group_by(group) %>%
      summarize(x = as.integer(tail(intervention,1)) + .5 ) 
    
    # calculate medians
    medians <- tapply(df$value, df$intervention, median)
    df$median <- medians[as.character(df$intervention)]
    
    # plot, mapping group to fill aesthetic
    ggplot(df, aes(x = intervention, fill = group)) +
      geom_col(aes(y = median, fill = group), width = 0.3, alpha=0.2) +
      geom_point(aes(y = value, col=province)) + 
      geom_vline(xintercept = last_in_group$x, lwd = 0.5, linetype=2, alpha = 0.2) +
      scale_y_continuous(expand = c(0,0)) +
      labs(y = "values and median") +
      theme(panel.background = element_rect(fill = "white"))
    

    【讨论】:

    • epi99@,可以。您能否帮助解决比较每个条形内的点并为数据点创建不同(单独)图例的问题?提前致谢。
    【解决方案3】:

    这可能是更自然的 ggplot 方法,使用 facet_grid 生成单行,scales = 'free_x' 仅包含使用的 x 值,space = 'free' 调整每个面板的宽度以适应。主题的额外调整可能会接近所需的演示文稿。

    这遵循来自@Jordi 的数据结构和示例

    # plot, mapping group to fill aesthetic
    ggplot(df, aes(x, fill = group)) +
      geom_bar(aes(y = median), stat = "unique", width= 0.3) +
      geom_point(aes(y = value)) + 
      labs(y = "values and median") +
      facet_grid(. ~ group, scales = "free_x", space = "free") 
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-11-28
      • 1970-01-01
      • 2019-02-15
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多