【问题标题】:Add geom_rug like boxplots per group in ggplot2在 ggplot2 中为每组添加 geom_rug 之类的箱线图
【发布时间】:2018-10-10 14:55:16
【问题描述】:

我想在我的密度图的底部和顶部添加每组的地毯箱线图。我找不到实现,所以我尝试手动创建箱线图,然后将带有 annotation_custom 的箱线图添加到图中。

目前存在密度图和箱线图的x轴不对齐的问题。我试图提取第一个图的限制,但只能找到一种方法来提取数据的限制。

第二个问题是箱线图的精确 y 对齐,这应该和 geom_rug 处理的一样。

第三个问题是确保密度和箱线图使用相同的填充颜色。我使用手动方法来解决这个问题,但如果我不必在多个位置指定颜色,显然它会更通用。

set.seed(123)
library(ggplot2)
library(ggpubr)
library(data.table)
Data <- data.table(x = rnorm(100),
                   group = rep(c("group1", "group2"), times = c(30, 70)))

# Colors for groups
colors <- c("group1" = "#66C2A5", "group2" = "#FC8D62")

p <-
  ggplot(Data, aes(x = x, fill = group, color = group)) +
  geom_density(alpha = 0.5) +
  scale_color_manual(values = colors) +
  scale_fill_manual(values = colors)

# Rugs
p +
  geom_rug(data = Data[group %in% "group1"]) +
  geom_rug(data = Data[group %in% "group2"], sides = "t")

#-----

# Boxplots
boxplot1 <-
  ggplot(Data[group %in% "group1"]) +
  geom_boxplot(aes(y = x), fill = colors[["group1"]]) +
  coord_flip() +
  theme_transparent()

boxplot2 <-
  ggplot(Data[group %in% "group2"]) +
  geom_boxplot(aes(y = x), fill = colors[["group2"]]) +
  coord_flip() +
  theme_transparent()

boxplot1_grob <- ggplotGrob(boxplot1)
boxplot2_grob <- ggplotGrob(boxplot2)

# Place box plots inside density plot
x <- ggplot_build(p)$layout$panel_scales_x[[1]]$range$range
xmin <- x[1]
xmax <- x[2]
y <- ggplot_build(p)$layout$panel_scales_y[[1]]$range$range
ymin <- y[1]
ymax <- y[2]

yoffset <- (1/28) * ymax
xoffset <- (1/28) * xmax

# Add boxplots with annotation_custom
p2 <- p +
  annotation_custom(grob = boxplot1_grob, xmin = xmin, xmax = xmax,
                    ymin = ymin - yoffset, ymax = ymin + yoffset) +
  annotation_custom(grob = boxplot2_grob,
                    xmin = xmin, xmax = xmax,
                    ymin = ymax - yoffset, ymax = ymax + yoffset)

p2

# Alignment is not correct
p2 +
  geom_rug(data = Data[group %in% "group1"]) +
  geom_rug(data = Data[group %in% "group2"], sides = "t")

【问题讨论】:

  • 您可以查看 ggMarginal() 包中的函数 ggExtra daattali.com/shiny/ggExtra-ggMarginal-demo,它可以让您在边缘显示箱线图。
  • 我看过这个包,但它似乎不是在顶部和底部,而是在底部和右侧添加边距。而且外观完全不同
  • 您可以在顶部添加两个箱线图或在底部添加两个箱线图。确实,这不是您要找的。​​span>
  • 我试过这个,但它似乎不起作用:stackoverflow.com/questions/52895254/…

标签: r ggplot2


【解决方案1】:

我曾经做过类似的练习,但尚未对其进行严格测试,但它似乎确实适用于您的用例。如果有任何问题,请告诉我,我会看看是否可以修复它们:

# with boxplots only
p +
  geom_marginboxplot(data = Data[Data$group %in% "group1", ], 
                     aes(y = 1), sides = "b") +
  geom_marginboxplot(data = Data[Data$group %in% "group2", ], 
                     aes(y = 1), sides = "t")

# with both boxplots & geom_rug (check that they align exactly)
p +
  geom_marginboxplot(data = Data[Data$group %in% "group1", ], 
                     aes(y = 1), sides = "b") +
  geom_marginboxplot(data = Data[Data$group %in% "group2", ], 
                     aes(y = 1), sides = "t") +
  geom_rug(data = Data[group %in% "group1"]) +
  geom_rug(data = Data[group %in% "group2"], sides = "t")

边缘箱线图的尺寸模仿geom_rug 的尺寸,占绘图面板高度/宽度的 3%。 x 和 y 都必须映射到 aes(),尽管在这种情况下实际上不需要 y,所以我将值分配给它 1 作为占位符。

运行以下命令获取geom_marginboxplot

library(ggplot2)
library(grid)

`%||%` <- function (x, y)  if (is.null(x))  y else x

geom_marginboxplot <- function(mapping = NULL, data = NULL,
                         ...,
                         sides = "bl",
                         outlier.shape = 16,
                         outlier.size = 1.5,
                         outlier.stroke = 0.5,
                         width = 0.9,
                         na.rm = FALSE,
                         show.legend = NA,
                         inherit.aes = TRUE) {

  layer(
    data = data,
    mapping = mapping,
    stat = StatMarginBoxplot,
    geom = GeomMarginBoxplot,
    position = "identity",
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      sides = sides,
      outlier.shape = outlier.shape,
      outlier.size = outlier.size,
      outlier.stroke = outlier.stroke,
      width = width,
      notch = FALSE,
      notchwidth = 0.5,
      varwidth = FALSE,
      na.rm = na.rm,
      ...
    )
  )
}

StatMarginBoxplot <- ggproto(
  "StatMarginBoxplot", Stat,
  optional_aes = c("x", "y"),
  non_missing_aes = "weight",

  setup_data = function(data, params, 
                        sides = "bl") {
    if(grepl("l|r", sides)){
      data.vertical <- data
      data.vertical$orientation <- "vertical"
    } else data.vertical <- data.frame()
    if(grepl("b|t", sides)){
      data.horizontal <- data
      data.horizontal$y <- data.horizontal$x
      data.horizontal$orientation <- "horizontal"
    } else data.horizontal <- data.frame()
    data <- remove_missing(rbind(data.vertical, 
                                 data.horizontal),
                           na.rm = FALSE, vars = "x", 
                           "stat_boxplot")
    data
  },

  compute_group = function(data, scales, sides = "bl", 
                           width = 0.9, na.rm = FALSE, coef = 1.5){

    if(grepl("l|r", sides)){
      df.vertical <- do.call(environment(StatBoxplot$compute_group)$f,
                             args = list(data = data[data$orientation == "vertical", ], 
                                         scales = scales, width = width,
                                         na.rm = na.rm, coef = coef))
      df.vertical <- df.vertical[, c("ymin", "lower", "middle", "upper", "ymax", "outliers")]
      df.vertical$orientation = "vertical"
    } else df.vertical <- data.frame()
    if(grepl("b|t", sides)){
      df.horizontal <- do.call(environment(StatBoxplot$compute_group)$f,
                               args = list(data = data[data$orientation == "horizontal", ], 
                                           scales = scales, width = width,
                                           na.rm = na.rm, coef = coef))
      df.horizontal <- df.horizontal[, c("ymin", "lower", "middle", "upper", "ymax", "outliers")]
      df.horizontal$orientation = "horizontal"
    } else df.horizontal <- data.frame()

    df <- rbind(df.vertical, df.horizontal)

    colnames(df) <- gsub("^y", "", colnames(df))
    df
  }
)

GeomMarginBoxplot <- ggproto(
  "GeomMarginBoxplot", Geom,

  setup_data = function(data, params, sides = "bl") {

    data.vertical <- data[data$orientation == "vertical", ]
    if(nrow(data.vertical) > 0) {
      colnames(data.vertical)[1:6] <- paste0("y", colnames(data.vertical)[1:6])
    } 
    data.horizontal <- data[data$orientation == "horizontal", ]
    if(nrow(data.horizontal) > 0){
      colnames(data.horizontal)[1:6] <- paste0("x", colnames(data.horizontal)[1:6])
    }
    data <- merge(data.vertical, data.horizontal, all = TRUE)
    data <- data[, sapply(data, function(x) !all(is.na(x)))]
    data
  },

  draw_group = function(data, panel_params, coord, fatten = 2,
                        outlier.shape = 19, outlier.stroke = 0.5,
                        outlier.size = 1.5, width = 0.9,
                        notch = FALSE, notchwidth = 0.5, varwidth = FALSE,
                        sides = "bl") {

    draw.marginal.box <- function(sides){

      if(sides %in% c("l", "b")){
        pos1 <- unit(0, "npc"); pos2 <- unit(0.03, "npc")
      } else {
        pos2 <- unit(0.97, "npc"); pos1 <- unit(1, "npc")
      }
      if(width > 0 & width < 1){
        increment <- (1 - width) / 2
        increment <- increment * (pos2 - pos1)
        pos1 <- pos1 + increment
        pos2 <- pos2 - increment
      }
      pos3 <- 0.5 * pos1 + 0.5 * pos2

      outliers_grob <- NULL

      if(sides %in% c("l", "r")) {
        data <- data[data$orientation == "vertical", ]

        if (!is.null(data$youtliers) && length(data$youtliers[[1]] >= 1)) {

          outliers <- data.frame(
            y = unlist(data$youtliers[[1]]),
            x = 0,
            colour = data$colour[1],
            fill = data$fill[1],
            shape = outlier.shape %||% data$shape[1],
            size = outlier.size %||% data$size[1],
            stroke = outlier.stroke %||% data$stroke[1],
            alpha = data$alpha[1],
            stringsAsFactors = FALSE
          )

          coords <- coord$transform(outliers, panel_params)

          x.pos <- rep(pos3, nrow(coords))
          y.pos <- unit(coords$y, "native")

          outliers_grob <- pointsGrob(
            x = x.pos, y = y.pos,
            pch = coords$shape,
            gp = gpar(col = coords$colour, 
                      fill = alpha(coords$fill, coords$alpha), 
                      fontsize = coords$size * .pt + coords$stroke * .stroke/2, 
                      lwd = coords$stroke * .stroke/2))
        }

        box.whiskers <- data.frame(
          y = c(data$ymin, data$ylower, data$ymiddle, data$yupper, data$ymax),
          x = 0,
          colour = data$colour[1],
          fill = data$fill[1],
          size = data$size[1],
          alpha = data$alpha[1],
          stringsAsFactors = FALSE
        )

        box.whiskers <- coord$transform(box.whiskers, panel_params)

        whiskers_grob <- segmentsGrob(
          x0 = rep(pos3, 2),
          x1 = rep(pos3, 2),
          y0 = unit(c(box.whiskers$y[1], box.whiskers$y[5]), "native"),
          y1 = unit(c(box.whiskers$y[2], box.whiskers$y[4]), "native"),
          gp = gpar(col = box.whiskers$colour,
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))

        box_grob <- rectGrob(
          x = pos1,
          y = unit(box.whiskers$y[4], "native"),
          width = pos2 - pos1,
          height = unit(box.whiskers$y[4] - box.whiskers$y[2], "native"),
          just = c("left", "top"),
          gp = gpar(col = box.whiskers$colour,
                    fill = alpha(box.whiskers$fill, box.whiskers$alpha),
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))

        median_grob <- segmentsGrob(
          x0 = rep(pos1, 2),
          x1 = rep(pos2, 2),
          y0 = unit(box.whiskers$y[3], "native"),
          y1 = unit(box.whiskers$y[3], "native"),
          gp = gpar(col = box.whiskers$colour,
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))
      } 

      if(sides %in% c("b", "t")) {
        data <- data[data$orientation == "horizontal", ]

        if (!is.null(data$xoutliers) && length(data$xoutliers[[1]] >= 1)) {

          outliers <- data.frame(
            x = unlist(data$xoutliers[[1]]),
            y = 0,
            colour = data$colour[1],
            fill = data$fill[1],
            shape = outlier.shape %||% data$shape[1],
            size = outlier.size %||% data$size[1],
            stroke = outlier.stroke %||% data$stroke[1],
            alpha = data$alpha[1],
            stringsAsFactors = FALSE
          )

          coords <- coord$transform(outliers, panel_params)

          x.pos <- unit(coords$x, "native")
          y.pos <- rep(pos3, nrow(coords))

          outliers_grob <- pointsGrob(
            x = x.pos, y = y.pos,
            pch = coords$shape,
            gp = gpar(col = coords$colour, 
                      fill = alpha(coords$fill, coords$alpha), 
                      fontsize = coords$size * .pt + coords$stroke * .stroke/2, 
                      lwd = coords$stroke * .stroke/2))
        }

        box.whiskers <- data.frame(
          x = c(data$xmin, data$xlower, data$xmiddle, data$xupper, data$xmax),
          y = 0,
          colour = data$colour[1],
          fill = data$fill[1],
          size = data$size[1],
          alpha = data$alpha[1],
          stringsAsFactors = FALSE
        )

        box.whiskers <- coord$transform(box.whiskers, panel_params)

        whiskers_grob <- segmentsGrob(
          y0 = rep(pos3, 2),
          y1 = rep(pos3, 2),
          x0 = unit(c(box.whiskers$x[1], box.whiskers$x[5]), "native"),
          x1 = unit(c(box.whiskers$x[2], box.whiskers$x[4]), "native"),
          gp = gpar(col = box.whiskers$colour,
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))

        box_grob <- rectGrob(
          y = pos2,
          x = unit(box.whiskers$x[2], "native"),
          height = pos2 - pos1,
          width = unit(box.whiskers$x[4] - box.whiskers$x[2], "native"),
          just = c("left", "top"),
          gp = gpar(col = box.whiskers$colour,
                    fill = alpha(box.whiskers$fill, box.whiskers$alpha),
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))

        median_grob <- segmentsGrob(
          y0 = rep(pos1, 2),
          y1 = rep(pos2, 2),
          x0 = unit(box.whiskers$x[3], "native"),
          x1 = unit(box.whiskers$x[3], "native"),
          gp = gpar(col = box.whiskers$colour,
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))
      }

      grobTree(outliers_grob,
               whiskers_grob,
               box_grob,
               median_grob)
    }

    result <- list()

    if(grepl("l", sides)) result$l <- draw.marginal.box("l")
    if(grepl("r", sides)) result$r <- draw.marginal.box("r")
    if(grepl("b", sides)) result$b <- draw.marginal.box("b")
    if(grepl("t", sides)) result$t <- draw.marginal.box("t")

    gTree(children = do.call("gList", result))

  },

  draw_key = draw_key_boxplot,

  default_aes = aes(weight = 1, colour = "grey20", fill = "white", 
                    size = 0.5, stroke = 0.5,
                    alpha = 0.75, shape = 16, linetype = "solid",
                    sides = "bl"),

  optional_aes = c("lower", "upper", "middle", "min", "max")
)

会话信息:R 3.5.1,ggplot2 3.0.0。

【讨论】:

  • 非常感谢!像魅力一样工作:)
  • 你能解释一下,为什么需要 y 美学吗?这将为我返回 Ignoring unknown aesthetics: y 警告
  • 如果没有 y 美感,图层就无法工作。坦率地说,我也不太清楚为什么,因为我确实在 optional 而不是 required 美学下列出了 x 和 y。但由于它仅适用于 y 美学映射,因此我必须包含一个。我使用散点图(同时具有 x 和 y)作为设想的用例编写了函数,所以这也是我第一次遇到这种情况。有时间我会进一步调查。 (出现警告是因为顶级数据中没有 y 映射,这是预期的。)
猜你喜欢
  • 1970-01-01
  • 2011-05-13
  • 2020-11-08
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多