【问题标题】:R: Stacked area chart does not stackR:堆积面积图不堆积
【发布时间】:2018-05-18 08:43:28
【问题描述】:

我有想要绘制为堆积面积图的数据。在 x 轴上,我有连续的数据,在 y 轴上,我有准备累积的连续数据。这是我与一些虚拟数据一起使用的代码:

library(data.table)
library(ggplot2)

set.seed(1)
dt <- data.table(var=sample(1:6,1000,replace=TRUE),xdata=runif(1000),ydata=runif(1000))
setorder(dt, var, xdata)

dt$cumydata <- dt[,
                  cumsum(ydata),
                  by = .(var)]$V1/sum(dt$ydata)

ggplot(dt, aes(x = xdata, y = cumydata, fill = as.factor(var))) +
  geom_area(position = "stack")

这是输出图:

我的问题是,数据没有正确堆叠。我猜这可能是因为数据的连续性?

【问题讨论】:

    标签: r plot ggplot2 stacked-chart stacked-area-chart


    【解决方案1】:

    对于堆积面积图,x 值和出现次数必须相同。因此,将您的示例数据更改为此将为您提供预期的输出:

    set.seed(1)
    dt <- data.table(ydata=runif(1002))
    dt$var <- rep(1:6, each=167)
    dt$xdata <- rep(runif(167), 6)
    setorder(dt, var, xdata)
    
    dt$cumydata <- dt[,
                      cumsum(ydata),
                      by = .(var)]$V1/sum(dt$ydata)
    
    ggplot(dt,aes(x = xdata, y = cumydata, fill = as.factor(var))) +
      geom_area(position = "stack")
    

    【讨论】:

      【解决方案2】:

      所以这就是我最终解决它的方法,基于 Jimbou 的信息。这只是一点预处理。我也把整个事情变成了对数。

      library(data.table)
      library(ggplot2)
      
      set.seed(1)
      dtt <- data.table(var=sample(1:6,1000,replace=TRUE),xdata=runif(1000),ydata=runif(1000))
      
      setorder(dtt, var, xdata)
      
      log.min.xdata <- log(min(dtt$xdata))
      log.max.xdata <- log(max(dtt$xdata))
      
      nbreaks <- 101
      
      temp <- hist(log(dtt$xdata[dtt$var==1]),
                   breaks = seq(log.min.xdata, log.max.xdata, length=nbreaks),
                   plot = FALSE)
      
      
      dt <- data.table(var = unlist(lapply(sort(unique(dtt$var)),
                                           function(x){rep(x,nbreaks-1)})),
                       bin = rep(1:(nbreaks-1),length(unique(dtt$var))),
                       mid = rep(temp$mids))
      
      dt$count <- dt[,
                     hist(log(dtt$xdata[dtt$var==var]), 
                          breaks = seq(log.min.xdata, log.max.xdata, length=nbreaks),
                          plot = FALSE)$counts,
                     by = .(var)]$V1
      
      dt$cumcount <- dt[,
                        cumsum(count),
                        by = .(var)]$V1
      
      
      
      pp <- ggplot(dt, aes(x = exp(mid), y = cumcount, fill = as.factor(var))) +
        geom_area(position = "stack") +
        scale_x_log10() +
        theme_bw() +
        theme(legend.position = c(0.1, 0.70),
              legend.background = element_rect(fill="lightgrey", 
                                               size=0.5, linetype="solid")) +
        labs(title = "y",
             fill = " var",
             x = "xdata",
             y = "cumcount") +
        theme(title = element_text(face = "bold"),
              axis.title = element_text(face = "bold"),
              legend.title = element_text(face = "bold"),
              legend.text = element_text(face = "bold"))
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2021-11-20
        • 1970-01-01
        • 2014-03-22
        • 1970-01-01
        • 2014-03-21
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多