【问题标题】:ggridges with time series - R具有时间序列的 ggridges - R
【发布时间】:2020-11-25 16:15:01
【问题描述】:

我有一个DF,我想用来自ggridgesgeom_density_ridges 做一个密度图,但是,它在所有州都返回相同的线。我做错了什么?

我想在here 中添加trim = TRUE,但它返回以下错误消息:

Ignoring unknown parameters: trim

我的代码:

library(tidyverse)
library(ggridges)

url <- httr::GET("https://xx9p7hp1p7.execute-api.us-east-1.amazonaws.com/prod/PortalGeral",
                 httr::add_headers("X-Parse-Application-Id" =
                                       "unAFkcaNDeXajurGB7LChj8SgQYS2ptm")) %>%
    httr::content() %>%
    '[['("results") %>%
    '[['(1) %>%
    '[['("arquivo") %>%
    '[['("url")

data <- openxlsx::read.xlsx(url) %>%
    filter(is.na(municipio), is.na(codmun)) %>%
    mutate_at(vars(contains(c("Acumulado", "Novos", "novos"))), ~ as.numeric(.))

data[,8] <- openxlsx::convertToDate(data[,8])

data <- data %>%
    mutate(mortalidade = obitosAcumulado / casosAcumulado,
           date = data) %>%
    select(-data)

ggplot(data = data, aes(x = date, y = estado, heights = casosNovos)) +
    geom_density_ridges(trim = TRUE)

【问题讨论】:

  • 这看起来像是均匀分布的核密度估计。我猜你的 date 列有规则的间隔,所以它的密度近似于均匀分布。您对时间序列密度图的期望究竟是什么?
  • 类似that

标签: r ggplot2 ggridges


【解决方案1】:

您可能不是在寻找密度脊,而是在寻找规则的脊线。

在标准化方面有几个选择。如果你想模拟密度,你可以通过它们的总和来划分每个组:height = casosNovos / sum(casosNovos)。接下来,您可以决定要缩放每个脊以适应线条之间的大小,您可以使用scales::rescale() 函数来完成。是要按组执行此操作还是对整个数据执行此操作由您决定。我选择了下面的全部数据。

library(tidyverse)
library(ggridges)

url <- httr::GET("https://xx9p7hp1p7.execute-api.us-east-1.amazonaws.com/prod/PortalGeral",
                 httr::add_headers("X-Parse-Application-Id" =
                                     "unAFkcaNDeXajurGB7LChj8SgQYS2ptm")) %>%
  httr::content() %>%
  '[['("results") %>%
  '[['(1) %>%
  '[['("arquivo") %>%
  '[['("url")

data <- openxlsx::read.xlsx(url) %>%
  filter(is.na(municipio), is.na(codmun)) %>%
  mutate_at(vars(contains(c("Acumulado", "Novos", "novos"))), ~ as.numeric(.))

data[,8] <- openxlsx::convertToDate(data[,8])

data <- data %>%
  mutate(mortalidade = obitosAcumulado / casosAcumulado,
         date = data) %>%
  select(-data) %>%
  group_by(estado) %>%
  mutate(height = casosNovos / sum(casosNovos))

ggplot(data = data[!is.na(data$estado),], 
       aes(x = date, y = estado, height = scales::rescale(height))) +
  geom_ridgeline()

【讨论】:

    猜你喜欢
    • 2018-08-05
    • 1970-01-01
    • 1970-01-01
    • 2014-11-06
    • 1970-01-01
    • 2020-07-04
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多