【问题标题】:R: Overlaying Points on a GraphR:图上的重叠点
【发布时间】:2021-04-15 11:01:02
【问题描述】:

我正在使用 R 编程语言。我正在尝试学习如何在图表上叠加点,然后将它们可视化。

使用下面的代码,我可以生成一些时间序列数据,按月聚合它们,取平均值/最小值/最大值,并绘制下图:

library(xts)
library(ggplot2)
library(dplyr)
library(plotly)
library(lubridate)

set.seed(123)

#time series 1
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")

property_damages_in_dollars <- rnorm(731,100,10)

final_data <- data.frame(date_decision_made, property_damages_in_dollars)


#####aggregate

final_data$year_month <- format(as.Date(final_data$date_decision_made), "%Y-%m")
final_data$year_month <- as.factor(final_data$year_month)


f = final_data %>% group_by (year_month) %>% summarise(max_value = max(property_damages_in_dollars), mean_value = mean(property_damages_in_dollars), min_value = min(property_damages_in_dollars))



####plot####

fig <- plot_ly(f, x = ~year_month, y = ~max_value, type = 'scatter', mode = 'lines',
        line = list(color = 'transparent'),
        showlegend = FALSE, name = 'max_value') 

fig <- fig %>% add_trace(y = ~min_value, type = 'scatter', mode = 'lines',
            fill = 'tonexty', fillcolor='rgba(0,100,80,0.2)', line = list(color = 'transparent'),
            showlegend = FALSE, name = 'min_value') 

fig <- fig %>% add_trace(x = ~year_month, y = ~mean_value, type = 'scatter', mode = 'lines',
            line = list(color='rgb(0,100,80)'),
            name = 'Average') 


fig <- fig %>% layout(title = "Average Property Damages",
         paper_bgcolor='rgb(255,255,255)', plot_bgcolor='rgb(229,229,229)',
         xaxis = list(title = "Months",
                      gridcolor = 'rgb(255,255,255)',
                      showgrid = TRUE,
                      showline = FALSE,
                      showticklabels = TRUE,
                      tickcolor = 'rgb(127,127,127)',
                      ticks = 'outside',
                      zeroline = FALSE),
         yaxis = list(title = "Dollars",
                      gridcolor = 'rgb(255,255,255)',
                      showgrid = TRUE,
                      showline = FALSE,
                      showticklabels = TRUE,
                      tickcolor = 'rgb(127,127,127)',
                      ticks = 'outside',
                      zeroline = FALSE))

fig

现在(在同一个图“图”上),对于每个月,我都试图以垂直方式绘制该月的所有观察结果。我正在尝试创建这样的东西:

通过一些数据操作,以下代码可以生成下图:plot( final_data$year_month, final_data$property_damages_in_dollars)

有人可以告诉我如何扩展此解决方案以用于绘图(即增强“无花果”对象)吗?

谢谢

【问题讨论】:

  • 应该使用 geom_ribbon() 参数吗?
  • 这些建议对您有何影响?

标签: r dplyr time-series plotly r-plotly


【解决方案1】:

要在格式化标记方面具有完全的灵活性,您可以使用add_trace 与您的数据框的子集final_data 一起使用以下代码添加:

date_split <- split(final_data, final_data$year_month)
for (i in 1:length(date_split)) {
  fig <- fig %>% add_trace(y=date_split[[i]]$property_damages_in_dollars,
                           x=date_split[[i]]$year_month,
                           mode='markers'
                           )
}

结果一:

如果您只需要黑色标记,可以将以下内容添加到 add_trace()

marker=list(color='rgba(0,0,0, 1)'

结果 2:

如果您想调整绘图的透明度,您可以直接通过rgba() 中的最后一个参数进行调整,例如:

marker=list(color='rgba(0,0,0, 0.2)')

结果 3:

完整代码:

library(xts)
library(ggplot2)
library(dplyr)
library(plotly)
library(lubridate)

set.seed(123)

#time series 1
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")

property_damages_in_dollars <- rnorm(731,100,10)

final_data <- data.frame(date_decision_made, property_damages_in_dollars)


#####aggregate

final_data$year_month <- format(as.Date(final_data$date_decision_made), "%Y-%m")
final_data$year_month <- as.factor(final_data$year_month)


f = final_data %>% group_by (year_month) %>% summarise(max_value = max(property_damages_in_dollars), mean_value = mean(property_damages_in_dollars), min_value = min(property_damages_in_dollars))



####plot####

fig <- plot_ly(f, x = ~year_month, y = ~max_value, type = 'scatter', mode = 'lines',
        line = list(color = 'transparent'),
        showlegend = FALSE, name = 'max_value') 

fig <- fig %>% add_trace(y = ~min_value, type = 'scatter', mode = 'lines',
            fill = 'tonexty', fillcolor='rgba(0,100,80,0.2)', line = list(color = 'transparent'),
            showlegend = FALSE, name = 'min_value') 

fig <- fig %>% add_trace(x = ~year_month, y = ~mean_value, type = 'scatter', mode = 'lines',
            line = list(color='rgb(0,100,80)'),
            name = 'Average') 


fig <- fig %>% layout(title = "Average Property Damages",
         paper_bgcolor='rgb(255,255,255)', plot_bgcolor='rgb(229,229,229)',
         xaxis = list(title = "Months",
                      gridcolor = 'rgb(255,255,255)',
                      showgrid = TRUE,
                      showline = FALSE,
                      showticklabels = TRUE,
                      tickcolor = 'rgb(127,127,127)',
                      ticks = 'outside',
                      zeroline = FALSE),
         yaxis = list(title = "Dollars",
                      gridcolor = 'rgb(255,255,255)',
                      showgrid = TRUE,
                      showline = FALSE,
                      showticklabels = TRUE,
                      tickcolor = 'rgb(127,127,127)',
                      ticks = 'outside',
                      zeroline = FALSE))

date_split <- split(final_data, final_data$year_month)
for (i in 1:length(date_split)) {
  fig <- fig %>% add_trace(y=date_split[[i]]$property_damages_in_dollars,
                           x=date_split[[i]]$year_month,
                           mode='markers',
                           marker=list(color='rgba(0,0,0, 0.2)')
                           #marker=list(color='rgba(0,0,0, 1)')
                           )
}
fig

【讨论】:

    【解决方案2】:

    至少我总是觉得使用 ggplot 更简单,然后使用神奇的函数 ggplotly 将其发送到 plotly。希望这对您有所帮助。

    library(xts)
    library(ggplot2)
    library(dplyr)
    library(plotly)
    library(lubridate)
    
    set.seed(123)
    
    #time series 1
    date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
    
    property_damages_in_dollars <- rnorm(731,100,10)
    
    final_data <- data.frame(date_decision_made, property_damages_in_dollars)
    
    
    #####aggregate
    
    dat <- final_data %>% 
      mutate(month = month(date_decision_made),
             year = year(date_decision_made),
             month_end = ceiling_date(date_decision_made, unit = "month")-1) %>% 
      group_by(month, year) %>% 
      mutate(mean_val = mean(property_damages_in_dollars,na.rm = TRUE),
             max_val = max(property_damages_in_dollars,na.rm = TRUE),
             min_val = min(property_damages_in_dollars,na.rm = TRUE))
    
    p <- ggplot(data = dat) +
      geom_ribbon(aes(x = month_end, 
                      ymin = min_val,
                      ymax = max_val), alpha = 0.2)+
      geom_point(aes(x = month_end,
                 y = property_damages_in_dollars), alpha = 0.3) +
      geom_line(aes(x = month_end,
                    y = mean_val), size = 1.25) +
      labs(y = "Dollars",
           x = "Months")+
      theme_minimal()
      
    ggplotly(p)
    

    【讨论】:

      【解决方案3】:

      在最后一行代码中添加以下内容:

      fig %>% add_trace(data = final_data, 
                    y = ~property_damages_in_dollars, x = ~year_month, 
                    name = "Property Damage in Dollars", mode = "markers", 
                    marker = list(color = " rgba(46, 49, 49, 1)", opacity = 0.2))
      

      生成以下图,其中参数coloropacity 可以调整为您喜欢的样式。我们使用了 data.frame final_data,因为那是点所在的位置。变量year_month已经自己设置好了,所以不需要额外的数据整理。要实际生成点,请务必在 add_trace() 函数中设置 mode = "markers"

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2016-07-31
        • 2012-03-03
        • 1970-01-01
        • 1970-01-01
        • 2013-06-10
        • 2022-01-13
        • 1970-01-01
        相关资源
        最近更新 更多