【问题标题】:ggplot2: plot time series and multiple point forecasts on a quasi time axisggplot2:在准时间轴上绘制时间序列和多点预测
【发布时间】:2015-09-25 18:46:42
【问题描述】:

我在绘制时间序列数据和多点预测时遇到问题。

我想绘制历史数据和一些点预测。历史数据应该用一条线连接,而点预测应该用箭头连接,因为第二个预测值说forecast_02实际上是修改后的forecast_01

使用的库:

library(ggplot2)
library(plyr)
library(dplyr)
library(stringr)
library(grid)

这是我的虚拟数据:

set.seed(1)

my_df <-
structure(list(values = c(-0.626453810742332, 0.183643324222082, 
-0.835628612410047, 1.59528080213779, 0.329507771815361, -0.820468384118015, 
0.487429052428485, 0.738324705129217, 0.575781351653492, -0.305388387156356
), c = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j"), time = c("2014-01-01", 
"2014-02-01", "2014-03-01", "2014-04-01", "2014-05-01", "2014-06-01", 
"2014-07-01", "2014-08-01", "2014-09-01", "2014-10-01"), type_of_value = c("historical", 
"historical", "historical", "historical", "historical", "historical", 
"historical", "historical", "forecast_01", "forecast_02"), time_and_forecast = c("2014-01-01", 
"2014-02-01", "2014-03-01", "2014-04-01", "2014-05-01", "2014-06-01", 
"2014-07-01", "2014-08-01", "forecast", "forecast")), .Names = c("values", 
"c", "time", "type_of_value", "time_and_forecast"), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -10L)

看起来像这样:

Source: local data frame [10 x 5]

       values c       time type_of_value time_and_forecast
1  -0.6264538 a 2014-01-01    historical        2014-01-01
2   0.1836433 b 2014-02-01    historical        2014-02-01
3  -0.8356286 c 2014-03-01    historical        2014-03-01
4   1.5952808 d 2014-04-01    historical        2014-04-01
5   0.3295078 e 2014-05-01    historical        2014-05-01
6  -0.8204684 f 2014-06-01    historical        2014-06-01
7   0.4874291 g 2014-07-01    historical        2014-07-01
8   0.7383247 h 2014-08-01    historical        2014-08-01
9   0.5757814 i 2014-09-01   forecast_01          forecast
10 -0.3053884 j 2014-10-01   forecast_02          forecast

使用下面的代码,我几乎设法生成了我想要的情节。但是,我无法将我的历史数据点通过一条线链接起来。

# my code for almost perfect chart    
ggplot(data = my_df, 
           aes(x = time_and_forecast, 
               y = values,
               color = type_of_value, 
               group = time_and_forecast)) +
      geom_point(size = 5) +
      geom_line(arrow = arrow()) +
      theme_minimal()

你能帮我把蓝点用线连起来吗?谢谢。

# sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=Slovenian_Slovenia.1250  LC_CTYPE=Slovenian_Slovenia.1250    LC_MONETARY=Slovenian_Slovenia.1250
[4] LC_NUMERIC=C                        LC_TIME=C                          

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] stringr_1.0.0 dplyr_0.4.1   plyr_1.8.3    ggplot2_1.0.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.6      assertthat_0.1   digest_0.6.8     MASS_7.3-40      R6_2.0.1         gtable_0.1.2    
 [7] DBI_0.3.1        magrittr_1.5     scales_0.2.4     stringi_0.4-1    lazyeval_0.1.10  reshape2_1.4.1  
[13] labeling_0.3     proto_0.3-10     tools_3.2.0      munsell_0.4.2    parallel_3.2.0   colorspace_1.2-6

【问题讨论】:

  • "由一条线连接" = 只用一条线连接点,还是线性插值,还是样条插值 (geom_smooth())?

标签: r ggplot2


【解决方案1】:

我认为这会得到你想要的:

ggplot(data = my_df, 
   aes(x = time_and_forecast, 
       y = values,
       color = type_of_value,
       group = 1)) +
  geom_point(size = 5) +
  geom_line(data=my_df[my_df$type_of_value=='historical',]) +
  geom_line(data=my_df[!my_df$type_of_value=='historical',], arrow=arrow()) +
  theme_minimal()

ggplot 尝试在您的 x 分类组中绘制线条,但它失败了,因为每个组只有 1 个值。如果您使用group = 1 指定它们都应该是同一个组,它将跨组绘制线条。由于您想要historical 组的一条线和其他两点之间的箭头,因此您可以使用不同的arrow 参数对数据帧的子集进行两次geom_line() 调用。我不知道是否有办法让 ggplot 按组自动选择箭头(就像颜色、线型等一样)。

【讨论】:

  • 这正是我想要的。谢谢。
【解决方案2】:

您可能想要拆分数据集:

library(ggplot)
library(grid)

df_hist <- subset(my_df, type_of_value == "historical")
df_forc <- subset(my_df, type_of_value != "historical")

ggplot() +
  geom_line(data = df_hist, aes(x = time, y = values, group = 1, color = type_of_value)) +
  geom_point(data = df_forc, aes(x = time, y = values, color = type_of_value), size = 5) +
  geom_path(data = df_forc, aes(x = time, y = values, group = 1), arrow = arrow())

您甚至可以添加一个阴影矩形来进一步强调预测区域:

ggplot() +
  geom_line(data = df_hist, aes(x = time, y = values, group = 1, color = type_of_value)) +
  geom_point(data = df_forc, aes(x = time, y = values, color = type_of_value), size = 5) +
  geom_path(data = df_forc, aes(x = time, y = values, group = 1), arrow = arrow()) + 
  annotate("rect", xmin = min(df_forc$time), xmax = max(df_forc$time), 
           ymin = -Inf, ymax = +Inf, alpha = 0.25, fill = "yellow")

【讨论】:

  • 我想要一个图表,其中预测值一个一个地放在另一个之上。
猜你喜欢
  • 1970-01-01
  • 2018-07-05
  • 1970-01-01
  • 1970-01-01
  • 2014-12-17
  • 1970-01-01
  • 2014-06-02
  • 2020-09-05
  • 2022-09-27
相关资源
最近更新 更多