如何将寓言/预测（在 R 中）应用于该数据库？答案

【问题标题】：How to apply Fable/Forecast (in R) to this database?如何将寓言/预测（在 R 中）应用于该数据库？
【发布时间】：2026-02-05 11:30:01
【问题描述】：

我正在尝试使用 R 中的 Fable 函数预测多个时间序列。这似乎是最有效的方法，但我对 R 非常陌生，所以我目前正在处理很多问题。我只是想向某人寻求建议和想法。我已经找到了如何仅使用预测功能包来做到这一点，但是需要很多额外的步骤。我的数据是一个 5701 列和 50 行的 excel。第一行的每一列作为产品的名称，后面的49个值是数字，代表2017年1月到2021年1月的销售额。首先，如何将该表转换为tibble？我知道我需要这样做才能与 Fable 合作，但我被困在如此简单的一步。然后我想输出一个表格，其中包含未来 3 个学期（2021 年 4 月至 2022 年 9 月）的月度预测，其中包含 Product|Date|Model Arima(values)|error of arima(value/values)|model ETS|Error ETS的|模型天真|天真..等的错误。我的主要目标是获得一张表格，其中包含产品|2021 年 4 月/2021 年 9 月的最佳预测|2021 年 10 月/2021 年 3 月的最佳预测|2022 年 4 月/2022 年 9 月的最佳预测|

我正在做的是使用这段代码：

newdata <- read_excel("ALLINCOLUMNS.xlsx")
Fcast <- ts(newdata[,1:5701], start= c(1), end=c(49), frequency=12)
output <- lapply(Fcast, function(x) forecast(auto.arima(x)))
prediction <- as.data.frame(output)
write.table(prediction, file= "C:\\Users\\thega\\OneDrive\\Documentos\\finalprediction.csv",sep=",")

默认情况下，这给了我一些格式为 |product1.Point.Forecast||Product1.Lo.80||Product1.Hi.80|Product1.Lo.95|Product1.Hi.95|Product2。 Point.Forecast|...|Product5071.Hi.95|... 无论如何，我不需要 80 和 95 间隔，这让我更难以使用它在 excel 中工作。如何获取以下格式的内容： |点预测产品 1|点预测产品 2|....|点预测产品 5701|，只显示预测？我知道我必须在预测函数中使用 level=NULL，但它并没有按照我尝试的方式工作。我打算做一个编程来删除这些列，但它不太优雅。最后，有没有办法显示列中方法的所有错误？我想将最好的方法添加到我的表中，因此我需要验证哪个错误更少。

【问题讨论】：

标签： r database time-series forecasting fable

【解决方案1】：

{fable} 包在数据格式整齐时效果最佳。在您的情况下，产品应该跨行而不是列表示。你可以在这里阅读更多关于整洁数据的信息：https://r4ds.had.co.nz/tidy-data.html 完成后，您还可以在此处阅读有关时间序列的整洁数据：https://otexts.com/fpp3/tsibbles.html

如果没有您的数据集，我只能猜测您的 Fcast 对象（ts() 数据）看起来像这样：

Fcast <- cbind(mdeaths,fdeaths)
Fcast
#>          mdeaths fdeaths
#> Jan 1974    2134     901
#> Feb 1974    1863     689
#> Mar 1974    1877     827
#> Apr 1974    1877     677
#> May 1974    1492     522
#> Jun 1974    1249     406
#> Jul 1974    1280     441
#> and so on ...

也就是说，您的每个产品都有自己的列（并且您有 5701 种产品，而不仅仅是我将在示例中使用的 2 个）。

如果ts 对象中已有数据，则可以使用as_tsibble(<ts>) 将其转换为整洁的时间序列数据集。

library(tsibble)
as_tsibble(Fcast, pivot_longer = TRUE)
#> # A tsibble: 144 x 3 [1M]
#> # Key:       key [2]
#>       index key     value
#>       <mth> <chr>   <dbl>
#>  1 1974 Jan fdeaths   901
#>  2 1974 Feb fdeaths   689
#>  3 1974 Mar fdeaths   827
#>  4 1974 Apr fdeaths   677
#>  5 1974 May fdeaths   522
#>  6 1974 Jan mdeaths  2134
#>  7 1974 Feb mdeaths  1863
#>  8 1974 Mar mdeaths  1877
#>  9 1974 Apr mdeaths  1877
#> 10 1974 May mdeaths  1492

^{由reprex package (v0.3.0) 于 2021 年 2 月 25 日创建}

设置pivot_longer = TRUE 会将列收集为长格式。这种格式适用于{fable} 包。我们现在有一个 key 列，它存储系列名称（您的数据的产品 ID），值存储在 value 列中。

有了适当格式的数据，我们现在可以使用 auto ARIMA() 和 forecast() 来获取预测：

library(fable)
#> Loading required package: fabletools
as_tsibble(Fcast, pivot_longer = TRUE) %>% 
  model(ARIMA(value)) %>% 
  forecast()
#> # A fable: 48 x 5 [1M]
#> # Key:     key, .model [2]
#>    key     .model          index        value .mean
#>    <chr>   <chr>           <mth>       <dist> <dbl>
#>  1 fdeaths ARIMA(value) 1980 Jan N(825, 6184)  825.
#>  2 fdeaths ARIMA(value) 1980 Feb N(820, 6184)  820.
#>  3 fdeaths ARIMA(value) 1980 Mar N(767, 6184)  767.
#>  4 fdeaths ARIMA(value) 1980 Apr N(605, 6184)  605.
#>  5 fdeaths ARIMA(value) 1980 May N(494, 6184)  494.
#>  6 fdeaths ARIMA(value) 1980 Jun N(423, 6184)  423.
#>  7 fdeaths ARIMA(value) 1980 Jul N(414, 6184)  414.
#>  8 fdeaths ARIMA(value) 1980 Aug N(367, 6184)  367.
#>  9 fdeaths ARIMA(value) 1980 Sep N(376, 6184)  376.
#> 10 fdeaths ARIMA(value) 1980 Oct N(442, 6184)  442.
#> # … with 38 more rows

^{由reprex package (v0.3.0) 于 2021 年 2 月 25 日创建}

您还可以通过在 model() 中指定多个模型来计算其他模型的预测。

Fcast <- cbind(mdeaths,fdeaths)
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, union
library(fable)
#> Loading required package: fabletools
as_tsibble(Fcast, pivot_longer = TRUE) %>% 
  model(arima = ARIMA(value), ets = ETS(value), snaive = SNAIVE(value)) %>% 
  forecast()
#> # A fable: 144 x 5 [1M]
#> # Key:     key, .model [6]
#>    key     .model    index        value .mean
#>    <chr>   <chr>     <mth>       <dist> <dbl>
#>  1 fdeaths arima  1980 Jan N(825, 6184)  825.
#>  2 fdeaths arima  1980 Feb N(820, 6184)  820.
#>  3 fdeaths arima  1980 Mar N(767, 6184)  767.
#>  4 fdeaths arima  1980 Apr N(605, 6184)  605.
#>  5 fdeaths arima  1980 May N(494, 6184)  494.
#>  6 fdeaths arima  1980 Jun N(423, 6184)  423.
#>  7 fdeaths arima  1980 Jul N(414, 6184)  414.
#>  8 fdeaths arima  1980 Aug N(367, 6184)  367.
#>  9 fdeaths arima  1980 Sep N(376, 6184)  376.
#> 10 fdeaths arima  1980 Oct N(442, 6184)  442.
#> # … with 134 more rows

^{由reprex package (v0.3.0) 于 2021 年 2 月 25 日创建}

.model 列现在标识用于生成每个预测的模型，其中有 3 个模型。

如果您想并排关注点预测，您可以tidyr::pivot_wider() 预测.mean 跨多个列的值。

library(tsibble)
library(fable)
library(tidyr)
Fcast <- cbind(mdeaths,fdeaths)
as_tsibble(Fcast, pivot_longer = TRUE) %>% 
  model(arima = ARIMA(value), ets = ETS(value), snaive = SNAIVE(value)) %>% 
  forecast() %>% 
  as_tibble() %>% 
  pivot_wider(id_cols = c("key", "index"), names_from = ".model", values_from = ".mean")
#> # A tibble: 48 x 5
#>    key        index arima   ets snaive
#>    <chr>      <mth> <dbl> <dbl>  <dbl>
#>  1 fdeaths 1980 Jan  825.  789.    821
#>  2 fdeaths 1980 Feb  820.  812.    785
#>  3 fdeaths 1980 Mar  767.  746.    727
#>  4 fdeaths 1980 Apr  605.  592.    612
#>  5 fdeaths 1980 May  494.  479.    478
#>  6 fdeaths 1980 Jun  423.  413.    429
#>  7 fdeaths 1980 Jul  414.  394.    405
#>  8 fdeaths 1980 Aug  367.  355.    379
#>  9 fdeaths 1980 Sep  376.  365.    393
#> 10 fdeaths 1980 Oct  442.  443.    411
#> # … with 38 more rows

^{由reprex package (v0.3.0) 于 2021 年 2 月 25 日创建}

您可以在此处了解如何评估这些模型/预测的准确性：https://otexts.com/fpp3/accuracy.html

【讨论】：

Brooo...多么美丽可爱的灵魂。你真的让我度过了一整天。它工作得很好，我几乎准备好了。你的书和你的工作也很有帮助，我学到了很多。最后一个问题，如果可以的话。我应该在哪里（在最后一个示例中）将增强函数放在 key|index|value|arima|.fitted|.resid|.innov|ets|.fitted|.resid|.innov|snaive| 格式的东西上。装|.resid|.innov|？我没有用 %>% 做太多工作，所以这有点新。再次感谢。
augment() 用于模型对象以将信息添加到原始数据中。 data %>% model() %>% augment()。为了比较宽格式的输出，您可以再次使用pivot_wider()。虽然宽格式很适合在表格中显示并通过肉眼比较值，但较长的格式更适合绘图。您可能会发现绘制这些值会提供更多信息。 as_tsibble(Fcast, pivot_longer = TRUE) %>% model(arima = ARIMA(value), ets = ETS(value), snaive = SNAIVE(value)) %>% augment() %>% pivot_wider(names_from = ".model", values_from = c(".fitted", ".resid", ".innov"))