开发预测功能/循环答案

【问题标题】：Developing forecasting function/loop开发预测功能/循环
【发布时间】：2018-02-19 10:23:16
【问题描述】：

我是 R 的初学者，感谢任何帮助或提示开发函数/循环以自动执行以下预测过程：这是一个虚拟数据集

> class(stack_help) 
[1] "data.frame" 
> stack_help
    OO    GG      CC      DD
1   198.12 60.56   265.5  271.24
2   145.68 52.28   328.9  427.68
3   106.48 47.08  380.24  695.60
4    83.16 43.52  443.94  934.30
5    89.72 46.68   484.6 1084.26
6    86.48 35.46  415.56  924.68
7    93.68 24.40  376.42  798.14
8   101.70 22.68  260.42  427.72
9   115.88 22.00  228.26  245.72
10  137.24 21.60   212.7  140.64
11  129.82 18.78  230.02   46.04
12  145.00 17.62  220.74   47.16
13  135.38 18.84  245.52  143.28
14  146.38 20.68  322.18  490.20
15  154.08 19.48  374.16  621.48
16  149.34 22.68  484.28  999.50
17  152.74 28.90  533.26 1223.58
18  148.62 27.44  456.76  974.44
19  158.54 23.90  417.52  820.54
20  169.96 27.08  306.16  498.02
21  152.50 33.74   283.1  309.22
22  149.68 38.44  224.54  123.82
23  149.48 38.94  215.28   30.26
24  153.38 36.24  193.18   75.46
25  155.58 37.88  243.34  228.92
26  165.84 37.00  318.08  528.58
27  171.34 38.96   393.6  707.04
28  183.60 48.20  531.62 1169.40
29  192.58 44.46  507.96 1037.22
30  207.92 43.52     435  956.96
31  228.88 47.44  399.58  788.78
32  246.14 45.74  262.84  397.66
33  228.92 45.98   240.8  255.32
34  227.52 45.22  211.44   96.02
35  232.92 43.02  203.08   62.18
36  220.16 43.88  188.56   63.74
37  221.76 46.78  210.58  131.28
38  218.94 45.10  272.36  438.64
39  221.00 47.48  351.58  689.90
40  215.82 44.68  402.82  854.80
41  222.32 43.74  435.06 1013.92
42  239.40 52.26  474.24 1128.04
43  249.86 47.62  324.92  689.40
44  240.92 49.60  289.82  538.98
45  221.04 48.40  218.74  256.80
46  191.18 47.34  192.36  136.84
47  206.28 48.66  188.22   60.60
48  226.68 48.12  174.54   58.36
49  226.76 51.66  204.26  190.58
50  223.94 53.40  272.22  454.56
51  219.42 54.50  339.26  647.94
52  219.36 54.68 #VALUE! 1040.08
53  225.94 53.06  462.82 1066.12
54  233.04 52.64  425.32  916.22
55  218.48 64.22  438.06  961.36
56  205.76 56.44  292.24  534.28
57  206.06 53.42  225.32  272.24
58  206.22 52.50   190.2  117.16
59  215.44 52.14  182.12   32.56
60  221.92 51.10  175.82   47.50

感谢任何有关改进以下过程的建议，并希望使用 apply 函数或循环函数来自动化它。

OO 列是我用来创建预测模型的变量。
其他列是预测变量，我想测试预测是否与它们一起更好，或者仅与 OO 的过去数据一起使用。
我进行了 36 次观察，以拟合带有“forecast”包中函数 auto.arima 的 Arima 模型。
该函数提供了一些模型参数p,d,q, ，比如说0,1,0

现在我想以自动化的方式测试模型并执行以下操作：

一个。预测未来的下一个时期，在上面的数据表中将相当于第 37 行。

b.将预测结果与历史数据进行比较，第 37 行，OO 列。

c。从包“forecast”中调用准确度函数并与数据点行 37 进行比较。 PLus ，将误差度量存储在向量中。

d。更新 'xdata' 参数，添加历史点 37 和 'xreg' 参数，再增加一个月作为预测变量，并为下一个周期调用另一个预测并重做此过程，直到我完成对 24 个预测的测试。

虽然我为模型安装了包“forecast”，但我发现使用包 astsa 中的函数“sarima.for”更容易。

在代码之前，还有更多信息：

Train.OO 将是上述数据表的前 36 个观测值的时间序列对象
n.ahead = 预测范围的参数：在这种情况下为 1 个周期
0,1,0 将是 ARIMA 模型 (p,d,q)
Train.GG 将是预测变量，GG 列的前 36 个观察值
newxreg 只是数据表中 TS 对象中的一个数据点的一部分，该数据点将作为预测的预测变量。

现在是代码

fc.1 <- sarima.for( 
xdata = Train.00,    
n.ahead = 1, 0, 1, 0 , 
xreg = Train.GG, 
newxreg = window(ts(slack_help$GG, start = c(2009,1), 
frequency = 12), start = c(2012,1) , end = c(2012,11)))
fc.1                      
fc.1.acc <- accuracy(fc.1$pred, 
                 window(ts((slack_help$OO), start = c(2009,1),frequency = 
12),    start = c(2012,1), end = c(2012,1), frequency =12)

现在是第二条命令：

fc.2 <- sarima.for( 
xdata = window(ts((slack_help$OO), start = c(2009,1),frequency = 12), 
start = c(2009,1), end = c(2012,1), frequency =12),  
n.ahead = 1, 0, 1, 0 , 
xreg = window(ts((slack_help$GG), start = c(2009,1),frequency = 12), 
start = c(2009,1), end = c(2012,1), frequency =12),
newxreg = window(ts((slack_help$GG), start = c(2009,1),frequency = 12), 
start = c(2009,2), end = c(2012,2), frequency =12),

fc.2
fc.2.acc <- accuracy(fc.2$pred, 
                 window(ts((slack_help$OO), start = c(2009,1),frequency = 
12),  start = c(2012,2), end = c(2012,2), frequency =12)

fc.2.acc

我为以下预测做了这个。基本相同的代码，只是更新了窗口函数的日期以切割正确的时间序列以考虑预测。

总共 24 次调用。

我知道这是低效的“蛮力”。但是，我对如何开始开发函数/循环有点迷茫。感谢有关如何自动化上述步骤的任何评论或提示。提前致谢！

【问题讨论】：

如果您使您的代码具有可重复性，则更容易提供帮助 - 请参阅 here。例如。指定如何定义不同的输入：Train.gas、Train.oil.lags.2、Test.overview 等。此外，指定您使用的包类型。 sarima.for 来自library(astsa)
@henrik_ibsen，我刚刚再次编辑它。希望对您有所帮助就足够了。任何提示都非常感谢！图森塔克
@henrik_ibsen 非常感谢！我将在此基础上发展更多！只是想了解下面为什么下面一行的“-1”？ for(i in 36:(nrow(stack_help)-1))
因为如果你一直走到nrow(stack_help)，test-set 中将不再有更多的观察。因此，您只能在您的情况下测量 nrow(stack_help)-1 的准确度，因为您需要最终观察来评估准确度。

标签： r function loops forecasting arima

【解决方案1】：

遍历每月时间序列的最简单方法之一是使用 1/12 是一个月这一事实。例如，如果数据开始于 2009 年 1 月，那么我们可以将其等同于 2009.000。在将您的数据设为ts-object (stack_help <- ts(stack_help, start=c(2009,1), freq=12)) 后，可以使用timeProp <-tsp(stack_help)[1] 提取此信息。那么 2009 年 2 月是 timeProp + 1/12 = 2009.083，2009 年 3 月是 timeProp+2/12 = 2009.167。 2011 年 1 月是 timeProp+24/12 = 2011.000 等。让我们应用这个：

library(forecast)

#first define the ts-object so we don't have to repeat it every time we use it
stack_help <- ts(stack_help, start=c(2009,1), freq=12)

#extract the start date
timeProp <- tsp(stack_help)[1]

#set vector for storing accuracy measures
accuracyMeasure <- 0
#set counter for above vector
k <- 1

#star the loop with the minimum length of data you want to use to estimate model.
#In this case start with a length of 36 (Jan 2009 - Des 2011)
for(i in 36:(nrow(stack_help)-1)){

  #create new train and test set for each iteration. This way makes your code
  #clearer and more transparent and easier to maintain in the future
  train <- window(stack_help, end=timeProp+((i-1)/12))
  test <- window(stack_help, start=timeProp+i/12, end=(timeProp+i/12))

  #Estimate the model. This can be changed to e.g. auto.arima(). Haven't tried
  #with sarima.for, but should be straight forward to use that as well.
  arrdarr <- Arima(train[,1], order=c(0,1,0), xreg=train[,2])

  #Forecast h=1 with the new xreg (h=1 is automatic since nrow(test)=1)
  foreArima <- forecast(arrdarr, xreg=test[,2])

  #Extract the test MAPE accuracy. 2 selects the test accuracy. "MAPE" can be changed to extract
  #others, e.g. #ME", "MASE". Remember that with h=1, you can scratch the "mean" part of the measures.
  accuracyMeasure[k] <- accuracy(foreArima, test[,1])[2,"MAPE"]

  k <- k+1
}

【讨论】：

感谢@henrik_ibsen！我会在这个想法的基础上进一步发展。