【问题标题】:Seasonal adjustment in RR中的季节性调整
【发布时间】:2021-11-02 09:32:05
【问题描述】:

我确信下面的代码可以对某些数据进行季节性调整,但现在似乎并非如此。我可能做了一些愚蠢的事情,但不知道是什么。有人可以帮忙吗?

数据:https://github.com/Paul-Edward-C/cn-data

library(lubridate)

library(seasonal)

library(xts)


df <- read.csv(input_file, 
               sep = ",", 
               na.strings = "NA", 
               strip.white = TRUE, 
               stringsAsFactors = FALSE)


df$Date <- as.Date(df$Date, format="%Y-%m-%d")
 
start_month <- month(df$Date[1])

start_year <- year(df$Date[1]) 

df_ts <- ts(df[-1], 
            start=c(start_year,start_month), 
            freq=12)


m <- seas(cbind(df_ts),
          xreg = genhol(cny, start = 0, end = 0, center = "calendar"),
          regression.aictest = "td",
          x11 = "",
          regression.usertype = "holiday"
)

df_sa <- as.xts(final(m))

names(df_sa)<-c(colnames(df)[-1])

index(df_sa) <- as.Date(index(df_sa), 
                        format="%b %Y")

write.csv(df_sa,output_file,
          row.names = index(df_sa))

附加包:'lubridate'

以下对象被“package:base”屏蔽:

date, intersect, setdiff, union

加载所需的包:动物园

附加包:“动物园”

以下对象被“package:base”屏蔽:

as.Date, as.Date.numeric

> tail(df, n=20)


         Date National.retail.sales..CNY..Monthly
240 2019-12-01                            38,782.0
241 2020-01-01                            26,060.1
242 2020-02-01                            26,060.1
243 2020-03-01                            26,441.2
244 2020-04-01                            28,167.0
245 2020-05-01                            31,979.6
246 2020-06-01                            33,528.8
247 2020-07-01                            32,189.0
248 2020-08-01                            33,570.6
249 2020-09-01                            35,294.7
250 2020-10-01                            38,576.5
251 2020-11-01                            39,514.2
252 2020-12-01                            40,566.0
253 2021-01-01                            34,868.4
254 2021-02-01                            34,868.4
255 2021-03-01                            35,484.1
256 2021-04-01                            33,152.6
257 2021-05-01                            35,945.1
258 2021-06-01                            37,585.8
259 2021-07-01                            34,925.1
    National.retail.sales..retail.trade..CNY..Monthly National.retail.sales..catering..CNY..Monthly
240                                          33,855.8                                       4,930.0
241                                          23,967.6                                       2,097.5
242                                          23,967.6                                       2,097.5
243                                          24,613.9                                       1,832.2
244                                          25,869.5                                       2,306.6
245                                          28,971.0                                       3,014.5
246                                          30,272.5                                       3,263.6
247                                          28,918.1                                       3,282.1
248                                          29,951.3                                       3,619.3
249                                          31,579.5                                       3,715.1
250                                          34,204.1                                       4,372.3
251                                          34,534.4                                       4,979.7
252                                          35,616.3                                       4,949.7
253                                          31,325.7                                       3,542.7
254                                          31,325.7                                       3,542.7
255                                          31,973.5                                       3,510.5
256                                          29,775.8                                       3,376.9
257                                          32,128.8                                       3,816.3
258                                          33,663.0                                       3,922.8
259                                          31,173.7                                       3,751.4
]) 

    > tail(df_ts, n=20)
       National.retail.sales..CNY..Monthly National.retail.sales..retail.trade..CNY..Monthly
[240,]                                 165                                               117
[241,]                                  86                                                70
[242,]                                  86                                                70
[243,]                                  91                                                72
[244,]                                  96                                                82
[245,]                                 140                                                96
[246,]                                 149                                               103
[247,]                                 141                                                95
[248,]                                 150                                               102
[249,]                                 157                                               111
[250,]                                 164                                               118
[251,]                                 166                                               119
[252,]                                 196                                               120
[253,]                                 155                                               109
[254,]                                 155                                               109
[255,]                                 158                                               112
[256,]                                 146                                               101
[257,]                                 160                                               113
[258,]                                 162                                               116
[259,]                                 156                                               108
       National.retail.sales..catering..CNY..Monthly
[240,]                                           117
[241,]                                            31
[242,]                                            31
[243,]                                            13
[244,]                                            43
[245,]                                            68
[246,]                                            80
[247,]                                            81
[248,]                                            96
[249,]                                           100
[250,]                                           114
[251,]                                           119
[252,]                                           118
[253,]                                            94
[254,]                                            94
[255,]                                            92
[256,]                                            86
[257,]                                           104
[258,]                                           109
[259,]                                           102

+ )
Error: X-13 run failed

Errors:
- Adding AO2021.Apr exceeds the number of regression effects allowed in the model
  (80). Check the regression model, change the automatic outlier options, (e.g.
  method to ADDONE, raise the critical value, or change types to identify AOs only),
  or change the program limits (see Section 2.7 of the X-13ARIMA-SEATS Reference
  Manual). Program error(s) halt execution for
  /var/folders/jj/yp86y9gs6dd3fxn4j9mgyvk80000gn/T//Rtmptiuv5K/x131d895e37f09a/NationalretailsalesC.spc
- Adding AO2016.Nov exceeds the number of regression effects allowed in the model
  (80). Check the regression model, change the automatic outlier options, (e.g.
  method to ADDONE, raise the critical value, or change types to identify AOs only),
  or change the program limits (see Section 2.7 of the X-13ARIMA-SEATS Reference
  Manual). Program error(s) halt execution for
  /var/folders/jj/yp86y9gs6dd3fxn4j9mgyvk80000gn/T//Rtmptiuv5K/x131d895e37f09a/Nationalretailsalesr.spc
- Adding A

【问题讨论】:

    标签: r dataframe time-series


    【解决方案1】:

    开始一个新会话并将输入文件的原始版本(不是问题中给出的 URL,而是下面显示的 u)读入带有 yearmon 类的动物园对象 z。因为数字列是用引号内的逗号定义的,它们通常被视为字符,所以我们定义一个类将它们转换为数字并在colClasses= 参数中指定该类。现在 useas.ts 将其转换为频率为 12 可用的 ts 对象。此外,如果传递了一个 ts 对象,seas 似乎无法区分某些 k 的前 k 个字符中相同的名称。解决方法似乎是(1)使名称在前 k 个位置中唯一或(2)将 ts 对象转换为 ts 对象列表。我们做后者。

    library(seasonal)
    library(zoo)
    
    u <- "https://raw.githubusercontent.com/Paul-Edward-C/cn-data/main/cn_retail_nsa.csv"
    
    setClass("num.comma")
    setAs("character", "num.comma", function(from) as.numeric(gsub(",", "", from)))
    
    nf <- count.fields(u, sep = ",")[1]
    z <- read.csv.zoo(u, FUN = as.yearmon, colClasses = c(NA, rep("num.comma", nf-1)))
    tt <- as.ts(z)
    L <- Map(function(nm) tt[, nm], colnames(tt))  # convert to list
    
    m <- seas(L,
              xreg = genhol(cny, start = 0, end = 0, center = "calendar"),
              regression.aictest = "td",
              x11 = "",
              regression.usertype = "holiday")
    m
    

    给予:

    $National.retail.sales..CNY..Monthly
    
    Call:
    seas(x = L, xreg = genhol(cny, start = 0, end = 0, center = "calendar"), 
        regression.aictest = "td", x11 = "", regression.usertype = "holiday")
    
    Coefficients:
                 xreg         AO2001.Jan         AO2003.May         LS2004.Apr  
             0.003553           0.015152          -0.037941           0.020231  
           AO2005.Feb         AO2006.Jan         LS2007.Oct         LS2008.Dec  
             0.029072           0.026374           0.011891          -0.017261  
           LS2009.Feb         AO2010.Feb         AO2011.Jan         LS2019.Mar  
            -0.030569           0.042868           0.020402          -0.023052  
           LS2020.Jan         AO2020.Apr         LS2020.Apr         LS2020.Jul  
            -0.304376          -0.043385           0.159083           0.027020  
           LS2020.Sep         LS2021.Jan  MA-Nonseasonal-01     MA-Seasonal-12  
             0.032748          -0.056603           0.243636          -0.569232  
    
    
    $National.retail.sales..retail.trade..CNY..Monthly
    
    Call:
    seas(x = L, xreg = genhol(cny, start = 0, end = 0, center = "calendar"), 
        regression.aictest = "td", x11 = "", regression.usertype = "holiday")
    
    Coefficients:
                 xreg         LS2011.Jan         LS2012.Apr         AO2016.Dec  
            0.0006381          0.0177392         -0.0074888          0.0045303  
           LS2018.May         LS2019.Jul         LS2020.Jan         LS2020.Mar  
           -0.0116767         -0.0205138         -0.2634525          0.0595134  
           LS2020.Apr         LS2020.May         AO2020.Jul         LS2020.Sep  
            0.0887770          0.0383029         -0.0169151          0.0246996  
           LS2021.Jan         LS2021.Mar         LS2021.Apr         AO2021.Jul  
           -0.0439001          0.0548717         -0.0323250         -0.0492754  
    AR-Nonseasonal-01  AR-Nonseasonal-02  MA-Nonseasonal-01     MA-Seasonal-12  
           -0.5535409         -0.3663086         -0.4990941         -0.8305223  
    
    
    $National.retail.sales..catering..CNY..Monthly
    
    Call:
    seas(x = L, xreg = genhol(cny, start = 0, end = 0, center = "calendar"), 
        regression.aictest = "td", x11 = "", regression.usertype = "holiday")
    
    Coefficients:
                 xreg         LS2011.Jan         LS2012.Jan         LS2020.Jan  
            3.435e-05          1.017e-01          5.798e-02         -6.519e-01  
           AO2020.Mar         LS2020.Mar         LS2020.May         AO2020.Jun  
           -2.638e-01          1.958e-01          1.542e-01          4.825e-02  
           LS2020.Jul         LS2020.Aug         LS2020.Sep         AO2020.Oct  
            9.758e-02          4.364e-02          4.313e-02          3.741e-02  
           LS2020.Nov         LS2021.Jan         LS2021.Mar  MA-Nonseasonal-01  
            2.555e-02         -1.296e-01          5.732e-02          2.241e-01  
    

    【讨论】:

    • 谢谢。文件输入具有三列不同的数据,但代码返回三个系列,其中两个具有相同的输出,“$NationalretailsalesC”和“$Nationalretailsalesc”。应该如何解决?
    • 您似乎在seas 中发现了一个错误。如果传递给seas 的 ts 对象的名称在某些 k (待确定)的前 k 个位置中是相同的,那么它会将它们视为相同的名称,从而有效地截断它们。添加了额外的讨论,给出了两种解决方法,并在代码中显示了两者中更简单的一种。
    • 已添加 github 问题。 github.com/christophsax/seasonal/issues/274
    • 我尝试更改列标题确实更有特色,然后我原始问题中的代码有效。感谢您的帮助。
    猜你喜欢
    • 1970-01-01
    • 2020-06-27
    • 2020-03-08
    • 1970-01-01
    • 2020-07-01
    • 1970-01-01
    • 2015-05-03
    • 2022-01-11
    • 1970-01-01
    相关资源
    最近更新 更多