分解趋势、季节和剩余时间序列元素答案

【问题标题】：Decomposing trend, seasonal and residual time series elements分解趋势、季节和剩余时间序列元素
【发布时间】：2016-03-31 03:55:41
【问题描述】：

我有一个DataFrame 有几个时间序列：

         divida    movav12       var  varmovav12
Date                                            
2004-01       0        NaN       NaN         NaN
2004-02       0        NaN       NaN         NaN
2004-03       0        NaN       NaN         NaN
2004-04      34        NaN       inf         NaN
2004-05      30        NaN -0.117647         NaN
2004-06      44        NaN  0.466667         NaN
2004-07      35        NaN -0.204545         NaN
2004-08      31        NaN -0.114286         NaN
2004-09      30        NaN -0.032258         NaN
2004-10      24        NaN -0.200000         NaN
2004-11      41        NaN  0.708333         NaN
2004-12      29  24.833333 -0.292683         NaN
2005-01      31  27.416667  0.068966    0.104027
2005-02      28  29.750000 -0.096774    0.085106
2005-03      27  32.000000 -0.035714    0.075630
2005-04      30  31.666667  0.111111   -0.010417
2005-05      31  31.750000  0.033333    0.002632
2005-06      39  31.333333  0.258065   -0.013123
2005-07      36  31.416667 -0.076923    0.002660

我想分解第一个时间序列divida，以便我可以将其趋势与其季节性和剩余分量分开。

我找到了答案here，并尝试使用以下代码：

import statsmodels.api as sm

s=sm.tsa.seasonal_decompose(divida.divida)

但我不断收到此错误：

Traceback (most recent call last):
File "/Users/Pred_UnBR_Mod2.py", line 78, in <module> s=sm.tsa.seasonal_decompose(divida.divida)
File "/Library/Python/2.7/site-packages/statsmodels/tsa/seasonal.py", line 58, in seasonal_decompose _pandas_wrapper, pfreq = _maybe_get_pandas_wrapper_freq(x)
File "/Library/Python/2.7/site-packages/statsmodels/tsa/filters/_utils.py", line 46, in _maybe_get_pandas_wrapper_freq
freq = index.inferred_freq
AttributeError: 'Index' object has no attribute 'inferred_freq'

我该如何继续？

【问题讨论】：

你的divida.index.dtype 是什么？它应该是一个 DatetimeIndex

标签： python pandas machine-learning time-series statsmodels

【解决方案1】：

将index 转换为DateTimeIndex 时效果很好：

df.reset_index(inplace=True)
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')
s=sm.tsa.seasonal_decompose(df.divida)

<statsmodels.tsa.seasonal.DecomposeResult object at 0x110ec3710>

通过以下方式访问组件：

s.resid
s.seasonal
s.trend

【讨论】：

快速提问：我如何访问该结果？我只在 0x110ec3710> 获得
谢谢@Stefan，救了我的命！
嗨，当我尝试这段代码时，我收到以下错误：AttributeError: 'RangeIndex' object has no attribute 'inferred_freq' 有什么建议吗？？
错误表明您的索引的类型为 RangeIndex，而它应该是 DateTimeIndex（请参阅示例中的 Date 列会发生什么情况）。
@Leevo 你必须指定一个频率，例如你可以重新采样你的数据：stackoverflow.com/questions/17001389/…

【解决方案2】：

仅当您提供频率时，Statsmodel 才会分解系列。通常所有时间序列索引都会包含频率，例如：Daywise，Business days，weekly 所以它显示错误。您可以通过两种方式消除此错误：

Stefan 所做的是将索引列提供给 pandas DateTime 函数。它使用内部函数infer_freq 查找频率并返回带有频率的索引。
否则，您可以将索引列的频率设置为df.index.asfreq(freq='m')。这里m 代表月份。如果您有领域知识或d，您可以设置频率。

【讨论】：

谢谢，老问题解决了。但现在它说：ValueError: cannot insert level_0, already exists。有什么建议吗？？
给你的问题一些更详细的描述。有回溯错误的代码将有助于解决问题

【解决方案3】：

简单点：

遵循三个步骤：
1.如果没有完成，请在yyyy-mm-dd或dd-mm-yyyy中创建列（使用excel）。
2. 然后使用 pandas 将其转换为日期格式： df['Date'] = pd.to_datetime(df['Date'])
3. 分解：

from statsmodels.tsa.seasonal import seasonal_decompose
decomposition=seasonal_decompose(ts_log)

最后：

【讨论】：

【解决方案4】：

这取决于索引格式。您可以拥有 DateTimeIndex，也可以拥有 PeriodIndex。 Stefan 展示了 DateTimeIndex 的示例。这是我的 PeriodIndex 示例。我原来的 DataFrame 有一个 MultiIndex 索引，第一级是年份，第二级是月份。这是我将其转换为 PeriodIndex 的方法：

df["date"] = pd.PeriodIndex (df.index.map(lambda x: "{0}{1:02d}".format(*x)),freq="M")
df = df.set_index("date")

现在可以被seasonal_decompose使用了。

【讨论】：

【解决方案5】：

尝试使用 parse_dates 解析日期列，稍后再提及索引列。

from statsmodels.tsa.seasonal import seasonal_decompose
data=pd.read_csv(airline,header=0,squeeze=True,index_col=[0],parse_dates=[0])
res=seasonal_decompose(data)

【讨论】：