【发布时间】:2018-09-28 08:23:51
【问题描述】:
我正在尝试构建一个包含许多较低级别库的超级字典
概念
我有我的零售银行过去 12 年的利率,我正在尝试通过使用不同债券的投资组合来模拟利率。
回归公式
Y_i - Y_i-1 = A + B(X_i - X_i-1) + E
换句话说,Y_Lag = alpha + beta(X_Lag) + 误差项
数据
Note: Y = Historic Rate
df = pd.DataFrame(np.random.randint(low=0, high=10, size=(100,17)),
columns=['Historic Rate', 'Overnight', '1M', '3M', '6M','1Y','2Y','3Y','4Y','5Y','6Y','7Y','8Y','9Y','10Y','12Y','15Y'])
到目前为止的代码
#Import packages required for the analysis
import pandas as pd
import numpy as np
import statsmodels.api as sm
def Simulation(TotalSim,j):
#super dictionary to hold all iterations of the loop
Super_fit_d = {}
for i in range(1,TotalSim):
#Create a introductory loop to run the first set of regressions
#Each loop produces a univariate regression
#Each loop has a fixed lag of i
fit_d = {} # This will hold all of the fit results and summaries
for col in [x for x in df.columns if x != 'Historic Rate']:
Y = df['Historic Rate'] - df['Historic Rate'].shift(1)
# Need to remove the NaN for fit
Y = Y[Y.notnull()]
X = df[col] - df[col].shift(i)
X = X[X.notnull()]
#Y now has more observations than X due to lag, drop rows to match
Y = Y.drop(Y.index[0:i-1])
if j = 1:
X = sm.add_constant(X) # Add a constant to the fit
fit_d[col] = sm.OLS(Y,X).fit()
#append the dictionary for each lag onto the super dictionary
Super_fit_d[lag_i] = fit_d
#Check the output for one column
fit_d['Overnight'].summary()
#Check the output for one column in one segment of the super dictionary
Super_fit_d['lag_5'].fit_d['Overnight'].summary()
Simulation(11,1)
问题
我似乎在用每个循环覆盖我的字典,并且我没有正确评估 i 以将迭代索引为 lag_1、lag_2、lag_3 等。我该如何解决这个问题?
提前致谢
【问题讨论】:
标签: python arrays pandas for-loop statsmodels