【发布时间】:2020-12-08 11:48:46
【问题描述】:
我正在构建一个 OLS 模型,但无法做出任何预测。
你能解释一下我做错了什么吗?
构建模型:
import numpy as np
import pandas as pd
from scipy import stats
import statsmodels.api as sm
import matplotlib.pyplot as plt
d = {'City': ['Tokyo','Tokyo','Lisbon','Tokyo','Madrid','New York','Madrid','London','Tokyo','London','Tokyo'],
'Card': ['Visa','Visa','Visa','Master Card','Bitcoin','Master Card','Bitcoin','Visa','Master Card','Visa','Bitcoin'],
'Colateral':['Yes','Yes','No','No','Yes','No','No','Yes','Yes','No','Yes'],
'Client Number':[1,2,3,4,5,6,7,8,9,10,11],
'Total':[100,100,200,300,10,20,40,50,60,100,500]}
d = pd.DataFrame(data=d).set_index('Client Number')
df = pd.get_dummies(d,prefix='', prefix_sep='')
X = df[['Lisbon','London','Madrid','New York','Tokyo','Bitcoin','Master Card','Visa','No','Yes']]
Y = df['Total']
X1 = sm.add_constant(X)
reg = sm.OLS(Y, X1).fit()
reg.summary()
预测:
d1 = {'City': ['Tokyo','Tokyo','Lisbon'],
'Card': ['Visa','Visa','Visa'],
'Colateral':['Yes','Yes','No'],
'Client Number':[11,12,13],
'Total':[0,0,0]}
df1 = pd.DataFrame(data=d1).set_index('Client Number')
df1 = pd.get_dummies(df1,prefix='', prefix_sep='')
y_new = df1[['Lisbon','Tokyo','Visa','No','Yes']]
x_new = df1['Total']
mod = sm.OLS(y_new, x_new)
mod.predict(reg.params)
然后它显示:ValueError:形状(3,1)和(11,)未对齐:1(dim 1)!= 11(dim 0)
我做错了什么?
【问题讨论】:
标签: python pandas dataframe linear-regression statsmodels