【问题标题】:linear regression problems with statsmodelstatsmodel 的线性回归问题
【发布时间】:2018-01-08 18:26:54
【问题描述】:

我有一个看起来像这样的 pandas df:

   broker-value-current  broker-value-prior      consensus-after  
                 590.00              510.00              462.55   
                  32.74               31.98               30.72   
                  33.00               30.00               30.04 

           pctch_broker      pctch_consensus    pctch_frstrec_eps 
              15.686275             1.599051             1.421657   
               2.376485             0.195695           -82.098455   
              10.000000             0.805369           -82.098455  

      pctch_frstrec_rev  
               1.243782  
              -1.258936  
              -1.258936 

最后几列的创建位置:

 data['pctch_broker'] = ((data['broker-value-current']-data['broker-value-prior'])/data['broker-value-prior'])*100
 data['pctch_consensus'] = ((data['consensus-after']-data['consensus-before'])/data['consensus-before'])*100
 data['pctch_frstrec_eps'] = ((data['frstrec_eps_announced']-data['frstrec_eps_forecast'])/data['frstrec_eps_forecast'])*100
 data['pctch_frstrec_rev'] = ((data['frstrec_rev_announced']-data['frstrec_rev_forecast'])/data['frstrec_rev_forecast'])*100

我也用这行清除 NA:

cleaned_data = data.dropna()

使用 scipy 统计数据时:

 import statsmodels.formula.api as sm

但是,当我尝试使用此代码将 'pctch_consensus' 或 'pctch_broker' 作为自变量,并将 'pctch_frstrec_rev' 或 'pctch_frstrec_eps' 作为因变量时:

 reg1 = sm.ols(formula="pctch_consensus ~ pctch_frstrec_rev", data=cleaned_data).fit()

我收到此错误:

RuntimeWarning: invalid value encountered in greater return (S > tol).sum(axis=-1)

【问题讨论】:

  • 哎呀,谢谢我更新了问题,是的,我的导入中也有这一行:import statsmodels.formula.api as sm

标签: python pandas error-handling regression statsmodels


【解决方案1】:

出现此问题是因为您的数据帧中有无穷大。您可能通过在创建新变量时除以零来创建这些无穷大。

这应该可以解决它:

cleaned_data = data.replace([np.inf, -np.inf], np.nan)

【讨论】: