在python中拟合有和没有截距的线性回归模型答案

【问题标题】：To fit Linear regression Model with and without intercept in python在python中拟合有和没有截距的线性回归模型
【发布时间】：2022-04-06 09:00:29
【问题描述】：

我需要将线性回归模型 1：y = β1x1 + ε 和模型 2：y = β0 + β1x1 + ε 拟合到数据 x1 = ([0,1,2,3,4]) y = ([1,2,3,2,1])。我的目标是找到两个模型的系数、平方误差损失、绝对误差损失和 L1.5 损失。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
import statsmodels.formula.api as smf
import numpy as np

x1 = ([0,1,2,3,4])
y = ([1,2,3,2,1])

你能告诉我一些获得这些的方法吗？

【问题讨论】：

见scikit-learn.org/stable/modules/generated/…、scikit-learn.org/stable/modules/…、scikit-learn.org/stable/auto_examples/linear_model/…

标签： linear-regression statsmodels

【解决方案1】：

第一种方法不使用公式 api。

import statsmodels.api as sm
import numpy as np

x1 = np.array([0,1,2,3,4])
y = np.array([1,2,3,2,1])
x1 = x1[:, None] # Transform into a (5,1) atrray

res = sm.OLS(y,x1).fit()

print(res.summary())

如果要使用公式接口，则需要建立一个DataFrame，然后回归为"y ~ x1"（如果要一个常数需要在右侧包含+1公式。

import statsmodels.formula.api as smf
import pandas as pd

x1 = [0,1,2,3,4]
y = [1,2,3,2,1]
data = pd.DataFrame({"y":y,"x1":x1})
res = smf.ols("y ~ x1", data).fit()
print(res.summary())

要么生产

                            OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                 -0.333
Method:                 Least Squares   F-statistic:                 4.758e-16
Date:                Wed, 17 Mar 2021   Prob (F-statistic):               1.00
Time:                        22:11:40   Log-Likelihood:                -5.6451
No. Observations:                   5   AIC:                             15.29
Df Residuals:                       3   BIC:                             14.51
Df Model:                           1
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      1.8000      0.748      2.405      0.095      -0.582       4.182
x1                  0      0.306          0      1.000      -0.972       0.972
==============================================================================
Omnibus:                          nan   Durbin-Watson:                   1.429
Prob(Omnibus):                    nan   Jarque-Bera (JB):                0.375
Skew:                           0.344   Prob(JB):                        0.829
Kurtosis:                       1.847   Cond. No.                         4.74
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

要在非公式 API 中包含拦截，您可以简单地使用

res_constant = sm.OLS(y, sm.add_constant(x1).fit()

【讨论】：

【解决方案2】：

您可以使用sklearn's LinearRegression.

对于没有截取的（想拟合模型在原点截取），只需设置参数fit_intercept = False

【讨论】：