Python中带有l Logistic回归的β系数和p值答案

【问题标题】：beta coefficients and p-value with l Logistic Regression in PythonPython中带有l Logistic回归的β系数和p值
【发布时间】：2021-01-08 23:37:24
【问题描述】：

我想在 python 中执行一个简单的逻辑回归（1 个依赖，1 个自变量）。我在 python 中看到的所有关于逻辑回归的文档都是为了使用它来开发预测模型。我想从统计方面更多地使用它。如何在python上找到简单逻辑回归的Odds ratio、p-value和confidence interval？

X = df[predictor]
y = df[binary_outcome]

model = LogisticRegression()
model.fit(X,y)

print(#model_stats)

理想输出为Odds ratio、p-value 和confidence interval

【问题讨论】：

您的问题是什么？欢迎来到 SO。这不是讨论论坛或教程。请使用tour 并花时间阅读How to Ask 以及该页面上的其他链接。

标签： python regression logistic-regression

【解决方案1】：

我假设您使用的是来自sklearn 的LogisticRegression()。您无法从中估计 p 值置信区间。您可以使用 statsmodels，另请注意，没有公式的 statsmodels 与 sklearn 有点不同（参见 @Josef 的 cmets），因此您需要使用 sm.add_constant() 添加截距：

import statsmodels.api as sm

y = np.random.choice([0,1],50)
x = np.random.normal(0,1,50)

model = sm.GLM(y, sm.add_constant(x), family=sm.families.Binomial())
results = model.fit()
results.summary()

Generalized Linear Model Regression Results
Dep. Variable:  y   No. Observations:   50
Model:  GLM Df Residuals:   48
Model Family:   Binomial    Df Model:   1
Link Function:  logit   Scale:  1.0000
Method: IRLS    Log-Likelihood: -33.125
Date:   Sat, 09 Jan 2021    Deviance:   66.250
Time:   16:21:51    Pearson chi2:   50.1
No. Iterations: 4       
Covariance Type:    nonrobust       
coef    std err z   P>|z|   [0.025  0.975]
const   -0.0908 0.309   -0.294  0.769   -0.696  0.514
x1  0.5975  0.361   1.653   0.098   -0.111  1.306

系数是对数赔率，您可以简单地将其转换为赔率比。 [0.025 0.975] 列是对数几率的 95% 置信区间。查看help page for more info

【讨论】：

注意，statsmodels 在不使用公式时不会自动添加常量。
是的，我错过了。感谢您指出这一点