从线性回归绘制逻辑回归答案

【问题标题】：Plotting Logistic Regression from Linear Regression从线性回归绘制逻辑回归
【发布时间】：2021-11-23 20:29:25
【问题描述】：

我有一个有效的线性回归算法，可以用它来绘制图形，但我不知道如何使用 sigmoid 变换绘制逻辑回归。

这是我正在使用的代码：

m = 0
c = 0

L = 0.0001  # The learning Rate
epochs = 1000  # The number of iterations to perform gradient descent

n = float(len(X)) # Number of elements in X

#performing gradient descent
for i in range(epochs):
    Y_pred = m*X + c  # The current predicted value of Y
    D_m = (-2/n) * sum(X * (Y - Y_pred))  # Derivative wrt m
    D_c = (-2/n) * sum(Y - Y_pred)  # Derivative wrt c
    m = (m - L * D_m)  # Update m
    c = (c - L * D_c)  # Update c
    
print (m, c)

# Making predictions
Sx = 1 / (1+(np.exp(-Y_pred)))


plt.scatter(df.x1, df.x2, c=df.y, cmap=matplotlib.colors.ListedColormap(['red','blue']))
plt.plot([min(X), max(X)], [min(Y_pred), max(Y_pred)])  # regression line
plt.show()

程序将 m 打印为大约 0.6985，将 c 打印为 0.9674。

这是使用线性回归时的绘图截图： Linear Regression Plot

暂时假设这是回归问题，这似乎是正确的输出。但是，由于线性回归不适合这种数据集，（我试图对红色和蓝色两组进行分类）当我更改图表以显示 Sx 时，我得到的似乎是一条从 0 开始的直线并在到达图形右侧时接近 1。这对我来说是有道理的，因为这些课程旨在改变这种方式（红色是 0 级，以左侧为中心，蓝色是 1 级在右侧）。但是，scikit-learn LogisticRegression 实现允许我输出更好看的图形，如下所示：

Scikit-Learn LogisticRegression Line

当 sigmoid 函数的输出在 0 和 1 之间时，如何创建类似的图形？

【问题讨论】：

这本质上是一个等高线图，因为您要绘制 f(x, y) = 0.5 的线。这可能会有所帮助：jakevdp.github.io/PythonDataScienceHandbook/…

标签： python machine-learning linear-regression logistic-regression

【解决方案1】：

不确定您计算逻辑回归梯度的方式。这就是我所做的。参考：https://ml-cheatsheet.readthedocs.io/en/latest/logistic_regression.html

逻辑回归线如下，假设我们有 2 个特征 X1 和 X2，

log reg = 1 / (1 + e(-z))

z = w0 + w1.x1 + w2.x2（w0 类似于 c，w1 类似于 m1，w2 类似于 m2）

所以使用 z

如果我们使用 x2，我们需要获取 x1 的值，反之亦然

所以：假设我们将为 x2 创建线，这可以通过等式找到

0 = w0 + w1.x1 + w2.x2

我们使用上述是因为

1 / (1 + e(-z)) = 0.5

2 = (1 + e(-z))

1 = e(-z)

ln(1) = ln(e(-z))

0 = -z

z = 0

0 = w0 + w1.x1 + w2.x2

=================================

所以，

x2 = (-w0 -w1.x1) / w2

插入 x1 的值，我们得到 x2 的值。

这些值构成决策边界。

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
import numpy as np

from sklearn.datasets import *
from sklearn.preprocessing import minmax_scale

X,Y = make_blobs(n_samples = 20,n_features=2,centers=2,random_state=75,cluster_std = 5)
X = minmax_scale(X)
X = np.hstack([np.ones(len(X)).reshape(-1,1), X])

B = np.array([1,1,1]) #weights

L = 0.01  # The learning Rate
epochs = 5000  # The number of iterations to perform gradient descent


N = len(X) #number of records

for i in range(epochs):

    Y_pred = np.dot(X, B) #prediction

    h = 1 / (1 + np.exp(-Y_pred)) # sigmoid

    Cost = (-np.multiply(Y,np.log(h))) - np.multiply((1-Y),np.log(1 - h)) 

    J = np.mean(Cost) #cost to minimize // to print out

    gradient = np.dot(X.T,  h - Y)

    gradient = gradient / N

    gradient = gradient * L

    B  = B - gradient #update weights
    
    #print ("cost = ", J)

decision_boundary = (-B[1] * (np.arange(100) / 100) - B[0]) / B[2]

plt.scatter(X[:,1],X[:,2], c = Y)
plt.plot(np.arange(100) / 100, decision_boundary)
put.show()

【讨论】：