为什么这种训练损失会波动？（从零开始的逻辑回归，带有二元交叉熵损失）答案

【问题标题】：Why does this training loss fluctuates? (Logistic regression from scratch with binary cross entropy loss)为什么这种训练损失会波动？（从零开始的逻辑回归，带有二元交叉熵损失）
【发布时间】：2021-10-23 10:31:24
【问题描述】：

我正在尝试使用二元交叉熵损失函数从头开始实现逻辑回归。下面实现的损失函数是基于以下公式创建的。

def binary_crossentropy(y, yhat):
    no_of_samples = len(y)

    numerator_1 = y*np.log(yhat)
    numerator_2 = (1-y) * np.log(1-yhat)
    
    loss = -(np.sum(numerator_1 + numerator_2) / no_of_samples)
    
    return loss

下面是我如何使用梯度下降实现训练。

L = 0.01
epochs = 40000

no_of_samples = len(x)

# Keeping track of the loss
loss = []

for _ in range(epochs):
    yhat = sigmoid(x*weight + bias)
    
    # Finding out the loss of each iteration
    loss.append(binary_crossentropy(y, yhat))
    
    d_weight = np.sum(x *(yhat-y)) / no_of_samples
    d_bias = np.sum(yhat-y) / no_of_samples
    
    weight = weight - L*d_weight
    bias = bias - L*d_bias

由于权重和偏差得到了适当的调整，上述训练进行得很好。但我的问题是，为什么损失图看起来波动很大？

我曾尝试实施线性回归，损失似乎在不断减少。

我的逻辑回归实现中有什么不正确的地方吗？如果我的实现已经正确，为什么会这样波动？

【问题讨论】：

标签： python machine-learning linear-regression logistic-regression

【解决方案1】：

您需要优化超参数以查看问题是否解决。可以做的一件事是更改您使用的优化器的类型。例如，您可以使用 Fmin_tnc 代替 梯度下降。

此外，如果您使用 sklearn 进行回归，您可以调整时期、L 和求解器的类型（“newton-cg”、“lbfgs”、“liblinear”、“sag”、“saga”）。

【讨论】：