反向传播时自定义损失函数变为零答案

【问题标题】：Custom Loss Function becomes zero when backpropagated反向传播时自定义损失函数变为零
【发布时间】：2019-11-14 20:22:40
【问题描述】：

我正在尝试编写基于误报率和误报率的自定义损失函数。我制作了一个虚拟代码，因此您也可以检查前两个定义。我添加了其余的，所以你可以看到它是如何实现的。然而，在某个地方，梯度仍然为零。现在梯度变为零的步骤是什么，或者我该如何检查？请我想知道如何解决这个问题:)。我尝试为您提供更多信息，以便您也可以玩耍，但如果您遗漏了什么，请告诉我！

渐变在每一步都保持为真。然而，仍然在模型的训练过程中损失没有更新，因此神经网络没有训练。

y = Variable(torch.tensor((0, 0, 0, 1, 1,1), dtype=torch.float), requires_grad = True)
y_pred = Variable(torch.tensor((0.333, 0.2, 0.01, 0.99, 0.49, 0.51), dtype=torch.float), requires_grad = True)
x = Variable(torch.tensor((0, 0, 0, 1, 1,1), dtype=torch.float), requires_grad = True)
x_pred = Variable(torch.tensor((0.55, 0.25, 0.01, 0.99, 0.65, 0.51), dtype=torch.float), requires_grad = True)

def binary_y_pred(y_pred):
    y_pred.register_hook(lambda grad: print(grad))
    y_pred = y_pred+torch.tensor(0.5, requires_grad=True, dtype=torch.float)
    y_pred = y_pred.pow(5)  # this is my way working around using torch.where() 
    y_pred = y_pred.pow(10)
    y_pred = y_pred.pow(15)
    m = nn.Sigmoid()
    y_pred = m(y_pred)
    y_pred = y_pred-torch.tensor(0.5, requires_grad=True, dtype=torch.float)
    y_pred = y_pred*2
    y_pred.register_hook(lambda grad: print(grad))
    return y_pred

def confusion_matrix(y_pred, y):
    TP = torch.sum(y*y_pred)
    TN = torch.sum((1-y)*(1-y_pred))
    FP = torch.sum((1-y)*y_pred)
    FN = torch.sum(y*(1-y_pred))

    k_eps = torch.tensor(1e-12, requires_grad=True, dtype=torch.float)
    FN_rate = FN/(TP + FN + k_eps)
    FP_rate = FP/(TN + FP + k_eps)

    return FN_rate, FP_rate

def dif_rate(FN_rate_y, FN_rate_x):
    dif = (FN_rate_y - FN_rate_x).pow(2)
    return dif

def custom_loss_function(y_pred, y, x_pred, x):
    y_pred = binary_y_pred(y_pred)
    FN_rate_y, FP_rate_y = confusion_matrix(y_pred, y)

    x_pred= binary_y_pred(x_pred)
    FN_rate_x, FP_rate_x = confusion_matrix(x_pred, x)

    FN_dif = dif_rate(FN_rate_y, FN_rate_x)
    FP_dif = dif_rate(FP_rate_y, FP_rate_x)

    cost = FN_dif+FP_dif
    return cost

# I added the rest so you can see how it is implemented, but this peace does not fully run well! If you want this part to run as well, I can add more code.
class FeedforwardNeuralNetModel(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(FeedforwardNeuralNetModel, self).__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim) 
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(hidden_dim, output_dim)
        self.sigmoid = nn.Sigmoid()

     def forward(self, x):
        out = self.fc1(x)
        out = self.relu1(out)
        out = self.fc2(out)
        out = self.sigmoid(out)
        return out

model = FeedforwardNeuralNetModel(input_dim, hidden_dim, output_dim)

optimizer = torch.optim.Adam(model.parameters(), lr=0.0001, betas=[0.9, 0.99], amsgrad=True)
criterion = torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction='mean')
for epoch in range(num_epochs):

train_err = 0
for i, (samples, truths) in enumerate(train_loader):
    samples = Variable(samples)
    truths = Variable(truths)
    optimizer.zero_grad()   # Reset gradients
    outputs = model(samples)  # Do the forward pass
    loss2 = criterion(outputs, truths) # Calculate loss

    samples_y = Variable(samples_y)
    samples_x = Variable(samples_x)

    y_pred = model(samples_y)
    y = Variable(y, requires_grad=True)

    x_pred = model(samples_x)
    x= Variable(x, requires_grad=True)

    cost = custom_loss_function(y_pred, y, x_pred, x)
    loss = loss2*0+cost #checking only if cost works.
    loss.backward()                  
    optimizer.step()
    train_err += loss.item()
train_loss.append(train_err)

我希望模型在训练期间更新。没有错误信息。

【问题讨论】：

k_eps 不应该有渐变，顺便说一句。它通常是数值稳定性的常数。

标签： customization pytorch loss-function

【解决方案1】：

使用您的定义：TP+FN=y 和 TN+FP=1-y。然后你会得到FN_rate=1-y_pred 和FP_rate=y_pred。那么你的代价就是FN_rate+FP_rate=1，它的梯度是0。

您可以手动检查或使用符号数学库（例如，SymPy）：

from sympy import symbols

y, y_pred = symbols("y y_pred")

TP = y * y_pred
TN = (1-y)*(1-y_pred)
FP = (1-y)*y_pred
FN = y*(1-y_pred)

# let's ignore the eps for now
FN_rate = FN/(TP + FN)
FP_rate = FP/(TN + FP)
cost = FN_rate + FP_rate

from sympy import simplify
print(simplify(cost))
# output: 1

【讨论】：

非常感谢您的帮助和反馈。但是，我现在看到，由于我努力使问题更简短和更有洞察力，您解决的错误实际上不在真实代码中。我用真实的代码更新了我的例子。我做了和你用符号提议的一样的把戏。现在我最后以“成本”留下了多个符号，这应该是可区分的。你知道它现在是什么吗？
@Madelon 你的损失定义是loss = confusion_matrix(outputs, truths)。命令loss.backward() 无法工作，因为此时loss 的类型为tuple...如果您向人们展示您的真实代码将会很有帮助。
感谢您的反馈。我更新了代码的更多部分。正如您在实施步骤中看到的那样，它有点复杂。因为损失函数由准确率部分和误报/负率部分组成。你是对的，现在损失确实不再是一个元组了。