Sigmoid 与二元交叉熵损失答案

【问题标题】：Sigmoid vs Binary Cross Entropy LossSigmoid 与二元交叉熵损失
【发布时间】：2021-11-25 23:45:18
【问题描述】：

在我的torch模型中，最后一层是torch.nn.Sigmoid()，损失是torch.nn.BCELoss。在训练步骤中，出现了如下错误：

RuntimeError: torch.nn.functional.binary_cross_entropy and torch.nn.BCELoss are unsafe to autocast.
Many models use a sigmoid layer right before the binary cross entropy layer.
In this case, combine the two layers using torch.nn.functional.binary_cross_entropy_with_logits
or torch.nn.BCEWithLogitsLoss.  binary_cross_entropy_with_logits and BCEWithLogits are
safe to autocast.

但是，当尝试在计算损失和反向传播时重现此错误时，一切正常：

import torch
from torch import nn

# last layer
sigmoid = nn.Sigmoid()
# loss
bce_loss = nn.BCELoss()


# the true classes
true_cls = torch.tensor([
            [0.],
            [1.]])

# model prediction classes
pred_cls = sigmoid(
    torch.tensor([
           [0.4949],
           [0.4824]],requires_grad=True)
)
pred_cls
# tensor([[0.6213],
#         [0.6183]], grad_fn=<SigmoidBackward>)

out = bce_loss(pred_cls, true_cls)
out
# tensor(0.7258, grad_fn=<BinaryCrossEntropyBackward>)

out.backward()

我错过了什么？感谢您提供的任何帮助。

【问题讨论】：

标签： pytorch loss-function sigmoid automatic-mixed-precision

【解决方案1】：

您必须先将其移至cuda 并启用autocast，如下所示：

import torch
from torch import nn
from torch.cuda.amp import autocast

# last layer
sigmoid = nn.Sigmoid().cuda()
# loss
bce_loss = nn.BCELoss().cuda()


# the true classes
true_cls = torch.tensor([
            [0.],
            [1.]]).cuda()

with autocast():

    # model prediction classes
    pred_cls = sigmoid(
        torch.tensor([
               [0.4949],
               [0.4824]], requires_grad=True

        ).cuda()
    )

    pred_cls
    # tensor([[0.6213],
    #         [0.6183]], grad_fn=<SigmoidBackward>)

    out = bce_loss(pred_cls, true_cls)
    out
    # tensor(0.7258, grad_fn=<BinaryCrossEntropyBackward>)

    out.backward()

RuntimeError: torch.nn.functional.binary_cross_entropy and torch.nn.BCELoss are unsafe to autocast.
Many models use a sigmoid layer right before the binary cross entropy layer.
In this case, combine the two layers using torch.nn.functional.binary_cross_entropy_with_logits
or torch.nn.BCEWithLogitsLoss.  binary_cross_entropy_with_logits and BCEWithLogits are
safe to autocast.

【讨论】：