ValueError：不推荐使用与输入大小 (torch.Size([16, 1])) 不同的目标大小 (torch.Size([2, 1]))答案

【问题标题】：ValueError: Using a target size (torch.Size([2, 1])) that is different to the input size (torch.Size([16, 1])) is deprecatedValueError：不推荐使用与输入大小 (torch.Size([16, 1])) 不同的目标大小 (torch.Size([2, 1]))
【发布时间】：2021-11-22 00:12:07
【问题描述】：

我正在尝试为 Quora 问题对数据集构建模型，其中输出为二进制 1 或 0，但出现此错误。我知道我的模型的输出形状与输入形状不同，但我不知道如何修复它。批量大小设置为 16

    class Bert_model (nn.Module):
      def __init__(self) :
        super(Bert_model,self).__init__()
        self.bert =  BertModel.from_pretrained('bert-base-uncased', return_dict=False)
        self.drop_layer = nn.Dropout(.25)
        self.output = nn.Linear(self.bert.config.hidden_size,1)
    
      def forward(self,input_ids,attention_mask):
        _,o2 = self.bert (input_ids =input_ids , attention_mask = attention_mask )
        o2 = self.drop_layer(o2)
        return self.output(o2)

    model = Bert_model()
    
    loss_fn = nn.BCELoss().to(device)

    def train_epoch(
      model, 
      data_loader, 
      loss_fn, 
      optimizer, 
      device, 
      n_examples
    ):
      model = model.train()
    
      losses = []
      correct_predictions = 0
      
      for d in data_loader:
        input_ids = d["input_ids"].to(device)
        attention_mask = d["attention_mask"].to(device)
        targets = d["target"].to(device)
    
        input_ids = input_ids.view(BATCH_SIZE,-1)
        attention_mask = attention_mask.view(BATCH_SIZE,-1)
    
        outputs = model(
          input_ids=input_ids,
          attention_mask=attention_mask
        )
    
        _, preds = torch.max(outputs, dim=1)
    
        targets = targets.unsqueeze(-1)
        loss = loss_fn(F.softmax(outputs,dim=1), targets)
    
        correct_predictions += torch.sum(preds == targets)
        losses.append(loss.item())
    
        loss.backward()
        nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
        optimizer.step()
        optimizer.zero_grad()
    
      return correct_predictions.double() / n_examples, np.mean(losses)

错误：

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in
binary_cross_entropy(input, target, weight, size_average, reduce,
reduction)    2913         weight = weight.expand(new_size)    2914 
-> 2915     return torch._C._nn.binary_cross_entropy(input, target, weight, reduction_enum)    2916     2917  ValueError: Using a target
size (torch.Size([2, 1])) that is different to the input size
(torch.Size([16, 1])) is deprecated

【问题讨论】：

可以添加错误的堆栈跟踪吗？
我在帖子中添加了它

标签： deep-learning pytorch

【解决方案1】：

从堆栈跟踪来看，错误发生在 BCELoss 计算中，这是因为outputs.shape 是(16, 1)，而targets.shape 是(2, 1)。

我在您的代码中发现了一个主要问题：BCELoss 用于比较概率分布（检查docs），但您的模型输出形状为(n, 1)，其中n 是批量大小（在您的情况下16）。实际上，在forward 的返回语句中，您将o2 传递给输出形状为1 的线性层。

Question Pairs Dataset 适用于二元分类任务，因此您需要将输出转换为概率分布，例如使用Sigmoid 或将线性层输出大小设置为 2，然后使用 softmax .

【讨论】：

另外你可以将BCELoss 与CrossEntropyLoss 切换，这是针对二分类问题的。
我将损失函数更改为 (BCEWithLogitsLoss) ，它将 sigmoid 应用于输出，然后我删除了 softmax 。问题仍然存在，但现在因为目标大小是 (10,1) 并且与输入 (16,1) 不同
很难从您的代码中分辨出错误。鉴于 16 是正确的批量大小，请仔细检查您的目标大小何时从 16 变为 10。请避免更改问题的主体，否则答案将不再有意义。