【问题标题】:one of the variables needed for gradient computation has been modified by an inplace operation : can't find inplace operation梯度计算所需的变量之一已被就地操作修改:找不到就地操作
【发布时间】:2020-11-11 07:30:16
【问题描述】:

我在下面有这段代码,但我找不到阻止梯度计算的就地操作。

for epoch in range(nepoch):
    model.train()
    scheduler.step()

    for batch1 in loader1:
        torch.ones(len(batch1[0]), dtype=torch.float)
        x, label = batch1
        x = x1.to('cuda', non_blocking=True)
        optimizer.zero_grad()
        pred = model(x)
        pred = pred.squeeze() if pred.ndimension() > 1 else pred
        label = (label.float()).cuda(cuda0)
        weights = torch.ones(len(label))
        loss_fun = torch.nn.BCEWithLogitsLoss(weight=weights.cuda(cuda0))
        score = loss_fun(pred, label)
        label = np.array(np.round(label.cpu().detach())).astype(bool)
        pred = np.array(pred.cpu().detach()>0).astype(bool)
        torch.autograd.set_detect_anomaly(True)

        score.backward()
        optimizer.step()

最后我弹出这个错误:

Warning: Error detected in MulBackward0. Traceback of forward call        that caused the error:
File "train.py", line 98, in <module>
  pred = model(x)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
  result = self.forward(*input, **kwargs)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
  input = module(input)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
  result = self.forward(*input, **kwargs)
File "/home/anatole2/best/PCEN_pytorch.py", line 30, in forward
  filtered[i] = filtered[i] + (1-exp(self.log_s)) * filtered[i-1]
(print_stack at /pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:60)
Traceback (most recent call last):
File "train.py", line 116, in <module>
  score.backward()
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward
  torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/autograd/__init__.py", line 100, in backward
  allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 1, 80]], which is output 0 of SelectBackward, is at version 378; expected version 377 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

如果你能帮助我,那就太好了!

【问题讨论】:

  • 您能否将完整的错误 Traceback 粘贴到帖子中,而不是通过链接引用它?顺便说一句,欢迎来到 Stack Overflow :-)
  • @Ronald 是啊我没想到它是链接的形式谢谢!

标签: python pytorch gradient conv-neural-network


【解决方案1】:

就地操作好像就在这一行:

File "/home/anatole2/best/PCEN_pytorch.py", line 30, in forward
  filtered[i] = filtered[i] + (1-exp(self.log_s)) * filtered[i-1]

请注意,它使用来自过滤[i] 的值,然后将结果存储在过滤[i] 中。这就是就地的意思;新值覆盖旧值。

要修复它,您需要执行以下操作:

filtered_new = torch.zeros_like(filtered)
...
filtered_new[i] = filtered[i] + (1-exp(self.log_s)) * filtered[i-1]

使这有点复杂的部分是它似乎在一个循环内(我假设i 是循环计数器)并且它可能使用了上一次循环中的值。修改后的版本不是就地的,但也可能不会产生与原始版本相同的结果。所以你可能不得不做这样的事情:

filtered_new[i] = filtered[i] + (1-exp(self.log_s)) * filtered_new[i-1]

如果没有看到更多代码就不可能解决这个问题,但基本上 - 环顾四周,并用创建新张量以存储计算结果的操作替换任何更改现有张量的操作。

【讨论】:

    猜你喜欢
    • 2019-05-10
    • 2020-08-02
    • 2019-12-29
    • 2020-09-15
    • 2021-11-04
    • 2020-01-03
    • 2020-08-25
    • 2022-10-01
    • 2021-08-05
    相关资源
    最近更新 更多