避免通过就地操作修改参数的有效方法答案

【问题标题】：Efficient way to avoid modifying parameter by inplace operation避免通过就地操作修改参数的有效方法
【发布时间】：2019-10-02 16:05:25
【问题描述】：

我有一个model，它有嘈杂的线性层（您可以从mu 和sigma 参数中采样值）并且需要创建两个去相关的输出。

这意味着我有类似的东西：

model.sample_noise()
output_1 = model(input)

with torch.no_grad():
     model.sample_noise()
     output_2 = model(input)

sample_noise 实际上根据正态分布修改了附加到model 的权重。

但最终这导致了

RuntimeError：梯度计算所需的变量之一是由就地操作修改

问题实际上是，避免修改这些参数的最佳方法是什么。实际上，我可以在每次迭代时对模型进行深度复制，然后将其用于第二次前向传递，但这对我来说听起来不是很有效。

【问题讨论】：

错误到底发生在哪里？什么线？
最后，这发生在loss.backward()——感谢torch.autograd.set_detect_anomaly(True)，我可以将其追溯到嘈杂的线性层，确切的行是：F.linear(inp, self.weight_mu + self.weight_sigma * self.weight_epsilon, self.bias_mu + self.bias_sigma * self.bias_epsilon)，上面写着@987654331 @.
在变分自动编码器的上下文中，这个问题的解决方案被称为“重新参数化技巧”。它非常直观，请参阅交叉验证的this question。这实际上与@Jatentaki 的提议相同。

标签： python pytorch

【解决方案1】：

如果我正确理解您的问题，您希望有一个带有矩阵 M 的线性层，然后创建两个输出

y_1 = (M + μ_1) * x + b
y_2 = (M + μ_2) * x + b

在哪里μ_1, μ_2 ~ P。在我看来，最简单的方法是创建一个自定义类

import torch
import torch.nn.functional as F
from torch import nn

class NoisyLinear(nn.Module):
    def __init__(self, n_in, n_out):
        super(NoisyLinear, self).__init__()

        # or any other initialization you want
        self.weight = nn.Parameter(torch.randn(n_out, n_in))
        self.bias = nn.Parameter(torch.randn(n_out))

    def sample_noise(self):
        # implement your noise generation here
        return torch.randn(*self.weight.shape) * 0.01

    def forward(self, x):
        noise = self.sample_noise()
        return F.linear(x, self.weight + noise, self.bias)

nl = NoisyLinear(4, 3)
x = torch.randn(2, 4)

y1 = nl(x)
y2 = nl(x)

print(y1, y2)

【讨论】：