【问题标题】:Can pytorch's autograd handle torch.cat?pytorch的autograd可以处理torch.cat吗?
【发布时间】:2018-12-09 00:18:51
【问题描述】:

我正在尝试实现一个应该学习灰度图像的简单神经网络。输入由像素的二维索引组成,输出应该是该像素的值。

网络的构造如下:每个神经元都连接到输入(即像素的索引)以及每个先前神经元的输出。输出就是这个序列中最后一个神经元的输出。

这种网络在学习图像方面非常成功,例如here.

问题: 在我的实现中,损失函数保持在0.20.4 之间,具体取决于神经元的数量、学习率和使用的迭代次数,这非常糟糕。此外,如果您将输出与我们在那里训练的内容进行比较,它看起来就像是噪音。但这是我第一次在网络内使用torch.cat,所以我不确定这是否是罪魁祸首。谁能看到我做错了什么?

from typing import List
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.nn import Linear

class My_Net(nn.Module):
    lin: List[Linear]

    def __init__(self):
        super(My_Net, self).__init__()
        self.num_neurons = 10
        self.lin = nn.ModuleList([nn.Linear(k+2, 1) for k in range(self.num_neurons)])

    def forward(self, x):
        v = x
        recent = torch.Tensor(0)
        for k in range(self.num_neurons):
            recent = F.relu(self.lin[k](v))
            v = torch.cat([v, recent], dim=1)
        return recent

    def num_flat_features(self, x):
        size = x.size()[1:]
        num = 1
        for i in size():
            num *= i
        return num

my_net = My_Net()
print(my_net)

#define a small 3x3 image that the net is supposed to learn
my_image = [[1.0, 1.0, 1.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]] #represents a T-shape
my_image_flat = []    #output of the net is the value of a pixel
my_image_indices = [] #input to the net is are the 2d indices of a pixel
for i in range(len(my_image)):
    for j in range(len(my_image[i])):
        my_image_flat.append(my_image[i][j])
        my_image_indices.append([i, j])

#optimization loop
for i in range(100):
    inp = torch.Tensor(my_image_indices)

    out = my_net(inp)

    target = torch.Tensor(my_image_flat)
    criterion = nn.MSELoss()
    loss = criterion(out.view(-1), target)
    print(loss)

    my_net.zero_grad()
    loss.backward()
    optimizer = optim.SGD(my_net.parameters(), lr=0.001)
    optimizer.step()

print("output of current image")
print([[my_net(torch.Tensor([[i,j]])).item() for i in range(3)] for j in range(3)])
print("output of original image")
print(my_image)

【问题讨论】:

    标签: python python-3.x neural-network pytorch


    【解决方案1】:

    是的,torch.cat 是反向概率的。因此,您可以毫无问题地使用它。

    这里的问题是您在每次迭代时都定义了一个新的优化器。相反,您应该在定义模型后定义一次。

    因此,更改此代码后,代码可以正常工作,并且损失不断减少。我还添加了每 5000 次迭代的打印输出以显示进度。

    from typing import List
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import torch.optim as optim
    from torch.nn import Linear
    
    class My_Net(nn.Module):
        lin: List[Linear]
    
        def __init__(self):
            super(My_Net, self).__init__()
            self.num_neurons = 10
            self.lin = nn.ModuleList([nn.Linear(k+2, 1) for k in range(self.num_neurons)])
    
        def forward(self, x):
            v = x
            recent = torch.Tensor(0)
            for k in range(self.num_neurons):
                recent = F.relu(self.lin[k](v))
                v = torch.cat([v, recent], dim=1)
            return recent
    
        def num_flat_features(self, x):
            size = x.size()[1:]
            num = 1
            for i in size():
                num *= i
            return num
    
    my_net = My_Net()
    print(my_net)
    
    optimizer = optim.SGD(my_net.parameters(), lr=0.001)
    
    
    
    #define a small 3x3 image that the net is supposed to learn
    my_image = [[1.0, 1.0, 1.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]] #represents a T-shape
    my_image_flat = []    #output of the net is the value of a pixel
    my_image_indices = [] #input to the net is are the 2d indices of a pixel
    for i in range(len(my_image)):
        for j in range(len(my_image[i])):
            my_image_flat.append(my_image[i][j])
            my_image_indices.append([i, j])
    
    #optimization loop
    for i in range(50000):
        inp = torch.Tensor(my_image_indices)
    
        out = my_net(inp)
    
        target = torch.Tensor(my_image_flat)
        criterion = nn.MSELoss()
        loss = criterion(out.view(-1), target)
        if i % 5000 == 0:
            print('Iteration:', i, 'Loss:', loss)
    
        my_net.zero_grad()
        loss.backward()
        optimizer.step()
    print('Iteration:', i, 'Loss:', loss)
    
    print("output of current image")
    print([[my_net(torch.Tensor([[i,j]])).item() for i in range(3)] for j in range(3)])
    print("output of original image")
    print(my_image)
    

    损失输出:

    Iteration: 0 Loss: tensor(0.4070)
    Iteration: 5000 Loss: tensor(0.1315)
    Iteration: 10000 Loss: tensor(1.00000e-02 *
           8.8275)
    Iteration: 15000 Loss: tensor(1.00000e-02 *
           5.6190)
    Iteration: 20000 Loss: tensor(1.00000e-02 *
           3.2540)
    Iteration: 25000 Loss: tensor(1.00000e-02 *
           1.3628)
    Iteration: 30000 Loss: tensor(1.00000e-03 *
           4.4690)
    Iteration: 35000 Loss: tensor(1.00000e-03 *
           1.3582)
    Iteration: 40000 Loss: tensor(1.00000e-04 *
           3.4776)
    Iteration: 45000 Loss: tensor(1.00000e-05 *
           7.9518)
    Iteration: 49999 Loss: tensor(1.00000e-05 *
           1.7160)
    

    所以在这种情况下,损失下降到0.000017。我不得不承认你的错误表面真的很粗糙。根据初始权重,它也可能收敛到最小值0.170.10 .. 等。它收敛的局部最小值可能非常不同。因此,您可以尝试在较小的范围内初始化权重。

    顺便说一句。这是在不改变定义优化器位置的情况下的输出:

    Iteration: 0 Loss: tensor(0.5574)
    Iteration: 5000 Loss: tensor(0.5556)
    Iteration: 10000 Loss: tensor(0.5556)
    Iteration: 15000 Loss: tensor(0.5556)
    Iteration: 20000 Loss: tensor(0.5556)
    Iteration: 25000 Loss: tensor(0.5556)
    Iteration: 30000 Loss: tensor(0.5556)
    Iteration: 35000 Loss: tensor(0.5556)
    Iteration: 40000 Loss: tensor(0.5556)
    Iteration: 45000 Loss: tensor(0.5556)
    

    【讨论】:

    • 感谢您的回答!虽然在我的情况下在循环之前定义优化器确实有意义,但我认为它并没有太大变化。据我所知,SGD 并没有使用这种意义上的状态,因此它首先不应该有所作为。损失仍然在类似的范围内,网络的输出似乎没有接近所需的输出。
    • @flawr 感谢您的反馈。我又跑了几次。我无法证实你的观察,它没有任何区别,恰恰相反!实际上,损失有时可能现在几乎收敛到零。但根据您设置的初始权重,它也可能更高。请检查编辑:)
    • 非常感谢您的帮助!我现在看到了,为了更好地理解SGD 函数,我确实需要阅读一些内容。而且我还低估了这里需要的迭代次数——这就是为什么我从来没有得到更好的结果。所以非常感谢你的帮助!刚开始时会产生如此大的不同:)
    猜你喜欢
    • 2020-06-20
    • 1970-01-01
    • 2020-05-12
    • 2018-09-06
    • 2020-05-13
    • 2017-10-27
    • 1970-01-01
    • 2018-12-05
    • 2020-01-11
    相关资源
    最近更新 更多