【问题标题】:Updating weights of a part of a model (nn.Module)更新模型一部分的权重 (nn.Module)
【发布时间】:2021-11-18 09:37:12
【问题描述】:

我在构建一个松散地基于 CycleGAN 架构的网络时遇到了一个问题

我将它的所有组件都放在一个nn.Module

from torch import nn

from classes.EncoderDecoder import EncoderDecoder
from classes.Discriminator import Discriminator

class CycleGAN(nn.Module):
    def __init__(self):
        super(CycleGAN, self).__init__()
        self.encdec1 = EncoderDecoder(encoder_in_channels=3)
        self.encdec2 = EncoderDecoder(encoder_in_channels=3)
        self.disc = Discriminator()
        

    def forward(self, images, images_bw):

        disc_color = self.disc(images) # I want the Discriminator to be trained here
        disc_bw = self.disc(images_bw) # I want the Discriminator to be trained here

        decoded1 = self.encdec1(images_bw) # EncoderDecoder forward pass
        decoded2 = self.encdec2(decoded1)

        decoded_disc = self.disc(decoded1)  # I don't want to train the Discriminator here, 
                                            # only the EncoderDecoder should be trained based
                                            # on this Discriminator's result

        return [disc_color, disc_bw, decoded1, decoded2, decoded_disc]

这就是我初始化这个模块、损失函数和优化器的方式

c_gan = CycleGAN().to('cuda', dtype=float32, non_blocking=True)

l2_loss = MSELoss().to('cuda', dtype=float32).train()
bce_loss = BCELoss().to('cuda', dtype=float32).train()

optimizer_gan = Adam(c_gan.parameters(), lr=0.00001)

这就是我在训练循环中训练网络的方式

c_gan.zero_grad()
optimizer_gan.zero_grad()

disc_color, disc_bw, decoded1, decoded2, decoded_disc = c_gan(images, images_bw)

loss_true = bce_loss(disc_color, label_true)
loss_false = bce_loss(disc_bw, label_false)
disc_loss = loss_true + loss_false
disc_loss.backward()

decoded_loss = l2_loss(decoded2, images_bw)
decoded_disc_loss = bce_loss(decoded_disc, label_true) # This is where the loss for that Discriminator forward pass is calculated
both_decoded_losses = decoded_loss + decoded_disc_loss
both_decoded_losses.backward()
optimizer_gan.step()

问题

我不想基于 EncoderDecoder -> Discriminator 前向传递训练 Discriminator 模块。不过我确实想根据images -> Discriminatorimages_bw -> Discriminator 前向传球来训练它。

  • 是否可以只为我的CycleGAN 模块使用一个优化器来实现这一目标?
  • 我可以在优化器的.step() 期间冻结Discriminator 吗?

我将不胜感激。

【问题讨论】:

    标签: python machine-learning pytorch


    【解决方案1】:

    来自PyTorch example: freezing a part of the net (including fine-tuning) - GitHub gist

    class CycleGan:
        ...
    
    c_gan = CycleGan()
    # freeze every layer of discriminator
    # c_gan.disc.{layer}.weight.requires_grad = False
    # c_gan.disc.{layer}.bias.requires_grad = False
    
    ...
    

    【讨论】:

    • 这适用于我的用例吗?我不想在初始化网络后立即冻结它,我实际上想根据前两个正向通过它来训练Discriminator。我只想在第三遍冻结它,它接收来自EncoderDecoder 的输入。我想根据Discriminator 的输出标签来训练EncoderDecoder
    • 在我看来,您希望 detach 或将 requires_grad 设置为 False 以用于第三遍 Discriminator - Difference Between Detach and with torch.no_grad(),希望对您有所帮助!
    • 谢谢,我会做一些测试,如果今天/明天有结果,我会回复你。
    猜你喜欢
    • 2011-09-06
    • 2015-06-18
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-04-03
    • 1970-01-01
    相关资源
    最近更新 更多