条件计算的中间层损失计算答案

【问题标题】：Intermediate Layer loss calculation for conditional Computation条件计算的中间层损失计算
【发布时间】：2021-07-26 10:54:50
【问题描述】：

我想创建一个基于 MLP 的自定义 CNN 模型（多尺度），由多个并行的小型网络（胶囊）组成。这些简单的小型网络被实例化为每个卷积尺度（即 3x3、5x5）的自定义层（conv2d->Flatten->Dense）。这些胶囊网络的目的是使用 CNN 模型产生中间损失意识以减少整体全局损失。我已经编写了一些粗略的代码，但我无法编写正确的代码来使用这些胶囊计算本地损失。代码如下：

from tensorflow.keras import layers
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Layer

class capsule(tf.keras.layers.Layer):
  def __init__(self):
     super(capsule, self).__init__()
     self.loss_fn = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
     self.Flatten = tf.keras.layers.Flatten()
     self.conv2D = tf.keras.layers.Conv2D(3,3,(1,1),padding='same', activation='relu',name="LocalLoss3x3")
     self.classifier = tf.keras.layers.Dense(10,activation='softmax', name='capsule3Output')

 def call(self, inputs):
    x=self.conv2D(inputs)
    x=self.Flatten(x)
    x=self.classifier(x)
    pred=self(x_train)
    loss=self.loss_fn(pred,y_train)
    #self.add_loss(self.rate * tf.reduce_sum(tf.square(inputs)))
    return loss, x

(x_train, y_train), (x_test, y_test)=  mnist.load_data()
from tensorflow.keras import layers
class SparseMLP(tf.keras.models.Model):

def __init__(self, output_dim):
  super(SparseMLP, self).__init__()
  self.dense_1 = layers.Dense(1, activation=tf.nn.relu)
  self.capsule = capsule()
  self.dense_2 = layers.Dense(output_dim)

def call(self, inputs):
  x = self.dense_1(inputs)
  loss,x = self.capsule(inputs)
  return self.dense_2(x)


mlp = SparseMLP(10)
#x_train=x_train.reshape(-1,28,28,1)
y = mlp(x_train)

【问题讨论】：

标签： python-3.x keras tensorflow2.0 tf.keras mlp

【解决方案1】：

要在层中包含损失，您可以使用tf.keras.layers.Layer 类的add_loss 函数。该函数接受一个损失值并将其添加到编译函数中定义的全局损失函数中。

您可以从自定义的 call 方法中调用 self.add_loss(loss_value) layer.Losses 以这种方式添加到训练期间的“主要”损失中（传递给 compile() 的那个）。

所以要让你的模型考虑中间层的损失，你应该取消注释add_loss fn，然后以你训练的通常方式训练模型。
请注意，在编译函数中不声明“主要”损失是完全可以的，因为您在图层类中已经定义了损失。

请注意，当您通过 add_loss() 传递损失时，可以在没有损失函数的情况下调用 compile()，因为模型已经有要最小化的损失。

请注意，SparseMLP 模型的call 函数应如下所示：

       x = self.dense_1(inputs)
       # i dunno if u desire to do this, that is pass inputs in capsule 
       # instead of x.Currently the output from dense_1 is not used at all .
       # so keep in mind to make sure ur passing proper inputs to layers.
       # and u do not have to call loss here as it will tracked internally by 
       # keras.
       x = self.capsule(inputs)
       return self.dense_2(x)

所以像下面这样运行你的模型应该可以解决问题：

model.compile(loss = "define ur main loss is there is" , metrics = "define ur metrics")
model.fit(x = train_inst , y = train_targets)

【讨论】：