Keras：在某一层后停止渐变答案

【问题标题】：Keras: stop gradient after a certain layerKeras：在某一层后停止渐变
【发布时间】：2018-06-19 15:04:35
【问题描述】：

假设你有一个 Keras NN 模型，如何在某一层之后停止反向传播中的梯度？

即，如果我们有一个具有两个输出的模型：

input_layer = Input(shape=(10,10,3))

x = Convolution2D(...)(input_layer)
x = Activation('relu')(x)

x = Flatten()(x)

x_1 = Dense(64)(x)
x_1 = Dense(32)(x_1)
x_1 = Dense(2)(x_1)

x_2 = Dense(64)(x)
x_2 = Dense(32)(x_2)
x_2 = Dense(2)(x_2)

model = Model(inputs=input_layer, outputs=[x_1, x_2])

如何在x_1 = Dense(64)(x) 层之后停止输出x_1 的梯度，使其不计入卷积层的权重更新？

根据Stopping Gradient back prop through a particular layer in keras 中的答案，我会在x_1 密集层之前添加一个 lambda 层，但我不太确定：

x_1 = Dense(64)(x)
x_1_stop_grad = Lambda(lambda x: K.stop_gradient(x))(x_1)
x_1 = Dense(32)(x_1)
x_1 = Dense(2)(x_1)

我必须在第一个密集的x_1 层之前或之后添加 lambda 层吗？

【问题讨论】：

你不想让卷积层的参数完全不更新/训练吗？
它们应该得到更新，但仅基于输出x_2。因此，输出x_1 的反向传播中的梯度应该只用于更新x_1 密集层。

标签： keras

【解决方案1】：

由于梯度在网络中是反向流动的，所以你需要在层之后直接添加梯度停止层，这里应该没有梯度。

即

# weights in x should not be updated by gradients from x_1
x = Convolution2D(...)(input_layer) 
x_1_stop_grad = Lambda(lambda x: K.stop_gradient(x))(x)
x_1 = Dense(64)(x_1_stop_grad)
x_1 = Dense(32)(x_1)
...

【讨论】：