Tensorflow 2.2：向量值输入/输出自定义层的自定义梯度答案

【问题标题】：Tensorflow 2.2: custom gradient for a vector valued input/output custom layerTensorflow 2.2：向量值输入/输出自定义层的自定义梯度
【发布时间】：2021-09-09 16:09:23
【问题描述】：

我正在使用 tensorflow 2.2 进行研究，并希望实现一个自定义层，该层接受向量（张量）输入并输出向量（张量）。我的输入/输出关系很复杂，我需要创建一个计算前向传递和梯度的函数。我遇到了完成这项工作的custom_gradient function。不幸的是，完全不清楚的是如何将它用于向量或张量输入和输出。特别是，我不知道如何返回雅可比矩阵或其某种形式。

举个简单的例子，假设我的自定义层计算输入 a 和权重 W（权重）的向量矩阵乘积。这就是我解决问题的方法（为简单起见，跳过初始化权重、构建等步骤）。

@tf.custom_gradient
def custom_op(A,W):  # A is of size (#samples, length of input)
    result = tf.matmul(A,W) # I compute the output tensor.
    def custom_grad(dy):
        grad = ... # I have no idea what exactly grad stores, mathematically speaking
        return grad
    return result, custom_grad

class CustomLayer(tf.keras.layers.Layer):
    def __init__(self):
        super(CustomLayer, self).__init__()

    def call(self, A):
        return custom_op(A, self.W)  # assuming self.W are the weights

任何帮助将不胜感激。谢谢！

【问题讨论】：

标签： python tensorflow keras

【解决方案1】：

如果您不需要自定义梯度计算，我认为您不必定义自己的 custom_grad 函数。

我会在层调用之外使用tf.GradientTape。梯度磁带监视可训练变量和对它们执行的操作。 tape.gradient 函数然后单独计算梯度。以下代码 sn-p 源自本教程：https://www.tensorflow.org/guide/autodiff。

layer = CustomLayer()
x = tf.constant([[1., 2., 3.]])

with tf.GradientTape() as tape:
  # Forward pass
  y = layer(x)
  loss = tf.reduce_mean(y**2)

# Calculate gradients with respect to every trainable variable
grad = tape.gradient(loss, layer.trainable_variables)

您可以通过在优化器实例上调用 apply_gradients 函数来应用计算的梯度来训练您的模型。

optimizer = tf.train.AdamOptimizer(learning_rate)
training_op = optimizer.apply_gradients(grad)

【讨论】：

感谢您的回复。另一方面，我确实需要我自己的梯度函数，因为自动梯度不能解决我感兴趣的问题。所以据我所知，应用 tape.gradient 计算对应于前向传播的梯度，假设所有变量都是连续的并且是无限的，对吗？另一方面，这不是我想要的，我想自定义我的渐变。