如何计算梯度 w.r.t. TensorFlow 的急切执行模式中的非变量？答案

【问题标题】：How can I compute the gradient w.r.t. a non-variable in TensorFlow's eager execution mode?如何计算梯度 w.r.t. TensorFlow 的急切执行模式中的非变量？
【发布时间】：2018-11-20 06:27:38
【问题描述】：

我正在尝试计算模型的损失相对于其输入的梯度，以创建一个对抗性示例。由于模型的输入是不可训练的，我需要计算相对于张量而不是变量的梯度。但是，我发现如果张量不是可训练变量，TensorFlow 的GradientTape 会返回None 梯度：

import numpy as np
import tensorflow as tf

tf.enable_eager_execution()

a = tf.convert_to_tensor(np.array([1., 2., 3.]), dtype=tf.float32)
b = tf.constant([1., 2., 3.])
c = tf.Variable([1., 2., 3.], trainable=False)
d = tf.Variable([1., 2., 3.], trainable=True)

with tf.GradientTape() as tape:
    result = a + b + c + d

grads = tape.gradient(result, [a, b, c, d])

print(grads) 打印：

[None, None, None, <tf.Tensor: id=26, shape=(3,), dtype=float32, numpy=array([1., 1., 1.], dtype=float32)>]

我浏览了 TensorFlow 的 Eager Execution tutorial 和 Eager Execution guide，但找不到计算梯度 w.r.t 的解决方案。张量。

【问题讨论】：

标签： python tensorflow

【解决方案1】：

tf.GradientTape 文档揭示了简单的解决方案：

可训练变量（由tf.Variable 或tf.get_variable 创建，trainable=True 在这两种情况下都是默认值）会被自动监视。可以通过在此上下文管理器上调用 watch 方法来手动观察张量。

在这种情况下，

with tf.GradientTape() as tape:
    tape.watch(a)
    tape.watch(b)
    tape.watch(c)
    result = a + b + c + d

grads = tape.gradient(result, [a, b, c, d])

将导致print(grads):

[<tf.Tensor: id=26, shape=(3,), dtype=float32, numpy=array([1., 1., 1.], dtype=float32)>, 
 <tf.Tensor: id=26, shape=(3,), dtype=float32, numpy=array([1., 1., 1.], dtype=float32)>, 
 <tf.Tensor: id=26, shape=(3,), dtype=float32, numpy=array([1., 1., 1.], dtype=float32)>, 
 <tf.Tensor: id=26, shape=(3,), dtype=float32, numpy=array([1., 1., 1.], dtype=float32)>]

【讨论】：