通过 while_loop 的 TensorFlow 梯度答案

【问题标题】：Tensorflow gradient through while_loop通过 while_loop 的 TensorFlow 梯度
【发布时间】：2018-11-04 05:03:04
【问题描述】：

我有一个张量流模型，其中一个层的输出是一个二维张量，比如t = [[1,2], [3,4]]。

下一层需要一个输入，该输入由该张量的每一行组合组成。也就是说，我需要把它变成t_new = [[1,2,1,2], [1,2,3,4], [3,4,1,2], [3,4,3,4]]。

到目前为止我已经尝试过：

1) tf.unstack(t, axis=0) 循环遍历它的行并将每个组合附加到缓冲区，然后 t_new = tf.stack(buffer, axis=0)。当形状未指定时，这适用except，即。没有，所以...

2) 我使用 tf.while_loop 生成索引idx=[[0,0], [0,1], [1,0], [1,1]]，然后生成t_new = tf.gather(t, idx)。我的问题是：我应该在这个tf.while_loop 中将back_prop 设置为True 还是False？我只在循环内生成索引。不知道back_prop 是什么意思。

另外，您知道实现我需要的更好方法吗？

这是while_loop：

i = tf.constant(0)
j = tf.constant(0)
idx = tf.Variable([], dtype=tf.int32)
def body(i, j, idx):
    c = tf.concat([idx, [i, j]], axis=0)
    i, j = tf.cond(tf.equal(j, sentence_len - 1),
                   lambda: (i + 1, 0),
                   lambda: (i, j + 1))
    return i, j, c
_, _, indices = tf.while_loop(lambda i, j, _: tf.less(i, sentence_len),
                             body,
                             [i, j, idx],
                             shape_invariants=[i.get_shape(),
                                               j.get_shape(),
                                               tf.TensorShape([None])])

现在我可以t_new = tf.gather(t, indices)。

但我对tf.while_loop 的back_prop 的含义感到非常困惑——总的来说，尤其是在这里。

【问题讨论】：

标签： python tensorflow machine-learning

【解决方案1】：

在这种情况下，您可以将 back_prop 设置为 false。它不需要通过索引的计算反向传播，因为该计算不依赖于任何学习变量。

【讨论】：

即使 t 是实际层的输出，使用可微函数计算？
你没有停止通过 t 的反向传播。您只是通过 while 循环中定义的图形部分来停止它。

【解决方案2】：

这取决于上下文。如果您要对从可微函数产生的某些特征进行索引，那么您需要反向传播。但是，如果您正在索引某些输入占位符或某种类型的输入数据，那么您可以将其保留为 false，就像 @Aaron 所说的那样。

【讨论】：

使用 tf.unstack 的时候怎么样？梯度传播是否正确？
是的，使用 tf.unstack 和 tf.stack 总是被 tf 跟踪。
哦，好的。然而，在我的情况下，在 while 循环中取消设置 back_track 可能仍然是安全的。我认为 tf.gather(t, indices) 应该处理它，你怎么看？