Tensorflow 的 variable_scope() 和 tf.AUTO_REUSE 不会在 for 循环中重用变量答案

【问题标题】：Tensorflow's variable_scope() and tf.AUTO_REUSE will not reuse variables in a for loopTensorflow 的 variable_scope() 和 tf.AUTO_REUSE 不会在 for 循环中重用变量
【发布时间】：2019-05-23 03:07:12
【问题描述】：

我想将几个不同的输入传递到可重用的 tensorflow 架构（解码器）中。为此，我使用了一个 for 循环，在该循环中我将输入输入到模型中。但是，我没有重用层变量，而是为每个循环迭代创建变量。假设这段代码：

import tensorflow as tf

for i in range(5):
    decoder(input=input, is_training=is_training)

当解码器是：

def decoder(self, input, is_training):

    with tf.variable_scope("physics", reuse=tf.AUTO_REUSE):
         latent = tf.expand_dims(latent, axis=1)
         latent = tf.expand_dims(latent, axis=1)

         x = latent

         """ Layer 1 """
         x = tf.layers.conv2d_transpose(x, filters=256, kernel_size=2, strides=1, activation='relu', padding='valid', name="transpose1_1", reuse=tf.AUTO_REUSE)
         x = tf.layers.batch_normalization(x, training=is_training, name="transpose_bn_1_1")

         """ Layer 2 """
         x = tf.layers.conv2d_transpose(x, filters=256, kernel_size=2, strides=2, activation='relu', padding='valid', name="transpose1_2", reuse=tf.AUTO_REUSE)
         x = tf.layers.batch_normalization(x, training=is_training, name="transpose_bn_1_2")

         ...

如果我现在使用

在循环后立即输出变量

from pprint import pprint
pprint([n.name for n in tf.get_default_graph().as_graph_def().node])

我得到以下输出，表明我没有在循环迭代之间共享我的变量：

 'physics/transpose1_1/kernel/Initializer/random_uniform/shape',
 'physics/transpose1_1/kernel/Initializer/random_uniform/min',
 'physics/transpose1_1/kernel/Initializer/random_uniform/max',
 'physics/transpose1_1/kernel/Initializer/random_uniform/RandomUniform',
 'physics/transpose1_1/kernel/Initializer/random_uniform/sub',
 'physics/transpose1_1/kernel/Initializer/random_uniform/mul',
 'physics/transpose1_1/kernel/Initializer/random_uniform',
 'physics/transpose1_1/kernel',
 'physics/transpose1_1/kernel/Assign',
 'physics/transpose1_1/kernel/read',
 'physics/transpose1_1/bias/Initializer/zeros',
 'physics/transpose1_1/bias',
 'physics/transpose1_1/bias/Assign',
 'physics/transpose1_1/bias/read',
 'physics/transpose1_1/Shape',
 'physics/transpose1_1/strided_slice/stack',
 'physics/transpose1_1/strided_slice/stack_1',
 'physics/transpose1_1/strided_slice/stack_2',
 'physics/transpose1_1/strided_slice',
 'physics/transpose1_1/strided_slice_1/stack',
 'physics/transpose1_1/strided_slice_1/stack_1',
 'physics/transpose1_1/strided_slice_1/stack_2',
 'physics/transpose1_1/strided_slice_1',
 'physics/transpose1_1/strided_slice_2/stack',
 'physics/transpose1_1/strided_slice_2/stack_1',
 'physics/transpose1_1/strided_slice_2/stack_2',
 'physics/transpose1_1/strided_slice_2',
 'physics/transpose1_1/mul/y',
 'physics/transpose1_1/mul',
 'physics/transpose1_1/add/y',
 'physics/transpose1_1/add',
 'physics/transpose1_1/mul_1/y',
 'physics/transpose1_1/mul_1',
 'physics/transpose1_1/add_1/y',
 'physics/transpose1_1/add_1',
 'physics/transpose1_1/stack/3',
 'physics/transpose1_1/stack',
 'physics/transpose1_1/conv2d_transpose',
 'physics/transpose1_1/BiasAdd',
 'physics/transpose1_1/Relu',
 ...
 'physics_4/transpose1_1/Shape',
 'physics_4/transpose1_1/strided_slice/stack',
 'physics_4/transpose1_1/strided_slice/stack_1',
 'physics_4/transpose1_1/strided_slice/stack_2',
 'physics_4/transpose1_1/strided_slice',
 'physics_4/transpose1_1/strided_slice_1/stack',
 'physics_4/transpose1_1/strided_slice_1/stack_1',
 'physics_4/transpose1_1/strided_slice_1/stack_2',
 'physics_4/transpose1_1/strided_slice_1',
 'physics_4/transpose1_1/strided_slice_2/stack',
 'physics_4/transpose1_1/strided_slice_2/stack_1',
 'physics_4/transpose1_1/strided_slice_2/stack_2',
 'physics_4/transpose1_1/strided_slice_2',
 'physics_4/transpose1_1/mul/y',
 'physics_4/transpose1_1/mul',
 'physics_4/transpose1_1/add/y',
 'physics_4/transpose1_1/add',
 'physics_4/transpose1_1/mul_1/y',
 'physics_4/transpose1_1/mul_1',
 'physics_4/transpose1_1/add_1/y',
 'physics_4/transpose1_1/add_1',
 'physics_4/transpose1_1/stack/3',
 'physics_4/transpose1_1/stack',
 'physics_4/transpose1_1/conv2d_transpose',
 'physics_4/transpose1_1/BiasAdd',
 'physics_4/transpose1_1/Relu',

这里发生了什么？ tf.AUTO_REUSE 标志不应该允许我在i==0 和所有迭代i>0 重用我的变量时首先初始化我的decoder 吗？我的解码器中的每一层都会出现上述情况。

我使用的是TensorFlow版本1.12.0。

谢谢。

【问题讨论】：

标签： python tensorflow scope conv-neural-network

【解决方案1】：

您已经在 for 循环中重复使用了变量。图的节点不等同于Variable。以下示例有多个节点但只有一个Variable。

import tensorflow as tf

a = tf.Variable([2.0],name='a')
b = a+1
print([n.name for n in tf.get_default_graph().as_graph_def().node])

['a/initial_value', 'a', 'a/Assign', 'a/read', 'add/y', 'add']

您应该使用其他方式查看代码中的变量。

1.在理解末尾添加if "Variable" in n.op

print([n.name for n in tf.get_default_graph().as_graph_def().node if "Variable" in n.op])

['a']

2.使用tf.global_variables()。

print(tf.global_variables())

[<tf.Variable 'a:0' shape=(1,) dtype=float32_ref>]

所以你应该在你的代码中执行以下操作：

import tensorflow as tf

def decoder(latent, is_training):
    with tf.variable_scope("physics", reuse=tf.AUTO_REUSE):
        x = latent
        """ Layer 1 """
        x = tf.layers.conv2d_transpose(x, filters=256, kernel_size=2, strides=1, activation='relu', padding='valid', name="transpose1_1", reuse=tf.AUTO_REUSE)
        x = tf.layers.batch_normalization(x, training=is_training, name="transpose_bn_1_1")
        """ Layer 2 """
        x = tf.layers.conv2d_transpose(x, filters=256, kernel_size=2, strides=2, activation='relu', padding='valid', name="transpose1_2", reuse=tf.AUTO_REUSE)
        x = tf.layers.batch_normalization(x, training=is_training, name="transpose_bn_1_2")

for i in range(5):
    decoder(latent=tf.ones(shape=[64,7,7,256]) , is_training=True)

print([n.name  for n in tf.get_default_graph().as_graph_def().node if "Variable" in n.op])
# print(tf.global_variables())

['physics/transpose1_1/kernel', 'physics/transpose1_1/bias', 'physics/transpose_bn_1_1/gamma', 'physics/transpose_bn_1_1/beta', 'physics/transpose_bn_1_1/moving_mean', 'physics/transpose_bn_1_1/moving_variance', 'physics/transpose1_2/kernel', 'physics/transpose1_2/bias', 'physics/transpose_bn_1_2/gamma', 'physics/transpose_bn_1_2/beta', 'physics/transpose_bn_1_2/moving_mean', 'physics/transpose_bn_1_2/moving_variance']

【讨论】：

【解决方案2】：

TF 根据层的名称构建一个变量名。然后，当它试图创建一个变量时，它会检查变量是否已经存在。如果是这样，它会抛出一个异常，除非你指定变量可以被重用。

要修复您的代码，您需要在应该共享变量的层中使用相同的名称。 documentation中也有同样的说明：

reuse: Boolean, whether to reuse the weights of a previous layer by the same name.

此外，要调试您的代码并确保您的 var 指向同一个位置，您只需删除 reuse 参数并确保在您尝试运行模型时出现异常。

【讨论】：

我不知道你在暗示什么。给每一层一个唯一的名字（这是我所做的）并不能单独解决问题。至于使用标志检查异常：是的，这是一个很好的提示！谢谢
documentation 链接现在错误。