不同variable_scope下的Tensorflow共享变量答案

【问题标题】：Tensorflow sharing variables under different variable_scope不同variable_scope下的Tensorflow共享变量
【发布时间】：2018-08-07 16:54:50
【问题描述】：

我有三个网络，分别称为 V、V_target 和 Actor，我正在尝试实现以下设置：

V 和 Actor 共享某些层。
V_target 与 V 完全相同。

对于那些熟悉深度 RL 的人，我在一个actor-critic 算法中使用它，该算法在价值网络和策略网络之间具有共享层，以及一个目标网络 V_target。我尝试了以下方法：

def shared(...):
  # define some variables, e.g.
  W = get_variable('W', ...)

def Actor(...):
  with tf.variable_scope("shared"):
    shared_out = shared(...)
  ... actor-specific layers ...

def V(...):
  with tf.variable_scope("shared", reuse=True):
    shared_out = shared(...)
  ... V-specific layers...

with tf.variable_scope("Policy"):
  actor_out = Actor(...)
with tf.variable_scope("V_main"):
  V_out = V(...)
with tf.variable_scope("V_target"):
  V_target = V(...)

正如预期的那样，这不起作用，因为使用最外层的 variable_scope 会阻止 Policy 和 V_main 之间的共享：变量 W 在一个范围内具有名称 Policy/shared/W，但在第二个范围内具有名称 V_main/shared/W .

为什么不使用tf.name_scope("Policy") 和tf.name_scope("V_main")？如果我这样做，可以定义shared 变量，但是我没有一个好的方法来获取V_main 和V_target 下的变量。具体来说，因为tf.name_scope 不会在tf.get_variable 创建的名称上附加任何内容，所以我不能使用tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES ,'V_main') 和tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES ,'V_target') 来获取所谓的“目标更新”的两组对象。

有什么巧妙的办法解决这个问题吗？

【问题讨论】：

标签： python tensorflow machine-learning scope reinforcement-learning

【解决方案1】：

我建议你做这个问题中描述的技巧：How to create variable outside of current scope in Tensorflow?

您可以通过提供现有范围的实例来清除当前变量范围。

所以您只需要定义一次tf.variable_scope("shared")，记住对该实例的引用并在所有其他变量范围内使用它（使用reuse=True）。 W 变量将在shared 范围内创建，无论外部范围是什么。

【讨论】：

由于这会清除当前范围并将其替换为shared，W 的名称中不会包含字符串“V_main”，这可能意味着我不能使用 tf.get_collection( ..., 'V_main') 来检索它。也许没有完美的解决方案，我只需要使用更迂回的变量检索方法。