Tensorboard - 可视化 LSTM 的权重答案

【问题标题】：Tensorboard - visualize weights of LSTMTensorboard - 可视化 LSTM 的权重
【发布时间】：2018-05-18 08:05:28
【问题描述】：

我正在使用几个 LSTM 层来形成一个深度循环神经网络。我想在训练期间监控每个 LSTM 层的权重。但是，我不知道如何将 LSTM 层权重的摘要附加到 TensorBoard。

关于如何做到这一点的任何建议？

代码如下：

cells = []

with tf.name_scope("cell_1"):
    cell1 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
    cell1 = tf.contrib.rnn.DropoutWrapper(cell1,
                input_keep_prob=self.input_dropout,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell1)

with tf.name_scope("cell_2"):
    cell2 = tf.contrib.rnn.LSTMCell(self.n_hidden, state_is_tuple=True, initializer=self.initializer)
    cell2 = tf.contrib.rnn.DropoutWrapper(cell2,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell2)

with tf.name_scope("cell_3"):
    cell3 = tf.contrib.rnn.LSTMCell(self.embd_size, state_is_tuple=True, initializer=self.initializer)
    # cell has no input dropout since previous cell already has output dropout
    cell3 = tf.contrib.rnn.DropoutWrapper(cell3,
                output_keep_prob=self.output_dropout,
                state_keep_prob=self.recurrent_dropout)
    cells.append(cell3)

cell = tf.contrib.rnn.MultiRNNCell(
    cells, state_is_tuple=True)

output, self.final_state = tf.nn.dynamic_rnn(
    cell,
    inputs=self.inputs,
    initial_state=self.init_state)

【问题讨论】：

标签： tensorflow tensorboard

【解决方案1】：

tf.contrib.rnn.LSTMCell 对象有一个名为 variables 的 property 用于此目的。只有一个技巧：该属性返回一个空列表，直到您的单元格通过tf.nn.dynamic_rnn。至少在使用单个 LSTMCell 时是这种情况。我不能代表MultiRNNCell。所以我希望这会起作用：

output, self.final_state = tf.nn.dynamic_rnn(...)
for one_lstm_cell in cells:
    one_kernel, one_bias = one_lstm_cell.variables
    # I think TensorBoard handles summaries with the same name fine.
    tf.summary.histogram("Kernel", one_kernel)
    tf.summary.histogram("Bias", one_bias)

然后你可能知道如何从那里做到这一点，但是

summary_op = tf.summary.merge_all()
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    train_writer = tf.summary.FileWriter(
        "my/preferred/logdir/train", graph=tf.get_default_graph())
    for step in range(1, training_steps+1):
        ...
        _, step_summary = sess.run([train_op, summary_op])
        train_writer.add_summary(step_summary)

查看我上面链接的 TensorFlow 文档，还有一个 weights 属性。我不知道有什么区别，如果有的话。而且，variables 返回的顺序没有记录。我通过打印结果列表并查看变量名称来解决这个问题。

现在，MultiRNNCell 根据其doc 具有相同的variables 属性，它表示它返回所有层变量。老实说，我不知道MultiRNNCell 是如何工作的，所以我无法告诉你这些变量是否是专门属于MultiRNNCell 的变量，或者它是否包含来自其中的单元格的变量。无论哪种方式，知道该属性存在应该是一个不错的提示！希望这会有所帮助。

尽管variables 记录在大多数（全部？）RNN 类中，但它确实对DropoutWrapper 造成了破坏。 property has been documented 自 r1.2 以来，但访问该属性会导致 1.2 和 1.4 中的异常（看起来像 1.3，但未经测试）。具体来说，

from tensorflow.contrib import rnn
...
lstm_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
wrapped_cell = rnn.DropoutWrapper(lstm_cell)
outputs, states = rnn.static_rnn(wrapped_cell, x, dtype=tf.float32)
print("LSTM vars!", lstm_cell.variables)
print("Wrapped vars!", wrapped_cell.variables)

将抛出AttributeError: 'DropoutWrapper' object has no attribute 'trainable'。从回溯（或长时间盯着DropoutWrapper source），我注意到variables 是在DropoutWrapper's super RNNCell 的超级Layer 中实现的。头晕了吗？事实上，我们在这里找到了记录在案的variables 属性。它返回（记录的）weights 属性。 weights 属性返回（记录的）self.trainable_weights + self.non_trainable_weights 属性。最后是问题的根源：

@property
def trainable_weights(self):
    return self._trainable_weights if self.trainable else []

@property
def non_trainable_weights(self):
    if self.trainable:
        return self._non_trainable_weights
    else:
        return self._trainable_weights + self._non_trainable_weights

也就是说，variables 不适用于 DropoutWrapper 实例。 trainable_weights 或 non_trainable_weights 也不会因为self.trainable 未定义。

更深一步，Layer.__init__ 默认self.trainable 为True，但DropoutWrapper 从不调用它。引用Github 上的 TensorFlow 贡献者，

DropoutWrapper 没有变量，因为它本身不存储任何变量。它包装了一个可能有变量的单元格；但不清楚如果您访问DropoutWrapper.variables，语义应该是什么。例如，所有 keras 层只报告它们拥有的变量；所以只有一层拥有任何变量。也就是说，这可能应该返回 []，而它没有返回的原因是 DropoutWrapper 从不在其构造函数中调用 super().__init__。这应该很容易解决；欢迎 PR。

例如，要访问上例中的 LSTM 变量，lstm_cell.variables 就足够了。

编辑：据我所知，Mike Khan 的 PR 已被纳入 1.5。现在，dropout 层的 variables 属性返回一个空列表。

【讨论】：

非常感谢！ MultiRNNCell.variables 返回一个包含 LSTM 单元的所有权重和偏差的列表，因此我几乎可以按照您的代码中概述的方式使用它
@MikeKhan 有问题的代码行？从错误消息中，您的问题听起来与此问题无关。另一方面，如果调用 dropoutwrapper.variables 引起了问题，则可能是相关的。
@MikeKhan 查看编辑。我鼓励你就这种行为提出一个新问题或在 Github 上发起一个问题。
@DylanF 我在 tensorflow github 上打开了一个问题，参考了您的精彩解释。这是一个参考链接github.com/tensorflow/tensorflow/issues/15810
在我们等待永久修复时，我想出了一个解决方法。除了调用one_kernel, one_bias = one_lstm_cell.variables，我们可以使用one_kernel, one_bias = one_lstm_cell._cell.variables（注意._cell）。这会调用传入的原始 RNNCell 上的变量。我知道调用受保护的属性通常不是一个好主意，但它可以工作。