在 tensorflow 1.0 中将单向 LSTM 单元转换为双向 LSTM 单元答案

【问题标题】：Convert unidirectional LSTM cell to Bidirectinoal LSTM cell in tensorflow 1.0在 tensorflow 1.0 中将单向 LSTM 单元转换为双向 LSTM 单元
【发布时间】：2020-06-13 13:34:58
【问题描述】：

我有这个在 tensorflow 1.0.1 中实现的遗留代码。我想将当前的 LSTM 单元转换为双向 LSTM。

with tf.variable_scope("encoder_scope") as encoder_scope:

cell = contrib_rnn.LSTMCell(num_units=state_size, state_is_tuple=True)
cell = DtypeDropoutWrapper(cell=cell, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
cell = contrib_rnn.MultiRNNCell(cells=[cell] * num_lstm_layers, state_is_tuple=True)

encoder_cell = cell

encoder_outputs, last_encoder_state = tf.nn.dynamic_rnn(
    cell=encoder_cell,
    dtype=DTYPE,
    sequence_length=encoder_sequence_lengths,
    inputs=encoder_inputs,
    )

我发现了一些例子。 https://riptutorial.com/tensorflow/example/17004/creating-a-bidirectional-lstm

但我无法通过引用它们将我的 LSTM 单元转换为双向 LSTM 单元。在我的情况下应该在 state_below 中添加什么？

更新：除了上述问题，我需要澄清如何将以下解码器网络（dynamic_rnn_decoder）转换为使用双向 LSTM。（文档没有提供任何线索）

with tf.variable_scope("decoder_scope") as decoder_scope:

    decoder_cell = tf.contrib.rnn.LSTMCell(num_units=state_size)
    decoder_cell = DtypeDropoutWrapper(cell=decoder_cell, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
    decoder_cell = contrib_rnn.MultiRNNCell(cells=[decoder_cell] * num_lstm_layers, state_is_tuple=True)   

    # define decoder train netowrk
    decoder_outputs_tr, _ , _ = dynamic_rnn_decoder(
        cell=decoder_cell, # the cell function
        decoder_fn= simple_decoder_fn_train(last_encoder_state, name=None),
        inputs=decoder_inputs,
        sequence_length=decoder_sequence_lengths,
        parallel_iterations=None,
        swap_memory=False,
        time_major=False)

谁能解释一下？

【问题讨论】：

标签： python python-3.x tensorflow lstm

【解决方案1】：

你可以使用 bidirectional_dynamic_rnn [1]

cell_fw = contrib_rnn.LSTMCell(num_units=state_size, state_is_tuple=True)
cell_fw = DtypeDropoutWrapper(cell=cell_fw, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
cell_fw = contrib_rnn.MultiRNNCell(cells=[cell_fw] * int(num_lstm_layers/2), state_is_tuple=True)

cell_bw = contrib_rnn.LSTMCell(num_units=state_size, state_is_tuple=True)
cell_bw = DtypeDropoutWrapper(cell=cell_bw, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
cell_bw = contrib_rnn.MultiRNNCell(cells=[cell_bw] * num_lstm_layers, state_is_tuple=True)

encoder_cell_fw = cell_fw
encoder_cell_bw = cell_bw

encoder_outputs, (output_state_fw, output_state_bw) = tf.nn.bidirectional_dynamic_rnn(
    cell_fw=encoder_cell_fw,
    cell_bw=encoder_cell_bw,
    dtype=DTYPE,
    sequence_length=encoder_sequence_lengths,
    inputs=encoder_inputs,
    )

last_encoder_state = [
                       tf.concat([output_state_fw[0], output_state_bw[0]], axis=-1),
                       tf.concat([output_state_fw[1], output_state_bw[1]], axis=-1)
                     ]

但是，正如 TensorFlow 文档中所说，此 API 已被弃用，您应该考虑迁移到 TensorFlow2 并使用 keras.layers.Bidirectional(keras.layers.RNN(cell))

关于更新后的问题，您不能在解码器模型中使用双向，因为双向意味着它已经知道它还需要生成什么[2]

无论如何，为了使您的解码器适应双向编码器，您可以连接编码器状态并将解码器 num_units（或编码器中 num_units 的一半）加倍[3]

decoder_cell = tf.contrib.rnn.LSTMCell(num_units=state_size)
decoder_cell = DtypeDropoutWrapper(cell=decoder_cell, output_keep_prob=tf_keep_probabiltiy, dtype=DTYPE)
decoder_cell = contrib_rnn.MultiRNNCell(cells=[decoder_cell] * num_lstm_layers, state_is_tuple=True)   

# define decoder train netowrk
decoder_outputs_tr, _ , _ = dynamic_rnn_decoder(
    cell=decoder_cell, # the cell function
    decoder_fn= simple_decoder_fn_train(last_encoder_state, name=None),
    inputs=decoder_inputs,
    sequence_length=decoder_sequence_lengths,
    parallel_iterations=None,
    swap_memory=False,
    time_major=False)

【讨论】：

非常感谢您的支持。但是我没有配置需要转换为 biLSTM 的同一编码器-解码器网络的另一部分。你能帮我解决更新的问题吗？
谢谢，但是 tf.concat 需要一个轴来进行连接。根据您的代码，正确的轴应该是什么？
我使用axis = 1，关注github.com/Scitator/YATS2S/blob/versions/tf_1.2/seq2seq/…。但是解码器输出没有正确的形状，无法使用我使用的损失函数进行训练。我认为它也应该连接起来以获得原始形状，以便我使用的损失函数正常工作。你能帮我怎么做吗？
尝试使用axis=-1