在张量流中训练单级 LSTM 时出错答案

【问题标题】：error when training single level LSTM in tensorflow在张量流中训练单级 LSTM 时出错
【发布时间】：2018-05-03 20:36:47
【问题描述】：

所以我一直在尝试在 tensorflow 中训练一个单层编码器-解码器网络，考虑到文档的解释如此稀疏，这真是令人沮丧，而且我只在 tensorflow 上学习了斯坦福的 CS231n。

下面是简单的模型：

def simple_model(X,Y, is_training):
    """
    a simple, single layered encoder decoder network, 
    that encodes X of shape (batch_size, window_len, 
    n_comp+1), then decodes Y of shape (batch_size, 
    pred_len+1, n_comp+1), of which the vector Y[:,0,
    :], is simply [0,...,0,1] * batch_size, so that 
    it starts the decoding
    """

    num_units = 128
    window_len = X.shape[1]
    n_comp = X.shape[2]-1
    pred_len = Y.shape[1]-1

    init = tf.contrib.layers.variance_scaling_initializer()
    encoder_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
    encoder_output, encoder_state = tf.nn.dynamic_rnn(
                         encoder_cell,X,dtype = tf.float32)
    decoder_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
    decoder_output, _ = tf.nn.dynamic_rnn(decoder_cell,
                                         encoder_output,
                             initial_state = encoder_state)
    # we expect the shape to be of the shape of Y
    print(decoder_output.shape)
    proj_layer = tf.layers.dense(decoder_output, n_comp)
    return proj_layer

现在我尝试设置培训详细信息：

tf.reset_default_graph()
X = tf.placeholder(tf.float32, [None, 15, 74])
y = tf.placeholder(tf.float32, [None, 4, 74])
is_training = tf.placeholder(tf.bool)
y_out = simple_model(X,y,is_training)

mean_loss = 0.5*tf.reduce_mean((y_out-y[:,1:,:-1])**2)
optimizer = tf.train.AdamOptimizer(learning_rate=5e-4)

extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(extra_update_ops):
    train_step = optimizer.minimize(mean_loss)

好的，现在我得到了这个愚蠢的错误

ValueError: 变量 rnn/basic_lstm_cell/kernel 已经存在，不允许。您的意思是在 VarScope 中设置 reuse=True 或 reuse=tf.AUTO_REUSE 吗？最初定义于：

【问题讨论】：

也许你在同一个程序的多个地方使用（定义）rnn/basic_lstm_cell/kernel。因此，当 tensorflow 尝试构建图形时，它会失败。请发布更多信息（完整的错误消息）

标签： python tensorflow lstm

【解决方案1】：

我不确定我是否理解正确。您的图表中有两个 BasicLSTMCells。根据documentation，您可能应该像这样使用MultiRNNCell：

encoder_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
decoder_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
rnn_layers = [encoder_cell, decoder_cell]
multi_rnn_cell = tf.nn.rnn_cell.MultiRNNCell(rnn_layers)
decoder_output, state = tf.nn.dynamic_rnn(cell=multi_rnn_cell,
                                          inputs=X,
                                          dtype=tf.float32)

如果这不是您想要的正确架构并且您需要分别使用两个BasicLSTMCells，我认为在定义encoder_cell 和decoder_cell 时传递不同/唯一的names 会有所帮助解决这个错误。 tf.nn.dynamic_rnn 会将单元格置于“rnn”范围内。如果您没有明确定义单元名称，则会导致reuse confusion。

【讨论】：