在 TensorFlow 中实现多对多 LSTM？答案

【问题标题】：Implementing a many-to-many LSTM in TensorFlow?在 TensorFlow 中实现多对多 LSTM？
【发布时间】：2016-11-01 06:03:57
【问题描述】：

我正在使用 TensorFlow 对时间序列数据进行预测。所以就像我有 50 个标签，我想找出下一个可能的 5 个标签。

如下图所示，我想把它做成第4个结构。

我浏览了教程演示：循环神经网络

但我发现它可以像上图中的第五个那样提供，这是不同的。

我想知道我可以使用哪种型号？我正在考虑 seq2seq 模型，但不确定它是否正确。

【问题讨论】：

标签： tensorflow deep-learning lstm

【解决方案1】：

您是对的，您可以使用 seq2seq 模型。为简洁起见，我写了一个示例，说明如何在 Keras 中执行此操作，它也有一个 Tensorflow 后端。我没有运行这个例子，所以它可能需要调整。如果您的标签是 one-hot，则需要使用交叉熵损失。

from keras.models import Model
from keras.layers import Input, LSTM, RepeatVector

# The input shape is your sequence length and your token embedding size
inputs = Input(shape=(seq_len, embedding_size))

# Build a RNN encoder
encoder = LSTM(128, return_sequences=False)(inputs)

# Repeat the encoding for every input to the decoder
encoding_repeat = RepeatVector(5)(encoder)

# Pass your (5, 128) encoding to the decoder
decoder = LSTM(128, return_sequences=True)(encoding_repeat)

# Output each timestep into a fully connected layer
sequence_prediction = TimeDistributed(Dense(1, activation='linear'))(decoder)

model = Model(inputs, sequence_prediction)
model.compile('adam', 'mse')  # Or categorical_crossentropy
model.fit(X_train, y_train)

【讨论】：

谢谢，西蒙！我没有运行 seq2seq 的演示。让我感到困惑的一点是关于嵌入。对于编码器和解码器，我想我不需要嵌入层，只需传入我的输入（id 列表）即可。
您可能想要嵌入或一次性编码您的 id。这取决于您的 id 的性质，但除非您有大量的 id，否则我认为 one-hot 编码是正确的选择。然后你的数据应该是(num_samples, seq_len, one_hot_size) 并且你应该使用交叉熵损失
输入输出大小可以不同吗？
因为我总是遇到尺寸错误，例如“形状 (50,) 和 (125,) 不兼容”
听起来你应该多花一点时间在更基础的部分上。你总是需要确保输入和输出维度是兼容的，无论是在层之间还是在损失函数中。您的模型输出，即 sequence_prediction 需要与上述模型中的 y_train 大小相同：y_train.shape = (num_samples, 5, 1)。如果你使用它，用你的嵌入大小替换最后一个维度，但这应该从你的数据中自然产生。