keras LSTM模型中的尺寸不匹配答案

【问题标题】：Dimensions not matching in keras LSTM modelkeras LSTM模型中的尺寸不匹配
【发布时间】：2017-02-26 13:19:11
【问题描述】：

我想使用带有 keras 的 LSTM 神经网络来预测时间序列组，但在使模型与我想要的匹配时遇到了麻烦。我的数据维度是：

输入张量：(data length, number of series to train, time steps to look back)

输出张量：(data length, number of series to forecast, time steps to look ahead)

注意：我想保持完全一样的尺寸，不换位。

重现问题的虚拟数据代码是：

import numpy as np
from keras.models import Sequential
from keras.layers import Dense, TimeDistributed, LSTM

epoch_number = 100
batch_size = 20
input_dim = 4
output_dim = 3
look_back = 24
look_ahead = 24
n = 100

trainX = np.random.rand(n, input_dim, look_back)
trainY = np.random.rand(n, output_dim, look_ahead)
print('test X:', trainX.shape)
print('test Y:', trainY.shape)

model = Sequential()

# Add the first LSTM layer (The intermediate layers need to pass the sequences to the next layer)
model.add(LSTM(10, batch_input_shape=(None, input_dim, look_back), return_sequences=True))

# add the first LSTM layer (the dimensions are only needed in the first layer)
model.add(LSTM(10, return_sequences=True))

# the TimeDistributed object allows a 3D output
model.add(TimeDistributed(Dense(look_ahead)))

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.fit(trainX, trainY, nb_epoch=epoch_number, batch_size=batch_size, verbose=1)

这会导致：

异常：检查模型目标时出错：预期 timedistributed_1 具有形状 (None, 4, 24) 但得到了具有形状的数组 (100, 3, 24)

问题似乎出在定义TimeDistributed 层时。

如何定义TimeDistributed 层以便编译和训练？

【问题讨论】：

标签： python-3.x theano keras keras-layer

【解决方案1】：

在您的情况下，错误消息有点误导。您的网络输出节点称为timedistributed_1，因为这是您的顺序模型中的最后一个节点。错误消息试图告诉您的是，此节点的输出与 您的模型适合的目标不匹配，即您的标签 trainY。

您的trainY 的形状为(n, output_dim, look_ahead)，因此(100, 3, 24) 但网络正在生成(batch_size, input_dim, look_ahead) 的输出形状。这种情况下的问题是output_dim != input_dim。如果您的时间维度发生变化，您可能需要填充或删除所述时间步长的网络节点。

【讨论】：

【解决方案2】：

我认为问题在于您希望 output_dim (!= input_dim) 出现在 TimeDistributed 的输出中，但这是不可能的。这个维度就是它所认为的时间维度：它被保留了。

输入至少应为 3D，索引 one 的维度将被认为是时间维度。

TimeDistributed 的目的是为每个时间步应用相同的层。您最终只能得到与开始时相同的时间步数。

如果你真的需要将此维度从 4 降低到 3，我认为你需要在最后添加另一个层，或者使用与 TimeDistributed 不同的东西。

PS：发现这个问题的一个提示是 output_dim 在创建模型时从未使用过，它只出现在验证数据中。虽然这只是代码异味（此观察结果可能没有任何问题），但值得检查。

【讨论】：

我正在做转置，因为对于单个时间序列，这种转置使预测更加准确。我在本教程之后得到了这个想法：machinelearningmastery.com/… 但我想对于多对多关系还有一些工作要做