LSTM-Keras 是否考虑时间序列之间的依赖关系？答案

【问题标题】：Does LSTM-Keras take into account dependencies between time series?LSTM-Keras 是否考虑时间序列之间的依赖关系？
【发布时间】：2026-02-11 17:20:02
【问题描述】：

我有：

多个时间序列作为输入
在 OUTPUT 中预测时间序列点

如何确保模型通过使用输入中所有时间序列之间的依赖关系来预测数据？

编辑 1
我目前的模型：

model = Sequential()
model.add(keras.layers.LSTM(hidden_nodes, input_dim=num_features, input_length=window, consume_less="mem"))
model.add(keras.layers.Dense(num_features, activation='sigmoid'))
optimizer = keras.optimizers.SGD(lr=learning_rate, decay=1e-6, momentum=0.9, nesterov=True)

【问题讨论】：

您可以添加您当前的模型吗？

标签： python time-series keras lstm

【解决方案1】：

默认情况下，keras 中的 LSTM 层（以及任何其他类型的循环层）是无状态的，因此每次将新输入输入网络时都会重置状态。您的代码使用此默认版本。如果需要，可以通过在 LSTM 层内指定stateful=True 使其成为有状态的，然后状态将不会被重置。您可以阅读更多有关相关语法的信息 here，this blog post 提供有关有状态模式的更多信息。

下面是对应语法的一个例子，取自here:

trainX = numpy.reshape(trainX, (trainX.shape[0], trainX.shape[1], 1))
testX = numpy.reshape(testX, (testX.shape[0], testX.shape[1], 1))
# create and fit the LSTM network
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(100):
    model.fit(trainX, trainY, epochs=1, batch_size=batch_size, verbose=2, shuffle=False)
    model.reset_states()
# make predictions
trainPredict = model.predict(trainX, batch_size=batch_size)
model.reset_states()
testPredict = model.predict(testX, batch_size=batch_size)

【讨论】：

在这种情况下，batch_size 等于 1，但我认为这不适合我的问题。你怎么看？
批量大小=1 更容易（你也可以这样做，它只会让代码运行得更慢）。否则，您将不得不仔细重新排列数据。这是由于我的回答中提供的第二个链接中的以下句子，“使用有状态模型，所有状态都传播到下一批。这意味着位于索引 i X_i 的样本的状态将用于下一批中样本 X_i+bs 的计算，其中 bs 是批大小（无洗牌）"