【发布时间】:2019-02-28 15:32:54
【问题描述】:
我最近从 LSTM 学习了时间序列预测 https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/23_Time-Series-Prediction.ipynb
在他的教程中,他说:我们将使用以下函数创建一批从训练数据中随机挑选的较短子序列,而不是在近 30 万个观察的完整序列上训练递归神经网络。
def batch_generator(batch_size, sequence_length):
"""
Generator function for creating random batches of training-data.
"""
# Infinite loop.
while True:
# Allocate a new array for the batch of input-signals.
x_shape = (batch_size, sequence_length, num_x_signals)
x_batch = np.zeros(shape=x_shape, dtype=np.float16)
# Allocate a new array for the batch of output-signals.
y_shape = (batch_size, sequence_length, num_y_signals)
y_batch = np.zeros(shape=y_shape, dtype=np.float16)
# Fill the batch with random sequences of data.
for i in range(batch_size):
# Get a random start-index.
# This points somewhere into the training-data.
idx = np.random.randint(num_train - sequence_length)
# Copy the sequences of data starting at this index.
x_batch[i] = x_train_scaled[idx:idx+sequence_length]
y_batch[i] = y_train_scaled[idx:idx+sequence_length]
yield (x_batch, y_batch)
他尝试创建几个用于训练的 bacth 样本。
我的问题是,我们可以先随机穿梭x_train_scaled 和y_train_scaled,然后使用follow batch_generator 开始抽样几个batch size?
我提出这个问题的动机是,对于时间序列预测,我们希望训练过去并预测未来。那么,穿梭训练样本是否合法?
在教程中,作者选择了一块连续的样本如
x_batch[i] = x_train_scaled[idx:idx+sequence_length]
y_batch[i] = y_train_scaled[idx:idx+sequence_length]
我们可以选择不连续的x_batch 和y_batch。比如x_batch[0]被10:00am选中,x_batch[1]在同一天被9:00am选中?
总结:以下两个问题是
(1) 我们可以先随机穿梭x_train_scaled 和y_train_scaled,然后使用follow batch_generator 开始采样几个batch size?
(2)我们在训练LSTM的时候,需要考虑时间顺序的影响吗?我们为 LSTM 学习了哪些参数。
谢谢
【问题讨论】:
标签: python-3.x tensorflow time-series lstm cross-validation