【发布时间】:2021-12-30 03:51:38
【问题描述】:
我想使用 RNN 进行时间序列预测,以使用 96 个反向步骤来预测未来的 96 个步骤。为此,我有以下代码:
#Import modules
import pandas as pd
import numpy as np
import tensorflow as tf
from sklearn.preprocessing import StandardScaler
from tensorflow import keras
# Define the parameters of the RNN and the training
epochs = 1
batch_size = 50
steps_backwards = 96
steps_forward = 96
split_fraction_trainingData = 0.70
split_fraction_validatinData = 0.90
randomSeedNumber = 50
helpValueStrides = int(steps_backwards /steps_forward)
#Read dataset
df = pd.read_csv('C:/Users1/Desktop/TestValues.csv', sep=';', header=0, low_memory=False, infer_datetime_format=True, parse_dates={'datetime':[0]}, index_col=['datetime'])
# standardize data
data = df.values
indexWithYLabelsInData = 0
data_X = data[:, 0:3]
data_Y = data[:, indexWithYLabelsInData].reshape(-1, 1)
scaler_standardized_X = StandardScaler()
data_X = scaler_standardized_X.fit_transform(data_X)
data_X = pd.DataFrame(data_X)
scaler_standardized_Y = StandardScaler()
data_Y = scaler_standardized_Y.fit_transform(data_Y)
data_Y = pd.DataFrame(data_Y)
# Prepare the input data for the RNN
series_reshaped_X = np.array([data_X[i:i + (steps_backwards+steps_forward)].copy() for i in range(len(data) - (steps_backwards+steps_forward))])
series_reshaped_Y = np.array([data_Y[i:i + (steps_backwards+steps_forward)].copy() for i in range(len(data) - (steps_backwards+steps_forward))])
timeslot_x_train_end = int(len(series_reshaped_X)* split_fraction_trainingData)
timeslot_x_valid_end = int(len(series_reshaped_X)* split_fraction_validatinData)
X_train = series_reshaped_X[:timeslot_x_train_end, :steps_backwards]
X_valid = series_reshaped_X[timeslot_x_train_end:timeslot_x_valid_end, :steps_backwards]
X_test = series_reshaped_X[timeslot_x_valid_end:, :steps_backwards]
Y_train = series_reshaped_Y[:timeslot_x_train_end, steps_backwards:]
Y_valid = series_reshaped_Y[timeslot_x_train_end:timeslot_x_valid_end, steps_backwards:]
Y_test = series_reshaped_Y[timeslot_x_valid_end:, steps_backwards:]
# Build the model and train it
np.random.seed(randomSeedNumber)
tf.random.set_seed(randomSeedNumber)
model = keras.models.Sequential([
keras.layers.SimpleRNN(10, return_sequences=True, input_shape=[None, 3]),
keras.layers.SimpleRNN(10, return_sequences=True),
keras.layers.Conv1D(16, helpValueStrides, strides=helpValueStrides),
keras.layers.TimeDistributed(keras.layers.Dense(1))
])
model.compile(loss="mean_squared_error", optimizer="adam", metrics=['mean_absolute_percentage_error'])
history = model.fit(X_train, Y_train, epochs=epochs, batch_size=batch_size, validation_data=(X_valid, Y_valid))
#Predict the test data
Y_pred = model.predict(X_test)
prediction_lastValues_list=[]
for i in range (0, len(Y_pred)):
prediction_lastValues_list.append((Y_pred[i][0][1 - 1]))
# Create thw dataframe for the whole data
wholeDataFrameWithPrediciton = pd.DataFrame((X_test[:,1]))
wholeDataFrameWithPrediciton.rename(columns = {indexWithYLabelsInData:'actual'}, inplace = True)
wholeDataFrameWithPrediciton.rename(columns = {1:'Feature 1'}, inplace = True)
wholeDataFrameWithPrediciton.rename(columns = {2:'Feature 2'}, inplace = True)
wholeDataFrameWithPrediciton['predictions'] = prediction_lastValues_list
wholeDataFrameWithPrediciton['difference'] = (wholeDataFrameWithPrediciton['predictions'] - wholeDataFrameWithPrediciton['actual']).abs()
wholeDataFrameWithPrediciton['difference_percentage'] = ((wholeDataFrameWithPrediciton['difference'])/(wholeDataFrameWithPrediciton['actual']))*100
# Inverse the scaling (traInv: transformation inversed)
data_X_traInv = scaler_standardized_X.inverse_transform(data_X)
data_Y_traInv = scaler_standardized_Y.inverse_transform(data_Y)
series_reshaped_X_notTransformed = np.array([data_X_traInv[i:i + (steps_backwards+steps_forward)].copy() for i in range(len(data) - (steps_backwards+steps_forward))])
X_test_notTranformed = series_reshaped_X_notTransformed[timeslot_x_valid_end:, :steps_backwards]
predictions_traInv = scaler_standardized_Y.inverse_transform(wholeDataFrameWithPrediciton['predictions'].values.reshape(-1, 1))
edictions_traInv = wholeDataFrameWithPrediciton['predictions'].values.reshape(-1, 1)
# Create thw dataframe for the inversed transformed data
wholeDataFrameWithPrediciton_traInv = pd.DataFrame((X_test_notTranformed[:,0]))
wholeDataFrameWithPrediciton_traInv.rename(columns = {indexWithYLabelsInData:'actual'}, inplace = True)
wholeDataFrameWithPrediciton_traInv.rename(columns = {1:'Feature 1'}, inplace = True)
wholeDataFrameWithPrediciton_traInv['predictions'] = predictions_traInv
wholeDataFrameWithPrediciton_traInv['difference_absolute'] = (wholeDataFrameWithPrediciton_traInv['predictions'] - wholeDataFrameWithPrediciton_traInv['actual']).abs()
wholeDataFrameWithPrediciton_traInv['difference_percentage'] = ((wholeDataFrameWithPrediciton_traInv['difference_absolute'])/(wholeDataFrameWithPrediciton_traInv['actual']))*100
wholeDataFrameWithPrediciton_traInv['difference'] = (wholeDataFrameWithPrediciton_traInv['predictions'] - wholeDataFrameWithPrediciton_traInv['actual'])
这里可以有一些测试数据(不用关心我自己编的实际值,形状很重要)Download test data
如何解释Y_pred 数据的输出?这些值中的哪一个为我提供了未来 96 步的预测值?我附上了“Y_pred”数据的截图。一次在最后一层有 5 个输出神经元,一次只有 1 个。谁能告诉我,如何解释“Y_pred”数据,这意味着 RNN 的预测究竟是什么?我可以在 RNN 模型的输出(最后一层)中使用任何值。 “Y_pred”数据始终具有形状(X_test 的批量大小、时间序列、输出神经元的数量)。我的问题是针对最后一个维度。我认为这些可能是特征,但在我的情况下并非如此,因为我只有 1 个输出特征(您可以看到 Y_train、Y_test 和 Y_valid 数据的形状)。
**提醒**:赏金即将到期,不幸的是我仍然没有收到任何答复。所以我想提醒你这个问题和赏金。我会非常感谢每一条评论。
【问题讨论】:
-
你能展示你的模型以及你是如何训练它的吗?
-
@TheGuywithTheHat:感谢戴帽子的家伙的评论。 “模型”和“你如何训练它”到底是什么意思?我已经在上面发布了我的代码。这是整个代码和(几乎)最小的可重现示例。我还包括了测试数据。据我所知,模型是在发布的代码中定义的,从
model = keras.models.Sequential([...行开始,训练命令也在model.compile(loss="mean_squared_error", optimizer="adam", metrics=['mean_absolute_percentage_error'])和history = model.fit(X_train, Y_train, ...行中定义 -
我的错,我不知何故错过了那个代码块上的滚动条。
-
@TheGuywithTheHat:没问题。你能看看我的代码,也许能告诉我一些关于我的问题的事情吗?我会非常感谢每一条评论。
-
我稍后可能会进行更彻底的查看,但是在制作
X_train、X_valid和X_test时,您是否应该使用:-steps_backwards而不是:steps_backwards?
标签: python tensorflow keras time-series recurrent-neural-network