【发布时间】:2021-09-03 08:45:55
【问题描述】:
我想用异常值缩放时间序列数据,并将其用于带有 Keras 的 LSTM 模型。
我的缩放代码是:
# Train Data
scaler = RobustScaler().fit(train)
train = pd.DataFrame(scaler.fit_transform(train))
train = train.values
# Test Data
test = pd.DataFrame(scaler.transform(test))
test = test.values
之后,我将数据转换为 Keras 的 3D 格式:
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
if end_ix > len(sequences)-1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :12]
X.append(seq_x)
y.append(seq_y)
return np.array(X), np.array(y)
# choose a number of time steps
n_steps = 30
# convert into train input/output
X_trai, y_trai = split_sequences(train, n_steps)
print(X_trai.shape, y_trai.shape)
# convert into test input/output
X_test, y_test = split_sequences(test, n_steps)
print(X_test.shape, y_test.shape)
训练和预测效果很好,但是,我无法对测试数据集的预测 y 数据进行逆变换。
我的问题:
- 上述缩放方法正确吗?
- 如果是,我怎样才能恢复我的 y_hat 预测的原始规模,以将其与原始 y 测试数据集进行比较?
谢谢!
【问题讨论】:
标签: python machine-learning keras neural-network scaling