我相信对于任意数量的输入特征,您都可以使用 因果填充 和 dilation。这是我建议的解决方案。
TimeDistributed layer 是这方面的关键。
来自 Keras 文档:“此包装器将层应用于输入的每个时间切片。输入应至少为 3D,并且索引一的维度将被视为时间维度。”强>
出于我们的目的,我们希望这一层对每个特征应用“某些东西”,因此我们将特征移动到时间索引,即 1。
Conv1D documentation 也相关。
特别是关于通道:“输入中维度的顺序。“channels_last”对应于具有形状(批次、步骤、通道)的输入(Keras 中时间数据的默认格式)”
from tensorflow.python.keras import Sequential, backend
from tensorflow.python.keras.layers import GlobalMaxPool1D, Activation, MaxPool1D, Flatten, Conv1D, Reshape, TimeDistributed, InputLayer
backend.clear_session()
lookback = 20
n_features = 5
filters = 128
model = Sequential()
model.add(InputLayer(input_shape=(lookback, n_features, 1)))
# Causal layers are first applied to the features independently
model.add(Permute(dims=(2, 1))) # UPDATE must permute prior to adding new dim and reshap
model.add(Reshape(target_shape=(n_features, lookback, 1)))
# After reshape 5 input features are now treated as the temporal layer
# for the TimeDistributed layer
# When Conv1D is applied to each input feature, it thinks the shape of the layer is (20, 1)
# with the default "channels_last", therefore...
# 20 times steps is the temporal dimension
# 1 is the "channel", the new location for the feature maps
model.add(TimeDistributed(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**0)))
# You could add pooling here if you want.
# If you want interaction between features AND causal/dilation, then apply later
model.add(TimeDistributed(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**1)))
model.add(TimeDistributed(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**2)))
# Stack feature maps on top of each other so each time step can look at
# all features produce earlier
model.add(Permute(dims=(2, 1, 3))) # UPDATED to fix issue with reshape
model.add(Reshape(target_shape=(lookback, n_features * filters))) # (20 time steps, 5 features * 128 filters)
# Causal layers are applied to the 5 input features dependently
model.add(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**0))
model.add(MaxPool1D())
model.add(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**1))
model.add(MaxPool1D())
model.add(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**2))
model.add(GlobalMaxPool1D())
model.add(Dense(units=1, activation='linear'))
model.compile(optimizer='adam', loss='mean_squared_error')
model.summary()
最终模型总结
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
reshape (Reshape) (None, 5, 20, 1) 0
_________________________________________________________________
time_distributed (TimeDistri (None, 5, 20, 128) 512
_________________________________________________________________
time_distributed_1 (TimeDist (None, 5, 20, 128) 49280
_________________________________________________________________
time_distributed_2 (TimeDist (None, 5, 20, 128) 49280
_________________________________________________________________
reshape_1 (Reshape) (None, 20, 640) 0
_________________________________________________________________
conv1d_3 (Conv1D) (None, 20, 128) 245888
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 10, 128) 0
_________________________________________________________________
conv1d_4 (Conv1D) (None, 10, 128) 49280
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 5, 128) 0
_________________________________________________________________
conv1d_5 (Conv1D) (None, 5, 128) 49280
_________________________________________________________________
global_max_pooling1d (Global (None, 128) 0
_________________________________________________________________
dense (Dense) (None, 1) 129
=================================================================
Total params: 443,649
Trainable params: 443,649
Non-trainable params: 0
_________________________________________________________________
编辑:
“为什么你需要重塑和使用 n_features 作为时间层”
n_features 最初需要在时间层的原因是因为带有扩张和因果填充的 Conv1D 一次只能处理一个特征,并且因为 TimeDistributed 层是如何实现的。
从他们的文档 “考虑一批 32 个样本,其中每个样本是 10 个 16 维向量的序列。层的批量输入形状是 (32,10,16),而 input_shape ,不包括样本维度,为 (10, 16)。
然后您可以使用 TimeDistributed 将 Dense 层分别应用于 10 个时间步长中的每一个:"
通过将 TimeDistributed 层独立应用于每个特征,它可以减少问题的维度,就好像只有一个特征一样(这很容易允许膨胀和因果填充)。有 5 个功能,首先需要分别处理它们。