使用 Keras (Tensorflow) 训练模型时出现形状错误答案

【问题标题】：Shape error when training a model with Keras (Tensorflow)使用 Keras (Tensorflow) 训练模型时出现形状错误
【发布时间】：2019-05-19 05:51:38
【问题描述】：

我是机器学习的新手，发现使用 Keras for TensorFlow 进行模型训练很难掌握。我正在尝试使用 TensorFlow 进行时间序列预测。我有一个生成器函数，可以生成训练数据和标签：

x_batch, y_batch = next(generator)

print(x_batch.shape)
print(y_batch.shape)

(256, 60, 9)
(256, 60, 3)

我通过以下方式构建模型：

model = Sequential()
model.add(LSTM(128, input_shape=(None, num_x_signals,), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())

model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())

model.add(LSTM(128))
model.add(Dropout(0.2))
model.add(BatchNormalization())

model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(num_y_signals, activation='relu'))

opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)

# Compile model
model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=opt,
    metrics=['accuracy']
)

我的模型摘要如下所示：

    model.summary()
  _________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_19 (LSTM)               (None, None, 128)         70656     
_________________________________________________________________
dropout_23 (Dropout)         (None, None, 128)         0         
_________________________________________________________________
batch_normalization_18 (Batc (None, None, 128)         512       
_________________________________________________________________
lstm_20 (LSTM)               (None, None, 128)         131584    
_________________________________________________________________
dropout_24 (Dropout)         (None, None, 128)         0         
_________________________________________________________________
batch_normalization_19 (Batc (None, None, 128)         512       
_________________________________________________________________
lstm_21 (LSTM)               (None, 128)               131584    
_________________________________________________________________
dropout_25 (Dropout)         (None, 128)               0         
_________________________________________________________________
batch_normalization_20 (Batc (None, 128)               512       
_________________________________________________________________
dense_12 (Dense)             (None, 32)                4128      
_________________________________________________________________
dropout_26 (Dropout)         (None, 32)                0         
_________________________________________________________________
dense_13 (Dense)             (None, 3)                 99        
=================================================================
Total params: 339,587
Trainable params: 338,819
Non-trainable params: 768

这是我尝试训练模型的方式：

tensorboard = TensorBoard(log_dir="logs/{}".format(NAME))

filepath = "RNN_Final-{epoch:02d}-{val_acc:.3f}"  # unique file name that will include the epoch and the validation acc for that epoch
checkpoint = ModelCheckpoint("models/{}.model".format(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')) # saves only the best ones

# Train model
history = model.fit_generator(
    generator=generator,
    epochs=EPOCHS,
    steps_per_epoch=100,      
    validation_data=validation_data,
    callbacks=[tensorboard, checkpoint],
)

# Score model
score = model.evaluate(validation_x, validation_y, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
# Save model
model.save("models/{}".format(NAME))

但是当我尝试训练我的模型时出现以下错误：

 ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-67-f5263636596b> in <module>()
     10     steps_per_epoch=100,
     11     validation_data=validation_data,
---> 12     callbacks=[tensorboard, checkpoint],
     13 )
     14 

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
   1777         use_multiprocessing=use_multiprocessing,
   1778         shuffle=shuffle,
-> 1779         initial_epoch=initial_epoch)
   1780 
   1781   def evaluate_generator(self,

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
    134             'or `(val_x, val_y)`. Found: ' + str(validation_data))
    135       val_x, val_y, val_sample_weights = model._standardize_user_data(
--> 136           val_x, val_y, val_sample_weight)
    137       val_data = val_x + val_y + val_sample_weights
    138       if model.uses_learning_phase and not isinstance(K.learning_phase(), int):

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, batch_size, check_steps, steps_name, steps, validation_split)
    915           feed_output_shapes,
    916           check_batch_axis=False,  # Don't enforce the batch size.
--> 917           exception_prefix='target')
    918 
    919       # Generate sample-wise weight values given the `sample_weight` and

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    180                            ': expected ' + names[i] + ' to have ' +
    181                            str(len(shape)) + ' dimensions, but got array '
--> 182                            'with shape ' + str(data_shape))
    183         if not check_batch_axis:
    184           data_shape = data_shape[1:]

ValueError: Error when checking target: expected dense_13 to have 2 dimensions, but got array with shape (1, 219, 3)

【问题讨论】：

您的输入验证数据的形状是什么？
因为您在某个时候停止使用return_sequences=True，所以您的网络从 2D 变为 1D。但是您的 y 数据是二维的，因此您得到了错误。
@IanQuah，我的验证看起来像这样 - X 是 (1, 219, 9)，Y 是 (1, 219, 3)。并通过构建验证数据，如 validation_data = (np.expand_dims(x_test_scaled, axis=0), np.expand_dims(y_test_scaled, axis=0)).... 所以基本上我接受了一组维度数组 (1, 219 , 9) 和 (1, 219, 3)
@yhenon，我使用 expand_dims 构造数据，就像我上面说的那样。所以本质上我的 Y 是 3D
@yhenon，我一直尝试传入 return_sequences=True ，这次我得到以下错误。预计dense_2 具有形状（None, 1），但得到形状为（219, 3）的数组。我不明白为什么 (None, 1) 是预期的

标签： python tensorflow keras time-series lstm

【解决方案1】：

正如@yhenon 在 cmets 部分中提到的，因为您的模型在每个时间步都有一些输出，所以您也必须将 return_sequences=True 用于最后一个 LSTM 层。

但是，不清楚任务是什么（即分类或回归）。如果是分类任务，你必须使用'categorical_crossentropy'作为损失函数（而不是你当前使用的'sparse_categorical_crossentropy'）并使用'softmax'作为最后一层的激活函数。

另一方面，如果是回归任务，则需要使用回归损失，例如'mse' 或'mae'，并根据输出值正确设置最后一层的激活函数（即使用@987654329 @如果输出值的范围包括负数和正数）。

【讨论】：

谢谢@Today，知道了，我正在做回归，并将损失用作“sparse_categorical_crossentropy”，我相信这会导致预期（无，1），当我将其更改为 mse 时。一切正常。您是否知道我可以使用任何指南来了解有关构建模型以及使用哪些层和激活函数的更多信息？
@ChiduMurthy 好吧，网络上有大量关于机器学习和 Keras 的精彩文章。如果您正在寻找 Keras 中的一些模型示例，您可以在 Keras repository 中找到很棒的示例。如果您正在寻找关于分类中使用的损失和激活函数的简洁快速指南，您可以阅读this answer（免责声明：它是我写的！）。
回归怎么样？
@ChiduMurthy 好吧，回归更直接。对于常见回归任务，最常见的损失函数是均方误差或'mse'，激活函数是'linear'（即没有激活）。当然，还有其他（特殊）情况，具体取决于问题和输出值，例如您可以分别使用sigmoid 和binary_crossentropy 作为激活函数和损失函数。