如何在 keras 中使用数据增强以及如何防止在 mnist 数据集上过度拟合？答案

【问题标题】：How can I use data augmentation in keras and how do I prevent overfitting on the mnist dataset?如何在 keras 中使用数据增强以及如何防止在 mnist 数据集上过度拟合？
【发布时间】：2020-10-31 17:35:35
【问题描述】：

我想在 mnist 数据集上训练一个 keras 神经网络。问题是我的模型在 1 或 2 个 epoch 后已经过拟合。为了解决这个问题，我想使用数据增强：

首先我加载数据：

#load mnist dataset
(tr_images, tr_labels), (test_images, test_labels) = mnist.load_data()

#normalize images
tr_images, test_images = preprocess(tr_images, test_images)

#function which returns the amount of train images, test images and classes
amount_train_images, amount_test_images, total_classes = get_data_information(tr_images, tr_labels, test_images, test_labels)

#convert labels into the respective vectors
tr_vector_labels = keras.utils.to_categorical(tr_labels, total_classes) 
test_vector_labels = keras.utils.to_categorical(test_labels, total_classes)

我用“create_model”函数创建了一个模型：

untrained_model = create_model()

这是函数定义：

def create_model(_learning_rate=0.01, _momentum=0.9, _decay=0.001, _dense_neurons=128, _fully_connected_layers=3, _loss="sparse_categorical_crossentropy", _dropout=0.1):
    #create model
    model = keras.Sequential()

    #input
    model.add(Flatten(input_shape=(28, 28)))
    
    #add fully connected layers
    for i in range(_fully_connected_layers):
        model.add(Dense(_dense_neurons, activation='relu'))

    model.add(Dropout(_dropout))

    #classifier
    model.add(Dense(total_classes, activation='sigmoid'))

    optimizer = keras.optimizers.SGD(
        learning_rate=_learning_rate,
        momentum=_momentum,
        decay=_decay
    )

    #compile
    model.compile(
        optimizer=optimizer,
        loss=_loss,
        metrics=['accuracy']
    )

    return model

该函数返回一个已编译但未经训练的模型。当我尝试优化超参数（因此有很多参数）时，我也会使用这个函数。然后我创建一个 ImagDataGenerator：

generator = tf.keras.preprocessing.image.ImageDataGenerator(
                rotation_range=0.15,
                width_shift_range=0.15,
                height_shift_range=0.15,
                zoom_range=0.15
            )

现在我想用我的 train_model_with_data_augmentation 函数训练模型：

train_model_with_data_augmentation(
                tr_images=tr_images, 
                tr_labels=tr_labels, 
                test_images=test_images, 
                test_labels=test_labels, 
                model=untrained_model,
                generator=generator,
                hyperparameters=hyperparameters
            )

但是，我不知道如何将此生成器用于我创建的模型，因为我发现的唯一方法是生成器的 fit 方法，但我想训练我的模型而不是生成器。

这是我从训练历史中得到的图表：https://ibb.co/sKFnwGr

我能否以某种方式将生成器转换为可用作模型拟合方法中的参数的数据？
如果不是：如何训练我使用此生成器创建的模型？（或者我必须以完全不同的方式实现数据增强？）
数据增强是否对 mnist 数据集有意义？
还有哪些其他选项可以防止在 mnist 上过拟合？

更新： 我尝试使用此代码：

generator.fit(x_train)
model.fit(generator.flow(x_train, y_train, batch_size=32), steps_per_epoch=len(x_train)/32, epochs=epochs)

但是我收到此错误消息： ValueError：“.fit() 的输入应该有 4 级。得到的数组的形状为：(60000, 28, 28)”

我相信 fit 方法的输入矩阵应该包含图像索引、高度、宽度、深度，因此它应该有 4 个维度，而我的 x_train 数组只有 3 个维度并且没有关于图像深度的任何维度。我试图扩展它：

x_train = x_train[..., np.newaxis]
y_train = y_train[..., np.newaxis]

但随后我收到此错误消息： “终结 GeneratorDataset 迭代器时发生错误：前置条件失败：Python 解释器状态未初始化。进程可能会终止。”

【问题讨论】：

标签： python tensorflow machine-learning keras neural-network

【解决方案1】：

可以在here 找到使用 ImageDataGenerator 的工作示例。示例本身：

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)

datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(x_train)

# fits the model on batches with real-time data augmentation:
model.fit(datagen.flow(x_train, y_train, batch_size=32),
          steps_per_epoch=len(x_train) / 32, epochs=epochs)

【讨论】：

我尝试使用您链接的文档中描述的代码，但出现错误。我已经分别更新了我原来的问题。有什么想法可以解决吗？
只需要扩展x数据集（调用datagen.fit(x_train)之前）y就可以了
好的，但是当我只展开 x_train 和 x_test 时，它会抛出同样的错误。当我打印 x 数据集的形状时，它显示它的形状 (60000, 28, 28, 1) 应该是正确的，不是吗？