【问题标题】:How to improve accuracy and validation accuracy in deep learning如何提高深度学习中的准确性和验证准确性
【发布时间】:2020-09-25 02:35:46
【问题描述】:

我正在用我自己的数据训练一个 CNN,我在相同的数据上尝试了 resnet50 和 resnet101 以及我自己的模型,准确度为 63,验证准确度为 0.08。我知道问题出在我的数据上,我想在拆分之前尝试对数据进行洗牌,但是我的数据在 26 个不同的类中,如何在将数据拆分为训练和验证集之前对其进行洗牌。我的数据集超过 36K 图像。

(trainX, testX, trainY, testY) = train_test_split(data, labels,
    test_size=0.25, stratify=labels, random_state=42)

# initialize the training data augmentation object
trainAug = ImageDataGenerator(
    rotation_range=30,
    zoom_range=0.15,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.15,
    horizontal_flip=True,
    fill_mode="nearest")

# initialize the validation/testing data augmentation object (which
# we'll be adding mean subtraction to)
valAug = ImageDataGenerator()

# define the ImageNet mean subtraction (in RGB order) and set the
# the mean subtraction value for each of the data augmentation
# objects
mean = np.array([123.68, 116.779, 103.939], dtype='float32')
trainAug.mean = mean
valAug.mean = mean

model = Sequential()
# The first two layers with 32 filters of window size 3x3
model.add(Conv2D(32, (5, 5), padding='same', activation='relu', input_shape=(224, 224, 3)))
model.add(Conv2D(32, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (5, 5), padding='same', activation='relu'))
model.add(Conv2D(64, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(labels, activation='softmax'))


print("[INFO] compiling model...")
opt = SGD(lr=1e-4, momentum=0.9, decay=1e-4 / args["epochs"])
model.compile(loss="categorical_crossentropy", optimizer=opt,
    metrics=["accuracy"])
print("[INFO] training head...")
H = model.fit(
    x=trainAug.flow(trainX, trainY, batch_size=32),
    steps_per_epoch=len(trainX) // 32,
    validation_data=valAug.flow(testX, testY),
    validation_steps=len(testX) // 32,
    epochs=args["epochs"])

【问题讨论】:

    标签: python tensorflow keras deep-learning


    【解决方案1】:

    您可以使用 ImageDataGenerator 的验证拆分关键字来自动拆分您的训练和测试数据。

    train_datagen = ImageDataGenerator(rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        validation_split=0.2) # set validation split
    
    train_generator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_height, img_width),
        batch_size=batch_size,
        class_mode='binary',
        subset='training') # set as training data
    
    validation_generator = train_datagen.flow_from_directory(
        train_data_dir, # same directory as training data
        target_size=(img_height, img_width),
        batch_size=batch_size,
        class_mode='binary',
        subset='validation') # set as validation data
    
    model.fit_generator(
        train_generator,
        steps_per_epoch = train_generator.samples // batch_size,
        validation_data = validation_generator, 
        validation_steps = validation_generator.samples // batch_size,
        epochs = nb_epochs)
    

    由于ImageDataGenerator 自动打乱您的输入数据,您使用ImageDataGenerator 您的数据被打乱和拆分。

    在您的情况下,您将需要 flow 而不是 flow_from_directory

    【讨论】:

    • 所以我应该在图像生成器之后拆分数据,谢谢我会尝试
    • @nada hussien 如果您认为我的回答令人满意,请接受它作为最佳答案
    • @Yannick Funk 我会的
    猜你喜欢
    • 1970-01-01
    • 2020-02-01
    • 1970-01-01
    • 2016-08-29
    • 2020-10-02
    • 1970-01-01
    • 2019-10-21
    • 1970-01-01
    • 2019-01-30
    相关资源
    最近更新 更多