【问题标题】:Keras and VGG training: why do I "lose" training and validation examples following model.predict_generatorKeras 和 VGG 训练:为什么我会在 model.predict_generator 之后“丢失”训练和验证示例
【发布时间】:2017-07-18 10:26:47
【问题描述】:

我正在用我自己的一些图像训练 VGG。 我有以下代码:

img_width, img_height = 512, 512
top_model_weights_path = 'UIP-versus-inconsistent.h5'
train_dir = 'MasterHRCT/Limited-Cuts-UIP-Inconsistent/train'
validation_dir = 'MasterHRCT/Limited-Cuts-UIP-Inconsistent/validation'
nb_train_samples = 1500
nb_validation_samples = 500
epochs = 50
batch_size = 16

def save_bottleneck_features():

        datagen = ImageDataGenerator(rescale=1. / 255)

        model = applications.VGG16(include_top=False, weights='imagenet')

        generator = datagen.flow_from_directory(
            train_dir, 
            target_size=(img_width, img_height), 
            shuffle=False, 
            class_mode=None,
            batch_size=batch_size
        )  

        bottleneck_features_train = model.predict_generator(generator=generator, steps=nb_train_samples // batch_size)

        np.save(file="UIP-versus-inconsistent_train.npy", arr=bottleneck_features_train)

        generator = datagen.flow_from_directory(
            validation_dir, 
            target_size=(img_width, img_height), 
            shuffle=False,
            class_mode=None,  
            batch_size=batch_size,    
        )

        bottleneck_features_validation = model.predict_generator(generator, nb_validation_samples // batch_size)

        np.save(file="UIP-versus-inconsistent_validate.npy", arr=bottleneck_features_validation)

                generator = datagen.flow_from_directory(
                    validation_dir, 
                    target_size=(img_width, img_height), 
                    shuffle=False,
                    class_mode=None,  
                    batch_size=batch_size,    
                )

                bottleneck_features_validation = model.predict_generator(generator, nb_validation_samples // batch_size)

                np.save(file="UIP-versus-inconsistent_validate.npy", arr=bottleneck_features_validation)

执行此操作后,我得到了基于我的目录的预期

 Found 1500 images belonging to 2 classes.
 Found 500 images belonging to 2 classes

然后我跑

 train_data = np.load(file="UIP-versus-inconsistent_train.npy")
 train_labels = np.array([0] * 750 + [1] * 750)
 validation_data = np.load(file="UIP-versus-inconsistent_validate.npy")
 validation_labels = np.array([0] * 250 + [1] * 250)

然后检查数据

 print("Train data shape", train_data.shape)
 print("Train_labels shape", train_labels.shape)
 print("Validation_data shape", validation_labels.shape)
 print("Validation_labels", validation_labels.shape)

我得到了

Train data shape (1488, 16, 16, 512)
Train_labels shape (1488,)
Validation_data shape (496,)
Validation_labels (496,)

这是可变的——而不是有 1500 个训练数据示例和 500 个验证示例,这就像我“失去”了一些。有时当我跑步时 save_bottleneck_features(): 数字回来了,其他时候他们没有。当这个过程需要很长时间时,它会发生很多。对此有可重复的解释吗?可能是损坏的图像?

【问题讨论】:

    标签: python machine-learning keras deep-learning vgg-net


    【解决方案1】:

    很简单:

    1488 = (1500 // batch_size) * batch_size
    496 = (500 // batch_size) * batch_size
    

    您的损失来自整数除法不准确。

    【讨论】:

    • 嗯...太容易了:(
    猜你喜欢
    • 2016-11-16
    • 2017-08-10
    • 2017-07-17
    • 2019-04-02
    • 2016-08-26
    • 1970-01-01
    • 2019-11-25
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多