【问题标题】:Freeze layers with the multi_gpu_model in Keras在 Keras 中使用 multi_gpu_model 冻结层
【发布时间】:2018-01-25 23:32:27
【问题描述】:

我正在尝试在 Keras 中微调修改后的 InceptionV3 模型。

我按照this page 上的示例“在一组新的类上微调 InceptionV3”。

所以我首先使用以下代码训练了添加到 InceptionV3 基础模型的顶层密集层:

model = Model(inputs=base_model.input, outputs=predictions)

for layer in base_model.layers:
    layer.trainable = False

parallel_model = multi_gpu_model(model, gpus=2)

parallel_model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

history = parallel_model.fit_generator(generate_batches(path), steps_per_epoch = num_images/batch_size, epochs = num_epochs)

之后,我尝试微调 InceptionV3 中的前 2 个初始块。根据示例,我应该做的是:

for layer in model.layers[:249]:
   layer.trainable = False
for layer in model.layers[249:]:
   layer.trainable = True

model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')

model.fit_generator(...)

但我使用的是multi_gpu_model,所以我不知道如何冻结前 249 层。

我的意思是,如果我冻结 no-gpu 模型中的层(如示例),并使用 parallel_model = multi_gpu_model(model, gpus=2) 冻结 parallel_model 中的层,那么刚刚训练的顶部密集层中的权重并且包含在parallel_model 中的会被覆盖,对吧?

另一方面,我尝试直接使用for layer in parallel_model.layers[:249]: layer.trainable = False,但是当我检查parallel_model中的图层时,它显示:

for i, layer in enumerate(parallel_model.layers):
   print(i, layer.name)

(0, 'input_1')
(1, 'lambda_1')
(2, 'lambda_2')
(3, 'model_1')
(4, 'dense_3')

那么“lambda_1”、“lambda_2”和“model_1”层是什么?为什么它在parallel_model中只显示5层?

更重要的是,如何冻结parallel_model中的图层?

【问题讨论】:

    标签: keras multi-gpu


    【解决方案1】:

    这个例子有点复杂,因为你要嵌套一个基础模型

    base_model = InceptionV3(weights='imagenet', include_top=False)
    

    进入一个添加你自己的密集层的模型,

    model = Model(inputs=base_model.input, outputs=predictions)
    

    然后调用 multi_gpu_model 当它使用 lambda 为每个 GPU 拆分模型一次时再次嵌套模型,然后将输出连接在一起以便将模型分布到多个 GPU 上。

    parallel_model = multi_gpu_model(model, gpus=2)
    

    在这种情况下,请记住两件事:更改 base_model 中层的可训练性并将非并行模型加载到 cpu 上以获得最佳性能。

    这是完整的微调示例,只需更新 train_data_dir 以指向您自己的数据位置。

    import tensorflow as tf
    from keras import Model
    from keras.applications.inception_v3 import InceptionV3, preprocess_input
    from keras.layers import Dense, GlobalAveragePooling2D
    from keras.optimizers import SGD
    from keras.preprocessing.image import ImageDataGenerator
    from keras.utils import multi_gpu_model
    
    train_data_dir = '/home/ubuntu/work/data/train'
    batch_size_per_gpu = 32
    nb_classes = 3
    my_gpus = 2
    target_size = (224, 224)
    num_epochs_to_fit_dense_layer = 2
    num_epochs_to_fit_last_two_blocks = 3
    
    batch_size = batch_size_per_gpu * my_gpus
    train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
    train_iterator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=target_size,
        batch_size=batch_size,
        class_mode='categorical',
        shuffle=True)
    
    # Check to make sure our model will match our data
    assert nb_classes == train_iterator.num_classes
    
    # Create base and template models on cpu
    with tf.device('/cpu:0'):
        base_model = InceptionV3(weights='imagenet', include_top=False)
        for layer in base_model.layers:
            layer.trainable = False
    
        # Add prediction layer to base pre-trained model
        x = base_model.output
        x = GlobalAveragePooling2D()(x)
        x = Dense(1024, activation='relu')(x)
        predictions = Dense(nb_classes, activation='softmax')(x)
    
        template_model = Model(inputs=base_model.input, outputs=predictions)
    
        # If you need to load weights from previous training, do so here:
        # template_model.load_weights('template_model.h5', by_name=True)
    
    # Create parallel model on GPUs
    parallel_model = multi_gpu_model(template_model, gpus=2)
    parallel_model.compile(optimizer='adam', loss='categorical_crossentropy')
    
    # Train parallel model.
    history = parallel_model.fit_generator(
        train_iterator,
        steps_per_epoch=train_iterator.n // batch_size,
        epochs=num_epochs_to_fit_dense_layer)
    
    # Unfreeze some layers in our model
    for layer in base_model.layers[:249]:
       layer.trainable = False
    for layer in base_model.layers[249:]:
       layer.trainable = True
    
    # Train parallel_model with more trainable layers
    parallel_model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
    history2 = parallel_model.fit_generator(
        train_iterator,
        steps_per_epoch=train_iterator.n // batch_size,
        epochs=num_epochs_to_fit_last_two_blocks)
    
    # Save model via the template model which shares the same weights as the parallel model.
    template_model.save('template_model.h5')
    

    【讨论】:

      猜你喜欢
      • 2018-07-04
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-09-02
      • 1970-01-01
      • 2017-12-06
      • 1970-01-01
      相关资源
      最近更新 更多