【问题标题】:Tensorflow - Preprocessing image in model predictionTensorflow - 模型预测中的预处理图像
【发布时间】:2021-08-03 12:33:32
【问题描述】:

我使用功能 API 和两种不同类型的预训练模型训练了一个模型:EfficientNet B5 和 MobileNet V2。在使用保存的模型进行训练后,我正在运行一个应用程序,该应用程序使用该模型进行一些预测。

我提出了一个疑问,即将图像传递给“model.prediction()”参数的正确方法是什么。

型号:

    self.feature_extractor1 = EfficientNetB5(#weights='imagenet',
                                  input_shape=self.input_shape,
                                  include_top=False)

    self.feature_extractor2 = MobileNetV2(#weights='imagenet',
                                  input_shape=self.input_shape,
                                  include_top=False)


    for layer in self.feature_extractor1.layers:
        layer.trainable = False    

    for layer in self.feature_extractor2.layers:
        layer.trainable = False        
    

    input_ = Input(shape=self.input_shape)
    processed_input1 = b5_preprocess_input(input_)

    processed_input2 = mbv2_preprocess_input(input_)

    x1 = self.feature_extractor1(processed_input1)
    x1 = GlobalAveragePooling2D()(x1)
    x1 = Dropout(0.2)(x1)
    x1 = Flatten()(x1)

    x2 = self.feature_extractor2(processed_input2)
    x2 = GlobalAveragePooling2D()(x2)
    x2 = Dropout(0.2)(x2)
    x2 = Flatten()(x2)

    x = Concatenate()([x1, x2])

    x = Dense(512, activation='relu')(x) #,kernel_initializer=initializer,kernel_regularizer=regularizers.l2(0.001)) 
    x = Dense(1024, activation='relu')(x)

    output_shape = Dense(shape_categories, activation='softmax', name='shape')(x)

    model = Model(inputs=input_,
                  outputs=output_shape)
                  
    adam_kwargs = {'beta_1': 0.9, 'beta_2': 0.9, 'epsilon': 1e-7}
    sgd_kwargs = {'decay': 1e-6, 'momentum': 0.9, 'nesterov': True}
    optimizer = self.optimizers(kwargs=adam_kwargs)
    
    model.compile(loss='categorical_crossentropy',
                  optimizer=optimizer,
                  metrics=['accuracy'])

    model.summary()

    STEP_SIZE_TRAIN = self.phase_gen[0].n// self.phase_gen[0].batch_size
    STEP_SIZE_VALID = self.phase_gen[1].n// self.phase_gen[1].batch_size
    if self.phases == 3:
        STEP_SIZE_TEST = self.phase_gen[2].n// self.phase_gen[2].batch_size

    checkpoint = ModelCheckpoint(self.model_dir,
                                monitor='val_accuracy',
                                verbose=1,
                                save_best_only=True,
                                mode='max')
    tensorboard = TensorBoard(log_dir=self.model_dir + '/logs',
                            histogram_freq=5,
                            embeddings_freq=5)
                            #[EarlyStopping(monitor='val_loss', patience=8)]
    callbacks = [checkpoint, tensorboard]

    
    hist = model.fit_generator(generator=self.phase_gen[0],
                               steps_per_epoch=STEP_SIZE_TRAIN,
                               validation_data=self.phase_gen[1],
                               validation_steps=STEP_SIZE_VALID,
                               epochs=self.epochs,
                               callbacks=callbacks
                               )

在另一个脚本中,我有预测方法:

from tensorflow.keras.applications.mobilenet_v2 import preprocess_input as mbv2_preprocess_input
from tensorflow.keras.applications.efficientnet import preprocess_input as b5_preprocess_input

def preprocess_image(img):
    img = Image.open(io.BytesIO(img))
    img = img.resize((224, 224), Image.ANTIALIAS)
    img = image.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    #return [b5_preprocess_input(img),  mbv2_preprocess_input(img)]
    return [img, img]

modelSHP = get_modelSHP()

@app.route('/part_numbers', methods=['POST'])
def part_number():
    img = request.files.get('image').read()
    processed_image = preprocess_image(img)
    predict_shape = modelSHP.predict(processed_image)

我的第一个想法是我需要传递由正确函数预处理的输入(图像),并且按照我在模型训练期间使用它的顺序。但是当我完成它时,我的预测准确性保持在零左右。只传图片,不做任何预处理,效果更好。

我将图像输入传递给 model.prediction 的方式是否正确(无需预处理)?我想知道是否使用 Functional API 以及在我构建模型的方式中,预处理变成了每个分支模型中的一个层。

【问题讨论】:

    标签: python tensorflow computer-vision conv-neural-network image-preprocessing


    【解决方案1】:

    我复制了你的代码,然后打印出模型摘要,如下所示

    Model: "functional_5"
    __________________________________________________________________________________________________
    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_23 (InputLayer)           [(None, 224, 224, 3) 0                                            
    __________________________________________________________________________________________________
    tf.math.truediv_5 (TFOpLambda)  (None, 224, 224, 3)  0           input_23[0][0]                   
    __________________________________________________________________________________________________
    tf.math.subtract_5 (TFOpLambda) (None, 224, 224, 3)  0           tf.math.truediv_5[0][0]          
    __________________________________________________________________________________________________
    efficientnetb5 (Functional)     (None, 7, 7, 2048)   28513527    input_23[0][0]                   
    __________________________________________________________________________________________________
    mobilenetv2_1.00_224 (Functiona (None, 7, 7, 1280)   2257984     tf.math.subtract_5[0][0]         
    __________________________________________________________________________________________________
    global_average_pooling2d_8 (Glo (None, 2048)         0           efficientnetb5[0][0]             
    __________________________________________________________________________________________________
    global_average_pooling2d_9 (Glo (None, 1280)         0           mobilenetv2_1.00_224[0][0]       
    __________________________________________________________________________________________________
    dropout_8 (Dropout)             (None, 2048)         0           global_average_pooling2d_8[0][0] 
    __________________________________________________________________________________________________
    dropout_9 (Dropout)             (None, 1280)         0           global_average_pooling2d_9[0][0] 
    __________________________________________________________________________________________________
    flatten_8 (Flatten)             (None, 2048)         0           dropout_8[0][0]                  
    __________________________________________________________________________________________________
    flatten_9 (Flatten)             (None, 1280)         0           dropout_9[0][0]                  
    __________________________________________________________________________________________________
    concatenate_3 (Concatenate)     (None, 3328)         0           flatten_8[0][0]                  
                                                                     flatten_9[0][0]                  
    __________________________________________________________________________________________________
    dense_6 (Dense)                 (None, 512)          1704448     concatenate_3[0][0]              
    __________________________________________________________________________________________________
    dense_7 (Dense)                 (None, 1024)         525312      dense_6[0][0]                    
    __________________________________________________________________________________________________
    shape (Dense)                   (None, 2)            2050        dense_7[0][0]                    
    ==================================================================================================
    Total params: 33,003,321
    Trainable params: 2,231,810
    Non-trainable params: 30,771,511
    

    正如您所假设的,预处理成为模型中的层。因此,对于预测,您不必对模型中内置的输入进行预处理。对于efficientNet,预处理函数只是一个传递,因为efficientnet 期望输入像素在0 到255 范围内。因此在模型摘要中,您可以看到输入(input_23)直接输入到efficientnet。对于 MobileNet,预处理函数在 -1 和 +1 之间缩放像素。这是通过方程输入像素=像素/127.5 - 1 完成的。因此,tf.math.truediv_5 层将 input_23 除以 127.5,然后 tf.math.subtract_5 层减去 1。

    【讨论】:

    • 谢谢!我没有检查摘要(遗憾的是,我没有想到这个想法)