Deeplab - 训练有素的 Deeplab 模型上的推理与可视化性能不一致答案

【问题标题】：Deeplab - Inconsistent Inference vs Visualization Performance on trained Deeplab modelDeeplab - 训练有素的 Deeplab 模型上的推理与可视化性能不一致
【发布时间】：2019-09-07 03:06:56
【问题描述】：

描述问题

我已经使用Deeplab 成功地在一个自定义数据集上训练了我的模型，该数据集包含 4 个大小为 480x640 的类，带有一个 xception65 编码器。每当我使用vis.py 脚本：EvalImageA_ckpt、EvalImageB_ckpt 时，我都会在验证集上获得不错的结果。但是，当我冻结模型时，我在相同的图像上没有得到相同的结果。

我使用export_model.py 冻结了模型并成功输出了frozen_model.pb 文件。但是，当我使用此 pb 文件运行推理时，在我提供上面链接的相同图像上，输出始终为 0（即所有内容都被归类为“背景”）。一切都是黑色的！

我认为这是我如何导出或加载模型的问题，而不一定是模型本身的问题，因为运行 vis.py 脚本和我的自定义推理代码之间的图像性能不同。也许我没有正确加载图表或初始化变量。或者，也许我一开始就没有正确保存权重。任何帮助将不胜感激！

源代码

下面我提供我的推理代码：

from deeplab.utils import get_dataset_colormap
from PIL import Image
import tensorflow as tf
import time
import matplotlib.pyplot as plt
import numpy as np
import cv2
import os
import glob


# tensorflow arguments
flags = tf.app.flags  # flag object for setup
FLAGS = flags.FLAGS   # object to access initialized flags
flags.DEFINE_string('frozen', None,
                    'The path/to/frozen.pb file.')

def _load_graph(frozen):
    print('Loading model `deeplabv3_graph` into memory from',frozen)
    with tf.gfile.GFile(frozen, "rb") as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
    with tf.Graph().as_default() as graph:
        tf.import_graph_def(
            graph_def, 
            input_map=None, 
            return_elements=None, 
            name="", 
            op_dict=None, 
            producer_op_list=None
        )
    return graph

def _run_inferences(sess, image, title):
    batch_seg_map = sess.run('SemanticPredictions:0',
        feed_dict={'ImageTensor:0': [np.asarray(image)]})
    semantic_prediction = get_dataset_colormap.label_to_color_image(batch_seg_map[0],
        dataset=get_dataset_colormap.__PRDL3_V1).astype(np.uint8)
    plt.imshow(semantic_prediction)
    plt.axis('off')
    plt.title(title)
    plt.show()


def main(argv):
    # initialize model
    frozen = os.path.normpath(FLAGS.frozen)
    assert os.path.isfile(frozen)
    graph = _load_graph(frozen)

    # open graph resource and begin inference in-loop
    with tf.Session(graph=graph) as sess:
        for img_path in glob.glob('*.png'):
            img = Image.open(img_path).convert('RGB')
            _run_inferences(sess, img, img_path)

if __name__ == '__main__':
    flags.mark_flag_as_required('frozen')
    tf.app.run()  # call the main() function

下面是我使用提供的export_model.py 脚本导出模型的代码。

python export_model.py \
--logtostderr \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--checkpoint_path="/path/to/.../model.ckpt-32245" \
--export_path="/path/to/.../frozen_4_11_19.pb" \
--model_variant="xception_65" \
--num_classes=4 \
--crop_size=481 \
--crop_size=641 \
--inference_scales=1.0

系统信息

你使用的模型的顶层目录是什么：deeplab
我是否编写了自定义代码（而不是使用 TensorFlow 中提供的股票示例脚本）：是
操作系统平台和发行版（例如，Linux Ubuntu 16.04）：Windows 10 企业版
TensorFlow 安装自（源或二进制）：二进制
TensorFlow 版本（使用下面的命令）：1.12.0
Bazel 版本（如果从源代码编译）：N/A
CUDA/cuDNN 版本：9
GPU 型号和内存：NVIDIA Quadro M4000, 8GB
重现的确切命令：不适用

【问题讨论】：

标签： python tensorflow deeplab

【解决方案1】：

使用以下标志运行导出模型脚本

python deeplab/export_model.py \
--logtostderr \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--num_classes=2 \
--decoder_output_stride=4 \
--crop_size=<<crop width>> \
--crop_size=<<crop height>> \
--dataset="shoes" \
--checkpoint_path="<<Checkpoint path>>" \
--export_path="<<Output frozen graph path>>" \

【讨论】：

python deeplab/export_model.py --logtostderr --input_type=image_tensor --checkpoint_path=/code/models/research/deeplab/weights_input_level_17/model.ckpt-22000 --export_path=/code/models /research/deeplab/frozen_weights_level_17/frozen_inference_graph.pb --model_variant="xception_65" --atrous_rates=6 --atrous_rates=12 --atrous_rates=18 --output_stride=16 --decoder_output_stride=4 --crop_size=2048 --crop_size =2048 --num_classes=3 --dataset="pascal_voc_seg"

【解决方案2】：

我也在为我的推理结果而苦苦挣扎。但在我的情况下，使用导出模型时我得到了非常令人满意的结果，只是它们不如我的可视化结果准确。

这是我的脚本，它是基于可用作演示的脚本。希望对你有帮助

import os

from matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import time
import cv2
from tqdm import tqdm

import tensorflow as tf

# Needed to show segmentation colormap labels

from deeplab.utils import get_dataset_colormap
from deeplab.utils import labels_alstom

flags = tf.app.flags

FLAGS = flags.FLAGS

flags.DEFINE_string('model_dir', None, 'Where the model is')
flags.DEFINE_string('image_dir', None, 'Where the image is')
flags.DEFINE_string('save_dir', None, 'Dir for saving results')
flags.DEFINE_string('image_name', None, 'Image name')



class DeepLabModel(object):
    """Class to load deeplab model and run inference."""

    INPUT_TENSOR_NAME = 'ImageTensor:0'
    OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'
    INPUT_SIZE = 513

    def __init__(self, model_dir):
        """Creates and loads pretrained deeplab model."""
        self.graph = tf.Graph()

        graph_def = None
        # Extract frozen graph from tar archive.
        model_filename = FLAGS.model_dir
        with tf.gfile.FastGFile(model_filename, 'rb') as f:
            graph_def = tf.GraphDef()
            graph_def.ParseFromString(f.read())

        if graph_def is None:
            raise RuntimeError('Cannot find inference graph in tar archive.')

        with self.graph.as_default():
            tf.import_graph_def(graph_def, name='')

        self.sess = tf.Session(graph=self.graph)

    def run(self, image):
        """Runs inference on a single image.

        Args:
            image: A PIL.Image object, raw input image.

        Returns:
            resized_image: RGB image resized from original input image.
            seg_map: Segmentation map of `resized_image`.
        """
        width, height = image.size
        resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
        target_size = (int(resize_ratio * width), int(resize_ratio * height))
        resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)
        print('Image resized')
        start_time = time.time()
        batch_seg_map = self.sess.run(
            self.OUTPUT_TENSOR_NAME,
            feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})
        print('Image processing finished')
        print('Elapsed time : ' + str(time.time() - start_time))
        seg_map = batch_seg_map[0]
        return resized_image, seg_map


model = DeepLabModel(FLAGS.model_dir)
print('Model created successfully')



def vis_segmentation(image, seg_map):
    
    seg_image = get_dataset_colormap.label_to_color_image(
         seg_map, get_dataset_colormap.get_alstom_name()).astype(np.uint8)
         
    return seg_image



def run_demo_image(image_path):
    try:
        print(image_path)
        orignal_im = Image.open(image_path)

    except IOError:
        print ('Failed to read image from %s.' % image_path)
        return
    print ('running deeplab on image...')
    resized_im, seg_map = model.run(orignal_im)

    return vis_segmentation(resized_im, seg_map)



IMAGE_DIR = FLAGS.image_dir


files = os.listdir(FLAGS.image_dir)
for f in tqdm(files):

    prediction = run_demo_image(IMAGE_DIR+f)
    Image.fromarray(prediction).save(FLAGS.save_dir+'prediction_'+f)

【讨论】：