从用于图像分割的 U-Net 迁移学习 [Keras]答案

【问题标题】：Transfer Learning From a U-Net for Image Segmentation [Keras]从用于图像分割的 U-Net 迁移学习 [Keras]
【发布时间】：2019-01-09 17:02:17
【问题描述】：

刚开始使用 Conv Nets 并尝试解决图像分割问题。我为 dstl 卫星图像特征检测竞赛获得了 24 张图像及其掩码。 (https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data)

我以为我会尝试按照https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html 此处的提示进行操作，但我被卡住了。

我下载了 ZF_UNET_224 的预训练权重，这是该问题的第二名获胜者方法。我的图像蒙版包含 5 个对象，所以我弹出了最后一层，而不是这样：

activation_45 (Activation) (None, 224, 224, 32) 0 batch_normalization_44[0][0]

spatial_dropout2d_2 (SpatialDro (None, 224, 224, 32) 0 activation_45[0][0]

conv2d_46 (Conv2D) (None, 224, 224, 1) 33 spatial_dropout2d_2[0][0]

我现在有这个：

activation_45 (Activation) (None, 224, 224, 32) 0 batch_normalization_44[0][0]

spatial_dropout2d_2 (SpatialDro (None, 224, 224, 32) 0 activation_45[0][0]

predictions (Conv2D) (None, 224, 224, 5) 10 conv2d_46[0][0]

我正在尝试遵循 Keras 教程中的确切步骤，但是当我这样做时

my_model.fit_generator( train_generator, steps_per_epoch= 4, epochs=10, validation_data=validation_generator )

我收到一条错误消息说

Output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: [[[[1. 1. 1. ] [1. 1. 1. ] [1. 1. 1. ] … [1. 1. 1. ] [1. 1. 1. ] [1. 1. 1. ]]

我想我想要的是我的 224X224 图像中每个像素的概率，这样我就可以使用它们在原始图像上生成蒙版，但我不知道如何去做。

我有 24 个 8 波段输入图像及其标记 5 个对象的掩码。我想在这些图像上训练这个 U-Net，并在一些测试图像上放置掩码，并评估它们的 IoU 或加权对数损失。有什么帮助吗？

更新：

我正在使用与 Keras 教程中相同的生成器：

   batch_size = 4

    # this is the augmentation configuration we will use for training 

train_datagen = ImageDataGenerator(
            rescale=1./255,
            shear_range=0.2,
            zoom_range=0.2,
            horizontal_flip=True)

    # this is the augmentation configuration we will use for testing:
    # only rescaling 
test_datagen = ImageDataGenerator(
            rescale=1./255,
            shear_range=0.2,
            zoom_range=0.2,
            horizontal_flip=True)

    # this is a generator that will read pictures found in
    # subfolers of 'data/train', and indefinitely generate
    # batches of augmented image data train_generator = 
train_datagen.flow_from_directory(
            'data/train',  # this is the target directory
            target_size=(224, 224),  # all images will be resized 
            batch_size=batch_size,
            color_mode='rgb', 
            class_mode=None)  # since we use binary_crossentropy loss, we need binary labels

    # this is a similar generator, for validation data 
 validation_generator = test_datagen.flow_from_directory(
            'data/valid',
            target_size=(224, 224),
            batch_size=batch_size,
            color_mode = 'rgb',
            class_mode=None)

还有一件事：我的训练图像有 8 个波段，但架构只接受 3 个波段。我认为生成器最后只留下 1 个波段。也不知道怎么解决这个问题。

【问题讨论】：

你能分享你的训练生成器吗？最后一行应该看起来像yiel d X, y。看起来你只产生了两者之一。
我使用的是标准生成器格式。以上更新。

标签： keras deep-learning image-segmentation transfer-learning

【解决方案1】：

关于您的错误信息：

使用flow_from_directory()，您的ImageDataGenerator 从包含您的图像的目录结构中推断出类标签。如示例所示，图像应按类别排列在子文件夹中。

对于您的图像分割问题，标签结构比每张图像只有一个标签更复杂。标签是带有标签每像素的掩码。通常，您希望在训练期间将这些标签作为np arrays 提供给模型。

您将无法通过flow_from_directory() 处理您的案件。一种解决方案是编写自己的自定义生成器，从磁盘读取图像和标签，并将其与fit_generator() 一起使用。

假设您有一个包含两列的 .csv 文件，一列包含图像名称，一列包含对应掩码的路径：

那么你的生成器可能看起来像这样（我正在使用pandas 来读取 .csv 文件）：

from keras.utils import Sequence
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from math import ceil 
import numpy as np
import pandas as pd


class DataSequence(Sequence):
    """
    Keras Sequence object to train a model on larger-than-memory data.
    df: pandas dataframe from the csv file
    data_path: path where your images live
    """

    def __init__(self, df, data_path, batch_size):
        self.batch_size = batch_size
        self.im_list = df['images'].tolist()
        self.mask_list = df['labels'].tolist()

    def __len__(self):
        """Make sure to handle cases where the last batch < batch_size
        return int(math.ceil(len(self.im_list) / float(self.batch_size)))

    def get_batch_images(self, idx, path_list):
        # Fetch a batch of images from a list of paths
        return np.array([load_image(im) for im in path_list[idx * self.batch_size: (1 + idx) * self.batch_size]])

    def __getitem__(self, idx):
        batch_x = self.get_batch_images(idx, self.im_list)
        batch_y = self.get_batch_labels(idx, self.mask_list)
        return batch_x, batch_y

我在这里使用 Keras Sequence 对象来编写生成器，因为这样可以实现安全的多处理，从而加快训练速度。请参阅有关此主题的docs。

关于迁移学习的实际问题：

您将无法像这样在 8 通道图像上使用针对 3 通道图像进行预训练的架构。如果您想使用该架构，您可以对通道进行二次采样，或执行从 8 个通道到 3 个通道的降维。另请参阅this 线程。

【讨论】：

您好，非常感谢您的回复。我查看了 Keras github 上的一些问题，发现他们建议制作一个生成图像和掩码的生成器，所以我使用 zip(image_generator, mask_generator) 将两者结合起来。我仍然收到一个错误提示 ValueError: Error when checks target: expected conv2d_26 to have shape (224, 224, 5) but got array with shape (224, 224, 3)
您需要将掩码从 RGB 图像转换为图像，其中每个像素由一个二进制向量组成，指示该类属于哪个像素。见this related answer
感谢您的回复。我不完全确定如何将一种热编码应用于掩码生成器。有什么建议吗？
您正在使用 Kaggle 数据。查看一些kernels 那里的其他人是如何做到的。