跨 MRI 切片的最大池化答案

【问题标题】：Max Pooling across MRI Slices跨 MRI 切片的最大池化
【发布时间】：2021-07-11 10:23:31
【问题描述】：

我正在尝试实施用于 MRI 扫描诊断的机器学习模型。我有形状 (x, 256, 256, 3) 的输入，其中我们有 3 个颜色通道，其中 x 是序列中的切片数。我阅读了MRNet 论文，我想在 TensorFlow Keras 中实现类似的架构。我不想使用 AlexNet 特征提取器，而是使用 VGG16。

论文中的模型架构：

我们的预测系统的主要构建块是 MRNet，一个卷积神经网络 (CNN) 将 3 维 MRI 系列映射到概率 [15]（图 2）。这 MRNet 的输入尺寸为 s × 3 × 256 × 256，其中 s 是 MRI 中的图像数量系列（3 是颜色通道的数量）。首先，每个二维 MRI 图像切片通过基于 AlexNet 的特征提取器获得一个 s × 256 × 7 × 7 的张量，其中包含每个切片的特征。然后应用全局平均池化层将这些特征减少到 s × 256。然后我们在切片之间应用最大池化以获得 256 维向量，它被传递给一个全连接层和 sigmoid 激活函数获得 0 到 1 范围内的预测。

到目前为止一切顺利。我有一个顺序模型，第一步添加了特征提取器，然后应用 GlobalAveragePooling2D() 将特征简化为形状（x，512）。然后我必须跨切片使用 MaxPool，但我没有办法解决这个问题。

feature_extractor = VGG16(weights='imagenet', include_top=False, input_shape=(256, 256, 3))
model = Sequential()
model.add(feature_extractor)         #output shape: (x, 8, 8, 512)
model.add(GlobalAveragePooling2D())  #output shape: (x, 512)
# Here i have to add a Layer witch Pools over the slices.
model.add(                         )  #output shape(1, 512)

model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

示例 Scan 的形状为 (44, 256, 256, 3)。当它通过 VGG16 时，其特征的维度为 (44, 8, 8, 512)。在 GlobalAverage Pooling 之后，我得到了 (44, 512)。然后，这个二维数组必须以某种方式转换为 (1, 512) 的形状。我的意思是，如果我在一个简单的二维 NumPy 数组上进行操作，我需要一个像 np.max 这样的函数在 0 轴上

np.max(x, axis=0)

也许您可以给我一个提示或为此提供一种方法。非常感谢您的帮助:)

############################################## ################################# 编辑：01.05.2021

我玩弄了你的方法@Aaron Keesing，但拟合模型并不能以某种方式训练它。在 25 个 epochs 之后，我仍然具有相同的精度。准确率是我的 2 个班级的分布（我只是在冠状平面上训练并且异常）

在这种情况下，例如我有 500 个病例，80% 的病例确实有异常，而 20% 没有。

# Dataset train, overall 500 cases
Absolute:
 abnormal  acl  meniscus
1         0    0           184
               1           118
0         0    0           100
1         1    1            63
               0            35
dtype: int64
Relative:
 abnormal  acl  meniscus
1         0    0           0.368
               1           0.236
0         0    0           0.200
1         1    1           0.126
               0           0.070

###########################################################
# Dataset valid, overall 100 cases
Absolute:
 abnormal  acl  meniscus
1         1    1           27
0         0    0           25
1         1    0           23
          0    0           20
               1            5
dtype: int64
Relative:
 abnormal  acl  meniscus
1         1    1           0.27
0         0    0           0.25
1         1    0           0.23
          0    0           0.20
               1           0.05

【问题讨论】：

标签： python tensorflow machine-learning keras

【解决方案1】：

在考虑了这个问题后，我找到了一种可行的解决方案

    vgg16 = VGG16(weights='imagenet', include_top=False, input_shape=(256, 256, 3))  #

    average_pool = Sequential(name='AveragePool')
    average_pool.add(layers.AveragePooling2D(input_shape=(8, 8, 512)))
    average_pool.add(layers.Flatten())
    self.average_pool = average_pool
    
    self.model = Sequential([
        vgg16,
        average_pool], name='MyModel')
    self.model.summary()
    
    # Max-pooling
    self.model.add(Dense(256, activation='relu', kernel_constraint=constraints.MaxNorm(max_value=2, axis=0)))
    self.model.add(Dense(1, activation='sigmoid'))

    self.model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

这导致以下摘要

Model: "AveragePool"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
average_pooling2d (AveragePo (None, 4, 4, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 8192)              0         
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________
Model: "MyModel"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 8, 8, 512)         14714688  
_________________________________________________________________
AveragePool (Sequential)     (None, 8192)              0         
_________________________________________________________________
dense (Dense)                (None, 256)               2097408   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 257       
=================================================================
Total params: 16,812,353
Trainable params: 16,812,353
Non-trainable params: 0

如果您有任何其他想法或改进意见，请告诉我！

【讨论】：

【解决方案2】：

您应该可以使用GlobalAveragePooling1D 层。但请注意，它需要一个批次维度。由于您输入的是一系列图像，因此您的输入应该是 5 维的，第一维是 batch_size（可以是 1）。

我认为图像 CNN 不适用于 5D 输入，因此您可以使用 TimeDistributed layer 应用于图像序列，这将为您提供形状为 (x, 512) 的特征序列，然后应用GlobalAveragePooling1D 得到最终的特征向量。

所以也许这样的事情可能会奏效。请注意，您必须指定序列中的图像数量，x（可以是None）：

vgg16 = VGG16(weights='imagenet', include_top=False, input_shape=(256, 256, 3))
feature_extractor = Sequential()
feature_extractor.add(vgg16)         #output shape: (bs, 8, 8, 512)
feature_extractor.add(GlobalAveragePooling2D())  #output shape: (bs, 512)

model = Sequential()
model.add(TimeDistributed(feature_extractor, input_shape=(x, 256, 256, 3)))  #output shape(bs, x, 512)
# Here i have to add a Layer witch Pools over the slices.
model.add(GlobalAveragePooling1D())   #output shape(bs, 512)
model.add(Dense(1, activation='sigmoid'))   #output shape(bs, 1)

您可以一次只放置一批一个 MRI 序列，这样bs = 1。

这会产生以下带有x = None 的模型结构：

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
time_distributed (TimeDistri (None, None, 512)         14714688
_________________________________________________________________
global_average_pooling1d (Gl (None, 512)               0
_________________________________________________________________
dense (Dense)                (None, 1)                 513
=================================================================
Total params: 14,715,201
Trainable params: 14,715,201
Non-trainable params: 0
_________________________________________________________________

【讨论】：

感谢您的帮助！我考虑了几天并使用了一些参数，这些参数使我找到了我发布的一个解决方案作为我的问题的答案。我非常专注于使用一个顺序模型，而不是堆叠它们。
您的模型仍然只会为每个图像提供一个输出值，而您的问题表明您希望每个 sequence 图像有一个输出值，这就是我的模型所做的.
你是对的，我在使用我的模型时没有得到任何正确的值。我实现了你的模型，并会玩弄它。我会让你知道我的结果:)