用于二进制分类而不是训练的胶囊网络答案

【问题标题】：Capsule networks for binary classification not training用于二进制分类而不是训练的胶囊网络
【发布时间】：2020-03-18 15:38:56
【问题描述】：

目前我正在尝试使用Xifeng Guo's Keras code for capsule nets 实现胶囊网络。我有一个dataset 的脑肿瘤图像，其中有 98 个负标记实例和 155 个正标记实例。我想使用 capsnet 来预测图像上脑肿瘤的阳性或阴性。不幸的是，我无法弄清楚为什么它没有超出设定的精度/损失。我尝试了数据扩充来增加数据集的大小，结果是 50/50 的预测。

我已经阅读了关于“Capsule Networks against Medical Imaging Data Challenges”的论文，他们在其中对 the DIARETDB1 dataset 等进行了胶囊网络实现，其中仅包含 89 张图像，即使没有数据增强（0.887不平衡场景 1) 的 F1 分数。这让我相信网络中可能出现了问题。仅供参考：我的图像经过标准化和裁剪。

感谢任何输入！

%pylab inline
import os
import numpy as np

import tensorflow as tf
import keras
import keras.backend as K

from capsulelayers import CapsuleLayer, PrimaryCap, Length, Mask
from keras import layers, models, optimizers
from keras.applications import vgg16
from keras.layers import Conv2D, MaxPooling2D

K.set_image_data_format('channels_last')


def CapsNet(input_shape, n_class, routings):
   x = layers.Input(shape=input_shape)

   # Layer 1: Just a conventional Conv2D layer
   conv1 = Conv2D(filters=256, kernel_size=9, strides=1, padding='valid', activation='relu', name='conv1')(x)

   # Layer 2: Conv2D layer with `squash` activation, then reshape to [None, num_capsule, dim_capsule]
   primarycaps = PrimaryCap(conv1, dim_capsule=8, n_channels=32, kernel_size=9, strides=2, padding='valid')

   # Layer 3: Capsule layer. Routing algorithm works here.
   digitcaps = CapsuleLayer(num_capsule=n_class, dim_capsule=16, routings=routings,
   name='digitcaps')(primarycaps)

   # Layer 4: This is an auxiliary layer to replace each capsule with its length. Just to match the true label's shape.
   # If using tensorflow, this will not be necessary. :)
   out_caps = Length(name='capsnet')(digitcaps) # CAN WE EXCLUDE THIS IN KERAS TOO?

   # Decoder network.
   y = layers.Input(shape=(n_class,))
   masked_by_y = Mask()([digitcaps, y]) # The true label is used to mask the output of capsule layer. For training
   masked = Mask()(digitcaps) # Mask using the capsule with maximal length. For prediction

   # Shared Decoder model in training and prediction
   decoder = models.Sequential(name='decoder')
   decoder.add(layers.Dense(512, activation='relu', input_dim=16*n_class))
   decoder.add(layers.Dense(1024, activation='relu'))
   decoder.add(layers.Dense(np.prod(input_shape), activation='sigmoid'))
   decoder.add(layers.Reshape(target_shape=input_shape, name='out_recon'))

   # Models for training and evaluation (prediction)
   train_model = models.Model([x, y], [out_caps, decoder(masked_by_y)])
   eval_model = models.Model(x, [out_caps, decoder(masked)])

   # manipulate model
   noise = layers.Input(shape=(n_class, 16))
   noised_digitcaps = layers.Add()([digitcaps, noise])
   masked_noised_y = Mask()([noised_digitcaps, y])
   manipulate_model = models.Model([x, y, noise], decoder(masked_noised_y))

   return train_model, eval_model, manipulate_model



def margin_loss(y_true, y_pred):
    """
    Margin loss for Eq.(4). When y_true[i, :] contains not just one `1`, this loss should work too. Not test it.
    :param y_true: [None, n_classes]
    :param y_pred: [None, num_capsule]
    :return: a scalar loss value.
    """
    L = y_true * K.square(K.maximum(0., 0.9 - y_pred)) + \
        0.5 * (1 - y_true) * K.square(K.maximum(0., y_pred - 0.1))

    return K.mean(K.sum(L, 1))

model, eval_model, manipulate_model = CapsNet(input_shape=x_train.shape[1:],
 n_class=1,
 routings=2)
# compile the model
model.compile(optimizer=optimizers.Adam(lr=3e-3),
 loss=[margin_loss, 'mse'],
 metrics={'capsnet': 'accuracy'})

model.summary()

history = model.fit(
        [x_train, y_train],[y_train,x_train],
        batch_size=16,
        epochs=30,
        validation_data=([x_val, y_val], [y_val, x_val]),
        shuffle=True)

结果是很多时期的准确性和损失都没有真正改变：

Epoch 1/30
161/161 [==============================] - 12s 77ms/step - loss: 0.2700 - capsnet_loss: 0.1911 - decoder_loss: 0.0789 - capsnet_acc: 0.5901 - val_loss: 0.2153 - val_capsnet_loss: 0.1588 - val_decoder_loss: 0.0565 - val_capsnet_acc: 0.6078
Epoch 2/30
161/161 [==============================] - 9s 56ms/step - loss: 0.2046 - capsnet_loss: 0.1560 - decoder_loss: 0.0486 - capsnet_acc: 0.6149 - val_loss: 0.2015 - val_capsnet_loss: 0.1588 - val_decoder_loss: 0.0427 - val_capsnet_acc: 0.6078
Epoch 3/30
161/161 [==============================] - 9s 56ms/step - loss: 0.1960 - capsnet_loss: 0.1560 - decoder_loss: 0.0401 - capsnet_acc: 0.6149 - val_loss: 0.1982 - val_capsnet_loss: 0.1588 - val_decoder_loss: 0.0394 - val_capsnet_acc: 0.6078

【问题讨论】：

标签： python keras deep-learning classification conv-neural-network

【解决方案1】：

我建议你在第一个卷积层下面引入一个批量归一化层，看看它有什么作用。

【讨论】：

【解决方案2】：

有两种向量变换过程可以从卷积中获得胶囊，即矩阵向量变换和卷积向量变换。由于您的数据量很少，因此最好使用卷积向量变换，这在这种情况下效果更好。

【讨论】：