【问题标题】:Capsule networks for binary classification not training用于二进制分类而不是训练的胶囊网络
【发布时间】:2020-03-18 15:38:56
【问题描述】:

目前我正在尝试使用Xifeng Guo's Keras code for capsule nets 实现胶囊网络。我有一个dataset 的脑肿瘤图像,其中有 98 个负标记实例和 155 个正标记实例。我想使用 capsnet 来预测图像上脑肿瘤的阳性或阴性。不幸的是,我无法弄清楚为什么它没有超出设定的精度/损失。我尝试了数据扩充来增加数据集的大小,结果是 50/50 的预测。

我已经阅读了关于“Capsule Networks against Medical Imaging Data Challenges”的论文,他们在其中对 the DIARETDB1 dataset 等进行了胶囊网络实现,其中仅包含 89 张图像,即使没有数据增强(0.887不平衡场景 1) 的 F1 分数。这让我相信网络中可能出现了问题。仅供参考:我的图像经过标准化和裁剪。

感谢任何输入!

%pylab inline
import os
import numpy as np

import tensorflow as tf
import keras
import keras.backend as K

from capsulelayers import CapsuleLayer, PrimaryCap, Length, Mask
from keras import layers, models, optimizers
from keras.applications import vgg16
from keras.layers import Conv2D, MaxPooling2D

K.set_image_data_format('channels_last')


def CapsNet(input_shape, n_class, routings):
   x = layers.Input(shape=input_shape)

   # Layer 1: Just a conventional Conv2D layer
   conv1 = Conv2D(filters=256, kernel_size=9, strides=1, padding='valid', activation='relu', name='conv1')(x)

   # Layer 2: Conv2D layer with `squash` activation, then reshape to [None, num_capsule, dim_capsule]
   primarycaps = PrimaryCap(conv1, dim_capsule=8, n_channels=32, kernel_size=9, strides=2, padding='valid')

   # Layer 3: Capsule layer. Routing algorithm works here.
   digitcaps = CapsuleLayer(num_capsule=n_class, dim_capsule=16, routings=routings,
   name='digitcaps')(primarycaps)

   # Layer 4: This is an auxiliary layer to replace each capsule with its length. Just to match the true label's shape.
   # If using tensorflow, this will not be necessary. :)
   out_caps = Length(name='capsnet')(digitcaps) # CAN WE EXCLUDE THIS IN KERAS TOO?

   # Decoder network.
   y = layers.Input(shape=(n_class,))
   masked_by_y = Mask()([digitcaps, y]) # The true label is used to mask the output of capsule layer. For training
   masked = Mask()(digitcaps) # Mask using the capsule with maximal length. For prediction

   # Shared Decoder model in training and prediction
   decoder = models.Sequential(name='decoder')
   decoder.add(layers.Dense(512, activation='relu', input_dim=16*n_class))
   decoder.add(layers.Dense(1024, activation='relu'))
   decoder.add(layers.Dense(np.prod(input_shape), activation='sigmoid'))
   decoder.add(layers.Reshape(target_shape=input_shape, name='out_recon'))

   # Models for training and evaluation (prediction)
   train_model = models.Model([x, y], [out_caps, decoder(masked_by_y)])
   eval_model = models.Model(x, [out_caps, decoder(masked)])

   # manipulate model
   noise = layers.Input(shape=(n_class, 16))
   noised_digitcaps = layers.Add()([digitcaps, noise])
   masked_noised_y = Mask()([noised_digitcaps, y])
   manipulate_model = models.Model([x, y, noise], decoder(masked_noised_y))

   return train_model, eval_model, manipulate_model



def margin_loss(y_true, y_pred):
    """
    Margin loss for Eq.(4). When y_true[i, :] contains not just one `1`, this loss should work too. Not test it.
    :param y_true: [None, n_classes]
    :param y_pred: [None, num_capsule]
    :return: a scalar loss value.
    """
    L = y_true * K.square(K.maximum(0., 0.9 - y_pred)) + \
        0.5 * (1 - y_true) * K.square(K.maximum(0., y_pred - 0.1))

    return K.mean(K.sum(L, 1))
model, eval_model, manipulate_model = CapsNet(input_shape=x_train.shape[1:],
 n_class=1,
 routings=2)
# compile the model
model.compile(optimizer=optimizers.Adam(lr=3e-3),
 loss=[margin_loss, 'mse'],
 metrics={'capsnet': 'accuracy'})

model.summary()
history = model.fit(
        [x_train, y_train],[y_train,x_train],
        batch_size=16,
        epochs=30,
        validation_data=([x_val, y_val], [y_val, x_val]),
        shuffle=True)

结果是很多时期的准确性和损失都没有真正改变:

Epoch 1/30
161/161 [==============================] - 12s 77ms/step - loss: 0.2700 - capsnet_loss: 0.1911 - decoder_loss: 0.0789 - capsnet_acc: 0.5901 - val_loss: 0.2153 - val_capsnet_loss: 0.1588 - val_decoder_loss: 0.0565 - val_capsnet_acc: 0.6078
Epoch 2/30
161/161 [==============================] - 9s 56ms/step - loss: 0.2046 - capsnet_loss: 0.1560 - decoder_loss: 0.0486 - capsnet_acc: 0.6149 - val_loss: 0.2015 - val_capsnet_loss: 0.1588 - val_decoder_loss: 0.0427 - val_capsnet_acc: 0.6078
Epoch 3/30
161/161 [==============================] - 9s 56ms/step - loss: 0.1960 - capsnet_loss: 0.1560 - decoder_loss: 0.0401 - capsnet_acc: 0.6149 - val_loss: 0.1982 - val_capsnet_loss: 0.1588 - val_decoder_loss: 0.0394 - val_capsnet_acc: 0.6078

【问题讨论】:

    标签: python keras deep-learning classification conv-neural-network


    【解决方案1】:

    我建议你在第一个卷积层下面引入一个批量归一化层,看看它有什么作用。

    【讨论】:

      【解决方案2】:

      有两种向量变换过程可以从卷积中获得胶囊,即矩阵向量变换和卷积向量变换。由于您的数据量很少,因此最好使用卷积向量变换,这在这种情况下效果更好。

      【讨论】:

        猜你喜欢
        • 2021-07-25
        • 1970-01-01
        • 2019-05-14
        • 1970-01-01
        • 2019-06-10
        • 2013-08-24
        • 1970-01-01
        • 2020-04-08
        • 2023-03-04
        相关资源
        最近更新 更多