模型不训练和输出相同的类答案

【问题标题】：model doesnt train and outputs the same class模型不训练和输出相同的类
【发布时间】：2021-01-16 10:06:48
【问题描述】：

让我给你一些背景

数据集：

https://www.kaggle.com/dionyshsmiaris/xrays 包括 X 射线类：

0/"normal" (1227 xrays)
1/"pneumonia_cause_viral_infection" (2238)
?/"pneumonia_cause_bacteria" (1207)

导入数据：

train_dir = "/content/gdrive/MyDrive/Xraydataset/train_images/"
test_dir =  "/content/gdrive/MyDrive/Xraydataset/test_images/"
def get_data(folder):
  X = []
  y = []
  for image_filename in tqdm(os.listdir(folder)):#/content/gdrive/MyDrive/Xraydataset/train_images/   == folder
    img_file = tf.keras.preprocessing.image.load_img(folder + '/' + image_filename,color_mode="grayscale")
    if img_file is not None:
      img_file=img_file.resize((224,224),1) 
  #img_file = tf.keras.preprocessing.image.smart_resize(img_file, (224, 224)) #256 256 1?
  #img_file = scipy.misc.imresize(arr=img_file, size=(150, 150, 3))
      img_arr = np.asarray(img_file)
      X.append(img_arr)
  label=pd.read_csv('/content/gdrive/MyDrive/Xraydataset/labels_train.csv',usecols = ['class_id'])
  # y.append(label)
  X = np.asarray(X)
  y = np.asarray(label)

  return X,y
X_train, y_train = get_data(train_dir)
X_test, y_test= get_data(test_dir)

预处理：

X_train=np.expand_dims(X_train, axis=3)
X_test=np.expand_dims(X_test, axis=3)
X_train = X_train.astype('float32') / 255
X_test = X_test.astype('float32') / 255

X_train_mean = np.mean(X_train, axis=0)
X_train -= X_train_mean
X_test -= X_train_mean

print('x_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

xtrain=X_train[:3700]
ytrain=y_train[:3700]
#split data to train and test
xtest=X_train[3700:]
ytest=y_train[3700:]


t_train = keras.utils.to_categorical(ytrain, 3)
t_test = keras.utils.to_categorical(ytest, 3)

数据生成：

datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range = 30,  # randomly rotate images in the range (degrees, 0 to 180)
        zoom_range = 0.2, # Randomly zoom image 
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip = True,  # randomly flip images
        vertical_flip=False)  # randomly flip images
datagen.fit(xtrain)

编译：

model = Sequential()
model.add(Conv2D(32 , (3,3) , strides = 1 , padding = 'same' , activation = 'relu' , input_shape = (224,224,1)))
model.add(BatchNormalization())
model.add(MaxPool2D((2,2) , strides = 2 , padding = 'same'))
model.add(Conv2D(64 , (3,3) , strides = 1 , padding = 'same' , activation = 'relu'))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(MaxPool2D((2,2) , strides = 2 , padding = 'same'))
model.add(Conv2D(64 , (3,3) , strides = 1 , padding = 'same' , activation = 'relu'))
model.add(BatchNormalization())
model.add(MaxPool2D((2,2) , strides = 2 , padding = 'same'))
model.add(Conv2D(128 , (3,3) , strides = 1 , padding = 'same' , activation = 'relu'))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(MaxPool2D((2,2) , strides = 2 , padding = 'same'))
model.add(Conv2D(256 , (3,3) , strides = 1 , padding = 'same' , activation = 'relu'))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(MaxPool2D((2,2) , strides = 2 , padding = 'same'))
model.add(Flatten())
model.add(Dense(units = 128 , activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(units = 3 , activation = 'softmax'))
model.compile(optimizer = "Adam" , loss = 'categorical_crossentropy' , metrics = ['accuracy'])

学习率、培训

learning_rate_reduction = ReduceLROnPlateau(monitor='val_accuracy',
                                            patience = 2, 
                                            verbose=1,
                                            factor=0.3,
                                            min_lr=0.000001)

history = model.fit(datagen.flow(xtrain,t_train, batch_size = 32) ,shuffle=True,
                    epochs = 20 , validation_data =datagen.flow(xtest,t_test), 
                    callbacks = [learning_rate_reduction])

输出：

[[0.276 0.455 0.27 ]
 [0.291 0.423 0.286]
 [0.286 0.434 0.28 ]
 ...
 [0.288 0.43  0.282]
 [0.292 0.429 0.28 ]
 [0.289 0.427 0.284]]

典型 acc 约为 46%，验证 acc 为 49%，但那是因为它预测频率最高的类

损失也几乎相同，所以我认为它不会训练任何提示/帮助

编辑:(with class_weights = {0: .9837, 1: .5394, 2: 1.0})

Epoch 1/20
116/116 [==============================] - 19s 154ms/step - loss: 0.8551 - accuracy: 0.3216 - val_loss: 1.1174 - val_accuracy: 0.3302
Epoch 2/20
116/116 [==============================] - 18s 155ms/step - loss: 0.8587 - accuracy: 0.3265 - val_loss: 1.1073 - val_accuracy: 0.3261
Epoch 3/20
116/116 [==============================] - 18s 158ms/step - loss: 0.8554 - accuracy: 0.3330 - val_loss: 1.1114 - val_accuracy: 0.3241
.
.
.
Epoch 18/20
116/116 [==============================] - 18s 155ms/step - loss: 0.8585 - accuracy: 0.3208 - val_loss: 1.1097 - val_accuracy: 0.3251
Epoch 19/20
116/116 [==============================] - 18s 154ms/step - loss: 0.8556 - accuracy: 0.3378 - val_loss: 1.1084 - val_accuracy: 0.3395
Epoch 20/20
116/116 [==============================] - 18s 156ms/step - loss: 0.8562 - accuracy: 0.3422 - val_loss: 1.1127 - val_accuracy: 0.3364

输出：

[[0.333 0.345 0.322]
 [0.361 0.283 0.356]
 [0.319 0.343 0.338]
 ...
 [0.335 0.329 0.335]
 [0.439 0.179 0.383]
 [0.338 0.311 0.351]]

从我现在得到的信息来看，我的模型能够猜出所有三个类，但它没有学习，如果你看损失它保持不变，acc 约为 30-33%，这会导致随机猜测任何提示？

【问题讨论】：

更高的学习率、更大的神经网络、迁移学习、更长的训练周期、更大的批量。
好吧，所有这些都是很好的建议，我已经尝试过在更长的时期/更大的 bacth 尺寸/更大的神经网络上进行训练，而对结果没有任何影响我想我会尝试更高的学习率，我不想使用迁移学习，因为我想从头开始构建。也感谢您的时间@NiteyaShah
@NiteyaShah 也没有更高的学习率。
kdnuggets.com/2020/06/… 你能试试这个架构吗？如果它有效，那么我们就知道您的架构不够强大，如果它没有，那么您的数据管道有问题。
@NiteyaShah 我会试一试，但这个架构大约有 2 个类。这不会是我的数据集的问题吗？

标签： python tensorflow keras deep-learning conv-neural-network

【解决方案1】：

我复制了您的代码并运行它并得到了与您相同的结果。为了查看模型是否是问题所在，我修改了代码以将迁移学习与 Resnet 结合使用。得到了基本相同的结果。我去 Kaggle 看看有没有效果更好的 notebook。没有笔记本。我认为问题出在数据集上。在https://www.kaggle.com/bachrr/covid-chest-xray，您可以在 Kaggle 上使用另一个数据集。还有这个数据集https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

【讨论】：

好吧，数据集来自一场比赛，人们达到了 84%，所以我想有办法解决它的问题，但感谢您的帮助，我接下来会尝试平衡课程。
这将是我的下一个建议。类计数为 0 =1227、1=2238 和 , 2 = 1207，因此类权重为 {0: .9837, 1: .5394, 2: 1.0}
你在哪里看到了 84% 的笔记本我看到除了入门笔记本之外没有列出任何笔记本
顺便说一句，你不需要做 datagen.fit(xtrain)。仅当您将 featurewise_center 或 samplewise_center 或 featurewise_std_normalization 或 samplewise_std_normalization 或 zca_whitening 设置为 true 时才需要这样做。
这是我学院的私人比赛