问题：模型分类猫和狗（keras）答案

【问题标题】：Issue: Model Classification cats and dogs (keras)问题：模型分类猫和狗（keras）
【发布时间】：2018-08-29 22:09:36
【问题描述】：

只有在阅读完其他所有内容后才能阅读此文本： “检查最后的第二个模型。我尝试从 cifar 数据集中对汽车和飞机进行分类：该模型是可重现的” em>

我正在尝试构建一个对猫和狗进行分类的模型，这不应该是一个真正的问题。所以，这就是我在做什么：
我创建了一个带有两个标记子文件夹的文件夹：猫和狗。在每个文件夹中，我有 1000 张猫/狗的图像。
我迭代地构建了一个 numpy 数组，在将图像转换为数组后将它们放入其中（我为每个图像选择了 (200,200,3) 大小），并且通过将数组除以 255 来放大数组。所以我得到了一个 ( 2000,200,200,3) 放大阵列。
然后，我为标签创建一个数组。由于我有两个类别，因此数组的每一行都有 2 个二进制数字：如果是猫，则为 (1,0)，如果是狗，则为 (0,1)。所以我发现自己有一个 (2000,2) 的标签数组。
接下来，我创建 X_train,Y_train 和 X_valid,Y_valid（70% 用于训练，30% 用于有效）。
然后我用这种架构创建了一个神经网络：
Dense(200x200x3,1000,relu)>>> Dense(1000,500,sigmoid)>>>Dense(500,100,sigmoid)>>Dense(100 ,2,softmax) : BACKPROP:loss=categorical_crossentropy,optimizer=adam.

到目前为止，一切看起来都很好，模型已经训练好了。但是，当我尝试预测值时：无论输入是什么，模型总是返回相同的值（即使我尝试使用训练集预测元素，我总是得到相同的常量 output= array([[ 0.5188029 , 0.48119715]] )

我确实需要帮助，我不知道为什么会这样。所以为了指导你们，我将写下与我所做的相对应的所有代码：

导入库 函数：预处理图像

def preprocess_image(img_path, model_image_size):
    image_type = imghdr.what(img_path)
    image = Image.open(img_path)
    resized_image = image.resize(tuple(reversed(model_image_size)), Image.BICUBIC)
    image_data = np.array(resized_image, dtype='float32')
    image_data /= 255.
    image_data = np.expand_dims(image_data, 0)  # Add batch dimension.
    return image, image_data

###################################### import libraries ##########################################################

import argparse
import os
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
import scipy.io
import scipy.misc
import numpy as np
import pandas as pd
import PIL
import tensorflow as tf
from keras import backend as K
from keras.preprocessing import image
from keras.layers import Input, Lambda, Conv2D
from keras.models import load_model, Model
from yolo_utils import read_classes, read_anchors, generate_colors, preprocess_image, draw_boxes, scale_boxes
from yad2k.models.keras_yolo import yolo_head, yolo_boxes_to_corners, preprocess_true_boxes, yolo_loss, yolo_body


from keras.models import Sequential
from scipy.misc import imread
get_ipython().magic('matplotlib inline')
import matplotlib.pyplot as plt
import numpy as np
import keras
from keras.layers import Dense
import pandas as pd

%matplotlib inline

导入图片

train_img=[]
for i in range(1000,2000):
    (img, train_img_data)=preprocess_image('kaggle/PetImages/Cat/'+str(i)+'.jpg',model_image_size = (200, 200))
    train_img.append(train_img_data)
for i in range(1000):
    (img, train_img_data)=preprocess_image('kaggle/PetImages/Dog/'+str(i)+'.jpg',model_image_size = (200, 200))
    train_img.append(train_img_data)
train_img= np.array(train_img).reshape(2000,200,200,3)

创建训练和验证集

x_train= train_img.reshape(train_img.shape[0],-1)
y_train = np.zeros((2000, 2))
for i in range(1000):
    y_train[i,0]=1
for i in range(1000,2000):
    y_train[i,1]=1
from sklearn.model_selection import train_test_split
X_train, X_valid, Y_train, Y_valid=train_test_split(x_train,y_train,test_size=0.3, random_state=42)

创建模型结构（使用 keras）

from keras.layers import Dense, Activation
model=Sequential()
model.add(Dense(1000, input_dim=200*200*3, activation='relu',kernel_initializer='uniform'))
keras.layers.core.Dropout(0.3, noise_shape=None, seed=None)
model.add(Dense(500,input_dim=1000,activation='sigmoid'))
keras.layers.core.Dropout(0.4, noise_shape=None, seed=None)
model.add(Dense(150,input_dim=500,activation='sigmoid'))
keras.layers.core.Dropout(0.2, noise_shape=None, seed=None)
model.add(Dense(units=2))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])
# fitting the model 
model.fit(X_train, Y_train, epochs=20, batch_size=50,validation_data=(X_valid,Y_valid))

拟合模型

model.fit(X_train, Y_train, epochs=20, batch_size=50,validation_data=(X_valid,Y_valid))

拟合模型的输出

Train on 1400 samples, validate on 600 samples
Epoch 1/20
1400/1400 [==============================] - 73s 52ms/step - loss: 0.8065 - acc: 0.4814 - val_loss: 0.6939 - val_acc: 0.5033
Epoch 2/20
1400/1400 [==============================] - 72s 51ms/step - loss: 0.7166 - acc: 0.5043 - val_loss: 0.7023 - val_acc: 0.4967
Epoch 3/20
1400/1400 [==============================] - 73s 52ms/step - loss: 0.6969 - acc: 0.5214 - val_loss: 0.6966 - val_acc: 0.4967
Epoch 4/20
1400/1400 [==============================] - 71s 51ms/step - loss: 0.6986 - acc: 0.4857 - val_loss: 0.6932 - val_acc: 0.4967
Epoch 5/20
1400/1400 [==============================] - 74s 53ms/step - loss: 0.7018 - acc: 0.4686 - val_loss: 0.7080 - val_acc: 0.4967
Epoch 6/20
1400/1400 [==============================] - 76s 54ms/step - loss: 0.7041 - acc: 0.4843 - val_loss: 0.6931 - val_acc: 0.5033
Epoch 7/20
1400/1400 [==============================] - 73s 52ms/step - loss: 0.7002 - acc: 0.4771 - val_loss: 0.6973 - val_acc: 0.4967
Epoch 8/20
1400/1400 [==============================] - 70s 50ms/step - loss: 0.7039 - acc: 0.5014 - val_loss: 0.6931 - val_acc: 0.5033
Epoch 9/20
1400/1400 [==============================] - 72s 51ms/step - loss: 0.6983 - acc: 0.4971 - val_loss: 0.7109 - val_acc: 0.5033
Epoch 10/20
1400/1400 [==============================] - 72s 51ms/step - loss: 0.7063 - acc: 0.4986 - val_loss: 0.7151 - val_acc: 0.4967
Epoch 11/20
1400/1400 [==============================] - 78s 55ms/step - loss: 0.6984 - acc: 0.5043 - val_loss: 0.7026 - val_acc: 0.5033
Epoch 12/20
1400/1400 [==============================] - 78s 55ms/step - loss: 0.6993 - acc: 0.4929 - val_loss: 0.6958 - val_acc: 0.4967
Epoch 13/20
1400/1400 [==============================] - 90s 65ms/step - loss: 0.7000 - acc: 0.4843 - val_loss: 0.6970 - val_acc: 0.4967
Epoch 14/20
1400/1400 [==============================] - 78s 56ms/step - loss: 0.7052 - acc: 0.4829 - val_loss: 0.7029 - val_acc: 0.4967
Epoch 15/20
1400/1400 [==============================] - 80s 57ms/step - loss: 0.7003 - acc: 0.5014 - val_loss: 0.6993 - val_acc: 0.5033
Epoch 16/20
1400/1400 [==============================] - 77s 55ms/step - loss: 0.6933 - acc: 0.5200 - val_loss: 0.6985 - val_acc: 0.5033
Epoch 17/20
1400/1400 [==============================] - 78s 56ms/step - loss: 0.6962 - acc: 0.4871 - val_loss: 0.7086 - val_acc: 0.4967
Epoch 18/20
1400/1400 [==============================] - 81s 58ms/step - loss: 0.6987 - acc: 0.4971 - val_loss: 0.7119 - val_acc: 0.4967
Epoch 19/20
1400/1400 [==============================] - 77s 55ms/step - loss: 0.7010 - acc: 0.5171 - val_loss: 0.6969 - val_acc: 0.4967
Epoch 20/20
1400/1400 [==============================] - 74s 53ms/step - loss: 0.6984 - acc: 0.5057 - val_loss: 0.6936 - val_acc: 0.5033
<keras.callbacks.History at 0x23903fc7c88>

对训练集元素的预测：

print(model.predict(X_train[240].reshape(1,120000)))
print(model.predict(X_train[350].reshape(1,120000)))
print(model.predict(X_train[555].reshape(1,120000)))
print(model.predict(X_train[666].reshape(1,120000)))
print(model.predict(X_train[777].reshape(1,120000)))

这些操作的输出

[[0.5188029  0.48119715]]
[[0.5188029  0.48119715]]
[[0.5188029  0.48119715]]
[[0.5188029  0.48119715]]
[[0.5188029  0.48119715]]

对验证集元素的预测

print(model.predict(X_valid[10].reshape(1,120000)))
print(model.predict(X_valid[20].reshape(1,120000)))
print(model.predict(X_valid[30].reshape(1,120000)))
print(model.predict(X_valid[40].reshape(1,120000)))
print(model.predict(X_valid[50].reshape(1,120000)))

这些操作的输出

[[0.5188029  0.48119715]]
[[0.5188029  0.48119715]]
[[0.5188029  0.48119715]]
[[0.5188029  0.48119715]]
[[0.5188029  0.48119715]]

我真的很困惑，因为我不知道为什么会得到这个结果。我还尝试了另一种性别分类（男性/女性），也得到了类似的结果，换句话说，无论输入的值是多少，我都得到了一个固定值输出（基本上告诉我所有观察结果都是女性）....

这是我在帖子开头所说的部分：
对汽车和飞机进行分类（可重复）

#importing keras cifar
from keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()


#Building labeled arrays of cars and planes: 5000 first are plane, 5000 last are car

#Building x_train_our
a=(y_train==0)
x_plane=x_train[list(a[:,0])]
a=(y_train==1)
x_car=x_train[list(a[:,0])]
x_train_our=np.append(x_plane,x_car,axis=0)


#Building y_train_our
y_train_our = np.zeros((10000, 2))
for i in range(5000):
    y_train_our[i,0]=1
for i in range(5000,10000):
    y_train_our[i,1]=1
print('x_train_our shape: ',x_train_our.shape)
print('y_train_our shape: ',y_train_our.shape)

#Train set and valid set
x_train_our= x_train_our.reshape(x_train_our.shape[0],-1)
y_train_our=y_train_our
print('x_train_our shape: ',x_train_our.shape)
print('y_train_our shape: ',y_train_our.shape)
from sklearn.model_selection import train_test_split
X_train_our, X_valid_our, Y_train_our, Y_valid_our=train_test_split(x_train_our,y_train_our,test_size=0.3, random_state=42)

#testing typology of different elements
print("-------------testing size of different elements et toplogie: ")
print("-------------x_train_our size: ",x_train_our.shape)
print("-------------y_train_our size: ",y_train_our.shape)
print("-------------X_train_our size: ",X_train_our.shape)
print("-------------X_valid_our size: ",X_valid_our.shape)
print("-------------Y_train_our size: ",Y_train_our.shape)
print("-------------Y_valid_our size: ",Y_valid_our.shape)


#Mode1: creating a mlp model which is going to be the output for the YOLO model
from keras.layers import Dense, Activation
model=Sequential()
model.add(Dense(1000, input_dim=32*32*3, activation='relu',kernel_initializer='uniform'))
keras.layers.core.Dropout(0.3, noise_shape=None, seed=None)
model.add(Dense(500,input_dim=1000,activation='sigmoid'))
keras.layers.core.Dropout(0.4, noise_shape=None, seed=None)
model.add(Dense(150,input_dim=500,activation='sigmoid'))
keras.layers.core.Dropout(0.2, noise_shape=None, seed=None)
model.add(Dense(units=2))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])

# fitting the model 

model.fit(X_train_our, Y_train_our, epochs=20, batch_size=10,validation_data=(X_valid_our,Y_valid_our))

#Build test set 
a=(y_test==0)
x_test_plane=x_test[list(a[:,0])]
a=(y_test==1)
x_test_car=x_test[list(a[:,0])]

# Test model
for i in range(1000):
    print('it should be a plane: ',model.predict(x_plane[i].reshape(1,-1)))
for i in range(1000):
    print('it should be a car: ',model.predict(x_car[i].reshape(1,-1)))

输出

x_train_our shape:  (10000, 32, 32, 3)
y_train_our shape:  (10000, 2)
x_train_our shape:  (10000, 3072)
y_train_our shape:  (10000, 2)
-------------testing size of different elements et toplogie: 
-------------x_train_our size:  (10000, 3072)
-------------y_train_our size:  (10000, 2)
-------------X_train_our size:  (7000, 3072)
-------------X_valid_our size:  (3000, 3072)
-------------Y_train_our size:  (7000, 2)
-------------Y_valid_our size:  (3000, 2)
Train on 7000 samples, validate on 3000 samples
Epoch 1/20
7000/7000 [==============================] - 52s 7ms/step - loss: 0.7114 - acc: 0.4907 - val_loss: 0.7237 - val_acc: 0.4877
Epoch 2/20
7000/7000 [==============================] - 51s 7ms/step - loss: 0.7004 - acc: 0.4967 - val_loss: 0.7065 - val_acc: 0.4877
Epoch 3/20
7000/7000 [==============================] - 51s 7ms/step - loss: 0.6979 - acc: 0.4981 - val_loss: 0.6977 - val_acc: 0.4877
Epoch 4/20
7000/7000 [==============================] - 52s 7ms/step - loss: 0.6990 - acc: 0.4959 - val_loss: 0.6970 - val_acc: 0.4877
Epoch 5/20
7000/7000 [==============================] - 53s 8ms/step - loss: 0.6985 - acc: 0.5030 - val_loss: 0.6929 - val_acc: 0.5123
Epoch 6/20
7000/7000 [==============================] - 52s 7ms/step - loss: 0.6970 - acc: 0.5036 - val_loss: 0.7254 - val_acc: 0.4877
Epoch 7/20
7000/7000 [==============================] - 51s 7ms/step - loss: 0.6968 - acc: 0.5047 - val_loss: 0.6935 - val_acc: 0.5123
Epoch 8/20
7000/7000 [==============================] - 47s 7ms/step - loss: 0.6970 - acc: 0.5076 - val_loss: 0.6941 - val_acc: 0.5123
Epoch 9/20
7000/7000 [==============================] - 50s 7ms/step - loss: 0.6982 - acc: 0.5024 - val_loss: 0.6928 - val_acc: 0.5123
Epoch 10/20
7000/7000 [==============================] - 47s 7ms/step - loss: 0.6974 - acc: 0.5010 - val_loss: 0.7222 - val_acc: 0.4877
Epoch 11/20
7000/7000 [==============================] - 51s 7ms/step - loss: 0.6975 - acc: 0.5087 - val_loss: 0.6936 - val_acc: 0.4877
Epoch 12/20
7000/7000 [==============================] - 49s 7ms/step - loss: 0.6991 - acc: 0.5021 - val_loss: 0.6938 - val_acc: 0.4877
Epoch 13/20
7000/7000 [==============================] - 49s 7ms/step - loss: 0.6976 - acc: 0.4996 - val_loss: 0.6983 - val_acc: 0.4877
Epoch 14/20
7000/7000 [==============================] - 49s 7ms/step - loss: 0.6978 - acc: 0.5064 - val_loss: 0.6944 - val_acc: 0.5123
Epoch 15/20
7000/7000 [==============================] - 49s 7ms/step - loss: 0.6993 - acc: 0.5019 - val_loss: 0.6937 - val_acc: 0.5123
Epoch 16/20
7000/7000 [==============================] - 49s 7ms/step - loss: 0.6969 - acc: 0.5027 - val_loss: 0.6930 - val_acc: 0.5123
Epoch 17/20
7000/7000 [==============================] - 49s 7ms/step - loss: 0.6981 - acc: 0.4939 - val_loss: 0.6953 - val_acc: 0.4877
Epoch 18/20
7000/7000 [==============================] - 51s 7ms/step - loss: 0.6969 - acc: 0.5030 - val_loss: 0.7020 - val_acc: 0.4877
Epoch 19/20
7000/7000 [==============================] - 51s 7ms/step - loss: 0.6984 - acc: 0.5039 - val_loss: 0.6973 - val_acc: 0.5123
Epoch 20/20
7000/7000 [==============================] - 51s 7ms/step - loss: 0.6981 - acc: 0.5053 - val_loss: 0.6940 - val_acc: 0.5123
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a plane:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]
it should be a car:  [[0.5358367  0.46416324]]

【问题讨论】：

非常不清楚这里出了什么问题
为什么无论输入值是多少，模型总是预测相同的输出！！！换句话说，模型说我一直有猫（如果我使用 0.5 作为边界决策）
你在最后一个 epoch 有 50% 的准确率，这就是为什么......你应该尝试更多的 epoch，改变学习率，看看损失函数......做更多的研究.
您是否检查过（通过绘图）您尝试预测的图像实际上是不同的图像？
@Echows 是的，我做到了！

标签： tensorflow computer-vision deep-learning keras classification

【解决方案1】：

我可以确认这发生在Keras==2.1.5, tensorflow==1.6.0。

简短回答：这是一个过拟合问题，我设法通过将学习率降低到 0.0001 或将 adam 优化器更改为 SGD 来解决 cifar10 数据集的问题。

首先，进行一些不会消除问题的方便修改：

设置batch_size=2048 以加速时代。
设置epochs=5 加速训练。
仅显示前 10 个测试预测。

我的猜测是，仅在第一层具有32*32*3*1000 参数的网络很容易被lr=0.001 过度拟合。所以我将cifar10 数据集更改为mnist，输入形状为28*28 => 第一层28*28*1000 单位。结果是这样的：

Train on 8865 samples, validate on 3800 samples
Epoch 1/5
 - 1s - loss: 0.3029 - acc: 0.8831 - val_loss: 0.0339 - val_acc: 0.9958
...
Epoch 5/5
 - 1s - loss: 0.0063 - acc: 0.9982 - val_loss: 0.0039 - val_acc: 0.9987

it should be a plane:  [[0.9984061  0.00159395]]
it should be a plane:  [[0.99826896 0.00173102]]
it should be a plane:  [[0.9980952  0.00190475]]
it should be a plane:  [[0.9984674  0.00153262]]
it should be a plane:  [[0.99838233 0.00161765]]
it should be a plane:  [[0.9981931  0.00180687]]
it should be a plane:  [[0.9982863  0.00171365]]
it should be a plane:  [[0.9956332  0.00436677]]
it should be a plane:  [[0.9982967  0.00170333]]
it should be a plane:  [[0.9983923  0.00160768]]
it should be a car:  [[0.00104721 0.99895275]]
it should be a car:  [[0.00099913 0.99900085]]
it should be a car:  [[9.910525e-04 9.990089e-01]]
it should be a car:  [[9.878672e-04 9.990121e-01]]
it should be a car:  [[0.00105713 0.9989429 ]]
it should be a car:  [[0.02821341 0.9717866 ]]
it should be a car:  [[9.509333e-04 9.990490e-01]]
it should be a car:  [[0.00103957 0.9989604 ]]
it should be a car:  [[8.8129757e-04 9.9911875e-01]]
it should be a car:  [[0.00189029 0.9981097 ]]

所以现在我执行的次数过多：adam-0.0001（使用lr=0.0001）和sgd，使用SGD 优化器。下图说明了这两者的预测如何在整个时期内传播，这与您使用 adam(lr=0.001) 的实现相反：

下图显示了亚当的梯度下降得更快：

这可能让它陷入了局部最小值，网络模拟了一个关于输入的常数函数。

您的代码中的其他一些 cmets：

以下代码无效：

keras.layers.core.Dropout(0.3, noise_shape=None, seed=None)

您需要将其添加到模型中：

from keras.layers import Dropout
model.add(Dropout(...))

无需在每一层设置input_dim。只是第一个。

Tensorboard 可用，因此您可以解决此类问题：

from keras import callbacks
model.fit(X_train_our, Y_train_our,
          epochs=5,
          batch_size=2048,
          validation_data=(X_valid_our, Y_valid_our),
          callbacks=[
              callbacks.TensorBoard('./logs/adam', histogram_freq=1, batch_size=2048, write_grads=True)
          ])

如上所述，将图像（或任何原始数据）直接馈送到Dense 层不是一个好主意。参数较少的层（例如Conv2D、LocallyConnected2D）更适合。

【讨论】：

在 planes and cares 脚本中：我忘记将输入除以 255（请注意！！！！）。所以当我注意到这一点时，我做了除法，问题就解决了。仍然是猫和狗的问题：输入的大小为 (200,200,3)，远大于 (32,32,3) 问题仍然存在：即使数据被放大（除以 255），我也被卡住了，直到你回答：它让我了解了问题的根源！谢谢先生。

【解决方案2】：

您为什么尝试使用普通的神经网络模型而不是 CNN 进行图像分类？ CNN 是为与图像相关的任务而构建的。此外，一旦您建立了模型，在验证期间，您不能只是将原始图像转储到 predict 方法中以获得结果。您需要导入该图像并将其转换为矩阵。您需要检查您的网络应该采用的输入的维度，并相应地为您的测试图像添加维度。

【讨论】：

我检查了维度并将图像转换为矩阵（这很简单。）。关于 CNN，我很清楚 CNN 用于构建图像任务。最初，我使用了一个预训练模型（YOLO 用于检测和分类）并向该模型添加了额外的层（对应于我在这里构建的模型）。所以我训练了额外的层，发现我得到了恒定的结果，所以我决定只采用附加层并构建独立模型，看看我是否仍然有同样的问题。结果：在这里，我总是有一个恒定的结果！这个阶段只是为了测试。