ValueError：检查目标时出错：预期activation_6具有形状（无，2）但得到的数组具有形状（5760,1）答案

【问题标题】：ValueError: Error when checking target: expected activation_6 to have shape(None,2) but got array with shape (5760,1)ValueError：检查目标时出错：预期activation_6具有形状（无，2）但得到的数组具有形状（5760,1）
【发布时间】：2018-09-18 05:29:31
【问题描述】：

我正在尝试为具有 8 个类的卷积神经网络（在 Keras 中）调整 Python 代码以处理 2 个类。我的问题是我收到以下错误消息：

ValueError：检查目标时出错：预期activation_6 有 shape(None,2) 但得到了形状为 (5760,1) 的数组。

我的模型如下（没有缩进问题）：

    class MiniVGGNet:
    @staticmethod
    def build(width, height, depth, classes):
    # initialize the model along with the input shape to be
    # "channels last" and the channels dimension itself
    model = Sequential()
    inputShape = (height, width, depth)
    chanDim = -1

    # if we are using "channels first", update the input shape
    # and channels dimension
    if K.image_data_format() == "channels_first":
        inputShape = (depth, height, width)
        chanDim = 1

    # first CONV => RELU => CONV => RELU => POOL layer set
    model.add(Conv2D(32, (3, 3), padding="same",
        input_shape=inputShape))
    model.add(Activation("relu"))
    model.add(BatchNormalization(axis=chanDim))
    model.add(Conv2D(32, (3, 3), padding="same"))
    model.add(Activation("relu"))
    model.add(BatchNormalization(axis=chanDim))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    # second CONV => RELU => CONV => RELU => POOL layer set
    model.add(Conv2D(64, (3, 3), padding="same"))
    model.add(Activation("relu"))
    model.add(BatchNormalization(axis=chanDim))
    model.add(Conv2D(64, (3, 3), padding="same"))
    model.add(Activation("relu"))
    model.add(BatchNormalization(axis=chanDim))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    # first (and only) set of FC => RELU layers
    model.add(Flatten())
    model.add(Dense(512))
    model.add(Activation("relu"))
    model.add(BatchNormalization())
    model.add(Dropout(0.5))

    # softmax classifier
    model.add(Dense(classes))
    model.add(Activation("softmax"))

    # return the constructed network architecture
    return model

其中 classes = 2，并且 inputShape=(32,32,3)。

我知道我的错误与我的类/对 binary_crossentropy 的使用有关，并且出现在下面的 model.fit 行中，但无法弄清楚它为什么会出现问题，或者如何解决它。

通过将上面的 model.add(Dense(classes)) 更改为 model.add(Dense(classes-1)) 我可以让模型进行训练，但是我的标签大小和 target_names 不匹配，我只有一个一切都归类的类别。

# import the necessary packages
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from pyimagesearch.nn.conv import MiniVGGNet
from pyimagesearch.preprocessing import ImageToArrayPreprocessor
from pyimagesearch.preprocessing import SimplePreprocessor
from pyimagesearch.datasets import SimpleDatasetLoader
from keras.optimizers import SGD
#from keras.datasets import cifar10
from imutils import paths
import matplotlib.pyplot as plt
import numpy as np
import argparse

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-d", "--dataset", required=True,
    help="path to input dataset")
ap.add_argument("-o", "--output", required=True,
    help="path to the output loss/accuracy plot")
args = vars(ap.parse_args())

# grab the list of images that we'll be describing
print("[INFO] loading images...")
imagePaths = list(paths.list_images(args["dataset"]))

# initialize the image preprocessors
sp = SimplePreprocessor(32, 32)
iap = ImageToArrayPreprocessor()

# load the dataset from disk then scale the raw pixel intensities
# to the range [0, 1]
sdl = SimpleDatasetLoader(preprocessors=[sp, iap])
(data, labels) = sdl.load(imagePaths, verbose=500)
data = data.astype("float") / 255.0

# partition the data into training and testing splits using 75% of
# the data for training and the remaining 25% for testing
(trainX, testX, trainY, testY) = train_test_split(data, labels,
    test_size=0.25, random_state=42)

# convert the labels from integers to vectors
trainY = LabelBinarizer().fit_transform(trainY)
testY = LabelBinarizer().fit_transform(testY)

# initialize the label names for the items dataset
labelNames = ["mint", "used"]

# initialize the optimizer and model
print("[INFO] compiling model...")
opt = SGD(lr=0.01, decay=0.01 / 10, momentum=0.9, nesterov=True)
model = MiniVGGNet.build(width=32, height=32, depth=3, classes=2)
model.compile(loss="binary_crossentropy", optimizer=opt,
    metrics=["accuracy"])

# train the network
print("[INFO] training network...")
H = model.fit(trainX, trainY, validation_data=(testX, testY),
    batch_size=64, epochs=10, verbose=1)
print ("Made it past training")

# evaluate the network
print("[INFO] evaluating network...")
predictions = model.predict(testX, batch_size=64)
print(classification_report(testY.argmax(axis=1),
    predictions.argmax(axis=1), target_names=labelNames))

# plot the training loss and accuracy
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, 10), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, 10), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, 10), H.history["acc"], label="train_acc")
plt.plot(np.arange(0, 10), H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy on items dataset")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend()
plt.savefig(args["output"])

我已经看过这些问题，但无法根据回复确定如何解决这个问题。

Stackoverflow Question 1

Stackoverflow Question 2

Stackoverflow Question 3

任何建议或帮助将不胜感激，因为我在过去几天一直在这方面。

【问题讨论】：

标签： tensorflow deep-learning keras valueerror

【解决方案1】：

Matt 的评论是绝对正确的，因为问题在于使用 LabelBinarizer，这个提示让我找到了一个不需要我放弃使用 softmax 或将最后一层更改为 classes = 1 的解决方案。为了后代和为了其他人，这是我更改的代码部分以及如何避免 LabelBinarizer：

from keras.utils import np_utils
from sklearn.preprocessing import LabelEncoder    

# load the dataset from disk then scale the raw pixel intensities
# to the range [0,1]
sp = SimplePreprocessor (32, 32)
iap = ImageToArrayPreprocessor()

# encode the labels, converting them from strings to integers
le=LabelEncoder()
labels = le.fit_transform(labels)

data = data.astype("float") / 255.0
labels = np_utils.to_categorical(labels,2)

# partition the data into training and testing splits using 75% of
# the data for training and the remaining 25% for testing
....

【讨论】：

【解决方案2】：

我相信问题在于LabelBinarizer的使用。

从这个例子：

>>> lb = preprocessing.LabelBinarizer()
>>> lb.fit_transform(['yes', 'no', 'no', 'yes'])
array([[1],
       [0],
       [0],
       [1]])

我收集到您的转换输出具有相同的格式，即。 e.单个 1 或 0 编码“是新的”或“已使用”。

如果您的问题只需要在这两个类别中进行分类，则该格式更可取，因为它包含所有信息并且使用的空间比替代方案 i. e. [1,0], [0,1], [0,1], [1,0].

因此，使用classes = 1 是正确的，并且输出应该是一个浮点数，表明网络对样本属于第一类的置信度。由于这些值的总和必须为 1，因此可以很容易地通过从 1 中减去来推断它属于第二类的概率。

您需要将softmax 替换为任何其他激活，因为单值上的 softmax 始终返回 1。我不完全确定 binary_crossentropy 的行为与单值结果，您可能想要尝试mean_squared_error作为损失。

如果您希望扩展您的模型以涵盖两个以上的类，您可能希望将目标向量转换为 One-hot 编码。我相信来自LabelBinarizer 的inverse_transform 会这样做，尽管这似乎是到达那里的迂回方式。我看到 sklearn 也有 OneHotEncoder 这可能是更合适的替代品。

注意：您可以更轻松地为任何层指定激活函数，例如：

Dense(36, activation='relu')

这可能有助于将代码保持在可管理的大小。

【讨论】：