【问题标题】:Python ConvNet Image Classifier - "ValueError" when fitting a model for a binary image classificationPython ConvNet 图像分类器 - 为二值图像分类拟合模型时出现“ValueError”
【发布时间】:2020-12-10 05:34:34
【问题描述】:

我对深度学习和 TensorFlow/Keras 非常陌生,因此在尝试拟合模型以将图像分类为“狗”或“猫”时,我无法理解为什么会抛出错误。 (图像数据库可以在这里找到:https://www.microsoft.com/en-us/download/details.aspx?id=54765)。该模型是在我学习并遵循 YouTube 教程 (https://www.youtube.com/watch?v=WvoLTXIjBYU) 时在单独的模块中编写、保存和打开的。第一个代码块涉及创建和保存模型(使用pickle),第二个代码块是训练实际卷积网络的部分。

图像数据库已下载,保存到文件目录,并编写了模型来训练分类器。代码如下:

import numpy as np
import matplotlib.pyplot as plt
import os
import cv2

DATADIR = "Pictures\\kagglecatsanddogs_3367a\\PetImages" 
#Workspace directory changed for posting
CATEGORIES = ["Dog", "Cat"]

#Iterate between all photos of dogs and cats
for category in CATEGORIES:
    path = os.path.join(DATADIR, category) #path to cats or dogs dir
    for img in os.listdir(path):
        img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE) #Converts to grayscale, does not need color in this specific instance)
        plt.imshow(img_array, cmap = "gray")
        break
    break

#Print image dimensions
print(img_array.shape)

#All the images are different-shaped photos, so they must be normalized
#Everything must be made the same shape
#Decide on the image size you want to go with
IMG_SIZE = 180
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))

training_data = []

def create_training_data(): #With goal of iterating through everything and building the dataset
    for category in CATEGORIES:
        path = os.path.join(DATADIR, category) #path to cats or dogs dir
        class_num = CATEGORIES.index(category)
        for img in os.listdir(path):
            try:
                img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE) #Converts to grayscale, does not need color in this specific instance)
                new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
                training_data.append([new_array, class_num])
            except Exception as e:
                pass

create_training_data()

print(len(training_data))

#Shuffle the data
import random
random.shuffle(training_data)

for sample in training_data[:10]:
    print(sample[1])

#Packs data into variables we will use
x = []
y = []

for features, label in training_data:
    x.append(features)
    y.append(label)
x = np.array(x).reshape(-1, IMG_SIZE, IMG_SIZE, 1)


#Model was saved with pickle
import pickle
pickle_out = open("x.pickle", "wb")
pickle.dump(x, pickle_out)
pickle_out.close()

pickle_out = open("y.pickle", "wb")
pickle.dump(y, pickle_out)
pickle_out.close()

代码随后在另一个 Jupyter Notebook 文件中打开并用于构建 CNN:

#Import necessary packages
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
import pickle

#Load models generated in previous tutorial
x = pickle.load(open("x.pickle", "rb"))
y = pickle.load(open("y.pickle", "rb"))

#Normalize the data
#255 is used due to RGB imagery
x = x/255

#Model building: First layer
model = Sequential()
#Convolutional network
model.add(Conv2D(64, (3,3), input_shape = x.shape[1:]))
model.add(Activation("relu"))
#Pooling
model.add(MaxPooling2D(pool_size = (2,2)))

#Model building: Second layer
#Convolutional network
model.add(Conv2D(64, (3,3), input_shape = x.shape[1:]))
model.add(Activation("relu"))
#Pooling
model.add(MaxPooling2D(pool_size = (2,2)))

#Final output layer
model.add(Flatten())
model.add(Dense(64))

model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss= "binary_crossentropy",
             optimizer = "adam",
             metrics = ['accuracy'])

model.fit(x, y, batch_size = 32, epochs = 3, validation_split = 0.1)

然后抛出异常:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-bb5f154147cd> in <module>
     39              metrics = ['accuracy'])
     40 
---> 41 model.fit(x, y, batch_size = 32, epochs = 3, validation_split = 0.1)

~\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\keras\engine\training.py in _method_wrapper(self, *args, **kwargs)
    106   def _method_wrapper(self, *args, **kwargs):
    107     if not self._in_multi_worker_mode():  # pylint: disable=protected-access
--> 108       return method(self, *args, **kwargs)
    109 
    110     # Running inside `run_distribute_coordinator` already.

~\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
   1038       (x, y, sample_weight), validation_data = (
   1039           data_adapter.train_validation_split(
-> 1040               (x, y, sample_weight), validation_split=validation_split))
   1041 
   1042     if validation_data:

~\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\keras\engine\data_adapter.py in train_validation_split(arrays, validation_split)
   1374     raise ValueError(
   1375         "`validation_split` is only supported for Tensors or NumPy "
-> 1376         "arrays, found following types in the input: {}".format(unsplitable))
   1377 
   1378   if all(t is None for t in flat_arrays):

ValueError: `validation_split` is only supported for Tensors or NumPy arrays, found following types in the input: [<class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>, <class 'int'>

如何解决此错误?似乎错误在model.fit(x, y, batch_size = 32, epochs = 3, validation_split = 0.1) 行中,因为当我在没有此行的情况下运行代码时,不会引发异常。谢谢!

【问题讨论】:

    标签: python numpy tensorflow machine-learning keras


    【解决方案1】:

    模型期望输入是 numpy 数组的形式。它收到的是一个整数列表。您必须将加载的数据转换为 numpy 数组,然后将它们传递给模型

    【讨论】:

    • 您能否提供一个示例来说明如何执行此操作?我对这一切都很陌生,并试图弄清楚。
    • 我建议你将训练数据保存在一个 numpy 文件 numpy.org/doc/stable/reference/generated/numpy.save.html 并使用 numpy.load 加载它。与将x 放入np.array 的方式相同,您应该使用y。希望对你有帮助
    【解决方案2】:

    您应该对 y 的 numpy 数组进行转换过程,而不仅仅是 X。

    X = []
    y = []
    
    for features,label in training_data:
        X.append(features)
        y.append(label)
    
    print(X[0].reshape(-1, IMG_SIZE, IMG_SIZE, 1))
    
    X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 1)
    y = np.array(y)
    

    【讨论】:

      【解决方案3】:
      import numpy as np 
      X = np.array(X).reshape(-1,IMG_SIZE,IMG_SIZE,1)  
      y = np.array(y) 
      
      import tensorflow as tf 
      from tensorflow.keras.models import Sequential 
      from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
      from tensorflow.keras.layers import Conv2D, MaxPooling2D
      
      
      import pickle  
      

      。 . .

      然后继续。 我也遇到了同样的问题,但是在将数据加载到 NumPy 数组后它就起作用了,正如我提到的那样,通过添加一个额外的行来定义 X 和 y。

      【讨论】:

        猜你喜欢
        • 2021-05-08
        • 1970-01-01
        • 2018-12-01
        • 2021-05-12
        • 2020-08-26
        • 2019-09-12
        • 1970-01-01
        • 2012-04-18
        • 2023-03-10
        相关资源
        最近更新 更多