卷积神经网络中图像的维度应该是多少答案

【问题标题】：what should be the dimension of image in Convolutional neural network卷积神经网络中图像的维度应该是多少
【发布时间】：2018-08-03 21:58:20
【问题描述】：

深度学习初学者..

我正在尝试使用浦那市的卫星图像（谷歌地图）来识别贫民窟。因此，在训练数据集中，我提供了大约 100 张贫民窟的图像和 100 张其他地区的图像。但是即使准确率很高，我的模型也无法正确分类输入图像。我认为这可能是因为图像的尺寸。我将所有图像的大小调整为 128*128 像素。内核大小为 3*3。

地图链接： https://www.google.co.in/maps/@18.5129661,73.822531,286m/data=!3m1!1e3?hl=en

下面是代码

import os,cv2
import glob
import numpy as np
from keras.utils import plot_model
from keras.utils.np_utils import to_categorical
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
from keras.models import Model
from keras.layers import Input, Convolution2D, MaxPooling2D, Flatten, Dense, Dropout


PATH = os.getcwd()
data_path = PATH + '/dataset/*'


files = glob.glob(data_path)
X = []

for myFiles in files:
 image = cv2.imread(myFiles)
 image_resize = cv2.resize(image, (256, 256))
 X.append(image_resize)

image_data = np.array(X)
image_data = image_data.astype('float32')
image_data /= 255
print("Image_data shape ", image_data.shape)


no_of_classes = 2
no_of_samples = image_data.shape[0]
label = np.ones(no_of_samples, dtype='int64')

label[0:86] = 0     #Slum
label[87:] = 1    #noSlum

Y = to_categorical(label, no_of_classes)


#shuffle dataset

x,y = shuffle(image_data , Y, random_state = 2)

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state = 2)

#print(x_train)
#print(y_train)


input_shape = image_data[0].shape

input = Input(input_shape)

conv_1 = Convolution2D(32,(3,3), padding='same', activation='relu')(input)
conv_2 = Convolution2D(32,(3,3), padding = 'same', activation = 'relu')(conv_1)
pool_1 = MaxPooling2D(pool_size = (2,2))(conv_2)
drop_1 = Dropout(0.5)(pool_1)

conv_3 = Convolution2D(64,(3,3), padding='same', activation='relu')(drop_1)
conv_4 = Convolution2D(64,(3,3), padding='same', activation = 'relu')(conv_3)
pool_2 = MaxPooling2D(pool_size = (2,2))(conv_4)
drop_2 = Dropout(0.5)(pool_2)

flat_1 = Flatten()(drop_2)
hidden = Dense(64,activation='relu')(flat_1)
drop_3 = Dropout(0.5)(hidden)
out = Dense(no_of_classes,activation = 'softmax')(drop_3)

model = Model(inputs = input, outputs = out)

model.compile(loss = 'categorical_crossentropy', optimizer = 'rmsprop',  metrics= ['accuracy'])

model.fit(x_train,y_train,batch_size=10,nb_epoch=20,verbose =1, validation_data=(x_test,y_test))

model.save('model.h5')

score = model.evaluate(x_test,y_test,verbose=1)
print('Test Loss: ',score[0])
print('Test Accuracy: ',score[1])


test_image = x_test[0:1]
print(test_image.shape)

print (model.predict(test_image))

【问题讨论】：

标签： image-processing tensorflow deep-learning keras

【解决方案1】：

通常，您在上面描述的行为类似于 NN 无法识别输入图像上的小物体。试想一下，您给出一张 128*128 的图像，其中没有看到任何粗糙的噪声——您希望 NN 正确分类对象吗？

怎么办？ 1) 尝试手动将数据集中的一些输入图像转换为 128*128 大小，并查看您真正训练 NN 的数据。所以，它会给你更多的洞察力-->也许你需要更好的图像尺寸

2) 添加更多带有更多神经元的 Conv 层，这将使您能够通过向输出函数添加更多非线性来检测更小、更复杂的对象。谷歌 ResNet 等出色的神经网络结构。

3) 添加更多的训练数据，100 张图像不足以得到合适的结果

4) 也添加数据增强技术（在您的情况下，轮换似乎很强大）

不要放弃 :) 最终，你会解决它。祝你好运

【讨论】：

我会尝试这些建议。看起来它需要一些严重的肘部油脂..非常感谢你..！
不客气。等待你的成功和成就。用其他事实编辑您的帖子，这肯定会有所帮助。
你说得对。我意识到机器很难检测 128*128 维度的特征。增加图像的维度并添加更多的训练数据对我有用。但准确度不是很好。我想向数据集添加更多图像和增强应该可以工作。
很高兴知道）当您获得更多数据并添加数据增强时，我相信您会成功 :) P.S 我通常使用 512*512 尺寸或大约它。以及 AWS 上的 1 个批量大小和 GPU。但这取决于数据，也许一些狗和猫分类 128*128 非常棒。通常，当你使用 opencv 库减少和增加尺寸时，手动检查图像的外观是理解的关键