【问题标题】:Sequential Image classification顺序图像分类
【发布时间】:2021-01-11 19:21:24
【问题描述】:

我有 100 多个本身包含多个图像的 tif 文件。我想创建一个二元分类器。首先,我将所有 tif 分解为 png 图像(例如,2 个 tif 文件分别包含 20 和 30 个图像,然后在另一个目录中转换为 50 个 png 图像(600 x 600))。然后,我在上面应用了 CNN,但结果并不达标。 tif 的图像本质上是连续的,并且包含可能与分类目的相关的重要信息。 现在,我正在尝试为此目的应用 CNN+LSTM。我有一个包含文件名和标签的 csv 文件,我正在使用 ImageGenerator 的 flow_from_Dataframe 来加载数据。这是代码:-

img_width, img_height = 600, 600
no_frame = 5
original_train = "PATH TO IMAGES"
nb_training_samples = 6587
nb_validation_samples = 1646
epochs = 1
batch_size = 32
lr = 0.001
    
if k.image_data_format() == "channels_first":
    input_shape = (3,img_width,img_height)
else:
    input_shape = (img_width, img_height,3)
    
METRICS = [
  metrics.TruePositives(name='tp'),
  metrics.FalsePositives(name='fp'),
  metrics.TrueNegatives(name='tn'),
  metrics.FalseNegatives(name='fn'),
  metrics.BinaryAccuracy(name='accuracy'),
  metrics.Precision(name='precision'),
  metrics.Recall(name='recall'),
  metrics.AUC(name='auc'),
]

model = Sequential()
model.add(ConvLSTM2D(filters = 32, kernel_size=(3,3),
                    activation='relu',
                    return_sequences=True,
                    padding='same',
                    input_shape=(None,img_width, img_height,3)))
model.add(BatchNormalization())
model.add(ConvLSTM2D(64,(3,3), activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=METRICS)

model.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv_lst_m2d_10 (ConvLSTM2D) (None, None, 600, 600, 32 40448     
_________________________________________________________________
batch_normalization_9 (Batch (None, None, 600, 600, 32 128       
_________________________________________________________________
conv_lst_m2d_11 (ConvLSTM2D) (None, 600, 600, 64)      221440    
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 300, 300, 64)      0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 5760000)           0         
_________________________________________________________________
dense_4 (Dense)              (None, 64)                368640064 
_________________________________________________________________
activation_2 (Activation)    (None, 64)                0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 130       
_________________________________________________________________
activation_3 (Activation)    (None, 2)                 0         
=================================================================
Total params: 368,902,210
Trainable params: 368,902,146
Non-trainable params: 64
_________________________________________________________________

datagen = ImageDataGenerator(rescale=1/255., validation_split=0.2)

train_generator = datagen.flow_from_dataframe(dataframe=data, directory=original_train,
                                             x_col='Id',
                                             y_col='label',
                                             target_size=(img_width,img_height),
                                             class_mode='categorical',
                                             batch_size=batch_size,
                                             subset='training',
                                             seed=7)

print(train_generator.class_indices)

validation_generator = datagen.flow_from_dataframe(dataframe=data, directory=original_train,
                                             x_col='Id',
                                             y_col='label',
                                             target_size=(img_width,img_height),
                                             class_mode='categorical',
                                             batch_size=batch_size,
                                             subset='validation',
                                             seed=7)

print(validation_generator.class_indices)

train_steps = train_generator.n//train_generator.batch_size
validation_steps = validation_generator.n//validation_generator.batch_size


history = model.fit_generator(train_generator,steps_per_epoch=train_steps, epochs=epochs,
                              validation_data=validation_generator,validation_steps=validation_steps)

在此之后我收到此错误:-

ValueError: Error when checking input: expected conv_lst_m2d_10_input to have 5 dimensions, but got array with shape (32, 600, 600, 3)

我对此有几个疑问:-

  1. 如何解决此错误?
  2. 如何将一个 tif 作为批次传递?由于单个 tif 中的图像数量不同。

任何帮助都是可观的。

谢谢:)

编辑 1:

我创建了一个自定义生成器,如下所示:

class DataGenerator(Sequence):
    
    def __init__(self, list_IDs, labels, image_path, to_fit=True, batch_size=32, dim=(5,600,600),
                n_channel=1, n_classes=2, shuffle=True):
        self.list_IDs = list_IDs
        self.labels = labels
        self.image_path = image_path
        self.to_fit = to_fit
        self.batch_size = batch_size
        self.dim = dim
        self.n_channel = n_channel
        self.n_classes = n_classes
        self.shuffle = shuffle
        self.on_epoc_end()
    
    def __len__(self):
        return int(np.floor(len(self.list_IDs)/self.batch_size))
    
    def __getitem__(self,index):
        indexes = self.indexes[index * self.batch_size:(index+1)*self.batch_size]
        
        list_IDs_temp = [self.list_IDs[k] for k in indexes]
        
        X,y = self._generate_data(list_IDs_temp)
        
        return X,y
    
    def on_epoc_end(self):
        self.indexes = np.arange(len(self.list_IDs))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)
    
    def _generate_data(self, list_IDs_temp):
        X = np.empty((self.batch_size, *self.dim, self.n_channel))
        y = np.empty((self.batch_size), dtype = np.uint8) 
        
        for i, ID in enumerate(list_IDs_temp):
            X[i,] = self._load_grayscale_image(self.image_path + ID)
            y[i] = self.labels[i]
        return X, y
    
    def _load_grayscale_image(self,image_path):
        img = cv2.imread(image_path+'.png')
        img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        img = img / 255
        img = img[:,:,np.newaxis]
        return img

并加载数据

def loadData(filepath, val_sample=0.2):
    data = pd.read_csv(filepath)
    image_IDs = data['Id'].values
    labels = data['label'].values
    X_train, X_test, Y_train, Y_test = train_test_split(image_IDs, labels, test_size=val_sample, shuffle=False)
    train_data = DataGenerator(X_train, Y_train,image_path = original_train, batch_size = batch_size, shuffle=False)
    val_data = DataGenerator(X_test,Y_test,image_path = original_train, batch_size = batch_size, shuffle=False)
    return train_data,val_data

但是在获得适合模型的形状后,它的捐赠:-

ValueError: Error when checking input: expected reshape_2_input to have 4 dimensions, but got array with shape (32, 5, 600, 600, 1)

【问题讨论】:

    标签: python tensorflow image-processing keras lstm


    【解决方案1】:

    与任何其他 LSTM 层一样,ConvLSTM2D 需要一个时间步长维度。所以整个形状应该是:

    (n_samples, time_steps, height, width, channels)
    

    由于使用ImageDataGenerator时很难添加维度,我建议你在数据进入神经网络时对其进行reshape:

    model.add(Reshape((1,) + input_shape, input_shape=input_shape))
    

    复制/粘贴示例:

    from tensorflow.keras.layers import *
    from tensorflow.keras import Sequential
    import numpy as np
    
    img_width, img_height = 32, 32
    input_shape = (img_width, img_height, 3)
    batch_size = 8
    
    model = Sequential()
    model.add(Reshape((1,) + input_shape, input_shape=input_shape))
    model.add(ConvLSTM2D(filters=8, kernel_size=(3, 3),
                         activation='relu',
                         return_sequences=True,
                         padding='same',
                         input_shape=(None, img_width, img_height, 3)))
    model.add(BatchNormalization())
    model.add(ConvLSTM2D(8, (3, 3), activation='relu', padding='same'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Flatten())
    model.add(Dense(8))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2))
    model.add(Activation('sigmoid'))
    
    model.compile(loss='binary_crossentropy', optimizer='rmsprop')
    
    model.summary()
    
    fake_picture = np.random.rand(*((batch_size,) + input_shape)).astype(np.float32)
    model(fake_picture)
    
    <tf.Tensor: shape=(8, 2), dtype=float32, numpy=
    array([[0.49504986, 0.4995347 ],
           [0.49617144, 0.5001322 ],
           [0.4947565 , 0.50097185],
           [0.49597737, 0.4996349 ],
           [0.49563733, 0.50064707],
           [0.49486715, 0.49945754],
           [0.49625823, 0.50110054],
           [0.49568254, 0.50056493]], dtype=float32)>
    

    【讨论】:

    • 感谢@Nicolas 的解决方案,因为它没有给出任何错误,但我的 jupyter 内核开始死机并自动启动。我减少了批量大小并稍微改变了模型架构。但这无济于事......任何解决方案......
    • 重启我猜
    • 第二个问题的任何解决方案,即是否必须将 tif 图像(可以是任意数量)作为批次传递?
    • 当尝试将 time_steps 从 1 增加到 5 时,我得到 InvalidArgumentError: Input to reshape 是一个具有 34560000 个值的张量,但请求的形状有 172800000
    • 你不能让时间步长神奇地出现。恐怕你的问题中至少有 3 个问题,所以很难回答。有输入形状,有 tiff 文件的加载......如果没有必要的数据,我不知道从哪里开始
    猜你喜欢
    • 1970-01-01
    • 2022-01-14
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-01-10
    • 1970-01-01
    • 2022-10-14
    • 2020-10-19
    相关资源
    最近更新 更多