【问题标题】:3D convolutional autoencoder is not returning the right output shape3D 卷积自动编码器未返回正确的输出形状
【发布时间】:2022-01-20 20:28:12
【问题描述】:

我正在尝试对时空数据使用自动编码器。 我的数据形状是:batches , filters, timesteps, rows, columns。我在将自动编码器设置为正确的形状时遇到问题。

这是我的模型:

input_imag = Input(shape=(3, 81, 4, 4))

x = Conv3D(16, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(input_imag)
x = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same')(x)
x = Conv3D(8, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
x = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same')(x)
x = Conv3D(4, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
encoded = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same', name='encoder')(x)

x = Conv3D(4, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(encoded)
x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
x = Conv3D(8, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
x = Conv3D(16, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
decoded = Conv3D(3, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)

autoencoder = Model(input_imag, decoded)
autoencoder.compile(optimizer='adam', loss='mse')

autoencoder.summary()

这是摘要:

Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 3, 81, 4, 4)]     0
_________________________________________________________________
conv3d (Conv3D)              (None, 16, 81, 4, 4)      2176
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 16, 27, 2, 2)      0
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 8, 27, 2, 2)       5768
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 8, 9, 1, 1)        0
_________________________________________________________________
conv3d_2 (Conv3D)            (None, 4, 9, 1, 1)        1444
_________________________________________________________________
encoder (MaxPooling3D)       (None, 4, 3, 1, 1)        0
_________________________________________________________________
conv3d_3 (Conv3D)            (None, 4, 3, 1, 1)        724
_________________________________________________________________
up_sampling3d (UpSampling3D) (None, 4, 9, 2, 2)        0
_________________________________________________________________
conv3d_4 (Conv3D)            (None, 8, 9, 2, 2)        1448
_________________________________________________________________
up_sampling3d_1 (UpSampling3 (None, 8, 27, 4, 4)       0
_________________________________________________________________
conv3d_5 (Conv3D)            (None, 16, 27, 4, 4)      5776
_________________________________________________________________
up_sampling3d_2 (UpSampling3 (None, 16, 81, 8, 8)      0
_________________________________________________________________
conv3d_6 (Conv3D)            (None, 3, 81, 8, 8)       2163
=================================================================
Total params: 19,499
Trainable params: 19,499
Non-trainable params: 0

我应该改变什么以使解码器输出形状为[?,3,81,4,4] 而不是[?,3,81,8,8]

【问题讨论】:

    标签: python tensorflow time-series conv-neural-network autoencoder


    【解决方案1】:

    看起来您希望 MaxPooling3D 和 UpSampling3D 操作是对称的(至少在输出形状方面)。让我们看看最后一个 MaxPooling3D 层的输入形状:

    conv3d_2 (Conv3D)            (None, 4, 9, 1, 1)        1444
    _________________________________________________________________
    encoder (MaxPooling3D)       (None, 4, 3, 1, 1)        0
    

    形状是(None, 4, 9, 1, 1)。最后两个维度已经是 1,所以它们不能被 2 整除,如pool_size 中所指定。所以 MaxPooling3D 层,尽管有一个pool_size=(3, 2, 2),有效地使用pool_size=(3, 1, 1) 进行操作。至少我认为这就是幕后发生的事情。

    我有点惊讶在指定 pool_size 大于输入大小时没有错误或警告。

    要解决这个问题,您可以将第一个 UpSampling3D 图层的形状设置为 (3, 1, 1)

    x = UpSampling3D((3, 1, 1), data_format='channels_first')(x)
    

    所以,完整的解决方案:

    input_imag = Input(shape=(3, 81, 4, 4))
    
    x = Conv3D(16, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(input_imag)
    x = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same')(x)
    x = Conv3D(8, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
    x = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same')(x)
    x = Conv3D(4, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
    encoded = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same', name='encoder')(x)
    
    x = Conv3D(4, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(encoded)
    x = UpSampling3D((3, 1, 1), data_format='channels_first')(x)
    x = Conv3D(8, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
    x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
    x = Conv3D(16, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
    x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
    decoded = Conv3D(3, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
    
    autoencoder = Model(input_imag, decoded)
    autoencoder.compile(optimizer='adam', loss='mse')
    
    autoencoder.summary()
    

    输出:

    Model: "model_1"
    _________________________________________________________________
     Layer (type)                Output Shape              Param #   
    =================================================================
     input_3 (InputLayer)        [(None, 3, 81, 4, 4)]     0         
                                                                     
     conv3d_14 (Conv3D)          (None, 16, 81, 4, 4)      2176      
                                                                     
     max_pooling3d_4 (MaxPooling  (None, 16, 27, 2, 2)     0         
     3D)                                                             
                                                                     
     conv3d_15 (Conv3D)          (None, 8, 27, 2, 2)       5768      
                                                                     
     max_pooling3d_5 (MaxPooling  (None, 8, 9, 1, 1)       0         
     3D)                                                             
                                                                     
     conv3d_16 (Conv3D)          (None, 4, 9, 1, 1)        1444      
                                                                     
     encoder (MaxPooling3D)      (None, 4, 3, 1, 1)        0         
                                                                     
     conv3d_17 (Conv3D)          (None, 4, 3, 1, 1)        724       
                                                                     
     up_sampling3d_6 (UpSampling  (None, 4, 9, 1, 1)       0         
     3D)                                                             
                                                                     
     conv3d_18 (Conv3D)          (None, 8, 9, 1, 1)        1448      
                                                                     
     up_sampling3d_7 (UpSampling  (None, 8, 27, 2, 2)      0         
     3D)                                                             
                                                                     
     conv3d_19 (Conv3D)          (None, 16, 27, 2, 2)      5776      
                                                                     
     up_sampling3d_8 (UpSampling  (None, 16, 81, 4, 4)     0         
     3D)                                                             
                                                                     
     conv3d_20 (Conv3D)          (None, 3, 81, 4, 4)       2163      
                                                                     
    =================================================================
    Total params: 19,499
    Trainable params: 19,499
    Non-trainable params: 0
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2017-01-04
      • 2019-12-01
      • 1970-01-01
      • 1970-01-01
      • 2018-05-02
      • 1970-01-01
      • 2020-08-20
      • 2017-10-06
      相关资源
      最近更新 更多