【问题标题】:how to reduce as much as possible the bootleneck in autoencoder?如何尽可能减少自动编码器的瓶颈?
【发布时间】:2020-04-16 15:14:44
【问题描述】:

亲爱的

我有以下代码:

inpt = Input(shape=(160,1))

# Input is 160 samples, 20 ms for sampling rate of 8 kHz
# Of course speech can be wide-band. One should take care then

conv1 = Convolution1D(512,3,activation='relu',padding='same',strides=1)(inpt)
conv2 = Convolution1D(128,3,activation='relu',padding='same',strides=1)(conv1)
pool1 = MaxPooling1D(pool_size=2, strides=None, padding='valid')(conv2)


conv3 = Convolution1D(256,3,activation='relu',padding='same',strides=1)(pool1)
conv4 = Convolution1D(256,3,activation='relu',padding='same',strides=1)(conv3)
pool2 = MaxPooling1D(pool_size=2, strides=None, padding='valid')(conv4)


conv5 = Convolution1D(256,3,activation='relu',padding='same',strides=1)(pool2)
conv6 = Convolution1D(128,3,activation='relu',padding='same',strides=1)(conv5)
pool3 = MaxPooling1D(pool_size=2, strides=None, padding='valid')(conv6)


conv7 = Convolution1D(128,3,activation='relu',padding='same',strides=1)(pool3)
conv8 = Convolution1D(64,3,activation='relu',padding='same',strides=1)(conv7)
pool4 = MaxPooling1D(pool_size=2, strides=None, padding='valid')(conv8)


conv9 = Convolution1D(32,3,activation='relu',padding='same',strides=1)(pool4)
conv10 = Convolution1D(16,3,activation='relu',padding='same',strides=1)(conv9)
############################# EXTRA 
conv10 = Convolution1D( 8, kernel_size = (3), activation='relu', padding='same')(conv10)
pool4 = MaxPooling1D(pool_size = (2), padding='same')(conv10)
conv10 = Convolution1D( 8, 3, activation='relu', padding='same')(pool4)
encoded = Convolution1D( 8, 3, activation='relu', padding='same')(conv10)
#############

如果输入是 27000 信号,这里的瓶颈长度为 6920

我想把瓶颈减少到只有400,怎么做,修改从extra部分开始 我尝试添加额外的 conv 和 pool 但长度不能小于 6920。

【问题讨论】:

  • @furcifer 你能帮忙吗

标签: tensorflow keras autoencoder


【解决方案1】:

您可以通过多种不同的方式获得所需的长度:

  1. 不断增加池大小:

    pool = MaxPooling1D(pool_size = (4))(prev) # 或者你可以使用更大的数字

  2. 在 Conv 和 Pool 层中使用 VALID 填充:

    pool = MaxPooling1D(pool_size = (4), padding='valid')(prev)

    conv10 = Convolution1D(8, 3, activation='relu', padding='valid')(prev)

  3. 您还可以在 Pool 和 Conv 层中使用更高的步幅

    pool = MaxPooling1D(pool_size = (4), strides=4, padding='valid')(prev)

    conv10 = Convolution1D(8, 3, strides=4, activation='relu', padding='valid')(prev)

【讨论】:

  • 我在应用选项 3 后收到以下错误:检查目标时出错:预期 model_14 的形状为 (96, 1) 但数组的形状为 (160, 1)
  • 我将上采样更改为 4,我也收到以下错误 ValueError: Error when checks target: expected model_18 to have shape (192, 1) but got array with shape (160, 1) ) @Bashir Kazimi
  • 您没有包含您的解码器代码。我提出了一个减少维度的一般想法,但您必须对其进行调整,以便获得与模型输入相同大小的输出。
  • ``` input_decoder = Input(shape = (1, 4) ) upsmp1 = UpSampling1D(size=2)(input_decoder) conv11 = Convolution1D( 4, 3, activation='relu', padding= 'same')(upsmp1) upsmp1 = UpSampling1D(size=4)(conv11) conv11 = Convolution1D( 8, 3, activation='relu', padding='same')(upsmp1) conv12 = Convolution1D( 8, 3, activation ='relu', padding='same')(conv11) pool4 = UpSampling1D(size=4)(conv12) conv10 = Convolution1D(8, kernel_size = (3), activation='relu', padding='same')( pool4) ``` @Bashir Kazimi
  • 根据您的问题和您最近的评论,我了解到您希望将编码器的输出减小为 (1, 4) 的形状。然后对您的解码器输出进行上采样,使其与编码器的输入具有相同的形状,我为此创建了一个模型并将其添加为答案
【解决方案2】:

我为您创建了如下草稿:

  1. 编码器接受形状为 (batch_size, 160, 1) 的输入,输出形状为 (batch_size, 1, 4) 的向量
  2. 解码器采用形状 (batch_size, 1, 4) 的输入,与编码器输出相同
  3. 一个组合的encoder_decoder模型

编码器:

from tensorflow.keras.layers import Input, Convolution1D, MaxPooling1D, GlobalAveragePooling1D, UpSampling1D
import tensorflow as tf
inpt = Input(shape=(160,1))

# Input is 160 samples, 20 ms for sampling rate of 8 kHz
# Of course speech can be wide-band. One should take care then

conv1 = Convolution1D(512,3,activation='relu',padding='same',strides=1)(inpt)
conv2 = Convolution1D(128,3,activation='relu',padding='same',strides=1)(conv1)
pool1 = MaxPooling1D(pool_size=2, strides=None, padding='valid')(conv2)


conv3 = Convolution1D(256,3,activation='relu',padding='same',strides=1)(pool1)
conv4 = Convolution1D(256,3,activation='relu',padding='same',strides=1)(conv3)
pool2 = MaxPooling1D(pool_size=2, strides=None, padding='valid')(conv4)


conv5 = Convolution1D(256,3,activation='relu',padding='same',strides=1)(pool2)
conv6 = Convolution1D(128,3,activation='relu',padding='same',strides=1)(conv5)
pool3 = MaxPooling1D(pool_size=2, strides=None, padding='valid')(conv6)


conv7 = Convolution1D(128,3,activation='relu',padding='same',strides=1)(pool3)
conv8 = Convolution1D(64,3,activation='relu',padding='same',strides=1)(conv7)
pool4 = MaxPooling1D(pool_size=6, strides=None, padding='valid')(conv8)


conv9 = Convolution1D(32,3,activation='relu',padding='same',strides=1)(pool4)
conv10 = Convolution1D(4,3,activation='relu',padding='same',strides=1)(conv9)
encoded = MaxPooling1D(pool_size=3)(conv10)

encoder = tf.keras.Model(inputs=inpt, outputs=encoded)
encoder.summary()

解码器:

input_decoder = Input(shape = (1, 4) ) ############# 
upsmp1 = UpSampling1D(size=2)(input_decoder) 
conv11 = Convolution1D( 4, 3, activation='relu', padding='same')(upsmp1) 
upsmp1 = UpSampling1D(size=8)(conv11) 
conv11 = Convolution1D( 8, 3, activation='relu', padding='same')(upsmp1) 
conv12 = Convolution1D( 8, 3, activation='relu', padding='same')(conv11) 
pool4 = UpSampling1D(size=10)(conv12) 
conv10 = Convolution1D( 1, kernel_size = (3), activation='relu', padding='same')(pool4) 
decoder = tf.keras.Model(inputs=input_decoder, outputs=conv10)
decoder.summary()

组合编码器解码器:

encoder_decoder = tf.keras.Model(inputs=inpt, outputs=decoder(encoded))
encoder_decoder.summary()

【讨论】:

    猜你喜欢
    • 2021-01-07
    • 2020-07-01
    • 1970-01-01
    • 2020-09-13
    • 1970-01-01
    • 2018-10-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多