KERAS：预训练了一个 CNN+Dense 模型。如何冻结 CNN 权重并用 LSTM 代替 Dense？答案

【问题标题】：KERAS: Pretrained a CNN+Dense model. How to freeze CNN weights and substitute Dense with LSTM?KERAS：预训练了一个 CNN+Dense 模型。如何冻结 CNN 权重并用 LSTM 代替 Dense？
【发布时间】：2023-12-07 20:29:01
【问题描述】：

我训练并加载了一个 cnn+dense 模型：

# load model
cnn_model = load_model('my_cnn_model.h5')
cnn_model.summary()

输出是这样的（我的图像尺寸为 2 X 3600）：

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv2d_1 (Conv2D)            (None, 2, 3600, 32)       128
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 2, 1800, 32)       3104
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 2, 600, 32)        0
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 2, 600, 64)        6208
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 2, 300, 64)        12352
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 2, 100, 64)        0
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 2, 100, 128)       24704
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 2, 50, 128)        49280
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 2, 16, 128)        0
_________________________________________________________________
flatten_1 (Flatten)          (None, 4096)              0
_________________________________________________________________
dense_1 (Dense)              (None, 1024)              4195328
_________________________________________________________________
dense_2 (Dense)              (None, 1024)              1049600
_________________________________________________________________
dense_3 (Dense)              (None, 3)                 3075
=================================================================
Total params: 5,343,779
Trainable params: 5,343,779
Non-trainable params: 0

现在，我想要的是让权重变平并用 LSTM 替换密集层来训练添加的 LSTM 部分。

我刚刚写了：

# freeze model
base_model = cnn_model(input_shape=(2, 3600, 1))

#base_model = cnn_model
base_model.trainable = False

# Adding the first lstm layer
x = LSTM(1024,activation='relu',return_sequences='True')(base_model.output)

# Adding the second lstm layer
x = LSTM(1024, activation='relu',return_sequences='False')(x)

# Adding the output
output = Dense(3,activation='linear')(x)

# Final model creation
model = Model(inputs=[base_model.input], outputs=[output])

但是我得到了：

base_model = cnn_model(input_shape=(2, 3600, 1))
TypeError: __call__() missing 1 required positional argument: 'inputs'

我知道我必须在 Flatten 层理想地添加 TimeDistributed，但我不知道该怎么做。此外，我不确定 base_model.trainable = False 是否完全符合我的要求。你能帮我完成这项工作吗？

非常感谢！

【问题讨论】：

如果您定义基本模型：base_model = cnn_model() 而不是 base_model = cnn_model(input_shape=(2, 3600, 1))
我得到同样的错误：base_model = cnn_model() TypeError: __call__() missing 1 required positional argument: 'inputs'
with: base_model = cnn_model 我得到：ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=2
此时我想我必须在某处插入 TimeDistributed
好的，这是正确的......你之前的模型的形状不匹配

标签： keras lstm conv-neural-network pre-trained-model

【解决方案1】：

您不能直接从 Flatten() 获取输出，LSTM 需要二维特征（时间、过滤器）。你必须重塑你的张量。

你可以从 flatten 之前的层（最大池化）获取输出，假设该层在模型中有索引 i，我们可以从该层获取输出并根据我们的需要对其进行整形并传递它到 LSTM。

before_flatten = base_model.layers[i].output # i is the index of the layer from which you want to take the model output

conv2lstm_reshape = Reshape((-1, 2))(before_flatten) # you have to select it, the temporal dim and filters

# Adding the first lstm layer
x = LSTM(1024,activation='relu',return_sequences='True')(conv2lstm_reshape)

# Adding the second lstm layer
x = LSTM(1024, activation='relu',return_sequences='False')(x)

# Adding the output
output = Dense(3,activation='linear')(before_flatten)

# Final model creation
model = Model(inputs=[base_model.input], outputs=[output])

model.summary()

【讨论】：

谢谢。现在我没有错误。只是一件奇怪的事情。在最终模型中，我的所有参数都是可训练的。在哪里放置 model.trainable = False？
您告诉过您只想冻结您的 CNN 层，您可以迭代每个层并使其无法单独训练。 *.com/questions/53503389/…