对目标标签少于预训练分类器的数据进行迁移学习答案

【问题标题】：Transfer learning on data with lesser target labels than that of the pretrained classifier对目标标签少于预训练分类器的数据进行迁移学习
【发布时间】：2020-12-07 16:34:18
【问题描述】：

假设有一个预训练模型 (base_model)，它已经使用大型数据集进行了训练，可以预测 7 种人类情绪，例如

'Anger', 'Disgust', 'Fear', 'Happiness', 'Sadness', 'Surprise','Neutral'

现在，为了构建迁移学习模型，我将移除最后一层“base_model”，冻结它们的权重并使其不可训练，然后添加我自己的微调层，它是可训练的。

我想知道如何在一个较小的数据集上训练这个新编译的模型“model_finetuned”，该数据集只包含 7 种情绪中的 3 种，即

'Anger', 'Sadness', 'Surprise'

非常感谢任何以 Python 代码形式提供的帮助或建议。提前致谢！

【问题讨论】：

标签： python-3.x tensorflow deep-learning transfer-learning

【解决方案1】：

正如您正确解释的那样，您可以冻结预先确定的模型权重并在模型末尾添加全连接层进行微调。

有两种方法可以利用预训练网络：特征提取和微调。

特征提取：包括使用先前网络学习的表示从新样本中提取有趣的特征。然后这些特征通过一个新的分类器运行，该分类器是从头开始训练的。（Cold 是最后一个全连接层）
微调：包括解冻用于特征提取的冻结模型库的几个顶层，并联合训练模型的新添加部分。

预训练 vgg16 示例：

#Load pretrained vgg16 network
from torchvision.models import vgg16

num_classes = 3
pretrained_model = vgg16(pretrained=True)
pretrained_model.eval()
pretrained_model.to(device)

#Extracting the first part of the model
feature_extractor = pretrained_model.features

#Define feature classifier
feature_classifier = nn.Sequential(
nn.Linear(4*4*512,256),
nn.ReLU(),
nn.Linear(256, num_classes))

#
model = nn.Sequential(
feature_extractor,
nn.Flatten(),
feature_classifier)

如您所见，您必须在最后一个全连接层中指定模型的输出。在你的情况下是 (num_classes = 3)。

【讨论】：

您提供的代码是用 PyTorch 编写的，虽然正确，但作者可能不知道如何改编。
感谢@mgrau，但我认为根据您的代码，该模型将只对这 3 个类而不是所有 7 个类进行预测
嗨@PrathameshMohite，我确实错误地理解了这个问题，因为我的解决方案只对在3个类之间进行分类有用。另一方面，您的问题的一个解决方案可能是使用每类加权损失函数，因为您应该对正在重新训练的类给予较低的重要性，因为这 3 个类的训练样本数量高于其他。使用 Pytorch 框架，您可以为每个类指定手动调整权重。 pytorch.org/docs/stable/generated/…

【解决方案2】：

这是我几天前使用 Tensorflow Keras 编写的代码示例

import tensorflow as tf
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.models import Sequential
from tensorflow.python.keras.layers import Dense

num_classes= 3

# Include the path of the weights for the pretrained model
resnet_weights_path='imagenet'

# Create your model
model= Sequential()

# Include the pre-trained model. In this case, ResNet50
model.add(ResNet50(include_top=False,pooling='avg',weights=resnet_weights_path ))

# Add as many extra layers as you need, according to you problem
# You can also try it directly

# Add the final layer that makes predictions. Suit yourself with the activation function 
model.add(Dense(num_classes,activation='softmax'))

# Don't train the pre-trained model
model.layers[0].trainable=False

# Compile your model according to your needs
model.compile(optimizer='sgd',loss='categorical_crossentropy',metrics=['accuracy'])

现在您的模型已准备好接受训练。

【讨论】：