使用预训练 VGG-16 模型的 Caffe 形状不匹配错误答案

【问题标题】：Caffe shape mismatch error using pretrained VGG-16 model使用预训练 VGG-16 模型的 Caffe 形状不匹配错误
【发布时间】：2016-07-27 16:05:15
【问题描述】：

我正在使用 PyCaffe 实现一个受 VGG 16 层网络启发的神经网络。我想使用他们的GitHub page 提供的预训练模型。通常，这是通过匹配层名称来实现的。

对于我的"fc6" 层，我的 train.prototxt 文件中有以下定义：

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  inner_product_param {
    num_output: 4096
  }
}

Here 是 VGG-16 部署架构的 prototxt 文件。请注意，他们的 prototxt 中的 "fc6" 与我的相同（除了学习率，但这无关紧要）。还值得注意的是，在我的模型中，输入的大小也都相同：3 通道 224x224px 图像。

我一直在密切关注this tutorial，给我带来问题的代码块如下：

solver = caffe.SGDSolver(osp.join(model_root, 'solver.prototxt'))
solver.net.copy_from(model_root + 'VGG_ILSVRC_16_layers.caffemodel')
solver.test_nets[0].share_with(solver.net)
solver.step(1)

第一行加载我的求解器 prototxt，然后第二行从预训练模型 (VGG_ILSVRC_16_layers.caffemodel) 复制权重。当求解器运行时，我收到此错误：

Cannot copy param 0 weights from layer 'fc6'; shape mismatch.  Source param 
shape is 1 1 4096 25088 (102760448); target param shape is 4096 32768 (134217728). 
To learn this layer's parameters from scratch rather than copying from a saved 
net, rename the layer.

它的要点是他们的模型期望层的大小为 1x1x4096，而我的只有 4096。但我不明白如何改变这个？

我在 Google 用户组中找到 this answer，指示我在复制之前进行网络手术以重塑预训练模型，但为了做到这一点，我需要来自原始架构数据层的 lmdb 文件，其中我没有（尝试运行网络手术脚本时会引发错误）。

【问题讨论】：

你对输出维度4096没有问题，而是在输入维度：你有25088暗淡的输入，而VGG期望输入暗淡32768。你改变了一些东西改变特征大小的卷积层。

标签： python deep-learning caffe pycaffe vgg-net

【解决方案1】：

问题不在于 4096，而在于 25088。您需要根据输入特征图计算网络每一层的输出特征图。请注意，fc 层采用固定大小的输入，因此前一个conv 层的输出必须与fc 层所需的输入大小相匹配。使用前一个conv 层的输入特征图大小计算您的 fc6 输入特征图大小（这是前一个 conv 层的输出特征图）。这是公式：

H_out = ( H_in + 2 x Padding_Height - Kernel_Height ) / Stride_Height + 1
W_out = (W_in + 2 x Padding_Width - Kernel_Width) / Stride_Width + 1

【讨论】：

【解决方案2】：

如果您将图像裁剪为 224，而不是使用原始数据集完成的 227，则会出现此错误。调整一下，你应该很高兴。

【讨论】：

VGG16 的输入大小为10x3x224x224！你从哪里得到的 227？