Caffe net.predict() 输出随机结果（GoogleNet）答案

【问题标题】：Caffe net.predict() outputs random results (GoogleNet)Caffe net.predict() 输出随机结果（GoogleNet）
【发布时间】：2015-08-29 00:10:36
【问题描述】：

我使用了来自 https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet 的预训练 GoogleNet，并使用我自己的数据（约 100k 图像，101 个类）对其进行了微调。经过一天的训练，我在 top-1 分类中达到了 62%，在 top-5 分类中达到了 85%，并尝试使用这个网络来预测几张图像。

我只是按照https://github.com/BVLC/caffe/blob/master/examples/classification.ipynb的例子，

这是我的 Python 代码：

import caffe
import numpy as np


caffe_root = './caffe'


MODEL_FILE = 'caffe/models/bvlc_googlenet/deploy.prototxt'
PRETRAINED = 'caffe/models/bvlc_googlenet/bvlc_googlenet_iter_200000.caffemodel'

caffe.set_mode_gpu()

net = caffe.Classifier(MODEL_FILE, PRETRAINED,
               mean=np.load('ilsvrc_2012_mean.npy').mean(1).mean(1),
               channel_swap=(2,1,0),
               raw_scale=255,
               image_dims=(224, 224))

def caffe_predict(path):
        input_image = caffe.io.load_image(path)
        print path
        print input_image
        prediction = net.predict([input_image])


        print prediction
        print "----------"

        print 'prediction shape:', prediction[0].shape
        print 'predicted class:', prediction[0].argmax()


        proba = prediction[0][prediction[0].argmax()]
        ind = prediction[0].argsort()[-5:][::-1] # top-5 predictions


        return prediction[0].argmax(), proba, ind

在我的 deploy.prototxt 中，我更改了最后一层，只是为了预测我的 101 个类。

layer {
  name: "loss3/classifier"
  type: "InnerProduct"
  bottom: "pool5/7x7_s1"
  top: "loss3/classifier"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 101
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "loss3/classifier"
  top: "prob"
}

这里是softmax输出的分布：

[[ 0.01106235  0.00343131  0.00807581  0.01530041  0.01077161  0.0081002
   0.00989228  0.00972753  0.00429183  0.01377776  0.02028225  0.01209726
   0.01318955  0.00669979  0.00720005  0.00838189  0.00335461  0.01461464
   0.01485041  0.00543212  0.00400191  0.0084842   0.02134697  0.02500303
   0.00561895  0.00776423  0.02176422  0.00752334  0.0116104   0.01328687
   0.00517187  0.02234021  0.00727272  0.02380056  0.01210031  0.00582192
   0.00729601  0.00832637  0.00819836  0.00520551  0.00625274  0.00426603
   0.01210176  0.00571806  0.00646495  0.01589645  0.00642173  0.00805364
   0.00364388  0.01553882  0.01549598  0.01824486  0.00483241  0.01231962
   0.00545738  0.0101487   0.0040346   0.01066607  0.01328133  0.01027429
   0.01581303  0.01199994  0.00371804  0.01241552  0.00831448  0.00789811
   0.00456275  0.00504562  0.00424598  0.01309276  0.0079432   0.0140427
   0.00487625  0.02614347  0.00603372  0.00892296  0.00924052  0.00712763
   0.01101298  0.00716757  0.01019373  0.01234141  0.00905332  0.0040798
   0.00846442  0.00924353  0.00709366  0.01535406  0.00653238  0.01083806
   0.01168014  0.02076091  0.00542234  0.01246306  0.00704035  0.00529556
   0.00751443  0.00797437  0.00408798  0.00891858  0.00444583]]

看起来就像是没有意义的随机分布。

感谢您的任何帮助或提示以及最好的问候，亚历克斯

【问题讨论】：

标签： python deep-learning caffe

【解决方案1】：

解决方案很简单：我只是忘记重命名部署文件中的最后一层：

layer {
  name: "loss3/classifier"
  type: "InnerProduct"
  bottom: "pool5/7x7_s1"
  top: "loss3/classifier"
  param {
    lr_mult: 1
    decay_mult: 1
  }

【讨论】：

【解决方案2】：

请检查您正在使用的图像转换 - 在训练和测试时是否相同？

AFAIK bvlc_googlenet 用每个通道一个值减去图像平均值，而您的 python classifier 使用不同的平均值 mean=np.load('ilsvrc_2012_mean.npy').mean(1).mean(1)。这可能会导致网络无法正确分类您的输入。

【讨论】：