【发布时间】:2017-05-17 10:48:45
【问题描述】:
有谁知道在 Caffe softmax 层内部进行了哪些计算?
我正在使用一个预训练的网络,最后有一个 softmax 层。
在测试阶段,对于图像的简单转发,倒数第二层(“InnerProduct”)的输出如下: -0.20095,0.0.2009,0.22510,-0.36796,-0.1991,0.43291,-0.2714,0.22229,-0.08174,0.01931,-0.05791,0.2169931,-0.05791,0.21699,000437,0.02350,0.02924,0.28733,0.19157,-0.04191,-0.07360,0.04191,-0.07360,0.30252
最后一层(“Softmax”)的输出是以下值: 0.00000,0.44520,0.01115,000,000,0.89348,000,000,000,000002,0.00015,000003,000940,0.00011,000006,0.00018,00010,10.00006,0.00018,00010,000550,0.00004,000,000,0.05710
如果我在内积层的输出上应用 Softmax(使用外部工具,如 matlab),我会得到以下值: 0.0398,0.0610,0.0337,0.0391,0.0751,0.0388,0.0390,0.0449,0.0496,0.046,0.05,0.0489,0.0476,0.0501,0.0365,0.0590,0.0467,0.0452,0.0659
后者对我来说很有意义,因为概率加起来为 1.0(请注意,Caffe 的 Softmax 层值的总和 > 1.0)。
显然,Caffe 中的 softmax 层并不是直接的 Softmax 操作。
(我不认为这有什么区别,但我只想提一下,我使用的是预训练的 flickr 风格的网络,请参阅描述 here)。
编辑:
这里是proto txt中最后两层的定义。注意最后一层的类型是“Softmax”。
layer {
name: "fc8_flickr"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_flickr"
param {
lr_mult: 10
decay_mult: 1
}
param {
lr_mult: 20
decay_mult: 0
}
inner_product_param {
num_output: 20
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "fc8_flickr"
top: "prob"
}
【问题讨论】:
-
"softmax"不计算概率。它基本上不能,因为您发现它甚至没有强制所有输出的总和 1.0。 (“以上都不是”不能有负概率)
-
softmax 单元测试是否在您的 caffe 版本上通过?
-
@Shai 是的,所有测试都成功完成。为了澄清我自己,我不认为 Caffe 中存在错误或某些东西。我只想知道Softmax层发生了什么操作。
-
@Shai 我检查了代码,但我不能说我明白发生了什么。大多数 cmets 描述了一个 softmax 操作(“...减去最大值”、“求幂”、“exp 后求和”、“除法”),但也发生了一些缩放。