sphereface论文 - 爱码网

we set the output dimension of FC1 layer as 2 . 譬如FC1之后输出：

feat=(256, 2)
256是batch_size，256张图片，每个图片提取两个特征

sphereface论文
横纵坐标是两个feature，得到两个类，一共256个点。

直角坐标系映射到极坐标： $W_{1}、W_{2}$ 向量长度不相等。 Features learned by the original softmax loss can not be classified simply via angles ——增加角度margin。
The decision boundary in softmax loss is :

$(W_{1} −W_{2})x+b_{1} −b_{2}=0$

If we define $x$ as a feature vector (特征向量？) and constrain $∥W_{1}∥=∥W_{2}∥=1$ and $b_{1} =b_{2} =0$ , the boundary:

$\left \| x \right \|\left ( cos(\theta _{1})-cos(\theta_{2}) \right )=0$ ,
where $θ_{i}$ is the angle between $W_{i}$ and x

到此为止，boundary只与角度θ有关，修改softmax loss直接优化角度，让 CNNs 提取到角度可分性更高的feature。现在我们来看加了 $W、b、x$ 的约束后的图像： sphereface论文
通过修改的softmax loss得到的feature。Compared to original softmax loss, the features learned by modified softmax loss are angularly distributed.
作者觉得两类分的还不够开，于是引入一个整数 $m(m ≥ 1)$ ，惩罚因子，控制分开的角度距离，边界变为：

$∥x∥(cos(mθ_{1} )−cos(θ_{2}))=0$ , $∥x∥(cos(θ_{1} )− cos(θ_{2} ))=0,$

$m$ 越大， $mθ_{1}$ 越大，得到更大的角，两类分离越远，如下图：
sphereface论文