【问题标题】:facenet triplet loss with keras使用 keras 的 facenet 三元组丢失
【发布时间】:2017-04-25 20:26:12
【问题描述】:

我正在尝试使用 Tensorflow 后端在 Keras 中实现 facenet,但三元组丢失有一些问题。

我用 3*n 个图像调用 fit 函数,然后按如下方式定义我的自定义损失函数:

def triplet_loss(self, y_true, y_pred):

    embeddings = K.reshape(y_pred, (-1, 3, output_dim))

    positive_distance = K.mean(K.square(embeddings[:,0] - embeddings[:,1]),axis=-1)
    negative_distance = K.mean(K.square(embeddings[:,0] - embeddings[:,2]),axis=-1)
    return K.mean(K.maximum(0.0, positive_distance - negative_distance + _alpha))

self._model.compile(loss=triplet_loss, optimizer="sgd")
self._model.fit(x=x,y=y,nb_epoch=1, batch_size=len(x))

其中 y 只是一个用 0 填充的虚拟数组

问题在于,即使在批量大小为 20 的第一次迭代之后,模型也开始为所有图像预测相同的嵌入。因此,当我第一次对批次进行预测时,每个嵌入都是不同的。然后我再次进行拟合和预测,突然间,对于批次中的所有图像,所有嵌入变得几乎相同

还要注意模型末尾有一个 Lambda 层。它对网络的输出进行归一化,因此所有嵌入都有一个单位长度,正如人脸网络研究中所建议的那样。

有人可以帮我吗?

模型总结

    Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
convolution2d_1 (Convolution2D)  (None, 112, 112, 64)  9472        input_1[0][0]                    
____________________________________________________________________________________________________
batchnormalization_1 (BatchNormal(None, 112, 112, 64)  128         convolution2d_1[0][0]            
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D)    (None, 56, 56, 64)    0           batchnormalization_1[0][0]       
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D)  (None, 56, 56, 64)    4160        maxpooling2d_1[0][0]             
____________________________________________________________________________________________________
batchnormalization_2 (BatchNormal(None, 56, 56, 64)    128         convolution2d_2[0][0]            
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D)  (None, 56, 56, 192)   110784      batchnormalization_2[0][0]       
____________________________________________________________________________________________________
batchnormalization_3 (BatchNormal(None, 56, 56, 192)   384         convolution2d_3[0][0]            
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D)    (None, 28, 28, 192)   0           batchnormalization_3[0][0]       
____________________________________________________________________________________________________
convolution2d_5 (Convolution2D)  (None, 28, 28, 96)    18528       maxpooling2d_2[0][0]             
____________________________________________________________________________________________________
convolution2d_7 (Convolution2D)  (None, 28, 28, 16)    3088        maxpooling2d_2[0][0]             
____________________________________________________________________________________________________
maxpooling2d_3 (MaxPooling2D)    (None, 28, 28, 192)   0           maxpooling2d_2[0][0]             
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D)  (None, 28, 28, 64)    12352       maxpooling2d_2[0][0]             
____________________________________________________________________________________________________
convolution2d_6 (Convolution2D)  (None, 28, 28, 128)   110720      convolution2d_5[0][0]            
____________________________________________________________________________________________________
convolution2d_8 (Convolution2D)  (None, 28, 28, 32)    12832       convolution2d_7[0][0]            
____________________________________________________________________________________________________
convolution2d_9 (Convolution2D)  (None, 28, 28, 32)    6176        maxpooling2d_3[0][0]             
____________________________________________________________________________________________________
merge_1 (Merge)                  (None, 28, 28, 256)   0           convolution2d_4[0][0]            
                                                                   convolution2d_6[0][0]            
                                                                   convolution2d_8[0][0]            
                                                                   convolution2d_9[0][0]            
____________________________________________________________________________________________________
convolution2d_11 (Convolution2D) (None, 28, 28, 96)    24672       merge_1[0][0]                    
____________________________________________________________________________________________________
convolution2d_13 (Convolution2D) (None, 28, 28, 32)    8224        merge_1[0][0]                    
____________________________________________________________________________________________________
maxpooling2d_4 (MaxPooling2D)    (None, 28, 28, 256)   0           merge_1[0][0]                    
____________________________________________________________________________________________________
convolution2d_10 (Convolution2D) (None, 28, 28, 64)    16448       merge_1[0][0]                    
____________________________________________________________________________________________________
convolution2d_12 (Convolution2D) (None, 28, 28, 128)   110720      convolution2d_11[0][0]           
____________________________________________________________________________________________________
convolution2d_14 (Convolution2D) (None, 28, 28, 64)    51264       convolution2d_13[0][0]           
____________________________________________________________________________________________________
convolution2d_15 (Convolution2D) (None, 28, 28, 64)    16448       maxpooling2d_4[0][0]             
____________________________________________________________________________________________________
merge_2 (Merge)                  (None, 28, 28, 320)   0           convolution2d_10[0][0]           
                                                                   convolution2d_12[0][0]           
                                                                   convolution2d_14[0][0]           
                                                                   convolution2d_15[0][0]           
____________________________________________________________________________________________________
convolution2d_16 (Convolution2D) (None, 28, 28, 128)   41088       merge_2[0][0]                    
____________________________________________________________________________________________________
convolution2d_18 (Convolution2D) (None, 28, 28, 32)    10272       merge_2[0][0]                    
____________________________________________________________________________________________________
convolution2d_17 (Convolution2D) (None, 14, 14, 256)   295168      convolution2d_16[0][0]           
____________________________________________________________________________________________________
convolution2d_19 (Convolution2D) (None, 14, 14, 64)    51264       convolution2d_18[0][0]           
____________________________________________________________________________________________________
maxpooling2d_5 (MaxPooling2D)    (None, 14, 14, 320)   0           merge_2[0][0]                    
____________________________________________________________________________________________________
merge_3 (Merge)                  (None, 14, 14, 640)   0           convolution2d_17[0][0]           
                                                                   convolution2d_19[0][0]           
                                                                   maxpooling2d_5[0][0]             
____________________________________________________________________________________________________
convolution2d_21 (Convolution2D) (None, 14, 14, 96)    61536       merge_3[0][0]                    
____________________________________________________________________________________________________
convolution2d_23 (Convolution2D) (None, 14, 14, 32)    20512       merge_3[0][0]                    
____________________________________________________________________________________________________
maxpooling2d_6 (MaxPooling2D)    (None, 14, 14, 640)   0           merge_3[0][0]                    
____________________________________________________________________________________________________
convolution2d_20 (Convolution2D) (None, 14, 14, 256)   164096      merge_3[0][0]                    
____________________________________________________________________________________________________
convolution2d_22 (Convolution2D) (None, 14, 14, 192)   166080      convolution2d_21[0][0]           
____________________________________________________________________________________________________
convolution2d_24 (Convolution2D) (None, 14, 14, 64)    51264       convolution2d_23[0][0]           
____________________________________________________________________________________________________
convolution2d_25 (Convolution2D) (None, 14, 14, 128)   82048       maxpooling2d_6[0][0]             
____________________________________________________________________________________________________
merge_4 (Merge)                  (None, 14, 14, 640)   0           convolution2d_20[0][0]           
                                                                   convolution2d_22[0][0]           
                                                                   convolution2d_24[0][0]           
                                                                   convolution2d_25[0][0]           
____________________________________________________________________________________________________
convolution2d_27 (Convolution2D) (None, 14, 14, 112)   71792       merge_4[0][0]                    
____________________________________________________________________________________________________
convolution2d_29 (Convolution2D) (None, 14, 14, 32)    20512       merge_4[0][0]                    
____________________________________________________________________________________________________
maxpooling2d_7 (MaxPooling2D)    (None, 14, 14, 640)   0           merge_4[0][0]                    
____________________________________________________________________________________________________
convolution2d_26 (Convolution2D) (None, 14, 14, 224)   143584      merge_4[0][0]                    
____________________________________________________________________________________________________
convolution2d_28 (Convolution2D) (None, 14, 14, 224)   226016      convolution2d_27[0][0]           
____________________________________________________________________________________________________
convolution2d_30 (Convolution2D) (None, 14, 14, 64)    51264       convolution2d_29[0][0]           
____________________________________________________________________________________________________
convolution2d_31 (Convolution2D) (None, 14, 14, 128)   82048       maxpooling2d_7[0][0]             
____________________________________________________________________________________________________
merge_5 (Merge)                  (None, 14, 14, 640)   0           convolution2d_26[0][0]           
                                                                   convolution2d_28[0][0]           
                                                                   convolution2d_30[0][0]           
                                                                   convolution2d_31[0][0]           
____________________________________________________________________________________________________
convolution2d_33 (Convolution2D) (None, 14, 14, 128)   82048       merge_5[0][0]                    
____________________________________________________________________________________________________
convolution2d_35 (Convolution2D) (None, 14, 14, 32)    20512       merge_5[0][0]                    
____________________________________________________________________________________________________
maxpooling2d_8 (MaxPooling2D)    (None, 14, 14, 640)   0           merge_5[0][0]                    
____________________________________________________________________________________________________
convolution2d_32 (Convolution2D) (None, 14, 14, 192)   123072      merge_5[0][0]                    
____________________________________________________________________________________________________
convolution2d_34 (Convolution2D) (None, 14, 14, 256)   295168      convolution2d_33[0][0]           
____________________________________________________________________________________________________
convolution2d_36 (Convolution2D) (None, 14, 14, 64)    51264       convolution2d_35[0][0]           
____________________________________________________________________________________________________
convolution2d_37 (Convolution2D) (None, 14, 14, 128)   82048       maxpooling2d_8[0][0]             
____________________________________________________________________________________________________
merge_6 (Merge)                  (None, 14, 14, 640)   0           convolution2d_32[0][0]           
                                                                   convolution2d_34[0][0]           
                                                                   convolution2d_36[0][0]           
                                                                   convolution2d_37[0][0]           
____________________________________________________________________________________________________
convolution2d_39 (Convolution2D) (None, 14, 14, 144)   92304       merge_6[0][0]                    
____________________________________________________________________________________________________
convolution2d_41 (Convolution2D) (None, 14, 14, 32)    20512       merge_6[0][0]                    
____________________________________________________________________________________________________
maxpooling2d_9 (MaxPooling2D)    (None, 14, 14, 640)   0           merge_6[0][0]                    
____________________________________________________________________________________________________
convolution2d_38 (Convolution2D) (None, 14, 14, 160)   102560      merge_6[0][0]                    
____________________________________________________________________________________________________
convolution2d_40 (Convolution2D) (None, 14, 14, 288)   373536      convolution2d_39[0][0]           
____________________________________________________________________________________________________
convolution2d_42 (Convolution2D) (None, 14, 14, 64)    51264       convolution2d_41[0][0]           
____________________________________________________________________________________________________
convolution2d_43 (Convolution2D) (None, 14, 14, 128)   82048       maxpooling2d_9[0][0]             
____________________________________________________________________________________________________
merge_7 (Merge)                  (None, 14, 14, 640)   0           convolution2d_38[0][0]           
                                                                   convolution2d_40[0][0]           
                                                                   convolution2d_42[0][0]           
                                                                   convolution2d_43[0][0]           
____________________________________________________________________________________________________
convolution2d_44 (Convolution2D) (None, 14, 14, 160)   102560      merge_7[0][0]                    
____________________________________________________________________________________________________
convolution2d_46 (Convolution2D) (None, 14, 14, 64)    41024       merge_7[0][0]                    
____________________________________________________________________________________________________
convolution2d_45 (Convolution2D) (None, 7, 7, 256)     368896      convolution2d_44[0][0]           
____________________________________________________________________________________________________
convolution2d_47 (Convolution2D) (None, 7, 7, 128)     204928      convolution2d_46[0][0]           
____________________________________________________________________________________________________
maxpooling2d_10 (MaxPooling2D)   (None, 7, 7, 640)     0           merge_7[0][0]                    
____________________________________________________________________________________________________
merge_8 (Merge)                  (None, 7, 7, 1024)    0           convolution2d_45[0][0]           
                                                                   convolution2d_47[0][0]           
                                                                   maxpooling2d_10[0][0]            
____________________________________________________________________________________________________
convolution2d_49 (Convolution2D) (None, 7, 7, 192)     196800      merge_8[0][0]                    
____________________________________________________________________________________________________
convolution2d_51 (Convolution2D) (None, 7, 7, 48)      49200       merge_8[0][0]                    
____________________________________________________________________________________________________
maxpooling2d_11 (MaxPooling2D)   (None, 7, 7, 1024)    0           merge_8[0][0]                    
____________________________________________________________________________________________________
convolution2d_48 (Convolution2D) (None, 7, 7, 384)     393600      merge_8[0][0]                    
____________________________________________________________________________________________________
convolution2d_50 (Convolution2D) (None, 7, 7, 384)     663936      convolution2d_49[0][0]           
____________________________________________________________________________________________________
convolution2d_52 (Convolution2D) (None, 7, 7, 128)     153728      convolution2d_51[0][0]           
____________________________________________________________________________________________________
convolution2d_53 (Convolution2D) (None, 7, 7, 128)     131200      maxpooling2d_11[0][0]            
____________________________________________________________________________________________________
merge_9 (Merge)                  (None, 7, 7, 1024)    0           convolution2d_48[0][0]           
                                                                   convolution2d_50[0][0]           
                                                                   convolution2d_52[0][0]           
                                                                   convolution2d_53[0][0]           
____________________________________________________________________________________________________
convolution2d_55 (Convolution2D) (None, 7, 7, 192)     196800      merge_9[0][0]                    
____________________________________________________________________________________________________
convolution2d_57 (Convolution2D) (None, 7, 7, 48)      49200       merge_9[0][0]                    
____________________________________________________________________________________________________
maxpooling2d_12 (MaxPooling2D)   (None, 7, 7, 1024)    0           merge_9[0][0]                    
____________________________________________________________________________________________________
convolution2d_54 (Convolution2D) (None, 7, 7, 384)     393600      merge_9[0][0]                    
____________________________________________________________________________________________________
convolution2d_56 (Convolution2D) (None, 7, 7, 384)     663936      convolution2d_55[0][0]           
____________________________________________________________________________________________________
convolution2d_58 (Convolution2D) (None, 7, 7, 128)     153728      convolution2d_57[0][0]           
____________________________________________________________________________________________________
convolution2d_59 (Convolution2D) (None, 7, 7, 128)     131200      maxpooling2d_12[0][0]            
____________________________________________________________________________________________________
merge_10 (Merge)                 (None, 7, 7, 1024)    0           convolution2d_54[0][0]           
                                                                   convolution2d_56[0][0]           
                                                                   convolution2d_58[0][0]           
                                                                   convolution2d_59[0][0]           
____________________________________________________________________________________________________
averagepooling2d_1 (AveragePoolin(None, 1, 1, 1024)    0           merge_10[0][0]                   
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 1024)          0           averagepooling2d_1[0][0]         
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 128)           131200      flatten_1[0][0]                  
____________________________________________________________________________________________________
lambda_1 (Lambda)                (None, 128)           0           dense_1[0][0]                    
====================================================================================================
Total params: 7456944
____________________________________________________________________________________________________
None

【问题讨论】:

  • 你的学习率是多少?也许它太大了。
  • 我想到了这一点,所以我尝试了非常低的学习率 Liek 1e-10,模型应该只改变非常小的权重,它仍然学会为每个图像产生相同的输出单次迭代中的批次非常奇怪。
  • 在您的代码中,“embeddings[0] - embeddings[1]”是“embeddings[:,0] - embeddings[:, 1]”吗?
  • 我对模型摘要感到困惑。损失层连接到哪一层(输出 128 大小的特征图)?摘要中的最后一层似乎生成了 3 大小的特征图。
  • @DalekSupreme 您在哪里能够成功地在 Keras 中实现 Facenet?我正在做一个项目,很想知道是否有人成功了。

标签: neural-network tensorflow keras


【解决方案1】:

除了学习率太高之外,可能发生的情况是有效地使用了不稳定的三元组选择策略。例如,如果您只使用 'hard triplets'(an 距离小于 ap 距离的三元组),您的网络权重可能会将所有嵌入折叠到一个点(使损失始终相等到边距(你的_alpha),因为所有嵌入距离都为零)。

这也可以通过使用其他类型的三元组来解决(例如 'semi-hard triplets',其中 ap 小于 an,但 ap 和 an 之间的距离仍然小于 margin) .所以也许如果你总是检查这个......在这篇博文中有更详细的解释:https://omoindrot.github.io/triplet-loss

【讨论】:

    【解决方案2】:

    您是否将嵌入限制为“位于 d 维超球面上”?尝试在嵌入从 CNN 出来后立即在嵌入上运行 tf.nn.l2_normalize

    问题可能在于嵌入有点像智能算法。减少损失的一种简单方法是将所有内容设置为零。 l2_normalize 强制它们为单位长度。

    看起来您需要在最后一个平均池之后添加归一化。

    【讨论】:

    • 感谢您的想法。不幸的是,最后一个 Lambda 层 lambda_1 已经这样做了。它按照人脸网络研究中的建议对嵌入进行规范化。为什么要在平均池而不是密集层之后进行归一化?
    • 哦,嗯。对此我不确定。这就是我做的地方(对于连体网络,这是一个类似的想法)。
    【解决方案3】:

    我遇到了同样的问题,我做了一些研究工作。我认为这是因为三元组损失需要多个输入,这可能会导致网络产生这样的输出。我还没有解决这个问题,但是您可以查看 keras 的问题页面以获取更多详细信息https://github.com/keras-team/keras/issues/9498

    在上面的问题中,我实现了一个假数据集和一个假三元组损失来重现问题,在我改变了网络的输入结构后,损失变得正常了。

    【讨论】:

      【解决方案4】:

      张量流中的损失函数需要一个标签列表,即整数列表。我认为您正在传递一个二维矩阵,即一种热编码。

      试试这个

      import keras.backend as K
      from tf.contrib.losses.metric_learning import triplet_semihard_loss
      
      def loss(y_true, y_pred):
          y_true = K.argmax(y_true, axis = -1)
          return triplet_semihard_loss(labels=y_true, embeddings=y_pred, margin=1.)
      

      【讨论】:

      • 我在找一个简单的 keras 度量学习示例,能否请您分享一个示例,非常感谢,
      猜你喜欢
      • 1970-01-01
      • 2017-06-28
      • 2018-05-21
      • 2020-02-27
      • 2018-01-13
      • 2019-04-05
      • 1970-01-01
      • 2019-05-03
      • 2021-07-18
      相关资源
      最近更新 更多