keras 中的自定义损失函数 - 使用 K.minimum 实现的问题答案

【问题标题】：Custom loss function in keras- problem with implementation using K.minimumkeras 中的自定义损失函数 - 使用 K.minimum 实现的问题
【发布时间】：2020-02-03 00:13:21
【问题描述】：

我正在尝试在 keras 中针对“部分标签学习”问题实现自定义损失函数。在我的训练集中 - 每个训练实例都分配有一组两个候选标签，只有一个其中是正确的。为此，我想在训练期间使用损失函数来计算每个标签的损失，并选择具有最小值的损失。这个函数的简化版本是这样的：

def custom_loss(y_true, y_pred):
num_labels = tf.reduce_sum(y_true) # [0,1,0,0,1]
if num_labels > 1: #create 2 seperate vectors
    y_true_1 = ?  # [0,1,0,0,0]
    y_true_2 = ?  # [0,0,0,0,1]
    loss_1 =  K.categorical_crossentropy(y_true_1, y_pred)
    loss_2 =  K.categorical_crossentropy(y_true_2, y_pred)
    loss = minimum(loss_1, loss_2)
else:
    loss = K.categorical_crossentropy(y_true, y_pred)

return loss

我试着这样做：

y_true = tf.constant([1., 0., 0., 0., 1., 0., 0., 0., 0.])
y_pred = tf.constant([.9, .05, .05, .5, .89, .6, .05, .01, .94])

def custom_loss(y_true, y_pred):

def train_loss():

    y_train_copy = tf.Variable(0, dtype=y_true.dtype)
    y_train_copy = tf.assign(y_train_copy, y_true, validate_shape=False)

    label_cls = tf.where(tf.equal(y_true,1))
    raplace = tf.Variable([0.]) #Variable
    y_true_1 = tf.compat.v1.scatter_nd_update(y_train_copy, [label_cls[0]], raplace)  # [0,1,0,0,0]
    y_true_2 = tf.compat.v1.scatter_nd_update(y_train_copy, [label_cls[1]], raplace)  # [0,0,0,0,1]
    loss_1 =  K.categorical_crossentropy(y_true_1, y_pred)
    loss_2 =  K.categorical_crossentropy(y_true_2, y_pred)
    min_loss = tf.minimum(loss_1, loss_2)           

    return min_loss      

num_labels = tf.reduce_sum(y_true) # [0,1,0,0,1]
loss = tf.cond(num_labels > 1, 
               lambda: train_loss(), 
               lambda: K.categorical_crossentropy(y_true, y_pred)) #

return loss

loss = custom_loss(y_true, y_pred)

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    print(sess.run(loss))

问题是，由于某种原因，无论我如何尝试从两个损失中取最小值，我都得到 0.0，即使 loss_1 和 loss_2 绝对不是 0

知道为什么吗？或者更好的想法来实现这个功能？

【问题讨论】：

标签： tensorflow machine-learning keras deep-learning

【解决方案1】：

无需创建y_train_copy 变量。我简化了你的代码，输出是 min(loss_1, loss_2)。

y_true = tf.constant([1., 0., 0., 0., 1., 0., 0., 0., 0.])
y_pred = tf.constant([.9, .05, .05, .5, .89, .6, .05, .01, .94])

def custom_loss(y_true, y_pred):

    def train_loss():
        label_cls = tf.where(tf.equal(y_true, 1.))
        y_true_1 = tf.squeeze(tf.one_hot(label_cls[0], tf.size(y_true)), axis=0)
        y_true_2 = tf.squeeze(tf.one_hot(label_cls[1], tf.size(y_true)), axis=0)
        loss_1 =  K.categorical_crossentropy(y_true_1, y_pred)
        loss_2 =  K.categorical_crossentropy(y_true_2, y_pred)
        min_loss = tf.minimum(loss_1, loss_2)           
        return min_loss      

    num_labels = tf.reduce_sum(y_true) 
    loss = tf.cond(num_labels > 1, 
                   lambda: train_loss(), 
                   lambda: K.categorical_crossentropy(y_true, y_pred)) #

    return loss

loss = custom_loss(y_true, y_pred)

with tf.Session() as sess:
    print(sess.run(loss))

更新：

你的代码的错误是使用tf.scatter_nd_update()，它会改变y_train_copy的值。如果你运行min_loss，它将同时执行y_true_1和y_true_2。 y_true_2 将始终是 zeors。那么您的min_loss 始终为零。如果你单独运行loss_2，你可以看到loss_2不为零，因为你没有执行y_true_1。

更好的选择是tf.scatter_nd。你可以这样做，

y_true = tf.constant([1., 0., 0., 0., 1., 0., 0., 0., 0.])
y_pred = tf.constant([.9, .05, .05, .5, .89, .6, .05, .01, .94])

label_cls = tf.where(tf.equal(y_true, 1.))
idx1, idx2 = tf.split(label_cls,2)

raplace = tf.constant([1.])
y_true_1 = tf.scatter_nd(tf.cast(idx1, dtype=tf.int32), raplace, [tf.size(y_true)]) 
y_true_2 = tf.scatter_nd(tf.cast(idx2, dtype=tf.int32), raplace, [tf.size(y_true)])  


loss_1 =  K.categorical_crossentropy(y_true_1, y_pred)
loss_2 =  K.categorical_crossentropy(y_true_2, y_pred)
min_loss = tf.minimum(loss_1, loss_2)

with tf.Session() as sess:
    print(sess.run(min_loss))

【讨论】：

非常感谢！它正在工作！ :) 我花了 2 天时间试图弄清楚！另外，我创建 y_train_copy 的唯一原因是它一直给我这个错误 - AttributeError: 'Tensor' object has no attribute '_lazy_read'，我在某处看到它可能会有所帮助（并且出于某种原因确实如此）。
好的，所以它在网络外工作正常，但是当我尝试将它用作模型中的损失函数时，它给了我这个错误：tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Outer dimensions of indices and update must match. Indices shape: [24,2], updates shape:[1] [[{{node loss/dense_2_loss/cond/ScatterNd}}]] [[loss/mul/_3467]] (1) Invalid argument: Outer dimensions of indices and update must match. Indices shape: [24,2], updates shape:[1] [[{{node loss/dense_2_loss/cond/ScatterNd}}]] 知道为什么吗？跨度>
indices 和updates 的形状在tf.scatter_nd() 方法中不匹配。
是的，训练中的问题是我用批次训练，所以 y 张量是 2D 而不是 1D。在这种情况下，idx1, idx2（索引）不正确，replace （更新）的形状也不正确。我试图修复它，但没有成功...... :(任何机会知道如何将此代码调整为 2D y 张量？
请看我的新帖子，也许能更好地解释它：link