在张量流 keras 中动态组合损失函数答案

【问题标题】：Dynamically combining loss functions in tensorflow keras在张量流 keras 中动态组合损失函数
【发布时间】：2021-04-26 22:14:04
【问题描述】：

我正在研究一个多标签分类问题，其中不是每个目标索引代表一个不同的类别，而是代表未来一段时间。除了希望我的预测标签与目标标签匹配之外，我还想要一个额外的术语来强制执行学习的某些时间方面。

例如：

y_true = [1., 1., 1., 0.]
y_pred = [0.75, 0.81, 0.93, 0.65]

在上面，真值标签意味着在前三个索引期间发生的事情。

我希望能够轻松混合和匹配损失函数。

我有几个自定义损失函数用于整体准确性，每个都包含在可调整参数的函数中：

def weighted_binary_crossentropy(pos_weight=1):
    def weighted_binary_crossentropy_(Y_true, Y_pred):
        ...
        return tf.reduce_mean(loss, axis=-1)
    return weighted_binary_crossentropy_

def mean_squared_error(threshold=0.5):
    def mean_squared_error_(Y_true, Y_pred):
        ...
        return tf.reduce_mean(loss, axis=-1)
    return mean_squared_error

我还有一个自定义损失函数来强制预测标签与真标签同时结束（我还没有在这里使用threshold 参数）：

def end_time_error(threshold=0.5):
    def end_time_error_(Y_true, Y_pred):
        _, n_times = K.int_shape(Y_true)
        weights = K.arange(1, n_times + 1, dtype=float)
        Y_true = tf.multiply(Y_true, weights)
        argmax_true = K.argmax(Y_true, axis=1)
        argmax_pred = K.argmax(Y_pred, axis=1)
        loss = tf.math.squared_difference(argmax_true, argmax_pred)
        return tf.reduce_mean(loss, axis=-1)

有时我可能想将end_time_error 与weighted_binary_crossentropy 结合起来，有时与mean_squared_error 结合使用，我还有很多其他损失函数可以试验。我不想为每一对编写一个新的组合损失函数。

尝试解决方案 1

我尝试制作一个结合损失函数的元损失函数（在同一个脚本中全局定义）。

def combination_loss(loss_dict, combine='add', weights=[]):
    losses = []
    if not weights:
        weights = [1] * len(loss_dict)
    for (loss_func, loss_args), weight in zip(loss_dict.items(), weights):
        assert loss_func in globals().keys()
        loss_func = eval(loss_func)
        loss = loss_func(loss_args)
        losses.append(loss * weight)
    if combine == 'add':
        loss = sum(losses)
    elif combine == 'multiply':
        loss = np.prod(losses)
    return loss

要使用这个：

loss_args = {'loss_dict':
                 {'weighted_binary_crossentropy': {'pos_weight': 1},
                  'end_time_error': {}},
             'combine': 'add',
             'weights': [0.75, 0.25]}
model.compile(loss=combination_loss(**loss_args), ...)

错误：

  File "C:\...\losses.py", line 165, in combination_loss
    losses.append(loss * weight)

TypeError: unsupported operand type(s) for *: 'function' and 'float'

我在函数上玩得很松，所以我对这失败并不感到惊讶。但我不确定如何得到我想要的。

如何在combination_loss 中将函数与权重结合起来？

或者我应该直接在 model.compile() 调用中使用 lambda 函数吗？

--编辑

尝试解决方案 2

开沟combination_loss：

losses = []
for loss_, loss_args_ in loss_args['loss_dict'].items():
    losses.append(get_loss(loss_)(**loss_args_))
loss = lambda y_true, y_pred: [l(y_true, y_pred) * w for l, w
                               in zip(losses, loss_args['weights'])]
model.compile(loss=loss, ...)

错误：

  File "C:\...\losses.py", line 139, in end_time_error_
    weights = K.arange(1, n_times + 1, dtype=float)

TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

可能是因为y_true, y_pred 不能用作包装损失函数的参数。

【问题讨论】：

标签： python tensorflow keras loss-function multilabel-classification

【解决方案1】：

让我们简化您的用例，只考虑两个损失：

loss = alpha * loss1 + (1-alpha) * loss2

那么你可以这样做：

def generate_loss(alpha):
    def combination_loss(y_true, y_pred):
        return alpha * loss1(y_true, y_pred) + (1-alpha) * loss2(y_true, y_pred)
    return combination_loss

显然，loss1 和 loss2 将是您各自的损失函数。您可以使用它为不同的 alpha 生成不同的损失函数：

alpha = 0.7
combination_loss = generate_loss(alpha)
model.compile(loss=combination_loss, ...)

如果 alpha 应该是静态的，你也可以去掉外部函数 generate_loss。

最后，您还可以将其定义为 lambda 函数：

model.compile(loss=lambda y_true, y_pred: alpha * loss1(y_true, y_pred) + (1-alpha) * loss2(y_true, y_pred), ...)

我不确定你的错误在哪里（我假设它是eval，但我无法调试它）但如果你像这样简化它或使用它作为一个工作示例来介绍你的损失和权重，它应该可以工作。

【讨论】：

这不起作用，因为我有单个输出：python ValueError: When passing a list as loss, it should have one entry per model outputs. The model has 1 outputs, but you passed loss=[<function weighted_binary_crossentropy.<locals>.weighted_binary_crossentropy_ at 0x000002218449CD38>, <function end_time_error.<locals>.end_time_error_ at 0x000002218449CDC8>]
我会在你的编辑中尝试类似的东西，但我看不到你在哪里传递额外的损失函数参数，例如pos_weight 我的weighted_binary_crossentropy。
您必须根据您的用例调整我的模板（它是一个 MWE）。如果您想将某些东西传递给“子损失”之一，您可以将其添加到外部函数中，例如generate_loss(alpha, pos_weight).
你能运行上面的例子吗？