张量流中的自定义分类损失函数答案

【问题标题】：custom class-wise loss function in tensorflow张量流中的自定义分类损失函数
【发布时间】：2019-09-20 17:37:19
【问题描述】：

对于我的问题，我想预测客户评论分数从 1 到 5。我认为将其实现为回归问题会很好，因为模型预测的 1 而 5 是真实值应该是比 4 更“差”的预测。还希望该模型以某种方式对所有评分等级都同样出色。因为我的数据集高度不平衡，我想创建一个能够捕获这一点的指标/损失（我认为就像 F1 进行分类一样）。因此我创建了以下指标（现在只有 mse 是相关的）：

def custom_metric(y_true, y_pred):
    df = pd.DataFrame(np.column_stack([y_pred, y_true]), columns=["Predicted", "Truth"])
    class_mse = 0
    #class_mae = 0
    print("MAE for Classes:")
    for i in df.Truth.unique():
        temp = df[df["Truth"]==i]
        mse = mean_squared_error(temp.Truth, temp.Predicted)
        #mae = mean_absolute_error(temp.Truth, temp.Predicted)
        print("Class {}: {}".format(i, mse))
        class_mse += mse
        #class_mae += mae
    print()
    print("AVG MSE over Classes {}".format(class_mse/len(df.Truth.unique())))
    #print("AVG MAE over Classes {}".format(class_mae/len(df.Truth.unique())))

现在是一个示例预测：

import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error, mean_absolute_error

# sample predictions: "model" messed up at class 2 and 3 
y_true = np.array((1,1,1,2,2,2,3,3,3,4,4,4,5,5,5))
y_pred = np.array((1,1,1,2,2,3,5,4,3,4,4,4,5,5,5))

custom_metric(y_true, y_pred)

现在我的问题是：它是否能够创建一个能够以类似行为起作用的自定义张量流损失函数？我还研究了这个实现，它还没有为 tensorflow 做好准备，但可能更相似：

def custom_metric(y_true, y_pred):
    mse_class = 0
    num_classes = len(np.unique(y_true))
    stacked = np.vstack((y_true, y_pred))
    for i in np.unique(stacked[0]):     
        y_true_temp = stacked[0][np.where(stacked[0]==i)]
        y_pred_temp = stacked[1][np.where(stacked[0]==i)]
        mse = np.mean(np.square(y_pred_temp - y_true_temp))
        mse_class += mse
    return mse_class/num_classes

但是，我仍然不确定如何解决类似 tensorflow 的定义的 for 循环。

提前感谢您的帮助！

【问题讨论】：

标签： python tensorflow keras metrics loss-function

【解决方案1】：

for 循环应该通过张量上的 numpy/tensorflow 操作来处理。

自定义指标示例如下：

  from keras import backend as K

  def custom_mean_squared_error(y_true, y_pred):
        return K.mean(K.square(y_pred - y_true), axis=-1)

其中 y_true 是真实标签，y_pred 是您的预测。您可以看到没有显式的 for 循环。

不使用 for 循环的动机是向量化操作（在 numpy 和 tensorflow 中都存在）利用了现代 CPU 架构，将多个迭代操作转换为矩阵操作。考虑到 numpy 中的点积实现比 Python 中的常规 for 循环花费大约 30 倍。

【讨论】：

感谢您的回答！所以我想我可以为每个班级做类似mean_squared_error(y_true[np.where(y_true==1)],y_pred[np.where(y_true==1)]) 的事情并计算全局平均值（不是很漂亮，而是一个开始）。但现在我被困在将(y_true[np.where(y_true==1)] 翻译成张量流。我会进一步努力让它发挥作用:)
你不需要。您可以直接发送 y_true 和 y_pred。由于您将问题描述为回归问题，例如，当基本事实为 1.0 时，MSE 将自动惩罚对实数 4.2 的预测而不是 2.1。在这种情况下，您的 y_true(ground truth) 是一个实数。因此，您不需要任何单热编码。
但是均方误差的简单实现并没有考虑到类的不平衡，或者我错过了什么？这就是为什么我想在所有复习课上做平均 MSE。因此，即使班级相对较小，在小班上的糟糕表现（1 星评价）也会受到更多惩罚。
好的，请注意 loss function 和 metric 是两个不同的东西 :D，这就是混淆的来源。上面定义的函数应该在你的神经网络训练期间用作损失函数。对于回归问题，“MSE”= mean_squared_error 是损失函数。您可以在训练期间使用的其他指标包括：平均绝对误差：mean_absolute_error、MAE、mae 平均绝对百分比误差：mean_absolute_percentage_error、MAPE 余弦接近度：余弦接近度、余弦
不平衡是另一回事，您可以使用 SMOTE 或 ADASYN 或其他此类过采样技术来平衡您的数据集。