【问题标题】:Keras Custom Loss for One-Hot Encoded用于 One-Hot 编码的 Keras 自定义损失
【发布时间】:2021-10-07 09:59:25
【问题描述】:

我目前有一个我训练的 DNN,它可以预测游戏所处状态的 one-hot 编码分类。基本上,假设有三个状态,0, 1, or 2.

现在,我通常会使用 categorical_cross_entropy 作为损失函数,但我意识到并不是所有的分类对于我的状态都是不相等的。例如:

  • 如果模型预测它应该是状态 1,那么如果分类错误,我的系统不会有任何成本,因为状态 1 基本上什么都不做,所以奖励 0x。
  • 如果模型正确预测状态 0 或 2(即预测 = 2 并且正确 = 2),那么奖励应该是 3 倍。
  • 如果模型不正确预测状态 0 或 2(即预测 = 2 且正确 = 0),那么奖励应该是 -1x。

我知道我们可以在 Keras 中声明我们的自定义损失函数,但我一直卡在形成它。有人对如何转换该伪代码有建议吗?我不知道如何在向量操作中做到这一点。

其他问题:我认为我基本上是在追求奖励功能。这和损失函数一样吗?谢谢!

def custom_expectancy(y_expected, y_pred):
    
    # Get 0, 1 or 2
    expected_norm = tf.argmax(y_expected);
    predicted_norm = tf.argmax(y_pred);
    
    # Some pseudo code....
    # Now, if predicted == 1
    #     loss += 0
    # elif predicted == expected
    #     loss -= 3
    # elif predicted != expected
    #     loss += 1
    #
    # return loss

咨询的来源:

https://datascience.stackexchange.com/questions/55215/how-do-i-create-a-keras-custom-loss-function-for-a-one-hot-encoded-binary-classi

Custom loss in Keras with softmax to one-hot

代码更新

import tensorflow as tf
def custom_expectancy(y_expected, y_pred):
    
    # Get 0, 1 or 2
    expected_norm = tf.argmax(y_expected);
    predicted_norm = tf.argmax(y_pred);
    
    results = tf.unstack(expected_norm)
    
    # Some pseudo code....
    # Now, if predicted == 1
    #     loss += 0
    # elif predicted == expected
    #     loss += 3
    # elif predicted != expected
    #     loss -= 1
    
    for idx in range(0, len(expected_norm)):
        predicted = predicted_norm[idx]
        expected = expected_norm[idx]
        
        if predicted == 1: # do nothing
            results[idx] = 0.0
        elif predicted == expected: # reward
            results[idx] = 3.0
        else: # wrong, so we lost
            results[idx] = -1.0
    
    
    return tf.stack(results)

认为这就是我所追求的,但我还没有完全弄清楚如何构建正确的张量(应该是批量大小)以返回。

【问题讨论】:

    标签: python tensorflow machine-learning keras deep-learning


    【解决方案1】:

    构建条件自定义损失的最佳方法是使用 tf.keras.backend.switch 而不涉及循环。

    在你的情况下,你应该结合2个switch条件表达式来获得想要的结果。

    想要的损失函数可以这样复现:

    def custom_expectancy(y_expected, y_pred):
        
        zeros = tf.cast(tf.reduce_sum(y_pred*0, axis=-1), tf.float32) ### important to produce gradient
        y_expected = tf.cast(tf.reshape(y_expected, (-1,)), tf.float32)
        class_pred = tf.argmax(y_pred, axis=-1)
        class_pred = tf.cast(class_pred, tf.float32)
        
        cond1 = (class_pred != y_expected) & (class_pred != 1)
        cond2 = (class_pred == y_expected) & (class_pred != 1)
        
        res1 = tf.keras.backend.switch(cond1, zeros -1, zeros)
        res2 = tf.keras.backend.switch(cond2, zeros +3, zeros)
        
        return res1 + res2
    

    cond1 是模型错误预测状态 0 或 2 时,cond2 是模型正确预测状态 0 或 2 时。标准状态为零,当 cond1cond2 不正确时返回已激活。

    您会注意到y_expected 可以作为整数编码状态的简单张量/数组传递(无需一次性处理它们)。

    损失函数的工作原理如下:

    true = tf.constant([[1],    [2],    [1],    [0]    ])  ## no need to one-hot
    pred = tf.constant([[0,1,0],[0,0,1],[0,0,1],[0,1,0]])
    
    custom_expectancy(true, pred)
    

    返回:

    <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 0.,  3., -1.,  0.], dtype=float32)>
    

    这似乎符合我们的需求。

    要在模型中使用损失:

    X = np.random.uniform(0,1, (1000,10))
    y = np.random.randint(0,3, (1000)) ## no need to one-hot
    
    model = Sequential([Dense(3, activation='softmax')])
    model.compile(optimizer='adam', loss=custom_expectancy)
    model.fit(X,y, epochs=3)
    

    Here 正在运行的笔记本

    【讨论】:

      【解决方案2】:

      Here there is a nice post explaining the concepts of the loss function and cost function。多个答案说明了机器学习领域的不同作者是如何考虑它们的。

      关于损失函数,你可以找到the following implementation useful。它实现了加权交叉熵损失,您可以根据训练中的重量成比例地对每个类别进行加权。这可以适应上面指定的约束。

      【讨论】:

        【解决方案3】:

        这就是您想要的方式。如果你的ground truth y_true 是密集的(形状为N3),你可以使用tf.reduce_all(y_true == [0.0, 0.0, 1.0], axis=-1, keepdims=True)tf.reduce_all(y_true == [1.0, 0.0, 0.0], axis=-1, keepdims=True) 来控制if/elif/else。您可以使用 tf.gather 进一步优化它。

        def sparse_loss(y_true, y_pred):
          """Calculate loss for game. Follows keras loss signature.
          
          Args:
            y_true: Sparse tensor of shape N1, where correct prediction
              is encoded as 0, 1, or 2. 
            y_pred: Tensor of shape N3. For each row, the three columns
              represent the predicted probability of each state. 
              For example, [0.1, 0.4, 0.6] means, "There's a 10% chance the 
              right state is 0; 40% chance the right state is 1, 
              and 60% chance the right state is 2". 
          """
        
          # This is the unvectorized implementation on individual rows which is more
          # intuitive. But TF requires vectorization. 
          # if y_true == 0:
          #   # Value matrix is shape 3. Broadcasting will occur. 
          #   return -tf.reduce_sum(y_pred * [3.0, 0.0, -1.0])
          # elif y_true == 2:
          #   return -tf.reduce_sum(y_pred * [-1.0, 0.0, 3.0])
          # else:
          #   # According to the rules, this is never the correct
          #   # state the predict so it should never show up.
          #   assert False, f'Impossible state reached. y_true: {y_true}, y_pred: {y_pred}.'
        
        
          # We vectorize by calculating the reward for all predictions for two cases:
          # if y_true is zero or if y_true is two. To eliminate this inefficiency, we 
          # could us tf.gather to build an N3 shaped matrix to multiply against. 
          reward_for_true_zero = tf.reduce_sum(y_pred * [3.0, 0.0, -1.0], axis=-1, keepdims=True) # N1
          reward_for_true_two = tf.reduce_sum(y_pred * [-1.0 ,0.0, 3.0], axis=-1, keepdims=True) # N1
        
          reward = tf.where(y_true == 0.0, reward_for_true_zero, reward_for_true_one) # N1
          return -tf.reduce_sum(reward)
        
        

        【讨论】:

          猜你喜欢
          • 2018-12-05
          • 2021-08-05
          • 2020-08-26
          • 2019-04-21
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2018-09-11
          • 2020-06-20
          相关资源
          最近更新 更多