是否可以将批次的总和而不是批次的平均值显示为keras中的损失？答案

【问题标题】：Is it possible to display the sum over batches instead of the mean over batches as loss in keras?是否可以将批次的总和而不是批次的平均值显示为keras中的损失？
【发布时间】：2020-04-26 07:29:38
【问题描述】：

我有一个回归任务，正在使用欧几里得距离测量拟合度。我不想显示均方误差作为损失，而是显示平方和。也就是说，我想仅对平方误差项求和，不除以示例数。

在批处理级别上，我可以通过像这样定义自定义损失来实现这一点（也许我可以直接使用tf.keras.losses.MeanSquareError）：

class CustomLoss(tf.keras.losses.Loss):
    def call(self, Y_true, Y_pred):
        return tf.reduce_sum(tf.math.abs(Y_true-Y_pred) ** 2, axis=-1)

target_loss=CustomLoss(reduction=tf.keras.losses.Reduction.SUM)

它将计算每个示例的平方误差，然后指示 TensorFlow 对示例求和以计算批量损失，而不是默认的 SUM_OVER_BATCH_SIZE（不应按字面意思阅读，而是作为分数，即 @987654324 @)。

我的问题是，在 epoch 级别上，Keras 获取这些总和，然后计算跨步（批次）的平均值以报告 epoch 的损失。 如何让 Keras 计算批次的总和而不是平均值？

【问题讨论】：

标签： python tensorflow keras deep-learning

【解决方案1】：

您必须写一个Custom Callback，它将在每个批次之后将损失附加到列表中（如共享链接文档中所示）。

实施 on_epoch_end 获取列表中所有值的总和（您在其中添加了所有批次损失）

如果您想最小化所有批次的损失总和，请使用K.Function API。 Full implementation

【讨论】：

【解决方案2】：

您可以像下面这样对tf.keras.metric.Metric 中的批次进行汇总，但现在 2.4.x 中存在一个未决问题（请参阅this GitHub issue），您可以尝试使用 2.3.2，

class AddAllOnes(tf.keras.metrics.Metric):
  """ A simple metric that adds all the one's in current batch and suppose to return the total ones seen at every end of batch"""
    def __init__(self, name="add_all_ones", **kwargs):
        super(AddAllOnes, self).__init__(name=name, **kwargs)
        self.total = self.add_weight(name="total", initializer="zeros")

    def update_state(self, y_true, y_pred, sample_weight=None):    
        self.total.assign_add(tf.cast(tf.reduce_sum(y_true), dtype=tf.float32))
        
    def result(self):
        print('')
        print('inside result...', self.total)
        return self.total

X_train = np.random.random((512, 8))
y_train = np.random.randint(0, 2, (512, 1))

K.clear_session()
model_inputs = Input(shape=(8,))
model_unit = Dense(256, activation='linear', use_bias=False)(model_inputs)
model_unit = BatchNormalization()(model_unit)
model_unit = Activation('sigmoid')(model_unit)
model_outputs = Dense(1, activation='sigmoid')(model_unit)
optim = Adam(learning_rate=0.001)
model = Model(inputs=model_inputs, outputs=model_outputs)
model.compile(loss='binary_crossentropy', optimizer=optim, metrics=[AddAllOnes()], run_eagerly=True)
model.fit(X_train, y_train, verbose=1, batch_size=32)

【讨论】：