带有自定义 Estimator 的 TensorFlow 指标答案

【问题标题】：Tensorflow metrics with custom Estimator带有自定义 Estimator 的 TensorFlow 指标
【发布时间】：2018-05-02 22:24:52
【问题描述】：

我有一个卷积神经网络，我最近重构它以使用 Tensorflow 的 Estimator API，主要遵循 this tutorial。但是，在训练期间，我添加到 EstimatorSpec 的指标没有显示在 Tensorboard 上，并且似乎也没有在 tfdbg 中进行评估，尽管名称范围和指标存在于写入到 Tensorboard 的图表中。

model_fn的相关位如下：

 ...

 predictions = tf.placeholder(tf.float32, [num_classes], name="predictions")

 ...

 with tf.name_scope("metrics"):
    predictions_rounded = tf.round(predictions)
    accuracy = tf.metrics.accuracy(input_y, predictions_rounded, name='accuracy')
    precision = tf.metrics.precision(input_y, predictions_rounded, name='precision')
    recall = tf.metrics.recall(input_y, predictions_rounded, name='recall')

if mode == tf.estimator.ModeKeys.PREDICT:
    spec = tf.estimator.EstimatorSpec(mode=mode,
                                      predictions=predictions)
elif mode == tf.estimator.ModeKeys.TRAIN:

    ...

    # if we're doing softmax vs sigmoid, we have different metrics
    if cross_entropy == CrossEntropyType.SOFTMAX:
        metrics = {
            'accuracy': accuracy,
            'precision': precision,
            'recall': recall
        }
    elif cross_entropy == CrossEntropyType.SIGMOID:
        metrics = {
            'precision': precision,
            'recall': recall
        }
    else:
        raise NotImplementedError("Unrecognized cross entropy function: {}\t Available types are: SOFTMAX, SIGMOID".format(cross_entropy))
    spec = tf.estimator.EstimatorSpec(mode=mode,
                                      loss=loss,
                                      train_op=train_op,
                                      eval_metric_ops=metrics)
else:
    raise NotImplementedError('ModeKey provided is not supported: {}'.format(mode))

return spec

有人对为什么不写这些有任何想法吗？我正在使用 Tensorflow 1.7 和 Python 3.5。我尝试通过tf.summary.scalar 显式添加它们，虽然它们确实以这种方式进入了 Tensorboard，但在第一次通过图表后它们永远不会更新。

【问题讨论】：

标签： python tensorflow

【解决方案1】：

metrics API 有一个转折点，让我们以tf.metrics.accuracy 为例（所有tf.metrics.* 工作相同）。这将返回 2 个值，accuracy 指标和 upate_op，这看起来像是您的第一个错误。你应该有这样的东西：

accuracy, update_op = tf.metrics.accuracy(input_y, predictions_rounded, name='accuracy')

accuracy 只是您期望计算的值，但请注意，您可能希望计算多次调用 sess.run 的准确度，例如，当您计算一个大型测试集的准确度时不是所有的都适合记忆。这就是update_op 的来源，它会累积结果，因此当您请求accuracy 时，它会给您一个运行记录。

update_op 没有依赖项，因此您需要在sess.run 中显式运行它或添加依赖项。例如，您可以将其设置为依赖于成本函数，以便在计算成本函数时计算 update_op（导致更新准确度的运行计数）：

with tf.control_dependencies(cost):
  tf.group(update_op, other_update_ops, ...)

您可以使用局部变量初始化器重置指标的值：

sess.run(tf.local_variables_initializer())

您需要使用tf.summary.scalar(accuracy) 为张量板添加准确性，正如您提到的那样（尽管您似乎添加了错误的东西）。

【讨论】：

啊....在每个教程中，他们要么使用罐装估算器，要么只做我在那里写的，不再提及。使用 tf.summary.scalar 手动完成——我知道元组的事情，但我很高兴你提到它是为了后代——并且控制依赖项有效，非常感谢！