【问题标题】:keras tensorboard: plot train and validation scalars in a same figurekeras tensorboard:在同一个图中绘制训练和验证标量
【发布时间】:2018-06-01 07:52:10
【问题描述】:

所以我在 keras 中使用 tensorboard。在 tensorflow 中,可以使用两种不同的摘要编写器来训练和验证标量,以便 tensorboard 可以将它们绘制在同一个图中。类似于

中的 figure

TensorBoard - Plot training and validation losses on the same graph?

有没有办法在 keras 中做到这一点?

谢谢。

【问题讨论】:

    标签: tensorflow neural-network keras tensorboard


    【解决方案1】:

    要使用单独的编写器处理验证日志,您可以编写一个自定义回调,将原始 TensorBoard 方法包装起来。

    import os
    import tensorflow as tf
    from keras.callbacks import TensorBoard
    
    class TrainValTensorBoard(TensorBoard):
        def __init__(self, log_dir='./logs', **kwargs):
            # Make the original `TensorBoard` log to a subdirectory 'training'
            training_log_dir = os.path.join(log_dir, 'training')
            super(TrainValTensorBoard, self).__init__(training_log_dir, **kwargs)
    
            # Log the validation metrics to a separate subdirectory
            self.val_log_dir = os.path.join(log_dir, 'validation')
    
        def set_model(self, model):
            # Setup writer for validation metrics
            self.val_writer = tf.summary.FileWriter(self.val_log_dir)
            super(TrainValTensorBoard, self).set_model(model)
    
        def on_epoch_end(self, epoch, logs=None):
            # Pop the validation logs and handle them separately with
            # `self.val_writer`. Also rename the keys so that they can
            # be plotted on the same figure with the training metrics
            logs = logs or {}
            val_logs = {k.replace('val_', ''): v for k, v in logs.items() if k.startswith('val_')}
            for name, value in val_logs.items():
                summary = tf.Summary()
                summary_value = summary.value.add()
                summary_value.simple_value = value.item()
                summary_value.tag = name
                self.val_writer.add_summary(summary, epoch)
            self.val_writer.flush()
    
            # Pass the remaining logs to `TensorBoard.on_epoch_end`
            logs = {k: v for k, v in logs.items() if not k.startswith('val_')}
            super(TrainValTensorBoard, self).on_epoch_end(epoch, logs)
    
        def on_train_end(self, logs=None):
            super(TrainValTensorBoard, self).on_train_end(logs)
            self.val_writer.close()
    
    • __init__ 中,为训练和验证日志设置了两个子目录
    • set_model 中,为验证日志创建了一个写入器self.val_writer
    • on_epoch_end 中,验证日志与训练日志分开,并写入self.val_writer 文件

    以 MNIST 数据集为例:

    from keras.models import Sequential
    from keras.layers import Dense
    from keras.datasets import mnist
    
    (x_train, y_train), (x_test, y_test) = mnist.load_data()
    x_train = x_train.reshape(60000, 784)
    x_test = x_test.reshape(10000, 784)
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_train /= 255
    x_test /= 255
    
    model = Sequential()
    model.add(Dense(64, activation='relu', input_shape=(784,)))
    model.add(Dense(10, activation='softmax'))
    model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    model.fit(x_train, y_train, epochs=10,
              validation_data=(x_test, y_test),
              callbacks=[TrainValTensorBoard(write_graph=False)])
    

    然后您可以在 TensorBoard 中对同一图形上的两条曲线进行可视化。


    编辑:我稍微修改了这个类,以便它可以用于急切执行。

    最大的变化是我在下面的代码中使用了tf.keras。似乎独立 Keras 中的 TensorBoard 回调尚不支持 Eager 模式。

    import os
    import tensorflow as tf
    from tensorflow.keras.callbacks import TensorBoard
    from tensorflow.python.eager import context
    
    class TrainValTensorBoard(TensorBoard):
        def __init__(self, log_dir='./logs', **kwargs):
            self.val_log_dir = os.path.join(log_dir, 'validation')
            training_log_dir = os.path.join(log_dir, 'training')
            super(TrainValTensorBoard, self).__init__(training_log_dir, **kwargs)
    
        def set_model(self, model):
            if context.executing_eagerly():
                self.val_writer = tf.contrib.summary.create_file_writer(self.val_log_dir)
            else:
                self.val_writer = tf.summary.FileWriter(self.val_log_dir)
            super(TrainValTensorBoard, self).set_model(model)
    
        def _write_custom_summaries(self, step, logs=None):
            logs = logs or {}
            val_logs = {k.replace('val_', ''): v for k, v in logs.items() if 'val_' in k}
            if context.executing_eagerly():
                with self.val_writer.as_default(), tf.contrib.summary.always_record_summaries():
                    for name, value in val_logs.items():
                        tf.contrib.summary.scalar(name, value.item(), step=step)
            else:
                for name, value in val_logs.items():
                    summary = tf.Summary()
                    summary_value = summary.value.add()
                    summary_value.simple_value = value.item()
                    summary_value.tag = name
                    self.val_writer.add_summary(summary, step)
            self.val_writer.flush()
    
            logs = {k: v for k, v in logs.items() if not 'val_' in k}
            super(TrainValTensorBoard, self)._write_custom_summaries(step, logs)
    
        def on_train_end(self, logs=None):
            super(TrainValTensorBoard, self).on_train_end(logs)
            self.val_writer.close()
    

    思路是一样的--

    • 查看TensorBoard回调的源码
    • 查看设置 writer 的作用
    • 在这个自定义回调中做同样的事情

    同样,您可以使用 MNIST 数据对其进行测试,

    from tensorflow.keras.datasets import mnist
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense
    from tensorflow.train import AdamOptimizer
    
    tf.enable_eager_execution()
    
    (x_train, y_train), (x_test, y_test) = mnist.load_data()
    x_train = x_train.reshape(60000, 784)
    x_test = x_test.reshape(10000, 784)
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_train /= 255
    x_test /= 255
    y_train = y_train.astype(int)
    y_test = y_test.astype(int)
    
    model = Sequential()
    model.add(Dense(64, activation='relu', input_shape=(784,)))
    model.add(Dense(10, activation='softmax'))
    model.compile(loss='sparse_categorical_crossentropy', optimizer=AdamOptimizer(), metrics=['accuracy'])
    
    model.fit(x_train, y_train, epochs=10,
              validation_data=(x_test, y_test),
              callbacks=[TrainValTensorBoard(write_graph=False)])
    

    【讨论】:

    • 你好 Yu,有没有办法记录验证准确性和丢失每批(而不是每个时期)?我们有大量图像要训练,一个 epoch 可能需要几分钟才能更新一次。
    • yes 郝曦。你可以在on_batch_end方法中添加一个计数器变量,并用它来记录每'n'个批次。可以在thisgithub issue 中看到代码。
    • 你好 Yu,这在启用 Eager Execution 的情况下也可以吗?这意味着也可以更新此代码以使用 TensorFlow Summary API v2 (tensorflow.org/api_docs/python/tf/contrib/summary)?
    • @Derk 我已经更新了答案以包含一个启用了急切执行的示例(使用 TF 1.11.0 测试)。请查看它是否符合您的用例。
    • 谢谢!您是否有理由使用 summary_ops_v2 而不是 tensorflow.org/api_docs/python/tf/contrib/summary 中的标准 API
    【解决方案2】:

    如果您使用的是 TensorFlow 2.0,则现在默认使用 Keras TensorBoard 回调来获取此信息。 (将 TensorFlow 与 Keras 结合使用时,请确保您使用的是 tensorflow.keras。)

    请参阅本教程:

    https://www.tensorflow.org/tensorboard/r2/scalars_and_keras

    【讨论】:

      猜你喜欢
      • 2020-06-27
      • 2017-06-13
      • 2020-08-12
      • 2017-09-18
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-11-18
      • 1970-01-01
      相关资源
      最近更新 更多