keras 自定义层加载数据答案

【问题标题】：keras custom layer to load datakeras 自定义层加载数据
【发布时间】：2021-04-06 06:28:19
【问题描述】：

我正在关注this 教程使用自定义层进行预处理。

def pre_process(file_path):
    # loading file from disk and transforming into [90,13,1]

class PreProcessBlock(layers.Layer):
    def __init__(self):
        super(PreProcessBlock,self).__init__()
    
    def call(self, inputs):
        return pre_process(inputs.numpy())
    
    def compute_output_shape(self, input_shape):
        return input_shape

preprocess = tf.keras.Sequential([
            PreProcessBlock()
])

model = keras.Sequential(
     [
      preprocess,
      
      layers.Dense(256, activation = "relu"),
      layers.Dropout(.5), 
      layers.Dense(len(LABELS))]

我正在创建我的数据集

files = ['file1,'file2`]
labels = [0,1]

def get_data_set(files, labels, is_training=False):
    dataset = tf.data.Dataset.from_tensor_slices((files, labels))

    if is_training:
        dataset = dataset.shuffle(SHUFFLE_BUFFER_SIZE, reshuffle_each_iteration = True)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(AUTOTUNE)
    return dataset

train_dataset = get_data_set(files, labels, is_training=True)
val_dataset = get_data_set(files, labels)

模型拟合失败并出现错误

model.fit(train_dataset, epochs=1, verbose=1,validation_data=val_dataset)

错误

AttributeError: in user code:

    /opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:806 train_function  *
        return step_function(self, iterator)
    <ipython-input-158-3f6d9dd39f2f>:6 call  *
        return pre_process(inputs.numpy())

    AttributeError: 'Tensor' object has no attribute 'numpy'

我的问题

这是实现模型管道的有效方式吗？

【问题讨论】：

标签： tensorflow keras keras-layer

【解决方案1】：

层的结构和将所有预处理层组织成一个顺序层非常棒。您不应该在层中加载任何学习示例（它不是模型的一部分，它会降低可移植性）。

两个问题：

由于this，您没有numpy() 方法。我建议坚持使用静态图，并且不要尝试在 keras 图中将任何内容转换为 numpy，除非绝对必要 - 性能问题。对张量的大部分操作都可以使用tf 来完成。

您的自定义预处理层应该从tensorflow.keras.layers.experimental.preprocessing.PreprocessingLayer 继承（tf.keras.layers.experimental.preprocessing 的所有层都直接从它继承或通过同一包中的CombinerPreprocessingLayer 继承）。 PreprocessingLayer 类的源代码中没有太多内容，但所有内容都很重要：

PreprocessingLayer 为adapt 方法提供接口：
adapt(self, data, reset_state=True)。请参阅“纯”keras docs 为什么以及何时需要它。
PreprocessingLayer 类具有 _must_restore_from_config = True 标志，我们阅读了 Layer 文档：

从 SavedModel 加载时，Layers 通常可以恢复为通用层包装器。然而，有时层可能会实现超出此包装器的方法，例如 PreprocessingLayers 的adapt 方法。在这种情况下，图层实施者可以覆盖 must_restore_from_config 以返回真的;必须将具有此属性的图层恢复到它们的实际对象（如果对象不可用于恢复代码）。

我们以Resizing 层代码为例（为了便于阅读，省略了注释）：

class Resizing(PreprocessingLayer):
  def __init__(self,
               height,
               width,
               interpolation='bilinear',
               name=None,
               **kwargs):
    self.target_height = height
    self.target_width = width
    self.interpolation = interpolation
    self._interpolation_method = get_interpolation(interpolation)
    self.input_spec = InputSpec(ndim=4)
    super(Resizing, self).__init__(name=name, **kwargs)
    base_preprocessing_layer._kpl_gauge.get_cell('V2').set('Resizing')

  def call(self, inputs):
    outputs = image_ops.resize_images_v2(
        images=inputs,
        size=[self.target_height, self.target_width],
        method=self._interpolation_method)
    return outputs

  def compute_output_shape(self, input_shape):
    input_shape = tensor_shape.TensorShape(input_shape).as_list()
    return tensor_shape.TensorShape(
        [input_shape[0], self.target_height, self.target_width, input_shape[3]])

  def get_config(self):
    config = {
        'height': self.target_height,
        'width': self.target_width,
        'interpolation': self.interpolation,
    }
    base_config = super(Resizing, self).get_config()
    return dict(list(base_config.items()) + list(config.items()))

这是一个非常通用的层。它可以将图像调整到一定的宽度和高度。但是，预处理层的目的是将整个端到端管道保存在一个模型中。因此，在 your 管道中，您将具有特定的宽度和高度，并且您不想在进行推理时为使用适当的参数实例化层而烦恼 - 它应该与训练中的相同（适用于任何预处理方法，真的）。所以在get_config() 方法中，除了基本的配置之外，高度和宽度都被保存了，并且在以后重新模型时可以很容易地读取。请注意，该层不会覆盖adapt 方法，因为它对数据是不变的。

【讨论】：