无法在自定义图层中获取 keras 图层的输出形状答案

【问题标题】：Can't get output shape of a keras layer inside a custom layer无法在自定义图层中获取 keras 图层的输出形状
【发布时间】：2021-07-23 00:45:41
【问题描述】：

我正在使用由多个 Keras 层构建的 Keras 自定义层。我试图让内层的 output_shape 形成一个回调（on_train_batch_end）并得到以下错误： “AttributeError：该层从未被调用，因此没有定义的输入形状。”

我不明白如果调用自定义层中的调用函数会发生这种情况，因为我已经为单个批次训练了模型。

代码示例：

from tensorflow.keras.layers import  ReLU, MaxPooling2D, Input, Dense, Conv2D, Flatten
from tensorflow.keras.layers import Layer
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import Callback
import tensorflow as tf
import numpy as np


class MyLayer(Layer):
    def __init__(self):
        super(MyLayer, self).__init__()
        self.conv = None
        self.m_max = None
        self.relu = None

    def call(self, inputs, **kwargs):
        x = self.conv(inputs)
        x = self.m_max(x)
        return self.relu(x)

    def build(self, input_shape):
        self.conv = Conv2D(input_shape=input_shape, filters=128, kernel_size=(2,2))
        self.m_max = MaxPooling2D()
        self.relu = ReLU()


class ModelCallback(Callback):
  def on_batch_end(self, batch, logs=None):
    print(self.model.layers[1].conv.output_shape)

inp = Input((32,32,3))
x = MyLayer()(inp)
x = Flatten()(x)
out = Dense(1)(x)

model = Model(inputs=inp, outputs=out)
model.compile(optimizer='adam', loss='categorical_crossentropy' )

x_train = np.random.rand(5000,32,32,3)
y_train = np.random.randint(2, size=(5000,1))

model.fit(x_train, y_train,epochs=5, callbacks=ModelCallback())

【问题讨论】：

尝试复制您的代码，但遇到错误，NameError: name 'ds_train' is not defined。您能否提供完整的代码（如果它不是机密的），以便我们为您提供帮助。请找到 Colab Gist：colab.research.google.com/gist/rmothukuru/…
我更新了完整的代码，这只是一个我可以分享的虚拟代码，但仍然出现同样的错误。
@TFer2 ，我用正确的代码创建了一个新的 github gist colab.research.google.com/gist/arielAmsel/…

标签： tensorflow keras

【解决方案1】：

这不是一个真正的答案，但我会与其他人分享我的解决方法。这个想法只是计算虚拟数据的大小并保存它们。

def _calculate_shape(self, input_tensor_shape: tf.TensorShape):
    self.conv.trainable = False
    self.m_max.trainable = False
    self.relu.trainable = False
    input_shape = list(input_tensor_shape)
    input_shape[0] = self.batch_size
    x = self.conv(np.random.rand(*input_shape))
    self.conv_shapes = (input_shape[1:], tf.shape(x).numpy().tolist()[1:])     # [1:] is needed to remove the batch size form the shape
    x = self.m_max(x)
    self.max_shapes = (self.conv_shapes[1], tf.shape(x).numpy().tolist()[1:])
    x = self.relu(x)
    self.relu_shapes = (self.max_shapes[1], tf.shape(x).numpy().tolist()[1:])
    self.conv.trainable = True
    self.m_max.trainable = True
    self.relu.trainable = True

然后您可以在尝试获取内层形状时使用变量。
** 这会从形状中删除批量大小

【讨论】：