TensorFlow 模型子类化多输入答案

【问题标题】：Tensorflow Model Subclassing Mutli-InputTensorFlow 模型子类化多输入
【发布时间】：2020-01-14 23:19:37
【问题描述】：

我正在使用 keras 子类化模块来重新制作以前需要两个输入和两个输出的功能模型。我找不到任何关于是否/如何可能的文档。

TF2.0/Keras 子类化 API 是否允许多输入？

输入我的功能模型，构建很简单：

word_in = Input(shape=(None,))  # sequence length
char_in = Input(shape=(None, None)) 
... layers...
m = Model(inputs=[word_in, char_in], outputs=[output_1, output_2])

【问题讨论】：

标签： python tensorflow keras

【解决方案1】：

多输入的子分类模型与单输入模型没有什么不同。

class MyModel(Model):
    def __init__(self):
        super(MyModel, self).__init__()
        # define layers
        self.dense1 = Dense(10, name='d1')
        self.dense2 = Dense(10, name='d2')

    def call(self, inputs):
        x1 = inputs[0]
        x2 = inputs[1]
        # define model
        return x1, x2

您可以在__init__ 中定义您的层，并在call 方法中定义您的模型。

word_in = Input(shape=(None,))  # sequence length
char_in = Input(shape=(None, None)) 

model = MyModel()
model([word_in, char_in])
# returns 
# (<tf.Tensor 'my_model_4/my_model_4/Identity:0' shape=(?, ?) dtype=float32>,
# <tf.Tensor 'my_model_4/my_model_4_1/Identity:0' shape=(?, ?, ?) dtype=float32>)

【讨论】：

我明白了，有没有办法在模型本身中定义输入层？在模型外部和内部都有层会变得更加混乱。
不需要在外面定义Input Layers，这样就可以直接把实际输入传给模型了。

【解决方案2】：

假设您有 3 个输入（例如 roberta 模型 QA 任务）

    class MasoudModel2(tf.keras.Model):

  def __init__(self):
    # in __init__ you define all the layers
    super(MasoudModel2, self).__init__()
    self.dense1 = tf.keras.layers.Dense(4, activation=tf.nn.relu)
    self.dense2 = tf.keras.layers.Dense(10, activation='softmax')


  def call(self, inputs):
    ids = inputs[0]
    toks = inputs[1]
    att_mask = inputs[2]
    # let's skip real layers
    a = self.dense1(ids)
    b = self.dense2(att_mask)
    return a, b

然后：

ids = tf.keras.Input((MAX_LEN), dtype = tf.int32)
att_mask = tf.keras.Input((MAX_LEN), dtype = tf.int32)
toks = tf.keras.Input((MAX_LEN), dtype = tf.int32)
model2 = MasoudModel2()
model2([ids, att_mask, toks])

更多信息：如果您也想要功能性 API。

def functional_type():
    ids = tf.keras.Input((MAX_LEN), dtype = tf.int32)
    att_mask = tf.keras.Input((MAX_LEN), dtype = tf.int32)
    toks = tf.keras.Input((MAX_LEN), dtype = tf.int32)
    
    
    c = tf.keras.layers.Dense(10, activation='softmax')(ids)
    d = tf.keras.layers.Dense(3, activation = 'softmax')(att_mask)
    
    model = tf.keras.Model(inputs=[ids, toks, att_mask], outputs =[c, d])
    
    
    return model

然后（注意：第一个参数的最后两个索引是答案。）

model.fit([input_ids[idxT,], attention_mask[idxT,], token_type_ids[idxT,]], [start_tokens[idxT,], end_tokens[idxT,]], 
        epochs=3, batch_size=32, verbose=DISPLAY, callbacks=[sv],

【讨论】：