Keras 多任务中的连接层有什么作用？答案

【问题标题】：What does concatenate layers do in Keras multitask?Keras 多任务中的连接层有什么作用？
【发布时间】：2019-12-03 11:47:52
【问题描述】：

我正在 Keras 中实现一个简单的多任务模型。我使用了共享层标题下documentation中给出的代码。

我知道在多任务学习中，我们共享模型中的一些初始层，而最终层根据link 分别针对特定任务。

我在 keras API 中有以下两种情况，第一种情况是我使用keras.layers.concatenate，而另一种情况我没有使用任何keras.layers.concatenate。

我将每个案例的代码和模型发布如下。

案例 1 代码

import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model

tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))

# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)

# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)

# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)

# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(merged_vector)
predictions2 = Dense(1, activation='sigmoid')(merged_vector)

# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

案例一模型

案例 2 代码

import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model

tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))

# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)

# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)



# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(encoded_a )
predictions2 = Dense(1, activation='sigmoid')(encoded_b)

# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

案例 2 模型

在这两种情况下，LSTMlayer 仅是共享的。在 case-1 中，我们有 keras.layers.concatenate，但在 case-2 中，我们没有任何 keras.layers.concatenate。

我的问题是，哪个是多任务处理，case-1 还是 case-2？ Morover，keras.layers.concatenate在case-1中的作用是什么？

【问题讨论】：

标签： python tensorflow keras deep-learning nlp

【解决方案1】：

两者都是多任务模型，因为这仅取决于是否存在多个输出且每个输出关联一个任务。

不同之处在于您的第一个模型明确连接了共享层产生的特征，因此两个输出任务都可以考虑来自两个输入的信息。第二个模型只有一个输入直接连接到一个输出，而不考虑另一个输入。此处模型之间的唯一联系是它们共享 LSTM 权重。

【讨论】：