【发布时间】:2019-12-03 11:47:52
【问题描述】:
我正在 Keras 中实现一个简单的多任务模型。我使用了共享层标题下documentation中给出的代码。
我知道在多任务学习中,我们共享模型中的一些初始层,而最终层根据link 分别针对特定任务。
我在 keras API 中有以下两种情况,第一种情况是我使用keras.layers.concatenate,而另一种情况我没有使用任何keras.layers.concatenate。
我将每个案例的代码和模型发布如下。
案例 1 代码
import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model
tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(merged_vector)
predictions2 = Dense(1, activation='sigmoid')(merged_vector)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
案例 2 代码
import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model
tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(encoded_a )
predictions2 = Dense(1, activation='sigmoid')(encoded_b)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
在这两种情况下,LSTMlayer 仅是共享的。在 case-1 中,我们有 keras.layers.concatenate,但在 case-2 中,我们没有任何 keras.layers.concatenate。
我的问题是,哪个是多任务处理,case-1 还是 case-2? Morover,keras.layers.concatenate在case-1中的作用是什么?
【问题讨论】:
标签: python tensorflow keras deep-learning nlp