【发布时间】:2020-11-24 11:05:16
【问题描述】:
我在以下链接中关注 Keras 中的自我关注 How to add attention layer to a Bi-LSTM
我想将 BI LSTM 应用于具有 3 个类的多类文本分类。
我尝试在我的代码中应用注意力,但出现以下错误,我该如何解决这个问题?谁能帮帮我?
Incompatible shapes: [100,3] vs. [64,3]
[[Node: training_1/Adam/gradients/loss_11/dense_14_loss/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@training_1/Adam/gradients/loss_11/dense_14_loss/mul_grad/Reshape_1"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](training_1/Adam/gradients/loss_11/dense_14_loss/mul_grad/Shape, training_1/Adam/gradients/loss_11/dense_14_loss/mul_grad/Shape_1)]]
class attention(Layer):
def __init__(self, return_sequences=False):
self.return_sequences = return_sequences
super(attention,self).__init__()
def build(self, input_shape):
self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
initializer="normal")
self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
initializer="zeros")
super(attention,self).build(input_shape)
def call(self, x):
e = K.tanh(K.dot(x,self.W)+self.b)
a = K.softmax(e, axis=1)
output = x*a
if self.return_sequences:
return output
return K.sum(output, axis=1)
model = Sequential()
model.add(Embedding(17666, 100, input_length=409))
model.add(Bidirectional(LSTM(32, return_sequences=False)))
model.add(attention(return_sequences=True)) # receive 3D and output 2D
model.add(Dropout(0.3))
model.add(Dense(3, activation='softmax'))
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.summary()
from keras.callbacks import EarlyStopping
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=3)
history777=model.fit(x_train, y_train,
batch_size=100,
epochs=30,
validation_data=(x_val, y_val),
callbacks=[es])
the model:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_14 (Embedding) (None, 409, 100) 1766600
_________________________________________________________________
bidirectional_14 (Bidirectio (None, 64) 34048
_________________________________________________________________
attention_14 (attention) (None, 64) 128
_________________________________________________________________
dropout_6 (Dropout) (None, 64) 0
_________________________________________________________________
dense_14 (Dense) (None, 3) 195
=================================================================
Total params: 1,800,971
Trainable params: 1,800,971
Non-trainable params: 0
____
【问题讨论】:
标签: python keras deep-learning conv-neural-network