【发布时间】:2020-12-28 10:17:03
【问题描述】:
我正在尝试构建用于文本生成的编码器-解码器模型。我正在使用带有嵌入层的 LSTM 层。嵌入层到 LSTM 编码器层的输出在某种程度上存在问题。我得到的错误是:
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 13, 128, 512)
我的编码器数据有形状:(40, 13, 128) = (num_observations, max_encoder_seq_length, vocab_size) embeddings_size/latent_dim = 512。
我的问题是:我如何才能“摆脱”从嵌入层到 LSTM 编码器层的第 4 维,或者换句话说:我应该如何将这 4 维传递到编码器模型的 LSTM 层?由于我是这个主题的新手,我最终还应该在解码器 LSTM 层中纠正什么?
我已经阅读了几个帖子,包括 this,这个 one 和许多其他帖子,但找不到解决方案。在我看来,我的问题不在于模型,而在于数据的形式。任何关于可能出错的提示或评论都将不胜感激。非常感谢
我的模型来自 (this tutorial):
encoder_inputs = Input(shape=(max_encoder_seq_length,))
x = Embedding(num_encoder_tokens, latent_dim)(encoder_inputs)
x, state_h, state_c = LSTM(latent_dim, return_state=True)(x)
encoder_states = [state_h, state_c]
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(max_decoder_seq_length,))
x = Embedding(num_decoder_tokens, latent_dim)(decoder_inputs)
x = LSTM(latent_dim, return_sequences=True)(x, initial_state=encoder_states)
decoder_outputs = Dense(num_decoder_tokens, activation='softmax')(x)
# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.summary()
# Compile & run training
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
# Note that `decoder_target_data` needs to be one-hot encoded,
# rather than sequences of integers like `decoder_input_data`!
model.fit([encoder_input_data, decoder_input_data],
decoder_target_data,
batch_size=batch_size,
epochs=epochs,
shuffle=True,
validation_split=0.05)
我的模型总结是:
Model: "functional_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 13)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 15)] 0
__________________________________________________________________________________________________
embedding (Embedding) (None, 13, 512) 65536 input_1[0][0]
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, 15, 512) 65536 input_2[0][0]
__________________________________________________________________________________________________
lstm (LSTM) [(None, 512), (None, 2099200 embedding[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (None, 15, 512) 2099200 embedding_1[0][0]
lstm[0][1]
lstm[0][2]
__________________________________________________________________________________________________
dense (Dense) (None, 15, 128) 65664 lstm_1[0][0]
==================================================================================================
Total params: 4,395,136
Trainable params: 4,395,136
Non-trainable params: 0
__________________________________________________________________________________________________
编辑
我正在按以下方式格式化我的数据:
for i, text, in enumerate(input_texts):
words = text.split() #text is a sentence
for t, word in enumerate(words):
encoder_input_data[i, t, input_dict[word]] = 1.
这给出了这样的命令decoder_input_data[:2]:
array([[[0., 1., 0., ..., 0., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]],
[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]], dtype=float32)
【问题讨论】:
-
请为架构中指定的形状添加值,并且请提供错误的完整跟踪。代码中的哪一行导致了错误?
标签: tensorflow keras lstm embedding