【发布时间】:2021-05-18 04:16:27
【问题描述】:
我正在通过Tensorflow's tutorial 使用注意力机制进行神经机器翻译。
解码器的代码如下:
class Decoder(tf.keras.Model):
def __init__(self, vocab_size, embedding_dim, dec_units, batch_sz):
super(Decoder, self).__init__()
self.batch_sz = batch_sz
self.dec_units = dec_units
self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
self.gru = tf.keras.layers.GRU(self.dec_units,
return_sequences=True,
return_state=True,
recurrent_initializer='glorot_uniform')
self.fc = tf.keras.layers.Dense(vocab_size)
# used for attention
self.attention = BahdanauAttention(self.dec_units)
def call(self, x, hidden, enc_output):
# enc_output shape == (batch_size, max_length, hidden_size)
context_vector, attention_weights = self.attention(hidden, enc_output)
# x shape after passing through embedding == (batch_size, 1, embedding_dim)
x = self.embedding(x)
# x shape after concatenation == (batch_size, 1, embedding_dim + hidden_size)
x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)
# passing the concatenated vector to the GRU
output, state = self.gru(x)
# output shape == (batch_size * 1, hidden_size)
output = tf.reshape(output, (-1, output.shape[2]))
# output shape == (batch_size, vocab)
x = self.fc(output)
return x, state, attention_weights
这里我不明白的是,解码器的 GRU 单元没有通过使用编码器的最后一个隐藏状态初始化它来连接到编码器。
output, state = self.gru(x)
# Why is it not initialized with the hidden state of the encoder ?
根据我的理解,编码器和解码器之间是有联系的,只有当解码器用“思想向量”或编码器的最后一个隐藏状态初始化时。
为什么在 Tensorflow 的官方教程中没有这个?它是一个错误吗?还是我在这里遗漏了什么?
有人可以帮我理解吗?
【问题讨论】:
标签: tensorflow lstm machine-translation encoder-decoder gated-recurrent-unit