【发布时间】:2021-06-13 16:47:49
【问题描述】:
所以我一直在 TF-Keras(最新版本)上实现句子 VAE。下面的自定义函数计算稀疏分类输出的 VAE 损失。
def vae_loss(encoder_inputs, decoder_outputs):
sen_loss = K.sparse_categorical_crossentropy(encoder_inputs, decoder_outputs, from_logits=True)
sen_loss = K.sum(sen_loss, axis=-1)
kl_loss = - 0.5 * K.mean(1 + z_log_sigma - K.square(z_mean) - K.exp(z_log_sigma))
loss = K.mean(sen_loss + kl_loss)
return loss
optimizer = keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer=optimizer, loss=vae_loss)
model.fit([seq_in,seq_out],
seq_lab,
batch_size=batch_size,
epochs=epochs,
validation_split=0.1)
#Seq_in shape = (no_of_samples, maxlen)
#Seq_out shape = (no_of_samples, maxlen)
#Seq_lab shape = (no_of_samples, maxlen, 1)
在尝试训练时,它给出了错误:
TypeError: Cannot convert a symbolic Keras input/output to a numpy array.
This error may indicate that you're trying to pass a symbolic value to a NumPy call,
which is not supported.
Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not
register dispatching,
preventing Keras from automatically converting the API call to a lambda layer in the
Functional Model.
然后我禁用了急切张量以允许使用计算图。但是,它给出了错误:
FailedPreconditionError: 2 root error(s) found.
(0) Failed precondition: Could not find variable
training_6/Adam/embedding_16/embeddings/v.
This could mean that the variable has been deleted. In TF1, it can also mean the
variable is uninitialized. Debug info: container=localhost, status=Not found:
Resource localhost/training_6/Adam/embedding_16/embeddings/v/N10tensorflow3VarE does
not exist.
[[{{node training_6/Adam/Adam/update_embedding_16/embeddings/ReadVariableOp_3}}]]
[[_arg_keras_learning_phase_0_3/_722]]
(1) Failed precondition: Could not find variable
training_6/Adam/embedding_16/embeddings/v.
This could mean that the variable has been deleted. In TF1, it can also mean the
variable is uninitialized.
Debug info: container=localhost, status=Not found: Resource
localhost/training_6/Adam/embedding_16/embeddings/v/N10tensorflow3VarE does not
exist.
[[{{node training_6/Adam/Adam/update_embedding_16/embeddings/ReadVariableOp_3}}]]
0 successful operations.
0 derived errors ignored.
模型的代码是这样的:
#################################### ENCODER LAYER ################################
encoder_inputs = Input(shape=(maxlen, ))
# Encoder Embedding
enc_emb = Embedding(vocab_size, embedding_dim,
trainable=True)(encoder_inputs)
# Encoder LSTM
encoder_lstm3 = LSTM(latent_dim)
encoder_outputs = encoder_lstm3(enc_emb)
#################################### VAE Z LAYER ################################
z_mean = Dense(units=latent_dim)(encoder_outputs)
z_log_sigma = Dense(units=latent_dim)(encoder_outputs)
def sampling(args):
z_mean, z_log_sigma = args
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0., stddev=1.0)
return z_mean + z_log_sigma * epsilon
z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_sigma])
expandz_h = Dense(latent_dim)
z_exp_h = expandz_h(z)
expandz_c = Dense(latent_dim)
z_exp_c = expandz_c(z)
#################################### DECODER LAYER ################################
# Set up the decoder, using z layer outputs as the initial state
decoder_inputs = Input(shape=(maxlen, ))
# Embedding layer
dec_emb = Embedding(vocab_size, embedding_dim, trainable=True)(decoder_inputs)
# Decoder LSTM
decoder_lstm = LSTM(latent_dim, return_sequences=True,
return_state=True, dropout=0.4,
recurrent_dropout=0.0)
(decoder_outputs, decoder_fwd_state, decoder_back_state) = \
decoder_lstm(dec_emb, initial_state=[z_exp_h, z_exp_c])
# Dense layer
decoder_dense = TimeDistributed(Dense(vocab_size, activation='softmax'))
decoder_outputs = decoder_dense(decoder_outputs)
# Define the model
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.summary()
我认为问题在于 tf-keras 中 loss_function 的实现,但我可能是错的,非常感谢指导。
【问题讨论】:
标签: keras nlp tensorflow2.0 autoencoder seq2seq