【发布时间】:2021-02-21 20:15:06
【问题描述】:
背景
根据TensorFlow documentation,可以使用以下方法执行自定义训练步骤
# Fake sample data for testing
x_batch_train = tf.zeros([32, 3, 1], dtype="float32")
y_batch_train = tf.zeros([32], dtype="float32")
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
with tf.GradientTape() as tape:
logits = model(x_batch_train, training=True)
loss_value = loss_fn(y_batch_train, logits)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
但是,如果我想使用不同的损失函数,例如分类交叉熵,我需要对梯度磁带中创建的 logits 进行 argmax:
loss_fn = tf.keras.lossees.get("categorical_crossentropy")
with tf.GradientTape() as tape:
logits = model(x_batch_train, training=True)
prediction = tf.cast(tf.argmax(logits, axis=-1), y_batch_train.dtype)
loss_value = loss_fn(y_batch_train, prediction)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
问题
问题在于 tf.argmax 函数不可微分,因此 TensorFlow 无法计算梯度,您会收到错误:
ValueError: No gradients provided for any variable: [...]
我的问题:在不改变损失函数的情况下如何使第二个示例工作?
【问题讨论】:
-
我为 softargmax 推荐了这个答案:stackoverflow.com/a/54294985/10202807
标签: python tensorflow machine-learning keras deep-learning