在 tensorflow 2.0 中的自定义损失和梯度代码中获取全无梯度答案

【问题标题】：Getting an all None gradient in my custom loss and gradient code in tensorflow 2.0在 tensorflow 2.0 中的自定义损失和梯度代码中获取全无梯度
【发布时间】：2019-12-14 10:41:30
【问题描述】：

我正在尝试在 tensorflow 2.0 中编写一个非常该死的基本损失函数。总而言之，我有 5 个类，我想在不分组的情况下使用一种热编码进行训练。我希望我的模型用 5 个类中的每一个的值来预测每个输入。之后，我想尝试得到两个最高值，如果它们是 3 或 4，我想将其归类为“好”，如果不是，则归类为“坏”。最后，我希望我的损失是 1-precision，其中我所说的精度在以下情况下具有真正的优势： 1.模型猜3，真实类3 2.模型猜3，真实类4 3.模型猜测为4，真实类为3 4. 模型猜测为4，真实类为4

再一次，我知道我可以更改数据的标签，但我不想那样做。我使用了一些已经写好的指标来写我的损失，这里是：

#@tf.function
def my_loss(output,real,threeandfour=1,weights=loss_weights,mod=m):
  m = tf.keras.metrics.TruePositives(thresholds=0.5)
  m.update_state(real,output,sample_weight=weights)
  shape_0=tf.shape(output)[0]
  #shape_1=tf.constant(2,dtype=tf.int32)
  shape_1=2
  halfs=tf.math.multiply(tf.constant(0.5,dtype=tf.float32),tf.ones((shape_0,shape_1),dtype=tf.float32))
  thrsfrs_1=output[:,2:4]
  thrsfrs=tf.cast(thrsfrs_1,dtype=tf.float32)
  logs_1=tf.math.greater(thrsfrs,halfs)
  logs=tf.cast(logs_1,dtype=tf.float32)
  print('shape of log: ',np.shape(logs))
  print('few logs: ',logs,)

  num_of_3_4s_in_model=tf.reduce_sum(logs)
  prec_1=tf.math.divide(m.result(),num_of_3_4s_in_model)
  prec=tf.cast(prec_1,dtype=tf.float32)
  return tf.math.subtract(tf.constant(1,dtype=tf.float32),prec)

渐变函数：

with tf.GradientTape() as tape:
      tape.watch(model.trainable_variables)
      y_=model(X_train)
      print('y_: ',y_)
      loss_value=my_loss(y_,tf_one_hot_train,mod=m,weights=loss_weights)
      #loss_value=tf.cast(loss_value,dtype=tf.float32)
      print('loss_value: ',loss_value)
grads=tape.gradient(loss_value,model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))

它确实成功地获得了 tensorflow 的损失值，并且看起来还不错。这是我得到的梯度和错误：

python
got grads
[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]

ValueError                                Traceback (most recent call last)
<ipython-input-370-2f8f4b783a7b> in <module>()
     23 
     24 #optimizer.apply_gradients(zip(grads, model.trainable_variables), global_step)
---> 25 optimizer.apply_gradients(zip(grads, model.trainable_variables))
     26 
     27 #print("Step: {},         Loss: {}".format(global_step.numpy(),

1 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py in _filter_grads(grads_and_vars)
    973   if not filtered:
    974     raise ValueError("No gradients provided for any variable: %s." %
--> 975                      ([v.name for _, v in grads_and_vars],))
    976   if vars_with_empty_grads:
    977     logging.warning(

ValueError: No gradients provided for any variable: ['dense_40/kernel:0', 'dense_40/bias:0', 'dense_41/kernel:0', 'dense_41/bias:0', 'dense_42/kernel:0', 'dense_42/bias:0', 'dense_43/kernel:0', 'dense_43/bias:0', 'dense_44/kernel:0', 'dense_44/bias:0', 'dense_45/kernel:0', 'dense_45/bias:0', 'dense_46/kernel:0', 'dense_46/bias:0', 'dense_47/kernel:0', 'dense_47/bias:0']

我尝试包含@tf.function，我尝试将 2 转换为 int 等。我还尝试使用许多不同的其他函数（例如 tf.confusion_matrix）甚至没有任何其他函数，包括只是tf.arg_max 之类的。似乎没有任何效果。

我正在为我能想到的损失添加最接近 tensorflow 的代码。同样的事情不断发生。我将它与 tensorflow 对象、numpy 对象一起使用，我检查了我的输入是从零到一，仍然是无梯度。这是我的张量流损失：

#@tf.function
def my_loss(real,output):
  threeandfour=tf.constant(1,dtype=tf.float32)
  #turning real into real classes (opposite of one hot encoding)
  real_classes=tf.argmax(real,axis=1)
  real_classes=tf.cast(real_classes,dtype=tf.float32)
  #tf.print('real_classes: ',real_classes)

  pred_classes=tf.argmax(output,axis=1)
  pred_classes=tf.cast(pred_classes,dtype=tf.float32)
  #tf.print('pred_classes: ',pred_classes)

  #checking how many 3s and 4s there are in both
  good_real=(tf.logical_or(tf.equal(real_classes,3),tf.equal(real_classes,4)))
  good_real=tf.cast(good_real,dtype=tf.float32)
  #tf.print('good_real: ',good_real)

  good_pred=(tf.logical_or(tf.equal(pred_classes,3),tf.equal(pred_classes,4)))
  good_pred=tf.cast(good_pred,dtype=tf.float32)
  #tf.print('good_pred: ',good_pred)

  #which ones do the real and model agree on
  same=tf.math.equal(good_pred,good_real)
  same=tf.cast(same,dtype=tf.float32)
  #print('same: ',same)

  #which ones do they both think are good (3 and 4)
  same_goods=tf.math.multiply(same,good_pred)
  same_goods=tf.cast(same_goods,dtype=tf.float32)
  #print('same goods: ',same_goods)

  #number of ones they both think are good
  num_same_goods=tf.reduce_sum(same_goods)
  num_same_goods=tf.cast(num_same_goods,dtype=tf.float32)
  #print('num_same_goods: ',num_same_goods)

  #number of ones model thinks are good
  num_pred_goods=tf.reduce_sum(good_pred)
  num_pred_goods=tf.cast(num_pred_goods,dtype=tf.float32)
  #print('num_pred_goods: ',num_pred_goods)

  #making sure not to divide by 0
  non_zero_num=tf.math.add(num_pred_goods,tf.constant(0.0001,dtype=tf.float32))
  #precision
  prec=tf.math.divide(num_same_goods,non_zero_num)
  prec=tf.cast(prec,dtype=tf.float32)
  #tf.print('prec: ',prec)
  #1-precision
  one_minus_prec=tf.math.subtract(tf.constant(1,dtype=tf.float32),prec)
  one_minus_prec=tf.cast(one_minus_prec,dtype=tf.float32)

  return one_minus_prec

【问题讨论】：

标签： tensorflow customization loss-function

【解决方案1】：

tensorflow==2.0.0a0 遇到了同样的问题。

更新到2.0.0b1 解决了我的问题

pip install -U tensorflow==2.0.0b1

【讨论】：

感谢@alexey 的回答。不幸的是，我已经在使用 tensorflow 2.0.0 beta-1，当我按照你写的方式重新安装它时，它仍然给我所有的 None 渐变。
我再次查看并注意到问题的另一个潜在来源。尝试将grads = tape.gradient(loss_value,model.trainable_variables) 放入with 块中
根据 TF 文档，grads = tape.gradient(loss_value,model.trainable_variables) 在 de 之外带有块。我想弄清楚发生了什么
感谢您的帮助！！
遇到同样的问题怎么解决？？