tf.math.pow(x, 0.5) 的激活函数导致 NaN 损失答案

【问题标题】：Activation function of tf.math.pow(x, 0.5) leading to NaN lossestf.math.pow(x, 0.5) 的激活函数导致 NaN 损失
【发布时间】：2021-08-18 20:37:23
【问题描述】：

我正在尝试为我的 Keras 顺序模型（特别是 MNIST 数据集）使用自定义平方根激活函数。当我使用tf.math.sqrt(x) 时，训练进行得很顺利，模型也相当准确。但是，当我尝试使用 tf.math.pow(x, 0.5) 时，模型无法训练，损失为 NaN。

我真的不确定为什么会发生这种情况，因为我认为这两种选择是相同的。

平方根函数

def tfsqrt(x):
    cond = tf.greater_equal(x, 0)
    return tf.where(cond, tf.math.sqrt(x), -tf.math.sqrt(-x))

幂函数

def pwsqrt(x):
  cond = tf.greater_equal(x, 0)
  return tf.where(cond, tf.math.pow(x, 0.5), -tf.math.pow(-x, 0.5))

如果有人能解释这种意外行为，将不胜感激。谢谢！

【问题讨论】：

标签： python tensorflow keras tf.keras activation-function

【解决方案1】：

函数正确： x=tf.Variable([-2.0,-3.0,0.0, 1.0,2.0])

y=tfsqrt(x)
y
y=pwsqrt(x)
y

这些函数在google colab中运行良好，可能数据中有一些nan值。

可能模型损失或指标有问题。

【讨论】：