计算 TensorFlow 中每个张量行的平均值答案

【问题标题】：Compute the mean for each tensor's row in TensorFlow计算 TensorFlow 中每个张量行的平均值
【发布时间】：2019-01-05 20:55:35
【问题描述】：

我是 tensorflow 的新手，我想从张量中计算每一行的平均值。为了做到这一点，Tensorflow 有 tf.reduce_mean 操作。问题是，当一行具有 nan 值时，该行的平均值也是 nan。除此之外，我想自己实现它，以便更好地理解 tensorflow 的哲学。那么我该如何手动实现呢？我写的代码：

import tensorflow as tf
import numpy as np

ratings = np.array([[7, 6, 7, 4, 5, 4], [6, 7, np.NaN, 4, 3, 4], [np.NaN, 3, 3, 1, 1, np.NaN],
                   [1, 2, 2, 3, 3, 4], [1, np.NaN, 1, 2, 3, 3]], dtype = np.float16)

tRatings = tf.convert_to_tensor(ratings, dtype = np.float16)

means = tf.get_variable("means", shape=(5), dtype=tf.float16)


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    mean = tf.reduce_mean(tRatings, axis=1)
    print(sess.run(mean))

【问题讨论】：

有两种做法：可以在numpy中定义均值运算，使用tf.py_func调用numpy函数。或者，在 tensorflow 本身中定义它，但根据需要将 NaN 替换为 0。您可以使用 tRatings=tf.where(tf.is_nan(tRatings), tf.zeros_like(tRatings), tRatings) 将 NaN 替换为零。
如果我用零替换 nan 值，那么我会发现一个错误的平均值。我想根据每一行的现有值计算平均值。
你可以用 tf.is_nan 计算 nan。然后在除以之前，减去 nan 的数量。

标签： python tensorflow

【解决方案1】：

import tensorflow as tf
import numpy as np
ratings = np.array([[7, 6, 7, 4, 5, 4], [6, 7, np.NaN, 4, 3, 4], [np.NaN, 3, 3, 1, 1, np.NaN],
                       [1, 2, 2, 3, 3, 4], [1, np.NaN, 1, 2, 3, 3]], dtype = np.float16)

tRatings = tf.convert_to_tensor(ratings, dtype = np.float16)
means = tf.get_variable("means", shape=(5), dtype=tf.float16)
with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  #mean = tf.reduce_mean(tRatings, axis=1)
  tRatings_wonan=tf.where(tf.is_nan(tRatings), tf.zeros_like(tRatings), tRatings)
  sum = tf.reduce_sum(tRatings_wonan,axis=1)
  count_nans = tf.reduce_sum(tf.cast(tf.is_nan(tRatings), tf.float16),axis=1)
  mean = tf.div(sum,tf.subtract(tf.cast(tf.shape(tRatings)[1], tf.float16),count_nans))
  print(sess.run(mean))

【讨论】：

非常感谢您的回答。如果我使用 tf.py_func，我会损失很多效率吗？
取决于具体实现：stackoverflow.com/questions/42927920/…