计算张量流中3D张量的像素距离？答案

【问题标题】：Calculate pixelwise distance of 3D tensor in tensorflow?计算张量流中3D张量的像素距离？
【发布时间】：2019-11-27 12:49:40
【问题描述】：

我正在尝试在 tensorflow 中创建一个 3d 距离图（大小：W * H * D），以用于训练的损失函数。我有一个基本事实（大小为 W * H * D 的二进制体积），我将使用它来创建距离图，即我的距离图的每个像素的值将是该像素到正值的最小距离（即像素=1) 基本事实中的形状。由于 L2.NORM 存在 3d 形状问题的问题，因此将轴减少为 2D 形状并使该问题完全可微。任何建议或指示将不胜感激。

【问题讨论】：

你能举一个你需要的例子吗？如果可能的话，您尝试过的那种事情？我不确定我是否理解问题的输入和输出。基本事实是 3D 二进制张量（仅由 1 和 0 组成）还是其他？并且您想计算对于相同大小的体积，与地面实况中最接近的 1 的距离？
@jdehesa 当然，我有一个 3d 体积形状 (112,112,112)，即地面实况掩码结构 1 = 1，结构 2=2，背景 = 0。我得到了地面实况掩码，仅采用结构 2 （阈值）然后我将其反转（即现在背景 = 1 和前景 = 0）。我现在尝试生成欧几里德距离变换，如 scipy.ndimage.morphology.distance_transform_edt 在 tensorflow 中。 TF 有这样的 2D 案例，但很难转换为 3D。我想要的是一个 3D 张量，在远离像素值 0 的位置处具有高值，在靠近像素值 0 的位置处具有较低值，以获得新的损失 fn。

标签： python tensorflow heatmap euclidean-distance

【解决方案1】：

如果我理解正确，您想计算从体积中的每个位置到给定类的最近位置的距离。为简单起见，我假设有趣的类标有1，但如果它不同，希望您可以根据您的情况调整它。该代码适用于 TensorFlow 2.0，但应该适用于 1.x。

执行此操作的最简单方法是使用1 计算体积中所有坐标与每个坐标之间的距离，然后从中选择最小的距离。你可以这样做：

import tensorflow as tf

# Make input data
w, h, d = 10, 20, 30
w, h, d = 2, 3, 4
t = tf.random.stateless_uniform([w, h, d], (0, 0), 0, 2, tf.int32)
print(t.numpy())
# [[[0 1 0 0]
#   [0 0 0 0]
#   [1 1 0 1]]
#
#  [[1 0 0 0]
#   [0 0 0 0]
#   [1 1 0 0]]]
# Make coordinates
coords = tf.meshgrid(tf.range(w), tf.range(h), tf.range(d), indexing='ij')
coords = tf.stack(coords, axis=-1)
# Find coordinates that are positive
m = t > 0
coords_pos = tf.boolean_mask(coords, m)
# Find every pairwise distance
vec_d = tf.reshape(coords, [-1, 1, 3]) - coords_pos
# You may choose a difference precision type here
dists = tf.linalg.norm(tf.dtypes.cast(vec_d, tf.float32), axis=-1)
# Find minimum distances
min_dists = tf.reduce_min(dists, axis=-1)
# Reshape
out = tf.reshape(min_dists, [w, h, d])
print(out.numpy().round(3))
# [[[1.    0.    1.    2.   ]
#   [1.    1.    1.414 1.   ]
#   [0.    0.    1.    0.   ]]
#
#  [[0.    1.    1.414 2.236]
#   [1.    1.    1.414 1.414]
#   [0.    0.    1.    1.   ]]]

这可能对您来说足够好，尽管它可能不是最有效的解决方案。最聪明的做法是在每个位置的相邻区域中搜索最接近的正位置，但要有效地做到这一点很复杂，一般来说，在 TensorFlow 中以矢量化方式更是如此。然而，我们可以通过几种方法来改进上面的代码。一方面，我们知道1 的位置总是零距离，因此不需要计算这些位置。另一方面，如果 3D 体积中的 1 类表示某种密集形状，那么如果我们只计算与该形状表面的距离，我们就可以节省一些时间。所有其他正位置必然与形状外的位置具有更大的距离。所以我们可以做同样的事情，但只计算从非正位置到正表面位置的距离。你可以这样做：

import tensorflow as tf

# Make input data
w, h, d = 10, 20, 30
w, h, d = 2, 3, 4
t = tf.dtypes.cast(tf.random.stateless_uniform([w, h, d], (0, 0)) > .15, tf.int32)
print(t.numpy())
# [[[1 1 1 1]
#   [1 1 1 1]
#   [1 1 0 0]]
# 
#  [[1 1 1 1]
#   [1 1 1 1]
#   [1 1 1 1]]]
# Find coordinates that are positive and on the surface
# (surrounded but at least one 0)
t_pad_z = tf.pad(t, [(1, 1), (1, 1), (1, 1)]) <= 0
m_pos = t > 0
m_surround_z = tf.zeros_like(m_pos)
# Go through the 6 surrounding positions
for i in range(3):
    for s in [slice(None, -2), slice(2, None)]:
        slices = tuple(slice(1, -1) if i != j else s for j in range(3))
        m_surround_z |= t_pad_z.__getitem__(slices)
# Surface points are positive points surrounded by some zero
m_surf = m_pos & m_surround_z
coords_surf = tf.where(m_surf)
# Find coordinates that are zero
coords_z = tf.where(~m_pos)
# Find every pairwise distance
vec_d = tf.reshape(coords_z, [-1, 1, 3]) - coords_surf
dists = tf.linalg.norm(tf.dtypes.cast(vec_d, tf.float32), axis=-1)
# Find minimum distances
min_dists = tf.reduce_min(dists, axis=-1)
# Put minimum distances in output array
out = tf.scatter_nd(coords_z, min_dists, [w, h, d])
print(out.numpy().round(3))
# [[[0. 0. 0. 0.]
#   [0. 0. 0. 0.]
#   [0. 0. 1. 1.]]
#
#  [[0. 0. 0. 0.]
#   [0. 0. 0. 0.]
#   [0. 0. 0. 0.]]]

编辑：这是一种使用 TensorFlow 循环将距离计算分成块的方法：

# Following from before
coords_surf = ...
coords_z = ...
CHUNK_SIZE = 1_000 # Choose chunk size
dtype = tf.float32
# If using TF 2.x you can know in advance the size of the tensor array
# (although the element shape will not be constant due to the last chunk)
num_z = tf.shape(coords_z)[0]
arr = tf.TensorArray(dtype, size=(num_z - 1) // CHUNK_SIZE + 1, element_shape=[None], infer_shape=False)
_, arr = tf.while_loop(lambda i, arr: i < num_z,
                       lambda i, arr: (i + CHUNK_SIZE, arr.write(i // CHUNK_SIZE,
                           tf.reduce_min(tf.linalg.norm(tf.dtypes.cast(
                               tf.reshape(coords_z[i:i + CHUNK_SIZE], [-1, 1, 3]) - coords_surf,
                           dtype), axis=-1), axis=-1))),
                       [tf.constant(0, tf.int32), arr])
min_dists = arr.concat()
out = tf.scatter_nd(coords_z, min_dists, [w, h, d])

【讨论】：

我实际上不确定，对于第二个 sn-p，我是否需要以 26 种方式（有对角线）或 6 种方式（没有对角线）检查周围的零。
非常感谢您的帮助，它看起来很棒，我真的很感激，我会在今天/明天尝试实现它。结构非常中心但很小，所以我认为 26 种方法可以确保找到边界？我附上了原始问题中结构的快照。
@HelenaWilliams 是的，我认为 26 种方法应该可以工作，我想知道是否只检查 6 种方法（这会产生更少的表面点，因此以后需要更少的操作）是否可以，即，如果在任何情况下，对角线上带有黑色像素的白色像素（但不在 6 路邻域中）可能是某个黑色像素的最接近的白色像素（在这种情况下，仅检查 6 路是错误的，因为我不会把那个白色像素算作“表面”）。
到目前为止，我唯一的问题是我的输入张量的形状为 (300,400,500)（由于训练期间的采样将是 (112,112,112)），但它仍然是一个非常大的张量。因此，当我执行布尔掩码/堆栈函数时，大多数函数都会出现资源内存错误。我现在正在尝试是否可以使用 tf.where() 函数来获取像素的位置，但是我不能再重塑为 w h d 。
@HelenaWilliams 你是对的，tf.where 是正确的函数。另外 1) 我有一个错误，其中coords_z 是用~m_surf 而不是~m_pos 定义的，导致结果正确但计算量更大 2) 考虑更多，我很确定 6 路邻居检查很好，这也应该减少计算。我相应地更新了代码（邻居检查现在有点难看）。尽管如此，这仍然很可能占用太多内存，尤其是vec_d 和dists。一种可能性是使用tf.while_loop 分块进行计算，尽管这有点痛苦。