将向量张量嵌入矩阵张量答案

【问题标题】：Embedding a tensor of vectors into a tensor of matrices将向量张量嵌入矩阵张量
【发布时间】：2018-12-13 20:59:13
【问题描述】：

我想创建多个矩阵，它们的对角线为零并且是对称的。这种形式的 n 维矩阵需要 n*(n-1)/2 个参数才能完全指定。这些参数稍后会学习...

在 numpy 中，我可以通过使用 numpy.triu_indices 来计算上三角矩阵的索引，该索引从主对角线上方的第一个对角线开始，然后用提供的参数填充它，如下面的代码 sn- p：

import numpy as np

R = np.array([[1,2,1,1,2,1], [1,1,1,1,1,1]]) 

s = R.shape[1]
M = R.shape[0]

iu_r, iu_c = np.triu_indices(s,1)

Q = np.zeros((M,s,s),dtype=float)
Q[:,iu_r,iu_c] = R
Q = Q + np.transpose(Q,(0,2,1))

输出：

[[[0. 1. 2. 1.]
 [1. 0. 1. 2.]
 [2. 1. 0. 1.]
 [1. 2. 1. 0.]]

[[0. 1. 1. 1.]
 [1. 0. 1. 1.]
 [1. 1. 0. 1.]
 [1. 1. 1. 0.]]]

但显然不能直接将其转换为 tensorflow，因为

import tensorflow as tf
import numpy as np

M = 2
s = 4

iu_r, iu_c = np.triu_indices(s,1)

rates = tf.get_variable(shape=(M,s*(s-1)/2), name="R", dtype=float)

Q = tf.get_variable(shape=(M,s,s), dtype=float, initializer=tf.initializers.zeros, name="Q")
Q = Q[:,iu_r,iu_c].assign(rates)

失败

TypeError: Tensors in list passed to 'values' of 'Pack' Op have types [int32, int64, int64] that don't all match.

从张量流中的向量张量定义矩阵张量的正确方法是什么？

编辑：

我目前的解决方案是使用 tensorflow 提供的 scatter_nd 函数进行嵌入，因为它适合不需要像 fill_triangular 那样分配冗余变量的需要。但是，索引与 numpy 生成的索引不兼容。目前硬编码以下示例有效：

import tensorflow as tf
import numpy as np

M = 2
s = 4

iu_r, iu_c = np.triu_indices(s,1)

rates = tf.get_variable(shape=(M,s*(s-1)/2), name="R", dtype=float)

iupper = [[[0,0,1],[0,0,2],[0,0,3],[0,1,2],[0,1,3],[0,2,3]],[[1,0,1],[1,0,2],[1,0,3],[1,1,2],[1,1,3],[1,2,3]]]
Q = tf.scatter_nd(iupper,rates,shape=(M,s,s), name="rate_matrix")

翻译得到的索引应该没问题

iu_r, iu_c = np.triu_indices(s,1)

但也许有人对此有更优雅的解决方案？

【问题讨论】：

标签： python-3.x numpy tensorflow

【解决方案1】：

这部分我不清楚它是如何工作的：

import numpy as np

R = np.array([[1,2,1,1,2,1], [1,1,1,1,1,1]]) 

s = R.shape[1]
M = R.shape[0]

iu_r, iu_c = np.triu_indices(s,1)

Q = np.zeros((M,s,s),dtype=float)
Q[:,iu_r,iu_c] = R
Q = Q + np.transpose(Q,(0,2,1))

因为这会错误地失败。您可以使用如下更简单的代码：

import numpy as np
R = [1,2,1,1,2,1]
N = 4
Q = np.zeros((N,N),dtype=float)

for i in range(0,N):
  for j in range(0,N):
    if (i<j):
      Q[i][j] = R.pop(0)

Q 将是：

[[0. 1. 2. 1.]
 [0. 0. 1. 2.]
 [0. 0. 0. 1.]
 [0. 0. 0. 0.]]
<class 'numpy.ndarray'>

要获得对称 Q，只需使用：Q = Q + np.transpose(Q)

以后无论您对费率做什么曲折，您都可以像这样转换为张量：

import tensorflow as tf
data_tf = tf.convert_to_tensor(Q, np.float32)
sess = tf.InteractiveSession()  
print(data_tf.eval())
sess.close()

【讨论】：

【解决方案2】：

其他答案建议使用 convert_to_tensor 函数，将您的 numpy 数组转换为 TensorFlow 张量。

这确实可以为您提供具有零对角线对称的所需属性的矩阵。但是，一旦您开始训练，这些属性可能不再存在，因为通常无法保证权重更新会保持此属性不变。

如果您确实需要在整个训练过程中保持矩阵对称且对角线为零，您可以执行以下操作：

import tensorflow as tf
from tensorflow.contrib.distributions import fill_triangular

M = 2 # batch size
s = 4 # matrix size

rates = tf.get_variable(shape=(M,s*(s+1)/2), name="R", dtype=float)

# Q will be triangular (with a non-zero diagonal!)
Q = fill_triangular(rates)

# set the diagonal of Q to zero.
Q = tf.linalg.set_diag(Q,tf.zeros((M,s)))

# make Q symmetric
Q = Q + tf.transpose(Q,[0,2,1])

这是一个验证矩阵是否具有所需属性的测试，即使在训练之后也是如此：

import numpy as np

# define some arbitrary loss function
Q_target = tf.constant(np.random.normal(size=(1,s,s)).astype(np.float32))
loss = tf.nn.l2_loss(Q-Q_target)

# a single training step (which will update the matrices)
train_step =  tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(loss)

sess = tf.Session()
sess.run(tf.global_variables_initializer())

# this is Q before training
print(sess.run(Q))
#[[[ 0.    -0.564  0.318 -0.446]
#  [-0.564  0.    -0.028  0.2  ]
#  [ 0.318 -0.028  0.     0.369]
#  [-0.446  0.2    0.369  0.   ]]
#
# [[ 0.     0.412  0.216  0.063]
#  [ 0.412  0.     0.221 -0.336]
#  [ 0.216  0.221  0.    -0.653]
#  [ 0.063 -0.336 -0.653  0.   ]]]


# this is Q after training
sess.run(train_step)
print(sess.run(Q))
#[[[ 0.    -0.548  0.235 -0.284]
#  [-0.548  0.    -0.055  0.074]
#  [ 0.235 -0.055  0.     0.25 ]
#  [-0.284  0.074  0.25   0.   ]]
#
# [[ 0.     0.233  0.153  0.123]
#  [ 0.233  0.     0.144 -0.354]
#  [ 0.153  0.144  0.    -0.568]
#  [ 0.123 -0.354 -0.568  0.   ]]]

【讨论】：

我给了两个代码sn-ps。第一个展示了如何构建矩阵。第二个证明矩阵确实具有所需的属性。我展示了所需的属性在执行训练步骤之后和之后都成立。为了说明这一点，我需要定义训练步骤，这意味着我应该定义一个损失函数和一个优化器。我任意选择损失函数为与其他矩阵的 L2 距离，我选择优化器为梯度下降。
感谢您写下此解决方案。我已经看到了函数 fill_triangular，但是我不喜欢这种方法的是，需要为每个 1

【解决方案3】：

显然你需要像convert_to_tensor 这样的东西。

该函数将各种类型的 Python 对象转换为 Tensor 对象。它接受张量对象、numpy 数组、Python 列表和 Python 标量。

注意：TensorFlow 操作会自动将 NumPy ndarray 转换为张量。

【讨论】：

我应该把它应用到什么地方？因为我想在精细的进一步计算中使用生成的张量 Q，然后针对参数 R 进行优化。所以我不能只转换 Q 的 numpy 版本，因为它没有关于 R