tensorflow 在稀疏变量上做梯度答案

【问题标题】：tensorflow doing gradients on sparse variabletensorflow 在稀疏变量上做梯度
【发布时间】：2017-04-30 14:25:54
【问题描述】：

我正在尝试在 tensorflow 中训练一个稀疏变量，据我所知，当前的 tensorflow 不允许使用稀疏变量。

我发现两个讨论类似问题的线程：using-sparsetensor-as-a-trainable-variable 和 update-only-part-of-the-word-embedding-matrix-in-tensorflow。不是很明白答案，如果有示例代码就好了

我尝试过的一种方法是：

# initialize the sparse variable sp_weights
# assuming w_s is the input sparse matrix contains indices information
dim=20
identity = tf.constant(np.identity(dim), dtype=tf.float32)
A=tf.sparse_tensor_dense_matmul(w_s, identity)  # convert w_s to dense
w_init = tf.random_normal([dim, dim], mean=0.0, stddev=0.1) 
w_tensor = tf.mul(A, w_init) # random initialize sparse tensor
vars['sp_weights'] = tf.Variable(w_tensor)

# doing some operations...

在计算梯度时，根据second link 使用tf.IndexedSlices

grad = opt.compute_gradients(loss)
train_op = opt.apply_gradients(
    [tf.IndexedSlices(grad, indices)]) # indices is extracted from w_s

上面的代码当然行不通，我在这里很困惑。 tf.IndexedSlices 使输入成为 IndexedSlices 实例，如何使用它来更新给定索引的梯度？此外，许多人提到使用 tf.scatter_add/sub/update。官方文档不包含任何关于如何使用和在哪里使用梯度更新的示例代码。我应该使用 tf.IndexedSlices 还是 tf.scatter？如果有任何示例代码会很有帮助。谢谢！

【问题讨论】：

你能把你得到的错误信息吗？

标签： tensorflow sparse-matrix

【解决方案1】：

我不熟悉 IndexedSlices 或稀疏变量，但我了解到您正在尝试仅对变量的某些切片应用梯度更新。如果您正在这样做，那么有一个简单的解决方法：使用

提取变量的副本

weights_copy = tf.Variable(weights_var.initialized_value()) # Copies the current value

，然后对 entire 变量应用梯度更新，然后使用 tf.scatter() 将两者合并，将原始/更新的部分合并到您希望的任何位置。

【讨论】：