如何在 PyTorch 中的张量的每一行中随机设置可变数量的元素答案

【问题标题】：How to randomly set a variable number of elements in each row of a tensor in PyTorch如何在 PyTorch 中的张量的每一行中随机设置可变数量的元素
【发布时间】：2021-11-12 15:13:40
【问题描述】：

我想创建一个维度为 (n, n) 的零一矩阵。这些应该随机放置，每行中的数量有一个上限。假设我有一个长度为 n 的列表，其中 n 行中的每一行都有 cap 的值。如何在 PyTorch 中做到这一点？

我的问题类似于this previous question。我正在寻找的唯一变化是，应该有 n 个 k 值，对应于 n 行。

【问题讨论】：

您可以将每行的前m列设置为True，然后将它们随机排列discuss.pytorch.org/t/…
我不确定这里的m 是什么。假设它与问题中的k 相同，则m 不能是固定数字。您能否建议我如何为每一行设置不同数量的列（预先确定）？

标签： python pytorch

【解决方案1】：

正如上面 cmets 中的 @Marcel 所解释的，您可以首先将第一个 m 值设置为值 k，然后按置换索引进行索引以获得随机张量：

>>> n = 10; m = 3; k = 1
>>> x = torch.zeros(n, n)

>>> x[:, :m] = k
tensor([[1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 0., 0., 0., 0., 0., 0., 0.]])

使用torch.randperm 获取逐行列排列：

>>> perm = torch.stack([torch.randperm(10) for _ in range(len(x))])
tensor([[8, 0, 3, 2, 1, 6, 9, 4, 5, 7],
        [5, 7, 1, 4, 8, 0, 6, 9, 2, 3],
        [2, 1, 9, 7, 0, 8, 6, 3, 5, 4],
        [1, 3, 5, 8, 7, 6, 9, 4, 2, 0],
        [7, 6, 0, 5, 2, 9, 1, 8, 4, 3],
        [5, 0, 6, 8, 1, 9, 2, 4, 3, 7],
        [4, 0, 6, 5, 8, 1, 3, 7, 2, 9],
        [5, 3, 4, 9, 0, 1, 7, 6, 8, 2],
        [5, 7, 9, 3, 2, 6, 8, 0, 4, 1],
        [2, 7, 4, 6, 3, 0, 9, 8, 5, 1]])

然后使用torch.gather 索引张量x 和perm：

>>> x.gather(dim=0, index=perm)
tensor([[0., 1., 0., 1., 1., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 1., 0., 0., 1., 0.],
        [1., 1., 0., 0., 1., 0., 0., 0., 0., 0.],
        [1., 0., 0., 0., 0., 0., 0., 0., 1., 1.],
        [0., 0., 1., 0., 1., 0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 1., 0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0., 1., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1., 1., 0., 0., 0., 1.],
        [0., 0., 0., 0., 1., 0., 0., 1., 0., 1.],
        [1., 0., 0., 0., 0., 1., 0., 0., 0., 1.]])

或者，您可以直接使用 torch.scatter 和 value 关键字参数：

>>> torch.zeros(n, n).scatter(dim=0, index=perm, value=1)
tensor([[0., 1., 0., 1., 1., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 1., 0., 0., 1., 0.],
        [1., 1., 0., 0., 1., 0., 0., 0., 0., 0.],
        [1., 0., 0., 0., 0., 0., 0., 0., 1., 1.],
        [0., 0., 1., 0., 1., 0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 1., 0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0., 1., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1., 1., 0., 0., 0., 1.],
        [0., 0., 0., 0., 1., 0., 0., 1., 0., 1.],
        [1., 0., 0., 0., 0., 1., 0., 0., 0., 1.]])

如果m 本身就是张量，您可以使用torch.arange 和torch.where 的组合找到解决方法：

首先对位置进行编码：

>>> d = torch.arange(n)[None].repeat(n,1)
>>> x = torch.where(d+m>n, 0, 1)
tensor([[1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
        [1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
        [1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

像以前一样构造排列：

>>> perm = torch.stack([torch.randperm(10) for _ in range(n)])
tensor([[2, 5, 7, 0, 4, 1, 3, 6, 8, 9],
        [7, 4, 9, 5, 6, 0, 3, 1, 2, 8],
        [5, 1, 4, 9, 0, 3, 2, 6, 7, 8],
        [9, 6, 0, 2, 3, 1, 7, 5, 4, 8],
        [3, 5, 4, 6, 0, 7, 9, 8, 2, 1],
        [5, 7, 8, 6, 9, 2, 0, 4, 3, 1],
        [8, 3, 9, 0, 6, 2, 5, 7, 4, 1],
        [2, 9, 4, 3, 7, 8, 1, 0, 6, 5],
        [5, 4, 8, 3, 2, 9, 7, 1, 6, 0],
        [8, 7, 3, 6, 5, 4, 2, 0, 9, 1]])

然后分散在x：

>>> x.scatter(dim=0, index=perm, value=1)
tensor([[1, 1, 1, 1, 1, 1, 1, 1, 0, 1],
        [1, 1, 1, 0, 0, 1, 1, 1, 0, 1],
        [1, 1, 1, 1, 1, 1, 1, 0, 1, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 0, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
        [1, 1, 1, 0, 1, 1, 1, 1, 1, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

【讨论】：

我可以看到最终的张量在每一行都有一个m 个 1。我想为每一行改变它。假设我们知道每行需要多少个 1。指定了每行中 1 的总数，只是它们必须放置在每行中的随机位置。
我已经编辑了我的答案，见上文。
我需要张量 m 的大小为 (n, )。 This answer 帮助了我。感谢您的回答和 cmets :)
@Ru11 "我需要张量 m 的大小为 (n, )"，这正是答案的第二部分所做的。你试过了吗？
是的。不知何故，>>> d = torch.arange(n)[None].repeat(n,1) >>> x = torch.where(d+m>n, 0, 1) 在每一行中为我生成了相同数量的 1。我使用m = torch.randint(n, (n,)) 生成m。你如何得到不同数量的1？我有些不明白。

【解决方案2】：

import torch 

n = 5 
top_k = torch.randint(n, (n,))
print(top_k)

# Step 1, generate a variable number of elements in each row
top_k = torch.nn.functional.one_hot(top_k, num_classes=n)
top_k = 1 - torch.cumsum(top_k, dim=1)  # 
print(top_k)

# Step 2, shuffle each row 
indices = torch.argsort(torch.randn((n, n)), dim=1)
result = top_k[torch.arange(top_k.shape[0])[..., None], indices]
print(result)

This answer 在 PyTorch 论坛上发帖帮助我得到了我想要的东西。

【讨论】：