Pytorch 不支持 one-hot 向量？答案

【问题标题】：Pytorch doesn't support one-hot vector?Pytorch 不支持 one-hot 向量？
【发布时间】：2019-08-28 05:03:27
【问题描述】：

我对 Pytorch 如何处理 one-hot 向量感到非常困惑。在这个tutorial 中，神经网络将生成一个单热向量作为其输出。据我了解，教程中的神经网络的示意图结构应该是这样的：

但是，labels 不是单热矢量格式。我得到以下size

print(labels.size())
print(outputs.size())

output>>> torch.Size([4]) 
output>>> torch.Size([4, 10])

奇迹般地，我将outputs 和labels 传递给criterion=CrossEntropyLoss()，完全没有错误。

loss = criterion(outputs, labels) # How come it has no error?

我的假设：

也许 pytorch 会自动将 labels 转换为 one-hot 向量形式。因此，我尝试在将标签传递给损失函数之前将其转换为 one-hot 向量。

def to_one_hot_vector(num_class, label):
    b = np.zeros((label.shape[0], num_class))
    b[np.arange(label.shape[0]), label] = 1

    return b

labels_one_hot = to_one_hot_vector(10,labels)
labels_one_hot = torch.Tensor(labels_one_hot)
labels_one_hot = labels_one_hot.type(torch.LongTensor)

loss = criterion(outputs, labels_one_hot) # Now it gives me error

但是，我得到了以下错误

RuntimeError: 不支持多目标 /opt/pytorch/pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:15

那么，Pytorch 不支持 one-hot 向量？ Pytorch 如何计算 cross entropy 两个张量 outputs = [1,0,0],[0,0,1] 和 labels = [0,2] ？目前对我来说完全没有意义。

【问题讨论】：

标签： python machine-learning pytorch

【解决方案1】：

PyTorch 在其CrossEntropyLoss 的文档中声明

此标准期望一个类索引（0 到 C-1）作为大小为 minibatch 的一维张量的每个值的目标

换句话说，它在概念上在CEL 中构建了您的to_one_hot_vector 函数，并且不公开one-hot API。请注意，与存储类标签相比，one-hot 向量的内存效率较低。

如果给定了 one-hot 向量并且需要转到类标签格式（例如与CEL 兼容），您可以使用argmax，如下所示：

import torch
 
labels = torch.tensor([1, 2, 3, 5])
one_hot = torch.zeros(4, 6)
one_hot[torch.arange(4), labels] = 1
 
reverted = torch.argmax(one_hot, dim=1)
assert (labels == reverted).all().item()

【讨论】：

所以使用 nn.CrossEntropyLoss 时不需要一个热编码类 注意“nn.LogSoftmax 和 nn.NLLLoss 的组合相当于使用 nn.CrossEntropyLoss。 "，所以如果你有一个 nn.LogSoftmax，并且你有一个 loss = nn.NLLLoss，那么你也不需要一个热编码。

【解决方案2】：

此代码将帮助您进行单热编码和多热编码：

import torch
batch_size=10
n_classes=5
target = torch.randint(high=5, size=(1,10)) # set size (2,10) for MHE
print(target)
y = torch.zeros(batch_size, n_classes)
y[range(y.shape[0]), target]=1
y

OHE 中的输出

tensor([[4, 3, 2, 2, 4, 1, 1, 1, 4, 2]])

tensor([[0., 0., 0., 0., 1.],
        [0., 0., 0., 1., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 0., 1.],
        [0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 1.],
        [0., 0., 1., 0., 0.]])

我设置target = torch.randint(high=5, size=(2,10))时MHE的输出

tensor([[3, 2, 4, 4, 2, 4, 0, 4, 4, 1],
        [4, 1, 1, 3, 2, 2, 4, 2, 4, 3]])

tensor([[0., 0., 0., 1., 1.],
        [0., 1., 1., 0., 0.],
        [0., 1., 0., 0., 1.],
        [0., 0., 0., 1., 1.],
        [0., 0., 1., 0., 0.],
        [0., 0., 1., 0., 1.],
        [1., 0., 0., 0., 1.],
        [0., 0., 1., 0., 1.],
        [0., 0., 0., 0., 1.],
        [0., 1., 0., 1., 0.]])

如果您需要多个 OHE：

torch.nn.functional.one_hot(target)

tensor([[[0, 0, 0, 1, 0],
         [0, 0, 1, 0, 0],
         [0, 0, 0, 0, 1],
         [0, 0, 0, 0, 1],
         [0, 0, 1, 0, 0],
         [0, 0, 0, 0, 1],
         [1, 0, 0, 0, 0],
         [0, 0, 0, 0, 1],
         [0, 0, 0, 0, 1],
         [0, 1, 0, 0, 0]],

        [[0, 0, 0, 0, 1],
         [0, 1, 0, 0, 0],
         [0, 1, 0, 0, 0],
         [0, 0, 0, 1, 0],
         [0, 0, 1, 0, 0],
         [0, 0, 1, 0, 0],
         [0, 0, 0, 0, 1],
         [0, 0, 1, 0, 0],
         [0, 0, 0, 0, 1],
         [0, 0, 0, 1, 0]]])

【讨论】：

这里有一个关于在转换中使用 one_hot 的特定用例，因为这是我们都在寻找的：``` target_transform=torchvision.transforms.Compose([ lambda x:torch .LongTensor([x]), lambda x:F.one_hot(x,10)], lambda x: x.squeeze()]) ``` stackoverflow.com/questions/63342147/…

【解决方案3】：

正如@Jatentaki 明确指出的，您可以使用torch.argmax(one_hot, dim=1) 将one-hot 编码向量转换为数字。

但是，如果您仍想在 PyTorch 中使用 one-hot 编码输出来训练您的网络，您可以使用 nn.LogSoftmax 和 NLLLOSS：

import torch
from torch import nn

output_onehot = nn.LogSoftmax(dim=1)(torch.randn(3, 5)) # m = 3 samples, each has n = 5 features
target = torch.tensor([1, 0, 4]) # target values for each sample

nn.NLLLoss()(output_onehot, target)

print(output_onehot)
print(target)

# You can get the probabilities using the exponential function:
print("Probabilities:", torch.exp(output_onehot))

输出将是这样的：

tensor([[-0.5413, -2.4461, -2.0110, -1.9964, -2.7851],
        [-2.3376, -1.6985, -1.8472, -3.0975, -0.6585],
        [-3.2820, -0.7160, -1.5297, -1.5636, -3.0412]])
tensor([1, 0, 4])
Probabilities: tensor([[0.5820, 0.0866, 0.1339, 0.1358, 0.0617],
        [0.0966, 0.1830, 0.1577, 0.0452, 0.5176],
        [0.0376, 0.4887, 0.2166, 0.2094, 0.0478]])

【讨论】：