【问题标题】:RuntimeError: 1D target tensor expected, multi-target not supported PytorchRuntimeError:预期 1D 目标张量,不支持多目标 Pytorch
【发布时间】:2021-09-13 08:43:12
【问题描述】:

我最近从 keras 转移到了 pytorch,但我仍在努力了解这一切是如何工作的。下面是我为使用简单的 MLP 对 mnist 数据集进行分类而实现的代码。就像我以前在 keras 中所做的那样,我将每个 28x28 图像展平为 784 的向量,并且我还为我的标签创建了一个单热表示。 在模型中,我希望给定一个 784 的向量,模型会输出一个带有概率的单热向量,但是一旦我的代码达到计算损失,我就会得到以下错误:

RuntimeError: 1D target tensor expected, multi-target not supported

下面是我的代码:

    import numpy as np
import matplotlib.pyplot as plt
import torch
import time
from torch import nn, optim
from keras.datasets import mnist
from torch.utils.data import Dataset, DataLoader

RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
torch.manual_seed(RANDOM_SEED)


# ----------------------------------------------------

class MnistDataset(Dataset):

    def __init__(self, data_size=0):

        (x, y), (_, _) = mnist.load_data()

        x = [i.flatten() for i in x]
        x = np.array(x, dtype=np.float32)

        if data_size < 0 or data_size > len(y):
            assert ("Data size should be between 0 to number of files in the dataset")

        if data_size == 0:
            data_size = len(y)

        self.data_size = data_size

        # picking 'data_size' random samples
        self.x = x[:data_size]
        self.y = y[:data_size]

        # scaling between 0-1
        self.x = (self.x / 255)

        # Creating one-hot representation of target
        y_encoded = []
        for label in y:
            encoded = np.zeros(10)
            encoded[label] = 1
            y_encoded.append(encoded)

        self.y = np.array(y_encoded)

    def __len__(self):
        return self.data_size

    def __getitem__(self, index):

        x_sample = self.x[index]
        label = self.y[index]

        return x_sample, label


# ----------------------------------------------------

num_train_samples = 10000
num_test_samples = 2000

# Each generator returns a single
# sample & its label on each iteration.
mnist_train = MnistDataset(data_size=num_train_samples)
mnist_test = MnistDataset(data_size=num_test_samples)

# Each generator returns a batch of samples on each iteration.
train_loader = DataLoader(mnist_train, batch_size=128, shuffle=True)  # 79 batches
test_loader = DataLoader(mnist_test, batch_size=128, shuffle=True)  # 16 batches


# ----------------------------------------------------

# Defining the Model Architecture

class MLP(nn.Module):

    def __init__(self):
        super().__init__()

        self.fc1 = nn.Linear(28 * 28, 100)
        self.act1 = nn.ReLU()
        self.fc2 = nn.Linear(100, 50)
        self.act2 = nn.ReLU()
        self.fc3 = nn.Linear(50, 10)
        self.act3 = nn.Sigmoid()

    def forward(self, x):
        x = self.act1(self.fc1(x))
        x = self.act2(self.fc2(x))
        output = self.act3(self.fc3(x))

        return output


# ----------------------------------------------------

model = MLP()

# Defining optimizer and loss function
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.1, momentum=0.9)

# ----------------------------------------------------

# Training the model

epochs = 10

print("Training Started...")

for epoch in range(epochs):
    for batch_index, (inputs, targets) in enumerate(train_loader):

        optimizer.zero_grad()  # Zero the gradients
        outputs = model(inputs)  # Forward pass
        loss = criterion(outputs, targets)  # Compute the Loss
        loss.backward()  # Compute the Gradients
        optimizer.step()  # Update the parameters

        # Evaluating the model
        total = 0
        correct = 0
        with torch.no_grad():
            for batch_idx, (inputs, targets) in enumerate(test_loader):
                outputs = model(inputs)
                _, predicted = torch.max(outputs.data, 1)
                total += targets.size(0)
                correct += predicted.eq(targets.data).cpu().sum()
            print('Epoch : {} Test Acc : {}'.format(epoch, (100. * correct / total)))

print("Training Completed Sucessfully")

# ----------------------------------------------------

我还阅读了一些与同一问题相关的其他帖子,其中大多数人说目标的 CrossEntropy 损失必须是一个数字,这完全超出了我的想象。有人可以解释一下解决方案吗。谢谢。

【问题讨论】:

    标签: python tensorflow deep-learning pytorch


    【解决方案1】:

    对于nn.CrossEntropyLoss,你不需要标签的one-hot表示,你只需要传递预测的logit,它的形状是(batch_size, n_class),和一个目标向量(batch_size,)

    所以只需传入标签索引向量 y 而不是 one-hot 向量。

    修正了你的代码:

    class MnistDataset(Dataset):
    
        def __init__(self, data_size=0):
    
            (x, y), (_, _) = mnist.load_data()
    
            x = [i.flatten() for i in x]
            x = np.array(x, dtype=np.float32)
    
            if data_size < 0 or data_size > len(y):
                assert ("Data size should be between 0 to number of files in the dataset")
    
            if data_size == 0:
                data_size = len(y)
    
            self.data_size = data_size
    
            # picking 'data_size' random samples
            self.x = x[:data_size]
            self.y = y[:data_size]
    
            # scaling between 0-1
            self.x = (self.x / 255)
    
            self.y = y # <--
    
        def __len__(self):
            return self.data_size
    
        def __getitem__(self, index):
    
            x_sample = self.x[index]
            label = self.y[index]
    
            return x_sample, label
    

    查看 Pytorch 示例以获取更多详细信息: https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html

    【讨论】:

    • 是的,我知道了,但是它是如何计算损失的呢?训练模型后,当我尝试打印模型和目标向量的输出时,我得到了这个:
    • 张量([3, 1, 5]) 张量([[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.5532, 0.0000, 0.0000], [[ 0.0000.987] 5.8783,00.0000,00000,00000,0.0000,1.3808,00000],[0.0000,0.0000,00000,0.2514,00000,00000,0.0000,00000,00000,0000,0.0000,00000,00000,00000]],grad_fn = ) span>
    • 这里第一个张量是我的目标标签,第二个张量是模型的输出。例如,目标是单个数字,而模型输出的向量为 10 ,那么如何计算损失?
    • 输出张量似乎不正确,它应该只有二维吗?在这里,您得到了形状为 (3, 1, 5) 的 3-d 张量,但没有一个大小为 10。您能再次确认吗?
    • 关于它如何计算损失,它会在output中取项目c,其中c是一个从0到9的标签,然后计算为-output[c] + log(sum(exp(output)))。这是来自文档中的第一个等式。这是针对批次中的每个项目完成的
    猜你喜欢
    • 1970-01-01
    • 2021-06-17
    • 2020-10-21
    • 2023-01-07
    • 1970-01-01
    • 2021-05-03
    • 2021-06-17
    • 2018-08-18
    • 2019-12-11
    相关资源
    最近更新 更多