PyTorch 卷积块 - CIFAR10 - RuntimeError答案

【问题标题】：PyTorch convolutional block - CIFAR10 - RuntimeErrorPyTorch 卷积块 - CIFAR10 - RuntimeError
【发布时间】：2021-04-13 21:46:18
【问题描述】：

我正在使用带有 CIFAR-10 数据集的 PyTorch 1.7 和 Python 3.8。我正在尝试使用以下命令创建一个块：conv -> conv -> pool -> fc。全连接层（fc）有 256 个神经元。代码如下：

# Testing-
conv1 = nn.Conv2d(
    in_channels = 3, out_channels = 64,
    kernel_size = 3, stride = 1,
    padding = 1, bias = True
    )
conv2 = nn.Conv2d(
    in_channels = 64, out_channels = 64,
    kernel_size = 3, stride = 1,
    padding = 1, bias = True
    )
pool = nn.MaxPool2d(
    kernel_size = 2, stride = 2
    )
fc1 = nn.Linear(
    in_features = 64 * 16 * 16, out_features = 256
    bias = True
)

images.shape
# torch.Size([32, 3, 32, 32])

x = conv1(images)
x.shape
# torch.Size([32, 64, 32, 32])

x = conv2(x)
x.shape
# torch.Size([32, 64, 32, 32])

x = pool(x)
x.shape
# torch.Size([32, 64, 16, 16])

# This line of code gives error-
x = fc1(x)

RuntimeError: mat1 和 mat2 形状不能相乘（32768x16 和 16384x256)

出了什么问题？

【问题讨论】：

标签： python-3.x pytorch conv-neural-network

【解决方案1】：

你快到了！你会注意到nn.MaxPool 返回一个与nn.Linear 的输入不兼容的形状(32, 64, 16, 16)：一个二维张量(batch, in_features)。你需要广播到(batch, 64*16*16)。

我建议使用nn.Flatten 层而不是自己广播。它将充当x.view(x.size(0), -1)，但更清晰。默认情况下，它保留第一个维度：

conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
conv2 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=1)
pool = nn.MaxPool2d(kernel_size=2, stride=2)
flatten = nn.Flatten()
fc1 = nn.Linear(in_features=64*16*16, out_features=256)

x = conv1(images)
x = conv2(x)
x = pool(x)
x = flatten(x)
x = fc1(x)

或者，您可以使用功能替代torch.flatten，您必须将start_dim 提供为1：x = torch.flatten(x, start_dim=1)。

当您完成调试后，您可以使用nn.Sequential 组装您的层：

model = nn.Sequential(
    nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1),
    nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=1),
    nn.MaxPool2d(kernel_size=2, stride=2),
    nn.Flatten(),
    nn.Linear(in_features=64*16*16, out_features=256)
)

x = model(images)

【讨论】：

【解决方案2】：

您需要扁平化nn.MaxPool2d 层的输出，以便在nn.Linear 层中提供输入。

在向fc 层提供输入之前尝试使用x = x.view(x.size(0), -1) 进行展平张量。

【讨论】：