Pytorch：重新缩放后RGB值范围为0-1，如何标准化图像？答案

【问题标题】：Pytorch: RGB value ranges 0-1 after rescaling, How do I normalize images?Pytorch：重新缩放后RGB值范围为0-1，如何标准化图像？
【发布时间】：2020-08-07 09:19:44
【问题描述】：

我写了一个类来重新缩放图像，但是经过预处理后 RGB 值变成了从 0 到 1 的范围。 RGB 发生了什么，直观上应该在 0-255 之间？以下是 Rescale 类和重新缩放后的 RGB 值。

问题：

我还需要 Min-Max Normalization，将 RGB 值映射到 0-1 吗？

如何应用 transforms.Normalization，Normalization 放在哪里，在 Rescale 之前还是之后，如何计算均值和方差，使用 0-255 或 0-1 范围内的 RGB 值？

感谢您的宝贵时间！

class Rescale(object):
    def __init__(self, output_size):
        assert isinstance(output_size, (int, tuple))
        self.output_size = output_size

    def __call__(self, sample):
        image, anno = sample['image'], sample['anno']

        # get orginal width and height of image
        h, w = image.shape[0:2]
        # if output_size is an integer

        if isinstance(self.output_size, int):
            if h > w:
                new_h, new_w = h * self.output_size / w, self.output_size
            else:
                new_h, new_w = self.output_size / h, w * self.output_size / h

        # if output size is a tuple (a, b)
        else:
            new_h, new_w = self.output_size
        new_h, new_w = int(new_h), int(new_w)

        image = transform.resize(image, (new_h, new_w))       
        return {'image': image, 'anno': anno}

[[[0.67264216 0.50980392 0.34503034]
  [0.67243905 0.51208121 0.34528431]
  [0.66719145 0.51817184 0.3459951 ]
  ...
  [0.23645098 0.2654311  0.3759458 ]
  [0.24476471 0.28003857 0.38963938]
  [0.24885877 0.28807445 0.40935877]]

 [[0.67465196 0.50994608 0.3452402 ]
  [0.68067157 0.52031373 0.3531848 ]
  [0.67603922 0.52732436 0.35839216]
  ...
  [0.23458333 0.25195098 0.36822142]
  [0.2461343  0.26886127 0.38314558]
  [0.2454384  0.27233056 0.39977664]]

 [[0.67707843 0.51237255 0.34766667]
  [0.68235294 0.5219951  0.35553024]
  [0.67772059 0.52747687 0.35659176]
  ...
  [0.24485294 0.24514568 0.36592999]
  [0.25407436 0.26205475 0.38063318]
  [0.2597007  0.27202914 0.40214216]]

 ...

[[[172 130  88]
  [172 130  88]
  [172 130  88]
  ...
  [ 63  74 102]
  [ 65  76 106]
  [ 67  77 112]]

 [[173 131  89]
  [173 131  89]
  [173 131  89]
  ...
  [ 65  74 103]
  [ 64  75 105]
  [ 63  73 108]]

 [[173 131  89]
  [174 132  90]
  [174 132  90]
  ...
  [ 63  72 101]
  [ 62  71 102]
  [ 61  69 105]]
  ...

【问题讨论】：

标签： python-3.x deep-learning computer-vision pytorch image-preprocessing

【解决方案1】：

您可以使用torchvision 来完成此操作。

transform = transforms.Compose([
    transforms.Resize(output_size),
    transforms.ToTensor(),
])

这需要一个 PIL 图像作为输入。它将返回[0, 1] 范围内的张量。您还可以添加均值标准归一化，如下所示

transform = transforms.Compose([
    transforms.Resize(output_size),
    transforms.ToTensor(),
    transforms.Normalize(mean, std),
])

这里的mean 和std 是训练集中所有图像的所有像素的每通道均值和标准差。您需要在调整所有图像大小并转换为torch Tensor 后计算它们。一种方法是应用前两个转换（resize 和 ToTensor），然后在所有训练图像上计算 mean 和 std

x = torch.concatenate([train_data[i] for i in range(len(train_data))])
mean = torch.mean(x, dim=(0, 1))
std = torch.std(x, dim=(0, 1))

然后你将这个mean 和std 值与上面的Normalize transorm 一起使用。

【讨论】：

嗨，非常感谢。我应该如何计算平均值和标准差？在调整大小之前或之后使用图像？如果我在调整大小之前计算平均值和标准值，则归一化在调整大小之后，这没有任何意义，因为调整大小后像素数发生变化，使用原始图像的计算值进行归一化似乎是错误的。但是如果要使用resize之后计算的均值和标准差，我需要编写一个函数来计算Resize和Normalize之间的均值和标准差吗？
感谢您的宝贵时间。如果我想用模型预测单张图像，我必须对单张图像再次进行归一化，对吧？
是的，使用与训练期间相同的值。