【问题标题】:Problem with pytorch dataset.imageFolder with custom dataset in Google ColabGoogle Colab中带有自定义数据集的pytorch dataset.imageFolder问题
【发布时间】:2023-03-21 17:14:01
【问题描述】:

我正在尝试使用 pytorch 为分类任务加载数据集,这是我使用的代码:

data_transforms = {
    'train': transforms.Compose([
        transforms.RandomRotation(2.8),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize((0.5), (0.5))
    ]),
    'valid': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize((0.5), (0.5))
    ])
}
print(os.listdir())
# TODO: Load the datasets with ImageFolder
image_datasets = {x: datasets.ImageFolder(os.path.join("/content/drive/MyDrive/DatasetPersonale", x),
                                          data_transforms[x])
                  for x in ['train', 'valid']}
# TODO: Using the image datasets and the trainforms, define the dataloaders
batch_size = 32
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size,
                                             shuffle=True, num_workers=4)
              for x in ['train', 'valid']}
class_names = image_datasets['train'].classes
print(class_names)
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'valid']}


代码运行良好,但由于我的数据集是灰度的,我需要将其转换为 RGB,所以我使用了以下代码:


rootdir = '/content/drive/MyDrive/DatasetPersonale/trainRGB'
print("Train")
for subdir, dirs, files in os.walk(rootdir):
   for file in files:
        filePath = os.path.join(subdir, file)
        name = os.path.basename(filePath)
        img=Image.open(filePath, mode="r")
        print(img.mode)
        if img.mode != "RGB":
            RGBimg=img.convert("RGB")
            RGBimg.save(filePath,format=jpeg)
 
       

现在我的图像仍然是 jpeg,但现在它们是 RGB 而不是 L。问题是如果我重新运行代码以加载数据集,我会收到此错误

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-15-3dace4b0f21b> in <module>()
     19 image_datasets = {x: datasets.ImageFolder(os.path.join("/content/drive/MyDrive/DatasetPersonale", x),
     20                                           data_transforms[x])
---> 21                   for x in ['trainRGB', 'validRGB']}
     22 
     23 # TODO: Using the image datasets and the trainforms, define the dataloaders

4 frames
<ipython-input-15-3dace4b0f21b> in <dictcomp>(.0)
     19 image_datasets = {x: datasets.ImageFolder(os.path.join("/content/drive/MyDrive/DatasetPersonale", x),
     20                                           data_transforms[x])
---> 21                   for x in ['trainRGB', 'validRGB']}
     22 
     23 # TODO: Using the image datasets and the trainforms, define the dataloaders

/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in __init__(self, root, transform, target_transform, loader, is_valid_file)
    311                                           transform=transform,
    312                                           target_transform=target_transform,
--> 313                                           is_valid_file=is_valid_file)
    314         self.imgs = self.samples

/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in __init__(self, root, loader, extensions, transform, target_transform, is_valid_file)
    144                                             target_transform=target_transform)
    145         classes, class_to_idx = self.find_classes(self.root)
--> 146         samples = self.make_dataset(self.root, class_to_idx, extensions, is_valid_file)
    147 
    148         self.loader = loader

/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in make_dataset(directory, class_to_idx, extensions, is_valid_file)
    190                 "The class_to_idx parameter cannot be None."
    191             )
--> 192         return make_dataset(directory, class_to_idx, extensions=extensions, is_valid_file=is_valid_file)
    193 
    194     def find_classes(self, directory: str) -> Tuple[List[str], Dict[str, int]]:

/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in make_dataset(directory, class_to_idx, extensions, is_valid_file)
    100         if extensions is not None:
    101             msg += f"Supported extensions are: {', '.join(extensions)}"
--> 102         raise FileNotFoundError(msg)
    103 
    104     return instances

FileNotFoundError: Found no valid file for the classes .ipynb_checkpoints. Supported extensions are: .jpg, .jpeg, .png, .ppm, .bmp, .pgm, .tif, .tiff, .webp

有人知道为什么会出现这个错误吗?我检查了所有文件的扩展名,它们都是 jpeg。

谢谢。

【问题讨论】:

  • 您是否事先将所有灰度图像复制到/content/drive/MyDrive/DatasetPersonale/trainRGB?否则,for subdir, dirs, files in os.walk(rootdir): for rootdir = '/content/drive/MyDrive/DatasetPersonale/trainRGB' 不会做任何事情,因为没有文件!?或者:/content/drive/MyDrive/DatasetPersonale/trainRGB 中是否有合适的文件?
  • 是的,我复制了trainRGB中的每个灰度图像,文件夹中的文件没问题,它们都是RGB

标签: python-imaging-library classification google-colaboratory pytorch-dataloader


【解决方案1】:

问题:这是因为文件夹/content/drive/MyDrive/DatasetPersonale/trainRGB 内的.ipynb_checkpoints 文件夹包含文件(无效图像),无法读取为具有有效扩展名(.jpg、.jpeg、. png、.ppm、.bmp、.pgm、.tif、.tiff、.webp)。

解决方案:您可以将所有图像保存在一个名为“images”的子文件夹中,然后将根文件夹更改为/content/drive/MyDrive/DatasetPersonale/trainRGB/images,以避免使用您的图像读取.ipynb_checkpoints 文件夹。

【讨论】:

    猜你喜欢
    • 2020-10-20
    • 1970-01-01
    • 2019-07-19
    • 2020-05-30
    • 2020-09-28
    • 2021-03-21
    • 2020-02-18
    • 1970-01-01
    • 2020-09-10
    相关资源
    最近更新 更多