【问题标题】:how can i train a model with different image shapes in a batch如何批量训练具有不同图像形状的模型
【发布时间】:2026-02-17 03:10:01
【问题描述】:

我尝试训练具有动态输入图像大小的模型。它适用于 batch_size = 1,但是如果批量大小大于 1,它会引发错误 enter image description here

在挖掘之后我才知道 numpy 只允许相同形状的图像作为批处理传递。

代码是这样的

enter code here
def sample_images(data_dir, batch_size):
    # Make a list of all images inside the data directory
    all_images = glob.glob(data_dir)
    # print(all_images)


   images_batch = np.random.choice(all_images, size=batch_size)

   #creating empty arrays for sets of given batch size
   low_resolution_images_set = []
   high_resolution_images_set = []

   for x in range(int(len(all_images)/batch_size)):
       # Choose a random batch of images
       images_batch = np.random.choice(all_images, size=batch_size)
       low_resolution_images = []
       high_resolution_images = []
       for img in images_batch:

           # Get an ndarray of the current image
           img1 = imread(img, mode='RGB')
           frame = cv2.imread(img)
           height, width, channels = frame.shape
           img1 = img1.astype(np.float32)

           low_resolution_shape = (int(height/4), int(width/4), channels)
           high_resolution_shape = (low_resolution_shape[0]*4, low_resolution_shape[1]*4, channels)

           img1_high_resolution = imresize(img1, high_resolution_shape)
           img1_low_resolution = imresize(img1, low_resolution_shape)

           # Do a random flip
           if np.random.random() < 0.5:
               img1_high_resolution = np.fliplr(img1_high_resolution)
               img1_low_resolution = np.fliplr(img1_low_resolution)


           high_resolution_images.append(img1_high_resolution)
           low_resolution_images.append(img1_low_resolution)

       high_resolution_images_set.append(high_resolution_images)
       low_resolution_images_set.append(low_resolution_images)
   return np.array(high_resolution_images_set), np.array(low_resolution_images_set)
enter code here

如何以批量大小训练我的架构?

【问题讨论】:

  • 在批处理之前调整所有图像的大小以使其形状相同。你可以使用 cv2.resize()
  • 谢谢@hafiz031,但问题是,我的训练数据集的输入大小有很多变化。 (例如:我无法将 10*10 和 100*100 的图像调整为 50*50,我可能会丢失图像中的重要细节)
  • 检查RaggedTensors 是否适合您(尽管它们还不是 100% 支持所有地方),否则,图像填充可能是一种选择。
  • 您可能需要详细了解模型的工作原理、它做出的假设等。那应该在keras/trensorflow 文档中。如果您了解机器学习背后的一些理论,您就会明白为什么需要一致的图像大小。

标签: python numpy tensorflow machine-learning keras


【解决方案1】:

正如您在我的评论中回答说您有各种具有不同分辨率的图像,因此这里有一个适合您的解决方案。只需为训练图像预处理选择所需的分辨率。以下代码将根据需要通过放大或缩小来调整所有图像的大小。只需选择您的源图像文件夹和目标图像文件夹。将所有图像调整为相同大小后,您就可以进行进一步的操作了。

# -*- coding: utf-8 -*-
"""
Created on Sat Mar  7 18:54:24 2020

@author: Hafiz
"""

import cv2
import glob
import os
import numpy as np

# choose where you want to save the resized images (here the destination folder will be created in the same place where the code resides).
destination = r'.\destination\/'  

try:                                        # making the destination folder
    if not os.path.exists(destination):
        os.makedirs(destination)
except OSError:
    print ('Error while creating directory')


image_no = 0
resolution = (512, 512) # use your desired resolution

for img_path in glob.glob(r'C:\Users\source/*.jpg'): # insert your input image folder directory here
                                                     # if the folder contains various types of images rather than only .jpg therefore use *.* instead of *.jpg (but make sure the folder contains only images)

    img = cv2.imread(img_path)


    if img is None:  # checking if the read image is not a NoneType
        continue

    img = cv2.resize(img, resolution) # the image will be zoomed in or out depending on the resolution

    cv2.imwrite(destination + 'image_' + str(image_no) + '.jpg', img)

    image_no = image_no + 1

【讨论】:

    最近更新 更多