【问题标题】:splitting an image in python在python中分割图像
【发布时间】:2024-04-27 03:40:01
【问题描述】:

我有一个大小为 (H,W,C) 的随机输入。我想将其拆分为相同大小(224,224,3)的 Z 图像,其中 Z 是根据图像大小确定的变量。它们是否重叠并不重要! 我的目标是处理子图像,然后重建原始图像。 如果有人可以帮忙!

【问题讨论】:

  • 你想将原图按(H,W,C)分割成子图吗?
  • 我的原始图像尺寸为 (H,W,C)。我的目标是裁剪成相同大小(224,224,3)的重叠块,然后在处理后我想重建原始图像
  • C 常量是什么,与3 有什么关系?
  • 当然C是通道数,等于3,高宽是随机的
  • 以这种方式分割图像并不复杂,但对于重建来说,如果一组子图像的原始图像大小已知(重叠可以重新计算-飞,例如)或不(甚至可能涉及图像拼接)。

标签: python opencv python-imaging-library


【解决方案1】:

对于拆分部分,我将仅引用this Q&A。我没有将每个子图像的坐标保存在最终列表中,而是将实际的子图像保存在那里。

关于重建部分:如果重建方法知道最终(或原始)图像大小,则可以简单地重建每个方向的子图像数量及其对应的重叠;它与拆分的代码基本相同。根据这些信息,您可以轻松地重建原始图像。

如果重建方法只获取子图像列表,它会变得复杂!在我看来,对于任意图像(以及它们的子图像),您将需要高级图像拼接技术。

不过,对于图像

  1. 将无损存储(内部或外部),并且
  2. 具有“独特”的重叠区域,

我想出了以下想法:

  • 从前两个子图像中,您可以蛮力搜索第一列中的第一个垂直重叠。这行得通,因为上述分割方法按该顺序保存子图像,并且由于上述两个假设。
  • 然后,您可以使用垂直重叠(或减量)来查找第一列的剩余子图像。这行得通,因为上述拆分方法保证了最相等的重叠(在 +/- 1 内),并且每列中的第一个垂直重叠总是较大的那个。
    • 现在,当到达第一列的末尾时,您将找不到与下一个子图像的适当垂直重叠,因为它是第二行的第一个图像。请注意,只有在上述假设 2. 成立时才有效。查看alexzander's answer 中的图像,此过程将失败。第一列的最后一个子图像的下部(全为零)等于第二列的第一个子图像的顶部。
  • 当到达第一列的末尾时,我们可以确定垂直重叠的数量,从而确定每列的子图像数量。由此,我们也知道每行子图像的数量。所以,现在,我们再次做整件事来寻找合适的水平重叠。
  • 拥有所有水平和垂直重叠,我们可以相应地切割所有子图像并将它们堆叠以重建原始图像。

是的,这需要做很多工作,但您不需要高级图像拼接,而且我认为这种重建方法适用于大多数真实世界的图像。任何大的单色背景(结合小的重叠)或生成子图像和重建原始图像之间的任何压缩都会导致故障。

这是完整的代码:

import cv2
import numpy as np


# Adapted from https://*.com/questions/58383814
def gen_subimages(image, hTile, wTile):

    h, w = image.shape[:2]

    # Number of tiles
    nTilesX = np.uint8(np.ceil(w / wTile))
    nTilesY = np.uint8(np.ceil(h / hTile))

    # Total remainders
    remainderX = nTilesX * wTile - w
    remainderY = nTilesY * hTile - h

    # Set up remainders per tile
    remaindersX = np.ones((nTilesX - 1), np.uint16) * \
                  np.uint16(np.floor(remainderX / (nTilesX - 1)))
    remaindersY = np.ones((nTilesY - 1), np.uint16) * \
                  np.uint16(np.floor(remainderY / (nTilesY - 1)))
    remaindersX[0:np.remainder(remainderX, np.uint16(nTilesX - 1))] += 1
    remaindersY[0:np.remainder(remainderY, np.uint16(nTilesY - 1))] += 1

    images = []

    # Determine proper tile boxes
    k = 0
    x = 0
    for i in range(nTilesX):
        y = 0
        for j in range(nTilesY):
            images.append(image[y:y+hTile, x:x+wTile, :])
            k += 1
            if j < (nTilesY - 1):
                y = y + hTile - remaindersY[j]
        if i < (nTilesX - 1):
            x = x + wTile - remaindersX[i]

    return images


def reconstruct_image(subimages):

    n_si = len(subimages)
    height, width = subimages[0].shape[:2]

    # VERTICAL OVERLAPS

    # Brute-force search for first vertical overlap
    y_overlaps = []
    for y in np.arange(height - 1, 1, -1):
        if np.all(subimages[0][y:, ...] == subimages[1][:-y, ...]):
            y_overlaps.append(height - y)
            break

    if len(y_overlaps) > 0:
        y_ol = y_overlaps[0]

        # Get following vertical overlaps
        for i in np.arange(1, n_si - 1):
            if np.all(subimages[i][height - y_ol:, ...] ==
                      subimages[i + 1][:y_ol, ...]):
                y_overlaps.append(y_ol)
            elif np.all(subimages[i][height - (y_ol - 1):, ...] ==
                        subimages[i + 1][:(y_ol - 1), ...]):
                y_ol -= 1
                y_overlaps.append(y_ol)
            else:
                break

    nTilesY = len(y_overlaps) + 1
    nTilesX = n_si // nTilesY

    # HORIZONTAL OVERLAPS

    # Brute-force search for first horizontal overlap
    x_overlaps = []
    for x in np.arange(width - 1, 1, -1):
        if np.all(subimages[0][:, x:, :] == subimages[nTilesY][:, :-x, :]):
            x_overlaps.append(width - x)
            break

    if len(x_overlaps) > 0:
        x_ol = x_overlaps[0]

        # Get following horizontal overlaps
        for i in np.arange(nTilesY, n_si - nTilesY, nTilesX):
            if np.all(subimages[i][:, width - x_ol:, :] ==
                      subimages[i + nTilesY][:, :x_ol, :]):
                x_overlaps.append(x_ol)
            elif np.all(subimages[i][:, width - (x_ol - 1):, :] ==
                        subimages[i + nTilesY][:, :(x_ol - 1), :]):
                x_ol -= 1
                x_overlaps.append(x_ol)
            else:
                break

    # Get all properly cutted subimages
    x_overlaps.insert(0, 0)
    y_overlaps.insert(0, 0)
    stacks = [subimages[iy + (ix * nTilesY)][y:, x:, ...]
              for iy, y in enumerate(y_overlaps)
              for ix, x in enumerate(x_overlaps)]

    # Stack cutted subimages
    image_recon = np.vstack([np.hstack(stacks[i:i + nTilesX])
                            for i in np.arange(0, nTilesX * nTilesY, nTilesX)])

    return image_recon


img = cv2.imread('path/to/your/image.png')

images = gen_subimages(img, 224, 224)
for im in images:
    print(im.shape)

img_recon = reconstruct_image(images)
print('Original image == Reconstructed image:', np.all(img == img_recon))

像往常一样,这是我的测试图片:

而且,这就是代码的输出:

(224, 224, 3)
(224, 224, 3)
(224, 224, 3)
(224, 224, 3)
Original image == Reconstructed image: True

你看,所有子图像都具有(224, 224, C)的所需形状,并且重建图像与原始图像相同。

【讨论】:

    【解决方案2】:

    我使用了这张图片:(original.png - (1280 x 640)) 和PIL lib,如果这对你来说方便的话,而不是cv2

    代码

    
    import os
    from PIL import Image
    
    class SubImage(object):
        def __init__(self, pil_image, coordinates):
            # PIL image object
            self.img = pil_image
            # this is a list with coordinates
            # used to crop from the original image;
            # these coordinates must be used as
            # DIAGONAL in order to crop or put back in place
            self.coords = coordinates
    
    
    def generate_sections(x_dim, y_dim, cut_x, cut_y):
        """ sections from 0 to X and 0 to Y with step by step
            step is the cut size
        """
        yy = []
        for y in range(0, x_dim, cut_x):
            yy.append(y)
        yy.append(x_dim)
    
        xx = []
        for x in range(0, y_dim, cut_y):
            xx.append(x)
        xx.append(y_dim)
    
        # lists of int tuples
        return xx, yy
    
    
    def generate_crop_coordinates(xx, yy):
        """ every combination of pair with
            the values from above function
        """
        coords = []
        for x in xx:
            rows = []
            for y in yy:
                rows.append((x, y))
            coords.append(rows)
        return coords
    
    
    def generate_subimages(coords: list):
        subimages = []
        for i in range(len(coords) - 1):
            row0 = coords[i]
            row1 = coords[i + 1]
    
            for ii in range(len(row0) - 1):
                x_pair, y_pair = row0[ii], row1[ii + 1]
    
                cropped = img.crop((x_pair[1], x_pair[0], y_pair[1], y_pair[0]))
                cropped_coords = [
                    (x_pair[1], x_pair[0]),
                    (y_pair[1], y_pair[0])
                ]
                subimg = SubImage(cropped, cropped_coords)
                subimages.append(subimg)
    
        # array of PIL Images
        return subimages
    
    
    def get_dimensions(subimages: list):
        """ we need this for reconstruction
            because we dont know the original img size
            we only have the array of subimages
        """
        max_X = 0
        max_Y = 0
        for subimage in subimages:
            for coords in subimage.coords:
                # coords is a tuple
                max_X = coords[0] if coords[0] > max_X else max_X
                max_Y = coords[1] if coords[1] > max_Y else max_Y
    
        # max x and y are the size of image
        return max_X, max_Y
    
    
    def reconstruct_image(subimages: list, folder: str):
        y, x = get_dimensions(subimages)
        new_image = Image.new("RGBA", (y, x))
    
        for subimage in subimages:
            new_image.paste(subimage.img, subimage.coords[0])
    
        # saves locally
        new_image.save(os.path.join(folder, "reconstructed.png"))
        return new_image
    
    
    if __name__ == '__main__':
        # this is what you provide as user
        cut_size = (224, 224, 3)
    
        # and you provide the image, ofc
        original_path = "original.png"
        img = Image.open(original_path)
        img_size = img.size
    
        if cut_size[0] > img_size[0] or cut_size[1] > img_size[1]:
            raise ValueError("image size smaller than cut size.")
    
        xx, yy = generate_sections(*img_size, cut_size[0], cut_size[1])
    
        coords = generate_crop_coordinates(xx, yy)
    
        subimages = generate_subimages(coords)
    
        for index, subimage in enumerate(subimages, start=1):
            # -> if you want to save the pieces and see them
            subimage.img.save(f"{index}.png")
            # print(subimage.coords)
    
        folder = "."
        reconstruct_image(subimages, folder)
    

    部分图片:(coordinates pairs)

    第 1 部分:[(0, 0), (224, 224)]

    第 2 部分:[(224, 0), (448, 224)]

    第 3 部分:[(448, 0), (672, 224)]

    好的,我想你明白了。

    运行脚本后: i got the same image as original

    我知道一个问题的代码很多,但这是最快的解决方案。

    享受吧。

    【讨论】:

    • 您的代码不符合上述要求:使用给定示例运行您的代码会产生 12.png 形状为 (224, 160, 3)18.png 形状为 (192, 160, 3)。如问题所述,所有补丁的形状都应为(224, 224, 3),并且它们可能重叠。
    • 所有子图像都不能有(224, 224, 3),因为1280 % 224 != 0640 % 224 != 0