在python中分割图像答案

【问题标题】：splitting an image in python在python中分割图像
【发布时间】：2024-04-27 03:40:01
【问题描述】：

我有一个大小为 (H,W,C) 的随机输入。我想将其拆分为相同大小（224,224,3）的 Z 图像，其中 Z 是根据图像大小确定的变量。它们是否重叠并不重要！我的目标是处理子图像，然后重建原始图像。如果有人可以帮忙！

【问题讨论】：

你想将原图按(H,W,C)分割成子图吗？
我的原始图像尺寸为 (H,W,C)。我的目标是裁剪成相同大小（224,224,3）的重叠块，然后在处理后我想重建原始图像
C 常量是什么，与3 有什么关系？
当然C是通道数，等于3，高宽是随机的
以这种方式分割图像并不复杂，但对于重建来说，如果一组子图像的原始图像大小已知（重叠可以重新计算-飞，例如）或不（甚至可能涉及图像拼接）。

标签： python opencv python-imaging-library

【解决方案1】：

对于拆分部分，我将仅引用this Q&A。我没有将每个子图像的坐标保存在最终列表中，而是将实际的子图像保存在那里。

关于重建部分：如果重建方法知道最终（或原始）图像大小，则可以简单地重建每个方向的子图像数量及其对应的重叠；它与拆分的代码基本相同。根据这些信息，您可以轻松地重建原始图像。

如果重建方法只获取子图像列表，它会变得复杂！在我看来，对于任意图像（以及它们的子图像），您将需要高级图像拼接技术。

不过，对于图像

将无损存储（内部或外部），并且
具有“独特”的重叠区域，

我想出了以下想法：

从前两个子图像中，您可以蛮力搜索第一列中的第一个垂直重叠。这行得通，因为上述分割方法按该顺序保存子图像，并且由于上述两个假设。
然后，您可以使用垂直重叠（或减量）来查找第一列的剩余子图像。这行得通，因为上述拆分方法保证了最相等的重叠（在 +/- 1 内），并且每列中的第一个垂直重叠总是较大的那个。
- 现在，当到达第一列的末尾时，您将找不到与下一个子图像的适当垂直重叠，因为它是第二行的第一个图像。请注意，只有在上述假设 2. 成立时才有效。查看alexzander's answer 中的图像，此过程将失败。第一列的最后一个子图像的下部（全为零）等于第二列的第一个子图像的顶部。
当到达第一列的末尾时，我们可以确定垂直重叠的数量，从而确定每列的子图像数量。由此，我们也知道每行子图像的数量。所以，现在，我们再次做整件事来寻找合适的水平重叠。
拥有所有水平和垂直重叠，我们可以相应地切割所有子图像并将它们堆叠以重建原始图像。

是的，这需要做很多工作，但您不需要高级图像拼接，而且我认为这种重建方法适用于大多数真实世界的图像。任何大的单色背景（结合小的重叠）或生成子图像和重建原始图像之间的任何压缩都会导致故障。

这是完整的代码：

import cv2
import numpy as np


# Adapted from https://*.com/questions/58383814
def gen_subimages(image, hTile, wTile):

    h, w = image.shape[:2]

    # Number of tiles
    nTilesX = np.uint8(np.ceil(w / wTile))
    nTilesY = np.uint8(np.ceil(h / hTile))

    # Total remainders
    remainderX = nTilesX * wTile - w
    remainderY = nTilesY * hTile - h

    # Set up remainders per tile
    remaindersX = np.ones((nTilesX - 1), np.uint16) * \
                  np.uint16(np.floor(remainderX / (nTilesX - 1)))
    remaindersY = np.ones((nTilesY - 1), np.uint16) * \
                  np.uint16(np.floor(remainderY / (nTilesY - 1)))
    remaindersX[0:np.remainder(remainderX, np.uint16(nTilesX - 1))] += 1
    remaindersY[0:np.remainder(remainderY, np.uint16(nTilesY - 1))] += 1

    images = []

    # Determine proper tile boxes
    k = 0
    x = 0
    for i in range(nTilesX):
        y = 0
        for j in range(nTilesY):
            images.append(image[y:y+hTile, x:x+wTile, :])
            k += 1
            if j < (nTilesY - 1):
                y = y + hTile - remaindersY[j]
        if i < (nTilesX - 1):
            x = x + wTile - remaindersX[i]

    return images


def reconstruct_image(subimages):

    n_si = len(subimages)
    height, width = subimages[0].shape[:2]

    # VERTICAL OVERLAPS

    # Brute-force search for first vertical overlap
    y_overlaps = []
    for y in np.arange(height - 1, 1, -1):
        if np.all(subimages[0][y:, ...] == subimages[1][:-y, ...]):
            y_overlaps.append(height - y)
            break

    if len(y_overlaps) > 0:
        y_ol = y_overlaps[0]

        # Get following vertical overlaps
        for i in np.arange(1, n_si - 1):
            if np.all(subimages[i][height - y_ol:, ...] ==
                      subimages[i + 1][:y_ol, ...]):
                y_overlaps.append(y_ol)
            elif np.all(subimages[i][height - (y_ol - 1):, ...] ==
                        subimages[i + 1][:(y_ol - 1), ...]):
                y_ol -= 1
                y_overlaps.append(y_ol)
            else:
                break

    nTilesY = len(y_overlaps) + 1
    nTilesX = n_si // nTilesY

    # HORIZONTAL OVERLAPS

    # Brute-force search for first horizontal overlap
    x_overlaps = []
    for x in np.arange(width - 1, 1, -1):
        if np.all(subimages[0][:, x:, :] == subimages[nTilesY][:, :-x, :]):
            x_overlaps.append(width - x)
            break

    if len(x_overlaps) > 0:
        x_ol = x_overlaps[0]

        # Get following horizontal overlaps
        for i in np.arange(nTilesY, n_si - nTilesY, nTilesX):
            if np.all(subimages[i][:, width - x_ol:, :] ==
                      subimages[i + nTilesY][:, :x_ol, :]):
                x_overlaps.append(x_ol)
            elif np.all(subimages[i][:, width - (x_ol - 1):, :] ==
                        subimages[i + nTilesY][:, :(x_ol - 1), :]):
                x_ol -= 1
                x_overlaps.append(x_ol)
            else:
                break

    # Get all properly cutted subimages
    x_overlaps.insert(0, 0)
    y_overlaps.insert(0, 0)
    stacks = [subimages[iy + (ix * nTilesY)][y:, x:, ...]
              for iy, y in enumerate(y_overlaps)
              for ix, x in enumerate(x_overlaps)]

    # Stack cutted subimages
    image_recon = np.vstack([np.hstack(stacks[i:i + nTilesX])
                            for i in np.arange(0, nTilesX * nTilesY, nTilesX)])

    return image_recon


img = cv2.imread('path/to/your/image.png')

images = gen_subimages(img, 224, 224)
for im in images:
    print(im.shape)

img_recon = reconstruct_image(images)
print('Original image == Reconstructed image:', np.all(img == img_recon))

像往常一样，这是我的测试图片：

而且，这就是代码的输出：

(224, 224, 3)
(224, 224, 3)
(224, 224, 3)
(224, 224, 3)
Original image == Reconstructed image: True

你看，所有子图像都具有(224, 224, C)的所需形状，并且重建图像与原始图像相同。

【讨论】：

【解决方案2】：

我使用了这张图片：(original.png - (1280 x 640)) 和PIL lib，如果这对你来说方便的话，而不是cv2。

代码


import os
from PIL import Image

class SubImage(object):
    def __init__(self, pil_image, coordinates):
        # PIL image object
        self.img = pil_image
        # this is a list with coordinates
        # used to crop from the original image;
        # these coordinates must be used as
        # DIAGONAL in order to crop or put back in place
        self.coords = coordinates


def generate_sections(x_dim, y_dim, cut_x, cut_y):
    """ sections from 0 to X and 0 to Y with step by step
        step is the cut size
    """
    yy = []
    for y in range(0, x_dim, cut_x):
        yy.append(y)
    yy.append(x_dim)

    xx = []
    for x in range(0, y_dim, cut_y):
        xx.append(x)
    xx.append(y_dim)

    # lists of int tuples
    return xx, yy


def generate_crop_coordinates(xx, yy):
    """ every combination of pair with
        the values from above function
    """
    coords = []
    for x in xx:
        rows = []
        for y in yy:
            rows.append((x, y))
        coords.append(rows)
    return coords


def generate_subimages(coords: list):
    subimages = []
    for i in range(len(coords) - 1):
        row0 = coords[i]
        row1 = coords[i + 1]

        for ii in range(len(row0) - 1):
            x_pair, y_pair = row0[ii], row1[ii + 1]

            cropped = img.crop((x_pair[1], x_pair[0], y_pair[1], y_pair[0]))
            cropped_coords = [
                (x_pair[1], x_pair[0]),
                (y_pair[1], y_pair[0])
            ]
            subimg = SubImage(cropped, cropped_coords)
            subimages.append(subimg)

    # array of PIL Images
    return subimages


def get_dimensions(subimages: list):
    """ we need this for reconstruction
        because we dont know the original img size
        we only have the array of subimages
    """
    max_X = 0
    max_Y = 0
    for subimage in subimages:
        for coords in subimage.coords:
            # coords is a tuple
            max_X = coords[0] if coords[0] > max_X else max_X
            max_Y = coords[1] if coords[1] > max_Y else max_Y

    # max x and y are the size of image
    return max_X, max_Y


def reconstruct_image(subimages: list, folder: str):
    y, x = get_dimensions(subimages)
    new_image = Image.new("RGBA", (y, x))

    for subimage in subimages:
        new_image.paste(subimage.img, subimage.coords[0])

    # saves locally
    new_image.save(os.path.join(folder, "reconstructed.png"))
    return new_image


if __name__ == '__main__':
    # this is what you provide as user
    cut_size = (224, 224, 3)

    # and you provide the image, ofc
    original_path = "original.png"
    img = Image.open(original_path)
    img_size = img.size

    if cut_size[0] > img_size[0] or cut_size[1] > img_size[1]:
        raise ValueError("image size smaller than cut size.")

    xx, yy = generate_sections(*img_size, cut_size[0], cut_size[1])

    coords = generate_crop_coordinates(xx, yy)

    subimages = generate_subimages(coords)

    for index, subimage in enumerate(subimages, start=1):
        # -> if you want to save the pieces and see them
        subimage.img.save(f"{index}.png")
        # print(subimage.coords)

    folder = "."
    reconstruct_image(subimages, folder)

部分图片：(coordinates pairs)

第 1 部分：[(0, 0), (224, 224)]

第 2 部分：[(224, 0), (448, 224)]

第 3 部分：[(448, 0), (672, 224)]

好的，我想你明白了。

运行脚本后： i got the same image as original

我知道一个问题的代码很多，但这是最快的解决方案。

享受吧。

【讨论】：

您的代码不符合上述要求：使用给定示例运行您的代码会产生 12.png 形状为 (224, 160, 3) 和 18.png 形状为 (192, 160, 3)。如问题所述，所有补丁的形状都应为(224, 224, 3)，并且它们可能重叠。
所有子图像都不能有(224, 224, 3)，因为1280 % 224 != 0和640 % 224 != 0。