Python 在图像中的单词之间产生更大的间隙答案

【问题标题】：Python make bigger gaps between words in an imagePython 在图像中的单词之间产生更大的间隙
【发布时间】：2021-11-20 09:31:18
【问题描述】：

我有以下图片：

from PIL import Image
img = Image.open("without_space.png")
img.show()

我希望增加单词之间的间隔，使其看起来像这样：

我想过将图像转换为 NumPy：

img = numpy.ndarray(img)

比增加阵列的x轴和y轴为增加间隙留出空间：

def increase_padding(img):
    np_arr = np.asarray(img)

    shape = np_arr.shape

    y = shape[0]
    colors = shape[2]
    zeros = np.zeros([y,20,colors], dtype=np.uint8)
    zeros[:,:,3] = 255
    np_arr = np.append(np_arr,zeros, axis=1)
    np_arr = np.append(zeros, np_arr, axis=1)

    shape = np_arr.shape

    x = shape[1]
    colors = shape[2]

    zeros = np.zeros([20,x,colors], dtype=np.uint8)
    zeros[:,:,3] = 255
    np_arr = np.append(np_arr,zeros, axis=0)
    np_arr = np.append(zeros, np_arr, axis=0)

    return np_arr

这是结果：

 img = increase_padding(img)
 img.show()

图像有更多空间来分隔单词，但现在我被卡住了。有什么想法吗？

【问题讨论】：

您需要一些方法来识别单词，然后在单词之间插入空格，而不是在图像的左/右和顶部/底部。一般来说，我认为在图像中找到单词并不是一件容易的事，但在这个例子中，看起来一些简单的规则可能会起作用（特别是如果图像是黑白的，即颜色值为 0 或 255 和没有别的）。
有一个numpy 函数用于填充np.pad。您的问题的解决方案必须识别图像中的字母和单词。这是一个复杂的程序，不是一个可以回答的问题。
检测文字的bounding boxes其实比我想象的要容易。
也许你可以说一下你这样做的实际目的是什么？可能有更好的方法。例如，你知道图片中的文字吗？
我不知道前面的文字。但是所有文本都采用我上面显示的格式。白底黑

标签： python numpy

【解决方案1】：

你的填充机制不太好，我的版本如下

import cv2
import numpy as np

ROI_number = 0
factor = 40
decrement = 20
margin = 3

#sorting code source
#https://gist.github.com/divyaprabha123/bfa1e44ebdfc6b578fd9715818f07aec
def sort_contours(cnts, method="left-to-right"):
    '''
    sort_contours : Function to sort contours
    argument:
        cnts (array): image contours
        method(string) : sorting direction
    output:
        cnts(list): sorted contours
        boundingBoxes(list): bounding boxes
    '''
    # initialize the reverse flag and sort index
    reverse = False
    i = 0

    # handle if we need to sort in reverse
    if method == "right-to-left" or method == "bottom-to-top":
        reverse = True

    # handle if we are sorting against the y-coordinate rather than
    # the x-coordinate of the bounding box
    if method == "top-to-bottom" or method == "bottom-to-top":
        i = 1

    # construct the list of bounding boxes and sort them from top to
    # bottom
    boundingBoxes = [cv2.boundingRect(c) for c in cnts]
    (cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
        key=lambda b:b[1][i], reverse=reverse))

    # return the list of sorted contours and bounding boxes
    return (cnts, boundingBoxes)

image = cv2.imread("test.png")

#use a black container of same shape to construct new image with gaps
container = np.zeros(image.shape, np.uint8)

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (9, 9), 0)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 25)

# Dilate to combine adjacent text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
dilate = cv2.dilate(thresh, kernel, iterations=4)

# Find contours, highlight text areas, and extract ROIs
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

#sort so that order remain preserved
cnts = sort_contours(cnts)[0]

for c in cnts:
    ROI_number += 1
    area = cv2.contourArea(c)
    print(area)
    x, y, w, h = cv2.boundingRect(c)
    
    x -= margin
    y -= margin
    w += margin
    h += margin

    #extract region of interest e.g. the word
    roi = image[y : y + h, x : x + w].copy()
    factor -= decrement
        
    x = x - factor

    #copy the words from the original image to container image with gap factor
    container[y : y + h, x : x + w] = roi

cv2.imshow('image', container)
cv2.waitKey()

输出如下，我假设对于其他图像，您必须优化此代码以自动找到最佳阈值。

我所做的是跟随

使用阈值提取轮廓
从左到右对等高线进行排序以获得正确的单词顺序
创建空容器（与原始大小相同的新图像）
将所有单词从原始容器复制到带有填充的新容器

【讨论】：

【解决方案2】：

要移动位图中的文字，您需要确定与这些区域相对应的边界框。

这些边界框的水平尺寸可以通过水平间距来识别。

您的第一步是沿水平轴“聚合”图像，取最大值（这将标记包含至少一个像素的所有列）。

horizontal = np_arr.max(axis=0)

然后您需要确定该数组中至少为给定长度的 0 次运行。这些将是单词之间的边距和空格。（阈值需要足够高才能跳过字母之间的空格。）

最后，这些 0-runs 之间的部分将是包含单词的区域。

【讨论】：