这是一种可行的方法。首先,我将尝试检测每个文本块。块(或边界框)是否重叠并不重要。在获取图像上所有 blob 的所有边界框后,我将检测 边界框重叠。如果边界框与其他边界框重叠,这意味着相同的文本块将在两个或多个图像之间共享。我将裁剪该部分并用白色矩形填充重叠区域,这样内容只会显示在一张图片上。
这些是重要的步骤:
-
使用形态学获得漂亮的文本块。
-
检测这些块上的轮廓并将这些轮廓转换为
边界框。
-
遍历所有边界框并:
这是代码,首先我们需要得到那些漂亮的文本块:
import numpy as np
import cv2
# image path
path = "C:/opencvImages/"
fileName = "sheet05.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Some deep copies of the input mat:
inputCopy = inputImage.copy()
cleanInputCopy = inputCopy.copy()
# Grayscale conversion:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Thresholding:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
binaryImage = 255 - binaryImage
非常标准的东西。只需read 图像,将其转换为grayscale 并通过Otsu 获得二进制图像。你会注意到我创建了一些输入的深拷贝。这些主要用于可视化结果,因为我最初绘制了每个找到的bounding box 和每个重叠区域。
应用一些非常强烈的形态来获得最好的文本块。我在这里应用扩张 + 侵蚀,每个操作有 10 次迭代:
# Dilate and Erode with a big Structuring Element:
kernelSize = 5
structuringElement = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
iterations = 10
dilatedImage = cv2.morphologyEx(binaryImage, cv2.MORPH_DILATE, structuringElement, None, None, iterations,
cv2.BORDER_REFLECT101)
erodedImage = cv2.morphologyEx(dilatedImage, cv2.MORPH_ERODE, structuringElement, None, None, iterations,
cv2.BORDER_REFLECT101)
最后一个 sn-p 得到这张图片:
现在,获取此图像的外轮廓并计算边界框:
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(erodedImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
contours_poly = [None] * len(contours)
boundRect = []
# Alright, just look for the outer bounding boxes:
for i, c in enumerate(contours):
if hierarchy[0][i][3] == -1:
contours_poly[i] = cv2.approxPolyDP(c, 3, True)
boundRect.append(cv2.boundingRect(contours_poly[i]))
现在,重要的部分来了。我们将遍历每个bounding box。裁剪图像,并检查最近用于裁剪图像的bounding box 是否在另一个bounding box 内(重叠)。如果是这种情况,只需在该区域内绘制一个大的白色矩形并继续进行新的裁剪:
# Loop thru all bounding boxes:
for i in range(len(boundRect)):
# Get current boundRect:
sourceRect = boundRect[i]
# Crop the roi:
croppedImg = cleanInputCopy[sourceRect[1]:sourceRect[1] + sourceRect[3],
sourceRect[0]:sourceRect[0] + sourceRect[2]]
# Check against other bounded rects:
for j in range(len(boundRect)):
# Get target boundRect:
targetRect = boundRect[j]
# Check for intersections:
if i != j:
foundIntersect, overlappedRect = checkIntersection(sourceRect, targetRect)
if foundIntersect:
# Found some overlapped rects, draw white rectangle at this location:
cv2.rectangle(cleanInputCopy, (int(overlappedRect[0]), int(overlappedRect[1])),
(int(overlappedRect[0] + overlappedRect[2]), int(overlappedRect[1] + overlappedRect[3])),
(255, 255, 2550), -1)
cv2.rectangle(inputCopy, (int(boundRect[i][0]), int(boundRect[i][1])),
(int(boundRect[i][0] + boundRect[i][2]), int(boundRect[i][1] + boundRect[i][3])), color, 5)
这些是为每个二进制 blob 检测到的边界框:
代码检测到重叠区域并在交叉点上绘制一个白色矩形,这样它将不再显示在以下裁剪上:
这些是农作物(请注意,这些是单独的图像):
现在,检测边界框交叉点的辅助函数是这样的:
# Check for boxA and boxB intersection
def checkIntersection(boxA, boxB):
x = max(boxA[0], boxB[0])
y = max(boxA[1], boxB[1])
w = min(boxA[0] + boxA[2], boxB[0] + boxB[2]) - x
h = min(boxA[1] + boxA[3], boxB[1] + boxB[3]) - y
foundIntersect = True
if w < 0 or h < 0:
foundIntersect = False
return(foundIntersect, [x, y, w, h])
这很简单。它只是获取两个边界矩形的坐标并计算相交区域。如果width 或height 小于零,则不存在交集。