我知道这是一个迟到的回复。但我认为未来的人可以从中获得帮助。
以下是我认为我从上面的段落中理解的答案(所有代码都在 OpenCV-Python v 2.4-beta 中):
我将此作为输入图像。为了便于理解,这是一个简单的图像。
First we generate the binary image of the give image by thresholding it at 80% of its intensity and inverting the resulting image.
import cv2
import numpy as np
img = cv2.imread('doc4.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,0.8*gray.max(),255,1)
contours, hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
阈值图像:
We considered simple 8-neighborhood connectivity and performed connected component (contour) analysis of the binary image leading to the segmentation of the textual components.
它只是在 OpenCV 中寻找轮廓,也称为connected-component labelling.它选择图像中的所有白色斑点(组件)。
contours, hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
轮廓:
For next part of algorithm we use the minimum bounding rectangle of contours.
现在我们在每个检测到的轮廓周围找到边界矩形。然后删除带有小区域的轮廓以删除逗号等。请参阅语句:
Smaller connected patterns were discarded based on the assumption that they may have originated due to noise dependent on image acquisition system and does not in any way contribute to the final layout. Also punctuation marks were neglected using smaller size criterion e.g. comma, full-stop etc.
我们还找到了平均身高,avgh。
height = 0
num = 0
letters = []
ht = []
for (i,cnt) in enumerate(contours):
(x,y,w,h) = cv2.boundingRect(cnt)
if w*h<200:
cv2.drawContours(thresh2,[cnt],0,(0,0,0),-1)
else:
cv2.rectangle(thresh2,(x,y),(x+w,y+h),(0,255,0),1)
height = height + h
num = num + 1
letters.append(cnt)
ht.append(h)
avgh = height/num
因此,在此之后,所有逗号等都被删除,并在选定的周围绘制绿色矩形:
At this level we also segregate the fonts based on the height of the bounding rect using avgh (average height) as threshold. Two thresholds are used to classify fonts into three categories - small, normal and large(根据文中给定的公式)。
这里获得的平均高度,avgh 是 40。所以如果高度小于 26.66(即 40x2/3),则一个字母是 small,如果高度>60,则 normal 如果 26.66large。但是在给定的图像中,所有高度都在 (28,58) 之间,所以都是正常的。所以你看不出区别。
所以我只是做了一个小修改以轻松可视化它:如果高度
for (cnt,h) in zip(letters,ht):
print h
if h<=30:
cv2.drawContours(thresh2,[cnt],0,(255,0,0),-1)
elif 30 < h <= 50:
cv2.drawContours(thresh2,[cnt],0,(0,255,0),-1)
else:
cv2.drawContours(thresh2,[cnt],0,(0,0,255),-1)
cv2.imshow('img',thresh2)
cv2.waitKey(0)
cv2.destroyAllWindows()
现在您会得到分类为小、正常、大的字母的结果:
These rectangles were then sorted top-to-bottom and left-to-right order, using 2D point information of leftmost-topmost corner.
这部分我省略了。它只是对所有边界矩形的最左上角进行排序。