【问题标题】:Cropping images in opencv在opencv中裁剪图像
【发布时间】:2013-10-04 15:33:48
【问题描述】:

我有一张图片,里面有一些文字。我想将图像发送到 OCR,但图像中有一些白噪声,所以 OCR 结果不是很好。我试图腐蚀/扩大图像,但无法获得完美的工作阈值。由于图像中的所有文本都是完全水平的,我尝试了霍夫变换。

这是我运行与 OpenCV 捆绑的示例霍夫变换程序时的图像。

问题

  • 我怎样才能将所有除了的地方都涂黑? 如何为红线突出显示的每个区域裁剪出单独的图像?

  • 我只想专注于水平线,我可以丢弃对角线。

发送到 OCR 时,任何一个选项都适用于我。不过,我想同时尝试两者,看看哪个效果最好。

【问题讨论】:

    标签: opencv image-processing text hough-transform


    【解决方案1】:

    howto/s 与输出

    • 我怎样才能把除了红线以外的所有东西都涂黑?
      • dotess2()
      • ['Footel text goes he: e\n', 'Some mole hele\n', 'Some Text Here\n']
    • 或者如何为红线突出显示的每个区域裁剪出单独的图像?
      • dotess1()
      • ['Foolel text goes he: e\n', 'Some mole hele\n', 'Some Text Here\n', 'Directions\n']

    代码

    # -*- coding: utf-8 -*- 
    import cv2
    import numpy as np
    import math
    import subprocess
    import os
    import operator
    
    #some clean up/init blah blah
    junk='\/,-‘’“ ”?.\';!{§_~!@#$%^&*()_+-|:}»£[]¢€¥°><'
    tmpdir='./tmp'
    if not os.path.exists(tmpdir):
        os.makedirs(tmpdir)
    for path, subdirs, files in os.walk(tmpdir):
        for name in files:
            os.remove(os.path.join(path, name))     
    
    #when the preprocessor is not pefect, there will be junk in the result. this is a crude mean of ridding them off
    def resfilter(res):
        rd = dict()
        for l in set(res):
            rd[l]=0.
    
        for l in rd:
            for i in l:
                if i in junk:
                    rd[l]-=1
                elif i.isdigit():
                    rd[l]+=.5
                else:
                    rd[l]+=1
        ret=[]
        for v in sorted(rd.iteritems(), key=operator.itemgetter(1), reverse=True):
            ret.append(v[0])
        return ret
    
    def dotess1():
        res =[]
        for path, subdirs, files in os.walk(tmpdir):
            for name in files:
                fpath = os.path.join(path, name)
                img = cv2.imread(fpath)
                gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    
                '''
                #if the text is too small/contains noise etc, resize and maintain aspect ratio
                if gray.shape[1]<100:
                    gray=cv2.resize(gray,(int(100/gray.shape[0]*gray.shape[1]),100))
                '''     
                cv2.imwrite('tmp.jpg',gray)
                args = ['tesseract.exe','tmp.jpg','tessres','-psm','7', '-l','eng']
                subprocess.call(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 
                with open('tessres.txt') as f:
                        for line in f:
                            if line.strip() != '':
                                res.append(line)
        print resfilter(res)
    
    
    def dotess2():
        res =[]
        args = ['tesseract.exe','clean.jpg','tessres','-psm','3', '-l','eng']
        subprocess.call(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 
        with open('tessres.txt') as f:
                for line in f:
                    if line.strip() != '':
                        res.append(line)
        print resfilter(res)
    
    '''
    start of code
    '''
    img = cv2.imread('c:/data/ocr3.png')
    gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    canny=cv2.Canny(gray,50,200,3)
    cv2.imshow('canny',canny)
    
    #remove the actual horizontal lines so that hough wont detect them
    linek = np.zeros((11,11),dtype=np.uint8)
    linek[5,...]=1
    x=cv2.morphologyEx(canny, cv2.MORPH_OPEN, linek ,iterations=1)
    canny-=x
    cv2.imshow('canny no horizontal lines',canny)
    
    #draw a fat line so that you can box it up
    lines = cv2.HoughLinesP(canny, 1, math.pi/2, 50,50, 50, 20)
    linemask = np.zeros(gray.shape,gray.dtype)
    for line in lines[0]:
        if line[1]==line[3]:#check horizontal
            pt1 = (line[0],line[1])
            pt2 = (line[2],line[3])
            cv2.line(linemask, pt1, pt2, (255), 30)
    
    cv2.imshow('linemask',linemask)
    
    '''
    * two methods of doing ocr,line mode and page mode
    * boxmask is used to so that a clean image can be saved for page mode
    * for every detected boxes, the roi are cropped and saved so that tess3 can be run in line mode
    '''
    
    boxmask = np.zeros(gray.shape,gray.dtype)
    contours,hierarchy = cv2.findContours(linemask,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
    idx=0
    for cnt in contours:
        idx+=1
        area = cv2.contourArea(cnt)
        x,y,w,h = cv2.boundingRect(cnt)
        roi=img[y:y+h,x:x+w].copy()
        cv2.imwrite('%s/%s.jpg'%(tmpdir,str(idx)),roi)
        cv2.rectangle(boxmask,(x,y),(x+w,y+h),(255),-1)
    
    
    cv2.imshow('clean',img&cv2.cvtColor(boxmask,cv2.COLOR_GRAY2BGR))
    cv2.imwrite('clean.jpg',img&cv2.cvtColor(boxmask,cv2.COLOR_GRAY2BGR))
    cv2.imshow('img',img)
    
    dotess1()
    dotess2()
    cv2.waitKey(0)
    

    【讨论】:

    • 您可以在线阅读包含图片的文档(不是 wiki,那真是一团糟)。喜欢这个cs.ukzn.ac.za/~sviriri/COMP702/COMP702-6.pdf。并尝试使用不同结构元素的opencv中的变形操作。在这种情况下,我们只想留下线条,因此结构元素必须在中心有一条线。它不一定是(11,11)。 (11,9),(15,11)在中心有一排都应该工作。您正在通过矩阵大小指定线的最小宽度。更粗的线也可以通过指定像linek[4,...]=1;linek[5,...]=1;linek[6,...]=1这样的粗行来检测
    猜你喜欢
    • 2012-12-31
    • 1970-01-01
    • 2017-08-04
    • 2023-03-05
    • 2014-07-13
    • 1970-01-01
    • 2018-08-27
    • 1970-01-01
    • 2011-12-21
    相关资源
    最近更新 更多