【问题标题】:Drawing bounding boxes with Pytesseract / OpenCV使用 Pytesseract / OpenCV 绘制边界框
【发布时间】:2020-01-31 18:39:26
【问题描述】:

我使用 pytesseract (0.3.2) 和 openCV (4.1.2) 来识别图像中的数字。当 image_to_string 工作时, image_to_data 和 image_to_boxes 不工作。我需要能够在图像上绘制边界框,这让我很难过。我尝试了不同的图像、旧版本的 pytesseract 等。我使用的是 Windows 和 Jupyter Notebooks。

import cv2 
import pytesseract

#erosion
def erode(image):
    kernel = np.ones((5,5),np.uint8)
    return cv2.erode(image, kernel, iterations = 1)

#grayscale
def get_grayscale(image):
    return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

#thresholding
def thresholding(image):
    #return cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
    return cv2.threshold(image, 200, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

img = cv2.imread('my_image.jpg')
pytesseract.pytesseract.tesseract_cmd = r'C:\mypath\tesseract.exe'

gray = get_grayscale(img)
thresh = thresholding(gray)
erode = remove_noise(thresh)

custom_config = r'-c tessedit_char_whitelist=0123456789 --psm 6'
print(pytesseract.image_to_string(erode, config=custom_config))

cv2.imwrite("test.jpg", erode)

#these return nothing
print(pytesseract.image_to_boxes(Image.open('test.jpg')))
print(pytesseract.image_to_data(Image.open('test.jpg')))

【问题讨论】:

    标签: python opencv jupyter-notebook computer-vision python-tesseract


    【解决方案1】:

    不使用image_to_boxes,另一种方法是简单地用cv2.findContours找到轮廓,用cv2.boundingRect获得边界矩形坐标,用cv2.rectangle绘制边界框

    使用此示例输入图像

    画框

    OCR 的结果

    1234567890
    

    代码

    import cv2
    import pytesseract
    import numpy as np
    
    pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
    
    # Load image, grayscale, Otsu's threshold
    image = cv2.imread('1.png')
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    
    # Draw bounding boxes
    cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    for c in cnts:
        x,y,w,h = cv2.boundingRect(c)
        cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
    
    # OCR
    data = pytesseract.image_to_string(255 - thresh, lang='eng',config='--psm 6')
    print(data)
    
    cv2.imshow('thresh', thresh)
    cv2.imshow('image', image)
    cv2.waitKey()
    

    【讨论】:

    • 如果有趣的是,10 年后同样的问题仍然一遍又一遍地出现。无论如何,你在帮助人们方面做得很好!
    • 我没有看到使用这种方法打印出预测字符及其对应框的方法。
    • 我不明白,边界框不是你想要的吗?
    【解决方案2】:

    请尝试以下代码:

    from pytesseract import Output
    import pytesseract
    import cv2
     
    image = cv2.imread("my_image.jpg")
    
    #swap color channel ordering from BGR (OpenCV’s default) to RGB (compatible with Tesseract and pytesseract).
    # By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
    # we need to convert from BGR to RGB format/mode:
    rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
     
    pytesseract.pytesseract.tesseract_cmd = r'C:\mypath\tesseract.exe'
    custom_config = r'-c tessedit_char_whitelist=0123456789 --psm 6'
    results = pytesseract.image_to_data(rgb, output_type=Output.DICT,lang='eng',config=custom_config)
    boxresults = pytesseract.image_to_boxes(rgb,output_type=Output.DICT,lang='eng',config=custom_config)
    print(results)
    print(boxresults)
    
    for i in range(0, len(results["text"])):
        # extract the bounding box coordinates of the text region from the current result
        tmp_tl_x = results["left"][i]
        tmp_tl_y = results["top"][i]
        tmp_br_x = tmp_tl_x + results["width"][i]
        tmp_br_y = tmp_tl_y + results["height"][i] 
        tmp_level = results["level"][i]
        conf = results["conf"][i]
        text = results["text"][i]
        
        if(tmp_level == 5):
            cv2.putText(image, text, (tmp_tl_x, tmp_tl_y - 10), cv2.FONT_HERSHEY_SIMPLEX,0.5, (0, 0, 255), 1)
            cv2.rectangle(image, (tmp_tl_x, tmp_tl_y), (tmp_br_x, tmp_br_y), (0, 0, 255), 1)
            
    for j in range(0,len(boxresults["left"])):
        left = boxresults["left"][j]
        bottom = boxresults["bottom"][j]
        right = boxresults["right"][j]
        top = boxresults["top"][j] 
        cv2.rectangle(image, (left, top), (right, bottom), (255, 0, 0), 1)
           
        
    cv2.imshow("image",image)
    cv2.waitKey(0)
    

    【讨论】:

      猜你喜欢
      • 2019-06-07
      • 2013-01-08
      • 1970-01-01
      • 1970-01-01
      • 2021-12-03
      • 1970-01-01
      • 1970-01-01
      • 2021-11-26
      • 2013-01-21
      相关资源
      最近更新 更多