如何使用 Tesseract 识别图像上两个数字之间的斜线？答案

【问题标题】：How to recognize slash between two numbers on my images with Tesseract?如何使用 Tesseract 识别图像上两个数字之间的斜线？
【发布时间】：2021-08-30 21:51:20
【问题描述】：

我有一些图像，其中两个数字由/ 分隔，非常接近它们。 Tesseract 根本无法识别该破折号，或者在大多数情况下将其识别为1（对于少数图像它有效）。

我的 Tesseract 代码：

pytesseract.image_to_string(img,lang='eng',config='--psm 7 --oem 3 -c tessedit_char_whitelist=/0123456789').strip()

我尝试过使用其他 psm 和 oem 配置。我一直在玩图像很多，例如使用cv2.threshold、cv2.cvtColor，调整大小。

编辑：

之后

img = cv2.threshold(img, 200, 255, cv2.THRESH_BINARY_INV)[1]` <br>
img = cv2.resize(img,(0,0), fx=1.5, fy=1.5)`

大多数图像返回良好的值，但其中一些在随机位置添加5（转换后的图像）：

很少有案例仍然无法识别斜线。

【问题讨论】：

A) 你能得到最好不是 jpeg（有损）格式的更高分辨率的图像吗？ b) 你可以在白色背景上制作图像黑色文本吗？
A) 无法获得更好的分辨率 B) 我做到了 img = cv2.threshold(img, 200, 255, cv2.THRESH_BINARY_INV)[1]

标签： python image-processing ocr tesseract python-tesseract

【解决方案1】：

在我的机器上进行了灰度化，甚至更大，不同的阈值完成了这项工作：

import cv2
import pytesseract


def extract_stats(img_filepath):
    img = cv2.imread(img_filepath, cv2.IMREAD_GRAYSCALE)
    img = cv2.resize(img, (0, 0), None, 4.0, 4.0)
    img = cv2.threshold(img, 160, 255, cv2.THRESH_BINARY)[1]
    config = '--psm 6 -c tessedit_char_whitelist="0123456789/"'
    text = pytesseract.image_to_string(img, config=config)
    print(text.replace('\n', '').replace('\f', ''))


for filepath in ['Bzh3j.png', 't9gAh.png', 'BBy2P.png']:
    extract_stats(filepath)
# 4319/6149
# 943/7114
# 103/6149

----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.19042-SP0
Python:        3.9.6
PyCharm:       2021.2
OpenCV:        4.5.3
pytesseract:   5.0.0-alpha.20201127
----------------------------------------

【讨论】：

【解决方案2】：

使用 tesseract 5.0.0-alpha-20210401 和 tessdata_best 我用这段代码得到了正确的结果：

import cv2
import numpy as np
import pytesseract
from IPython.display import display
from PIL import Image

pytesseract.pytesseract.tesseract_cmd = r'bin\\tesseract.exe'
tessdata = "tessdata"

img = cv2.imread('t9gAh.png', cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img,(0,0), fx=3.0, fy=3.0)
bin_inverted = ~cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
test = pytesseract.image_to_string(bin_inverted, config=f'--psm 6 --tessdata-dir "{tessdata}"')
print(text.replace('\n\f'', ''))

【讨论】：