Pytesseract 问题答案

【问题标题】：Pytesseract questionsPytesseract 问题
【发布时间】：2020-09-29 14:56:53
【问题描述】：

我正在尝试从我从游戏中截取的屏幕截图中读取数字，但我无法正确获取数字。

from pyautogui import *
import pyautogui as pg
import time
import keyboard
import random
import win32api, win32con
import threading
import cv2
import numpy
from pynput.mouse import Button, Controller
from pynput.keyboard import Listener, KeyCode
from PIL import Image
from pytesseract import *
pytesseract.tesseract_cmd = r'D:\Python\Tesseract\tesseract.exe'

    #configs
    custom_config = r'--dpi 300 --psm 6 --oem 3 -c tessedit_char_whitelist=0123456789' 

    # 1. load the image as grayscale
    img = cv2.imread("price.png",cv2.IMREAD_GRAYSCALE)
    # Change all pixels to black, if they aren't white already (since all characters were white)
    img[img <= 150] = 231
    img[img == 199] = 0
    cv2.imwrite('resultfirst.png', img)
    # 2. Scale it 10x
    scaled = cv2.resize(img, (0,0), fx=10, fy=10, interpolation = cv2.INTER_CUBIC)
    # 3. Retained your bilateral filter
    filtered = cv2.bilateralFilter(scaled, 11, 17, 17)
    # 4. Thresholded OTSU method
    thresh = cv2.threshold(filtered, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
    time.sleep(1)
    # 5. Erode the image to bulk it up for tesseract
    kernel = numpy.ones((5,5),numpy.uint8)
    eroded = cv2.erode(thresh, kernel, iterations = 2)
    pre_processed = eroded
    
    output = pytesseract.image_to_string(pre_processed, config=custom_config)
    
    cv2.imwrite('result.png', pre_processed)
    print(output)

图像非常清晰，但返回 13500 或 18500，但没有任何修改可以正确返回 7。有没有更好的方法，还是我忘记了什么？

编辑：

在将黄色（灰度转换后的灰色）转换为黑色以填充数字后，我设法获得了更好的结果。我在代码块中添加了转换代码。

之前： This was the original result before 后： This is the result now

问题是 pytesseract 每次仍然返回 7 作为 1。我不认为我可以让这个 7 更像 7.. 怎么办？

【问题讨论】：

您在该地区进行硬编码 - 您确定您没有切断 7 个中的任何一个吗？
@rassar 是的，该区域仅用于屏幕截图（以消除任何可能混淆 tesseract 的不必要的混乱），我在这里发布的图像是结果，没有切断。
有趣，我得到了同样的结果。我去看看。
@rassar 我用新结果编辑了主帖

标签： python python-3.x python-tesseract

【解决方案1】：

不确定该解决方案的通用性，但如果您的所有图片都像这样，则 103 的阈值将起作用：

image = cv2.imread('price.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

threshold = 103
_, img_binarized = cv2.threshold(gray, threshold, 255, cv2.THRESH_BINARY)

print(pytesseract.image_to_string(img_binarized, config='--dpi 300 --psm 6 --oem 1 -c tessedit_char_whitelist=0123456789').strip())

在我的机器上提供78500。

【讨论】：

哦，这看起来比我做的简单多了，我试试看！ @rassar