【问题标题】:OCR on binary image二进制图像上的 OCR
【发布时间】:2019-03-21 14:50:55
【问题描述】:

我有一个像这样的二进制文本图像black on white text - cat

我想对这样的图像执行 OCR。它们只包含一个单词。 我已经尝试过 tesseract 和谷歌云视觉,但它们都没有返回任何结果。 我正在使用 python 3.6 和 Windows 10。

# export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json
import io
import cv2
from PIL import Image

# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types

# Instantiates a client
client = vision.ImageAnnotatorClient()

with io.open("test.png", 'rb') as image_file:
    content = image_file.read()

image = types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
resp = ''

for text in texts:
    resp+=' ' + text.description

print(resp)

from PIL import Image as im
import pytesseract as ts
print(ts.image_to_string(im.fromarray(canvas.reshape((480,640)),'L'))) # canvas contains the Mat object from which the image is saved to png

对于两者中的任何一个来说,这张图片都应该是一项简单的任务,我觉得我的代码中遗漏了一些东西。请帮帮我!

编辑:

感谢 F10 为我指明了正确的方向。这就是我让它与本地图像一起工作的方式。

# export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json
import io
import cv2
from PIL import Image

# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types
from google.cloud.vision import enums

# Instantiates a client
client = vision.ImageAnnotatorClient()

with io.open("test.png", 'rb') as image_file:
    content = image_file.read()

features = [
    types.Feature(type=enums.Feature.Type.DOCUMENT_TEXT_DETECTION)
]


image = types.Image(content=content)

request = types.image_annotator_pb2.AnnotateImageRequest(image=image, features=features)
response = client.annotate_image(request)

print(response)

【问题讨论】:

    标签: ocr google-cloud-vision python-tesseract


    【解决方案1】:

    基于this document,我使用了以下代码,我能够得到text: "cat\n"作为输出:

    from pprint import pprint
    
    # Imports the Google Cloud client library
    from google.cloud import vision
    
    # Instantiates a client
    client = vision.ImageAnnotatorClient()
    
    # The name of the image file to annotate
    response = client.annotate_image({
      'image': {'source': {'image_uri': 'gs://<your_bucket>/ORW90.png'}},
      'features': [{'type': vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION}],
    })
    
    pprint(response)
    

    希望对你有帮助。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2014-06-07
      • 1970-01-01
      • 2013-05-10
      • 1970-01-01
      • 2020-12-14
      • 2011-02-22
      • 1970-01-01
      • 2010-12-05
      相关资源
      最近更新 更多