演示页面和控制台之间的 Azure OCR 差异答案

【问题标题】：Azure OCR difference between demo page and console演示页面和控制台之间的 Azure OCR 差异
【发布时间】：2019-01-20 05:36:30
【问题描述】：

我有几个需要使用 OCR 识别的图像示例。

我尝试在演示页面https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ 上识别它们，并且效果很好。我使用“读取图片中的文字”选项，它比“读取图片中的手写文字”效果更好。

但是当我尝试使用脚本中的 REST 调用（根据文档中给出的示例）时，结果要糟糕得多。有些字母被认错了，有些字母完全错过了。如果我尝试从开发控制台 https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/56f91f2e778daf14a499e1fc/console 运行相同的示例，我仍然会得到同样糟糕的结果。

什么会导致这种差异？当演示页面生成时，如何修复它以获得可靠的结果？

也许还需要其他信息？

UPD：因为我找不到任何解决方案，甚至无法解释差异，所以我创建了一个示例文件（类似于实际文件），以便您查看。文件地址为http://sfiles.herokuapp.com/sample.png

您可以看到，如果在“读取图像中的文本”部分的演示页面https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ 上使用它，则生成的 JSON 是

{
  "status": "Succeeded",
  "succeeded": true,
  "failed": false,
  "finished": true,
  "recognitionResult": {
    "lines": [
      {
        "boundingBox": [
          307,
          159,
          385,
          158,
          386,
          173,
          308,
          174
        ],
        "text": "October 2011",
        "words": [
          {
            "boundingBox": [
              308,
              160,
              357,
              160,
              357,
              174,
              308,
              175
            ],
            "text": "October"
          },
          {
            "boundingBox": [
              357,
              160,
              387,
              159,
              387,
              174,
              357,
              174
            ],
            "text": "2011"
          }
        ]
      },
      {
        "boundingBox": [
          426,
          157,
          519,
          158,
          519,
          173,
          425,
          172
        ],
        "text": "07UC14PII0244",
        "words": [
          {
            "boundingBox": [
              426,
              160,
              520,
              159,
              520,
              174,
              426,
              174
            ],
            "text": "07UC14PII0244"
          }
        ]
      }
    ]
  }
}

如果我在控制台中使用此文件并进行以下调用：

POST https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr?language=unk&detectOrientation =true HTTP/1.1
Host: westcentralus.api.cognitive.microsoft.com
Content-Type: application/json
Ocp-Apim-Subscription-Key: ••••••••••••••••••••••••••••••••

{"url":"http://sfiles.herokuapp.com/sample.png"}

我得到不同的结果：

{
  "language": "el",
  "textAngle": 0.0,
  "orientation": "Up",
  "regions": [{
    "boundingBox": "309,161,75,10",
    "lines": [{
      "boundingBox": "309,161,75,10",
      "words": [{
        "boundingBox": "309,161,46,10",
        "text": "October"
      }, {
        "boundingBox": "358,162,26,9",
        "text": "2011"
      }]
    }]
  }, {
    "boundingBox": "428,161,92,10",
    "lines": [{
      "boundingBox": "428,161,92,10",
      "words": [{
        "boundingBox": "428,161,92,10",
        "text": "071_lC14P110244"
      }]
    }]
  }]
}

如您所见，结果完全不同（即使是 JSON 格式）。有谁知道我做错了什么，或者我遗漏了什么，并且“读取图像中的文本”演示与 API 的 ocr 方法不匹配？

非常感谢您的帮助。

【问题讨论】：

您的 REST 调用是什么样的？参数是否与 Web 控制台发送的一致？
@MoA 控制台会显示这样一个请求预览：POST https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr?language=unk&detectOrientation =true HTTP/1.1 Host: westcentralus.api.cognitive.microsoft.com Content-Type: application/json Ocp-Apim-Subscription-Key: •••••••••••••••••••••••••••••••• {"url":"http://url.of/the/file.png"} 这个确切的文件 url 用于演示页面，并且效果明显更好。控制台和我的脚本中的结果是相同的，我想如果我从控制台让它工作，问题就解决了。
@MoA 我刚刚用示例图像和示例调用更新了问题，所以也许它可以帮助理解问题所在？

标签： rest azure ocr

【解决方案1】：

Microsoft 认知服务中有两种类型的 OCR。较新的端点（/recognizeText）具有更好的识别能力，但目前仅支持英文。较旧的端点 (/ocr) 的语言覆盖范围更广。

this post 中提供了有关差异的一些其他详细信息。

【讨论】：