【问题标题】:google vision OCR returning too much information谷歌视觉 OCR 返回太多信息
【发布时间】:2019-11-06 01:38:37
【问题描述】:

我创建了一个简单的类来测试谷歌视觉 OCR API。我正在传递一个包含 5 个字母的简单图像,该图像应该返回一个带有“CRAIG”的字符串。虽然这个 API 调用返回了很多额外的信息:

{
    "property": {
        "detectedLanguages": [
            {
                "languageCode": "en"
            }
        ]
    },
    "boundingBox": {
        "vertices": [
            {
                "x": 183,
                "y": 105
            },
            {
                "x": 674,
                "y": 105
            },
            {
                "x": 674,
                "y": 253
            },
            {
                "x": 183,
                "y": 253
            }
        ]
    },
    "symbols": [
        {
            "property": {
                "detectedLanguages": [
                    {
                        "languageCode": "en"
                    }
                ]
            },
            "boundingBox": {
                "vertices": [
                    {
                        "x": 183,
                        "y": 105
                    },
                    {
                        "x": 257,
                        "y": 105
                    },
                    {
                        "x": 257,
                        "y": 253
                    },
                    {
                        "x": 183,
                        "y": 253
                    }
                ]
            },
            "text": "C",
            "confidence": 0.99
        },
        {
            "property": {
                "detectedLanguages": [
                    {
                        "languageCode": "en"
                    }
                ]
            },
            "boundingBox": {
                "vertices": [
                    {
                        "x": 249,
                        "y": 105
                    },
                    {
                        "x": 371,
                        "y": 105
                    },
                    {
                        "x": 371,
                        "y": 253
                    },
                    {
                        "x": 249,
                        "y": 253
                    }
                ]
            },
            "text": "R",
            "confidence": 0.99
        },
        {
            "property": {
                "detectedLanguages": [
                    {
                        "languageCode": "en"
                    }
                ]
            },
            "boundingBox": {
                "vertices": [
                    {
                        "x": 459,
                        "y": 105
                    },
                    {
                        "x": 581,
                        "y": 105
                    },
                    {
                        "x": 581,
                        "y": 253
                    },
                    {
                        "x": 459,
                        "y": 253
                    }
                ]
            },
            "text": "A",
            "confidence": 0.99
        },
        {
            "property": {
                "detectedLanguages": [
                    {
                        "languageCode": "en"
                    }
                ]
            },
            "boundingBox": {
                "vertices": [
                    {
                        "x": 582,
                        "y": 105
                    },
                    {
                        "x": 638,
                        "y": 105
                    },
                    {
                        "x": 638,
                        "y": 253
                    },
                    {
                        "x": 582,
                        "y": 253
                    }
                ]
            },
            "text": "I",
            "confidence": 0.98
        },
        {
            "property": {
                "detectedLanguages": [
                    {
                        "languageCode": "en"
                    }
                ],
                "detectedBreak": {
                    "type": "LINE_BREAK"
                }
            },
            "boundingBox": {
                "vertices": [
                    {
                        "x": 636,
                        "y": 105
                    },
                    {
                        "x": 674,
                        "y": 105
                    },
                    {
                        "x": 674,
                        "y": 253
                    },
                    {
                        "x": 636,
                        "y": 253
                    }
                ]
            },
            "text": "G",
            "confidence": 0.99
        }
    ],
    "confidence": 0.98
}

我怎样才能只得到返回的字母?

类:

public static void Main(string[] args)
    {

        string credential_path = @"C:\Users\35385\nodal.json";
        System.Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", credential_path);

        // Instantiates a client
        var client = ImageAnnotatorClient.Create();
        // Load the image file into memory
        var image = Image.FromFile("vision.jpg");
        // Performs label detection on the image file
        var response = client.DetectDocumentText(image);

        foreach (var page in response.Pages)
        {
            foreach (var block in page.Blocks)
            {
                foreach (var paragraph in block.Paragraphs)
                {
                    Console.WriteLine(string.Join("\n", paragraph.Words));
                }
            }
        }


    }

我传入的图像是一个简单的单词,我用油漆画出来的:

【问题讨论】:

标签: c# api google-api google-vision


【解决方案1】:

尝试改变..

var response = client.DetectDocumentText(image); 

var response = client.DetectText(image);

说明

以下是来自 GOOGLE CLOUD VISION API 文档的一些信息

Vision API 可以检测和提取图像中的文本。有两种注释功能支持光学字符识别 (OCR):

  • TEXT_DETECTION 检测并从任何图像中提取文本。例如,一张照片可能包含街道标志或交通标志。 JSON 包括整个提取的字符串,以及单个单词及其边界框。

  • DOCUMENT_TEXT_DETECTION 也从图像中提取文本,但响应针对密集文本和文档进行了优化。 JSON 包括页面、块、段落、单词和中断信息。

【讨论】:

  • 感谢您的回复。不幸的是,我需要在我的应用程序中使用检测文档文本
  • 你想要的确切输出是什么..?
  • 我创建了一个新问题,其中包含我的全部要求,如果您想看看:stackoverflow.com/questions/58736076/…
【解决方案2】:

经过一番研究,以下为我提供了这个词,并且输出更清晰:

Block Text at (183, 105) - (674, 105) - (674, 253) - (183, 253)
  Paragraph at (183, 105) - (674, 105) - (674, 253) - (183, 253)
    Word: CRAIG

方法:

foreach (var page in response.Pages)
            {
                foreach (var block in page.Blocks)
                {
                    string box = string.Join(" - ", block.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
                    Console.WriteLine($"Block {block.BlockType} at {box}");
                    foreach (var paragraph in block.Paragraphs)
                    {
                        box = string.Join(" - ", paragraph.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
                        Console.WriteLine($"  Paragraph at {box}");
                        foreach (var word in paragraph.Words)
                        {
                            Console.WriteLine($"    Word: {string.Join("", word.Symbols.Select(s => s.Text))}");
                        }
                    }
                }
            }

【讨论】:

    猜你喜欢
    • 2018-11-11
    • 1970-01-01
    • 2019-02-18
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-01-09
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多