【发布时间】:2021-12-29 08:42:26
【问题描述】:
我正在开发一个应用程序,该应用程序使用 OpenCV 读取图像文件并使用 Tesseract 处理其中的文字。 使用以下代码,Tesseract 检测到不包含文本的额外矩形。
void Application::Application::OpenAndProcessImageFile(void)
{
OPENFILENAMEA ofn;
ZeroMemory(&ofn, sizeof(OPENFILENAMEA));
char szFile[260] = { 0 };
// Initialize remaining fields of OPENFILENAMEA structure
ofn.lStructSize = sizeof(ofn);
ofn.hwndOwner = mWindow->getHandle();
ofn.lpstrFile = szFile;
ofn.nMaxFile = sizeof(szFile);
ofn.lpstrFilter = "JPG\0*.JPG\0PNG\0*.PNG\0";
ofn.nFilterIndex = 1;
ofn.lpstrFileTitle = NULL;
ofn.nMaxFileTitle = 0;
ofn.lpstrInitialDir = NULL;
ofn.Flags = OFN_PATHMUSTEXIST | OFN_FILEMUSTEXIST;
//open the picture dialog and select the image
if (GetOpenFileNameA(&ofn) == TRUE) {
std::string filePath = ofn.lpstrFile;
//load image
mImage = cv::imread(filePath.c_str());
//process image
tesseract::TessBaseAPI ocr = tesseract::TessBaseAPI();
ocr.Init(NULL, "eng");
ocr.SetImage(mImage.data, mImage.cols, mImage.rows, 3, mImage.step);
Boxa* bounds = ocr.GetWords(NULL);
for (int i = 0; i < bounds->n; ++i) {
Box* b = bounds->box[i];
cv::rectangle(mImage, { b->x,b->y,b->w,b->h }, { 0, 255, 0 }, 2);
}
ocr.End();
//show image
cv::destroyAllWindows();
cv::imshow("İşlenmiş Resim", mImage);
}
}
这是输出图像
正如您所见,Tesseract 处理根本不包含单词的区域。 我该如何解决这个问题?
【问题讨论】: