OpenCV 检测多个、旋转、缩放的对象答案

【问题标题】：OpenCV Detecting Multiple, Rotated, Scaled objectsOpenCV 检测多个、旋转、缩放的对象
【发布时间】：2018-01-19 18:55:05
【问题描述】：

我只用了 12 个小时左右的 OpenCV，还没有解决这个问题。最终目标是获取一张图像并将每个字符存储为 6 个单独的 vector2 数组（5 个字符 + 气泡）中的一个条目

另外，我需要知道一个字符是否被“放大”了。

资源链接：https://imgur.com/a/lT5HA

正如您在任何特定时刻都知道的那样，有大量的事情发生，这使得这项任务有些困难。不过，我知道这是可能的——Robotmon 以几乎 100% 的准确率识别每个字符——唯一的缺点是“放大”的字符被识别了 3 次（丢弃重复字符时的距离对大字符不起作用，因为它们足够大，可以注册多次）。

所有字符都用单一颜色标记，同一颜色组中的字符不会出现在同一匹配中。

我确定我犯了很多错误——我没有找到关于这个用例的 OpenCV 的很多有用信息。相当数量是反复试验+查看文件内部。

例如，我敢肯定，如果我要添加屏幕截图中出现的所有字符，搜索所有字符，然后比较“分数”，我就能排除一些错误的标识（因为角色会被准确地声明）。

重申我的问题： 如何在没有误报的情况下识别图像中的每个字符（包括小字符被“发送”到乐谱或透明字符逐渐消失），准确识别所有字符，以及单独识别放大字符？（也许还使用 OpenCL？）

#include "stdafx.h"
#include <opencv2/opencv.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/core.hpp>
#include <opencv2/features2d.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/xfeatures2d.hpp>
#include <iostream>
#include <stdio.h>
#include <string>


using namespace cv;
using namespace std;
using namespace cv::xfeatures2d;

int MatchFunction();

int main()
{
    MatchFunction();
    waitKey(0);
    return 0;
}

int MatchFunction()
{

    Mat Image_Scene = imread("Bubbles.jpg");
    Mat image_Object = imread("block_peterpan_s.png");

    // Check for invalid input
    if (Image_Scene.empty() || image_Object.empty())
    {
        cout << "Could not open or find the image" << endl;
        return 0;
    }

    // Initiate ORB detector
    Ptr<ORB> detector = ORB::create();

    //detector->setMaxFeatures(50);
    detector->setScaleFactor(1.1);
    detector->setNLevels(12);
    //detector->setPatchSize(16);


    std::vector<KeyPoint> keypoints_object, keypoints_scene;
    Mat descriptors_object, descriptors_scene;

    // find the keypoints and descriptors with ORB
    detector->detect(image_Object, keypoints_object);
    detector->detect(Image_Scene, keypoints_scene);


    detector->compute(image_Object, keypoints_object, descriptors_object);
    detector->compute(Image_Scene, keypoints_scene, descriptors_scene);


    //-- Step 3: Matching descriptor vectors with a brute force matcher
    //BFMatcher matcher(NORM_HAMMING, true);   //BFMatcher matcher(NORM_L2);
    //Ptr<BFMatcher> matcher = BFMatcher::create(); //Ptr<ORB> detector = ORB::create();
    Ptr<BFMatcher> matcher = BFMatcher::create(NORM_HAMMING, true);


    vector<DMatch> matches;
    matcher->match(descriptors_object, descriptors_scene, matches);
    //matcher.match(descriptors_object, descriptors_scene, matches);


    vector<DMatch> good_matches;
    //vector<Point2f> featurePoints1;
    //vector<Point2f> featurePoints2;

    //Sort the matches by adding them 1 by 1 to good_matches
    //for (int i = 0; i<int(matches.size()); i++) { //Size is basically length
    //  good_matches.push_back(matches[i]);
    //}


    string k = to_string((matches.size()));
    cout << k << endl;
    //cout << " Usage: ./SURF_FlannMatcher <img1> <img2>" << std::endl;

    double max_dist = 0; double min_dist = 100;
    //-- Quick calculation of max and min distances between keypoints
    for (int i = 0; i < int(matches.size()); i++)
    {
        //cout << to_string(i) << endl;
        double dist = matches[i].distance;

        if (dist < min_dist) min_dist = dist;
        if (dist > max_dist) max_dist = dist;
    }
    //printf("-- Max dist : %f \n", max_dist);
    //printf("-- Min dist : %f \n", min_dist);

    //-- Draw only "good" matches (i.e. whose distance is less than 2*min_dist,
    //-- or a small arbitary value ( 0.02 ) in the event that min_dist is very
    //-- small)
    //-- PS.- radiusMatch can also be used here.
    //std::vector< DMatch > good_matches;


    for (int i = 0; i < int(matches.size()); i++)
    {
        if (matches[i].distance <= max(4 * min_dist, 0.02))
        {
            good_matches.push_back(matches[i]);
        }
    }


    Mat img_matches;
    drawMatches(image_Object, keypoints_object, Image_Scene, keypoints_scene,
        good_matches, img_matches, Scalar::all(-1), Scalar::all(-1),
        vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS);
    imshow("Good Matches", img_matches);
}

【问题讨论】：

仅SIFT匹配等简单的图像处理不足以进行多个旋转缩放对象的检测。需要更多的知识，例如 ML。这是一个大项目，而不是玩具代码。
我可以肯定地确认 Robotmon 没有使用任何形式的机器学习——事实上，他们使用模板匹配和奇怪的 16x16 模糊正方形字符而不是源图像来做到这一点。这些图像是 45 度旋转，所有这些图像都已加载并与缩小和模糊的屏幕截图进行检查。这就是我所知道的 - 我正在寻找学习 + 让它在模拟器之外（以及在 GPU 上）运行。 github.com/r2-studio/robotmon-scripts/tree/master/scripts/…
模板匹配可能有效，但效率低下。 N1 类，N2 旋转，N3 缩放，MxN 滑动窗口 => O(N1xN2xN3xMxN)。当然，在边缘图像上会更有效。
哦，当然。这就是我选择 ORB 的原因——我认为它会更快、更准确（或相同）。我对需要做的事情有一个粗略的了解，但是在使用较低级别的东西 atm 时遇到了困难。想要这样做是因为 OpenCL 和 OpenCV（以及机器学习）对我很感兴趣，而且我一直很喜欢这个，但它并不像我希望的那样顺利。不幸的是，我也必须 afk 2 天。可能会编写一个 OpenGL 钩子，但这只能在模拟器 + 更多工作 + 的情况下工作 + 有点欺骗我的更大目的，有利于保持势头。感谢您的提示（边缘）。
看看dhanushkadangampola.blogspot.com.tr/2015/01/…

标签： c++ opencv computer-vision

【解决方案1】：

模板匹配将是最简单的方法，但您必须针对每个对象以不同的比例和旋转强制使用相同的模板。如果您有关于游戏中此类缩放/旋转可能数量的信息，这将大大缩小迭代次数。

图像似乎没有噪点、失真或遮挡，因此在这种情况下，机器学习方法并不是真正必要的。如果您想要更高效的东西并且熟悉科学语言和实现算法，请查看这项使用径向和圆形过滤器来缩小组合数量的研究： http://pdfs.semanticscholar.org/c965/0f78bf9d18eba3850281841dc7ddd20e8d5e.pdf 如果需要，算法或更具体地说过滤器可以与 OpenCL 或任何其他库并行化。在现代机器上，这应该不是必需的，因为串行实现的工作速度非常快。

我前段时间成功实现了它，它运行良好，速度足够快，可以通过近乎实时的性能解决您的问题。 RGB 将不是必需的。据我所知，它并未作为开源代码在任何地方实现，但您也可以尝试查找缩放和旋转不变模板匹配，看看会发生什么。

【讨论】：

这会比 ORB + 蛮力匹配更好吗？我的意思是，这在纸上听起来不错，只是想确定一下。因为规模不是无限的——它仅限于设备显示。在 BlueStacks 上它甚至更好，因为 960x540 是游戏支持的最低合理比例。我估计最多有 20 个可能的（和已知的）尺度。在这可能有 6 个纵横比（基于 Current Res / 1.0 Res 的缩小源）。即使以 15° 的间隔旋转，总共也只有 120 + 5（气泡不旋转），因此在 GPU 上，这应该几乎可以立即节省开销（传输、同步）。
在这种情况下，即使使用模板匹配强制解决方案也会相当快。不能真正告诉你它是否会比 ORB + BF Matcher 更快，但它肯定会更精确。如果这不让您满意，请尝试实施我提到的论文。真的很快
会的 - 不幸的是，我将在大约 32 小时内无法访问我的计算机。我注意到 Bluestacks 没有以 540x960 运行（即使是这样也不会输出该尺寸的屏幕截图），并且（也）内置屏幕截图输出低质量图像。在 Reddit 上看到一个关于游戏的报告，有人被禁止但可能是为了黑客而不是自动化——对我来说是个好时机，因为我必须决定这是纯粹的 OpenCV 检测还是完整的程序——这会影响我获取图像的方式（Opengl/Dx 挂钩、ADB、Windows 打印）和性能需求。非常感谢！