光流忽略稀疏运动答案

【问题标题】：Optical flow ignore sparse motions光流忽略稀疏运动
【发布时间】：2016-04-17 11:12:04
【问题描述】：

我们实际上正在开展一个图像分析项目，我们需要识别场景中消失/出现的对象。这里有 2 张图像，一张是在外科医生采取行动之前拍摄的，另一张是之后拍摄的。

之前：后：

首先，我们刚刚计算了两张图片之间的差异，这是结果（请注意，我在结果中添加了 128 Mat 只是为了获得更好的图片）：

（之后 - 之前）+ 128

目标是检测杯子（红色箭头）已经从场景中消失并且注射器（黑色箭头）已经进入场景，换句话说，我们应该只检测与离开/进入的对象相对应的区域现场。此外，很明显，场景左上角的物体从它们的初始位置偏移了一点。我想到了Optical flow，所以我使用OpenCV C++ 来计算Farneback 的值，看看它是否足以满足我们的情况，这是我们得到的结果，然后是我们编写的代码：

流程：

void drawOptFlowMap(const Mat& flow, Mat& cflowmap, int step, double, const Scalar& color)
{
    cout << flow.channels() << " / " << flow.rows << " / " << flow.cols << endl;
    for(int y = 0; y < cflowmap.rows; y += step)
        for(int x = 0; x < cflowmap.cols; x += step)
        {
            const Point2f& fxy = flow.at<Point2f>(y, x);
            line(cflowmap, Point(x,y), Point(cvRound(x+fxy.x), cvRound(y+fxy.y)), color);
            circle(cflowmap, Point(x,y), 1, color, -1);
        }
}

void MainProcessorTrackingObjects::diffBetweenImagesToTestTrackObject(string pathOfImageCaptured, string pathOfImagesAfterOneAction, string pathOfResultsFolder)
{
    //Preprocessing step...

    string pathOfImageBefore = StringUtils::concat(pathOfImageCaptured, imageCapturedFileName);
    string pathOfImageAfter = StringUtils::concat(pathOfImagesAfterOneAction, *it);

    Mat imageBefore = imread(pathOfImageBefore);
    Mat imageAfter = imread(pathOfImageAfter);

    Mat imageResult = (imageAfter - imageBefore) + 128;
    //            absdiff(imageAfter, imageBefore, imageResult);
    string imageResultPath = StringUtils::stringFormat("%s%s-color.png",pathOfResultsFolder.c_str(), fileNameWithoutFrameIndex.c_str());
    imwrite(imageResultPath, imageResult);

    Mat imageBeforeGray, imageAfterGray;
    cvtColor( imageBefore, imageBeforeGray, CV_RGB2GRAY );
    cvtColor( imageAfter, imageAfterGray, CV_RGB2GRAY );

    Mat imageResultGray = (imageAfterGray - imageBeforeGray) + 128;
    //            absdiff(imageAfterGray, imageBeforeGray, imageResultGray);
    string imageResultGrayPath = StringUtils::stringFormat("%s%s-gray.png",pathOfResultsFolder.c_str(), fileNameWithoutFrameIndex.c_str());
    imwrite(imageResultGrayPath, imageResultGray);


    //*** Compute FarneBack optical flow
    Mat opticalFlow;
    calcOpticalFlowFarneback(imageBeforeGray, imageAfterGray, opticalFlow, 0.5, 3, 15, 3, 5, 1.2, 0);

    drawOptFlowMap(opticalFlow, imageBefore, 5, 1.5, Scalar(0, 255, 255));
    string flowPath = StringUtils::stringFormat("%s%s-flow.png",pathOfResultsFolder.c_str(), fileNameWithoutFrameIndex.c_str());
    imwrite(flowPath, imageBefore);

    break;
}

为了知道这个光流有多准确，我写了一小段代码来计算 (IMAGEAFTER + FLOW) - IMAGEBEFORE:

//Reference method just to see the accuracy of the optical flow calculation
Mat accuracy = Mat::zeros(imageBeforeGray.rows, imageBeforeGray.cols, imageBeforeGray.type());

strinfor(int y = 0; y < imageAfter.rows; y ++)
for(int x = 0; x < imageAfter.cols; x ++)
{
     Point2f& fxy = opticalFlow.at<Point2f>(y, x);
     uchar intensityPointCalculated = imageAfterGray.at<uchar>(cvRound(y+fxy.y), cvRound(x+fxy.x));
     uchar intensityPointBefore = imageBeforeGray.at<uchar>(y,x);
     uchar intensityResult = ((intensityPointCalculated - intensityPointBefore) / 2) + 128;
     accuracy.at<uchar>(y, x) = intensityResult;
}
validationPixelBased = StringUtils::stringFormat("%s%s-validationPixelBased.png",pathOfResultsFolder.c_str(), fileNameWithoutFrameIndex.c_str());
 imwrite(validationPixelBased, accuracy);

拥有这个((intensityPointCalculated - intensityPointBefore) / 2) + 128; 的目的只是为了拥有一个易于理解的图像。

图像结果：

由于它检测到所有已移动/进入/离开场景的区域，我们认为OpticalFlow 不足以仅检测表示对象在场景中消失/出现的区域。有什么方法可以忽略opticalFlow 检测到的稀疏运动？或者有没有其他方法可以检测我们需要什么？

【问题讨论】：

标签： c++ opencv motion-detection opticalflow

【解决方案1】：

假设这里的目标是识别出现/消失对象的区域，而不是识别在两张图片中都存在但只是移动位置的区域。

正如您已经做过的那样，光流应该是一个不错的选择。然而，问题是如何评估结果。与显示无法容忍旋转/缩放差异的像素到像素差异相反，您可以进行特征匹配（SIFT 等Check out here for what you can use with opencv）

这是我之前从您的图像中获得的 Good Features To Track。

GoodFeaturesToTrackDetector detector;
vector<KeyPoint> keyPoints;
vector<Point2f> kpBefore, kpAfter;
detector.detect(imageBefore, keyPoints);

您可以使用稀疏流而不是密集光流并仅跟踪特征，

vector<uchar> featuresFound;
vector<float> err;
calcOpticalFlowPyrLK(imageBeforeGray, imageAfterGray, keyPointsBefore, keyPointsAfter, featuresFound, err, Size(PATCH_SIZE , PATCH_SIZE ));

输出包括 FeaturesFound 和 Error 值。这里我只是简单地使用了一个阈值来区分移动的特征和不匹配的消失的特征。

vector<KeyPoint> kpNotMatched;
for (int i = 0; i < kpBefore.size(); i++) {
    if (!featuresFound[i] || err[i] > ERROR_THRESHOLD) {
        kpNotMatched.push_back(KeyPoint(kpBefore[i], 1));
    }
}
Mat output;
drawKeypoints(imageBefore, kpNotMatched, output, Scalar(0, 0, 255));

可以过滤掉剩余的不正确匹配的特征。这里我使用了简单的均值滤波加阈值处理来得到新出现区域的掩码。

Mat mask = Mat::zeros(imageBefore.rows, imageBefore.cols, CV_8UC1);
for (int i = 0; i < kpNotMatched.size(); i++) {
    mask.at<uchar>(kpNotMatched[i].pt) = 255;
}
blur(mask, mask, Size(BLUR_SIZE, BLUR_SIZE));
threshold(mask, mask, MASK_THRESHOLD, 255, THRESH_BINARY);

然后找到它的凸包以显示原始图像中的区域（黄色）。

vector<vector<Point> > contours;
vector<Vec4i> hierarchy;
findContours( mask, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, Point(0, 0) );

vector<vector<Point> >hull( contours.size() );
for( int i = 0; i < contours.size(); i++ ) {
    convexHull(Mat(contours[i]), hull[i], false);
}
for( int i = 0; i < contours.size(); i++ ) {
    drawContours( output, hull, i, Scalar(0, 255, 255), 3, 8, vector<Vec4i>(), 0, Point() );
}

然后简单地以相反的方式（从 imageAfter 到 imageBefore 匹配）来获得区域出现。 :)

【讨论】：

我无法重现相同的结果.. 请告诉我您用于这些常量的值：BLUR_SIZE、ERROR_THRESHOLD、MASK_THRESHOLD
基于 960 x 540 输入图像，我有 BLUR_SIZE=35, ERROR_THRESHOLD=30, MASK_THRESHOLD=1.5。您可能还想调整其他参数，例如稀疏光流金字塔级别、补丁大小等。但是，简单的常数阈值可能无法在所有情况下都很好地工作，您可能希望根据您的用例应用更复杂的策略。
感谢您的支持。你的回答不包括我的大部分情况，但我会给你赏金，因为它足够接近。
欢迎告诉我哪些有效，哪些无效。光流和特征描述符匹配可能很嘈杂，但可以根据您的情况以各种方式进行调整，并获得显着改进的结果。祝你好运。

【解决方案2】：

这是我尝试过的；

检测发生变化的区域。为此，我使用了简单的帧差分、阈值处理、形态学运算和凸包。
在两个图像中找到这些区域的特征点，看看它们是否匹配。一个地区的良好匹配表明它没有发生重大变化。不匹配意味着这两个区域现在不同。为此，我使用 BOW 和 Bhattacharyya 距离。

参数可能需要调整。我使用了仅适用于两个示例图像的值。作为特征检测器/描述符，我使用了 SIFT（非免费）。您可以尝试其他检测器和描述符。

差异图像：

地区：

变化（红色：插入/移除，黄色：稀疏运动）：

// for non-free modules SIFT/SURF
cv::initModule_nonfree();

Mat im1 = imread("1.png");
Mat im2 = imread("2.png");

// downsample
/*pyrDown(im1, im1);
pyrDown(im2, im2);*/

Mat disp = im1.clone() * .5 + im2.clone() * .5;
Mat regions = Mat::zeros(im1.rows, im1.cols, CV_8U);

// gray scale
Mat gr1, gr2;
cvtColor(im1, gr1, CV_BGR2GRAY);
cvtColor(im2, gr2, CV_BGR2GRAY);
// simple frame differencing
Mat diff;
absdiff(gr1, gr2, diff);
// threshold the difference to obtain the regions having a change
Mat bw;
adaptiveThreshold(diff, bw, 255, CV_ADAPTIVE_THRESH_GAUSSIAN_C, CV_THRESH_BINARY_INV, 15, 5);
// some post processing
Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
morphologyEx(bw, bw, MORPH_CLOSE, kernel, Point(-1, -1), 4);
// find contours in the change image
Mat cont = bw.clone();
vector<vector<Point> > contours;
vector<Vec4i> hierarchy;
findContours(cont, contours, hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE, Point(0, 0));
// feature detector, descriptor and matcher
Ptr<FeatureDetector> featureDetector = FeatureDetector::create("SIFT");
Ptr<DescriptorExtractor> descExtractor = DescriptorExtractor::create("SIFT");
Ptr<DescriptorMatcher> descMatcher = DescriptorMatcher::create("FlannBased");

if( featureDetector.empty() || descExtractor.empty() || descMatcher.empty() )
{
    cout << "featureDetector or descExtractor or descMatcher was not created" << endl;
    exit(0);
}
// BOW
Ptr<BOWImgDescriptorExtractor> bowExtractor = new BOWImgDescriptorExtractor(descExtractor, descMatcher);

int vocabSize = 10;
TermCriteria terminate_criterion;
terminate_criterion.epsilon = FLT_EPSILON;
BOWKMeansTrainer bowTrainer( vocabSize, terminate_criterion, 3, KMEANS_PP_CENTERS );

Mat mask(bw.rows, bw.cols, CV_8U);
for(size_t j = 0; j < contours.size(); j++)
{
    // discard regions that a below a specific threshold
    Rect rect = boundingRect(contours[j]);
    if ((double)(rect.width * rect.height) / (bw.rows * bw.cols) < .01)
    {
        continue; // skip this region as it's too small
    }
    // prepare a mask for each region
    mask.setTo(0);
    vector<Point> hull;
    convexHull(contours[j], hull);
    fillConvexPoly(mask, hull, Scalar::all(255), 8, 0);

    fillConvexPoly(regions, hull, Scalar::all(255), 8, 0);

    // extract keypoints from the region
    vector<KeyPoint> im1Keypoints, im2Keypoints;
    featureDetector->detect(im1, im1Keypoints, mask);
    featureDetector->detect(im2, im2Keypoints, mask);
    // get their descriptors
    Mat im1Descriptors, im2Descriptors;
    descExtractor->compute(im1, im1Keypoints, im1Descriptors);
    descExtractor->compute(im2, im2Keypoints, im2Descriptors);

    if ((0 == im1Keypoints.size()) || (0 == im2Keypoints.size()))
    {
        // mark this contour as object arrival/removal region
        drawContours(disp, contours, j, Scalar(0, 0, 255), 2);
        continue;
    }

    // bag-of-visual-words
    Mat vocabulary = bowTrainer.cluster(im1Descriptors);
    bowExtractor->setVocabulary( vocabulary );
    // get the distribution of visual words in the region for both images
    vector<vector<int>> idx1, idx2;
    bowExtractor->compute(im1, im1Keypoints, im1Descriptors, &idx1);
    bowExtractor->compute(im2, im2Keypoints, im2Descriptors, &idx2);
    // compare the distributions
    Mat hist1 = Mat::zeros(vocabSize, 1, CV_32F);
    Mat hist2 = Mat::zeros(vocabSize, 1, CV_32F);

    for (int i = 0; i < vocabSize; i++)
    {
        hist1.at<float>(i) = (float)idx1[i].size();
        hist2.at<float>(i) = (float)idx2[i].size();
    }
    normalize(hist1, hist1);
    normalize(hist2, hist2);
    double comp = compareHist(hist1, hist2, CV_COMP_BHATTACHARYYA);

    cout << comp << endl;
    // low BHATTACHARYYA distance means a good match of features in the two regions
    if ( comp < .2 )
    {
        // mark this contour as a region having sparse motion
        drawContours(disp, contours, j, Scalar(0, 255, 255), 2);
    }
    else
    {
        // mark this contour as object arrival/removal region
        drawContours(disp, contours, j, Scalar(0, 0, 255), 2);
    }
}

【讨论】：

我必须添加 if ((im1Keypoints.size() goo.gl/W7rCFa

【解决方案3】：

您可以尝试两管齐下的方法 - 使用图像差异方法非常适合检测进出场景的对象，只要对象的颜色与背景的颜色不同。让我印象深刻的是，如果在使用该方法之前可以删除已经移动的对象，它会得到很大的改进。

有一个很棒的用于对象检测的 OpenCV 方法here，它可以在图像中找到兴趣点来检测对象的平移。我认为您可以通过以下方法实现您想要的 -

1 将图像与 OpenCV 代码进行比较并突出显示两个图像中的移动对象

2 将检测到的物体与背景的另一张图片在同一组像素（或类似的像素）上着色，以减少由运动图像引起的图像差异

3 找出现在应该有较大的主要物体和运动图像留下的较小伪影的图像差异

4 在图像差异中检测到的特定尺寸对象的阈值

5 编制一份可能的候选人名单

对象跟踪还有其他替代方案，因此可能会有您更喜欢的代码，但我认为该过程应该适合您正在做的事情。

【讨论】：