OpenCV，Haar 级联分类器：缩放特征还是计算图像金字塔？答案

【问题标题】：OpenCV, Haar cascade classifier: scaling feature or computing image pyramid?OpenCV，Haar 级联分类器：缩放特征还是计算图像金字塔？
【发布时间】：2015-01-28 05:25:16
【问题描述】：

我阅读了 Viola 和 Jones 的论文。他们在论文中明确表示，他们的算法比其他算法更快，因为通过缩放特征矩形避免了图像金字塔的计算。

但是我google了很久，才发现OpenCV实现了图像金字塔的方法，而不是缩放特征矩形。并为金字塔中的所有子图像计算积分图像。如果该算法用于处理视频而不是图片，则对每一帧都执行此操作。

这个选择的理由是什么？我不太明白。

我能理解的完全相反：对于视频应用，缩放特征只需要做一次，缩放后的特征可以被所有帧重用。并且只需要计算整幅图像的积分图像。

我说的对吗？

Viola 和 Jones 还在 Pentium 3 计算机上展示了 15fps 的帧速率，但我几乎看不到有人通过现代计算机上的 OpenCV 实现实现了这种性能。这很奇怪，不是吗？

任何输入都会有所帮助。谢谢。

【问题讨论】：

标签： opencv haar-classifier

【解决方案1】：

我试图通过查看他们的代码来验证这一点。这是基于版本 2.4.10。简短的回答是：两者都有。 OpenCv 根据执行检测的比例因子对图像进行缩放，并且它还可以根据比例因子在不同的窗口大小下重新缩放特征。理由如下：
1. 查看旧函数，来自 objdetect 模块 (haar.cpp) 的 cvHaarDetectObjectsForROC。值得注意的参数是 CvSize minSize、CvSize maxSize 和 const CvArr* _img、double scaleFactor、int minNeighbors。

CvSeq*
cvHaarDetectObjectsForROC( const CvArr* _img,
                     CvHaarClassifierCascade* cascade, CvMemStorage* storage,
                     std::vector<int>& rejectLevels, std::vector<double>& levelWeights,
                     double scaleFactor, int minNeighbors, int flags,
                     CvSize minSize, CvSize maxSize, bool outputRejectLevels )
{
    CvMat stub, *img = (CvMat*)_img;
.... // skip a bit ahead to this part
    if( flags & CV_HAAR_SCALE_IMAGE )
    {
        CvSize winSize0 = cascade->orig_window_size; // this would be the trained size of 24x24 pixels mentioned in the paper
        for( factor = 1; ; factor *= scaleFactor )
        {
           // detection window for current scale
           CvSize winSize = { cvRound(winSize0.width*factor), cvRound(winSize0.height*factor) };
           //resized image size
           CvSize sz = { cvRound( img->cols/factor ), cvRound( img->rows/factor ) };
           // take every possible scale factor as long as the resulting window doesn't exceed the maximum size given and is bigger than the minimum one
           if( winSize.width > maxSize.width || winSize.height > maxSize.height )
               break;
           if( winSize.width < minSize.width || winSize.height < minSize.height )
               continue;
           img1 = cvMat( sz.height, sz.width, CV_8UC1, imgSmall->data.ptr );
           ... // skip sum, square sum, tilted sums a.k.a interal image arrays initialization
           cvResize( img, &img1, CV_INTER_LINEAR ); // scaling down the image here
           cvIntegral( &img1, &sum1, &sqsum1, _tilted ); // compute integral representation for the scaled down version
           ...  //skip some lines
           cvSetImagesForHaarClassifierCascade( cascade, &sum1, &sqsum1, _tilted, 1. ) //-> set the structures and also rescales the feature according to the last parameter which is the scale factor.
          // Notice it is 1.0 because the image was scaled down this time.
          <call detection function with notable arguments: cascade,... factor, cv::Mat(&sum1), cv::Mat(&sqsum1) ...>
          // the above call is a parallel for that evaluates a window at a certain position in the image with the cascade classifier
          // note the class naming HaarDetectObjects_ScaleImage_Invoker in the actual code and skipped here.

        } // end for
    } // if
    else
    {
        int n_factors = 0; // total number of factors
        cvIntegral( img, sum, sqsum, tilted ); // -> makes a single integral image for the given image (the original one passed in the cvHaarDetectObjects)
        // below aims to see the total number of scale factors at which detection is performed.
         for( n_factors = 0, factor = 1;
             factor*cascade->orig_window_size.width < img->cols - 10 &&
             factor*cascade->orig_window_size.height < img->rows - 10;
             n_factors++, factor *= scaleFactor );
        ... // skip some lines
        for( ; n_factors-- > 0; factor *= scaleFactor )
        {
            CvSize winSize = { cvRound( cascade->orig_window_size.width * factor ), cvRound( cascade->orig_window_size.height * factor )};
            ... // skip check for minSize and maxSize here
            cvSetImagesForHaarClassifierCascade( cascade, sum, sqsum, tilted, factor ); // -> notice here the scale factor is given so that the trained Haar features can be rescaled.
            <parallel for detect call given a startX, endX and startY endY, window size and cascade> // Note the name here HaarDetectObjects_ScaleCascade_Invoker used in actual code and skipped here
        }
    } // end of if
... // skip rest
} // end of cvHaarDetectObjectsForROC function

如果您采用新的 API (C++) 类 CascadeClassifier，如果它加载由 traincascade.exe 应用程序输出的新的 .xml 格式的级联，将根据比例因子缩放图像（对于 Haars，它应该从我所知道的）。类的 detectMultiScale 方法在代码中的某个点将默认为 detectSingleScale 方法：
```
 if( !detectSingleScale( scaledImage, stripCount, processingRectSize, stripSize, yStep, factor, candidates, rejectLevels, levelWeights, outputRejectLevels ) )
     break; // from cascadedetect.cpp in the detectMultiScale method.
```

我能想到的可能原因：为了在 C++ 中进行统一设计，这是唯一可以通过单一接口实现不同类型功能的透明性的方法。

我留下了思路，以防我理解错误或遗漏了其他用户可以通过验证此线索纠正我的内容。

【讨论】：