官方文档:https://docs.opencv.org/3.4.3/d6/d0f/group__dnn.html#ga29d0ea5e52b1d1a6c2681e3f7d68473a
1.OpenCV3.4.3 DNN模块介绍
最早在OpenCV3.3版本发布中,把DNN模块从扩展模块移到了OpenCV正式发布模块中,当前DNN模块最早来自Tiny-dnn,可以加载预先训练好的Caffe模型数据,OpenCV做了近一步扩展支持所有主流的深度学习框架训练生成与导出模型数据加载,常见的有如下:
Caffe
TensorFlow
Torch/PyTorch
以face_detector Caffe模型为例。(位于${OPENCV_DIR}\sources\samples\dnn\face_detector)
一般需要两个文件:1)模型参数 2)模型配置文件即模型框架
res10_300x300_ssd_iter_140000_fp16.caffemodel
deploy.prototxt
在caffe中,模型定义在prototxt文件中,文件中定义了每层的结构信息。
模型文件需要从以下地址下载即可:https://raw.githubusercontent.com/opencv/opencv_3rdparty/dnn_samples_face_detector_20180205_fp16/res10_300x300_ssd_iter_140000_fp16.caffemodel
以GoogleNet Caffe模型为例。
一般需要两个文件:1)模型参数 2)模型配置文件即模型框架
bvlc_googlenet.caffemodel
bvlc_googlenet.prototxt
下载googlenet caffemodel文件:http://dl.caffe.berkeleyvision.org/
bvlc_googlenet.prototxt文件下载:https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/bvlc_googlenet.prototxt
https://github.com/BVLC/caffe这个内容很丰富,有空要去了解了解!
参考:https://blog.csdn.net/haoji007/article/details/81035509
https://blog.csdn.net/red_ear/article/details/85990162
OpenCV中DNN模块已经支持与测试过这些常见的网络模块:
AlexNet
GoogLeNet v1 (also referred to as Inception-5h)
ResNet-34/50/...
SqueezeNet v1.1
VGG-based FCN (semantical segmentation network)
ENet (lightweight semantical segmentation network)
VGG-based SSD (object detection network)
MobileNet-based SSD (light-weight object detection network)
OpenCV中DNN模块的位置:
几种常用的DNN中的函数:
1、readNetFromCaffe()函数:读取Caffe训练得到的模型。
String modelConfiguration = "D:\\Program Files\\OpenCV\\opencv\\sources\\samples\\dnn\\face_detector\\deploy.prototxt";
String modelBinary = "D:\\Program Files\\OpenCV\\opencv\\sources\\samples\\dnn\\face_detector\\res10_300x300_ssd_iter_140000_fp16.caffemodel";
//! [Initialize network]
dnn::Net net = readNetFromCaffe(modelConfiguration, modelBinary);//Reads a network model stored in Caffe model in memory
2、blobFromImage函数
从图像中创建4维斑点。可选择从中心调整大小和裁剪 image,减去平均值mean,按比例因子scalefactor缩放,交换蓝色和红色通道。
Mat blobFromImage(InputArray image, double scalefactor=1.0, const Size& size = Size(),
const Scalar& mean = Scalar(), bool swapRB=true, bool crop=true,
int ddepth=CV_32F);
image: 输入图像(带有1个,3个或4个通道)。
size: 输出图像的空间大小
mean : 从通道中减去的平均值的标量?(scalar with mean values which are subtracted from channels.)。 如果图像具有BGR排序且swapRB为真,则值应为(mean-R,mean-G,mean-B)顺序。
scalefactor : 图像值的比例因子乘数。.
swapRB: 如果图像具有BGR排序且swapRB为真,则值应为(mean-R,mean-G,mean-B)顺序。
crop :裁剪标志,指示是否在调整大小后裁剪图像
ddepth :输出blob的深度。 选择CV_32F或CV_8U
if crop is true, input image is resized so one side after resize is equal to corresponding dimension in size and another one is equal or larger. Then, crop from the center is performed. If crop is false, direct resize without cropping and preserving aspect ratio is performed.(如果裁剪为真,则调整输入图像的大小,使调整大小后的一侧等于相应的尺寸大小,另一侧等于或大于大小。 然后,执行从中心的裁剪。 如果裁剪为假,则执行直接调整大小而不裁剪并保留纵横比。)
函数返回一个 4维 Mat 矩阵( [N,C,H,W] dimensions.)
3、DNN中的Net结构:
https://docs.opencv.org/3.4.3/db/d30/classcv_1_1dnn_1_1Net.html
3.1、setInput函数
用于设置网络的新输入值。
void setInput(InputArray blob, const String& name = "",
double scalefactor = 1.0, const Scalar& mean = Scalar());
/** @brief Sets the new input value for the network
* @param blob A new blob. Should have CV_32F or CV_8U depth.
* @param name A name of input layer.
* @param scalefactor An optional normalization scale.
* @param mean An optional mean subtraction values.
* @see connect(String, String) to know format of the descriptor.
*
* If scale or mean values are specified, a final input blob is computed
* as:
* \f[input(n,c,h,w) = scalefactor \times (blob(n,c,h,w) - mean_c)\f]
*/
blob:一个新的blob。 应该有CV_32F或CV_8U深度。
name:输入图层的名称。
scalefactor:可选的标准化比例。
mean:一个可选的平均减法值。
如果指定了scale或mean值,则最终输入blob计算为:
调用方式:
net.setInput(inputBlob, "data");
3.2、forward
前向运行计算输出层,输出层名称为outputName。
/** @brief Runs forward pass to compute output of layer with name @p outputName.
* @param outputName name for layer which output is needed to get
* @return blob for first output of specified layer.
* @details By default runs forward pass for the whole network.
*/
CV_WRAP Mat forward(const String& outputName = String());
参数
outputName:需要输出的图层的名称
返回:指定图层的第一个输出的blob。
默认情况下,为整个网络运行正向传递。
调用:
Mat detection = net.forward("detection_out"); //compute output, [Make forward pass]
实例:
以opencv3.3\sources\samples\dnn\caffe_googlenet.cpp为例
/**M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include <opencv2/dnn.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/core/utils/trace.hpp>
using namespace cv;
using namespace cv::dnn;
#include <fstream>
#include <iostream>
#include <cstdlib>
using namespace std;
/* Find best class for the blob (i. e. class with maximal probability) */
static void getMaxClass(const Mat &probBlob, int *classId, double *classProb)
{
Mat probMat = probBlob.reshape(1, 1); //reshape the blob to 1x1000 matrix
Point classNumber;
minMaxLoc(probMat, NULL, classProb, NULL, &classNumber);
*classId = classNumber.x;
}
static std::vector<String> readClassNames(const char *filename = "synset_words.txt")
{
std::vector<String> classNames;
std::ifstream fp(filename);
if (!fp.is_open())
{
std::cerr << "File with classes labels not found: " << filename << std::endl;
exit(-1);
}
std::string name;
while (!fp.eof())
{
std::getline(fp, name);
if (name.length())
classNames.push_back( name.substr(name.find(' ')+1) );
}
fp.close();
return classNames;
}
int main(int argc, char **argv)
{
CV_TRACE_FUNCTION();
String modelTxt = "bvlc_googlenet.prototxt";
String modelBin = "bvlc_googlenet.caffemodel";
String imageFile = (argc > 1) ? argv[1] : "space_shuttle.jpg";
Net net;
try {
//! [Read and initialize network]
net = dnn::readNetFromCaffe(modelTxt, modelBin);
//! [Read and initialize network]
}
catch (cv::Exception& e) {
std::cerr << "Exception: " << e.what() << std::endl;
//! [Check that network was read successfully]
if (net.empty())
{
std::cerr << "Can't load network by using the following files: " << std::endl;
std::cerr << "prototxt: " << modelTxt << std::endl;
std::cerr << "caffemodel: " << modelBin << std::endl;
std::cerr << "bvlc_googlenet.caffemodel can be downloaded here:" << std::endl;
std::cerr << "http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" << std::endl;
exit(-1);
}
//! [Check that network was read successfully]
}
//! [Prepare blob]
Mat img = imread(imageFile);
if (img.empty())
{
std::cerr << "Can't read image from the file: " << imageFile << std::endl;
exit(-1);
}
//GoogLeNet accepts only 224x224 BGR-images
Mat inputBlob = blobFromImage(img, 1.0f, Size(224, 224),
Scalar(104, 117, 123), false); //Convert Mat to batch of images
//! [Prepare blob]
Mat prob;
cv::TickMeter t;
for (int i = 0; i < 10; i++)
{
CV_TRACE_REGION("forward");
//! [Set input blob]
net.setInput(inputBlob, "data"); //set the network input
//! [Set input blob]
t.start();
//! [Make forward pass]
prob = net.forward("prob"); //compute output
//! [Make forward pass]
t.stop();
}
//! [Gather output]
int classId;
double classProb;
getMaxClass(prob, &classId, &classProb);//find the best class
//! [Gather output]
//! [Print results]
std::vector<String> classNames = readClassNames();
std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
std::cout << "Probability: " << classProb * 100 << "%" << std::endl;
//! [Print results]
std::cout << "Time: " << (double)t.getTimeMilli() / t.getCounter() << " ms (average from " << t.getCounter() << " iterations)" << std::endl;
return 0;
} //main
from:https://blog.csdn.net/qq_15947787/article/details/78436995