OpenCV  3.2.0
Open Source Computer Vision
Load Caffe framework models

.2.0+dfsg_contrib_modules_dnn_tutorials_tutorial_dnn_googlenet

Introduction

In this tutorial you will learn how to use opencv_dnn module for image classification by using GoogLeNet trained network from Caffe model zoo.

We will demonstrate results of this example on the following picture.

Source Code

We will be using snippets from the example application, that can be downloaded here.

#include <opencv2/dnn.hpp>
using namespace cv;
using namespace cv::dnn;
#include <fstream>
#include <iostream>
#include <cstdlib>
using namespace std;
/* Find best class for the blob (i. e. class with maximal probability) */
void getMaxClass(dnn::Blob &probBlob, int *classId, double *classProb)
{
Mat probMat = probBlob.matRefConst().reshape(1, 1); //reshape the blob to 1x1000 matrix
Point classNumber;
minMaxLoc(probMat, NULL, classProb, NULL, &classNumber);
*classId = classNumber.x;
}
std::vector<String> readClassNames(const char *filename = "synset_words.txt")
{
std::vector<String> classNames;
std::ifstream fp(filename);
if (!fp.is_open())
{
std::cerr << "File with classes labels not found: " << filename << std::endl;
exit(-1);
}
std::string name;
while (!fp.eof())
{
std::getline(fp, name);
if (name.length())
classNames.push_back( name.substr(name.find(' ')+1) );
}
fp.close();
return classNames;
}
int main(int argc, char **argv)
{
cv::dnn::initModule(); //Required if OpenCV is built as static libs
String modelTxt = "bvlc_googlenet.prototxt";
String modelBin = "bvlc_googlenet.caffemodel";
String imageFile = (argc > 1) ? argv[1] : "space_shuttle.jpg";
Net net = dnn::readNetFromCaffe(modelTxt, modelBin);
if (net.empty())
{
std::cerr << "Can't load network by using the following files: " << std::endl;
std::cerr << "prototxt: " << modelTxt << std::endl;
std::cerr << "caffemodel: " << modelBin << std::endl;
std::cerr << "bvlc_googlenet.caffemodel can be downloaded here:" << std::endl;
std::cerr << "http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" << std::endl;
exit(-1);
}
Mat img = imread(imageFile);
if (img.empty())
{
std::cerr << "Can't read image from the file: " << imageFile << std::endl;
exit(-1);
}
resize(img, img, Size(224, 224)); //GoogLeNet accepts only 224x224 RGB-images
dnn::Blob inputBlob = dnn::Blob::fromImages(img); //Convert Mat to dnn::Blob batch of images
net.setBlob(".data", inputBlob); //set the network input
net.forward(); //compute output
dnn::Blob prob = net.getBlob("prob"); //gather output of "prob" layer
int classId;
double classProb;
getMaxClass(prob, &classId, &classProb);//find the best class
std::vector<String> classNames = readClassNames();
std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
std::cout << "Probability: " << classProb * 100 << "%" << std::endl;
return 0;
} //main

Explanation

  1. Firstly, download GoogLeNet model files: bvlc_googlenet.prototxt and bvlc_googlenet.caffemodel

    Also you need file with names of ILSVRC2012 classes: synset_words.txt.

    Put these files into working dir of this program example.

  2. Read and initialize network using path to .prototxt and .caffemodel files
    Net net = dnn::readNetFromCaffe(modelTxt, modelBin);
  3. Check that network was read successfully
    if (net.empty())
    {
    std::cerr << "Can't load network by using the following files: " << std::endl;
    std::cerr << "prototxt: " << modelTxt << std::endl;
    std::cerr << "caffemodel: " << modelBin << std::endl;
    std::cerr << "bvlc_googlenet.caffemodel can be downloaded here:" << std::endl;
    std::cerr << "http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" << std::endl;
    exit(-1);
    }
  4. Read input image and convert to the blob, acceptable by GoogleNet

    Mat img = imread(imageFile);
    if (img.empty())
    {
    std::cerr << "Can't read image from the file: " << imageFile << std::endl;
    exit(-1);
    }
    resize(img, img, Size(224, 224)); //GoogLeNet accepts only 224x224 RGB-images
    dnn::Blob inputBlob = dnn::Blob::fromImages(img); //Convert Mat to dnn::Blob batch of images

    Firstly, we resize the image and change its channel sequence order.

    Now image is actually a 3-dimensional array with 224x224x3 shape.

    Next, we convert the image to 4-dimensional blob (so-called batch) with 1x3x224x224 shape by using special cv::dnn::Blob::fromImages constructor.

  5. Pass the blob to the network

    net.setBlob(".data", inputBlob); //set the network input

    In bvlc_googlenet.prototxt the network input blob named as "data", therefore this blob labeled as ".data" in opencv_dnn API.

    Other blobs labeled as "name_of_layer.name_of_layer_output".

  6. Make forward pass
    net.forward(); //compute output
    During the forward pass output of each network layer is computed, but in this example we need output from "prob" layer only.
  7. Determine the best class
    dnn::Blob prob = net.getBlob("prob"); //gather output of "prob" layer
    int classId;
    double classProb;
    getMaxClass(prob, &classId, &classProb);//find the best class
    We put the output of "prob" layer, which contain probabilities for each of 1000 ILSVRC2012 image classes, to the prob blob. And find the index of element with maximal value in this one. This index correspond to the class of the image.
  8. Print results
    std::vector<String> classNames = readClassNames();
    std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
    std::cout << "Probability: " << classProb * 100 << "%" << std::endl;
    For our image we get:

    Best class: #812 'space shuttle'

    Probability: 99.6378%

cv::Mat::reshape
Mat reshape(int cn, int rows=0) const
Changes the shape and/or the number of channels of a 2D matrix without copying the data.
cv::Point_< int >
imgproc.hpp
cv::dnn::readNetFromCaffe
Net readNetFromCaffe(const String &prototxt, const String &caffeModel=String())
Reads a network model stored in Caffe model files.
cv::dnn::Net::getBlob
Blob getBlob(String outputName)
Returns the layer output blob.
cv::Point_::x
_Tp x
Definition: types.hpp:176
dnn.hpp
highgui.hpp
cv::dnn::Net::forward
void forward(LayerId toLayer=String())
Runs forward pass to compute output of layer toLayer.
cv::dnn::Blob
This class provides methods for continuous n-dimensional CPU and GPU array processing.
Definition: blob.hpp:147
cv::dnn::initModule
void initModule()
Initialize dnn module and built-in layers.
cv::dnn::Net::empty
bool empty() const
cv::Size
Size2i Size
Definition: types.hpp:315
cv::dnn
This namespace is used for dnn module functionlaity.
Definition: all_layers.hpp:49
cv::imread
Mat imread(const String &filename, int flags=IMREAD_COLOR)
Loads an image from a file.
cv::Mat::empty
bool empty() const
Returns true if the array has no elements.
cv::dnn::Blob::matRefConst
const Mat & matRefConst() const
Returns reference to cv::Mat, containing blob data, for read-only purposes.
cv::minMaxLoc
void minMaxLoc(InputArray src, double *minVal, double *maxVal=0, Point *minLoc=0, Point *maxLoc=0, InputArray mask=noArray())
Finds the global minimum and maximum in an array.
cv::dnn::Net::setBlob
void setBlob(String outputName, const Blob &blob)
Sets the new value for the layer output blob.
cv::Mat
n-dimensional dense array class
Definition: mat.hpp:741
cv::resize
void resize(InputArray src, OutputArray dst, Size dsize, double fx=0, double fy=0, int interpolation=INTER_LINEAR)
Resizes an image.
cv
Definition: affine.hpp:52
cv::dnn::Blob::fromImages
static Blob fromImages(InputArray image, int dstCn=-1)
Constructs 4-dimensional blob (so-called batch) from image or array of images.
cv::hal::resize
void resize(int src_type, const uchar *src_data, size_t src_step, int src_width, int src_height, uchar *dst_data, size_t dst_step, int dst_width, int dst_height, double inv_scale_x, double inv_scale_y, int interpolation)
cv::String
Definition: cvstd.hpp:478
cv::dnn::Net
This class allows to create and manipulate comprehensive artificial neural networks.
Definition: dnn.hpp:151