Importing a deep learning model into OpenCV is very easy; we can import models from TensorFlow, Caffe, Torch, and Darknet. All imports are very similar, but, in this chapter, we are going to learn how to import a TensorFlow model.
To import a TensorFlow model, we can use the readNetFromTensorflow method, which accepts two parameters: the first parameter is the model in protobuf format, and the second is the text graph definition in protobuf format, too. The second parameter is not required, but in our case, we have to prepare our model for inference, and we have to optimize it to import to OpenCV too. Then, we can import the model with the following code:
dnn::Net dnn_net= readNetFromTensorflow("frozen_cut_graph_opt.pb");
To classify each detected segment of our plate, we have to put each image segment into our dnn_net and obtain the probabilities. This is the full code to classify each segment:
for(auto& segment : segments){
//Preprocess each char for all images have same sizes
Mat ch=preprocessChar(segment.img);
// DNN classify
Mat inputBlob;
blobFromImage(ch, inputBlob, 1.0f, Size(20, 20), Scalar(), true, false);
dnn_net.setInput(inputBlob);
Mat outs;
dnn_net.forward(outs);
cout << outs << endl;
double max;
Point pos;
minMaxLoc( outs, NULL, &max, NULL, &pos);
cout << "---->" << pos << " prob: " << max << " " << strCharacters[pos.x] << endl;
input->chars.push_back(strCharacters[pos.x]);
input->charsPos.push_back(segment.pos);
}
We are going to explain this code a bit more. First, we have to preprocess each segment to get the same-sized image with 20 x 20 pixels. This preprocessed image must be converted as a blob saved in a Mat structure. To convert it to a blob, we are going to use the blobFromImage function, which creates four-dimensional data with optional resize, scale, crop, or swap channel blue and red. The function has the following parameters:
void cv::dnn::blobFromImage (
InputArray image,
OutputArray blob,
double scalefactor = 1.0,
const Size & size = Size(),
const Scalar & mean = Scalar(),
bool swapRB = false,
bool crop = false,
int ddepth = CV_32F
)
The definitions of each one are as follows:
- image: Input image (with one, three, or four channels).
- blob: Output blob mat.
- size: Spatial size for the output image.
- mean: Scalar with mean values, which are subtracted from channels. Values are intended to be in (mean-R, mean-G, mean-B) order if the image has BGR ordering and swapRB is true.
- scalefactor: Multiplier for image values.
- swapRB: A flag that indicates the need to swap the first and last channels in a three-channel image.
- crop: A flag that indicates whether the image will be cropped after resizing
- ddepth: Depth of output blob. Choose CV_32F or CV_8U.
This generated blob can be added as an input to our DNN using dnn_net.setInput(inputBlob).
Once the input blob is set up for our network, we only need to pass the input forward to obtain our results. This is the purpose of the dnn_net.forward(outs) function, which returns a Mat with the softmax prediction results. The result obtained is a row of Mat where each column is the label; then, to get the label with the highest probability, we only need to get the max position of this Mat. We can use the minMaxLoc function to retrieve the label value, and if we so desire, the probability value too.
Finally, to close the ANPR application, we only have to save, in the input plate data, the new segment position and the label obtained.
If we execute the application, we will obtain a result like this: