Viewing the learned knowledge

While it is not necessary, it is quite useful to view the internal data structures that the face recognition algorithm generated when learning your training data, particularly if you understand the theory behind the algorithm you selected and want to verify it worked, or find out why it is not working as you hoped. The internal data structures can be different for different algorithms, but luckily they are the same for Eigenfaces and Fisherfaces, so let's just look at those two. They are both based on 1D eigenvector matrices that appear somewhat like faces when viewed as 2D images; therefore, it is common to refer to eigenvectors as Eigenfaces when using the Eigenface algorithm or as Fisherfaces when using the Fisherface algorithm.

In simple terms, the basic principle of Eigenfaces is that it will calculate a set of special images (Eigenfaces), and blending ratios (Eigenvalues), which when combined in different ways, can generate each of the images in the training set, but can also be used to differentiate the many face images in the training set from each other. For example, if some of the faces in the training set had a moustache and some did not, then there would be at least one eigenface that shows a moustache, and so the training faces with a moustache would have a high blending ratio for that eigenface to show that they contained a moustache, and the faces without a moustache would have a low blending ratio for that eigenvector.

If the training set has five people with twenty faces for each person, then there would be 100 Eigenfaces and Eigenvalues to differentiate the 100 total faces in the training set, and in fact these would be sorted, so the first few Eigenfaces and Eigenvalues would be the most critical differentiators, and the last few Eigenfaces and Eigenvalues would just be random pixel noises that don't actually help to differentiate the data. So it is common practice to discard some of the last Eigenfaces, and just keep the first 50 or so Eigenfaces.

In comparison, the basic principle of Fisherfaces is that instead of calculating a special eigenvector and eigenvalue for each image in the training set, it only calculates one special eigenvector and eigenvalue for each person. So, in the preceding example that has five people with twenty faces for each person, the Eigenfaces algorithm would use 100 Eigenfaces and Eigenvalues, whereas the Fisherfaces algorithm would use just five Fisherfaces and Eigenvalues.

To access the internal data structures of the Eigenfaces and Fisherfaces algorithms, we must use the cv::Algorithm::get() function to obtain them at runtime, as there is no access to them at compile time. The data structures are used internally as part of mathematical calculations, rather than for image processing, so they are usually stored as floating-point numbers typically ranging between 0.0 and 1.0, rather than 8-bit uchar pixels ranging from 0 to 255, similar to pixels in regular images. Also, they are often either a 1D row or column matrix, or they make up one of the many 1D rows or columns of a larger matrix. So, before you can display many of these internal data structures, you must reshape them to be the correct rectangular shape, and convert them to 8-bit uchar pixels between 0 and 255. As the matrix data might range from 0.0 to 1.0, or -1.0 to 1.0, or anything else, you can use the cv::normalize() function with the cv::NORM_MINMAX option to make sure it outputs data ranging between 0 and 255, no matter what the input range may be. Let's create a function to perform this reshaping to a rectangle and conversion to 8-bit pixels for us, as follows:

    // Convert the matrix row or column (float matrix) to a 
// rectangular 8-bit image that can be displayed or saved.
// Scales the values to be between 0 to 255.
Mat getImageFrom1DFloatMat(const Mat matrixRow, int height)
{
// Make a rectangular shaped image instead of a single row.
Mat rectangularMat = matrixRow.reshape(1, height);
// Scale the values to be between 0 to 255 and store them
// as a regular 8-bit uchar image.
Mat dst;
normalize(rectangularMat, dst, 0, 255, NORM_MINMAX,
CV_8UC1);
return dst;
}

To make it easier to debug OpenCV code and even more so, when internally debugging the cv::Algorithm data structure, we can use the ImageUtils.cpp and ImageUtils.h files to display information about a cv::Mat structure easily, as follows:

    Mat img = ...; 
printMatInfo(img, "My Image");

You will see something similar to the following printed on your console:

My Image: 640w480h 3ch 8bpp, range[79,253][20,58][18,87]

This tells you that it is 640 elements wide and 480 high (that is, a 640 x 480 image or a 480 x 640 matrix, depending on how you view it), with three channels per pixel that are 8-bits each (that is, a regular BGR image), and it shows the minimum and maximum values in the image for each of the color channels.

It is also possible to print the actual contents of an image or matrix by using the printMat() function instead of the printMatInfo() function. This is quite handy for viewing matrices and multichannel-float matrices, as these can be quite tricky to view for beginners.
The ImageUtils code is mostly for OpenCV's C interface, but is gradually including more of the C++ interface over time. The most recent version can be found at http://shervinemami.info/openCV.html.