Facial landmark detection in OpenCV

Landmark detection starts with face detection, finding faces in the image and their extents (bounding boxes). Facial detection has long been considered a solved problem, and OpenCV contains one of the first robust face detectors freely available to the public. In fact, OpenCV, in its early days, was majorly known and used for its fast face detection feature, implementing the canonical Viola-Jones boosted cascade classifier algorithm (Viola et al. 2001, 2004), and providing a pre-trained model. While face detection has grown much since those early days, the fastest and easiest method for detecting faces in OpenCV is still to use the bundled cascade classifiers, by means of the cv::CascadeClassifier class provided in the core module.

We implement a simple helper function to detect faces with the cascade classifier, shown as follows:

void faceDetector(const Mat& image,
                  std::vector<Rect> &faces,
                  CascadeClassifier &face_cascade) {
    Mat gray;

    // The cascade classifier works best on grayscale images
    if (image.channels() > 1) {
        cvtColor(image, gray, COLOR_BGR2GRAY);
    } else {
        gray = image.clone();
    }

    // Histogram equalization generally aids in face detection
    equalizeHist(gray, gray);

    faces.clear();

    // Run the cascade classifier
    face_cascade.detectMultiScale(
        gray, 
        faces, 
        1.4, // pyramid scale factor
        3,   // lower thershold for neighbors count
        // here we hint the classifier to only look for one face
        CASCADE_SCALE_IMAGE + CASCADE_FIND_BIGGEST_OBJECT);
}

We may want to tweak the two parameters that govern the face detection: pyramid scale factor and number of neighbors. The pyramid scale factor is used to create a pyramid of images within which the detector will try to find faces. This is how multi-scale detection is achieved, since the bare detector has a fixed aperture. In each step of the image pyramid, the image is downscaled by this factor, so a small factor (closer to 1.0) will result in many images, longer runtime, but more accurate results. We also have control of the lower threshold for a number of neighbors. This comes into play when the cascade classifier has multiple positive face classifications in close proximity. Here, we instruct the overall classification to only return a face bound if it has at least three neighboring positive face classifications. A lower number (an integer, close to 1) will return more detections, but will also introduce false positives.

We must initialize the cascade classifier from the OpenCV-provided models (XML files of the serialized models are provided in the $OPENCV_ROOT/data/haarcascades directory). We use the standard trained classifier on frontal faces, demonstrated as follows:

const string cascade_name = "$OPENCV_ROOT/data/haarcascades/haarcascade_frontalface_default.xml";

CascadeClassifier face_cascade;
if (not face_cascade.load(cascade_name)) {
    cerr << "Cannot load cascade classifier from file: " << cascade_name << endl;
    return -1;
}

// ... obtain an image in img

vector<Rect> faces;
faceDetector(img, faces, face_cascade);

// Check if any faces were detected or not
if (faces.size() == 0) {
    cerr << "Cannot detect any faces in the image." << endl;
    return -1;
}

A visualization of the results of the face detector is shown in the following screenshot:

The facemark detector will work around the detected faces, beginning at the bounding boxes. However, first we must initialize the cv::face::Facemark object, demonstrated as follows:

#include <opencv2/face.hpp>

using namespace cv::face;

// ...

const string facemark_filename = "data/lbfmodel.yaml";
Ptr<Facemark> facemark = createFacemarkLBF();
facemark->loadModel(facemark_filename);
cout << "Loaded facemark LBF model" << endl;

The cv::face::Facemark abstract API is used for all the landmark detector flavors, and offers base functionality for implementation for inference and training according to the specific algorithm. Once loaded, the facemark object can be used with its fit function to find the face shape, shown as follows:

vector<Rect> faces;
faceDetector(img, faces, face_cascade);

// Check if faces detected or not
if (faces.size() != 0) {
    // We assume a single face so we look at the first only
    cv::rectangle(img, faces[0], Scalar(255, 0, 0), 2);

    vector<vector<Point2f> > shapes;

    if (facemark->fit(img, faces, shapes)) {
        // Draw the detected landmarks
        drawFacemarks(img, shapes[0], cv::Scalar(0, 0, 255));
    }
} else {
    cout << "Faces not detected." << endl;
}

A visualization of the results of the landmark detector (using cv::face::drawFacemarks) is shown in the following screenshot: