Face detection with OpenCV

OpenCV provides two approaches for face detection:

Haar cascade based face detectors
Deep learning-based face detectors

The framework proposed by Viola and Jones (see Rapid Object Detection Using a Boosted Cascade of Simple Features (2001)) is an effective object detection method. This framework is very popular because OpenCV provides face detection algorithms based on this framework. Additionally, this framework can also be used for detecting other objects rather than faces (for example, full body detector, plate number detector, upper body detector, or cat face detector). In this section, we will see how to detect faces using this framework.

The face_detection_opencv_haar.py script performs face detection using haar feature-based cascade classifiers. In this sense, OpenCV provides four cascade classifiers to use for (frontal) face detection:

haarcascade_frontalface_alt.xml (FA1): 22 stages and 20 x 20 haar features
haarcascade_frontalface_alt2.xml (FA2): 20 stages and 20 x 20 haar features
haarcascade_frontalface_alt_tree.xml (FAT): 47 stages and 20 x 20 haar features
haarcascade_frontalface_default.xml (FD): 25 stages and 24 x 24 haar features

In some available publications, the authors evaluated the performance of these cascade classifiers using different criteria and datasets. Overall, it can be concluded that these classifiers achieve similar accuracy. That is why, in this script, we will be using two of them (to simplify things). More specifically, in this script, two cascade classifiers (the previously introduced FA2 and FD) are loaded:

# Load cascade classifiers:
cas_alt2 = cv2.CascadeClassifier("haarcascade_frontalface_alt2.xml")
cas_default = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")

The cv2.CascadeClassifier() function is used to load a classifier from a file. You can download these cascade classifier files from the OpenCV repository: https://github.com/opencv/opencv/tree/master/data/haarcascades. Moreover, we have included the two loaded cascade classifier files in the GitHub repository (haarcascade_frontalface_alt2.xml and haarcascade_frontalface_default.xml).

The next step is to perform the detection:

faces_alt2 = cas_alt2.detectMultiScale(gray)
faces_default = cas_default.detectMultiScale(gray)

The cv2.CascadeClassifier.detectMultiScale() function detects objects and returns them as a list of rectangles. The final step is to correlate the results using the show_detection() function:

img_faces_alt2 = show_detection(img.copy(), faces_alt2)
img_faces_default = show_detection(img.copy(), faces_default)

The show_detection() function draws a rectangle over each detected face:

def show_detection(image, faces):
    """Draws a rectangle over each detected face"""

    for (x, y, w, h) in faces:
        cv2.rectangle(image, (x, y), (x + w, y + h), (255, 0, 0), 5)
    return image

OpenCV also provides the cv2.face.getFacesHAAR() function to detect faces:

retval, faces_haar_alt2 = cv2.face.getFacesHAAR(img, "haarcascade_frontalface_alt2.xml")
retval, faces_haar_default = cv2.face.getFacesHAAR(img, "haarcascade_frontalface_default.xml")

It should be noted that cv2.CascadeClassifier.detectMultiScale() needs a grayscale image, while cv2.face.getFacesHAAR() needs a BGR image as an input. Moreover, cv2.CascadeClassifier.detectMultiScale() outputs the detected faces as a list of rectangles. For example, the output for two detected faces will be like this:

[[332 93 364 364] [695 104 256 256]]

The cv2.face.getFacesHAAR() function returns the faces in a similar format:

[[[298 524 61 61]] [[88 72 315 315]]

To get rid of the useless one-dimension arrays, call np.squeeze():

faces_haar_alt2 = np.squeeze(faces_haar_alt2)
faces_haar_default = np.squeeze(faces_haar_default)

The full code for detecting and drawing the faces in the loaded image is as follows:

# Load image and convert to grayscale:
img = cv2.imread("test_face_detection.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Load cascade classifiers:
cas_alt2 = cv2.CascadeClassifier("haarcascade_frontalface_alt2.xml")
cas_default = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")

# Detect faces:
faces_alt2 = cas_alt2.detectMultiScale(gray)
faces_default = cas_default.detectMultiScale(gray)
retval, faces_haar_alt2 = cv2.face.getFacesHAAR(img, "haarcascade_frontalface_alt2.xml")
faces_haar_alt2 = np.squeeze(faces_haar_alt2)
retval, faces_haar_default = cv2.face.getFacesHAAR(img, "haarcascade_frontalface_default.xml")
faces_haar_default = np.squeeze(faces_haar_default)


# Draw face detections:
img_faces_alt2 = show_detection(img.copy(), faces_alt2)
img_faces_default = show_detection(img.copy(), faces_default)
img_faces_haar_alt2 = show_detection(img.copy(), faces_haar_alt2)
img_faces_haar_default = show_detection(img.copy(), faces_haar_default)

The final step is to show the four created images by using OpenCV, or Matplotlib in this case. The full code can be seen in the face_detection_opencv_haar.py script. The output of this script can be seen in the following screenshot:

As you can see, the detected faces vary using the four aforementioned approximations by using haar feature-based cascade classifiers. Finally, it should also be commented that the cv2.CascadeClassifier.detectMultiScale() function has the minSize and maxSize parameters in order to establish the minimum size (objects smaller than minSize will not be detected) and the maximum size (objects larger than maxSize will not be detected), respectively. On the contrary, the cv2.face.getFacesHAAR() function does not offer this possibility.

Haar feature-based cascade classifiers can be used to detect objects other than human faces. The OpenCV library also provides two cascade files to use for cat face detection.

For the sake of completeness, the cat_face_detection_opencv_haar.py script loads two cascade files, which have been trained to detect frontal cat faces in images. This script is pretty similar to the face_detection_opencv_haar.py script. Indeed, the key modification is the two cascade files that have been loaded. In this case, here are the two loaded cascade files:

haarcascade_frontalcatface.xml: A frontal cat face detector using the basic set of haar features with 20 stages and 24 x 24 haar features
haarcascade_frontalcatface_extended.xml: A frontal cat face detector using the full set of haar features with 20 stages and 24 x 24 haar features

For more information about these cascade files, check out Joseph Howse's OpenCV for Secret Agents, Packt Publishing, January 2015. You can download these cascade classifier files from the OpenCV repository: https://github.com/opencv/opencv/tree/master/data/haarcascades. Moreover, we have included these two cascade classifier files in the GitHub repository.

The output of this script can be seen in the following screenshot:

Additionally, OpenCV provides a deep learning-based face detector (https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector). More specifically, the OpenCV deep neural network (DNN) face detector is based on the Single Shot MultiBox Detector (SSD) framework using a ResNet-10 network.

Since OpenCV 3.1, there is the DNN module, which implements a forward pass (inferencing) with pre-trained deep networks using popular deep learning frameworks, such as Caffe, TensorFlow, Torch, and Darknet. In OpenCV 3.3, the module has been promoted from the opencv_contrib repository to the main repository (https://github.com/opencv/opencv/tree/master/modules/dnn) and accelerated significantly. This means that we can use the pre-trained networks to perform a complete forward pass and utilize the output to make a prediction within our application rather than spend hours training the network. In Chapter 12, Introduction to Deep Learning, we will further explore the DNN module; in this chapter, we will focus on the deep learning face detector.

In this section, we will perform face detection using pre-trained deep learning face detector models, which are included in the library.

OpenCV provides two models for this face detector:

Face detector (FP16): Floating-point 16 version of the original Caffe implementation (5.1 MB)
Face detector (UINT8): 8-bit quantized version using TensorFlow (2.6 MB)

In each case, you will need two sets of files: the model file and the configuration file. In the case of the Caffe model, these files are the following:

res10_300x300_ssd_iter_140000_fp16.caffemodel: This file contains the weights for the actual layers. It can be downloaded from https://github.com/opencv/opencv_3rdparty/raw/19512576c112aa2c7b6328cb0e8d589a4a90a26d/res10_300x300_ssd_iter_140000_fp16.caffemodel and it is also included in the GitHub repository of the book.
deploy.prototxt: This file defines the model architecture. It can be downloaded from https://github.com/opencv/opencv/blob/master/samples/dnn/face_detector/deploy.prototxt and is included in the GitHub repository of the book.

If you're using the TensorFlow model, you'll need these files:

opencv_face_detector_uint8.pb: This file contains the weights for the actual layers. This file can be downloaded from https://github.com/opencv/opencv_3rdparty/raw/8033c2bc31b3256f0d461c919ecc01c2428ca03b/opencv_face_detector_uint8.pb and is included in the GitHub repository of the book.
opencv_face_detector.pbtxt: This file defines the model architecture. It can be downloaded from https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/opencv_face_detector.pbtxt and is included in the GitHub repository of the book.

The face_detection_opencv_dnn.py script shows you how to detect faces by using face detection and pre-trained deep learning face detector models. The first step is to load the pre-trained model:

# Load pre-trained model:
net = cv2.dnn.readNetFromCaffe("deploy.prototxt", "res10_300x300_ssd_iter_140000_fp16.caffemodel")
# net = cv2.dnn.readNetFromTensorflow("opencv_face_detector_uint8.pb", "opencv_face_detector.pbtxt")

As you can see, in this example, the floating-point 16 version of the original Caffe implementation is loaded. To achieve the best accuracy, we must run the model on BGR images resized to 300 x 300 by applying mean subtraction of values of (104, 177, 123) for the blue, green, and red channels, respectively. This preprocessing is performed with the cv2.dnn.blobFromImage() OpenCV function:

blob = cv2.dnn.blobFromImage(image, 1.0, (300, 300), [104., 117., 123.], False, False)

In Chapter 12, Introduction to Deep Learning, we will look at this function in more depth.

The next step is to set the blob as an input to obtain the results, performing a forward pass for the whole network to compute the output:

# Set the blob as input and obtain the detections:
net.setInput(blob)
detections = net.forward()

The final step is to iterate over all the detections and draw the results, only considering detections if the corresponding confidence is greater than a fixed minimum threshold:

# Iterate over all detections:
for i in range(0, detections.shape[2]):
    # Get the confidence (probability) of the current detection:
    confidence = detections[0, 0, i, 2]

    # Only consider detections if confidence is greater than a fixed minimum confidence:
    if confidence > 0.7:
        # Increment the number of detected faces:
        detected_faces += 1
        # Get the coordinates of the current detection:
        box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
        (startX, startY, endX, endY) = box.astype("int")

        # Draw the detection and the confidence:
        text = "{:.3f}%".format(confidence * 100)
        y = startY - 10 if startY - 10 > 10 else startY + 10
        cv2.rectangle(image, (startX, startY), (endX, endY), (255, 0, 0), 3)
        cv2.putText(image, text, (startX, y), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 0, 255), 2)

The output of the face_detection_opencv_dnn.py script can be seen in the next screenshot:

As can be seen, the three faces are detected with high confidence.