Planning the Interactive Recognizer app

Let's begin this project with the middle layer, the Interactive Recognizer app, in order to see how all layers connect. Like Luxocator (the previous chapter's project), Interactive Recognizer is a GUI app built with wxPython. Refer to the following screenshot, which features one of my colleagues, Chief Science Officer Sanibel "San" Delphinium Andromeda, Oracle of the Numm:

The app uses a face detection model, which is loaded from a disk, and it maintains a face recognition model that is saved or loaded to/from a disk. The user might specify the identity of any detected face and this input is added to the face recognition model. A detection result is shown by outlining the face in the video feed, while a recognition result is shown by displaying the name of the face in the text below. To elaborate, we can say that the app has the following flow of execution:

The app loads a face detection model from a file. The role of the detection model is to distinguish faces from the background.
It loads a face recognition model from a file, if any such model was saved during a previous run of Interactive Recognizer. Otherwise (if there is no such model to load), it creates a new one. The role of the recognition model is to distinguish faces of different individuals from each other.
It captures and displays a live video from a camera.
For each frame of video, it detects the largest face, if any. If a face is detected:
1. It draws a rectangle around the face.
2. It permits the user to enter the face's identity as a short string (up to four characters), such as Joe or Puss. When the user hits the Add to Model button, the model becomes trained to recognize the face as whomever the user specified (Joe, Puss, or another identity).
3. If the recognition model is trained for at least one face, the GUI displays the recognizer's prediction for the current face, that is, it displays the most probable identity of the current face according to the recognizer. Also, it displays a measure of distance (non-confidence) for this prediction.
If the recognition model is trained for at least one face, the GUI permits the user to hit the Clear Model button to delete the model (including any version saved to the file) and create a new one.
After exiting, if the recognition model is trained for at least one face, the app saves the model to a file so that it can be loaded in subsequent runs of Interactive Recognizer and Angora Blue.
Note
We could generalize by using the term object instead of face. Depending on the models that it loads, Interactive Recognizer can detect and recognize any kind of object, not necessarily faces.

We will use a type of detection model called a Haar cascade and a type of recognition model called Local Binary Patterns (LBP) or Local Binary Pattern Histograms (LBPH). Alternatively, we can use LBPH for both detection and recognition. As detection models, LBP cascades are faster but generally less reliable, compared to Haar cascades. OpenCV comes with some Haar cascade and LBP cascade files including several face detection models. Command-line tools are also included with OpenCV along with these files. The APIs offer high-level classes for loading and using Haar or LBP cascades and for loading, saving, training, and using LBPH recognition models. Let's look at the basic concepts of these models.

Planning the Interactive Recognizer app

Note