What are integral images?

In order to extract these Haar features, we will have to calculate the sum of the pixel values enclosed in many rectangular regions of the image. To make it scale-invariant, we are required to compute these areas at multiple scales (for various rectangle sizes). Implemented naively, this would be a very computationally-intensive process; we would have to iterate over all the pixels of each rectangle, including reading the same pixels multiple times if they are contained in different overlapping rectangles. If you want to build a system that can run in real-time, you cannot spend so much time in computation. We need to find a way to avoid this huge redundancy during the area computation because we iterate over the same pixels multiple times. To avoid it, we can use something called integral images. These images can be initialized at a linear time (by iterating only twice over the image) and then provide the sum of the pixels inside any rectangle of any size by reading only four values. To understand it better, let's look at the following diagram:

If we want to calculate the area of any rectangle in our diagram, we don't have to iterate through all the pixels in that region. Let's consider a rectangle formed by the top-left point in the image and any point, P, as the opposite corner. Let AP denote the area of this rectangle. For example, in the previous image, AB denotes the area of the 5 x 2 rectangle formed by taking the top-left point and B as opposite corners. Let's look at the following diagram for clarity:

Let's consider the top-left square in the previous image. The blue pixels indicate the area between the top-left pixel and point A. This is denoted by AA. The remaining diagrams are denoted by their respective names: AB, AC, and AD. Now, if we want to calculate the area of the ABCD rectangle, as shown in the preceding diagram, we would use the following formula:

Area of the rectangleABCD = AC - (AB + AD - AA)

What's so special about this particular formula? As we know, extracting Haar features from the image includes computing these summations and we would have to do it for a lot of rectangles at multiple scales in the image. A lot of those calculations are repetitive because we would be iterating over the same pixels over and over again. It is so slow that building a real-time system wouldn't be feasible. Hence, we need this formula. As you can see, we don't have to iterate over the same pixels multiple times. If we want to compute the area of any rectangle, all the values on the right-hand side of the preceding equation are readily available in our integral image. We just pick up the right values, substitute them in the preceding equation, and extract the features.