Matching Contours

Now that we have a pretty good idea of what a contour is and of how to work with contours as objects in OpenCV, we would like to take a moment to understand how to use them for some practical purposes. The most common task associated with contours is matching them in some way with one another. We may have two computed contours that we'd like to compare or a computed contour and some abstract template with which we'd like to compare our contour. We will discuss both of these cases.

Moments

One of the simplest ways to compare two contours is to compute contour moments. This is a good time for a short digression into precisely what a moment is. Loosely speaking, a moment is a gross characteristic of the contour computed by integrating (or summing, if you like) over all of the pixels of the contour. In general, we define the (p, q) moment of a contour as

Here p is the x-order and q is the y-order, whereby order means the power to which the corresponding component is taken in the sum just displayed. The summation is over all of the pixels of the contour boundary (denoted by n in the equation). It then follows immediately that if p and q are both equal to 0, then the m₀₀ moment is actually just the length in pixels of the contour.^[125]

The function that computes these moments for us is

void cvContourMoments(
    CvSeq*     contour,
    CvMoments* moments
)

The first argument is the contour we are interested in and the second is a pointer to a structure that we must allocate to hold the return data. The CvMoments structure is defined as follows:

typedef struct CvMoments {

   // spatial moments
   double m00, m10, m01, m20, m11, m02, m30, m21, m12, m03;

   // central moments
   double mu20, mu11, mu02, mu30, mu21, mu12, mu03;

   // m00 != 0 ? 1/sqrt(m00) : 0
   double inv_sqrt_m00;

} CvMoments;

The cvContourMoments() function uses only the m00, m01, …, m03 elements; the elements with names mu00, … are used by other routines.

When working with the CvMoments structure, there is a friendly helper function that will return any particular moment out of the structure:

double cvGetSpatialMoment(
  Cv Moments* moments,
  Int        x_order,
  int        y_order
);

A single call to cvContourMoments() will instigate computation of all the moments through third order (i.e., m₃₀ and m₀₃ will be computed, as will m₂₁ and m₁₂, but m₂₂ will not be).

More About Moments

The moment computation just described gives some rudimentary characteristics of a contour that can be used to compare two contours. However, the moments resulting from that computation are not the best parameters for such comparisons in most practical cases. In particular, one would often like to use normalized moments (so that objects of the same shape but dissimilar sizes give similar values). Similarly, the simple moments of the previous section depend on the coordinate system chosen, which means that objects are not matched correctly if they are rotated.

OpenCV provides routines to compute normalized moments as well as Hu invariant moments [Hu62]. The CvMoments structure can be computed either with cvMoments or with cvContourMoments. Moreover, cvContourMoments is now just an alias for cvMoments.

A useful trick is to use cvDrawContours() to "paint" an image of the contour and then call one of the moment functions on the resulting drawing. This allows you to control whether or not the contour is filled.

Here are the four functions at your disposal:

void cvMoments(
   const CvArr* image,
   CvMoments*   moments,
   int          isBinary = 0
)
double cvGetCentralMoment(
   CvMoments* moments,
   int          x_order,
   int          y_order
)
double cvGetNormalizedCentralMoment(
   CvMoments*   moments,
   int          x_order,
   int          y_order
);
void cvGetHuMoments(
   CvMoments*   moments,
   CvHuMoments* HuMoments
);

The first function is essentially analogous to cvContourMoments() except that it takes an image (instead of a contour) and has one extra argument. That extra argument, if set to CV_TRUE, tells cvMoments() to treat all pixels as either 1 or 0, where 1 is assigned to any pixel with a nonzero value. When this function is called, all of the moments'including the central moments (see next paragraph)'are computed at once.

A central moment is basically the same as the moments just described except that the values of x and y used in the formulas are displaced by the mean values:

where

and

The normalized moments are the same as the central moments except that they are all divided by an appropriate power of m₀₀:^[126]

Finally, the Hu invariant moments are linear combinations of the central moments. The idea here is that, by combining the different normalized central moments, it is possible to create invariant functions representing different aspects of the image in a way that is invariant to scale, rotation, and (for all but the one called h₁) reflection.

The cvGetHuMoments() function computes the Hu moments from the central moments. For the sake of completeness, we show here the actual definitions of the Hu moments:

Looking at Figure 8-9 and Table 8-1, we can gain a sense of how the Hu moments behave. Observe first that the moments tend to be smaller as we move to higher orders. This should be no surprise in that, by their definition, higher Hu moments have more powers of various normalized factors. Since each of those factors is less than 1, the products of more and more of them will tend to be smaller numbers.

Figure 8-9. Images of five simple characters; looking at their Hu moments yields some intuition concerning their behavior

Table 8-1. Values of the Hu moments for the five simple characters of Figure 8-9

	h₁	h₂	h₃	h₄	h₅	h₆	h₇
A	2.837e–1	1.961e–3	1.484e–2	2.265e–4	–4.152e–7	1.003e–5	–7.941e–9
I	4.578e–1	1.820e–1	0.000	0.000	0.000	0.000	0.000
O	3.791e–1	2.623e–4	4.501e–7	5.858e–7	1.529e–13	7.775e–9	–2.591e–13
M	2.465e–1	4.775e–4	7.263e–5	2.617e–6	–3.607e–11	–5.718e–8	–7.218e–24
F	3.186e–1	2.914e–2	9.397e–3	8.221e–4	3.872e–8	2.019e–5	2.285e–6

Other factors of particular interest are that the "I", which is symmetric under 180 degree rotations and reflection, has a value of exactly 0 for h₃ through h₇; and that the "O", which has similar symmetries, has all nonzero moments. We leave it to the reader to look at the figures, compare the various moments, and so build a basic intuition for what those moments represent.

Matching with Hu Moments

double  cvMatchShapes(
  const void* object1,
  const void* object2,
  int         method,
  double      parameter = 0
);

Naturally, with Hu moments we'd like to compare two objects and determine whether they are similar. Of course, there are many possible definitions of "similar". To make this process somewhat easier, the OpenCV function cvMatchShapes() allows us to simply provide two objects and have their moments computed and compared according to a criterion that we provide.

These objects can be either grayscale images or contours. If you provide images, cvMatchShapes() will compute the moments for you before proceeding with the comparison. The method used in cvMatchShapes() is one of the three listed in Table 8-2.

Table 8-2. Matching methods used by cvMatchShapes()

Value of method	cvMatchShapes() return value
CV_CONTOURS_MATCH_I1
CV_CONTOURS_MATCH_I2
CV_CONTOURS_MATCH_I3

Value of method

cvMatchShapes() return value

CV_CONTOURS_MATCH_I1

CV_CONTOURS_MATCH_I2

CV_CONTOURS_MATCH_I3

are the Hu moments of A and B, respectively.

Each of the three defined constants in Table 8-2 has a different meaning in terms of how the comparison metric is computed. This metric determines the value ultimately returned by cvMatchShapes(). The final parameter argument is not currently used, so we can safely leave it at the default value of 0.

Hierarchical Matching

We'd often like to match two contours and come up with a similarity measure that takes into account the entire structure of the contours being matched. Methods using summary parameters (such as moments) are fairly quick, but there is only so much information they can capture.

For a more accurate measure of similarity, it will be useful first to consider a structure known as a contour tree. Contour trees should not be confused with the hierarchical representations of contours that are returned by such functions as cvFindContours(). Instead, they are hierarchical representations of the shape of one particular contour.

Understanding a contour tree will be easier if we first understand how it is constructed. Constructing a contour tree from a contour works from bottom (leaf nodes) to top (the root node). The process begins by searching the perimeter of the shape for triangular protrusions or indentations (every point on the contour that is not exactly collinear with its neighbors). Each such triangle is replaced with the line connecting its two nonadjacent points on the curve;thus, in effect the triangle is either cut off (e.g., triangle D in Figure 8-10), or filled in (triangle C). Each such alteration reduces the contour's number of vertices by 1 and creates a new node in the tree. If such a triangle has original edges on two of its sides, then it is a leaf in the resulting tree; if one of its sides is part of an existing triangle, then it is a parent of that triangle. Iteration of this process ultimately reduces the shape to a quadrangle, which is then cut in half; both resulting triangles are children of the root node.

Constructing a contour tree: in the first round, the contour around the car produces leaf nodes A, B, C, and D; in the second round, X and Y are produced (X is the parent of A and B, and Y is the parent of C and D)

Figure 8-10. Constructing a contour tree: in the first round, the contour around the car produces leaf nodes A, B, C, and D; in the second round, X and Y are produced (X is the parent of A and B, and Y is the parent of C and D)

The resulting binary tree (Figure 8-11) ultimately encodes the shape information about the original contour. Each node is annotated with information about the triangle to which it is associated (information such as the size of the triangle and whether it was created by cutting off or filling in).

Once these trees are constructed, they can be used to effectively compare two contours.^[127] This process begins by attempting to define correspondences between nodes in the two trees and then comparing the characteristics of the corresponding nodes. The end result is a similarity measure between the two trees.

In practice, we need to understand very little about this process. OpenCV provides us with routines to generate contour trees automatically from normal CvContour objects and to convert them back; it also provides the method for comparing the two trees. Unfortunately, the constructed trees are not quite robust (i.e., minor changes in the contour may change the resultant tree significantly). Also, the initial triangle (root of the tree) is chosen somewhat arbitrarily. Thus, to obtain a better representation requires that we first apply cvApproxPoly() and then align the contour (perform a cyclic shift) such that the initial triangle is pretty much rotation-independent.

CvContourTree*  cvCreateContourTree(
  const CvSeq*  contour,
  CvMemStorage* storage,
  double        threshold
);
CvSeq*  cvContourFromContourTree(
  const CvContourTree* tree,
  CvMemStorage*        storage,
  CvTermCriteria       criteria
);
double  cvMatchContourTrees(
  const CvContourTree* tree1,
  const CvContourTree* tree2,
  int                  method,
  double               threshold
);

Figure 8-11. A binary tree representation that might correspond to a contour like that of Figure 8-10

This code references CvTermCriteria, the details of which are given in Chapter 9. For now, you can simply construct a structure using cvTermCriteria() with the following (or similar) defaults:

CvTermCriteria termcrit = cvTermCriteria(
  CV_TERMCRIT_ITER | CV_TERMCRIT_EPS, 5, 1 )
);

Contour Convexity and Convexity Defects

Another useful way of comprehending the shape of an object or contour is to compute a convex hull for the object and then compute its convexity defects [Homma85]. The shapes of many complex objects are well characterized by such defects.

Figure 8-12 illustrates the concept of a convexity defect using an image of a human hand. The convex hull is pictured as a dark line around the hand, and the regions labeled A through H are each "defects" relative to that hull. As you can see, these convexity defects offer a means of characterizing not only the hand itself but also the state of the hand.

#define CV_CLOCKWISE         1
#define CV_COUNTER_CLOCKWISE 2
CvSeq* cvConvexHull2(
  const CvArr* input,
  void*        hull_storage  = NULL,
  int          orientation   = CV_CLOCKWISE,
  int          return_points = 0
);
int  cvCheckContour Convexity(
   const CvArr* contour
);
CvSeq*  cvConvexityDefects(
   const CvArr*  contour,
   const CvArr*  convexhull,
   CvMemStorage* storage    = NULL
);

Convexity defects: the dark contour line is a convex hull around the hand; the gridded regions (A-H) are convexity defects in the hand contour relative to the convex hull

Figure 8-12. Convexity defects: the dark contour line is a convex hull around the hand; the gridded regions (A-H) are convexity defects in the hand contour relative to the convex hull

There are three important OpenCV methods that relate to convex hulls and convexity defects. The first simply computes the hull of a contour that we have already identified, and the second allows us to check whether an identified contour is already convex. The third computes convexity defects in a contour for which the convex hull is known.

The cvConvexHull2() routine takes an array of points as its first argument. This array is typically a matrix with two columns and n rows (i.e., n-by-2), or it can be a contour. The points should be 32-bit integers (CV_32SC1) or floating-point numbers (CV_32FC1). The next argument is the now familiar pointer to a memory storage where space for the result can be allocated. The next argument can be either CV_CLOCKWISE or CV_COUNTERCLOCKWISE, which will determine the orientation of the points when they are returned by the routine. The final argument, returnPoints, can be either zero (0) or one (1). If set to 1 then the points themselves will be stored in the return array. If it is set to 0, then only indices^[128] will be stored in the return array, indices that refer to the entries in the original array passed to cvConvexHull2().

At this point the astute reader might ask: "If the hull_storage argument is a memory storage, then why is it prototyped as void*?" Good question. The reason is because, in many cases, it is more useful to have the points of the hull returned in the form of an array rather than a sequence. With this in mind, there is another possibility for the hull_storage argument, which is to pass in a CvMat* pointer to a matrix. In this case, the matrix should be one-dimensional and have the same number of entries as there are input points. When cvConvexHull2() is called, it will actually modify the header for the matrix so that the correct number of columns are indicated.^[129]

Sometimes we already have the contour but do not know if it is convex. In this case we can call cvCheckContourConvexity(). This test is simple and fast,^[130] but it will not work correctly if the contour passed contains self-intersections.

The third routine, cvConvexityDefects(), actually computes the defects and returns a sequence of the defects. In order to do this, cvConvexityDefects() requires the contour itself, the convex hull, and a memory storage from which to get the memory needed to allocate the result sequence. The first two arguments are CvArr* and are the same form as the input argument to cvConvexHull2().

typedef struct CvConvexityDefect {
   // point of the contour where the defect begins
   CvPoint* start;
   // point of the contour where the defect ends
   CvPoint* end;
   // point within the defect farthest from the convex hull
   CvPoint* depth_point;
   // distance between the farthest point and the convex hull
   float depth;
} CvConvexityDefect;

The cvConvexityDefects() routine returns a sequence of CvConvexityDefect structures containing some simple parameters that can be used to characterize the defects. The start and end members are points on the hull at which the defect begins and ends. The depth_point indicates the point on the defect that is the farthest from the edge of the hull from which the defect is a deflection. The final parameter, depth, is the distance between the farthest point and the hull edge.

Pairwise Geometrical Histograms

Earlier we briefly visited the Freeman chain codes (FCCs). Recall that a Freeman chain is a representation of a polygon in terms of a sequence of "moves", where each move is of a fixed length and in a particular direction. However, we did not linger on why one might actually want to use such a representation.

There are many uses for Freeman chains, but the most popular one is worth a longer look because the idea underlies the pairwise geometrical histogram (PGH).^[131]

The PGH is actually a generalization or extension of what is known as a chain code histogram (CCH). The CCH is a histogram made by counting the number of each kind of step in the Freeman chain code representation of a contour. This histogram has a number of nice properties. Most notably, rotations of the object by 45 degree increments become cyclic transformations on the histogram (see Figure 8-13). This provides a method of shape recognition that is not affected by such rotations.

Freeman chain code representations of a contour (top) and their associated chain code histograms (bottom); when the original contour (panel a) is rotated 45 degrees clockwise (panel b), the resulting chain code histogram is the same as the original except shifted to the right by one unit

Figure 8-13. Freeman chain code representations of a contour (top) and their associated chain code histograms (bottom); when the original contour (panel a) is rotated 45 degrees clockwise (panel b), the resulting chain code histogram is the same as the original except shifted to the right by one unit

The PGH is constructed as follows (see Figure 8-14). Each of the edges of the polygon is successively chosen to be the "base edge". Then each of the other edges is considered relative to that base edge and three values are computed: d_min, d_max, and θ. The d_min value is the smallest distance between the two edges, d_max is the largest, and θ is the angle between them. The PGH is a two-dimensional histogram whose dimensions are the angle and the distance. In particular: for every edge pair, there is a bin corresponding to (d_min, θ) and a bin corresponding to (d_max, θ). For each such pair of edges, those two bins are incremented—as are all bins for intermediate values of d (i.e., values between d_min and d_max).

Pairwise geometric histogram: every two edge segments of the enclosing polygon have an angle and a minimum and maximum distance (panel a); these numbers are encoded into a two-dimensional histogram (panel b), which is rotation-invariant and can be matched against other objects

Figure 8-14. Pairwise geometric histogram: every two edge segments of the enclosing polygon have an angle and a minimum and maximum distance (panel a); these numbers are encoded into a two-dimensional histogram (panel b), which is rotation-invariant and can be matched against other objects

The utility of the PGH is similar to that of the FCC. One important difference is that the discriminating power of the PGH is higher, so it is more useful when attempting to solve complex problems involving a greater number of shapes to be recognized and/or a greater variability of background noise. The function used to compute the PGH is

void cvCalcPGH(
    const CvSeq* contour,
    CvHistogram* hist
);

Here contour can contain integer point coordinates; of course, hist must be two-dimensional.

^[125] Mathematical purists might object that m₀₀ should be not the contour's length but rather its area. But because we are looking here at a contour and not a filled polygon, the length and the area are actually the same in a discrete pixel space (at least for the relevant distance measure in our pixel space). There are also functions for computing moments of IplImage images; in that case, m₀₀ would actually be the area of nonzero pixels.

^[126]Here, "appropriate" means that the moment is scaled by some power of m₀₀ such that the resulting normalized moment is independent of the overall scale of the object. In the same sense that an average is the sum of N numbers divided by N, the higher-order moments also require a corresponding normalization factor.

^[127]Some early work in hierarchical matching of contours is described in [Mokhtarian86] and [Neveu86] and to 3D in [Mokhtarian88].

^[128] If the input is CvSeq* or CvContour* then what will be stored are pointers to the points.

^[129]You should know that the memory allocated for the data part of the matrix is not re-allocated in any way, so don't expect a rebate on your memory. In any case, since these are C-arrays, the correct memory will be de-allocated when the matrix itself is released.

^[130]It actually runs in O(N) time, which is only marginally faster than the O(N log N) time required to construct a convex hull.

^[131]OpenCV implements the method of Iivarinen, Peura, Särelä, and Visa [Iivarinen97].