Now that we have a pretty good idea of what a contour is and of how to work with contours as objects in OpenCV, we would like to take a moment to understand how to use them for some practical purposes. The most common task associated with contours is matching them in some way with one another. We may have two computed contours that we'd like to compare or a computed contour and some abstract template with which we'd like to compare our contour. We will discuss both of these cases.
One of the simplest ways to compare two contours is to compute contour moments. This is a good time for a short digression into precisely what a moment is. Loosely speaking, a moment is a gross characteristic of the contour computed by integrating (or summing, if you like) over all of the pixels of the contour. In general, we define the (p, q) moment of a contour as
Here p is the x-order and q is the y-order, whereby order means the power to which the corresponding component is taken in the sum just displayed. The summation is over all of the pixels of the contour boundary (denoted by n in the equation). It then follows immediately that if p and q are both equal to 0, then the m00 moment is actually just the length in pixels of the contour.[125]
The function that computes these moments for us is
void cvContourMoments( CvSeq* contour, CvMoments* moments )
The first argument is the contour we are interested in and the second is a pointer to
a structure that we must allocate to hold the return data. The CvMoments
structure is defined as follows:
typedef struct CvMoments { // spatial moments double m00, m10, m01, m20, m11, m02, m30, m21, m12, m03; // central moments double mu20, mu11, mu02, mu30, mu21, mu12, mu03; // m00 != 0 ? 1/sqrt(m00) : 0 double inv_sqrt_m00; } CvMoments;
The cvContourMoments()
function uses only the
m00, m01, …, m03
elements; the elements with names
mu00
, … are used by other routines.
When working with the CvMoments structure, there is a friendly helper function that will return any particular moment out of the structure:
double cvGetSpatialMoment( Cv Moments* moments, Int x_order, int y_order );
A single call to cvContourMoments()
will instigate
computation of all the moments through third order (i.e.,
m30 and
m03 will be computed, as will
m21 and
m12, but
m22 will not be).
The moment computation just described gives some rudimentary characteristics of a contour that can be used to compare two contours. However, the moments resulting from that computation are not the best parameters for such comparisons in most practical cases. In particular, one would often like to use normalized moments (so that objects of the same shape but dissimilar sizes give similar values). Similarly, the simple moments of the previous section depend on the coordinate system chosen, which means that objects are not matched correctly if they are rotated.
OpenCV provides routines to compute normalized moments as well as Hu
invariant moments [Hu62]. The CvMoments
structure can be
computed either with cvMoments
or with cvContourMoments
. Moreover, cvContourMoments
is now just an alias for cvMoments
.
A useful trick is to use cvDrawContours()
to
"paint" an image of the contour and then call one of the moment functions on the resulting
drawing. This allows you to control whether or not the contour is filled.
Here are the four functions at your disposal:
void cvMoments( const CvArr* image, CvMoments* moments, int isBinary = 0 ) double cvGetCentralMoment( CvMoments* moments, int x_order, int y_order ) double cvGetNormalizedCentralMoment( CvMoments* moments, int x_order, int y_order ); void cvGetHuMoments( CvMoments* moments, CvHuMoments* HuMoments );
The first function is essentially analogous to cvContourMoments()
except that it takes an image (instead of a contour) and
has one extra argument. That extra argument, if set to CV_TRUE
, tells cvMoments()
to treat all
pixels as either 1 or 0, where 1 is assigned to any pixel with a nonzero value. When this
function is called, all of the moments'including the central moments (see next paragraph)'are computed at once.
A central moment is basically the same as the moments just described except that the values of x and y used in the formulas are displaced by the mean values:
where
and
.
The normalized moments are the same as the central moments except that they are all divided by an appropriate power of m00:[126]
Finally, the Hu invariant moments are linear combinations of the central moments. The idea here is that, by combining the different normalized central moments, it is possible to create invariant functions representing different aspects of the image in a way that is invariant to scale, rotation, and (for all but the one called h1) reflection.
The cvGetHuMoments()
function computes the
Hu moments from the central moments. For the sake of completeness, we show
here the actual definitions of the Hu moments:
Looking at Figure 8-9 and Table 8-1, we can gain a sense of how the Hu moments behave. Observe first that the moments tend to be smaller as we move to higher orders. This should be no surprise in that, by their definition, higher Hu moments have more powers of various normalized factors. Since each of those factors is less than 1, the products of more and more of them will tend to be smaller numbers.
Figure 8-9. Images of five simple characters; looking at their Hu moments yields some intuition concerning their behavior
Table 8-1. Values of the Hu moments for the five simple characters of Figure 8-9
h1 |
h2 |
h3 |
h4 |
h5 |
h6 |
h7 | |
---|---|---|---|---|---|---|---|
A |
2.837e–1 |
1.961e–3 |
1.484e–2 |
2.265e–4 |
–4.152e–7 |
1.003e–5 |
–7.941e–9 |
I |
4.578e–1 |
1.820e–1 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
O |
3.791e–1 |
2.623e–4 |
4.501e–7 |
5.858e–7 |
1.529e–13 |
7.775e–9 |
–2.591e–13 |
M |
2.465e–1 |
4.775e–4 |
7.263e–5 |
2.617e–6 |
–3.607e–11 |
–5.718e–8 |
–7.218e–24 |
F |
3.186e–1 |
2.914e–2 |
9.397e–3 |
8.221e–4 |
3.872e–8 |
2.019e–5 |
2.285e–6 |
Other factors of particular interest are that the "I", which is symmetric under 180 degree rotations and reflection, has a value of exactly 0 for h3 through h7; and that the "O", which has similar symmetries, has all nonzero moments. We leave it to the reader to look at the figures, compare the various moments, and so build a basic intuition for what those moments represent.
double cvMatchShapes( const void* object1, const void* object2, int method, double parameter = 0 );
Naturally, with Hu moments we'd like to compare two objects and determine whether they
are similar. Of course, there are many possible definitions of "similar". To make this
process somewhat easier, the OpenCV function cvMatchShapes()
allows us to simply provide two objects and have their
moments computed and compared according to a criterion that we provide.
These objects can be either grayscale images or contours. If you provide images, cvMatchShapes()
will compute the moments for you before proceeding with the
comparison. The method
used in cvMatchShapes()
is one of the three listed in Table 8-2.
In the table,
and
are defined as:
where
and
are the Hu moments of A and B, respectively.
Each of the three defined constants in Table 8-2 has a different meaning in terms of
how the comparison metric is computed. This metric determines the value ultimately
returned by cvMatchShapes()
. The final parameter
argument is not currently used, so we can safely
leave it at the default value of 0.
We'd often like to match two contours and come up with a similarity measure that takes into account the entire structure of the contours being matched. Methods using summary parameters (such as moments) are fairly quick, but there is only so much information they can capture.
For a more accurate measure of similarity, it will be useful first to consider a
structure known as a contour tree. Contour trees should not be confused with the hierarchical representations of contours
that are returned by such functions as cvFindContours()
. Instead, they are hierarchical representations of the shape
of one particular contour.
Understanding a contour tree will be easier if we first understand how it is constructed. Constructing a contour tree from a contour works from bottom (leaf nodes) to top (the root node). The process begins by searching the perimeter of the shape for triangular protrusions or indentations (every point on the contour that is not exactly collinear with its neighbors). Each such triangle is replaced with the line connecting its two nonadjacent points on the curve;thus, in effect the triangle is either cut off (e.g., triangle D in Figure 8-10), or filled in (triangle C). Each such alteration reduces the contour's number of vertices by 1 and creates a new node in the tree. If such a triangle has original edges on two of its sides, then it is a leaf in the resulting tree; if one of its sides is part of an existing triangle, then it is a parent of that triangle. Iteration of this process ultimately reduces the shape to a quadrangle, which is then cut in half; both resulting triangles are children of the root node.
Figure 8-10. Constructing a contour tree: in the first round, the contour around the car produces leaf nodes A, B, C, and D; in the second round, X and Y are produced (X is the parent of A and B, and Y is the parent of C and D)
The resulting binary tree (Figure 8-11) ultimately encodes the shape information about the original contour. Each node is annotated with information about the triangle to which it is associated (information such as the size of the triangle and whether it was created by cutting off or filling in).
Once these trees are constructed, they can be used to effectively compare two contours.[127] This process begins by attempting to define correspondences between nodes in the two trees and then comparing the characteristics of the corresponding nodes. The end result is a similarity measure between the two trees.
In practice, we need to understand very little about this process. OpenCV provides us
with routines to generate contour trees automatically from normal CvContour objects and to
convert them back; it also provides the method for comparing the two trees. Unfortunately,
the constructed trees are not quite robust (i.e., minor changes in the contour may change
the resultant tree significantly). Also, the initial triangle (root of the tree) is chosen
somewhat arbitrarily. Thus, to obtain a better representation requires that we first apply
cvApproxPoly()
and then align the contour (perform a
cyclic shift) such that the initial triangle is pretty much rotation-independent.
CvContourTree* cvCreateContourTree( const CvSeq* contour, CvMemStorage* storage, double threshold ); CvSeq* cvContourFromContourTree( const CvContourTree* tree, CvMemStorage* storage, CvTermCriteria criteria ); double cvMatchContourTrees( const CvContourTree* tree1, const CvContourTree* tree2, int method, double threshold );
Figure 8-11. A binary tree representation that might correspond to a contour like that of Figure 8-10
This code references CvTermCriteria
, the details of
which are given in Chapter 9. For now, you can simply
construct a structure using cvTermCriteria()
with the
following (or similar) defaults:
CvTermCriteria termcrit = cvTermCriteria( CV_TERMCRIT_ITER | CV_TERMCRIT_EPS, 5, 1 ) );
Another useful way of comprehending the shape of an object or contour is to compute a convex hull for the object and then compute its convexity defects [Homma85]. The shapes of many complex objects are well characterized by such defects.
Figure 8-12 illustrates the concept of a convexity defect using an image of a human hand. The convex hull is pictured as a dark line around the hand, and the regions labeled A through H are each "defects" relative to that hull. As you can see, these convexity defects offer a means of characterizing not only the hand itself but also the state of the hand.
#define CV_CLOCKWISE 1 #define CV_COUNTER_CLOCKWISE 2 CvSeq* cvConvexHull2( const CvArr* input, void* hull_storage = NULL, int orientation = CV_CLOCKWISE, int return_points = 0 ); int cvCheckContour Convexity( const CvArr* contour ); CvSeq* cvConvexityDefects( const CvArr* contour, const CvArr* convexhull, CvMemStorage* storage = NULL );
Figure 8-12. Convexity defects: the dark contour line is a convex hull around the hand; the gridded regions (A-H) are convexity defects in the hand contour relative to the convex hull
There are three important OpenCV methods that relate to convex hulls and convexity defects. The first simply computes the hull of a contour that we have already identified, and the second allows us to check whether an identified contour is already convex. The third computes convexity defects in a contour for which the convex hull is known.
The cvConvexHull2()
routine takes an array of
points as its first argument. This array is typically a matrix with two columns and
n rows (i.e., n-by-2), or it can be a contour.
The points should be 32-bit integers (CV_32SC1
) or
floating-point numbers (CV_32FC1
). The next argument is
the now familiar pointer to a memory storage where space for the result can be allocated.
The next argument can be either CV_CLOCKWISE
or
CV_COUNTERCLOCKWISE
, which will determine the
orientation of the points when they are returned by the routine. The final argument,
returnPoints
, can be either zero (0
) or one (1
). If set to
1
then the points themselves will be stored in the
return array. If it is set to 0
, then only
indices[128] will be stored in the return array, indices that refer to the entries in the
original array passed to cvConvexHull2()
.
At this point the astute reader might ask: "If the hull_storage
argument is a memory storage, then why is it prototyped as
void*
?" Good question. The reason is because, in many
cases, it is more useful to have the points of the hull returned in the form of an array
rather than a sequence. With this in mind, there is another possibility for the hull_storage
argument, which is to pass in a CvMat*
pointer to a matrix. In this case, the matrix should be
one-dimensional and have the same number of entries as there are input points. When
cvConvexHull2()
is called, it will actually modify
the header for the matrix so that the correct number of columns are indicated.[129]
Sometimes we already have the contour but do not know if it is convex. In this case we
can call cvCheckContourConvexity()
. This test is simple
and fast,[130] but it will not work correctly if the contour passed contains
self-intersections.
The third routine, cvConvexityDefects()
, actually
computes the defects and returns a sequence of the defects. In order to do this, cvConvexityDefects()
requires the contour itself, the convex
hull, and a memory storage from which to get the memory needed to allocate the result
sequence. The first two arguments are CvArr*
and are
the same form as the input
argument to cvConvexHull2()
.
typedef struct CvConvexityDefect { // point of the contour where the defect begins CvPoint* start; // point of the contour where the defect ends CvPoint* end; // point within the defect farthest from the convex hull CvPoint* depth_point; // distance between the farthest point and the convex hull float depth; } CvConvexityDefect;
The cvConvexityDefects()
routine returns a sequence
of CvConvexityDefect
structures containing some simple
parameters that can be used to characterize the defects. The start
and end
members are points on the
hull at which the defect begins and ends. The depth_point
indicates the point on the defect that is the farthest from the
edge of the hull from which the defect is a deflection. The final parameter, depth
, is the distance between the farthest point and the hull
edge.
Earlier we briefly visited the Freeman chain codes (FCCs). Recall that a Freeman chain is a representation of a polygon in terms of a sequence of "moves", where each move is of a fixed length and in a particular direction. However, we did not linger on why one might actually want to use such a representation.
There are many uses for Freeman chains, but the most popular one is worth a longer look because the idea underlies the pairwise geometrical histogram (PGH).[131]
The PGH is actually a generalization or extension of what is known as a chain code histogram (CCH). The CCH is a histogram made by counting the number of each kind of step in the Freeman chain code representation of a contour. This histogram has a number of nice properties. Most notably, rotations of the object by 45 degree increments become cyclic transformations on the histogram (see Figure 8-13). This provides a method of shape recognition that is not affected by such rotations.
Figure 8-13. Freeman chain code representations of a contour (top) and their associated chain code histograms (bottom); when the original contour (panel a) is rotated 45 degrees clockwise (panel b), the resulting chain code histogram is the same as the original except shifted to the right by one unit
The PGH is constructed as follows (see Figure 8-14). Each of the edges of the polygon is successively chosen to be the "base edge". Then each of the other edges is considered relative to that base edge and three values are computed: dmin, dmax, and θ. The dmin value is the smallest distance between the two edges, dmax is the largest, and θ is the angle between them. The PGH is a two-dimensional histogram whose dimensions are the angle and the distance. In particular: for every edge pair, there is a bin corresponding to (dmin, θ) and a bin corresponding to (dmax, θ). For each such pair of edges, those two bins are incremented—as are all bins for intermediate values of d (i.e., values between dmin and dmax).
Figure 8-14. Pairwise geometric histogram: every two edge segments of the enclosing polygon have an angle and a minimum and maximum distance (panel a); these numbers are encoded into a two-dimensional histogram (panel b), which is rotation-invariant and can be matched against other objects
The utility of the PGH is similar to that of the FCC. One important difference is that the discriminating power of the PGH is higher, so it is more useful when attempting to solve complex problems involving a greater number of shapes to be recognized and/or a greater variability of background noise. The function used to compute the PGH is
void cvCalcPGH( const CvSeq* contour, CvHistogram* hist );
Here contour
can contain integer point coordinates;
of course, hist
must be two-dimensional.
[125] Mathematical purists might object that
m00 should be not the contour's length
but rather its area. But because we are looking here at a contour and not a filled
polygon, the length and the area are actually the same in a discrete pixel space (at
least for the relevant distance measure in our pixel space). There are also functions
for computing moments of IplImage
images; in that
case, m00 would actually be the area of
nonzero pixels.
[126] Here, "appropriate" means that the moment is scaled by some power of m00 such that the resulting normalized moment is independent of the overall scale of the object. In the same sense that an average is the sum of N numbers divided by N, the higher-order moments also require a corresponding normalization factor.
[127] Some early work in hierarchical matching of contours is described in [Mokhtarian86] and [Neveu86] and to 3D in [Mokhtarian88].
[128] If the input is CvSeq*
or CvContour*
then what will be stored are pointers to the
points.
[129] You should know that the memory allocated for the data part of the matrix is not re-allocated in any way, so don't expect a rebate on your memory. In any case, since these are C-arrays, the correct memory will be de-allocated when the matrix itself is released.
[130] It actually runs in O(N) time, which is only marginally faster than the O(N log N) time required to construct a convex hull.
[131] OpenCV implements the method of Iivarinen, Peura, Särelä, and Visa [Iivarinen97].