In Chapter 5 we introduced the function cvPyrSegmentation()
. Pyramid segmentation uses a color merge
(over a scale that depends on the similarity of the colors to one another) in order to
segment images. This approach is based on minimizing the total energy
in the image; here energy is defined by a link strength, which is
further defined by color similarity. In this section we introduce
cvPyrMeanShiftFiltering()
, a similar algorithm that is
based on mean-shift clustering over color [Comaniciu99]. We'll see the details of the
mean-shift algorithm cvMeanShift()
in Chapter 10, when we discuss tracking and motion. For now, what we
need to know is that mean shift finds the peak of a color-spatial (or other feature)
distribution over time. Here, mean-shift segmentation finds the peaks of color distributions
over space. The common theme is that both the motion tracking and the color segmentation
algorithms rely on the ability of mean shift to find the modes (peaks) of a
distribution.
Given a set of multidimensional data points whose dimensions are (x,
y, blue, green, red), mean shift can find the highest density "clumps" of data
in this space by scanning a window over the space. Notice, however,
that the spatial variables (x, y) can have very different ranges from
the color magnitude ranges (blue, green, red). Therefore, mean shift needs to allow for
different window radii in different dimensions. In this case we should have one radius for
the spatial variables (spatialRadius
) and one radius for
the color magnitudes (colorRadius
). As mean-shift windows move, all the points traversed by the windows that converge
at a peak in the data become connected or "owned" by that peak. This ownership, radiating
out from the densest peaks, forms the segmentation of the image. The segmentation is
actually done over a scale pyramid (cvPyrUp(),
cvPyrDown())
, as described in Chapter 5, so that color
clusters at a high level in the pyramid (shrunken image) have their boundaries refined at
lower pyramid levels in the pyramid. The function call for cvPyrMeanShiftFiltering()
looks like this:
void cvPyrMeanShiftFiltering( const CvArr* src, CvArr* dst, double spatialRadius, double colorRadius, int max_level = 1, CvTermCriteria termcrit = cvTermCriteria( CV_TERMCRIT_ITER | CV_TERMCRIT_EPS, 5, 1 ) );
In cvPyrMeanShiftFiltering()
we have an input image
src
and an output image dst
. Both must be 8-bit, three-channel color images of the same width and
height. The spatialRadius
and colorRadius
define how the mean-shift algorithm averages color and space
together to form a segmentation. For a 640-by-480 color image, it works well to set spatialRadius
equal to 2 and colorRadius
equal to 40. The next parameter of this algorithm is max_level
, which describes how many levels of scale pyramid you
want used for segmentation. A max_level
of 2 or 3 works
well for a 640-by-480 color image.
The final parameter is CvTermCriteria
, which we saw
in Chapter 8. CvTermCriteria
is used for
all iterative algorithms in OpenCV. The mean-shift segmentation function comes with good
defaults if you just want to leave this parameter blank. Otherwise, cvTermCriteria
has the following constructor:
cvTermCriteria( int type; // CV_TERMCRIT_ITER, CV_TERMCRIT_EPS, int max_iter, double epsilon );
Typically we use the cvTermCriteria()
function to
generate the CvTermCriteria
structure that we need. The
first argument is either CV_TERMCRIT_ITER
or CV_TERMCRIT_EPS
, which tells the algorithm that we want to
terminate either after some fixed number of iterations or when the convergence metric
reaches some small value (respectively). The next two arguments set the values at which one,
the other, or both of these criteria should terminate the algorithm. The reason we have both
options is because we can set the type to CV_TERMCRIT_ITER |
CV_TERMCRIT_EPS
to stop when either limit is reached. The parameter max_iter
limits the number of iterations if CV_TERMCRIT_ITER
is set, whereas epsilon
sets the error limit if CV_TERMCRIT_EPS
is set. Of course the exact meaning of epsilon
depends on the algorithm.
Figure 9-11 shows an example of mean-shift segmentation using the following values:
cvPyrMeanShiftFiltering( src, dst, 20, 40, 2);