Distance Transform

The distance transform of an image is defined as a new image in which every output pixel is set to a value equal to the distance to the nearest zero pixel in the input image. It should be immediately obvious that the typical input to a distance transform should be some kind of edge image. In most applications the input to the distance transform is an output of an edge detector such as the Canny edge detector that has been inverted (so that the edges have value zero and the non-edges are nonzero).

In practice, the distance transform is carried out by using a mask that is typically a 3-by-3 or 5-by-5 array. Each point in the array defines the "distance" to be associated with a point in that particular position relative to the center of the mask. Larger distances are built up (and thus approximated) as sequences of "moves" defined by the entries in the mask. This means that using a larger mask will yield more accurate distances.

Depending on the desired distance metric, the appropriate mask is automatically selected from a set known to OpenCV. It is also possible to tell OpenCV to compute "exact" distances according to some formula appropriate to the selected metric, but of course this is much slower.

The distance metric can be any of several different types, including the classic L2 (Cartesian) distance metric; see Table 6-2 for a listing. In addition to these you may define a custom metric and associate it with your own custom mask.

Table 6-2. Possible values for distance_type argument to cvDistTransform()

Value of distance_type

Metric

CV_DIST_L2

image with no caption

CV_DIST_L1

image with no caption

CV_DIST_L12

image with no caption

CV_DIST_FAIR

image with no caption

CV_DIST_WELSCH

image with no caption

CV_DIST_USER

User-defined distance

When calling the OpenCV distance transform function, the output image should be a 32-bit floating-point image (i.e., IPL_DEPTH_32F).

Void cvDistTransform(
   const CvArr* src,
   CvArr*       dst,
   int          distance_type = CV_DIST_L2,
   int          mask_size     = 3,
   const float* kernel        = NULL,
   CvArr*       labels        = NULL
);

There are several optional parameters when calling cvDistTransform(). The first is distance_type, which indicates the distance metric to be used. The available values for this argument are defined in Borgefors (1986) [Borgefors86].

After the distance type is the mask_size, which may be 3 (choose CV_DIST_MASK_3) or 5 (choose CV_DIST_MASK_5); alternatively, distance computations can be made without a kernel[87] (choose CV_DIST_MASK_PRECISE). The kernel argument to cvDistanceTransform() is the distance mask to be used in the case of custom metric. These kernels are constructed according to the method of Gunilla Borgefors, two examples of which are shown in Figure 6-20. The last argument, labels, indicates that associations should be made between individual points and the nearest connected component consisting of zero pixels. When labels is non-NULL, it must be a pointer to an array of integer values the same size as the input and output images. When the function returns, this image can be read to determine which object was closest to the particular pixel under consideration. Figure 6-21 shows the outputs of distance transforms on a test pattern and a photographic image.



[87] The exact method comes from Pedro F. Felzenszwalb and Daniel P. Huttenlocher [Felzenszwalb63].