Detecting circles and lines

From The Living Headlights (our project in Chapter 5, Equipping Your Car with a Rearview Camera and Hazard Detection), we are already familiar with one technique to detect circles. We treated the problem as a special case of blob detection and we used an OpenCV class, SimpleBlobDetector, which allows us to specify many detection criteria such as a blob's size, color, and circularity (or noncircularity, that is linearity).

A blob is a shape filled with a solid (or nearly solid) color. This definition implies that many circular or linear objects are not detectable as blobs. In the following photo we see a sunlit desk with a china teapot, china bowl, and pewter bowl:

The bowls and the lid of the teapot have approximately circular outlines in this top-down view. However, they are unlikely to pass the detection as blobs because the interior of each shape is multicolored, especially in uneven light.

Blob detection starts with a simple threshold filter (marking bright regions as white and dark regions as black); a more general approach to shape detection should start with an edge finding filter (marking edge regions as white and interior regions as black) and then a thresholding process. We define an edge as the discontinuity between neighboring regions of different brightness. Thus, an edge pixel has darker neighbors on one side and brighter neighbors on the opposite side. An edge-finding filter subtracts the neighbor values from one side and adds them from the opposite side in order to measure how strongly a pixel exhibits this edge-like contrast in a given direction. To achieve a measurement that is independent of edges' direction, we can apply multiple filters (each oriented for edges of a different direction) and treat each filter's output as a dimension of a vector whose magnitude represents the overall "edginess" of the pixel. A set of such measurements for all pixels is sometimes called the derivative of the image. Having computed the image's derivative, we select a threshold value based on the minimum contrast that we require in an edge. A high threshold accepts only high-contrast edges, while a lower threshold also accepts lower-contrast edges.

Having identified edge pixels, we can count how many of them are intersected by a given primitive shape. The greater the number of intersections, the more confident we can be that the given primitive shape correctly represents an edge in the image. Each intersection is called a vote, and a shape needs a specified number of votes to be accepted as a real edge's shape. Out of all the possible primitive shapes (of a given kind) in the image, we will consider an evenly spaced representative sample. We do so by specifying a step size for the shapes' geometric parameters. (For example, a line's parameters are a point and angle, while a circle's parameters are a center point and radius.) This sample of possible shapes is called a grid, the individual shapes in it are called cells, and the votes are said to be cast in cells. This process—tallying the matches between actual edge pixels and a sample of possible shapes—is the core of a technique called the Hough transform, which has various specializations such as Hough line detection and Hough circle detection.

Hough line detection has two implementations in OpenCV: Imgproc.HoughLines, which is based on the original Hough transform, and Imgproc.HoughLinesP, which is based on a probabilistic variant of the Hough transform. Imgproc.HoughLines does an exhaustive count of intersections for all possible lines of a given pair of step sizes in pixels and in radians. Imgproc.HoughLinesP is usually faster (particularly in images with a few long line segments) as it takes possible lines in a random order and discards some possible lines after finding a good line in a region. Imgproc.HoughLines expresses each line as a distance from the origin and an angle, whereas Imgproc.HoughLinesP expresses each line as two points—the endpoints of a detected segment of the line—which is a more useful representation since it gives us the option to treat the detection results as line segments rather than indefinitely long lines. For both the functions, the arguments include the image (which should be preprocessed with Canny or a similar algorithm), the step sizes in pixels and radians, and the minimum number of intersections required to accept a line. The arguments to Imgproc.HoughLinesP also include a minimum length between endpoints and a maximum gap, where a gap consists of non-edge pixels between edge pixels that intersect the line.

Hough circle detection has one implementation in OpenCV, Imgproc.HoughCircles, which is based on a variant of the Hough transform that makes use of gradient information at edges. This function's arguments include the image (preprocessed with Canny or a similar algorithm), a step size in pixels, a minimum distance between detected circles' centers, and minimum and maximum radii.

Note

For more details on the Canny algorithm, the Hough transform, and OpenCV's implementations of them, refer to Chapter 7 Extracting Lines, Contours, and Components in Robert Laganière's book, OpenCV 2 Computer Vision Application Programming Cookbook (Packt Publishing, 2011).

Despite using a more efficient algorithm than the original Hough transform, Imgproc.HoughCircles is a computationally expensive function, so we do not actually use it in Rollingball. Instead, we assume that the user will draw circles in a solid color and we will apply blob detection, which is cheaper. For line detection, we will use the Imgproc.HoughLinesP function, which is not as expensive as OpenCV's other Hough detectors.

Having chosen algorithms and their OpenCV implementations, let's set up the plugin that will let us easily access this functionality in Unity.