Lucas-Kanade method

The Lucas-Kanade method is used for sparse optical flow tracking. By sparse, we mean that the number of feature points is relatively low. You can refer to their original paper here: http://cseweb.ucsd.edu/classes/sp02/cse252/lucaskanade81.pdf. We start the process by extracting the feature points. For each feature point, we create 3 x 3 patches with the feature point at the center. The assumption here is that all the points within each patch will have a similar motion. We can adjust the size of this window depending on the problem at hand.

For each feature point in the current frame, we take the surrounding 3 x 3 patch as our reference point. For this patch, we look in its neighborhood in the previous frame to get the best match. This neighborhood is usually bigger than 3 x 3 because we want to get the patch that's closest to the patch under consideration. Now, the path from the center pixel of the matched patch in the previous frame to the center pixel of the patch under consideration in the current frame will become the motion vector. We do that for all the feature points and extract all the motion vectors.

Let's consider the following frame:

We need to add some points that we want to track. Just go ahead and click on a bunch of points on this window with your mouse:

If I move into a different position, you will see that the points are still being tracked correctly within a small margin of error:

Let's add a lot of points and see what happens:

As we can see, it will keep tracking those points. But, you will notice that some of the points will be dropped because of factors such as prominence or speed of movement. If you want to play around with it, you can just keep adding more points to it. You can also let the user select a region of interest in the input video. You can then extract feature points from this region of interest and track the object by drawing a bounding box. It will be a fun exercise!

Here is the code to do Lucas-Kanade-based tracking:

int main(int argc, char* argv[]) 
{ 
    // Variable declaration and initialization 
     
    // Iterate until the user hits the Esc key 
    while(true) 
    { 
        // Capture the current frame 
        cap >> frame; 
         
        // Check if the frame is empty 
        if(frame.empty()) 
            break; 
         
        // Resize the frame 
        resize(frame, frame, Size(), scalingFactor, scalingFactor, INTER_AREA); 
         
        // Copy the input frame 
        frame.copyTo(image); 
         
        // Convert the image to grayscale 
        cvtColor(image, curGrayImage, COLOR_BGR2GRAY); 
         
        // Check if there are points to track 
        if(!trackingPoints[0].empty()) 
        { 
            // Status vector to indicate whether the flow for the corresponding features has been found 
            vector<uchar> statusVector; 
             
            // Error vector to indicate the error for the corresponding feature 
            vector<float> errorVector; 
             
            // Check if previous image is empty 
            if(prevGrayImage.empty()) 
            { 
                curGrayImage.copyTo(prevGrayImage); 
            } 
             
            // Calculate the optical flow using Lucas-Kanade algorithm 
            calcOpticalFlowPyrLK(prevGrayImage, curGrayImage, trackingPoints[0], trackingPoints[1], statusVector, errorVector, windowSize, 3, terminationCriteria, 0, 0.001); 

We use the current image and the previous image to compute the optical flow information. Needless to say, the quality of the output will depend on the parameters chosen. You can find more details about the parameters at http://docs.opencv.org/2.4/modules/video/doc/motion_analysis_and_object_tracking.html#calcopticalflowpyrlk. To increase quality and robustness, we need to filter out the points that are very close to each other because they're not adding new information. Let's go ahead and do that:

 
            int count = 0; 
             
            // Minimum distance between any two tracking points 
            int minDist = 7; 
             
            for(int i=0; i < trackingPoints[1].size(); i++) 
            { 
                if(pointTrackingFlag) 
                { 
                    // If the new point is within 'minDist' distance from an existing point, it will not be tracked 
                    if(norm(currentPoint - trackingPoints[1][i]) <= minDist) 
                    { 
                        pointTrackingFlag = false; 
                        continue; 
                    } 
                } 
                 
                // Check if the status vector is good 
                if(!statusVector[i]) 
                    continue; 
                 
                trackingPoints[1][count++] = trackingPoints[1][i]; 
 
                // Draw a filled circle for each of the tracking points 
                int radius = 8; 
                int thickness = 2; 
                int lineType = 8; 
                circle(image, trackingPoints[1][i], radius, Scalar(0,255,0), thickness, lineType); 
            } 
             
            trackingPoints[1].resize(count); 
        } 

We now have the tracking points. The next step is to refine the location of those points. What exactly does refine mean in this context? To increase the speed of computation, there is some level of quantization involved. In layman's terms, you can think of it as rounding off. Now that we have the approximate region, we can refine the location of the point within that region to get a more accurate outcome. Let's go ahead and do that:

         
        // Refining the location of the feature points 
        if(pointTrackingFlag && trackingPoints[1].size() < maxNumPoints) 
        { 
            vector<Point2f> tempPoints; 
            tempPoints.push_back(currentPoint); 
             
            // Function to refine the location of the corners to subpixel accuracy. 
            // Here, 'pixel' refers to the image patch of size 'windowSize' and not the actual image pixel 
            cornerSubPix(curGrayImage, tempPoints, windowSize, Size(-1,-1), terminationCriteria); 
             
            trackingPoints[1].push_back(tempPoints[0]); 
            pointTrackingFlag = false; 
        } 
         
        // Display the image with the tracking points 
        imshow(windowName, image); 
         
        // Check if the user pressed the Esc key 
        char ch = waitKey(10); 
        if(ch == 27) 
            break; 
         
        // Swap the 'points' vectors to update 'previous' to 'current' 
        std::swap(trackingPoints[1], trackingPoints[0]); 
         
        // Swap the images to update previous image to current image 
        cv::swap(prevGrayImage, curGrayImage); 
    } 
     
    return 1; 
}