Measuring error

Visually, the results seem very good. However, since we have the ground truth data, we may elect to analytically compare it to the detection and get an error estimate. We can use a standard mean Euclidean distance metric () to tell how close each predicted landmark is to the ground truth on average:

float MeanEuclideanDistance(const vector<Point2f>& A, const vector<Point2f>& B) {
    float med = 0.0f;
    for (int i = 0; i < A.size(); ++i) {
        med += cv::norm(A[i] - B[i]);
    }
    return med / (float)A.size();
}

A visualization of the results with the prediction (red) and ground truth (green) overlaid, shown in the following screenshot:

We can see the average error over all landmarks is roughly only one pixel for these particular video frames.