Matrix and Image Operators

Table 3-3 lists a variety of routines for matrix manipulation, most of which work equally well for images. They do all of the "usual" things, such as diagonalizing or transposing a matrix, as well as some more complicated operations, such as computing image statistics.

Table 3-3. Basic matrix and image operators

Function

Description

cvAbs

Absolute value of all elements in an array

cvAbsDiff

Absolute value of differences between two arrays

cvAbsDiffS

Absolute value of differences between an array and a scalar

cvAdd

Elementwise addition of two arrays

cvAddS

Elementwise addition of an array and a scalar

cvAddWeighted

Elementwise weighted addition of two arrays (alpha blending)

cvAvg

Average value of all elements in an array

cvAvgSdv

Average value and standard deviation of all elements in an array

cvCalcCovarMatrix

Compute covariance of a set of n-dimensional vectors

cvCmp

Apply selected comparison operator to all elements in two arrays

cvCmpS

Apply selected comparison operator to an array relative to a scalar

cvConvertScale

Convert array type with optional rescaling of the value

cvConvertScaleAbs

Convert array type after absolute value with optional rescaling

cvCopy

Copy elements of one array to another

cvCountNonZero

Count nonzero elements in an array

cvCrossProduct

Compute cross product of two three-dimensional vectors

cvCvtColor

Convert channels of an array from one color space to another

cvDet

Compute determinant of a square matrix

cvDiv

Elementwise division of one array by another

cvDotProduct

Compute dot product of two vectors

cvEigenVV

Compute eigenvalues and eigenvectors of a square matrix

cvFlip

Flip an array about a selected axis

cvGEMM

Generalized matrix multiplication

cvGetCol

Copy elements from column slice of an array

cvGetCols

Copy elements from multiple adjacent columns of an array

cvGetDiag

Copy elements from an array diagonal

cvGetDims

Return the number of dimensions of an array

cvGetDimSize

Return the sizes of all dimensions of an array

cvGetRow

Copy elements from row slice of an array

cvGetRows

Copy elements from multiple adjacent rows of an array

cvGetSize

Get size of a two-dimensional array and return as CvSize

cvGetSubRect

Copy elements from subregion of an array

cvInRange

Test if elements of an array are within values of two other arrays

cvInRangeS

Test if elements of an array are in range between two scalars

cvInvert

Invert a square matrix

cvMahalonobis

Compute Mahalonobis distance between two vectors

cvMax

Elementwise max operation on two arrays

cvMaxS

Elementwise max operation between an array and a scalar

cvMerge

Merge several single-channel images into one multichannel image

cvMin

Elementwise min operation on two arrays

cvMinS

Elementwise min operation between an array and a scalar

cvMinMaxLoc

Find minimum and maximum values in an array

cvMul

Elementwise multiplication of two arrays

cvNot

Bitwise inversion of every element of an array

cvNorm

Compute normalized correlations between two arrays

cvNormalize

Normalize elements in an array to some value

cvOr

Elementwise bit-level OR of two arrays

cvOrS

Elementwise bit-level OR of an array and a scalar

cvReduce

Reduce a two-dimensional array to a vector by a given operation

cvRepeat

Tile the contents of one array into another

cvSet

Set all elements of an array to a given value

cvSetZero

Set all elements of an array to 0

cvSetIdentity

Set all elements of an array to 1 for the diagonal and 0 otherwise

cvSolve

Solve a system of linear equations

cvSplit

Split a multichannel array into multiple single-channel arrays

cvSub

Elementwise subtraction of one array from another

cvSubS

Elementwise subtraction of a scalar from an array

cvSubRS

Elementwise subtraction of an array from a scalar

cvSum

Sum all elements of an array

cvSVD

Compute singular value decomposition of a two-dimensional array

cvSVBkSb

Compute singular value back-substitution

cvTrace

Compute the trace of an array

cvTranspose

Transpose all elements of an array across the diagonal

cvXor

Elementwise bit-level XOR between two arrays

cvXorS

Elementwise bit-level XOR between an array and a scalar

cvZero

Set all elements of an array to 0

void cvAdd(
    const CvArr* src1,
    const CvArr* src2,
    CvArr*       dst,
    const CvArr* mask = NULL
);
void cvAddS(
    const CvArr* src,
    CvScalar     value,
    CvArr*       dst,
    const CvArr* mask = NULL
);
void cvAddWeighted(
    const CvArr* src1,
    double       alpha,
    const CvArr* src2,
    double       beta,
    double       gamma,
    CvArr*       dst
);

cvAdd() is a simple addition function: it adds all of the elements in src1 to the corresponding elements in src2 and puts the results in dst. If mask is not set to NULL, then any element of dst that corresponds to a zero element of mask remains unaltered by this operation. The closely related function cvAddS() does the same thing except that the constant scalar value is added to every element of src.

The function cvAddWeighted() is similar to cvAdd() except that the result written to dst is computed according to the following formula:

image with no caption

This function can be used to implement alpha blending [Smith79; Porter84]; that is, it can be used to blend one image with another. The form of this function is:

void  cvAddWeighted(
    const CvArr* src1,
    double       alpha,
    const CvArr* src2,
    double       beta,
    double       gamma,
    CvArr*       dst
);

In cvAddWeighted() we have two source images, src1 and src2. These images may be of any pixel type so long as both are of the same type. They may also be one or three channels (grayscale or color), again as long as they agree. The destination result image, dst, must also have the same pixel type as src1 and src2. These images may be of different sizes, but their ROIs must agree in size or else OpenCV will issue an error. The parameter alpha is the blending strength of src1, and beta is the blending strength of src2. The alpha blending equation is:

image with no caption

You can convert to the standard alpha blend equation by choosing α between 0 and 1, setting β = 1 − α, and setting γ to 0; this yields:

image with no caption

However, cvAddWeighted() gives us a little more flexibility—both in how we weight the blended images and in the additional parameter γ, which allows for an additive offset to the resulting destination image. For the general form, you will probably want to keep alpha and beta at no less than 0 and their sum at no more than 1; gamma may be set depending on average or max image value to scale the pixels up. A program showing the use of alpha blending is shown in Example 3-14.

The code in Example 3-14 takes two source images: the primary one (src1) and the one to blend (src2). It reads in a rectangle ROI for src1 and applies an ROI of the same size to src2, this time located at the origin. It reads in alpha and beta levels but sets gamma to 0.Alpha blending is applied using cvAddWeighted(), and the results are put into src1 and displayed. Example output is shown in Figure 3-4, where the face of a child is blended onto the face and body of a cat. Note that the code took the same ROI as in the ROI addition example in Figure 3-3. This time we used the ROI as the target blending region.

void cvCalcCovarMatrix(
    const CvArr** vects,
    int           count,
    CvArr*        cov_mat,
    CvArr*        avg,
    int           flags
);

Given any number of vectors, cvCalcCovarMatrix() will compute the mean and covariance matrix for the Gaussian approximation to the distribution of those points. This can be used in many ways, of course, and OpenCV has some additional flags that will help in particular contexts (see Table 3-4). These flags may be combined by the standard use of the Boolean OR operator.

In all cases, the vectors are supplied in vects as an array of OpenCV arrays (i.e., a pointer to a list of pointers to arrays), with the argument count indicating how many arrays are being supplied. The results will be placed in cov_mat in all cases, but the exact meaning of avg depends on the flag values (see Table 3-4).

The flags CV_COVAR_NORMAL and CV_COVAR_SCRAMBLED are mutually exclusive; you should use one or the other but not both. In the case of CV_COVAR_NORMAL, the function will simply compute the mean and covariance of the points provided.

image with no caption

Thus the normal covariance is computed from the m vectors of length n, where is defined as the nth element of the average vector . The resulting covariance matrix is an n-by-n matrix. The factor z is an optional scale factor; it will be set to 1 unless the CV_COVAR_SCALE flag is used.

In the case of CV_COVAR_SCRAMBLED, cvCalcCovarMatrix() will compute the following:

image with no caption

This matrix is not the usual covariance matrix (note the location of the transpose operator). This matrix is computed from the same m vectors of length n, but the resulting scrambled covariance matrix is an m-by-m matrix. This matrix is used in some specific algorithms such as fast PCA for very large vectors (as in the eigenfaces technique for face recognition).

The flag CV_COVAR_USE_AVG is used when the mean of the input vectors is already known. In this case, the argument avg is used as an input rather than an output, which reduces computation time.

Finally, the flag CV_COVAR_SCALE is used to apply a uniform scale to the covariance matrix calculated. This is the factor z in the preceding equations. When used in conjunction with the CV_COVAR_NORMAL flag, the applied scale factor will be 1.0/m (or, equivalently, 1.0/count). If instead CV_COVAR_SCRAMBLED is used, then the value of z will be 1.0/n (the inverse of the length of the vectors).

The input and output arrays to cvCalcCovarMatrix() should all be of the same floating-point type. The size of the resulting matrix cov_mat should be either n-by-n or m-by-m depending on whether the standard or scrambled covariance is being computed. It should be noted that the "vectors" input in vects do not actually have to be one-dimensional; they can be two-dimensional objects (e.g., images) as well.

The cvConvertScale() function is actually several functions rolled into one; it will perform any of several functions or, if desired, all of them together. The first function is to convert the data type in the source image to the data type of the destination image. For example, if we have an 8-bit RGB grayscale image and would like to convert it to a 16-bit signed image, we can do that by calling cvConvertScale().

The second function of cvConvertScale() is to perform a linear transformation on the image data. Each pixel value will be multiplied by the value scale and then have added to it the value shift. It is critical to remember that, even though "Convert" precedes "Scale" in the function name, the actual order in which these operations is performed is the opposite (i.e. multiplication by scale and the addition of shift occurs before the type conversion takes place).

When you simply pass the default values (scale = 1.0 and shift = 0.0), you need not have performance fears; OpenCV is smart enough to recognize this case and not waste processor time on useless operations. For clarity (if you think it adds any), OpenCV also provides the macro cvConvert(), which is the same as cvConvertScale() but is conventionally used when the scale and shift arguments will be left at their default values.

cvConvertScale() will work on all data types and any number of channels, but the number of channels in the source and destination images must be the same. (If you want to, say, convert from color to grayscale or vice versa, see cvCvtColor(), which is coming up shortly.)

void cvCvtColor(
    const CvArr* src,
    CvArr*       dst,
    int          code
);

The previous functions were for converting from one data type to another, and they expected the number of channels to be the same in both source and destination images. The complementary function is cvCvtColor(), which converts from one color space (number of channels) to another [Wharton71] while expecting the data type to be the same. The exact conversion operation to be done is specified by the argument code, whose possible values are listed in Table 3-6.[25]

Table 3-6. Conversions available by means of cvCvtColor()

Conversion code

Meaning

CV_BGR2RGB
CV_RGB2BGR
CV_RGBA2BGRA
CV_BGRA2RGBA

Convert between RGB and BGR color spaces (with or without alpha channel)

CV_RGB2RGBA
CV_BGR2BGRA

Add alpha channel to RGB or BGR image

CV_RGBA2RGB
CV_BGRA2BGR

Remove alpha channel from RGB or BGR image

CV_RGB2BGRA
CV_RGBA2BGR
CV_BGRA2RGB
CV_BGR2RGBA

Convert RGB to BGR color spaces while adding or removing alpha channel

CV_RGB2GRAY
CV_BGR2GRAY

Convert RGB or BGR color spaces to grayscale

CV_GRAY2RGB
CV_GRAY2BGR
CV_RGBA2GRAY
CV_BGRA2GRAY

Convert grayscale to RGB or BGR color spaces (optionally removing alpha channel in the process)

CV_GRAY2RGBA
CV_GRAY2BGRA

Convert grayscale to RGB or BGR color spaces and add alpha channel

CV_RGB2BGR565
CV_BGR2BGR565
CV_BGR5652RGB
CV_BGR5652BGR
CV_RGBA2BGR565
CV_BGRA2BGR565
CV_BGR5652RGBA
CV_BGR5652BGRA

Convert from RGB or BGR color space to BGR565 color representation with optional addition or removal of alpha channel (16-bit images)

CV_GRAY2BGR565
CV_BGR5652GRAY

Convert grayscale to BGR565 color representation or vice versa (16-bit images)

CV_RGB2BGR555
CV_BGR2BGR555
CV_BGR5552RGB
CV_BGR5552BGR
CV_RGBA2BGR555
CV_BGRA2BGR555
CV_BGR5552RGBA
CV_BGR5552BGRA

Convert from RGB or BGR color space to BGR555 color representation with optional addition or removal of alpha channel (16-bit images)

CV_GRAY2BGR555
CV_BGR5552GRAY

Convert grayscale to BGR555 color representation or vice versa (16-bit images)

CV_RGB2XYZ
CV_BGR2XYZ
CV_XYZ2RGB
CV_XYZ2BGR

Convert RGB or BGR image to CIE XYZ representation or vice versa (Rec 709 with D65 white point)

CV_RGB2YCrCb
CV_BGR2YCrCb
CV_YCrCb2RGB
CV_YCrCb2BGR

Convert RGB or BGR image to luma-chroma (aka YCC) color representation

CV_RGB2HSV
CV_BGR2HSV
CV_HSV2RGB
CV_HSV2BGR

Convert RGB or BGR image to HSV (hue saturation value) color representation or vice versa

CV_RGB2HLS
CV_BGR2HLS
CV_HLS2RGB
CV_HLS2BGR

Convert RGB or BGR image to HLS (hue lightness saturation) color representation or vice versa

CV_RGB2Lab
CV_BGR2Lab
CV_Lab2RGB
CV_Lab2BGR

Convert RGB or BGR image to CIE Lab color representation or vice versa

CV_RGB2Luv
CV_BGR2Luv
CV_Luv2RGB
CV_Luv2BGR

Convert RGB or BGR image to CIE Luv color representation

CV_BayerBG2RGB
CV_BayerGB2RGB
CV_BayerRG2RGB
CV_BayerGR2RGB
CV_BayerBG2BGR
CV_BayerGB2BGR
CV_BayerRG2BGR
CV_BayerGR2BGR

Convert from Bayer pattern (single-channel) to RGB or BGR image

The details of many of these conversions are nontrivial, and we will not go into the subtleties of Bayer representations and the CIE color spaces here. For our purposes, it is sufficient to note that OpenCV contains tools to convert to and from these various color spaces, which are of importance to various classes of users.

The color-space conversions all use the conventions: 8-bit images are in the range 0–255, 16-bit images are in the range 0–65536, and floating-point numbers are in the range 0.0–1.0. When grayscale images are converted to color images, all components of the resulting image are taken to be equal; but for the reverse transformation (e.g., RGB or BGR to grayscale), the gray value is computed using the perceptually weighted formula:

image with no caption

In the case of HSV or HLS representations, hue is normally represented as a value from 0 to 360.[26] This can cause trouble in 8-bit representations and so, when converting to HSV, the hue is divided by 2 when the output image is an 8-bit image.

double cvDotProduct(
    const CvArr* src1,
    const CvArr* src2
);

This function computes the vector dot product [Lagrange1773] of two N-dimensional vectors.[27] As with the cross product (and for the same reason), it does not matter if the vectors are in row or column form. Both src1 and src2 should be single-channel arrays, and both arrays should be of the same data type.

double cvEigenVV(
    CvArr* mat,
    CvArr* evects,
    CvArr* evals,
    double eps    = 0
);

Given a symmetric matrix mat, cvEigenVV() will compute the eigenvectors and the corresponding eigenvalues of that matrix. This is done using Jacobi's method [Bronshtein97], so it is efficient for smaller matrices.[28] Jacobi's method requires a stopping parameter, which is the maximum size of the off-diagonal elements in the final matrix.[29] The optional argument eps sets this termination value. In the process of computation, the supplied matrix mat is used for the computation, so its values will be altered by the function. When the function returns, you will find your eigenvectors in evects in the form of subsequent rows. The corresponding eigenvalues are stored in evals. The order of the eigenvectors will always be in descending order of the magnitudes of the corresponding eigenvalues. The cvEigenVV() function requires all three arrays to be of floating-point type.

As with cvDet() (described previously), if the matrix in question is known to be symmetric and positive definite[30] then it is better to use SVD to find the eigenvalues and eigenvectors of mat.

double cvGEMM(
    const CvArr* src1,
    const CvArr* src2,
    double       alpha,
    const CvArr* src3,
    double       beta,
    CvArr*       dst,
    int          tABC = 0
);

Generalized matrix multiplication (GEMM) in OpenCV is performed by cvGEMM(), which performs matrix multiplication, multiplication by a transpose, scaled multiplication, et cetera. In its most general form, cvGEMM() computes the following:

image with no caption

Where A, B, and C are (respectively) the matrices src1, src2, and src3, α and β are numerical coefficients, and op() is an optional transposition of the matrix enclosed. The argument src3 may be set to NULL, in which case it will not be added. The transpositions are controlled by the optional argument tABC, which may be 0 or any combination (by means of Boolean OR) of CV_GEMM_A_T, CV_GEMM_B_T, and CV_GEMM_C_T (with each flag indicating a transposition of the corresponding matrix).

In the distant past OpenCV contained the methods cvMatMul() and cvMatMulAdd(), but these were too often confused with cvMul(), which does something entirely different (i.e., element-by-element multiplication of two arrays). These functions continue to exist as macros for calls to cvGEMM(). In particular, we have the equivalences listed in Table 3-7.

All matrices must be of the appropriate size for the multiplication, and all should be of floating-point type. The cvGEMM() function supports two-channel matrices, in which case it will treat the two channels as the two components of a single complex number.

CvSize cvMahalanobis(
    const CvArr* vec1,
    const CvArr* vec2,
    CvArr*       mat
);

The Mahalanobis distance is defined as the vector distance measured between a point and the center of a Gaussian distribution; it is computed using the inverse covariance of that distribution as a metric. See Figure 3-5. Intuitively, this is analogous to the z-score in basic statistics, where the distance from the center of a distribution is measured in units of the variance of that distribution. The Mahalanobis distance is just a multivariable generalization of the same idea.

cvMahalanobis() computes the value:

image with no caption

The vector vec1 is presumed to be the point x, and the vector vec2 is taken to be the distribution's mean.[32] That matrix mat is the inverse covariance.

In practice, this covariance matrix will usually have been computed with cvCalcCovarMatrix() (described previously) and then inverted with cvInvert(). It is good programming practice to use the SV_SVD method for this inversion because someday you will encounter a distribution for which one of the eigenvalues is 0!



[25] Long-time users of IPL should note that the function cvCvtColor() ignores the colorModel and channelSeq fields of the IplImage header. The conversions are done exactly as implied by the code argument.

[26] Excluding 360, of course.

[27] Actually, the behavior of cvDotProduct() is a little more general than described here. Given any pair of n-by-m matrices, cvDotProduct() will return the sum of the products of the corresponding elements.

[28] A good rule of thumb would be that matrices 10-by-10 or smaller are small enough for Jacobi's method to be efficient. If the matrix is larger than 20-by-20 then you are in a domain where this method is probably not the way to go.

[29] In principle, once the Jacobi method is complete then the original matrix is transformed into one that is diagonal and contains only the eigenvalues; however, the method can be terminated before the off-diagonal elements are all the way to zero in order to save on computation. In practice is it usually sufficient to set this value to DBL_EPSILON, or about 10−15.

[30] This is, for example, always the case for covariance matrices. See cvCalcCovarMatrix().

[31] Remember that OpenCV regards a "vector" as a matrix of size n-by-1 or 1-by-n.

[32] Actually, the Mahalanobis distance is more generally defined as the distance between any two vectors; in any case, the vector vec2 is subtracted from the vector vec1. Neither is there any fundamental connection between mat in cvMahalanobis() and the inverse covariance; any metric can be imposed here as appropriate.

[33] At least in the case of the L2 norm, there is an intuitive interpretation of the difference norm as a Euclidean distance in a space of dimension equal to the number of pixels in the images.

[34] Purists will note that averaging is not technically a proper fold in the sense implied here. OpenCV has a more practical view of reductions and so includes this useful operation in cvReduce.