Mathematical Handbook for Scientists and Engineers

13.1-1. Matrices (Sec. 13.2-1) are the building blocks of an important class of mathematical models. Matrix techniques permit a simplified representation of various mathematical and physical operations in terms of numerical operations on matrix elements. Chapter 13 introduces matrix algebra and calculus (Secs. 13.2-1 to 13.4-7), quadratic and hermitian forms (Secs. 13.5-1 to 13.5-6), and the matrix /state-variable treatment of ordinary differential equations (Secs. 13.6-1 to 13.6-7). The general application of matrices to the representation of vectors, linear transformations (linear operators), and inner products is reserved for Chap. 14.

Sections 13.5-1 to 13.5-6 similarly introduce quadratic and hermitian forms from the viewpoint of simple algebra; their real significance for the representation of scalar products is discussed in Secs. 14.7-1 and 14.7-2.

13.2. MATRIX ALGEBRA AND MATRIX CALCULUS

13.2-1. Rectangular Matrices. Norms. (a) An array

of “scalars” a_ik taken from a commutative field (.Sec. 12.3-1) F is called a (rectangular) m X n matrix over the field F whenever one of the “matrix operations” defined in Sec. 13.2-2 is to be used. The elements a_ik are called matrix elements; the matrix element a_ik is situated in the i^th row and in the k^th column of the matrix (1). m is the number of rows, and n is the number of columns. A matrix is finite if and only if it has a finite number of rows and a finite number of columns; otherwise the matrix is infinite.

A finite or infinite matrix (1) over the field of complex numbers is bounded if and only if it has a finite bound (norm) defined typically as

Table 13.2-1. Some Matrix Norms (n and/or m can be infinite; see also Secs. 12.5-5 and 14.4-1)

(see also Table 13.2-1 and Sec. 14.4-1). A finite matrix over the field of complex numbers is bounded if and only if all its matrix elements are bounded.

Throughout this handbook all matrices will be understood to be bounded matrices over the field of complex numbers, unless the contrary is specifically stated. A matrix A ≡ [a_ik] is real if and only if all matrix elements a_ik are real numbers.

(b) n X 1 matrices are column matrices, and 1 X n matrices are row matrices. The following notation will be used:

Triangular (superdiagonal) if and only if i > k implies a_ik = 0

Strictly triangular if and only if i ≥ k implies a_ik = 0

Diagonal if and only if i ≠ k implies a_ik = 0

Monomial if and only if each row and column has one and only one element different from zero

13.2-2. Basic Operations. Operations on matrices are defined in terms of operations on the matrix elements.

In every matrix product AB the number n of columns of A must match the number of rows of B (A and B must be conformable). The existence of AB implies that of BA if and only if A and B are square matrices; in general BA ≠ AB (see also Sec. 13.4-4b). Note

13.2-3. Identities and Inverses. Note the following definitions:

Products and reciprocals of nonsingular matrices are nonsingular; if A and B are nonsingular, and α ≠ 0,

(see also sec. 14.3-5).

A given matrix A is nonsingular if it has a unique left or right inverse, or egqual left and right inverses; the mere existence of one or more left and/or right inverses is not sufficient (see also Sec. 14.3-5). A given n X n matrix is nonsingular if and only if it can be partitioned (Sec. 13.2-8) into a linearly independent set of row or column matrices.

13.2-4. Integral Powers of Square Matrices. One defines A⁰ = I, A¹ = A, A² = A A, A³ = AAA, . . . and, if A is nonsingular,

The ordinary rules for operations with exponents apply (see also .Sec. 14.3-6).

13.2-5. Matrices as Building Blocks of Mathematical Models. The definitions of Secs. 13.2-2 and 13.2-3 (constructive definitions, Sec. 12.1-1) imply the following results:

1. Given any pair of (finite) positive integers m, n, the class of all m X n matrices over a field F is an mn-dimensional vector space over F (Secs. 12.4-1 and 14.2-4). In particular, n-element column or row matrices form n-dimensional vector spaces (see also Sec. 14.5-2).

2. The class of all square matrices of a given (finite) order n over a field F is a linear algebra of order n² over F; singular matrices are zero divisors (Sec. 12.4-2).

3. The class of all nonsingular square matrices of a given (finite) order n over a field F constitutes a multiplicative group (Sec. 12.2-1) and, together with the n X n null matrix, a division algebra of order n² over the field F (Sec. 12.4-2).

Analogous theorems apply to bounded infinite matrices over the field of real or complex numbers.

13.2-6. Multiplication by Special Matrices. Permutation Matrices. Given any n X n matrix A,

1. Premultiplication of A by the matrix obtained on replacing the 1 in the i^th row oj the n X n identity matrix by a complex number α multiplies all elements in the i^th row of A by α.

2. Premultiplication of A by the matrix obtained on replacing the nondiagonal element of the n X n identity matrix by 1 adds the k^th row of A to the i^th row of A.

3. Premultiplication of A by the permutation matrix formed through a permutation of the rows of the n X n identity matrix results in an identical permutation of the rows of A.

The third transposition relation (13.3-1) yields three analogous theorems describing operations on the columns of a matrix as results of postmultiplication by special matrices.

13.2-7. Rank, Trace, and Determinant of a Matrix (see also Sec. 14.3-2). The rank of a given matrix is the largest number r such that at least one r^th-order determinant (Sec. 1.5-1) formed from the matrix by deleting rows and/or columns is different from zero. An m X n matrix A is nonsingular if and only if m = n = r, i.e., if and only if A is square and det (A) ≠ 0 (Sec. 13.2-3).

The trace (spur) of an n X n matrix A = [a_ik] is the sum

of the diagonal terms. If n = ∞, this sum converges whenever A is bounded (Sec. 13.2-la). For finite matrices A, B

The determinant det (A) of an infinite square matrix A ≡ [a_ik] is defined as

if the limit exists. The theorems of Secs. 1.5-1 to 1.5-6 apply to determinants defined in this manner whenever both and converge.

13.2-8. Partitioning of Matrices. A matrix having more than one row and column may be partitioned into smaller rectangular submatrices by lines drawn between rows and/or columns. One can multiply two similarly partitioned n X n matrices A and B by entering their rectangular submatrices as elements in the ordinary matrix-product formula (Sec. 13.2-2); the product elements thus obtained are the submatrices of the n X n matrix AB. This theorem is helpful in numerical computations (Sec. 20.3-4).

13.2-9. Step Matrices. Direct Sums (see also Secs. 13.4-6, 14.8-2, and 14.9-2). A step matrix is a square matrix A which can be partitioned into a diagonal matrix (Sec. 13.2-lc) of square submatrices A₁, A ₂, . . . , so that

A step matrix is often referred to as the direct sum

of the square matrices along its diagonal (see also Sec. 12.7-5). Note that A^p = A₁^p ⊕ A₂^p ⊕ … for p = 0, 1, 2, . . . (and also for p = −1, −2, . . . if A is nonsingular), and

13.2-10. Direct Product (Outer Product) of Matrices (see also Sec. 12.7-3). The direct product (outer product) A ⊗ B of the m X n matrix A ≡ [a_ik] and the m′ X n′ matrix B ≡ [b_i′k′] is the mm′ X nn′ matrix

where j enumerates the pairs (i, i′) in the sequence (1, 1), (1, 2), . . . , (1, m′), (2, 1), (2, 2), . . . (m, m′) and where h enumerates the pairs (k, k′) in a similar manner. Note

13.2-11. Convergence and Differentiation. (a) A sequence of matrices S₀, S₁, S₂,… each having the same number of rows and the same number of columns is said to converge to a bounded matrix S if and only if every matrix element of S_n converges to the corresponding element of S as n → ∞, i.e., if and only if One similarly defines limits of matrix functions of a scalar parameter t (see also Sec. 12.5-3).

(b) If the matrix elements of a matrix A ≡ [a_ik] are differentiable functions a_ik(t) of a scalar parameter t, one writes

Partial differentiation and integration of matrices are defined in an analogous manner.

13.2-12. Functions of Matrices. Matrix polynomials and algebraic functions of matrices are defined in terms of the elementary matrix operations. The Cayley-Hamilton theorem (Sec. 13.4-7) reduces every convergent series in powers of an n X n matrix A (analytic function of the matrix A) to an n^th-degree polynomial in A.

13.3. MATRICES WITH SPECIAL SYMMETRY PROPERTIES

13.3-1. Transpose and Hermitian Conjugate of a Matrix (see also Secs. 14.4-3 and 14.4-6a). Given any m X n matrix A ≡ [a_ik] over the field of complex numbers,

The transpose (transposed matrix) of A is the n X m matrix Ã ≡ [a_ki].

The hermitian conjugate (adjoint, conjugate, associate matrix)* of A is the n X m matrix A † ≡ [].

Note the following relations:

A, Ã, and A† are necessarily of equal rank. For every square matrix A,

13.3-2. Matrices with Special Symmetry Properties (see also Secs. 14.4-4 to 14.4-6). A square matrix A ≡ [a_ik] is

is the complex conjugate of a_ki (Sec. 1.3-1). The terms adjoint, conjugate, and associate(d) are used with different meanings (see also Secs. 12.2-5, 14.4-3, 16.7-1, and 16.7-2); some authors refer to the matrix A^-1 det (A) (matrix of cofactors with transposed indices, adjugate of A) as the adjoint of A. The symbols used for the trans-pose Ã, the hermitian conjugate A†,and the complex conjugate A* ≡ [] ≡ (Ã)† of a given matrix A also vary; some authors denote the hermitian conjugate of A by A*.

Symmetric if and only if Ã = A, i.e., a_ik = a_ki

Skew-symmetric (antisymmetric) if and only if Ã = −A, i.e., a_ik = −a*_ki

Hermitian (self-adjoint, self-conjugate) if and only if A† = −A, i.e., a_ik =

Skew-hermitian (alternating) if and only if A† = −A, i.e., a_ik = −

Orthogonal if and only if Ã A = AÃ = I, i.e., Ã = A^−l

Unitary if and only if A†A = AA† = I, i.e., A† = A⁻¹

A hermitian matrix is symmetric, a skew-hermitian matrix is skew-symmetric, and a unitary matrix is orthogonal if and only if all its matrix elements are real. The diagonal elements of hermitian, skew-hermitian, and skew-symmetric matrices are, respectively, real, pure imaginary, and equal to zero.

The determinant of a hermitian matrix is real. The determinant of a skew-hermitian n X n matrix is real if n is even, and pure imaginary if n is odd. The determinant of a skew-symmetric matrix of odd order is equal to zero. The determinant of a unitary matrix has the absolute value 1, and the determinant of an orthogonal matrix equals either + 1 or -1.

13.3-3. Combination Rules (see also Sec. 14.4-7). (a) If the matrix A is symmetric the same is true for A^p (p = 0, 1, 2, . . .), A^−l, , and αA.

Given any nonsingular matrix T, is symmetric if and only if A is symmetric; hence for any orthogonal matrix T, T^−lAT is symmetric if and only if the same is true for A.

If A and B are symmetric the same is true for A + B. The product AB of two symmetric matrices A and B is symmetric if and only if BA = AB.

(b)If the matrix A is hermitian the same is true for A^p (p = 0, 1, 2, . . .), A⁻¹, and T†AT, and for αA if a is real.

Given any nonsingular matrix T, T†AT is hermitian if and only if A is hermitian; hence for any unitary matrix T, T^−lAT is hermitian if and only if the same is true for A.

If A and B are hermitian the same is true for A + B. The product AB of two hermitian matrices A and B is hermitian if and only if BA = AB.

(c) If A is an orthogonal matrix the same is true for A^p (p = 0, 1, 2, . . .), A⁻¹, Ã, and −A. If A and B are orthogonal the same is true for AB.

If A is a unitary matrix the same is true for A^p (p = 0, 1, 2, . . .), A^−l, and A† and for αA if |a| = 1. If A and B are unitary the same is true for AB.

13.3-4. Decomposition Theorems. Normal Matrices (see also Secs. 13.4-4a and 14.4-8). (a) For every square matrix over the field of complex numbers

1. ½(A + Ã) = S₁ is a symmetric matrix, and ½(A — Ã) = S₂ is skew-symmetric. A = S₁ + S₂ is the (unique) decomposition of the given matrix A into a symmetric part and a skew-symmetric part.

2. ½(A + A†) = H₁ and are hermitian matrices. A = H₁ + iH₂ is the (unique) cartesian decomposition of the given matrix A into a hermitian part and a skew-hermitian part (comparable to the cartesian decomposition of complex numbers into real and imaginary parts, Sec. 1.3-1).

A†A is hermitian (and nonneqative, Sec. 13.5-3), and there exists a polar decomposition A = QU of A into a nonnegative hermitian factor Q and a unitary factor U. Q is uniquely defined by

Q^{3. 2} = A†A

and U is uniquely defined if and only if A is nonsingular (compare this with Sec. 1.3-2).

(b) A square matrix A is a normal matrix if and only if A†A = AA†, or, equivalently, if and only if H₂H₁ = H₁H₂.

13.4. EQUIVALENT MATRICES. EIGENVALUES,

DIAGONALIZATION, AND RELATED TOPICS

13.4-1. Equivalent and Similar Matrices (see also Secs. 12.2-5a, 13.5-4, 13.5-5, and 14.6-2). (a) Two rectangular matrices A and B are equivalent if and only if there exist two nonsingular matrices S, T such that A and B are related by the transformation

Every matrix B equivalent to a given matrix A has the same number of rows and the same number of columns as A and can be obtained through successive application of the six operations defined in Sec. 13.2-6. Equivalent matrices are of equal rank; two m X n matrices of equal rank are necessarily equivalent.

A and B are equivalent whenever B = QA or B = AQ, where Q is a nonsingular matrix.

(b) In particular, two square matrices A and Ā are similar (sometimes simply called equivalent) if and only if there exists a nonsingular matrix T (transformation matrix) such that A and Ā are related by the similarity transformation (collineatory transformation)

A, Ā, and T are necessarily square matrices of equal order.

Every similarity transformation (2) preserves the results of matrix addition, matrix multiplication, and multiplication by scalars (see also Sec. 12.1-6). Two similar matrices have the same rank, the same trace, and the same determinant (see also Sec. 13.4-2a).

where T is nonsingular, are congruent. Two square matrices A and Ā related by

where T is nonsingular, are conjunctive. In either case A, Ā, and T are necessarily square matrices of equal order.

(d) Matrix equivalence, similarity, congruence, and conjunctivity are equivalence relations; each defines a partition of the class of matrices under consideration (Sec. 12.1-3b). In most applications, two or more similar matrices constitute different representations of a linear transformation (linear operator, dyadic) A (Sec. 14.6-2). It is, then, of interest (1) to find similarity transformations yielding particularly simple representations of A (transformation of a matrix to diagonal or other “canonical” form) and (2) to find properties of matrices which are invariant with respect to similarity transformations and are thus common to each class of similar matrices (e.g., rank, trace, determinant, eigenvalues).

13.4-2. Eigenvalues and Spectra of Square Matrices (see also Sec. 14.8-3). (a) The eigenvalues (proper values, characteristic values, characteristic roots, latent roots) of a (finite or infinite) square matrix A ≡ [a_ik] are those values of the scalar parameter λ for which the resolvent matrix A — λI is singular. The spectrum (eigenvalue spectrum) of the matrix A is the set of all its eigenvalues.

The eigenvalues of a square matrix A can be defined directly as the eigenvalues of a linear operator represented by A (Sec. 14.8-3); similar matrices have identical spectra.

Note that the spectrum of an infinite matrix may or may not be a discrete set: e.g., every real number between —π and π is an eigenvalue of the infinite matrix . Some authors restrict the term eigenvalue to values of λ in the discrete spectrum (see also Sec. 14.8-3d).

(b) Given a normal matrix A (A†A = AA†, Sec. 13.3-4b) with eigenvalues λ, A† has the eigenvalues λ*, H₁ = ½(A + A†) has the eigenvalues Re (λ), and has the eigenvalues Im (λ) (see also Sec.13.3-4a).

All eigenvalues of a given normal matrix are real if and only if the matrix is similar to a hermitian matrix (see also Sec. 14.8-4). In particular, all eigenvalues of hermitian and real symmetric matrices are real. All eigenvalues of a unitary matrix have absolute values equal to 1; in particular, real eigenvalues of real orthogonal matrices equal +1 or —1, and their complex eigenvalues occur in pairs e^±iφ. A square matrix is nonsingular if and only if all its eigenvalues are different from zero.

(c) Refer to Secs. 13.4-5a, 14.8-5, and 20.3-5 for the numerical calculation of eigenvalues, and to Sec. 14.8-9 for the calculation of bounds for the eigenvalues of a given matrix.

13.4-3. Transformation of a Square Matrix to Triangular Form. Algebraic Multiplicity of an Eigenvalue (see also Sec. 14.8-3e). (a) Given any square matrix A having a purely discrete eigenvalue spectrum, there exists a similarity transformation Ā = T^—lAT such that Ā is triangular (Sec. 13.2-lc). The diagonal elements of every triangular matrix similar to A are eigenvalues of A, and each eigenvalue λ_j of A occurs exactly m′_j ≥ 1 times as a diagonal element; m′_j is called the algebraic multiplicity of the eigenvalue λ_j.

NOTE: m′_j is not necessarily equal to the degree of degeneracy m_j defined in Sec. 14.8-3b. In the case of infinite matrices, one or more of the m′_j may be infinite.

(b) If Tr (A) exists, and A has a purely discrete eigenvalue spectrum, Tr (A) equals the sum of all eigenvalues, each counted a number of times equal to its algebraic multiplicity. If det (A) exists (Sec. 13.2-7), it equals the similarly computed product of the eigenvalues (see also Sec. 13.4-5).

13.4-4. Diagonalization of Matrices (see also Sec. 14.8-5). (a) A square matrix A can be diagonalized by a similarity transformation (i.e., there exists a nonsingular transformation matrix T such that Ā = T^—lAT is diagonal, Sec. 13.2-lc) if and only if A has a purely discrete eigenvalue spectrum and is similar to a normal matrix (Sec. 13.3-4b). More specifically, a given matrix A having a purely discrete eigenvalue spectrum can be diagonalized by a similarity transformation with unitary transformation matrix T (or with a real orthogonal transformation matrix if A is real) if and only if A is a normal matrix (A†A = AA†, Sec. 13.3.4b). In each case the diagonal elements of Ā are eigenvalues of A; every eigenvalue occurs a number of times equal to its algebraic multiplicity. Refer to Sec. 14.8-6 for a procedure yielding transformation matrices T with the desired properties.

SPECIAL CASES OF DIAGONALIZABLE MATRICES. Hermitian and unitary matrices (and thus real and symmetric or orthogonal matrices) are special instances of normal matrices. Every matrix having only discrete eigenvalues of algebraic multiplicity 1 is similar to a normal matrix.

(b) Two hermitian matrices A and B with purely discrete eigenvalue spectra can be diagonalized by the same similarity transformation (and, in particular, by the same similarity transformation with unitary transformation matrix T) if and only if BA = AB (see also Sees. 13.5-5, 14.8-6e).

(c)Given any hermitian matrix A having a purely discrete eigenvalue spectrum, there exists a nonsingular matrix T such that Ā = T†AT is diagonal; the diagonal elements of Ā are then real. In particular, there exists a nonsingular matrix T such that the diagonal elements of Ā take only the values +1, —1, and/or 0.

Given any real symmetric matrix A having a purely discrete eigenvalue spectrum, there exists a real nonsingular matrix T such that is diagonal. In particular,there exists a real nonsingular matrix T such that the diagonal elements of Ā take only the values +1, —1, and/or 0.

Matrices T with the desired properties are obtained from T = DU, where U is a unitary (or real orthogonal) matrix such that U^—1AU is diagonal, and D is a real diagonal matrix; U is found by the method of Sec. 14.8-6 (see also Sec. 13.5-4d).

13.4-5. Eigenvalues and Characteristic Equation of a Finite Matrix, (a) The eigenvalue spectrum of a finite n X n matrix A ≡ [a_ik] is identical with the set of roots λ of the n^th-degree algebraic equation

The multiplicity (order, Sec. 1.6-2) of each root λ_j equals its algebraic multiplicity m′_j as an eigenvalue, so that m′₁ + m′₂ + … = n.

Similar n X n matrices have identical characteristic equations; the coefficients in Eq. (5) are symmetric functions of the n roots λ₁, λ₂, . . . , λ_n(Sec. 1.6-4). In particular, the coefficient of λ^n—1 and the constant term in Eq. (5) are, respectively,

The coefficient of λ^n—r equals (—l)^r times the sum of the r-rowed principal minors (Sec. 1.5-4) of det (A).

(b) (See also Sec. 14.8-3). Given a finite square matrix A with eigenvalues λ_j, αA has the eigenvalues αλ_j, and A^p has the eigenvalues (p = 0, 1, 2, . . . ; p = 0, ±1, ±2, . . . if A is nonsingular). Every polynomial or analytic function f(A) (Sec. 13.2-12) has the eigenvalues f(λ_j). A matrix power series converges (Sec. 13.2-1la) if and only if the power series converges for every eigenvalue λ_j of A. Given two finite square matrices A and B having the respective eigenvalues λ_j and _{μ_λ}, the eigenvalue spectrum of the direct product A ⊗ B (Sec. 13.2-10) is the set of products λ_{jμ_λ}.

13.4-6. Eigenvalues of Step Matrices (Direct Sums, Sec. 13.2-9). The spectrum of a finite or infinite) step matrix (direct sum) A = A₁ ⊕ A₂ ⊕ … is the union of the spectra of A₁, A₂,… ; algebraic multiplicities add. The contribution of each finite submatrix A_k may be obtained with the aid of its characteristic equation.

13.4-7. The Cayley-Hamilton Theorem and Related Topics. (a) Every finite square matrix A satisfies its own characteristic equation (Sec. 13.4-5a), i.e.

(b) The Cayley-Hamilton theorem permits one to represent every integral power, and hence every analytic function, of a finite n X n matrix A as a linear function of any n distinct positive integral powers of A (see also Sec. 13.2-12). Specifically,

where Δ is the Vandermonde determinant (Sec. 1.6-5) det [λ_i^k—1], and Δ_j is the determinant obtained on substitution of f(λ₁), f(λ₂), . . . , f(λ_n) for λ₁^j, λ₂^j, . . . ,λ_n^j in Δ.

If the eigenvalues λ₁, λ₂, . . . , λ_n of the matrix A are distinct, then Eq. (8) can be rewritten as

13.5. QUADRATIC AND HERMITIAN FORMS

13.5-1. Bilinear Forms. A bilinear form in the 2n real or complex variables ξ₁, ξ₂, . . . , ξ_n, η₁, η₂, . . . , η_n is a homogeneous polynomial of the second degree (Sec. 1.4-3a)

13.5-2. Quadratic Forms. A (homogeneous) quadratic form in n real or complex variables* ξ₁, ξ₂, . . . , ξ_n is a polynomial

* The theory of Secs. 13.5-2 to 13.5-6 also applies to quadratic and hermitian forms in a countably infinite set of variables ξ₁, ξ₂, . . . (n = ∞), and to the corresponding infinite matrices, provided that the infinite sums and converge.

where A₁ = ½(A + Ã) is the “symmetric part” (Sec. 13.3-4) of the matrix A ≡ [a_ik]. The expression (2) vanishes identically if and only if A is skew-symmetric (a_ki = —a_ik Sec. 13.3-2). A quadratic form (2) is symmetric if and only if the matrix A ≡ [a_ik] is symmetric (a_ki = a_ik, Sec. 13.3-2) and real if and only if A (and thus every a_ik Sec. 13.2-la) is real.

A real symmetric quadratic form (2), and also the corresponding real symmetric matrix A, is called positive definite, negative definite, nonnegative, or nonpositive if and only if, respectively, , , , or for every set of real numbers ξ₁, ξ₂, . . . , ξ_n not all equal to zero. All other real symmetric quadratic forms are indefinite (i.e., the sign of xAx depends on ξ₁, ξ₂, . . . , ξ_n) or identically zero.

A real symmetric quadratic form (2), and also the corresponding real symmetric matrix A, is positive semidefinite or negative semidefinite if and only if it is, respectively, nonnegative or nonpositive, and for some set of real numbers ξ₁, ξ₂, . . . , ξ_n not all equal to zero.

13.5-3. Hermitian Forms. A hermitian form in n real or complex variables* ξ₁, ξ₂, . . . , ξ_n is a polynomial

such that the matrix A ≡ [a_ik] is hermitian (a_ik = , Sec. 13.3-2). A form (3) is real for every set of complex numbers ξ₁, ξ₂, . . . , ξ_n if and only if A is hermitian (see also Sec. 14.4-4).

A hermitian form (3), and also the corresponding hermitian matrix A ≡ [a_ik], is called positive definite, negative definite, nonnegative, or nonpositive if and only if, respectively, x†Ax > 0, x†Ax < 0, x†Ax ≥ 0, or x†Ax ≤ 0 for every set of complex numbers {ξ₁, ξ₂, . . . , ξ_nnot all equal to zero. All other hermitian forms (or hermitian matrices) are indefinite (i.e., the sign of x†Ax depends on ξ₁, ξ₂, . . . , ξ_n) or identically zero.

A hermitian form (3), and also the corresponding hermitian matrix A, is positive semidefinite or negative semidefinite if and only if it is, respectively, nonnegative or nonpositive, and x†Ax = 0 for some set of complex numbers ξ₁, ξ₂, . . . , ξ_n not all equal to zero.

13.5-4. Transformation of Quadratic and Hermitian Forms. Transformation to Diagonal Form. (a) A linear substitution

* See footnote to Sec. 13.5-2,

(nonsingular homogeneous linear transformation of coordinates or vector components, “alias” point of view, Sec. 14.6-1) transforms every quadratic form (2) into a quadratic form in the new variables :

Ā is symmetric if A is symmetric; Ā is real if A and T are real.

The linear substitution (4) transforms every hermitian form (3) into a new hermitian form:

(b) For every given real symmetric quadratic form (2) there exist linear transformations (4) with real coefficients t_ik such that the transformed matrix Ā in Eq. (5) is diagonal (see also Sec. 13.4-4c), so that

Similarly, for every given hermitian form (3) there exist linear transformations (4) such that

The number r of nonzero coefficients ā_ii in Eq. (7) or (8) is independent of the particular diagonalizing transformation used and equals the rank of the given matrix A; r is called the rank of the given quadratic or hermitian form. For any given real symmetric quadratic form (2) the difference between the respective numbers of positive and negative coefficients ā_ii in Eq. (7) is independent of the particular diagonalizing transformation used (Jacobi-Sylvester Law of Inertia); this number is referred to as the signature of the given quadratic form.

(c) In particular, there exists a real orthogonal diagonalizing matrix T for every real symmetric quadratic form (2), and a unitary diagonalizing matrix T for every hermitian form (3) (see also Sec. 13.4-4). The resulting principal-axes transformation (transformation to normal coordinates , see also Sec. 9.4-8) yields the normal form of the given quadratic or hermitian form, viz.,

where the set of real numbers λ_i is the eigenvalue spectrum of the given matrix A (Sec. 13.4-2).

(d) The additional transformation reduces the expressions (9) to their respective canonical forms

where each _i equals +1, — 1, or 0 if the corresponding eigenvalue λ_i is positive, negative, or zero.

(e) The calculation of suitable diagonalizing transformation matrices is discussed in Sec. 14.8-6.

13.5-5. Simultaneous Diagonalization of Two Quadratic or Hermitian Forms (see also Secs. 13.4-4b and 14.8-7). Given two real symmetric quadratic forms , , where is positive definite, it is possible to find a real transformation (4) which diagonalizes and simultaneously. In particular, there exists a real transformation (4) to new coordinates such that

Similarly, given two hermitian forms x†Ax, x†Bx, where x†Bx is positive definite, there exists a transformation (4) to new coordinates such that

In either case, the set of real numbers μ₁μ₂, . . . , μ_n is the eigenvalue spectrum of the matrix B^—lA, obtainable as the set of roots of the n^th-degree algebraic equation

The desired transformation matrix T is obtained by the method of Sec. 14.8-7b, or from T = UT₀, where T₀ is the matrix reducing or x†Bx to canonical form (Sec. 13.5-4d), and U is a unitary matrix which diagonalizes or x†Ax (Sec. 13.5-4c).

NOTE: TWO real symmetric quadratic forms , xBx or two hermitian forms x†Ax, x†Bx can be diagonalized simultaneously by the same unitary transformation matrix T if and only if BA = AB (see also Secs. 13.4-4b and 14.8-6e).

13.5-6. Tests for Positive Definiteness, Nonnegativeness, etc. (a) A real symmetric quadratic form or hermitian form is positive definite, negative definite, nonnegative, nonpositive, indefinite, or identically zero (Secs. 13.5-2 and 13.5-3) if and only if the (necessarily real) eigenvalues λ_j of the matrix A ≡ [a_ik] are, respectively, all positive, all negative, all nonnegative, all nonpositive, of different signs, or all equal to zero.

A real symmetric quadratic form or hermitian form is positive semidefinite or negative semidefinite if and only if it is, respectively, nonnegative or nonpositive, and at least one eigenvalue λ_j of the matrix A ≡ [a_ik] equals zero.

Note that the λ_j are the roots of the characteristic equation (13.4-5); the signs of these roots can often be investigated by one of the methods of Sec. 1.6-6.

(b) A hermitian matrix A ≡ [a_ik] (and the corresponding hermitian form or real symmetric quadratic form) is positive definite if and only if every one of the quantities

is positive (Sylvester's Criterion).

(c) A hermitian matrix A (and the corresponding hermitian form or real symmetric quadratic form) is negative definite, nonpositive, or negative semidefinite if and only if —A is, respectively, positive definite, nonnegative, or positive semidefinite.

(d) A matrix A is a nonnegative hermitian matrix if and only if there exists a matrix B such that A = B†B. A real matrix A is a nonnegative symmetric matrix if and only if there exists a real matrix B such that . In either case, A is positive definite if B, and thus A, is nonsingular.

(e) If both A and B are positive definite or nonnegative, the same is true for AB. Every positive definite matrix A has a unique pair of square roots H, —H defined by H² = A; H is positive definite (see also Sec. 13.3-4).

13.6. MATRIX NOTATION FOR SYSTEMS OF DIFFERENTIAL

EQUATIONS (STATE EQUATIONS). PERTURBATIONS AND

LYAPUNOV STABILITY THEORY

13.6-1. Systems of Ordinary Differential Equations. Matrix Notation . As noted in Sec. 9.1-3, a general system of ordinary differential equations (9.1-4) reduces to the first-order form

if appropriate derivatives are introduced as new variables y_i. The system (la) is written as a single matrix differential equation

(see also Sec. 13.2-11), where y(t) and f(t, y) are n X 1 column matrices. If the f_i are single-valued and continuous and satisfy a Lipschitz condition (9.2-4) over the domain of interest, then the solution y(t) of Eq. (1) is uniquely determined by the initial condition

The system (1) is called autonomous if and only if f does not depend explicitly on the independent variable t.

More than merely a notational convenience, the matrix notation will be seen to extend intuitive insight gained from studies of simple first-order differential equations to systems of first-order equations. Matrix operations needed for solution of linear systems (Sec. 13.6-2) are, moreover, readily implemented with digital computers.

In the most important applications, t represents physical time, and the y_i(t) are state variables representing the state of a dynamical system; the system (1) is then called a system of state equations (see also Sec. 11.8-4 and Refs. 13.10 to 13.16).*

13.6-2. Linear Differential Equations with Constant Coefficients (Time-invariant Systems). (a) Homogeneous Systems. Normal-mode Solution. The solution of the homogeneous linear system

* Many engineering texts refer to the matrix y(t) as a state vector. It would be more correct to state that the matrix elements y_i(t) (state variables) represent a state vector in a specific scheme of measurements (in the sense of tensor analysis, Chap. 16; see also Ref. 13.15).

with constant coefficients a_ik (see also Secs. 9.3-1 and 9.4-ld) is explicitly given by

where the matrix function e^At is the n X n matrix defined in accordance with Secs. 13.2-12 and 13.4-7. Expansion of e^At by Eq. (13.4-8) involves cumbersome matrix multiplications but, if the given matrix A has n distinct eigenvalues, expansion of e^At by Sylvester's theorem (13.4-9) yields the normal-mode expansion of Sec. 9.4-1.

One can often simplify the solution of a problem (2) by introducing n new dependent variables (state variables) ȳ_h by a nonsingular linear transformation

such that the resulting transformed system

is simplified (see also Secs. 14.6-1 and 14.6-2). If, in particular, there exists a transformation (4) which diagonalizes the given system matrix A (Secs. 13.4-4 and 14.8-6), then the transformed variables ȳ_h are normal coordinates of the given linear system (see also Sec. 9.4-8): they satisfy “uncoupled” differential equations

where λ₁, λ₂, . . . , λ_n are the eigenvalues of A. If A has n distinct eigenvalues, the solution of the original problem (2) is then given by Eq. (4) with

Complex-conjugate terms in a normal-mode solution (4), and also coincident and zero eigenvalues, can be treated in a manner analogous to Sec. 9.4-1. In the general case, one can use a transformation (4) producing a triangular matrix Ā (Sec. 13.4-3), so that the ȳ_i(t) can be derived one by one (Ref. 13.15).

(b) Nonhomogeneous Equations. The State-transition Matrix. The linear system

where f(t) is an n X 1 column matrix, describes the response of a (time-invariant) linear system to the inputs fi(t). As in Secs. 9.3-1 and 9.4-2, the matrix solution y(t) is obtained by superposition of the homogeneous-system solution (3) and a particular integral (normal response) y_N(t):

The n X n state-transition matrix h₊(t — r) ≡ [{h₊(t — r)}_ik] for the initial-value problem (8) is a generalization of the one-dimensional weighting function h₊(t — τ) in Sec. 9.4-3 and satisfies

h₊(t) is the response to the set of (asymmetrical) unit impulses f_i(t) = δ₊(t) (i = 1, 2, . . . , n; see also Sec. 9.4-3d). Note that the solution (9) is precisely analogous to the solution of the one-dimensional problem dy/dt = ay + f(t), y(0) = y₀.

(c) Laplace-transform Solution (see also Sec. 9.4-5). Element-by-element Laplace transformation of the given constant-coefficient matrix equation (8) produces

where Y(s), F(s) denote the respective Laplace transforms of y(t), f(t). The terms in Eq. (12) are the transforms of those in Eq. (9); inverse Laplace transformation of each element Y_i(s) of Y(s) produces y_i(t).

13.6-3. Linear Systems with Variable Coefficients (see also Secs. 9.2-4, 9.3-3, and 18.12-2). (a) The most general linear system (1) has the form

where A(t) ≡ [a_ik(t)] is an n X n matrix, and f(t) is an n X 1 column matrix (linear differential equations with variable coefficients and forcing terms). The solution can again be written as

where w₊(t, λ) is the n X n state-transition matrix determined for t ≥ λ as the solution of

or as the response to a set of (asymmetrical) unit impulses f_i(t) = δ₊(t — λ), where i = 1, 2, . . . , n; see also Sec. 9.4-3d. For constant-coefficient systems, w₊(t, λ) ≡ h₊(t — λ).

(b) For any real or complex matrix A(t) with continuous elements, the solution of the homogeneous linear system

is Y(t)y(0), where Y = Y(t) is an n X n matrix and the unique solution of the matrix differential equation

Y(t) is nonsingular; its columns constitute n linearly independent solutions of Eq. (16) (fundamental-solution matrix, see also Sec. 9.3-2). U(t) ≡ [Y^—1(t)]† is the unique solution of

Equations (17) and (18) are adjoint linear differential equations.*

The state-transition matrix w₊(t, λ) of Sec. 13.6-3a is given by

so that Eq. (14) corresponds to a matrix version of the variation-of-constants solution of Sec. 9.3-3.

13.6-4. Perturbation Methods and Sensitivity Equations. (a) Given a system of differential equations

which depends on a set (column matrix) {α₁, α₂, . . . , α_m} of m parameters α_k let y₍₁₎(t) be the known solution for the parameter values given by α = α₁ ≡ {α₁₁, α₂, . . . ,α_1m}. The perturbed solution y₁(t) + δ_y(t) corresponding to the perturbed parameter matrix α = α₁ + δα may be easier to find through solution of

for the perturbation (variation, Sec. 11.5-1) δy than by direct solution

* are adjoint operators on n X 1 matrix functions u{t) such that exists and u(0) = 0 if one defines the inner product of two such functions u, v by (Sec. 14.4-3; see also Sec. 15.4-3).

of Eq. (20). Equation (21) is exact. For suitably differentiate f(t, y, α), one may, however, be able to neglect all but first-order terms in a Taylor-series expansion of Eq. (21) to find an approximation to δy (first-order perturbation) by solving the linear system

where the elements of the n X n matrix ∂f/∂y ≡ [∂f_i/∂y_k]_y=y₍₁₎ and the n X m matrix ∂f/∂α ≡ [∂f_i/∂α_k]_y=y₍₁₎ will, in general, depend on the “nominal solution” y₍₁₎(t) and hence on t. If the perturbations δy_i are small compared with the \yi\, one may be able to neglect approximation and numerical errors in the computation of the δy_i.

(b) The dependence of the solution y(t) on the parameters a_k is often described by (the n X m matrix of) the sensitivity coefficients (parameter-influence coefficients) Z_ik = ∂yi/∂α_k, which form an n X m matrix Z ≡ ∂y/∂α ≡ [∂y_i/∂α_k]_y=y₍₁₎. For each given nominal solution y(t) = y₍₁₎(t), the sensitivity coefficients are functions of t and satisfy the mn linear differential equations (sensitivity equations) given by

(c) The initial values y_i(0) = y_i0 may be treated as parameters in perturbation and sensitivity calculations. In this case, the initial conditions δy(0) = 0 in Eqs. (21) and (22) must be replaced by

The appropriate initial conditions for the sensitivity equations (23) are

13.6-5. Stability of Solutions: Definitions (see also Sec. 9.5-4). (a) Given a system

different types of stability of a solution y = y₍₁₎(t) can be defined in terms of the effects of various parameter perturbations (Sec. 13.6-4). The following theory is concerned with stability in the sense of Lyapunov, which is determined by the effects of small changes

in initial solution values on the resulting perturbations

for t > t₀.

The solution y = y₍₁₎(t) of the system (26) is

stable in the sense of Lyapunov if and only if for every real > 0, there exists a real Δ(, t₀) > 0 such that ||δy(t₀)|| < Δ(, t₀) implies ||δy(t)|| < for all t ≥ t₀. Otherwise the solution is unstable.

asymptotically stable in a region D1(t₀) of the “state space” of points y ≡ {y₁, y₂, . . . , y_n} if and only if y₍₁₎(t) is stable, and y(t₀) in D₁(t₀) implies (i.e., ||δy(t)|| → 0 as t → 0, Sec. 13.2-11).

asymptotically stable in the large (completely stable, globally asymptotically stable) if and only if the entire state space is a region of asymptotic stability.

NOTE: In the above definitions, the norm ||δy|| of the n X 1 column matrix δy ≡ {δy₁ δy₂, . . . , δy_n}, defined in accordance with Eq. (13.2-2) as

can be conveniently replaced by one of the alternate norms (Table 13.2-1)

Note that these definitions refer to stability of solutions, not of systems (see also Secs. 9.4-4 and 13.6-7). If a solution is stable in the sense of Lyapunov, sufficiently small changes in initial values cannot cause large solution changes at any time. For an asymptotically stable solution, the effects of finite initial-value changes, up to specified bounds, are nullified after sufficient time has elapsed. If the solution is asymptotically stable in the large, even arbitrarily large initial-value changes will have negligible long-term effects. Asymptotic stability is a requirement for practical control systems.

(b) An unstable solution y₍₁₎(t) of Eq. (26) has a finite escape time T if and only if it becomes unbounded after a finite time t = T.

13.6-6. Lyapunov Functions and Stability. (a) Stability of Equilibrium for Autonomous Systems (see also Sec. 9.5-4b). An equilibrium solution y(t) = y₍₁₎(t ≥ 0) of the autonomous system

is defined by

It will suffice to consider equilibrium solutions y(t) = y₍₁₎ = 0, since

other equilibrium “points” y = y₍₁₎ in state space can be translated to the origin by a simple coordinate transformation.

With reference to the solution y(t) = 0 of a given system (28), a Lyapunov function is any real function V(y) such that V(0) =0 and, throughout a neighborhood D of the “point” y = 0 in the “state space” of “points” y ≡ {y₁, y₂, . . . , y_n}_y V(y) is continuously differentiable and

for all y(t) ≠ 0 satisfying Eq. (28). The equilibrium solution y(t) = 0 is stable in the sense of Lyapunov if (and only if, Ref. 13.11) there exists a corresponding Lyapunov function. y(t) = 0 is asymptotically stable

If there exists a Lyapunov function V(y) satisfying the stronger condition dV/dt <0 for all solutions y(t) ≠ 0 of Eq. (28) in D (Lyapunov's Theorem on Asymptotic Stability).

If there exists a Lyapunov function V(y) not identically zero on any solution trajectory y = y(t) in D (Kalman-Bertram Theorem).

If the neighborhood D of the origin defining a Lyapunov function V(y) contains a bounded region D₁ such that V(y) < Vo, where V₀ is any positive constant, then y(t) = 0 is asymptotically stable in D₁. If a Lyapunov function V(y) can be defined for the entire state space, and V(y) → α as ||y(t)|| → α , then the solution y(t) =0 is asymptotically stable in the large (La Salle's Theorem on Asymptotic Stability).

The equilibrium solution y(t) =0 of Eq. (24) is unstable if there exists a neighborhood D of y = 0, a region D₁ in D, and a real function U(y) such that

U(y) is continuously differentiable, and for all solutions y(t) in D₁ except that

U(y) = 0 at all boundary points (Sec. 4.3-6) of D₁ in D.

y = 0 is a boundary point of D₁ (Cetaev's Instability Theorem).

(b) Nonautonomous Systems. Every solution y = y₍₁₎(t)of the system

can be transformed to the equilibrium solution (l) = 0 of a new (generally non-autonomous) system by the transformation y(t) = (l) + y₍₁₎(t).

The equilibrium solution y(t) = 0 of a given system (26) is asymptotically stable in the large if there exist a continuously differentiable real function V,(t, y), two continuous nondecreasing real functions V₁(||y||), V₂(||y||), and a continuous real function V₃(||y||) such that V(t,0) = V₁(0) = V₂(0) = V₃(0) = 0 and

for all y(t) ≠ 0 satisfying Eg. (26).

13.6-7. Applications and Examples (see also Sec. 9.5-4). (a) Applications such as control-system design motivate the search for Lyapunov functions establishing asymptotic stability in specified state-space regions, or in as large regions as possible (“direct method” of Lyapunov for stability investigations). Lyapunov functions for particular solutions are not unique, and practical search methods are more of an art than a science (Refs. 13.11 to 13.14).

(b) As noted in Sec. 9.5-4a, the equilibrium solution y(t) = 0 of the linear homogeneous constant-coefficient system

(Sec. 13.6-2a) is asymptotically stable in the large (completely stable) if and only if the system is completely stable in the sense of Sec. 9.4-4, i.e., if and only if all eigenvalues of the system matrix A have negative real parts. This is true if and only if for an arbitrary positive definite symmetric matrix Q, there exists a positive definite symmetric matrix P such that

V(y) ≡ ỹPy is then a Lyapunov function for the equilibrium solution y{ t ) = 0.

describes the oscillations of a nonlinear spring. Introducing y = y₁ y = y₂, one has the nonlinear first-order system

The theory of Sec. 13.6-6a indicates that

is a Lyapunov function for the equilibrium solution y₁ ( t ) = y₂ ( t ) = 0 when a > 0, b > 0 (“hard spring”); this solution is asymptotically stable in the large.

For a > 0, b <0 (“soft spring”), the equilibrium solution y₁ ( t ) = y₂( t ) = 0 is asymptotically stable, but not in the large (Fig. 13.6-1).

FIG. 13.6-1. Region of asymptotic stability for duffing’s equation

with a = 1, b = −0.04. (Based on Ref. 13.11.)

Linear simultaneous equations Chap. 1

Systems of ordinary differential equations Chap. 9

Matrix notation for optimum-control problems Chap. 11

Use of matrices for the representation of vectors, linear transformations (linear operators), scalar products, and group elements Chap. 14

Eigenvectors and eigenvalues of linear operators Chap. 14

Matrix techniques for difference equations Chap. 20

Numerical techniques Chap. 20

13.7-2. References and Bibliography (see also Secs. 12.9-2 and 14.11-2).

13.1. Aitken, A. C.: Determinants and Matrices, 8th ed., Interscience, New York, 1956.

13.2. Birkhoff, G., and S. MacLane: A Survey of Modern Algebra, rev. ed., Mac-millan, New York, 1965

13.3. Gantmakher, F. R.: The Theory of Matrices, Chelsea, New York, 1959.

13.4. : Applications of the Theory of Matrices, Inter cience, New York, 1959.

13.5. Hohn, F. E.: Elementary Matrix Algebra, 2d ed., Macmillan, New York, 1964.

13.6. Nering, E. D.: Linear Algebra and Matrix Theory, Interscience, New York, 1963.

13.7. Shields, P. C.: Linear Algebra, Addison-Wesley, Reading, Mass., 1964.

13.8. Thrall, R. M., and L. Tornheim: Vector Spaces and Matrices, Wiley, New York, 1957.

13.9. Zurmuehl, R.: Matrizen, 2d ed., Springer, Berlin, 1964.

(See also the articles by G. Falk and H. Tietz in vol. II of the Handbuch der Physik, Springer, Berlin, 1955. For numerical techniques, see Secs. 20.3-3 to 20.3-5.)

Matrix Techniques for Systems of Differential Equations

(See also Refs. 9.3 and 9.16 in Sec. 9.7-2)

13.10. DeRusso, P., et al.: State Variables for Engineers, Wiley, New York, 1965.

13.11. Geiss, G. R.: “The Analysis and Design of Nonlinear Control Systems via Lyapunov's Direct Method,” RTD-TDR-63-4076, U.S. Air Force Flight Dynamics Laboratory, Wright-Patterson AFB, Ohio, 1964.

13.12. Hahn, W.: Theory and Application of Lyapunov's Direct Method, Prentice-Hall, Englewood Cliffs, N.J., 1963.

13.13. Krasovskii, N. N.: Stability of Motion, Stanford, Stanford, Calif., 1963.

13.14. Letov, A. M.: Stability of Nonlinear Control Systems, Princeton, Princeton, N.J., 1961.

13.15. Schultz, D. G., and J. L. Melsa: State Functions in Automatic Control, McGraw-Hill, New York, 1967.

13.16. Tomovič, R.: Sensitivity Analysis of Dynamic Systems, McGraw-Hill, New York, 1964.