CHAPTER 13
MATRICES. QUADRATIC AND HERMITIAN FORMS
13.2. Matrix Algebra and Matrix Calculus
13.2-1. Rectangular Matrices. Norms
13.2-3. Identities and Inverses
13.2-4. Integral Powers of Square Matrices
13.2-5. Matrices as Building Blocks of Mathematical Models
13.2-6. Multiplication by Special Matrices. Permutation Matrices
13.2-7. Rank, Trace, and Determinant of a Matrix
13.2-8. Partitioning of Matrices
13.2-9. Step Matrices. Direct Sums
13.2-10. Direct Product (Outer Product) of Matrices
13.2-11. Convergence and Differentiation
13.2-12. Functions of Matrices
13.3. Matrices with Special Symmetry Properties
13.3-1. Transpose and Hermitian Conjugate of a Matrix
13.3-2. Matrices with Special Symmetry Properties
13.3-4. Decomposition Theorems. Normal Matrices
13.4. Equivalent Matrices. Eigenvalues, Diagonalization, and Related Topics
13.4-1. Equivalent and Similar Matrices
13.4-2. Eigenvalues and Spectra of Square Matrices
13.4-4. Diagonalization of Matrices
13.4-5. Eigenvalues and Characteristic Equation of a Finite Matrix
13.4-6. Eigenvalues of Step Matrices
13.4-7. The Cayley-Hamilton Theorem and Related Topics
13.5. Quadratic and Hermitian Forms
13.5-4. Transformation of Quadratic and Hermitian Forms. Transformation to Diagonal Form
13.5-5. Simultaneous Diagonalization of Two Quadratic or Hermitian Forms
13.5-6. Tests for Positive Definiteness, Nonnegativeness, etc.
13.6-1. Systems of Ordinary Differential Equations. Matrix Notation
13.6-2. Linear Differential Equations with Constant Coefficients (Time-invariant Systems)
(a) Homogeneous Systems. Normal-mode Solution
(b) Nonhomogeneous Systems. The State-transition Matrix
(c) Laplace-transform Solution
13.6-3. Linear Systems with Variable Coefficients
13.6-4. Perturbation Methods and Sensitivity Equations
13.6-5. Stability of Solutions: Definitions
13.6-6. Lyapunov Functions and Stability
13.6-7. Applications and Examples
13.7. Related Topics, References, and Bibliography
13.7-2. References and Bibliography
13.1-1. Matrices (Sec. 13.2-1) are the building blocks of an important class of mathematical models. Matrix techniques permit a simplified representation of various mathematical and physical operations in terms of numerical operations on matrix elements. Chapter 13 introduces matrix algebra and calculus (Secs. 13.2-1 to 13.4-7), quadratic and hermitian forms (Secs. 13.5-1 to 13.5-6), and the matrix /state-variable treatment of ordinary differential equations (Secs. 13.6-1 to 13.6-7). The general application of matrices to the representation of vectors, linear transformations (linear operators), and inner products is reserved for Chap. 14.
Sections 13.5-1 to 13.5-6 similarly introduce quadratic and hermitian forms from the viewpoint of simple algebra; their real significance for the representation of scalar products is discussed in Secs. 14.7-1 and 14.7-2.
13.2. MATRIX ALGEBRA AND MATRIX CALCULUS
13.2-1. Rectangular Matrices. Norms. (a) An array
of “scalars” aik taken from a commutative field (.Sec. 12.3-1) F is called a (rectangular) m X n matrix over the field F whenever one of the “matrix operations” defined in Sec. 13.2-2 is to be used. The elements aik are called matrix elements; the matrix element aik is situated in the ith row and in the kth column of the matrix (1). m is the number of rows, and n is the number of columns. A matrix is finite if and only if it has a finite number of rows and a finite number of columns; otherwise the matrix is infinite.
A finite or infinite matrix (1) over the field of complex numbers is bounded if and only if it has a finite bound (norm) defined typically as
Table 13.2-1. Some Matrix Norms (n and/or m can be infinite; see also Secs. 12.5-5 and 14.4-1)
(see also Table 13.2-1 and Sec. 14.4-1). A finite matrix over the field of complex numbers is bounded if and only if all its matrix elements are bounded.
Throughout this handbook all matrices will be understood to be bounded matrices over the field of complex numbers, unless the contrary is specifically stated. A matrix A ≡ [aik] is real if and only if all matrix elements aik are real numbers.
(b) n X 1 matrices are column matrices, and 1 X n matrices are row matrices. The following notation will be used:
(see also Secs. 13.3-1, 14.5-1, and 14.5-2).
(c) An n X n matrix is called a square matrix of order n. A square matrix A ≡ [aik] is
Triangular (superdiagonal) if and only if i > k implies aik = 0
Strictly triangular if and only if i ≥ k implies aik = 0
Diagonal if and only if i ≠ k implies aik = 0
Monomial if and only if each row and column has one and only one element different from zero
13.2-2. Basic Operations. Operations on matrices are defined in terms of operations on the matrix elements.
In every matrix product AB the number n of columns of A must match the number of rows of B (A and B must be conformable). The existence of AB implies that of BA if and only if A and B are square matrices; in general BA ≠ AB (see also Sec. 13.4-4b). Note
13.2-3. Identities and Inverses. Note the following definitions:
Products and reciprocals of nonsingular matrices are nonsingular; if A and B are nonsingular, and α ≠ 0,
(see also sec. 14.3-5).
A given matrix A is nonsingular if it has a unique left or right inverse, or egqual left and right inverses; the mere existence of one or more left and/or right inverses is not sufficient (see also Sec. 14.3-5). A given n X n matrix is nonsingular if and only if it can be partitioned (Sec. 13.2-8) into a linearly independent set of row or column matrices.
13.2-4. Integral Powers of Square Matrices. One defines A0 = I, A1 = A, A2 = A A, A3 = AAA, . . . and, if A is nonsingular,
The ordinary rules for operations with exponents apply (see also .Sec. 14.3-6).
13.2-5. Matrices as Building Blocks of Mathematical Models. The definitions of Secs. 13.2-2 and 13.2-3 (constructive definitions, Sec. 12.1-1) imply the following results:
1. Given any pair of (finite) positive integers m, n, the class of all m X n matrices over a field F is an mn-dimensional vector space over F (Secs. 12.4-1 and 14.2-4). In particular, n-element column or row matrices form n-dimensional vector spaces (see also Sec. 14.5-2).
2. The class of all square matrices of a given (finite) order n over a field F is a linear algebra of order n2 over F; singular matrices are zero divisors (Sec. 12.4-2).
3. The class of all nonsingular square matrices of a given (finite) order n over a field F constitutes a multiplicative group (Sec. 12.2-1) and, together with the n X n null matrix, a division algebra of order n2 over the field F (Sec. 12.4-2).
Analogous theorems apply to bounded infinite matrices over the field of real or complex numbers.
13.2-6. Multiplication by Special Matrices. Permutation Matrices. Given any n X n matrix A,
1. Premultiplication of A by the matrix obtained on replacing the 1 in the ith row oj the n X n identity matrix by a complex number α multiplies all elements in the ith row of A by α.
2. Premultiplication of A by the matrix obtained on replacing the nondiagonal element of the n X n identity matrix by 1 adds the kth row of A to the ith row of A.
3. Premultiplication of A by the permutation matrix formed through a permutation of the rows of the n X n identity matrix results in an identical permutation of the rows of A.
The third transposition relation (13.3-1) yields three analogous theorems describing operations on the columns of a matrix as results of postmultiplication by special matrices.
13.2-7. Rank, Trace, and Determinant of a Matrix (see also Sec. 14.3-2). The rank of a given matrix is the largest number r such that at least one rth-order determinant (Sec. 1.5-1) formed from the matrix by deleting rows and/or columns is different from zero. An m X n matrix A is nonsingular if and only if m = n = r, i.e., if and only if A is square and det (A) ≠ 0 (Sec. 13.2-3).
The trace (spur) of an n X n matrix A = [aik] is the sum
of the diagonal terms. If n = ∞, this sum converges whenever A is bounded (Sec. 13.2-la). For finite matrices A, B
The determinant det (A) of an infinite square matrix A ≡ [aik] is defined as
if the limit exists. The theorems of Secs. 1.5-1 to 1.5-6 apply to determinants defined in this manner whenever both and
converge.
13.2-8. Partitioning of Matrices. A matrix having more than one row and column may be partitioned into smaller rectangular submatrices by lines drawn between rows and/or columns. One can multiply two similarly partitioned n X n matrices A and B by entering their rectangular submatrices as elements in the ordinary matrix-product formula (Sec. 13.2-2); the product elements thus obtained are the submatrices of the n X n matrix AB. This theorem is helpful in numerical computations (Sec. 20.3-4).
13.2-9. Step Matrices. Direct Sums (see also Secs. 13.4-6, 14.8-2, and 14.9-2). A step matrix is a square matrix A which can be partitioned into a diagonal matrix (Sec. 13.2-lc) of square submatrices A1, A 2, . . . , so that
A step matrix is often referred to as the direct sum
of the square matrices along its diagonal (see also Sec. 12.7-5). Note that Ap = A1p ⊕ A2p ⊕ … for p = 0, 1, 2, . . . (and also for p = −1, −2, . . . if A is nonsingular), and
13.2-10. Direct Product (Outer Product) of Matrices (see also Sec. 12.7-3). The direct product (outer product) A ⊗ B of the m X n matrix A ≡ [aik] and the m′ X n′ matrix B ≡ [bi′k′] is the mm′ X nn′ matrix
where j enumerates the pairs (i, i′) in the sequence (1, 1), (1, 2), . . . , (1, m′), (2, 1), (2, 2), . . . (m, m′) and where h enumerates the pairs (k, k′) in a similar manner. Note
13.2-11. Convergence and Differentiation. (a) A sequence of matrices S0, S1, S2,… each having the same number of rows and the same number of columns is said to converge to a bounded matrix S if and only if every matrix element of Sn converges to the corresponding element of S as n → ∞, i.e., if and only if One similarly defines limits of matrix functions
of a scalar parameter t (see also Sec. 12.5-3).
(b) If the matrix elements of a matrix A ≡ [aik] are differentiable functions aik(t) of a scalar parameter t, one writes
Partial differentiation and integration of matrices are defined in an analogous manner.
13.2-12. Functions of Matrices. Matrix polynomials and algebraic functions of matrices are defined in terms of the elementary matrix operations. The Cayley-Hamilton theorem (Sec. 13.4-7) reduces every convergent series in powers of an n X n matrix A (analytic function of the matrix A) to an nth-degree polynomial in A.
13.3. MATRICES WITH SPECIAL SYMMETRY PROPERTIES
13.3-1. Transpose and Hermitian Conjugate of a Matrix (see also Secs. 14.4-3 and 14.4-6a). Given any m X n matrix A ≡ [aik] over the field of complex numbers,
The transpose (transposed matrix) of A is the n X m matrix à ≡ [aki].
The hermitian conjugate (adjoint, conjugate, associate matrix)* of A is the n X m matrix A † ≡ [].
Note the following relations:
A, Ã, and A† are necessarily of equal rank. For every square matrix A,
13.3-2. Matrices with Special Symmetry Properties (see also Secs. 14.4-4 to 14.4-6). A square matrix A ≡ [aik] is
is the complex conjugate of aki (Sec. 1.3-1). The terms adjoint, conjugate, and associate(d) are used with different meanings (see also Secs. 12.2-5, 14.4-3, 16.7-1, and 16.7-2); some authors refer to the matrix A-1 det (A) (matrix of cofactors with transposed indices, adjugate of A) as the adjoint of A. The symbols used for the trans-pose Ã, the hermitian conjugate A†,and the complex conjugate A* ≡ [
] ≡ (Ã)† of a given matrix A also vary; some authors denote the hermitian conjugate of A by A*.
Symmetric if and only if à = A, i.e., aik = aki
Skew-symmetric (antisymmetric) if and only if à = −A, i.e., aik = −a*ki
Hermitian (self-adjoint, self-conjugate) if and only if A† = −A, i.e., aik =
Skew-hermitian (alternating) if and only if A† = −A, i.e., aik = −
Orthogonal if and only if à A = Aà = I, i.e., à = A−l
Unitary if and only if A†A = AA† = I, i.e., A† = A−1
A hermitian matrix is symmetric, a skew-hermitian matrix is skew-symmetric, and a unitary matrix is orthogonal if and only if all its matrix elements are real. The diagonal elements of hermitian, skew-hermitian, and skew-symmetric matrices are, respectively, real, pure imaginary, and equal to zero.
The determinant of a hermitian matrix is real. The determinant of a skew-hermitian n X n matrix is real if n is even, and pure imaginary if n is odd. The determinant of a skew-symmetric matrix of odd order is equal to zero. The determinant of a unitary matrix has the absolute value 1, and the determinant of an orthogonal matrix equals either + 1 or -1.
13.3-3. Combination Rules (see also Sec. 14.4-7). (a) If the matrix A is symmetric the same is true for Ap (p = 0, 1, 2, . . .), A−l, , and αA.
Given any nonsingular matrix T, is symmetric if and only if A is symmetric; hence for any orthogonal matrix T, T−lAT is symmetric if and only if the same is true for A.
If A and B are symmetric the same is true for A + B. The product AB of two symmetric matrices A and B is symmetric if and only if BA = AB.
(b)If the matrix A is hermitian the same is true for Ap (p = 0, 1, 2, . . .), A−1, and T†AT, and for αA if a is real.
Given any nonsingular matrix T, T†AT is hermitian if and only if A is hermitian; hence for any unitary matrix T, T−lAT is hermitian if and only if the same is true for A.
If A and B are hermitian the same is true for A + B. The product AB of two hermitian matrices A and B is hermitian if and only if BA = AB.
(c) If A is an orthogonal matrix the same is true for Ap (p = 0, 1, 2, . . .), A−1, Ã, and −A. If A and B are orthogonal the same is true for AB.
If A is a unitary matrix the same is true for Ap (p = 0, 1, 2, . . .), A−l, and A† and for αA if |a| = 1. If A and B are unitary the same is true for AB.
13.3-4. Decomposition Theorems. Normal Matrices (see also Secs. 13.4-4a and 14.4-8). (a) For every square matrix over the field of complex numbers
1. ½(A + Ã) = S1 is a symmetric matrix, and ½(A — Ã) = S2 is skew-symmetric. A = S1 + S2 is the (unique) decomposition of the given matrix A into a symmetric part and a skew-symmetric part.
2. ½(A + A†) = H1 and are hermitian matrices. A = H1 + iH2 is the (unique) cartesian decomposition of the given matrix A into a hermitian part and a skew-hermitian part (comparable to the cartesian decomposition of complex numbers into real and imaginary parts, Sec. 1.3-1).
A†A is hermitian (and nonneqative, Sec. 13.5-3), and there exists a polar decomposition A = QU of A into a nonnegative hermitian factor Q and a unitary factor U. Q is uniquely defined by
Q3. 2 = A†A
and U is uniquely defined if and only if A is nonsingular (compare this with Sec. 1.3-2).
(b) A square matrix A is a normal matrix if and only if A†A = AA†, or, equivalently, if and only if H2H1 = H1H2.
13.4. EQUIVALENT MATRICES. EIGENVALUES,
DIAGONALIZATION, AND RELATED TOPICS
13.4-1. Equivalent and Similar Matrices (see also Secs. 12.2-5a, 13.5-4, 13.5-5, and 14.6-2). (a) Two rectangular matrices A and B are equivalent if and only if there exist two nonsingular matrices S, T such that A and B are related by the transformation
Every matrix B equivalent to a given matrix A has the same number of rows and the same number of columns as A and can be obtained through successive application of the six operations defined in Sec. 13.2-6. Equivalent matrices are of equal rank; two m X n matrices of equal rank are necessarily equivalent.
A and B are equivalent whenever B = QA or B = AQ, where Q is a nonsingular matrix.
(b) In particular, two square matrices A and Ā are similar (sometimes simply called equivalent) if and only if there exists a nonsingular matrix T (transformation matrix) such that A and Ā are related by the similarity transformation (collineatory transformation)
A, Ā, and T are necessarily square matrices of equal order.
Every similarity transformation (2) preserves the results of matrix addition, matrix multiplication, and multiplication by scalars (see also Sec. 12.1-6). Two similar matrices have the same rank, the same trace, and the same determinant (see also Sec. 13.4-2a).
(c) Two square matrices A and Ā related by a transformation
where T is nonsingular, are congruent. Two square matrices A and Ā related by
where T is nonsingular, are conjunctive. In either case A, Ā, and T are necessarily square matrices of equal order.
(d) Matrix equivalence, similarity, congruence, and conjunctivity are equivalence relations; each defines a partition of the class of matrices under consideration (Sec. 12.1-3b). In most applications, two or more similar matrices constitute different representations of a linear transformation (linear operator, dyadic) A (Sec. 14.6-2). It is, then, of interest (1) to find similarity transformations yielding particularly simple representations of A (transformation of a matrix to diagonal or other “canonical” form) and (2) to find properties of matrices which are invariant with respect to similarity transformations and are thus common to each class of similar matrices (e.g., rank, trace, determinant, eigenvalues).
13.4-2. Eigenvalues and Spectra of Square Matrices (see also Sec. 14.8-3). (a) The eigenvalues (proper values, characteristic values, characteristic roots, latent roots) of a (finite or infinite) square matrix A ≡ [aik] are those values of the scalar parameter λ for which the resolvent matrix A — λI is singular. The spectrum (eigenvalue spectrum) of the matrix A is the set of all its eigenvalues.
The eigenvalues of a square matrix A can be defined directly as the eigenvalues of a linear operator represented by A (Sec. 14.8-3); similar matrices have identical spectra.
Note that the spectrum of an infinite matrix may or may not be a discrete set: e.g., every real number between —π and π is an eigenvalue of the infinite matrix . Some authors restrict the term eigenvalue to values of λ in the discrete spectrum (see also Sec. 14.8-3d).
(b) Given a normal matrix A (A†A = AA†, Sec. 13.3-4b) with eigenvalues
λ, A† has the eigenvalues λ*, H1 = ½(A + A†) has the eigenvalues Re (λ), and has the eigenvalues Im (λ) (see also Sec.13.3-4a).
All eigenvalues of a given normal matrix are real if and only if the matrix is similar to a hermitian matrix (see also Sec. 14.8-4). In particular, all eigenvalues of hermitian and real symmetric matrices are real. All eigenvalues of a unitary matrix have absolute values equal to 1; in particular, real eigenvalues of real orthogonal matrices equal +1 or —1, and their complex eigenvalues occur in pairs e±iφ. A square matrix is nonsingular if and only if all its eigenvalues are different from zero.
(c) Refer to Secs. 13.4-5a, 14.8-5, and 20.3-5 for the numerical calculation of eigenvalues, and to Sec. 14.8-9 for the calculation of bounds for the eigenvalues of a given matrix.
13.4-3. Transformation of a Square Matrix to Triangular Form. Algebraic Multiplicity of an Eigenvalue (see also Sec. 14.8-3e). (a) Given any square matrix A having a purely discrete eigenvalue spectrum, there exists a similarity transformation Ā = T—lAT such that Ā is triangular (Sec. 13.2-lc). The diagonal elements of every triangular matrix similar to A are eigenvalues of A, and each eigenvalue λj of A occurs exactly m′j ≥ 1 times as a diagonal element; m′j is called the algebraic multiplicity of the eigenvalue λj.
NOTE: m′j is not necessarily equal to the degree of degeneracy mj defined in Sec. 14.8-3b. In the case of infinite matrices, one or more of the m′j may be infinite.
(b) If Tr (A) exists, and A has a purely discrete eigenvalue spectrum, Tr (A) equals the sum of all eigenvalues, each counted a number of times equal to its algebraic multiplicity. If det (A) exists (Sec. 13.2-7), it equals the similarly computed product of the eigenvalues (see also Sec. 13.4-5).
13.4-4. Diagonalization of Matrices (see also Sec. 14.8-5). (a) A square matrix A can be diagonalized by a similarity transformation (i.e., there exists a nonsingular transformation matrix T such that Ā = T—lAT is diagonal, Sec. 13.2-lc) if and only if A has a purely discrete eigenvalue spectrum and is similar to a normal matrix (Sec. 13.3-4b). More specifically, a given matrix A having a purely discrete eigenvalue spectrum can be diagonalized by a similarity transformation with unitary transformation matrix T (or with a real orthogonal transformation matrix if A is real) if and only if A is a normal matrix (A†A = AA†, Sec. 13.3.4b). In each case the diagonal elements of Ā are eigenvalues of A; every eigenvalue occurs a number of times equal to its algebraic multiplicity. Refer to Sec. 14.8-6 for a procedure yielding transformation matrices T with the desired properties.
SPECIAL CASES OF DIAGONALIZABLE MATRICES. Hermitian and unitary matrices (and thus real and symmetric or orthogonal matrices) are special instances of normal matrices. Every matrix having only discrete eigenvalues of algebraic multiplicity 1 is similar to a normal matrix.
(b) Two hermitian matrices A and B with purely discrete eigenvalue spectra can be diagonalized by the same similarity transformation (and, in particular, by the same similarity transformation with unitary transformation matrix T) if and only if BA = AB (see also Sees. 13.5-5, 14.8-6e).
(c)Given any hermitian matrix A having a purely discrete eigenvalue spectrum, there exists a nonsingular matrix T such that Ā = T†AT is diagonal; the diagonal elements of Ā are then real. In particular, there exists a nonsingular matrix T such that the diagonal elements of Ā take only the values +1, —1, and/or 0.
Given any real symmetric matrix A having a purely discrete eigenvalue spectrum, there exists a real nonsingular matrix T such that is diagonal. In particular,there exists a real nonsingular matrix T such that the diagonal elements of Ā take only the values +1, —1, and/or 0.
Matrices T with the desired properties are obtained from T = DU, where U is a unitary (or real orthogonal) matrix such that U—1AU is diagonal, and D is a real diagonal matrix; U is found by the method of Sec. 14.8-6 (see also Sec. 13.5-4d).
13.4-5. Eigenvalues and Characteristic Equation of a Finite Matrix, (a) The eigenvalue spectrum of a finite n X n matrix A ≡ [aik] is identical with the set of roots λ of the nth-degree algebraic equation
The multiplicity (order, Sec. 1.6-2) of each root λj equals its algebraic multiplicity m′j as an eigenvalue, so that m′1 + m′2 + … = n.
Similar n X n matrices have identical characteristic equations; the coefficients in Eq. (5) are symmetric functions of the n roots λ1, λ2, . . . , λn(Sec. 1.6-4). In particular, the coefficient of λn—1 and the constant term in Eq. (5) are, respectively,
The coefficient of λn—r equals (—l)r times the sum of the r-rowed principal minors (Sec. 1.5-4) of det (A).
(b) (See also Sec. 14.8-3). Given a finite square matrix A with eigenvalues λj, αA has the eigenvalues αλj, and Ap has the eigenvalues (p = 0, 1, 2, . . . ; p = 0, ±1, ±2, . . . if A is nonsingular). Every polynomial or analytic function f(A) (Sec. 13.2-12) has the eigenvalues f(λj). A matrix power series
converges (Sec. 13.2-1la) if and only if the power series
converges for every eigenvalue λj of A. Given two finite square matrices A and B having the respective eigenvalues λj and μλ, the eigenvalue spectrum of the direct product A ⊗ B (Sec. 13.2-10) is the set of products λjμλ.
13.4-6. Eigenvalues of Step Matrices (Direct Sums, Sec. 13.2-9). The spectrum of a finite or infinite) step matrix (direct sum) A = A1 ⊕ A2 ⊕ … is the union of the spectra of A1, A2,… ; algebraic multiplicities add. The contribution of each finite submatrix Ak may be obtained with the aid of its characteristic equation.
13.4-7. The Cayley-Hamilton Theorem and Related Topics. (a) Every finite square matrix A satisfies its own characteristic equation (Sec. 13.4-5a), i.e.
(b) The Cayley-Hamilton theorem permits one to represent every integral power, and hence every analytic function, of a finite n X n matrix A as a linear function of any n distinct positive integral powers of A (see also Sec. 13.2-12). Specifically,
where Δ is the Vandermonde determinant (Sec. 1.6-5) det [λik—1], and Δj is the determinant obtained on substitution of f(λ1), f(λ2), . . . , f(λn) for λ1j, λ2j, . . . ,λnj in Δ.
If the eigenvalues λ1, λ2, . . . , λn of the matrix A are distinct, then Eq. (8) can be rewritten as
13.5. QUADRATIC AND HERMITIAN FORMS
13.5-1. Bilinear Forms. A bilinear form in the 2n real or complex variables ξ1, ξ2, . . . , ξn, η1, η2, . . . , ηn is a homogeneous polynomial of the second degree (Sec. 1.4-3a)
13.5-2. Quadratic Forms. A (homogeneous) quadratic form in n real or complex variables* ξ1, ξ2, . . . , ξn is a polynomial
* The theory of Secs. 13.5-2 to 13.5-6 also applies to quadratic and hermitian forms in a countably infinite set of variables ξ1, ξ2, . . . (n = ∞), and to the corresponding infinite matrices, provided that the infinite sums
and
converge.
where A1 = ½(A + Ã) is the “symmetric part” (Sec. 13.3-4) of the matrix A ≡ [aik]. The expression (2) vanishes identically if and only if A is skew-symmetric (aki = —aik Sec. 13.3-2). A quadratic form (2) is symmetric if and only if the matrix A ≡ [aik] is symmetric (aki = aik, Sec. 13.3-2) and real if and only if A (and thus every aik Sec. 13.2-la) is real.
A real symmetric quadratic form (2), and also the corresponding real symmetric matrix A, is called positive definite, negative definite, nonnegative, or nonpositive if and only if, respectively, ,
,
, or
for every set of real numbers ξ1, ξ2, . . . , ξn not all equal to zero. All other real symmetric quadratic forms are indefinite (i.e., the sign of xAx depends on ξ1, ξ2, . . . , ξn) or identically zero.
A real symmetric quadratic form (2), and also the corresponding real symmetric matrix A, is positive semidefinite or negative semidefinite if and only if it is, respectively, nonnegative or nonpositive, and for some set of real numbers ξ1, ξ2, . . . , ξn not all equal to zero.
13.5-3. Hermitian Forms. A hermitian form in n real or complex variables* ξ1, ξ2, . . . , ξn is a polynomial
such that the matrix A ≡ [aik] is hermitian (aik = , Sec. 13.3-2). A form (3) is real for every set of complex numbers ξ1, ξ2, . . . , ξn if and only if A is hermitian (see also Sec. 14.4-4).
A hermitian form (3), and also the corresponding hermitian matrix A ≡ [aik], is called positive definite, negative definite, nonnegative, or nonpositive if and only if, respectively, x†Ax > 0, x†Ax < 0, x†Ax ≥ 0, or x†Ax ≤ 0 for every set of complex numbers {ξ1, ξ2, . . . , ξnnot all equal to zero. All other hermitian forms (or hermitian matrices) are indefinite (i.e., the sign of x†Ax depends on ξ1, ξ2, . . . , ξn) or identically zero.
A hermitian form (3), and also the corresponding hermitian matrix A, is positive semidefinite or negative semidefinite if and only if it is, respectively, nonnegative or nonpositive, and x†Ax = 0 for some set of complex numbers ξ1, ξ2, . . . , ξn not all equal to zero.
13.5-4. Transformation of Quadratic and Hermitian Forms. Transformation to Diagonal Form. (a) A linear substitution
* See footnote to Sec. 13.5-2,
(nonsingular homogeneous linear transformation of coordinates or vector components, “alias” point of view, Sec. 14.6-1) transforms every quadratic form (2) into a quadratic form in the new variables :
Ā is symmetric if A is symmetric; Ā is real if A and T are real.
The linear substitution (4) transforms every hermitian form (3) into a new hermitian form:
(b) For every given real symmetric quadratic form (2) there exist linear transformations (4) with real coefficients tik such that the transformed matrix Ā in Eq. (5) is diagonal (see also Sec. 13.4-4c), so that
Similarly, for every given hermitian form (3) there exist linear transformations (4) such that
The number r of nonzero coefficients āii in Eq. (7) or (8) is independent of the particular diagonalizing transformation used and equals the rank of the given matrix A; r is called the rank of the given quadratic or hermitian form. For any given real symmetric quadratic form (2) the difference between the respective numbers of positive and negative coefficients āii in Eq. (7) is independent of the particular diagonalizing transformation used (Jacobi-Sylvester Law of Inertia); this number is referred to as the signature of the given quadratic form.
(c) In particular, there exists a real orthogonal diagonalizing matrix T for every real symmetric quadratic form (2), and a unitary diagonalizing matrix T for every hermitian form (3) (see also Sec. 13.4-4). The resulting principal-axes transformation (transformation to normal coordinates , see also Sec. 9.4-8) yields the normal form of the given quadratic or hermitian form, viz.,
where the set of real numbers λi is the eigenvalue spectrum of the given matrix A (Sec. 13.4-2).
(d) The additional transformation reduces the expressions (9) to their respective canonical forms
where each i equals +1, — 1, or 0 if the corresponding eigenvalue λi is positive, negative, or zero.
(e) The calculation of suitable diagonalizing transformation matrices is discussed in Sec. 14.8-6.
13.5-5. Simultaneous Diagonalization of Two Quadratic or Hermitian Forms (see also Secs. 13.4-4b and 14.8-7). Given two real symmetric quadratic forms ,
, where
is positive definite, it is possible to find a real transformation (4) which diagonalizes
and
simultaneously. In particular, there exists a real transformation (4) to new coordinates
such that
Similarly, given two hermitian forms x†Ax, x†Bx, where x†Bx is positive definite, there exists a transformation (4) to new coordinates such that
In either case, the set of real numbers μ1μ2, . . . , μn is the eigenvalue spectrum of the matrix B—lA, obtainable as the set of roots of the nth-degree algebraic equation
The desired transformation matrix T is obtained by the method of Sec. 14.8-7b, or from T = UT0, where T0 is the matrix reducing or x†Bx to canonical form (Sec. 13.5-4d), and U is a unitary matrix which diagonalizes
or x†Ax (Sec. 13.5-4c).
NOTE: TWO real symmetric quadratic forms ,
xBx or two hermitian forms x†Ax, x†Bx can be diagonalized simultaneously by the same unitary transformation matrix T if and only if BA = AB (see also Secs. 13.4-4b and 14.8-6e).
13.5-6. Tests for Positive Definiteness, Nonnegativeness, etc. (a) A real symmetric quadratic form or hermitian form is positive definite, negative definite, nonnegative, nonpositive, indefinite, or identically zero (Secs. 13.5-2 and 13.5-3) if and only if the (necessarily real) eigenvalues λj of the matrix A ≡ [aik] are, respectively, all positive, all negative, all nonnegative, all nonpositive, of different signs, or all equal to zero.
A real symmetric quadratic form or hermitian form is positive semidefinite or negative semidefinite if and only if it is, respectively, nonnegative or nonpositive, and at least one eigenvalue λj of the matrix A ≡ [aik] equals zero.
Note that the λj are the roots of the characteristic equation (13.4-5); the signs of these roots can often be investigated by one of the methods of Sec. 1.6-6.
(b) A hermitian matrix A ≡ [aik] (and the corresponding hermitian form or real symmetric quadratic form) is positive definite if and only if every one of the quantities
is positive (Sylvester's Criterion).
(c) A hermitian matrix A (and the corresponding hermitian form or real symmetric quadratic form) is negative definite, nonpositive, or negative semidefinite if and only if —A is, respectively, positive definite, nonnegative, or positive semidefinite.
(d) A matrix A is a nonnegative hermitian matrix if and only if there exists a matrix B such that A = B†B. A real matrix A is a nonnegative symmetric matrix if and only if there exists a real matrix B such that . In either case, A is positive definite if B, and thus A, is nonsingular.
(e) If both A and B are positive definite or nonnegative, the same is true for AB. Every positive definite matrix A has a unique pair of square roots H, —H defined by H2 = A; H is positive definite (see also Sec. 13.3-4).
13.6. MATRIX NOTATION FOR SYSTEMS OF DIFFERENTIAL
EQUATIONS (STATE EQUATIONS). PERTURBATIONS AND
LYAPUNOV STABILITY THEORY
13.6-1. Systems of Ordinary Differential Equations. Matrix Notation . As noted in Sec. 9.1-3, a general system of ordinary differential equations (9.1-4) reduces to the first-order form
if appropriate derivatives are introduced as new variables yi. The system (la) is written as a single matrix differential equation
(see also Sec. 13.2-11), where y(t) and f(t, y) are n X 1 column matrices. If the fi are single-valued and continuous and satisfy a Lipschitz condition (9.2-4) over the domain of interest, then the solution y(t) of Eq. (1) is uniquely determined by the initial condition
The system (1) is called autonomous if and only if f does not depend explicitly on the independent variable t.
More than merely a notational convenience, the matrix notation will be seen to extend intuitive insight gained from studies of simple first-order differential equations to systems of first-order equations. Matrix operations needed for solution of linear systems (Sec. 13.6-2) are, moreover, readily implemented with digital computers.
In the most important applications, t represents physical time, and the yi(t) are state variables representing the state of a dynamical system; the system (1) is then called a system of state equations (see also Sec. 11.8-4 and Refs. 13.10 to 13.16).*
13.6-2. Linear Differential Equations with Constant Coefficients (Time-invariant Systems). (a) Homogeneous Systems. Normal-mode Solution. The solution of the homogeneous linear system
* Many engineering texts refer to the matrix y(t) as a state vector. It would be more correct to state that the matrix elements yi(t) (state variables) represent a state vector in a specific scheme of measurements (in the sense of tensor analysis, Chap. 16; see also Ref. 13.15).
with constant coefficients aik (see also Secs. 9.3-1 and 9.4-ld) is explicitly given by
where the matrix function eAt is the n X n matrix defined in accordance with Secs. 13.2-12 and 13.4-7. Expansion of eAt by Eq. (13.4-8) involves cumbersome matrix multiplications but, if the given matrix A has n distinct eigenvalues, expansion of eAt by Sylvester's theorem (13.4-9) yields the normal-mode expansion of Sec. 9.4-1.
One can often simplify the solution of a problem (2) by introducing n new dependent variables (state variables) ȳh by a nonsingular linear transformation
such that the resulting transformed system
is simplified (see also Secs. 14.6-1 and 14.6-2). If, in particular, there exists a transformation (4) which diagonalizes the given system matrix A (Secs. 13.4-4 and 14.8-6), then the transformed variables ȳh are normal coordinates of the given linear system (see also Sec. 9.4-8): they satisfy “uncoupled” differential equations
where λ1, λ2, . . . , λn are the eigenvalues of A. If A has n distinct eigenvalues, the solution of the original problem (2) is then given by Eq. (4) with
Complex-conjugate terms in a normal-mode solution (4), and also coincident and zero eigenvalues, can be treated in a manner analogous to Sec. 9.4-1. In the general case, one can use a transformation (4) producing a triangular matrix Ā (Sec. 13.4-3), so that the ȳi(t) can be derived one by one (Ref. 13.15).
(b) Nonhomogeneous Equations. The State-transition Matrix. The linear system
where f(t) is an n X 1 column matrix, describes the response of a (time-invariant) linear system to the inputs fi(t). As in Secs. 9.3-1 and 9.4-2, the matrix solution y(t) is obtained by superposition of the homogeneous-system solution (3) and a particular integral (normal response) yN(t):
The n X n state-transition matrix h+(t — r) ≡ [{h+(t — r)}ik] for the initial-value problem (8) is a generalization of the one-dimensional weighting function h+(t — τ) in Sec. 9.4-3 and satisfies
h+(t) is the response to the set of (asymmetrical) unit impulses fi(t) = δ+(t) (i = 1, 2, . . . , n; see also Sec. 9.4-3d). Note that the solution (9) is precisely analogous to the solution of the one-dimensional problem dy/dt = ay + f(t), y(0) = y0.
(c) Laplace-transform Solution (see also Sec. 9.4-5). Element-by-element Laplace transformation of the given constant-coefficient matrix equation (8) produces
where Y(s), F(s) denote the respective Laplace transforms of y(t), f(t). The terms in Eq. (12) are the transforms of those in Eq. (9); inverse Laplace transformation of each element Yi(s) of Y(s) produces yi(t).
13.6-3. Linear Systems with Variable Coefficients (see also Secs. 9.2-4, 9.3-3, and 18.12-2). (a) The most general linear system (1) has the form
where A(t) ≡ [aik(t)] is an n X n matrix, and f(t) is an n X 1 column matrix (linear differential equations with variable coefficients and forcing terms). The solution can again be written as
where w+(t, λ) is the n X n state-transition matrix determined for t ≥ λ as the solution of
or as the response to a set of (asymmetrical) unit impulses fi(t) = δ+(t — λ), where i = 1, 2, . . . , n; see also Sec. 9.4-3d. For constant-coefficient systems, w+(t, λ) ≡ h+(t — λ).
(b) For any real or complex matrix A(t) with continuous elements, the solution of the homogeneous linear system
is Y(t)y(0), where Y = Y(t) is an n X n matrix and the unique solution of the matrix differential equation
Y(t) is nonsingular; its columns constitute n linearly independent solutions of Eq. (16) (fundamental-solution matrix, see also Sec. 9.3-2). U(t) ≡ [Y—1(t)]† is the unique solution of
Equations (17) and (18) are adjoint linear differential equations.*
The state-transition matrix w+(t, λ) of Sec. 13.6-3a is given by
so that Eq. (14) corresponds to a matrix version of the variation-of-constants solution of Sec. 9.3-3.
13.6-4. Perturbation Methods and Sensitivity Equations. (a) Given a system of differential equations
which depends on a set (column matrix) {α1, α2, . . . , αm} of m parameters αk let y(1)(t) be the known solution for the parameter values given by α = α1 ≡ {α11, α2, . . . ,α1m}. The perturbed solution y1(t) + δy(t) corresponding to the perturbed parameter matrix α = α1 + δα may be easier to find through solution of
for the perturbation (variation, Sec. 11.5-1) δy than by direct solution
* are adjoint operators on n X 1 matrix functions u{t) such that
exists and u(0) = 0 if one defines the inner product of two such functions u, v by
(Sec. 14.4-3; see also Sec. 15.4-3).
of Eq. (20). Equation (21) is exact. For suitably differentiate f(t, y, α), one may, however, be able to neglect all but first-order terms in a Taylor-series expansion of Eq. (21) to find an approximation to δy (first-order perturbation) by solving the linear system
where the elements of the n X n matrix ∂f/∂y ≡ [∂fi/∂yk]y=y(1) and the n X m matrix ∂f/∂α ≡ [∂fi/∂αk]y=y(1) will, in general, depend on the “nominal solution” y(1)(t) and hence on t. If the perturbations δyi are small compared with the \yi\, one may be able to neglect approximation and numerical errors in the computation of the δyi.
(b) The dependence of the solution y(t) on the parameters ak is often described by (the n X m matrix of) the sensitivity coefficients (parameter-influence coefficients) Zik = ∂yi/∂αk, which form an n X m matrix Z ≡ ∂y/∂α ≡ [∂yi/∂αk]y=y(1). For each given nominal solution y(t) = y(1)(t), the sensitivity coefficients are functions of t and satisfy the mn linear differential equations (sensitivity equations) given by
(c) The initial values yi(0) = yi0 may be treated as parameters in perturbation and sensitivity calculations. In this case, the initial conditions δy(0) = 0 in Eqs. (21) and (22) must be replaced by
The appropriate initial conditions for the sensitivity equations (23) are
13.6-5. Stability of Solutions: Definitions (see also Sec. 9.5-4). (a) Given a system
different types of stability of a solution y = y(1)(t) can be defined in terms of the effects of various parameter perturbations (Sec. 13.6-4). The following theory is concerned with stability in the sense of Lyapunov, which is determined by the effects of small changes
in initial solution values on the resulting perturbations
for t > t0.
The solution y = y(1)(t) of the system (26) is
stable in the sense of Lyapunov if and only if for every real > 0, there exists a real Δ(
, t0) > 0 such that ||δy(t0)|| < Δ(
, t0) implies ||δy(t)|| <
for all t ≥ t0. Otherwise the solution is unstable.
asymptotically stable in a region D1(t0) of the “state space” of points y ≡ {y1, y2, . . . , yn} if and only if y(1)(t) is stable, and y(t0) in D1(t0) implies (i.e., ||δy(t)|| → 0 as t → 0, Sec. 13.2-11).
asymptotically stable in the large (completely stable, globally asymptotically stable) if and only if the entire state space is a region of asymptotic stability.
NOTE: In the above definitions, the norm ||δy|| of the n X 1 column matrix δy ≡ {δy1 δy2, . . . , δyn}, defined in accordance with Eq. (13.2-2) as
can be conveniently replaced by one of the alternate norms (Table 13.2-1)
Note that these definitions refer to stability of solutions, not of systems (see also Secs. 9.4-4 and 13.6-7). If a solution is stable in the sense of Lyapunov, sufficiently small changes in initial values cannot cause large solution changes at any time. For an asymptotically stable solution, the effects of finite initial-value changes, up to specified bounds, are nullified after sufficient time has elapsed. If the solution is asymptotically stable in the large, even arbitrarily large initial-value changes will have negligible long-term effects. Asymptotic stability is a requirement for practical control systems.
(b) An unstable solution y(1)(t) of Eq. (26) has a finite escape time T if and only if it becomes unbounded after a finite time t = T.
13.6-6. Lyapunov Functions and Stability. (a) Stability of Equilibrium for Autonomous Systems (see also Sec. 9.5-4b). An equilibrium solution y(t) = y(1)(t ≥ 0) of the autonomous system
is defined by
It will suffice to consider equilibrium solutions y(t) = y(1) = 0, since
other equilibrium “points” y = y(1) in state space can be translated to the origin by a simple coordinate transformation.
With reference to the solution y(t) = 0 of a given system (28), a Lyapunov function is any real function V(y) such that V(0) =0 and, throughout a neighborhood D of the “point” y = 0 in the “state space” of “points” y ≡ {y1, y2, . . . , yn}y V(y) is continuously differentiable and
for all y(t) ≠ 0 satisfying Eq. (28). The equilibrium solution y(t) = 0 is stable in the sense of Lyapunov if (and only if, Ref. 13.11) there exists a corresponding Lyapunov function. y(t) = 0 is asymptotically stable
If there exists a Lyapunov function V(y) satisfying the stronger condition dV/dt <0 for all solutions y(t) ≠ 0 of Eq. (28) in D (Lyapunov's Theorem on Asymptotic Stability).
If there exists a Lyapunov function V(y) not identically zero on any solution trajectory y = y(t) in D (Kalman-Bertram Theorem).
If the neighborhood D of the origin defining a Lyapunov function V(y) contains a bounded region D1 such that V(y) < Vo, where V0 is any positive constant, then y(t) = 0 is asymptotically stable in D1. If a Lyapunov function V(y) can be defined for the entire state space, and V(y) → α as ||y(t)|| → α , then the solution y(t) =0 is asymptotically stable in the large (La Salle's Theorem on Asymptotic Stability).
The equilibrium solution y(t) =0 of Eq. (24) is unstable if there exists a neighborhood D of y = 0, a region D1 in D, and a real function U(y) such that
U(y) is continuously differentiable, and for all solutions y(t) in D1 except that
U(y) = 0 at all boundary points (Sec. 4.3-6) of D1 in D.
y = 0 is a boundary point of D1 (Cetaev's Instability Theorem).
(b) Nonautonomous Systems. Every solution y = y(1)(t)of the system
can be transformed to the equilibrium solution (l) = 0 of a new (generally non-autonomous) system by the transformation y(t) =
(l) + y(1)(t).
The equilibrium solution y(t) = 0 of a given system (26) is asymptotically stable in the large if there exist a continuously differentiable real function V,(t, y), two continuous nondecreasing real functions V1(||y||), V2(||y||), and a continuous real function V3(||y||) such that V(t,0) = V1(0) = V2(0) = V3(0) = 0 and
for all y(t) ≠ 0 satisfying Eg. (26).
13.6-7. Applications and Examples (see also Sec. 9.5-4). (a) Applications such as control-system design motivate the search for Lyapunov functions establishing asymptotic stability in specified state-space regions, or in as large regions as possible (“direct method” of Lyapunov for stability investigations). Lyapunov functions for particular solutions are not unique, and practical search methods are more of an art than a science (Refs. 13.11 to 13.14).
(b) As noted in Sec. 9.5-4a, the equilibrium solution y(t) = 0 of the linear homogeneous constant-coefficient system
(Sec. 13.6-2a) is asymptotically stable in the large (completely stable) if and only if the system is completely stable in the sense of Sec. 9.4-4, i.e., if and only if all eigenvalues of the system matrix A have negative real parts. This is true if and only if for an arbitrary positive definite symmetric matrix Q, there exists a positive definite symmetric matrix P such that
V(y) ≡ ỹPy is then a Lyapunov function for the equilibrium solution y{ t ) = 0.
(c) Duffing's equation
describes the oscillations of a nonlinear spring. Introducing y = y1 y = y2, one has the nonlinear first-order system
The theory of Sec. 13.6-6a indicates that
is a Lyapunov function for the equilibrium solution y1 ( t ) = y2 ( t ) = 0 when a > 0, b > 0 (“hard spring”); this solution is asymptotically stable in the large.
For a > 0, b <0 (“soft spring”), the equilibrium solution y1 ( t ) = y2( t ) = 0 is asymptotically stable, but not in the large (Fig. 13.6-1).
FIG. 13.6-1. Region of asymptotic stability for duffing’s equation
with a = 1, b = −0.04. (Based on Ref. 13.11.)
13.7. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
13.7-1. Related Topics. The following topics related to the study of matrices, quadratic forms, and hermitian forms are treated in other chapters of this handbook:
Linear simultaneous equations Chap. 1
Systems of ordinary differential equations Chap. 9
Matrix notation for optimum-control problems Chap. 11
Use of matrices for the representation of vectors, linear transformations (linear operators), scalar products, and group elements Chap. 14
Eigenvectors and eigenvalues of linear operators Chap. 14
Matrix techniques for difference equations Chap. 20
Numerical techniques Chap. 20
13.7-2. References and Bibliography (see also Secs. 12.9-2 and 14.11-2).
13.1. Aitken, A. C.: Determinants and Matrices, 8th ed., Interscience, New York, 1956.
13.2. Birkhoff, G., and S. MacLane: A Survey of Modern Algebra, rev. ed., Mac-millan, New York, 1965
13.3. Gantmakher, F. R.: The Theory of Matrices, Chelsea, New York, 1959.
13.4. : Applications of the Theory of Matrices, Inter cience, New York, 1959.
13.5. Hohn, F. E.: Elementary Matrix Algebra, 2d ed., Macmillan, New York, 1964.
13.6. Nering, E. D.: Linear Algebra and Matrix Theory, Interscience, New York, 1963.
13.7. Shields, P. C.: Linear Algebra, Addison-Wesley, Reading, Mass., 1964.
13.8. Thrall, R. M., and L. Tornheim: Vector Spaces and Matrices, Wiley, New York, 1957.
13.9. Zurmuehl, R.: Matrizen, 2d ed., Springer, Berlin, 1964.
(See also the articles by G. Falk and H. Tietz in vol. II of the Handbuch der Physik, Springer, Berlin, 1955. For numerical techniques, see Secs. 20.3-3 to 20.3-5.)
Matrix Techniques for Systems of Differential Equations
(See also Refs. 9.3 and 9.16 in Sec. 9.7-2)
13.10. DeRusso, P., et al.: State Variables for Engineers, Wiley, New York, 1965.
13.11. Geiss, G. R.: “The Analysis and Design of Nonlinear Control Systems via Lyapunov's Direct Method,” RTD-TDR-63-4076, U.S. Air Force Flight Dynamics Laboratory, Wright-Patterson AFB, Ohio, 1964.
13.12. Hahn, W.: Theory and Application of Lyapunov's Direct Method, Prentice-Hall, Englewood Cliffs, N.J., 1963.
13.13. Krasovskii, N. N.: Stability of Motion, Stanford, Stanford, Calif., 1963.
13.14. Letov, A. M.: Stability of Nonlinear Control Systems, Princeton, Princeton, N.J., 1961.
13.15. Schultz, D. G., and J. L. Melsa: State Functions in Automatic Control, McGraw-Hill, New York, 1967.
13.16. Tomovič, R.: Sensitivity Analysis of Dynamic Systems, McGraw-Hill, New York, 1964.