CHAPTER 14

LINEAR VECTOR SPACES AND LINEAR TRANSFORMATIONS (LINEAR OPERATORS). REPRESENTATION OF MATHEMATICAL MODELS IN TERMS OF MATRICES

    14.1. Introduction. Reference Systems and Coordinate Transformations

      14.1-1. Introductory Remarks

      14.1-2. Numerical Description of Mathematical Models: Reference Systems

      14.1-3. Coordinate Transformations

      14.1-4. Invariance

      14.1-5. Schemes of Measurements

    14.2. Linear Vector Spaces

      14.2-1. Defining Properties

      14.2-2. Linear Manifolds and Subspaces in υ

      14.2-3. Linearly Independent Vectors and Linearly Dependent Vectors

      14.2-4. Dimension of a Linear Manifold or Vector Space. Bases and Reference Systems (Coordinate Systems)

      14.2-5. Normed Vector Spaces

      14.2-6. Unitary Vector Spaces

      14.2-7. Metric, and Convergence in Normed Vector Spaces. Banach Spaces and Hilbert Spaces

      14.2-8. The Projection Theorem

    14.3. Linear Transformations(Linear Operators)

      14.3-1. Linear Transformation of Vector Space. Linear Operators

      14.3-2. Range, Null Space, and Rank of a Linear Transformation (Operator)

      14.3-3. Addition and Multiplication by Scalars. Null Transformation

      14.3-4. Product of Two Linear Transformations (Operators). Identity Transformation

      14.3-5. Nonsingular Linear Transformations (Operators). Inverse Transformations (Operators)

      14.3-6. Integral Powers of Operators

    14.4. Linear Transformations of a Normed or Unitary Vector Space into Itself. Hermitian and Unitary Transformations (Operators)

      14.4-1. Bound of a Linear Transformation

      14.4-2. The Bounded Linear Transformations of a Normed Vector Space into Itself

      14.4-3. Hermitian Conjugate of a Linear Transformation (Operator)

      14.4-4. Hermitian Operators

      14.4-5. Unitary Transformations (Operators)

      14.4-6. Symmetric, Skew-symmetric, and Orthogonal Transformations of Real Unitary Vector Spaces

      14.4-7. Combination Rules

      14.4-8. Decomposition Theorems. Normal Operators

      14.4-9. Conjugate Vector Spaces. More General Definition of Conjugate Operators

      14.4-10. Infinitesimal Linear Transformations

    14.5. Matrix Representation of Vectors and Linear Transformations (Operators)

      14.5-1. Transformation of Base Vectors and Vector Components: "Alibi" Point of View

      14.5-2. Matrix Representation of Vectors and Linear Transformations (Operators)

      14.5-3. Matrix Notation for Simultaneous Linear Equations

      14.5-4. Dyadic Representation of Linear Operators

    14.6. Change of Reference System

      14.6-1. Transformation of Base Vectors and Vector Components: "Alias" Point of View

      14.6-2. Representation of a Linear Operator in Different Schemes of Measurements

      14.6-3. Consecutive Operations

    14.7. Representation of Inner Products. Orthonormal Bases

      14.7-1. Representation of Inner Products

      14.7-2. Change of Reference System

      14.7-3. Orthogonal Vectors and Orthonormal Sets of Vectors

      14.7-4. Orthonormal Bases (Complete Orthonormal Sets)

      14.7-5. Matrices Corresponding to Hermitian-conjugate Operators

      14.7-6. Reciprocal Bases

      14.7-7. Comparison of Notations

    14.8. Eigenvectors and Eigenvalues of Linear Operators

      14.8-1. Introduction

      14.8-2. Invariant Manifolds. Decomposable Linear Transformations (Linear Operators)and Matrices

      14.8-3. Eigenvectors, Eigenvalues, and Spectra

      14.8-4. Eigenvectors and Eigenvalues of Normal and Hermitian Operators

      14.8-5. Determination of Eigenvalues and Eigenvectors: Finite-dimensional Case

      14.8-6. Reduction and Diagonalization of Matrices. Principalaxes Transformations

      14.8-7. "Generalized" Eigenvalue Problems

      14.8-8. Eigenvalue Problems as Stationary-value Problems

      14.8-9. Bounds for the Eigenvalues of Linear Operators

      14.8-10. Nonhomogeneous Linear Vector Equations

    14.9. Group Representations and Related Topics

      14.9-1. Group Representations

      14.9-2. Reduction of a Representation

      14.9-3. The Irreducible Representations of a Group

      14.9-4. The Character of a Representation

      14.9-5. Orthogonality Relations

      14.9-6. Direct Products of Representations

      14.9-7. Representations of Rings, Fields, and Linear Algebras

    14.10. Mathematical Description of Rotations

      14.10-1. Rotations in Three-dimensional Euclidean Vector Space

      14.10-2. Angle of Rotation. Rotation Axis

      14.10-3. Euler Parameters and Gibbs Vector

      14.10-4. Representation of Vectors and Rotations by Spin Matrices and Quaternions. Cayley-Klein Parameters

      14.10-5. Rotations about the Coordinate Axes

      14.10-6. Euler Angles

      14.10-7. Infinitesimal Rotations, Continuous Rotation, and Angular Velocity

      14.10-8. The Three-dimensional Rotation Group and Its Representations

    14.11. Related Topics, References, and Bibliography

      14.11-1. Related Topics

      14.11-2. References and Bibliography

14.1. INTRODUCTION. REFERENCE SYSTEMS AND COORDINATE TRANSFORMATIONS

14.1-1 . This chapter reviews the theory of linear vector spaces (see also Sec. 12.4-1) and linear transformations (linear operators). Vectors and linear operators represent physical objects and operations in many important applications.

Most practical problems require a description (representation) of mathematical models (Sec. 12.1-1) in terms of ordered sets of real or complex numbers. In particular, the concepts of homomorphism and isomorphism (Sec. 12.1-6) make it possible to “represent” many mathematical models by corresponding classes of matrices (Sec. 13.2-1; see also Sec. 13.2-5), so that abstract mathematical operations are made to correspond to numerical operations on matrix elements (EXAMPLES: matrix representations of quantum-mechanical operators and of electrical transducers). Sections 14.5-1 to 14.10-8 describe the use of matrices to represent vectors, linear operators, and group elements.

14.1-2. Numerical Description of Mathematical Models: Reference Systems (see also Secs. 2.1-2, 3.1-2, 5.2-2, 6.2-1, 12.1-1, and 16.1-2). A reference system (coordinate system) is a scheme of rules which describes (represents) each object (point) of a class (space, region of a space) C by a corresponding ordered set of (real or complex) numbers (components, coordinates) x 1, x 2, . . . . The number of coordinates required to define each point (image1, image2 . . .) is called the dimension of the space C (see also Sec. 14.2-4b). In many applications, coordinate values are related to physical measurements.

14.1-3. Coordinate Transformations (see also Secs. 2.1-5 to 2.1-8 3.1-12, 6.2-1, and 16.1-2). A transformation of the coordinates x1, x2, . . . is a set of rules or relations associating each point (x1, x2, . . .) with a new set of coordinates. Coordinate transformations admit two interpretations:

1. “Alibi” or “active” point of view: the coordinate transformation image describes an operation (function, mapping, Sec. 12.1-4) associating a new mathematical object (point) (x1, x2, . . .) with each given point (x 1, x 2, . . .)

2. “Alias” or “passive” point of view: the coordinate transformation (14.1-2) image introduces a new description (new representation, relabeling) of each point (x1, x2, . . .) in terms of the new coordinates x1, x2, . . . .

Coordinate transformations permit one to represent abstract mathematical relationships by numerical relations (“alibi” point of view) and to change reference systems (“alias” point of view). A change of reference system often simplifies a given problem (EXAMPLES: principal-axes transformations, Secs. 2.4-8, 3.5-7, 9.4-8, and 17.4-7; contact transformations, Secs. 10.2-5 and 11.5-6; angle and action variables in dynamics).

14.1-4. Invariance (see also Secs. 12.1-5 and 16.2-1, and refer to Secs. 16.1-4 and 16.4-1 for more detailed discussions). A function of the coordinates labeling an object or objects is invariant with respect to a given coordinate transformation (1) or (2) if the function value is unchanged on substitution of x i (x 1, x 2, . . .) or xi (x 1, x 2, . . .) for each xi . A relation between coordinate values is invariant if it remains valid after similar substitutions.

Invariance with respect to an “alibi”-type coordinate transformation is interpreted in the manner of Sec. 12.1-5. Functions and relations invariant with respect to a suitable class of “alias”-type coordinate transformations may be regarded as functions of, and relations between, the actual objects (invariants) represented by different sets of coordinates in different reference systems. A complete set of invariants f 1(x 1, x 2, . . .), f 2(x 1, x 2, . . .), . . . uniquely specifies all properties of the object (x 1, x 2, . . .) which are invariant with respect to a given class (group) of coordinate transformations (see also Sec. 12.2-8).

14.1-5. Schemes of Measurements . The representation of a model involving two or more classes of objects will, in general, require a reference system for each class of objects; the resulting set of reference systems will be called a scheme of measurements. A change of the scheme of measurements involves an “alias”-type coordinate transformation for each class of objects; one usually relates these transformations so as to ensure the invariance of important functions and/or relations (see also Secs. 14.6-2, 16.1-4, 16.2-1, and 16.4-1).

14.2. LINEAR VECTOR SPACES

14.2-1. Defining Properties . As already stated in Sec. 12.4-1, a linear vector space υ of vectors a, b, c, . . . over the ring (with identity, Sec. 12.3-1b) R of scalars α, β, . . . admits vector addition and multiplication of vectors by scalars with the following properties:

image

Note that

image

Unless the contrary is specifically stated, all linear vector spaces considered in this handbook are understood to be real vector spaces or complex vector spaces respectively defined as linear vector spaces over the field of real numbers and over the field of complex numbers.

In the case of vector spaces admitting a definition of infinite sums (Sec. 14.2-7b), many authors refer to a set of vectors with the above properties as a linear manifold and reserve the term vector space for a linear manifold which is closed, i.e., which contains all its limit points (Sec. 12.5-1b; the two terms are equivalent in the case of finite-dimensional manifolds, Sec. 14.2-4).

14.2-2. Linear Manifolds and Subspaces in υ . A subset υ 1 of a linear vector space υ is a linear manifold in υ if and only if υ 1 is a linear manifold over the same ring of scalars as υ; υ 1 will be called a subspace of υ if it is a closed linear manifold in υ (see also Sec. 14.2-1). A proper subspace of υ is a subspace other than 0 or υ itself.

Any given set of vectors e1, e2, . . . in υ generates (spans, determines) a linear manifold comprising all linear combinations of e1, e2 , . . .

EXAMPLES: straight lines and planes through the origin in three-dimensional Euclidean space.

14.2-3. Linearly Independent Vectors and Linearly Dependent Vectors (see also Secs. 1.9-3, 5.2-2, and 9.3-2). (a) A finite set of vectors a 1, a 2, . . . is linearly independent if and only if

image

Otherwise, the vectors a1, a2, . . . are linearly dependent, and at least one vector of the set, say ak, can be expressed as a linear combinationimageof the other vectors ai of the set. As a trivial special case, this is true whenever a k is a null vector.

(b) The definitions of Sec. 14.2-3a apply to infinite sets of vectors a1, a2, . . . if it is possible to assign a meaning to Eq. (2). In general, this will require the vector space to admit a definition of convergence in addition to the algebraic postulates of Sec. 14.2-1 (Secs. 12.5-3 and 14.2-7b).

14.2-4. Dimension of a Linear Manifold or Vector Space. Bases and Reference Systems (Coordinate Systems) . (a) A (linear) basis in the linear manifold υ is a set of linearly independent vectors e1, e2, . . . of υ such that every vector a of υ can be expressed as a linear form

image

in the base vectors e i. Every set of linearly independent vectors forms a basis for the linear manifold comprising all linear combinations of the given vectors.

(b) In a finite-dimensional linear manifold or vector space spanned by n base vectors

Every set of n linearly independent vectors is a basis.

No set of m < n vectors is a basis.

Every set of m > n vectors is necessarily linearly dependent.

The number n is called the (linear) dimension of the vector space. An infinite-dimensional vector space does not admit a finite basis.

(c) In every finite-dimensional real or complex vector space, the numbers α1, α2, . . . , αn are unique components or coordinates of the vector a = α1e1 + α 2e2 + . . . + αn e n in a reference system (coordinate system) defined by the base vectors e1, e2, . . . , en . Note that a + b has the components αi + βi , and αa has the components ααi (i = 1, 2, . . . , n; see also Sec. 5.2-2).

(d) Two linear vector spaces υ and υ′ over the same ring of scalars α, β, . . . are isomorphic (Sec. 12.1-6) if and only if it is possible to relate their respective vectors a, b, . . . and a′, b′, ... by a reciprocal one-to-one correspondence a ↔ a′, b ↔ b′, . . . such that a + b ↔ a′ + b′, αa ↔ αa′. In the case of finite-dimensional vector spaces, this is true if and only if υ and υ′ have the same linear dimension.

In particular, every n-dimensional real or complex vector space is isomorphic with the space of n-rowed column matrices over the field of real or complex numbers, respectively (matrix representation, Sec. 14.5-2).

14.2-5. Normed Vector Spaces . A real or complex vector space υ is a normed vector space if and only if for every vector a of υ there exists a real number ||a|| (norm, absolute value, magnitude of a) such that a = b implies ||a|| = ||b||, and, for all a, b in υ,

image

A unit vector is a vector of unit magnitude (see also Sec. 5.2-5). Note that || —a|| = ||a||, ||0|| =0 (see also Sec. 13.2-1).

14.2-6. Unitary Vector Spaces . (a) A real or complex vector space υu is a unitary (hermitian) vector space if and only if it is possible to define a binary operation (inner or scalar multiplication of vectors) associating a scalar (a, b), the (hermitian) inner, scalar, or dot product of a and b, with every pair a, b of vectors of υu, where

image

It follows that in every unitary vector space

image

* Some authors use the alternative defining postulate (αa, b) = α(a, b) which amounts to an interchange of a and b in the definition of (a, b).

The Cauchy-Schwarz inequality (6) (see also Sec. 1.3-2) reduces to an equation if and only if a and b are linearly dependent (see also Secs. 1.3-2, 4.6-19, and 15.2-lc).

m vectors a1, a2, . . . , am of υu are linearly independent if and only if the mth-order determinant det [(ai, ak)] is different from zero (Gram's determinant, see also Secs. 5.2-8 and 15.2-la).

(b) If a unitary vector space is a real vector space, all scalar products (a, b) are real, and scalar multiplication of vectors is commutative, so that

image

NOTE:The inner-product spaces with indefinite metric used in relativity theory are real or complex vector spaces admitting the definition of an inner product (a, b) which satisfies conditions (1) to (3), but not condition (4) of Sec. 14.2-6a. The vectors of such a space may be classified as vectors of positive, negative, and zero square (a, a). One definesimage. See also Sees. 14.2-5 and 16.8-1.

14.2-7. Metric and Convergence in Normed Vector Spaces. Banach Spaces and Hilbert Spaces . (a) Every normed vector space (Sec. 14.2-5) is a metric space with the metric d(a, b) = ||a — b|| (Sec. 12.5-2) and permits the definition of neighborhoods and convergence in the manner of Sec. 12.5-3 (see also Sec. 5.3-1). In this sense, an infinite sum a0 + a1 + a2 . . . converges to (equals) a vector

image

A normed vector space υ is complete (Sec. 12.5-4) if and only if every sequence of vectors s0, s1, s2, . . . of υ such that

image

(Cauchy sequence, see also Sec. 4.9-1) converges to a vector s of υ. Complete normed vector spaces are called Banach spaces. Every finite-dimensional normed vector space is complete.

(b) Every unitary vector space υ u permits one to introduce the norm (absolute value, magnitude) ||a|| of each vector a, the distance d(a, b) between two “points” a, b of υu and the angle γ between any two vectors a, b by means of the definitions

image

The functions ||a|| and d(a, b) defined by Eq. (8) satisfy all conditions of Sees. 14.2-5 and 12.5-2. If υu is a real unitary vector space, the Cauchy-Schwarz inequality (6) insures that the angle γ is real for all a, b.

(c) Finite-dimensional real unitary vector spaces are called Euclidean vector spaces. They are separable, complete, and boundedly compact (Sec. 12.5-4) and serve as models for n-dimensional Euclidean geometries (see also Chap. 2 and Chap. 3 and Secs. 5.2-6 and 17.4-6d). Complete infinite-dimensional unitary vector spaces are called Hilbert spaces.* The complete sequence and function spaces listed in Table 12.5-1 are all Hilbert spaces (and hence also Banach spaces).

Hilbert spaces preserve many of the properties of Euclidean spaces. In particular, every separable (Sec. 12.5-16) real or complex Hilbert space is isomorphic and isometric to the space I2 of respectively real or complex infinite sequences1, ξ2, . . .) such that ||(ξ1, ξ2, . . .)||2 = |ξ1|2 + |ξ2|2 + . . . converges (Table 12.5-1 a). Hence each vector of a separable Hilbert space can be labeled with a countable set of coordinates (or with a column or row matrix, Sec. 14.5-2).

Every linear manifold in a Hilbert space is a complete subspace (see also Sec. 14.2-2) and is, thus, itself a Euclidean vector space or a Hilbert space.

14.2-8. The Projection Theorem . Given any vector x of a unitary vector space υu and a complete subspace υ1, there exists a unique vector y = xp of υ1 which minimizes the distance ||x — y|| for all y in υ1; moreover, xp is the unique vector y of υ1 such that x — y is orthogonal to every vector x1 of υ1 i.e.,

image

(see also Sec. 14.7-3b). The mapping x → xp is a bounded linear operation (Sec. 14.4-2) called the orthogonal projection of (the points of) υu onto υ1

The projection theorem is of the greatest practical importance, for Eq. (9) defines the optimal approximation of a vector x by a vector y of the “simpler” class υ1 if ||x — y||2 measures the error of the approximation.

EXAMPLES: Projection of points onto planes in Euclidean geometry; orthogonal-function approximations (Secs. 15.2-3, 15.2-6, 20.6-2, and 20.6-3), mean-square regression (Sec. 18.4-6), Wiener filtering and prediction.

14.3. LINEAR TRANSFORMATIONS (LINEAR OPERATORS)

14.3-1. Linear Transformation of a Vector Space. Linear Operators . Given two linear vector spaces υ and υ′ over the same field of

* Some authors do not require Hilbert spaces to be infinite-dimensional; others require them to be separable (Sec. 12.5-16) as well as complete. Unitary vector spaces are sometimes referred to as pre-Hilbert spaces.

scalars α, β, . . . , a (homogeneous) linear transformation of υ into υ′ is a correspondence

image

which relates vectors x′ of υ′ to vectors x of υ so as to preserve the “linear” operations of vector addition and multiplication of vectors by scalars:

image

Each linear transformation can be written as a multiplication by a linear operator A (linear operation), with

image

f(x) = Ax + a′ is called a linear vector function. The definition of each linear operator must include that of its domain of definition. In physics, the first relation (3) is often referred to as a superposition principle for a class of operations.

14.3-2. Range, Null Space, and Rank of a Linear Transformation (Operator) . The range (Sec. 12.1-4) of a linear transformation A of υ into υ′ is a linear manifold (Sec. 14.2-2) of υ′. The null space of A is the manifold of υ mapped onto Ax = 0. The rank r and the nullity r′ of a linear transformation A are the respective linear dimensions (Sec. 14.2-4b) of its range and null space. If υ has the finite dimension n, then its range and null space are subspaces, and r + r′ = n.

14.3-3. Addition and Multiplication by Scalars. Null Transformation . (a) Let A and B be linear transformations (operators) mapping a given domain of definition D in υ into υ′. One defines A ± B and αA as linear transformations of υ into υ′ such that

image

for all vectors x in D.

(b) The null transformation O of υ into υ is defined by Ox = 0 for all vectors x in υ.

14.3-4. Product of Two Linear Transformations (Operators). Identity Transformation . (a) Let A be a linear transformation (operator) mapping υ into υ′, and let B be a linear transformation mapping (the range of A in) υ′ into υ″. The product BA is the linear transformation of υ into υ″ obtained by performing the transformations A

and B successively (see also Sec. 12.2-8):

image

(b) The identity transformation I of any vector space υ transforms every vector x of υ into itself:

image

14.3-5. Nonsingular Linear Transformations (Operators). Inverse Transformations (Operators) . A linear transformation (operator) A is nonsingular (regular) if and only if it defines a reciprocal one-to-one correspondence mapping all of υ onto all of υ′ (υ and υ are then necessarily isomorphic, Sec. 14.2-4d). A is nonsingular if and only if it has a unique inverse (inverse transformation, inverse operator) A-1 mapping υ onto υ so that x′ = Ax implies x = A-1x' and conversely, or

image

Products and inverses of nonsingular transformations (operators) are non-singular; if A and B are nonsingular, and α ≠ 0,

image

Nonsingular linear transformations (operations) preserve linear independence of vectors and hence also the linear dimensions of transformed manifolds (Secs. 14.2-3 and 14.2-4).

A linear operator A is nonsingular if it has a unique left or right inverse, or if it has equal left and right inverses; the mere existence of a left and/or right inverse is not sufficient. A linear transformation (operator) A defined on a finite-dimensional vector space is nonsingular if and only if Ax = 0 implies x = 0, i.e., if and only if r = n, r′ = 0 (Sec. 14.3-2).

14.3-6. Integral Powers of Operators . One defines A0 ≡ 1, A1 ≡ A, A2AA, A3AAA, . . . , and, if A is nonsingular, A-p = (A-1)p = (Ap)-1 (p = 1, 2, . . .). The ordinary rules for operations with exponents apply (see also Sec. 12.4-2).

14.4. LINEARTRANSFORMATIONS OF A NORMED OR UNITARY VECTOR SPACE INTO ITSELF. HERMITIAN AND UNITARY TRANSFORMATIONS (OPERATORS)

14.4-1. Bound of a Linear Transformation (see also Sec. 13.2-la). A linear transformation A of a normed vector space (Sec. 14.2-5) υ into a normed vector space υ' is bounded if and only if A has a finite bound (norm)

image

Every linear transformation (operator) defined throughout a finite-dimensional normed vector space is bounded.*

If υ and υ' are unitary vector spaces (Sec. 14.2-6), then

image

14.4-2. The Bounded Linear Transformations of a Normed Vector Space into Itself . (a) The bounded linear transformations (operators) A, B, . . . of a normed vector space υ into itself constitute a linear algebra (Sec. 12.4-2). Within this algebra, the singular transformations are zero divisors (Sec. 12.3-la); the nonsingular transformations constitute a multiplicative group and, together with the null transformation (Sec. 14.3-3b), form a division algebra (Sec. 12.4-2). If υ has the finite dimension n, the transformation algebra is of order n2 .

The transformation algebra (operator algebra) is not in general commutative (see also Sec. 12.4-2). The operator AB — BA is called the commutator of A and B.

(b) Bounded linear operators defined on complete unitary vector spaces (either finite-dimensional or Hilbert spaces, Sec. 14.2-7c) permit the definition of convergent sequences and analytic functions of operators with the aid of the metric ||A — B|| (Sec. 12.5-3) in the manner of Secs. 13.2-11 and 13.2-12.

14.4-3. Hermitian Conjugate of a Linear Transformation (Operator) . Every bounded linear transformation† A of a complete unitary vector space υ u into itself has a unique hermitian conjugate (adjoint, conjugate, associate operator) A† defined by

image

* Many texts define homogeneous linear operators on any normed vector space as operators which satisfy Eq. (14.3-3) and are bounded (and hence continuous in the sense of Sec. 12.5-le).

† See footnote to Sec. 14.4-1.

so that

image

(see also Sec. 14.2-6a).

14.4-4. Hermitian Operators . A linear operator A mapping a complete unitary vector space υ u into itself is a hermitian operator (self-adjoint operator, self-conjugate operator) if and only if

image

If υ u is a complex complete unitary vector space, A is hermitian if and only if (x, Ax) is real for all x, or

image

(alternative definition). A transformation (operator) A such that A† = — A is called skew-hermitian.

Hermitian operators are of great importance in applications which require (x, Ax) to be a real quantity (vibration theory, quantum mechanics). A hermitian operator A is, respectively, positive definite, negative definite, nonnegative, nonpositive, positive semidefinite, negative semidefinite, indefinite, or zero if the same is true for the inner product (hermitian form) (x, Ax) (see also Sees. 13.5-3 and 14.7-1). If A is nonnegative, there exists a hermitian operator Q such that Q†Q = QQ† = A; Q is uniquely defined if A is nonsingular.

14.4-5. Unitary Transformations (Operators) . A linear transformation A mapping a complete unitary vector space υ u into itself is unitary if and only if

image

Every unitary transformation A is nonsingular and bounded, and ||A|| = 1.Every unitary transformation x′ = Ax preserves scalar products:

image

If υ u is finite-dimensional, each of the relations (10) implies that A is unitary.

Unitary transformations preserve the results of scalar multiplication of vectors as well as those of vector addition and multiplication by scalars; so that absolute values, distances, angles, orthogonality, and orthonormality (Sees. 14.2-7a and 14.7-3) are invariant (Sec. 12.1-5).

14.4-6. Symmetric, Skew-symmetric, and Orthogonal Transformations of Real Unitary Vector Spaces . (a)The hermitian conjugate (Sec. 14.4-3) A† associated with a linear transformation A of a real complete unitary vector space υ E into itself is often called the transpose (conjugate, associate operator) Ã of A and satisfies the relations

image

à may be substituted for A† in all relations of Sec. 14.4-3 whenever the vector space in question is a real unitary vector space.

(b) A linear transformation A of a real complete unitary vector space υ E is symmetric if and only if

image

skew-symmetric (antisymmetric) if and only if

image

orthogonal if and only if

image

Orthogonal transformations defined on real unitary vector spaces are unitary, so that all theorems of Sec. 14.4-5 apply.

14.4-7. Combination Rules (see also Sec. 13.3-3). (a) If the operator A is hermitian, the same is true for Ap (p= 0, 1, 2, . . .), A-1, and T†AT, and for αA if α is real.

Given any nonsingular operator T, T†AT is hermitian if and only if A is hermitian; hence for any unitary operator T, T-1AT is hermitian if and only if the same is true for A.

In particular, if the vector space in question is a real complete unitary vector space, and A is symmetric, the same is true for Ap (p = 0, 1, 2, . . .), A-1, TAT, and αA. Given any nonsingular operator T, TAT is symmetric if and only if A is symmetric; and for any orthogonal operator T, T_1AT is symmetric if and only if the same is true for A.

(b) If A and B are hermitian (or symmetric), the same is true for A + B. The product AB of two hermitian (or symmetric) operators A and B is hermitian (or symmetric) if and only if BA = AB (see also Sec. 13.4-4b).

(c) If A is a unitary transformation (operator), the same is true for Ap (p = 0, 1, 2, . . .), A-1, and A†, and for αA if |α| = 1. If A and B are unitary, the same is true for AB.

If A is an orthogonal transformation, the same is true for Ap (p = 0,1, 2, . . .), A-1, Ã, and —A. If A and B are orthogonal, the same is true for AB.

14.4-8. Decomposition Theorems. Normal Operators (see also Secs. 13.3-4 and 14.8-4). (a) For every linear operator A mapping a complete unitary vector space into itself, ½(A + A†) = H1 and ½i (A — A†) = H2 are hermitian operators. A = H1 + iH2 is the (unique) cartesian decomposition of the given operator A into a hermitian part and a skew-hermitian part (comparable to the cartesian decomposition of complex numbers into real and imaginary parts, Sec. 1.3-1).

If A is defined on a real complete unitary vector space, the cartesian decomposition reduces to the (unique) decomposition of A into the symmetric part ½(A + Ã) and the skew-symmetric part ½(A — Ã).

For every linear operator A mapping a complete unitary vector space into itself, A†A is hermitian and nonnegative, and there exists a polar decomposition A = QU of A into a nonnegative hermitian factor Q and a unitary factor U. Q is uniquely defined by Q2 = A†A, and U is uniquely defined if and only if A is nonsingular (compare this with Sec. 1.3-2).

(b) A linear operator A mapping a complete unitary vector space υ u into itself is a normal operator if and only if A†A = AA† or, equivalently, if and only if H2H1 = H1H2. A bounded operator A is normal if and only if ||Ax|| = ||A|| ||x|| for all x in υ u . Hermitian and unitary operators are normal.

14.4-9. Conjugate (Adjoint, Dual) Vector Spaces. More General Definition of Conjugate (Adjoint) Operators . (a)The bounded linear transformations* A of a normed vector space υ into any complete normed vector space (Banach space, Sec. 14.2-7) constitute a complete normed vector space, with addition and multiplication by scalars defined by Eq. (14.3-4), and norm ||A||. In particular, the class of bounded, linear, and homogeneous scalar functions φ(x) defined on a normed vector space υ constitute a complete normed vector space,* the conjugate (adjoint, dual) vector space υassociated with υ.

A bounded linear transformation

image

*See footnote to Sec. 14.4-1.

† Note carefully that the value of φ(x) is a scalar, while the function φ(x) can be a multidimensional vector. In the context of Chap. 15 (Sec. 15.4-3), the numerical-valued function φ(x) is a functional.

mapping υ into another normed vector space υ relates vectors φ, φ′ of the corresponding conjugate spaces υ†, υ′†, by the bounded linear transformation

image

A† is called the conjugate (adjoint) operator associated with A; one has

image

Note that (A†)† is not general identical with A in this context.

(b) Every bounded, linear, and homogeneous scalar function φ(x) defined on a complete unitary vector space (Euclidean space or Hilbert space) υ v can be expressed as an inner product

image

where $ is a vector of υ v. The correspondence between the vectors φ of υv and the vectors $ of υ v is an isomorphism and, since Sec. 14.4-1 implies

image

the correspondence is also isometric (Sees. 12.5-2 and 14.2-7b). Hence, complete unitary vector spaces are self-conjugate, i.e., identical with their conjugate spaces except for isomorphism and isometry. The definition of operators conjugate to linear transformations of a complete unitary vector space into itself can then be reduced to the simple definition of Hermitian-conjugate operators given in Sec. 14.4-3.

14.4-10. Infinitesimal Linear Transformations (see also Sees. 4.5-3 and 14.10-5). (a) An infinitesimal linear transformation (infinitesimal linear operator, infinitesimal dyadic) defined on a normed real or complex vector space has the form

image

where B is bounded, and |image|2 is negligibly small compared to 1 (image is usually a scalar differential).

(b) For infinitesimal linear transformations, A = I + imageB, A1 = I + image1B1, A2 = I + image2B2

image

Infinitestimal linear transformations (operators) commute.

(c) An infinitesimal linear transformation A = I + imageB defined on a complete unitary vector space is unitary if and only if imageB is skew-hermitian. An infinitesimal linear transformation A = I + imageB defined on a real complete unitary vector space is orthogonal if and only if imageB is skew-symmetric.

14.5. MATRIX REPRESENTATION OF VECTORS AND LINEAR TRANSFORMATIONS (OPERATORS)

14.5-1. Transformation of Base Vectors and Vector Components: “Alibi” Point of View (see also Sec. 14.1-3). Consider a finite-dimensional* real or complex vector space υ n with a reference system defined by n base vectors e1, e2, . . . , en (Sec. 14.2-4). Each vector

image

*The theory of Sees. 14.5-1 to 14.7-7 applies also to certain infinite-dimensional vector spaces (Sees. 14.2-4 and 14.2-7b). Such spaces must permit the definition of countable bases (Sec. 14.2-4), such as orthonormal bases (Sec. 14.7-4), and of convergence (Sec. 14.2-7), so that sums like that in Eq. (1) become convergent infinite series. This is, in particular, true for every separable Hilbert space (Sec. 14.2-7c). Vector spaces which do not admit countable bases can be represented by suitable function spaces (Sec. 15.2-1). is described by its components ξ1, ξ2, . . ,ξn. A linear transformation (operator) A mapping υ n into itself (Sec. 14.3-1) transforms each base vector ek into a corresponding vector

image

and each vector x of υ n into a corresponding vector x′ of υ n:

image

The components ξi' of the vector x′ and the components ξk of the vector x, both referred to the e1, e2, . . . , en reference system, are related by the n linear homogeneous transformation equations

image

14.5-2. Matrix Representation of Vector and Linear Transformations (Operators) . For each given reference basis e1, e2, . . . , en in υ n

The vectors x ≡ ξ1;e1 + ξ1e2 + … + ξnen of υ n are represented on a reciprocal one-to-one basis by the column matrices {ξk} (Sec. 13.2-1b).

The linear transformations (operators) mapping υ n into itself are represented on a reciprocal one-to-one basis by the n X n matrices A[aik] defined by Eq. (2) or (4).

The transition between vectors and operators and the corresponding matrices is an isomorphism (Sec. 12.1-6): sums and products involving scalars, vectors, and transformations correspond to analogous sums and products of matrices. Identities and inverses correspond; nonsingular and unbounded transformations (operators) correspond, respectively, to nonsingular and unbounded matrices, and conversely.

In particular, the coordinate-free vector equation

image

is represented in the e1 , e2, . . . , en reference system by the matrix equation

image

which is equivalent to the n transformation equations (4); and the product of two linear transformations A and B is represented by the product of the corresponding matrices A and B (carefully note Sec. 14.6-3).

NOTE: Transformations of an n-dimensional vector space υ n into an ra-dimensional vector space υ m may be similarly represented by m X n matrices. Transformations relating two real vector spaces can always be represented by real matrices.

14.5-3. Matrix Notation for Simultaneous Linear Equations (see also Sees. 1.9-2 to 1.9-5). A set of simultaneous linear equations

image

is equivalent to the matrix equation

image

The unknowns xk may be regarded as components of an unknown vector such that the transformation (8) yields the vector represented by the bi. If, in particular, the matrix [aik] is nonsingular (Sec. 13.2-3), then the matrix equation (8) can be solved to yield the unique result

image

which is equivalent to Cramer's rule (1.9-4).

14.5-4. Dyadic Representation of Linear Operators . A linear operator A defined on an n-dimensional vector space may also be expressed as a sum of n outer products of pairs of vectors (n dyads) in the manner of Sec. 16.9-1. The corresponding n X n matrix A can be similarly expressed as the sum of n outer products of pairs of row and column matrices (see also Sec. 13.2-10).

14.6. CHANGE OF REFERENCE SYSTEM

14.6-1. Transformation of Base Vectors and Vector Components: “Alias” Point of View (see also Sec. 14.1-3). (a) Given a reference basis e1, e2, . . . , en in the finite-dimensional* vector space υ n, m vectors

image

if and only if the matrix [aik] is of rank m (see also Sec. 1.9-3).

In particular, for every reference basis e1, e2, . . . , en in υ n

image

The matrix T[tik] represents a (necessarily nonsingular) transformation T relating the old base vectors e i and the new base vectors e k = T e k in the manner of Eq. (14.5-2).

(b) Now each vector x of υ n can be expressed in terms of vector components ξ i referred to the e i system or in terms of vector components ξ k referred to the e k system:

image

* See the footnote to Sec. 14.5-1.

The vector components ξi and ξ k of the same vector x are related by the n linear homogeneous transformation equations

image

The meaning of the transformation equations (3) must be carefully distinguished from that of the formally analogous relations (14.5-4) and (14.5-6).

Note also the inverse relations, viz.

image

or

where T ik is the cofactor of t ik in the determinant det [t ik] (Sec. 1.5-2).

14.6-2. Representation of a Linear Operator in Different Schemes of Measurements . (a) Consider a linear operator A represented by the matrix A in the scheme of measurements (Sec. 14.1-5) associated with the base vectors e i (Sec. 14.5-2) and by the matrix A in the e i scheme of measurements, so that for every vector x of υ n

image

Given the transformation matrix T relating the e i and e k reference systems so that

image

(Sec. 14.6-1), the matrices A and A are related by the similarity transformation (Sec. 13.4-1)

image

Conversely, every matrix A related to A by a similarity transformation (8) represents the same linear operator A in a scheme of measurements specified by the base vectors (1).

(b) All matrices (8) representing the same operator A have the same rank r; r equals the rank of A (Sec. 14.3-2). The trace and the determinant of the matrix A are also common to all matrices (8) and are referred to as the trace Tr (A) and the determinant det (A) of the operator A (see also Sees. 14.1-4 and 13.4-1b).

(c) Matrix Transformation of Base Vectors. If one admits row and column matrices of vectors, Eqs. (1) and (6) can be respectively written as

image

14.6-3. Consecutive Operations (see also Sees. 14.5-2 and 14.10-6). (a) Consider two consecutive linear operations A, B defined by the base-vector transformations

image

where A, B, and BA are, respectively, represented by the matrices A[aik], B[bik], and BA in the e i scheme of measurements, i.e.,

image

is represented by

image

where x′ ≡ { ξ′1, ξ′2, . . .} and x″ ≡ { ξ″1, ξ″2, . . .} as well as x ≡ { ξ1, ξ2, . . .} are columns of vector components measured in the ei reference system.

Note carefully that the operation B defined by Eqs. (12) to (14) is, in general, different from the operation defined by

image

which corresponds to

image

since the matrix B[bik] represents ABA-1 and not B in the e′ k scheme of measurements.

(b) The column matrices x ≡ { ξ1, ξ2, . . . , ξn}, x ≡ { ξ1, ξ2, . . . , ξn}, x ≡ { ξ1, ξ2, . . . , ξn} representing the same vector

image

are related by the alias-type transformations

image

Note again that, in general, x ≠ Bx

14.7. REPRESENTATION OF INNER PRODUCTS. ORTHONORMAL BASES

14.7-1. Representation of Inner Products (see also Sees. 14.2-6, 14.7-6b, and 16.8-1). Given a finite-dimensional unitary vector space or separable Hilbert space (Sec. 14.2-7)* υ u ,let the vectors image and image be represented by the respective column matrices a ≡ {αi} and b ≡ {βi} in the manner of Sec. 14.5-2 (see also Sec. 13.2-1b). Then

image

The matrix G ≡ [gik ] is necessarily hermitian (gik ≡ g*ki) and positive definite (Sec. 13.5-3); if υ n is a real unitary vector space, then G is real and symmetric.

Equation (1) makes it possible to describe absolute values, angles, distances, convergence, etc., in υ n in terms of numerical vector components (see also Sec. 14.2-7). In particular,

image

* See the footnote to Sec. 14.5-1.

The hermitian form (Sec. 13.5-3) (2) is called the fundamental form of υ u in the scheme of measurements defined by the base vectors ei .

14.7-2. Change of Reference System (see also Secs. 14.6-1 and 16.8-1). If one introduces a new set of base vectors ek such that a = Ta, b = Tb (Sec. 14.6-1), the in variance (Sec. 14.1-4) of

image

14.7-3. Orthogonal Vectors and Orthonormal Sets of Vectors . (a) Two vectors a, b of a unitary vector space υ n are (mutually) orthogonal if and only if (a, b) = 0 = 90 deg, Sec. 14.2-7b). An ortho-normal (normal orthogonal) set of vectors is a set of mutually orthogonal unit vectors u1, u2, . . . , so that

image

Every set of mutually orthogonal nonzero vectors (and, in particular, every orthonormal set) is linearly independent; so that the largest number of vectors in any such set (the orthogonal dimension of υ u ) cannot exceed the linear dimension of υ u (see also Secs. 14.2-3 and 14.2-4b).

(b) Bessel's Inequality (see also Sec. 15.2-3b). Given a finite or infinite orthonormal set u1, u2, . . . and any vector a in υ u,

image

The equal sign applies if and only if the vector a belongs to the linear manifold spanned by the orthonormal set (see also Sec. 14.7-4). BessePs inequality is closely related to the projection theorem of Sec. 14.2-8 and is often used to prove the convergence of infinite series.

14.7-4. Orthonormal Bases (Complete Orthonormal Sets) . (a) In a finite-dimensional unitary vector space of dimension n, every ortho-normal set of n vectors is a basis (orthonormal basis). More generally, in every complete unitary vector space υ c (this includes both finite-dimensional unitary vector spaces and Hilbert spaces, Sec. 14.2-7c), an orthonormal set of vectors u1, u2, . . . constitutes an orthonormal basis (complete orthonormal set, complete orthonormal system) if and only if it satisfies the following conditions:

image

Each of these four conditions implies the three others. The relative simplicity of the above expressions for ||a||2 and (a, b) makes orthonormal bases especially useful as reference systems. Note that the concept of a complete orthonormal set extends the definition of a basis (Sec. 14.2-4) to suitable infinite-dimensional vector spaces in the following sense: if a, a′ are two vectors with identical components αk, then ||a — a′|| =0 (Uniqueness Theorem).

(b) Construction of Orthonormal Sets of Vectors. Given any countable (finite or infinite) set of linearly independent vectors e1, e2, . . . of a complete unitary vector space, there exists an orthonormal set u1, u1, . . . spanning the same linear manifold.

Such an orthonormal set may be constructed with the aid of the following recursion formulas (Gram-Schmidt orthogonalization process, see also Sec. 15.2-5):

image

14.7-5. Matrices Corresponding to Hermitian-conjugate Operators (see also Secs. 14.4-3 to 14.4-6, 13.3-1, and 13.3-2). (a) Given a linear operator A represented by the matrix A,the hermitian conjugate A† is represented by G-1A†G in the same scheme of measurements.*

(b) For an orthonormal reference system u1, u2, . . . , one has G = I (Sec. 14.7-4), and

image

so that hermitian-conjugate operators correspond to hermitian-conjugate matrices, and conversely. Thus hermitian, skew-hermitian, and unitary operators correspond to matrices of the same respective types, and conversely. In particular, symmetric, skew-symmetric, and orthogonal operators defined on real vector spaces correspond, respectively, to symmetric, skew-symmetric, and orthogonal matrices whenever an ortho-normal reference system is used.

More generally, unitary or orthogonal operators correspond to matrices of the same respective type whenever the reference base vectors are orthogonal (not necessarily orthonormal).

14.7-6. Reciprocal Bases . (a) For every basis e1, e2, . . . , en in a finite-dimensional vector space, there exists a uniquely corresponding reciprocal (dual) basis e1, e2, . . . , en defined by the symmetric relationship

image

so that each e i is perpendicular to all e k with k ≠ i, and

image

(b) Vectors a, b, . . . represented in the e i reference system by column matrices a, b, . . . are represented in the e i system by the column matrices Ga, Gb, . . . † and

image

In particular, (e i, a) = αi

A linear operator A given by the matrix A in the e i scheme of measurements is represented by the matrix GAG-1 in the e i scheme of measurements.

(c) The reciprocal basis e1, e2, . . . , en corresponding to a new set of base vectors

image

* Note that A† is represented by the matrix A† in the scheme of measurements corresponding to the reciprocal basis, Sec. 14.7-6.

†One can also represent the vectors a, b, ... by the row matrices (Ga)†, (Gb)†, . . . or (Ga), (Gb), . . . , corresponding to a representation by covariant vector components (Sees. 16.2-1 and 16.7-3).

is given by

image

The base vectors e i and e k are said to transform contragrediently (see also Sec. 16.6-2). Similarly

image

(d) Every orthonormal basis (Sec. 14.7-4a) is identical with its reciprocal basis (self-dual), so that e i = e i = u i (i = 1, 2, . . . , n), and

image

14.7-7. Comparison of Notations .In order to permit easy reference to standard textbooks, subscripts only have been used throughout Chaps. 12 to 14 to label vector components and matrix elements. The improved dummy-index notation employed in tensor analysis and described in Sec. 16.1-3 uses superscripts as well as subscripts. Table 14.7-1 reviews the different notations used to describe vectors and linear operators and may be used to translate one notation into the other (see also Sec. 16.2-1).

Table 14.7-1. Comparision of Different Notations Describing Scalars, Vectors, and Linear Operators

image

14.8. EIGENVECTORS AND EIGENVALUES OF LINEAR OPERATORS

14.8-1. Introduction . The study of eigenvectors and eigenvalues is of outstanding practical interest, because

Many relations involving a linear operator are radically simplified if its eigenvectors are introduced as reference base vectors (diago-nalization of matrices, quadratic forms, solution of operator equations in spectral form; see also Sec. 15.1-1).

The eigenvalues of a linear operator specify important properties of the operator without reference to a particular coordinate system.

In many applications, eigenvectors and eigenvalues of linear operators have a direct geometrical or physical significance; they can usually be interpreted in terms of a maximum-minimum problem (Sec. 14.8-8). The most important applications involve hermitian operators which have real eigenvalues (Sees. 14.8-4 and 14.8-10).

14.8-2. Invariant Manifolds. Decomposable Linear Transformations (Linear Operators) and Matrices .The manifold υ1 is a linear vector space υ is invariant with respect to (reduces) a given linear transformation A of υ into itself if and only if A transforms every vector x of υ1 into a vector Ax of υ1 . Given a reference basis e1, e2, . . . , em, em+1, ... in υ such that e1, e2, . . . , em span υ1 , A is represented by a matrix A which can be partitioned in the form

image

where A1 is the m X m matrix representing the linear transformation A 1 of υ1 “ induced” by A. A1 may or may not be capable of further reduction.

A linear transformation (linear operator) A of the vector space υ into itself is decomposable (reducible, completely reducible*) if and only if υ is the direct sum υ = υ1 image υ1 image . . . (Sec. 12.7-5a) of two or more subspaces υ1 , υ2 . . . each invariant with respect to A. In this case, one writes A as the direct sum A = A1 image A1 image . . . of the linear transformations A1, A2, . . . respectively induced by A in υ1 , υ2 , . . . .

A square matrix A represents a decomposable operator A if and only if A is similar to a step matrix (direct sum of matrices A1, A2, . . . corresponding

* See the footnote to Sec. 14,9-2b.

to A1, A2, . . . , see also Sec. 13.2-9). A matrix A with this property is also called decomposable (reducible, completely reducible*).

14.8-3. Eigenvectors, Eigenvalues, and Spectra (see also Sec. 13.4-2). (a) An eigenvector (proper vector, characteristic vector) of the linear transformation (linear operator) A defined on a linear vector space υ is a vector y of υ such that

image

where λ is a suitably determined scalar called the eigenvalue (proper value, characteristic value) of A associated with the eigenvector y.

(b) If y is an eigenvector associated with the eigenvalue λ of A, the same is true for every vector αy ≠ 0. If y1, y2, . . . , y8 are eigenvectors associated with the eigenvalue λ of A, the same is true for every vector α1y1 + α2y2 + . . . +α8y8 0; these vectors span a linear manifold invariant with respect to A (Sec. 14.8-2; see also Sec. 14.8-4c). This theorem also applies to convergent infinite series of eigenvectors in Hilbert spaces.

An eigenvalue λ associated with exactly m > 1 linearly independent eigenvectors is said to be m-fold degenerate; m is called the degree of degeneracy or geometrical multiplicity of the eigenvalue.

Eigenvectors associated with different eigenvalues of a linear operator are linearly independent.

A linear operator defined on an n-dimensional vector space has at most n distinct eigenvalues. Every eigenvalue of a nonsingular operator is different from zero.

(c) Given a linear operator A with eigenvalues λ, αA has the eigenvalues αλ, and Ap has the eigenvalues λp (p = 0, 1, 2, . . . ; p = 0, ±1, ±2, . . . if A is nonsingular). Every polynomial f(A) (Sec. 14.4-2b) has the eigenvalues f(λ) (see also Sees. 13.4-5b). All these functions of A have the same eigenvectors as A.

(d) The Spectrum of a Linear Operator. The spectrum of a linear operator A mapping a complete normed vector space (Banach space, Sec. 14.2-7a) υ into itself is the set of all complex numbers (spectral values, eigenvalues†) λ such that the vector equation

image

does not have a unique solution x = (A — λI)-1f of finite norm for every vector f of finite norm (see also Sec. 14.8-10). More precisely stated, the operator A = λI does not have a unique bounded inverse (A — λI)-1

* See the footnote to Sec. 14.9-2b.

† Some authors refer to all spectral values as eigenvalues; others restrict the use of this term to the discrete spectrum.

(resolvent operator). The spectrum may be partitioned into

The discrete spectrum (point spectrum) defined by Eq. (2) with eigenvectors y ≠ 0 of finite norm ||y||

The continuous spectrum where (A — λI)-1 is unbounded with domain dense in υ

The residual spectrum where (A — λI)-1 is unbounded with domain not dense in υ

The spectrum of a linear operator A contains its approximate spectrum defined as the set of all complex numbers λ such that there exists a sequence of unit vectors u1, u2, . . . such that ||(A — λI)u n|| < l/n (n = 1, 2, . . .). The approximate spectrum contains both the discrete and the continuous spectrum (see also Sec. 14.8-4).

The residual spectrum of A is contained in the discrete spectrum of A†.

(e) The spectrum of a linear operator A is identical with the spectrum of every matrix A representing A in the manner of Sec. 14.5-2. The algebraic multiplicity m′j of any discrete eigenvalue λj of A is the algebraic multi plicity of the corresponding matrix eigenvalue (Sec. 13.4-3a). m′j is greater than, or equal to, the geometrical multiplicity mj of λj (see also Sec.14.8-4c).

For every linear operator having a purely discrete spectrum and a finite trace, Tr (A) equals the sum of all eigenvalues, each counted a number of times equal to its algebraic multiplicity. If det (A) exists, it equals the similarly computed product of the eigenvalues (see also Sec. 13.4-3b).

The characteristic equation FA (λ) = 0 associated with a class of similar finite matrices (Sec. 13.4-5a) is the characteristic equation of the corresponding operator A and yields its eigenvalues together with their algebraic multiplicities. The Cayley-Hamilton theorem [FA (A) = 0, Sec. 13.4-7a] and the theorems of Sec. 13.4-7b apply to linear operators defined on finite-dimensional vector spaces.

(f) If A, y, and z are bounded, then Ay = λy, A†z = μz implies either μ= λ* or (y, z) =0 (see also Sec. 14.8-4).

14.8-4. Eigenvectors and Eigenvalues of Normal and Hermitian Operators (see also Sees. 13.4-2, 13.4-4a, 14.4-6, 14.8-8, 15.3-3b, 15.3-4, and 15.4-6). (a) If A is a normal operator (A†A = AA†, Sec. 14.4-8b), then A and A† have identical eigenvectors; corresponding eigenvalues of A and A† are complex conjugates. The spectrum of every normal operator is identical with its approximate spectrum; the residual spectrum is empty (see also Sec. 14.8-3d).

For every normal operator with eigenvalues λ, the hermitian operators 1/2(A + A†) = H1, 1/2i (A - A†) = H2, and A†A have the same eigenvectors as A and the respective eigenvalues Re(λ), Im(λ), and |λ|2.

(b) All spectral values of any hermilian operator are real.

The converse is not necessarily true; but every normal operator having a real spectrum is hermitian.

The spectrum of every bounded* hermitian operator A is a closed set of real numbers; its largest and smallest values equal image

(c) The following important special properties of normal operators apply, in particular, to hermitian operators:

Orthogonality of Eigenvectors. Eigenvectors corresponding to different eigenvalues of a normal operator are mutually orthogonal.

Completeness Property of Eigenvectors. Every bounded normal operator A defined on a complete unitary vector space υu is completely reduced (Sec. 14.8-2) by a subspace spanned by a complete orthonormal set of eigenvectors (Sec. 14.7-4) and a subspace orthogonal to every eigenvector of A. If υu is separable (in particular, if υu is finite-dimensional), the orthonormal eigenvectors span υu .

Every normal operator A defined on a complete and separable unitary vector space υu is decomposable into a direct sum (Sec. 14.8-2) of normal operators A1, A2, . . . defined on corresponding subspaces υ1, υ2. . . of υu so that each A j has the single eigenvalue λ j. In each subspace υj,there exists a complete orthonormal set of eigenvectors λ i (i)

If the eigenvalue λ of a normal operator has the (finite) algebraic multiplicity m, then the degree of degeneracy of λ is also equal to m, and conversely.

(d) Spectral Representation. The following properties of normal operators apply, in particular, to hermitian operators. Given an orthonormal set of eigenvectors y1, y2, . . . of the bounded normal operator A and any vector x = ξ1y1 + ξ2y2 + . . . [ξ k = (yk, x), Sec. 14.7-4a], note

image

where λk is the eigenvalue associated with yk; (see also Sec. 13.5-4b).

See Refs. 14.10 and 14.11 for analogous properties of normal operators whose spectra are not discrete. If A is an operator of this type, the sums (3) must be replaced by Stieltjes integrals over the spectrum.

14.8-5. Determination of Eigenvalues and Eigenvectors: Finite-dimensional Case (see also Sees. 13.4-5 and 13.5-5; refer to Sec. 20.3-5 for numerical methods). Given the n X n matrix A[aik] representing a linear operator A in the scheme of measurements defined by the base vectors e1, e2, . . . , en , one determines the eigenvalues λ and the eigenvectors y = n1e1 + n2e2 + . . . + nnen of A as follows:

* See footnote to Sec. 14.4-1.

image

The mj column matrices image are called modal columns (or simply eigenvectors) of the given matrix A, associated with the eigenvalue λj. Each modal column may be multiplied by an arbitrary constant different from zero.

14.8-6. Reduction and Diagonalization of Matrices. Principal-axes Transformations (see also Sec. 14.8-2). (a) Every finite set of linearly independent eigenvectors y1, y2, . . . , ys associated with the same eigenvalue of a linear operator A spans a subspace υ1 invariant with respect to A (Sec. 14.8-3b). If y1, y2, . . . , ys are introduced as the first s base vectors, A will be represented by a matrix of the form

image

(b) In particular, let A be a normal operator defined on a finite-dimensional vector space of dimension n. Then the procedure of Sec 14.8-5 will yield exactly n linearly independent eigenvectors

image

The n corresponding modal columns form a nonsingular modal matrix

image

where each distinct eigenvalue λ j accounts for m j adjacent columns; the n = m1 + m2 + . . . columns are labeled by successive values of k = 1, 2, . . . , n. The alias-type coordinate transformation

image

(Sec. 14.6-1) introduces the n eigenvectors y (j) of the normal operator A as a reference basis. The similarity transformation

image

yields the matrix A representing A in the new reference system; A is a step matrix

image

where each submatrix Aj, corresponds to a different eigenvalue λ j of A and has exactly mj rows and mj columns (see also Sec. 14.8-4c).

(c) If the n eigenvectors defining the columns of the modal matrix (5) are mutually orthogonal, then the similarity transformation (7) yields a diagonal matrix A (diagonalization of the given matrixA, Sec. 13.4-4a). To obtain a transformation matrix T which diagonalizes a given matrix A representing a normal operator A, proceed as follows:

If all eigenvalues λ j are nondegenerate (this is true whenever the characteristic equation has no multiple roots), every modal matrix (5) diagonalizes A.

If there are degenerate eigenvalues λ j, orthogonalize each set of mj eigenvectors y(j) = n1 (j) e1 + n2 (j) e2 + . . . + nn (j) en by the Gram-Schmidt process (Sec. 14.7-4b). The n modal columnsimagerepresenting the resulting m1 + m2 + . . . = n mutually orthogonal eigenvectors y(j) = n1 (j) e1 + n2 (j) e2 + . . . + nn (j) en form the desired transformation matrix T.

(d) In many applications, the original reference basis e1, e2, . . . , en is orthonormal (rectangular cartesian coordinates), so that

image

(Sec. 14.7-4a), and the new reference basis is taken to be an orthonormal set of eigenvectors y(j) (obtained, if necessary, with the aid of the Gram-Schmidt process). Then every modal matrix T formed from the y(j) is a unitary matrix. A unitary coordinate transformation (6) introducing n orthonormal eigenvectors as base vectors is called a principal-axes transformation for the operator A (see also Sees. 2.4-7 and 3.5-6).

A principal-axes transformation for a hermitian operator A reduces the corresponding hermitian form (9) to its normal form (13.5-9) (see also Sec. 13.5-4).

(e) Two hermitian operators A and B can be represented by diagonal matrices in the same scheme of measurements if and only if BA = AB (see also Sees. 13.4-4b and 13.5-5).

14.8-7. “Generalized” Eigenvalue Problems (see also Sees. 13.5-5 and 15.4-5). (a) Some applications require one to find “eigenvectors” y and “eigenvalues” μ defined by a relation of the form

image

where B is a nonsingular operator. The quantities y and μ are necessarily the eigenvectors and eigenvalues of the operator B-1 A; the problem reduces to that of Eq. (2) if B is the identity operator. If both A and B are hermitian, and B is positive definite (Sec. 14.4-4), then

All eigenvalues μ are real.

One can introduce the new inner product

image

(see also Sec. 14.2-6a). In terms of this new inner product, the operator B -1 A becomes hermitian, and the orthogonality and completeness theorems of Sec. 14.8-4c apply. In particular, eigenvectors y associated with different eigenvalues μ are mutually orthogonal relative to the scalar product (11) (see also Sec. 14.7-3a).

(b) Consider a finite-dimensional unitary vector space and an ortho-normal reference system u1, u2, . . . , u n, so that image (Sec. 14.7-4a). Let A and B be represented by hermitian matrices A ≡ [aik], B ≡ [bik], where B is positive definite. Then the eigenvalues μ defined by Eq. (10) are identical with the roots of the nth-degree algebraic equation

image

For each root μj of multiplicity mj, there are exactly m linearly independent eigenvectors y0) = rjiij)ui + rj2(j)u2 + • • • + Vnu)un; the components rjiU) are obtained from the linear simultaneous equations

image

Application of the Gram-Schmidt process (Sec. 14.7-4b) to the mi -f- m* -f • • • = n eigenvectors y( yields a complete orthonormal set relative to the new inner product

image

If this orthonormal set of eigenvectors is introduced as a reference basis in the manner of Sec. 14.8-6c, the hermitian forms (x, Ax) = x\Ax and (x, Bx) =* x\Bx take the form (13.5-12) (simultaneous diagonalization of two hermitian forms, Sec. 13.5-5).

(c) Refer to Sec. 15.4-5 for a discussion of analogous “generalized” eigenvalue problems involving an infinite-dimensional vector space.

14.8-8. Eigenvalue Problems as Stationary-value Problems (see also Sec. 15.4-7). (a) Consider a hermitian operator A denned on a finite-dimensional unitary vector space υu ; and introduce an ortho-normal reference basis* u2, u2, . . . , un, so that A is represented by a hermitian matrix A ≡ [a ik]. The important problem of finding the eigenvectors y(j) = n1 (j) u1 + n2 (j) u2 + . . . + nn (j) un and the corresponding eigenvalues λ j of A is precisely equivalent to each of the following problems:

Find (the components ni of) each vector y ≠ 0 such that

image

has a stationary value, y = y(j) yields the stationary value λj.

Find (the components ni of) each vector y such that

image

has a stationary value subject to the constraint

image

y = y(j) yields the stationary value Ay.

*The orthonormal basis is used for convenience in most applications; in the general case, it is only necessary to substitute (y, Ay) = y†GAy and (y, y) = y†Gy in Eqs. (15) to (17).

      3. Find (the components ni of) each vector y ≠ 0 such that (y, y) has a stationary value subject to the constraint (y, Ay) = 1. y = y(j) yields the stationary value 1/λ j.

Let the eigenvalues of A be arranged in increasing order, with an m-fold degenerate eigenvalue repeated m times, or λ 1 ≤ λ 2 ≤ • • • ≤λ n. The smallest eigenvalue λ 1 is the minimum value of Rayleigh's quotient (15) for an arbitrary vector y of υu . The rth eigenvalue λ r in the above sequence similarly is less than or equal to Rayleigh's quotient if y is an arbitrary nonzero vector orthogonal to all eigenvectors associated with λ 1, λ 2, . . . , λ r-1; λr is the maximum of min (y, Ay)/(y, y) for an arbitrary (rI)-dimensional subspace υr of υu (Courant's Minimax Principle).

The last theorems may be restated for problems 2 and 3, and for maxima instead of minima; note that a minimum in problem 1 or 2 corresponds to a maximum in problem 3, and conversely. The inner product (y, Ay) usually has direct physical significance. Problem 3 associates the eigenvalues of A with the principal axes of a second-order hypersurface (see also Sees. 2.4-7 and 3.5-6).

(b) Generalizations. The theory of Sec. 14.8-8a may be extended to apply to the “generalized” eigenvalue problem defined by Ay = μBy, where A is a hermitian operator, and B is hermitian and positive definite (Sec. 14.8-7). It is only necessary to replace (y, y) by (y, V)B ≡ (y, By) in each problem statement of Sec. 14.8-8a. In particular, Rayleigh's quotient (15) is replaced by

image

Analogous theorems apply to suitable operators defined on Hilbert spaces; here the inner products (y, Ay), (y, y), and (y, By) may be integrals rather than sums, so that the stationary-value problems of Sec. 14.8-8a become variation problems (Sec. 15.4-7).

14.8-9. Bounds for the Eigenvalues of Linear Operators (see also Sec. 15.4-10). The following theorems are often helpful for the estimation of eigenvalues.

(a) Every eigenvalue z = λ of a normal linear operator A represented by a finite n X n matrix A[a ik] is contained in the union of the n circles

image

Re(λ) lies between the smallest and the largest eigenvalue of 1/2(A + A†) = H1, and Im(λ) lies between the smallest and the largest eigenvalue of 1/2(A - A†) = H2, (see also Sec. 13.3-4a).

(b) For hermitian matrices and operators

image

(c) Comparison Theorems. Let μ 1 ≤ μ 2 ≤ • • • ≤μ n be the sequence (including multiplicities) of the eigenvalues for a finite-dimensional eigenvalue problem (10), where A and B are hermitian, and B is positive definite. Then

Addition of a positive-definite hermitian operator to A cannot decrease any eigenvalue μr in the above sequence.

Addition of a positive-definite hermitian operator to B cannot increase any eigenvalue μr.

If a constraint restricts the vectors y to an (n — m)-dimensional sub-space of υu , then the nm eigenvalues μ 1′ ≤ μ 2′ ≤ • • • ≤μ n-m′ of the constrained problem satisfy the relations

image

The constraint usually takes the form of m independent linear equations relating the vector components ni.

These theorems also apply to operators defined on Hilbert spaces if A and B are positive-definite hermitian operators yielding a discrete sequence μ 1 ≤ μ 2 ≤ • • • with finite multiplicities.

14.8-10. Nonhomogeneous Linear Vector Equations (see also Sees. 1.9-4, 15.3-7, and 15.4-12). (a) Given a bounded operator A, the vector equation

image

has a unique solution x for every given vector f if and only if the given scalar λ is not contained in the spectrum of A (Sec. 14.8-3d). If λ equals an eigenvalue λ1 of A in the sense of Eq. (2), then Eq. (22) has a solution only if the given vector f is orthogonal to every eigenvector of A† associated with the eigenvalue λ*1. In the latter case, there is an infinite number of solutions; every sum of a particular solution and a linear combination of eigenvectors corresponding to the eigenvalue λ1 is a solution.

(b)The important special case

image

where A is a bounded normal operator, admits a unique solution x for every given vector f if and only if Ax = 0 implies x = 0, i.e., if and only if A is nonsingular. If A is singular, Eq. (23) has a solution only if f is orthogonal to every eigenvector of A† associated with the eigenvalue zero.

(b) (c) For a hermitian operator A = A† having an orthonormal set of eigenvectors y k such that imagethe solution of Eq. (22) is given by

image

where the λ k are the (not necessarily distinct) eigenvalues corresponding to each yk.

14.9. GROUP REPRESENTATIONS AND RELATED TOPICS

14.9-1. Group Representations . (a) Every group (Sec. 12.2-1) can be represented by a homomorphism (Sec. 12.1-6) relating the group elements to a group of nonsingular linear transformations of a vector space (representation space, carrier space), and thus to a group of non-singular matrices (this is a form of Cayley's theorem stated in Sec. 12.2-9b). A representation of degree or dimension n of a group G in the field F is a group of n X n matrices A, B, . . . over F related to the elements a, b, . . . of G by a homomorphism A = A(a),B = B(b), . . . , so that ab = c implies A(a)B(b) = C(c) for all a, b in G (representation condition), n equals the linear dimension of the representation space. A representation is faithful (true) if and only if it is reciprocal one-to-one (and thus an isomorphism, Sec. 12.1-6).

Every group admits a complex vector space as a representation space; i.e., every group has a representation in the field of complex numbers. Such a representation permits one to describe the defining operation of any group in terms of numerical additions and multiplications (see also Sees. 12.1-1 and 14.1-1). Most applications deal with groups of transformations (Sec. 12.2-8; for examples refer to Sec. 14.10-7).

Every group a1, a2, . . . , ag of finite order g admits a faithful representation comprising the g linearly independent permutation matrices {Sec. 13.2-6) A j(a j) = [a ik(a j)] defined by

image

where E is the identity element of the given group (regular representation of the finite group). Every finite group is thus isomorphic to a group of permutations (see also Sec. 12.2-8).

(b) Two representations R and R of a group G are similar or equivalent if and only if all pairs of matrices A (a) of R and A (α) of R are related by the same similarity transformation (Sec. 13.4-1b) Ā = T-1AT. In this case, the matrices A (a) and A (a) are said to describe one and the same linear transformation A (α) of a representation space common to R and R (see also Sec. 14.6-2).

A representation R is bounded, unitary, and/or orthogonal if and only if all its matrices have the corresponding properties. Every representation of a finite group and every unitary representation is bounded. For every bounded representation there exists an equivalent unitary representation.

(c) The rank of any representation R is the greatest number of linearly independent matrices* in R .

14.9-2. Reduction of a Representation . (a) A representation† R of a group G is reducible if and only if the representation space υ has a proper subspace υ1 invariant with respect to R , i.e., with respect to every linear transformation of υ described by a matrix of R (Sec. 14.5-2). This is true if and only if there exists a similarity transformation

image

which reduces the matrices A (a), B(b), . . . of R simultaneously to corresponding matrices of the form

image

where A1(a), B1(b), . . . are square matrices of equal order (alternative definition). The matrices A1(a), B1(b), . . . constitute a representation R1 of the given group G, with the representation space υ1 . A representation which cannot be reduced in this manner is called irreducible.

(b) A representation R will be called decomposable† if and only if its representation space is a direct sum υ = υ1 image υ2 image . . . (Sec. 12.7-5a) of subspaces υ1 , υ2 , . . . invariant with respect to R . This is true if and only if there exists a similarity transformation A = T-1AT which reduces the matrices A (a), B(b), ... of R simultaneously to

* Linear independence of matrices is denned in the manner of Sec. 14.2-3, since matrices may be regarded as vectors.

†The definitions of this section also apply to any set of linear transformations of a vector space into itself, or to any corresponding set of matrices (not necessarily a group).

†The terms reducible, decomposable, and completely reducible are variously interchanged by different authors. These terms are, indeed, equivalent in the case of bounded matrices, transformations, arid representations, and thus for all representations of finite groups (Sec. 14.9-2d).

corresponding step matrices

image

(direct sums of matrices, Sec. 13.2-9), where corresponding submatrices are of equal order. Each set of matrices Ai(a), Bi(b), . . . ( i = 1, 2, . . .) constitutes a representation Ri of the given group G with the representation space υ i . R is written as the direct sum R = R1 image R2 image . . . .

(c)A representation R is completely reducible if and only if it is decomposable into irreducible representations (irreducible components) R (1), R (2), ....

(d) Conditions for Reducibility (see also Sec. 14.9-5b). Every bounded representation (and, in particular, every representation of a finite group) is either completely reducible or irreducible.

A group G has a decomposable representation if and only if it is the direct product (Sec. 12.7-2) of simple groups (Sec. 12.2-5b).

A bounded representation R is completely reducible if and only if there exists a matrix Q, not a multiple of 7, which commutes with every matrix of R . Irreducible representations of commutative (Abelian) groups are necessarily one-dimensional.

Whenever all corresponding matrices A, A of two irreducible representations R and R are related by the same transformation QA = AQ, then R and R are either equivalent or Q = [0] (Schur's Lemma).

14.9-3. The Irreducible Representations of a Group . (a) The decomposition R = R(1) image R(2) image . . . .of a given completely reducible representation R of a group into irreducible components is unique except for equivalence and for the relative order of terms. Every completely reducible representation is uniquely defined by its irreducible components (except for equivalence). If R(j) is one of exactly mj mutually equivalent irreducible components of R (j = 1, 2, . . .), one may write

image

(b) For every group G of finite order g,

The number m of distinct nonequivalent irreducible representations is finite and equals the number of distinct classes of conjugate elements (Sec. 12.2-5a).

If nj is the dimension of the jth irreducible representation, its rank equals nj 2 (Burnside's Theorem); the rank of every representation R of G equals the sum of the ranks nf of the distinct irreducible components in R .

Each nj is a divisor of g, and

image

The regular representation of G (Sec. 14.8-la) contains the jth irreducible representation of G exactly nj times.

(c)The determination of the complete set of irreducible representations of a group G of operators is of particular interest as a key to the solution of certain eigenvalue problems. Given any hermitian operator H which commutes with every operator of a group G, there exists a reciprocal one-to-one correspondence between the distinct eigenvalues λi of H and the nonequivalent irreducible representations R(j) of G, and the degree of degeneracy of each λi equals the dimension of R(j) (classification of quantum-mechanical eigenvalues from symmetry considerations, Refs. 14.20 to 14.22).

14.9-4. The Character of a Representation . (a) The character of a representation R is the function

image

defined on the elements a of the group G represented by R . Conjugate group elements (Sec. 12.2-5a) have equal character values.

For every bounded representation,

image

Two completely reducible representations of the same group G are equivalent if and only if they have identical characters.

(b)The characters of irreducible representations are called simple or primitive characters, and the characters of reducible representations are known as composite characters. R = R1 image R2 image . . . .implies , χ(a) ≡ χ1(a) + χ1(a) + . . . , where χ(a), χ1(a), χ2(a), . . . are the respective characters of R , R 1, R 2, . . .

14.9-5. Orthogonality Relations (see also Sec. 14.9-6). (a) The primitive characters χ(1)(a), χ(2)(a), . . .respectively associated with the nonequivalent irreducible representations R (1), R (2), . . . of a finite group G satisfy the relations

image

(b) Each of the m nonequivalent irreducible representations R(j) of a finite group G is equivalent to a corresponding unitary irreducible representation comprising the matrices [uik (j)(a)] (see also Sec. 14.9-1b). The elements of these unitary matrices satisfy the relations

image

(c) The relations (7) to (11) apply to countably infinite, continuous, and mixed-continuous groups whenever the mean values (Sec. 12.2-12) in question exist. In this case the primitive characters χ(j)(a) ≡ χ(j)(a)[a(α1, α1 . . .)] constitute a complete orthogonal set of functions in the sense of Sec. 15.2-4.

14.9-6. Direct Products of Representations . (a) Given an n1-dimensional representation R1 and an n2 -dimensional representation R2 of the same group G, the n1n2 X n1n2 matrices each obtained as a direct product (Sec. 13.2-10) of a matrix of (Ri by a matrix of (R2 constitute a representation of G, the direct product (Kronecker product) R1 image R2 of the representations R1 and R2 (see also Sec. 12.7-2). Its representation space is the direct product of the representation spaces associated with R1 and R2 (Sec. 12.7-3). The character χ(a) of R = R1 (x) R2 is the product of the respective characters χ1(a) of R1 and χ2(a) of R2:

image

If both R1 and R2 are bounded or unitary, the same is true for R1 imageR2.

(b) The direct product R(1) imageR(2) bounded irreducible representations R(1) and R(2) of G is irreducible if the dimension of R(1) and/or R(2) equals 1; otherwise, R(1) imageR(2) completely reducible. One may use this last fact to derive new irreducible representations of G from given irreducible representations.

(c) The irreducible representations of the direct product G1 image G2 of two groups G1 and G2 (Sec. 12.7-2) are the direct products R1(j) image R2 (j′) of the irreducible representations R1(j) of G1 and R2(j′) of G2.

14.9-7. Representations of Rings, Fields, and Linear Algebras (see also Sees. 12.3-1 and 12.4-2). Rings, fields, and linear algebras may also be represented by suitable classes of matrices or linear transformations. In particular, a linear algebra of order n2 over afield F is isomorphic to an algebra of n X n matrices over F, provided that the given algebra has a multiplicative identity (regular representation of a linear algebra, see also Sees. 14.9-la and 14.10-6).

14.10. MATHEMATICAL DESCRIPTION OF ROTATIONS

14.10-1. Rotations in Three-dimensional Euclidean Vector Space .

(a) Every orthogonal linear transformation

image

of a three-dimensional Euclidean vector space (Sec. 14.2-7a) onto itself preserves absolute values of vectors and angles between vectors (Sees. 14.4-5 and 14.4-6). Such a transformation is a (proper) rotation if and only if det (A) = 1, i.e., if and only if the transformation also preserves the relative orientation of any three base vectors (and hence right- and left-handedness of axes, vector products, and triple scalar products). A transformation (la) with det (A) = —1 is an improper rotation, or a rotation with reflection.

(b) Given any orthonormal basis (Sec. 14.7-4) u1, u2, u3, let

image

Every transformation (la) is represented by

image

or in matrix form

where

image

for proper rotations. Since an orthonormal reference system is used, the real matrix A ≡ [aik] describing each rotation is orthogonal (A A = A A = I, see also Sec. 14.7-5), i.e.,

image

and each coefficient aik equals the cofactor of aki in the determinant det [aik]. Three suitably given coefficients aik determine all 9.

Geometrically, aik is the cosine of the angle between the base vector ui and the rotated base vectorimage (see also Sec. 14.5-1):

image

14.10-2. Angle of Rotation. Rotation Axis . (a) A rotation (1) rotates the position vector x of each point in a three-dimensional Euclidean space through an angle of rotation δ about a directed rotation axis whose points are invariant. The rotation angle δ and the direction sosines c1, c2, c3 of the positive rotation axis are given by

image

so that δ > 0 corresponds to a rotation in the sense of a right-handed screw propelled in the direction of the positive rotation axis. Either the sign of δ or the positive direction on the rotation axis may be arbitrarily assigned.

The direction of the positive rotation axis is that of the eigenvector c1u1 + c2u2 + c3u3 corresponding to the eigenvalue +1 of A and is obtained by a principal-axes transformation of the matrix A (Sec. 14.8-6). The remaining eigenvalues of A are cos δ ± i sin δ = e±iδ.

(b)The transformation matrix A corresponding to a given rotation described by δ, c1, c2, c3 is

image

14.10-3. Euler Parameters and Gibbs Vector . (a) The four Euler symmetrical parameters

image

define the rotation uniquely, since Eq. (6) yields

image

λ, μ, υ, ρ and —λ, —μ, —υ, —ρ represent the same rotation.

(b)The Gibbs vector

with

image

also defines the rotation uniquely. The rotated vector x′ can be written as

image

14.10-4. Representation of Vectors and Rotations by Spin Matrices and Quaternions . Cayley-Klein Parameters. (a) Given an orthonormal basis u1, u2, u3, every real vector x = ξ1u1 + ξ2u2 + ξ3u3 may be represented by a (in general complex) hermitian 2 X 2 matrix

image

where the hermitian Pauli spin matrices

image

correspond, respectively, to u1, u2, u3. The correspondence (11) is an isomorphism preserving the results of vector addition and multiplication of vectors by (real) scalars.

For every rotation (1), the 2 X 2 matrix representing the rotated vector

image

where U is the (in general complex) unitary 2 X 2 matrix with determinant 1 (unimodular 2 X 2 matrix) defined by

image

The complex numbers a, b, determine the corresponding rotation uniquely; but a, b and — a, — b, and hence U and — U, represent the same rotation. Either a, b, — b*, a* or a*, ib*, — ib, a are referred to as the Cayley-Klein parameters of the rotation.

Geometrically, the complex parameters a, b define the complex-plane transformation

image

(bilinear transformation, Sec. 7.9-2) relating the stereographic projection u of the point (ξ1, ξ2, ξ3) on a sphere about the origin onto the complex u plane (Sec. 7.2-4) and the stereographic projection u′ of the rotated point (ξ′1, ξ′2, ξ′3) (see also Ref • 14.13).

(b)The linear combinations of I, iS1, iS2, and iS3 with real coefficients constitute a matrix representation of the quaternion algebra (Sec. 12.4-2), whose scalars correspond to real multiples of I, and whose generators correspond to iS1, iS2, iS3 with

image

Every complex 2 X 2 matrix can be expressed as such a linear combination; in particular,

image

Again, both U and — U define the same rotation uniquely.

14.10-5. Rotations about the Coordinate Axes . The following transformation matrices represent right-handed rotations about the positive coordinate axes:

image

14.10-6. Euler Angles . (a) Every matrix A = [aik] representing a proper rotation in three-dimensional Euclidean space can be variously expressed as a product of three matrices (18), and in particular as

image

The three Euler angles α, β, γ define the rotation uniquely; except for multiples of 2π, they are uniquely determined by a given rotation, unless β = 0 ("gimbal lock," Sec. 14.10-1a).

A set of cartesian x′, y′, z′ axes (moving "body axes" of a rigid body) initially aligned with u1, u2, u3 can be rotated into alignment with u1′, u2′, u3 by three successive rotations (18) [Fig. 14.10-1; note the discussion of Sec. 14.6-3, which explains the apparently inverted order of the three matrices in Eq. (20)].

Rotate about x′ axis through the Euler angle α

Rotate about y′ axis through the Euler angle β

Rotate about z′ axis through the Euler angle γ

image

FIG. 14.10-1. The Euler angles α, β, γ. The axis OL of the second rotation rotation (through β) is often called the line of nodes. Note that α and β are the spherical polar coordinates for the vector u3 in the u1, u2, u3 system.

image

FIG. 14.10-2. Body axes of an aircraft.

The inverse rotation A-1 (which turns x′ back into x) is represented by the matrix

image

There are six ways to express a rotation matrix (1) as a product

image

which is related to the system of Eq. (20) by

of rotations about two different coordinate axes. Of the different resulting Euler-angle systems, another frequently used one is defined by

image

image

(b)There are, furthermore, six ways to represent a rotation matrix A as a product

image

of rotations about three different coordinate axes. In particular,

image

is frequently used to describe the attitude of an aircraft or space vehicle after successive roll φ, pitch ϑ, and yaw ψ about body-centered axes respectively directed forward, to starboard, and toward the bottom of the craft (Fig. 14.10-2).

(c)The profusion of the 12 Euler-angle systems defined above is augmented by the fact that some authors replace one or more of the Euler angles by its negative, and that some of the literature involves left-handed coordinate systems. In addition, the reader is warned to check whether a given Euler-angle transformation is originally defined as an operation (“alibi” interpretation, Sec. 14.5-1) or as a coordinate transformation (“alias” interpretation, Sec. 14.6-1), since it is possible to confuse A and A-1 = A. Specifically, the component matrix x = {ξ1, ξ2, ξ3} of a vector x in the fixed ui system and the component matrix x ≡ {ξ1, ξ2, ξ3} in the rotated u′k system are related by

image

and the base-vector matrices (Sec. 14.6-2) transform according to

image

Note the discussion of Sec. 14.6-3 .

(d)The parameters c1, c2, c3, δ and λ, μ, ν, ρ are readily expressed in terms of Euler angles with the aid of Eq. (5) and the Euler-angle matrix. Thus

image

Note that addition of 2π to one of the Euler angles changes the sign of all parameters (29) and leaves the rotation matrix A unchanged.

If ψ2 = 0 in Eq. (22) or ϑ2 = π/2 in Eq. (25) (e.g., β = 0, β' = 0, or ϑ = π/2), then the remaining two Euler angles are no longer uniquely defined by the given rotation ("gimbal lock" in attitude-reference systems). Euler angles are, therefore, employed only for the representation of rotations which are suitably restricted in range.

14.10-7. Infinitesimal Rotations, Continuous Rotation, and Angular Velocity (see also Sees. 5.3-2 and 14.4-9). (a) An infinitesimal three-dimensional rotation through a small angle dδ about a rotation axis with the direction cosines c 1, c 2, c 3 is described by the orthogonal infinitesimal transformation

image

so that Δ is skew-symmetric (Sec. 14.4-9). In the orthonormal u 1, u 2, u 3 reference frame, Δ is represented by the skew-symmetric matrix†

image

obtained from Eq. (6) for δ = dδ —> 0. It follows that

image

where e = c1u1 + c2u2 + c3u3 is the unit vector in the direction of the

* a and b are defined in Sec. 14.10-4. Note that several different definitions of the Cayley-Klein parameters are in common use.

† In general, if W is any skew-symmetric linear operator defined on a three-dimensional Euclidean vector space VE and represented in an orthonormal reference frame u 1, u 2, u 3 by the skew-symmetric matrix

image

then, for every vector x of VE,

image

where w = WiUi + W2U2 + W3U3 (see also Sec. 16.9-2).

positive rotation axis. This relation is independent of any reference frame (see also Sec. 14.10-3b).

(b) For a continuous three-dimensional rotation described by

image

where x is constant, Eq. (32) yields

image

The vector ω(t), given in terms of the fixed and rotating base vectors by

image

is directed along the instantaneous axis of rotation (axis of the rotation x' → x' + dx'), and |ω(t)| is the instantaneous absolute rate of rotation with respect to t. If the parameter t used to describe the rotation is the time, then ω(t) is the angular velocity of the rotation.

Equation (33) yields

image

where Ci is the skew-symmetrical operator represented in the u 1, u 2, u 3 and u'1(t), u'2(t), u'3(t) reference systems by the respective matrices

image

With the aid of the relations (14.6-8) and A-1 = A, it follows that

image

Substitution of Eq. (8), (20), or (26) into Eq. (36) yields relations between angular-velocity components and matrix elements (direction cosines), Euler parameters, or Euler angles. In particular,

image

image

14.10-8.The Three-dimensional Rotation Group and Its Representations (see also Sees. 12.2-1 to 12.2-12 and 14.9-1 to 14.9-6).  (a)The orthogonal transformations (1) of a three-dimensional Euclidean vector space onto itself are necessarily bounded and nonsingular and constitute a group, the three-dimensional rotation-reflection group image. The proper rotations [det (A) = 1] constitute a normal subgroup of image, the three-dimensional rotation group Rf. Neither Rf nor Rf is commutative.

Rotations involving the same absolute rotation angle |δ| belong to the same class of conjugate elements. The rotations about any fixed axis form a commutative subgroup of image (two-dimensional rotations).

imageis a mixed-continuous group, and image is a continuous group (normal subgroup of image with index 2). image is, in turn, a subgroup of the group of all nonsingular linear transformations of the Euclidean vector space onto itself (full linear group, FLG). Note that every transformation of FLG is the product of a proper or improper rotation and a nonnegative symmetric transformation (affine transformation, stretch¬ing or contraction; see also Sees. 14.4-8 and 13.3-4).

(b)The Irreducible Representations of image. The 2X2 matrices (14) constitute an irreducible unitary two-to-one representation of image in the field of complex numbers; i.e., the three-dimensional rotation group image is represented by the group of unitary transformations with determinant 1 of a two-dimensional unitary vector space onto itself (two-dimensional unimodular unitary group, special unitary group, SUG).

More generally, the three-dimensional rotation group image has bounded irreducible representations of dimensions n = 2, 3, 4, . . . .The complete set of unitary irreducible representations is conveniently written asR (1/2), R (1), R (3/2), . . . ; R (j) has the dimension n = 2j + 1, and the (2j + 1) X (2j + 1) matrix representing the rotation with Cayley-Klein parameters a, b, —b*, a* (Sec. 14.10-4) or Euler angles α, β, γ (Sec. 14.10-6) is

image

Each sum has only a finite number of terms, since 1/N! = 0 for N < 0.The representation R (j) is true (one-to-one) for j = 1, 2, . . . and two-to-one for j = 1/2, 3/2, . . . (see also Sec. 14.10-4). The character (Sec. 14.9-4) of R (j); is

image

where δ is the angle of rotation defined in Sec. 14.10-2.

The peculiar indices j, m, and q employed to label the R (R) and image are those associated with the spherical surface harmonics of degree j (Sec. 21.8-12). For integral values of j, these functions constitute a (2j + l)-dimensional representation space for R (j), with the functions (21.8-6b) as an.prthonormal basis.

* Some authors replace the factor (—l)h by (—l) hq+m , corresponding to multiplication of each matrix by the same diagonal matrix, As written, Eq. (41) reduces to Eq. (14) for j = 1/2.

(c) Direct Products of Rotation Groups (see also Sees. 12.7-2 and 14.9-6). Direct products of three-dimensional rotation groups describe, for instance, the composite rotations of dynamical systems comprising two or more rotating bodies (atoms or nuclei in quantum mechanics).image is isomorphic with image;note

image

14.11. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY

14.11-1. Related Topics. The following topics related to the study of matrix methods are treated in other chapters of this handbook:

Linear simultaneous equations Chap. 1

Elementary vector algebra Chap. 5

Vector relations in terms of curvilinear coordinates Chap. 6

Groups, vector spaces, and linear operators Chap. 12

Diagonalization of matrices, eigenvalue problems, quadratic and hermitian forms Chap. 14

Tensor algebra Chap. 15

14.11.2.References and Bibliography (see also Sees. 12.9-2, 13.7-2, and 15.7-2).

      14.1. Birkhoff, G., and S. MacLane: A Survey of Modern Algebra, 3d ed., Macmillan, New York, 1965.

      14.2. Dunford, N., and J. T. Schwartz: Linear Operators, Interscience, New York, 1964.

      14.3. Finkbeiner, D. T.: Introduction to Matrices and Linear Algebra, Freeman, San Francisco, 1960

      14.4. Halmos, P. R.: Finite-dimensional Vector Spaces, 2d ed., Princeton, Princeton, N.J., 1958.

      14.5. image:Introduction to Hilbert Space and the Theory of Spectral Multiplicity,Chelsea, New York, 1957.

      14.6. Nering, E. D.: Linear Algebra and Matrix Theory, Interscience, New York, 1963.

      14.7. Shields, P. C: Linear Algebra, Addison-Wesley, Reading, Mass., 1964.

      14.8. Stoll, R. R.: Linear Algebra and Matrix Theory, McGraw-Hill, New York, 1952.

      14.9. Thrall, R. M., and L. Tornheim: Vector Spaces and Matrices, Wiley, New York, 1957.

      14.10. Von Neumann, J.: Die Mathematischen Grundlagen der Quantenmechanik, Springer, Berlin, 1932.

      14.11. Zaanen, C: Linear Analysis, North Holland Publishing Company, Amsterdam, 1956.

(See also the articles by G. Falk and H, Tietz in vol, II of the Handbuch der Physik Springer, Berlin, 1955.)

Representation of Rotations

      14.12. Goldstein, H.: Classical Mechanics, Addison-Wesley, Reading, Mass., 1953.

      14.13. Synge, J. L.: Classical Dynamics, in Handbuch der Physik, Vol. III, Springer, Berlin, 1960.

Representation of Groups

      14.14. Boerner, H.: Darstellungen von Gruppen mit Beruecksichtigung der Beduerfnisse der modernen Physik, Springer, Berlin, 1955.

      14.15. Hall, M.:The Theory of Groups, Macmillan, New York, 1961.

      14.16. Kurosh, A. G.:The Theory of Groups, 2 vols., 2d ed., Chelsea, New York, 1960.

      14.17. Margenau, H., and G. M. Murphy:The Mathematics of Physics and Chemistry, 2d ed., Van Nostrand, Princeton, N.J., 1957.

      14.18. Murnaghan, F. D.:The Unitary and Rotation Groups, Spartan, Washington, D.C., 1962.

      14.19. Van der Waerden, G. L.: Gruppentheoretische Methode in der Quantenmechanik, Springer, Berlin, 1932.

      14.20. Weyl, H.:The Theory of Groups and Quantum Mechanics, Dover, New York, 1931.

      14.21. image:The Classical Groups, Princeton, Princeton, N.J., 1939.

      14.22. Wigner, E.: Gruppentheorie und ihre Anwendung auf Quantenmechanik der Atomspektren, Brunswick, 1951.

      14.23. Zassenhaus, Hans J.:The Theory of Groups, 2d ed., Chelsea, New York, 1958.