Schaum’s Outline Linear Algebra, Sixth Edition

Canonical Forms

10.1 Introduction

Let T be a linear operator on a vector space of finite dimension. As seen in Chapter 6, T may not have a diagonal matrix representation. However, it is still possible to “simplify” the matrix representation of T in a number of ways. This is the main topic of this chapter. In particular, we obtain the primary decomposition theorem, and the triangular, Jordan, and rational canonical forms.

We comment that the triangular and Jordan canonical forms exist for T if and only if the characteristic polynomial Δ(t) of T has all its roots in the base field K. This is always true if K is the complex field C but may not be true if K is the real field R.

We also introduce the idea of a quotient space. This is a very powerful tool, and it will be used in the proof of the existence of the triangular and rational canonical forms.

10.2 Triangular Form

Let T be a linear operator on an n-dimensional vector space V. Suppose T can be represented by the triangular matrix

Then the characteristic polynomial Δ(t) of T is a product of linear factors; that is,

The converse is also true and is an important theorem (proved in Problem 10.28).

THEOREM 10.1: Let T:V → V be a linear operator whose characteristic polynomial factors into linear polynomials. Then there exists a basis of V in which T is represented by a triangular matrix.

THEOREM 10.1: (Alternative Form) Let A be a square matrix whose characteristic polynomial factors into linear polynomials. Then A is similar to a triangular matrix—that is, there exists an invertible matrix P such that P⁻¹AP is triangular.

We say that an operator T can be brought into triangular form if it can be represented by a triangular matrix. Note that in this case, the eigenvalues of T are precisely those entries appearing on the main diagonal. We give an application of this remark.

EXAMPLE 10.1 Let A be a square matrix over the complex field C. Suppose λ is an eigenvalue of A². Show that Images or Images is an eigenvalue of A.

By Theorem 10.1, A and A² are similar, respectively, to triangular matrices of the form

Images

Because similar matrices have the same eigenvalues, Images for some i. Hence, Images is an eigenvalue of A.

10.3 Invariance

Let T:V → V be linear. A subspace W of V is said to be invariant under T or T-invariant if T maps W into itself—that is, if υ ∈ W implies T(υ) ∈ W. In this case, T restricted to W defines a linear operator on W; that is, T induces a linear operator defined by for every w ∈ W.

EXAMPLE 10.2

(a) Let T: R³ → R³ be the following linear operator, which rotates each vector υ about the z-axis by an angle θ (shown in Fig. 10-1):

Images

Figure 10-1

Observe that each vector w = (a, b, 0) in the xy-plane W remains in W under the mapping T; hence, W is T-invariant. Observe also that the z-axis U is invariant under T. Furthermore, the restriction of T to W rotates each vector about the origin O, and the restriction of T to U is the identity mapping of U.

(b) Nonzero eigenvectors of a linear operator T:V → V may be characterized as generators of T-invariant one-dimensional subspaces. Suppose T(υ) = λυ, υ ≠ 0. Then W = {kυ, k ∈ K}, the one-dimensional subspace generated by υ, is invariant under T because

Conversely, suppose dim U = 1 and u ≠ 0 spans U, and U is invariant under T. Then T(u) ∈ U and so T(u) is a multiple of u—that is, T(u) = μu. Hence, u is an eigenvector of T.

The next theorem (proved in Problem 10.3) gives us an important class of invariant subspaces.

THEOREM 10.2: Let T:V → V be any linear operator, and let f(t) be any polynomial. Then the kernel of f(t) is invariant under T.

The notion of invariance is related to matrix representations (Problem 10.5) as follows.

THEOREM 10.3: Suppose W isan invariant subspace of T:V → V.Then T hasa blockmatrix representation Images , where A is a matrix representation of the restriction Images of T to W.

10.4 Invariant Direct-Sum Decompositions

A vector space V is termed the direct sum of subspaces W₁, ..., W_r, written

if every vector υ ∈ V can be written uniquely in the form

The following theorem (proved in Problem 10.7) holds.

THEOREM 10.4: Suppose W₁, W₂, ..., W_r are subspaces of V, and suppose

are bases of W₁, W₂, ..., W_r, respectively. Then V is the direct sum of the W_i if and only if the union B = B₁ ∪ ... ∪ B_r is a basis of V.

Now suppose T:V → V is linear and V is the direct sum of (nonzero) T-invariant subspaces W₁, W₂, ..., W_r; that is,

Let T_i denote the restriction of T to W_i. Then T is said to be decomposable into the operators T_i or T is said to be the direct sum of the T_i, written T = T₁ ⊕ ... ⊕ T_r. Also, the subspaces W₁, ..., W_r are said to reduce T or to form a T-invariant direct-sum decomposition of V.

Consider the special case where two subspaces U and W reduce an operator T:V → V; say dim U = 2 and dim W = 3, and suppose {u₁, u₂} and {w₁, w₂, w₃} are bases of U and W, respectively. If T₁ and T₂ denote the restrictions of T to U and W, respectively, then

Accordingly, the following matrices A, B, M are the matrix representations of T₁, T₂, T, respectively,

The block diagonal matrix M results from the fact that {u₁, u₂, w₁, w₂, w₃} is a basis of V (Theorem 10.4), and that T(u_i) = T₁(u_i) and T(w_j) = T₂(w_j).

A generalization of the above argument gives us the following theorem.

THEOREM 10.5: Suppose T:V → V is linear and suppose V is the direct sum of T-invariant subspaces, say, W₁, ..., W_r. If A_i is a matrix representation of the restriction of T to W_i, then T can be represented by the block diagonal matrix:

10.5 Primary Decomposition

The following theorem shows that any operator T:V → V is decomposable into operators whose minimum polynomials are powers of irreducible polynomials. This is the first step in obtaining a canonical form for T.

THEOREM 10.6: (Primary Decomposition Theorem) Let T:V → V be a linear operator with minimal polynomial

where the f_i(t) are distinct monic irreducible polynomials. Then V is the direct sum of T-invariant subspaces W₁, ..., W_r, where W_i is the kernel of f_i(T)^n_i. Moreover, f_i(t)^n_i is the minimal polynomial of the restriction of T to W_i.

The above polynomials f_i(t)^n_i are relatively prime. Therefore, the above fundamental theorem follows (Problem 10.11) from the next two theorems (proved in Problems 10.9 and 10.10, respectively).

THEOREM 10.7: Suppose T:V → V is linear, and suppose f(t) = g(t)h(t) are polynomials such that f(t) = 0 and g(t) and h(t) are relatively prime. Then V is the direct sum of the T-invariant subspace U and W, where U = Ker g(T) and W = Ker h(T).

THEOREM 10.8: In Theorem 10.7, if f(t) is the minimal polynomial of T [and g(t) and h(t) are monic], then g(t) and h(t) are the minimal polynomials of the restrictions of T to U and W, respectively.

We will also use the primary decomposition theorem to prove the following useful characterization of diagonalizable operators (see Problem 10.12 for the proof).

THEOREM 10.9: A linear operator T:V → V is diagonalizable if and only if its minimal polynomial m(t) is a product of distinct linear polynomials.

THEOREM 10.9: (Alternative Form) A matrix A is similar to a diagonal matrix if and only if its minimal polynomial is a product of distinct linear polynomials.

EXAMPLE 10.3 Suppose A ≠ I is a square matrix for which A³ = I. Determine whether or not A is similar to a diagonal matrix if A is a matrix over: (i) the real field R, (ii) the complex field C.

Because A³ = I, A is a zero of the polynomial f(t) = t³ − 1 = (t − 1)(t² + t + 1). The minimal polynomial m(t) of A cannot be t − 1, because A ≠ I. Hence,

Because neither polynomial is a product of linear polynomials over R, A is not diagonalizable over R. On the other hand, each of the polynomials is a product of distinct linear polynomials over C. Hence, A is diagonalizable over C.

10.6 Nilpotent Operators

A linear operator T:V → V is termed nilpotent if Tⁿ = 0 for some positive integer n; we call k the index of nilpotency of T if T^k = 0 but T^{k − 1} ≠ 0. Analogously, a square matrix A is termed nilpotent if Aⁿ = 0 for some positive integer n, and of index k if A^k = 0 but A^{k − 1} ≠ 0. Clearly the minimum polynomial of a nilpotent operator (matrix) of index k is m(t) = t^k; hence, 0 is its only eigenvalue.

EXAMPLE 10.4 The following two r-square matrices will be used throughout the chapter:

The first matrix N, called a Jordan nilpotent block, consists of 1’s above the diagonal (called the super-diagonal), and 0’s elsewhere. It is a nilpotent matrix of index r. (The matrix N of order 1 is just the 1 × 1 zero matrix [0].)

The second matrix J(λ), called a Jordan block belonging to the eigenvalue λ, consists of λ’s on the diagonal, 1’s on the superdiagonal, and 0’s elsewhere. Observe that

In fact, we will prove that any linear operator T can be decomposed into operators, each of which is the sum of a scalar operator and a nilpotent operator.

The following (proved in Problem 10.16) is a fundamental result on nilpotent operators.

THEOREM 10.10: Let T:V → V be a nilpotent operator of index k. Then T has a block diagonal matrix representation in which each diagonal entry is a Jordan nilpotent block N. There is at least one N of order k, and all other N are of orders ≤k. The number of N of each possible order is uniquely determined by T. The total number of N of all orders is equal to the nullity of T.

The proof of Theorem 10.10 shows that the number of N of order i is equal to 2m_i − m_{i + 1} − m_{i − 1}, where m_i is the nullity of Tⁱ.

10.7 Jordan Canonical Form

An operator T can be put into Jordan canonical form if its characteristic and minimal polynomials factor into linear polynomials. This is always true if K is the complex field C. In any case, we can always extend the base field K to a field in which the characteristic and minimal polynomials do factor into linear factors; thus, in a broad sense, every operator has a Jordan canonical form. Analogously, every matrix is similar to a matrix in Jordan canonical form.

The following theorem (proved in Problem 10.18) describes the Jordan canonical form J of a linear operator T.

THEOREM 10.11: Let T:V → V be a linear operator whose characteristic and minimal polynomials are, respectively,

where the λ_i are distinct scalars. Then T has a block diagonal matrix representation J in which each diagonal entry is a Jordan block J_ij = J(λ_i). For each λ_ij, the corresponding J_ij have the following properties:

(i) There is at least one J_ij of order m_i; all other J_ij are of order ≤m_i.

(ii) The sum of the orders of the J_ij is n_i.

(iii) The number of J_ij equals the geometric multiplicity of λ_i.

(iv) The number of J_ij of each possible order is uniquely determined by T.

EXAMPLE 10.5 Suppose the characteristic and minimal polynomials of an operator T are, respectively,

Then the Jordan canonical form of T is one of the following block diagonal matrices:

The first matrix occurs if T has two independent eigenvectors belonging to the eigenvalue 2; and the second matrix occurs if T has three independent eigenvectors belonging to the eigenvalue 2.

10.8 Cyclic Subspaces

Let T be a linear operator on a vector space V of finite dimension over K. Suppose υ ∈ V and υ ≠ 0. The set of all vectors of the form f (T)(υ), where f(t) ranges over all polynomials over K, is a T-invariant subspace of V called the T-cyclic subspace of V generated by υ ; we denote it by Z(υ, T) and denote the restriction of T to Z(υ, T) by T_υ. By Problem 10.56, we could equivalently define Z(υ, T) as the intersection of all T-invariant subspaces of V containing υ.

Now consider the sequence

of powers of T acting on υ. Let k be the least integer such that T^k(υ) is a linear combination of those vectors that precede it in the sequence, say,

Then

is the unique monic polynomial of lowest degree for which m_υ(T)(υ) = 0. We call m_υ(t) the T-annihilator of υ and Z(υ, T).

The following theorem (proved in Problem 10.29) holds.

THEOREM 10.12: Let Z(υ, T), T_υ, m_υ(t) be defined as above. Then

(i) The set {υ, T(υ), ..., T^{k − 1}(υ)} is a basis of Z(υ, T); hence, dim Z(υ, T) = k.

(ii) The minimal polynomial of T_υ is m_υ(t).

(iii) The matrix representation of T_υ in the above basis is just the companion matrix C(m_υ) of m_υ(t); that is,

10.9 Rational Canonical Form

In this section, we present the rational canonical form for a linear operator T:V → V. We emphasize that this form exists even when the minimal polynomial cannot be factored into linear polynomials. (Recall that this is not the case for the Jordan canonical form.)

LEMMA 10.13: Let T:V → V be a linear operator whose minimal polynomial is f(t)ⁿ, where f(t) is a monic irreducible polynomial. Then V is the direct sum

of T-cyclic subspaces Z(υ_i, T) with corresponding T-annihilators

Any other decomposition of V into T-cyclic subspaces has the same number of components and the same set of T-annihilators.

We emphasize that the above lemma (proved in Problem 10.31) does not say that the vectors υ _i or other T-cyclic subspaces Z(υ_i, T) are uniquely determined by T, but it does say that the set of T-annihilators is uniquely determined by T. Thus, T has a unique block diagonal matrix representation:

Images

where the C_i are companion matrices. In fact, the C_i are the companion matrices of the polynomials f(t)^n_i.

Using the Primary Decomposition Theorem and Lemma 10.13, we obtain the following result.

THEOREM 10.14: Let T:V → V be a linear operator with minimal polynomial

where the f_i(t) are distinct monic irreducible polynomials. Then T has a unique block diagonal matrix representation:

where the C_ij are companion matrices. In particular, the C_ij are the companion matrices of the polynomials f_i(t)^n_ij, where

The above matrix representation of T is called its rational canonical form. The polynomials f_i(t)^n_ij are called the elementary divisors of T.

EXAMPLE 10.6 Let V be a vector space of dimension 8 over the rational field Q, and let T be a linear operator on V whose minimal polynomial is

Thus, because dim V = 8, the characteristic polynomial Δ(t) = f₁(t) f₂(t)⁴. Also, the rational canonical form M of T must have one block the companion matrix of f₁(t) and one block the companion matrix of f₂(t)². There are two possibilities:

(a) diag[C(t⁴ − 4t³ + 6t² − 4t − 7), C((t − 3)²), C((t − 3)²)]

(b) diag[C(t⁴ − 4t³ + 6t² − 4t − 7), C((t − 3)²), C(t − 3), C(t − 3)]

That is,

10.10 Quotient Spaces

Let V be a vector space over a field K and let W be a subspace of V. If υ is any vector in V, we write υ + W for the set of sums υ + w with w ∈ W; that is,

These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets partition V into mutually disjoint subsets.

EXAMPLE 10.7 Let W be the subspace of R² defined by

Figure 10-2

that is, W is the line given by the equation x − y = 0. We can view υ + W as a translation of the line obtained by adding the vector υ to each point in W. As shown in Fig. 10-2, the coset υ + W is also a line, and it is parallel to W. Thus, the cosets of W in R² are precisely all the lines parallel to W.

In the following theorem, we use the cosets of a subspace W of a vector space V to define a new vector space; it is called the quotient space of V by W and is denoted by V/W.

THEOREM 10.15: Let W be a subspace of a vector space over a field K. Then the cosets of W in V form a vector space over K with the following operations of addition and scalar multiplication:

We note that, in the proof of Theorem 10.15 (Problem 10.24), it is first necessary to show that the operations are well defined; that is, whenever u + W = u′ + W and υ + W = υ′ + W, then

Images

In the case of an invariant subspace, we have the following useful result (proved in Problem 10.27).

THEOREM 10.16: Suppose W is a subspace invariant under a linear operator T:V → V. Then T induces a linear operator Images on V/W defined by Images . Moreover, if T is a zero of any polynomial, then so is Images . Thus, the minimal polynomial of Images divides the minimal polynomial of T.

SOLVED PROBLEMS

Invariant Subspaces

10.1. Suppose T:V → V is linear. Show that each of the following is invariant under T:

(a) {0}, (b) V, (c) kernel of T, (d) image of T.

(a) We have T(0) = 0 ∈ {0}; hence, {0} is invariant under T.

(b) For every υ ∈ V, T(υ) ∈ V; hence, V is invariant under T.

(d) Because T(υ) ∈ Im T for every υ ∈ V, it is certainly true when υ ∈ Im T. Hence, the image of T is invariant under T.

10.2. Suppose {W_i} is a collection of T-invariant subspaces of a vector space V. Show that the intersection W = ∩_i W_i is also T-invariant.

Suppose υ ∈ W; then υ ∈ W_i for every i. Because W_i is T-invariant, T(υ) ∈ W_i for every i. Thus, T(υ) ∈ W and so W is T-invariant.

10.3. Prove Theorem 10.2: Let T:V → V be linear. For any polynomial f(t), the kernel of f(T) is invariant under T.

Suppose υ ∈ Ker f(t)—that is, f(T)(υ) = 0. We need to show that T(υ) also belongs to the kernel of f(T)—that is, f(T)(T(υ)) = (f(T) ∘ T)(υ) = 0. Because f(t)t = tf(t), we have f(T) ∘ T = T ∘ f(t). Thus, as required,

Images

10.4. Find all invariant subspaces of Images viewed as an operator on R².

By Problem 10.1, R² and {0} are invariant under A. Now if A has any other invariant subspace, it must be one-dimensional. However, the characteristic polynomial of A is

Images

Hence, A has no eigenvalues (in R) and so A has no eigenvectors. But the one-dimensional invariant subspaces correspond to the eigenvectors; thus, R² and {0} are the only subspaces invariant under A.

10.5. Prove Theorem 10.3: Suppose W is T-invariant. Then T has a triangular block representation Images , where A is the matrix representation of the restriction Images of T to W.

We choose a basis {w₁, ..., w_r} of W and extend it to a basis {w₁, ..., w_r, υ₁, ..., υ_s} of V. We have

Images

But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system of equations (Section 6.2). Therefore, it has the form Images , where A is the transpose of the matrix of coefficients for the obvious subsystem. By the same argument, A is the matrix of Images relative to the basis {w_i} of W.

10.6. Let Images denote the restriction of an operator T to an invariant subspace W. Prove

(a) For any polynomial f(t), f()(w) = f (T)(w).

(b) The minimal polynomial of divides the minimal polynomial of T.

(a) If f(t) = 0 or if f(t) is a constant (i.e., of degree 1), then the result clearly holds. Assume deg f = n > 1 and that the result holds for polynomials of degree less than n. Suppose that

Then

(b) Let m(t) denote the minimal polynomial of T. Then by (a), m()(w) = m(T)(w) = 0(w) = 0 for every w ∈ W; that is, is a zero of the polynomial m(t). Hence, the minimal polynomial of divides m(t).

Invariant Direct-Sum Decompositions

10.7. Prove Theorem 10.4: Suppose W₁, W₂, ..., W_r are subspaces of V with respective bases

Images

Then V is the direct sum of the W_i if and only if the union B = ∪_i B_i is a basis of V.

Suppose B is a basis of V. Then, for any υ ∈ V,

Images

where w_i = a_i1w_i1 + ⋯ + a_{in_i}w_{in_i} ∈ W_i. We next show that such a sum is unique. Suppose

Images

Because {w_i1, ..., w_{in_i}} is a basis of Images , and so

Images

Because B is a basis of V, a_ij = b_ij, for each i and each j. Hence, Images , and so the sum for υ is unique. Accordingly, V is the direct sum of the W_i.

Conversely, suppose V is the direct sum of the W_i. Then for any υ ∈ V, υ = w₁ + ⋯ + w_r, where w_i ∈ W_i. Because {w_{ij_i}} is a basis of W_i, each w_i is a linear combination of the w_{ij_i}, and so υ is a linear combination of the elements of B. Thus, B spans V. We now show that B is linearly independent. Suppose

Images

Note that a_i1w_i1 + ⋯ + a_{in_i}w_{in_i} ∈ W_i. We also have that 0 = 0 + 0 ⋯ 0 ∈ W_i. Because such a sum for 0 is unique,

Images

The independence of the bases {w_{ij_i}} implies that all the a’s are 0. Thus, B is linearly independent and is a basis of V.

10.8. Suppose T:V → V is linear and suppose T = T₁ ⊕ T₂ with respect to a T-invariant direct-sum decomposition V = U ⊕ W. Show that

(a) m(t) is the least common multiple of m₁(t) and m₂(t), where m(t), m₁(t), m₂(t) are the minimum polynomials of T, T₁, T₂, respectively.

(b) Δ(t) = Δ₁(t)Δ₂(t), where Δ(t), Δ₁(t), Δ₂(t) are the characteristic polynomials of T, T₁, T₂, respectively.

(a) By Problem 10.6, each of m₁(t) and m₂(t) divides m(t). Now suppose f(t) is a multiple of both m₁(t) and m₂(t), then f(T₁)(U) = 0 and f(T₂)(W) = 0. Let υ ∈ V, then υ = u + w with u ∈ U and w ∈ W. Now

Images

That is, T is a zero of f(t). Hence, m(t) divides f(t), and so m(t) is the least common multiple of m₁(t) and m₂(t).

(b) By Theorem 10.5, T has a matrix representation Images , where A and B are matrix representations of T₁ and T₂, respectively. Then, as required,

Images

10.9. Prove Theorem 10.7: Suppose T:V → V is linear, and suppose f(t) = g(t)h(t) are polynomials such that f(t) = 0 and g(t) and h(t) are relatively prime. Then V is the direct sum of the T-invariant subspaces U and W where U = Ker g(T) and W = Ker h(T).

Note first that U and W are T-invariant by Theorem 10.2. Now, because g(t) and h(t) are relatively prime, there exist polynomials r(t) and s(t) such that

Images

Hence, for the operator T,

Images

Let υ ∈ V; then, by (*),

Images

But the first term in this sum belongs to W = Ker h(T), because

Images

Similarly, the second term belongs to U. Hence, V is the sum of U and W.

To prove that V = U ⊕ W, we must show that a sum υ = u + w with u ∈ U, w ∈ W, is uniquely determined by υ. Applying the operator r(T)g(T) to υ = u + w and using g(T)u = 0, we obtain

Images

Also, applying (*) to w alone and using h(T)w = 0, we obtain

Images

Both of the above formulas give us w = r(T)g(T)υ, and so w is uniquely determined by υ. Similarly u is uniquely determined by υ. Hence, V = U ⊕ W, as required.

10.10. Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if f(t) is the minimal polynomial of T (and g(t) and h(t) are monic), then g(t) is the minimal polynomial of the restriction T₁ of T to U and h(t) is the minimal polynomial of the restriction T₂ of T to W.

Let m₁(t) and m₂(t) be the minimal polynomials of T₁ and T₂, respectively. Note that g(T₁) = 0 and h(T₂) = 0 because U = Ker g(T) and W = Ker h(T). Thus,

Images

By Problem 10.9, f(t) is the least common multiple of m₁(t) and m₂(t). But m₁(t) and m₂(t) are relatively prime because g(t) and h(t) are relatively prime. Accordingly, f(t) = m₁(t)m₂(t). We also have that. These two equations together with (1) and the fact that all the polynomials are monic imply that g(t) = m₁(t) and h(t) = m₂(t), as required.

10.11. Prove the Primary Decomposition Theorem 10.6: Let T:V → V be a linear operator with minimal polynomial

Images

where the f_i(t) are distinct monic irreducible polynomials. Then V is the direct sum of T-invariant subspaces W₁, ..., W_r where W_i is the kernel of f_i(T)^n_i. Moreover, f_i(t)^n_i is the minimal polynomial of the restriction of T to W_i.

The proof is by induction on r. The case r = 1 is trivial. Suppose that the theorem has been proved for r − 1. By Theorem 10.7, we can write V as the direct sum of T-invariant subspaces W₁ and V₁, where W₁ is the kernel of f₁(T)^n₁ and where V₁ is the kernel of f₂(T)^n₂ ⋯ f_r(T)^n_r. By Theorem 10.8, the minimal polynomials of the restrictions of T to W₁ and V₁ are f₁(t)^n₁ and f₂(t)^n₂ ⋯ f_r(t)^n_r, respectively.

Denote the restriction of T to V₁ by Images . By the inductive hypothesis, V₁ is the direct sum of subspaces W₂, ..., W_r such that W_i is the kernel of f_i(T₁)^n_i and such that f_i(t)^n_i is the minimal polynomial for the restriction of Images to W_i. But the kernel of f_i(T)^n_i, for i = 2, ..., r is necessarily contained in V₁, because f_i(t)^n_i divides f₂(t)^n₂ ⋯ f_r(t)^n_r. Thus, the kernel of f_i(T)^n_i is the same as the kernel of f_i(T₁)^n_i, which is W_i. Also, the restriction of T to W_i is the same as the restriction of Images to W_i (for i = 2, ..., r); hence, f_i(t)^n_i is also the minimal polynomial for the restriction of T to W_i. Thus, V = W₁ ⊕ W₂ ⊕ W_r is the desired decomposition of T.

10.12. Prove Theorem 10.9: A linear operator T:V → V has a diagonal matrix representation if and only if its minimal polynomal m(t) is a product of distinct linear polynomials.

Suppose m(t) is a product of distinct linear polynomials, say,

where the λ_i are distinct scalars. By the Primary Decomposition Theorem, V is the direct sum of subspaces W₁, ..., W_r, where W_i = Ker(T − λ_iI). Thus, if υ ∈ W_i, then (T − λ_iI)(υ) = 0 or T(υ) = λ_iυ. In other words, every vector in W_i is an eigenvector belonging to the eigenvalue λ_i. By Theorem 10.4, the union of bases for W₁, ..., W_r is a basis of V. This basis consists of eigenvectors, and so T is diagonalizable.

Conversely, suppose T is diagonalizable (i.e., V has a basis consisting of eigenvectors of T ). Let λ₁, ..., λ_s be the distinct eigenvalues of T. Then the operator

Images

maps each basis vector into 0. Thus, f(t) = 0, and hence, the minimal polynomial m(t) of T divides the polynomial

Images

Accordingly, m(t) is a product of distinct linear polynomials.

Nilpotent Operators, Jordan Canonical Form

10.13. Let T:V be linear. Suppose, for υ ∈ V, T^k(υ) = 0 but T^{k − 1}(υ) ≠ 0. Prove

(a) The set S = {υ, T(υ), ..., T^{k − 1}(υ)} is linearly independent.

(b) The subspace W generated by S is T-invariant.

(d) Relative to the basis {T^{k − 1}(υ), ..., T(υ), υ} of W, the matrix of T is the k-square Jordan nilpotent block N_k of index k (see Example 10.5).

(a) Suppose

Images

Applying T^{k − 1} to (*) and using T^k(υ) = 0, we obtain aT^{k − 1}(υ) = 0; because T^{k − 1}(υ) ≠ 0, a = 0. Now applying T^{k − 2} to (*) and using T^k(υ) = 0 and a = 0, we fiind a₁T^{k − 1}(υ) = 0; hence, a₁ = 0. Next applying T^{k − 3} to (*) and using T^k(υ) = 0 and a = a₁ = 0, we obtain a₂T^{k − 1}(υ) = 0; hence, a₂ = 0. Continuing this process, we find that all the a’s are 0; hence, S is independent.

(b) Let υ ∈ W. Then

Images

Using T^k(υ) = 0, we have

Images

Thus, W is T-invariant.

Images

That is, applying Images to each generator of W, we obtain 0; hence, Images and so Images is nilpotent of index at most k. On the other hand, Images ; hence, T is nilpotent of index exactly k.

(d) For the basis {T^{k − 1}(υ), T^{k − 2}(υ), ..., T(υ), υ} of W,

Images

Hence, as required, the matrix of T in this basis is the k-square Jordan nilpotent block N_k.

10.14. Let T:V → V be linear. Let U = Ker Tⁱ and W = Ker T^{i + 1}. Show that

(a) U ⊆ W, (b) T(W) ⊆ U.

(a) Suppose u ∈ U = Ker Tⁱ. Then Tⁱ(u) = 0 and so T^{i+ 1}u) = T(Tⁱ(u)) = T(0) = 0. Thus, u ∈ Ker T^{i + 1} = W. But this is true for every u ∈ U; hence, U ⊆ W.

(b) Similarly, if w ∈ W = Ker T^{i + 1}, then T^{i + 1}(w) = 0. Thus, T^{i + 1}(w) = Tⁱ(T(w)) = Tⁱ(0) = 0 and so T(W) ⊆ U.

10.15. Let T:V be linear. Let X = Ker T^{i − 2}, Y = Ker T^{i − 1}, Z = Ker Tⁱ. Therefore (Problem 10.14),

Images

are bases of X, Y, Z, respectively. Show that

Images

is contained in Y and is linearly independent.

By Problem 10.14, T(Z) ⊆ Y, and hence S ⊆ Y. Now suppose S is linearly dependent. Then there exists a relation

Images

where at least one coefficient is not zero. Furthermore, because {u_i} is independent, at least one of the b_k must be nonzero. Transposing, we find

Images

Hence,

Images

Thus,

Images

Because {u_i, υ_j} generates Y, we obtain a relation among the u_i, υ_j, w_k where one of the coefficients (i.e., one of the b_k) is not zero. This contradicts the fact that {u_i, υ_j, w_k} is independent. Hence, S must also be independent.

10.16. Prove Theorem 10.10: Let T:V → V be a nilpotent operator of index k. Then T has a unique block diagonal matrix representation consisting of Jordan nilpotent blocks N. There is at least one N of order k, and all other N are of orders ≤k. The total number of N of all orders is equal to the nullity of T.

Suppose dim V = n. Let W₁ = Ker T, W₂ = Ker T², ..., W_k = Ker T^k. Let us set m_i = dim W_i, for i = 1, ..., k. Because T is of index k, W_k = V and W_{k − 1} ≠ V and so m_{k − 1} < m_k = n. By Problem 10.14,

Images

Thus, by induction, we can choose a basis {u₁, ..., u_n} of V such that {u₁, ..., u_{m_i}} is a basis of W_i.

We now choose a new basis for V with respect to which T has the desired form. It will be convenient to label the members of this new basis by pairs of indices. We begin by setting

Images

and setting

Images

By the preceding problem,

Images

is a linearly independent subset of W_{k − 1}. We extend S₁ to a basis of W_{k − 1} by adjoining new elements (if necessary), which we denote by

Images

Next we set

Images

Again by the preceding problem,

is a linearly independent subset of W_{k − 2}, which we can extend to a basis of W_{k − 2} by adjoining elements

Continuing in this manner, we get a new basis for V, which for convenient reference we arrange as follows:

The bottom row forms a basis of W₁, the bottom two rows form a basis of W₂, and so forth. But what is important for us is that T maps each vector into the vector immediately below it in the table or into 0 if the vector is in the bottom row. That is,

Now it is clear [see Problem 10.13(d)] that T will have the desired form if the υ(i, j) are ordered lexicographically: beginning with υ (1, 1) and moving up the first column to υ(1, k), then jumping to υ(2, 1) and moving up the second column as far as possible.

Moreover, there will be exactly m_k − m_{k − 1} diagonal entries of order k3. Also, there will be

Images

as can be read off directly from the table. In particular, because the numbers m₁, … ,m_k are uniquely determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, the identity

Images

shows that the nullity m₁ of T is the total number of diagonal entries of T.

10.17. Let Images . The reader can verify that A and B are both nilpotent of index 3; that is, A³ = 0 but A² ≠ 0, and B³ = 0 but B² ≠ 0. Find the nilpotent matrices M_A and M_B in canonical form that are similar to A and B, respectively.

Because A and B are nilpotent of index 3, M_A and M_B must each contain a Jordan nilpotent block of order 3, and none greater then 3. Note that rank(A) = 2 and rank(B) = 3, so nullity(A) = 5 − 2 = 3 and nullity(B) = 5 − 3 = 2. Thus, M_A must contain three diagonal blocks, which must be one of order 3 and two of order 1; and M_B must contain two diagonal blocks, which must be one of order 3 and one of order 2. Namely,

10.18. Prove Theorem 10.11 on the Jordan canonical form for an operator T.

By the primary decomposition theorem, T is decomposable into operators T₁, ..., T_r; that is, T = T₁ ⊕ ⋯ ⊕ T_r, where (t − λ_i)^m_i is the minimal polynomial of T_i. Thus, in particular,

Images

Set N_i = T_i − λ_iI. Then, for i = 1, ..., r,

Images

That is, T_i is the sum of the scalar operator λ_iI and a nilpotent operator N_i, which is of index m_i because Images is the minimal polynomial of T_i.

Now, by Theorem 10.10 on nilpotent operators, we can choose a basis so that N_i is in canonical form. In this basis, T_i = N_i + λ_iI is represented by a block diagonal matrix M_i whose diagonal entries are the matrices J_ij. The direct sum J of the matrices M_i is in Jordan canonical form and, by Theorem 10.5, is a matrix representation of T.

Last, we must show that the blocks J_ij satisfy the required properties. Property (i) follows from the fact that N_i is of index m_i. Property (ii) is true because T and J have the same characteristic polynomial. Property (iii) is true because the nullity of N_i = T_i − λ_iI is equal to the geometric multiplicity of the eigenvalue λ_i. Property (iv) follows from the fact that the T_i and hence the N_i are uniquely determined by T.

10.19. Determine all possible Jordan canonical forms J for a linear operator T:V → V whose characteristic polynomial Δ(t) = (t − 2)⁵ and whose minimal polynomial m(t) = (t − 2)².

J must be a 5 × 5 matrix, because Δ(t) has degree 5, and all diagonal elements must be 2, because 2 is the only eigenvalue. Moreover, because the exponent of t − 2 in m(t) is 2, J must have one Jordan block of order 2, and the others must be of order 2 or 1. Thus, there are only two possibilities:

10.20. Determine all possible Jordan canonical forms for a linear operator T:V → V whose characteristic polynomial Δ(t) = (t − 2)³(t − 5)². In each case, find the minimal polynomial m(t).

Because t − 2 has exponent 3 in Δ(t), 2 must appear three times on the diagonal. Similarly, 5 must appear twice. Thus, there are six possibilities:

The exponent in the minimal polynomial m(t) is equal to the size of the largest block. Thus,

(a) m(t) = (t − 2)³(t − 5)², (b) m(t) = (t − 2)³(t − 5), (c) m(t) = (t − 2)²(t − 5)²,

(d) m(t) = (t − 2)²(t − 5), (e) m(t) = (t − 2)(t − 5)², (f) m(t) = (t − 2)(t − 5)

Quotient Space and Triangular Form

10.21. Let W be a subspace of a vector space V. Show that the following are equivalent:

(i) u ∈ υ + W, (ii) u − υ ∈ W, (iii) υ ∈ u + W.

Suppose u∈ υ + W. Then there exists w₀ ∈ W such that u = υ + w₀. Hence, u υ = w₀ ∈ W. Conversely, suppose u − υ ∈ W. Then u − υ = w₀ where w₀ ∈ W. Hence, u = υ + w₀ ∈ υ + W. Thus, (i) and (ii) are equivalent.

We also have u − υ ∈ W iff − (u − υ ) = υ − u ∈ W iff υ ∈ u + W. Thus, (ii) and (iii) are also equivalent.

10.22. Prove the following: The cosets of W in V partition V into mutually disjoint sets. That is,

(a) Any two cosets u + W and υ + W are either identical or disjoint.

(b) Each υ ∈ V belongs to a coset; in fact, υ ∈ υ + W.

Furthermore, u + W = υ + W if and only if u − υ ∈ W, and so (υ + w) + W = υ + W for any w ∈ W.

Let υ ∈ V. Because 0 ∈ W, we have υ = υ + 0 ∈ υ + W, which proves (b).

Now suppose the cosets u + W and υ + W are not disjoint; say, the vector x belongs to both u + W and υ + W. Then u − x ∈ W and x − υ ∈ W. The proof of (a) is complete if we show that u + W = υ + W. Let u + w₀ be any element in the coset u + W. Because u − x, x − υ, w₀ belongs to W,

Thus, u + w₀ ∈ υ + W, and hence the cost u + W is contained in the coset υ + W. Similarly, υ + W is contained in u + W, and so u + W = υ + W.

The last statement follows from the fact that u + W = υ + W if and only if u ∈ υ + W, and, by Problem 10.21, this is equivalent to u − υ ∈ W.

10.23. Let W be the solution space of the homogeneous equation 2x + 3y + 4z = 0. Describe the cosets of W in R³.

W is a plane through the origin O = (0, 0, 0), and the cosets of W are the planes parallel to W. Equivalently, the cosets of W are the solution sets of the family of equations

In fact, the coset υ + W, where υ = (a, b, c), is the solution set of the linear equation

10.24. Suppose W is a subspace of a vector space V. Show that the operations in Theorem 10.15 are well defined; namely, show that if u + W = u′ + W and υ + W = υ ′ + W, then

Images

(a) Because u + W = u′ + W and υ + W = υ′ + W, both u − u′ and υ − υ ′ belong to W. But then (u + υ) (u′ + υ′) = (u − u′) + (υ − υ′) ∈ W. Hence, (u + υ) + W = (u′ + υ′) + W.

(b) Also, because u − u′ ∈ W implies k(u − u′) ∈ W, then ku − ku′ = k(u − u′) ∈ W; accordingly, ku + W = ku′ + W.

10.25. Let V be a vector space and W a subspace of V. Show that the natural map η: V → V/W, defined by η(υ) = υ + W, is linear.

For any u, υ ∈ V and any k ∈ K, we have

and

Accordingly, η is linear.

10.26. Let W be a subspace of a vector space V. Suppose {w₁, ..., w_r} is a basis of W and the set of cosets Images , where Images , is a basis of the quotient space. Show that the set of vectors B = {υ₁, ..., υ_s, w₁, ..., w_r} is a basis of V. Thus, dim V = dim W + dim(V/W).

Suppose u ∈ V. Because is a basis of V/W,

Hence, u = a₁υ₁ + ⋯ + a_sυ_s + w, where w ∈ W. Since {w_i} is a basis of W,

Accordingly, B spans V.

We now show that B is linearly independent. Suppose

Then

Because is independent, the c’s are all 0. Substituting into (1), we find d₁w₁ + ⋯ + d_rw_r = 0. Because {w_i} is independent, the d’s are all 0. Thus, B is linearly independent and therefore a basis of V.

10.27. Prove Theorem 10.16: Suppose W is a subspace invariant under a linear operator T:V → V. Then T induces a linear operator Images on V/W defined by Images . Moreover, if T is a zero of any polynomial, then so is Images . Thus, the minimal polynomial of Images divides the minimal polynomial of T.

We first show that Images is well defined; that is, if u + W = υ + W, then Images . If u + W = υ + W, then u − υ ∈ W, and, as W is T-invariant, T(u − υ ) = T(u) − T(υ) ∈ W. Accordingly,

Images

as required.

We next show that Images is linear. We have

Images

Furthermore,

Images

Thus, Images is linear.

Now, for any coset u + W in V/W,

Images

Hence, Images . Similarly, Images for any n. Thus, for any polynomial

Images

and so Images . Accordingly, if T is a root of f(t) then Images ; that is, Images is also a root of f(t). The theorem is proved.

10.28. Prove Theorem 10.1: Let T:V → V be a linear operator whose characteristic polynomial factors into linear polynomials. Then V has a basis in which T is represented by a triangular matrix.

The proof is by induction on the dimension of V. If dim V = 1, then every matrix representation of T is a 1 × 1 matrix, which is triangular.

Now suppose dim V = n > 1 and that the theorem holds for spaces of dimension less than n. Because the characteristic polynomial of T factors into linear polynomials, T has at least one eigenvalue and so at least one nonzero eigenvector υ, say T(υ) = a₁₁υ. Let W be the one-dimensional subspace spanned by υ. Set Images = V/W. Then (Problem 10.26) dim Images = dim V − dim W = n − 1. Note also that W is invariant under T. By Theorem 10.16, T induces a linear operator Images on Images whose minimal polynomial divides the minimal polynomial of T. Because the characteristic polynomial of T is a product of linear polynomials, so is its minimal polynomial, and hence, so are the minimal and characteristic polynomials of Images . Thus, Images and Images satisfy the hypothesis of the theorem. Hence, by induction, there exists a basis Images of Images such that

Images

Now let υ₂, ..., υ_n be elements of V that belong to the cosets υ₂, ..., υ_n, respectively. Then {υ, υ₂, ..., υ_n} is a basis of V (Problem 10.26). Because Images , we have

Images

But W is spanned by υ ; hence, T(υ₂) − a₂₂υ₂ is a multiple of υ, say,

Images

Similarly, for i = 3, ..., n

Images

Thus,

Images

and hence the matrix of T in this basis is triangular.

Cyclic Subspaces, Rational Canonical Form

10.29. Prove Theorem 10.12: Let Z(υ, T) be a T-cyclic subspace, T_υ the restriction of T to Z(υ, T), and m_υ(t) = t^k + a_{k − 1}t^{k − 1} + ⋯ + a₀ the T-annihilator of υ. Then,

(i) The set {υ, T(υ), ..., T^{k − 1}(υ)} is a basis of Z(υ, T); hence, dim Z(υ, T) = k.

(ii) The minimal polynomial of T_υ is m_υ(t).

(iii) The matrix of T_υ in the above basis is the companion matrix C = C(m_υ) of m_υ(t) [which has 1’s below the diagonal, the negative of the coefficients a₀, a₁, ..., a_{k − 1} of m_υ(t) in the last column, and 0’s elsewhere].

(i) By definition of m_υ(t), T^k(υ) is the first vector in the sequence υ, T(υ), T²(υ), ... that, is a linear combination of those vectors that precede it in the sequence; hence, the set B = {υ, T(υ), ..., T^{k − 1}(υ)} is linearly independent. We now only have to show that Z(υ, T) = L(B), the linear span of B. By the above, T^k(υ) ∈ L(B). We prove by induction that Tⁿ(υ) ∈ L(B) for every n. Suppose n > k and T^{n − 1}(υ) ∈ L(B)—that is, T^{n − 1}(υ) is a linear combination of υ, ..., T^{k − 1}(υ). Then Tⁿ(υ) = T(T^{n − 1}(υ)) is a linear combination of T(υ), ..., T^k(υ). But T^k(υ) ∈ L(B); hence, Tⁿ(υ) ∈ L(B) for every n. Consequently, f(T)(υ) ∈ L(B) for any polynomial f(t). Thus, Z(υ, T) = L(B), and so B is a basis, as claimed.

(ii) Suppose m(t)= t^s + b_{s − 1}t^{s − 1} + ⋯ + b₀ is the minimal polynomial of T_υ. Then, because υ ∈ Z(υ, T),

Images

Thus, T^s(υ) is a linear combination of υ, T(υ), ..., T^{s − 1}(υ), and therefore k ≤ s. However, m_υ(T) = 0 and so m_υ(T_υ) = 0. Then m(t) divides m_υ(t), and so s ≤ k. Accordingly, k = s and hence m_υ(t) = m(t).

Images

By definition, the matrix of T_υ in this basis is the tranpose of the matrix of coefficients of the above system of equations; hence, it is C, as required.

10.30. Let T:V → V be linear. Let W be a T-invariant subspace of V and Images the induced operator on V/W. Prove

(a) The T-annihilator of υ ∈ V divides the minimal polynomial of T.

(b) The -annihilator of divides the minimal polynomial of T.

(a) The T-annihilator of υ ∈ V is the minimal polynomial of the restriction of T to Z(υ, T); therefore, by Problem 10.6, it divides the minimal polynomial of T.

(b) The Images -annihilator of Images divides the minimal polynomial of Images , which divides the minimal polynomial of T by Theorem 10.16.

Remark: In the case where the minimum polynomial of T is f(t)ⁿ, where f(t) is a monic irreducible polynomial, then the T-annihilator of υ ∈ V and the Images -annihilator of Images are of the form f(t)^m, where m ≤ n.

10.31. Prove Lemma 10.13: Let T:V → V be a linear operator whose minimal polynomial is f(t)ⁿ, where f(t) is a monic irreducible polynomial. Then V is the direct sum of T-cyclic subspaces Z_i = Z(υ_i, T), i = 1, ..., r, with corresponding T-annihilators

Any other decomposition of V into the direct sum of T-cyclic subspaces has the same number of components and the same set of T-annihilators.

The proof is by induction on the dimension of V. If dim V = 1, then V is T-cyclic and the lemma holds. Now suppose dim V > 1 and that the lemma holds for those vector spaces of dimension less than that of V.

Because the minimal polynomial of T is f(t)ⁿ, there exists υ₁ ∈ V such that f(T)^{n − 1}(υ₁) ≠ 0; hence, the T-annihilator of υ₁ is f(t)ⁿ. Let Z₁ = Z(υ₁, T) and recall that Z₁ is T-invariant. Let Images and let Images be the linear operator on Images induced by T. By Theorem 10.16, the minimal polynomial of Images divides f(t)ⁿ; hence, the hypothesis holds for Images and Images . Consequently, by induction, Images is the direct sum of Images -cyclic subspaces; say,

Images

where the corresponding Images -annihilators are f(t)^n₂, ..., f(t)^n_r, n ≥ n₂ ≥ n_r.

We claim that there is a vector υ₂ in the coset Images whose T-annihilator is f(t)^n₂, the Images -annihilator of Images . Let w be any vector in Images . Then f(t)^n₂(w) ∈ Z₁. Hence, there exists a polynomial g(t) for which

Images

Because f(t)ⁿ is the minimal polynomial of T, we have, by (1),

Images

But f(t)ⁿ is the T-annihilator of υ₁; hence, f(t)ⁿ divides f(t)ⁿ − n₂g(t), and so g(t) = f(t)^n₂ h(t) for some polynomial h(t). We set

Images

Because w − υ₂ = h(T)(υ₁) ∈ Z₁, υ₂ also belongs to the coset Images . Thus, the T-annihilator of υ₂ is a multiple of the Images -annihilator of Images . On the other hand, by (1),

Images

Consequently, the T-annihilator of υ₂ is f(t)^n₂, as claimed.

Similarly, there exist vectors υ₃, ..., υ_r∈ V such that Images and that the T-annihilator of υ_i is f(t)^n_i, the Images -annihilator of Images . We set

Images

Let d denote the degree of f(t), so that f(t)^n_i has degree dn_i. Then, because f(t)^n_i is both the T-annihilator of υ_i and the Images -annihilator of Images , we know that

Images

are bases for Z(υ_i, T) and Images , respectively, for i = 2, ..., r. But Images hence,

Images

is a basis for Images . Therefore, by Problem 10.26 and the relation Images (see Problem 10.27),

Images

is a basis for V. Thus, by Theorem 10.4, V = Z(υ₁, T) ⊕ ⋯ ⊕ Z(υ_r, T), as required.

It remains to show that the exponents n₁, ..., n_r are uniquely determined by T. Because d = degree of f(t),

Images

Also, if s is any positive integer, then (Problem 10.59) f(T)^s(Z_i) is a cyclic subspace generated by f(T)^s(υ_i), and it has dimension d(n_i − s) if n_i > s and dimension 0 if n_i ≤ s.

Now any vector υ ∈ V can be written uniquely in the form υ = w₁ + ⋯ + w_r, where w_i ∈ Z_i. Hence, any vector in f(T)^s(V) can be written uniquely in the form

Images

where f(t)^s(w_i) ∈ f(T)^s(Z_i). Let t be the integer, dependent on s, for which

Images

Then

Images

and so

Images

The numbers on the left of (2) are uniquely determined by T. Set s = n − 1, and (2) determines the number of n_i equal to n. Next set s = n − 2, and (2) determines the number of n_i (if any) equal to n − 1. We repeat the process until we set s = 0 and determine the number of n_i equal to 1. Thus, the n_i are uniquely determined by T and V, and the lemma is proved.

10.32. Let V be a seven-dimensional vector space over R, and let T:V → V be a linear operator with minimal polynomial m(t) = (t² − 2t + 5)(t − 3)³. Find all possible rational canonical forms M of T.

Because dim V = 7, there are only two possible characteristic polynomials, Δ₁(t)= (t² − 2t + 5)²(t − 3)³ or Δ₁(t)= (t² − 2t + 5)(t − 3)⁵. Moreover, the sum of the orders of the companion matrices must add up to 7. Also, one companion matrix must be C(t² − 2t + 5) and one must be C((t − 3)³) = C(t³ − 9t² + 27t − 27). Thus, M must be one of the following block diagonal matrices:

Projections

10.33. Suppose V = W₁ ⊕ ⋯ ⊕ W_r. The projection of V into its subspace W_k is the mapping E: V → V defined by E(υ) = w_k, where υ = w₁ + ⋯ + w_r, w_i ∈ W_i. Show that (a) E is linear, (b) E² = E.

(a) Because the sum υ = w₁ + ⋯ + w_r, w_i ∈ W is uniquely determined by υ, the mapping E is well defined. Suppose, for u ∈ V, . Then

are the unique sums corresponding to υ + u and kυ. Hence,

and therefore E is linear.

(b) We have that

is the unique sum corresponding to w_k ∈ W_k; hence, E(w_k) = w_k. Then, for any υ ∈ V,

Thus, E² = E, as required.

10.34. Suppose E:V → V is linear and E² = E. Show that (a) E(u) = u for any u ∈ Im E (i.e., the restriction of E to its image is the identity mapping); (b) V is the direct sum of the image and kernel of E:V = Im E ⊕ Ker E; (c) E is the projection of V into Im E, its image. Thus, by the preceding problem, a linear mapping T:V → V is a projection if and only if T² = T; this characterization of a projection is frequently used as its definition.

(a) If u ∈ Im E, then there exists υ ∈ V for which E(υ) = u; hence, as required,

(b) Let υ ∈ V. We can write υ in the form υ = E(υ) + υ − E(υ). Now E(υ) ∈ Im E and, because

υ − E(υ) ∈ Ker E. Accordingly, V = Im E + Ker E.

Now suppose w ∈ Im E ∩ Ker E. By (i), E(w) = w because w ∈ Im E. On the other hand, E(w) = 0 because w ∈ Ker E. Thus, w = 0, and so Im E ∩ Ker E = {0}. These two conditions imply that V is the direct sum of the image and kernel of E.

(c) Let υ ∈ V and suppose υ = u + w, where u ∈ Im E and w ∈ Ker E. Note that E(u) = u by (i), and E(w) = 0 because w ∈ Ker E. Hence,

That is, E is the projection of V into its image.

10.35. Suppose V = U ⊕ W and suppose T:V → V is linear. Show that U and W are both T-invariant if and only if TE = ET, where E is the projection of V into U.

Observe that E(υ) ∈ U for every υ ∈ V, and that (i) E(υ) = υ iff υ ∈ U, (ii) E(υ) = 0 iff υ ∈ W. Suppose ET = TE. Let u ∈ U. Because E(u) = u,

Hence, U is T-invariant. Now let w ∈ W. Because E(w) = 0,

Hence, W is also T-invariant.

Conversely, suppose U and W are both T-invariant. Let υ ∈ V and suppose υ = u + w, where u ∈ T and w ∈ W. Then T(u) ∈ U and T(w) ∈ W; hence, E(T(u)) = T(u) and E(T(w)) = 0. Thus,