Schaum’s Outline Linear Algebra, Sixth Edition

Inner Product Spaces, Orthogonality

7.1 Introduction

The definition of a vector space V involves an arbitrary field K. Here we first restrict K to be the real field R, in which case V is called a real vector space; in the last sections of this chapter, we extend our results to the case where K is the complex field C, in which case V is called a complex vector space. Also, we adopt the previous notation that

Furthermore, the vector spaces V in this chapter have finite dimension unless otherwise stated or implied.

Recall that the concepts of “length” and “orthogonality” did not appear in the investigation of arbitrary vector spaces V (although they did appear in Section 1.4 on the spaces Rⁿ and Cⁿ). Here we place an additional structure on a vector space V to obtain an inner product space, and in this context these concepts are defined.

7.2 Inner Product Spaces

We begin with a definition.

DEFINITION: Let V be a real vector space. Suppose to each pair of vectors u, υ ∈ V there is assigned a real number, denoted by u, υ. This function is called a (real) inner product on V if it satisfies the following axioms:

[I₁] (Linear Property): au₁ + bu₂, υ = au₁, υ + b u₂, υ.

[I₂] (Symmetric Property): u, υ = υ, u.

[I₃] (Positive Definite Property): u, u ≥ 0.; and u, u = 0 if and only if u = 0.

The vector space V with an inner product is called a (real) inner product space.

Axiom [I₁] states that an inner product function is linear in the first position. Using [I₁] and the symmetry axiom [I₂], we obtain

That is, the inner product function is also linear in its second position. Combining these two properties and using induction yields the following general formula:

That is, an inner product of linear combinations of vectors is equal to a linear combination of the inner products of the vectors.

EXAMPLE 7.1 Let V be a real inner product space. Then, by linearity,

Observe that in the last equation we have used the symmetry property that u, υ = υ, u.

Remark: Axiom [I₁] by itself implies 0, 0 = 0υ, 0 = 0υ, 0 = 0. Thus, [I₁], [I₂], [I₃] are equivalent to [I₁], [I₂], and the following axiom:

If u ≠ 0, then u, u is positive.

That is, a function satisfying is an inner product.

Norm of a Vector

By the third axiom [I₃] of an inner product, u, u is nonnegative for any vector u. Thus, its positive square root exists. We use the notation

This nonnegative number is called the norm or length of u. The relation ||u||² = u, u will be used frequently.

Remark: If ||u|| = 1 or, equivalently, if u, u = 1, then u is called a unit vector and it is said to be normalized. Every nonzero vector υ in V can be multiplied by the reciprocal of its length to obtain the unit vector

which is a positive multiple of υ. This process is called normalizing υ.

7.3 Examples of Inner Product Spaces

This section lists the main examples of inner product spaces used in this text.

Euclidean n-Space Rⁿ

Consider the vector space Rⁿ. The dot product or scalar product in Rⁿ is defined by

where u = (a_i) and υ = (b_i). This function defines an inner product on Rⁿ. The norm ||u|| of the vector u = (a_i) in this space is as follows:

On the other hand, by the Pythagorean theorem, the distance from the origin O in R³ to a point P(a, b, c) is given by Images . This is precisely the same as the above-defined norm of the vector υ = (a, b, c) in R³. Because the Pythagorean theorem is a consequence of the axioms of Euclidean geometry, the vector space Rⁿ with the above inner product and norm is called Euclidean n-space. Although there are many ways to define an inner product on Rⁿ, we shall assume this inner product unless otherwise stated or implied. It is called the usual (or standard) inner product on Rⁿ.

Remark: Frequently the vectors in Rⁿ will be represented by column vectors—that is, by n × 1 column matrices. In such a case, the formula

defines the usual inner product on Rⁿ.

EXAMPLE 7.2 Let u = (1, 3, −4, 2), υ = (4, −2, 2, 1), w = (5, −1, −2, 6) in R⁴.

(a) Show 3u − 2υ, w = 3u, w − 2υ, w.

By definition,

Note that 3u − 2υ = (−5, 13, −16, 4). Thus,

As expected, 3u, w − 2υ, w = 3(22) − 2(24) = 18 = 3u − 2υ, w.

(b) Normalize u and υ.

By definition,

We normalize u and υ to obtain the following unit vectors in the directions of u and υ, respectively:

Function Space C[a, b] and Polynomial Space P(t)

The notation C[a, b] is used to denote the vector space of all continuous functions on the closed interval [a, b]—that is, where a ≤ t ≤ b. The following defines an inner product on C[a, b], where f(t) and g(t) are functions in C[a, b]:

It is called the usual inner product on C[a, b].

The vector space P(t) of all polynomials is a subspace of C[a, b] for any interval [a, b], and hence, the above is also an inner product on P(t).

EXAMPLE 7.3

Consider f(t) = 3t − 5 and g(t) = t² in the polynomial space P(t) with inner product

(a) Find f, g.

We have f(t)g(t) = 3t³ − 5t². Hence,

(b) Find ||f|| and ||g||.

We have [f(t)]² = f(t)f(t) = 9t² − 30t + 25 and [g(t)]² = t⁴. Then

Therefore, .

Matrix Space M = M_m_,_n

Let M = M_m,_n, the vector space of all real m × n matrices. An inner product is defined on M by

where, as usual, tr( ) is the trace—the sum of the diagonal elements. If A = [a_ij] and B = [b_ij], then

That is, A, B is the sum of the products of the corresponding entries in A and B and, in particular, A, A is the sum of the squares of the entries of A.

Hilbert Space

Let V be the vector space of all infinite sequences of real numbers (a₁, a₂, a₃,…) satisfying

that is, the sum converges. Addition and scalar multiplication are defined in V componentwise; that is, if

then

An inner product is defined in υ by

The above sum converges absolutely for any pair of points in V. Hence, the inner product is well defined. This inner product space is called l₂-space or Hilbert space.

7.4 Cauchy–Schwarz Inequality, Applications

The following formula (proved in Problem 7.8) is called the Cauchy–Schwarz inequality or Schwarz inequality. It is used in many branches of mathematics.

THEOREM 7.1: (Cauchy–Schwarz) For any vectors u and υ in an inner product space V,

Next we examine this inequality in specific cases.

EXAMPLE 7.4

(a) Consider any real numbers a₁, …, a_n, b₁, …, b_n. Then, by the Cauchy–Schwarz inequality,

That is, (u · υ)² ≤ ||u||²||υ||², where u = (a_i) and υ = (b_i).

(b) Let f and g be continuous functions on the unit interval [0, 1]. Then, by the Cauchy–Schwarz inequality,

That is, (f, g)² ≤ ||f||²||υ||². Here V is the inner product space C[0, 1].

The next theorem (proved in Problem 7.9) gives the basic properties of a norm. The proof of the third property requires the Cauchy–Schwarz inequality.

THEOREM 7.2: Let V be an inner product space. Then the norm in V satisfies the following properties:

The property [N₃] is called the triangle inequality, because if we view u + υ as the side of the triangle formed with sides u and υ (as shown in Fig. 7-1), then [N₃] states that the length of one side of a triangle cannot be greater than the sum of the lengths of the other two sides.

Images

Figure 7-1

Angle Between Vectors

For any nonzero vectors u and υ in an inner product space V, the angle between u and υ is defined to be the angle θ such that 0 ≤ θ ≤ π and

By the Cauchy–Schwartz inequality, −1 ≤ cos θ ≤ 1, and so the angle exists and is unique.

EXAMPLE 7.5

(a) Consider vectors u = (2, 3, 5) and υ = (1, −4, 3) in R³. Then

Then the angle θ between u and υ is given by

Note that θ is an acute angle, because cos θ is positive.

(b) Let f(t) = 3t − 5 and g(t) = t² in the polynomial space P(t) with inner product Images . By Example 7.3,

Images

Then the “angle” θ between f and g is given by

Images

Note that θ is an obtuse angle, because cos θ is negative.

7.5 Orthogonality

Let V be an inner product space. The vectors u, υ ∈ V are said to be orthogonal and u is said to be orthogonal to υ if

The relation is clearly symmetric—if u is orthogonal to υ, then υ, u = 0, and so υ is orthogonal to u. We note that 0 ∈ V is orthogonal to every υ ∈ V, because

Conversely, if u is orthogonal to every υ ∈ V, then u, u = 0 and hence u = 0 by [I₃]. Observe that u and υ are orthogonal if and only if cos θ = 0, where θ is the angle between u and υ. Also, this is true if and only if u and υ are “perpendicular”—that is, θ = π/2 (or θ = 90^º).

EXAMPLE 7.6

(a) Consider the vectors u = (1, 1, 1), υ = (1, 2, −3), w = (1, −4, 3) in R³. Then

Thus, u is orthogonal to υ and w, but υ and w are not orthogonal.

(b) Consider the functions sin t and cos t in the vector space C[−π, π] of continuous functions on the closed interval [−π, π]. Then

Thus, sin t and cos t are orthogonal functions in the vector space C[−π, π].

Remark: A vector w = (x₁, x₂, …, x_n) is orthogonal to u = (a₁, a₂, …, a_n) in Rⁿ if

That is, w is orthogonal to u if w satisfies a homogeneous equation whose coefficients are the elements of u.

EXAMPLE 7.7 Find a nonzero vector w that is orthogonal to u₁ = (1, 2, 1) and u₂ = (2, 5, 4) in R³.

Let w = (x, y, z). Then we want u₁, w = 0 and u₂, w = 0. This yields the homogeneous system

Here z is the only free variable in the echelon system. Set z = 1 to obtain y = −2 and x = 3. Thus, w = (3, −2, 1) is a desired nonzero vector orthogonal to u₁ and u₂.

Any multiple of w will also be orthogonal to u₁ and u₂. Normalizing w, we obtain the following unit vector orthogonal to u₁ and u₂:

Orthogonal Complements

Let S be a subset of an inner product space V. The orthogonal complement of S, denoted by S^⊥ (read “S perp”) consists of those vectors in V that are orthogonal to every vector u ∈ S; that is,

In particular, for a given vector u in V, we have

that is, u^⊥ consists of all vectors in V that are orthogonal to the given vector u.

We show that S^⊥ is a subspace of V. Clearly 0 ∈ S^⊥, because 0 is orthogonal to every vector in V. Now suppose υ, w ∈ S^⊥. Then, for any scalars a and b and any vector u ∈ S, we have

Thus, aυ + bw ∈ S^⊥, and therefore S^⊥ is a subspace of V.

We state this result formally.

PROPOSITION 7.3: Let S be a subset of a vector space V. Then S^⊥ is a subspace of V.

Remark 1: Suppose u is a nonzero vector in R³. Then there is a geometrical description of u^⊥. Specifically, u^⊥ is the plane in R³ through the origin O and perpendicular to the vector u. This is shown in Fig. 7-2.

Images

Figure 7-2

Remark 2: Let W be the solution space of an m × n homogeneous system AX = 0, where A = [a_ij] and X = [x_i]. Recall that W may be viewed as the kernel of the linear mapping A:Rⁿ → R^m. Now we can give another interpretation of W using the notion of orthogonality. Specifically, each solution vector w = (x₁, x₂, …, x_n) is orthogonal to each row of A; hence, W is the orthogonal complement of the row space of A.

EXAMPLE 7.8 Find a basis for the subspace u^⊥ of R³, where u = (1, 3, −4).

Note that u^⊥ consists of all vectors w = (x, y, z) such that u, w = 0, or x + 3y − 4z = 0. The free variables are y and z.

(1) Set y = 1, z = 0 to obtain the solution w₁ = (−3, 1, 0).

(2) Set y = 0, z = 1 to obtain the solution w₂ = (4, 0, 1).

The vectors w₁ and w₂ form a basis for the solution space of the equation, and hence a basis for u^⊥.

Suppose W is a subspace of V. Then both W and W^⊥ are subspaces of V. The next theorem, whose proof (Problem 7.28) requires results of later sections, is a basic result in linear algebra.

THEOREM 7.4: Let W be a subspace of V. Then V is the direct sum of W and W^⊥; that is, V = W ⊕ W^⊥.

7.6 Orthogonal Sets and Bases

Consider a set S = {u₁, u₂, …, u_r} of nonzero vectors in an inner product space V. S is called orthogonal if each pair of vectors in S are orthogonal, and S is called orthonormal if S is orthogonal and each vector in S has unit length. That is,

Normalizing an orthogonal set S refers to the process of multiplying each vector in S by the reciprocal of its length in order to transform S into an orthonormal set of vectors.

The following theorems apply.

THEOREM 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent.

THEOREM 7.6: (Pythagoras) Suppose {u₁, u₂, …, u_r} is an orthogonal set of vectors. Then

These theorems are proved in Problems 7.15 and 7.16, respectively. Here we prove the Pythagorean theorem in the special and familiar case for two vectors. Specifically, suppose u, υ = 0. Then

which gives our result.

EXAMPLE 7.9

(a) Let E = {e₁, e₂, e₃} = {(1, 0, 0), (0, 1, 0), (0, 0, 1)} be the usual basis of Euclidean space R³. It is clear that

Namely, E is an orthonormal basis of R³. More generally, the usual basis of Rⁿ is orthonormal for every n.

(b) Let V = C[−π, π] be the vector space of continuous functions on the interval −π ≤ t ≤ π with inner product defined by . Then the following is a classical example of an orthogonal set in V:

This orthogonal set plays a fundamental role in the theory of Fourier series.

Orthogonal Basis and Linear Combinations, Fourier Coefficients

Let S consist of the following three vectors in R³:

The reader can verify that the vectors are orthogonal; hence, they are linearly independent. Thus, S is an orthogonal basis of R³.

Suppose we want to write υ = (7, 1, 9) as a linear combination of u₁, u₂, u₃. First we set υ as a linear combination of u₁, u₂, u₃ using unknowns x₁, x₂, x₃ as follows:

We can proceed in two ways.

METHOD 1: Expand (*) (as in Chapter 3) to obtain the system

Images

Solve the system by Gaussian elimination to obtain x₁ = 3, x₂ = −1, x₃ = 2. Thus, υ = 3u₁ − u₂ + 2u₃.

METHOD 2: (This method uses the fact that the basis vectors are orthogonal, and the arithmetic is much simpler.) If we take the inner product of each side of (*) with respect to u_i, we get

Here two terms drop out, because u₁, u₂, u₃ are orthogonal. Accordingly,

Thus, again, we get υ = 3u₁ − u₂ + 2u₃.

The procedure in Method 2 is true in general. Namely, we have the following theorem (proved in Problem 7.17).

THEOREM 7.7: Let {u₁, u₂, …, u_n} be an orthogonal basis of V. Then, for any υ ∈ V,

Remark: The scalar is called the Fourier coefficient of υ with respect to u_i, because it is analogous to a coefficient in the Fourier series of a function. This scalar also has a geometric interpretation, which is discussed below.

Projections

Let V be an inner product space. Suppose w is a given nonzero vector in V, and suppose υ is another vector. We seek the “projection of υ along w,” which, as indicated in Fig. 7-3(a), will be the multiple cw of w such that υ^′ = υ − cw is orthogonal to w. This means

Images

Figure 7-3

Accordingly, the projection of υ along w is denoted and defined by

Such a scalar c is unique, and it is called the Fourier coefficient of υ with respect to w or the component of v along w.

The above notion is generalized as follows (see Problem 7.25).

THEOREM 7.8: Suppose w₁, w₂, …, w_r form an orthogonal set of nonzero vectors in V. Let υ be any vector in V. Define

where

Then υ^′ is orthogonal to w₁, w₂, …, w_r.

Note that each c_i in the above theorem is the component (Fourier coefficient) of υ along the given w_i.

Remark: The notion of the projection of a vector υ ∈ V along a subspace W of V is defined as follows. By Theorem 7.4, V = W ⊕ W^⊥. Hence, υ may be expressed uniquely in the form

Images

We define w to be the projection of υ along W, and denote it by proj(υ, W), as pictured in Fig. 7-3(b). In particular, if W = span(w₁, w₂, …, w_r), where the w_i form an orthogonal set, then

Images

Here c_i is the component of υ along w_i, as above.

7.7 Gram–Schmidt Orthogonalization Process

Suppose {υ₁, υ₂, …, υ_n} is a basis of an inner product space V. One can use this basis to construct an orthogonal basis {w₁, w₂, …, w_n} of V as follows. Set

In other words, for k = 2, 3, …, n, we define

where c_ki = Images υ_k, w_i Images / Images w_i, w_i Images is the component of υ_k along w_i. By Theorem 7.8, each w_k is orthogonal to the preceeding w’s. Thus, w₁, w₂, …, w_n form an orthogonal basis for V as claimed. Normalizing each w_i will then yield an orthonormal basis for V.

The above construction is known as the Gram–Schmidt orthogonalization process. The following remarks are in order.

Remark 1: Each vector w_k is a linear combination of υ_k and the preceding w’s. Hence, one can easily show, by induction, that each w_k is a linear combination of υ₁, υ₂, …, υ_n.

Remark 2: Because taking multiples of vectors does not affect orthogonality, it may be simpler in hand calculations to clear fractions in any new w_k, by multiplying w_k by an appropriate scalar, before obtaining the next w_k+1.

Remark 3: Suppose u₁, u₂, …, u_r are linearly independent, and so they form a basis for U = span(u_i). Applying the Gram–Schmidt orthogonalization process to the u’s yields an orthogonal basis for U.

The following theorems (proved in Problems 7.26 and 7.27) use the above algorithm and remarks.

THEOREM 7.9: Let {υ₁, υ₂, …, υ_n} be any basis of an inner product space V. Then there exists an orthonormal basis {u₁, u₂, …, u_n} of V such that the change-of-basis matrix from {υ_i} to {u_i} is triangular; that is, for k = 1, …, n,

THEOREM 7.10: Suppose S = {w₁, w₂, …, w_r} is an orthogonal basis for a subspace W of a vector space V. Then one may extend S to an orthogonal basis for V; that is, one may find vectors w_r+1, …, w_n such that {w₁, w₂, …, w_n} is an orthogonal basis for V.

EXAMPLE 7.10 Apply the Gram–Schmidt orthogonalization process to find an orthogonal basis and then an orthonormal basis for the subspace U of R⁴ spanned by

(1) First set w₁ = υ₁ = (1, 1, 1, 1).

(2) Compute

Set w₂ = (−2, −1, 1, 2).

(3) Compute

Clear fractions to obtain w₃ = (−6, −17, −13, 14).

Thus, w₁, w₂, w₃ form an orthogonal basis for U. Normalize these vectors to obtain an orthonormal basis {u₁, u₂, u₃} of U. We have ||w₁||² = 4, ||w₂||² = 10, ||w₃||² = 910, so

EXAMPLE 7.11 Let V be the vector space of polynomials f(t) with inner product Images dt. Apply the Gram–Schmidt orthogonalization process to {1, t, t², t³} to find an orthogonal basis {f₀, f₁, f₂, f₃} with integer coefficients for P₃(t).

Here we use the fact that, for r + s = n,

(1) First set f₀ = 1.

(2) Compute .

(3) Compute

Multiply by 3 to obtain f₂ = 3t² = 1.

(4) Compute

Multiply by 5 to obtain f₃ = 5t³ − 3t.

Thus, {1, t, 3t² − 1, 5t³ − 3t} is the required orthogonal basis.

Remark: Normalizing the polynomials in Example 7.11 so that p(1) = 1 yields the polynomials

Images

These are the first four Legendre polynomials, which appear in the study of differential equations.

7.8 Orthogonal and Positive Definite Matrices

This section discusses two types of matrices that are closely related to real inner product spaces V. Here vectors in Rⁿ will be represented by column vectors. Thus, u, υ = u^Tυ denotes the inner product in Euclidean space Rⁿ.

Orthogonal Matrices

A real matrix P is orthogonal if P is nonsingular and P⁻¹ = P^T, or, in other words, if PP^T = P^TP = I. First we recall (Theorem 2.6) an important characterization of such matrices.

THEOREM 7.11: Let P be a real matrix. Then the following are equivalent: (a) P is orthogonal; (b) the rows of P form an orthonormal set; (c) the columns of P form an orthonormal set.

(This theorem is true only using the usual inner product on Rⁿ. It is not true if Rⁿ is given any other inner product.)

EXAMPLE 7.12

(a) Let . The rows of P are orthogonal to each other and are unit vectors. Thus P is an orthogonal matrix.

(b) Let P be a 2 × 2 orthogonal matrix. Then, for some real number θ, we have

The following two theorems (proved in Problems 7.37 and 7.38) show important relationships between orthogonal matrices and orthonormal bases of a real inner product space V.

THEOREM 7.12: Suppose E = {e_i} and Images are orthonormal bases of V. Let P be the change-of-basis matrix from the basis E to the basis E^′. Then P is orthogonal.

THEOREM 7.13: Let {e₁, …, e_n} be an orthonormal basis of an inner product space V. Let P = [a_ij] be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V:

Positive Definite Matrices

Let A be a real symmetric matrix; that is, A^T = A. Then A is said to be positive definite if, for every nonzero vector u in Rⁿ,

Algorithms to decide whether or not a matrix A is positive definite will be given in Chapter 13. However, for 2 × 2 matrices, we have simple criteria that we state formally in the following theorem (proved in Problem 7.43).

THEOREM 7.14: A 2 × 2 real symmetric matrix Images is positive definite if and only if the diagonal entries a and d are positive and the determinant |A| = ad − bc = ad − b² is positive.

EXAMPLE 7.13 Consider the following symmetric matrices:

A is not positive definite, because |A| = 4 − 9 = −5 is negative. B is not positive definite, because the diagonal entry −3 is negative. However, C is positive definite, because the diagonal entries 1 and 5 are positive, and the determinant |C| = 5 − 4 = 1 is also positive.

The following theorem (proved in Problem 7.44) holds.

THEOREM 7.15: Let A be a real positive definite matrix. Then the function Images u, υ Images = u^TAυ is an inner product on Rⁿ.

Matrix Representation of an Inner Product (Optional)

Theorem 7.15 says that every positive definite matrix A determines an inner product on Rⁿ. This subsection may be viewed as giving the converse of this result.

Let V be a real inner product space with basis S = {u₁, u₂, …, u_n}. The matrix

Images

is called the matrix representation of the inner product on V relative to the basis S.

Observe that A is symmetric, because the inner product is symmetric; that is, Images u_i, u_j Images = Images u_j, u_i Images . Also, A depends on both the inner product on V and the basis S for V. Moreover, if S is an orthogonal basis, then A is diagonal, and if S is an orthonormal basis, then A is the identity matrix.

EXAMPLE 7.14 The vectors u₁ = (1, 1, 0), u₂ = (1, 2, 3), u₃ = (1, 3, 5) form a basis S for Euclidean space R³. Find the matrix A that represents the inner product in R³ relative to this basis S.

First compute each u_i, u_j to obtain

Then . As expected, A is symmetric.

The following theorems (proved in Problems 7.45 and 7.46, respectively) hold.

THEOREM 7.16: Let A be the matrix representation of an inner product relative to basis S for V. Then, for any vectors u, υ ∈ V, we have

where [u] and [υ] denote the (column) coordinate vectors relative to the basis S.

THEOREM 7.17: Let A be the matrix representation of any inner product on V. Then A is a positive definite matrix.

7.9 Complex Inner Product Spaces

This section considers vector spaces over the complex field C. First we recall some properties of the complex numbers (Section 1.7), especially the relations between a complex number z = a + bi, where a, b ∈ R, and its complex conjugate :

Also, z is real if and only if .

The following definition applies.

DEFINITION: Let V be a vector space over C. Suppose to each pair of vectors, u, υ ∈ V there is assigned a complex number, denoted by u, υ. This function is called a (complex) inner product on V if it satisfies the following axioms:

The vector space V over C with an inner product is called a (complex) inner product space. Observe that a complex inner product differs from the real case only in the second axiom . Axiom (Linear Property) is equivalent to the two conditions:

On the other hand, applying and , we obtain

That is, we must take the conjugate of a complex number when it is taken out of the second position of a complex inner product. In fact (Problem 7.47), the inner product is conjugate linear in the second position; that is,

Images

Combining linear in the first position and conjugate linear in the second position, we obtain, by induction,

Images

The following remarks are in order.

Remark 1: Axiom Images by itself implies that Images 0, 0 Images = Images 0υ, 0 Images = 0 Images υ, 0 Images = 0. Accordingly, Images , Images , and Images are equivalent to Images , Images , and the following axiom:

Images

That is, a function satisfying Images , Images , and Images is a (complex) inner product on V.

Remark 2: By Images , Images . Thus, Images u, u Images must be real. By Images , Images u, u Images must be nonnegative, and hence, its positive real square root exists. As with real inner product spaces, we define Images to be the norm or length of u.

Remark 3: In addition to the norm, we define the notions of orthogonality, orthogonal complement, and orthogonal and orthonormal sets as before. In fact, the definitions of distance and Fourier coefficient and projections are the same as in the real case.

EXAMPLE 7.15 (Complex Euclidean Space Cⁿ). Let V = Cⁿ, and let u = (z_i) and υ = (w_i) be vectors in Cⁿ. Then

is an inner product on V, called the usual or standard inner product on Cⁿ. V with this inner product is called Complex Euclidean Space. We assume this inner product on Cⁿ unless otherwise stated or implied. Assuming u and υ are column vectors, the above inner product may be defined by

where, as with matrices, means the conjugate of each element of υ. If u and υ are real, we have . In this case, the inner product reduced to the analogous one on Rⁿ.

EXAMPLE 7.16

(a) Let V be the vector space of complex continuous functions on the (real) interval a ≤ t ≤ b. Then the following is the usual inner product on V:

(b) Let U be the vector space of m × n matrices over C. Suppose A = (z_ij) and B = (w_ij) are elements of U. Then the following is the usual inner product on U:

As usual, ; that is, B^H is the conjugate transpose of B.

The following is a list of theorems for complex inner product spaces that are analogous to those for the real case. Here a Hermitian matrix A (i.e., one where Images ) plays the same role that a symmetric matrix A (i.e., one where A^T = A) plays in the real case. (Theorem 7.18 is proved in Problem 7.50.)

THEOREM 7.18: (Cauchy–Schwarz) Let V be a complex inner product space. Then

THEOREM 7.19: Let W be a subspace of a complex inner product space V. Then V = W ⊕ W^⊥.

THEOREM 7.20: Suppose {u₁, u₂, …, u_n} is a basis for a complex inner product space V. Then, for any υ ∈ V,

THEOREM 7.21: Suppose {u₁, u₂, …, u_n} is a basis for a complex inner product space V. Let A = [a_ij] be the complex matrix defined by a_ij = Images u_i, u_j Images . Then, for any u, υ ∈ V,

where [u] and [υ] are the coordinate column vectors in the given basis {u_i}. (Remark: This matrix A is said to represent the inner product on V.)

THEOREM 7.22: Let A be a Hermitian matrix (i.e., Images ) such that Images is real and positive for every nonzero vector X ∈ Cⁿ. Then Images is an inner product on Cⁿ.

THEOREM 7.23: Let A be the matrix that represents an inner product on V. Then A is Hermitian, and X^TAX is real and positive for any nonzero vector in Cⁿ.

7.10 Normed Vector Spaces (Optional)

We begin with a definition.

DEFINITION: Let V be a real or complex vector space. Suppose to each υ ∈ V there is assigned a real number, denoted by ||υ||. This function ||·|| is called a norm on V if it satisfies the following axioms:

[N₁] ||υ|| 0; and ||υ|| = 0 if and only if υ = 0.

[N₂] ||kυ|| = |k|||υ||.

[N₃] ||u + υ|| ≤ ||u||+||υ||.

A vector space V with a norm is called a normed vector space.

Suppose V is a normed vector space. The distance between two vectors u and υ in V is denoted and defined by

The following theorem (proved in Problem 7.56) is the main reason why d (u, υ) is called the distance between u and υ.

THEOREM 7.24: Let V be a normed vector space. Then the function d(u, υ) = ||u − υ|| satisfies the following three axioms of a metric space:

[M₁] d(u, υ) ≥ 0; and d (u, v) = 0 if and only if u = υ.

[M₂] d(u, υ) = d(υ, u).

[M₃] d(u, υ) ≤ d (u, w)+ d (w, υ).

Normed Vector Spaces and Inner Product Spaces

Suppose V is an inner product space. Recall that the norm of a vector υ in V is defined by

One can prove (Theorem 7.2) that this norm satisfies [N₁], [N₂], and [N₃]. Thus, every inner product space V is a normed vector space. On the other hand, there may be norms on a vector space V that do not come from an inner product on V, as shown below.

Norms on Rⁿ and Cⁿ

The following define three important norms on Rⁿ and Cⁿ:

Images

(Note that subscripts are used to distinguish between the three norms.) The norms ||·||_∞, ||·||₁, and ||·||₂ are called the infinity-norm, one-norm, and two-norm, respectively. Observe that ||·||₂ is the norm on Rⁿ (respectively, Cⁿ) induced by the usual inner product on Rⁿ (respectively, Cⁿ). We will let d_∞, d₁, d₂ denote the corresponding distance functions.

EXAMPLE 7.17 Consider vectors u = (1, −5, 3) and υ = (4, 2, −3) in R³.

(a) The infinity norm chooses the maximum of the absolute values of the components. Hence,

(b) The one-norm adds the absolute values of the components. Thus,

(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by the usual inner product on R³). Thus,

(d) Because u − υ = (1 −4, −5 −2, 3 + 3) = (−3, −7, 6), we have

EXAMPLE 7.18 Consider the Cartesian plane R² shown in Fig. 7-4.

Images

Figure 7-4

(a) Let D₁ be the set of points u = (x, y) in R² such that ||u||₂ = 1. Then D₁ consists of the points (x, y) such that Images . Thus, D₁ is the unit circle, as shown in Fig. 7-4.

(b) Let D₂ be the set of points u = (x, y) in R² such that ||u||₁ = 1. Then D₁ consists of the points (x, y) such that ||u||₁ = |x|+|y| = 1. Thus, D₂ is the diamond inside the unit circle, as shown in Fig. 7-4.

(c) Let D₃ be the set of points u = (x, y) in R² such that ||u||_∞ = 1. Then D₃ consists of the points (x, y) such that ||u||_∞ = max(|x|, |y|) = 1. Thus, D₃ is the square circumscribing the unit circle, as shown in Fig. 7-4.

**Norms on C[a, b]**

Consider the vector space V = C[a, b] of real continuous functions on the interval a ≤ t ≤ b. Recall that the following defines an inner product on V:

Images

Accordingly, the above inner product defines the following norm on V = C[a, b] (which is analogous to the ||·||₂ norm on Rⁿ):

Images

The following define the other norms on V = C[a, b]:

There are geometrical descriptions of these two norms and their corresponding distance functions, which are described below.

The first norm is pictured in Fig. 7-5. Here

Images

Figure 7-5

This norm is analogous to the norm ||·||₁ on Rⁿ.

The second norm is pictured in Fig. 7-6. Here

Images

Figure 7-6

This norm is analogous to the norms ||·||_∞ on Rⁿ.

SOLVED PROBLEMS

Inner Products

7.1. Expand:

(a) 5u₁ + 8u₂, 6υ₁ − 7υ₂,

(b) 3u + 5υ, 4u 6υ,

Use linearity in both positions and, when possible, symmetry, u, υ = υ, u.

(a) Take the inner product of each term on the left with each term on the right:

[Remark: Observe the similarity between the above expansion and the expansion (5a–8b)(6c–7d) in ordinary algebra.]

7.2. Consider vectors u = (1, 2, 4), υ = (2, 3, 5), w = (4, 2, 3) in R³. Find

(a) Multiply corresponding components and add to get u · υ = 2 − 6 + 20 = 16.

(b) u · w = 4 + 4 − 12 = −4.

(d) First find u + υ = (3, −1, 9). Then (u + υ) · w = 12 − 2 − 27 = −17. Alternatively, using [I₁], (u + υ) · w = u · w + υ w = −4 − 13 = −17.

(e) First find ||u||² by squaring the components of u and adding:

(f) ||υ||² = 4 + 9 + 25 = 38, and so .

7.3. Verify that the following defines an inner product in R²:

We argue via matrices. We can write u, υ in matrix notation as follows:

Because A is real and symmetric, we need only show that A is positive definite. The diagonal elements 1 and 3 are positive, and the determinant ||A|| = 3 − 1 = 2 is positive. Thus, by Theorem 7.14, A is positive definite. Accordingly, by Theorem 7.15, Images u, υ Images is an inner product.

7.4. Consider the vectors u = (1, 5) and υ = (3, 4) in R². Find

(a) u, υ with respect to the usual inner product in R².

(b) Images u, υ Images with respect to the inner product in R² in Problem 7.3.

(d) ||υ|| using the inner product in R² in Problem 7.3.

(a) Images u, υ Images = 3 + 20 = 23.

(b) Images u, υ Images = 1 · 3 − 1 · 4 − 5 · 3 + 3 · 5 · 4 = 3 − 4 − 15 + 60 = 44.

(d) ||υ||² = Images υ, υ Images = Images (3, 4), (3, 4) Images = 9 − 12 − 12 + 48 = 33; hence, Images .

7.5. Consider the following polynomials in P(t) with the inner product Images :

(a) Find f, g and f, h.

(b) Find ||f|| and ||g||.

(a) Integrate as follows:

(b)

7.6. Find cos θ where θ is the angle between:

(a) u = (1, 3, −5, 4) and υ = (2, −3, 4, 1) in R⁴,

(a) Compute:

Thus,

(b) Use , the sum of the products of corresponding entries.

Use , the sum of the squares of all the elements of A.

Thus,

7.7. Verify each of the following:

(a) Parallelogram Law (Fig. 7-7): ||u + υ||² + ||u − υ||² = 2||u||² + 2||υ||².

Images

Figure 7-7

(b) Polar form for u, υ (which shows the inner product can be obtained from the norm function):

Expand as follows to obtain

Add (1) and (2) to get the Parallelogram Law (a). Subtract (2) from (1) to obtain

Divide by 4 to obtain the (real) polar form (b).

7.8. Prove Theorem 7.1 (Cauchy–Schwarz): For u and υ in a real inner product space V, Images u, u Images ² ≤ Images u, u Images υ, υ Images or | Images u, υ Images | ≤ ||u|| ||υ||.

For any real number t,

Images

Let a = ||u||², b = 2 Images u, υ), c = ||υ||². Because ||tu + υ||² ≥ 0, we have

Images

for every value of t. This means that the quadratic polynomial cannot have two real roots, which implies that b² − 4ac ≤ 0 or b² ≤ 4ac. Thus,

Images

Dividing by 4 gives our result.

7.9. Prove Theorem 7.2: The norm in an inner product space V satisfies

(a) [N₁] ||υ|| 0; and ||υ|| = 0 if and only if υ = 0.

(b) [N₂] ||kυ|| = |k|||υ||.

(a) If υ ≠ 0, then Images υ, υ Images > 0, and hence, Images . If υ = 0, then Images 0, 0 Images = 0. Consequently, Images . Thus, [N₁] is true.

(b) We have ||kυ||² = Images kυ, k υ Images = k² Images υ, υ Images = k²||υ||². Taking the square root of both sides gives [N₂].

Images

Taking the square root of both sides yields [N₃].

Orthogonality, Orthonormal Complements, Orthogonal Sets

7.10. Find k so that u = (1, 2, k, 3) and υ = (3, k, 7, −5) in R⁴ are orthogonal.

First find

Then set u, υ = 9k − 12 = 0 to obtain .

7.11. Let W be the subspace of R⁵ spanned by u = (1, 2, 3, −1, 2) and υ = (2, 4, 7, 2, −1). Find a basis of the orthogonal complement W^⊥ of W.

We seek all vectors w = (x, y, z, s, t) such that

Eliminating x from the second equation, we find the equivalent system

The free variables are y, s, and t. Therefore,

(1) Set y = −1, s = 0, t = 0 to obtain the solution w₁ = (2, −1, 0, 0, 0).

(2) Set y = 0, s = 1, t = 0 to find the solution w₂ = (13, 0, −4, 1, 0).

(3) Set y = 0, s = 0, t = 1 to obtain the solution w₃ = (−17, 0, 5, 0, 1).

The set {w₁, w₂, w₃} is a basis of W^⊥.

7.12. Let w = (1, 2, 3, 1) be a vector in R⁴. Find an orthogonal basis for w^⊥.

Find a nonzero solution of x + 2y + 3z + t = 0, say υ₁ = (0, 0, 1, −3). Now find a nonzero solution of the system

say υ₂ = (0, −5, 3, 1). Last, find a nonzero solution of the system

say υ₃ = (−14, 2, 3, 1). Thus, υ₁, υ₂, υ₃ form an orthogonal basis for w^⊥.

7.13. Let S consist of the following vectors in R⁴:

(a) Show that S is orthogonal and a basis of R⁴.

(b) Find the coordinates of an arbitrary vector υ = (a, b, c, d) in R⁴ relative to the basis S.

(a) Compute

Thus, S is orthogonal, and S is linearly independent. Accordingly, S is a basis for R⁴ because any four linearly independent vectors form a basis of R⁴.

(b) Because S is orthogonal, we need only find the Fourier coefficients of υ with respect to the basis vectors, as in Theorem 7.7. Thus,

Images

are the coordinates of υ with respect to the basis S.

7.14. Suppose S, S₁, S₂ are the subsets of V. Prove the following (where S^⊥⊥ means (S^⊥)^⊥):

(a) S ⊆ S^⊥⊥.

(b) If S₁ ⊆ S₂, then .

(a) Let w ∈ S. Then w, υ = 0 for every υ ∈ S^⊥; hence, w ∈ S^⊥⊥. Accordingly, S ⊆ S^⊥⊥.

(b) Let . Then w, υ = 0 for every υ ∈ 2 S₂. Because S₁ ⊆ S₂, w, υ = 0 for every υ = S₁. Thus, , and hence, .

(c) Because S ⊆ span(S), part (b) gives us span(S)^⊥ ⊆ S^⊥. Suppose u ∈ S^⊥ and υ ∈ span(S). Then there exist w₁, w₂, …, w_k in S such that υ = a₁w₁ + a₂w₂ + … + a_kw_k. Then, using u ∈ S^⊥, we have

Thus, u ∈ span(S)^⊥. Accordingly, S^⊥ ⊆ span(S)^⊥. Both inclusions give S^⊥ = span(S)^⊥.

7.15. Prove Theorem 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent.

Suppose S = {u₁, u₂, …, u_r} and suppose

Taking the inner product of (1) with u₁, we get

Because u₁ ≠ = 0, we have u₁, u₁ ≠ 0. Thus, a₁ = 0. Similarly, for i = 2, …, r, taking the inner product of (1) with u_i,

But u_i, u_i ≠ 0, and hence, every a_i = 0. Thus, S is linearly independent.

7.16. Prove Theorem 7.6 (Pythagoras): Suppose {u₁, u₂, …, u_r} is an orthogonal set of vectors. Then

Images

Expanding the inner product, we have

Images

The theorem follows from the fact that Images u_i, u_i Images = ||u_i||² and Images u_i, u_j Images = 0 for i ≠ j.

7.17. Prove Theorem 7.7: Let {u₁, u₂, …, u_n} be an orthogonal basis of V. Then for any υ ∈ V,

Images

Suppose υ = k₁u₁ + k₂u₂ +…+ k_nu_n. Taking the inner product of both sides with u₁ yields

Images

Thus, Images . Similarly, for i = 2, …, n,

Images

Thus, Images . Substituting for k_i in the equation υ = k₁u₁ +…+ k_nu_n, we obtain the desired result.

7.18. Suppose E = {e₁, e₂, …, e_n} is an orthonormal basis of V. Prove

(a) For any u ∈ V, we have u = u, e₁e₁ + u, e₂e₂ +…+u, e_ne_n.

(b) a₁e₁ +…+ a_ne_n, b₁e₁ +…+ b_ne_n = a₁b₁ + a₂b₂ +…+ a_nb_n.

(a) Suppose u = k₁e₁ + k₂e₂ +…+ k_ne_n. Taking the inner product of u with e₁,

Similarly, for i = 2, …, n,

Substituting u, e_i for k_i in the equation u = k₁e₁ +…+ k_ne_n, we obtain the desired result.

(b) We have

But e_i, e_j = 0 for i ≠ j, and e_i, e_j = 1 for i = j. Hence, as required,

Thus, by part (b),

Projections, Gram–Schmidt Algorithm, Applications

7.19. Suppose w ≠ 0. Let υ be any vector in V. Show that

is the unique scalar such that υ^′ = υ − cw is orthogonal to w.

In order for υ^′ to be orthogonal to w we must have

Thus, . Conversely, suppose . Then

7.20. Find the Fourier coefficient c and the projection of υ = (1, −2, 3, −4) along w = (1, 2, 1, 2) in R⁴.

Compute υ, w = 1 − 4 + 3 − 8 = −8 and ||w||² = 1 + 4 + 1 + 4 = 10. Then

7.21. Consider the subspace U of R⁴ spanned by the vectors:

Find (a) an orthogonal basis of U; (b) an orthonormal basis of U.

(a) Use the Gram–Schmidt algorithm. Begin by setting w₁ = u = (1, 1, 1, 1). Next find

Set w₂ = (−1, −1, 0, 2). Then find

Clear fractions to obtain w₃ = (1, 3, −6, 2). Then w₁, w₂, w₃ form an orthogonal basis of U.

(b) Normalize the orthogonal basis consisting of w₁, w₂, w₃. Because ||w₁||² = 4, ||w₂||² = 6, and ||w₃||² = 50, the following vectors form an orthonormal basis of U:

7.22. Consider the vector space P(t) with inner product Images . Apply the Gram–Schmidt algorithm to the set {1, t, t²} to obtain an orthogonal set {f₀, f₁, f₂} with integer coefficients.

First set f₀ = 1. Then find

Clear fractions to obtain f₁ = 2t − 1. Then find

Clear fractions to obtain f₂ = 6t² − 6t + 1. Thus, {1, 2t − 1, 6t² − 6t + 1} is the required orthogonal set.

7.23. Suppose υ = (1, 3, 5, 7). Find the projection of υ onto W or, in other words, find w ∈ W that minimizes ||υ − w||, where W is the subspace of R⁴ spanned by

(a) u₁ = (1, 1, 1, 1) and u₂ = (1, −3, 4, −2),

(b) υ₁ = (1, 1, 1, 1) and υ₂ = (1, 2, 3, 2).

(a) Because u₁ and u₂ are orthogonal, we need only compute the Fourier coefficients:

Then .

(b) Because υ₁ and υ₂ are not orthogonal, first apply the Gram–Schmidt algorithm to find an orthogonal basis for W. Set w₁ = υ₁ = (1, 1, 1, 1). Then find

Set w₂ = (−1, 0, 1, 0). Now compute

Then w = proj(υ, W) = c₁w₁ + c₂w₂ = 4(1, 1, 1, 1) + 2(−1, 0, 1, 0) = (2, 4, 6, 4).

7.24. Suppose w₁ and w₂ are nonzero orthogonal vectors. Let υ be any vector in V. Find c₁ and c₂ so that υ^′ is orthogonal to w₁ and w₂, where υ^′ = υ − c₁w₁ − c₂w₂.

If υ^′ is orthogonal to w₁, then

Thus, c₁ = υ, w₁/w₁, w₁. (That is, c₁ is the component of υ along w₁.) Similarly, if υ^′ is orthogonal to w₂, then

Thus, c₂ = υ, w₂/w₂, w₂. (That is, c₂ is the component of υ along w₂.)

7.25. Prove Theorem 7.8: Suppose w₁, w₂, …, w_r form an orthogonal set of nonzero vectors in V. Let υ ∈ V. Define

Images

Then υ^′ is orthogonal to w₁, w₂, …, w_r.

For i = 1, 2, …, r and using Images w_i, w_j Images = 0 for i ≠ j, we have

Images

The theorem is proved.

7.26. Prove Theorem 7.9: Let {υ₁, υ₂, …, υ_n} be any basis of an inner product space V. Then there exists an orthonormal basis {u₁, u₂, …, u_n} of V such that the change-of-basis matrix from {υ_i} to {u_i} is triangular; that is, for k = 1, 2, …, n,

Images

The proof uses the Gram–Schmidt algorithm and Remarks 1 and 3 of Section 7.7. That is, apply the algorithm to {υ_i} to obtain an orthogonal basis {w_i, …, w_n}, and then normalize {w_i} to obtain an orthonormal basis {u_i} of V. The specific algorithm guarantees that each w_k is a linear combination of υ₁, …, υ_k, and hence, each u_k is a linear combination of υ₁, …, υ_k.

7.27. Prove Theorem 7.10: Suppose S = {w₁, w₂, …, w_r}, is an orthogonal basis for a subspace W of V. Then one may extend S to an orthogonal basis for V; that is, one may find vectors w_r+1, …, w_n such that {w₁, w₂, …, w_n} is an orthogonal basis for V.

Extend S to a basis S^′ = {w₁, …, w_r, υ_r+1, …, υ_n} for V. Applying the Gram–Schmidt algorithm to S^′, we first obtain w₁, w₂, …, w_r because S is orthogonal, and then we obtain vectors w_r+1, …, w_n, where {w₁, w₂, …, w_n} is an orthogonal basis for V. Thus, the theorem is proved.

7.28. Prove Theorem 7.4: Let W be a subspace of V. Then V = W ⊕ W^⊥.

By Theorem 7.9, there exists an orthogonal basis {u₁, …, u_r} of W, and by Theorem 7.10 we can extend it to an orthogonal basis {u₁, u₂, …, u_n} of V. Hence, u_r₊₁, …, u_n ∈ W^⊥. If υ ∈ V, then

Images

Accordingly, V = W + W^⊥.

On the other hand, if w ∈ W ∩ W^⊥, then Images w, w Images = 0. This yields w = 0. Hence, W ∩ W^⊥ = {0}.

The two conditions V = W + W^⊥ and W ∩ W^⊥ = {0} give the desired result V = W ⊕ W^⊥.

Remark: Note that we have proved the theorem for the case that V has finite dimension. We remark that the theorem also holds for spaces of arbitrary dimension.

7.29. Suppose W is a subspace of a finite-dimensional space V. Prove that W = W^⊥⊥.

By Theorem 7.4, V = W ⊕ W^⊥, and also V = W^⊥ ⊕ W^⊥⊥. Hence,

Images

This yields dim W = dim W^⊥⊥. But W ⊆ W^⊥⊥ (see Problem 7.14). Hence, W = W^⊥⊥, as required.

7.30. Prove the following: Suppose w₁, w₂, …, w_r form an orthogonal set of nonzero vectors in V. Let υ be any vector in V and let c_i be the component of υ along w_i. Then, for any scalars a₁, …, a_r, we have

That is, Σ c_iw_i is the closest approximation to υ as a linear combination of w₁, …, w_r.

By Theorem 7.8, υ − Σ c_kw_k is orthogonal to every w_i and hence orthogonal to any linear combination of w₁, w₂, …, w_r. Therefore, using the Pythagorean theorem and summing from k = 1 to r,

Images

The square root of both sides gives our theorem.

7.31. Suppose {e₁, e₂, …, e_r} is an orthonormal set of vectors in V. Let υ be any vector in V and let c_i be the Fourier coefficient of υ with respect to e_i. Prove Bessel’s inequality:

Note that c_i = υ, e_i, because ||e_i|| = 1. Then, using e_i, e_j = 0 for i ≠ j and summing from k = 1 to r, we get

This gives us our inequality.

Orthogonal Matrices

7.32. Find an orthogonal matrix P whose first row is Images .

First find a nonzero vector w₂ = (x, y, z) that is orthogonal to u₁—that is, for which

One such solution is w₂ = (0, 1, −1). Normalize w₂ to obtain the second row of P:

Next find a nonzero vector w₃ = (x, y, z) that is orthogonal to both u₁ and u₂—that is, for which

Set z = −1 and find the solution w₃ = (4, −1, −1). Normalize w₃ and obtain the third row of P; that is,

Thus,

We emphasize that the above matrix P is not unique.

7.33. Let Images . Determine whether or not: (a) the rows of A are orthogonal;

(b) A is an orthogonal matrix; (c) the columns of A are orthogonal.

(a) Yes, because (1, 1, −1) · (1, 3, 4) = 1 + 3 − 4 = 0, (1, 1 − 1) · (7, −5, 2) = 7 − 5 − 2 = 0, and (1, 3, 4) · (7, − 5, 2) = 7 − 15 + 8 = 0.

(b) No, because the rows of A are not unit vectors, for example, (1, 1, −1)² = 1 + 1 + 1 = 3.

7.34. Let B be the matrix obtained by normalizing each row of A in Problem 7.33.

(a) Find B.

(b) Is B an orthogonal matrix?

(a) We have

Thus,

(b) Yes, because the rows of B are still orthogonal and are now unit vectors.

(c) Yes, because the rows of B form an orthonormal set of vectors. Then, by Theorem 7.11, the columns of B must automatically form an orthonormal set.

7.35. Prove each of the following:

(a) P is orthogonal if and only if P^T is orthogonal.

(b) If P is orthogonal, then P⁻¹ is orthogonal.

(a) We have (P^T)^T = P. Thus, P is orthogonal if and only if PP^T = I if and only if P^TT P^T = I if and only if P^T is orthogonal.

(b) We have P^T = P⁻¹, because P is orthogonal. Thus, by part (a), P⁻¹ is orthogonal.

(c) We have P^T = P⁻¹ and Q^T = Q⁻¹. Thus, (PQ)(PQ)^T = PQQ^TP^T = PQQ⁻¹P⁻¹ = I. Therefore, (PQ)^T = (PQ)⁻¹, and so PQ is orthogonal.

7.36. Suppose P is an orthogonal matrix. Show that

(a) Pu, Pυ = u, υ for any u, υ ∈ V;

(b) ||Pu|| = ||u|| for every u ∈ V.

Use P^TP = I and u, υ = u^Tυ.

(a) Pu, Pυ = (Pu)^T(Pυ) = u^TP^TPυ = u^Tυ = u, υ.

(b) We have

Taking the square root of both sides gives our result.

7.37. Prove Theorem 7.12: Suppose E = {e_i} and Images are orthonormal bases of V. Let P be the change-of-basis matrix from E to E^′. Then P is orthogonal.

Suppose

Images

Using Problem 7.18(b) and the fact that E^′ is orthonormal, we get

Images

Let B = [b_ij] be the matrix of the coefficients in (1). (Then P = B^T.) Suppose BB^T = [c_ij]. Then

Images

By (2) and (3), we have c_ij = δ_ij. Thus, BB^T = I. Accordingly, B is orthogonal, and hence, P = B^T is orthogonal.

7.38. Prove Theorem 7.13: Let {e₁, …, e_n} be an orthonormal basis of an inner product space V. Let P = [a_ij] be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V:

Images

Because {e_i} is orthonormal, we get, by Problem 7.18(b),

Images

where C_i denotes the ith column of the orthogonal matrix P = [a_ij]: Because P is orthogonal, its columns form an orthonormal set. This implies Images . Thus, Images is an orthonormal basis.

Inner Products And Positive Definite Matrices

7.39. Which of the following symmetric matrices are positive definite?

Use Theorem 7.14 that a 2 × 2 real symmetric matrix is positive definite if and only if its diagonal entries are positive and if its determinant is positive.

(a) No, because |A| = 15 − 16 = −1 is negative.

(b) Yes.

(d) Yes.

7.40. Find the values of k that make each of the following matrices positive definite:

(a) First, k must be positive. Also, |A| = 2k − 16 must be positive; that is, 2k − 16 > 0. Hence, k > 8.

(b) We need |B| = 36 − k² positive; that is, 36 − k² > 0. Hence, k² < 36 or −6 < k < 6.

7.41. Find the matrix A that represents the usual inner product on R² relative to each of the following bases of R²: (a) {υ₁ = (1, 3), υ₂ = (2, 5)}; (b) {w₁ = (1, 2), w₂ = (4, 2)}:

(a) Compute υ₁, υ₁ = 1 + 9 = 10, υ₁, υ₂ = 2 + 15 = 17, υ₂, υ₂ = 4 + 25 = 29. Thus, .

(b) Compute w₁, w₁ = 1 + 4 = 5, w₁, w₂ = 4 − 4 = 0, w₂, w₂ = 16 + 4 = 20. Thus, .

(Because the basis vectors are orthogonal, the matrix A is diagonal.)

7.42. Consider the vector space P₂(t) with inner product Images .

(a) Find f, g, where f(t) = t + 2 and g(t) = t² − 3t + 4.

(b) Find the matrix A of the inner product with respect to the basis {1, t, t²} of V.

Images

(b) Here we use the fact that if r + s = n,

Images

Then Images . Thus,

Images

7.43. Prove Theorem 7.14: Images is positive definite if and only if a and d are positive and |A| = ad − b² is positive.

Let u = [x, y]^T. Then

Images

Suppose f(u) > 0 for every u ≠ 0. Then f(1, 0) = a > 0 and f(0, 1) = d > 0. Also, we have f(b, −a) = a(ad − b²) > 0. Because a > 0, we get ad − b² > 0.

Conversely, suppose a > 0, d > 0, ad − b² > 0. Completing the square gives us

Images

Accordingly, f(u) > 0 for every u ≠ 0.

7.44. Prove Theorem 7.15: Let A be a real positive definite matrix. Then the function Images u, υ Images = u^TAυ is an inner product on Rⁿ.

For any vectors u₁, u₂, and υ,

Images

and, for any scalar k and vectors u, υ,

Images

Thus [I₁] is satisfied.

Because u^TAυ is a scalar, (u^TAυ)^T = u^TAυ. Also, A^T = A because A is symmetric. Therefore,

Images

Thus, [I₂] is satisfied.

Last, because A is positive definite, X^TAX > 0 for any nonzero X ∈ Rⁿ. Thus, for any nonzero vector υ, Images υ, υ Images = υ^TAυ > 0. Also, Images 0, 0 Images = 0^TA0 = 0. Thus, [I₃] is satisfied. Accordingly, the function Images u, υ Images = Aυ is an inner product.

7.45. Prove Theorem 7.16: Let A be the matrix representation of an inner product relative to a basis S of V. Then, for any vectors u, υ ∈ V, we have

Images

Suppose S = {w₁, w₂, …, w_n} and A = [k_ij]. Hence, k_ij = Images w_i, w_j. Suppose

Images

Then

Images

On the other hand,

Images

Equations (1) and (2) give us our result.

7.46. Prove Theorem 7.17: Let A be the matrix representation of any inner product on V. Then A is a positive definite matrix.

Because Images w_i, w_j Images = Images w_j, w_i Images for any basis vectors w_i and w_j, the matrix A is symmetric. Let X be any nonzero vector in Rⁿ. Then [u] = X for some nonzero vector u ∈ V. Theorem 7.16 tells us that X^TAX = [u]^TA[u] = Images u, u Images > 0. Thus, A is positive definite.

Complex Inner Product Spaces

7.47. Let V be a complex inner product space. Verify the relation

Using , , and then , we find

7.48. Suppose Images u, υ Images = 3 + 2i in a complex inner product space V. Find

7.49. Find the Fourier coefficient (component) c and the projection cw of υ = (3 + 4i, 2 − 3i) along w = (5 + i, 2i) in C².

Recall that c = υ, w/w, w. Compute

Thus, . Accordingly,

7.50. Prove Theorem 7.18 (Cauchy–Schwarz): Let V be a complex inner product space. Then | Images u, υ Images | ≤ ||u|| ||υ||.

If υ = 0, the inequality reduces to 0 ≤ 0 and hence is valid. Now suppose υ ≠ 0. Using Images (for any complex number z) and Images , we expand ||u − Images u, υ Images tυ||² ≤ 0, where t is any real value:

Images

Set t = 1/||υ||² to find Images , from which | Images u, υ Images |² ≤ ||u||²||υ||². Taking the square root of both sides, we obtain the required inequality.

7.51. Find an orthogonal basis for u^⊥ in C³ where u = (1, i, 1 + i).

Here u^⊥ consists of all vectors s = (x, y, z) such that

Find one solution, say w₁ = (0, 1 −i, i). Then find a solution of the system

Here z is a free variable. Set z = 1 to obtain y = i/(1 + i) = (1 + i)/2 and x = (3i − 3)2. Multiplying by 2 yields the solution w₂ = (3i − 3, 1 + i, 2). The vectors w₁ and w₂ form an orthogonal basis for u^⊥.

7.52. Find an orthonormal basis of the subspace W of C³ spanned by

Apply the Gram–Schmidt algorithm. Set w₁ = υ₁ = (1, i, 0). Compute

Multiply by 2 to clear fractions, obtaining w₂ = (1 + 2i, 2 − i, 2 − 2i). Next find and then . Normalizing {w₁, w₂}, we obtain the following orthonormal basis of W:

7.53. Find the matrix P that represents the usual inner product on C³ relative to the basis {1, i, 1 − i }.

Compute the following six inner products:

Then, using , we obtain

(As expected, P is Hermitian; that is, P^H = P.)

Normed Vector Spaces

7.54. Consider vectors u = (1, 3, −6, 4) and υ = (3, −5, 1, −2) in R⁴. Find

(a) ||u||_∞ and ||υ||_∞, (b) ||u||₁ and ||υ||₁, (c) ||u||₂ and ||υ||₂,

(d) d_∞(u, υ), d₁(u, υ), d₂(u, υ).

(a) The infinity norm chooses the maximum of the absolute values of the components. Hence,

(b) The one-norm adds the absolute values of the components. Thus,

(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by the usual inner product on R³). Thus,

(d) First find u − υ = (−2, 8, −7, 6). Then

7.55. Consider the function f(t) = t² 4t in C[0, 3].

(a) Find ||f||_∞, (b) Plot f(t) in the plane R², (c) Find ||f||₁, (d) Find ||f||₂.

(a) We seek ||f||_∞ = max(|f(t)|). Because f(t) is differentiable on [0, 3], |f(t)| has a maximum at a critical point of f(t) (i.e., when the derivative f^′(t) = 0), or at an endpoint of [0, 3]. Because f^′(t) = 2t 4, we set 2t 4 = 0 and obtain t = 2 as a critical point. Compute

Thus, ||f||_∞ = |f(2)| = | − 4| = 4.

(b) Compute f(t) for various values of t in [0, 3], for example,

Plot the points in R² and then draw a continuous curve through the points, as shown in Fig. 7-8.

Images

Figure 7-8

Thus,

Images

(d)

Images .

Thus,

Images .

7.56. Prove Theorem 7.24: Let V be a normed vector space. Then the function d(u, υ) = ||u − υ|| satisfies the following three axioms of a metric space:

[M₁] d(u, υ) ≥ 0; and d(u, υ) = 0 iff u = υ.

[M₂] d(u, υ) = d (υ, u).

[M₃] d(u, υ) ≤ d (u, w) + d(w, υ).

If u ≠ υ, then u − υ ≠ 0, and hence, d(u, υ) = ||u − υ|| > 0. Also, d(u, u) = ||u − u|| = ||0|| = 0. Thus,