Chapter 2

Continuous and linear maps

A normed space has two structures: a linear one (the underlying vector space), and a topological one (the norm). So when we study maps between normed spaces, it is natural to focus on maps which are well-behaved with these structures, and we’ll do this now. In particular, we’ll study:

(1)linear transformations
(well-behaved with respect to the linear structure),

(2)continuous maps
(well-behaved with respect to the topological structure),

(3)continuous linear transformations
(well-behaved with respect to both structures).

In the context of normed spaces, continuous linear transformations are most important, and these are sometimes also called bounded linear operators.

image

The reason for this terminology will become clear in Theorem 2.6 (page 67). We’ll see that the set of all bounded linear operators is itself a vector space, with obvious pointwise operations of addition and scalar multiplication, and it also has a natural notion of a norm, called the operator norm. Equipped with the operator norm, the vector space of bounded linear operators is a Banach space, provided that the co-domain is a Banach space. This is a useful result, which we will use in order to prove the existence of solutions to integral and differential equations.

2.1Linear transformations

Linear transformations are maps that respect vector space operations.

Definition 2.1. (Linear transformation).

Let X and Y be vector spaces over K (R or C).

A map T : XY is called a linear transformation if:

(L1) For all x1, x2X, T(x1 + x2) = T(x1) + T(x2).

(L2) For all xX and all αK, T(α · x) = α · T(x).

Example 2.1. (Linear galore!)

(1)D : C1[a, b] → C[a, b] given by Dx = x′, xC1[a, b] is a linear transformation, since

(L1) D(x + y) = (x + y)′ = x′ + y′ = Dx + Dy for all x, yC1[a, b];

(L2)D(αx) = (αx)′ = α · x′ for all αR and xC1[a, b].

(2)Let m, nN and X = Rn and Y = Rm.

image

is a linear transformation from Rn to Rm. Indeed,

image

and so (L1) holds. Moreover,

image

and so (L2) holds as well. Hence TA is a linear transformation.

(3)Let X = Y = 2. Define the left/right shift operators L, R as follows: if x = (xn)nN2, then

image

Then it is easy to see that R and L are linear transformations.

(4)Let X := c(⊂ ), the space of all real valued convergent sequences, and Y = R. The map L : cR, L((an)nN) := image for (an)nN, is a linear transformation (using the algebra of limits).

Recall that given a linear transformation T : XY, we can associate with T two natural subspaces, of X and Y, respectively,

the kernel of T, ker T := {xX : Tx = 0Y} ⊂ X, and

the range of T, ran T := {yY : ∃xX such that y = Tx} ⊂ Y .

image

In the above example of the linear transformation L, we have

ker L = c0(set of sequences convergent with limit 0),

ran L = R(since for every rR, the constant sequence (r)nN) converges to r).

(5)The map I : C[a, b] → R, given by image for all xC[a, b], is a linear transformation.

(6)Let S := {hC1[a, b] : h(a) = h(b) = 0}. From Exercise 1.3 (page 7), we see that S is a subspace of C1[a, b]. Let A, BC[a, b] be fixed functions.

Let L : SR be given by image

Let us check that L is a linear transformation. We have:

(L1) For all h1, h2S,

image

(L2) For all hC1[0, 1] and all αR,

image

Thus L is a linear transformation.

(7)Let C1(R), C2(R) denote the vector spaces of once, respectively twice continuously differentiable real-valued functions in R with pointwise operations. For fC2(R) and gC1(R), consider the initial value problem for the one (spatial) dimensional wave equation:

image

Let C2(R × [0, ∞)) denote the vector space of all twice continuously differentiable functions (x, t) image u(x, t) : R × [0, ∞] image R, again with pointwise operations. Then it can be shown that the unique solution uf,g in C2(R × [0, ∞)) to (IVP) is given by d’Alembert’s Formula,

image

Then the map (f, g) image uf,g : C2(R) × C1(R) → C2(R × [0, ∞)) is a linear transformation.

image

Here are some non examples.

Example 2.2. (Not quite linear!)

(1)If · denotes complex conjugation, then the complex conjugation map z image z : CC is not a linear transformation, since although (L1) is satisfied: (z + w) = z + w(z, wC), we see that (L2) isn’t: indeed, (i · 1) = i = –ii · 1.

(2)Consider the map T : R2R2 defined by

image

Then T is not a linear transformation since (L1) is not satisfied.

Indeed, image

while image

If αR\{0} and image then we have

image

If α = 0 and image

So for all image Thus (L2) holds.

image

Notation 2.1. We will denote the set of all linear transformations from the vector space X to the vector space Y by L(X, Y). Recall from elementary linear algebra that L(X, Y) is itself a vector space (over the common field K for X, Y) with pointwise operations: if T, SL(X, Y), then we define T + SL(X, Y) by (T + S)(x) = Tx + Sx, for all xX, and if αK and TL(X, Y), then we define α · TL(X, Y) by (α · T)(x) = α · (Tx), for all xX. What is the zero vector in this vector space L(X, Y)? It is the “zero linear transformation” 0 : XY, given by 0x = 0Y, for all xX, where 0Y denotes the zero vector in Y.

image

If X = Y, then we write L(X) instead of L(X, X).

Exercise 2.1. Consider the two maps S1, S2 : C[0, 1] → R given by

image

Show that S1 is not a linear transformation, while S2 is.

Exercise 2.2. Let a, b be nonzero real numbers, and consider the two real-valued functions f1, f2 defined on R by f1(t) = eat cos(bt) and f2(t) = eat sin(bt), tR. f1 and f2 are vectors belonging to the infinite dimensional vector space C1(R), consisting of all continuously differentiable functions from R to R. Denote by Sf1,f2 the span of the two functions f1 and f2.

(1)Prove that f1 and f2 are linearly independent in C1(R).

(2)Show that the differentiation map, image is a linear transformation.

(3)What is the matrix [D]B of D with respect to the (ordered) basis B = (f1, f2)?

(4)Prove that D is invertible, and write down the matrix corresponding to the inverse of D.

(5)Compute the indefinite integrals image

2.2Continuous maps

Let X and Y be normed spaces. As there is a notion of distance between pairs of vectors in either space (provided by the norm of the difference of the pair of vectors in each respective space), one can talk about continuity of maps. Within the huge collection of all maps, the class of continuous maps form an important subset. Continuous maps play a prominent role in functional analysis since they possess some useful properties.

Before discussing the case of a function between normed spaces, let us first of all recall the notion of continuity of a function f : RR.

Continuity of functions from R to R

In everyday speech, a ‘continuous’ process is one that proceeds without gaps of interruptions or sudden changes. What does it mean for a function f : RR to be continuous? The common informal definition of this concept states that a function f is continuous if one can sketch its graph without lifting the pencil. In other words, the graph of f has no breaks in it. If a break does occur in the graph, then this break will occur at some point. Thus (based on this visual view of continuity), we first give the formal definition of the continuity of a function at a point below. Next, if a function is continuous at each point, then it will be called continuous. If a function has a break at a point, say x0, then even if points x are close to x0, the points f(x) do not get close to f(x0).

image

This motivates the definition of continuity in calculus, which guarantees that if a function is continuous at a point x0, then we can make f(x) as close as we like to f(x0), by choosing x sufficiently close to x0.

image

Definition 2.2. A function f : RR is continuous at x0 if for every image > 0, there exists a δ > 0 such that for all xR satisfying |xx0| < δ, we have that |f(x) – f(x0)| < image.

f : RR is continuous if for every x0R, f is continuous at x0.

Continuity of functions between normed spaces

We now define the set of continuous maps from a normed space X to a normed space Y.

We observe that in the definition of continuity in ordinary calculus, if x, y are real numbers, then |xy| is a measure of the distance between them, and that the absolute value | · | is a norm in the finite (one) dimensional normed space R. So it is natural to define continuity in arbitrary normed spaces by simply replacing the absolute values by the corresponding norms, since the norm provides a notion of distance between vectors.

Definition 2.3. (Continuity of maps between normed spaces).

Let X and Y be normed spaces over K (R or C). Let x0X. A map f : XY is continuous at x0 if for every image > 0, there exists a δ > 0 such that for all xX satisfying ||xx0|| < δ, we have ||f(x) – f(x0)|| < image. f : XY is continuous if for all x0X, f is continuous at x0.

We will soon study when linear transformations are continuous, but first let us consider some examples of nonlinear maps.

Example 2.3. Consider the map S : C[0, 1] → R, given by

image

We’ll show that S is continuous. (As usual, C[0, 1] is endowed with the supremum norm.) Suppose that x0C[0, 1]. Let image > 0. As we would like to make |S(x) – S(x0)| small, let us first consider this expression. We have

image

if ||xx0|| < δ, where δ > 0 is some number. We ought to choose δ > 0 suitably so as to make the right-hand side above smaller than image. There is no unique way to do this, and anything one can justify works. We set

image

Whenever ||xx0|| < δ, in light of the above computation, we have

image

Thus S is continuous at x0. As the choice of x0 was arbitrary, it follows that S is continuous (on C[0, 1]).

image

Example 2.4. c00 is the subspace of of all finitely supported sequences. c00 is a normed space with the supremum norm inherited from .

Consider the map s : c00R given by image

We’ll show that s is not continuous at 0. Suppose on the contrary that it is. With image = 1/4 > 0, there exists a δ > 0 such that if ||a|| = ||a0|| < δ, then we are guaranteed that |s(a) – s(0)| = |s(a) – 0|= |s(a)| < image = 1/4. If image then image So for all m sufficiently large, we must have ||am|| < δ, giving in turn that |s(am)| < 1/4. But for all m we have

image

a contradiction. Hence s is not continuous at 0.

image

Exercise 2.3. (Rationale for the C1[a, b] norm.)

This exercise concerns the norm on C1[a, b] we have chosen to use. Since we want to be able to use ordinary analytic operations such as passage to the limit, then, given a function f : C1[0, 1] → R, it is reasonable to choose a norm such that f is continuous. As our f, let us take the arc length function given by

image

We show in the following sequence of exercises that f is not continuous if we equip C1[0, 1] with the supremum norm || · || induced from C[0, 1].

(1)Calculate f(0). (The arc length of the graph of the constant function taking value 0 everywhere on [0, 1] is obviously 1, and check that the above formula delivers this.)

(2)Now consider image Using image for all image and the periodicity of sin(2πnt) (the graph of sin(2πnt) on image is repeated n times in [0, 1]), conclude that

image

(3)Show that f is not continuous at 0. (Prove this by contradiction. Note that by taking larger and larger n, ||xn0|| can be made as small as we please, but f(xn) doesn’t stay close to f(0).)

Show that the arc length function f is continuous if we equip C1[0, 1] with the norm || · ||1,8. It may be useful to note that by using the triangle inequality in (R2, || · ||2), we have for a, bR that

image

Exercise 2.4. Let (X, || · ||) be a normed space. Show that the norm || · || : XR is a continuous map.

Continuity and open sets

We’ll now learn an important property of continuous maps:

“inverse images” of open sets under a continuous map are open.

In fact, we shall see that this property is a characterisation of continuity. First let’s some notation. Let f : XY be a map between the normed spaces X and Y, and let VY. We set f–1(V) := {xX : f(x) ∈ V}, and call it the inverse image of V under f. Clearly f–1(Y) = X and f–1(∅) = ∅.

image

Exercise 2.5. Let f : RR be given by f(x) = cos x(xR).

Find f–1(V), where V = {–1, 1}, V = {1}, V = [–1, 1], V = R, image

On the other hand if UX, then we set f(U) := {f(x) ∈ Y : xU}, and call it the image of U under f.

image

Exercise 2.6. Let f : RR be given by f(x) = cos x (xR).

Find f(U), where U = R, U = [0, 2π], U = [δ, δ + 2π] where δ > 0.

Theorem 2.1. Let X, Y be normed spaces and f : XY be a map.

Then f is continuous on X if and only if

for every V open in Y, f–1(V) is open in X.

Proof.

(If) Let cX, and let image > 0. Consider the open ball B(f(c), image) with center f(c) and radius image in Y . We know that this open ball V := B(f(c), image) is an open set in Y. Thus we also know that f–1(V) = f–1(B(f(c), image)) is an open set in X. But the point cf–1(B(f(c), image)), because f(c) ∈ B(f(c), image) (||f(c), f(c)|| = 0 < image!). So by the definition of an open set, there is a δ > 0 such that B(c, δ) ⊂ f–1(B(f(c), image)). In other words, whenever xX satisfies ||xc|| < δ, we have xf–1(B(f(c), image)), that is, f(x) ∈ B(f(c), image), which implies ||f(x) – f(c)|| < image. Hence f is continuous at c. But the choice of cX was arbitrary. Consequently f is continuous on X. See the picture on the left.

image

(Only if) Now let f be continuous, and let V be an open subset of Y. We would like to show that f–1(V) is open. So let cf–1(V. Then f(c) ∈ V. As V is open, there is a small open ball B(f(c), image) with center f(c) and radius image that is contained in V. By the continuity of f at c, there is a δ > 0 such that whenever ||xc|| < δ, we have ||f(x) – f(c)|| < image, that is, f(x) ∈ V. But this means that B(c, δ) ⊂ f–1(V). Indeed, if xB(c, δ), then ||xc|| < δ and so by the above, f(x) ∈ V, that is, xf–1(V). Consequently, f–1(V) is open in X. See the picture on the right above.

image

image Note that the theorem does not claim that for every U open in X, f(U) is open in Y. Consider for example X = Y = R equipped with the Euclidean norm, and the constant function f(x) = c (xR), which is clearly continuous. But note that direct images of open sets are not always open under f : indeed X = R is open in X = R, but f(X) = {c} is not open in Y = R.

Corollary 2.1. Let X, Y be normed spaces and f : XY be a map.

Then f is continuous on X if and only if

for every F closed in Y, f–1(F) is closed in X.

Proof. If FY, then f–1(Y\F) = X\(f–1(F)).

image

Exercise 2.7. Fill in the details of the proof of Corollary 2.1.

Theorem 2.2. Let X, Y, Z be normed spaces, and f : XY, g : YZ be continuous maps. Then the composition map g image f : XZ, defined by (g image f)(x) := g(f(x)) (xX), is continuous.

Proof. Let W be open in Z. Then since g is continuous, g–1(W) is open in Y. Also, since f is continuous, f–1(g–1(W)) is open in X. Finally, we note that (g image f)–1(W) = f–1(g–1(W)). So g image f is continuous.

image

Exercise 2.8. In the proof of Theorem 2.2, we used (g image f)–1(W) = f–1(g–1(W)). Check this.

Exercise 2.9. Let X be a normed space and f : XR be a continuous map. Determine if the following statements are true or false.

(1){xX : f(x) < 1} is an open set.

(2){xX : f(x) > 1} is an open set.

(3){xX : f(x) = 1} is an open set.

(4){xX : f(x) image 1} is a closed set.

(5){xX : f(x) = 1} is a closed set.

(6){xX : f(x) = 1 or f(x) = 2} is a closed set.

(7){xX : f(x) = 1} is a compact set.

Continuity and convergence

We have the following characterisation of continuous maps in terms of convergence of sequences: “Continuous maps preserve convergent sequences”.

Theorem 2.3. Let X, Y be normed spaces, cX, and let f : XY. Then the following two statements are equivalent:

(1)f is continuous at c.

(2)For every sequence (xn)nN in X such that (xn)nN converges to c, (f(xn))nN converges to f(c).

Proof.

(1) ⇒ (2): Suppose that f is continuous at c. Let (xn)nN be a sequence in X such that (xn)nN converges to c. Let image > 0. Then there exists a δ > 0 such that for all xX satisfying ||xc|| < δ, we have ||f(x) – f(c)|| < image. As the sequence (xn)nN converges to c, for this δ > 0, there exists an NN such that whenever n > N, ||xnc|| < δ. But then by the above, ||f(xn) – f(c)|| < image. So we have shown that for every image > 0, there is an NN such that for all n > N, ||f(xn) – f(c)|| < image. In other words, the sequence (f(xn))nN converges to f(c).

(2) ⇒ (1): Suppose that f is not continuous at c. Thus there is an image > 0 such that for every δ > 0, there is an xX such that ||xc|| < δ, but ||f(x) – f(c)|| > image. We will use this statement to construct a sequence (xn)nN for which the conclusion in (2) does not hold. Let δ = 1/n, for nN, and denote a corresponding x as xn: thus, ||xnc|| < δ = 1/n, but ||f(xn) – f(c)|| > image. Clearly the sequence (xn)nN is convergent with limit c, but (f(xn))nN does not converge to f(c) since ||f(xn) – f(c)|| > image for all nN. Consequently if (1) does not hold, then (2) does not hold. In other words, we have shown that (2) ⇒ (1).

image

Exercise 2.10. Let X, Y be normed spaces. Find all continuous maps f : XY such that for all xX, f(x) + f(2x) = 0. Hint: image

Exercise 2.11. (∗)(Continuity of the determinant; {invertible matrices} is open). Show that the determinant M → det M : (Rn×n, || · ||) → (R, | · |) is continuous. Prove that the set of invertible matrices is open in (Rn×n, || · ||). Hint: det–1{0}.

Continuity and compactness

In this section we will learn about a very useful result in Optimisation Theory, on the existence of global minimisers of real-valued continuous functions on compact sets.

Theorem 2.4.

If(1)K is a compact subset of a normed space X,

(2)Y is a normed space, and

(3)f : XY is function that is continuous at each xK,
then f
(K) is a compact subset of Y.

Proof. Suppose that (yn)nN is a sequence contained in f(K). Then for each nN, there exists an xnK such that yn = f(xn). Thus we obtain a sequence (xn)nN in the set K. As K is compact, there exists a convergent subsequence, say (xnk)kN, with limit LK. As f is continuous, it preserves convergent sequences. So (f(xnk))kN = (ynk)kN is convergent with limit f(L) ∈ f(K). Consequently, f(K) is compact.

image

Now we prove the aforementioned result which turns out to be very useful in Optimisation Theory, namely that a real-valued continuous function on a compact set attains its maximum/minimum on the compact set. This is a generalisation of the Extreme Value Theorem we had learnt earlier, where the compact set in question was just the interval [a, b].

Theorem 2.5. (Weierstrass).

If(1)K is a nonempty compact subset of a normed space X, and

(2)f : XR is a function that is continuous at each xK, then there exists a cK such that f(c) = sup{f(x) : xK}.

We note that since cK, f(c) ∈ {f(x) : xK}, and so the supremum above is actually a maximum:

image

Also, under the same hypothesis of the above result, there exists a minimiser in K, that is, there exists a dK such that

image

This follows from the above result by just looking at –f, that is by applying the above result to the function g : XR given by g(x) = –f(x) (xX).

Proof. (Of Theorem 2.5.) We know that the image of K under f, namely the set f(K) is compact and hence bounded. So {f(x) : xK} is bounded. It is also nonempty since K is nonempty. But by the least upper bound property of R, a nonempty bounded subset of R has a least upper bound. Thus M := sup{f(x) : xK} ∈ R. Now consider M – 1/n (nN). This number cannot be an upper bound for {f(x) : xK}. So there must be an xnK such that f(xn) > M – 1/n. In this manner we get a sequence (xn)nN in K. As K is compact, (xn)nN has a convergent subsequence (xnk)kN with limit, say c, belonging to K. As f is continuous, (f(xnk))kN is convergent as well with limit f(c). But from the inequalities f(xn) > M – 1/n (nN), it follows that f(c) image M. On the other hand, from the definition of M, we also have that f(c) image M. So f(c) = M.

image

Example 2.5. Since the set image is compact in R3 and since the function x image x1 + x2 + x3 is continuous on R3, it follows that the optimisation problem

image

has a minimiser.

image

Remark 2.1. (∗) In Optimisation Theory, one often meets necessary conditions for a minimiser, that is, results of the following form:

image

(Here image are certain mathematical conditions, such as the Lagrange multiplier equations.) Now such a result has limited use as such since even if we find all image which satisfy image, we can’t conclude that there is one that is a minimiser. But now suppose that we know that f is continuous on F and that F is compact. Then we know that a minimiser exists, and so we know that among the image that satisfy image, there is at least one which is a minimiser.

Notation 2.2. We will denote the set of all continuous maps from the normed space X to the normed space Y by C(X, Y).

2.3The normed space CL(X, Y)

In this section we study those linear transformations from a normed space X to a normed space Y that are also continuous.

Notation 2.3. We denote the set of all continuous linear transformations from the normed space X to the normed space Y by CL(X, Y), that is, CL(X, Y) := C(X, Y) ∩ L(X, Y). If X = Y, then we denote CL(X, X) simply by CL(X).

We begin by giving a characterisation of continuous linear transformations.

When is a linear transformation continuous?

Theorem 2.6.

Let X and Y be normed spaces, and T : XY be a linear transformation. Then the following properties of T are equivalent:

(1)T is continuous.

(2)T is continuous at 0.

(3)There exists an M > 0 such that for all xX, ||Tx||Y image M ||x||X.

We’ll see the proof below. But let us first remark that the useful part is the equivalence of (1) and (3), since by just showing the existence/lack of of the bound M, we can conclude the continuity/lack of continuity of the given linear transformation. So we don’t have to go through the rigmarole of verifying the image-δ definition: rather, a simple estimate, as stipulated in (3), suffices. Note also that it seems miraculous that continuity at just one point (at 0) delivers continuity everywhere on X! This miracle happens because the map T is not any old map, but rather a linear transformation. Here is an elementary example.

Example 2.6. (The left shift and right shift operators).

The left shift operator, L : 22, given by

image

is a linear transformation. We have for all (an)nN2 that

image

and so LC(2, 2). The right shift operator R : 22, given by R(a1, a2, a3, ···) := (0, a1, a2, ···), (an)nN2, is also a linear transformation which is continuous, thanks to the equality

image

for all (an)nN2.

image

Proof. (Of Theorem 2.6.) We will show the three implications (1)⇒(2), (2)⇒(3), and (3)⇒(1), which are enough to get all the three equivalences (and six implications) given in the statement of the theorem.

(1)⇒(2). This is just the definition of continuity on X. Indeed, T has to be continuous at each point in X, and in particular at 0X.

(2)⇒(3). Take image := 1 > 0. Then there exists a δ > 0 such that whenever ||x0|| = ||x|| < δ, we have that ||TxT0|| = ||Tx0|| = ||Tx|| < 1. Let’s check that this yields:

image

First consider x = 0. Then

image

And so the claim in (2.2) holds because we have in fact an equality.

On the other hand, now suppose that x0. Set image Then

image

and so ||Ty|| < 1, that is image

Upon rearranging, we obtain (2.2). So the claim in (3) holds with image (3)⇒(1). Let M > 0 be such that for all xX, ||Tx|| image M ||x||. Let x0X, and image > 0. Set δ := image/M > 0. Then whenever ||xx0|| < δ, we have

image

So T is continuous at x0. But as x0X was arbitrary, T is continuous.

image

Example 2.7. (Norm on C1[a, b] revisited). Consider the differentiation mapping D : C1[0, 1] → C[0, 1] defined by (Dx)(t) = x′(t), t ∈ [0, 1], xC1[0, 1]. We had seen that D is a linear transformation. Let’s now investigate if D is also continuous.

(1)We will show that D is not continuous if both C1[0, 1] and C[0, 1] are equipped with the || · || norm. Suppose on the contrary that the map D is continuous. Because D is a linear transformation, it follows from Theorem 2.6 that there exists an M > 0 such that for all xC1[0, 1],

image

But if we take x = tn (nN), then we have

image

and so ||Dx|| = ||x′|| = n image M ||x|| = M · 1, that is, n image M for all nN, which is clearly not true. So D is not continuous.

(2)However, D is continuous if C1[0, 1] is equipped with the || · ||1,∞ norm:

image

while C[0, 1] has the usual supremum norm || · ||. Indeed, we have for all xC1[0, 1], that ||Dx|| = ||x′|| image ||x|| + ||x′|| = ||x||1,∞.

image

Example 2.8. (If X, Y are finite dimensional, then L(X, Y) = CL(X, Y).) Let X = (Rn, || · ||2), Y = (Rm, || · ||2) and let ARm×n be given by

image

Let TA : RnRm be the linear transformation given by TAx := Ax for all xRn. Then for all xRn,

image

and so image Hence TA is continuous.

image

Remark 2.2. We know that every linear transformation on finite dimensional vector spaces X, Y can be represented by TA once bases for X, Y have been chosen. Also we know that all norms on finite-dimensional normed spaces are equivalent to each other. It follows from these two facts that every linear transformation between finite dimensional normed spaces is continuous.

Example 2.9. Let image and let image

For x = (xj)jN2, set image

We claim that TA : 22 is a continuous linear transformation on 2. Firstly, TAx2, since

image

Moreover, it is easily seen that TAL(2). Moreover, by Theorem 2.6, the computation above shows that TACL(2).

image

Example 2.10. (Integral operators).

Suppose that A : [0, 1] × [0, 1] → R be such that image

We think of A as a “non-discrete/continuous” analogue of a square matrix: the indices i, j are replaced by the “non-discrete/continuous” indices t, τ.

Then the map TA : L2[0, 1] → L2[0, 1], defined by

image

is a continuous linear transformation. The following picture illustrates the action of TA on x schematically, highlighting the analogy with matrix multiplication.

image

We note that for xL2[0, 1],

image

(The inequality in the second line above, is the Cauchy-Schwarz inequality in L2[0, 1], and it follows from the general Cauchy-Schwarz inequality in inner product spaces, which will be shown in Theorem 4.1, page 157; see also Example 4.3, page 159. We’ll accept this for now.) So TAxL2[0, 1], and TACL(L2[0, 1]).

Operators TA are called integral operators. It used to be common to call the function A that plays the role of the matrix, as the “kernel”1 of the integral operator. Many variations of the integral operator are possible.

image

Example 2.11. We had seen on page 55 that with

image

and A, BC[a, b], the map L : SR given by

image

is a linear transformation. Now we ask: is L continuous? Here we equip SC1[0, 1] with the norm || · ||1,∞. For hS,

image

where image In the above, we have used

image

Hence L is continuous.

image

Example 2.12. (∗)(Fourier transform).

Let L1(R) be the space of all complex valued Lebesgue integrable functions on R, with the usual L1-norm:

image

Its Fourier transform is the function image : RC defined by

image

Then image is a continuous function on R, and it is also bounded because

image

The vector space Cb(R) of all complex-valued continuous functions on R that are bounded, is a normed space with the supremum norm:

image

(We won’t check this; the proof is analogous to Example 1.9, page 10.) Thus from the above, we have imageCb(R). It is also easy to check that image : L1(R) → Cb(R) is a linear transformation, and it is continuous, thanks to the estimate above, giving ||image|| image ||f||1.

image

Remark 2.3. Owing to the characterisation of continuous linear transformations by the existence of a bound as in item (3) of Theorem 2.6 above, they are sometimes called bounded linear operators.

Exercise 2.12. Show that if ARm×n, then ker A = {xRn : Ax = 0} is a closed subspace of Rn.

Exercise 2.13. (∗) Prove that every subspace of Rn is closed.

Hint: Construct a linear transformation whose kernel is the given subspace.

Exercise 2.14. Let C[a, b] be endowed with the || · ||-norm.

(1)Show that image is a continuous linear transformation.

(2)Prove that if image converges to f in C[a, b], then image

Exercise 2.15. (Convolution operator).

If fL1(R), then the corresponding convolution operator f∗ : L(R) → L(R) is given by

image

Show that f∗ is well-defined and that f∗ ∈ CL(L(R)).

Exercise 2.16. Let Y = {fL2(R) : f = image := f(–·)} be the set of all even functions in L2(R). Show that Y is a closed subspace of L2(R).

Hint: View Y as the kernel of a suitable map in CL(L2(R)).

Operator norm and the normed space CL(X, Y)

Consider the set CL(X, Y) of all continuous linear transformations from a normed space X to a normed space Y. We will show that CL(X, Y) is a normed space, with pointwise operations (inherited from L(X, Y)), and the “operator norm” || · || : CL(X, Y) → R given by

image

Let us first show that CL(X, Y) is a subspace of L(X, Y), making it a vector space in its own right.

Proposition 2.1. CL(X, Y) is a subspace of L(X, Y).

Proof. We have:

(S1)Let S, TCL(X, Y). Then there exist MS, MT > 0 such that for all xX, ||Sx|| image MS ||x|| and ||Tx|| image MT ||x||. So

image

Thus S + TCL(X, Y) too.

(S2)Let αR and TCL(X, Y). There exists an M > 0 such that for all xX, ||Tx|| image M||x||. So ||(αT)x|| = ||α(Tx)|| = |α| ||Tx|| image |α|M||x||. Hence αTCL(X, Y).

(S3)The zero linear transformation 0CL(X, Y) because for all xX, ||0x|| = ||0Y|| = 0 image 1 · ||x||.

Consequently, CL(X, Y) is a subspace of L(X, Y).

image

Next we show that the operator norm || · || : CL(X, Y) → R given by

image

is indeed a norm on CL(X, Y). First let us check that is a well-defined number. If we set S := {||Tx|| : xX, ||x|| image 1}, then we note that this is a subset of the real numbers. Let us observe that this is a nonempty bounded set:

(1)S ≠ ∅ because if we take x = 0XX, then ||x|| = ||0X|| = 0 image 1, and so ||Tx|| = ||T0X|| = ||0Y|| = 0 ∈ S.

(2)S is bounded above. As TCL(X, Y), there is an M > 0 such that for all xX, ||Tx|| image M||x||. We claim that M is an upper bound of S. Indeed, if xX and ||x|| image 1, then ||Tx|| image M||x|| image M · 1 = M.

Since S is a nonempty subset of R which is bounded above, it follows from the Least Upper Bound Property of R that the supremum of S exists: so for all TCL(X, Y), ||T|| := sup{||Tx|| : xX, ||x|| image 1} < ∞. In order to do our verification that this operator norm || · || is a norm on CL(X, Y), the following two results will be useful.

Lemma A. Let TCL(X, Y).

If M > 0 is such that for all xX, ||Tx|| image M||x||, then ||T|| image M.

Proof. If xX and ||x|| image 1, then ||Tx|| image M||x|| image M · 1 = M. So M is an upper bound of S = {||Tx|| : xX, ||x|| image 1}. Thus sup S image M, that is, ||T|| image M.

image

Lemma B. Let TCL(X, Y). Then for all xX, ||Tx|| image ||T|| ||x||.

Proof.

1image x = 0. Then ||Tx|| = ||T0|| = ||0|| = 0 = ||T||0 = ||T|| ||0|| = ||T|| ||x||.

2image Suppose that 0XxX. Let image Then image

Thus ||Ty|| ∈ S, and so ||Ty|| image sup S = ||T||, that is,

image

Rearranging, we get ||Tx|| image ||T|| ||x||.

image

Lemmas A and B together tell us that for a TCL(X, Y), ||T|| is allowed as an “M” in

image

and moreover it is the smallest possible such number M, in the sense that any other allowed M has got to be at least as large as ||T||.

Theorem 2.7. The operator norm, || · || : CL(X, Y) → R, given by

image

is a norm on CL(X, Y).

Proof. We have:

(N1)For TCL(X, Y), image since ||Tx|| image 0 for all x.

If TCL(X, Y) and ||T|| = 0, then ||Tx|| image ||T|| ||x|| = 0||x|| = 0, and so ||Tx|| = 0, that is, Tx = 0Y for all xX. So T = 0, the zero linear transformation.

(N2)For αK and TCL(X, Y),

image

(N3)Let T, SCL(X, Y). Then for all xX,

image

from which it follows (Lemma A) that ||T + S|| image ||T|| + ||S||.

image

Example 2.13. Recall Example 2.8, page 69.

Let Rn and Rm be equipped with the Euclidean || · ||2-norm.

Let A = [Aij] ∈ Rm×n, and TACL(Rn, Rm) be the continuous linear transformation given by TAx = Ax, xRn.

Then we’d seen that for all xRn, image So

image

So we have an estimate for ||TA|| in terms of the matrix coefficients aij. But there does not exist a general “formula” for ||TA|| in terms of the matrix coefficients except in the special cases n = 1 or m = 1, when ||TA|| = |a11|. It can be seen that the map

image

is also a norm on Rm×n, and is called the Hilbert-Schmidt norm of A.

image

Exercise 2.17. (Diagonal operator norm; operator norm needn’t be attained.) Let (λn)nN be a bounded sequence in K, and let Λ ∈ CL(2) be given by Λ(a1, a2, a3, ···) = (λ1a1, λ2a2, λ3a3, ···) for all (a1, a2, a3, ···) ∈ 2. Show that Λ ∈ CL(2) and image

Now let image Show that there is no x2 such that ||x||2 image 1 and ||Λx||2 = ||Λ||. This gives an example showing that the operator norm need not be attained.

Exercise 2.18. (Schauder basis). Let X be a Banach space. A sequence of vectors (en)nN in X is a Schauder basis for X if for every xX, there exists a unique sequence of numbers (ξn)nN such that image

Let 1 image p < ∞, and en = (0, ···, 0, 1, 0, ···) be the sequence in p with nth term equal to 1 and all others 0. Show that {en : nN} is a Schauder basis for p.

Hint: For uniqueness use the continuity of the “coordinate map” φn : x image xn, selecting the nth term of the sequence x.

Remark. A Banach space X that has a Schauder basis is separable, that is, there exists a countable dense subset in X (for example the linear combinations of the en with rational coefficients). The converse of the above, namely if every separable Banach space had a Schauder basis, was an open problem for a long time. In 1973, the Swedish mathematician Per Enflo finally constructed an example of a separable Banach space that does not have a Schauder basis.

Exercise 2.19. (Invariant subspace, and the Invariant Subspace Problem)

(1)Prove that the averaging operator2 A : , defined by

image

is a continuous linear transformation. What is the operator norm of A?

(2)(∗) A subspace Y of a normed space X is said to be an invariant subspace with respect to a linear transformation T : XX if TYY. Let ACL() be the averaging operator from part (1). Show that the subspace c of , consisting of all convergent sequences, is an invariant subspace of the averaging operator A. Hint: Show that if xc has limit L, then Ax has limit L.

Remark. Invariant subspaces are useful since they are helpful in studying complicated operators by breaking them down into smaller operators acting on invariant subspaces. This is already familiar to the student from the diagonalisation procedure in linear algebra, where one decomposes the vector space into eigenspaces, and in these eigenspaces the linear transformation acts trivially. One of the open problems in functional analysis is the invariant subspace problem:

Does every TCL(H) on a separable complex Hilbert space H have a non-trivial invariant subspace?

Hilbert spaces are just special types of Banach spaces, in which the norm is induced by an inner product, and we will learn about Hilbert spaces in Chapter 4. Non-trivial means that the invariant subspace must be different from {0} or H. In the case of Banach spaces, the answer to the above question is “no”: during the annual meeting of the American Mathematical Society in Toronto in 1976, Per Enflo (again!) announced the existence of a Banach space and a bounded linear operator on it without any non-trivial invariant subspace.

Now that we know CL(X, Y) is a normed space with the operator norm, it is natural to ask if CL(X, Y) is complete, that is, if CL(X, Y) is a Banach space. It turns out that CL(X, Y) is a Banach space if and only if Y is a Banach space, and we’ll show this in the next section.

When is CL(X, Y) complete?

We’ll see that CL(X, Y) is a Banach space if and only if Y is a Banach space. In this section the “if” part will be shown, and the “only if” part will be done in Remark 2.9, page 109.

Theorem 2.8. If Y is a Banach space, and X is any normed space, then CL(X, Y) is a Banach space.

Proof. Let (Tn)nN be a Cauchy sequence in CL(X, Y). Let xX. Claim: (Tnx)nN is Cauchy in Y.

Indeed, for all n, m, ||TnxTmx|| image ||TnTm|| ||x||.

As Y is Banach, (Tnx)nN converges in Y, with limit, say TxY.

So we get a map xTx : XY.

Questions:(a)Is TCL(X, Y)?

(b)Does Tn image T in CL(X, Y)?

(a)Is T a linear transformation?

If x1, x2X, then (Tnx1)nN converges to Tx1 in Y, and (Tnx2)nN converges to Tx2 in Y. Thus (Tnx1 + Tnx2)nN = (Tn(x1 + x2))nN converges to Tx1 + Tx2 in Y. But we know that (Tn(x1 + x2))nN converges to T(x1 + x2) in Y. By the uniqueness of limits, T(x1 + x2) = Tx1 + Tx2.

Let αK and xX. Then (Tnx)nN converges to Tx in Y. So we have (α · (Tnx))nN = (Tn(α · x))nN converges to α · (Tx) in Y. But (Tn(α · x))nN converges to T(α · x) in Y. So α · T(x) = T(α · x).

Is T continuous? Let image = 1. Then there exists an NN such that for all n, m > N, ||TnTm|| image image = 1. So for all n > N, ||TnTN+1|| image 1. Thus for n > N and xX, ||TnxTN +1x|| image ||TnTN+1|| ||x|| image 1 · ||x||. Passing the limit n → ∞, we obtain ||TxTN+1x|| image ||x|| for all xX. So for all xX, ||Tx|| image ||TxTN+1x|| + ||TN+1x|| image= (1 + ||TN+1||) ||x||. Conclusion: TCL(X, Y).

(b)Is it true that image Tn = T in CL(X, Y)?

Let image > 0. Then there exists an NN such that for all n, m > N, we have ||TnTm|| image image. So for all n, m > N and all xX, we obtain that ||TnxTmx|| image ||TnTm || · ||x|| image image||x||. Passing to the limit as m → ∞, we get that for all n > N and xX, ||TnxTx|| image image||x||. Hence for all n > N, ||TnT|| image image.

image

Corollary 2.2. If X is a normed space over K, then the dual space of X, X′ := CL(X, K), is a Banach space with the operator norm.

Corollary 2.3. If X is a Banach space, then CL(X) := CL(X, X) is a Banach space with the operator norm.

Remark 2.4. (“Hilbert” versus Banach spaces). In Chapter 4, we’ll meet Hilbert spaces: a Hilbert space is a special type of a Banach space in which the norm is induced by an “inner product”. If instead of Banach spaces, we are interested only in Hilbert spaces, then the notion of a Banach space is still indispensable, since for a Hilbert space H, the normed space CL(H) is typically only a Banach space, and not a Hilbert space in general.

(∗) Strong and weak operator topologies on CL(X, Y)

Many claims in this section won’t be proved, but are included to provide the reader with a “road map”. The main content of the section are the definitions of the three operator topologies and the illustrative examples. One who wants to know more could embark on a deeper study, as offered for example in [Pedersen (1989)] or [Rudin (1976)].

Let a set X be equipped with two topologies, and let X1 (respectively X2) denote the set X equipped with the first (respectively second) topology. If the identity map xx : X1X2 is continuous, namely if every set open in X2 is open in X1, one says that first topology is stronger than the second, or that the second topology is weaker/coarser/smaller than the first. Of all the topologies on the set X, there is a strongest one (discrete topology), namely the one for which all subsets of X are open, and there is a weakest one (trivial topology), namely the one for which only X, ∅ are open.

Now suppose we have a set X, and a family F = {fi : XR | iI} of maps. Then of course there exists at least one topology on X with respect to which all the maps fi are continuous, namely the discrete topology on X. However, there is also a “less wasteful/more efficient/weakest” topology on X that makes all the maps fi, iI, continuous, characterized by the following: U is open in this topology on X if for every xU, there exist a finite number of indices i1, ···, inI and intervals (a1, b1), ···, (an, bn) such that x ∈ {yX : fik (y) ∈ (ak, bk), k = 1, ···, n} ⊂ U. It can be shown that this gives a topology T on X, and for any other topology T′ on X that makes the maps fi, iI, continuous, we have TT′.

We had seen that CL(X, Y) is a normed space with the operator norm ||T|| := sup{||Tx|| : xX, ||x|| image 1}, TCL(X, Y). We call the resulting topology the uniform operator topology on CL(X, Y), and is the weakest topology making each map in the family

image

continuous. A subset UCL(X, Y) is open in the uniform operator topology on CL(X, Y) if for each TU, there exists an image > 0 such that {SCL(X, Y) : ||ST|| < image} ⊂ U. A sequence (Tn)nN converges to TCL(X, Y) in the uniform operator topology if image ||TnT|| = 0.

We remark that besides the uniform operator topology on CL(X, Y), there are weaker topologies (with fewer open sets), on CL(X, Y), called the Strong Topology and the Weak Topology. Here are the definitions, although in this basic introduction, we won’t use these useful alternative topologies much.

Definition 2.4. (Strong Operator Topology)

Let X, Y be normed spaces. Then the weakest topology on CL(X, Y) which makes each map in the family

image

continuous, is called the strong operator topology on CL(X, Y). A subset UCL(X, Y) is open in the strong operator topology on CL(X, Y) if for each TU, there exists an image > 0 and finitely many x1, ···, xnX such that {SCL(X, Y) : ||SxkTxk|| < image, k = 1, ···, n} ⊂ U. A sequence (Tn)nN converges to TCL(X, Y) in the strong operator topology if for all xX, image ||TnxTx|| = 0.

Example 2.14. (Strong but not uniform convergence).

For nN, let PnCL(2) be the “projection operator” given by

image

We claim that (Pn)nN converges to the identity operator ICL(2) in the strong operator topology.

Indeed, ||IaPna||22 = ||(0, ···, 0, an+1, an+2, ···)||22 = image

But the sequence (Pn)nN does not converge to the identity ICL(2) in the uniform operator topology. Let’s show this by contradiction.

Suppose it does converge to I with respect to the operator norm. With image := 1/2 > 0, there exists an NN such that ||PNI|| < 1/2. So if eN+12 is the sequence with the (N + 1)st term 1 and all others 0, then we have

image

a contradiction!

image

A yet weaker topology than the strong operator topology is the weak operator topology, defined below.

Definition 2.5. (Weak Operator Topology)

Let X, Y be normed spaces. Let Y′ := CL(Y, K). Then the weakest topology on CL(X, Y) which makes each map in the family

image

continuous, is called the weak operator topology on CL(X, Y). A subset UCL(X, Y) is open in the weak operator topology on CL(X, Y) if for all TU, there exists an image > 0, finitely many x1, ···, xnX, and φ1, ···, φnY′ such that

image

A sequence (Tn)nN converges to TCL(X, Y) in the weak operator topology if for all φY′ and for all image

The following table summarises this:

image

Example 2.15. (Weak but not strong convergence).

Let RCL(2) be given by 2 ∋ (a1, a2, a3, ···) image (0, a1, a2, a3, ···), the right shift operator. We claim that (Rn)nN converges to 0CL(2) in the weak operator topology. We’ll use a result which will be proved later on in Theorem 2.14, page 104 (and also in Chapter 4, Theorem 4.10, page 189):

For each φCL(2, C) =: (2)′, there is an xφ = (xφ(k)kN2, such that

image

(Here · denotes complex conjugation.)

Using the Cauchy-Schwarz inequality (page 159), for all a2, φ ∈ (2)′,

image

Thus (Rn)nN converges to 0CL(2) in the weak operator topology.

If e1 := (1, 0, 0, ···) ∈ 2, then Rne1 = (0, ···, 0, 1, 0, ···), the sequence with (n + 1)st term 1 and all others 0. So ||Rne1||2 = 1, nN. Thus it is not the case that image Rne1 = 0 = 0e1.

So (Rn)nN does not converge to 0 in the strong operator topology.

image

2.4Composition of continuous linear transformations

If TCL(X, Y),SCL(Y, Z), then the composition ST : XZ of T, S is defined by (ST) (x) = S(T(x)), xX.

image

It is easily checked that ST is linear. Moreover, it is continuous too, since for all xX, we have ||(ST)(x)|| = ||S(T(x)) image ||S|| ||Tx|| image ||S|| ||T|| ||x||. Moreover, the above inequality shows that ||ST|| image ||S|| ||T||.

In particular, if X is a normed space, then CL(X), besides possessing a natural addition and scalar multiplication (both defined pointwise), also possesses a natural multiplication of elements of CL(X), namely composition (S, T) image ST : CL(X) × CL(X) → CL(X). So CL(X) is an “algebra”. Loosely speaking, an algebra is a vector space in which there is also available a nice way of multiplying vectors and producing new vectors.

Definition 2.6. (Algebra). An algebra is a vector space V in which an associative and distributive multiplication is defined, that is,

image

for all u, v, wV, and which is related to scalar multiplication so that

image

for all u, vV and all αK. We call eV a multiplicative identity element if for all vV, one has ev = v = ve.

The algebra V := CL(X) has a multiplicative identity element, namely the identity operator I. The identity operator is the map I : X image X, given by Ix = x, xX. The operator I clearly belongs to CL(X) (with ||I|| = 1), and I serves as the multiplicative identity element of the algebra CL(X): IT = T = TI for all TCL(X).

Definition 2.7. (Normed and Banach algebras).

A normed algebra is an algebra V equipped with a norm || · || that satisfies:

image

A Banach algebra is a normed algebra which is complete.

We note that V := CL(X) is a normed algebra. We’d seen earlier that CL(X) is a Banach space if X is a Banach space. So CL(X) is a Banach algebra if X is a Banach algebra.

Let us note that as opposed to vector addition in CL(X), vector multiplication (that is, composition) in CL(X) is in general not commutative. Here is an example. Take X = R2. Let T be clockwise rotation by π/2, and S be reflection in the x-axis, that is,

image

Then one can check that TSST. This can also be observed visually by observing the distinct fates of the point (1, 0) under TS and under ST :

image

The commutator of A, BCL(X) is defined by [A, B] = ABBA, and “measures” the lack of commutativity of A and B. The above example shows that the commutator may not be necessarily 0. In Exercise 2.22, page 86, we will investigate the “largeness” of the commutator in finite and infinite dimensional spaces X. This plays a role in Quantum Mechanics. We’ll show in Chapter 4 (page 204) that for “observables” A, B, the Heisenberg Uncertainty Relation holds:

image

We won’t explain this3 right now, but we simply notice that the commutator makes an appearance on the right-hand side.

If dim X = d < ∞, and T, SCL(X) are such that TS = I, then ST = I too. So TS = ITS = ST = I. (Let us show this. First of all, if TS = I, then ker S = {0}. Indeed, if Sx = 0, then

image

Next observe that if {v1, ···, vd} is a basis for X, then {Sv1, ···, Svd} are linearly independent: if αks are scalars such that α1Sv1 + ··· + αdSvd = 0, then S(α1v1 + ··· + αdvd) = 0, and so α1v1 + ··· + αdvd = 0, making all αks zeros. Hence {Sv1, ···, Svd} must be a basis for X. For xX, there exist βks in K such that x = β1Sv1 + ··· + βdSvd = S(β1v1 + ··· + βdvd); and so STx = STS(β1v1 + ··· + βdvd) = SI(β1v1 + ··· + βdvd) = x.)

However, if dim X = ∞, then it can happen that TS = I, but STI. Consider for example the left/right shift operators on 2. We have LR = I as LR(a1, a2, a3, ···) = L(0, a1, a2, ···) = (a1, a2, ···) = I(a1, a2, a3, ···), for all (a1, a2, a3, ···) ∈ 2. But RLI since

image

This prompts the following definition.

Definition 2.8. (Invertible operator) Let X be a normed space. An element ACL(X) is said to be invertible if there exists a BCL(X) such that AB = I = BA.

Inverses are unique. This follows from the associativity of composition.

Proposition 2.2. If ACL(X) is invertible, then there exists a unique BCL(X) such that AB = I = BA.

The unique inverse of an invertible ACL(X) is denoted by A–1CL(X).

Proof. If B1, B2CL(X) satisfy AB1 = I = B1A and AB2 = I = B2A, then B1 = IB1 = (B2A)B1 = B2(AB1) = B2I = B2.

image

Proposition 2.3. If ACL(X) is invertible, then A is bijective.

Proof. If x, yX are such that Ax = Ay, then A–1(Ax) = A–1(Ay), that is, Ix = Iy, and so x = y. Thus A is injective/one-to-one.

If yX, then x := A–1 yX, and so Ax = A(A–1y) = Iy = y. Hence A is surjective/onto too.

image

If ACL(X) is bijective, then the inverse map is automatically a linear transformation. In the case when dim X < ∞, we have L(X) = CL(X). So in this case the inverse is automatically continuous too. So if dim X < ∞, then ACL(X) is invertible if and only if A is a bijection.

In the infinite dimensional case, is it still true that if ACL(X) is a bijection, then A must be invertible? The answer is “yes” if X is a Banach space. The proof is not immediate, and we will show this below, using a deep result called the “Open Mapping Theorem”. But first, let us see an example showing that in non-Banach spaces, the inverses of continuous bijections may fail to be continuous.

Example 2.16. (Bijection, but not invertible.)

Recall that c00 is the subspace of of all finitely supported sequences. Consider the map A : c00c00 given by

image

Then A is linear, and continuous (because ||Ax|| image ||x|| for all xc00). It is also easily seen that A is injective and surjective. So A is a bijection. However, it is not invertible. Indeed, if otherwise, BCL(c00) is the inverse, then we would have, with em := (0, ···, 0, 1, 0, ···) (mth term 1, all others 0), that

image

giving m image ||B|| for all mN, a contradiction. But we aren’t shocked by this example, since c00 is not complete with the supremum norm, and the equivalence of bijectivity with invertibility is supposed to hold for operators in a Banach space.

image

Exercise 2.20. (When is the diagonal operator invertible?)

Let (λn)nN be a bounded sequence in K, and consider Λ ∈ CL(2) given by

image

Show that Λ is invertible in CL(2) if and only if image

Exercise 2.21. Let X be a normed space, and suppose that A, BCL(X).

Show that if I + AB is invertible, then I + BA is also invertible, with the inverse (I + BA)–1 given by IB(I + AB)–1A.

Remark. This identity can be used to show that the nonzero spectrum of AB and BA coincide. λ is said to be in the spectrum of an operator T if λIT is not invertible in CL(X).

Exercise 2.22. ([A, B] can’t be “large” for A, BCL(X).)

(1)The trace, tr(A), of a square matrix A = [aij] ∈ Cd×d is the sum of its diagonal entries: tr(A) = a11 + ··· + add. It can be shown that tr(A + B) = tr(A) + tr(B) and that tr(AB) = tr(BA). Prove that there cannot exist A, B in Cd×d such that ABBA = I, where I denotes the d × d identity matrix.

(2)Let X be a normed space, and A, B be in CL(X). Show that if ABBA = I, then for all nN, ABnBn A = nBn–1, where we set B0 := I. Taking the operator norm on both sides of ABnBn A = nBn–1, conclude that we can never have ABBA = I with A, BCL(X).

(3)Let C(R) denote the set of all functions f : RR such that for all nN, f(n) exists. It is clear that C(R) is a vector space with pointwise operations. Consider the operators A, B : C(R) → C(R) given as follows:

image

(The operators A and B appear as the momentum operator and the position operator in Quantum Mechanics.) Show that ABBA = I, where I denotes the identity on C(R).

The Neumann Series Theorem.

Theorem 2.9. (Neumann4 Series Theorem).

Let X be a Banach space, and ACL(X) be such that ||A|| < 1.

Then (1) IA is invertible in CL(X),

image

In particular, IA : XX is bijective: for each yX, there exists a unique solution xX of the equation xAx = y, and moreover,

image

so that x depends continuously on y.

This plays a role in integral equation theory:

image

where y, k are given, and x is the unknown function.

(This is called the Fredholm equation of the second type.)

Proof. (Of the Neumann Series Theorem). For all nN, ||An|| image ||A||n.

As ||A|| < 1, image ||A||n converges. By comparison, image||An|| converges too.

As X is Banach, so is CL(X). Since all absolutely convergent series in the Banach space CL(X) converge, it follows that

image

converges in CL(X). Is this S the inverse of IA? For nN, define

image

Then we know that image Sn = S in CL(X). We have

image

Since ||ASnAS|| image ||A|| ||SnS|| and ||SnASA|| image ||A|| ||SnS||, it follows that SA = AS = SI. This gives (IA)S = I = S(IA). Hence IA is invertible in CL(X) and

image

Moreover, image

image

Exercise 2.23. Consider the system

image

in the unknown variables (x1, x2) ∈ R2. If I denotes the 2 × 2 identity matrix, then this system can be written as (IK)x = y, where

image

(1)Show that if R2 is equipped with the norm || · ||2, then ||K|| < 1.
Conclude that (2.3) has a unique solution (denoted by x in the sequel).

(2)Find out the unique solution x by computing (IK)–1.

(3)Write a computer program to compute xn = (I + K + ··· + Kn)y and the relative error ||xxn||2/||x||2 for various values of n (say, until the relative error is less than 1%). Note the slow convergence of the Neumann series.

Exercise 2.24. Let X be a Banach space, and let ACL(X) be such that ||A|| < 1.

For nN, let Pn := (I + A)(I + A2)(I + A4) ··· (I + A2).

(1)Using induction, show that (IA)Pn = IA2n+1 for all nN.

(2)Prove that (Pn)nN is convergent in CL(X) to (IA)–1.

Exercise 2.25. (∗)(The set of invertibles is open, and ·–1 is continuous.)

Let X be a Banach space and GL(X) denote the set of all invertible continuous linear transformations on X.

(1)Prove that GL(X) is an open subset of CL(X) in the usual operator norm topology.

(2)Prove that T image T–1 is continuous on GL(X), that is, for all T0CL(X) and each image > 0, there exists a δ > 0 such that if TCL(X) satisfies ||TT0|| < δ, then TGL(X) and ||T–1T0–1|| < image.

The exponential of an operator. Let X be a Banach space and let ACL(X). We will now study the exponential operator eACL(X).

For aR, one defines the exponential eaR by

image

The exponential function e· is useful, because it provides a solution to the initial value problem for the most basic differential equation

image

(Here x(t) ∈ R and x0R.) The unique solution is given by x(t) = etax0, tR. This fundamental differential equation arises in all sorts of applications, for example, radioactive decay, Newton’s law of cooling, continuous compound interest, population growth, etc.

For ACL(X), we will show that an analogous definition,

image

(where we have simply replaced the little a by capital A!) works, and the series converges in CL(X). Then the map t image etA x0 provides a solution to the analogous initial value problem, but now in the Banach space X, with the initial condition x0X.

Theorem 2.10. Let X be a Banach space, and ACL(X).

Then image converges in CL(X).

Proof. The real series image converges (to e||A||). Since for nN we have image by the Comparison Test, image converges absolutely. So image converges in the Banach space CL(X).

image

Remark 2.5. (∗) Recall that when aR, we have image

Similarly, it can be shown that when ACL(X), image

The last equality is not superfluous, since commutativity of multiplication in CL(X) is not always guaranteed, but it turns out that A does commute with etA. Formally, the above result is not surprising, as can be seen by differentiating the series for etA termwise with respect to t:

image

A rigorous justification can be given using the fact that e(t+s)A = etA esA for all s, tR. In general, if A, BCL(X) commute, that is, AB = BA, then eA+B = eA eB. This shows that eA is always invertible in CL(X). Indeed, since A commutes with –A, we have eAeA = eAA = e0 = I = eA eA.

Now let x0X, ACL(X), and consider the initial value problem:

image

Then x(t) := etA x0, tR, solves the initial value problem because

image

with x(0) = e0Ax0 = e0x0 = Ix0 = x0.

Moreover, the solution is unique, since if image is any solution, then

image

so that etAimage(t) = e–0Aimage(0) = Ix0 = x0 for all t, giving

image

for all tR. Hence the solution t image etA x0, tR, is unique.

Initial value problems in Banach spaces of the above type arise from initial boundary value problems for partial differential equations and their discretisations. More generally, the operator A in the initial value problem is then “unbounded”, and similar to t image etA, one can then associate a “C0-semigroup image generated by the infinitesimal operator A”. The solution to the initial value problem is given by x(t) = etA x0 for t image 0. For example, the initial value problem for the diffusion equation with the homogeneous Dirichlet boundary conditions

image

gives the initial value problem for the following ordinary differential equation in the Banach space L2[0, 1]:

image

where x(t) = u(·, t) ∈ L2[0, 1], and A : D(A)(⊂ L2[0, 1] → L2[0, 1] is an unbounded operator given by

image

and image

This completes our (rather long!) Remark 2.5.

Example 2.17. (Computing eA for diagonalisable A). Consider the system

image

With x = (x1, x2), this system can be written as x′(t) = Ax(t), where

image

We know that given the initial condition x0 = (x1(0), x2(0)) ∈ R2, the unique solution is x(t) = etAx0. This raises the question:

image

There are several ways, but let us consider a method which works for diagonalisable As. First we note that if

image

and so

image

Note in particular that e0 = I, and so calculating eA cannot be the same as taking exponentials of the entries of A!

Now suppose that A is diagonalisable, that is, A = PDP–1 where D is diagonal and P is invertible. Then An = PDnP–1 and so

image

Let’s see this method in action when image where a, bR.

By computing the eigenvalues and eigenvectors of A, we can write

image

and so

image

In particular, our initial value problem for (2.4) has the solution (putting a = 1 and b = 2 above)

image

So we’ve seen how to compute eA if the matrix A is diagonalisable. However, not all matrices are diagonalisable. For example, consider the matrix

image

The eigenvalues of this matrix are both 0, and so if it were diagonalisable, say A = PDP–1, then the diagonal matrix D must be the zero matrix. But then A = PDP–1 = P0P–1 = 0, and we have arrived at a contradiction since A0! So this A is not diagonalisable.

In general, however, every matrix has what is called a Jordan canonical form, that is, there exists an invertible P such that P–1AP = D + N, where D is diagonal, N is nilpotent (that is, there exists an nN such that Nn = 0), and D and N commute. Then the exponential of A is:

image

But the computation of a P taking A to its Jordan form requires some sophisticated linear algebra, and we won’t treat this here. The interested reader is referred to [Hirsch and Smale (1974), Chapter 6].

image

Exercise 2.26. (eA+BeA eB).

Compute eA and eB, where A, B are the nilpotent matrices image

Give an example of matrices A, BR2×2 for which eA+BeA eB.

2.5(∗) Open Mapping Theorem

In this section, we will show Theorem 2.11, the “Open Mapping Theorem”. The proofs in this section are somewhat more technical than the rest of the sections of this chapter.

Definition 2.9. (Open map) Let X, Y be normed spaces.

TCL(X, Y) is called open if for all open sets UX, T(U) is open in Y.

Proposition 2.4.

Let X, Y be normed spaces, TCL(X, Y), and B := {xX : ||x|| image 1}. Then the following are equivalent:

(1)T is open.

(2)There exists a δ > 0 such that B(0Y, δ) ⊂ T(B).

Proof.

(2)⇒(1): Suppose that there exists a δ > 0 such that B(0Y, δ) ⊂ T(B). Let U be open in X. If y0T(U), then y0 = Tx0 for some x0U. As U is open, there exists a r > 0 such that the open ball B(x0, r) with centre x0 and radius r is contained in U. We claim that the open ball B(y0, δr/2) is contained in T(U). If yB(y0, δr/2), then ||yy0|| < δr/2, that is, ||(2/r)(yy0)|| < δ, and so (2/r)(yy0) ∈ B(0Y, δ) ⊂ T(B). Hence there exists an xB such that (2/r)(yy0) = Tx, that is, we have y = T((r/2)x + x0). But as ||((r/2)x + x0) – x0|| = (r/2)||x|| image (r/2) · 1 < r, we see that (r/2)x + x0B(x0, r) ⊂ U. Consequently, yT(U), as desired.

(1)⇒(2): Suppose that T is open. Then T(B(0X, 1)), the image of the open set B(0X, 1), must be open. But 0Y = T0XT(B(0X, 1)), and so, there must exist a δ > 0 such that the open ball B(0Y, δ) ⊂ T(B(0X, 1)) ⊂ T(B), as wanted. 0

image

Lemma 2.1. (Baire Lemma)

Let(1)X be a Banach space, and

(2)(Fn)nN be a sequence of closed sets in X such that X = image Fn.

Then there exist an nN and a nonempty open set U such that UFn.

Proof. We assume none of the sets Fn contain a nonempty open subset and construct a Cauchy sequence that converges to a point, which lies in none of the Fn, contradicting the fact that the Fns cover X.

First let us observe that whenever a closed set F in X does not contain any open set, we have that imageF is dense in X. (To see this, let xX, and r > 0. We’d like to show that B(x, r) ∩ imageF ≠ ∅. If ximageF, then xB(x, r) ∩ imageF, and we are done. On the other hand, if ximageF, then xF. But as F doesn’t contain any open set, it won’t, in particular, contain B(x, r). So there must be an element y in B(x, r) which is not in F. But this means that yimageF, and so we’ve got yB(x, r) ∩ imageF, as wanted.) By our assumption, it follows that imageFn is dense in X for all nN.

Let x1 be any element in the nonempty (dense!) open set imageF1. Let r1 > 0 be such that imageimageF1. As imageF2 is dense in X, there exists an x2B(x1, r1) ∩ imageF2. As B(x1, r1) ∩ imageF2 is open, we can find an r2 < r1/2 such that imageB(x1, r1) ∩ imageF2. As imageF3 is dense in X, there exists an x3B(x2, r2) ∩ imageF3. As B(x2, r2) ∩ imageF3 is open, we can find an r3 < r1/4 such that imageB(x2, r2) ∩ imageF3.

image

Proceeding in this manner, we obtain a sequence (xn)nN, with the term xn+1B(xn, rn). If n > m, then B(xn, rn) ⊂ B(xm, rm), and so we have ||xnxm|| < rm < r1/2m–1 image 0. Thus (xn)nN is Cauchy, and as X is Banach, also convergent, say, to xX. With a fixed m, in the inequality above, if we pass the limit as n → ∞, then we obtain ||xxm|| image rm, that is, ximageimageFm. As the choice of mN was arbitrary, for all mN, xFm. But this contradicts the fact that the Fms cover X.

image

Exercise 2.27.

Show that the Hamel basis5 of a Banach space can only be finite or uncountable.

Before proving the Open Mapping Theorem, we’ll give some notation and a useful technical result. For subsets A, B of a normed space X and a scalar α, we set αA := {αa : aA}, and A + B := {a + b : aA, bB}.

Lemma 2.2. Let X be a normed space, and AX satisfy

(1)A is symmetric, that is, –A = A,

(2)A is mid-point convex, that is, for all x, yA, imageA, and

(3)there is a nonempty open set UA.

Then there exists a δ > 0 such that B(0, δ) ⊂ A.

Proof. First note that for a fixed scalar α ≠ 0, and an aX, the maps x image x + a : XX and x image αx : XX, are both continuous, with the continuous inverses (x image xa and x image α–1x).

Hence if U is open in X, then U + {–a} is open in X.

So U + (–A) = image (U + {–a}) is open in X. Thus image is open in X.

If aU, then 0 = image

Thus there exists a δ > 0 such that B(0, δ) ⊂ image.

image

Theorem 2.11. (Open Mapping Theorem).

Let X, Y be Banach spaces, and TCL(X, Y) be surjective.

Then T is open.

Proof. Let B := {xX : ||x|| image 1}. Then X = image nB. Thanks to the surjectivity of T, we have Y = image T(nB). Thus certainly Y = image T(nB). It can be checked that T(nB) = nT(B). By the Baire Lemma, there exists an nN such that nT(B) contains a nonempty open set. But since the map x image nx : XX is continuous with a continuous inverse, it follows that T(B) contains a nonempty open set too. By Lemma 2.2, there exists a δ > 0 such that B(0Y, δ) ⊂ T(B). We will now show that this implies

image

giving the required openness of T by Proposition 2.4. Let y such that ||y|| < δ/2. We must show that there exists a xB with y = Tx. Using B(0Y, δ) ⊂ T(B), it can be seen that

image

From (2.6), with n = 1, it follows that we can arbitrarily closely approximate y by elements from T(B/2). Thus there exists an x1 with ||x1|| image 1/2 such that ||yTx1|| image δ/4 that is, yTx1B(0, δ/4). From (2.6) again it follows (with n = 2) that we can arbitrarily closely approximate yTx1 by an element Tx2 with ||x2|| image 1/4: ||yTx1Tx2|| image δ/8. Proceeding in this manner, we can inductively construct a sequence (xn)nN such that: ||xn|| image 1/2n and ||yTx1Tx2 – ··· – Txn–1|| image δ/2n.

As ||xn|| image image xn is absolutely convergent, and image.

If we denote the sum of the series image xn by x, then

image

thanks to the continuity of T. Since ||x|| image 1, this proves the desired inclusion (2.5).

image

Corollary 2.4. If X, Y are Banach spaces, and TCL(X, Y) is bijective, then T–1L(Y, X) is continuous.

We then refer to T as a normed space isomorphism, and say that X, Y are isomorphic (as normed spaces), written X image Y.

Proof. T is open, and so if U is open in X, T(U) is open in Y. But (T–1)–1(U) = {yY : T–1yU} = {yY : yT(U)} = T(U). Thus the inverse images of open sets under T–1 are open, showing that T–1 is continuous.

image

Exercise 2.28. Construct a continuous and surjective, but not open, f : RR.

Exercise 2.29. (Closed Graph Theorem).

The aim of this exercise is to prove the Closed Graph Theorem:

Let X, Y be Banach spaces and T : XY be a linear transformation.

Then T is continuous if and only if its graph G(T) is closed in X × Y.

Here X × Y has the norm ||(x, y)} := max{||x||, ||y||}, (x, y) ∈ X × Y, and the set G(T) := {(x, Tx) : xX} ⊂ X × Y is the graph of T.

The “only if” part is easy to see. If (xn, Txn) → (x, y), then xnx, and as T is continuous, ||TxnTx|| image ||T|| ||xnx||, so that TxnTx. But Txny, and so, by the uniqueness of limits, Tx = y. Thus (xn, Txn) → (x, Tx) ∈ G(T), showing that G(T) is closed.

Show the “if” part. Hint: Consider p : G(T) → X, where p((x, Tx)) = x, xX.

Uniform Boundedness Principle.

We give below another important application of the Baire Lemma.

Theorem 2.12. (Uniform Boundedness Principle).

Suppose that

(1)X and Y are Banach spaces,

(2)TiCL(X, Y), iI, is a “pointwise bounded” family, that is,

image

Then the family is “uniformly bounded”, that is, image ||Ti|| < + ∞.

Proof. For nN, Fn := {XX : image ||Tix|| image n} = image{xX : ||Tix|| image n} is mid-point convex, symmetric, and closed, as Fn is the intersection of the mid-point convex, symmetric, and closed sets {xX : ||Tix|| image n}, iI.

From (2), we have X = image Fn, and so by the Baire Lemma, there exists an n such that Fn contains a nonempty open set. By Lemma 2.2, there exists a δ > 0 such that the ball B(0, δ) with center 0 and radius δ is contained in Fn, that is, if ||x|| < δ, then for all iI we have ||Tix|| image n. We claim that ||Tix|| image (2n/δ)||x|| for all xX and all iI. Clearly this is true if x = 0, since then both sides of the inequality are equal to 0. On the other hand, if x0, then y := imagex has norm ||y|| = δ/2 < δ, and so we must have ||Tiy|| image n, which, using the linearity of Ti and the positive homogeneity of the norm, delivers, upon a rearrangement, the desired inequality. Thus ||Ti|| image 2n/δ for all iI, and thus image ||Ti|| image 2n/δ.

image

Corollary 2.5. (Banach-Steinhauss Theorem).

Let(1)X, Y be Banach spaces, and

(2)(Tn)nN in CL(X, Y) be such that image Tnx exists for all xX.

Then x image image Tnx : XY belongs to CL(X, Y).

Proof. It is clear that the map x image image Tnx : XY is linear.

It remains to show that it is continuous too. Set Tx := image Tnx, xX. For each xX, (Tnx)nN is convergent, and in particular, bounded:

image

Hence by the Uniform Boundedness Principle, there exists an M such that for all nN, ||Tn|| image M. This gives, for each fixed xX, that

image

Passing the limit n → ∞ yields ||Tx|| image M||x||. As the choice of x was arbitrary, this holds for all x, and consequently, the linear transformation T is continuous.

image

2.6Spectral Theory

For a linear transformation TL(X) on a finite dimensional vector space X over C, the set of eigenvalues of T is known as its spectrum σ(T), and has cardinality at most dim X. But in infinite dimensional complex vector spaces, strange things may happen, for example linear transformations may have no eigenvalues at all or finitely many or (countably/uncountably) infinitely many! First of all, here is a natural definition of eigenvectors and eigenvalues, extending our prior familiarity with eigenvalues from elementary linear algebra. We remind the reader that the prefix eigen is derived from German, meaning “one’s own”.

Definition 2.10. (Eigenvalues and eigenvectors). Let X be a normed space and TCL(X). Then λC is called an eigenvalue of T if there exists a nonzero vector xX such that Tx = λx. Such a nonzero vector x is then called an eigenvector of T corresponding to the eigenvalue λ.

Example 2.18. (Uncountably many eigenvalues).

Let λD := {zC : |z| < 1}. If x := (1, λ, λ2, λ3, ···), then as |λ| < 1,

image

and so x2. Clearly x0 too.

We see that x is an eigenvector of the left shift operator LCL(2) because

image

Thus each point in the open unit disk6 is an eigenvalue of L.

image

Example 2.19. (No eigenvalues). On the other hand, the right shift operator RCL(2) has no eigenvalues. Suppose that λC is such that Rx = λx for some x = (xn)nN2. Then

image

Suppose first that λ ≠ 0. Then from the above, λx1 = 0 gives x1 = 0. Next, λx2 = x1 now gives x2 = 0. Proceeding in this manner, we obtain x1 = x2 = x3 = ··· = 0, and so x = 0.

On the other hand, if λ = 0, then (0, x1, x2, x3, ···) = (λx1, λx2, λx3, ···) shows immediately that x1 = x2 = x3 = ··· = 0, and so x = 0.

Consequently, R has no eigenvalues.

image

Note that when dim X < ∞, and TCL(X), then

λC is an eigenvalue of T if and only if λIT is not invertible.

So the points in the spectrum σ(T) are exactly the ones where λIT fails to be invertible in σ(T). This prompts the following natural concept in the general case, that is, when dim X images ∞.

Definition 2.11. (Spectrum and resolvent).

Let X be a normed space and TCL(X).

We say that λC belongs to the spectrum σ(T) of T if λIT is not invertible in CL(X). Thus

image

The set ρ(T) is called the resolvent set of T.

The set σp(T) of all eigenvalues of T is called the point spectrum of T.

We have that σp(T) ⊂ σ(T), since if λσp(T), then there exists a nonzero vector x such that Tx = λx, that is, (λIT)x = 0, showing that λIT is not injective, and hence can’t be invertible either!

We’ll now show that if X is Banach and TCL(X), then σ(T) is a compact nonempty subset of C.

Theorem 2.13. Let X be a Banach space and TCL(X).
Then

(1)σ(T) ⊂ {λC : |λ| image ||T||}.

(2)ρ(T) is an open subset of C.

(3)σ(T) is a compact subset of C.

(4) σ(T) is nonempty.

image

Proof.

(1)Let |λ| > ||T|| image 0. Then image < 1, and so Iimage is invertible in CL(X). Thus, as λ ≠ 0, we have that

image

is invertible in CL(X) too.

(2)Let λ0ρ(T). Then for λC,

image

So Iimage.

For λ0λ, A has small norm, in particular, < 1. Hence it follows that (λ0IT)–1(λIT) =: S is invertible in CL(X). So we conclude that λIT = (λ0IT)S (being the product of two invertible operators in CL(X)) is also invertible in CL(X).

(3)σ(T) is bounded (as σ(T) ⊂ B(0, ||T||) := {zC : |z| image ||T||}), and also it is closed (because its complement C\σ(T) = ρ(T) is open). So σ(T) is compact.

(4)(∗)7 Let σ(T) = ∅. Then f(z) := (zIT)–1CL(X) for all zC.

In particular, T–1 exists, and is not 0.

Let φ ∈ (CL(X))′ be such that φ(T–1) ≠ 0.

Such a φ exists by the Hahn-Banach Theorem (Exercise 2.38, page 109).

Let g : R2C be given by g(r, θ) = φ(f(re)), for all (r, θ) ∈ R2.

We will show that gC1 (R2, C) by showing that it has continuous first order partial derivatives (which will in turn be used in the calculations, and also to justify a differentiation under the integral sign).

Using the resolvent identity (Exercise 2.30, page 102), we have

image

Using continuity of φ and that of the inverse operation (Exercise 2.25, page 88), it follows from the above calculation, that

image

Similarly, image.

By differentiating under the integral sign, we obtain

image

Consequently,

image

Hence F is constant, and we have

image

Now

image

Fix r such that |φ(f(re))| < image. Then

image

giving 2 < 1, a contradiction. This completes the proof.

image

Example 2.20. (Spectrum of the left shift operator).

Consider the left shift operator LCL(2). Then ||L|| image 1. So it follows that σ(L) ⊂ {zC : |z| image 1}. As {zC : |z| < 1} ⊂ σp(L) ⊂ σ(L), and because σ(L) is closed, it follows that {zC : |z| image 1} ⊂ σ(L) too. So σ(L) = {zC : |z| image 1}.

image

We now claim that σp(L) = {zC : |z| < 1}. We had seen earlier that {zC : |z| < 1} ⊂ σp(L). Now we’ll show the reverse inclusion.

To this end, let λσp(L) with eigenvector x = (xn)nN.

Then (x2, x3, ···) = L(x1, x2, x3, ···) = λ(x1, x2, x3, ···).

So λxn = xn+1 for all n, giving (by induction) xn = λn–1x1 for all n.

As 2x0, we have

image

so that x1 ≠ 0, and the geometric series with common ratio |λ|2 converges. So |λ| < 1, and we get the reverse inclusion σp(T) ⊂ {zC : |z| < 1}.

image

We will return to this topic on spectral theory when we deal with operators on a Hilbert space, and also in the context of compact operators.

Exercise 2.30. (Resolvent Identity). Let X be a normed space, TCL(X) and λ, μρ(T). Prove that (λIT)–1 – (μIT)–1 = (μλ)(λIT)–1(μIT)–1.

Exercise 2.31. (Spectral radius). Let X be a Banach space, and TCL(X). Define the spectral radius of T by rσ(T) := image |λ|.

(1)Prove that rσ(T) image ||T||.

(2)Show that for TACL(R2), A := image, then rσ(TA) < ||TA||.

Here R2 has the usual Euclidean || · ||2-norm.

Remark. In this connection, the Gelfand-Beurling Formula8 says that:

If X is Banach and TCL(X), then rσ(T) = image ||Tn||1/n.

Exercise 2.32. Let X be a Banach space, TCL(X), and λσ(T).

Prove that λ2 belongs to the spectrum of T2.

Hint: Use (λ2IT2) = (λIT)(λI + T) = (λI + T)(λIT).

Remark. More generally, the Spectral Mapping Theorem9 says that:

If X is a Banach space, TCL(X), p = c0 + c1z + · · · + cdzdC[z] (a polynomial with complex coefficients), and p(T) := c0I + c1T + · · · + cdTd, then we have σ(p(T)) = p(σ(T)) := {p(λ) : λσ(T)}.

Exercise 2.33. (Spectrum of the diagonal operator).

Let (λn)nN be sequence in C which is convergent to 0, and consider Λ ∈ CL(2) given by Λ(a1, a2, a3, · · ·) = (λ1a1, λ2a2, λ3a3, · · ·) for all (a1, a2, a3, · · ·) ∈ 2.

Show that {λn : nN) ⊂ σp(Λ) ⊂ {λn : nN) image{0} = σ(Λ).

Remark. (Spectral Theorem for Compact Operators).

In Chapter 5, we will learn that this Λ is an example of a “compact operator”; see Example 5.3 on page 214. More generally, one can show the Spectral Theorem for Compact Operators, which says that for a compact operator K on an infinite dimensional Hilbert space H,

(1)σ(K\{0} = σp(K\{0}, and σ(K) is countable,

(2)0 is the only accumulation point of σ(K),

(3)For all λσp(K\{0}, dim kerp(λIK) = dim kerp(λ*IK*) < ∞.

Exercise 2.34. (Approximate spectrum).

(1)Let X be a Banach space, and TCL(X). A number λC is said to belong to the approximate spectrum σap(T) of T if there exists a sequence (xn)nN of vectors from X such that ||xn|| = 1 for all nN, and Txnλxn image 0 in X. Prove that σap(T) ⊂ σ(T).

(2)Let Λ ∈ CL(2) be the diagonal operator corresponding to a convergent (and hence bounded) sequence (λn)nN. Prove that image λnσap(Λ).

Exercise 2.35. (Point spectrum of the position operator).

Let X be a normed space, and let A : DAX be an “unbounded operator10”, where the domain DA is a subspace of X. Then the point spectrum of the unbounded operator A is defined in an analogous manner as before: σp(A) := {λC : there exists an xDA\{0} such that Ax = λx}.

Now consider the position operator Q : DQL2(R), arising in Quantum Mechanics, where DQ := {Ψ ∈ L2(R) : (xxΨ(x)) =: QΨ ∈ L2(R)}, and (QΨ)(x) := xΨ(x), for almost all xR, and all Ψ ∈ DQ.

Show that σp(Q) = ∅.

Remark. So Q has no eigenvectors in DQL2(R). However, when we learn elementary distribution theory later on in Chapter 6, we’ll see that λ = λδλ for all λR, where δλ is the “Dirac distribution” with support at λR. See Example 6.11 on page 251.

2.7(∗) Dual space and the Hahn-Banach Theorem

Definition 2.12. (Dual space of a normed space).

Let X be a normed space over K. Then the normed space CL(X, K), equipped with the operator norm, is called the dual space of X. One denotes the dual space of X simply by X′. Elements of the dual space are sometimes called bounded linear functionals.

Recall that a consequence of Theorem 2.8 (page 78) was Corollary 2.2, which says that X′ is always a Banach space, even if X isn’t. This is because K = R or C are both Banach spaces.

Given a concrete X, like Rd or p, it is sometimes possible to “recognize” X′, that is to establish a (normed space) isomorphism from X′ to some other Banach space, for example:

image

Such results are called representation theorems, and we will see a few such results now, and also later on in the chapter on Hilbert spaces (Chapter 4), when we will learn about the Riesz Representation Theorem, page 189.

Theorem 2.14. For 1 image p < ∞, (p)′ » p, where image

(Here the understanding is that if p = 1, then q = ∞.)

Proof. (Sketch). We consider K = R for simplicity. Let 1 < p < ∞.

By Hölder’s Inequality, |a1b1 + · · · + anbn| image ||(a1, · · ·, an)||p||b1, · · ·, bn)||q, with equality if image is a multiple of image. Let TCL(p, R). Let ekp be the sequence (0, · · ·, 0, 1, 0, · · ·) with kth term 1, and all others 0. Fix nN. Let a = (a1, · · ·, an, 0, · · ·) ∈ p be such that image is a multiple of ((Te1)q, · · ·, (Ten)q) (i.e., ak := (Tek)p/q, k = 1, · · ·, n).

Then ||T|| image image = ||(Te1, · · · , Ten)||q.

Passing the limit n → ∞, we get (Te1, Te2, Te3, · · ·) ∈ q. So we get a continuous linear transformation T image (Te1, Te2, Te3, · · ·) : CL(p, R) → p. It can be checked that this map ι is injective and surjective. As ι is bijective, it is an isomorphism.

If p = 1, then let us now show that (1)′ image . This is easier to see, since if TCL(1, R), then we get immediately that for all k, |Tek| = image image ||T||, giving (Tek)kN.

image

Remark 2.6. (The dual space ()′ image ′.)

If a = (an)nN1, then define the functional φaCL(, R) = ()′ by

image

Then a image φa : 1 → ()′ is an injective linear transformation. It is continuous since |φa(b)| image ||b||||a||1 for all b, giving ||φa|| image ||a||1. However it is not surjective, and this can be shown by using the Hahn-Banach Theorem (see Theorem 2.15 on page 108), which says that a continuous linear functional on a subspace of a normed space can be extended to the whole normed space while preserving the operator norm of the functional. To see how this gives the non-surjectivity of the map a image φa : 1 → ()′ above, let us consider the subspace c of all convergent subsequences, and the “limit functional” λ : cR, given by

image

Then λCL(c, R) = c′, and ||λ|| = 1. By the Hahn-Banach Theorem, this functional λ on the subspace c of has an extension Λ ∈ CL(, R). But now we see that Λ can’t be φa for some a1. Otherwise, with enc being the sequence with nth term 1 and all others 0, we have

image

for all n, showing that a = 0, and so Λ = φa = 0, which is clearly false, since Λ(1, 1, 1, · · ·) = λ(1, 1, 1, · · ·) = 1 ≠ 0! So ()′ is “bigger” than 1.

Remark 2.7. If 1 image p < ∞, then it can be shown that

image

where image

Exercise 2.36. Consider the subspace c0 consisting of all sequences that converge to 0. Prove that 1 image (c0)′.

Exercise 2.37. (∗) (Dual of C[a, b]). In this exercise we will learn a representation of the dual space of C[a, b]. A function μ : [a, b] → R is said to be of bounded variation on [a, b] if its total variation var(μ) on [a, b] is finite, where

image

Here P is the set of all partitions of [a, b]. A partition of [a, b] is a finite set P = {t0, t1, · · ·, tn–1, tn} with t0 := a < t1 < · · · < tn–1 < b =: tn.

(1)Show that the set of all functions of bounded variations on [a, b], with the usual pointwise operations forms a vector space, denoted BV[a, b].

Define || · || : BV[a, b] → [0, +∞) by ||μ|| := |μ(a)| + var(μ), for μBV[a, b].

(2)Prove that || · || is a norm on BV[a, b].

The Riemann-Stieltjes integral: Let xC[a, b] and μBV[a, b]. For a partition of [a, b], say P = {t0, t1, · · ·, tn–1, tn}, let δP be the length of a largest interval [tj–1, tj], that is, δP := max{t1t0, · · ·, tntn–1}, and set

image

Then it can be shown that there exists a unique real number, denoted by

image

called the Riemann-Stieltjes integral of x over [a, b] with respect to μ, such that for every image > 0 there is a δ > 0 such that if P is a partition of [a, b] satisfying δP < δ, then

image

The usual linearity of the integral (as with the ordinary Riemann integral) holds:

image

(3)Prove that image image ||x|| var(μ), where x ∈ C[a, b] and μBV[a, b].

(4)Conclude that every μBV[a, b] gives rise to a φµCL(C[a, b], R),

image

and that ||φµ|| image var(μ).

The following converse result was proved by F. Riesz: For all φCL(C[a, b], R), there exists a μBV[a, b] such that

image

and ||φ|| = var(μ). In other words, every element (C[a, b]′ can be represented by a Riemann-Stieltjes integral.

(5)For the functional x image x(a) on C[a, b], find a corresponding μBV[a, b].

Dual spaces are important because, among other things, they allow us to define dual operators. Here is the definition.

Definition 2.13. (Dual operator).

Let X, Y be normed spaces, and TCL(X, Y). We define the dual operator (of T), T′ ∈ CL(Y′, X′), by (Tψ)(x) = ψ(Tx), for all xX and ψY′.

Several things need to be checked here:

(1)For ψY′, does Tψ belong to X′?

(2)Does T′ ∈ CL(Y′, X′)?

Let us begin with (1). If ψ′ ∈ Y′, then we have that

(L1)for all x1, x2X,

image

(L2)for all αK and xX,

image

Hence TψL(Y, K). Moreover Tψ is continuous because for all xX,

image

Now let’s check (2), that is, that T′ ∈ CL(Y′, X′). We have

(L1)for all ψ1, ψ2Y′, for all xX,

image

and so T′(ψ1 + ψ2) = T′(ψ1) + T′(ψ2),

(L2)for all αK, for all ψY′, for all xX,

image

and so T′(αψ) = α(Tψ).

Thus T′ is linear. It is also continuous, because (2.7) gives ||Tψ|| image ||ψ||||T|| for all ψ, that is T′ ∈ CL(Y′, X′) and ||T′|| image ||T||.

Example 2.21. Consider x image x′ : C1[0, 1] → C[0, 1], xC1[0, 1]. Then D′ : (C[0, 1]) → (C1[0, 1])′ is given by (Dψ)(x) = ψ(Dx) = ψ(x′), ψ ∈ (C[0, 1])′, xC1[0, 1]. But (C[0, 1])′ ⊂ BV[0, 1], and so every ψ ∈ (C[0, 1])′ can be represented by some element μψBV[0, 1], so that

image

Thus if ψ ∈ (C[0, 1])′, then

image

where μψBV[0, 1] is such that ψ(y) = image y(t)dμψ, yC[0, 1].

image

Sometimes problems for an operator can be simplified by looking at the dual operator, making the consideration of dual spaces and dual operators a useful endeavour.

Remark 2.8. (Dual versus adjoint operators). When we learn about Hilbert spaces, we will learn about the notion of the adjoint T* ∈ CL(Y, X) of an operator TCL(X, Y), where X, Y are Hilbert spaces, and we can use the Riesz Representation Theorem (which we will also learn there) to represent elements of Y′, X′ by elements of Y, X. (The next sentence should be read after the discussion of the adjoint operator and the Riesz Representation Theorem.) If X, Y are Hilbert spaces, and TCL(X, Y), then for YyψψY′, we have for all xX that

image

where we identified T*yψX with the functional x → 〈x, T*yψ〉 : XK in X′. In this sense the notions of adjoint and dual “coincide” in this context of operators on Hilbert spaces.

Hahn-Banach Theorem11. Finally, we will learn a fundamental result, known as the Hahn-Banach Theorem, which says that X′ always contains sufficiently many elements to separate points of X : for xy in X, there exists a φX′ such that φ(x) ≠ φ(y). In this sense, the elements of X′ play the role of “coordinates” for the points of X (which is the kind of thinking one is used to in elementary linear algebra when X = Kd).

Theorem 2.15. (Hahn-Banach).

Let (1) X be a normed space,

(2)YX be a linear subspace,

(3)φCL(Y, K).

Then there exists a Φ ∈ CL(X, K) such that Φ|Y = φ and ||Φ|| = ||φ||.

In other words: “Every continuous linear functional on a subspace Y of a normed space X possesses a norm-preserving extension to the entire normed space X”.

Before proving the Hahn-Banach Theorem, we will now list a few important consequences one obtains from it.

Corollary 2.6. Let X be a normed space and x0X. Then there exists an element Φ ∈ Xsuch that Φ(x0) = ||x0|| and ||Φ|| = 1.

Proof. Let Y := span {x0} and φ : YK be given by φ(y) = α||x0|| for y = αx0Y, αK. Then φ is linear. Moreover, φ is a continuous map because |φ(y)| = |φ(αx0)| = |α||x0||| = ||αx0|| = ||y||, that is, |φ(y)| = ||y|| for all yY. Hence ||φ|| = 1. By Hahn-Banach there now exists an extension Φ with the desired property.

image

As mentioned earlier, once we have the Hahn-Banach Theorem, one has the ability of distinguishing elements of X using elements from X′. This is shown in the two corollaries below.

Corollary 2.7. Let x and y be elements in a normed space X with xy. Then there exists a Φ ∈ Xsuch that Φ(x) ≠ Φ(y). (In other words, Xseparates the points of X.)

Proof. Take Φ ∈ X′ with Φ(xq– Φ(y) = Φ(xy) = ||xy|| ≠ 0.

image

Exercise 2.38. Let X be a complex normed space and xX\{0}. Show that there exists a φCL(X, C) such that φ(x) ≠ 0.

Remark 2.9. (CL(X, Y) Banach ⇒ Y Banach).

Fix any nonzero xX. By Exercise 2.38, there exists a φCL(X, C), such that φ(x) ≠ 0. Let (yn)nN be a Cauchy sequence in Y. For nN, define TnCL(X, Y) by

image

Then using the linearity of φ, it follows that Tn is linear. Also, Tn is continuous because

image

A similar computation also gives that for n, mN,

image

showing that (Tn)nN is a Cauchy sequence (since (yn)nN is Cauchy). As CL(X, Y) is Banach, the Cauchy sequence (Tn)nN is convergent, with limit, say, TCL(X, Y). But for xX,

image

and so we have that for all xX, (Tnx)nN converges to Tx.

In particular, with x = x, (Tnx)nN = (yn)nN converges to Tx = y.

Hence Y is a Banach space!

Since X′ is itself a normed space, we know that X′ too has a dual space (X′)′ =: X″, and X″ called the bidual of X. For xX, now consider the map φx : X′ → K, given by

image

It is clear that φx is linear. Moreover, it is continuous too, since

image

We thus see that ||φx|| image ||x||.

If x = 0, the zero vector in X, we have φx = 0, the zero linear transformation in CL(X′, K). So ||φx|| = 0 = ||x|| in this case.

If x0, then from Corollary 2.6 it follows that there exists a ψX′\{0} for which |ψ(x)| = ||x|| and ||ψ|| = 1. So we get the reverse inequality ||φx|| image ||x|| too. Hence ||φx|| = ||x|| in this case as well.

So we have the following third consequence of the Hahn-Banach Theorem:

Corollary 2.8. Let X be a normed space and xX. Then the map φx on X′, given by image has the operator norm ||φx|| = ||x||.

Thus map x image φx : X image X″ is a linear isometric embedding of X in X″. If we consider the elements of X as bounded linear functionals on X′ (by identifying x with φx), then:

image

where the norm of x on X agrees with the norm of φx in X″. Sometimes, the map x image φx from X into X″ is also surjective. In that case, the space X is called reflexive and the inclusion (2.8) is replaced by the equality:

image

For proving the Hahn-Banach theorem, we will need a few preliminaries. We will first prove the theorem in the case K = R, and then show how the result for the case K = C can be derived from the real case.

In the following lemma, we consider a normed space X over R, and instead of a norm, we consider a more general function p : XR such that

image

that is, a subadditive and positive-homogeneous functional.

Lemma 2.3. (Hahn-Banach Lemma).

Let X be a normed space over R and ( : XR satisfy (2.9) and (2.10). Furthermore, let YX be a subspace and φ : YR be a linear map such that

image

Then there exists a linear map Φ : XR such that Φ|Y = φ, and

image

(That is, there exists a linear extension of φ to X preserving the estimate.)

Proof. (∗) This is a rather technical proof, but the idea of the proof is to extend φ “one dimension at a time”. Let x0X\Y. Every vector xY + (span {x0}) has a unique decomposition x = αx0 + y, with yY and αR. An extension Φ of φ to Y + (span {x0}) is given by Φ(x) = αr + φ(y), where r, which ought to be Φ(x0), will be chosen so that (2.12) holds, that is:

image

Owing to the positive homogeneity of p, it is sufficient to choose r such that (2.13) is satisfied with α = 1 and α = –1:

image

Indeed, once these hold, then multiplication with t > 0 yields

image

which, in light of the fact that every element from Y can be written in the form ty with yY, gives (2.13) with α = ±t ≠ 0. For α = 0, (2.13) is already satisfied according to the hypothesis. Now the inequalities (2.14) and (2.15) are equivalent to the statement:

image

But there exists such a r precisely if all numbers φ(y) – p(–x0 + y), with yY, lie to the left of all numbers –φ(z) + p(x0 + z), with zY, that is,

image

that is, if for all y, zY, φ(y) + φ(z) image p(–x0 + y) + p(x0 + z). But this is indeed the case since we have for all y, zY that

image

Now from (2.16) it now follows that:

image

and it is sufficient to choose, for instance, image

(In general, the sup and the inf here are unequal, and we can choose r arbitrarily from an interval.) Now the number r also satisfies (2.14) and (2.15) and thus from (2.13), we have obtained an extension to Y + (span {x0}) such that (2.12) holds.

Now the idea is that we extend φ one dimension at a time in order to get an extension to the space X. If X were finite dimensional, then it is clear that this can be done. After dim X – dim Y steps we will have obtained a linear transformation Φ : XR that satisfies (2.12).

In the general case, the proof goes through, in essentially the same manner, by successive one-dimensional extensions, but we won’t be able to get an extension to X in finitely many steps. In order to complete the process, we will use Zorn’s Lemma.

Zorn’s Lemma

Zorn’s Lemma says that a partially ordered set P with the property that every chain has an upper bound in P possesses a maximal element. The terms are explained below.

A partial order image on a set P is a relation on P satisfying

(transitivity) for all x, y, zP, x image y, y image zx image z,

(antisymmetry) for all x, yP, x image y, y image xx = y,

(reflexivity) for all xP, x image x.

A set with a partial order is called a partially ordered set.

A familiar example is R with the usual image relation, but the situation can be much more general: for example, consider R2 with the order: (a, b) image (c, d) if a image c and b image d. This latter example justifies the terminology partial. Indeed, image is not a total order because not every pair of elements can be compared with image : we have neither (0, 1) image (1, 0) nor (1, 0) image (0, 1).

A subset C of P is said to be bounded above if there exists an element uP such that x image u for all xC. The element uP is then called an upper bound of C.

A subset C of P is said to be chain if for all x, yC, there holds x image y or y image x. Thus on a chain C, image forms a total order since any two elements in C can be compared with image. The set R2 with the above order image is not a chain, since neither (0, 1) image (1, 0) nor (1, 0) image (0, 1). However, the diagonal {(x, x) : xR} is a chain.

An element mP is called maximal if whenever xP and m image x we have that x = m.

Zorn’s Lemma (named after the mathematician Max Zorn) is an axiom in Set Theory. It can be shown that it is equivalent with the Axiom of Choice: for every family Ai, iI, of nonempty sets Ai, there exists a map IixiAi.

In order to apply Zorn’s Lemma to complete the proof of the Hahn-Banach Lemma, we proceed as follows.

Consider the set P of all pairs (Z, ψ), where Z is a subspace of X with YZX, and ψ : ZR is a linear transformation extending φ such that ψ(z) image p(z) for all zZ.

We define the partial order image on P by defining (Z, ψ) image (Z′, ψ′) if ZZ′ and ψ = ψ′|Z. Then every chain in P has an upper bound, as explained below.

If C is a chain in P, then we can construct an upper bound (ZC, ψC) of C as follows: Let ZC be the union of all subspaces Z, with (Z, ψ) ∈ C and let ψC be the common extension of the linear transformations ψ. More precisely, for zZC, there exists a (Z, ψ) ∈ C such that zZ, and we define ψC(z) = ψ(z). This definition of ψC(z) is independent of the choice of (Z, ψ). Indeed, if (Z′, ψ′) also belongs to C, and zZ′, then we have (Z, ψ) image (Z′, ψ′) or (Z′, ψ′) image (Z, ψ), and so ψ is the restriction of ψ′ or vice versa. In either case, we have ψ(z) = ψ′(z). The map ψC : ZCR so defined is linear: Indeed, if z, z′ belong to ZC, then there exists a (Z, ψ) such that zZ and there exists a (Z′, ψ′) such that z′ ∈ Z′. We have ZZ′ or Z′ ⊂ Z. Suppose that Z′ ⊂ Z. Then also z′ ∈ Z so that for α, α′ ∈ R, we have αz + αz′ ∈ ZZC, and so it follows that ZC is subspace of X, and ψC(αz + αz′) = ψ(αz + αz′) = αψ(z) + αψ(z′) = αψC(z) + αψC(z′). Finally, ψC satisfies the inequality ψC(z) image p(z), zZC, since indeed ψ(z) image p(z) for all zZ, for all (Z, ψ) ∈ C. Thus we see that (ZC, ψC) belongs to P and that (Z, ψ) image (ZC, ψC) for all (Z, ψ) ∈ C. This completes the proof that every chain in P has an upper bound.

By Zorn’s Lemma, P has a maximal element (Z, Φ). Then Z = X. Indeed, if Z image X, then there exists an xX\Z, and then from the first part of the proof of the Hahn-Banach Lemma, it follows that we can extend Φ to Z + (span{x}) with the same estimate given by p, contradicting the maximality of (Z,Φ). Thus we have a linear Φ : XR that extends φ : YR, while satisfying the estimate (2.12). This completes the proof of the Hahn-Banach Lemma!

image

We will now apply this Hahn-Banach Lemma to prove the Hahn-Banach Theorem, first of all in the case when K = R.

Proof. (Of the Hahn-Banach Theorem; real case.)

Let φ : YR be a continuous linear transformation. Then we have:

image

Now we apply the Hahn-Banach Lemma with p(x) := ||φ|| ||x||, xX. From (2.17), we certainly have for all yY, φ(y) image |φ(y)| image ||φ|| ||x|| = p(y). Thus, by the Hahn-Banach Lemma, there exists a linear map Φ : XR, extending φ to X, that moreover satisfies the estimate that for all xX, Φ(x) image p(x) = ||φ|| ||x||. Replacing x by –x, we obtain –Φ(x) image ||φ|| ||x||, and so for all xX, |Φ(x)| image ||φ|| ||x||. Hence it follows that Φ is continuous and that ||Φ|| image ||φ||. Since φ is the restriction of Φ, we have, on the other hand, also that

image

This proves the Hahn-Banach Theorem in the real case.

image

The proof for complex scalars can be derived from the real case. We remark that real versions of the Hahn-Banach Theorem were first proved independently by Hahn and by Banach. The complex version was given by Bohnenblust and Sobcyzk, following the ideas of Murray.

Proof. (Of the Hahn-Banach Theorem; complex case.)

Let X be a normed space over C. By restricting the multiplication with scalars to real numbers, we obtain a normed space over R, which we denote simply by XR. If Φ : XC is a linear transformation, then ΦR : XRR, given by ΦR(x) = Re(Φ(x)), xXR, is also a linear transformation. We now observe below that Φ is completely determined by its “real part” ΦR. For complex z = a + ib, with a, bR, we have iz = –b + ia, and hence Im(z) = –Re(iz). So Im(Φ(x)) = –Re(iΦ(x)) = –Re(Φ(ix)) = –ΦR(ix). Thus

image

Now if ΦR : XRR is R-linear, then the right-hand side expression of (2.18) determines a C-linear map Φ : XC:

(1)It is clear that Φ is R-linear.

(2)We have Φ(ix) = ΦR(ix) – iΦR(–x) = iR(x) – iΦR(ix) = iΦ(x).
Since every complex number is of the form a + ib, with a, bR, it follows from here and the above part (1) that Φ is also C-linear.

Finally we show that Φ continuous if and only if its real part ΦR is continuous, and moreover, ||Φ|| = ||ΦR||.

(1)If Φ is continuous, then since |ΦR(x)| = |Re(Φ(x))| image |Φ(x)| image ||Φ|| |x||, we have that ΦR is continuous and moreover ||ΦR|| image ||Φ||.

(2)Now suppose that ΦR is continuous and that Φ is given by (2.18). For xX, let θR be such that Φ(x) = e |Φ(x)|. As |Φ(x)| is real,

image

so that Φ is continuous, and moreover ||Φ|| image ||ΦR||.

The proof of the Hahn-Banach theorem in the complex case can now be completed as follows. Let φCL(Y, C) and let φRCL(YR, R) be the real part of φ. Then there exists an extension ΦRCL(XR, R) of φR to XR with ||ΦR|| = ||φR||. Let Φ ∈ CL(X, C) defined by (2.18). Then Φ is an extension of φ, and ||Φ|| = ||ΦR|| = ||φR|| = ||φ||.

image

Exercise 2.39. (Hamel basis). Let X be a vector space over any field F. Show that there exists a subset BX such that B is linearly independent, and span B = X. Such a set is called a Hamel basis of X.

Exercise 2.40. Let X, Y be vector spaces over a field F. Show that any function f : BY defined on a Hamel basis B of X can be extended to a linear transformation F : XY, that is, F |B = f. Hint: Every vector in X can be uniquely expressed as a linear combination of vectors from B.

Exercise 2.41. Let X be an infinite dimensional normed space, and let Y be a nontrivial normed space. Prove that there exists a linear transformation from X to Y which is not continuous.

Exercise 2.42. R is a vector space over Q, and hence has a Hamel basis B. Prove that B is necessarily uncountable.

Exercise 2.43. (Additive discontinuous F : RR). Show that there exists a function F : RR such that for all x, yR, F(x + y) = F(x) + F(y), but F is not continuous on R.

Exercise 2.44. (∗)(Banach limits).

Consider the subspace c of comprising convergent sequences.

Let l : cK be the limit functional given by image

(1)Show that l is an element in the dual space CL(c, K) of c, when c is equipped with the induced norm from .

Let Y be given by image

(2)Show that Y is a subspace of .

(3)Prove that for all x, xSxY, where S : denotes the left shift operator: S(x1, x2, x3, ···) = (x2, x3, x4, ···), (xn)nN.

(4)Prove that cY.

(5)Show that there exists a LCL(, K) such that L|c = l and LS = L.

This gives a generalisation of the concept of a limit, and the number Lx is called a Banach limit of a (possibly divergent!) sequence x.

Hint: First observe that L0 : YK defined by

image

is an extension of the functional l from c to Y. Now use the Hahn-Banach Theorem to extend L0 from Y to .

(6)Find the Banach limit of the divergent sequence ((–1)n)nN.

Notes

The proof of the Open Mapping Theorem given in §2.5, and the proof of the Hahn-Banach Theorem given in §2.7 are based on [Thomas (1997)].

 

1 This has nothing to do with the null space: ker TA := {xL2[0, 1] : TAx = 0}, which is also called the kernel of the integral operator.

2 In Fourier/Harmonic Analysis, this is sometimes called the Cesáro summation operator.

3 That is, what Δψ, 〈·〉ψ, etc. mean.

4 The “geometric” series in (2) is called the Neumann series, after the German mathematician Carl Neumann, who used it in connection with the Dirichlet problem.

5 For the definition/existence of a Hamel basis, see Exercise 2.39, page 115.

6 We remark that if we look at the “matrix” corresponding to L, while thinking of vectors in 2 as an “infinite columns”, then the action of L is described by

image

a matrix with all diagonal entries equal to 0, and with 1s along an “upper” diagonal. So in this case, our “matricial intuition” would have led us astray, since based on the above matrix, reminiscent of a Jordan block in finite dimensional linear algebra, one would be tempted to hastily guess that L has the only eigenvalue 0!

7 The usual proof of this is by using some tools from complex analysis. We will instead follow the proof from [Singh (2006)] relying on real analysis techniques.

8 See for example [Taylor and Lay (1980), page 287, Theorem 3.2].

9 See for example [Taylor and Lay (1980), page 279, Theorem 3.4].

10 By an unbounded operator, we mean a linear transformation that is not continuous.

11 Named after the mathematicians Hans Hahn and Stefan Banach.