Modern Mathematics for the Engineer

Semigroup Methods in the Theory of Partial Differential Equations

RALPH S. PHILLIPS

PROFESSOR OF MATHEMATICS
STANFORD UNIVERSITY

4.1 Introduction

The theory of semigroups of operators is intimately connected with the initial-value problem for systems of partial differential equations. Mathematical physics abounds in problems of this kind; for instance, the initial-value problem for the heat equation,

and the initial-value problem for the wave equation,

are both of this type. J. Hadamard has called such a problem well formulated if there is a unique solution for the given initial data and if the solution varies continuously with the initial data; see pages 93 to 94 and 111 to 112 of Ref. 1. These two requirements are certainly reasonable on physical grounds, the existence and uniqueness being merely an affirmation of the principle of scientific determinism, whereas the continuous dependence is an expression of the stability of the solution of the problem. In order that the initial-value problem for a given system of partial differential equations may be well formulated, it is frequently necessary to limit the set of initial data, for instance by imposing certain boundary conditions. Thus, for the heat equation over a finite interval, a well-formulated problem would be

The purpose of this chapter is to give a precise, but somewhat abstract, formulation to the initial-value problem, also called the Cauchy problem.

If we assume that the system of differential equations is independent of time—or, equivalently, that the corresponding physical mechanism is time-invariant—then the Cauchy problem leads directly to a semigroup of operators. In fact, suppose a class of suitable initial data is denoted by and let the solution to the problem at time t > 0 with initial data y in be denoted by S(t)y. We must also assume that is invariant in the sense that S(t)y lies in for each y in ; this guarantees that the solution is continuable. We can then determine the solution at time t₁ + t₂, taking both t₁ and t₂ > 0, either directly as S(t₁ + t₂)y or indirectly by using S(t₁)y as initial data and determining the solution at time t₂ later, as S(t₂)[S(t₁)y]. The uniqueness of the solution implies that

S(t₁ + t₂)y = S(t₂)[S(t₁)y]

or briefly,

This is the semigroup property for the operators [S(t); t > 0].

In some physical problems, the initial data determine the entire past as well as the future of the mechanism. For such problems, the restriction t₁ > 0, t₂ > 0 in Eq. (4.3) need not be imposed and the resulting family of operators [S(t)] defined for all t, − ∞ < t < ∞, forms a group of operators. In particular, S(t) and S(−t) will be inverses of each other, so that

S(t)S(−t) = I

the identity operator.

The semigroup property of solutions to the time-invariant Cauchy problem is reflected in certain addition theorems. Thus, as is well known, in the case of the previously mentioned heat equation (4.1) we have

and the kernel

exhibits the semigroup property of the operators [S(t); t > 0] in the form

since

A similar result holds for the wave-equation initial-value problem (4.2). If we write η¹ = u_χ, η² = u_t, then Eq. (4.2) takes the form

and the solution can be written as

Here F(χ) is the column vector F(χ) = [f′(χ), g(χ)] and

where δ(χ) denotes the Dirac delta function (see Chap. 1). Again the fact that the solution satisfies the semigroup property results in the kernel satisfying the relationship

Stability requires that S(t)y_n converges to S(t)y whenever y_n converges to y, whence we may conclude that the operator S(t) is continuous on . Moreover, the statement that y in is the initial value for the solution of a Cauchy problem means that the solution at time t, namely S(t)y, approaches y as t goes to 0 +, in other words that

We shall in addition assume that the problem is linear; i.e., for y and w in , and a and b complex numbers, we assume that ay + bw lies in and that the solution satisfies the equation

This will be the case when the system of partial differential equations is linear and the boundary conditions determining are homogeneous.

Uniqueness, stability, and linearity constitute our principal assumptions, and it is evident that a large class of problems in partial differential equations meet these conditions and consequently have solutions that can be described by semigroups of linear operators continuous on for each t > 0 and such that

for each y in .

4.2 Semigroups of Operators on Finite-dimensional Spaces

We now proceed to a discussion of one-parameter families of operators of the sort described in Sec. 4.1, and we begin by considering the simplest case, namely, that for which is one-dimensional. In effect, then, will be the complex-number field, and for each t > 0, S(t) will be merely a multiplicative factor, also a complex number. Since S(t + τ) = S(t)S(τ) and S(τ) → 1 as τ → 0+, we see that S(t) is continuous on the right and hence integrable. Clearly we have

so that in particular, for t sufficiently small, S(t) must satisfy the condition

For 0 < < t, we have

The right-hand member has the limit S(t) − 1 as → 0 +, and therefore

exists. Accordingly, we obtain

and successive substitutions into the right-hand member yield

Although the relationship (4.8) has been established only for t sufficiently small, the semigroup property shows directly that it holds for all t > 0. Thus in the one-dimensional case we can show that all semigroups of operators are exponentials.

The argument given above also applies when is finite-dimensional, with the difference that now S(t) and A are to be interpreted as matrices. The relationships (4.7) and (4.8) continue to hold, however, and we obtain in this way a general form for the solution to problems that arise in the theory of small vibrations and in electric-circuit theory. Thus, if y is a given initial vector and we set y(t) = S(t)y, then it follows that

which is the general form of the initial-value problem in the finite-dimensional case. As an example of the solution to Eq. (4.9) for

we have

4.3 Hilbert Space

Applications of semigroup theory to problems such as the Cauchy problem for the heat and wave equations require that we extend the foregoing theory to the infinite-dimensional case. The resulting theory is much more refined and considerably richer in detail. It is now necessary, however, that we be precise in defining our concepts. For example, we have repeatedly used the notion of a limit in without saying what was meant by this concept. As a matter of fact, almost all definitions of limit in the finite-dimensional case are equivalent. This is not true of the infinite-dimensional case, and so we shall start by describing a suitable setting for our theory.

The simplest and perhaps the most useful of the infinite-dimensional spaces is the Hilbert space H (see, for instance, the book by F. Riesz and B. Sz.-Nagy⁹); H is a direct generalization of the familiar complex euclidean space and accordingly one has considerable insight into its geometry right from the start. Thus H is a vector space with complex numbers as scalar multipliers; more precisely, H consists of elements y, z, w, etc., called vectors; to every pair y and z of vectors there corresponds a vector w, called their sum, w = y + z; and to every vector y and complex number a there corresponds a vector z, called their product, z = ay. The sum and product satisfy the following properties:

a. y + z = z + y

b. y + (z + w) = (y + z) + w

c. There exists a unique zero vector, denoted by 0, such that for all y in H,

y + 0 = y

d. To each y in H there corresponds a unique vector, denoted by − y, with the property

y + (− y) = 0

e. a(y + z) = ay + az

f. (a + b)y = ay + by

g. a(by) = (ab)y

h. 0y = 0 and 1y = y

What distinguishes the Hilbert space H from other vector spaces is the existence of a complex-valued inner product (y,z) defined for all ordered pairs y and z in H and having the properties

a. (ay,z) = a(y,z)

b. (y + z, w) = (y,w) + (z,w)

c. where and α, β are real

d. (y,y) ≥ 0the equality sign holding if and only if y = 0

The norm

||y|| = [(y,y)]^1/2

is a measure of the length of the vector y. We note that

||y || ≥ 0

the equality sign holding if and only if y = 0. Further, we have

||ay|| = |a| ||y||

The fact that the cosine of the angle between two vectors is of absolute value less than or equal to 1 is expressed by the inequality of Schwarz (see pages 126, 134, 166, 175, and 400 of Ref. 1):

|(y,z)| ≤ ||y|| ||z||

In order to establish this inequality, we note that, if y and z are both equal to the zero element, then the result is obviously valid. If z ≠ 0, then

and by setting a = − (y,z)/(z,z) we obtain the desired result. The so-called triangle inequality

||y + z|| ≤ ||y|| + ||z||

now follows directly; in fact, we have

||y + z||² = ||y||² + (y,z) + (z,y) + ||z||² ≤ ||y||² + 2||y|| ||z|| + ||z||²

The distance between two elements y and z is defined as ||y − z||. It is clear that this function is symmetric and nonnegative, zero if and only if y = z, and it follows from the triangle inequality that

||y − z|| ≤ ||y − w|| + ||w − z||

It also follows from the triangle inequality that

| ||y|| − ||z|| | ≤ ||y − z||

from which we see that ||y|| is a continuous function. The inner product can also be shown to be a continuous function of both of its arguments.

It is assumed that H is complete in terms of the above distance function; that is, given a Cauchy sequence {y_n}, which by definition is a sequence such that

then there is a y in H for which

Finally, it is assumed that H is not finite-dimensional.

The simplest example of a Hilbert space is the space of complex-valued sequences

y = {η¹, η², …}z = {ζ¹, ζ², …}

for which the sum of the squares of the absolute values of the components is finite; here the inner product is defined as

Another example of a Hilbert space is given by L₂(G), the Lebesgue-measurable complex-valued functions y = y(χ) defined on a domain G in an m-dimensional real euclidean space E_m and having the property that

In this case, the inner product is defined for y and z in L₂(G) as

In order that the inequality (y,y) > 0 be satisfied when y ≠ 0, it is necessary to identify all functions that differ only on sets of measure zero. The resulting classes of functions in L₂(G) then define a Hilbert space. It should be noted that in such a space it is possible to have a sequence {y_n} for which converges to zero even though y_n(χ) does not converge for any point χ in G.

A subspace L contained in H is simply a vector space, i.e., a subset with the property that for all y and z in L and for all complex numbers a and b, the vector ay + bz lies in L. A closed subspace is a subspace that contains all its limit points. It is easy to see that a closed subspace of a Hilbert space is complete and therefore is itself a Hilbert space when it is not finite-dimensional. Two vectors y and z of H are said to be orthogonal if (y,z) = 0; it follows that the zero vector is orthogonal to every vector of H. Two sets S₁ and S₂ are said to be orthogonal if every element of S₁ is orthogonal to every element of S₂. The orthogonal complement of a set S, in symbols S^⊥, is the set of all vectors orthogonal to S. It is readily verified that the orthogonal complement of a set is a closed subspace of H.

If L is a closed subspace of H and y is an arbitrary element in H, then there is in L a unique closest element y₁ to y. In fact, let

d = inf (||y − z||; z in L)

Then there is a sequence {z_n} in L such that ||y − z_n|| converges to d. Now it can readily be verified that

||z_m − z_n||² = 2(||y − z_n||² + ||y − z_m||²) − 4||y − ½(z_n + z_m)||²

and since ||y − z_n||² and ||y − z_m||² converge to d², and

||y − ½(z_n + z_m)||² ≥ d²

[because (½)(z_n + z_m) lies in L], it follows that {z_n} forms a Cauchy sequence. Thus {z_n} converges to some y₁ in L with ||y − y₁|| = d. Now, if there were a second value w₁ in L with ||y − w₁|| = d, then the above identity applied to y₁ and w₁ shows that ||y₁ − w₁|| = 0, which proves that y₁ is unique.

To pursue the foregoing matter a bit further, suppose y and y₁ are defined as above and let z be an arbitrary element in L. Then we have

This inequality can hold for all complex a if and only if (y − y₁, z) = 0. Thus y₂ = y − y₁ lies in the orthogonal complement to L; accordingly, to each y in H there correspond a y₁ in L and a y₂ in L^⊥ such that y = y₁ + y₂. Since 0 is the only vector common to L and L^⊥, it is clear that this decomposition of y is unique. The vector y₁ is usually called the projection of y on L; in symbols, we write

P_Ly = y₁

Now, because y₁ and y₂ are orthogonal, we have

Further, it is easily seen that

P_L(ay + bz) = aP_Ly + bP_Lz P_LP_Ly = P_Ly (P_Ly,z) = (y,P_Lz)

It is obvious from the definition of orthogonal complements that (S^⊥)^⊥ ⊃ S for an arbitrary subset S. In the case of a closed subspace L, however, if y lies in (L^⊥)^⊥, then y is orthogonal to L^⊥; by writing y = y₁ + y₂, y₁ in L and y₂ in L^⊥, we see that

||y₂||² = (y,y₂) − (y₁,y₂) = 0

Hence in this case y lies in L and we have L = (L^⊥)^⊥.

It is convenient to introduce the notion of a product space H × H consisting of all ordered pairs [y,z] with y and z in H. This becomes a vector space under the convention

a[y,z] + b[u,v] = [ay + bu, az + bv]

And if we define an inner product as

([y,z],[u,v]) = (y,u) + (z,v)

then H × H actually becomes a Hilbert space.

We come now to the subject of transformations or operators on H to itself. An operator T with domain and range is said to be linear if is a linear subspace and if

The graph of T, in symbols , is defined as

If T is a linear operator, it is clear that is a linear subspace of H × H. For an operator T, if we have y_n → y and Ty_n → z, we shall say that [y_n, Ty_n] converges to [y,z] in the graph topology. In particular if, for every convergent sequence of this type, y lies in and Ty = z, then T is said to be a closed operator; in other words, T is closed if and only if is a closed subspace of H × H. Continuity is a much more restrictive concept, since an operator is continuous if y_n → y in implies that Ty_n →> Ty. A continuous operator need not be closed; however, it always has a smallest closed extension that is continuous and has a closed domain. If T is linear and continuous, then there is a δ > 0 such that ||Tx|| < 1 if only ||x|| < δ. For any y in H, it is clear that

is of norm less than δ and hence

Thus, for a continuous linear operator, the norm or bound

is finite. It is clear that

and that ||T|| is the smallest number for which this holds. Thus if T is continuous and linear, and if {y_n} is a Cauchy sequence of elements in , then the inequality

shows that {Ty_n} also forms a Cauchy sequence. We can therefore define an extension of T as

for all such Cauchy sequences. This is the smallest closed extension of T, to which we alluded above; it is readily shown that is again continuous and linear and that is a closed subspace of H. The inequality (4.13) shows, incidentally, that a linear operator satisfying the relationship (4.12) is necessarily continuous.

In order to illustrate the foregoing discussion, we note that the operator S(t) defined as in Eq. (4.4) on H = L₂(− ∞, ∞) and representing the solution to the initial-value problem (4.1) for the heat equation is linear and continuous. In fact, we have ||S(t)|| ≤ 1 since

and since, by the Schwarz inequality,

On the other hand, the associated diffusion operator A [also on the space H = L₂(− ∞, ∞)], defined by

can be shown to be closed but not continuous.

Suppose T is a closed linear operator and let λ be a complex number. If λI − T has a continuous inverse with domain all of H, then λ is said to belong to the resolvent set of T; the inverse is called the resolvent of T at the point λ and is denoted by R(λ;T). It is clear that

As an example, we consider the operator A defined as in Eq. (4.14) and take λ > 0. Then

(λI − A)y = λy − y_χχ

has an inverse if and only if the equation λy − y_χχ = 0 has no nontrivial solution in . The only nontrivial solutions of this equation for which y and y_χ are absolutely continuous, however, are linear combinations of exp (λ^½χ) and exp (−λ^½χ), none of which lie in L₂(−∞, ∞). Thus λI − A has an inverse; in fact, the equation

(λI − A)y = f

has a solution given by

One can verify directly that the solution y given here belongs to and also that R_λ is a continuous linear operator with bound 1/λ. Thus R_λ is a right inverse of λI − A. In order to prove that R_λ is actually the resolvent of A at λ, it remains only to verify that . To this end, let z in be given, and set

f = λz − Azy = R_λf

According to what has already been established, we have

(λI − A)(y − z) = 0

Hence it follows from the above uniqueness argument that

z = y = R_λf

This proves that

We note parenthetically that is dense in H; that is, every element in H is the limit of a sequence of elements in .

4.4 Semigroups of Operators on a Hilbert Space

Returning now to semigroup theory, we suppose that we have a semi-group of linear operators [S(t);t > 0] defined on a domain that we assume to be contained and dense in H. Stability implies that each S(t) is continuous on , and hence S(t) can be uniquely extended to be linear and continuous on H. We shall denote the extended operator again by S(t), and it is readily verified that the extended operators also have the semigroup property (4.3).

It is convenient at this point to make one further assumption, namely, that the operators S(t) satisfy the inequality

||S(t)|| ≤ 1

Such operators are called contraction operators. In this case, continuity with respect to t at t = 0 also carries over for the extended operators; in other words,

Actually, more is true. It can be shown on the basis of Eq. (4.15) that S(t)y is a continuous function of t, t > 0, for each y in H, and if we set S(0) = I, then the assertion holds for all t ≥ 0.

We now define the infinitesimal generator A of the semigroup [S(t)] by setting

The domain of A consists of all y in H for which this limit exists. It is easy to verify that A is a linear operator, and it can be shown that A is actually closed with a dense domain . Now

⁻¹[S(t + )y − S(t)y] = ⁻¹[S() − I]S(t)y = S(t){⁻¹[S()y − y]}

For each y in , the right-hand member converges to S(t)Ay as → 0+. This shows that the middle member also converges and hence that S(t)y lies in ; we may therefore assert that the right-hand derivative satisfies

Upon writing

we see that the left-hand derivative also exists, and since it equals S(t)Ay, we conclude that Eq. (4.16) holds for the two-sided derivative.

We note that the derivative with respect to t is taken in the sense of the metric in H. Thus suppose that H = L₂(−∞, ∞) and set u(χ,t) equal to a function in H that corresponds to S(t)f for a given f in and t ≥ 0. We can then take the difference quotient of u(χ,t) with respect to t and pass to the limit in the mean-square sense as the t increment goes to zero. This gives a generalized partial derivative with respect to t, and according to Eq. (4.16) the resulting derivative is obtained by having A act on u(χ,t). If A were the operator (4.14), say, then the semigroup S(t) restricted to would provide a solution to the Cauchy problem for the heat equation (4.1), at least in this generalized sense. It will be the aim of this chapter to show how the theory of semigroups can be used to establish the existence of such solutions to a large class of Cauchy problems defined by means of certain differential operators.

The semigroup method is in essence an abstract analogue of the Laplace-transform approach to the initial-value problem for time-invariant partial differential equations and as such should be not entirely unfamiliar to the reader (see Chap. 3). The Laplace transform

can be defined in a straightforward manner and converges for λ > 0. Since S(t) is very close to exp (At), and at least heuristically

it is not surprising that R_λ turns out to be the resolvent of A, namely, R(λ;A). Here, as in the classical Laplace-transform approach, it is the inverse problem that is important. In other words, given an operator A, for which the resolvent exists when λ is sufficiently large, the problem is to determine whether A is the infinitesimal generator of a semigroup of operators. It is precisely at this point that the abstract semigroup theory provides a better answer than the classical theory, at least better in the sense that the abstract criterion is easier to verify than the classical criterion. The first result of this kind was obtained independently by E. Hille and K. Yosida (see pages 360 to 364 of Ref. 6) and can be expressed as follows:

THEOREM 4.1. A necessary and sufficient condition that a closed linear operator L with dense domain generate a semigroup of contraction operators is that the resolvent R(λ;L) satisfy the inequality

||R(λ;L)|| ≤ λ⁻¹λ > 0

The Hille-Yosida result is valid in more general settings than a Hilbert space H, so that it is to be expected that a simpler criterion holds in the case of H. By way of motivating the terminology in this simpler criterion, we note that, in the model that we shall use, the inner product (y,y) turns out to be the energy associated with the state y; thus, assuming S(t) to be a contraction operator is the same as assuming that the process is dissipative in the sense that no energy enters the system. Since for τ > 0 we have

||S(t + τ)y|| = ||S(τ)S(t)y|| ≤ ||S(t)y||

we see that ||S(t)y|| is nonincreasing and hence for we obtain

We shall call an operator L dissipative if

and maximal dissipative if it is not the proper restriction of any other dissipative operator. We can prove the following theorems.⁷

THEOREM 4.2. A necessary and sufficient condition for a dissipative operator L with dense domain to be maximal dissipative is that the range of I − L be all of H.

THEOREM 4.3. A necessary and sufficient condition that a linear operator L generate a semigroup of contraction operators is that L be maximal dissipative with dense domain.

These results are readily verified in the case of the finite-dimensional analogue of H, namely, the k-dimensional complex euclidean space Z_k with elements

y = (η¹, η², …, η^k)z = (ζ¹, ζ², …, ζ^k)

and inner product

As we have already seen, each semigroup of matrices [S(t)] defined on Z_k and continuous on the right can be represented as

where the matrix A is the infinitesimal generator of the semigroup. If, in addition, the operators S(t) are contraction operators, then, according to the relationship (4.17), A will be dissipative, actually maximal dissipative since A is defined on all Z_k. In the finite-dimensional case this simply means that the eigenvalues of the real part of A are nonpositive, in symbols

B = ½(A + A*) ≤ Θ

where A* is the transpose of A and Θ is the zero matrix. Conversely, if B is negative, then for S(t) defined as in Eq. (4.18) and for arbitrary y in Z_k we have

and since

〈S(t)y, S(t)y〉 = 〈y,y〉for t = 0

we see that

||S(t)y|| ≤ ||y||for all t > 0

Thus A generates a semigroup of contraction operators in accordance with Theorem 4.3. For example, the semigroup (4.11) generated by Eq. (4.10) will consist of contraction operators if and only if

We further note for A dissipative, λ > 0, and f = λy − Ay, that

from which we conclude that

λ||y|| ≤ ||f||

It follows from this that λI − A is nonsingular, so that

in accordance with Theorem 4.2, and it further follows that

R(λ;A) = (λI − A)⁻¹

is of norm less than or equal to λ⁻¹ in accordance with Theorem 4.1.

For additional material on the theory of semigroups of operators, we refer the reader to the treatise by E. Hille and R. S. Phillips,⁶ entitled “Functional Analysis and Semi-groups,” and to the excellent set of notes by K. Yosida,¹¹ entitled “Lectures on Semi-group Theory and Its Applications to Cauchy’s Problem in Partial Differential Equations.”

4.5 Hyperbolic Systems of Partial Differential Equations

We shall apply the previous discussion to symmetric hyperbolic systems of partial differential equations. Such a system can be described as follows: Let G be a bounded domain in an m-dimensional euclidean space with points

χ = (χ¹, χ², …, χ^m)

and again let Z_k be a k-dimensional complex euclidean space with elements

y = (η¹, η²,…, η^k)z = (ζ¹, ζ², …, ζ^k)

and inner product

For functions y(χ,t) defined on G × (0, ∞), we shall consider the Cauchy problem

Moreover, y(χ,t) is restricted so as to satisfy certain homogeneous lateral conditions that suffice to make the solution to the initial-value problem unique. Throughout the remainder of this chapter, the subscript i will denote differentiation with respect to χⁱ. In Eq. (4.20), Aⁱ and B are matrix-valued functions of χ alone, the Aⁱ are symmetric and continuously differentiable in , whereas B is merely continuous in . The coefficients are subject to one further condition, namely,

Here B* is the transpose of B relative to the inner product (4.19). As we shall see, the hypothesis (4.21) is a consequence of the fact that we deal only with energy-dissipative systems.

We note that the wave equation can readily be transformed into a symmetric hyperbolic system. We have already shown in Eqs. (4.5) that this is true of Eq. (4.2). More generally, we have

and if we set

η¹ = u₁η² = u₂andη³ = u_t

then Eq. (4.22) becomes

which is now of the form (4.20) with

In this case,

which is ≤Θ if r ≥ 0, as we shall assume.

The potential energy plus the kinetic energy at time t for the amplitude function of Eq. (4.22) is given by

Extending this to the system (4.20), we take, as the energy at time t of the amplitude function y, the quantity

This suggests the Hilbert space H with inner product

as the natural setting for our problem. If we assume that the solution S(t)f is energy-nonincreasing, then we have

||S(t)f|| ≤ ||f||

Thus the associated semigroup operator is contractional, as desired.

By differentiating E with respect to t and making use of an integration by parts, we obtain, at least formally,

where is the boundary of G, n = (n¹, n², …, n^m) the outer normal to , and dσ the surface element on G. The first term on the right in Eq. (4.23), namely,

represents the rate at which energy enters the system through the boundary, the m-tuple with components 〈Aⁱy,y〉 being a generalization of the “Poynting vector.” The other term, (Dy,y), represents the rate at which energy enters the system from energy sources within G. In particular, if we assume that E_t ≤ 0 for all smooth initial-value functions that vanish near , then for t = 0 the surface integral vanishes and we are left with the condition (Dy,y) ≤ 0 for all such functions. This is easily seen to imply the dissipative condition (4.21), which we assume to be satisfied by the coefficients of our system. Finally, when we set

the computation (4.23) shows that

and hence the condition E_t ≤ 0 for all solutions of the initial-value problem is equivalent to the condition that the domain of L is so defined as to make this operator dissipative in the sense of Sec. 4.4. This comes as no surprise, since we have already shown in Secs. 4.1 and 4.4 that a necessary and sufficient condition for the problem to be well formulated is that L be maximal dissipative.

Now if the domain of L is such that Q(y,y) ≤ 0 for all functions in this domain, then L is obviously dissipative. It turns out that this is the case if and only if the domain of L is suitably restricted by what may be called boundary conditions. In particular, this will be the case if the functions y in are subject to homogeneous constraints requiring that

be nonpositive at each point χ of . Boundary conditions of this sort will be called local boundary conditions; K. O. Friedrichs⁴ has characterized boundary conditions of this kind that define L as a maximal dissipative operator. There are also a wide variety of boundary conditions defining L as maximal dissipative in which the boundary values at more than one point of are related; for instance, periodic-type boundary conditions can be of this sort.

Since most of the present discussion deals with the question of boundary conditions, it will be helpful to elaborate on the description of local boundary conditions as given by Friedrichs. For fixed , it is clear that the matrix

is symmetric, say of rank r with n negative and p positive eigenvalues.

Relative to the local quadratic form

Q^χ(y,y) = 〈A^χy,y〉

a subspace N of Z_k will be called negative [positive] if

Q^χ(y,y) ≤ 0[Q^χ(y,y) ≥ 0]

for all y ∈ N, and maximal negative [maximal positive] if N is negative [positive] but not the proper subspace of some other negative [positive] subspace. It is easy to show that a negative [positive] subspace is maximal negative [maximal positive] if and only if it is of dimension n + (k − r) [of dimension p + (k − r)]. It follows from Eq. (4.25) that L will be dissipative if the functions in its domain lie in negative subspaces at each point of , and it is clear that L cannot be maximal dissipative unless these subspaces are maximal negative. Friedrichs has shown that this condition is also sufficient. More precisely, Friedrichs assumes that A^χ is of constant rank throughout and chooses a continuous family of maximal negative subspaces on ; the set of all smooth functions having boundary values in the chosen subspaces determines a dissipative operator L for which the smallest closed extension is maximal dissipative.

On the other hand, it is clear from Eqs. (4.23) and (4.25) that the condition E_t ≤ 0, which is equivalent to L being dissipative, does not require the surface integral Q(y,y) to be nonpositive for all y in the domain of L. If D Θ, then E_t ≤ 0 even if energy enters through the boundary, provided that it is compensated for by the energy dissipated in the interior. In order to ensure such a state of affairs, the domain of the operator L must be restricted by global lateral conditions that relate the boundary values of a function in to its values at the sinks in G. This is a new phenomenon that can occur only when D Θ or, what amounts to the same thing (as we shall see), only in the non-self-adjoint problem. Such lateral conditions are reminiscent of lateral conditions found by W. Feller² in describing certain return processes in diffusion theory. There is also a dual process, no longer governed by a pure partial differential operator, in which energy leaks out through the boundary and is in part transported back into the interior.

Returning now to our main problem, namely, that of finding all the dissipative solutions of the initial-value problem (4.20), we see by Theorem 4.3 that this is equivalent to the problem of characterizing all the maximal dissipative operators that can be associated with the differential operator L of Eq. (4.24). There is still a certain amount of arbitrariness as to the operators that can be associated with L. We shall take this to mean either extensions of a minimal operator L₀ defined by Eq. (4.24), with domain containing in essence only the smooth functions that vanish near , or restrictions of a maximal operator L₁ also defined by

Eq. (4.24), but with domain containing in essence all smooth functions on .

More precisely, L₀ and L₁ are defined as the least closed extensions of L₀₀ and L₁₀, respectively, which are in turn defined as follows:

and

The maximal operator L₁ can also be defined in terms of the formal adjoint to L, namely,

We note that the differential operator M is readily shown to be dissipative in the sense that its coefficients satisfy the analogue of Eq. (4.21); in fact, D is the same for both L and M. We proceed as above to define M₀ and M₁ by means of M₀₀ and M₁₀, respectively, where

It can be shown³ that L₁ is the adjoint of M₀; in other words, consists of all y in H for which there exists a y′ in H such that

in which case we have L₁y = y′.

4.6 Maximal Dissipative Operators

Our problem, then, is to characterize both the maximal dissipative extensions of L₀ and the restrictions of L₁. The extensions of L₀ need not be of the form (4.24) for y not in , whereas the restrictions of L₁ will in general have domains determined by global lateral conditions. The maximal dissipative operators that are both extensions of L₀ and restrictions of L₁ have domains determined by boundary conditions, and the elements of these domains satisfy the condition Q(y,y) ≤ 0; it is with these operators that we shall be most concerned.

Before proceeding further with our characterization of the maximal dissipative operators that lie between L₀ and L₁, we shall illustrate the foregoing discussion with a simple example, in which G is the interval 0 < x < 1 and we take m = 1, k = 1, H = L₂(0,1), and

In this case we find that A = 1, B = −1, and D = −2 ≤ 0. It can readily be verified that

Moreover, we have

The only local boundary condition defining the domain of a maximal dissipative operator is y(1) = 0. For the general boundary condition, we consider the form

for which r = 2 = dimension of the boundary space, and n = 1 = p. Each maximal negative subspace for this form is characterized as y(1) = αy(0) for some α of absolute value less than or equal to 1. Each such subspace defines the domain of a maximal dissipative operator L_α, L₀ ⊂ L_α ⊂ L₁, and conversely, each maximal dissipative operator between L₀ and L₁ is characterized in this way.

On the other hand, it is easy to construct extensions of L₀ that are maximal dissipative but not of the above type; for instance, consider

Here a = a(χ) belongs to L₂(0,1) and we assume that (a,a) ≤ 2. In order to see that U is dissipative, we observe that

where the inequality follows from the Schwarz inequality:

|(y,a)| ≤ ||y|| ||a||

It is clear that is dense in L₂(0,1); hence, according to Theorem 4.2, we shall prove that U is maximal dissipative if we show that

This requires that we solve the equation

y − Uy = f

for arbitrary f ∈ H. It is readily verified that this is satisfied by the function

We remark that the term ay(0) in the expression for U corresponds physically to transporting energy into the interior in amounts having linear density a(χ)|y(0)|².

The dual situation of a maximal dissipative operator V that is a restriction of L₁ but not an extension of L₀ can be illustrated by the operator

where again it is assumed that (a,a) ≤ 2. In this case,

so that V is dissipative. One verifies as before that V is maximal dissipative by showing that

This amounts to solving the equation

y − Vy = g

and one can easily find that this equation is satisfied by

where

The condition (a,a) ≤ 2 is sufficient to make the coefficient of k different from zero. It can be shown that for some y in the boundary integral Q(y,y) is positive, so that energy can enter the system through the boundary; however, the lateral condition y(1) = (y,a), which determines , requires that the rate of energy being dissipated in the interior, namely, −2(y,y), is always at least as large as the flow in through the boundary, namely, |y(1)|².

We return now to the general problem of characterizing the boundary conditions for the maximal dissipative operators L between L₀ and L₁. An integration by parts as in Eq. (4.25) shows that

at least for y and z in ; hence, by considering sequences in the graph topology we see that Eq. (4.28) gives us a means of defining the boundary integral Q(y,z) for all y and z in . For functions y in , which by definition vanish near , and functions z in , it is clear that Q(y,z) = 0; upon again taking sequences in the graph topology, we see that this remains true for all y in and z in . Thus, as far as the boundary integral is concerned, the functions in have absolutely no influence and can be said to exhibit zero-like boundary behavior. An argument similar to that used for Eq. (4.27) shows that is the largest class of functions in with this property.

Suppose now we consider the residue classes of cosets

Since each coset consists of functions that differ from each other by functions in , it follows that all the functions in such a coset exhibit the same boundary behavior. On the other hand, functions in different cosets do not differ by an element of and hence exhibit different boundary behavior. Thus, there is a one-to-one correspondence between the cosets in and the types of boundary behavior displayed by the functions in . Moreover, suppose that y₁ and y₂ belong to one coset and z₁ and z₂ belong to another. Then y₁ − y₂ and z₁ − z₂ both lie in and, according to our previous remarks,

Q(y₁,z₁) = Q([y₂ + (y₁ − y₂)], [z₂ + (z₁ − z₂)]) = Q(y₂,z₂)

so that Q(y,z) depends only on the cosets to which y and z belong. Thus, Q induces a bilinear form on the residue classes; we denote this form by .

We now have a space of boundary data together with a bilinear form that is in essence the boundary integral associated with the differential operator (4.24). A subset of can be thought of as a boundary condition on a restriction of L₁ and, in particular, a linear subspace of can be considered as a homogeneous boundary condition on such an operator; the corresponding domain of this operator would consist of the totality of functions in the cosets of such a subset or subspace.

The maximal dissipative operators between L₀ and L₁ can be described as follows. Relative to the quadratic form defined on the cosets of , we can again define the notion of a negative [positive] subspace of as well as that of a maximal negative [maximal positive] subspace of .

These maximal negative subspaces determine the boundary conditions for the maximal dissipative operators in question. More precisely, we have the following result.

THEOREM 4.4. There is a one-to-one correspondence between the maximal negative subspaces of H and the maximal dissipative operators L that lie between L₀ and L₁; this correspondence is given by

In the case of one spatial variable, is finite-dimensional and the problem of determining the maximal negative subspaces of relative to is purely algebraic. For instance, in the case of our previous example the cosets of are determined by the values of the functions in at 0 and 1, so that is two-dimensional. Thus , and y lies in the coset if and only if

y(0) = aandy(1) = b

Moreover,

and each maximal negative subspace of is characterized by a number α, |α| ≤ 1, as b = αa.

In the many-spatial-variable case, is no longer finite-dimensional and the problem is quite complex, as we shall see. Nevertheless, at each point χ of we do have the bilinear form Q^χ that acts on the k-dimensional vector space Z_k. In effect, Friedrichs ⁴ has characterized the maximal negative subspaces of that correspond to local boundary conditions as the direct sum of the maximal negative subspaces relative to Q^χ at each point χ of .

By way of illustration, we consider the initial-value problem for the circular membrane

which we rewrite as before with η¹ = u¹, η² = u₂, η³ = u_t,

If we parameterize the boundary point χ¹ = cos σ, χ² = sin σ by σ, 0 ≤ σ < 2π, then the local quadratic form can be written as

where denotes the real part of [ ]. Each maximal negative subspace relative to Q^σ is characterized by a number α(σ) with and described by the relationship

η¹ cos σ + η² sin σ = −α(σ)η³

The maximal negative subspaces of Friedrichs are obtained by choosing α(σ) as a continuous function of σ with

α(2π−) = α(0)

It is instructive actually to characterize for this example and then exhibit certain maximal negative subspaces of . To this end, we note that the mapping

y → [y,L_iy]

carries in a one-to-one linear fashion onto the graph for i = 0 and 1. Consequently, Eq. (4.29) can be written equivalently as

The advantage of Eq. (4.30) over Eq. (4.29) lies in the fact that the graphs of L₁ and L₀ are closed subspaces of H × H, and it is therefore possible to take the orthogonal complement of with respect to , in symbols . Now each clement in has a unique decomposition as an element in plus an element in , and from this it follows that the cosets of are in one-to-one correspondence with the elements of . The elements of are just the pairs [z¹,z²] in for which

In this example we have D = Θ, so that M₀ = −L₀; see Eq. (4.26). It follows from Eq. (4.27) that z² lies in and z¹ = L₁z². On the other hand, [z¹,z²] being in means that z² = L₁z¹, and hence it follows that

Accordingly, there is a one-to-one correspondence between the solutions of Eq. (4.31) and the elements of .

For the circular membrane, Eq. (4.31) is of the form

The first two of Eqs. (4.32) show that ζ¹₂ = ζ²₁, and hence there is a function ϕ such that

ζ¹ = ϕ₁andζ² = ϕ₂

By combining this with Eqs. (4.32), we see that

Thus each element of is characterized by two solutions of Eq. (4.33), say ϕ and ζ³, such that [ϕ₁,ϕ₂,ζ³] lies in . Now [ϕ₁,ϕ₂,ζ³] lies in if and only if ϕ₁,ϕ₂, ϕ₁₁ + ϕ₂₂ = ϕ, ζ³, ζ³₁, and ζ³₂ are in . If ϕ satisfies Eq. (4.33), then by Green’s theorem we have

where

r² = (χ¹)² + (χ²)²

Hence the integrability conditions on ϕ and ζ³ are

Further, in the present case,

Thus only the boundary values of ∂ϕ/∂r and ζ³ enter into the quadratic form Q. Let us represent these boundary values by the Fourier series

Then, since both ϕ and ζ³ satisfy Eq. (4.33), the functions ∂ϕ/∂r and ζ³ can be represented by means of Bessel-function expansions in r < 1; these expansions are, respectively,

The integrability conditions become simply

where , which is of order |k|. Thus can be represented as the direct product of two sequence spaces and with elements

and inner product

where

and

Here, of course, corresponds to the boundary values of

and corresponds to the boundary values of ζ³. The quadratic form is given by

Let α(σ) be a bounded measurable function on [0,2π) with nonnegative real part. We now show that the local boundary condition

defines the domain of a maximal dissipative operator L between L₀ and L₁. We note that a sequence {b_k} in corresponds to a function

in L₂(0,2π). By imposing the usual norm in L₂(0,2π), namely,

we see that the mapping

{b_k} → f

is continuous on to L₂(0,2π), in fact of norm ρ₀^−½, and that the mapping

[f(σ) in L₂(0,2π)] → [α(σ)f(σ) in L₂(0,2π)]

is also continuous and of norm

||α|| = essential sup |α(σ)|

Finally, the mapping

is also continuous and of norm ρ₀^−½, so that the composite mapping

is of norm ρ₀⁻¹||α||. Consequently, the relationship (4.34) makes sense and defines a subspace in . It is clear that is closed, and since

it follows that is a negative subspace of . If were not maximal negative, then would be properly contained in another negative sub-space , and there would be a [ {c_k}, {d_k}] in and hence an element

[{e_k} = {c_k} − α · {d_k}, {0}]

in . Choose b_k = e_k/ρ_k so that {b_k} lies in and

In this case,

lies in and therefore

But

so that the above inequality cannot hold for all c. The supposition that is not maximal negative has thus led to a contradiction. It now follows from Theorem 4.4 that the boundary condition (4.34) defines a maximal dissipative operator L between L₀ and L₁. The condition (4.34) is more general than Friedrichs’ local boundary conditions in that continuity of α(σ) has been replaced by bounded measurability.

It is also easy to exhibit maximal negative subspaces of that do not correspond to local boundary conditions. For example, if ∂ϕ/∂r and ζ³ are represented as above by the Fourier coefficients {a_k} and {b_k}, respectively, then for each sequence {ω_k} satisfying the conditions

for some M > 0, the subspace of defined by the relationships

is closed and negative since the mapping

is obviously continuous, in fact of norm ≤M, and since for

we have

Again, suppose that were properly contained in a negative subspace ; then, as above, there would be an element

in . Choose b_k = ρ_k⁻¹ e_k; then the element

lies in ,

and again it is impossible to have

for all c. This shows that is actually maximal negative and therefore defines the domain of a maximal dissipative operator between L₀ and L₁ as in Theorem 4.4.

4.7 Parabolic Partial Differential Equations

The Cauchy problem for the diffusion equation is associated with an entirely new set of concepts. Energy no longer plays a role; instead, we have to do with density and positivity, both of which are somewhat foreign to Hilbert-space theory. Consequently, one cannot expect the foregoing material to be especially suitable for the parabolic case. Indeed, the best results in this case have been obtained by W. Feller,² E. Hille,⁵ and K. Yosida¹⁰ with both the space of continuous functions and the space of Lebesgue-integrable functions as settings. But the previous development can be adapted⁸ to take care of the diffusion equation, and this results in new information about the boundary-value problem when the domain is in a euclidean space of more than one dimension.

For notational convenience, we shall consider the initial-value problem with domain G in E₂:

where a and b are positive continuously differentiable functions of χ in , while c is merely nonpositive and continuous in . When we set

η⁰ = uη¹ = u¹η² = u₂

we can write Eq. (4.36) as

It is clear that this is not of the form (4.20). The right-hand member of the system (4.37) is again of the form (4.24), however, with

Furthermore,

and hence the differential operator L can be treated as before. We proceed, therefore, to define the minimal operator L₀ and the maximal operator L₁ as in Sec. 4.5; Theorem 4.4 then furnishes us with a characterization of all the maximal dissipative operators L that lie between L₀ and L₁.

When we have obtained such a maximal dissipative operator L, our next step is somehow to restrict L to the subspace of first components, namely,

H₁ = [η⁰,0,0]

in order to recover an operator of the form

Kη⁰ = (aη⁰₁)₁ + (bη⁰₂)₂ + cη⁰

To this end, we first define the restriction L′ ⊂ L by

For y in , we see that

η¹ = η⁰₁η² = η⁰₂

and hence that

L′y = [Kη⁰,0,0]

There is still the objection that L′ acts on H and not simply on H₁. With this in mind, we set P₁ equal to the projection

P₁[η⁰,η¹,η²] = [η⁰,0,0]

and define the retraction of L to H₁ as

It is easily shown that L″ is uniquely determined by P₁y alone anp hence is well defined. In fact, suppose that y and z lie in and that

P₁y = P₁z

Then

w = y − z = [ω⁰,ω¹,ω²]

lies in , and

P₁w = [ω⁰,0,0] = 0

Thus, w is orthogonal to L′w and

0 = (L′w,w) + (w,L′w) = Q(w,w) + (Dw,w)

by Eq. (4.25). Since w also belongs to , and as such has nonpositive boundary conditions [that is, Q(w,w) ≤ 0], we see that

from which we can conclude that

ω¹ = 0 = ω²

and hence that

w = 0L′w = 0

It follows that L″ is a well-defined operator on H₁ to itself and of the form K if we write the argument

P₁y = [η⁰,0,0]

as simply η⁰. In this case, for

η⁰ = P₁y

and y in , we have

(L″η⁰,η⁰) + (η⁰,L″η⁰) = (L′y,y) + (y,L′y) ≤ 0

so that L″ is dissipative. Actually, it can be shown that L″ is maximal dissipative with a dense domain in H₁, so that L″ generates a semigroup of contraction operators [S₁(t)] on H₁. Thus for , the function

u(·,t) = S₁(t)f

solves the initial-value problem (4.36). Moreover, there is a one-to-one correspondence between the maximal dissipative operators L that lie between L₀ and L₁ and the retractions defined in this way.

We note that the above procedure does not furnish us with all of the maximal dissipative operators that lie between the retractions of L₀ and L₁. We do, however, obtain all the usual operators associated with the system (4.36) and then some.

In the special case

a = 1b = 1G = [(χ¹,χ²); (χ¹)² + (χ²)² < 1]

the resulting operators L are those considered in the example at the end of Sec. 4.6, with η³ now replaced by η⁰. For the restricted operator L″, we have

η¹ = η⁰₁η² = η⁰₂

so that the function ϕ of the example can now be replaced by η⁰; the boundary data are therefore determined in the present case by η⁰ and ∂η⁰/∂r. The relationship (4.34) becomes

where again α(σ) is a bounded measurable function with nonnegative real part, and this relationship now defines the domain of a maximal dissipative operator L″. Likewise, the equations (4.35) furnish a nonlocal relationship between η⁰ and ∂η⁰/∂r and also define the domain of a maximal dissipative operator of type L″.

REFERENCES

1. Beckenbach, E. F. (ed.), “Modern Mathematics for the Engineer,” First Series, McGraw-Hill Book Company, Inc., New York, 1956.

2. Feller, W., The Parabolic Differential Equations and the Associated Semi-groups of Transformations, Ann. of Math., ser. 2, vol. 55, pp. 468–519, 1952.

3. Friedrichs, K. O., Symmetric Hyperbolic Linear Differential Equations, Comm. Pure Appl. Math., vol. 7, pp. 345–392, 1954.

4. ——, Symmetric Positive Linear Differential Equations, Comm. Pure Appl. Math., vol. 11, pp. 333–418, 1958.

5. Hille, Einar, The Abstract Cauchy Problem and Cauchy’s Problem for Parabolic Differential Equations, J. Analyse Math., vol. 3, pp. 81–196, 1954.

6. —— and R. S. Phillips, “Functional Analysis and Semi-groups,” Amer. Math. Soc. Colloquium Publ., vol. 31, American Mathematical Society, New York, 1957.

7. Phillips, R. S., Dissipative Operators and Hyperbolic Systems of Partial Differential Equations, Trans. Amer. Math. Soc., vol. 90, pp. 193–254, 1959.

8. ——, Dissipative Operators and Parabolic Partial Differential Equations, Comm. Pure Appl. Math., vol. 12, pp. 249–276, 1959.

9. Riesz, F., and B. Sz.-Nagy, “Functional Analysis,” Frederick Ungar Publishing Co., New York, 1955.

10. Yosida, K., Semi-group Theory and the Integration Problem of Diffusion Equations, Proc. Internat. Cong. Math., vol. 2, pp. 1–16, 1954.

11. ——, “Lectures on Semi-group Theory and Its Applications to Cauchy’s Problem in Partial Differential Equations,” Tata Institute of Fundamental Research, Bombay, 1957.