4
Semigroup Methods in the Theory of Partial Differential Equations
RALPH S. PHILLIPS
PROFESSOR OF MATHEMATICS
STANFORD UNIVERSITY
4.1 Introduction
The theory of semigroups of operators is intimately connected with the initial-value problem for systems of partial differential equations. Mathematical physics abounds in problems of this kind; for instance, the initial-value problem for the heat equation,
and the initial-value problem for the wave equation,
are both of this type. J. Hadamard has called such a problem well formulated if there is a unique solution for the given initial data and if the solution varies continuously with the initial data; see pages 93 to 94 and 111 to 112 of Ref. 1. These two requirements are certainly reasonable on physical grounds, the existence and uniqueness being merely an affirmation of the principle of scientific determinism, whereas the continuous dependence is an expression of the stability of the solution of the problem. In order that the initial-value problem for a given system of partial differential equations may be well formulated, it is frequently necessary to limit the set of initial data, for instance by imposing certain boundary conditions. Thus, for the heat equation over a finite interval, a well-formulated problem would be
The purpose of this chapter is to give a precise, but somewhat abstract, formulation to the initial-value problem, also called the Cauchy problem.
If we assume that the system of differential equations is independent of time—or, equivalently, that the corresponding physical mechanism is time-invariant—then the Cauchy problem leads directly to a semigroup of operators. In fact, suppose a class of suitable initial data is denoted by and let the solution to the problem at time t > 0 with initial data y in
be denoted by S(t)y. We must also assume that
is invariant in the sense that S(t)y lies in
for each y in
; this guarantees that the solution is continuable. We can then determine the solution at time t1 + t2, taking both t1 and t2 > 0, either directly as S(t1 + t2)y or indirectly by using S(t1)y as initial data and determining the solution at time t2 later, as S(t2)[S(t1)y]. The uniqueness of the solution implies that
S(t1 + t2)y = S(t2)[S(t1)y]
or briefly,
This is the semigroup property for the operators [S(t); t > 0].
In some physical problems, the initial data determine the entire past as well as the future of the mechanism. For such problems, the restriction t1 > 0, t2 > 0 in Eq. (4.3) need not be imposed and the resulting family of operators [S(t)] defined for all t, − ∞ < t < ∞, forms a group of operators. In particular, S(t) and S(−t) will be inverses of each other, so that
S(t)S(−t) = I
the identity operator.
The semigroup property of solutions to the time-invariant Cauchy problem is reflected in certain addition theorems. Thus, as is well known, in the case of the previously mentioned heat equation (4.1) we have
and the kernel
exhibits the semigroup property of the operators [S(t); t > 0] in the form
A similar result holds for the wave-equation initial-value problem (4.2). If we write η1 = uχ, η2 = ut, then Eq. (4.2) takes the form
and the solution can be written as
Here F(χ) is the column vector F(χ) = [f′(χ), g(χ)] and
where δ(χ) denotes the Dirac delta function (see Chap. 1). Again the fact that the solution satisfies the semigroup property results in the kernel satisfying the relationship
Stability requires that S(t)yn converges to S(t)y whenever yn converges to y, whence we may conclude that the operator S(t) is continuous on . Moreover, the statement that y in
is the initial value for the solution of a Cauchy problem means that the solution at time t, namely S(t)y, approaches y as t goes to 0 +, in other words that
We shall in addition assume that the problem is linear; i.e., for y and w in , and a and b complex numbers, we assume that ay + bw lies in
and that the solution satisfies the equation
This will be the case when the system of partial differential equations is linear and the boundary conditions determining are homogeneous.
Uniqueness, stability, and linearity constitute our principal assumptions, and it is evident that a large class of problems in partial differential equations meet these conditions and consequently have solutions that can be described by semigroups of linear operators continuous on for each t > 0 and such that
for each y in .
4.2 Semigroups of Operators on Finite-dimensional Spaces
We now proceed to a discussion of one-parameter families of operators of the sort described in Sec. 4.1, and we begin by considering the simplest case, namely, that for which is one-dimensional. In effect, then,
will be the complex-number field, and for each t > 0, S(t) will be merely a multiplicative factor, also a complex number. Since S(t + τ) = S(t)S(τ) and S(τ) → 1 as τ → 0+, we see that S(t) is continuous on the right and hence integrable. Clearly we have
so that in particular, for t sufficiently small, S(t) must satisfy the condition
For 0 < < t, we have
The right-hand member has the limit S(t) − 1 as → 0 +, and therefore
exists. Accordingly, we obtain
and successive substitutions into the right-hand member yield
Although the relationship (4.8) has been established only for t sufficiently small, the semigroup property shows directly that it holds for all t > 0. Thus in the one-dimensional case we can show that all semigroups of operators are exponentials.
The argument given above also applies when is finite-dimensional, with the difference that now S(t) and A are to be interpreted as matrices. The relationships (4.7) and (4.8) continue to hold, however, and we obtain in this way a general form for the solution to problems that arise in the theory of small vibrations and in electric-circuit theory. Thus, if y is a given initial vector and we set y(t) = S(t)y, then it follows that
which is the general form of the initial-value problem in the finite-dimensional case. As an example of the solution to Eq. (4.9) for
we have
4.3 Hilbert Space
Applications of semigroup theory to problems such as the Cauchy problem for the heat and wave equations require that we extend the foregoing theory to the infinite-dimensional case. The resulting theory is much more refined and considerably richer in detail. It is now necessary, however, that we be precise in defining our concepts. For example, we have repeatedly used the notion of a limit in without saying what was meant by this concept. As a matter of fact, almost all definitions of limit in the finite-dimensional case are equivalent. This is not true of the infinite-dimensional case, and so we shall start by describing a suitable setting for our theory.
The simplest and perhaps the most useful of the infinite-dimensional spaces is the Hilbert space H (see, for instance, the book by F. Riesz and B. Sz.-Nagy9); H is a direct generalization of the familiar complex euclidean space and accordingly one has considerable insight into its geometry right from the start. Thus H is a vector space with complex numbers as scalar multipliers; more precisely, H consists of elements y, z, w, etc., called vectors; to every pair y and z of vectors there corresponds a vector w, called their sum, w = y + z; and to every vector y and complex number a there corresponds a vector z, called their product, z = ay. The sum and product satisfy the following properties:
a. y + z = z + y
b. y + (z + w) = (y + z) + w
c. There exists a unique zero vector, denoted by 0, such that for all y in H,
y + 0 = y
d. To each y in H there corresponds a unique vector, denoted by − y, with the property
y + (− y) = 0
e. a(y + z) = ay + az
f. (a + b)y = ay + by
g. a(by) = (ab)y
h. 0y = 0 and 1y = y
What distinguishes the Hilbert space H from other vector spaces is the existence of a complex-valued inner product (y,z) defined for all ordered pairs y and z in H and having the properties
a. (ay,z) = a(y,z)
b. (y + z, w) = (y,w) + (z,w)
c. where
and α, β are real
d. (y,y) ≥ 0the equality sign holding if and only if y = 0
The norm
||y|| = [(y,y)]1/2
is a measure of the length of the vector y. We note that
||y || ≥ 0
the equality sign holding if and only if y = 0. Further, we have
||ay|| = |a| ||y||
The fact that the cosine of the angle between two vectors is of absolute value less than or equal to 1 is expressed by the inequality of Schwarz (see pages 126, 134, 166, 175, and 400 of Ref. 1):
|(y,z)| ≤ ||y|| ||z||
In order to establish this inequality, we note that, if y and z are both equal to the zero element, then the result is obviously valid. If z ≠ 0, then
and by setting a = − (y,z)/(z,z) we obtain the desired result. The so-called triangle inequality
||y + z|| ≤ ||y|| + ||z||
now follows directly; in fact, we have
||y + z||2 = ||y||2 + (y,z) + (z,y) + ||z||2 ≤ ||y||2 + 2||y|| ||z|| + ||z||2
The distance between two elements y and z is defined as ||y − z||. It is clear that this function is symmetric and nonnegative, zero if and only if y = z, and it follows from the triangle inequality that
||y − z|| ≤ ||y − w|| + ||w − z||
It also follows from the triangle inequality that
| ||y|| − ||z|| | ≤ ||y − z||
from which we see that ||y|| is a continuous function. The inner product can also be shown to be a continuous function of both of its arguments.
It is assumed that H is complete in terms of the above distance function; that is, given a Cauchy sequence {yn}, which by definition is a sequence such that
then there is a y in H for which
Finally, it is assumed that H is not finite-dimensional.
The simplest example of a Hilbert space is the space of complex-valued sequences
y = {η1, η2, …}z = {ζ1, ζ2, …}
for which the sum of the squares of the absolute values of the components is finite; here the inner product is defined as
Another example of a Hilbert space is given by L2(G), the Lebesgue-measurable complex-valued functions y = y(χ) defined on a domain G in an m-dimensional real euclidean space Em and having the property that
In this case, the inner product is defined for y and z in L2(G) as
In order that the inequality (y,y) > 0 be satisfied when y ≠ 0, it is necessary to identify all functions that differ only on sets of measure zero. The resulting classes of functions in L2(G) then define a Hilbert space. It should be noted that in such a space it is possible to have a sequence {yn} for which converges to zero even though yn(χ) does not converge for any point χ in G.
A subspace L contained in H is simply a vector space, i.e., a subset with the property that for all y and z in L and for all complex numbers a and b, the vector ay + bz lies in L. A closed subspace is a subspace that contains all its limit points. It is easy to see that a closed subspace of a Hilbert space is complete and therefore is itself a Hilbert space when it is not finite-dimensional. Two vectors y and z of H are said to be orthogonal if (y,z) = 0; it follows that the zero vector is orthogonal to every vector of H. Two sets S1 and S2 are said to be orthogonal if every element of S1 is orthogonal to every element of S2. The orthogonal complement of a set S, in symbols S⊥, is the set of all vectors orthogonal to S. It is readily verified that the orthogonal complement of a set is a closed subspace of H.
If L is a closed subspace of H and y is an arbitrary element in H, then there is in L a unique closest element y1 to y. In fact, let
d = inf (||y − z||; z in L)
Then there is a sequence {zn} in L such that ||y − zn|| converges to d. Now it can readily be verified that
||zm − zn||2 = 2(||y − zn||2 + ||y − zm||2) − 4||y − ½(zn + zm)||2
and since ||y − zn||2 and ||y − zm||2 converge to d2, and
||y − ½(zn + zm)||2 ≥ d2
[because (½)(zn + zm) lies in L], it follows that {zn} forms a Cauchy sequence. Thus {zn} converges to some y1 in L with ||y − y1|| = d. Now, if there were a second value w1 in L with ||y − w1|| = d, then the above identity applied to y1 and w1 shows that ||y1 − w1|| = 0, which proves that y1 is unique.
To pursue the foregoing matter a bit further, suppose y and y1 are defined as above and let z be an arbitrary element in L. Then we have
This inequality can hold for all complex a if and only if (y − y1, z) = 0. Thus y2 = y − y1 lies in the orthogonal complement to L; accordingly, to each y in H there correspond a y1 in L and a y2 in L⊥ such that y = y1 + y2. Since 0 is the only vector common to L and L⊥, it is clear that this decomposition of y is unique. The vector y1 is usually called the projection of y on L; in symbols, we write
PLy = y1
Now, because y1 and y2 are orthogonal, we have
Further, it is easily seen that
PL(ay + bz) = aPLy + bPLz PLPLy = PLy (PLy,z) = (y,PLz)
It is obvious from the definition of orthogonal complements that (S⊥)⊥ ⊃ S for an arbitrary subset S. In the case of a closed subspace L, however, if y lies in (L⊥)⊥, then y is orthogonal to L⊥; by writing y = y1 + y2, y1 in L and y2 in L⊥, we see that
||y2||2 = (y,y2) − (y1,y2) = 0
Hence in this case y lies in L and we have L = (L⊥)⊥.
It is convenient to introduce the notion of a product space H × H consisting of all ordered pairs [y,z] with y and z in H. This becomes a vector space under the convention
a[y,z] + b[u,v] = [ay + bu, az + bv]
And if we define an inner product as
([y,z],[u,v]) = (y,u) + (z,v)
then H × H actually becomes a Hilbert space.
We come now to the subject of transformations or operators on H to itself. An operator T with domain and range
is said to be linear if
is a linear subspace and if
The graph of T, in symbols , is defined as
If T is a linear operator, it is clear that is a linear subspace of H × H. For an operator T, if we have yn → y and Tyn → z, we shall say that [yn, Tyn] converges to [y,z] in the graph topology. In particular if, for every convergent sequence of this type, y lies in
and Ty = z, then T is said to be a closed operator; in other words, T is closed if and only if
is a closed subspace of H × H. Continuity is a much more restrictive concept, since an operator is continuous if yn → y in
implies that Tyn →> Ty. A continuous operator need not be closed; however, it always has a smallest closed extension that is continuous and has a closed domain. If T is linear and continuous, then there is a δ > 0 such that ||Tx|| < 1 if only ||x|| < δ. For any y in H, it is clear that
is of norm less than δ and hence
Thus, for a continuous linear operator, the norm or bound
is finite. It is clear that
and that ||T|| is the smallest number for which this holds. Thus if T is continuous and linear, and if {yn} is a Cauchy sequence of elements in , then the inequality
shows that {Tyn} also forms a Cauchy sequence. We can therefore define an extension of T as
for all such Cauchy sequences. This is the smallest closed extension of T, to which we alluded above; it is readily shown that is again continuous and linear and that
is a closed subspace of H. The inequality (4.13) shows, incidentally, that a linear operator satisfying the relationship (4.12) is necessarily continuous.
In order to illustrate the foregoing discussion, we note that the operator S(t) defined as in Eq. (4.4) on H = L2(− ∞, ∞) and representing the solution to the initial-value problem (4.1) for the heat equation is linear and continuous. In fact, we have ||S(t)|| ≤ 1 since
and since, by the Schwarz inequality,
On the other hand, the associated diffusion operator A [also on the space H = L2(− ∞, ∞)], defined by
can be shown to be closed but not continuous.
Suppose T is a closed linear operator and let λ be a complex number. If λI − T has a continuous inverse with domain all of H, then λ is said to belong to the resolvent set of T; the inverse is called the resolvent of T at the point λ and is denoted by R(λ;T). It is clear that
As an example, we consider the operator A defined as in Eq. (4.14) and take λ > 0. Then
(λI − A)y = λy − yχχ
has an inverse if and only if the equation λy − yχχ = 0 has no nontrivial solution in . The only nontrivial solutions of this equation for which y and yχ are absolutely continuous, however, are linear combinations of exp (λ½χ) and exp (−λ½χ), none of which lie in L2(−∞, ∞). Thus λI − A has an inverse; in fact, the equation
(λI − A)y = f
has a solution given by
One can verify directly that the solution y given here belongs to and also that Rλ is a continuous linear operator with bound 1/λ. Thus Rλ is a right inverse of λI − A. In order to prove that Rλ is actually the resolvent of A at λ, it remains only to verify that
. To this end, let z in
be given, and set
f = λz − Azy = Rλf
According to what has already been established, we have
(λI − A)(y − z) = 0
Hence it follows from the above uniqueness argument that
z = y = Rλf
This proves that
We note parenthetically that is dense in H; that is, every element in H is the limit of a sequence of elements in
.
4.4 Semigroups of Operators on a Hilbert Space
Returning now to semigroup theory, we suppose that we have a semi-group of linear operators [S(t);t > 0] defined on a domain that we assume to be contained and dense in H. Stability implies that each S(t) is continuous on
, and hence S(t) can be uniquely extended to be linear and continuous on H. We shall denote the extended operator again by S(t), and it is readily verified that the extended operators also have the semigroup property (4.3).
It is convenient at this point to make one further assumption, namely, that the operators S(t) satisfy the inequality
||S(t)|| ≤ 1
Such operators are called contraction operators. In this case, continuity with respect to t at t = 0 also carries over for the extended operators; in other words,
Actually, more is true. It can be shown on the basis of Eq. (4.15) that S(t)y is a continuous function of t, t > 0, for each y in H, and if we set S(0) = I, then the assertion holds for all t ≥ 0.
We now define the infinitesimal generator A of the semigroup [S(t)] by setting
The domain of A consists of all y in H for which this limit exists. It is easy to verify that A is a linear operator, and it can be shown that A is actually closed with a dense domain . Now
−1[S(t +
)y − S(t)y] =
−1[S(
) − I]S(t)y = S(t){
−1[S(
)y − y]}
For each y in , the right-hand member converges to S(t)Ay as
→ 0+. This shows that the middle member also converges and hence that S(t)y lies in
; we may therefore assert that the right-hand derivative satisfies
Upon writing
we see that the left-hand derivative also exists, and since it equals S(t)Ay, we conclude that Eq. (4.16) holds for the two-sided derivative.
We note that the derivative with respect to t is taken in the sense of the metric in H. Thus suppose that H = L2(−∞, ∞) and set u(χ,t) equal to a function in H that corresponds to S(t)f for a given f in and t ≥ 0. We can then take the difference quotient of u(χ,t) with respect to t and pass to the limit in the mean-square sense as the t increment goes to zero. This gives a generalized partial derivative with respect to t, and according to Eq. (4.16) the resulting derivative is obtained by having A act on u(χ,t). If A were the operator (4.14), say, then the semigroup S(t) restricted to
would provide a solution to the Cauchy problem for the heat equation (4.1), at least in this generalized sense. It will be the aim of this chapter to show how the theory of semigroups can be used to establish the existence of such solutions to a large class of Cauchy problems defined by means of certain differential operators.
The semigroup method is in essence an abstract analogue of the Laplace-transform approach to the initial-value problem for time-invariant partial differential equations and as such should be not entirely unfamiliar to the reader (see Chap. 3). The Laplace transform
can be defined in a straightforward manner and converges for λ > 0. Since S(t) is very close to exp (At), and at least heuristically
it is not surprising that Rλ turns out to be the resolvent of A, namely, R(λ;A). Here, as in the classical Laplace-transform approach, it is the inverse problem that is important. In other words, given an operator A, for which the resolvent exists when λ is sufficiently large, the problem is to determine whether A is the infinitesimal generator of a semigroup of operators. It is precisely at this point that the abstract semigroup theory provides a better answer than the classical theory, at least better in the sense that the abstract criterion is easier to verify than the classical criterion. The first result of this kind was obtained independently by E. Hille and K. Yosida (see pages 360 to 364 of Ref. 6) and can be expressed as follows:
THEOREM 4.1. A necessary and sufficient condition that a closed linear operator L with dense domain generate a semigroup of contraction operators is that the resolvent R(λ;L) satisfy the inequality
||R(λ;L)|| ≤ λ−1λ > 0
The Hille-Yosida result is valid in more general settings than a Hilbert space H, so that it is to be expected that a simpler criterion holds in the case of H. By way of motivating the terminology in this simpler criterion, we note that, in the model that we shall use, the inner product (y,y) turns out to be the energy associated with the state y; thus, assuming S(t) to be a contraction operator is the same as assuming that the process is dissipative in the sense that no energy enters the system. Since for τ > 0 we have
||S(t + τ)y|| = ||S(τ)S(t)y|| ≤ ||S(t)y||
we see that ||S(t)y|| is nonincreasing and hence for we obtain
We shall call an operator L dissipative if
and maximal dissipative if it is not the proper restriction of any other dissipative operator. We can prove the following theorems.7
THEOREM 4.2. A necessary and sufficient condition for a dissipative operator L with dense domain to be maximal dissipative is that the range of I − L be all of H.
THEOREM 4.3. A necessary and sufficient condition that a linear operator L generate a semigroup of contraction operators is that L be maximal dissipative with dense domain.
These results are readily verified in the case of the finite-dimensional analogue of H, namely, the k-dimensional complex euclidean space Zk with elements
y = (η1, η2, …, ηk)z = (ζ1, ζ2, …, ζk)
and inner product
As we have already seen, each semigroup of matrices [S(t)] defined on Zk and continuous on the right can be represented as
where the matrix A is the infinitesimal generator of the semigroup. If, in addition, the operators S(t) are contraction operators, then, according to the relationship (4.17), A will be dissipative, actually maximal dissipative since A is defined on all Zk. In the finite-dimensional case this simply means that the eigenvalues of the real part of A are nonpositive, in symbols
B = ½(A + A*) ≤ Θ
where A* is the transpose of A and Θ is the zero matrix. Conversely, if B is negative, then for S(t) defined as in Eq. (4.18) and for arbitrary y in Zk we have
and since
〈S(t)y, S(t)y〉 = 〈y,y〉for t = 0
we see that
||S(t)y|| ≤ ||y||for all t > 0
Thus A generates a semigroup of contraction operators in accordance with Theorem 4.3. For example, the semigroup (4.11) generated by Eq. (4.10) will consist of contraction operators if and only if
We further note for A dissipative, λ > 0, and f = λy − Ay, that
from which we conclude that
λ||y|| ≤ ||f||
It follows from this that λI − A is nonsingular, so that
in accordance with Theorem 4.2, and it further follows that
R(λ;A) = (λI − A)−1
is of norm less than or equal to λ−1 in accordance with Theorem 4.1.
For additional material on the theory of semigroups of operators, we refer the reader to the treatise by E. Hille and R. S. Phillips,6 entitled “Functional Analysis and Semi-groups,” and to the excellent set of notes by K. Yosida,11 entitled “Lectures on Semi-group Theory and Its Applications to Cauchy’s Problem in Partial Differential Equations.”
4.5 Hyperbolic Systems of Partial Differential Equations
We shall apply the previous discussion to symmetric hyperbolic systems of partial differential equations. Such a system can be described as follows: Let G be a bounded domain in an m-dimensional euclidean space with points
χ = (χ1, χ2, …, χm)
and again let Zk be a k-dimensional complex euclidean space with elements
y = (η1, η2,…, ηk)z = (ζ1, ζ2, …, ζk)
For functions y(χ,t) defined on G × (0, ∞), we shall consider the Cauchy problem
Moreover, y(χ,t) is restricted so as to satisfy certain homogeneous lateral conditions that suffice to make the solution to the initial-value problem unique. Throughout the remainder of this chapter, the subscript i will denote differentiation with respect to χi. In Eq. (4.20), Ai and B are matrix-valued functions of χ alone, the Ai are symmetric and continuously differentiable in , whereas B is merely continuous in
. The coefficients are subject to one further condition, namely,
Here B* is the transpose of B relative to the inner product (4.19). As we shall see, the hypothesis (4.21) is a consequence of the fact that we deal only with energy-dissipative systems.
We note that the wave equation can readily be transformed into a symmetric hyperbolic system. We have already shown in Eqs. (4.5) that this is true of Eq. (4.2). More generally, we have
and if we set
η1 = u1η2 = u2andη3 = ut
then Eq. (4.22) becomes
which is now of the form (4.20) with
which is ≤Θ if r ≥ 0, as we shall assume.
The potential energy plus the kinetic energy at time t for the amplitude function of Eq. (4.22) is given by
Extending this to the system (4.20), we take, as the energy at time t of the amplitude function y, the quantity
This suggests the Hilbert space H with inner product
as the natural setting for our problem. If we assume that the solution S(t)f is energy-nonincreasing, then we have
||S(t)f|| ≤ ||f||
Thus the associated semigroup operator is contractional, as desired.
By differentiating E with respect to t and making use of an integration by parts, we obtain, at least formally,
where is the boundary of G, n = (n1, n2, …, nm) the outer normal to
, and dσ the surface element on G. The first term on the right in Eq. (4.23), namely,
represents the rate at which energy enters the system through the boundary, the m-tuple with components 〈Aiy,y〉 being a generalization of the “Poynting vector.” The other term, (Dy,y), represents the rate at which energy enters the system from energy sources within G. In particular, if we assume that Et ≤ 0 for all smooth initial-value functions that vanish near , then for t = 0 the surface integral vanishes and we are left with the condition (Dy,y) ≤ 0 for all such functions. This is easily seen to imply the dissipative condition (4.21), which we assume to be satisfied by the coefficients of our system. Finally, when we set
the computation (4.23) shows that
and hence the condition Et ≤ 0 for all solutions of the initial-value problem is equivalent to the condition that the domain of L is so defined as to make this operator dissipative in the sense of Sec. 4.4. This comes as no surprise, since we have already shown in Secs. 4.1 and 4.4 that a necessary and sufficient condition for the problem to be well formulated is that L be maximal dissipative.
Now if the domain of L is such that Q(y,y) ≤ 0 for all functions in this domain, then L is obviously dissipative. It turns out that this is the case if and only if the domain of L is suitably restricted by what may be called boundary conditions. In particular, this will be the case if the functions y in are subject to homogeneous constraints requiring that
be nonpositive at each point χ of . Boundary conditions of this sort will be called local boundary conditions; K. O. Friedrichs4 has characterized boundary conditions of this kind that define L as a maximal dissipative operator. There are also a wide variety of boundary conditions defining L as maximal dissipative in which the boundary values at more than one point of
are related; for instance, periodic-type boundary conditions can be of this sort.
Since most of the present discussion deals with the question of boundary conditions, it will be helpful to elaborate on the description of local boundary conditions as given by Friedrichs. For fixed , it is clear that the matrix
is symmetric, say of rank r with n negative and p positive eigenvalues.
Relative to the local quadratic form
Qχ(y,y) = 〈Aχy,y〉
a subspace N of Zk will be called negative [positive] if
Qχ(y,y) ≤ 0[Qχ(y,y) ≥ 0]
for all y ∈ N, and maximal negative [maximal positive] if N is negative [positive] but not the proper subspace of some other negative [positive] subspace. It is easy to show that a negative [positive] subspace is maximal negative [maximal positive] if and only if it is of dimension n + (k − r) [of dimension p + (k − r)]. It follows from Eq. (4.25) that L will be dissipative if the functions in its domain lie in negative subspaces at each point of , and it is clear that L cannot be maximal dissipative unless these subspaces are maximal negative. Friedrichs has shown that this condition is also sufficient. More precisely, Friedrichs assumes that Aχ is of constant rank throughout
and chooses a continuous family of maximal negative subspaces on
; the set of all smooth functions having boundary values in the chosen subspaces determines a dissipative operator L for which the smallest closed extension is maximal dissipative.
On the other hand, it is clear from Eqs. (4.23) and (4.25) that the condition Et ≤ 0, which is equivalent to L being dissipative, does not require the surface integral Q(y,y) to be nonpositive for all y in the domain of L. If D Θ, then Et ≤ 0 even if energy enters through the boundary, provided that it is compensated for by the energy dissipated in the interior. In order to ensure such a state of affairs, the domain of the operator L must be restricted by global lateral conditions that relate the boundary values of a function in
to its values at the sinks in G. This is a new phenomenon that can occur only when D
Θ or, what amounts to the same thing (as we shall see), only in the non-self-adjoint problem. Such lateral conditions are reminiscent of lateral conditions found by W. Feller2 in describing certain return processes in diffusion theory. There is also a dual process, no longer governed by a pure partial differential operator, in which energy leaks out through the boundary and is in part transported back into the interior.
Returning now to our main problem, namely, that of finding all the dissipative solutions of the initial-value problem (4.20), we see by Theorem 4.3 that this is equivalent to the problem of characterizing all the maximal dissipative operators that can be associated with the differential operator L of Eq. (4.24). There is still a certain amount of arbitrariness as to the operators that can be associated with L. We shall take this to mean either extensions of a minimal operator L0 defined by Eq. (4.24), with domain containing in essence only the smooth functions that vanish near , or restrictions of a maximal operator L1 also defined by
Eq. (4.24), but with domain containing in essence all smooth functions on .
More precisely, L0 and L1 are defined as the least closed extensions of L00 and L10, respectively, which are in turn defined as follows:
and
The maximal operator L1 can also be defined in terms of the formal adjoint to L, namely,
We note that the differential operator M is readily shown to be dissipative in the sense that its coefficients satisfy the analogue of Eq. (4.21); in fact, D is the same for both L and M. We proceed as above to define M0 and M1 by means of M00 and M10, respectively, where
It can be shown3 that L1 is the adjoint of M0; in other words, consists of all y in H for which there exists a y′ in H such that
in which case we have L1y = y′.
4.6 Maximal Dissipative Operators
Our problem, then, is to characterize both the maximal dissipative extensions of L0 and the restrictions of L1. The extensions of L0 need not be of the form (4.24) for y not in , whereas the restrictions of L1 will in general have domains determined by global lateral conditions. The maximal dissipative operators that are both extensions of L0 and restrictions of L1 have domains determined by boundary conditions, and the elements of these domains satisfy the condition Q(y,y) ≤ 0; it is with these operators that we shall be most concerned.
Before proceeding further with our characterization of the maximal dissipative operators that lie between L0 and L1, we shall illustrate the foregoing discussion with a simple example, in which G is the interval 0 < x < 1 and we take m = 1, k = 1, H = L2(0,1), and
In this case we find that A = 1, B = −1, and D = −2 ≤ 0. It can readily be verified that
Moreover, we have
The only local boundary condition defining the domain of a maximal dissipative operator is y(1) = 0. For the general boundary condition, we consider the form
for which r = 2 = dimension of the boundary space, and n = 1 = p. Each maximal negative subspace for this form is characterized as y(1) = αy(0) for some α of absolute value less than or equal to 1. Each such subspace defines the domain of a maximal dissipative operator Lα, L0 ⊂ Lα ⊂ L1, and conversely, each maximal dissipative operator between L0 and L1 is characterized in this way.
On the other hand, it is easy to construct extensions of L0 that are maximal dissipative but not of the above type; for instance, consider
Here a = a(χ) belongs to L2(0,1) and we assume that (a,a) ≤ 2. In order to see that U is dissipative, we observe that
where the inequality follows from the Schwarz inequality:
|(y,a)| ≤ ||y|| ||a||
It is clear that is dense in L2(0,1); hence, according to Theorem 4.2, we shall prove that U is maximal dissipative if we show that
This requires that we solve the equation
y − Uy = f
for arbitrary f ∈ H. It is readily verified that this is satisfied by the function
We remark that the term ay(0) in the expression for U corresponds physically to transporting energy into the interior in amounts having linear density a(χ)|y(0)|2.
The dual situation of a maximal dissipative operator V that is a restriction of L1 but not an extension of L0 can be illustrated by the operator
where again it is assumed that (a,a) ≤ 2. In this case,
so that V is dissipative. One verifies as before that V is maximal dissipative by showing that
This amounts to solving the equation
y − Vy = g
and one can easily find that this equation is satisfied by
where
The condition (a,a) ≤ 2 is sufficient to make the coefficient of k different from zero. It can be shown that for some y in the boundary integral Q(y,y) is positive, so that energy can enter the system through the boundary; however, the lateral condition y(1) = (y,a), which determines
, requires that the rate of energy being dissipated in the interior, namely, −2(y,y), is always at least as large as the flow in through the boundary, namely, |y(1)|2.
We return now to the general problem of characterizing the boundary conditions for the maximal dissipative operators L between L0 and L1. An integration by parts as in Eq. (4.25) shows that
at least for y and z in ; hence, by considering sequences in the graph topology we see that Eq. (4.28) gives us a means of defining the boundary integral Q(y,z) for all y and z in
. For functions y in
, which by definition vanish near
, and functions z in
, it is clear that Q(y,z) = 0; upon again taking sequences in the graph topology, we see that this remains true for all y in
and z in
. Thus, as far as the boundary integral is concerned, the functions in
have absolutely no influence and can be said to exhibit zero-like boundary behavior. An argument similar to that used for Eq. (4.27) shows that
is the largest class of functions in
with this property.
Suppose now we consider the residue classes of cosets
Since each coset consists of functions that differ from each other by functions in , it follows that all the functions in such a coset exhibit the same boundary behavior. On the other hand, functions in different cosets do not differ by an element of
and hence exhibit different boundary behavior. Thus, there is a one-to-one correspondence between the cosets in
and the types of boundary behavior displayed by the functions in
. Moreover, suppose that y1 and y2 belong to one coset and z1 and z2 belong to another. Then y1 − y2 and z1 − z2 both lie in
and, according to our previous remarks,
Q(y1,z1) = Q([y2 + (y1 − y2)], [z2 + (z1 − z2)]) = Q(y2,z2)
so that Q(y,z) depends only on the cosets to which y and z belong. Thus, Q induces a bilinear form on the residue classes; we denote this form by .
We now have a space of boundary data
together with a bilinear form
that is in essence the boundary integral associated with the differential operator (4.24). A subset of
can be thought of as a boundary condition on a restriction of L1 and, in particular, a linear subspace of
can be considered as a homogeneous boundary condition on such an operator; the corresponding domain of this operator would consist of the totality of functions in the cosets of such a subset or subspace.
The maximal dissipative operators between L0 and L1 can be described as follows. Relative to the quadratic form defined on the cosets of
, we can again define the notion of a negative [positive] subspace of
as well as that of a maximal negative [maximal positive] subspace of
.
These maximal negative subspaces determine the boundary conditions for the maximal dissipative operators in question. More precisely, we have the following result.
THEOREM 4.4. There is a one-to-one correspondence between the maximal negative subspaces of H and the maximal dissipative operators L that lie between L0 and L1; this correspondence is given by
In the case of one spatial variable, is finite-dimensional and the problem of determining the maximal negative subspaces of
relative to
is purely algebraic. For instance, in the case of our previous example the cosets of
are determined by the values of the functions in
at 0 and 1, so that
is two-dimensional. Thus
, and y lies in the coset
if and only if
y(0) = aandy(1) = b
Moreover,
and each maximal negative subspace of is characterized by a number α, |α| ≤ 1, as b = αa.
In the many-spatial-variable case, is no longer finite-dimensional and the problem is quite complex, as we shall see. Nevertheless, at each point χ of
we do have the bilinear form Qχ that acts on the k-dimensional vector space Zk. In effect, Friedrichs4 has characterized the maximal negative subspaces of
that correspond to local boundary conditions as the direct sum of the maximal negative subspaces relative to Qχ at each point χ of
.
By way of illustration, we consider the initial-value problem for the circular membrane
which we rewrite as before with η1 = u1, η2 = u2, η3 = ut,
If we parameterize the boundary point χ1 = cos σ, χ2 = sin σ by σ, 0 ≤ σ < 2π, then the local quadratic form can be written as
where denotes the real part of [ ]. Each maximal negative subspace relative to Qσ is characterized by a number α(σ) with
and described by the relationship
η1 cos σ + η2 sin σ = −α(σ)η3
The maximal negative subspaces of Friedrichs are obtained by choosing α(σ) as a continuous function of σ with
α(2π−) = α(0)
It is instructive actually to characterize for this example and then exhibit certain maximal negative subspaces of
. To this end, we note that the mapping
y → [y,Liy]
carries in a one-to-one linear fashion onto the graph
for i = 0 and 1. Consequently, Eq. (4.29) can be written equivalently as
The advantage of Eq. (4.30) over Eq. (4.29) lies in the fact that the graphs of L1 and L0 are closed subspaces of H × H, and it is therefore possible to take the orthogonal complement of with respect to
, in symbols
. Now each clement in
has a unique decomposition as an element in
plus an element in
, and from this it follows that the cosets of
are in one-to-one correspondence with the elements of
. The elements of
are just the pairs [z1,z2] in
for which
In this example we have D = Θ, so that M0 = −L0; see Eq. (4.26). It follows from Eq. (4.27) that z2 lies in and z1 = L1z2. On the other hand, [z1,z2] being in
means that z2 = L1z1, and hence it follows that
Accordingly, there is a one-to-one correspondence between the solutions of Eq. (4.31) and the elements of .
For the circular membrane, Eq. (4.31) is of the form
The first two of Eqs. (4.32) show that ζ12 = ζ21, and hence there is a function ϕ such that
ζ1 = ϕ1andζ2 = ϕ2
By combining this with Eqs. (4.32), we see that
Thus each element of is characterized by two solutions of Eq. (4.33), say ϕ and ζ3, such that [ϕ1,ϕ2,ζ3] lies in
. Now [ϕ1,ϕ2,ζ3] lies in
if and only if ϕ1,ϕ2, ϕ11 + ϕ22 = ϕ, ζ3, ζ31, and ζ32 are in
. If ϕ satisfies Eq. (4.33), then by Green’s theorem we have
where
r2 = (χ1)2 + (χ2)2
Hence the integrability conditions on ϕ and ζ3 are
Further, in the present case,
Thus only the boundary values of ∂ϕ/∂r and ζ3 enter into the quadratic form Q. Let us represent these boundary values by the Fourier series
Then, since both ϕ and ζ3 satisfy Eq. (4.33), the functions ∂ϕ/∂r and ζ3 can be represented by means of Bessel-function expansions in r < 1; these expansions are, respectively,
The integrability conditions become simply
where , which is of order |k|. Thus
can be represented as the direct product of two sequence spaces
and
with elements
and inner product
where
and
Here, of course, corresponds to the boundary values of
and corresponds to the boundary values of ζ3. The quadratic form
is given by
Let α(σ) be a bounded measurable function on [0,2π) with nonnegative real part. We now show that the local boundary condition
defines the domain of a maximal dissipative operator L between L0 and L1. We note that a sequence {bk} in corresponds to a function
in L2(0,2π). By imposing the usual norm in L2(0,2π), namely,
we see that the mapping
{bk} → f
is continuous on to L2(0,2π), in fact of norm ρ0−½, and that the mapping
[f(σ) in L2(0,2π)] → [α(σ)f(σ) in L2(0,2π)]
is also continuous and of norm
||α|| = essential sup |α(σ)|
Finally, the mapping
is also continuous and of norm ρ0−½, so that the composite mapping
is of norm ρ0−1||α||. Consequently, the relationship (4.34) makes sense and defines a subspace in
. It is clear that
is closed, and since
it follows that is a negative subspace of
. If
were not maximal negative, then
would be properly contained in another negative sub-space
, and there would be a [ {ck}, {dk}] in
and hence an element
[{ek} = {ck} − α · {dk}, {0}]
in . Choose bk = ek/ρk so that {bk} lies in
and
In this case,
lies in and therefore
But
so that the above inequality cannot hold for all c. The supposition that is not maximal negative has thus led to a contradiction. It now follows from Theorem 4.4 that the boundary condition (4.34) defines a maximal dissipative operator L between L0 and L1. The condition (4.34) is more general than Friedrichs’ local boundary conditions in that continuity of α(σ) has been replaced by bounded measurability.
It is also easy to exhibit maximal negative subspaces of that do not correspond to local boundary conditions. For example, if ∂ϕ/∂r and ζ3 are represented as above by the Fourier coefficients {ak} and {bk}, respectively, then for each sequence {ωk} satisfying the conditions
for some M > 0, the subspace of
defined by the relationships
is closed and negative since the mapping
is obviously continuous, in fact of norm ≤M, and since for
we have
Again, suppose that were properly contained in a negative subspace
; then, as above, there would be an element
in . Choose bk = ρk−1 ek; then the element
lies in ,
and again it is impossible to have
for all c. This shows that is actually maximal negative and therefore defines the domain of a maximal dissipative operator between L0 and L1 as in Theorem 4.4.
4.7 Parabolic Partial Differential Equations
The Cauchy problem for the diffusion equation is associated with an entirely new set of concepts. Energy no longer plays a role; instead, we have to do with density and positivity, both of which are somewhat foreign to Hilbert-space theory. Consequently, one cannot expect the foregoing material to be especially suitable for the parabolic case. Indeed, the best results in this case have been obtained by W. Feller,2 E. Hille,5 and K. Yosida10 with both the space of continuous functions and the space of Lebesgue-integrable functions as settings. But the previous development can be adapted8 to take care of the diffusion equation, and this results in new information about the boundary-value problem when the domain is in a euclidean space of more than one dimension.
For notational convenience, we shall consider the initial-value problem with domain G in E2:
where a and b are positive continuously differentiable functions of χ in , while c is merely nonpositive and continuous in
. When we set
η0 = uη1 = u1η2 = u2
we can write Eq. (4.36) as
It is clear that this is not of the form (4.20). The right-hand member of the system (4.37) is again of the form (4.24), however, with
Furthermore,
and hence the differential operator L can be treated as before. We proceed, therefore, to define the minimal operator L0 and the maximal operator L1 as in Sec. 4.5; Theorem 4.4 then furnishes us with a characterization of all the maximal dissipative operators L that lie between L0 and L1.
When we have obtained such a maximal dissipative operator L, our next step is somehow to restrict L to the subspace of first components, namely,
H1 = [η0,0,0]
in order to recover an operator of the form
Kη0 = (aη01)1 + (bη02)2 + cη0
To this end, we first define the restriction L′ ⊂ L by
For y in , we see that
η1 = η01η2 = η02
L′y = [Kη0,0,0]
There is still the objection that L′ acts on H and not simply on H1. With this in mind, we set P1 equal to the projection
P1[η0,η1,η2] = [η0,0,0]
and define the retraction of L to H1 as
It is easily shown that L″ is uniquely determined by P1y alone anp hence is well defined. In fact, suppose that y and z lie in and that
P1y = P1z
Then
w = y − z = [ω0,ω1,ω2]
lies in , and
P1w = [ω0,0,0] = 0
Thus, w is orthogonal to L′w and
0 = (L′w,w) + (w,L′w) = Q(w,w) + (Dw,w)
by Eq. (4.25). Since w also belongs to , and as such has nonpositive boundary conditions [that is, Q(w,w) ≤ 0], we see that
from which we can conclude that
ω1 = 0 = ω2
and hence that
w = 0L′w = 0
It follows that L″ is a well-defined operator on H1 to itself and of the form K if we write the argument
P1y = [η0,0,0]
as simply η0. In this case, for
η0 = P1y
and y in , we have
(L″η0,η0) + (η0,L″η0) = (L′y,y) + (y,L′y) ≤ 0
so that L″ is dissipative. Actually, it can be shown that L″ is maximal dissipative with a dense domain in H1, so that L″ generates a semigroup of contraction operators [S1(t)] on H1. Thus for , the function
u(·,t) = S1(t)f
solves the initial-value problem (4.36). Moreover, there is a one-to-one correspondence between the maximal dissipative operators L that lie between L0 and L1 and the retractions defined in this way.
We note that the above procedure does not furnish us with all of the maximal dissipative operators that lie between the retractions of L0 and L1. We do, however, obtain all the usual operators associated with the system (4.36) and then some.
In the special case
a = 1b = 1G = [(χ1,χ2); (χ1)2 + (χ2)2 < 1]
the resulting operators L are those considered in the example at the end of Sec. 4.6, with η3 now replaced by η0. For the restricted operator L″, we have
η1 = η01η2 = η02
so that the function ϕ of the example can now be replaced by η0; the boundary data are therefore determined in the present case by η0 and ∂η0/∂r. The relationship (4.34) becomes
where again α(σ) is a bounded measurable function with nonnegative real part, and this relationship now defines the domain of a maximal dissipative operator L″. Likewise, the equations (4.35) furnish a nonlocal relationship between η0 and ∂η0/∂r and also define the domain of a maximal dissipative operator of type L″.
REFERENCES
1. Beckenbach, E. F. (ed.), “Modern Mathematics for the Engineer,” First Series, McGraw-Hill Book Company, Inc., New York, 1956.
2. Feller, W., The Parabolic Differential Equations and the Associated Semi-groups of Transformations, Ann. of Math., ser. 2, vol. 55, pp. 468–519, 1952.
3. Friedrichs, K. O., Symmetric Hyperbolic Linear Differential Equations, Comm. Pure Appl. Math., vol. 7, pp. 345–392, 1954.
4. ——, Symmetric Positive Linear Differential Equations, Comm. Pure Appl. Math., vol. 11, pp. 333–418, 1958.
5. Hille, Einar, The Abstract Cauchy Problem and Cauchy’s Problem for Parabolic Differential Equations, J. Analyse Math., vol. 3, pp. 81–196, 1954.
6. —— and R. S. Phillips, “Functional Analysis and Semi-groups,” Amer. Math. Soc. Colloquium Publ., vol. 31, American Mathematical Society, New York, 1957.
7. Phillips, R. S., Dissipative Operators and Hyperbolic Systems of Partial Differential Equations, Trans. Amer. Math. Soc., vol. 90, pp. 193–254, 1959.
8. ——, Dissipative Operators and Parabolic Partial Differential Equations, Comm. Pure Appl. Math., vol. 12, pp. 249–276, 1959.
9. Riesz, F., and B. Sz.-Nagy, “Functional Analysis,” Frederick Ungar Publishing Co., New York, 1955.
10. Yosida, K., Semi-group Theory and the Integration Problem of Diffusion Equations, Proc. Internat. Cong. Math., vol. 2, pp. 1–16, 1954.
11. ——, “Lectures on Semi-group Theory and Its Applications to Cauchy’s Problem in Partial Differential Equations,” Tata Institute of Fundamental Research, Bombay, 1957.