2
Operational Methods for Separable Differential Equations
BERNARD FRIEDMAN
PROFESSOR OF MATHEMATICS
UNIVERSITY OF CALIFORNIA, BERKELEY
2.1 Introduction
From the earliest development of calculus, mathematicians have attempted to apply algebraic manipulations to differential operators. They have tried to add, multiply, and divide operators; to expand them in infinite series; and, in short, to treat them as if they were algebraic quantities. Leibnitz7 himself, when he discovered differential calculus in 1695, was inspired enough to introduce the suggestive notation Dy for the first derivative, D2y for the second derivative, and so on. Because of the algebraic properties of the operator D, it was easy to derive the formula for the nth derivative of a product of two functions u and v. The procedure might have been as follows:
The well-known formula for the derivative of uv is
D(uv) = uDv + vDu
We introduce two operators D1 and D2, where D1 operates only on u and D2 operates only on v; then this formula may be written as
D(uv) = D1uv + D2uv = (D1 + D2)(uv)
Consequently, we have
D = D1 + D2
and naturally
By applying this result to the product uv, we get
In 1859, George Boole1 published a book on differential equations containing a calculus of operators. Some parts of this calculus are still taught in elementary courses in differential equations. With the methods he developed, it is possible to find particular solutions of ordinary differential equations. For example, consider the equation
We write D for the differentiation operator, and then this equation can be represented as follows:
(D2 + k2)u = x2
If we may treat D as an algebraic quantity, the solution of the foregoing equation is
To interpret this result, we use ordinary algebraic division to obtain
By substituting in Eq. (2.2), we get
since D4x2 = 0. It is easy to verify that the polynomial (x/k)2 − 2k−4 is indeed a solution of Eq. (2.1).
Such remarkable successes encouraged further attempts at developing a calculus of operators. If D represents differentiation, then it was natural to let D−1 represent integration, because with that convention we have
However, when we consider that
where c is an arbitrary constant of integration, we run into difficulties (see Sec. 1.5). One difficulty is that from Eq. (2.3) we should write DD−1 = 1, but from Eq. (2.4) we see that D−1D ≠ 1. Another difficulty is that we obtain particular solutions and not the complete solution of the equation. Boole of course recognized these difficulties and introduced notation such as D−10 to represent the constant of integration. This notation is unsatisfactory because it contradicts the well-known algebraic fact that, when any quantity is multiplied by zero, the product is zero. These difficulties, among others, caused Boole’s treatment to be neglected.
2.2 Heaviside Theory
The theory of operators was completely revitalized by O. Heaviside6 in a series of papers in the 1890s. His first contribution was to interpret D−1 not as an indefinite integral, but as a definite integral, so that for him D−1u meant
With this interpretation, we still have
but now
if u(0) = 0. Therefore, if u(0) = 0, we may write
DD−1 = D−1D = 1
By using this definition of D−1, Heaviside showed that algebraic methods could be used to find solutions of differential equations satisfying given initial conditions.
We shall illustrate Heaviside’s algebraic approach. Let us follow Heaviside’s notation by using p to represent differentiation and p−1 to represent the definite integral from 0 to t. From the usual rule for differentiating a product, we see that
or
e−atpeatf(t) = (p + a)f(t)
If we omit the function f(t), this result may be written in the following operational form:
Notice that, by using Eq. (2.5) repeatedly, we get
and
(p + a)n = e−atpneat
It is easy to generalize this result to obtain
ϕ(p + a) = e−atϕ(p)eat
if ϕ(p) is an analytic function of p. In particular,
We shall use Eq. (2.6) to find the solution of the differential equation
that vanishes when t = 0. We write the equation as
(p + a)u = e−t
and then, by Eq. (2.6),
The verification that this is the correct solution of the differential equation will be left to the reader.
Heaviside’s successes in the use of his techniques were remarkable. Many objections, however, were advanced against his methods, especially by professional mathematicians. One objection was that he did not have a systematic way of evaluating functions of the operator p, such as , but instead treated each case individually by means of some trick adapted to it. Another and more serious objection was that his methods were frankly experimental. If the methods achieved the desired results, he continued to use them; if they did not work, he modified them until they did. This meant that he could not really explain the principles underlying his methods. Heaviside’s famous answer to this argument was that he didn’t stop eating just because he didn’t understand the process of digestion (see Chap. 3). The rebuttal, however, is that even though we do not stop eating, we try to learn how to avoid eating anything that may upset our digestion, and some of Heaviside’s methods are still indigestible.
The first objection to Heaviside’s methods, namely, that there is no systematic method for evaluating a function of an operator, was overcome when Bromwich,2 in 1919, introduced the contour integral that inverts the integral now known as the Laplace transform. By this integral, to any given function F(p) of the operator p there corresponds a function f(t) defined by the contour integral
For example, the operational solution of
is
and by Eq. (2.8) this corresponds to
Thus we obtain the solution u of Eq. (2.9) that satisfies the conditions u(0) = u′(0) = 0.
One final step was necessary before a rigorous theory of some of Heaviside’s work could be obtained. The need for such a step may be seen by referring to Eq. (2.7). The operational solution of Eq. (2.7) is
Bromwich’s integral cannot be used here because the right-hand side of Eq. (2.10) is not solely a function of p but is a function of p acting on a function of t. Wagner,10 in 1916, and Carson,3 in 1922, however, introduced the Laplace transform that established a correspondence between a function f(t) and a function F(p) as follows:
For Eq. (2.7), let f(t) = e−t, so that F(p) = 1/(p + 1); the solution of Eq. (2.7) can then be obtained from the operational form (p + a)−1(p + 1)−1 by means of Bromwich’s integral. We find
In practical applications, the operational technique is not used. Instead, the Laplace transform of Eq. (2.7) is taken. If we let denote the Laplace transform of u, we get immediately
and therfore
By using Bromwich’s integral or a table of Laplace transforms, we again obtain the result
The work of Bromwich, Carson, and Wagner thus provided a justification for a large part of Heaviside’s work, but there still are parts that cannot be explained. It should be noticed that the Laplace transform and the Heaviside calculus can be applied principally to differential equations containing time derivatives and not to differential equations containing space derivatives. The basic reason for this is that Heaviside’s method was designed for differential equations involving initial conditions and not for those involving boundary conditions.
2.3 Domain of an Operator
It is the purpose of this chapter to develop an operational calculus of the Heaviside type for problems involving space derivatives and boundary conditions.5 For this purpose, it is necessary to look very carefully at the concept of an operator. Notice, first, that just as a pianist must have a piano on which to play, an operator must have something on which to operate. The set of functions on which the operator acts will be called the domain of the operator.
For illustrative purposes, let us consider the operator L = −d2/dx2. What kind of functions u(x) could be in its domain? Clearly, the functions u(x) must have at least a second derivative. For mathematical convenience, it is useful to restrict the domain of L further by requiring that
be finite, that u″(x) exist, and that
be finite.
Consider the differential equation
We may write it as
(L + k2)u = f(x)
and then we are tempted to divide by L + k2 to get
But we know that the solution of Eq. (2.11) is not unique. To make it unique, we may specify that u(x) satisfy two conditions—e.g., initial conditions such as u(0) = 1 and u′(0) = 2, or boundary conditions such as u(0) = 1 and u(1) = 2. Because our purpose is to discuss operators with boundary conditions, we restrict the domain of L still further by requiring that u(x) in the domain of L must satisfy a sufficient number of boundary conditions to make the solution of Eq. (2.11) unique. There is another restriction that arises from the further requirement that both L and L−1 acting on the function zero should give zero. To satisfy this requirement, we must confine our discussion to homogeneous boundary conditions only. Later, we shall show how nonhomogeneous boundary conditions may be included.
Let us consider an arbitrary differential operator L defined over the interval (a,b). For mathematical convenience, we shall assume that the operator acts only on a collection of real-valued functions u(x), defined for a ≤ x ≤ b, that are such that Lu exists and the integrals
are finite. The domain of the operator will be those functions of that satisfy a pair of homogeneous boundary conditions. The following are examples of such boundary conditions:
where α1 and α2 are arbitrary scalars.
2.4 Linear Operators
The domains referred to above have the very important property of being linear vector spaces.8 A linear vector space of functions is defined as a collection of functions such that, if u1(x) and u2(x) are any two functions in the collection, then any linear combination αu1(x) + βu2(x), where α and β are scalars, is also in the collection. If we had used nonhomogeneous boundary conditions, the domains would not have been linear vector spaces. We shall see later that the use of this concept of linear vector space enables us to give a more intuitive and geometric description of the properties of the operator.
We shall also restrict our attention to the consideration of linear differential operators,4 i.e., operators L that act on functions u(x) in such a way that
L[αu(x)] = αLu(x)andL[u1(x) + u2(x)] = Lu1(x) + Lu2(x)
Examples of linear differential operators are d/dx, −d2/dx2, and
When this last operator acts on a function u(x), the result is
p(x)u″(x) + q(x)u′(x) + r(x)u(x)
Note that an operator such as Au, defined by
is not a linear operator, because
Later, we shall also consider partial differential operators, e.g.,
To summarize, hereafter when using the term “operator” we shall mean a linear differential operator, usually of the second order, together with a linear vector space as its domain. Both the differential operator and the domain are required in the definition of the operator. For us, the same differential operators but with different domains will be considered as different operators. For example, the second derivative on a domain of functions u(x) such that u(0) = u(1) = 0 is not the same as the second derivative on a domain of functions u(x) such that u′(0) = u′(1) = 0. The failure to distinguish between two such operators has caused a great deal of confusion and even error in the applications of the calculus of operators.
2.5 Functions of Operators
Now that we have a suitable definition of operator, our next aim will be to define functions of an operator. If L represents an operator as defined above, what will be meant by operators such as L2 or e−αL or ? We proceed first to define powers of L. Suppose that u is in the domain of L; then v = Lu exists. If v is also in the domain of L, then Lv = w exists. We put
w = Lv = L2u
Thus we see that, if u is such that u and Lu are in the domain of L, then L2u is defined as the result of the repeated application of the operator L on u. For example, if L = −d2/dx2 with the boundary condition u(0) = u(1) = 0, then we can define L2u for the function u1 = sin πx but not for the function u2 = x(1 − x). Note that both u1 and u2, and also Lu1 = π2 sin πx, are in the domain of L, but Lu2 = 2 is not; consequently L2u1 = π4 sin πx, whereas L2u2 is not defined.
It is clear that this process may be generalized to higher powers of L. If u, Lu, L2u, …, Ln−1 are all in the domain of L, then Lnu is defined as follows:
Lnu = L(Ln−1u)
Naturally, then, if p(L) is a polynomial in L of the nth degree, say
We may go even further and define analytic functions of L. Suppose ϕ(λ) is an analytic function of λ having a convergent power-series expansion
Then if Lnu is in the domain of u for all values of n, we put
provided that the resulting series of functions converges.
2.6 Eigenfunctions and Self-adjoint Operators
The foregoing method of defining functions of the operator, although natural, is nevertheless not the most convenient. We shall see that a better way to define the functions of the operator is to begin with the concept of an eigenfunction of the operator. An eigenfunction of an operator L is a function ϕ(x), not identically zero, in the domain of L such that L acting on ϕ(x) leaves the function unchanged except for multiplication by a scalar λ,
Lϕ = λϕ
The scalar λ is called an eigenvalue of L; cf. Ref. 9 for the analogous concept in matrix theory. For example, if L1 is −d2/dx2 on the domain of functions u(x) such that u(0) = u(1) = 0, then the functions
ϕn(x) = αn sin nπx
where n = 1, 2, 3, … and the αn are constants ≠ 0, are eigenfunctions of L1 with corresponding eigenvalues n2π2, because
In many cases, the eigenfunctions are mutually orthogonal; i.e., if ϕn(x) and ϕm(x) are eigenfunctions of L, then
if n ≠ m. For example,
We shall give a characterization of the class of operators that have mutually orthogonal eigenfunctions. An operator L defined over the interval (a,b) is called self-adjoint if, for all functions u(x) and v(x) in the domain of L, we have
Notice that this is a sort of generalized integration by parts. For example, the operator L1 we have considered previously is self-adjoint because, by using integration by parts twice, we get
If u(x) and v(x) are in the domain of L1, then
u(0) = u(1) = v(0) = v(1) = 0
and therfore
which shows that Eq. (2.12) is satisfied when L = L1.
We now prove that, for a self-adjoint operator, eigenfunctions corresponding to different eigenvalues are mutually orthogonal:
THEOREM. Suppose that ϕn(x) is an eigenfunction of a self-adjoint operator L with eigenvalue λn and that ϕm(x) is also an eigenfunction with eigenvalue λm; then
provided that λn ≠ λm.
Proof. From the definition of eigenfunctions we have
Lϕn = λnϕnLϕm = λmϕm
If we multiply the first equation by ϕm and the second equation by ϕn, subtract, and then integrate, we get
Since L is self-adjoint, Eq. (2.12) shows that the left-hand side of the above equation is zero; therefore, since λn ≠ λm, we have
as desired.
Since αnϕn(x), where αn is a constant ≠ 0, is an eigenfunction if ϕn(x) is, we can always normalize the eigenfunctions, i.e., so choose the constant multiplying factor αn that
For example, for the eigenfunctions of L1 we have
and therefore ; consequently, if we take the eigenfunctions of L1 as
, they will be normalized.
Hereafter, we shall assume that the eigenfunctions are both orthogonal and normalized, so that both Eq. (2.13) and the following equation hold:
The set of eigenfunctions ϕ1(x), ϕ2(x), … will thus correspond exactly to a set of mutually perpendicular unit vectors.
2.7 Spectral Representation
Self-adjoint operators are important because in many cases the eigen-functions of such an operator form a basis for the space of functions under consideration; i.e., any function of the space can be written as a sum, perhaps with an infinite number of terms, of linear multiples of the eigen-functions. For example, we have seen that for the operator L1 the normalized eigenfunctions are , where n = 1, 2, 3, …. Let u(x) be any function such that
Then there exists an expansion
This equation is to be understood in the following sense: The partial sums of the series on the right approximate the functions on the left in the mean-square sense, so that
Because the eigenfunctions are orthogonal and normalized, it is easy to find an explicit formula for cn in Eq. (2.15). Multiply this equation by the normalized eigenfunction and integrate. We find
by the use of Eqs. (2.13) and (2.14) in this special case. Note that Eq. (2.15) is just the Fourier sine series for the function u(x) and Eq. (2.16) is the formula for the nth coefficient.
For a general self-adjoint operator L, the results will be very much the same as they are for L1. Suppose that ϕ1(x), ϕ2(x), … are the normalized eigenfunctions for L corresponding to the eigenvalues λ1, λ2, …, respectively. If u(x) is a function in the domain of L, then u(x) can be expanded, in the mean-square sense, as follows:
where, because of Eqs. (2.13) and (2.14),
Furthermore, by the definition of an eigenfunction, we have
A formula such as (2.17), which uses an eigenfunction expansion to represent the action of an operator L on an arbitrary function in its domain, will be called the spectral representation of the operator L.
Once we have the spectral representation (2.17), it is easy to define functions of L. For example,
Similarly,
We make a natural extension of this formula. If q(λ) is a function of λ that is defined for λ = λ1, λ2, …, then we define q(L) as follows:
Of course, it is understood that this definition applies only when the right-hand side converges in the mean-square sense.
As an illustration, consider again the operator L1. What is to be meant by the operator
where α is a positive scalar? Since the eigenfunctions of L1 are and correspond to the eigenvalues n2π2, n = 1, 2, …, we find from Eq. (2.18) that
where
2.8 A Partial Differential Equation
We shall now use these ideas concerning spectral representations and functions of operators to solve the following partial differential equation, which is the mathematical formulation of the problem of determining the equilibrium distribution of temperature T(x,y) in a rectangular plate of length 1 and width h if three sides are kept at temperature zero but the fourth side has a prescribed temperature distribution f(x):
Find a function T(x,y) such that
and T(0,y) = T(1,y) = T(x,h) = 0 but T(x,0) = f(x).
Equation (2.19) may be represented in the following operational form:
(L1 + L2)T = 0
where L1 = −∂2/∂x2 on the domain of functions u(x,y) that as functions of x satisfy the boundary conditions of vanishing for both x = 0 and x = 1, and where L2 = −∂2/∂y2 on the domain of functions u(x,y) that as functions of y satisfy a zero boundary condition for y = h and a given nonzero boundary condition at y = 0. Notice that L2 is not a linear operator, because the boundary condition for y = 0 is not homogeneous. This suggests writing Eq. (2.19) as follows:
where T = f(x) for y = 0 and T = 0 for y = h.
Since in Eq. (2.20) the operator L1 has no effect on the y dependence of T but acts on T only as a function of x and has its boundary conditions dependent only on the values of x, we may treat L1 in Eq. (2.20) as if it were a constant independent of y. If L1 were a constant, however, the solution of Eq. (2.20) would be
T = Aeky + Be−ky
where k = L1½ and where A and B are quantities independent of y. The boundary conditions at y = 0 and y = h imply that
When these equations are solved for A and B and the expression for T is rearranged slightly, we find that
is the solution of Eq. (2.20) satisfying the prescribed boundary conditions.
In Eq. (2.21), the function T(x,y) is represented as a function of a differential operator in x acting on a function of x. It would be desirable to interpret this function of the operator and so obtain T as an explicit function of x and y. This interpretation can be achieved by using the spectral representation of the operator L1. We have seen that the eigen-functions of L1 are the functions , for n = 1, 2, …, and that they correspond to the eigenvalues n2π2. By using Eqs. (2.15) and (2.16), we see that
and then by using Eq. (2.18) we find that
It is easy to verify that this is the desired solution of Eq. (2.19).
2.9 Types of Spectral Representation
So far, we have considered the spectral representation of only those operators having a set of eigenfunctions that can form a basis for the space. For many operators, no eigenfunctions exist, and even if they do exist, there may not be enough of them to form a basis for the space. For example, consider the operator −d2/dx2 on the domain of functions u(x) that are defined and have second derivatives for − ∞ < x < ∞ and that are such that the integrals
are finite. This operator has no eigenfunctions; i.e., there do not exist any functions ϕ(x) 0 in the domain such that
because any nonnull solution of Eq. (2.23) will cause the integrals (2.22) to become infinite.
It is also easy to give examples of differential operators that have only a finite number of eigenfunctions. It is clear that such a finite number of functions cannot be a basis for a function space containing an infinite number of linearly independent functions.
In order to use the techniques of the previous sections in interpreting functions of operators, we must extend the concept of spectral representation. We shall say that the operator L has a spectral representation if, for any function u(x) in the domain of L, there exists a representation of u(x) as a sum or as an integral or, perhaps, as a combination of both sum and integral,
and if the expansion is such that
For example, if L = −d2/dx2 on the domain defined by Eqs. (2.22), then, for any u(x) in the domain, we have
These formulas reduce to the well-known ones for the Fourier integral if we put λ = k2. They become
It is clear that these formulas can be used to define functions of the operator L. We find
We shall use these formulas to find the acoustic field produced in the region − ∞ < x < ∞, 0 ≤ y < ∞ by sources of given strength f(x,y) and frequency ω if the line y = 0 represents a rigid surface. Mathematically, the problem is to find that solution of
satisfying the condition
for − ∞ < x < ∞. Here c is the velocity of sound in the medium. Proceeding as before, we put
over the interval (− ∞, ∞) and then Eq. (2.24) becomes
with the condition (2.25). By the method of variation of constants, we find that the solution of this ordinary differential equation is
This result can be interpreted by means of the spectral representation of L. The function f(x,η) is represented in terms of the Fourier integral as follows:
Now, remembering that
Lejkx = k2ejkx
we find that
2.10 Conclusion
The preceding discussion has shown how separable partial differential equations may be solved by operational means. The procedure is as follows: One of the differential operators is treated as a constant and the resulting simpler differential equation is solved. The solution of this simpler differential equation will contain functions of the operator that was treated as a constant. These functions of the operator are interpreted by the use of the spectral representation of the operator; thus, the solution to the partial differential equation is obtained.
The usual transform methods are applications of this theory. For example, the Laplace transform defines the spectral representation for the operator d/dt on the domain of functions u(t) satisfying the initial condition u(0) = 0; the Fourier transform defines the spectral representation for the operator −d2/dx2 on the domain of functions u(x) defined over (− ∞, ∞) such that the integrals
are finite, and the Hankel transform defines the spectral transformation for the operator
on the domain of functions u(r), defined over (0, ∞), such that the integrals
are finite. These transforms will be studied in detail in Chap. 3. It is clear, then, that transform methods and operational calculus depend on the study of the spectral representation of differential operators.
REFERENCES
1. Boole, George, “Treatise on Differential Equations,” Cambridge University Press, London, 1859.
2. Bromwich, T. J. I’A., Operational Methods in Mathematical Physics, Phil. Mag., ser. 6, vol. 37, pp. 407–419, 1919.
3. Carson, J. P., Heaviside Operational Calculus, Bell System Tech. J., vol. 1, pp. 1–13, 1922.
4. Dunford, Nelson, and Jacob T. Schwartz, “Linear Operators,” Interscience Publishers, Inc., New York, 1958.
5. Friedman, Bernard, “Principles and Techniques of Applied Mathematics,” John Wiley & Sons, Inc., New York, 1956.
6. Heaviside, O., “Electromagnetic Theory,” vol. 1, The Electrician, London, 1893.
7. Leibnitz, G. W., “Mathematische Schriften,” vol. 1, edited by C. J. Gerhardt, Berlin, 1849.
8. Morrey, Charles B., Jr., Nonlinear Methods, chap. 16 in “Modern Mathematics for the Engineer,” First Series, edited by E. F. Beckenbach, McGraw-Hill Book Company, Inc., New York, 1956.
9. Pipes, Louis A., Matrices in Engineering, chap. 13 in “Modern Mathematics for the Engineer,” First Series, edited by E. F. Beckenbach, McGraw-Hill Book Company, Inc., New York, 1956.
10. Wagner, K. W., Über eine Formel von Heaviside zur Berechnung von Einschaltvorgange, Ark. Electrotechnik, vol. 4, pp. 159–193, 1916.