Modern Mathematics for the Engineer

Operational Methods for Separable Differential Equations

BERNARD FRIEDMAN

PROFESSOR OF MATHEMATICS
UNIVERSITY OF CALIFORNIA, BERKELEY

2.1 Introduction

From the earliest development of calculus, mathematicians have attempted to apply algebraic manipulations to differential operators. They have tried to add, multiply, and divide operators; to expand them in infinite series; and, in short, to treat them as if they were algebraic quantities. Leibnitz ⁷ himself, when he discovered differential calculus in 1695, was inspired enough to introduce the suggestive notation Dy for the first derivative, D²y for the second derivative, and so on. Because of the algebraic properties of the operator D, it was easy to derive the formula for the nth derivative of a product of two functions u and v. The procedure might have been as follows:

The well-known formula for the derivative of uv is

D(uv) = uDv + vDu

We introduce two operators D₁ and D₂, where D₁ operates only on u and D₂ operates only on v; then this formula may be written as

D(uv) = D₁uv + D₂uv = (D₁ + D₂)(uv)

Consequently, we have

D = D₁ + D₂

and naturally

By applying this result to the product uv, we get

In 1859, George Boole ¹ published a book on differential equations containing a calculus of operators. Some parts of this calculus are still taught in elementary courses in differential equations. With the methods he developed, it is possible to find particular solutions of ordinary differential equations. For example, consider the equation

We write D for the differentiation operator, and then this equation can be represented as follows:

(D² + k²)u = x²

If we may treat D as an algebraic quantity, the solution of the foregoing equation is

To interpret this result, we use ordinary algebraic division to obtain

By substituting in Eq. (2.2), we get

since D⁴x² = 0. It is easy to verify that the polynomial (x/k)² − 2k⁻⁴ is indeed a solution of Eq. (2.1).

Such remarkable successes encouraged further attempts at developing a calculus of operators. If D represents differentiation, then it was natural to let D⁻¹ represent integration, because with that convention we have

However, when we consider that

where c is an arbitrary constant of integration, we run into difficulties (see Sec. 1.5). One difficulty is that from Eq. (2.3) we should write DD⁻¹ = 1, but from Eq. (2.4) we see that D⁻¹D ≠ 1. Another difficulty is that we obtain particular solutions and not the complete solution of the equation. Boole of course recognized these difficulties and introduced notation such as D⁻¹0 to represent the constant of integration. This notation is unsatisfactory because it contradicts the well-known algebraic fact that, when any quantity is multiplied by zero, the product is zero. These difficulties, among others, caused Boole’s treatment to be neglected.

2.2 Heaviside Theory

The theory of operators was completely revitalized by O. Heaviside ⁶ in a series of papers in the 1890s. His first contribution was to interpret D⁻¹ not as an indefinite integral, but as a definite integral, so that for him D⁻¹u meant

With this interpretation, we still have

but now

if u(0) = 0. Therefore, if u(0) = 0, we may write

DD⁻¹ = D⁻¹D = 1

By using this definition of D⁻¹, Heaviside showed that algebraic methods could be used to find solutions of differential equations satisfying given initial conditions.

We shall illustrate Heaviside’s algebraic approach. Let us follow Heaviside’s notation by using p to represent differentiation and p⁻¹ to represent the definite integral from 0 to t. From the usual rule for differentiating a product, we see that

e^−atpe^atf(t) = (p + a)f(t)

If we omit the function f(t), this result may be written in the following operational form:

Notice that, by using Eq. (2.5) repeatedly, we get

and

(p + a)ⁿ = e^−atpⁿe^at

It is easy to generalize this result to obtain

ϕ(p + a) = e^−atϕ(p)e^at

if ϕ(p) is an analytic function of p. In particular,

We shall use Eq. (2.6) to find the solution of the differential equation

that vanishes when t = 0. We write the equation as

(p + a)u = e^−t

and then, by Eq. (2.6),

The verification that this is the correct solution of the differential equation will be left to the reader.

Heaviside’s successes in the use of his techniques were remarkable. Many objections, however, were advanced against his methods, especially by professional mathematicians. One objection was that he did not have a systematic way of evaluating functions of the operator p, such as , but instead treated each case individually by means of some trick adapted to it. Another and more serious objection was that his methods were frankly experimental. If the methods achieved the desired results, he continued to use them; if they did not work, he modified them until they did. This meant that he could not really explain the principles underlying his methods. Heaviside’s famous answer to this argument was that he didn’t stop eating just because he didn’t understand the process of digestion (see Chap. 3). The rebuttal, however, is that even though we do not stop eating, we try to learn how to avoid eating anything that may upset our digestion, and some of Heaviside’s methods are still indigestible.

The first objection to Heaviside’s methods, namely, that there is no systematic method for evaluating a function of an operator, was overcome when Bromwich,² in 1919, introduced the contour integral that inverts the integral now known as the Laplace transform. By this integral, to any given function F(p) of the operator p there corresponds a function f(t) defined by the contour integral

For example, the operational solution of

and by Eq. (2.8) this corresponds to

Thus we obtain the solution u of Eq. (2.9) that satisfies the conditions u(0) = u′(0) = 0.

One final step was necessary before a rigorous theory of some of Heaviside’s work could be obtained. The need for such a step may be seen by referring to Eq. (2.7). The operational solution of Eq. (2.7) is

Bromwich’s integral cannot be used here because the right-hand side of Eq. (2.10) is not solely a function of p but is a function of p acting on a function of t. Wagner,¹⁰ in 1916, and Carson,³ in 1922, however, introduced the Laplace transform that established a correspondence between a function f(t) and a function F(p) as follows:

For Eq. (2.7), let f(t) = e^−t, so that F(p) = 1/(p + 1); the solution of Eq. (2.7) can then be obtained from the operational form (p + a)⁻¹(p + 1)⁻¹ by means of Bromwich’s integral. We find

In practical applications, the operational technique is not used. Instead, the Laplace transform of Eq. (2.7) is taken. If we let denote the Laplace transform of u, we get immediately

and therfore

By using Bromwich’s integral or a table of Laplace transforms, we again obtain the result

The work of Bromwich, Carson, and Wagner thus provided a justification for a large part of Heaviside’s work, but there still are parts that cannot be explained. It should be noticed that the Laplace transform and the Heaviside calculus can be applied principally to differential equations containing time derivatives and not to differential equations containing space derivatives. The basic reason for this is that Heaviside’s method was designed for differential equations involving initial conditions and not for those involving boundary conditions.

2.3 Domain of an Operator

It is the purpose of this chapter to develop an operational calculus of the Heaviside type for problems involving space derivatives and boundary conditions.⁵ For this purpose, it is necessary to look very carefully at the concept of an operator. Notice, first, that just as a pianist must have a piano on which to play, an operator must have something on which to operate. The set of functions on which the operator acts will be called the domain of the operator.

For illustrative purposes, let us consider the operator L = −d²/dx². What kind of functions u(x) could be in its domain? Clearly, the functions u(x) must have at least a second derivative. For mathematical convenience, it is useful to restrict the domain of L further by requiring that

be finite, that u″(x) exist, and that

be finite.

Consider the differential equation

We may write it as

(L + k²)u = f(x)

and then we are tempted to divide by L + k² to get

But we know that the solution of Eq. (2.11) is not unique. To make it unique, we may specify that u(x) satisfy two conditions—e.g., initial conditions such as u(0) = 1 and u′(0) = 2, or boundary conditions such as u(0) = 1 and u(1) = 2. Because our purpose is to discuss operators with boundary conditions, we restrict the domain of L still further by requiring that u(x) in the domain of L must satisfy a sufficient number of boundary conditions to make the solution of Eq. (2.11) unique. There is another restriction that arises from the further requirement that both L and L⁻¹ acting on the function zero should give zero. To satisfy this requirement, we must confine our discussion to homogeneous boundary conditions only. Later, we shall show how nonhomogeneous boundary conditions may be included.

Let us consider an arbitrary differential operator L defined over the interval (a,b). For mathematical convenience, we shall assume that the operator acts only on a collection of real-valued functions u(x), defined for a ≤ x ≤ b, that are such that Lu exists and the integrals

are finite. The domain of the operator will be those functions of that satisfy a pair of homogeneous boundary conditions. The following are examples of such boundary conditions:

where α₁ and α₂ are arbitrary scalars.

2.4 Linear Operators

The domains referred to above have the very important property of being linear vector spaces.⁸ A linear vector space of functions is defined as a collection of functions such that, if u₁(x) and u₂(x) are any two functions in the collection, then any linear combination αu₁(x) + βu₂(x), where α and β are scalars, is also in the collection. If we had used nonhomogeneous boundary conditions, the domains would not have been linear vector spaces. We shall see later that the use of this concept of linear vector space enables us to give a more intuitive and geometric description of the properties of the operator.

We shall also restrict our attention to the consideration of linear differential operators,⁴ i.e., operators L that act on functions u(x) in such a way that

L[αu(x)] = αLu(x)andL[u₁(x) + u₂(x)] = Lu₁(x) + Lu₂(x)

Examples of linear differential operators are d/dx, −d²/dx², and

When this last operator acts on a function u(x), the result is

p(x)u″(x) + q(x)u′(x) + r(x)u(x)

Note that an operator such as Au, defined by

is not a linear operator, because

Later, we shall also consider partial differential operators, e.g.,

To summarize, hereafter when using the term “operator” we shall mean a linear differential operator, usually of the second order, together with a linear vector space as its domain. Both the differential operator and the domain are required in the definition of the operator. For us, the same differential operators but with different domains will be considered as different operators. For example, the second derivative on a domain of functions u(x) such that u(0) = u(1) = 0 is not the same as the second derivative on a domain of functions u(x) such that u′(0) = u′(1) = 0. The failure to distinguish between two such operators has caused a great deal of confusion and even error in the applications of the calculus of operators.

2.5 Functions of Operators

Now that we have a suitable definition of operator, our next aim will be to define functions of an operator. If L represents an operator as defined above, what will be meant by operators such as L² or e^−αL or ? We proceed first to define powers of L. Suppose that u is in the domain of L; then v = Lu exists. If v is also in the domain of L, then Lv = w exists. We put

w = Lv = L²u

Thus we see that, if u is such that u and Lu are in the domain of L, then L²u is defined as the result of the repeated application of the operator L on u. For example, if L = −d²/dx² with the boundary condition u(0) = u(1) = 0, then we can define L²u for the function u₁ = sin πx but not for the function u₂ = x(1 − x). Note that both u₁ and u₂, and also Lu₁ = π² sin πx, are in the domain of L, but Lu₂ = 2 is not; consequently L²u₁ = π⁴ sin πx, whereas L²u₂ is not defined.

It is clear that this process may be generalized to higher powers of L. If u, Lu, L²u, …, Lⁿ⁻¹ are all in the domain of L, then Lⁿu is defined as follows:

Lⁿu = L(Lⁿ⁻¹u)

Naturally, then, if p(L) is a polynomial in L of the nth degree, say

We may go even further and define analytic functions of L. Suppose ϕ(λ) is an analytic function of λ having a convergent power-series expansion

Then if Lⁿu is in the domain of u for all values of n, we put

provided that the resulting series of functions converges.

2.6 Eigenfunctions and Self-adjoint Operators

The foregoing method of defining functions of the operator, although natural, is nevertheless not the most convenient. We shall see that a better way to define the functions of the operator is to begin with the concept of an eigenfunction of the operator. An eigenfunction of an operator L is a function ϕ(x), not identically zero, in the domain of L such that L acting on ϕ(x) leaves the function unchanged except for multiplication by a scalar λ,

Lϕ = λϕ

The scalar λ is called an eigenvalue of L; cf. Ref. 9 for the analogous concept in matrix theory. For example, if L₁ is −d²/dx² on the domain of functions u(x) such that u(0) = u(1) = 0, then the functions

ϕ_n(x) = α_n sin nπx

where n = 1, 2, 3, … and the α_n are constants ≠ 0, are eigenfunctions of L₁ with corresponding eigenvalues n²π², because

In many cases, the eigenfunctions are mutually orthogonal; i.e., if ϕ_n(x) and ϕ_m(x) are eigenfunctions of L, then

if n ≠ m. For example,

We shall give a characterization of the class of operators that have mutually orthogonal eigenfunctions. An operator L defined over the interval (a,b) is called self-adjoint if, for all functions u(x) and v(x) in the domain of L, we have

Notice that this is a sort of generalized integration by parts. For example, the operator L₁ we have considered previously is self-adjoint because, by using integration by parts twice, we get

If u(x) and v(x) are in the domain of L₁, then

u(0) = u(1) = v(0) = v(1) = 0

and therfore

which shows that Eq. (2.12) is satisfied when L = L₁.

We now prove that, for a self-adjoint operator, eigenfunctions corresponding to different eigenvalues are mutually orthogonal:

THEOREM. Suppose that ϕ_n(x) is an eigenfunction of a self-adjoint operator L with eigenvalue λ_n and that ϕ_m(x) is also an eigenfunction with eigenvalue λ_m; then

provided that λ_n ≠ λ_m.

Proof. From the definition of eigenfunctions we have

Lϕ_n = λ_nϕ_nLϕ_m = λ_mϕ_m

If we multiply the first equation by ϕ_m and the second equation by ϕ_n, subtract, and then integrate, we get

Since L is self-adjoint, Eq. (2.12) shows that the left-hand side of the above equation is zero; therefore, since λ_n ≠ λ_m, we have

as desired.

Since α_nϕ_n(x), where α_n is a constant ≠ 0, is an eigenfunction if ϕ_n(x) is, we can always normalize the eigenfunctions, i.e., so choose the constant multiplying factor α_n that

For example, for the eigenfunctions of L₁ we have

and therefore ; consequently, if we take the eigenfunctions of L₁ as , they will be normalized.

Hereafter, we shall assume that the eigenfunctions are both orthogonal and normalized, so that both Eq. (2.13) and the following equation hold:

The set of eigenfunctions ϕ₁(x), ϕ₂(x), … will thus correspond exactly to a set of mutually perpendicular unit vectors.

2.7 Spectral Representation

Self-adjoint operators are important because in many cases the eigen-functions of such an operator form a basis for the space of functions under consideration; i.e., any function of the space can be written as a sum, perhaps with an infinite number of terms, of linear multiples of the eigen-functions. For example, we have seen that for the operator L₁ the normalized eigenfunctions are , where n = 1, 2, 3, …. Let u(x) be any function such that

Then there exists an expansion

This equation is to be understood in the following sense: The partial sums of the series on the right approximate the functions on the left in the mean-square sense, so that

Because the eigenfunctions are orthogonal and normalized, it is easy to find an explicit formula for c_n in Eq. (2.15). Multiply this equation by the normalized eigenfunction and integrate. We find

by the use of Eqs. (2.13) and (2.14) in this special case. Note that Eq. (2.15) is just the Fourier sine series for the function u(x) and Eq. (2.16) is the formula for the nth coefficient.

For a general self-adjoint operator L, the results will be very much the same as they are for L₁. Suppose that ϕ₁(x), ϕ₂(x), … are the normalized eigenfunctions for L corresponding to the eigenvalues λ₁, λ₂, …, respectively. If u(x) is a function in the domain of L, then u(x) can be expanded, in the mean-square sense, as follows:

where, because of Eqs. (2.13) and (2.14),

Furthermore, by the definition of an eigenfunction, we have

A formula such as (2.17), which uses an eigenfunction expansion to represent the action of an operator L on an arbitrary function in its domain, will be called the spectral representation of the operator L.

Once we have the spectral representation (2.17), it is easy to define functions of L. For example,

Similarly,

We make a natural extension of this formula. If q(λ) is a function of λ that is defined for λ = λ₁, λ₂, …, then we define q(L) as follows:

Of course, it is understood that this definition applies only when the right-hand side converges in the mean-square sense.

As an illustration, consider again the operator L₁. What is to be meant by the operator

where α is a positive scalar? Since the eigenfunctions of L₁ are and correspond to the eigenvalues n²π², n = 1, 2, …, we find from Eq. (2.18) that

where

2.8 A Partial Differential Equation

We shall now use these ideas concerning spectral representations and functions of operators to solve the following partial differential equation, which is the mathematical formulation of the problem of determining the equilibrium distribution of temperature T(x,y) in a rectangular plate of length 1 and width h if three sides are kept at temperature zero but the fourth side has a prescribed temperature distribution f(x):

Find a function T(x,y) such that

and T(0,y) = T(1,y) = T(x,h) = 0 but T(x,0) = f(x).

Equation (2.19) may be represented in the following operational form:

(L₁ + L₂)T = 0

where L₁ = −∂²/∂x² on the domain of functions u(x,y) that as functions of x satisfy the boundary conditions of vanishing for both x = 0 and x = 1, and where L₂ = −∂²/∂y² on the domain of functions u(x,y) that as functions of y satisfy a zero boundary condition for y = h and a given nonzero boundary condition at y = 0. Notice that L₂ is not a linear operator, because the boundary condition for y = 0 is not homogeneous. This suggests writing Eq. (2.19) as follows:

where T = f(x) for y = 0 and T = 0 for y = h.

Since in Eq. (2.20) the operator L₁ has no effect on the y dependence of T but acts on T only as a function of x and has its boundary conditions dependent only on the values of x, we may treat L₁ in Eq. (2.20) as if it were a constant independent of y. If L₁ were a constant, however, the solution of Eq. (2.20) would be

T = Ae^ky + Be^−ky

where k = L₁^½ and where A and B are quantities independent of y. The boundary conditions at y = 0 and y = h imply that

When these equations are solved for A and B and the expression for T is rearranged slightly, we find that

is the solution of Eq. (2.20) satisfying the prescribed boundary conditions.

In Eq. (2.21), the function T(x,y) is represented as a function of a differential operator in x acting on a function of x. It would be desirable to interpret this function of the operator and so obtain T as an explicit function of x and y. This interpretation can be achieved by using the spectral representation of the operator L₁. We have seen that the eigen-functions of L₁ are the functions , for n = 1, 2, …, and that they correspond to the eigenvalues n²π². By using Eqs. (2.15) and (2.16), we see that

and then by using Eq. (2.18) we find that

It is easy to verify that this is the desired solution of Eq. (2.19).

2.9 Types of Spectral Representation

So far, we have considered the spectral representation of only those operators having a set of eigenfunctions that can form a basis for the space. For many operators, no eigenfunctions exist, and even if they do exist, there may not be enough of them to form a basis for the space. For example, consider the operator −d²/dx² on the domain of functions u(x) that are defined and have second derivatives for − ∞ < x < ∞ and that are such that the integrals

are finite. This operator has no eigenfunctions; i.e., there do not exist any functions ϕ(x) 0 in the domain such that

because any nonnull solution of Eq. (2.23) will cause the integrals (2.22) to become infinite.

It is also easy to give examples of differential operators that have only a finite number of eigenfunctions. It is clear that such a finite number of functions cannot be a basis for a function space containing an infinite number of linearly independent functions.

In order to use the techniques of the previous sections in interpreting functions of operators, we must extend the concept of spectral representation. We shall say that the operator L has a spectral representation if, for any function u(x) in the domain of L, there exists a representation of u(x) as a sum or as an integral or, perhaps, as a combination of both sum and integral,

and if the expansion is such that

For example, if L = −d²/dx² on the domain defined by Eqs. (2.22), then, for any u(x) in the domain, we have

These formulas reduce to the well-known ones for the Fourier integral if we put λ = k². They become

It is clear that these formulas can be used to define functions of the operator L. We find

We shall use these formulas to find the acoustic field produced in the region − ∞ < x < ∞, 0 ≤ y < ∞ by sources of given strength f(x,y) and frequency ω if the line y = 0 represents a rigid surface. Mathematically, the problem is to find that solution of

satisfying the condition

for − ∞ < x < ∞. Here c is the velocity of sound in the medium. Proceeding as before, we put

over the interval (− ∞, ∞) and then Eq. (2.24) becomes

with the condition (2.25). By the method of variation of constants, we find that the solution of this ordinary differential equation is

This result can be interpreted by means of the spectral representation of L. The function f(x,η) is represented in terms of the Fourier integral as follows:

Now, remembering that

Le^jkx = k²e^jkx

we find that

2.10 Conclusion

The preceding discussion has shown how separable partial differential equations may be solved by operational means. The procedure is as follows: One of the differential operators is treated as a constant and the resulting simpler differential equation is solved. The solution of this simpler differential equation will contain functions of the operator that was treated as a constant. These functions of the operator are interpreted by the use of the spectral representation of the operator; thus, the solution to the partial differential equation is obtained.

The usual transform methods are applications of this theory. For example, the Laplace transform defines the spectral representation for the operator d/dt on the domain of functions u(t) satisfying the initial condition u(0) = 0; the Fourier transform defines the spectral representation for the operator −d²/dx² on the domain of functions u(x) defined over (− ∞, ∞) such that the integrals

are finite, and the Hankel transform defines the spectral transformation for the operator

on the domain of functions u(r), defined over (0, ∞), such that the integrals

are finite. These transforms will be studied in detail in Chap. 3. It is clear, then, that transform methods and operational calculus depend on the study of the spectral representation of differential operators.

REFERENCES

1. Boole, George, “Treatise on Differential Equations,” Cambridge University Press, London, 1859.

2. Bromwich, T. J. I’A., Operational Methods in Mathematical Physics, Phil. Mag., ser. 6, vol. 37, pp. 407–419, 1919.

3. Carson, J. P., Heaviside Operational Calculus, Bell System Tech. J., vol. 1, pp. 1–13, 1922.

4. Dunford, Nelson, and Jacob T. Schwartz, “Linear Operators,” Interscience Publishers, Inc., New York, 1958.

5. Friedman, Bernard, “Principles and Techniques of Applied Mathematics,” John Wiley & Sons, Inc., New York, 1956.

6. Heaviside, O., “Electromagnetic Theory,” vol. 1, The Electrician, London, 1893.

7. Leibnitz, G. W., “Mathematische Schriften,” vol. 1, edited by C. J. Gerhardt, Berlin, 1849.

8. Morrey, Charles B., Jr., Nonlinear Methods, chap. 16 in “Modern Mathematics for the Engineer,” First Series, edited by E. F. Beckenbach, McGraw-Hill Book Company, Inc., New York, 1956.

9. Pipes, Louis A., Matrices in Engineering, chap. 13 in “Modern Mathematics for the Engineer,” First Series, edited by E. F. Beckenbach, McGraw-Hill Book Company, Inc., New York, 1956.

10. Wagner, K. W., Über eine Formel von Heaviside zur Berechnung von Einschaltvorgange, Ark. Electrotechnik, vol. 4, pp. 159–193, 1916.