Methods of Mathematics Applied to Calculus, Probability, and Statistics

21 Applications of Infinite Series

21.1 THE FORMAL ALGEBRA OF POWER SERIES

Power series play a prominent role in much of mathematics; hence we need to know how to handle them. It is evident from the linearity of both differentiation and summation that if

then

For multiplication of two power series (we ignore for the moment the problem of the convergence of the product), write the product

How can a term in xⁿ arise in the product? To avoid troubles, we first write the products with dummy variables of summation m and k:

We see that the condition for the term xⁿ to arise is

m + k = n

Then, and only then, can we get a term Jtⁿ; the sum of the exponents of the individual terms in the product must add up to the value n. Rewriting this, we get

k = n − m

Thus each coefficient w_n of the product is given by corresponding finite sum of products

This is called the convolution of the two sequences of coefficients. Its structure is best seen in the Figure 21.1-1, where we have written on one strip the coefficients of the u_n series in order as you go down, and have written the coefficients of v_n in the opposite order. Each coefficient w_n is found from the sum of all the products when the two arrays are shifted an amount of exactly n. We have seen this before, for example in the sum of two random variables. Note that the convolution of the u_k with the v_m is the same as the convolution of the v_m with the u_k.

images

Figure 21.1-1 Convolution

Example 21.1-1

Consider the product e^xe^y. We get

If w_n are the coefficients of the product, then the convolution gives

If (1) we write this as (multiply and divide by n!), (2) note the binomial coefficients appearing, and (3) apply the binomial expansion, we get

We have, therefore,

This is, of course, what you expect since

We have proved that the function exp x is an exponential function.

Example 21.1-2

Division of power series is done much like the division of polynomials. Using

we divide the two series:

Thus, assuming that the quotient series converges, we have

We cannot expect convergence beyond the first zero of the denominator since there the quotient is infinite, and the power series, if it represented the quotient, would diverge there. We will simply state that this series does converge that far.

21.2 GENERATING FUNCTIONS

In Chapter 2 we introduced the idea of a generating function of the binomial coefficients. The expansion

is the generating function of the binomial coefficients C(n, k). We also saw that by multiplying generating functions we could find various identities. For example the equation

(1 + t)^n+m = (1 + t)ⁿ(l + t)^m

regarded as generating functions, leads to various identities among the binomial coefficients when we equate the corresponding linearly independent terms on both sides (the corresponding powers of t). Because the expansion of a function in a power series about a point is unique, it follows that the corresponding coefficients of like powers are equal. We can differentiate and integrate such identities to get new identities.

We now regard the expansion

as a generating function of the numbers u_n [or of functions of x in case u_n = u_n(x)].

Example 21.2-1

We can regard the equation

as the generating function of the numbers 1, 1, 1, …. If we differentiate this and then multiply through by t, we get

which is the generating function of the numbers 1, 2, 3, ….

Had we integrated the original expression (21.2-1) from 0 to t (and multiplied by –1), we would have found, for |t| < 1,

where in the second line we have shifted the dummy index of summation by 1. It is easy to verify that these two expansions are the same by simply writing out the first few terms and comparing the two forms.

Example 21.2-2

Given a probability distribution p(n) for n = 0, 1, 2,…, the corresponding generating function is

If we differentiate this expression, we get

The expected value of the distribution p(k) (the mean of average) is

If we multiply the derivative Equation (21.2-5) by t and differentiate again, we get

Hence, putting t = 1, we have

Recalling that the variance is

we have that

Thus the generating function enables us to find the first and second moments of a distribution by the simple process of differentiation. Indeed, it is easy to see how to get the higher moments, if you want them, from the generating function of the distribution. The formula (21.2-8) for the variance is very useful.

A power series can, therefore, be regarded as a generating function of the coefficients. The student may notice that in this approach the convergence of the series is not important; it is the formal relationships that seem to matter. When we multiply two generating functions together, we get the corresponding generating function of the new set of numbers. We saw such a process when we multiplied two powers of (1 + t)^k and compared the results. If the numbers are to be combined according to the corresponding convolution, then the corresponding coefficients of like powers are to be identified without regard to the convergence. The question of the convergence of generating functions is a delicate matter that has long been debated; it is, of course nice if the series converges, but at times useful results are obtained by the formal manipulation of generating functions without regard to the convergence or divergence of any of the expansions.

The student may wonder how we get generating functions in cases where we do not know them already. The Fibonacci numbers in the next example will show how: (1) the assumption of the generating functions of the Fibonacci numbers, (2) applying the known relationship that defines the Fibonacci numbers leads us to the generating function, and (3) the manipulation of the function to get it expanded in a formal power series, without the cost of trying to compute the general derivative, produces the coefficients.

Example 21.2-3

The Fibonacci numbers (Example 2.4-3)are defined by

we assume (1) that we have the generating function of the Fibonacci numbers:

If we multiply by t, we get

and shifting the dummy index of summation from k + 1 to k, we have

Evidently, multiplying the generating function by t lowers the index of summation by 1. This suggests that dividing by t would shift the index up by 1. We have

We are now ready to do step (2). The fundamental defining equation for the Fibonacci numbers is

We write out the three expansions in detail to see what is going on.

images

We see that (remember f₀ = 0)

and the coefficients of all other powers are automatically zero because of the defining relation f_n+1 – f_n – f_n–1 = 0. Solve for F(t) to get

We have the generating function for the Fibonacci numbers. It is now a matter of getting it into a power series form (3). We could divide it out, but that appears to give a confusing result, so we first apply the method of partial fractions to the right-hand side.

where t₁ and t₂ are the roots of the quadratic in the denominator

t² + t – 1 = 0

These roots are

After tedious algebra (and no one said that the method is easy to carry out, only that the ideas are simple), we get

A little more algebra and we have

But recalling that for this quadratic equation the product of the roots t₁t₂ = –1, we get

We can now divide these out using the form

to get finally for the coefficient of tⁿ

which upon further algebra yields the value given in Example 2.4-3:

Generalization 21.2-4

From the single example of the Fibonacci numbers, we see that with any given set of numbers defined by a difference equation with constant coefficients of the form

we can do a similar process and obtain the general term in a closed form, provided we can factor the denominator into linear factors. Indeed, if quadratic factors arose, we could, at some cost in effort, divide out the quadratic factors.

EXERCISES 21.2

1. The generating function of the sequence 1, 1, 1, … is 1/(1 – t). Show that the square of it gives the numbers 1, 2, 3, …. What does the cube give?

2. If p(n) = l/2 ⁿ⁺¹ for n = 0, 1, 2, …, find the generating function and check that P(l) = 1. Find the mean and variance of the distribution.

3. Using the recurrence relation u_n = au_n-₁, u₀ = 1, find the generating function.

4.* Using the recurrence relation u_n+1 = au_n + bu_n–1 with u₀ = 0, u₁ = 1, find the generating function.

21.3 THE BINOMIAL EXPANSION AGAIN

In Section 2.4 we found the binomial expansion for integer n:

images

where the binomial coefficients C(n,k) are given by the formula

Looking at the expansion, we see that it is simply the Maclaurin expansion of the function (1 + x)ⁿ. We can get it directly by

To get the coefficient of x^k, the kth derivative must be diveded by k!, and we see that we get exactly the same result as we earlier found for the binomial expansion.

Now that we have two different derivations of this important result, we are in a position to compare the assumptions behind the derivations and perhaps loosen the conditions. For the earlier proof we used mathematicalinduction and were pretty well confined to positive integer exponents n. In this new derivation we are not so limited. If we believe

for the exponent n,whether or not an integer, then we have the corresponding binomial expansion. But we will now get an endless sequence of coefficients; the numerator factor will not produce a zero coefficient past some point unless n is an integer. It is natural to extend the defination of the binomial coefficients to noninteger values of n, and say

for all real values of n and integer k. (Compare with Example 9.10-3.)

We will have, for noninteger exponents n, an infinite series

What about the convergence? The ratio test gives

and as k → ∞, we see that for |x| < 1 we have convergence. We will not examine the convergence at the ends of the interval.

The student of the foundations of what we know should note that the derivative

was derived for rational numbers only, and by assumption 1 of Section 4.5 we extended it to the irrational exponents. However, if the student believes that for all real n

y = xⁿ

is equivalent to

ln y = n ln x

then differentiating this gives

from which we get

without use of assumption 1 of Section 4.5. Thus we have an alternative derivation for the derivative of a power of x, one that may seem to make fewer assumptions.

We need to examine the binomial coefficients for negative integers. We have the definition

If we replace n by -n we get

Thus we can write (for positive n) the negative binomial coefficients as

This is the fundamental tool for converting binomial coefficients from negative exponents to positive exponents.

Example 21.3-1

Suppose we check this binomial expansion formula for posivitive exponents by observing that

We have the Maclaurin expansion for the first few terms:

images

We have now to multiply this series by itself (using detached coefficients); we have

We can similarly check the binomial expansion for negative exponents by using

In this way we can find many identities among the binomial coefficients.

Example 21.3-2

First occurrence. We often want to know when something will happen for the first time. Consider the tossing of a coin, or other Bernoulli trial, with probability of success p and of failure q = 1 — p. For independent trials we have the probability of the first failure to occur on the kth trial equals: success, success, success, …, success, failure. This probability is

As a partial check of this, we examine the sum of all possible outcomes:

which is what it should be, since the first occurrence must occur sometime.

What is the expected value of this probability of first occurrence? We need the generating function of the probabilities. This generating function is (by definition)

To get the mean and variance of the distribution of first occurrence [see formula (21.2-8)], xwe need the first two derivatives of the generating function. We have

Evaluating these at t = 1, we get

Example 21.3-3

Suppose we now ask for the probability distribution of the time to the second failure (occurrence). Evidently, the first failure can occur at any time before the second, so considering all those times of first failure plus the second failure, which add up to the second failure at the kth trial, we see a convolution arising. We have a sum of all products of occurrence for the first and second failure such that the sum of their times is the given amount. The generating functions of each failure (first and second) are the same since the two are assumed to be independent. Hence, the generating function of the time to second failure is just the square of the generating function to first failure:

The individual terms involve the binomial coefficients of order —2. We have, using (21.3-3),

images

Using C(n, k) = C(n, n– k), we have

The probability distribution of the second failure is given by (k – 1)p ^k–2q².

Case History 21.3-4

The Catalan numbers. The Catalan numbers occur in many different applications, but the background from which they come is, in each case, difficult to explain simply. The stages in the solution leading to them illustrate the use of many of the small ideas that we have just developed. The numbers u_n are defined by the convolution

with u_o = 1. This convolution suggests immediately that we deal with generating functions. Let the generating function be

We multiply the convolution equation by tⁿ and sum with respect to n. We see that

This can be written as a quadratic equation

whose solution is

This is no u_-1 term; hence we must use the minus sign. We have, therefore,

Our problem is now to get this expression as a power series in t. To do this, we begin with the binomial expansion of the radical:

We see that the general term

Hence

and our original expression becomes (in the second line shift the summation index)

The number u_n is (the coefficient of tⁿ)

These are known as the Catalan numbers, and occur in numerous places in mathematics.

The values of the sequence of Catalan numbers can be either found one at a time from this formula, or else we can find a recurrence relation. To find a recursive formula, we write

The first few Catalan numbers are uo = 1, u₁ = 1, u₂ = 2, u₃ = 5, u₄ = 14, u₅ = 42, u₆ = 132, u₇ = 429, …. The size of the nth Catalan number can be found using Stirling’s approximation for n!

EXERCISES 21.3

1. Write C(– 3, k) as a positive binomial coefficient.

2. Verify by expanding and multiplying out the first four powers of (1 + x)(1 + x)^-1 = 1.

3. Verify by expanding and multiply ingout the fir st four terms of (1 + x) ^1/2 (1 + x)–^1/2 = 1.

4. Find the probability of the third occurrence at trial k.

5. Find the mean and variance of the probability distribution of the second occurrence.

6. Show that (1 + x)^1/3 (1 + x)^2/3 = 1 + x through the fourth-degree terms.

7. Using the Stirling approximation, find an estimate of the nth Catalan number.

21.4 EXPONENTIAL GENERATING FUNCTIONS

The form of the generating function series we assumed is the standard one. However, at times it is worth using various alternative forms. The second most common form for a generating function of the numbers u_n is the exponential generating function, which has the form

The difference is that we now are dealing with the original numbers u_n divided by nl. Note that the formulas for the mean and variance do not apply with this generating function.

Example 21.4-1

Suppose we want the moments of the normal distribution centered about the origin (for convenience). The probability density function is

and the moments are

We form the exponential generating function of the moments

We have carefully arranged that inside the integral there will be the sum

Therefore, the moment generating function is of the form

The exponent from the p(x) will add to this to give us, in the exponent,

This immediately suggests completing the square:

We now have

Change the variable of integration to x – σ²t – y. This gives

This is the generating function of the moments of the normal distribution. We now express this in powers of t. We have immediately, upon expanding the exponential in a Maclaurin expansion,

It remains to equate the like powers of t. Clearly, all the odd moments are zero (the integrand is odd over a symmetric range, so this clearly checks for all the odd powers of t). For the even powers, we get

The generating function approach gives us all the moments of the normal distribution at one time.

The method just used can be applied to many situations besides finding moments of a probability distribution. With variations you can find all the integrals of a family of the form

provided you can do the corresponding integral that arises, depending on the kind of generating function you adopt. It is then a matter of expanding the generating function into powers of t and equating like terms.

EXERCISES 21.4

1. Using a generating function, find the moments of the distribution p(x) = a exp (–ax), x ≥ 0.

2. Using a generating function, find the moments of the function sin πx, 0 ≤ x ≤ 1.

21.5 COMPLEX NUMBERS AGAIN

The expansion of the exponential function is

Suppose we make the formal substitution

x = iθ

where i = We will get

and if we rearrange this in terms of real and imaginary numbers, (using i² = –1, i³ = –i, i⁴ = 1, …), we get

images

This is the famous Euler identity. The formula is sometimes written as exp (iθ) = cos θ. The formula is certainly surprising at first, so we evaluate it for a number of values.

images

We see that each additional increase of π/2 to the angle gives a multiplicative factor of i. Thus the results are consistent with the multiplicative property of the exponential function. The particular value

e^πi = –1

is very interesting as it shows that a close relationship exists between the two transcendental numbers, e and π, that dominate the calculus and its applications. (The number y = 0.577… is a very different number in the sense that it is apparently unrelated to e and π).

Let us explore the matter further and see if cos θ + i sin θ is truly an exponential. We begin with

and write each term out in the sin and cos form.

Using the addition formulas of trigonometry, we have the result

This is the de Moivre (1667–1754) identity (of whom Newton once said, “Ask Monsieur de Moivre; he knows these things better than I do.”). The identity (21.5-2) provides a very convenient way of remembering the addition formulas of trigonometry.

In this identity we can set θ = θ and use the same methods as we used in Section 3.8 for studying the exponentials, to get the fact that Euler’s and de Moivre’s identities are true for all rational powers. Using the assumption 1 of Section 4.5, they are also true for all real numbers. To get to negative exponents, we can either replace i by – i throughout, or a bit more reasonably note that

Hence cos θ – i sin θ is the reciprocal of cos θ + i sin θ.

In some respects the de Moivre identity is the heart of trigonometry;it explains

so much of what one sees. For example, it is what is behind the result (Section 2.3) where we showed that cos nx = polynomial of degree n in cos x. It is simply the nth power of the cos x + i sin x, plus noting that the real parts have only sin² x, which is easily converted to cos² x so the result is only in cos x.

It is possible that you might have felt better about all the formalism if we had started with

exp (ax)

and set a = i, but it would be formalism nevertheless.

We see that |exp(ix)| = 1. If we plot cos θ + i sin θ in the complex plane (Section 5.12), we find Figure 21.5-1. This figure should give you a bit more

Figure 21.5-1 e^iθ

confidence in the results obtained so far; you can see that the point described by

e^iθ = cos θ + i sin θ

goes around the unit circle as θ goes from 0 to 2π, and the picture has the right periodicity for both sides of the equation. The geometric approach complements and reinforces the algebraic approach.

We write out the two Euler identities (21.5-1) (note the shift to the x variable),

e^ix = cos x + i sin x

e^–ix = cos x – i sin x

add, and divide by 2, to get

Similarly, by subtraction we get

Thus we can convert from trigonometry to complex exponentials and back again as we please. This is useful, especially if you abbreviate e^ix by writing it as a new variable z:

You have

images

These substitutions reduce many trigonometric identities to problems in rational algebra where you do not need to recall the many trigonometric identities. In the sense of analytic geometry versus synthetic geometry, this is the analytic approach to trigonometric identities. The analytic approach to trigonometry has the corresponding power. Note that

Example 21.5-1

Consider the expression (compare with Section 17.4)

images

In the above derivation, we have used mainly algebra and have used trigonometry only at the two ends. Of course, the derivation is much more rapid if you know all the various identities and when to use them; the above method is useful when you have no idea of what to do with the mess of trigonometric functions. The method is especially useful in handling “regular” identities.

Example 21.5-2

Consider the Euler identity

Writing this in terms of the symbol z, we get

This is clearly a geometric progression, and we get (arranging things in a symmetric form)

which is the same as before, but a lot more directly obtained.

From the rules of exponents it follows that for complex exponents we have

exp (a + ib) = (exp a) (cos b + i sin b)

Example 21.5-3

In Example 16.5-7, we considered integrals of the form

and used integration by parts twice to get each answer. With complex exponentials we can get the two results much more easily. They are the imaginary and real parts of

where we have ignored the constant of integration, and for the last equation have multiplied the numerator and the denominator by the complex conjugate a – ib to make the denominator real. We have now to pick out the real and imaginary parts of the right-hand side. They are

which agrees with the earlier results (a here is —a there).

What are we to think about all this formalism? Generally, most mathematicians are a lot happier when it is placed on a postulational basis of what appears to be more elementary assumptions. Engineers and scientists tend to feel that it must be true since the whole machinery works so well with the known parts of mathematics that they use daily. “Let the mathematicians make it more rigorous by the mathematician’s standards if they wish,” is their attitude, “we know it is true.”

And this illustrates much of the progress of mathematics. What was at one time assumed to be obviously true by those using it, mathematicians have later analyzed more deeply and placed on a foundation of a postulational system that is logically more satisfying to them, and the results are generally more teachable. The postulational approach makes things more compact and makes them more certain in the sense that the assumptions are fewer, more visible, and easier to study for contradictions. But the engineer’s and scientist’s view is that whatever they believe is true is up to the mathematicians to account for; and if what the mathematicians report is not what the users want, then let the mathematicians try again. Of course, what was “obvious” to the engineers has more than once turned out to be false! The rigor of mathematics is the hygiene necessary to prevent these kinds of mistakes. This process describes much of the history of mathematics. The first formalizations are often not adequate and further work must be done. Thus mathematics is a cooperative effort (with much fighting between the two sides) of creating the useful, artistic structure we call mathematics. Mathematical rigor is seldom creative, but it is a necessary criticism of intuition if we are to avoid endless mistakes. We need to take advantage of the best of both approaches.

Of course, it is often the mathematician who first notices some interesting relationship and extends, generalizes, and abstracts it before it is needed by the users of mathematics. There is both a risk and an advantage of going forward without regard to the needs of the users; you are free from petty details, but you risk making an irrelevant generalization. Much useful mathematics has thus been found, and that which is ill generalized is soon forgotten.

EXERCISES 21.5

1. Find cos 6x in terms of powers of cos x.

Using the z notation, complete the following identities:

2. cos ²x – sin²x

3. 2 sin x cos x

4. cos³x in terms of cos x

5. tan 2x/(1 – tan²x)

6. sin x + sin 2x + sin 3x + … + sin nx

7. cos x + 2 cos 2x + 3 cos 3x + … + n cos nx

21.6 HYPERBOLIC FUNCTIONS

The Euler identities

suggest looking at the corresponding real expressions. These are the hyperbolic functions

Cosh rhymes with “gosh,” and sinh is pronounced “cinch.” The other hyperbolic functions are defined similarly:

Their graphs are shown in Figure 21.6-1.

images

Figure 21.6-1 The hyperbolic function

Clearly, these functions are analogous to the trigonometric functions with minor differences. For example, we have

images

Similarly,

and there are no changes in sign as there are for the derivatives of the trigonometric functions.

You can show in a number of ways that the corresponding addition formulas are

For example, write out the right-hand sides as exponentials and then simplify.

The power series for the two main hyperbolic functions closely resemble those for the corresponding trigonometric functions (except that there are no changes in sign):

The inverse hyperbolic functions can be studied easily by simply interchanging the two variables and then solving for y. Thus we start with, for example,

cosh x = y

and take the inverse function

We have

This is a quadratic exp y, which is easily solved for

But since y > 0, we need to use the plus sign. Taking logs, we get

This is suitable for the range |x| > 1.

EXERCISES 21.6

1. Write out all the definitions of the hyperbolic functions.

2. Differentiate all the hyperbolic functions.

3. Solve the quadratic that arises in finding the arctanh x. Be careful with the argument ranges.

4. Find the arcsinh x as a In function.

5. Find by differentiation the power series of cosh x and sinh x.

21.7* HYPERBOLIC FUNCTIONS CONTINUED

Why are they called “hyperbolic functions?” They are related to the hyperbola

x² – y² = 1

much as the trigonometric (circular) functions are related to the circle

x² + y² = 1

In the first equation, x = cosh θ, and in the second, x = cos 0. Correspondingly, for the y we have sinh θ and sin θ. Our problem is to identify the angle θ.

For the circular functions, we defined the angle θ as the ratio of the arc length cut out by the angle to the radius of the circle. For this analogy between the trigonometric and hyperbolic functions, we use the alternative definition of the angle as twice the area of the sector corresponding to the angle (see Figure 21.7-1). We need to find the shaded area to measure the “angle.”

The proof is made easy if we refer the hyperbola to its asymptotes. Set

images

Figure 21.7-1 Hyperbolic angle

Then

The point A has coordinates

while the point P has coordinates

The angle θ, being the double of the area, is

where since y > 0 for positive angles the plus sign is used. This is the arccosh x as we asserted at the end of Section 21.6. From this it follows that

x = cosh θ

which shows that the angle plays the same role for both types of functions.

Thus the hyperbolic functions can be related to the hyperbola as the trigonometric (circular) functions are related to the circle. The analogy is often useful. However, it should be noted that when you represent some function in terms of the sinh x and cosh x for large positive x you are in trouble, since the two functions are both approximately

While, mathematically, the two functions cosh x and sinh x are linearly independent, in the presence of roundoff errors they are practically dependent (for |x| large).

21.8 SUMMARY

In this chapter we have greatly extended the use of infinite series, especially power series. Since the power series is unique in any range, it follows that the powers of x are linearly independent for all powers, and we may equate the corresponding coefficients in two expressions that represent the same function in the same range. This is the basis for the generating function approach to many problems, especially combinatorial problems, where it is the main tool.

The binomial theorem was extended, by the usual approach, from integers to fractions, and then to all real exponents. Some tests of this new formula were given to both convince you that the formal expansions were consistent and to give you some practice in handling fractional exponents. Then a few of the numerous applications were indicated.

Next, we took on the tentative extension from the real numbers to the complex in the matter of power series. We simply asserted that the same formal power series we obtained for reals applied also for complex numbers. In particular, we used pure imaginary exponents to get the important Euler identities and found their relationship to the trigonometric functions. Thus we connected apparently different parts of mathematics together and found an inner unity. This phenomenon was discussed in Chapter

1 when we observed that the same kinds of things kept coming up in very different situations—the universality of mathematics. It follows that the inverse functions must similarly be related.

We then examined a close analogy to the trigonometric functions, called the hyperbolic functions. They again showed an inner structure that is not evident on the surface. This time we had to alter our original definition of angle to make the close analogy, something that should not surprise you. When you begin mathematics, there are often several ways of starting, several ways of defining things. As you progress and search for the inner unity, you sometimes have to go back and pick a different approach to make an analogy work. It is this inner unity, behind the surface diversity, that mathematicians find so beautiful about their subject. And this unity applies not only to pure mathematics, but often occurs in its various applications—the same mathematics is used in widely different fields. In a sense the mathematician is the last universalist; a sound knowledge of mathematics enables you to work in many different fields, and the contributions in your field are apt to reveal the unity behind the apparent diversity in many different fields of application.