Real Variables with Basic Metric space Topology

DIFFERENTIATION

5.1THE DERIVATIVE AND ITS BASIC PROPERTIES

In this section we examine the differentiation process and obtain the Mean Value Theorem, which has several basic applications, including L’Hospital’s rule and Taylor’s formula with remainder.

5.1.1Definition and Comments

If f : (a, b) → R and x ∈ (a, b), the derivative of f at x is defined by

provided the limit exists and is finite. In this case we say that f is differentiable at x.

We will not derive the familiar calculus formulas for the derivative of a sum, product, or quotient of two functions, but the chain rule is worth looking at. Here we assume that f is differentiable at x (when making a statement of this type, we always assume that f is defined at least on some open interval containing x) and g is differentiable at y = f(x).

If h is the composition of f and g, that is, h(t) = g(f(t)), we wish to show that h is differentiable at x and

There is a rather intuitive argument that almost works. We use Δx: instead of h for the increment in x, and write Δy = f(x + Δx) − f(x), so y + Δy = f(x + Δx). Let Δz = h(x + Δx) − h(x); we then have

Thus,

We would simply like to let Δx → 0 to obtain the desired result. However, there is a problem of division by 0. Although Δx: is restricted to be nonzero by definition of f′(x) (see the remarks at the beginning of Section 4.3), we have no guarantee that Δy will be nonzero; since Δy is determined by Δx, we have no control over it. One possible solution is to define [g(y + Δy) − g(y)]/Δy to be g′(y) if Δy = 0. In this case, f(x + Δx) − f(x) = Δy = 0, so the right side of (1) is 0; also Δz = g(y + Δy) − g(y) = 0, so the left side is 0 as well. Now if we make Δx sufficiently small, [f(x + Δx) − f(x)]/Δx will be arbitrarily close to f′(x). Furthermore, if we knew that Δx → 0 implies Δу → 0, we would be able to conclude that [g(y + Δy) − g(y)]/Δy approaches g′(f(x)), proving the chain rule. But the statement that Δx → 0 implies Δy → 0 is just the assertion that differentiability implies continuity, which we now prove.

5.1.2THEOREM. If f is differentiable at x, then f is continuous at x.

Proof. Let Δy = f(x + Δx) − f(x); then

images

Thus, f(x + Δx) → f(x) as Δx: → 0. ■

The familiar results on local maxima and minima may now be established.

5.1.3THEOREM. Let f be differentiable at x, and assume that f has either a local maximum at x (that is, f(x+h) ≤ f(x) for all sufficiently small h) or a local minimum at x (f(x + h) ≥ f(x) for all sufficiently small h). Then f′(x) = 0.

Proof. Suppose f has a local maximum at x ; then

Let h → 0 to conclude that f′(x) is both ≤ 0 and ≥ 0; hence f′(x) = 0. The argument for a local minimum is similar. ■

We now prove a result that is perhaps the most useful of all differentiation theorems.

5.1.4Mean Value Theorem

Let f:[a, b] → R, and assume that f is continuous on [a, b] and differentiable on (a, b). Then for some x ∈ (a, b) we have

Thus, at x, the slope of the tangent to the curve is the same as the slope of the chord joining (a, f(a)) to (b, f(b)) (see Fig. 5.1.1). In the special case f(a) = f(b) = 0, we have f′(x) = 0 for some x ∈ (a, b), so that the curve has a horizontal tangent. This is known as Rolle’s Theorem (see Fig. 5.1.2). (Intuitively, if the average speed of a car is 60 mph, there is an instant at which the speedometer reads exactly 60. If the car is slower at the beginning, it must be faster later.)

Proof. We consider Rolle’s Theorem first. If f is identically 0, there is nothing to prove, so assume f(t) > 0 for some t ∈ (a, b). Since f is continuous on the compact set [a, b], by Corollary 4.2.2, f attains a maximum at some x, necessarily in (a, b). (If f(t) < 0 for some t, we obtain a minimum at some point of (a, b).) By Theorem 5.1.3, f′(x) = 0.

images

Figure 5.1.1 Mean Value Theorem

images

Figure 5.1.2 Rolle’s Theorem

The general case may be reduced to Rolle’s Theorem as follows. Define the function g by subtracting a linear function from f as follows:

Then g is continuous on [a, b], differentiable on (a, b), and g(a) = g(b) = 0. Furthermore,

By Rolle’s Theorem, g′(x) = 0 for some x ∈ (a, b), and the result follows. ■

5.1.5COROLLARY. Let f be differentiable on the open interval I.

(a) If f′ ≥ 0 on I, then f is increasing on I; that is, if a, b ∈ I, a < b, then f(a) ≤ f(b). If f′ > 0 on I, then f is strictly increasing on I.

(b) If f′ ≤ 0 on I, then f is decreasing on I; that is, if b ∈ I; a < b, then f(a) ≥ f(b). If f′ < 0 on I then f is strictly decreasing on I.

Proof. If a, b ∈ I, a < b, then, by Theorem 5.1.4, f(b) − f(a) = (b − a)f′(x) for some x ∈ (a, b); the result follows. ■

The following generalization of Theorem 5.1.4 will be needed.

5.1.6Generalized Mean Value Theorem

Assume that f and g are continuous real-valued functions on [a, b], both differentiable on (a, b). We assume that g′ is never 0 on (a, b); hence, g(b) − g (a) ≠ 0 by Theorem 5.1.4. Then for some x ∈ (a, b)we have

When g(t) = t, this reduces to Theorem 5.1.4.

[Intuitively, if car f covers four times as much distance as car g, there is an instant when its speedometer reads four times as much. (If f′/g′ < 4 at the beginning, it must be > 4 later, to compensate.)]

Proof. Consider the function h defined by

Then h is continuous on [a, b], differentiable on (a, b), and h(a) = h(b) = 0. By Rolle’s Theorem, h′(x) = 0 for some x ∈ (a, b), and the theorem is proved. ■

Problems for Section 5.1

1.Let f(x) = x² sin(1/x), x ≠ 0; f(0) = 0. Show that f is differentiable everywhere.

2.If f: R → R and for some r > 0 and all x, y ∈ R we have

show that f is constant.

3.Let p(x) = a₀ + a₁ + ... + a_nxⁿ be a polynomial with real coefficients. If all roots of p are real, show that all roots of the derivative p′ are real also.

4.If the hypothesis that g′ is never 0 on (a, b) is dropped from the statement of the Generalized Mean Value Theorem, the result is no longer true. Can you give an explicit counterexample (preferably one with g(b) − g(a) ≠ 0 and f′(x₀) = 0 whenever g′(x₀) = 0, with f′(x)/g′(x) finite limit as x → x₀)?

5.2ADDITIONAL PROPERTIES OF THE DERIVATIVE; SOME APPLICATIONS OF THE MEAN VALUE THEOREM

Let’s examine the assumption we made in Theorem 5.1.6 that g′ is never 0 on (a, b). It is natural to expect that either g′ > 0 on (a, b) or g′ < 0 on (a, b), for if g′(c) < 0 < g′(d) then g′ would be 0 somewhere between c and d, which is impossible. In presenting this argument, we are applying the Intermediate Value Theorem 4.3.2 to g′, but we have a problem because g′ need not be continuous. However, the result holds anyway, as we now show.

5.2.1Intermediate Value Theorem for Derivatives

Assume f is differentiable on the open interval I and that a, b ∈ I, a < b, with f′(a) < c < f′(b). There is a point x ∈ (a, b) such that f′(x) = c.

Proof. Let g(t) = f(t) − ct; then g′(a) = f′(a) − c < 0, and g′(b) − f′(b) − c > 0. Since [g(a + h) − g(a)]/h → g′(a) as h → 0, we have g(a+h)−g(a) < 0 for h > 0 and sufficiently small. Similarly, [g(b+h) − g(b)]/h → g′(b) as h → 0; hence, g(b+h)−g(b) < 0 for h negative and sufficiently small. Thus, we can find points t₁, t₂ ∈ (a, b) such that g(t₁) < g(a) and g(t₂) < g(b). The point of doing this is to conclude that if x minimizes g(t), a ≤ t ≤ b (the existence of x is guaranteed by Corollary 4.2.2), then we must have a < x < b, so by Theorem 5.1.3, g′(x) = 0; in other words, f′(x) = c. ■

5.2.2COROLLARY. If f is differentiable on an open interval containing x₀, then f′ cannot have a simple discontinuity at x₀.

Proof. Intuitively, if f′ jumps at x₀, a typical situation might be f′(x) = 0, x < x₀; f′(x) = k, x ≥ x_0. By considering the cumulative area under f′, we conclude that f(x) = 0, x < x₀; f(x) = k(x − x₀), x ≥ x₀. But then f is not differentiable at x₀ (see Fig. 5.2.1).

For the formal proof, first assume f′ has a simple discontinuity at x₀ with ; see Fig. 5.2.2. By changing c slightly, we may assume c ≠ f′(x₀). By definition of a simple discontinuity, as t → x₀ from above, and as t → x₀ from below. It follows that we can find points a, b and an ∊ > 0 such that a < x₀ < b and

Now apply Theorem 5.2.1 to obtain x ∈ (a, b) such that f′(x) = c; in view of the preceding inequalities we must have x = x₀. But this is a contradiction because f’(x₀) ≠ c.

Figure 5.2.1 Intuitive Argument That the Derivative Cannot Have a Simple Discontinuity

Figure 5.2.2 Proof of Corollary 5.2.2

There is one remaining possibility, namely, , and this is covered by a similar argument (Problem 2). Thus, f′ cannot have a simple discontinuity at x₀. ■

For an example of a function whose derivative has a nonsimple discontinuity, see Problem 3.

We now discuss a basic application of the Generalized Mean Value Theorem.

5.2.3L’Hospital’s Rule

In calculus, you encountered the problem of finding the limit of a quotient f(x)/g(x) as x → a, where f(x) and g(x) both approach 0 as x → a (the “0/0 case”), or f(x) and g(x) both approach ∞ as x → a (the “∞/∞ case”). In either case, if f and g are differentiable, with g′ never 0, on (a, b), and f′(x)/g′(x) → L as x → a, then f(x)/g(x) → L as x → a. (In this case, L is allowed to be infinite.)

To prove the assertion, consider the 0/0 case first. Since f is differentiable, hence continuous, on (a, b), and f(x) → 0 as x → a, we may extend f to a continuous function on [a, b), and similarly for g. If a < x < b, Theorem 5.1.6 yields

let x → a to obtain f(x)/g(x) → L, as desired. (Intuitively, if the speed of car f is four times the speed of car g at time a, and they start from the same position at that time, then in a small time interval beginning with a, car f will cover four times as much distance.)

In the ∞/∞ case, let a < x < t₀ < b, and apply Theorem 5.1.6 to obtain

Now observe that

The expression in brackets approaches 1 as x → a, and it follows from (1) and the assumption that f′(y)/g′(y) → L as y → a that f(x)/g(x) → L, as desired. (Formally, we may first choose t₀ so that f′(t)/g′(t) is close to L for all t ∈ (a, t₀), and then let x → a.)

Our final application of the Mean Value Theorem is to the problem of expanding a function f in a power series. If we are interested in the behavior of f near x = a, we may attempt to represent f in the form a₀ + a₁(x − a) + a₂(х − a)² + .... If such a representation is possible, the coefficient a_n must be given by f⁽ⁿ⁾(a)/n!, where f⁽ⁿ⁾ is the nth derivative of f (take f⁽⁰⁾ = f ). The general problem of convergence of power series is best studied via complex variables, but we can obtain the following useful result.

5.2.4Taylor’s Formula with Remainder

Assume that f⁽ⁿ⁾ exists on the open interval I, and let a, b ∈ I. Then

where

    Proof. Since differentiability implies continuity, f and all its derivatives up to order n − 1 are continuous on I. Assume for simplicity that a < b; the argument for a > b is essentially the same. Let M be defined by the equation

We must produce x ∈ (a, b) such that f⁽ⁿ⁾(x) = M. Essentially, we replace a by a variable t in the preceding equation; let

By hypothesis, g is continuous on [a, b] and differentiable on (a, b). Also, g(b) = 0 by definition of g, and g (a) = 0 by definition of M. Thus, by Theorem 5.1.4, g′(x) = 0 for some x ∈ (a, b). But

All terms cancel except for the last two, so if we find x ∈ (a, b) such that g′(x) = 0, we must have f⁽ⁿ⁾(x) = M, as desired. ■

Problems for Section 5.2

    1.Give an example of a function g differentiable on an open interval containing [a, b] such that the minimum value of g(t), a ≤ t ≤ b, occurs at a, but g′(a) ≠ 0. (This is the case we took pains to exclude in the proof of Theorem 5.2.1.)

    2.Complete the proof of Corollary 5.2.2 by analyzing the case

    3.Let f(x) = x² sin(1/x), x ≠ 0; f(0) = 0. By Problem 1 of Section 5.1, f is differentiable everywhere. Show that f′ has a nonsimple discontinuity at x = 0.

    4.Use Taylor’s formula with remainder to show that the familiar power series expansion

            is valid for all x.

    5.Discuss the application of the Mean Value Theorem to the problem of estimating how close b must be to a in order that f(b) be within a specified degree of closeness to f(a). State your assumptions clearly.

REVIEW PROBLEMS FOR CHAPTER 5

    1.Let f be differentiable on R, with f′(−1) < 0, f′(1) > 0. Must f have a local maximum at some point in (−1, 1)? Explain.

    2.Let f(x) = e^−1/^x² if x ≠ 0; f(0) = 0. Show that f is infinitely differentiable on R; that is, the nth derivative f⁽ⁿ⁾(x) exists for every n = 1,2,... and every x ∈ R.

    3.Give an example of a function f whose Taylor expansion converges for every x ∈ R but does not converge to f(x) except at x = 0.

    4.Give an example of functions f : (0, 1) → R and g: (0, 1) → R such that f and g are differentiable everywhere, g is never 0, g′ is never 0, and f(x)/g(x) → 1 as x → 0, but f′(x)/g′(x) → L ≠ l as x → 0.

    5.Give an example of a function f that is differentiable everywhere and whose derivative has a nonsimple discontinuity at x = 1.

    6.Use Taylor’s formula with remainder to show that the power series expansion

            is valid for all x.

    7.Let f and g be differentiable on the open interval I, and let a be a point of I with f(a) = g (a) = 0. Consider the following “proof” of the 0/0 case of L’Hospital’s Rule:

            For x near a, we may write, by the Mean Value Theorem,

            where y and z are between a and x. Now let x, hence y and z, approach a. If f′(x)/g′(x) → L, it follows that f(x)/g(x) → L also.

        (a)Explain why this argument is unsound.

        (b)The proof works if an extra hypothesis is added. What is this hypothesis, and where is it used?