APPENDIX 1 Euler’s Original Derivation

Below, I’ll go over Euler’s first proof of e^iθ = cos θ + i sin θ. The math is a little more challenging than that in the rest of the book. But most of it will look familiar if you read the trig chapter and are comfortable with basic algebra. (In case of algebraic discomfort, I’ve included a cheat sheet on pertinent algebra formulas.) And taking on the challenge comes with a major reward: you’ll get to look over the shoulder of a genius as he pieces together a great discovery.

First, some math-history mood music.

Euler presented his initial proof of e^iθ = cos θ + i sin θ in the Introductio in Analysin Infinitorum (Introduction to the Analysis of the Infinite), a two-volume opus published in 1748. The Introductio was basically a high-end pre-calculus text whose goal was to familiarize eighteenth-century math students with the infinite, and the tricky issues it raises, while they were standing on the solid ground of well-known algebraic techniques. Later, they would presumably move on to grappling with infinity in calculus, which at the time was still an evolving branch of math. (Euler later wrote his era’s definitive texts on differential and integral calculus.) Among the Introductio’s topics were the use of infinitely large and small numbers in calculations; the manipulation of infinite sums; and the expression of trigonometric functions in terms of infinite sums.

Although the Introductio was ostensibly a textbook, it was actually more like a very long research paper, and it included a slew of firsts. Examples: it presented the first modern definition of functions; it brought the unit circle to the fore in trigonometry; it pioneered the modern definitions of the sine, cosine, and other trig functions, and established the abbreviations we still use for them, such as sin θ; and it standardized π as the symbol for the famous circle-related number. In a 1950 lecture, historian Carl Boyer called the Introductio the most influential mathematics textbook of modern times, comparable to Euclid’s Elements in importance.

High praise indeed. But French mathematician André Weil arguably topped it in 1979 when he declared that contemporary math students could profit much more from the Introductio than from modern alternatives. This probably seemed eccentric to most mathematicians at the time, for it conflicted with the conventional view that the Introductio’s derivations are generally based on dubious, eighteenth-century reasoning that, when carelessly applied, can lead to contradictions—such stuff as math nightmares are made on. Post-eighteenth-century mathematicians and historians have sometimes described Euler’s conceptual moves as brash, even reckless. Thus, while his results have been celebrated as brilliant advances for over two centuries, many of his derivations, particularly ones related to infinity, have long been regarded as little more than quaint relics. Indeed, writers on math have often marveled over the fact that he almost always reached what are now regarded as sound conclusions—he seemed to have had an uncanny knack for doing the right thing despite his purportedly dubious reasoning. In effect, while belittling Euler’s methods, they’ve bolstered the idea that he possessed almost preternatural intuition.

But Weil’s view has seemed less eccentric in recent years. One reason is that revisionist mathematicians have shown that Euler’s inferential moves when proving infinity-related theorems were actually quite similar to ones that are now made in non-standard analysis, an indisputably rigorous branch of mathematics that emerged in the 1960s. (Non-standard analysis is based on “hyperreal” numbers, which are basically the infinitely large and small numbers of Euler’s era couched in rigorous, modern terms.) Their research suggests that most of Euler’s conceptual maneuvering, with only minor adjustments, can be seen as perfectly valid by today’s standards.

Further, some math educators have argued that Euler’s conceptual moves are better aligned with students’ natural intuitions, and thus easier to grasp, than their standard modern counterparts. (This view isn’t new, by the way, but its appeal has grown as teachers have cast about for better ways to introduce calculus.) Today, math texts that make reference to infinity often read like legal documents filled with abstruse logic and complex qualifying clauses. The complexities help ward off inconsistencies that were implied by the less-than-rigorous concepts of Euler’s day—these older ideas were basically trashed by nineteenth-century mathematicians as they sought to eliminate shadows of doubt that had loomed over math’s treatment of the infinite. But the greater rigor had a cost: it distanced a lot of math from intuitively appealing ideas that had long informed mathematical thought. For instance, picturing functions as curves that a moving hand might draw came to be seen as relying on ill-defined geometrical intuitions that could all too easily lead to serious weirdness, or even horrible cracks, in math’s very foundations.^*

Euler’s masterpieces are also noteworthy because of their lucidity. Indeed, he’s almost unique in mathematics for “taking pains” to carefully present his reasoning, observed twentieth-century Hungarian mathematician George Pólya. Largely because of that, Pólya added, Euler’s works possess a “distinctive charm”—a quality that’s not superabundant in the technical math lit. Mathematician William Dunham has similarly noted that Euler’s expositions are “fresh and enthusiastic, in contrast to the modern tendency of obscuring a scholar’s passion behind the facade of detached technical prose.”

TO UNDERSTAND HOW Euler derived e^iθ = cos θ + i sin θ, you need to know a little about the infinitely small numbers—infinitesimals—that he routinely introduced in calculations. Infinitesimals were popularized in mathematics by Leibniz when he developed his version of calculus in the 1670s. They were loosely defined as numbers that were so very close to zero that they could be treated as zeroes—or not—depending on the circumstances. Importantly, when wearing their non-zero hats, they could serve as divisors, or, equivalently, as denominators of fractions. In contrast, true zeroes can’t do that—dividing by zero isn’t allowed in math, and fractions of the form x/0 are undefined.

These vaguely defined numberlets were crucial in the development of calculus. And following in Leibniz’s footsteps, Euler and his contemporaries freely introduced them when computing instantaneous rates of change (which involve zero time increments, because that’s what the term “instant” means). That enabled them to avoid dividing by zero in such calculations—instead, they divided by infinitesimals. But when the convenient specks were no longer needed, they would be eliminated as a simplifying move, just as if they really were equal to zero after all. This maneuver resembled the perfectly valid move of replacing x + 0 with x, and it allowed mathematicians to write things like x + dx = x, where dx designates a non-zero (wink, wink) infinitesimal. Such strategic “neglect” of infinitesimals was justified by the argument that they were too small to matter when added or subtracted from what might be called normal-sized (finite) numbers in calculations.

Plainly, the infinitesimals of Euler’s day were metaphysically fishy. But like a zillion magical microbes in harness, they helped propel a powerful engine of discovery. In fact, “great creations” in mathematics were “more numerous [during the eigtheenth century] than during any other century,” according to eminent math historian Morris Kline.

Kline added, however, that Euler and his peers were so “intoxicated” with their successes that they were often “indifferent to the missing rigor” of their mathematics. The most glaring problem was that their clever calculating tricks had skirted, rather than solved, deep problems involving the infinite that had bedeviled mathematics and philosophy since the time of the ancient Greeks. By 1900, fully sobered-up mathematicians had replaced the loose, intuitive ideas of Euler’s time with precision-engineered definitions of limits and other concepts that referenced only finite quantities, effectively shoving the threatening specters out of sight.^* (Although not necessarily out of mind—the infinite will probably always be a very provocative topic.)

NOW FOR THE MATH. Let’s begin with a set of rules governing the use of infinitely small and large numbers in Euler’s reckonings. Just give them a quick once-over at this point; you can revisit them as needed when they’re invoked in the calculations below.

(1) Multiplying an infinitely small number times a finite one yields another infinitesimal. This is analogous to the rule of arithmetic specifying that multiplying a number times zero equals zero. Thus, if y and z designate infinitesimals, and x is a finite number, you can write y × x = z, or yx = z.

(2) Dividing a finite number by an infinitely large one yields an infinitesimal. This logic is akin to dividing a cake into an infinitely large number of pieces for the attendees at a very, very large birthday party, resulting in each person getting a vanishingly small piece. Mathematically, it would be x/n = z, where n is infinitely large, x is a finite number, and z is an infinitesimal.

(3) Dividing a positive finite number by a positive infinitesimal is equal to an infinitely large number. This is analogous to dividing a positive number by a very small positive fraction to get a very big number. For instance, 1 divided by a millionth (which is equivalent to the number of one-millionths there are in the number 1), equals a million, or, in short, 1/(1/1,000,000) = 1,000,000. Thus, if x were a finite number, z were an infinitesimal, and n were an infinitely large number, you could write x/z = n.

(4) Multiplying an infinitesimal times an infinitely large number yields a finite number. There’s no analogue to this rule in regular arithmetic. However, it accords with the intuitive idea that when the infinitely large is pitted against the infinitely small, the two basically cancel each other out in a titanic clash, and after the dust clears a finite number remains. Thus, if z designates a positive infinitesimal, n stands for an infinitely large number, and x represents a finite number, you could write z × n = x, or zn = x.

(5) Recall that cos 0 = 1 and sin 0 = 0. Because the difference between an infinitesimal, call it z, and 0 is infinitely small, it would make sense—at least when no post-eighteenth-century mathematician is looking—to sneak z in for 0 in these trig facts to get cos z = 1 and sin z = z. This move, which Euler made in the derivation at hand, invokes one side of the two-faced nature of infinitesimals—in effect, z is treated like zero here. George Berkeley, the philosopher who argued that infinitesimals were metaphysically goofy, is spinning in his grave about now. But Euler is remaining totally cool with the whole thing during his infinitely long rest.

HERE’S THE PROMISED algebra cheat sheet. These rules show how to manipulate terms (designated by a, b, c, and d) that represent specific numbers, individual variables, or elaborate expressions involving numbers and variables.

• a⁰ = 1, a¹ = a, a² = a × a, a³ = a × a × a, etc.

• (a^m)ⁿ = am × n

• If a = b, then aⁿ = bⁿ

• If a/b = c, then a/c = b.

• a/b × b = a

• If a = b and c = d, then a + c = b + d (Which, in effect, means that equations can be added together.)

• (a + b) × (c + d) = (a × c) + (a × d) + (b × c) + (b × d) (This is known as the FOIL rule, for it entails expanding the left side of the equation to get the right side by successively adding the products of: the First components of each term on the left (a and b), the Outer components (a and d), the Inner components (b and c), and the Last components (b and d).)

• (a + b)/2 = a/2 + b/2.

I’VE DIVIDED EULER’S derivation into seven steps and labeled certain equations (e.g., A.1 and A.2 in the first step) so they can be easily referenced later. I’ve also updated Euler’s notation—I write, for instance, x² instead of xx as he did for typesetters’ convenience. With one exception, noted in Step 6, the reasoning presented here closely tracks Euler’s.

Step 1: Our first move is to show how de Moivre’s formula, which was mentioned in passing in Chapter 9, can be extracted from a couple of basic trigonometry equations called the angle addition identities. I’ve opted to skip derivations of these identities; they follow from the triangle-based trig I showed you in Chapter 7, and if you want to see the proofs, I recommend the Khan Academy’s excellent versions at www.khanacademy.org. (Just search its website for “proof of angle addition identities.”)

The identities are:

sin (a + b) = sin a cos b + sin b cos a

and

cos (a + b) = cos a cos b − sin a sin b.

In case you’re wondering, the terms on the right sides of these equations represent products. For instance, sin a sin b = sin a × sin b.

To generate de Moivre’s formula, Euler used the identities to show that expressions of the form (cos θ + i sin θ)ⁿ are equal to expressions of the form cos (nθ) + i sin (nθ), where n = 1, 2, 3,….

This is trivial to prove for n = 1, because

(cos θ + i sin θ)¹ = cos θ + i sin θ = cos (1 × θ) + i sin (1 × θ) (Since a¹ = a , and 1 × θ = θ.)

For n = 2, we have

(cos θ + i sin θ)² = (cos θ + i sin θ)(cos θ + i sin θ)

= cos²θ + i sin θ cos θ + i sin θ cos θ + i²sin²θ [Using FOIL, and the fact that cos²θ means cos θ times cos θ —likewise for sin²θ.]

= (cos²θ − sin²θ) + [i × (2 sin θ cos θ)] [By using the fact that i²= −1, combining the first and fourth terms after converting i²to −1, and by adding the two identical middle terms.]

= cos 2θ + i sin 2θ,

because of the angle addition identities given above. For instance, when both a and b are set equal to θ in the second identity, it implies that cos 2θ = cos (θ + θ) = cos θ cos θ − sin θ sin θ = cos²θ − sin²θ, which justifies replacing cos²θ − sin²θ with cos 2θ.

For n = 3, we have

= (cos θ + i sin θ)³ = (cos θ + i sin θ)²(cos θ + i sin θ) [By the definition of exponents.]

= (cos 2θ + i sin 2θ)(cos θ + i sin θ) [By plugging in the result above for n = 2.]

= cos 2θ cos θ + i cos 2θ sin θ + i sin 2θ cos θ − sin 2θ sin θ [By FOIL and i²= −1.]

= (cos 2θ cos θ − sin 2θ sin θ) + [i × (cos 2θ sin θ + sin 2θ cos θ)] [By rearranging terms.]

= cos 3θ + i sin 3θ [By using the trig identities with a = 2θ and b = θ].

For n = 4, we repeat this procedure, duly plugging in the result we got for n = 3 and invoking the trig identities with a and b set equal to 3θ and θ, respectively, to obtain

(cos θ + i sin θ)⁴= cos 4θ + i sin 4θ.

At this point, I hope you can see that repeating the procedure for n = 5, 6, and so on will yield similar results. Expressing this conclusion for any positive integer n gives us de Moivre’s formula:

(A.1) (cos θ + i sin θ)ⁿ = cos (nθ) + i sin (nθ).

Tweaking A.1 yields a similar equation:

(A.2) (cos θ − i sin θ)ⁿ = cos (nθ) − i sin (nθ).

The tweak simply involves replacing θ by −θ in A.1, and then applying these trig facts: sin (−θ) = −sin θ, and cos (−θ) = cos θ. You can verify the trig facts, if desired, by reviewing Chapter 7’s unit-circle-based definition of the trig functions and the meaning of negative angles. Or visit the Khan Academy and search for “Sine & cosine identities: symmetry.”

Step 2: Reversing equations A.1 and A.2 and adding them together, we get

cos (nθ) + i sin (nθ) + cos (nθ) − i sin (nθ) = (cos θ + i sin θ)ⁿ+ (cos θ − i sin θ)ⁿ.

Rearranging this equation’s left-side terms makes it cos (nθ) + cos (nθ) + i sin (nθ) − i sin (nθ), or simply 2 cos (nθ). (Note that i sin (nθ) − i sin (nθ) equals 0—the two terms cancel each other out.) Thus, the equation becomes

2 cos (nθ) = (cos θ + i sin θ)ⁿ+ (cos θ − i sin θ)ⁿ,

and, by dividing each side by 2, we obtain

(A.3) cos (nθ) = [(cos θ + i sin θ)ⁿ+ (cos θ − i sin θ)ⁿ]/2.

At this point, Euler brought infinity into play. He assumed that the n shown in equation A.3 is infinitely large. If a finite number, which we’ll represent by the variable v, is divided by the infinitely large n, the result is an infinitesimal number, which we’ll call z. In short, v/n = z (in accordance with Rule 2 above). By basic algebra, v/n = z implies that v = zn = nz. (Which accords with Rule 4.)

Note well: n, z, and v will be brought into play multiple times below.

Now, by applying Rule 5 to z, we have cos z = 1. Using the “sine part” of the same rule along with z = v/n (from the previous paragraph), we get, as Euler did, sin z = z = v/n.

Hang on, we’re making real progress. We’re now close to turning the sines and cosines on the right side of equation A.3 into expressions very much like (1 + 1/n)ⁿ with n assumed to be infinitely large. This expression is the number e, which we need to bring into the picture in order to produce the equation we’re after.

Here’s a quick review of the numbers that are now available for action: we’ve assumed that n is an infinitely large number, z is an infinitely small one, nz = v (where v is finite), z = v/n, cos z = 1, and sin z = v/n.

The next maneuver is to plug in an infinitely small number for θ in A.3, namely z. After we plug in z for θ on A.3’s right side, it becomes [(cos z + i sin z)ⁿ+ (cos z − i sin z)ⁿ]/2. Now comes a key move that Euler cleverly set up by bringing the infinite into play: because cos z = 1 and sin z = v/n, we can replace cos z with 1, and sin z with v/n, to turn A.3’s right side into [(1 + iv/n)ⁿ + (1 − iv/n)ⁿ]. (Note the two e-like thingies we’ve now conjured up.)

Meanwhile, when we plug in z for θ on A.3’s left side, it becomes cos (nz), and since nz = v the left side can be rewritten as cos v.

Installing these new, improved left and right sides in A.3 gives us

(A.4) cos v = [(1 + iv/n)ⁿ + (1 − iv/n)ⁿ]/2.

Step 4: This step is virtually identical to Step 3, except that its first move is to subtract equation A.2 from A.1 instead of adding the two. That yields

2i sin (nθ) = (cos θ + i sin θ)ⁿ − (cos θ − i sin θ)ⁿ,

which, by duplicating the infinity-based logic of Step 3, leads to

(A.5) i sin v = [(1 + iv/n)ⁿ − (1 − iv/n)ⁿ]/2.

Step 5: Now let’s clear away some clutter by temporarily replacing the expressions (1 + iv/n)ⁿand (1 − iv/n)ⁿin equations A.4 and A.5 with the variables r and t, respectively. That turns them into

cos v = (r + t)/2

and

i sin v = (r − t)/2.

Adding these equations together gives us

(A.6) cos v + i sin v = (r + t)/2 + (r − t)/2.

By basic algebra, (r + t)/2 = r/2 + t/2, and (r − t)/2 = r/2 − t/2, which allows us to rewrite A.6 as

cos v + i sin v = r/2 + t/2 + r/2 − t/2.

Finally, notice that the second and fourth terms of the right side of this last equation cancel each other out, and the first and third add up to r. This means that the right side algebraically melts down to r alone, which, recall, is equal to (1 + iv/n)ⁿ. And that permits rewriting A.6 again as

(A.7) cos v + i sin v = (1 + iv/n)ⁿ.

Step 6: Now all we need to do is show that the right side of A.7 is equivalent to e^iv and we’ll be done. Continuing in Euler’s footsteps, we’ll begin by assuming that a is a finite number greater than 1 and z is an infinitely small number. From algebra, we know that a⁰ = 1, a¹ = a, and, in general, increasing a’s exponent will produce larger and larger numbers. (For instance, if a = 3, then a⁰ = 1, a¹ = 3, a² = 9, and so on.) Thus, since z, an infinitesimal, is assumed to be only very slightly greater than 0, it stands to reason that a^z will be only very slightly greater than a⁰. Indeed, Euler reasoned that a^zis only infinitesimally greater than a⁰. Expressing this as an equation, we have a^z = a⁰ + w, where w is infinitely small. And since a⁰=1, we can rewrite this equation as a^z = 1 + w.

Dividing a number m by the number p to get k can be written as m/p = k. Similarly, for the infinitesimals w and z, we can write w/z = k for some number k. Multiplying both sides of this equation by z and applying basic algebra to the result gives us w = kz. This means that we can replace w with kz in the equation above (a^z = 1 + w) to get

(A.8) a^z = 1 + kz.

Importantly, this last equation implies that a and k are tied to each other in the same way that y and x are tied by the equation y = 1 + 2x—in the latter, if x is assigned a specific value, say 2, then y must take on a corresponding specific value, in this case 5. This important a-k linkage will be revisited momentarily.

Now glance back at Rule 2 above. It implies that the infinitesimal z is equal to some number v/n, where v stands for a finite number (the variable x was used to represent a finite number above, but v will work just as well), and n stands for an infinitely large number. This logic expressed as an equation is z = v/n, which allows us to replace z with v/n in A.8 to get

a^v/n = (1 + kv/n).

Raising both sides of this equation to the nth power yields (a^v/n)ⁿ on the left side, which, by applying a bit more algebra, we reduce to a^v. Meanwhile, raising the equation’s right side to the nth power turns it into (1 + kv/n)ⁿ. Thus, we now have:

(A.9) a^v = (1 + kv/n)ⁿ.

Based on the a-k linkage mentioned above, we know that if we set k equal to a specific number, then a must take on a corresponding specific value. Thus, if we set k equal to 1, a will take on the value of some corresponding constant. To identify this mystery constant, let’s examine the equation after k is set equal to 1:

a^v= (1 + v/n)ⁿ.

Because v represents an unspecified finite number, it can freely range over different numbers without falsifying the equation. That permits us to set v equal to 1 if we want (and we do), turning the equation into

a = (1 + 1/n)ⁿ.

Now we can see what a must be when k = 1. Based on the definition of e from Chapter 2 (and the fact that n was assumed above to be infinitely large), this last equation implies that a = e. Therefore, we can rewrite A.9 with k = 1 and a = e to get

(A.10) e^v= (1 + v/n)ⁿ.

(Euler used more elaborate reasoning than shown here to get to A.10. But the trail he followed also led to the conclusion that a = e when k = 1.)

Last step: Recall from Chapter 9 how Euler boldly substituted the imaginary-number variable iθ for the real-number variable θ in an equation he knew was true for real numbers. Let’s do the same thing in equation A.10 by plugging in iv for the real-number variable v to get

e^iv= (1 + iv/n)ⁿ,

which, with A.7, implies that e^ivand cos v + i sin v equal the same thing, namely (1 + iv/n)ⁿ. Thus, e^ivand the expression cos v + i sin v are themselves equal, or, as Euler rather excitedly put it when he arrived at his famous formula in the Introductio, “truly there will be” (erit vero)

e^iv = cos v + i sin v,

or, more familiarly, e^iθ = cos θ + i sin θ, using θ as the variable.

^* Here’s an example of the kind of weirdness that geometrical intuitions can lead to: Picture two, same-sized adjacent circles touching at a single point. It’s intuitively natural to think of them as “kissing”—you might call it “Circles in Love” if you were a minimalist-art aficionado. But now consider what this kiss would actually be in human terms: two people whose lip skin melts together when they kiss, so that they’re literally joined at the lips. This follows from the fact that the point where the circles touch is an intersection point like one shared by two straight lines that cross—it lies on both circles. George Lakoff and Rafael E. Núñez put this wonderfully creepy insight on the record in their book Where Mathematics Comes From. It isn’t mathematically troublesome, but it memorably illustrates how seemingly simple geometric ideas sometimes imply strange things.

* For example, here’s the definition of the limit of the infinite sequence of fractions, 1, 1/2, 1/3,… (i.e., 1/n, where n = 1, 2, 3,…): The sequence 1/n for n = 1, 2, 3,… has the limit L if, for any positive number ε, there exists a positive integer m such that the absolute value of L − 1/n is less than ε for all n greater than or equal to m. (For this sequence, by the way, L = 0.) Complicated, yes. Infinity-infested, well, not exactly. The infinite—that is, the infinitely small—is hidden behind the phrase “for any positive ε,” which implies that we can make 1/n as close to L as we like by choosing n to be ever larger. Importantly, this “epsilontic” definition (so-named because ε, the Greek letter epsilon, is frequently used in math to designate a small, finite number) omits the somewhat vague notion of motion (conveyed by terms such as “approaching” or “tending toward”), which was implicit in earlier definitions of limits. The now-standard, industrial-strength definition of limits was formulated by German mathematician Karl Weierstrass.