§7a. This section even has epigraphs:
“The modern theory of the infinite evolved in a contiguous way out of the mathematics that preceded it.”
—S. Lavine
“But the uninitiated may wonder how it is possible to deal with a number which cannot be counted.”
—B. Russell
“Buckle your lapbelts please everyone as we are about to undergo a steep ascent.”
—E. R. Goris
For reasons that by now will be familiar, a lot of what follows is going to be really fast. It’s also somewhat hard at the start, but like a lot of pure math it gets easier the deeper we go. As mentioned already, G. F. L. P. Cantor does his graduate work at Berlin under Weierstrass and Kronecker; his earliest published articles are fairly standard-issue work in certain problems of number theory.1 Ph.D. in hand, Cantor gets a low-rung job as a privatdozent (which seems to be a weird sort of freelance T.A.2) at U. Halle, and there meets one E. H. Heine (1821–1881), a specialist in applied analysis who’d done important work in Heat, particularly on the Potential Equation.3 Anyway, as of c. 1870 Heine is part of the whole big group of mathematicians working on Fourier Series and the issues raised by Riemann’s “On the Representability …,” and apparently he (=Heine) gets Cantor interested in what had come to be known as the Uniqueness problem: If any given f(x) can be represented by a trigonometric series, is that representation Unique, i.e. is there only one trig series that can do it? Heine himself had been able to prove Uniqueness only on the condition that the relevant trig series was uniformly convergent.4 Which clearly wasn’t good enough, since there are plenty of trig series and even Fourier Series that aren’t uniformly convergent.
In 1872, Cantor’s “On the Extension of a Proposition from the Theory of Trigonometric Series”5 defines and proves a far more general Uniqueness Theorem that not only doesn’t require uniform convergence but permits exceptions to convergence at an infinity of points, provided that these exceptional points6 are distributed in a certain specific way. As we’ll see, this precise distribution’s rather complicated—as is the ’72 Theorem itself, which Cantor has actually developed over several prior papers and published appendices to same, as his criteria for Uniqueness gradually evolve from requiring the given trig series to converge for all values of x, to allowing a finite number of exceptional points, to the Uniqueness Theorem’s final, 100% general form. What’s interesting is that it’s Professor L. Kronecker who helps Cantor refine and simplify his proof at several early points. At the same time, Cantor’s approach is deeply informed by K. Weierstrass’s work on continuity and convergence, e.g. the observation that for a general trig series of the form to be integrable term-by-term (which was the way Heine and everyone else had tried to attack the Uniqueness problem), it has to be uniformly convergent. Historians have noted that Cantor-Kronecker relations cooled and L. K.’s letters got more and more critical as Cantor’s refinements of the Uniqueness Theorem began to allow infinitely many points at which exceptions to either the representation of the given f(x) or the convergence of the series7 could be allowed. In each successive version of the proof, Kronecker was basically watching Cantor move from his own algebraic, Constructivist position to a more Weierstrassian, function-theoretic one. The apostasy was complete when the final 1872 paper came out and its whole first part was devoted to the theory of irrationals/reals detailed in §6e.
Understanding just why it required a coherent theory of irrationals is as good a way as any to see how Cantor’s work on the Uniqueness Theorem led him into studying infinite sets per se. The discussion, which gets a bit hairy, requires that you recollect the Bolzano-Weierstrass Theorem’s rule that every bounded infinite set of points contains at least one limit point—and so of course what a limit point is.8 Plus you need to keep in mind that the Uniqueness proof’s central concepts are all about the Real Line (i.e., when terms like ‘point set’ or ‘exceptional point’ or ‘limit point’ are used, they are really referring to geometric points that correspond to numbers9). Please note also that the ‘bounded infinite sets of points’ under consideration in both the B.W.T. and Cantor’s proof are actually sequences, which are also the easiest entities w/r/t which to understand limit points—like for instance the infinite set of N.L. points 10 has as its limit point the same
that is the limit of the infinite sequence
So here’s how Cantor gets his increasingly general result. At first (1870), he needs a trig series that isn’t uniformly convergent11 but is convergent everywhere, meaning convergent for all values of x. In the next step (1871)12, he is able to prove that if two apparently distinct trig series converge to the same (arbitrary) function everywhere except a finite number of x-points, they are really the same series. We’re going to skip this proof because what’s really germane is the next one, which is the 1872 result where Cantor is able to allow an infinite number of exceptional points and still prove that the two representative trig series are ultimately identical.13 How he does this is by introducing the concept of a derived set, whose def. is basically: If P is a point set (meaning just any set of real-number points, though what Cantor obviously has in mind is the infinite set of all exceptional points between the two trig series), then P’s derived set P′ is the set of all limit points of P. Or rather we should say that P′ is the first derived set of P, because as long as the relevant point sets are infinite, the whole process is, in principle, endlessly iterable—P″ is the derived set of P′, P′″ is the derived set of P″, and so on, until after (n − 1) iterations you’ll have Pn as the nth derived set of P. In the gloss of the Uniqueness Theorem two ¶s back, what the “provided that these exceptional points are distributed in a certain specific way” criterion centers on is this nth derived set Pn, with the vital question being whether Pn is infinite or not.
Some of its real math involves extremely technical stuff about how limit points operate, but in essence here’s how the U.T.’s exceptional-point-distribution thing works. Another deep breath would probably not be out of place.14 Posit an infinite point set P such that, for some finite n, P’s (n − 1)th derived set P(n−1) is infinite while its nth derived set Pn is finite. Then, if two trigonometric series converge to the same f(x) except at some or all of the points in P, they are the equivalent series; hence Uniqueness is proved. That was probably not blindingly vivid and clear. The part it’s crucial to unpack is the requirement that Pn be finite. And to explain it, we have to introduce another distinction: any set P for which its derived set Pn is finite for some finite n is what Cantor calls a set of the first species; whereas, if Pn is not finite for any finite value of n, then P is a set of the second species. (This is why the derived-set process was described above as “in principle” endless—it’s only the second-species sets that never produce finite derived sets and so allow a ∞ of iterations of the derived-set-of-derived-set thing.)
OK then. Here’s why, in his proof of the U.T., Cantor needs the original infinite set of exceptional points P to be a first-species set, and thus Pn to be finite. It is because, via K. Weierstrass’s Extreme Values Theorem, you can prove for sure that if any derived set Pn is finite, then at some further point n + k the derived set P(n + k) is going to take its absolute minimum value m, which in this case will be 0, or no limit points at all. Meaning, in other words, that anytime in the whole P′, P″, P′″, … Pn, … progression you arrive at a derived set that’s finite, you know that the whole iterative process is going to stop somewhere; you’re eventually going to get to a P (n + k) with no members. And of course we know that the members of all these various P′s and Pns are limit points; and we further know, from the Bolzano-Weierstrass Theorem, that any bounded infinite set has at least one limit point. If P (n + k) has no members, then the set of which it is the derived set has no limit point, and thus by the B.W.T. must itself be finite, and thus by the E.V.T. must at some point take its own minimum value of 0 members, at which same point the set of which that is the derived set becomes finite, and so on back down through the P (n + k)s and Pns and P (n −1)s and P′ s … all of which means—to boil everything way, way down—that at some provable point the two representative trig series can be shown to collapse into a single series, which establishes Uniqueness.
You’ll remember from §§ 5e and 6a, though, that in order to be 100% workable in all cases the Extreme Values Theorem requires a theory of real numbers, and that the Weierstrassian version of such a theory was (as Cantor himself showed) a dink. This is one reason why Cantor’s 1872 proof requires its own theory of reals. The other, and related, reason is that in order to use the more general Bolzano-Weierstrass Theorem to construct his theory of derived sets and species, Cantor needs to reconcile the geometric properties of points on a line’s continuum—meaning here concepts like ‘limit point,’ ‘interval,’ etc.—with the arithmetic continuum of real numbers, since the entities involved in analysis are of course really numbers and not points.15
If we observe that Cantor’s derived sets resemble the general idea of a subset, and that the minimum value P (n+k) = 0 is essentially the same as the definition of the empty set, it’s possible to discern the seeds of what’s now known as set theory16 in Cantor’s Uniqueness proof. Derived sets, the R.L./ real-number continuum, and the Uniqueness Theorem are also the progenitors of Cantor’s transfinite math, although in a rather complex bunch of ways. We’ve just seen that the Uniqueness Theorem’s Pn requires n to be finite, i.e. that Cantor uses only finite iterations of the derived-set-of-derived-set process for his proof. Since there are already so many infinite sets floating around the proof, though (as in: the original P is infinite, and all the P′s and P″s and so on up through P (n−1) can be infinite, and of course the relevant trig series are infinite series, not to mention that limit points involve an infinite number of intra-interval points, and that these intervals can themselves be infinitesimally small), it shouldn’t be surprising that Cantor starts to consider more closely the characteristics of his derived sets under infinite iterations.
More specifically, Cantor starts to ask whether the infinity of an infinitely iterated second-species derived set P ∞ might differ from or somehow exceed the infinity of the finitely iterated first-species set P (n−1). More specifically still, he notices how closely those questions resemble one about the relative ∞s of the N.L.’s rational numbers versus the R.L.’s real numbers. This latter question concerns the issue first discussed all the way back in §2c—namely that while the rational numbers are both infinite and infinitely dense, they are not continuous (i.e., the Number Line is shot through with holes), whereas of course both Dedekind and Cantor have now proved the continuity of the set of all reals as schematized on the R.L. It thus appears natural to Cantor17—who for the U.T. has already developed techniques for examining both real v. rational numbers and the properties of infinite sets—to ask whether the infinite set of all real numbers is somehow bigger than the infinite set of all rational numbers. Except what would ‘bigger’ here mean, and ditto for ‘exceed’ in the above P ∞ v. P (n–1) question—that is, how can the comparative sizes of different ∞s be described and explained mathematically? At which point, in the immortal words of J. Gleason….
ADMINISTRATIVE INTERPOLATION
There are now a couple procedural issues that need to be addressed. Whole scholarly tomes are devoted to Cantor’s accomplishments.18 You can take a two-term course on set theory under the departmental rubrics of Logic, Math, Philosophy, or Computer Science19 and still have just scratched the veneer. Historically, Cantor’s transfinite theories and proofs are spread over 20 years20 and scores of different papers, often with successive refinements and amendments to earlier stuff so that there is sometimes more than one version of the same proof. It’s clearly going to be impossible here to unpack transfinite math all the way or to do real justice to its evolution in Cantor’s publications.21 On the other hand, there are certain recent pop books that give such shallow and reductive accounts of Cantor’s proofs (accounts which are, as mentioned, usually subordinated to some larger Promethean narrative about G. Cantor’s psych problems or supposed mystical affiliations) that the math is distorted and its beauty obscured; and we obviously don’t want to do this, either.
So Command Decision: The compromise henceforth will be to sacrifice chronology and a certain developmental thoroughness for the sake of conceptual thoroughness and cohesion—that is, to present Cantor’s concepts, theorems, and proofs in a way that highlights their connections to one another and to math per se. This will involve not only skipping around a bit, but often not telling you that we’re skipping around, or that there are sometimes several different versions of a given proof and we’re covering only the best one, or what the exact dates and English-v.-German titles of all the relevant articles are,22 etc. It also entails a special set-theoretic EMERGENCY GLOSSARY III, though this E.G. will need to be administered gradually and in situ because some of the material is just too abstract to shove at you up front without context.
END A.I.
§7b. As should be evident, some very important ∞-related ideas come out of the proof of the general U.T. One concerns the relative sizes of the set of all rationals v. the set of all reals; another is whether the continuity of the Real Line is related somehow to the size/composition of the latter set. Yet another is the concept of a transfinite number, which Cantor derives from the same considerations that led him to distinguish first- from second-species infinite sets in the ’72 proof.
To see how Cantor conceives and generates his transfinite numbers, we need first to make sure we’re E.G.-grade clear on a few set-theory terms you probably first saw in elementary school.23 To wit: Set A is a subset of set B if and only if there is no member of A that is not a member of B. The union of two sets A and B is the set of all members of A and all members of B, while the intersection of A and B is the set of just those members of A that are also members of B. Union and intersection are normally symbolized by ‘∪’ and ‘∩,’ respectively. Lastly, the good old empty set, whose usual symbol is ‘Ø,’ is a set with no members—and be apprised that, by what looks at first like merely a quirk of the definition of ‘subset,’ any set whatsoever will always include Ø as a subset. End of first situated chunk of E.G.III. Plus here you have to remember the first-v. second-species thing for point sets from the previous §. There’s actually also some technical stuff involving criteria for ‘dense’ v. ‘everywhere-dense’ sets that we’re omitting, but in essence the way Cantor conceives and generates the transfinite numbers is this:24
Assume that P is a second-species infinite point-set. Cantor shows that P’s first derived set, P′, can be “decomposed”25 or broken down into the union of two different subsets, Q and R, where Q is the set of all points belonging to first-species derived sets of P′, and R is the set of all points that are contained in every single derived set of P′, meaning R is the set of just those points that all the derived sets of P′ have in common. Why not take a second and read that last sentence over again.26 R is the important part, and it’s actually how Cantor first defines ‘intersection’ for sets, here via the infinite sequence of derived sets P′, P″, P′″, … (the sequence being infinite because P is a second-species set). Unlike our ‘∩,’ Cantor’s symbol for intersection is a strange ultracursive (Again, we’re not going to be doing everything in this much detail.) So the official definition of R is: R =
(P′, P″, P″′, …), which, together with the def. of ‘second-species set,’ means that both the following are true:
(1)R = (P(2), P(3), P(4), P(5), …) …
(2)R = (P(n), P(n+1), P(n+2), P(n+3), …)27
EMBEDDED MINI-INTERPOLATION
What (1) and (2) together really are is a type of proof, the other really famous one besides the reductio. This one’s called mathematical induction. To prove some statement Cn for all (n = ∞) cases by math induction, you (a) prove that C1 is true for the first case n = 1, then (b) assume that Ck is true for the first k cases (you don’t know what number k is, but from step (a) you know that it exists—if nothing else, it’s 1), and then (c) prove that Ck + 1 is true for the first (k + 1) cases. Weird-looking or no, (a)–(c) ensure that Cn will be true no matter what n is, that is, that C is a genuine theorem.
END E.M-I.
What Cantor’s (1) and (2) allow him to do is to define R, as taken from P, as: R = P∞—that is, R is the ∞th derived set of P. And since (again) P is a second-species set, there is no chance that P ∞ = Ø, which means that P ∞ will itself yield the derived set P(∞+1), which latter will yield the derived set P (∞+2), and so on, except here ‘and so on’ means we can keep generating infinite derived sets of the abstract form28 P(n0∞v + n1∞v−1 + … + nv). And since this formula’s n and v are variables, Cantor can construct the following infinite sequence of infinite sets: P (n∞∞), P(∞∞+1), P(∞∞ + n), P(∞n ∞), P (∞∞n), P(∞∞∞), …. Of this sequence he says “We see here a dialectic generation of concepts, which leads further and further, and thus remains in itself necessarily and consequently free of any arbitrariness” … by which he means that these “concepts” are real math entities—transfinite numbers—rigorously established by the Bolzano-Weierstrass Theorem, G. C.’s own definitions of ‘real number’ and ‘derived set’ and ‘intersection,’ and mathematical induction.
If you object (as some of us did to Dr. Goris) that Cantor’s transfinite numbers aren’t really numbers at all but rather sets, then be apprised that what, say, ‘P(∞∞+n)’ really is is a symbol for the number of members in a given set, the same way ‘3’ is a symbol for the number of members in the set {1, 2, 3}. And since the transfinites are provably distinct and compose an infinite ordered sequence just like the integers,29 they really are numbers, symbolizable (for now) by Cantor’s well-known system of alephs or s.30 And, as true numbers, transfinites turn out to be susceptible to the same kinds of arithmetical relations and operations as regular numbers—although, just as with 0, the rules for these operations are very different in the case of
and have to be independently established and proved.
(IYI We won’t be doing a whole lot with them, but if you’re curious, here are some of the standard theorems for the addition, multiplication, and exponentiation of transfinite numbers, all either derived or suggested by Cantor. (Please note that sums and/or products of infinitely many terms here have nothing to do with the sums/limits of infinite series in analysis, which series are now known, post-Cantor, as quasi-infinite.) Assume that n is any finite integer, and that we’ve got two distinct transfinite numbers, designated and
where
31 in which case the following are all true:
N.B. that subtraction and division are possible only in certain misceginated cases—e.g., for a finite n, not between transfinites per se. (Again, this is not all that different from the arithmetic of 0.) Note also that the case of transfinite exponents like
etc. is special and gets discussed at length later on. END IYI)
In case you’re wondering what any of this has to do with the other big ∞-related issues—namely the comparative infinities of the rational numbers v. the real numbers, and surds’ role in the continuity of the Real Line—be apprised that one of Cantor’s favorite arguments for transfinite numbers’ reality32 is their mathematical/metaphysical similarity to irrationals, which latter Dedekind has already successfully defined in terms of infinite sets. How Cantor puts it is:
The transfinite numbers themselves are in a certain sense new irrationals, and in fact I think the best way to define the finite irrational numbers is entirely similar[33]; I might even say in principle it is the same as my method for introducing transfinite numbers. One can absolutely assert: the transfinite numbers stand or fall with the finite irrational numbers; they are alike in their most intrinsic nature; for the former like these latter are definite, delineated forms or modifications of the actual infinite.
What’s interesting is that this clear, unequivocal statement appears in the same “Contributions to the Study of the Transfinite” where all the way back in §3a we saw Cantor quote and credit St. Thomas’s objection to infinite numbers qua infinite sets. Nevertheless, Cantor’s own #1 argument for transfinite numbers—an argument repeated in many forms from 1874 to the late 1890s—is that “their existence is confirmed directly by abstraction from the existence of infinite sets.”34 Thus the central project of Cantor’s 1874–84 work is to develop a coherent, consistent theory of infinite sets—and please notice the plural ‘sets,’ because for such a theory to be nontrivial there needs to be more than one kind (meaning mathematical kind, meaning basically size35) and some set of rules for evaluating and comparing them.
§7c. This obviously segues into the question whether the Real Line’s continuity means that the infinite set of all real numbers is somehow > the infinite set of all rational numbers. To make a very long story short, Cantor’s work on this problem proceeds more or less simultaneously with his development of the derived-set and transfinite-number stuff.36
All right. In trying to find some way to compare the sizes of two sets that are both infinitely large, Cantor hits on the very concept that’s now used in 4th grade to define equality between two sets, namely one-to-one correspondence, or ‘1-1C’. (Actually ‘hits on’ isn’t quite right, since we’ve seen both Galileo and Bolzano use one-to-one correspondence to establish their respective paradoxes (though after Cantor’s theory they’d be paradoxes no more).) One-to-one correspondence is, as you may already know, the way to establish whether two collections are equal without having to count them. Textbooks use all kinds of different scenarios to illustrate how the 1-1C matchup works, e.g. the fingers on your right v. left hands, the number of patrons v. available seats in a theater, a restaurant’s cups v. saucers. Dr. Goris’s own chosen trope (which was clearly audience-tailored) involved the numbers of boys v. girls at a dance and having everybody couple up and dance and seeing whether anyone was left standing stricken and alone against the wall. You get the idea. A couple formal definitions: There is a one-to-one correspondence between sets A and B if and only if there exists a method (which technically no one has to know about) of pairing off the members of A w/ the members of B so that each A-member is paired with exactly one B-member and vice versa. Sets A and B are defined as having the same cardinal number (a.k.a. cardinality) if and only if there is indeed a 1-1C between them.37
Now, for the next definition please recall how Galileo generated his eponymous Paradox in §1d. It would also be helpful to remember the formal def. of subset one § back. A set A is a proper subset of set B if and only if A is a subset of B and there is at least one member of B that is not a member of A.38 So, by definition, every set is a subset of itself, but no set is a proper subset of itself. Make sense? It ought to, at least for sets with a finite number of members.
But what G. Cantor posits as the defining formal property of an infinite set is that such a set can be put in a 1-1C with at least one of its proper subsets. Which is to say that an infinite set can have the same cardinal number as its proper subset, as in Galileo’s infinite set of all positive integers and that set’s proper subset of all perfect squares, which latter is itself an infinite set.
This feature makes the whole idea of comparing the quote-unquote ‘sizes’ of infinite sets look very freaky, since by definition an infinite set can have the same size (or cardinality) as a set it’s by definition bigger than. What Cantor does here39 is take yet another element of Galileo’s Paradox and turn it into an extremely powerful and important tool for comparing ∞-type sets. This, if you want to keep track, is his first stroke of incredible, nape-tingling genius, although it may not look like much at first. It’s the idea of one-to-one correspondence with the set of all positive integers, viz. {1, 2, 3, …}. The reason this is critical is that the set of all positive integers can, in principle, be counted 40—as in it’s possible to go ‘Here’s the first member, 1, and the next member is 2, and …,’ etc., even though as a practical matter the process never ends. Anyway, hence Cantor’s concept of denumerability : An infinite set A is denumerable if and only if there is a 1-1C between A and the set of all positive integers.41
The set of all positive integers also establishes a sort of baseline cardinal number for infinite sets; it’s this set’s cardinality that Cantor symbolizes w/ his famous .42 The idea is that other infinite sets’ cardinalities can be evaluated via this baseline cardinal—that is, you can compare them to
by seeing whether they can be put in one-to-one correspondence with the positive integers. Here’s an example (it isn’t Cantor’s per se, but it’s a good warmup):
Consider whether the set C of all positive integers and the set D of all integers (incl. 0 and the negatives) have the same cardinality. The problem is that there’s a crucial difference between these two sets: C has a very first (meaning smallest) member, namely 1, whereas D (which is basically the set {…, −n, …, 0, …, n, …}) doesn’t. And initially it’s hard to see how we can test two sets for 1-1C if one of them doesn’t have a first member. Luckily what we’re talking about here is cardinality, which has nothing to do with the specific order of the sets’ members43; thus we can futz with the order of set D in such a way that even though D doesn’t have a smallest member it does indeed have a first member, here let’s say 0. And this single bit of futzing lets us set up, and represent schematically, a perfect 1-1C—
—that proves C and D have the same cardinality. Notice that even though you can never literally finish the matching process with infinite sets, as long as you can establish a procedure for one-to-one correspondence that works for the 1st, nth, and (n + 1)th cases, you have proved by mathematical induction that the correspondence will obtain all the way through both sets. In the above example, we’ve proved that the set of all integers is denumerable even though we can’t possibly count every member.44 The proof’s method is © G. Cantor, and the big thing to see is that he is once again able to take an implicit property of something—here math induction’s ability to abstract a finite number of results over a ∞ of possible cases—and to make it explicitly, rigorously applicable to infinite sets.
OK, so now it’s clear how Cantor can do a size-comparison on the infinite sets of all rational numbers and all real numbers:45 he can see whether either or both are denumerable. What follow are a series of very famous proofs, most worked out in correspondence with R. Dedekind, published in the 1870s, and then revised and expanded in the early ’90s. First the rationals.46 When you consider the infinite density that Zeno had exploited merely in the geometric rationals between 0 and 1, it appears as if the set of all rational numbers can’t possibly be denumerable. Not only does it lack a smallest member, but there isn’t even a next-largest member after any given rational (as we’ve seen two different proofs of). What Cantor notices, though, is that by ignoring ‘relations of magnitude’ between successive members, we can actually arrange the set of all rationals in a row, something like the row of all positive integers; and in that row there’ll be a first member r1, a second member r2, and so on. It just so happens that the technical term for putting a set into such a row is giving a denumeration of the set—plus the row itself is called the set’s denumeration—meaning that here the valid construction of an ordered row will constitute a proof that the set of all rationals really is denumerable (that is, 1-1C-able with, and so equivalent in cardinality to, the set of all integers). Cantor’s construction, which is sometimes referred to incorrectly as his ‘Diagonal Proof,’47 runs more or less thus:
As we saw in §6c, all rational numbers can be put in the ratio-of-integers form So we make a 2D array of all these
where the top horizontal row is all the rationals of the form
(i.e., the integers), and the first vertical column is all the rationals of the form
and every rational
will be located in the qth row and pth column, like so:
Granted, a 2D array is not the same as the single ordered sequence/row of true denumeration, but Cantor figures out how to sequence the array’s rationals via a single continuous zigzaggy line, like so: start at 1 and go due east one place to 2, then diagonally southwest to then due south to
then diagonally northeast to the first row again and 3, then east to 4, then southwest all the way to
south to
northeast to 5, and so on, as in:
The points on the above line will compose the sequence 1, 2, from which we can licitly cancel out all the ratios in which p and q have a common factor, so each different rational appears just once in its most basic form. This process of elimination/reduction then yields the linear sequence
which sequence constitutes the ordered row required for denumeration,48 meaning that the set of all rationals is indeed denumerable and therefore has the same cardinality as the set of integers, namely good old
The true Diagonal Proof appears in Cantor’s answer to the question whether the set of all real numbers is > the set of all rationals. It should now be obvious that Cantor’s proof here will concern the denumerability of the real numbers; i.e., if the reals are denumerable then their cardinality = that of the rationals, and if they’re not then the reals are > the rationals. The overall proof is a reductio, and its method of Diagonalization is now regarded as one of the most important proof-techniques in all of number theory. Two preliminary things to note here. (1) Cantor’s first, Dedekind-informed, ’73–’74 proof of the nondenumerability of the reals involves limits of sequences w/r/t ‘nested intervals’ on the Real Line and is just hideously complex. The proofs we’re doing here are Cantor’s revised versions, c. 1890; they are both simpler and more significant than the early one. (2) Notice once again in the following how Cantor uses the decimal form of real numbers and exploits §2c’s fact that .999 … = 1.0 in order to represent not just the irrationals but all real numbers as nonterminating decimals—as in for example 0.5 = 0.4999 …, 13.1 = 13.0999, etc. This move (which was actually Dedekind’s suggestion) ensures that there’s only one licit representation of each decimal; we’ll see in a minute why Cantor needs to set the real numbers up this way.
So here’s the proof. Because it’s a reductio, we first assume that the set of all real numbers truly is denumerable—i.e., that it is listable in an ordered row or sequence.49 This sequence will consist in an infinite table of infinite nonterminating decimals, which table we can show at least the start of, like so:
1st Real # = X1.a1a2a3a4a5a6a7…
2nd Real # = X2.b1b2b3b4b5b6b7…
3rd Real # = X3.c1c2c3c4c5c6c7…
4th Real # = X4.d1d2d3d4d5d6d7…
5th Real # = X5.e1e2e3e4e5e6e7…
6th Real # = X6.f1 f2 f3 f4 f5 f6 f7…
.
.
.
And so on….
In this table, the X’s denote any and all pre-decimal-point integers, and a’s, b’s, etc. represent the infinite sequences of digits after the decimal points; and the proof’s assumption is that the infinite version of such a table will be exhaustive of all the real numbers. This means that the reductio’s desired contradiction will require us to prove that such a table does not really exhaust the set of all reals, which proof requires that we come up with a real number that isn’t—can’t be—included in the table.
What Cantor’s Diagonal Proof does is generate just such a number, which let’s call R. The proof is both ingenious and beautiful—a total confirmation of art’s compresence in pure math. First, have another look at the above table. We can let the integral value of R be whatever X we want; it doesn’t matter. But now look at the table’s very first row. We’re going to make sure R’s first post-decimal digit, a, is a different number from the table’s a1. It’s easy to do this even though we don’t know what particular number a1 is: let’s specify that a = (a1 − 1) unless a1 happens to be 0, in which case a = 9. Now look at the table’s second row, because we’re going to do the same thing for R’s second digit b: b = (b2 − 1), or b = 9 if b2 = 0. This is how it works. We use the same procedure for R’s third digit c and the table’s c3, for d and d4, for e and e5, and so on, ad inf. Even though we can’t really construct the whole R (just as we can’t really finish the whole infinite table), we can still see that this real number R = X.abcdefghi … is going to be demonstrably different from every real number in the table. It will differ from the table’s 1st Real in its first post-decimal digit, from the 2nd Real in its second digit, from the 3rd Real in its third digit, … and will, given the Diagonal Method here,50 differ from the table’s Nth Real in its nth digit. Ergo R is not—cannot be—included in the above infinite table; ergo the infinite table is not exhaustive of all the real numbers; ergo (by the rules of reductio) the initial assumption is contradicted and the set of all real numbers is not denumerable, i.e. it’s not 1-1C-able with the set of integers. And since the set of all rational numbers is 1-1C-able with the integers, the set of all reals’ cardinality has got to be greater than the set of all rationals’ cardinality. Q.E.D.*
*QUICK FOREST-V.-TREE INTERPOLATION
Let’s step back and reflect for just a second here on how stratospherically abstract all this is. And on why set theory, which is arguably the most fundamental part of modern math, is also the most mindbending. Set theory is 100% trivial as long as you’re dealing with finite sets, because all relations between such sets can be determined empirically—you just count up their members. In real set theory, we’re dealing with abstract aggregates of abstract entities so numerous they cannot ever be counted or completed or even comprehended … and yet we are proving, deductively and thus definitively, truths about the makeup and relations of such things. In the heat of all this proof and explication, it’s easy to lose sight of the utter strangeness of infinite sets, a strangeness which is diminished not one bit by Cantor and Dedekind’s having shown that these ∞s lie at the very taproot of math and are required for handling something as basic as a straight line. Apropos this strangeness, here is a nice quotation from philosophers P. Benacerraf and H. Putnam:
There are the sets; beautiful, imperishable, multitudinous, intricately connected. They toil not, neither do they spin. Nor, and this is the rub, do they interact with us in any way. So how are we supposed to have epistemological access to them? To answer, ‘by intuition,’ is hardly satisfactory. We need some account of how we can have knowledge of these beasties.
—and one from the hardass Intuitionist H. Poincaré:
A reality completely independent of the spirit that conceives it, sees it, or feels it, is an impossibility. A world so external as that, even if it existed, would be forever inaccessible to us.
—and a rather delicious rebuttal from the Platonist K. Gödel:
Despite their remoteness from sense experience, we do have something like a perception also of the objects of set theory, as is seen from the fact that the axioms force themselves upon us as being true. I don’t see any reason why we should have less confidence in this kind of perception, i.e. in mathematical intuition, than in sense perception, which induces us to build up physical theories and to expect that future sense perceptions will agree with them… .
END Q. F.-V.-T. I. RETURN TO §7c AT THE ¶ ON p. 256
W/ ASTERISK AT END
Some addenda regarding these first two proofs. (1) Since the cardinal number of denumerable sets is it looks as if it would make sense to signify the set of all reals’ cardinality by
but for complicated reasons Cantor designates this set’s cardinal number c, which he also calls “the power of the Continuum,” since it turns out to be the nondenumerability of the reals that accounts for the continuity of the Real Line. What this means is that the ∞ of points involved in continuity is greater than the ∞ of points comprised by any kind of discrete sequence, even an infinitely dense one. (2) Via his Diagonal Proof that c >
Cantor has succeeded in characterizing arithmetical continuity entirely in terms of order, sets, denumerability, etc. That is, he has characterized it 100% abstractly, without reference to time, motion, streets, noses, pies, or any other feature of the physical world—which is why Russell credits him with ‘definitively solving’ the deep problems behind the Dichotomy.51 (3) The D.P. also explains, with respect to Dr. G.’s demonstration back in §2e, why there will always be more real numbers than red hankies. And it helps us understand why rational numbers ultimately take up 0 space on the Real Line,52 since it’s obviously the irrational numbers that make the set of all reals nondenumerable. (4) An extension of Cantor’s proof helps confirm J. Liouville’s 1851 proof that there are an infinite number of transcendental irrationals in any interval on the Real Line. (This is pretty interesting. You’ll recall from §3a FN 15 that of the two types of irrationals, transcendentals are the ones like π and e that can’t be the roots of integer-coefficient polynomials. Cantor’s proof that the reals’ ∞ outweighs the rationals’ ∞ can be modified to show that it’s actually the transcendental irrationals that are nondenumerable and that the set of all algebraic irrationals has the same cardinality as the rationals,53 which establishes that it’s ultimately the transcendental-irrational-reals that account for the R.L.’s continuity.) (5) Given that the D.P. is a reductio proof and that its quantities are in no way constructible, it should come as no surprise that Prof. L. Kronecker and other proto-Constructivists didn’t like it at all (re which there’s much more a couple §s down). By all accounts, Kronecker’s public campaign against Cantor commenced in earnest with the c-related papers.
§7d. Mathematically, you can probably see what Cantor’s next big move is. Having proved with c that there is a power of ∞ greater than he starts looking for infinite sets whose cardinality might be greater than c. His next major proof (which you’ll notice still concerns point sets) is an attempt to show that the 2D plane contains a ∞ of points that’s greater than the 1D Real Line’s c in the same way that c is greater than the Number Line’s
This is the proof of whose final result Cantor famously wrote to Dedekind “Je le vois, mais je’n le crois pas” in 1877.54 It’s known in English as his Dimension Proof. The general idea is to show that the real numbers cannot be put into a 1-1C with the set of points in an n-dimensional space, here a plane, and hence that the cardinality of the plane’s point set is > the cardinality of the set of all reals. The proof’s specific cases are the good old Pythagorean Unit Square and the interval [0,1] on the Real Line. (You will remember from §3 that Bolzano’s P. of the I. had already suggested in 1850 that [0,1] contained as many points as the whole Real Line, which equivalence Cantor now formally proves in his Dimension paper. Since we’ve already seen a graphical demonstration of the equivalence in §3c, we’ll skip this proof except to point out what you can likely anticipate: Cantor shows that whatever type of Diagonalization you use to create a new real number that’s > 1 can be duplicated to create a new real number in [0,1].)
For the paper’s main Dimension Proof, you sort of have to visualize the Unit Square set up like a Cartesian grid, with numerical coordinates corresponding to each and every point on its plane. Cantor’s strategy is to use Diagonalization to show that there are numbers corresponding to these 2D coordinates that cannot be found in the set of all reals. As is clear from his letters to Dedekind, Cantor is sure at the outset that such numbers can be generated, since every geometer from Riemann on had operated under the assumption that any space’s dimension (as in 1D, 2D, 3D) was uniquely determined by the number of coordinates required to identify a point in that space.
Except that assumption turns out to be wrong, as Cantor discovers in his attempt to construct decimal sequences of 2D coordinates that will allow planar points to be compared with real-number decimals. The tricky thing is obviously that planar points are specified by pairs of real numbers and linear points by solo reals, so (harking back to Pythagoras and Eudoxus) Cantor has to devise a way to make the two sets of points commensurable. It takes him three years to figure out how to do this. Again, let all relevant numbers be represented by infinite nonterminating decimals. Take any point (x, y) on the Unit Square; these coordinates are writable:
x = 0.a1a2a3a4a5a6a7…
y = 0.b1b2b3b4b5b6b7…
which combine to make up the point (x, y)’s unique55 decimal representation:
0.a1b1a2b2a3b3a4b4a5b5a6b6a7b7…
And to this point there will clearly correspond a unique point z in the R.L. interval [0,1], namely the z that equals the real number 0.a1b1a2b2a3b3a4b4a5b5a6b6a7b7 ….56
So, by straightforward extrapolation from the Unit Square and [0,1], every point on a 2D plane can be put into a 1-1C with a point on the R.L. in just this way, and vice versa. More, Cantor’s (relatively) simple method of combining coordinates into a single real number means that the same general technique can be used to show that a 3D cube, a 4D hypercube, or actually any n-dimensional figure’s total point-set has the same cardinality as the R.L.’s set of real numbers, namely c. This is an extraordinary result, and it’s why Cantor wasn’t disappointed at having failed to prove his original premise: he’d discovered an incredible depth and richness to the Continuum, and his proof showed (this is him writing to Dedekind) “what wonderful power there is in the real numbers, since one is in a position to determine uniquely, with a single coordinate, the elements of an n-dimensional continuous space.”
Cantor’s discovery that lines, planes, cubes, and polytopes57 were all equivalent as sets of points goes a long way toward explaining why set theory was such a revolutionary development for math—revolutionary in theory and practice both. Some of this goes all the way back to the Greeks’ commensurability problem and classical calc’s ambivalent relationship to geometry. Uneasiness about using quantities like x2 and x3 in the same equation (since squares entailed 2D areas and cubes → 3D volumes) had persisted for centuries, and the 1800s’ emphasis on rigor made the geometric ambiguities even less palatable. To make a long story short, Cantorian set theory helps unify and clarify math in the sense that all mathematical entities can now be understood as fundamentally the same kind of thing—a set. Plus, in the new non-Euclidean geometries,58 Cantor’s finding that all geometric-point-sets are transfinitely equivalent (i.e., that they all had the cardinality c) is of major importance, particularly in the idea of dimension, as Cantor also observes to Dedekind:
This [=G.C.’s] view seems opposed to that generally prevailing in particular among the advocates of the new geometry, since they speak of simply infinite, two, three, …, n-dimensional infinite domains. Sometimes one will even find the idea that the infinity of points of a [2D] surface may be produced so to speak by squaring, that of a solid by cubing the infinity of points of a line.
It goes without saying, though, that our ‘revolutionary development’ and ‘major importance’ stuff is in hindsight. As has been abundantly foreshadowed, it’s not the case that mainstream math immediately dropped to one knee with arms out to welcome Cantor’s post–Uniqueness Theorem proofs. Particularly re the Dimension Proof, mathematicians of nearly all stripe and school lined up to revile it. Besides the general objections in §7c, Constructivists especially hated the idea of somehow creating 1D irrationals out of 2D combinations of other irrationals, as well as the ‘noncontinuous mapping’ the Dimension Proof produced between line-points and plane-points59; and it was actually Cantor’s Dimension-Proof paper60 that L. Kronecker first intrigued to get rejected from a journal he was on the editorial board of, over which Cantor spent a great many letters venting spleen. But it wasn’t just Constructivists or fundamentalists. See, for just one example, these lines from P. Du Bois-Reymond—who is not a Kroneckerian but a mainstream analyst in the Aristotle–Gauss tradition of potential-only ∞s—in a review of the Dimension Proof:
It seems utterly repugnant to common sense. The simple fact is that this is the outcome of a type of reasoning that allows Idealistic [=Platonic] fictions to assume the role of genuine quantities even though they are not even truly limits of representations of quantities.61
§7e. Anyway, so we’ve established at least and maybe at most two distinct orders of infinite sets, and c,62 and it’s now appropriate to ask what exactly these cardinal numbers have to do with the transfinite numbers that we saw Cantor manufacture out of R and
and the derived-sets-of-derived-sets thing in §7b. With the big specific question being whether the infinite sequence of infinite set P(n∞∞), P(∞∞ + 1), p(∞∞ + n), etc. can be shown to correspond to an infinite hierarchy of greater and greater cardinal numbers, or whether
and c are the only infinite cardinals and there are no real ∞s beyond the transdimensional power of the Continuum.
Cantor’s next big discovery is that you can validly construct an infinite sequence of infinite sets with larger and larger cardinal numbers using nothing but the formal properties of sets.63 These properties involve the concepts of the subset and of the Power Set, the second of which is hereby defined, for some set A, as simply the set of all subsets of A. Meaning every member of P(A) is some subset of A. This turns out to be heavier than it looks. Every set, finite or not, has a Power Set64; but what Cantor’s able to prove is that even if set A is infinite, its Power Set P(A) will always have a larger cardinal number than A—more specifically, he’s able to prove that the cardinal number of P(A) will always equal 2A.65 And this A→2A thing ends up being crucial for navigating the transfinite, in which realm it turns out that one sort of jumps, quantumlike, from one number class to the next, with nothing in between: and so on (as it were).
Cantor’s Power Set proofs are extremely intricate, and we have to kind of build up to them. And 4th grade was doubtless a long time ago for everybody, so in case we haven’t already done so let’s explicitize that the formal way to designate a set is to put its members inside {braces} like this, and that the symbolism for ‘item a is a member of set A’ is ‘a ∈ A’. Let’s further remind you that ‘subset’ is by definition more inclusive than ‘proper subset,’ and that included among the subsets of any set A will be (1) A itself and (2) the empty set, symbolized ‘Ø’ or sometimes ‘{ }’.66 Since any set, therefore, has at least some subsets, it follows that every set has a Power Set. To see informally that the number of members of A’s Power Set always equals 2(Number of Members of A), let A be the three-member set {1, 2, 3}. A’s subsets here are: { }, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}, of which there are exactly 8, or 23. A more rigorous way to prove P(A) = 2A is by mathematical induction, which technically isn’t the way Cantor does it but is at least implicit in Cantor’s proof; plus it’s comparatively easy. Please review or resummon §7b’s thing on the three steps of proof by math induction, which here will go like:
(a) Prove that the cardinality of P(A) equals 2A for a set A with just one member. Such an A has as subsets the following: Ø and A itself, which means its P(A) has two members, which is 21 members, which is 2A members, so bingo.67 (b) Assume it’s true that if A has k members, P(A)’s cardinal number = 2k. (c) Prove that if A has (k + 1) members, P(A) = 2(k +1). From step (b) we know that A’s first k members yield 2k subsets of A. We now take each one of these 2k subsets and form a brand new subset that also contains the very last of A’s (k + 1) members (i.e., the new, extra member designated by the ‘+1’). We can form exactly 2k of these new ‘+1’ subsets—one for each of the original subsets. So now we’ve got the original 2k subsets that don’t contain the new ‘+1’ member, and we’ve got 2k new subsets that do contain it. That’s (2k + 2k) subsets, which is equivalent to (2 × 2k) subsets, which equals 2(k+1) subsets. So (c) is also proven. So sure enough, P(A) = 2A.
For our purposes, Cantor’s got two main Power Set proofs. In neither one is he worried about the 2A thing yet: what he’s basically concerned to show is that even for an infinite set A, P(A) > A.68 The first version, which dates around 1891, is important mainly because it shows what a potent reductio-weapon the Diagonalization technique is. It can be considered a proof that the set of all subsets of the set of integers is not denumerable69—which, since Cantor’s already shown that the set of integers is denumerable, will obviously mean that its Power Set has a higher cardinal number than
Here’s the proof. Call the set of all integers I; call I ’s Power Set P(I). We know from §7c that in order for P(I) to be denumerable, it has to be possible to set up a one-to-one correspondence between P(I) and I. The present proof is a reductio, so assume that verily such a 1-1C between P(I) and I is possible. This (as we also know from §7c’s Diagonal Proofs) means that the 1-1C can be charted in an array like the following partial example, with the members of I on the left and the subsets of I (which are also the members of P(I), and can be in any sort of random order we want) on the right:
ARRAY #1
I P (I)
0 ↔ {All Integers}
1 ↔ { }
2 ↔ {All Even Integers}
3 ↔ {All Odd Integers}
4 ↔ {All Primes}
5 ↔ {All Integers > 3}
6 ↔ {All Perfect Squares}
7 ↔ {All Perfect Cubes}
· ↔ ·
· ↔ ·
· ↔ ·
As it happens, we can tweak this array’s informative range by exploiting a property of its 1-1C that you might already have noticed if you’ve spent any time thinking about why exactly the relation between sets and their Power Sets is always 2A rather than 3A or some other xA. The deep answer is that the ‘2’ in 2A reflects a particular kind of decision procedure. For each subset s of some set A, you have exactly two choices with respect to each member a of A: either a is a member of s, or it isn’t. That last sentence probably requires more than one read. It’s hard to put it clearly in natural language, but the idea itself isn’t that complicated. A is a set, a is some particular member of A, s is some particular subset of A. Ask whether a happens to be a member of s. Well, either it is or it isn’t. You exhaust all the possibilities regarding a’s membership in s by including a in s once and excluding a from s once—thereby producing the duo of subsets s and s′ w/r/t each a.
(SEMI-IYI Here’s one of those places where it’s simply impossible to tell whether or not what’s just been said will make sense to a general reader. If the abstract is-a-a-member-of-s-or-not thing is clear enough so that you understand why it alone explains why a set A with three members will have 23 subsets, feel free to skip the rest of this ¶. If it isn’t, we’ll do a concrete example. Let’s say A is the same set {1, 2, 3} unpacked on p. 267, where we listed A’s subsets: { }, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}. Take a look at those subsets and see how many times any particular member of A—let’s say the member 1—is included in the eight total subsets. You’ll notice it’s included in four of the subsets and excluded from four. If you look at A’s member 2, you’ll see it’s the same thing: 2 is present in four and absent from four. Same with 3. Can you see why? There are eight total subsets; half of them contain any particular member of A, and half of them don’t. You can actually construct the set of all A’s subsets this way. Take any member of A. If your first subset s doesn’t contain the member, your next one s′ will. Or obversely. That is, for any particular member and subset, there are two choices, and the set of all subsets will comprise them both. Two choices for each member. Hence the number of subsets of {1, 2, 3} will be 2 × 2 × 2, or 23. If this still fails to make the basic idea clear, you’re asked to please just eat it (the idea) because this is the best we can do.)
OK, so this means that we can take Array #1 and sort of expand it sideways by asking, for every integer in set I, whether it really is part of its corresponding subset in the P(I) column, and entering ‘Yes’ if the integer is in that particular subset and ‘No’ if it isn’t. As in:
And once this Array #2 is set up, we can easily show that the assumed correspondence between I and P(I) isn’t exhaustive and so isn’t a valid 1-1C. We do this by using good old Diagonalization to construct a subset of I that will never ever show up in the table’s I ↔ P(I) correspondence. It’s the subset defined by starting at the extreme northwest corner of Array #2’s ‘Yes’/‘No’ table and going diagonally southeast, changing ‘Yes’s to ‘No’s and vice versa throughout—like so:
All we know about this new subset is that it includes 1, 4, 6, and 7, and that it differs in at least one member from each subset (a.k.a. each member of P(I)) in the original 1-1C. Naturally, our Array #3 is just a fragment, but by continuing the simple process of Diagonal ‘Yes’–‘No’ switching we can guarantee that the new subset generated thereby will differ from all the 1-1C’s subsets no matter how far out in the pairings we get. Hence a true 1-1C between I and P(I) is impossible. Which means P(I) is nondenumerable,70 which means its cardinality is greater than Q.E.D.
While there are good reasons why we’ve gone through the proof in such graphic detail here, be advised that this is not the way G. Cantor does it. The truth is that he never explicitly lays out the Diagonal Proof for P(I) > I and therefore for P(A) > A; he merely alludes to it as a “natural extension” of his Diagonal Proof of the nondenumerability of the real numbers.71 The argument for P(A) > A that he does give is a bit of a skullclutcher, but it ends up playing a key role in our Story’s dénouement and so needs to be spelled out. This proof’s wholly abstract and nonspecific, designed to show only that from any infinite set A you can construct some infinite set B whose cardinal number trumps that of A. It would maybe be good to prepare yourself, emotionally, for having to read the following more than once:
A is an infinite set; B is the set of all subsets of A.72 Because all sets are by definition subsets of themselves, A is a subset of A, meaning A is a member of B; so it’s definitely possible to set up a 1-1C between all the members of A and at least one member of B. It is not possible, however, to set up a 1-1C between all the members of A and all the members of B. We’re going to prove this by reductio, so we assume the customary position and posit that such a 1-1C is indeed constructed and is exhaustive of both infinite sets. Now, let a be any member of A and b be any member of B (so b is any subset of A). As we saw with Array #2 above, the 1-1C between A and B can be wholly random in the sense that, in any individual correspondence a ↔ b, a may or may not be a member of the b it’s paired with. For instance, the integer 3 got paired with {All Odd Integers} and is itself an odd integer, whereas 6 got paired with {All Perfect Squares} but is not a perfect square. It will be the same with our present 1-1C and its infinite pairings a ↔ b: sometimes a will be a member of the subset b it’s paired with; sometimes it won’t. This is all fairly straightforward. Now, though, consider the total of all a’s in the 1-1C that are not members of the b’s they’re matched with. Let ϕ be the set of all such a’s. ϕ is, of course, a subset of A, which means that ϕ is a member of B—and yet it’s provable that ϕ cannot be included in the supposedly exhaustive 1-1C between A and B. For if ϕ is included, it’s matched with some a, and there are as we’ve seen only two options: either this a is itself a member of ϕ, or it isn’t. If a is a member of ϕ—but it can’t be, since this contradicts the definition of ϕ. But if a is not a member of ϕ, then it is, by definition, a member of ϕ—which it can’t be, but so it must be, but so it can’t be … and so you’ve got your LEM-grade contradiction either way. Hence no true 1-1C between A and B is possible; hence B’s cardinality is > A’s cardinality. Quod erat dem.*
*SEMI-INTERPOLATIVE §7f.
Please notice the way this last proof resembles the ancient Greek ‘I Am Lying’ paradox73 where if the sentence is true it’s false and if it’s false it’s true. Meaning we’ve now entered the chasmal terrain of self-reference. This is the real reason we just slogged through Cantor’s B > A proof—it opens up a whole new kind of crevasse for modern math.
Though it’s not strictly in our purview, be informed that in the 1930s Prof. K. Gödel74 will use something very much like Cantor’s device to prove his devastating Incompleteness Theorems. (In crude terms, Gödel will prove that certain well-formed math propositions are true and yet unprovable by deriving ‘Proposition P: Proposition P is unprovable’ as a theorem.) More important for our purposes is this idea that sets can include other sets as members, which is essential to the concept of Power Sets and certainly looks innocent enough … except after Cantor’s proof it turns out to be to be a veritable swan-dive into the crevasse of self-reference. Example: Consider the theorem Cantor’s just proved, that the set of all subsets of set A will always contain more members than A itself. But suppose now that A is defined as ‘the set of all sets’. By definition, this A will contain all its subsets, since these subsets are sets—so here there’s no way P(A) > A. Upshot: The same sets-of-sets principle that Cantor needs in order to build a hierarchy of infinite sets yields a paradox almost right away.
Historical evidence shows that Cantor knew about the set-of-all-sets paradox by c. 1895,75 though never once in his published work does he mention it. It’s nevertheless known now as Cantor’s Paradox. It’s also regarded as the basis for the most famous set-theoretic paradox of all, which is usually called Russell’s Antinomy because the ubiquitous B. Russell used it to torpedo G. Frege’s Foundations of Arithmetic in 1901.76 We can sketch Russell’s Antinomy very quickly and easily because almost everybody’s heard of it sometime or other. Though it drops right out of Cantor’s abstract Power Set proof, the Antinomy also plays havoc with the main criterion in Cantor’s definition of ‘set,’ which (you’ll recall from §7a FN 16) is that there’s always a procedure such that for any given item you can always determine whether it’s a member of a given set.77 So here is Russell’s Antinomy. As we’ve seen, some sets are members of themselves and some aren’t. Actually, most aren’t—as in for example the set of all chairs is not itself a chair, the set of all entities that can tie a knot in a cherry-stem with their tongue cannot itself tie such a knot, etc. But some sets do contain themselves as members, e.g. the set of all sets, the set of all abstractions, the set of all entities that cannot tie a cherry-stem knot. Russell calls a set that is not a member of itself a normal set, and one that is self-containing an abnormal set. So now consider the set N of all normal sets—is N a normal set?78 Well, if N is a normal set, then by definition it isn’t a member of itself; but N is the set of all sets that aren’t members of themselves, so N is a member of itself if it isn’t a member of itself; although now if N really is a member of itself then it can’t be a member of the set of all sets that aren’t members of themselves, so N actually isn’t a member of itself, in which case it is … and around and around ad inf.
This kind of paradox, like the cruncher in Cantor’s reductio, is officially known as a Vicious Circle. The ‘vicious’ here means roughly the same thing it did in §2a’s VIR, namely that it becomes logically impossible to do something we’re logically required to do. In VC paradoxes like Russell’s and Cantor’s, what we cannot do is determine whether something is or isn’t a member of a set, which violates both the formal definition of ‘set’ and (way worse) LEM. So these are not lightweight problems.
By this point you’ve almost certainly discerned the Story of ∞’s overall dynamic, whereby certain paradoxes give rise to conceptual advances that can handle those original paradoxes but in turn give rise to new paradoxes, which then generate further conceptual advances, and so on. If you’re one of the readers who is bothering with the ‘IYI’ footnotes, you’ve already seen stuff about one kind of technical remedy for Russell’s Antinomy, which is Zermelo et al.’s replacement of the Unlimited Abstraction Principle with the Limited A.P. Another type of solution is the prohibition of impredicative definitions championed by J. H. Poincaré (1854–1912), a major figure in topology who incidentally was, after Kronecker’s death in 1891, the #1 opponent of transfinite math.79 Poincaré’s definition of impredicative is somewhat shifty, but in essence it means defining an object in terms of a whole group of objects of which it’s a part. Even more essentially, an impredicative definition depends on self-referential properties and descriptions, and ‘The set of all sets that are not members of themselves’ is a perfect example of such a definition (as are ‘The set of all sets’ and Cantor’s def. of set ϕ in the B > A proof above). This all gets very involved, but Poincaré’s general tactic is to characterize impredicative definitions in terms of the paradoxical results they can yield,80 which then forms the logical argument for disallowing them. It’s rather the same way dividing by 0 got outlawed. Unfortunately, the formal definitions of all sorts of terms and concepts in analysis, from ‘sequence’ and ‘series’ to ‘limit point’ and ‘lower bound,’ are also impredicative—not to mention that the concept of impredicativity can itself be made to generate nasty VCs81—so Poincaré’s solution never really caught on.
Russell’s own proposed way to avoid his and Cantor’s eponymous Paradoxes is the Theory of Types, which to make a very long story short is part of Russell’s foundational program for trying to show that all math is reducible to symbolic logic. The Theory of Types is a sort of grammar of abstraction that disallows certain kinds of propositions in which different Types of entities are treated as equivalent. Meaning, in essence, metaphysically equivalent.82 The idea is that sets of individuals are not the same Type of entity as individuals themselves, and sets of sets are not the same Type as sets of individuals, and so on. And a particular entity’s Type is a direct function of how abstract that entity is, so you end up with a set-theoretic hierarchy that resembles the informal abstraction-Levels we talked about in §1b—Russell’s Type 1 = Individuals, Type 2 = Sets, Type 3 = Sets of sets, Type 4 = Sets of sets of sets, etc. etc.83 What enables the theory to preempt Vicious Circles is that the same sort of hierarchy can be applied to propositions—e.g., Type x = some entity of some particular Type; Type (x + 1) = some proposition about that entity; Type (x + 2) = some proposition about that proposition about the entity; and so on. (N.B. that for Russell ‘proposition’ can mean either a natural-language sentence or a formal/mathematical assertion like ‘a ∈ A’.84) And the big rule is that, where m and n are integers, a proposition or set of Type n cannot be applied to another proposition/set of Type n, but only to a proposition/set of some Type m where m < n.
As far as our Story’s concerned, the Theory of Types can be seen as a perfect example of trying to legislate one’s way out of a paradox. The theory does indeed offer a ‘solution’ to Russell’s and Cantor’s crunchers—that is, it gives an account of what the paradoxes’ illicit move is—but it’s also incredibly arcane and cumbersome, and ultimately as damaging to math as Poincaré’s impredicativity thing. Quick example: Since rational numbers are defined as ratios of integers, and surds as sets/sequences of rationals, the three kinds of numbers are of different Types, and by the theory’s rules we couldn’t predicate things of all three in common without endless different proofs and Levels and caveats. FYI, Russell tried to patch up some of these difficulties via what he called Axioms of Reducibility, but these were even more complicated and contrived … and basically his whole Typology spins off into the aether and is now of merely historical interest.85
If it’s necessary to say once again that we’re just barely skimming a turbid surface here, consider it said. Specific counter-paradox measures like Russell’s and Poincaré’s are part of a much larger and deeper crisis, one that predates G. Cantor but is brought to a head by his theories of ∞. Thrust, broadly stated: The paradoxes of set theory, coupled with the foundational concerns that start with Abel and Cauchy and climax with Frege and Peano, lead directly to the great Formalist-v.-Intuitionist controversies of the early 1900s. These are controversies that we again can only trace the outlines of. Respecting infinite sets, for example, Intuitionism is rabidly anti-Cantor and Formalism staunchly pro-Cantor, even though both Formalism and Intuitionism are anti-Plato and Cantor is a diehard Platonist. Which, migrainous or not, means we’re again back to metaphysics: the modern wrangle over math’s procedures is ultimately a dispute over the ontological status of math entities.
There’s already been some intro to Intuitionism in §6’s discussions of Constructivism; Formalism is its own separate kettle. The best way to come at this might be to recast the broadly stated thrust just above, to wit: The paradoxes of set theory are part of the larger issue of the Consistency of Math, which D. Hilbert proposed as Major Problem #286 at the same Paris Congress where he could be seen rhapsodizing about Cantor in §1a. Hilbert’s own program for reconstructing mathematics in such a way that theorems don’t yield paradoxes is Formalism, which seeks to make the abstractness of math both total and primary. The basic idea of Formalism is to totally separate math from the world and turn it into a game. Literally. This game involves the manipulation of certain symbols according to certain rules that let you construct sequences of symbols from other sequences of symbols. It’s 100% formal—hence the name. What the math-game’s symbols mean, or whether they even denote at all, has nothing to do with it; and to say that a math entity ‘exists’ is merely to say that it doesn’t cause a contradiction.87 What matter are the rules, and the whole project of Formalism is proof-theoretic: the goal is to construct a set of axioms and rules of inference88 from which all of math can be derived, so that the whole thing’s totally deductive and rigorous and clean—as a self-enclosed game, that is.
If you have any sort of background in logic or the philosophy of math, you’ll recognize that this is a radically boiled-down description of Formalism. (For one thing, Hilbert’s program also involves breaking math down into Levels of reasoning somewhat like Russell’s Types, with again no inter-Level propositions allowed.) You’ll also probably know that the movement runs into serious problems long before Gödel’s aforementioned proofs that a formal system can’t be both Complete and Consistent89—like e.g. the Formalists couldn’t even get basic arithmetic to be Complete and Consistent if it included multiplication as a legal operation, which is obviously a serious problem. So we don’t have to talk about the philosophical impoverishment or flat-out weirdness of a referentless math-game, because Formalism couldn’t even succeed on its own terms.
The most coherent and successful responses to the VC paradoxes come from within set theory itself (which by c. 1900 is a thriving field in both math and logic, thanks to guess who), and are spearheaded by Cantor’s #1 follower and systematizer, Prof. E. Zermelo.90 A result of these responses is the split of abstract set theory into two subtypes, naïve set theory and axiomatic set theory. N.S.T. is just regular Cantorian set theory with all its warts and glories, including its susceptibility to paradoxes.91 Axiomatic set theory is an attempt to derive a more rigorous, foundationally secure version of set theory that’s got all the conceptual power of N.S.T. but is set up in such a way as to avoid gross paradox. The A.S.T. program is somewhat Formalist in spirit, and Euclidean: it’s to make set theory its own independent formal system92 with its own set of axioms that yield maximal Consistency and Completeness. As mentioned someplace already, the best-known axiomatic system is usually called ZFS (for Zermelo, A. Fraenkel, and T. Skolem); there’s also the more restrictive von Neumann–Bernays (VNB) system, as well as some others, w/ various metatheoretical bells and whistles, designed by eminences like A. Tarski, W. V. O. Quine, F. P. Ramsey, & c.
As it happens, axiomatic set theory and the logic of same have had fruitful applications in everything from math’s theory of real functions, analysis, and topology, to generative grammar and syntax studies in linguistics, to decision theory, algorithms, logic circuitry, halting-probabilities/‘Ω-studies,’ A.I., and combinatorial processing in computer science. Despite increasingly dire space limitations, it is therefore worth it to include at least a doubletime tour of the basic axiomatization that all the major systems are variations of, w/ terse and directly relevant glosses where necessary—and of course at this late point the whole thing being skip- or skimmable at your IYI discretion—as follows:
Primitive Concept: The membership relation ∈, where ‘s ∈ S’ means object s is a member of set S.
Ax. 1: Two sets are equal if they contain the same members. (Notice it’s not ‘if and only if …’; this is because infinite sets and their proper subsets can also be equal.)
Ax. 2: If a and b are either objects or sets, then {a, b} is a set.
Ax. 3: There are two variants of this one. 1st variant—For a set S and a ‘definite predicate’93 P, there exists the set SP that contains only those x ∈ S 94 that have the property designated by P. 2nd variant—There exists a set S with the following features: (a) Ø ∈ S, and (b) For any x, if x ∈ S, then {x} ∈ S. (These are two technically distinct versions of the Limited Abstraction Principle mentioned supra. Both versions do two important things. First, they establish that the empty set exists. Second, they define and validate the set-theoretic method of transfinite induction and, via this method, establish the existence of a denumerably infinite set S whose members are Ø, {Ø}, {{Ø}}, {{{Ø}}}, ….95 Whereupon if, in this set, Ø is taken to be 1 and, for any x, {x} equals (x + 1), then S becomes the ordered set of all positive integers (which happens to be very close to the way Peano’s Postulates96 generate the integers).)
Ax. 4: The union of a set of sets is itself a set. (This serves as a technical definition of ‘union,’ from which ‘intersection,’ ‘Cartesian Product,’ 97 etc. can be derived by logical manipulation (rather the way you can define the logical connective ‘and’ wholly in terms of ‘not-’ and ‘or’).)
Ax. 5: The famous Power Set Axiom: For any set S, there exists the Power Set P(S) of S. (This one establishes the infinite hierarchy of infinite sets. Recall from §7b ff. that all set theory is trivial in the case of finite sets, w/ ‘trivial’ meaning you can check the veracity of any set-theoretic proposition just by looking at the members of the relevant sets. The whole point of these axioms is to be able to prove theorems that are trans-experiential, 100% abstract—just like ∞ itself.)
Ax. 6: The famous and infamous Axiom of Choice. In the nomenclature of set theory, the A.C. is: ‘If S is a set of pair-wise disjoint nonempty sets, the Cartesian Product of the members of S 98 is not empty; every member of this Cartesian Product is designated a selection set of S ’. In regular English, it’s that from any S you can construct a subset S′ with a particular property even if you can’t specify a procedure for choosing the individual members of S′. (Zermelo came up with the Axiom of Choice in the early 1900s. It’s way too technical to try to unpack here,99 but one important consequence of the A.C. is the well-ordering principle, viz. that any subset S′ of any set S can be chosen and arranged in such a way that S′ has a first member. We’ve already seen the importance of this principle in 1-1C demonstrations, e.g. the very first one about {all integers} and {all positive integers} having the same cardinality. The w.o.p. is also crucial for Cantor’s proofs that c > and P(I) > I, since these proofs’ various arrays all obviously had to have a first element. But the Axiom of Choice was also horribly controversial (for one thing, you can understand why Intuitionists and Constructivists hated the idea that you could designate a subset without any kind of procedure for picking its members), and it remained one of the great vexed questions of set theory until (1) K. Gödel in 1940 proved the A.C.’s logical Consistency with set theory’s other axioms, and then (2) Prof. P. Cohen100 in 1963 proved the A.C.’s logical Independence from (i.e., its negation’s Consistency with) set theory’s other axioms, which proofs together pretty much settled the Axiom’s hash.101)
Ax. 7: This one’s usually known as the Axiom of Regularity; it too has several versions. The simplest one is that whether x is an object or a set, A racier formulation is: ‘Every nonempty set S contains a member x such that S and x have no common member.’102 (The Axiom of Regularity sort of encapsulates Poincaré’s and Russell’s objections to self-reference; or at any rate it’s this axiom that heads off Russell’s Antinomy. It also bars formulations like ‘the set of all sets’ and ‘the set of all ordinal numbers’ and so avoids Cantor’s Paradox and the soon-to-be-explained Burali-Forti Paradox. Notice it also disallows Cantor’s ϕ-based proof of P(A) > A in §7e. This is why there’s the whole separate Power Set Axiom above, from which P(A) > A can be derived without any sort of proof that violates the Axiom of Regularity. But please be informed that even with the A.R., axiomatics like ZFS can still be prone to certain model-theoretic paradoxes,103 so that as of say 2000 C.E. there’s now a whole hierarchy of axiomatizations for set theory, each with its own special immunity to paradoxes, known in the trade as Consistency strength. If you’re interested—and because if nothing else their names are fun—today’s main systems, listed in order of increasing Consistency strength, are: Peano’s Postulates, Analytic, ZFS, Mahlo, VNB, Quinian, Weakly Compact, Hyper-Mahlo, Ineffable, Ramsey, Supercompact, and n-Huge.)
END S-I. §7f.
§7g. You’ve doubtless noticed that it’s been a while since G. Cantor même has been mentioned and have maybe wondered where he is in all §7f’s foundational roil. Poincaré’s and Russell’s prophylactics, Zermelo’s axiomatizations, etc. are all around the early 1900s, by which time Cantor’s best work is behind him and he’s mostly abandoned math for the obsessive preoccupations that consumed his later years.104 It’s also now that he’s in and out of hospitals all the time. The poignant irony is that it’s just when Cantor’s work is gaining wide acceptance and set theory is inflorescing throughout math and logic that his illness gets really bad, and there are all sorts of special conferences and awards presentations he can’t go to because he’s too sick.
More directly apropos is that even when Cantor first happened on his paradoxes in (probably) the 1880s, he didn’t worry too much about them, or rather couldn’t, because he had more pressing problems. As in mathematically. Chief among them is what’s now known as the Continuum Hypothesis.105 The C.H. gets characterized in all kinds of different ways—‘Is the power of the Continuum equivalent to that of the second number class?’; ‘Do the real numbers constitute the Power Set of the rational numbers?’; ‘Is c the same as ‘Does
—but here’s the nub. Cantor has already proved that there’s an infinite hierarchy of infinite sets and their Power Sets, and he’s proved that P(A) = 2A and 2A > A are theorems for infinite sets. But he hasn’t yet proved just how these different results are connected. The central question is whether the 2A > A thing constitutes an exhaustive law for how the transfinite hierarchy is arranged—that is, whether for any infinite set A the next larger set is always 2A, with no intermediate ∞s between them—and thus whether this process of ‘binary exponentiation’ is the way you get from one infinite set to the next, just the way addition lets you get from one integer to the next. A yes to this long question is the Continuum Hypothesis. What’s now regarded as the general form of the C.H. is
106 but Cantor’s original version is more specific. We know that he’s proved the existence and cardinalities of two distinct infinite sets, namely the set of all integers/rationals/algebraics
and that of all reals/transcendentals/continuous intervals and spaces (=c); and he’s proved that c >
His own Continuum Hypothesis is that c =
i.e. that c is actually
the very next infinite set after
with nothing in between.107
Cantor’s attempts to prove the C.H. went on through the 1880s and ’90s, and there are some heartbreaking letters to Dedekind in which he’d excitedly announce a proof and then a couple days later discover an error and have to retract it. He never did prove or disprove it, and some pop-type historians think the C.H. is what really sent Cantor over the edge for good.
Mathematically speaking, the truth about the Continuum Hypothesis is more complicated than pop writers let on, because Cantor really comes upon the C.H.’s various problems through his work on ordinal numbers, which numbers’ relations are rather more like the ‘R = (P′, P″, P″′, …)’ thing of §7b, and which despite our best intentions we now have to sketch very briefly.108 First, to save time, please recall or review §5e(1) FN 78’s primer on ordinal v. cardinal integers. We’re now concerned with ordinal numbers in set theory, which are a little different, and involve the concept of sets’ order-types. Simple explanation: We know that if sets A and B have the same cardinal number, they are 1-1C-able. If this 1-1C can be carried out in such a way that the order of the members of A and B remains unchanged, then A and B are the same order-type. (A straightforward example of two sets with the same cardinality but different order-types was {all positive integers} and {all integers} in §7c. Remember that we had to tinker with the latter set’s order so that it would have a first member to match up with the set of positives’ 1.)
You can see why this is going to be more complicated than the cardinals: we’re now concerned not just with a set’s number of members but with the way in which they’re arranged. Or rather ways, because the possible permutations of these arrangements form a good part of the ordinal theory’s meat. Which meat we will now look at, though you should be aware that there are a great many technical terms and distinctions—‘ordered,’ ‘well-ordered,’ ‘partially ordered,’ ‘everywhere-’ v. ‘nowhere dense,’ ‘relation number,’ ‘enumeration theorem,’ and so on—that we are going to mostly blow off.109 Some basic facts: For finite sets, cardinality = order-type; that is, two finite sets with the same cardinal number will automatically have the same order-type. This is because there’s exactly one order-type for all sets with one member, one order-type for all sets with two members, and so on.110 The total number of possible order-types for finite sets is, in fact, the same as the cardinal number of the set of positive integers, namely It’s with infinite sets that order-types get complicated. Which should be unsurprising. Take the prenominate denumerably infinite set of all positive integers: {1, 2, 3, 4, …} has more than one order-type. This doesn’t mean just switching certain chunks of numbers in the infinite sequence around, since the set will still be 1-1C-able with the original set of positive integers, even if the correspondence is something like
But if you take one of the set of integers’ members and put it last—as in {1, 3, 4, 5, 6, 7, …, 2} you now have a totally different order-type. The set {1, 3, 4, 5, 6, 7, …, 2} is no longer 1-1C-able with a regularly ordered set that has no last member and so gives you no way to arrive at anything to match up with the 2. Plus observe that in the new order-type, 2 becomes a different ordinal number: it is no longer the 2nd member of the set but rather now the last member, and it has no specific number immediately before it. Hence the comprehensive def. of ordinal number: It’s a number that identifies where a certain member of a set appears in a certain order-type.111
In Cantorian set theory there are two main rules for generating ordinal numbers. (1) Given any ordinal number n, you can always derive the next ordinal, which is n + 1. (2) Given any set N of ordinal n’s ordered in an increasing sequence (e.g., the set of positive integers), you can always derive a last ordinal that’s bigger than all the other n’s. This final ordinal technically functions as the limit of N’s sequence and can be written ‘Lim(N)’.112 These rules don’t look too bad, but things start to get tricky when we consider not just sets of ordinal numbers but ordinal numbers as sets—which we can do because a basic tenet of set theory is that all math entities can be represented as sets (e.g., the transfinite cardinal is the set of cardinal numbers {1, 2, 3, 4, …}; plus recall §2a’s ante rem thing about ‘5’ literally being the set of all quintuples). So but then just what set is some ordinal number n? The answer is Cantor’s third big rule: for any ordinal n, n = the set of all ordinals less than n ; i.e., n is identified with just that set of ordinals of which it is the limit.113 Or, in formal terms,114 n = {(∀x)x < n}. You can generate the whole sequence of regular integers (as either cardinals or ordinals) this way: 0 = {(∀x)x < 0} = Ø; 1 = {(∀x)x < 1} = {0}; 2 = {(∀x)x < 2} = {0, 1}; and so on. The ordinal number of the whole denumerably infinite set {0, 1, 2, 3, 4, …} gets symbolized by the little omega ‘ω’. This transfinite ordinal is the limit of the set’s members’ sequence—that is, it’s the very smallest number bigger than all finite integers. Another, more common way to describe ω is that it’s the ordinal number of that set of which
is the cardinal number.115
IYI INTERPOLATION
However hard the last ¶ seemed, most everything beyond that in the theory of ordinals is so brutally abstruse and technical that we can only make some general observations. One is that the arithmetic of transfinite ordinals is different from but no less weird than that of transfinite cardinals—for example, (1 + ω) = ω, but (ω + 1) > ω because by definition (ω + 1) is the very next ordinal after ω. Another is that, just as with the cardinal an infinite hierarchy of transfinite ordinals of infinite sets of ordinals is generatable (you might want to read that last clause over), though in this case it’s a very different process from the
thing. The transfinite-ordinal hierarchy is associated both with abstract entities called epsilon numbers and with an arithmetical operation called tetration. We’re not getting near the former except to say that they’re essentially a class of numbers such that ωε = ε116; but tetration is simpler, and you might already be familiar with it from, say, field theory or combinatorics if you had a lot of college math. It’s basically exponentiation on acid. The 4th tetration of 3 is written ‘43’ and means 3(3(33)), which = 3(39), which = 319,683, which you are hereby dared to try to calculate. The technical connection between tetration, transfinite ordinals, and epsilon numbers is the fact that ε0 =ωω, which isn’t all that important. But if you can conceive, abstractly, of a progression like ω, ((ω + 1), (ω + 2), …, (ω + ω)), ω2, ωω, ωω, ωωω, …, then you can get an idea—or at any rate an ‘idea’—of the hierarchy and the unthinkable heights of ordinal numbers of infinite sets of infinite sets of the ordinals of infinite sets it involves. End general obs.
END IYI I.
All right, so the specific way that Cantor runs up against the Continuum Hypothesis concerns ordinals and order-types. We’ve seen that there’s more than one order-type for infinite sets, as with {1, 2, 3, 4, …} v. {1, 3, 4, …, 2} a few ¶s back. There are, in fact, a ∞ of different order-types for any infinite set; and what Cantor proves117 is that the set of possible order-types for a denumerably infinite set is itself non-denumerable. This means that there’s yet another distinct way to generate an infinite hierarchy of infinite sets—if S is some denumerably infinite set, then Z is the nondenumerably infinite set of possible order-types of S, and Z will be the set of possible order-types of Z, and …, and away we go. (Actually, to call the different processes for deriving ∞-hierarchies ‘distinct’ is a little misleading, because in truth they’re related in all kinds of ways. The math of these relations is beyond our technical scope here, but you can get at least some notion of the connections from the technical definition Cantor gives of set Z (keeping in mind that ‘number class’ really refers to sets of ordinals), viz.: “The second number class Z is the entirety {α} of all the order-types α of well-ordered sets of the cardinality
It doesn’t have to get that deep, though. Leaving transfinite ordinals like ω out of it, we can still see a marked and surely not coincidental similarity among (1) c as the set of all real numbers (v. raionals); (2)
as the Power Set of
i.e. as
(3) Z as the set of all order-types of
The real problem is that Cantor can’t prove a certain crucial connection between these three identities. You’ll recall from a couple pages back that Cantor’s original C.H. is that (1) and (2) are the same, that
and that there’s no kind of intermediate-size ∞ between
and c. We are now set up to understand at least roughly how relation (3) is involved here. In the later §s of “Contributions … Numbers”—through a process of profoundly, unsummarizably technical reasoning—Cantor is able to deduce two big things: (a) that there is no way that c is >
and (b) that if there does exist any infinite set that’s greater than
but smaller than c, this set has got to be the nondenumerable set Z, a.k.a. the second number class. It is big thing (b) that informs his main attack on the C.H., which consists in an attempt to show that relations (2) and (3) above are actually the same—that is, if Cantor can prove that Z =
then by (b) it will be provable that there exists no intermediate set between
and c, which will entail that c =
It’s specifically this Z =
that he couldn’t prove. Ever. Despite years of unimaginable noodling. Whether it’s what unhinged him or not is an unanswerable question, but it is true that his inability to prove the C.H. caused Cantor pain for the rest of his life; he considered it his great failure. This too, in hindsight, is sad, because professional mathematicians now know exactly why G. Cantor could neither prove nor disprove the C.H. The reasons are deep and important and go corrosively to the root of axiomatic set theory’s formal Consistency, in rather the same way that K. Gödel’s Incompleteness proofs deracinate all math as a formal system. Once again, the issues here can be only sketched or synopsized (although this time Gödel is directly involved, so the whole thing is probably fleshed out in the Great Discoveries Series’ Gödel booklet).
The Continuum Hypothesis and the aforementioned Axiom of Choice are the two great besetting problems of early set theory. Particularly respecting the former, it’s important to distinguish between two different questions. One, which is metaphysical, is whether the C.H. is true or false. The other is whether the C.H. can be formally proved or disproved from the axioms of standard set theory.118 It’s the second question that has been definitively answered, over a period of several decades, by K. Gödel and P. Cohen, to wit:
1938—Gödel formally proves that the general form of the Continuum Hypothesis is Consistent with the axioms of ZFS—i.e., that if the C.H. is treated as its own axiom and added to those of set theory no logical contradiction can possibly result.
1963—In one of those out-of-nowhere coups d’éclats that pop scholars and moviemakers love, a young Stanford prof. named Paul J. Cohen proves that the negation of the general C.H. can also be added to ZFS without contradiction.119
These two results together establish what’s now known as the Independence of the Continuum Hypothesis, meaning that the C.H. occupies a place rather like the Parallel Axiom’s120 w/r/t the rest of Euclidean geometry: it can be neither proved nor disproved from set theory’s standard axioms.121 Plus you’ll recall from the previous § that Gödel and Cohen are able to derive pretty much the same results for the Axiom of Choice so vital to Cantor’s various Diagonal proofs—Gödel proving that the Axiom isn’t disprovable in ZFS/VNB, Cohen that it isn’t provable in ZFS/VNB.122 There are, as was mentioned, alternative axiom systems in which the C.H. and A.C. are provable/disprovable (e.g., Quinian set theory is set up in such a way that the Axiom of Choice is prima facie contradictory), although many of these Consistency-enhanced systems use ‘set’ in ways that are awfully different from Cantor et al.’s original definitions.
The Continuum Hypothesis remains alive in other ways. It is, for instance, the motive cause behind several different theoretical axiomatizations and extensions of set-theoretic models in which the C.H. and various equivalents are assumed123 to be proved or disproved. These speculative systems are among the most hyperabstract constructs in modern math, involving rarefied terms like ‘Cantorian’ v. ‘1st-Order Universes,’ ‘constructible’ v. ‘nonconstructible sets,’ ‘measurable cardinals,’ ‘inaccessible ordinals,’ ‘transfinite recursion,’ ‘supercompletion,’ and many others that are fun to say even if one has no clear idea what they’re supposed to denote.124 By way of closure, the more important thing for us to consider is how the C.H.’s unprovability bears on the other big question—whether the Hypothesis is in fact true. There are, not surprisingly, n different possible views on this. One kind of Formalist take is that various axiomatizations have various strengths and weaknesses, that the C.H. will be provable/disprovable in some and Undecidable in others, and that which system you adopt will depend on what your particular purpose is. Another, more strictly Hilbertian response will be that ‘true’ in this context can’t really mean anything except ‘provable in ZFS,’ and thus that the C.H.’s logical Independence from ZFS means it’s literally neither true nor false.125 A pure Intuitionist is apt to see the whole mess of paradox and unprovability in set theory as the natural consequence of allowing fuzzy and unconstructive concepts like sets, subsets, ordinals, and of course actual-type ∞ into math.126
But it is the mathematical Platonists (sometimes a.k.a. Realists, Cantorians, and/or Transfinitists) who are most upset by the C.H.’s Undecidability—which is interesting, since the two most famous modern Platonists are G. Cantor and K. Gödel, who together are at least two-thirds responsible for the whole nonplus. The Platonic position here is nicely summarized by Gödel, writing about his own and Cohen’s proofs of the C.H.’s Independence:
Only someone who (like the Intuitionist) denies that the concepts and axioms of classical set theory have any meaning could be satisfied with such a solution, not someone who believes them to describe some well-determined reality. For in reality Cantor’s conjecture must be either true or false, and its undecidability from the axioms as known today can only mean that these axioms do not contain a complete description of reality.
That is, for a mathematical Platonist, what the C.H. proofs really show is that set theory needs to find a better set of core axioms than classical ZFS, or at least it will need to add some further postulates that are—like the Axiom of Choice—both “self-evident” and Consistent with classical axioms. If you’re interested, Gödel’s own personal view was that the Continuum Hypothesis is false, that there are actually a whole ∞ of Zeno-type ∞s nested between and c, and that sooner or later a principle would be found that proved this. As of now no such principle’s ever been found. Gödel and Cantor both died in confinement,127,128 bequeathing a world with no finite circumference. One that spins, now, in a new kind of all-formal Void. Mathematics continues to get out of bed.
1 IYI More specifically in indeterminate equations and ternary forms, both of which are algebraic number-theory topics that Kronecker’s interested in. (Have we mentioned that Kronecker was Cantor’s dissertation advisor? and that it’s SOP for young mathematicians to work on mentors’ problems—q.v. Cantor also taking over Heine’s proof just below?)
2 IYI The German academic system of the 1800s is pretty much unparsable.
3 IYI = a particular kind of partial differential equation that you might remember from sophomore math, probably in the context of Green’s Theorem.
4 which, you might recall from E.G.II’s thing on —Uniform Convergence & Associated Arcana, means that the f(x) the series represents (=sums to) must be continuous. (IYI N.B.: Heine’s real proof requires that the series and function be ‘almost everywhere’ uniformly convergent and continuous, which involves distinctions so fine that we can ignore them without distortion.)
5 IYI i.e., the same paper in which he lays out his theory of real numbers in §6e (which paper’s title will probably now make more sense).
6 IYI Q.v. E.G.II’s —Uniform Convergence & Associated Arcana item (d) for exceptional points, which again please recall can also be called ‘discontinuities’. (N.B.: Some math classes also use singularity to mean exceptional point, which is both confusing and intriguing since the term also refers to Black Holes, which in a sense is what discontinuities are.)
7 It is important throughout this section to remember that as far as we’re concerned these are the same thing.
8 IYI q.v. §5e, p. 188’s text and FN 72.
9 C.f. Cantor, right at the start of “On the Extension … Series”: “To every number there corresponds a definite point of the line, whose coordinate is equal to that number.”
10 Have we inserted yet anyplace that these strange human-profile brackets, ‘{ },’ are what you put around things to show they compose a mathematical set?
11 i.e., that is not integrable term by term such that the result is convergent.
12 IYI Evidently here’s where Kronecker was especially helpful.
13 And clearly, if any series-representation ends up being provably identical to the f(x)’s original representative series, then that original series is the function’s Unique representation. This is basically how the Uniqueness Theorem works—it’s not that there’s only one series that can represent a given function, but rather that all series that do represent it are provably equivalent.
14 IYI If the following couple ¶s seem brutal, please don’t lose heart. The truth is that Cantor’s route to the theory of ∞ is a lot harder, mathwise, than the theory itself. All you really need to get is a rough sense of how the Uniqueness Theorem leads Cantor into transfinite math. The brutal part will be over quickly.
15 See for instance J. W. Dauben on the ’72 proof: “Cantor stressed, however, that the numbers in these various [derived-set] domains remained entirely independent of this geometric identification, and the isomorphism served, really, as an aid in thinking about the numbers themselves.” You’ll notice that this attitude is more or less identical to Dedekind’s in “Continuity and Irrational Numbers”.
16 Preliminary tidying: We need to draw a distinction between two different kinds of set theory you might know about. What’s called point-set theory involves sets whose members are numbers, spatial or R.L.-points, or various groups/systems of these. Point-set theory is today a big deal in, e.g., function theory and analytic topology. Abstract set theory, on the other hand, is so named because the nature and/or members of its sets isn’t specified. Meaning it concerns sets of pretty much anything at all; it’s totally general and nonspecific—hence the ‘abstract’.
From here on out, ‘set theory’ will refer to abstract set theory unless otherwise specified.
What’s complicated is that G. Cantor, whose real fame is as the author of abstract set theory, obviously started out in (and basically invented) the point-set kind. It’s really not until the 1890s that he provides the definition of set that now characterizes abstract set theory—“A collection into a whole of definite, distinct objects of our intuition or thought, [which] are called the members of the set”—but all his significant results in the ’70s and ’80s on point sets also apply to abstract s.t. Plus finally please note, anent Cantor’s above definition v. the gloss we saw in §6f, that his ‘definite’ means that for some set S and any object x, it’s at least in principle possible to determine whether x is a member of S (if you’ve had much logic, you might recall that this feature of formal systems is called Decidability). Whereas his definition’s ‘distinct’ means that for any two members x and y of S, x ≠ y, which serves formally to distinguish a set from a sequence, since in sequences the same term can show up over and over. Tidying complete.
17 IYI It goes without saying that this is all very condensed—it’s not like there was some single epiphanic moment at his desk when these things occurred to him.
18 IYI First-rate scholarly books in English include A. Abian’s The Theory of Sets and Transfinite Arithmetic, M. Hallett’s Cantorian Set Theory and Limitation of Size, and J. W. Dauben’s Georg Cantor: His Mathematics and Philosophy of the Infinite, all of which are listed in the Bibliography. The caveat, though, is that ‘scholarly’ here means pitilessly dense and technical. Dauben’s book in particular requires such a strong pure-math background that it’s hard to imagine any reader who’s able to enjoy it wasting her time on the present booklet … which renders this whole FN self-nullifying in a sort of interesting way.
19 IYI the last usually in connection with the extensional logic of G. Boole (1815–1864), whom it’s a shame we’re not going to talk about.
20 IYI It’s more nearly accurate to say that the bulk of his original work gets done 1874–84, with the subsequent decade’s papers being mostly expansions and revisions of previous proofs, as well as responses to other mathematicians’ objections.
21 Please note also that as he was inventing the stuff Cantor did not proceed axiomatically; he based most of it on what he called “pictures,” or rough concepts. Plus, despite the heavy symbolism, most of his actual arguments were in natural language—and not exactly crisp clear Russell-caliber language, either. The point being that Cantor’s original work is quite a bit fuzzier and more complicated than the transfinite math we’re going to discuss, much of which latter uses axiomatizations and codifications supplied by post-Cantor set theorists like E. Zermelo, A. A. Fraenkel, and T. Skolem (not that the names matter much yet in this §).
22 IYI although obviously everything is ultimately findable in the Cantor publications listed in the Bibliography.
23 IYI especially if you’re the right age to have been subjected to the New Math.
24 IYI FN 14’s apologies and reassurances pertain to the following text ¶, too.
25 IYI = Cantor’s term, which apparently in German doesn’t have the postmortem connotations it does for us.
26 IYI Cantor’s proof that P′ = Q ∪ R is emetically complex but wholly valid; please take it on faith.
27 IYI Notice how (1)’s got ellipses outside the right paren, too, meaning the sequence continues beyond the finite-superscripted progression. (2)’s ellipses are 100% intraparenthetical because n itself is infinite. Make sense?
28 The following actual notation is of course IYI—but if nothing else it’s quite pretty.
29 IYI In truth there’s a much better analogy for transfinite numbers than the integers, viz. some other kind of number that can’t actually be named or counted but can nevertheless be abstractly generated—by, say, drawing the diagonal of the Unit Square, or taking the square root of 5, or describing a particular Dedekindian A and B and interstitial schnitt. Regarding all of which please see or await the discussion two text ¶s down.
30 IYI That the aleph is a Hebrew letter is sometimes made much of by historians w/r/t Cantor’s ethnicity or kabbalistic leanings. More plausible explanations for Cantor’s choice of are that (1) he wanted a whole new symbol for a whole new kind of number, and/or (2) all the good Greek letters were already taken.
31 IYI re which proofs are on the way in §7c ff.
32 IYI mathematically speaking.
33 He’s talking here about Dedekind’s schnitt method, which after he got going on ∞ Cantor preferred over his own approach, for rather obvious reasons.
34 Because of space considerations we’re not going to harp too much on this, but let’s emphasize once more here that G. Cantor is, like R. Dedekind, a mathematical Platonist; i.e., he believes that both infinite sets and transfinite numbers really exist, as in metaphysically, and that they are “reflected” in actual real-world infinities, although his theory of these latter involve Leibnizian monads and is best steered clear of. As it happens, Cantor develops all sorts of theological positions and arguments respecting ∞, too, some of which are cogent and powerful and others merely eccentric. Still, as a mathematician and rhetor Cantor is smart enough to argue that one doesn’t need to accept any particular metaphysical premises in order to admit infinite sets or their abstract numbers into the domain of legit mathematical concepts. See e.g. this passage from Cantor’s prenominate “Contributions …”:
In particular, in introducing new numbers, mathematics is only obliged to give definitions of them, by which such a definiteness and, circumstances permitting, such a relation to the older numbers are conferred upon them that in given cases they can definitely be distinguished from one another. As soon as a number satisfies all these conditions, it can and must be regarded as existent and real in mathematics. Here I perceive the reason why one has to regard the rational, irrational, and complex numbers as being just as thoroughly existent as the finite positive integers.
That last and clearest sentence is a tiny blown kiss to L. Kronecker. The rest is obscure enough that J. Dauben gives it the following gloss: “For mathematicians, only one test was necessary: once the elements of any mathematical theory were seen to be consistent, then they were mathematically acceptable. Nothing more was required.”
35 IYI Here the best supporting scholium is from Cantor’s follower A. A. Fraenkel: “The concept of transfinite magnitude is insignificant so long as only one such magnitude was known to exist.”
36 IYI Really more like ‘in conjunction’ than ‘simultaneously,’ since the two projects end up being connected in all sorts of high-level ways.
37 IYI If you’ve ever run across references to Cantor’s transfinite cardinals, this is what they are—the cardinal numbers of infinite sets.
38 IYI Galileo’s Paradox rests squarely if covertly on this def. See, for instance, §1d p. 39’s “It’s also obvious that while every perfect square (viz. 1, 4, 9, 16, 25, …) is an integer, not every integer is a perfect square.”
39 IYI ‘Here’ = mostly two seminal articles in 1874 and ’78, though he also spends a lot of time fleshing the idea out in his later, more discursive papers. If you’d like the title of the ‘74 monograph, it’s “Über eine Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen.” This translates to something like “On a Characteristic of the Set of All Real Algebraic Numbers,” regarding which feel free to see FN 53 below.
40 IYI This is, in point of fact, what counting any collection of n things is: it’s putting the things in a one-to-one correspondence with the set of integers {1, 2, 3, …, n}. The equivalence of counting and 1-1C-ing-with-{all integers} is what made set theory the basis for teaching little kids arithmetic in the New Math.
41 IYI N.B. that in Cantorian set theory denumerable is related to but not synonymous with countable. Def.: A set is countable if and only if it is either (a) finite or (b) denumerable.
42 This traditionally gets called either aleph-null or aleph-nought.
43 IYI Order starts mattering only with Cantor’s transfinite ordinals, which enter play in §7g.
44 IYI Notice that we’re starting to be able to answer Russell’s epigraphic query at the start of §7.
45 Command Decision: From here on out, when we talk about ‘all numbers’ we’re going to deal only with positive values. This includes the integers, since after all we’ve just proved that the set of all integers’ cardinality = that of the set of all positive integers. Plus it ought to be obvious that Cantor’s proofs for all positive rationals, all positive reals, etc. will still be valid if the relevant infinite sets are doubled to comprise negative values. If you’re dubious, then observe that doubling something is the same as multiplying it by 2, and that 2 is finite, and that—by transfinite theorems (3) and (6) in §7b—any times a finite n will still =
46 IYI Again, please be advised that we’re doing these proofs in the order that yields the clearest and most logical exposition. N.B. also that there are also some cardinals-v.-ordinals distinctions that for now we’re going to fudge.
47 IYI What Cantor really calls ‘Diagonalization’ is his method for proving the nondenumerability of the reals, which as we’ll see is quite different.
48 IYI If this schematic def. seems to you insufficient and you want an actual 1-1C between the sets of rationals and integers, then simply take the prenominate ordered row and match its first member with 1, its second member with 2, … and so on and so forth.
49 IYI You may well be able to anticipate some familiar complications here w/r/t what can possibly be the sequence’s very first real number, and to see why none of the previous kinds of fiddling with the row-orders will work with the real numbers. In which case, for your own peace of mind, be now advised that Prof. E. Zermelo’s famous set-theoretic Axiom of Choice (which is its own briarthicket—see §7f) ensures that we can always construct an ordered set of real numbers in such a way that it’s got a bona fide first member. It would be in your interest to just swallow this for now.
50 So the reason for the ‘Diagonal’ thing is the first-digit-in-first-row, second-digit-in-second-row, 45°-angle construction of R.
51 IYI Q.v. the beginning of §5e.
52 IYI The proof of this weird factoid, which was way back on §2e’s pp. 89–90, can of course now be freed from the requirement that all the hankies and half-hankies and half-half-hankies be infinitesimally small—we merely invoke §5e(1)’s Weierstrassian proof that
53 IYI Historically speaking, the earliest nondenumerability result Cantor could actually prove was that the set of all transcendentals was nondenumerable and that the total set of all rationals + all algebraic irrationals had the same cardinality as the rationals. Q.v. here FN 39 supra and the title of Cantor’s ’74 paper, which should now make more sense.
54 IYI = “I see it, but I don’t believe it” (which is a pithy if unintentional bit of anti-empiricism). Just why he says this in French to a fellow German isn’t clear—it seems to have been a way to emphasize emotion. Cantor’s scholarly German, too, often switches into French or Greek for no discernible reason. Perhaps this was SOP.
55 This uniqueness is critical. You can’t allow two different decimal ways to represent the same point, because the whole idea is to see whether to each particular point in the U. Square there corresponds a particular point in the R.L.’s [0,1]. It should now be 100% clear why Cantor needs to stipulate that numbers like are to be designated only by .4999 … and so on. If you remember the mention of this stipulation in §7c, be now apprised that it was this Dimension Proof w/r/t which Dedekind suggested it to Cantor, pointing out the “unique mapping” (which was D. & C.’s original term for 1-1C, in letters) would be screwed up if both .4999 … and .5000 … were allowed.
56 INTERPOLATION THAT IS SO IYI IT’S NOT EVEN IN THE MAIN TEXT Technically it’s a little more complicated than that. But not a lot. Cantor’s original Dimension Proof is sort of unnecessarily recondite. It involves defining the decimal representation of point (x, y) as the convergent series and then “pulling the terms of [this series] apart” in such a way as to form for each member a sequence of “ρ independent variables” in [0,1], these latter designated α1, α2, α3, α4, …, αρ. The “pulling apart” and subsequent mapping (as well as the reverse, so that the α↔β correspondence works both ways) is accomplished via four equations, of which the first looks like: ‘α1,n = β(n−1)ρ+1’. Which may not make the 1-1C here terribly obvious. The real proof’s rub (as explained in 1979 by the redoubtable Dr. G.) is that the ‘a1a2a3/b1b2b3’ description of the planar coordinates (x, y) we gave above is a bit too simple in that it makes it look like a and b are individual digits. What Cantor’s technique really does is break x and y up into little chunks of post-decimal digits; the rule is that each chunk terminates at the very first nonzero digit you hit (which is another, more technical reason why Cantor can’t have integers and rationals ending in .0000 ….) So say for example that x = 0.020093089 … and that y = 0.702064101 …, in which case they get broken down into:
x = 0.02 009 3 08 9 …
y = 0.7 02 06 4 1 01 …
It’s these chunks that get combined tit for tat to compose point (x, y)’s unique decimal representation: (x, y) = 0.02 7 009 02 3 06 08 4 …. And here the extra spaces are just illustrative aids; the actual decimal rep. of (x, y) is 0.02700902306084 …. This is, of course, also a real number, namely the [0,1] point z that equals 0.02700902306084 …. And what’s ingenious about the chunks-of-digits device is that if there’s a z′ that differs from z by even one digit lying n places out past the decimal (which of course there will be—a ∞ of such z’s, in fact), then the relevant (x, y) will also be different in the chunk comprising that nth digit, so the relevant correspondence is biunique, meaning for each z there’s a unique (x, y) and vice versa, which means it’s a genuine 1-1C.
57 IYI (inserted at editor’s insistence) = the prurient term for polyhedra in 4+ dimensions.
58 IYI as mentioned in §§ 5b and 5d. (And in the latter §, Riemannian geometry’s use of ∞ and ∞-related points was at least touched on.)
59 This gets into a pretty specialized area of function theory, but in essence noncontinuous mapping means that if you travel continuously along the points in [0,1] on the R.L., the corresponding points on the Unit Square won’t form a continuous curve but will be spread out patternlessly all over the place. (IYI With respect to the second ¶ of §7d above, it turns out that Riemann et al.’s assumption was wrong in an interesting way: the dimension of a given set of points depends not only on how many coordinates per point there are, nor even on the cardinality of the total point-set, but also on the particular way the points are distributed. This latter is an issue in point-set topology, regarding which all we’re in a position to say in this booklet is that it is yet another branch of math that wouldn’t exist without Cantor’s work on ∞.)
60 IYI date = 1878; title = “Ein Beitrag zur Mannigfaltigkeitslehre,” or roughly “A Contribution to the Theory of Manifolds/Aggregates/Sets.”
61 IYI It’s probably obvious that D. B.-R.’s specific “fictions” here are the composite decimal reps. for (x, y); but in the context of the whole review he’s also talking about the infinite sets of R.L.- and U.S.-points the decimals are mapping. (N.B. that exactly the same charge could have been leveled against Dedekind’s schnitt theory’s A and B—for some reason Dedekind never drew the same kind of fire Cantor did.)
62 Cantor often refers to these as the first number class and second number class, respectively.
63 IYI Textbooks often state this as an abstract theorem, like ‘Given any infinite set S, it is possible to construct another infinite set S′ with a greater cardinal number’.
64 IYI This principle is known in set theory as the Power Set Axiom. One reason it’s an Axiom is that it drops right out of the definitions of ‘subset’ and ‘empty set,’ as will be evident just below in the main text. There are problems with the P.S.A., though—q.v. the interpolative §7f below.
65 IYI Technically, this ought to be written ‘P(A) = 2Ā,’ where ‘Ā’ stands for the cardinal number of A. Having stated this for the record, we’ll just write it informally from now on.
66 IYI Again, technically it would be better to say that ‘Ø’ is the symbol for { }, which latter is the empty set … but you get the idea.
67 IYI You can prove P(A) = 2A for the empty set too. If A = Ø, it has 0 members. It does, however, have a subset—viz. Ø, since the empty set is a subset of every set. So here P(A) = 1, which is 20.
68 This will make sense if you remember that the overall context of these proofs is Cantor’s attempt to derive infinite sets (a.k.a. number classes) whose cardinality exceeds c.
69 IYI In point of fact, 1891’s is really a proof that this Power Set is uncountable. But recall that a set’s being countable requires its being either finite or denumerable, and it’s easy to show that the relevant set here isn’t finite. Since the set of all integers {1, 2, 3, 4, 5, …} is itself infinite, we have only to take each individual member, put braces around it, and realize that {1}, {2}, {3}, etc. are each subsets of the set of all integers. So there’s no way the set of all such subsets can be finite. So the real issue is whether the Power’s Set’s denumerable.
70 IYI FROM SERIES EDITOR’S LETTER OF QUERIES ON MS. VERSION OF BOOKLET: “p. 272, paragraph following graphic of ‘Array #3’: so, in other words, no matter how many subsets of I we come up with, we can always create new ones? If so, do you want to say something like that, just to spell it out?” FROM TESTY AUTH. REPLY: “No we do not want to say something like that, because it’s wrong. What Array #3 shows is that no matter how infinitely or ∞∞ly many subsets of I we list, it’s provable that there will always be some subsets that aren’t on the list. This is, recall, what ‘nondenumerable’ means: incapable of being exhaustively listed/rowed/Arrayed (and, again, it’s why there will always be more irrationals than Hankies of Death back in Dr. G.’s §2e demo—hankies, like integers and rationals, can compose only a denumerable ∞). Plus the ‘we can always create new ones’ part is deeply, seriously wrong: we’re not creating new subsets; we’re proving that there do exist and will always exist some subsets that no list or integral 1-1C can capture. W/ ‘exist,’ admittedly, requiring a wiseass ‘as it were’ or ‘whatever that means’ or something—but the reader’d have to be a radical-Shiite Kroneckerian to believe that what we’re doing in this proof is really creating these new subsets.”
71 IYI This brings up an important issue. You may well have noticed how closely the Diagonal Proof of P(I)’s nondenumerability resembles §7c’s Diagonal Proof of {all real numbers}’s nondenumerability. And now, given that both P(I)’s and {all reals}’s cardinalities are > , you may well be wondering whether P(I)’s cardinal number is c just as {all reals}’s is. In which case you have derived, on your own, a version of one of the most profound problems in Cantorian set theory, which problem gets hashed out at length in §7g. The point being that you are 100% right to be wondering about P(I)’s relation to c, but just hang on.
72 IYI meaning that B is also P(A)—but it’s easier if you forget about the whole Power Set thing for this proof.
73 IYI sometimes a.k.a. Eublides’ Paradox to distinguish it from Epimenides’ ‘All Cretans Are Liars’ variant—long story.
74 IYI 1906–1978, modern math’s absolute Prince of Darkness, referenced all the way back in §1a and elsewhere.
75 IYI We know, for example, that he told D. Hilbert about it, and it’s mentioned in at least one of Cantor’s letters to Dedekind. Note now for later that there’s also another paradox he stumbled on and also didn’t publish, which is known today as the Burali-Forti Paradox and has to do with transfinite ordinals, which as mentioned are themselves upcoming.
76 IYI The Frege-Russell thing is a long but much-loved story among math historians, very easy to find elsewhere. (N.B. that Russell’s Antinomy is just as often called Russell’s Paradox—but it gets tiresome saying ‘paradox’ over and over.)
77 IYI Once again, it’s all a bit more complicated than that. What Russell’s Antinomy really exploits is an unsound axiom in early set theory called (no kidding) the Unlimited Abstraction Principle, which in effect states that every conceivable feature/condition determines a set—i.e., that given any conceivable property, there exists a set of all entities possessing this property. Three quick remarks about the U.A.P. (1) Notice its intriguing resemblance to Plato’s One Over Many argument from §2a. (2) It ought to become evident soon in the main text why the U.A.P. is faulty and enables Russell’s Antinomy. (3) Please hold neocortically for just a few pages the fact that the Zermelo–Fraenkel–Skolem system of axioms for set theory amends the U.A.P. to the Limited Abstraction Principle, which holds that given any property p and a set S, we can form the set of all elements of S that have p.
78 IYI The remainder of this text-¶ is skippable if you can already see how the paradox works just from “Is N a normal set?” (IYI2 Russell also has a famous way to set up his Antinomy in natural language, to wit: Imagine a barber who shaves all and only those who do not shave themselves—does this barber shave himself or not?)
79 IYI In this opposition Poincaré’s often associated with the finite-point-set specialists E. Borel and L. Lebesgue, and in the metaphysics of math this trio’s sometimes known as the Anti-Platonic School.
80 very much like the Greek characterization of ∞ as to apeiron.
81 IYI Here’s one you may already have anticipated: If some quality is impredicative if it applies to itself—say e.g. the quality of being expressible in natural language, or correctly spelled, or abstract—then we can call a quality ‘predicative’ if it doesn’t apply to itself. So this quality of being predicative—is it predicative, or impredicative?
82 IYI Yes: we’re now coming back around to the abstract existence and denotation questions posed in §1.
83 IYI If you think you can see the ghost of Aristotle’s Third Man hovering around the Theory of Types, you are not mistaken. Many of the foundational problems in set theory end up looping back to Greek metaphysics.
84 IYI N.B. also that Russell’s arguments for the connection between the metaphysical Typology of entities/abstractions and the semantic Typology of entities/statements/metastatements are lengthy and complex, but they do exist—it’s not like he’s positing all this out of nowhere.
85 IYI Subsequent extensions and modifications of Russellian Type-theory, by logicians like F. P. Ramsey and A. Tarski, are so nightmarishly complicated and confusing that most mathematicians will pretend they don’t even hear you if you try to bring them up.
86 IYI meaning the second of the 10 Major Unsolved Problems that Hilbert listed at 1900’s 2nd I.C.M. as crucial for math to nail down in the upcoming century—another whole long story you can find in any good math-history survey. (IYI2 If you’ve learned/heard that there were really 23 Hilbert Problems, the truth is that Hilbert listed 1–10 in his Paris speech and 1–23 in the written version that came out in 1902.)
87 Compare this Formalist ontology to the view of Intuitionism that “[M]athematical objects are mental entities that do not exist independently of our ability to provide a proof of their existence in a finite number of steps.” You can see that the two views are not all that dissimilar, especially in their rejection of the idea that math has anything to do with extra-mental reality—although the Intuitionists’ “finite number of steps” criterion is specifically meant to outlaw entities like irrationals and infinite sets that Poincaré and Brouwer, like Kronecker before them, had metaphysical (not just procedural) problems with. (IYI Dr. G.’s way to contrast the two schools was to say that Intuitionism was sneaky whereas Formalism was more just crazy.)
88 IYI q.v. §1c.
89 These C-words should have been E.G.III’d by now. They’re model-theoretic terms from logic. A system is Complete if and only if every last true proposition can be adduced as a theorem; it’s Consistent if it doesn’t include or entail any contradictions. There’s incidentally a third, briefly aforementioned criterion called Decidability, which concerns whether there’s a procedure/algorithm for determining, for any well-formed proposition of the system, whether or not it’s true (i.e., whether it’s a theorem). The three criteria are obviously interconnected, but they’re also distinct in important ways; and a deductively immaculate formal system is supposed to satisfy all three … which Gödel basically showed no system could, which is why he’s the Dark Prince, and why pure math’s been in mid-air for the last 70 years.
90 IYI Dates: 1871–1953. Major paper: “Investigations of the Foundations of Set Theory” (1908). Main collaborator: A. Fraenkel, who is also Cantor’s first biographer.
91 which paradoxes many working mathematicians now don’t worry too much about in the course of their day-to-day work, any more than we worry about melting through the floor when we get out of bed.
92 Probably rather than ‘independent’ it would be better to say ‘conceptually prior to,’ or ‘underlying,’ mathematics per se. The idea behind A.S.T. is that since set theory is the most abstract and primitive branch of math, it serves as the foundation for math’s most basic concepts, such as ‘number,’ ‘function,’ ‘order,’ etc. Though the whole issue gets very involved—particularly the questions of set theory’s relation to symbolic logic and of which one is math’s real fundament—it’s nevertheless true that G. Frege and G. Peano, the two most important figures in the foundations of arithmetic, both define numbers and basic math operations in terms of set theory.
93 IYI = either a single-valued function or some natural-language predicate that’s meaningful for all members of S (where ‘meaningful’ basically means the predicate is something you can verify as definitely T v. F for any set-member, like ‘is blue’ or ‘weighs more than 28.7 grams’ as opposed to ‘is lovely’ or ‘tastes like chicken’).
94 Using the membership symbol in a noun phrase like this is the sexy way to say ‘member of S’.
95 So an obvious corollary to the L.A.P. is: Infinite sets exist.
96 IYI Here’s another place where it’s unclear exactly which readers will know or remember what’s being tossed off. If Peano’s Postulates are not familiar and you’d like them to be, invest 45 seconds in the following: P.’s P.s are the five basic axioms of number theory; they’re how you derive the whole infinite sequence of positive integers from just two primitive concepts, which latter are (a) ‘is an integer,’ and (b) ‘is a successor of’. In natural language, the Postulates are: (1) 1 is an integer; (2) If x is an integer, the successor of x is an integer; (3) 1 is not the successor of an integer; (4) If the successors of two numbers x and y are equal, then x = y ; (5) If a set I contains 1, and if, for any integer x in I, the successor of x is in I, then every integer is in I. Just why Postulate (5) is the axiomatic authority behind proof by mathematical induction becomes clearer in an alternative formulation, which goes more or less: (5) If P is a certain property, and if 1 has P, and if whenever an integer x has P, the successor of x has P, then all integers have P.
(IYI2 Gorisian factoid: Though Peano does deserve 100% credit for introducing all kinds of important stuff to number- and set theory (not least the standard symbols ‘∈,’ ‘∩,’ and ‘∪’), his eponymous Postulates are a clear case of capricious math-fame, since axioms equivalent to (1)–(5) appeared in Dedekind’s “Nature and Meaning of Numbers” at least two years before Peano’s own Arithmetices Principia Nova Methode Exposita came out.)
97 IYI We won’t bother much with Cartesian Products except to say (1) that they’re a specific kind of interset union involving ‘ordered pairs,’ which are a whole saga to themselves; and (2) that C.P.s instantiate the important principle of Preservation of Homogeneity, meaning that if two sets A and B both have certain special characteristics, their Cartesian Product (A × B) will also have those characteristics (like if A and B are point sets and both characterize topological spaces, their C.P. will also be a topological space).
98 IYI Here ‘Cartesian Product’ specifically means (deep inhalation) ‘the set of just those subsets of the union of all members of S such that each (=each subset) contains exactly one member of each set in S’. This sort of thing is just to let you sample the heady bouquet of real A.S.T.
99 IYI Any decent mathematical-logic or set-theory text will give you a whole chapter on the Axiom of Choice and its relation to such other high-eros concepts as Russell’s Multiplicative Axiom, Zorn’s Lemma, the Trichotomy Principle, the Hausdorff Maximal Principle, and (no kidding) the Teichmüller-Tukey Maximal Element Lemma.
100 IYI an American (!) whom we’re also about to see in action w/r/t the Continuum Hypothesis, just below.
101 IYI The proof-career of the A.C. is—surprise—a very long story; the upshot of which is captured in the following from E. Mendelson’s 1979 Introduction to Mathematical Logic (w/ the second sentence being about as heated as a logician ever gets):
The status of the Axiom of Choice has become less controversial in recent years. To most mathematicians it seems quite plausible and it has so many important applications in practically all branches of mathematics that not to accept it would seem to be a willful hobbling of the practicing mathematician.
102 IYI If you’d like to see the A.R. in 100% naked symbolism, it’s (∀S)[(S ≠ Ø) → (∃x)((x ∈ S) & (x ∩ S = Ø))], in which the only unfamiliar symbols might be the predicate-calculus quantifiers ‘∀S’ (which means ‘For all S …’) and ‘∃x’ (which means ‘There exists at least one x such that …’ (w/ ‘exists’ meaning mathematically/set-theoretically (which of course assumes that this kind of existence is distinct from some other kind(s)))).
103 These have to do mainly with how many different valid interpretations an axiom system can have (model being the uptown term for a specific interpretation of what the abstract symbols and formulae really stand for). It turns out that most reasonably Complete axiomatics have a ∞ of valid models—sometimes even a nondenumerable ∞ of them—which entails enormous headaches, since systems like ZFS or Peano’s Postulates are set up with fairly specific models in mind, and it’s not difficult to see that, with an actual ∞ of possible models, some are going to contradict the desired ones.
104 IYI His two main ones were Jesus’s real (biological) paternity and the Bacon-v.-Shakespeare question. By way of armchair psych, both these issues concern not just factual accuracy but the denial of credit to someone deserving. Given the amount of professional shit Cantor took, his choice of obsessions seems thus both understandable and sad.
105 IYI In some texts this is referred to as the Continuum Problem.
106 Mathematicians who call it the Continuum Problem frame the general form as ‘Does there exist a set of higher cardinality than but lower cardinality than P
107 IYI It’s Cantor’s specific focus on c that gave the C.H. its name.
108 IYI N.B. that what follows is, even by our standards, a woefully simplistic overview of Cantor’s theory of ordinals—a theory that’s even more complex and ramificatory than the cardinal number stuff—and the only reason we’re even dipping a phalange here is that it would be more woeful still to pretend that the C.H. has only to do with the hierarchy of transfinite cardinals.
109 IYI G. Cantor’s horripilatively technical theory of ordinals and sets’ order-types gets worked out mostly in two papers, “Principles of a Theory of Order-Types” (1885) and the booklet-length “Contributions to the Founding of the Theory of Transfinite Numbers” (1895).
110 The reason that might be confusing at first is the same reason our initial explanation’s “in such a way that the order of the members of A and B remains unchanged” was simplistic—order-type is not the same as mere arrangement. ‘{a, b}’ and ‘{b, a}’ are the same order-type, for instance.
111 IYI Another tailored analogy of Dr. G.’s was that cardinal numbers are like the characters in a school play and ordinal numbers are the marks they’re supposed to hit in a scene, as in a play’s script v. its stage directions.
112 IYI We can see here some clear affinities with Cantor’s theory of irrationals as limits of number-sequences (in §6e). This earlier theory was, in certain ways, the origin of his work on ordinals.
113 IYI The heretofore undefined Burali-Forti Paradox drops right out of this definition. Consider the set of all ordinal numbers everywhere. Now consider this set’s own ordinal number, which by definition will be greater than any ordinal in the set—except that set was defined as containing all ordinals. So either way there’s a contradiction. This is a mean one, and it’s the real prophylactic motive behind the Axiom of Regularity.
114 Except q.v. §7f FN 102 for what the upcoming ‘∀x’ means—which in retrospect means that FN 102 should not have been classed IYI.
115 DEFINITELY IYI Not sure it’s smart to mention this, but at least sometimes G. Cantor used to designate the 1st transfinite ordinal and ‘ω’ to designate the 1st transfinite cardinal. The strict truth is that it was really the set of all finite ordinals (which is what he really called the “first number class”) that Cantor used to derive the first transfinite cardinal—which he basically did because in his theory cardinal numbers were also definable as limit ordinals, which concept we’re not discussing because it requires a level of set-theoretic detail on the relations between cardinal and ordinal numbers that this booklet’s not equipped for. We are using what’s now come to be the standard symbolism, viz.
for transfinite cardinals and ‘ω’s for transf. ordinals; the reasoning is that this symbolism stands the best chance of being familiar to at least some readers. (N.B. that the undiluted poop on Cantorian math in all its intricacies is available in several good technical books, including Dauben’s aforementioned Georg Cantor, Abian’s aforementioned Theory of Sets, and E. V. Huntington’s The Continuum and Other Types of Serial Order, with an Introduction to Cantor’s Transfinite Numbers—q.v. Bibl.)
116 IYI and that they are related to the Weierstrassian epsilons of §5e only in the sense that they’re created by a similar ‘there exists … such that’–type definition—e.g., the first ordinal number k such that ωk = k is designated ‘epsilon 0’ or ‘ε0’.
117 IYI in §15 of the prenominate “Contributions … Numbers.”
118 These two questions collapse into one only if either (1) formal set theory is an accurate map/mirror of the actual reality of ∞ and ∞-grade sets, or (2) formal set theory is that actual reality, meaning that a given infinite set’s ‘existence’ is all and only a matter of its logical compatibility with the theory’s axioms. Please notice that these are just the questions about the metaphysical status of abstract entities that have afflicted math since the Greeks.
119 IYI If set and proof theory weren’t so incredibly esoteric, there would already have been a big-budget movie about Cohen’s proof and the stories surrounding it, which math historians love and you can find in myriad sources. What’s apposite for us are some eerie parallels with G. Cantor. For one thing, Cohen’s background is in functional and harmonic analysis, areas that involve both differential equations and Fourier Series—meaning that he too comes to set theory from pure analysis. It gets eerier. Cohen’s Ph.D. dissertation (U. of Chicago, 1958) is entitled Topics in the Theory of Uniqueness of Trigonometric Series. Plus, just as Cantor had invented entirely new, Diagonal and ‘ϕ’-type set-theoretic proofs, so too Cohen invents a whole new proof-technique known as forcing, which is prohibitively high-tech but in some ways resembles a sort of Manichean math induction where you’re requiring the ‘n = 1’ and ‘k’ cases to take one of only two possible values… . Which may not make sense but isn’t all that vital here—what’s Hollywoodesque is that Cohen gets turned on to set theory, invents and refines his proof-method, and proves the C.H.’s Independence all within a single year.
120 IYI Q.v. §§ 1d and 5b.
121 This kind of Independence (which can also be called Undecidability) is a big deal indeed. For one thing, it demonstrates that Gödel’s Incompleteness results (as well as A. Church’s 1936 proof that 1st-order predicate logic is also Undecidable) are not just describing theoretical possibilities, that there really are true and significant theorems in math that can’t be proved/disproved. Which in turn means that even a maximally abstract, general, wholly formal mathematics is not going to be able to represent (or, depending on your metaphysical convictions, contain) all real-world mathematical truths. It’s this shattering of the belief that 100% abstraction = 100% truth that pure math has still not recovered from—nor is it yet even clear what ‘recovery’ here would mean.
122 IYI Plus, in yet another ’63 proof, Cohen was able to show that even if the Axiom of Choice is added to the other axioms of ZFS, the general-form Continuum Hypothesis still isn’t provable—which establishes that the A.C. and C.H. are also Independent of each other, which again knocked the socks off the math world.
123 Here ‘assumed’ = in a speculative, what-if way. (IYI Factoid regarding the same clause’s ‘various equivalents’: W. Sierpinski’s 1934 Hypothèse du Continu lists over 80 mathematical propositions that either equal or reduce to the general-form C.H.)
124 IYI A good deal of contemporary set theory seems to involve arguing about what these theoretical terms mean and just when and why they do (=mean what they mean, if anything (and, if not anything, then what that nothing might mean (and so on))).
125 IYI Again, you can see how this Formalist view also incorporates elements of Intuitionism, the most obvious of which is the willingness to bag LEM.
126 IYI L. E. J. Brouwer’s pronouncement on the whole Consistency-v-Undecidability thing in set theory is the very Aristotelian-sounding “Nothing of mathematical value will be attained in this manner; a false theory which is not stopped by a contradiction is nonetheless false, just as a criminal policy unchecked by a reprimanding court is nonetheless criminal.”
127 IYI Hilbert didn’t go out easy either. Brouwer and Russell, on the other hand, both ended up living so long they practically had to be dispatched with clubs.
128 IYI As of this writing, P. J. Cohen is the Marjorie Mhoon Fair Professor of Quantitative Science at Stanford.