6 / VI   How to choose a number

In 1792, the wealthy businessman John Jacob Astor wrote one of the first checks issued by the Bank of the United States, newly founded the year before by Alexander Hamilton (figure 6.1). Then as now, the structure of the written text of checks required a repetition of numerals both in numeral words (fifteen hundred and fifty) and in numeral signs (1550), to secure against fraudulent alteration and to reduce the risk of ambiguity of reading. The practice of dual notation is an essential norm of this text genre to this day. We all still do it—those of us who still write paper checks, at least—even if we don’t necessarily reflect very much on why. This 200-year-old check thus has a familiarity to the modern eye despite many other differences.

Figure 6.1

Check written by John Jacob Astor, 1792

Over a century later, the first edition of the now-famous Chicago Manual of Style was published in 1906. In figure 6.2 we see a dizzying array of rules and principles at play to answer the seemingly simple question: How should I write a number? We might prefer conciseness, writing 128 instead of one hundred and twenty-eight. We might prefer aesthetics when starting a sentence with words (“Five hundred and ninety-three”). We are urged to prefer “two dollars” written in words in preparing ordinary reading matter but to use an ideogram (the dollar sign) and numeral sign in matters “of a statistical character,” depending on genre. Years should always be written in numerals—even though the reader isn’t given any guidance whether to read such a numeral aloud as “two thousand eighteen” or “twenty eighteen” or something else entirely. We might prefer consistency across multiple numerals in a sentence, wherever two or more of these principles conflict.

Figure 6.2

Chicago Manual of Style, first edition: rules for writing numbers (University of Chicago Press 1906: 30–31)

All published writers have had to deal with the modern versions of these same rules, often to our own dismay, in the editorial process for our publications. Once, when I was working as an editorial assistant for the late Bruce Trigger, my doctoral supervisor, he reported to me with alarm that the book I was helping him with had been copyedited to replace the word “million” with “10 lakhs” and “two million” with “20 lakhs.” It turns out that the editing had been outsourced to an Indian firm, whose copyeditor had duly employed the word “lakh” consistently, which is Indian English for 100,000 (from Hindi lakh, hundred thousand).1

These sorts of variations, these degrees of freedom in writing and reading texts, are ubiquitous in textual traditions. Over the past chapters, I’ve written at length about how individuals and groups adopt, use, and abandon particular numerical notation systems in specific contexts. But this perspective presents choice as a binary, as if one day a writer simply decided to stop using Roman numerals and then never used them again. That might conceivably happen, but it isn’t the most likely scenario. Skilled writers are always taking account of the context in which, and the audiences for which, they’re writing. And because, in addition to numerical notations, languages also have number words available for expressing numbers, the choices and combinations available to writers are even more expansive than simply a choice among notations.

Writing systems typically record language, but numerical notations—like Roman numerals, or like the Western numerals 0 through 9—are graphic, relatively permanent notations not linked to any specific language. We can read them in whatever language we prefer, and often do—in other words, they’re not multilingual (recording information in multiple languages) but rather translinguistic (allowing the reader to choose what language to read something in). Most writing systems, especially after the earliest ones, can be used to write any word in a language, including the set or sets of lexical number words of the language. Given the ubiquity of lexical and nonlexical resources for expressing numbers in writing, almost any numeral phrase can be written using a variety of strategies.

Here are seven different English expressions for the same number (one million two hundred thousand) from relatively formal texts from the first half of the twentieth century:

Canadian production of steel ingots from 1935 to 1938 averaged about 1 million, 200 thousand tons per annum. (Anonymous 1941: 154)

But perhaps the most important fact bearing upon the situation is the report of the steamship companies to the effect that over twelve hundred thousand applications have been received in the last four years for passage to their native European lands, as soon as the war is over. (Freund 1918: 20)

In fiscal 1948 such carryings under the U.S. flag amounted to 1.9 million tons out of a total of 3.6 million tons, while in 1938 the U.S. flag share was 1.2 million tons of the 2.4 million tons total industrial carryings. (Anonymous 1949: 82–83)

There are one million two hundred thousand students in high schools; a third of a million in higher institutions. (Cattell 1914: 156–157)

Whole shrimp examined immediately after removal from the trawl-net varied in bacterial count from 1,600 to 1,200,000 per gram. (Green 1949)

One point two million ($1,198,150), or .6%, for command and management. (Anonymous 1953: 68)

The total number of ions per cc. produced in air by the radiation to which they are subject is then not more than thirty times 1.2 × 106 (the number of seconds in two weeks), or 3.6 × 107. (Muller and Mott-Smith 1930: 279)

We can trace the rise and fall in frequency of different forms for this number over time (figure 6.3). In English, we have the largest and broadest corpus of written texts available for analysis, the Google Books corpus, analyzable through the Ngram tool, which traces changes in the relative frequency of words among 155 billion words in English-language printed books (Google Ngram Viewer 2016, http://books.google.com/ngrams). Throughout the nineteenth century, twelve hundred thousand, which to many modern ears sounds jarring or even ungrammatical, was the most frequent form. (I can say twelve hundred without an issue, but when I add another number word on the end of it, it sounds infelicitous to me, and to others. This form is almost absent from contemporary written English.) One million two hundred thousand was slightly less popular but still widely used. In contrast, 1.2 million was almost absent from nineteenth-century English books, but especially after 1940 rapidly overtook all other options. Other options, like one point two million, are rare throughout the whole period and only show up in the twentieth century. And still other options, like 1.2 × 106, using scientific notation, are vanishingly rare in ordinary books, magazines, and texts intended for nonspecialists.

Figure 6.3

Variation in frequency of expressions for “1.2 million,” 1800–2000 (Google Books Ngram Viewer, http://books.google.com/ngrams)

Some of this linguistic change reflects changing genres—the rise of scientific writing, for instance, affects word frequencies in the Google Books sample (Pechenick et al. 2015). Some of it surely reflects the changes in formal style guides, which influence what makes it into print. And some of it reflects changes in taste and aesthetics, and in what English speakers and writers choose and expect in writing. But at no point has there been just one grammatical way to say or write this number. This variation exists across text genres—for instance, scientific writing may well differ from literature—but also may vary by time period, the nature of the text’s production (e.g., transcription or editing), and the individual idiosyncrasies of writers. How do literate norms and traditions, and all these other factors, constrain user choices? We are once again returning to the issues of constraint outlined in chapter 1, but with additional ammunition at our disposal. Recognizing that there is variation is only the first step. Determining how genre, dialect, and broader social factors interact with the preferences of individual writers and readers must follow.

Explaining how and why these sorts of variability persist, and why these changes occur, is no easy task, even for a contemporary world language like modern English. It is substantially more challenging to examine variation and choice in numerical expressions in premodern literate traditions where the survival of texts is highly fragmentary and the entire corpus of materials is orders of magnitude smaller than for English. This brings us to a set of key analytical questions. First, what options exist in each society, or in particular literate traditions, for writing numbers? Second, how does this variation in numerical expression correlate with genre, audience, and medium—in other words, what might help explain the variation? Third, how can we use this variation to usefully explain writers’ choices?

Let’s start, then, with the recognition of variation—things are not always the same. In nearly every literate context, number is expressible using a variety of strategies, as we’ve just seen. But just describing variation isn’t enough—as social scientists and humanists we should be trying to explain variation. In using the concept of agency, I am stepping into tricky territory, because of the often-interminable and unproductive social scientific discussions over the past decades about what constitutes agency, how it relates to social structure, and other theoretical quagmires (Giddens 1979; Ahearn 2001; Dobres and Robb 2000). But put simply, agency is the ability to make meaningful choices among options. In other words, I want to draw our attention not only to what sorts of notational options are available to writers, but also to how writers select among them. Variation can be a key or a clue to the existence of agency, but they’re not the same thing.

A lot of the scholarship on numeration and number systems doesn’t talk about either variation or agency. Part of the problem is that we—and by “we,” I mean both specialists in number systems as well as people more generally—have tended to think of numbers as a technical solution to problems, as a sort of adjunct to mathematics. As I’ve outlined earlier, I see numerals as representational systems, related to practices of literacy and writing, not as computational systems. They have consequences for numerical cognition, yes, but their use and history are not driven by those consequences. The scholarship on literacy and writing systems is full of discussions of agency (Englehardt 2012). By drawing as much attention to variation and agency in the writing of numbers as has already been done for writing in general, I think we can get to a more satisfying account of how writers chose and used the notations available to them.

When we’re talking about choice in numerical systems, we need to use another concept, this one from semiotics, the concept of modality, which derives from the philosopher of language Charles Sanders Peirce and has made its way into a lot of theory in semiotics and the social sciences. Modality includes the medium in which something is expressed—visual, auditory, etc.—but also involves the mode of its expression—how it conveys the meaning, and the degree to which it claims to be a representation of some underlying reality. So, for instance, the concept “stop” may be expressed in speech, in writing, gesturally by an outstretched hand, or simply through the ideogram of a red octagon (figure 6.4). I guarantee you will stop your car at a red octagon, even with nothing written on it. Please, please stop at a plain red octagon, just in case. But modalities can also be combined—a red octagon combined with the written word STOP, or a forward-facing palm combined with a shouted “STOP!” Both the octagon and the written word are visual, but the modality is different. In combination, these are multimodal representations, which achieve more together than would be possible as the sum of their parts (Hull and Nelson 2005). As the Sumerologist Cale Johnson notes, “each modality acts as an implicit metalinguistic confirmation of the meaning of the other.” Particularly when modalities are brought into coherence with one another, the effect is to encode meaning and simultaneously signal something about the choice being made (Johnson 2013: 28).

Figure 6.4

Modalities of representing “stop” and “four”

All of these same modalities are found in number systems: the written or spoken word “four,” a hand with four fingers extended, or a numeral IV. Here too, multimodal representations are available, as in the expression 1.2 million—numerical notation and written number words are two different modalities, within the visual medium, that frequently co-occur and work together to convey meaning. We can then call these multimodal numbers—ones that combine modalities in various ways. When they work together, we may not even notice this consciously. But they provide writers with opportunities to highlight particular readings, make particular numerals more salient, or facilitate more rapid reading—in other words, to use the text to facilitate a particular communicative relationship with some audience.

Below, I set out a model for describing and analyzing variation in multimodal numerical expressions in a comparative perspective. In particular, I look at two modalities (introduced in chapter 1) that are extraordinarily common cross-culturally: lexical numeration—number words—and numerical notation itself. While there is no one reason why writers choose to mix these two modalities, it happens so often, and in particular ways, that it is no mere coincidence. So let’s have a look at how some written traditions do this.

Agency without variation

At one end of the continuum, there are a few writing systems that have no corresponding numerical notation—in other words, numbers always have to be written out as words. Many of the scripts of the Philippines and Indonesia, for instance, historically used no numerical notation. The scripts from which they are descended, the Indian and Southeast Asian writing systems descended from the Brahmi script, had or have numerical notation—ultimately that’s where Western numerals came from. But in most of island Southeast Asia, numbers were traditionally written in words, never in numerical notation. It should not be entirely surprising that some script traditions have no numerical notation—after all, numbers can always be expressed in the numeral words of some language, so in some sense numerical notations are redundant. Their worldwide near-ubiquity is thus all the more striking.

In direct contrast, most of the earliest script traditions worldwide use numerical notation but rarely number words. Numbers are widespread in the earliest texts of Mesoamerica, China, Egypt, and Mesopotamia, the four regions of the world where writing is generally thought to have developed relatively independently. In all of these early scripts, numerical notation arrives early, and numbers are not written out in words normally or at all (Chrisomalis 2009). Houston (2004: 238) contends that “most early scripts use word signs bundled with systems of numeration that probably had a different and far-more-ancient origin.” We don’t know much about Neolithic and other notations prior to writing—most of this evidence simply hasn’t survived (Postgate, Wang, and Wilkinson 1995). Whatever the case, it’s clear that there is some connection—numerical notation immediately precedes writing and maybe even causes writing to emerge, although the latter claim is much harder to sustain (Schmandt-Besserat 1992).

Egyptian presents an extreme case because of the near-absence of numeral words across its many thousands of texts. A few Old Kingdom hieroglyphic Pyramid Texts (ca. 2350–2100 BCE), written on the sarcophagi and walls of the pyramids of the pharaohs of the period, use both number words and number symbols side by side. One striking example is discussed by Pascal Vernus (2004: 284) where the phrase “nine bows” (pst pwt), a collective term referring to non-Egyptian lands, appears in three different places in different Pyramid Texts in three different forms (figure 6.5)—once with only nine vertical strokes for 9, once with nine bows each drawn separately, and once with both nine strokes and the word “nine” written out phonetically, side by side. These aren’t exactly multimodal texts—they all occur in the same genre, the Pyramid Texts, using the same phrase, but were written at different times by different writers. These cross-textual correspondences do, however, allow us to know how the numerals were pronounced (minus the vowels, as with Egyptian writing systems).

Figure 6.5

Egyptian expressions for “nine bows” in three different Pyramid Texts (after Vernus 2004: 284)

Beyond these few texts, though, almost all of what we can reconstruct of the Egyptian number words comes from Coptic texts from the early first millennium CE, which help us fill in vowel sounds and reconstruct the number system more fully (Loprieno 1995). Coptic is clearly an Egyptian descendant language, but its script is largely Greek-derived with only a few signs taken from Egyptian demotic writing (Loprieno and Müller 2012). The hieroglyphic script, used primarily in monumental contexts, the hieratic script used cursively to write on papyrus, and the demotic script used in the first millennium BCE and later all use numerical notations virtually all the time, which presents a real challenge in linguistic reconstruction. Thus, James Allen’s (2000: 99) magisterial grammar of Middle Egyptian advises students (correctly) that “it is not necessary to learn all these number words in order to read hieroglyphic texts”—which does raise the question of how Egyptology students ought to read them, when they’re reading Egyptian. Mostly, I am told by several graduate students in the field, they just read them out in English or skip over them.

One is struck by this absence in the enormous Egyptian corpus of tens of thousands of texts, across three different writing systems, over three thousand years. It is hard to believe that writing numerals out in words would never have occurred to scribes. The Pyramid Texts show us that they could, on occasion, do so. Rather than sitting there bemused, we should then ask whether some other principle was more relevant. I suspect the norm at work was part of the graphic norms of the very conservative scribal practice, which the Egyptologist John Baines calls “decorum” (Baines 2007). Baines contends that in many respects, Egyptian art, iconography, and written practice weren’t so much blindly traditionalistic as they were consciously fixed in place by a set of canons. In other words, it wasn’t that Egyptian scribes couldn’t have written numbers in words, or that they never thought of it, but that they chose not to, consistently, over a period of three thousand years. This demonstrates, moreover, that variation and agency are not the same. Sometimes a lack of variation where we might expect it can also be a sign of agency.

Linear B, the script of Mycenaean Greece and the Aegean islands, is another interesting case where there is a near-absence of lexical numerals in the thousands of clay tablets and other surviving texts. We can infer what the archaic Greek numeral words probably were, working backward from later Greek texts and using comparative historical linguistics, but they were virtually never written out in the Linear B inscriptions. Instead, very simple abstract graphic numerical notations—lines and circles—were common to both Minoan and Mycenaean notational practices. In this case, the exception is all the more interesting for its relevance to the decipherment of Linear B. In his famous book on the decipherment, John Chadwick, one of its co-decipherers, reprinted a letter from May 1953 from Carl Blegen to Michael Ventris (now recognized as a primary figure in the decipherment), writing in excitement about tablet P641 from Pylos:

Enclosed for your information is a copy of P641, which you may find interesting. It evidently deals with pots, some on three legs, some with four handles, some with three, and others without handles. The first word by your system seems to be ti-ri-po-de and it recurs twice as ti-ri-po (singular?). The four-handled pot is preceded by qe-to-ro-we, the three-handled by ti-ri-o-we or ti-ri-jo-we, the handleless pot by a-no-we. All this seems too good to be true. Is coincidence excluded? (Chadwick 1990: 81)

Shortly after Ventris’s initial identification of syllabic values for the signs that would secure the language of the tablets as Greek, Blegen had noted the correlation on P641 of words containing the morphemes ti-ri and qe-to-ro with pictographic representations of vessels with three feet, or three or four handles, as seen in figure 6.6. These words are exactly what we would expect in archaic Greek on the basis of linguistic reconstruction. P641 actually contains three different modalities of numerical representation: lexical number words (the numerical prefixes used to indicate the values 3 and 4); pictograms with numerical indicators of 3 or 4 feet or handles; and finally, the ordinary Linear B numeral signs, indicating the quantities of each type of vessel using vertical strokes.

Figure 6.6

Linear B tablet from Pylos, P641, with different numerical modalities (drawing by Michael Ventris; image courtesy of the University of London Institute of Classical Studies Ventris Archive (MV 062.3))

But rather than thinking solely about this tablet’s utility for the modern decipherment of Mycenaean Greek, I want to draw our attention to its graphic complexity as an index of scribal choice. For a scribe to represent the number of handles or legs on a vessel in two different ways—both lexically and pictographically—is a significant choice on small clay tablets such as these, where space is at a premium. The writer may have felt it useful, or thought it essential, to employ both forms, perhaps for less-than-fully-literate readers who may not have known words like ti-ri-po-de—even though the modern reader might see the word tripod jumping off the page, just as Blegen did in 1953.

Finally, there can be variation in numerical expressions even within numerical notation itself. In Western mathematics, for instance, one can write the same number as a fraction (1/4) or a decimal (0.25), or one can write a number with auxiliary marks like commas and decimal points, or not (1200 vs. 1,200 vs. 1200.00). All of these are numerical notation, but one still has choice as to which one to employ. In some script traditions where there is a close correspondence between signs on the one hand and words/morphemes on the other, it is not even always possible to unambiguously distinguish numerical notation from number words, such as the various Siniform writing systems of East Asia. Whereas in English we have both 4 and four, in Chinese there is, normally, only . The distinction between written number word and number sign breaks down when each numeral word has a single sign. But this is not to say that there is no variation in writing practice. Figure 6.7 shows several Chinese representations for the number 20,406.

Figure 6.7

Variation in Chinese numerical expressions

The first, and earliest, is an example of what I have described in chapter 1 as a multiplicative-additive system—it uses signs for 1–9 and for each power of 10, combined in multiplicative pairs, so that 20,406 is written 2 10000 4 100 6. From the Shang Dynasty onward, that was the normal way to write the number. But starting in the late sixteenth century, a sign for zero, líng, could be inserted (and read) to indicate empty positions, as in the thousands and tens place in 20406. Líng is a different sort of zero, though—when used in the classical system, it’s really redundant, because the signs for the powers are still included. This also meant that, in many numeral phrases with successive zeroes, only one líng was used, so that 20,006 would just be 二 萬零 六 (2 10000 0 6). Finally, today Chinese numerals are frequently written positionally, using either the líng sign or a circular zero, but without the multiplier signs for the powers. All of these are read, character by character, in different ways. Today in China, it is just as common for writers to simply use the Western signs for 0 through 9 outright. There is also, in common use in finance, a secondary set of numerals, the dàxiě (lit. “big writing”) numerals. These are read identically to the ordinary classical numerals but exist for banking purposes, with signs specifically selected for their complexity in order to prevent fraudulent alteration. This practice is directly analogous with the security measure of writing numerals both lexically and graphically on Western-style checks. Finally, Chinese has a variant sign for 2 (), pronounced liang instead of the more common èr (). It may seem odd to have two words for two, but thinking about the English lexicon for a moment, and considering the “twoness” of couple, second, pair, deuce, duo, twain, twice, and both, it is not so surprising after all.

Thus, even in contexts where there is seemingly only one choice—where numbers are only written using one modality—there is often variation available to the writer. Now let’s turn from cases where numbers are expressed typically in only one modality to ones that mix modalities in various ways. Three such ways are:

  1. Blended modalities: numerical notations that provide cues or clues to their pronunciation or their linguistic origins;
  2. Hybrid modalities: numerals expressed through a combination of lexical and notational resources, but with the two remaining distinct;
  3. Parallel modalities: numerals expressed in two or more modalities within a single text.

Blended modalities

Sometimes the distinction between lexical and graphic modalities is not so clear-cut. Normally, numerical notation is translinguistic and can be read in different ways in different languages—in other words, there is no phonetic component to it. Sometimes, however, a numeral phrase is written using signs that provide some clue to the phonetic value of the relevant number word, but without using full representation in words. I will call these blended modalities.

Immediately your mind might turn to the Roman numerals, where C is the first letter of centum (100) and M is the first letter of mille (1,000). However, this was actually a later development. Note that the other Roman numerals don’t have any phonetic associations, even though they are letters—why would L be 50 or X be 10? As mentioned in chapter 3 (and simplifying a very complex story), the Roman numerals began as a completely different sign system from the Roman alphabet, one that was not associated with letters at all; over time, the signs became associated with letters which integrated them with literate practices but also obscured their origin (Keyser 1988). By coincidence, the old Roman Ɔ for 100 could be mirrored across the vertical axis to become C, and the Roman for 1,000 could be separated at the bottom to resemble an M. Actually, was the main form of 1,000 in antiquity, and even throughout the medieval and early modern period it was extremely common.

The same sort of assimilative process went on with the closely related Greek acrophonic system. Originally unconnected to the sounds of words, by the sixth century BCE the signs were altered to correspond with the first sounds (Greek akros = “first, highest” + phone “sound”) of the words PENTE, DEKA, HEKATON, CHILIOS, and MYRIOS as their respective numeral signs for 5, 10, 100, 1,000, and 10,000 (Tod 1979). Thus, for instance, 135 would be ΗΔΔΔΠ (100 + 10 + 10 + 10 + 5). But note that when written out as such, Greek acrophonic numerals cannot be “read” as words: the ancient Greek word for 30 is triákonta, not deka deka deka. Moreover, the sign for 1 (a vertical stroke) is not acrophonic at all. At best, the acrophonic numerals probably helped in learning the signs and, for semiliterate readers, drew associations that made reading easier. Blending modalities served a mnemonic function, rather than one oriented toward the needs of fully literate readers.

Acrophonic numeration survives today in Western notations such as the use of K for 1,000, as in popular video games like NBA 2K18, or in Y2K. The K is the first letter of the pseudo-Greek morpheme kilo, which was stripped in the early modern period of the proud chi at the start of χίλιοι (or, I must add indignantly, Chrisomalis) when it became a metric prefix (as in kilogram). Now, over the past few decades, K has turned back into an acrophonic numeral for 1,000. According to the Oxford English Dictionary, K for 1,000 seems to have originated in the 1960s in computing and electronic contexts (where it normally refers to 1,024 bytes, not 1,000), then spread to salary figures like “She makes $92K a year” in the 1970s. Since Y2K, K has become a playful and hypermodern (even though millennia-old) strategy for representing years. It’s no shorter to write 2K18 than 2018, and presents uncertainty to readers as to how it should be read aloud, even where its meaning is unambiguous. Rather, it’s an example of what the linguist David Crystal (2001) calls “ludic language”—language influenced by the human capacity for play, manipulation, and puzzling.2

Another case of blending modalities is the siyaq or dewani notation used by Arabic, Persian, and Ottoman administrators from the tenth to the nineteenth centuries (figure 6.8). Here, the sign for each multiple of each power of the base of 10 is a cursive reduction of the corresponding Arabic number word. This is like a visual clue indexing a particular linguistic reading, as if the number 9 looked like the word nine. Yet despite their lexical origin, siyaq numeral phrases could not simply be read lexically. They are too reduced to be read or written automatically except by a trained scribe, and constructing numeral phrases using two or more siyaq signs does not automatically generate a grammatical Arabic number word. There is also no necessary expectation that a scribe using siyaq numerals would have been knowledgeable in Arabic, and we know that a lot of them were Turkish, Farsi, or Hindi speakers. Figure 6.8 shows the Persian version of the siyaq, and while some of its users may have known some Arabic, that would not have been necessary. The origin or the “visual etymology” of the signs is lexical, but their reading and use are not necessarily so. Actually, all users of the siyaq numerals were also fluent users of other numerical notations, such as the Arabic, Persian, Indian, or Western variants of the decimal, positional numerals 0 through 9. The role of the siyaq in its social context was actually a form of social control, limiting access to financial information to those initiated in its use, while providing a visual clue that might have been useful in teaching. Blending modalities, in this case, was semicryptographic and part of a set of accounting practices designed to limit the flow of information rather than to encourage it.

Figure 6.8

Persian variant of siyaq / dewani numerals (Kazem-Zadeh 1915: plate VIII)

A final instance of blended modalities consists of numerals used in unexpected contexts for nonnumerical, often playful purposes, including phonographic ones. In these, rather than the numeral giving a clue to the numerical value, the numeral indexes a phonetic value. Probably the best-known numerical phonogram in English is K-9 used as a pun on the word canine, but it is not the only one—2 for to/too, 4 for for, and 8 for -ate in phrases like “Stop h8.” The use of the numerals for their phonetic value remains frequent, although less so than a decade ago, in the written language of youth texting, much to their elders’ dismay.3 But this is no mere modern trick but a longstanding practice in multiple script traditions. Take, for instance, the Maya numeral 4, expressed as four dots, with the phonetic value kan or chan. The same four dots can be employed nonnumerically to express homophones or near-homophones in words meaning “snake” or “sky”—the numerical sign could be used for any of the three meanings (Houston 1984). Or, in Elamite cuneiform, the logogram for “king” is the numeral phrase “180 20,” or 3,600 (šar), which is nearly a homophone of the word for king, šarru (Nougayrol 1972; Biggs and Stolper 1983). B4 we dismiss “kids these days” for what we suppose are their linguistic atrocities, we should look to past scribal practice for evidence of numerical play.

Hybrid modalities

Other numerical traditions combine lexical and nonlexical graphic signs in a systematic way, normally by using lexical terms for powers of the base in combination with graphic signs for low numbers. I call these hybrid modalities—there are two modalities, but they’re not blended together so much as interwoven. Both the lexical and nonlexical parts of a hybrid phonographic representation are discrete and comprehensible on their own, and then they are combined in some systematic way that is directly readable.

A well-developed numerical tradition that uses hybrid phonography systematically is the Eblaite script found on thousands of tablets at the palace archive of Ebla in what is now northwestern Syria, dating to the middle of the third millennium BCE. The Eblaites were speakers of a Semitic language related to Akkadian and Babylonian, and used a cuneiform script akin to those used in Mesopotamia, to the east. For numbers below 100, Eblaite used the ordinary Sumerian curviform or cuneiform signs for 1, 10, and 60, written additively just like almost any other Mesopotamian numerical system at the time (Pettinato 1981; Chrisomalis 2010: 245–247). Where Eblaite diverged from the practice of Akkadian scribes is in the writing of numbers for powers of 10 starting with 100. For 100, 1,000, 10,000, and 100,000, Eblaite did not use graphic signs but represented the powers of the base lexically, using the numeral words. In some cases, these could be reduced to the first syllable of each word (mi, li, ri, ma).

In figure 6.9, we have an Eblaite clay tablet probably from the twenty-fourth or perhaps the twenty-third century BCE (ARET 02, 20; CDLI P241045; cf. Edzard 1981). We’ll focus only on the bottom left register. The coefficients of each power are easily readable by the number of signs: 1, 8, 6, 2. The words that follow each number are the Eblaite words for each power of ten descending from 100,000 to 100, giving the final total of 182,600. This has a close parallel with the expression “1.2 million” that I discussed earlier. Eblaite didn’t have any known numerical signs for the higher powers, even though many of the other Mesopotamian cuneiform writing systems did. However, the scribes were familiar with other ways of writing, such as the Sumerian numerals used by both the Sumerians and later, the Akkadians, throughout the third millennium BCE.

Figure 6.9

Eblaite text with a representation of 182,600 using numeral signs with words (source: Edzard 1981, ARET 02,20)

Why, then, did they diverge from the Sumerian, sexagesimal numerical practice? In these cases it’s often tempting for social scientists to invoke concepts like identity to explain variation, but in most other respects, Eblaite had many similarities with Akkadian scribal practice; simply invoking identity seems like a weak explanation, given that there’s no real evidence for or against it. An alternative account might focus on cognitive effort. Sumerian written numerals have a complexity in that they use many elements of sexagesimal (base-60) notation, whereas Eblaite, like all the Semitic languages, is decimal. To write 182,600 in Sumerian, the scribe would need to write five signs for 36,000, four signs for 600, three for 60, and two for 10. It’s not that it’s longer to write—14 signs is not so many, compared to the text we’re looking at—but it requires a great deal of calculation to figure out how to write it in this system, for speakers of a language with decimal numerals, like Eblaite. Analogously: imagine if every time you needed to write out a time in hours, you needed to figure out how many seconds it is first. In contrast, the Eblaite text, although not exactly concise to write unless the words are reduced to their first syllables, requires no particular cognitive load—simply append the numerals for 1 through 9 to whatever word for a power of 10 you need. But the fact that sometimes the short forms were used solves even that problem—note the parallel with the Greek acrophonic system, using the first letter to represent the whole word. Because Ebla was politically distinct from the Akkadian Empire to the east, their scribes may have been more free to experiment. Eventually, the Semitic-speaking Babylonians around 2100 BCE started using decimal numerical notation much like the Eblaite system, alongside a new notation—positional, base-60 numerals that were the earliest true place value system (Proust 2009; Ouyang 2016). The decimal system began to use me and li-im for 100 and 1,000, perhaps in emulation of Eblaite practice but just as likely independently developed. And, by the seventeenth century, the Hittites had begun using the same Semitic words for 100 and 1,000, along with elaborated ideograms for 10,000 and (probably) 100,000 (Hoffner 2007). Even though Hittite was an Indo-European language, its numerals (like those of most Indo-European languages) were decimal, so this was an easy adoption of Mesopotamian practice, though the signs were probably read aloud in Hittite, not Babylonian or another Semitic language. Sumerian—known and available, though deeply archaic at the time—was not considered.

The Sogdian written tradition of central Asia was used roughly from the fourth to tenth centuries CE in parts of what is now Uzbekistan, Tajikistan, and western China. Sogdian is a highly cursive alphabet, difficult to read today as it surely must have been at the time. In figure 6.10, we have a Sogdian financial document from the eighth or ninth century (Bi and Sims-Williams 2010). The original findspot of the text is unknown, but it is from western China, near the city of Hetian (ancient Khotan). In Sogdian texts, there were both numerical notations and numeral words for all the numbers at least as high as 10,000, so in theory, any text could be written entirely in words or entirely in notation. However, many texts, like the one shown here, use words for the lower numerals one through nine, combined with notation for the powers for 100, 1,000, and 10,000.

Figure 6.10

Sogdian financial text, eighth–ninth century CE, GXW 04320 (Bi and Sims-Williams 2010: 501; Museum of Renmin University of China, Sogdian document no. 3; Image © Bi Bo and Nicholas Sims-Williams)

δs 1,000 pny βyrt ctβ’r 1,000 βt 100

ten 1,000 pny received four 1,000 seven 100

“10,000 pny. Received: 4,700 [pny].”

δ’rt ’ytxw msyδr ’δw 1,000 δwy 100 pny

has Itkhu priest two 1,000 two 100 pny

“Itkhu the priest has 2,200 pny.”

δ’rt βwγδ’t 24 1,000 pny

has Vogh-dhat 24 1,000 pny

“Vogh-dhat has 24,000 pny.”

δ’rt ypγw ’’tryc pnc 1000 βt 100 pny

has yabghu Atarich five 1,000 seven 100 pny

“The yabghu Atarich has 5,700 pny.”

This is the exact opposite of the pattern we saw in Eblaite—here the powers tend to be in numerical notation, and the units in words. For instance, in line 2, the number 2,200 is written with two in words, but 100 and 1,000 notationally. But immediately after this, in line 3, 24,000 is written entirely in notation: the sign for twenty, plus four strokes for one, followed by 1,000. A Sogdian sign exists for 10,000 that the scribe could have used if they had wanted to write “two 10,000 four 1,000,” but it is rare, and the scribe may not have known it or wanted to avoid it. In any case, in line 4 the scribe returns to the hybrid modality “five 1,000 seven 100” for 5,700.

Again, we can ask why. Here I think it’s possible that the decision was motivated by issues related to multilingualism. The numbers one through ten are among the first words learned by children and new learners. This is an economic document, located in or around Khotan, a major strategic location along the Silk Road, and Sogdian was a minority language here, well to the east of its heartland. It’s not that the numerals 1–10 would have been hard to write—they’re mostly vertical strokes, ligatured together at the bottom, as in the 4 in 24 in line 3. It’s that the words for 100, 1,000, and 10,000 couldn’t be presumed to be known by all potential readers of the text.

Hybrid modalities are common cross-culturally. Returning to the Roman numerals, in a lot of early modern writing, Roman numerals and numeral words could be mixed fluidly. One problem faced by medieval writers was that there was no standard way for writing large numbers in Roman numerals. Charles Burnett (2002) discusses the interesting case of the Helcep Sarracenium of Ocreatus, a twelfth-century text explaining place value and Indian/Arabic arithmetic (the “algorism”) including transliterations from Roman to very early Western numerals. This is a multicultural document influenced by Indo-Arabic knowledge but using the representational systems available to a medieval European writer, well before Leonardo of Pisa’s Liber abaci of 1202 made the Western numerals familiar (to scholars, at least). Because there were no standard ways of expressing numbers higher than 1,000,000 in Roman numerals (the bar or vinculum representing multiplication by 1,000 only took you so far), some other strategy was needed. The solution in this text was to simply add the words decies and centies after the MM for 1,000,000, as shown in table 6.1. Because the Western numerals are infinite, we don’t have quite the same problem, because the pure notational forms 10,000,000 and 100,000,000 exist, but we still usually find 10 million and 100 million easier to read and understand.

Table 6.1

Hybrid Roman numerals in the Helcep Sarracenium

i

1

x

10

c

100

M

1,000

xM

10,000

cM

100,000

MM

1,000,000

deciesMM

10,000,000

centiesMM

100,000,000

Similarly, Ford (2018) analyzes numerical expressions in the Middle English verse romance Capystranus, dating to the late fifteenth or early sixteenth century and notable for its frequent use of large numbers. This was a poem clearly intended to be read aloud, as was the common practice at the time. The meter of the poem clearly indicates how the numerals must have been read. Of the expressions in the poem, fifteen were in lexical numerals alone, six in Roman numerals alone, and nine used hybrid modalities such as Syxe and twenty.M. for 26,000 and C. thousande for 100,000. It wasn’t that there was one way to write any specific number, though—for instance, 20,000 was written as xx. thousande, Twenty.M (twice), and Twenty thousande (twice). Rather, one of the best predictors for the choice of modality was the position of the numeral in the line. Roman numerals never started a line and never ended a line (so as to clearly indicate the rhyming scheme). Seven of the nine hybrid numerals were at the start of the line, with the lexical numeral preceding the Roman numeral, likely to conform to the aesthetic canon of starting lines with words (comparable with the modern norm of not starting a sentence with numeral notation, discussed earlier).

Now, the whole thing could have been written out in words, and this is the choice made by many poets even today. But hybrid modality is far more common than most contemporary scholars acknowledge. Important English printers such as William Tyndale and William Caxton often avoided the use of Roman numeral C and M in early printed Bibles in favor of lexical hybrids such as “ix hundred and xxx yere” (Williams 1997: 11). We even see cross-modality influences such as the use of XX (vingt) for 20 in hybrid French texts, as in a letter from King Richard II of England dating from 1392 but referring to events three years earlier, “l’An Mille, CCC, IV XX. & Neuf” (Rymer 1740: 76). Clearly this can only be read in French as mille trois cents quatre-vingt et neuf (1389). Not only is there a hybrid modality, but the use of XX as a structuring feature in combination with IV shows a further influence, in this case, from French, where quatre-vingt is the number 80 (Preston 1994). Crossley (2013) shows that mixtures of Roman numerals and number words (French, Latin, or other), mixtures of Roman and Western numerals, or annotations of numerals with signs indicating how they were to be read (e.g., a superscript o or mo in phrases like Mo for millesimo = 1,000th) were extraordinarily common in late medieval texts. The frequency of this sort of notation, combined with our awareness of just how frequently contemporary writers use phrases like “27 million,” has enormous implications not only for how we read texts, but how we explain longer-term processes of change and replacement of notations like the Roman numerals. Because hybrids are common, we shouldn’t treat numerical notations as pure pristine objects to be compared to one another, without considering these sorts of fruitful blends.

Parallel modalities

The third type of multimodal numbers I’m interested in are what we can call parallel modalities. These are examples like John Jacob Astor’s check from 1792, in which each individual number is written with only one modality, but across a single text multiple modalities are used in parallel for writing the same number. This sort of apparent redundancy is actually very useful for several reasons, as some examples from the ancient world will show us.

The South Arabian script tradition started in the early first millennium BCE, as a variant of the Bronze Age alphabets of the Sinai and the Levant, but quickly took on a distinct aesthetic quality especially in monumental contexts. Used mainly in the southern part of the Arabian peninsula (Yemen and Oman), but also in other parts of Arabia as well as across the Red Sea in what is now Eritrea, it was written in boustrophedon style—in other words, with lines alternating right to left and left to right. For the first several centuries of its existence, South Arabian was a writing system without any special numeral signs—numbers were only written out in words. Then, around the sixth or fifth century, possibly under the influence of the Greek acrophonic system I discussed earlier, a set of distinct additive South Arabian acrophonic numerals developed (Biella 1982).

In figure 6.11, we see an inscription from Sirwah, probably from the sixth or fifth century BCE, in which the number 6,000 is written. South Arabian numerical notation is always preceded and followed by a hatched vertical bar—so here, we see the six signs for 1,000 surrounded by bars. This creates a visually salient indicator that makes numerals distinct within the text. What’s interesting with South Arabian is that writers almost never used numerical notation alone, but rather, preceded by the exact same number word.4

Figure 6.11

South Arabian inscription for “6,000” with parallel modalities (DAI Sirwah 2005–2050)

sdtt ‘lfm 1,000 1,000 1,000 1,000 1,000 1,000

“six thousand (6,000)”

One possible reason for this redundancy is that numerical notation was rare in the South Arabian writing system. It hadn’t existed for centuries previous to this inscription, and at the time it was fairly new, so writers may not have assumed that even fluent readers would be familiar with it. It’s also possible that this duplication serves the same security function as on modern checks, or as in legal contracts where redundant numbers in parentheses are used to ensure correct readings, and I’ve transcribed it here as if it were so. However, in these sorts of monumental inscriptions, it doesn’t seem likely that scribes carving in stone would have this concern. A final, intriguing possibility is that adding numerical notation at all was simply done to emphasize and make salient the number being expressed—in other words, the visually striking repeated signs, surrounded by hatched bars, are part of an aesthetic canon designed as a sort of emphatic flourish. The parallel here is with the “conspicuous computation” strategy I discussed in chapter 2. This sort of monumental inscription is exactly the sort of place where salience—giving prominence to numerals within a text—might be a viable strategy.

In any event, this practice did not survive too long. Although the South Arabian script tradition survived right up until the early Islamic period, the South Arabian numerical notation was not used past the first century BCE. After that point, the script reverted to writing all numbers out in words alone. This process of simply abandoning numerical notation may seem strange, given how essential we, as modern people, imagine it to be to the survival of civilization. But the history of numerical notations is not a unilinear, progressive arc toward representational and computational perfection. Not only will systems be replaced for reasons other than “efficiency”—for whatever value of “efficiency” you might choose—sometimes they won’t be replaced at all, but simply abandoned. To me, the abandonment of numerical notation in South Arabian writing just reinforces that it was always an auxiliary system, never central to scribal practice.

A related example of parallel modalities comes from a very interesting Greek text, Fort. 1771, found among the Persepolis Fortification Archive, analyzed and curated at the Oriental Institute of the University of Chicago (figure 6.12). The Archive is one of our most precious bodies of knowledge about economic and social life during the Achaemenid empire centered in Persia but spreading from the Balkans to the Indus Valley. At Persepolis, the ceremonial capital of the empire founded by Cyrus the Great and Darius I in the late sixth century, tens of thousands of economic and administrative texts from around 500 BCE have been found, mostly in Elamite cuneiform and in Aramaic, along with a few unusual texts in other languages. Fort. 1771 is the only Greek text in the Archive, at a time when trade and conflict between Greeks and Persians was common, but not particularly involving Persepolis, which was a new city at the time.

Figure 6.12

Fort. 1771, Greek tablet from the Persepolis Fortification Archive, ca. 500 BCE (courtesy of the Oriental Institute of the University of Chicago)

ΟΙΝΟΣ ΔΥΟ ΙΙ ΜΑΡΙΣ

oinos duo II maris

wine two II maris

“two (2) liquid measures of wine”

But, as Matt Stolper and Jan Tavernier (2007) rightly note, this is not some misplaced text that just accidentally happened to fall out of some Greek merchant’s satchel. It has an Elamite seal on one end, deals with a commodity of wine, and uses the Persian unit maris to measure it, in common with many of the other texts of the archive. The language of the text may be different, but the function and the information system are the same. I give here just the text on the obverse of the tablet, “oinos duos II maris.” Once you know that maris is a unit of liquid measure (equal to about 9.3 liters), you need almost no Greek to read it as two units of wine. As with the South Arabian text from Sirwah, it looks like the number word two is followed by two vertical strokes as a sort of redundancy.

How was this text meant to be read and understood—and how should we think about the choice to inscribe two vertical lines below the Greek word duos? Unfortunately, we don’t have a Persepolis Manual of Style to tell us how to read and write numbers, and neither did the scribe who wrote it at the time. Here, aesthetic preferences don’t seem a plausible explanation—this is not a beautiful display text. In an excellent and underappreciated recent article, Flavia Pompeo (2015) begins with the observation, first made by Schmitt (1989), that the two vertical marks seem to have been added between duos and maris shortly after the rest of the text had been written, judging by the spacing, to clarify for readers who may not have been fluent in Greek. This seems right—it’s on its own line, rather tightly placed between the lines above and below it. But Pompeo then notes that even the two marks are not self-explanatory. Perhaps they are Greek acrophonic numerals: two vertical strokes would be 2 in this system, which I discussed earlier. Or perhaps they are meant to be Aramaic numerals: Aramaic would also use two vertical strokes. Or perhaps they were intended as linear models of the Elamite or Babylonian cuneiform numerals (two lines standing for two wedges). Or perhaps they are just two general tally marks, not really intended as part of a specific notation but intended to be readable by just about anyone. We’ll probably never know, but it bears on the question of who exactly added them, and why. Were they added by a thoughtful Greek-speaking scribe as a sort of afterthought? Alternately, perhaps a bilingual Elamite or Persian scribe thought to clarify the text for readers who might not have been familiar with the word duos. Perhaps the marks were part of an administrative practice by which the secondary notation was added as a security measure or a kind of reckoning tool for when the two maris reached some destination.

One final twist: imagine, hypothetically, that the marks were there but the word duos was not. Would we then be so confident that two random lines meant the number 2, as opposed to two instances of the letter I, or some incomplete letter? Epigraphers often struggle to clearly identify marks like these as numerals, because they’re open to so many other interpretations. It’s probably right to think that the two marks were put there to clarify the duos, but their clarity derives in part from the fact that the lexical and notational numerals are both there in parallel, not just one or the other.

One final example comes from the earliest period of the Aramaic script tradition as used in Assyria in northern Mesopotamia (figure 6.13). The bronze Assyrian lion weight BM 91220 from Nimrud found by Sir Austen H. Layard in the 1840s bears an early Aramaic inscription; it probably dates from the reign of the emperor Shalmaneser V (726–722 BCE). The social context of the period was complex and multicultural. At the time, the Aramaic language and script were relative latecomers to the environment, having only recently been elevated to the status of lingua franca of the Assyrian empire under Shalmaneser’s predecessor, Tiglath-Pileser III (745–727 BCE). It was widely spoken, but still only tentatively a prestige language. The lion weight, one of sixteen Layard found, is notable in that it bears three inscriptions, two of which are Aramaic and one purely numerical (Fales 1995).

Figure 6.13

Aramaic lion weight, Nimrud, ca. 725 BCE (BM 91220; image © The Trustees of the British Museum)

Right flank:

mnn —|||| b zy `rq`

mina-PL 15 by the land

“15 minas, by the standard of the land”

Right base:

[]mšt `šr mnyn [b zy] mlk

five ten mina-PL [by the] king

“fifteen minas, by the standard of the king”

Left flank:

|||||||||||||||

On the right flank, the weight of the object, “15 minas” is expressed in the Aramaic decimal numerical notation: a sign for 10 followed by five signs for 1. Below it, on the right base, the amount is written out in Aramaic words: khamshat ashar, five plus ten. Finally, on the left flank, there are fifteen ungrouped vertical strokes, a sort of tally.

What motivated the triplication of the value of 15? Was it in part so that the value could be seen from both left and right sides? Perhaps, but then why write it three different ways? Could it be to prevent fraud, as in check-writing practices? Perhaps, but this is an inscription on bronze, and hardly easy to alter—and in any case, its weight would give the alteration away. More likely, some of the users of the weight would not have been literate in Aramaic, and might not have been able to read either of the inscriptions on the right side. The tallies serve a clear and useful purpose in this case—they are probably readable by just about anyone. Even those literate in Aramaic need not have known 15 as a combination of a horizontal 10 plus five vertical strokes. Aramaic and Phoenician writing had been around for a while, but the lion weight inscription is actually the very earliest example of an unambiguously Aramaic numerical notation. In other words, the numerical notation was a novelty in the newly Aramaeized Nimrud of the mid to late eighth century. Thus, in choosing three modalities—one using the novel Aramaic numerals, a second using Aramaic words readable by anyone literate in the script, and a third using a tallying system accessible to anyone—the scribe was considering the ability of the inscription to be readable by all its potential audiences.

Code choice

So far I’ve discussed blended modalities, where numerical notation contains a cue or index to a lexical reading; hybrid modalities, where a single number is written using two (or more) distinct notations, one lexical, the other graphic; and parallel modalities, where the same number is written two or more times in different modalities. But what about the simplest case of all, texts that use number words in one set of contexts and numerical notation in another? These we might simply classify, in a sociolinguistic model, as an example of code choice—we select an expression based on some goal, interest, or principle.

Let’s return to the sample sentence advocated in the Chicago Manual of Style (figure 6.2, above), the almost-baffling “Five hundred and ninety-three men, 417 women, and 126 children under eighteen, besides 63 of the crew, went down with the ship.” In this example, “Five hundred and ninety-three” starts the sentence, and the norm is that sentences start with words, not numerical notation. One of the rationales for this principle, whether the original explanation or not, is that sentences should start with a capital letter, as a way of visually marking sentences, and, after all, there are no capital numbers. In contrast,5 417 and 126 are three or more digits so should be written with numerals, and while 63 should be written out in words by the style guide’s rules, keeping it in the same style as the previous two figures of persons makes some sense. Eighteen, however, should stay as a word because it’s a relatively low number, or perhaps because it’s an age rather than a count of people. You may disagree with any of these decisions and might have made them differently. That is the whole point of an agency-based account of numerical variation. But the writer undeniably expected all fluent readers of English to be able to quickly read and understand the sentence. However, if you asked the average reader thirty (or 30) seconds later, they likely couldn’t tell you which was expressed in which format.

The Etruscan numerals provide a fascinating case of variation in code choice in part because even though Etruscan script is readable, the Etruscan language is poorly understood due to a lack of known relatives; but numerical variation in texts helps us to improve that understanding. It’s not an Indo-European language like Latin and the other Italic languages, but coexisted on the Italian peninsula for centuries until finally going extinct probably in the first century CE. The Etruscan numeral words, in particular, have been the source of some perplexity for decades because in the absence of clear context, it is difficult to assign specific numerical meanings to specific words. The problem is made even more extreme in that, of around 10,000 extant Etruscan inscriptions, around 200 write numerals using a numerical notation very similar to, and in fact ancestral to, Roman numerals, which is completely understood, but only around 40 known inscriptions appear to have numerical values (often ages of deceased individuals) written using number words (Kharsekin 1967). Because the numerical notation can be read precisely, and because many of the numerals designate age at death on tombs and sarcophagi, Kharsekin (1967) a half-century ago was able to evaluate most of the lexical numerals by correlating “age curves” of the graphically inscribed tomb texts, whose reading was secure, with those with lexical numerals, which were potentially ambiguous. But we don’t know why writers chose lexical expressions in around 15–20% of the available inscriptions. The variation here is useful to us, as modern decipherers, but otherwise not really analyzable.

The chief remaining challenge was that, while the numerals for four and six could be identified as huθ and śa, which was which remained in dispute, with important Etruscologists on either side of the issue. In 2011 a study in the journal Archaeometry shed light on the issue using variation across modalities in a very unusual corpus of “texts”: Etruscan six-sided dice (figure 6.14) (Artioli, Nociti, and Angelini 2011).

Figure 6.14

Etruscan numeral words, dice, and numerical notation

Of 93 Etruscan surviving dice, 91 have one through six pips on the faces, as is typical of modern dice, but two very remarkable examples, instead, use the first six Etruscan numeral words. For the 91 dice with pips, there were only two patterns by which the numbers were distributed on opposite faces. Dice where 1 opposes 2, 3 opposes 4, and 5 opposes 6 were more prominent in dice from the eighth through fifth centuries BCE, while others (as in modern dice) opposed 1 to 6, 2 to 5, 3 to 4, typically later examples from the fifth through third centuries. But note that in both of these formats, 3 opposes 4, and in both of the two dice with lexical numerals, śa opposes ci (three), so śa is overwhelmingly more likely to be 4 rather than 6, and huθ must be 6 rather than 4. Representational variation allowed the solution of a century-old linguistic problem.

Earlier, I discussed the Greek acrophonic numerals. However, the acrophonic numerals were used only for a small range of domains and functions in antiquity (Tod 1979; Threatte 1980). For instance, they were never used to express ordinal numbers, only cardinal values. They were never used to express dates, or the ages of the deceased in funerary inscriptions. They were virtually never used in connected prose at all, such as decrees. In these and other contexts, numerals were expressed lexically. There was nothing to prevent acrophonic numerals from being used in these contexts, except, perhaps, the fact that numbers, like texts themselves, were generally read aloud at this period; i.e., silent reading was not the norm in classical antiquity (Saenger 1997).

After about 325 BCE, another numerical notation became common: the alphabetic numerals, in which the 24 letters of the alphabet plus three additional letters were assigned the values of 1–9, 10–90, and 100–900 (Tod 1979; Johnston 1979). Alphabetic numerals had been around for several hundred years, having an ultimate origin in seventh-century BCE contact with Egypt (Chrisomalis 2003). But this did not affect the widespread use of lexical numerals in connected prose. What it did do, though, was encourage the Greek writing of number words in descending order in the units and tens (compare “twenty-four” with “four-and-twenty”). Prior to around 325 BCE, descending order from the highest to the lowest power was rare; afterward, it was common, a fact that Keyser (2015) correctly attributes to increasing numeracy and familiarity with numerical notation in that period. In other words, as writers began using numerical notation more frequently alongside lexical numerals, linguistic structures that were consonant with those systems increased, and ones that contradicted them decreased. There is nothing requiring such a transition—German, to this day, still places units before tens in its number words, but no one seriously thinks that German speakers are less numerate than others. But it illustrates the ways in which numerical modalities, even when not used for the same purposes or in the same contexts, can influence one another.

In chapters 3 and 4, I discussed the transition from Roman to Western numerals in terms of an extremely gradual, socially motivated replacement influenced by the increase in literacy associated with printing practices. We saw that Western numerals were adopted in some rather haphazard ways throughout early printed books, and not in ways that necessarily fit the model of rationality often presumed to be the case. This situation clearly fits the model of code choice. To this, we then need to add the abundance of texts that use Roman numerals along with alternatives such as Western numerals or number words for different purposes. This will help us to think of the rise of Western numerals not only in the context of the Roman numerals—their purported competitor—but also in the context of the lexical numerals, which never went away.

Nor did the mixing of numeral systems end with the so-called transition to Western numerals. A South Carolina eight-dollar bill from 1776 (figure 6.15), from the same generation of early American life when John Jacob Astor wrote his check with which we started this chapter, exhibits extraordinary variation and complexity in code choice. Roman numerals, Western numerals, and English numeral words are not in competition with one another, but rather complement one another in producing a multimodal text rich with numerical information.

Figure 6.15

South Carolina eight-dollar bill, 1776

Some of this variation in parallel modalities may have served to clarify or prevent alteration (VIII vs. Eight, 13 vs. Thirteen), but clearly this is not a full explanation. Rather, viewing this text and others as complex fusions of different modalities—not employed haphazardly, but rather chosen by those who had the agency to select them—and reflecting on why they made the choices they did, allows a richer understanding of all sorts of numerical material around us. For instance, the use of Roman numerals for the amount in dollars—a purely American currency—versus Western numerals (at the bottom, center) for the amount in pounds, takes on a new meaning when we see that this bill was produced just a few months after the signing of the Declaration of Independence. But the number 2308 handwritten on the bill as its serial number is, understandably, in Western numerals—quite a convenience when some poor writer had to annotate each bill in turn.

Conclusion

Sociolinguists often talk about code switching or code mixing, both in discussions of verbal speech and in analyses of texts, because people do not always speak only in one language at any given time. Some sociolinguists use code switching to refer to switching across different phrases or sentences, and code mixing to refer to switching languages within an utterance or phrase. Under this model, parallel modalities most closely resemble code switching, while blended and hybrid modalities are closer to code mixing. But more important than a typology of different kinds of language mixing as they relate to notations is identifying the cognitive, social, and textual explanations for the phenomenon. Multilingual people, sometimes for reasons of imperfect fluency but most often for reasons of style, emphasis, or identity marking, engage in code mixing and code switching regularly (Myers-Scotton 1993). We might want to know what we can learn from these sorts of instances—how choices are made and deployed, and why.

Those of us who study writing and literacy, similarly, are used to thinking about bilingual and trilingual texts. Most schoolchildren still learn about the decipherment of Egyptian hieroglyphs using the example of the Rosetta Stone written in two languages (Greek and Egyptian) and three scripts (Greek, demotic, and hieroglyphic), and the artifact has come to serve as a materialized metaphor for translation and decipherment in general. There is also a vast and growing literature on the multilingualism of individuals and groups in antiquity, with code choice always being present as writers used different languages promiscuously across a range of genres (Crisostomo 2015; McDonald 2015). Who writes and for whom? What code(s) do they use and why? These questions motivate many authors in Near Eastern studies, classics, and numerous other historical disciplines. In contrast, number systems are seen almost as needing no decipherment, needing no bilingual texts, needing no explanation—in other words, not of much interest except for mathematicians. After all, even where there are undeciphered scripts, we can often read the numerals. The Maya numerals were deciphered almost a century before the phonetic aspects of the Maya script were revealed. The Minoan Linear A numerals were fully understood at the time of Sir Arthur Evans (1909), and are still the only part of that still-undeciphered script about which there is some agreement.

We can see now that this perspective is a mistake. There is almost always some variation, and thus some choice, as to how to write a number, lexically, graphically, or using some combination of principles. Even a text written in a single language, and intended for a narrow audience, affords the writer opportunities to select some combination of expressions that suits a set of needs and goals. Very often, multiple and sometimes competing principles and interests will be at play, making the explanation of particular choices all the more fluid. Some of the explanations for these choices include, but are not limited to:

  • Multilingualism: Because not every reader can be assumed to be fluent in a particular language or script, numerical notation serves as a translinguistic bridge.
  • Clarity: Because numerical notation is visually salient within a text, adding numerals provides clarity and makes the task of identifying and reading numbers in a text easier.
  • Security: Using two distinct representations of the same number provides security, in case of a scribal error or alteration, that the number represented is that intended.
  • Familiarity: Rare numerical notations—very new or obsolescent, e.g.—may need explication or linguistic redundancy.
  • Conciseness: Mixing modalities can in some cases be shorter than either modality used alone (12 billion vs. 12,000,000,000 vs. twelve billion).
  • Aesthetics: Code choice involves aesthetic as well as practical considerations: prestige, modernity, playfulness.6
  • Audience: Numerals are chosen that are likely to be understandable or valuable to particular imagined readers of the text, as opposed to others.

Numeration is thus not a mere appendage to scripts. It displays remarkable variation in form and function in practically every literate society. It is not just “math stuff”—in fact, most of it isn’t for math at all, but simply solves the problem: how to write a number? Because there are multiple ways to write numbers in almost every tradition, numeration is a rich source of insight into how premodern writers thought about problems of representation. It also helps us think about reading and readership in environments where familiarity with number symbols was almost surely more widespread than full literacy.

Once we can recognize the complex set of factors that motivate these choices, we can return to the broader picture, and to larger historical time scales, as we look for patterns and processes that apply on a comparative basis across multiple millennia. The micro-scale choices made at the level of individual writers come back to inform how we think about much longer-term patterns of change across deep time.

Notes

  1. 1.  As mentioned in chapter 2, Indian users of Western numerical notation usually divide long chunks of numerals differently than Western writers do—where I would write 100,000, a Hindi speaker might write 1,00,000.

  2. 2.  The cover image of this book, a watch designed by Jean-Antoine Lépine in 1788, is a similarly playful blended modality of two numerical notations (Roman and Western numerals). Thompson (2008: 100–101) suggests that this may have been to keep the watch face uncluttered, as each number takes at most two digits.

  3. 3.  There is probably no better way 4 U 2 seem hopelessly old than 2 try 2 imit8 the inventive online linguistic practices of youth, which change extremely rapidly (Tagliamonte 2016).

  4. 4.  Compare with modern Arabic ستة افال stt alaf.

  5. 5.  Here I’m conforming to the style guide—rather than just start the sentence “417,” which wouldn’t do, I am enjoined to rewrite it so as not to have to violate a rule.

  6. 6.  In June 2020, the technology entrepreneur Elon Musk and his partner, the musician Grimes, had a baby boy. Originally the plan was to name the young lad X Æ A-12. It turns out, though, that this violates California law which mandates that numerals cannot be used in a legal name. Their solution was to name the child X AE A-XII instead—using Roman numerals, because they apparently count as letters rather than numbers. But when asked about her choice, Grimes indicated that the new name looked better anyway.