Another Way to Love Language

The Insights of Linguistics

The ’60s attitude in education that grammar “doesn’t allow us to express our inner souls,” that grammar is in fact a class warfare tool used by the ruling elite to oppress the lumpen proletariat, may have placed the English language beyond redemption. This suits some linguists fine.

—CHRISTOPHER ORLET, “Our Inarticulate Future,”
The Weekly Standard

Why are linguists so hell-bent on placing the English language beyond redemption? Is the discipline of linguistics, or the school of “descriptivism,” ruining our linguistic savvy, our ability to communicate? Who are these linguists, anyway, and why do they hate our language so much? Is it possible that people who have chosen to dedicate their entire lives to studying language—the professional linguists Christopher Orlet disdains in the quotation above—love language, too, and. if so, how could they see it so differently from the way the sticklers do?

In 2006, Louann Brizendine published a book that tapped straight into readers’ intellectual id. And no, “intellectual id” is not an oxymoron. There are some things that people seem desperately eager to believe, and they’re delighted to find those things “confirmed” by a piece of scholarly-seeming work. Brizendine’s The Female Brain was just such a hit.

In the book Brizendine claimed, among other things, that women spoke 20,000 words a day, while men utter just 7,000. It was all part of her larger thesis that women’s brains work differently from men’s. And it was just what many people—especially henpecked husbands, I suspect—wanted to hear. The British Daily Mail wrote, “It is something one half of the population has long suspected—and the other half always vocally denied.” A journalist blogging at The Washington Post wrote, “Women talk too much, and men only think about sex … you need a Ph.D. to figure that out?” (Brizendine has an M.D.) The claim was touted prominently on the book jacket and was an Internet sensation.

Something didn’t sound right to Mark Liberman, a linguist at the University of Pennsylvania, though. Women speaking three times as much as men? Though his field is phonetics, Liberman also keeps a popular blog, Language Log, where he and about a dozen other linguists regularly post on general-interest language topics that crop up in the news.

Had Brizendine done some new research? Or had Liberman missed some past research that found this huge disparity in men’s and women’s speech? He looked in the back of Brizendine’s book—one-third of the text is footnotes, lending it a weighty air—and found only one reference for the 20,000-word claim: a self-help book called Talk Language: How to Use Conversation for Profit and Pleasure, by Alan Pease and Allan Garner. Pease and Garner had not done any original fact-finding research on the subject themselves, nor did they cite anyone who had.

Liberman dug around further. Had anybody else done the research on how much women and men talk? Sure enough, he found that they had. Unsurprisingly, there’s a huge amount of variation in talkativeness. Some people, male or female, never shut up, and some rarely talk at all. But as for average differences between the sexes, Liberman found that studies found either no difference at all or a small one—in favor of men. Yes, according to some studies, men talk (on average) slightly more than women. Liberman has not yet found any study showing women talking significantly more, though he’s asked his blog’s readers to send him any, promising to publish the results. None has shown up.

Confronted with this, Brizendine hedged. She claimed that the Pease and Garner self-help book in her footnotes was meant to be “further reading,” not a scholarly citation. She claimed an unfair backlash against her ideas: “It’s very politically incorrect to say there are any gender differences.” She backtracked to say that women produced more “communication events”—gestures, facial expressions, and whatnot—than men. But in the end she promised to take the bit about female logorrhea out of future editions of the book. Well she might. A study published in Science the next year, 2007, was the first to track a large number of people (210 women, 186 men) throughout the day in both the United States and Mexico. Both sexes used about 16,000 words a day, though on average, in this study, the women used 3.5 percent more words, a statistically trivial difference. Brizendine had said women talk 185 percent more than men.

Of course, Brizendine’s dud “fact” was already out of the gate, racing around blogs and book reviews. As the book went into multiple translations, foreigners latched on as fast as English speakers have. (“Warum gebrauchen Frauen 20 000 Wörter am Tag, während Männer nur 7000?,” as Das Weibliche Gehirn’s German publisher touted the claim on Germany’s Amazon.de.) It is likely that, despite Liberman’s efforts, it will become one of the early twenty-first century’s favorite factoids, something that everyone “knows.”

Liberman was at it again in May 2009. First George Will, the conservative newspaper columnist, then Stanley Fish, a professor of English with a perch at The New York Times, and finally Mary Kate Cary, a former speechwriter for George H. W. Bush, all published columns or blog posts noting that the new Obama presidency was looking unduly egocentric. Specifically, all three claimed that they were starting to hear the word “I” coming out of Barack Obama’s mouth more often than seemed proper for a man whose unofficial campaign slogan had been “Yes we can.” It was supposed to be about the movement, not the man.

Liberman did something that had never occurred to Will, Fish, or Cary: he checked the easily available public record. He looked at the first press conferences given by Obama and his two predecessors. The word “I” comprised 4.5 percent of the words George Bush used in his first two press conferences; Bill Clinton’s rate had been 3.9 percent. Of Obama’s words in his first press conference, just 2.9 percent were “I.” When Cary piled on with her column about how her old boss, the first president Bush, had been far too genteel to prefer the word “I,” Liberman went on to show that the elder Bush, too, used it far more often than Obama had. It seems that Obama is unusually I-shy for a modern president.

But Liberman has only a blog, albeit one that gets about 10,000 visits a day; the other three wrote in The Washington Post, The New York Times, and the website of U.S. News & World Report. No points for guessing which version of the “facts” more people will read and remember. Liberman may not change the public perception, but he patiently chugs on, fact-checking the columnists and pundits’ claims of language, motivated by the sometimes naive-looking belief that the truth matters and that in language it is easily discoverable, right there on the record, if only people care enough to actually look.

To get a better idea of what linguists actually do, I thought it was a good idea to observe one in his native habitat. Liberman meets me in his office at the Linguistic Data Consortium, an independent research outfit attached to the University of Pennsylvania. The LDC was set up with funding from the Pentagon’s famed Defense Advanced Research Projects Agency (DARPA). On my brief tour around, few people look up from their work. A young woman transcribes an audio file into written Chinese. An Indian programmer and a Tunisian partner work on a computer program that will automatically analyze any Arabic word of a text: click on a word, and the program will tell you the part of speech, case (subject, object, possessor, etc.) or conjugation, attached pronouns and prepositions, proper pronunciation (short vowels are not written in Arabic), and so forth. Reading in Arabic is difficult because many of these things are clear only from context. Humans can use their common sense. The program uses a sophisticated statistical analysis of the words around it to decide whether ktbt should be read as katabtu, “I wrote,” katabat, “she wrote,” katabta, “you [male] wrote,” or katabti, “you [female] wrote.” The program could eventually prove useful in improving the stillrocky quality of computer translation, a field where breakthroughs to higher quality without human assistance seem always five years away.

All around the LDC, hard drives are scattered like bags of chips around a sophomore dorm room. Stacks of CD-ROMs and DVDs a yard high sit next to machines that will mass-produce copies of the LDC’s products for use by other research outfits. In the computer room, servers lie in horizontal stacks in tall racks, while the fans that cool them create a surprisingly loud hum. Eight or nine small screens show live television in Arabic, Chinese, and other languages.

Liberman explains the genesis of the LDC, going into more detail than a casual visitor needs but showing the pride of its director. DARPA is famous for funding blue-sky research into technologies so far ahead of the curve that its projects either fail spectacularly or create technological great leaps forward that the market would never deliver. DARPA’s interest in linguistics is obvious; machine translation and “defined-item recognition” (such as finding the name “Osama bin Laden” on blogs and television broadcasts and in wiretaps) are clearly interesting to the Pentagon. But the center does no classified research; by the terms of its grant, the LDC must share its work.

Over the course of lunch, I find Liberman open-minded about everything we discuss, though he is plenty opinionated. When the famous debate over Whorfianism (which we’ll return to in chapter 4) comes up, he makes a mini-case against it, then refers me to some scholars who support it. He speaks so slowly that when I listen to my tape and play it on fast speed, I sound like an auctioneer and he sounds normal. In his writings on Language Log, he slowly unwinds thoughts until he has trawled around and gathered enough information to support what he wants to say. The overwhelming impression is that of a man who is relentlessly empirical: he doesn’t form an opinion first and then scour for facts that support it. As with the Brizendine and Obama-“I” episodes, he looks at the facts first, does the math where appropriate, and determines what truths the facts support.

For those with conservative attitudes to language, this careful empiricism might come as some surprise. The usage wars that have pitted sticklers, or “prescriptivists,” against professional language scholars, almost all “descriptivists,” have had both an epistemological and a political cast. The sticklers we met in the last chapter want to tell people how to speak and write. Linguists want to discover how people do speak and write and deduce “rules” from observation or description. For sticklers, this is an intellectual mess.

Take Mark Halpern, a computer programmer by trade who has also written about language for the The Atlantic and The Vocabula Review and published a book in 2008 called Language and Human Nature. Halpern has mixed it up with Liberman on Language Log; when Liberman gently mocked Language and Human Nature, Halpern sent in a long reply that Liberman duly posted on the blog. The two men are good examples of their respective sides of the debate.

Halpern is not the most small-minded or angriest stickler. He acknowledges that some prescriptivist rules are silly and that language change has been the norm throughout history. He doesn’t even self-apply the label “prescriptivist,” but he is nonetheless a proud “linguistic activist.” Language changes through processes that linguists give dry names and tend to describe, not condemn. “Reanalysis” refers to the process by which a mass of certain small round cooked vegetables was once known as “pease” (a mass like “oatmeal” or “porridge”) and later was reanalyzed as the plural of a singular noun, “pea.” “Semantic shift” refers to how words (for example) often move down-market; “lady” once referred to a female of noble birth but now applies to any female. Linguists tend to note these things and explain. But Halpern scornfully calls those same things “simple ignorance,” “social climbing,” or “semantic inflation.”

Linguists, he feels, aren’t really scientists, methodical observers. Instead, he smells an agenda: Halpern liberally (or should that be “conservatively”?) heaps the label “progressive” on linguists. In Halpern’s mental caricature, linguists think nobody should condemn anyone for fear of hurting their feelings; linguists approve of any and all language production, including “mistakes,” and this is because they are in league with the other softheaded left-wingers who make up the academy. Linguistics is bound up with postmodernism and the denial of the existence of objective facts. This enables “linguistic decadence,” which paves the way for atrocities such as anti-Semitism and communism to be wrapped in healthy-looking labels such as virile nationalism or communitarian spirit. When people no longer pay proper attention to language, Halpern argues, they are more likely to be suckered by politicians.

Halpern’s writings make clear that he is a political conservative. His criticism of academics as seeking to destroy the very notion of objectivity is an old saw of the political Right. Oddly enough, though, this view of linguists as destroyers of healthy objective thinking is not limited to one-half of the political spectrum.

David Foster Wallace closely mimicked Halpern’s critique in a 2001 article in Harper’s entitled “Tense Present: Democracy, English, and the Wars over Usage,” an article widely read and quoted because of its author’s dazzling writing in the novel Infinite Jest, and his essays and short stories. Wallace was clearly not a political conservative himself, but he makes many of the same mistakes Halpern does when he talks about linguists and linguistics.

Wallace, who admitted to being a usage stickler himself, from a proud family of the same, argued many of the same things that this book does: that language controversies are political ones too, and that identity politics lies at the heart of the project to promote a language standard.

Wallace taught composition at the University of Illinois at Bloomington, an experience that taught him that minority students—coming, as they usually did, with a different language or a distinct dialect such as Black English ^*—needed standard English even more than whites. His sensitivity to this makes him stand out among sticklers, as does his self-awareness:

we SNOOTs know when and how to hyphenate phrasal adjectives and to keep participles from dangling, and we know that we know, and we know how very few other Americans know this stuff or even care, and we judge them accordingly,

When Wallace says this, he’s not paying himself a not-so-subtle compliment, like Truss. He uses “judge” here to mean no healthy thing. Sensitivity to the political content of sticklerism pervades the essay.

But Wallace nonetheless wheels around to make the case for standard English by relying on utterly misguided arguments about what linguists actually are and what “descriptivism” is. He was thus right in only half of his political argument. The famously hip Generation X novelist sounds just like Halpern, the far older writer and proud conservative, when he caricatures linguists and “descriptivists” as part of the progressive, liberal, post-1960s can’t-we-all-just-get-along brigade. But he makes error after error along the way. He wrongly identifies the 1961 publication of Webster’s Third New International Dictionary as the “Fort Sumter” moment of descriptivism. Though based on descriptivist principles, it included monitory notes such as “chiefly dialectal” or “nonstandard” when listing stickler-bait words such as “heighth” and “irregardless.” (Somehow Wallace missed these yellow flags, writing that “heighth” and “irregardless” are listed “without monitory labels.”) In fact, the 1961 publication of Webster’s Third built on insights that had been gathering in linguistics for decades, and descriptive practice could be seen in venerable previous dictionaries such as the Oxford English Dictionary.

Worse, Wallace somehow confuses linguistics with the humanities. The guiding principles of those two modern disciplines are utterly different. Yes, English and comparative literature departments focus heavily on race, class, gender, colonialism, paradigms, and the like. Wallace doesn’t seem to have talked to many colleagues in the linguistics department, though, because very few linguists spend their research time thinking along the lines Wallace condemns.

Perhaps the low point of the Wallace essay comes when he writes:

Descriptivism so quickly and thoroughly took over English education in this country that just about everybody who started junior high after c. 1970 has been taught to write Descriptively—via “freewriting,” “brainstorming,” “journaling,” a view of writing as self-exploratory and-expressive rather than as communicative, an abandonment of systematic grammar, usage, semantics, rhetoric, etymology.

Wallace has a point here, but it has nothing to do with linguistics. It seems he really means to attack the progressive education movement, which sought (with decent intentions but often misguidedly) to increase the focus on how much students enjoy education. Make it entertaining and personal, and students will learn. But linguists had virtually nothing to do with throwing out the rules of grammar and usage in favor of “freewriting” or “journaling.” Wallace was not a journalist, but he would have done well to pick up the phone and call a linguist: if he had chosen a random member of the Linguistic Society of America, he would have been a lot more likely to find someone like Mark Liberman than some softheaded proponent of making kids feel good through “freewriting.” In the same sentence, Wallace describes linguists as “doctrinaire positivists” (true of linguists, in that they think scientific truth is discoverable) and having “their intellectual roots … firmly in the U.S. sixties” (untrue, at least as far as most of their scholarship goes).

So if they’re not really wooly 1960s let’s-all-get-along types, how do linguists actually think? Talking about his discipline, Liberman makes a useful comparison to economics. At one end of the spectrum in both fields are the highly data-driven types, the quants whose number crunching and use of computers mean articles filled with formulae utterly forbidding to the outsider. At the other end of both disciplines are the theorists. In economics, this means those who try to come up with elegant (if jargon-laden) explanations for why economies behave as they do, while rarely getting into the data. The same is true of theoretical linguists, who do most of their writing from their studies, trying to figure out how the guts of human language, especially its syntax, work. Much of this field centers around debates about Noam Chomsky, who upended linguistics beginning in the 1950s. Many modern debates still pit passionately pro- and anti-Chomsky camps against each other.

But Chomsky’s linguistics does not spring from his politics. For decades, he has been a scathing left-wing critic of American foreign policy in Southeast Asia, Latin America, and the Middle East. But those who might expect that this would lead to antiauthoritarian permissivism in his linguistic work will be disappointed. A typical passage reads not like the postcolonial theorist Edward Said or the postmodern critic Michel Foucault, but like this:

(64)

a. this book is too interesting to put down without having finished

b. this book is too interesting [O [PRO to put t down without PRO having finished e]]

The structure (64b) is an instance of (57) with a=O; that is, t is the trace of the empty operator O moved to COMP in the syntax, and e is the parasitic gap licensed by the variable t. If there were no such operator O, the structure would be barred for the reasons already discussed: e would be an NP-traced A-bound by this book (structurally analogous to (59b), in which a locally A-binds t and e), rather than being assigned the status of a parasitic gap, as it is.

I don’t know what this means, but I take comfort that several linguists tell me they wouldn’t, either. For linguists who like to leave campus, this is one of those things you forget after settling into a nontheoretical field after graduate school.

Liberman, as an applied phonetician who began his career at Bell Labs, is one of those who keeps a foot outside the academy. He says that Chomsky once told him that it wouldn’t matter a whit to have descriptive grammars of all the world’s languages (and that one might as well survey the location of every blade of grass on MIT’s campus). For a data-monger like Liberman, the prospect of so much raw information would be drool-inducing.

But whether theoretical or data-crunching, linguists have beliefs about how language works that are as sharply held as that of any Lowth, Fowler, or Truss. And they are just as frustrated when they see others saying things that to them are just plain wrong. Why did Liberman spend so much time debunking Louann Brizendine’s “fact” of 20,000 words a day for women against 7,000 for men? Because getting the facts right is important, and the claimed “fact” is used to buttress a theory with important social implications—that women’s brains differ significantly from men’s. Liberman doesn’t disagree that women’s brains might well be different; he just objected to seeing bogus information from his field wielded to prove it. If fun facts are fun only when they’re true, important facts are important only when they’re true.

Linguistics does not have its roots, as Wallace thought, in the 1960s. It goes back much further than that, and perhaps the best introduction to the discipline is a look at one of its most famous subfields, historical linguistics. Historical linguists have been responsible for some of the field’s most breathtaking intellectual achievements, including the reconstruction of languages that have not been spoken for thousands of years and were never written down.

The ancient Greeks and Romans thought a lot about language, putting their thinking to use mainly in developing rhetoric, logic, and poetics. They did little studying of how their languages had come to be, however. Elsewhere, the Indians took their Sanskrit language more seriously than possibly any culture in the world, believing that the exact form of its sacred texts gave access to the divine. The scholar Panini, from what is now Pakistan, composed a grammar of Sanskrit that consisted of almost four thousand rules, one of the most extensive grammars of any language ever published—and this in the fourth century B.C.

After Panini and the classical era, things would largely go downhill in thinking about language. The Arabs made useful and interesting studies of Arabic but rarely looked abroad. (They were largely codifiers of classical rules, highly successful prescriptivists in their time—one of the roots of the Arabic “diglossia,” as we will see later.) Christians, meanwhile, believing the literal truth of the Babel story, sought to discover the original human language. Some thought it to be Hebrew; others tried to classify the world’s languages in groups named after the sons of Noah. One scholar, Johannes Goropius Becanus, thought that the world’s first language was his own language, Flemish. (The Language Log bloggers, including Liberman, named an award for Becanus, the “Becky,” for outstandingly ill-informed linguistic pontificating. Its first recipient was Brizendine.)

But modern historical linguistics would have to wait almost two millennia to come into its own. It had not a Fort Sumter moment but a Christopher Columbus moment. William Jones, a lawyer, had been a British colonial official in Calcutta. But he was also an amateur linguist, and when he returned to London he gave a lecture that spawned a thousand voyages of linguistic study after him. Having studied the classical European languages and then Sanskrit in India, he noticed something striking:

The Sanskrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong indeed that no philologer could examine them all three, without believing them to have sprung from some common source, which perhaps no longer exists: there is a similar reason, though not quite so forcible, for supposing that both the Gothic and the Celtic, though blended with a very different idiom, had the same origin with the Sanskrit; and that the old Persian might be added to the same family.

Language may not be in decline, but they don’t make British civil servants like they used to. The lawyer and amateur linguist had just taken the biggest step yet in the invention of modern linguistics. Greek and Latin pater (“father”) looked too much like Sanskrit pitar, mater too much like Sanskrit matar; Jones found so many such correspondences that they simply couldn’t be attributed to chance.

As with Columbus’s discovery of America, Jones may not have gotten there first. Others before him had noted the similarities among Latin, Greek, and Persian, for example. But it was Jones’s public announcement that changed the world. Further investigation would show that Hindi, Punjabi, Kashmiri, and other north Indian languages, plus Persian, some of the languages spoken in the Caucasus and Central Asia, and virtually all the European languages, were related. (Of the European languages, a very few, including Basque, Finnish, Hungarian, and Estonian, did not form part of the family.) As relatives, the newly discovered huge new family needed a common ancestor. This language, painstakingly reconstructed in ensuing years, came to be known as Proto-Indo-European.

Jones’s discovery and the quotation above are as well known to linguists as Thomas Jefferson’s opening of the Declaration of Independence is known to American schoolchildren. The discovery of Indo-European led to an explosion of interest in languages and their histories. The first modern scientific linguists began to investigate just how the modern languages had emerged.

If Jones’s speech formed a foundational text for linguistics, one of its first early lawgivers was Jacob Grimm—perhaps the James Madison of the discipline. In his work, he began to develop one of the tools of modern language science: the laws of language change. Grimm proved that languages don’t change randomly due to speakers’ laziness; they change systematically. Grimm noticed, in piecing together modern German words with their Gothic ancestors and with Greek, that certain sounds in one almost always corresponded with a related sound in another. Where Greek had p, Germanic would tend to have f, as Greek pous (“foot”), which became German Fuss. Other Germanic languages share the p-to-f shift: Gothic fotus, Icelandic fótur, Danish fod, Swedish fot, and of course English “foot.” Other languages kept the p sound, like Latin (ped-). We see them today in “pedestrian” and “podiatry,” derived from Latin and Greek into English.

Grimm found many other systematic changes, such as Greek b to Old Germanic p to modern German f, and tied them together in what linguists now know as Grimm’s law. Other scholars have refined the original “law,” improving on its details, and extended that work to the other Indo-European languages, finding equally systematic changes to the ones Grimm had discovered. German and Danish scholars jumped especially eagerly into the new research.

The discovery first of the huge Indo-European family, and then of systematic changes over time, made it impossible to see language the same way again. While the books of Lindley Murray and Robert Lowth were selling millions of copies by simply pronouncing ex cathedra rules—“Two negatives destroy each other”—Jones, Grimm, and their companions and followers were investigating dozens of languages across thousands of years. They were discovering rules of phonetic and grammatical mutation, finding both regularity and huge variety. The laws the early linguists discovered even allowed them to reconstruct, to a certain extent, words and stems of the Proto-Indo-European language itself, though that language had never been written down. One German scholar, August Schleicher, went so far as to write a short fable in what he said was Indo-European. Of course, true knowledge of the exact language—which was more probably a bundle of dialects and not a single standard language—isn’t possible, and Schleicher’s effort is enjoyable for his audacity if not to be trusted for accuracy.^* (Other scholars have updated it as scholarship has advanced.) Today, joining together with archaeologists and historians, historical linguists continue to try to locate the original Indo-Europeans in time and space. Some posit Turkey as their home, while others make the case for southern Russia.

Whatever their disagreements, though, all historical linguists know of Jones’s speech and Grimm’s law, and the incredible intellectual achievements that followed. That Irish, French, Hindi, and Armenian sprang from a common source is not a curious fact but a commonplace one to the historical linguist. After having those stories drilled into their heads in the first year of graduate school, it is far harder for the modern linguist to look at any language today, whether English or Sinhalese, and see anything but a snapshot of a general, constantly mutating phenomenon called “language” at a particular place and time. While the stickler might see the misuse and gradual disappearance of “whom” as proof that education and society have been flushed down the toilet, most linguists—even though they will almost certainly use “whom” in their written work themselves—see the pronoun’s replacement with “who” as merely another step in English’s gradual shedding of case endings. In the era of Beowulf, English nouns had endings that showed what role they played in the sentence, as Latin did. But nearly all of them disappeared by the time of Shakespeare, and a linguist would see the death of “whom” as simply the conclusion of the process.

John McWhorter, an expert in language history and change, compares language to a lava lamp. For him, as for most linguists, it doesn’t make much sense to get red in the face because the globs of goo, like language’s words and rules, simply won’t stay put. (Of course, it does make sense, but not on linguistic grounds. Prescriptive rage feels pretty good for many people.)

All this doesn’t mean that linguists want to throw out the teaching of standard grammar rules to schoolchildren, for example. But knowing what linguists know about eons of linguistic history and change really puts things into perspective: any “rule” that makes the average stickler want to commit assault with a red pen seems terribly temporary and contingent to the linguist. Studying a dozen languages or a thousand years of history will do that. The sticklers, though they grudgingly acknowledge that language changes, could do with a bit of the same perspective.

Historical comparative linguistics, which for a long time went by the name “philology,” is just the first and perhaps the most famous subfield of modern linguistics. For about a century (roughly the nineteenth), that was almost all there was. Ferdinand de Saussure, the brilliant Swiss linguist, was first known for his historical work. He hypothesized consonants, “sonant coefficients” (later known as laryngeals), in Indo-European. These sounds articulated at the back of the throat—consonants like a rasping h or a quick stop of the airflow with the glottis—must have once existed in Proto-Indo-European, he reckoned. There was no other explanation for variations in vowels in Indo-European’s successor languages. Of course there was no direct evidence, as Proto-Indo-European had never been written down. But Saussure was proven right, decades later, by the discovery of Hittite texts, which included laryngeals.

Saussure’s theoretical work is more famous, however, leading to the rise of linguistic “structuralism.” He posited that langue, the system of a language, and parole, its external manifestation, were two sides of the same thing; one could no more separate them than one could cut just the obverse and not the reverse of a piece of paper. Linguists, to that point, had been focusing on individual words, sounds, and grammar points and their histories. Saussure got linguists thinking about how all the pieces all hung together, contingent on one another, a structure. Saussure’s research led, through a series of intermediary theorists and half a century of advances, to Noam Chomsky.

Before Chomsky, psychologists, not linguists, had published the most prominent theories of how our brains and tongues work together. A dominant school was the “behaviorism” of B. F. Skinner, a psychologist (but not a language expert). Skinner posited that human language was little more than a sophisticated version of the stimulus and response he had observed in other animals. Behaviors were learned through rewards and punishments. Rats could not only learn to press a lever for food; they could progressively learn more complicated tasks if rewarded properly.

Skinner posited that the human animal also responded to stimuli. A child, responding to his mother’s praise, gradually learned to say the words that would elicit it, Skinner wrote in the 1959 book Verbal Behavior. Language was a fancy version of the same process that made a rat press a bar:

Behavior alters the environment through mechanical action, and its properties or dimensions are often related in a simple way to the effects produced. When a man walks toward an object, he usually finds himself closer to it; if he reaches for it, physical contact is likely to follow; and if he grasps and lifts it, or pushes or pulls it, the object frequently changes position in appropriate directions. All this follows from simple geometrical and mechanical principles.

Much of the time, however, a man acts only indirectly upon the environment.… Instead of going to a drinking fountain, a thirsty man may simply “ask for a glass of water”—that is, may engage in behavior which produces a certain pattern of sounds which in turn induces someone to bring him a glass of water.… The consequences of such behavior are mediated by a train of events no less physical or inevitable than direct mechanical action, but clearly more difficult to describe.

Though Skinner did not say that humans were no different from rats, the extension of his earlier work is clear: he wrote that “the results [of his work] have been surprisingly free of species restrictions.”

Chomsky would have none of it, writing a devastating review of Skinner’s book that killed the behaviorist view of language with one stone to the temple. (In 1959 Chomsky himself was not yet Goliath.) Chomsky pointed out that while certain linguistic reactions to a stimulus were predictable, others were nothing of the sort.

A typical example of stimulus control for Skinner would be the response to a piece of music with the utterance Mozart or to a painting with the response Dutch. These responses are asserted to be “under the control of extremely subtle properties” of the physical object or event (108). Suppose instead of saying Dutch we had said Clashes with the wallpaper, I thought you liked abstract work, Never saw it before, Tilted, Hanging too low, Beautiful, Hideous, Remember our camping trip last summer?, or whatever else might come into our minds when looking at a picture.

Skinner offers “response strength” as evidence that the stimuli are connected to responses; a rat repeatedly and urgently pressing a bar would indicate a tight connection between behavior and reward. In trying to extend the concept to language, Skinner writes that “if we are shown a prized work of art and exclaim Beautiful!, the speed and energy of the response will not be lost on the owner.” Chomsky drily replies that “It does not appear totally obvious that in this case the way to impress the owner is to shriek Beautiful in a loud, high-pitched voice, repeatedly, and with no delay.” No, not obvious at all.

Having killed behaviorism with this kind of dry wit, and having also published his revolutionary 1957 book Syntactic Structures, Chomsky launched linguists on the task of trying to construct “grammars” of languages. These were not books that should be beaten into the heads of young children to correct their habits. Chomsky instead defined a “grammar” as a set of rules that could produce all plausible sentences in a language but none of the implausible ones. As Chomsky wrote in Syntactic Structures, Colorless green ideas sleep furiously, while nonsensical, was grammatical. All of the pieces fit together as adjectives, nouns, verbs, and adverbs should. Tired young children sleep marvelously has the exact same form. By contrast, Chomsky noted that Furiously sleep ideas green colorless is both nonsensical and ungrammatical. And The child seems sleeping, though sensible, isn’t grammatical. The pieces don’t fit together. Crucially, virtually any normal speaker would make the same judgments about the three statements. A grammatical sentence can be false, badly written, or illogical. An ungrammatical string of words can be true, beautiful, or powerful. Grammar simply isn’t the same thing as rhetoric, logic, mechanics, or style.

Since Chomsky’s late-1950s publications, linguists have largely concerned themselves with discovering exactly what rules that people use are and how they work in the human mind. This has been an enormous and intensive research project—but one that those outside linguistics have had little clue about. David Foster Wallace, in his caricature of descriptivism, wrote:

The very language in which today’s socialist, feminist, minority, gay, and environmentalist movements frame their sides of political debates is informed by the Descriptivist belief that traditional English is conceived and perpetuated by Privileged WASP Males and is thus inherently capitalist, sexist, racist, xenophobic, homophobic, elitist: unfair.

This notion—that linguists, as descriptivists, want to throw out the rules because they are “inherently capitalist, sexist, racist, xenophobic, homophobic, elitist, [and] unfair” would come as a surprise to Geoff Pullum, one of the many linguists who, since Chomsky, have been trying to figure out how language really does work in the human mind. Pullum, an expert in English grammar at the University of Edinburgh, is no “Let’s overthrow the man, man” hippie. He is a very different, and rare, breed: the pissed-off, curmudgeonly descriptivist.

Language rants are usually the domain of the sticklers. But Pullum, who blogs along with Mark Liberman at Language Log, has had it. When he sees someone making what he thinks are foolish claims, he isn’t professorial or even G-rated. In one typical post, he goes through and demolishes a list of copyediting shibboleths—that, for example, “have to” and “need to,” or “note” and “notice,” mean starkly different things and mustn’t be confused. Finishing up, he writes:

The things mentioned above are not debatable, they are facts about English that can easily be checked, and it is about time copy editors were told to stop wasting millions of hours on pointlessly correcting them when they were correct in the first place. God dammit, I can feel the veins standing out in my neck. I need to step outside for a while and kick something.

“Facts about English”? What is a descriptivist, in Wallace’s caricature, doing talking about “facts” about a language if descriptivists think that the rules of standard English are just a racist, homophobic (etc.) tool of oppression? Doesn’t “descriptivism” mean that if a native speaker says something, it is ipso facto okay?

This is “utterly insane,” says Pullum. Of course English has rules: in fact, he lays them out in an 1,860-page book, The Cambridge Grammar of the English Language, which he coauthored with an Australian syntactician, Rodney Huddleston. Descriptivists like Pullum do believe people can misuse those rules; when they do, they miscommunicate or just sound silly.

But linguists imagine rules very differently than do Wallace-style grouchy grammarians, deriving them from observation of some of the billions of words spoken each day and analyses of the trillions that have been written down (and that are, today, conveniently searchable by computer). Linguists might observe native speakers undetected, record them in casual conversation, ask them what sounds grammatical to them, or observe what educated people write. If a sentence strikes the vast majority of speakers of a language as well formed, it is well formed.

This doesn’t rule out variation, say, by dialect or region. “His team is in trouble” is grammatical in standard American English. “His team are in trouble” is grammatical in Britain. And linguists would say, “His team in trouble” is grammatical in Black English—it is perfectly comprehensible and, crucially, native speakers of the dialect will not bat an eye at it. But “The team am in trouble,” even if comprehensible, is grammatical in no dialect of English. Descriptivists draw their rules from native speakers, but that doesn’t mean anything goes. Grammar rules should generate sentences that the large majority of speakers of a given (dialect of a) language would accept as correct. Pullum’s nearly two thousand pages of rules in The Cambridge Grammar so happen to focus on modern standard English. But just such a grammar could also be written about Black English or the English of southeastern England at the time of Shakespeare.

Pullum has special vitriol for Elements of Style, which he calls “E. B. White’s disgusting and hypocritical revision of William Strunk’s little hodgepodge of bad grammar advice and stylistic banalities” or elsewhere simply a “horrid little notebook of nonsense.” Pullum catches E. B. White using a “which” to introduce a restrictive clause in the second paragraph of Stuart Little, something White himself prohibits in Elements. But Pullum notes that it is no good pointing this out to the self-appointed language guardians. When you tell prescriptivists that they have just violated their own rule, they simply say they erred and promise to do better next time. When you point out that dozens of great writers violate the same rule, they retreat to “Everybody makes mistakes sometimes.” They never imagine that the rule itself may be bogus.

This myth is the opposite of “everything is correct,” and Pullum calls it “nothing is relevant.” The Rules exist on some plane of their own, and no amount of empirical evidence—say, repeated examples from professionally edited works of indisputably great writers—matters. For Pullum and the majority of linguists who are like him in this regard, if great writers break a rule frequently and naturally in writing, everyone else follows suit in speech, and doing so creates no confusion, that rule is illegitimate, a waste of everyone’s time. The Cambridge Grammar thus calls prescriptivism with no recourse to evidence as mere “universalizing of one person’s taste.”

What does the modern descriptivist syntactician look like when he formulates his own rules? These rules are a mix of logic, the linguist’s intuition about what a native speaker would consider acceptable, and examples taken from large bodies of written or transcribed language. Using a mix of data and his own judgments (both are needed, since people can make realtime mistakes, but a linguist’s intuitions can be wrong too), the linguist seeks to tease out the rules.

For example, if you knock on someone’s door and the person on the other side says, “Who is it?,” the stickler’s correct response is “It is I.” The thinking is that “it” is a subject pronoun, and “is” means “the first part of this sentence and the last must be grammatically equivalent, because ‘is’ makes two things the same.” That means that since “It” is the subject, the complement must be “I,” a subject pronoun, and not “me,” which is used as the object of verbs, prepositions, and so forth.

In pure prescriptivist form, the rule is simply the rule. It must not vary, for formality or by grammatical context. It is probably based on Latin grammar, where “predicate” and “predicand” (as they’re known in the jargon) match in case. But this is by no means required by universal logic. French uses alternate pronouns: the nominative is je, but “It is I” translates not as C’est je but C’est moi. The Scandinavian languages, cousins to English, do much the same: Det er mig in Danish, for example.

As Pullum and his coauthor, Huddleston, have it, the case of I or its equivalents can alternate between their subject forms (I, he, she, they) and their object forms (me, him, her, them) in English.

Consider these examples:

a. It is I who love you.

b. It’s me who loves you.

c. It is I she loves.

d. It’s me she loves.

e. Yes, it is she!

f. Yes, it’s her.

g. This is he.

h. This is him.

i. The only one who objected was I.

j. The only one who objected was me.

k. This one here is I at the age of twelve.

l. This one here is me at the age of twelve.

For Pullum and Huddleston, the difference between (a) and (b) is simply one of formal versus informal register. (a) is formal but would ring oddly in many contexts; (b) is neutral to informal. Those who might insist that the rules are the rules and must never vary would have to insist on (c) as well, which is so stuffy or archaic as to be ridiculous in the early twenty-first century. In (k), a plausible scenario in which the speaker is showing an old photo, the prescriptive rule leads to a sentence that isn’t merely stuffy but probably unacceptable to most speakers. At the very least, someone who spoke like that would be considered very odd and probably not invited to many parties. If the “rule” produces sentences the vast majority of people wouldn’t say and would reject if others said them, the rule is, by (descriptivists’) definition, no rule, or it needs modification.

Several objections might occur to the stickler. One is that the rules should simply be standard and clear. If you allow for variation, children will not be able to learn them. But in this case, the descriptivist reply is that there is indeed a rule: both nominative and accusative are acceptable in most cases and differ by formality. To the accusation that having two choices is intolerable because consistency must be ironclad, the linguist would simply point to many other optional variations in English: “It’s” versus “It is,” for example. No rule requires that “It is” be contracted, or kept apart. Both are available and serve different needs, usually different levels of formality.

One more thing needs to be said about how linguists think as opposed to traditional sticklers. The self-appointed stickler usually holds the written language far above the spoken in its purported logic, clarity, elegance, and style. Spoken language, with its far more errors, false starts, and variation by time, place, exhaustion, and presence of alcohol, seems by comparison debased and debauched.

For the linguist, the focus is almost the other way around. All typical, healthy adults speak, and spoken language has been in existence for tens if not hundreds of thousands of years. Writing is a newcomer by comparison. Spoken language may be a natural faculty wired in the brain, which needs input only during formative years to become the amazing machine that is the adult language-producing box. Writing is an artificial modern skill that must be taught for years when children are older, and (as the stickler knows) the results often fail to impress. Sociolinguists more often study how people talk in different situations, not how they write. Comparative linguists doing fieldwork spend much of their time learning languages that are fascinatingly rich and usually hugely complicated—but have no writing system at all.

This isn’t to say that linguists don’t care about writing. Historical linguists such as Jacob Grimm obviously relied on older writings. Syntacticians such as Pullum pore through written works to see if theoretical or rare formations can be found and, if so, how often. But the notion that the written word is the “true” one and the spoken word “false” makes little sense from the linguist’s chair. Everybody speaks, but not everyone reads. Just a few hundred of the world’s six thousand languages are written down. And few of the world’s people write on a regular basis. Even in the literate world, the average person is a fluent speaker and a fairly clumsy writer.

Is it possible to bridge the gap between prescriptive and descriptive? There is absolutely no reason why not; Pullum and Huddleston, for example, explicitly declare that their Cambridge Grammar is descriptive only but say that a writer seeking usage advice is well advised to find a good style manual. (Just not The Elements of Style.) Linguists just don’t see that as their job. David Foster Wallace mocked linguists’ pretensions to descriptive scientific accuracy, saying that it would be like an ethics textbook that described how people behaved, rather than how they should. But linguistics isn’t ethics; it’s not the humanities either. It’s more like economics or political science, in which methods range from highly theoretical to highly quantitative.

The idea that language has no rules does exist in linguistics, but it is an exotic minority position that most reject. Geoffrey Sampson, of the University of Sussex, put forth that position in an article called “Grammar Without Grammaticality”: “I believe that the concept of ‘ungrammatical’ or ‘ill-formed’ word-sequences is a delusion.” Pullum responded with the exact opposite position: that nearly any string of words is ungrammatical. Take ten common English words; they can be combined in 3,628,800 different ten-word strings (10 × 9 × 8 × 7 × 6 × 5 × 4 × 3 × 2 × 1). Almost none of those 3 million-plus strings will be accepted as sentences by an English speaker. In this light, producing a grammatical sentence is a miraculous thing indeed.

There are many other huge disagreements among modern syntacticians. Chomsky first promoted a vision in which the brain changes a meaning-based structure (“deep structure”) into the actual form of words used by a speaker (“surface structure”) through a process he called “transformation.” He has since abandoned transformational grammar, but others have continued work in a transformational vein.

Research into syntax—constructing grammars for the world’s languages—is now split not only between pro- and anti-Chomsky researchers, but between those who follow his older work and those who support his new “minimalist program.” The new program has tried to strip down the number of rules considerably and even argues that the feature of “recursion” is the only unique feature of human language (as distinct from other forms of language such as whale music, bee dances, and so on). Recursion is the ability to fit one kind of syntactic unit into another of the same type; “the cat” is a noun phrase, and “the cat in the hat” is a noun phrase that has noun phrases (“the hat” and “the cat”) within it. But though once upon a time everyone in linguistics seemed to be responding to the dominant “classical” Chomskyan paradigm, today the minimalist program has put the titan himself into a smaller, more controversial camp, against, for example, his fellow “innatist” (a believer that some elements of grammar are wired in the brain), Steven Pinker.

All this should dispel the notion that descriptivists don’t believe in rules. But they see their role as discovering, not pronouncing, them. Some (like Pullum) use real-world evidence. Others (like Chomsky) construct artificial examples to illustrate their points. But what neither does is sit in a chair saying “This is how it is, by Jove, and anyone who doesn’t know this rule is a fool.” Remember our analogy of linguistics to social science. A political scientist or an economist who says “Vote Democrat” or “Progressive taxation is unfair” is not doing academic work. She has every right to say that, but this is the stuff of an op-ed, not a peer-reviewed journal. And linguists see their job much the same way. Saying “That is annoying and incorrect” isn’t what they’re after; “This is informal, and many usage books recommend against it” is more their style. Pullum is the rare one who will add “but you should feel free to tell them where they can shove it.”

Syntax is at the more theoretical end of the linguistic spectrum, though; as mentioned, there are many syntacticians who spend their day going through bodies of text for real-world examples of this or that construction. But linguistics also has subfields that are far more directly empirical, two of which are relevant to the sticklers-versus-linguists debate.

The first is psycholinguistics. The field itself grew out of Chomsky’s revolution against behavioralism. When Chomsky posited that many language structures are innate, psycholinguists redoubled their efforts trying to figure out how the brain processed language. Many of their efforts are not of interest here. But their methods are: psycholinguistics offers a test of many prescriptivist notions.

The Elements of Style by Strunk and White, for example, says that

The pronoun this, referring to the complete sense of a preceding sentence or clause, can’t always carry the load and so may produce an imprecise statement.

Visiting dignitaries watched yesterday as ground was broken for the new high-energy physics laboratory with a blowout safety wall. This is the first visible evidence of the university’s plans for modernization and expansion.

In the left-hand example above, this does not immediately make clear what the first visible evidence is.

As usual, no evidence is offered that “this” really doesn’t carry the load. We are told to believe that this sentence is harder to understand—here, at least, Strunk and White are telling you to think of your reader’s comprehension, not merely yelling at you. But they offer no support for this pronouncement; the reader is meant to merely take Strunk and White’s word for it.

As it happens, though, these kinds of things can often be tested in a laboratory. One simple method, for example, available to modern researchers is the minute tracking of eyeballs as readers scan a page of text. A vague or unclear passage will make readers stop tracking through the text. (Test subjects are sometimes told that they will be asked questions about the text afterward, so that they try to comprehend it.) If the eyeballs pause on the “this” in a construction like the one Strunk and White proscribe, it may indeed be vague or problematic. (That wouldn’t make it improper grammar, meaning the pieces of the sentence don’t add up. It’s very easy to write a grammatical sentence that people would have to stop again and again to read. It simply means that avoiding it would make better writing.) As it happens, there is just such a test under way: the sexily named Anaphora Resolution and Underspecification research program at the Universities of Essex and Glasgow. (The program is also studying whether computers can parse the sentences in question.) When wondering whether a certain construction is acceptable, Strunk and White pronounce; linguists research.

Sometimes their research might give comfort to prescriptivists. Sex-indefinite singular pronouns have been a source of a long-running descriptivist battle. “Everyone has finished their dinner” is a stickler no-no. It uses “they,” a plural pronoun, to refer back to “everyone,” which is grammatically singular. (Even if it is semantically plural, we say “Everyone is,” not “everyone are.”) Linguists say that the sex-indefinite singular “they” goes back in English writing to the King James Bible, among other places: “They set forward, every one after their families, according to the house of their fathers” (Numbers 2:34). And of course virtually everyone but the most Wahabbist stickler uses “they” in plain conversation. The band Jane’s Addiction sang “Everybody has their own opinion” because in a punk-influenced rock song, “Everyone has his own opinion” would sound silly and “Everyone has his or her opinion”—both prescriptivist and politically correct—would sound idiotic. Strunk and White, naturally, counsel simply “his,” asserting that it stands for both sexes, and that “his and her” is stylistically ugly.

As it happens, the Anaphora Resolution and Underspecification program has data. Two groups read slightly different texts. One group got the unremarkable

Mr Jones was looking for the station. He saw some people on the other side of the road, so he crossed over and asked them politely where the station was. It was in a different part of town.

Others got this (emphasis added):

Mr Jones was looking for the station. He saw some people on the other side of the road, so he crossed over and asked her politely where the station was. It was in a different part of town.

Compared to the first group, readers in the second group took, on average, an extra tenth of a second to process this before moving on with their eyeballs. Other subjects were given this:

Mr Jones was looking for the station. He saw someone on the other side of the road, so he crossed over and asked them politely where the station was. It was in a different part of town.

Here, readers paused about an extra twentieth of a second. It seems that upon finding “them,” they wondered briefly where the plural antecedent was. But they quickly moved on, twice as fast as those in the first example above, who paused at a genuine oddity.

The point, again, is this: if a prescriptivist says something is illogical or confusing, this claim can be tested. Some tests would show the sticklers wrong, but some might prove them right. David Foster Wallace considered the scientific study of language to be “Unbelievably Naïve Positivism.” Unfortunately, he never seemed to look into what he was talking about; linguists can’t know everything about every speaker’s language, but they certainly can discover quite a bit.

If Noam Chomsky doesn’t care much for real-world data, the fact that he is the world’s best-known linguist doesn’t mean that he is representative of the whole clan. At the other end of the spectrum are those who constantly seek out how language is actually being used day to day. Among them are sociolinguists, perhaps those most relevant to the subject of this book. Chomsky studies language in the abstract; sociolinguists study how it is used on the streets and in the fields of the real world.

Do you find New York accents annoying? As a southerner living in Brooklyn, I have to confess that sometimes they grate on my ears. William Labov, perhaps the world’s foremost sociolinguist, discovered an interesting fact: New Yorkers don’t like their accents either.

Or at least they act as though they don’t. The accent we associate with New York is actually a working-class accent (and, contrary to popular belief, it doesn’t differ much from Brooklyn to the Bronx to the other boroughs). There was once also a high-class New York accent—think Franklin Roosevelt saying “We have nothing to feahhh but feahhh itself”—but it is rare today.

Among the features of the working-class New York accent are the dropping of r sounds after vowels, the replacement of the th sound, as in “third,” with t; the replacement of the th sound, as in “then,” with d; and so on. I once actually asked someone for the location of a bar which, improbably enough, happened to be located at “toity-toid and toid.”

To test the prestige or lack of it for these sounds (rather than just asserting that they’re lower class), Labov went to three New York City department stores for his 1966 doctoral research. Saks, Macy’s, and S. Klein were high to low status, respectively. Asking for a department he knew to be located on the fourth floor, he surreptitiously recorded the answers of the employees.

The results neatly confirmed his suspicion: an employee’s accent became more “New York” as he traveled down the socioeconomic ladder. Employees were most likely to drop the r sound in both “fourth and “floor” in S. Klein, the low-end store, more than 90 percent of the time. (The r is slightly more likely to drop in “floor” than in “fourth,” incidentally.) In Macy’s, employees dropped the r in “floor” 73 percent of the time. In Saks, the number was 70 percent r-lessness.

Another interesting finding is the attitude the employees showed when Labov pretended he couldn’t hear them and asked them to repeat themselves. When they did, they were speaking carefully, and lo, they avoided the lower-class pronunciation and became far more likely to pronounce the r. Another Labov study showed that in casual speech, the rs were dropped almost all the time. But when subjects read words aloud from a list, r pronunciation shot up. How much attention a subject is paying—quick unmonitored speech, careful speech, reading from a page—makes a big difference and inspires linguists to come up with ingenious experiments to tease things out.

Did Labov show that r-dropping is a laziness? When they repeated themselves or read from a page, Labov’s New Yorkers were less likely to drop the rs. But that doesn’t mean pronouncing the r is “careful” or “educated” and hence correct, while dropping it is “ignorant” or “lazy.” The social dynamics are much more interesting than that. In surveys, New Yorkers rate r-dropping as undesirable, even as they do it themselves. And intriguingly, Labov found that the lower middle classes pronounced their rs even more carefully than the upper middle class in formal settings (as when reading aloud). He guessed that these socially vulnerable but hopeful people hypercorrect when put on the spot, anxious not to seem lower class than they are.

Why, then, does anyone drop rs? Labov posits “covert prestige”: the working classes’ way of showing solidarity and pride. It isn’t only the upper classes or sticklers that monitor their language. Members of any group—not just academics or pedants but sports announcers, rappers, and construction workers—signal who they are and who they want to be associated with by how they talk. The persistence of r-dropping among New Yorkers, despite the pressures to climb socially, shows that climbing the social ladder is not all that matters. Working-class New Yorkers (and many others) are caught between social ambition and solidarity with the class they come from.

Most people imagine languages as codified, straightforward, and pure. The rules are written in grammar books, the meanings of words are in dictionaries, and every sound has a correct pronunciation. Linguists, obviously, don’t think that way. And sociolinguists try to figure out in what ways these things, like Labov’s rs, vary socially: by region, class, gender, social situation, and many other factors.

Dialectology is a big part of sociolinguistics. Most languages have clear and stable subvarieties. (What counts as a “dialect” and what a “language” isn’t linguistically clear cut itself and usually political, as will become more apparent in later chapters.) Since dialects are often ignored if not flat-out denied by the keepers of the pure language, it often falls to the sociolinguists to describe how they are used.

As you would expect, they don’t find sloppiness or laziness when they describe (for example) southern English, Cajun French, Swiss German, Damascus Arabic, or other nonstandard varieties. They find dialects or languages with stable grammar and pronunciation—not as stable as those of the written standard languages (which have the advantage of a large body of written literature), but highly coherent nonetheless. As Labov put it, “a decade of work outside the university as an industrial chemist had convinced me that the everyday world was stubborn but consistently so.” Where others saw a mess, he saw order.

Take what sociolinguists call “code switching”: moving back and forth between two languages or distinct dialects in a single conversation. For example, in the New York subway I often hear examples like this one from Labov:

Por eso cada, you know it’s nothing to be proud of, porque yo no estoy proud of it, as a matter of fact I hate it, pero viene Vierne y Sabado yo estoy, tu me ve hacia mi, sola with a, aqui solita, a veces que Frankie me deja, you know a stick or something, y yo aqui solita, queces Judy no sabe y yo estoy haci, viendo television, but I rather, y cuando estoy con gente yo me … borracha porque me siento mas, happy, mas free, you know, pero si yo estoy con mucha gente yo no estoy, you know, high, more or less, I couldn’t get along with anybody.

What looks like a jumble to an onlooker is more systematic and more interesting. Studies of code switching show regularities. For example, some code switchers do so only between sentences. But others do so almost freely within sentences. It turns out that though the latter looks more jumbled, as in the passage above, it is actually most common among those who speak both languages well and who have more positive feelings towards their bicultural identity.

Studies of code switching have shown that the switch from one language to another often involves emotional moments: describing a stressful experience, for example. In the case of this New York mix, Spanish may be to show solidarity (the mentions of mutual acquaintances are in Spanish). But the speaker switches, after a long stretch of Spanish, to English for “happy” and “free” (“I feel more happy, more free, you know”). This might be because the English words “happy” and “free” might carry different emotional freight—as distinctly American wishes—than their Spanish equivalents feliz/contenta and libre would. What other country on Earth put “the pursuit of happiness” in its founding document?

Other language phenomena that look messy or random are also predictable and systematic ways of seeking solidarity. One example is a behavior many Americans have guiltily confessed to me: that when they travel to (say) Scotland or Ireland, before long they find themselves half imitating the local accent, and maybe its rhythm and vocabulary too. Most people who do this think that they are uniquely chameleonic and feel slavish and foolish in changing their own accent to sound like the natives. In fact, like many other things, sociolinguists have seen this all around the world and given it a name: accommodation. If you think about it, you probably speak more quietly when talking to someone who speaks unusually softly. If you’re talking with someone who runs a hundred miles an hour conversationally, the chances are good that you speed up too. The desire to show solidarity with conversation partners motivates these changes of speed or volume. The traveler’s accent shifts are no different. And choice of speaking style and vocabulary can be another way of getting closer. As a teacher, I have noticed that I get more colloquial and relaxed when talking to my students; I want to feel as though I’m joining the group temporarily.

But accommodation has its flip side: intentional distancing. I am more formal and stiff on purpose when dressing down students or expressing disappointment. I want them to know that I am the teacher and they have to listen to what I say because of our teacher-student relationship. The same kind of distancing may be going on when a boss raises his voice at a whimpering subordinate. The message is that we’re not the same: I can be loud and you must be quiet.

How language choices like this hang together with identity and politics is particularly clear in the case of what linguists call “diglossia.” In the classic diglossic situation, two varieties of a language, such as standard French and Haitian creole French, exist alongside each other in a single society. Each variety has its own fixed functions—one a “high,” prestigious variety, and one a “low,” or colloquial, one. Using the wrong variety in the wrong situation would be socially inappropriate, almost on the level of delivering the BBC’s nightly news in broad Scots. Their functions are quite different.

Children learn the low variety as a native language; in diglossic cultures, it is the language of home, the family, the streets and marketplaces, friendship, and solidarity. By contrast, the high variety is spoken by few or none as a first language. It must be taught in school. The high variety is used for public speaking, formal lectures and higher education, television broadcasts, sermons, liturgies, and writing. (Often the low variety has no standard written form.)

Diglossic pairs include Swiss German and standard German; colloquial Arabic (in many regional guises) and classical/standard Arabic; Haitian creole French and standard French; and two varieties of Greek called dhimotiki (demotic) and katharévousa. An important point to remember is that the “low” varieties are socially but not linguistically low. They are no run-down gutter cants. They are highly regular languages; linguists can and do write grammars of, for example, modern Palestinian Arabic. But in the typical diglossic situation, most people think the low variety isn’t the “real” language. The real language is the (high) language, which must be taught, essentially as a foreign language, to children who have never used it before they begin school.

Attitudes toward the low varieties vary. In Switzerland, standard German gives German-speakers access to the economy, writing, television, and culture of a big international arena, including neighboring Germany and Austria, which with 90 million people dwarf the 4 million German-speakers of Switzerland. But Swiss German is a token of pride, separating the Swiss from their neighbors. Personal ads in newspapers are often taken out in Swiss (called Schwyzerdütsch, among other things, by the natives). And since it varies from place to place in Switzerland, one’s own variety of Swiss can signal allegiance to a particular region or town. The Swiss, in other words, have no problem with their diglossia, and there is certainly nothing lower class about speaking it.

Slightly more mixed feelings prevail in Haiti regarding the relationship between standard French and their own, by now highly regular, creole variety. Knowledge of standard French is crucial for access to power in government or business. But French is also the language of a former colonial power, and Haitians are proud of the Creole, with a few exceptions. (Some in the upper rungs of society disclaim knowledge of the Creole but can be heard to speak it fluently nonetheless.) Creole makes you Haitian.

In Greece, the relationship between demotic (low) and katharévousa (high, closer to classical Greek) is even more obviously political. In the twentieth century, with people speaking varieties of demotic as the native language, politicians made an effort to regularize and spread the formal use of the low variety. But the conservative pushback was fierce: a translation of the Bible into demotic Greek sparked riots in 1921. When a group of colonels took control of Greece in a 1967 military coup, they halted the official spread of demotic.

The restoration of democracy restored demotic (which shares, incidentally, the same linguistic root, demos, “people”). Katharévousa, once known as a pragmatic compromise between ancient and modern Greek, was now associated with authoritarian politics. Demotic has basically won the day. A Greek student of mine, Maria, typifies the shift. Her grandmother read a book of Greek mythology, written in katharévousa, to her mother. But Maria’s mother spot-translated it into demotic when she read it to her. Maria, like most young Greeks, can understand written katharévousa but cannot speak it.

The victory of demotic has not resolved Greeks’ language anxiety, however. There is a general opinion that Greek is in decline. Maria, though she doesn’t speak it, describes katharévousa in terms that are common to speakers of the “low” variety when talking about the “high” one: that it is more complex, expressive, and beautiful. In Maria’s words, it is “a complex mathematical problem meets a paintbrush with vibrant colors.” (The “complexity” refers to the fact that low varieties have fewer word endings, which have eroded with time.) At the same time, she is democratic about her demotic, expressing frustration that the most important hour of the week for many Greeks, church, is katharévousa-only. Since the New Testament was originally written in a form of ancient Greek closer to katharévousa, it may be understandable that Greeks would want to hang on to it. But this also means that many parishioners have little idea what is being said to them.

Even though demotic has won out, political angst still plays into language attitudes in Greece. Greece joined the polyglot European Union in 1981 but has failed to thrive through trade, aid, and integration as many other countries have. It has fallen economically behind many of its European partners. Words and phrases from katharévousa, nowadays sometimes mixed into demotic, remind Greeks of a time of national glory and Greek domination of the neighborhood. There will always be those who think that if the language hadn’t been let go, Greece would still be a power to reckon with.

If that nostalgia goes for Greek, it goes triply for Arabic. After the advent of Islam in the seventh century, the Arabic language roared out of its home in the Arabian Peninsula, spreading north, east, and west like a fire across dry brush. With the prophet Muhammad’s conversion of enormous territories to Islam came the spread of the language of the Qur’an, which he said had been dictated to him by the archangel Gabriel. The exact words of the Qur’an were so sacred that learning to read it in Arabic was the only way to access the religion itself. Today, there are twenty-two members of the Arab League, with a combined population of over 300 million (though not all are Arabic-speakers).

But the Arabic language should stand as a caution to sticklers. The veneration of the Qur’an (and also the hadith, or sayings of the Prophet), had the effect of canonizing, and freezing, one form of Arabic. Medieval grammarians made detailed studies of its properties while Europe remained in the dark ages. Poetry and even storytelling flourished. Arabic was the language of science, the reason alkali, alcohol, algebra, and many other English words come from Arabic.

But Arabic, like every other language on Earth, despite the existence of a prestigious written standard, went on changing in the mouths of its speakers. And having spread from Morocco to Iraq, over a territory bigger than any modern country but Russia, it changed not in one stream but many. Today’s linguistic situation in the Arab world isn’t just diglossic but “polyglossic”—that is, incredibly varied. Though most Arabs don’t like to admit it, today, spoken Arabic is not a “dialect” of classical Arabic. It has become a different language in widely varying forms, classed by linguists into North African, Egyptian, Gulf, Levantine, and Iraqi, each with many subdialects.

Sometimes classical Arabic, also called fusha (pronounced “fus-ha”), resembles the colloquial, especially when the vocabulary is somewhat elevated:

Classical:	Hiyya mu’allima fii jaami’at Dimashq.
Syrian colloquial:	Hiyye m’allme b’jaami’at Dimashq.

(She is a teacher at Damascus University.)

But elsewhere it is clear, even in simple sentences, just how far Arabic has moved in 1,300 years:

Classical:	Ra’aytu ar-rajul ma’a ibnihi. Yathhabaani ila as-suq al’aan.
Palestinian	Shuft az-zalame ma’
colloquial:	walado. Biruhu ‘a as-su’ halla’.

(I saw the man with his son. They’re going to the market now.)

In the first full-length Arabic conversation I ever had, with two young Egyptians I met in South Africa who didn’t speak English, I could speak only fusha (not yet having begun learning a modern colloquial Arabic dialect). They tried to respond in kind. But it was a clumsy exchange on both sides. They mixed in not only typically Egyptian pronunciations (such as gadiid for jadiid, “new”) but “wrong” ones in fusha that came from their dialect, such as munazama for munadhama (“organization”). Though I struggled to remember some vocabulary, in other ways my fusha was better than theirs. (On the other hand, a sociolinguist might say my overall performance was much worse; fusha is utterly inappropriate for late-night hotel-bar drinks. I must have sounded something like a professor lecturing to them.)

Furthermore, “Arabic”-speakers from one region can have serious trouble speaking to Arabs from another. A Moroccan and a Lebanese will lose much subtlety and confidence in understanding if they try to talk to each other in their home dialects. If they have trouble and need to talk about a difficult technical topic (such as engineering or computer science), they will often switch to English or French rather than fusha. Meanwhile, a Moroccan from the countryside and his uneducated Lebanese counterpart would hardly be able to carry on a conversation at all.

Talking about this is awkward, and Arabs are often at pains to stress that really they do speak the same language and it’s a shame more people don’t speak “better Arabic.” They are in denial that classical Arabic has no native speakers, and they are in a messy situation with few ground rules. This denial has political roots and heavy implications. Many Arabs recognize that they speak differently from one another, but they still strongly identify with an “Arab nation.” Mastery of fusha Arabic, including literacy and familiarity with classical texts, remains a gateway to social prestige.

At the same time, the modern colloquial Arabics have been put to political use too. Remember Labov’s lower-class “covert prestige.” Just as Bill Clinton could turn up his southern accent when in front of the right kind of crowd, Gamal Abdel Nasser, the revolutionary president of Egypt from 1952 to 1970, mixed Egyptian colloquial into his speeches to Egyptians. While he was president of the short-lived United Arab Republic, which merged Egypt and Syria, however, he never used Egyptian colloquial in his speeches in Damascus; he was not fluent in Syrian colloquial, and during this pan-Arabist heyday he wanted to seem Arab, not Egyptian, to the Syrian crowds.

The ongoing use of fusha for education has repercussions. Mohamed Maamouri, a Tunisian who works at the Linguistic Data Consortium with Mark Liberman, argues that teaching children in fusha, essentially a foreign language, hampers literacy and learning and psychologically distances them from the culture of the written word. Illiteracy is 40 percent in the Arab League states; among women it reaches 50 percent. And even those who can read feel cut off from the content that reading gives access to. Maamouri relates the impressions of Khaled, a typical sixteen-year-old in Tunis:

Since he chose to specialize in Sciences, most of his subjects are taught in French: all the scientific subjects are taught in French. This has been the case since the first year of secondary school. Before then, all of his classes were taught in Fusha. Khaled doesn’t like Fusha and thinks he can do without it. With his friends, he speaks mostly Arbi [Tunisian colloquial], with a little French thrown in. Nobody speaks in Fusha, it sounds too weird and forced. The only classes he has in Fusha are liberal arts classes, like history, religion and social studies. These classes bore him. The teachers are too traditional, and have no sense of humor.

Khaled understands most of what [his cousin] Sourour [who grew up in Saudi Arabia] says when she speaks in Arabic, but she does not understand Arbi. He has to use Fusha or French in order to speak to her. They finally settle on a mixture of the two, because her French is not as good as his.

Arabs are told that the language they grow up speaking isn’t a “real” language and the “real” language is one that they learn in school, not made for fun or creativity. This alienation from the written language has obvious consequences for the development of journalism, political activism, and other forms of democracy-building that rely on the written word. But Maamouri laments that there is no one with the political clout in the Arabic world to push through reforms that might ameliorate the situation.

One option might be to simplify the grammar of fusha—the case endings are notoriously difficult but also largely unnecessary—while also mixing in words that are found in many of the colloquials. Such a “middle Arabic” would be closer to today’s living languages than to the distant, aloof fusha that springs from the Qur’an. Such “middle Arabics” already exist in an ad-hoc way, for example in informal writing online. This makes sense; online writing is often spontaneous, which calls for colloquial, but it is still written and so draws on the only standard written variety, fusha. Codifying middle Arabic would create a useful, but somewhat artificial, language. And artificial languages—from wholly invented ones such as Esperanto to the pandialectal “New Norwegian” invented as a vehicle for Norwegian nationalism—have a poor track record. They can become written languages and will always have their adherents for political reasons. But they belong to no one. In any case, doing this with Arabic today remains unlikely. Distancing Arabic from its roots implies distancing modern Arabs from a cherished religious, military and cultural history. And no coherent vision of modern Arab life has yet emerged to replace those past glories.

These are just a few of the issues that sociolinguists deal with. For other, “purer” linguists who work with the Saussurean system, sociolinguists may seem interdisciplinary vagrants with messy data. For the sociolinguists, sitting in the library and concocting example sentences, rather than getting out and listening to people talk, seems as detached from the real world as describing the chemistry of sugar fermentation is from tasting wines.

What linguists—theoreticians and field researchers—have in common is a search for facts that can support robust, testable theories. They are not in the business of overturning paradigms of capitalism, patriarchy, and colonialism. Of course, many linguists are politically left-wing, but so are most academics. And some linguists’ work has explicitly political content: sociolinguists, in particular, deal heavily with issues of class, social mobility, and power and how language interacts with them.

By that same token, as seekers after social facts, they also discover that a prestige variety of a language—whether standard English or Arabic fusha—has a social value. People treasure these languages for their literary heritage and for the way they bind communities together. Ultimately, it seems, most people like belonging to a community and a legacy, and written standard languages provide both.

But variation from that norm—whether a person grows up speaking Swiss German or Black English—is just that: variation. Black English is no more “broken-down English” than Swiss German is “broken-down German.” And as Bill Labov’s r-dropping experiments found, sometimes people cling tightly to the way they talk even when they know it deviates from the prestigious form—the working classes value a sense of community, too. Variation is a linguistic fact of life and one that a true lover of language should study in all its fantastic variety—not seek to eradicate through a homogenizing, orthodox sticklerism that has little to do with history, logic, and beauty and much to do with power, status, and control.

* Linguists also call this Black Vernacular English (BVE) or African-American Vernacular English (AAVE), but I see no reason not to use the shorter term.

* It is “Avis, jasmin varna na a ast, dadarka akvams, tam, vagham garum vaghantam, tam, bharam magham, tam, manum aku bharantam. Avis akvabhjams a vavakat: kard aghnutai mai vidanti manum akvams agantam. Akvasas a vavakant: krudhi avai, kard aghnutai vividvant-svas: manus patis varnam avisams karnauti svabhjam gharmam vastram avibhjams ka varna na asti. Tat kukruvants avis agram a bhugat.” The English is “[On a hill,] a sheep that had no wool saw horses, one of them pulling a heavy wagon, one carrying a big load, and one carrying a man quickly. The sheep said to the horses: ‘My heart pains me, seeing a man driving horses.’ The horses said: ‘Listen, sheep, our hearts pain us when we see this: a man, the master, makes the wool of the sheep into a warm garment for himself. And the sheep has no wool.’ Having heard this, the sheep fled into the plain.”