The Linguistics Wars

The Beauty of Deep Structure

The hidden harmony is better than the obvious.

—Heraclitus (Fragment B54)

In general, as syntactic description becomes deeper, what appear to be semantic questions fall increasingly within its scope.

—Noam Chomsky (1964b [1962]:936)

The Chomskyan Revolution

“Nobody in academia had ever witnessed or even heard of a performance like this before,” Tom Wolfe said of Noam Chomsky’s famously dramatic rise in linguistics:

In just five years, 1953–57, a University of Pennsylvania graduate student—a student, in his twenties—had taken over an entire field of study, linguistics, and stood it on its head and hardened it from a spongy so-called social science into a real science, a hard science, and put his name on it: Noam Chomsky. (Wolfe 2016:86)

Wolfe’s account is hyperbolic bullshit. Even the chronology is ludicrous.¹ Chomsky was certainly starting to make an impression by 1957, but it would be more than a decade before his influence approached something one might justly call “taking over the field,” fanning out through allies like psychologist George Miller, an important co-architect with Chomsky of the cognitive revolution; Morris Halle, a close friend from his days at Harvard who hired him into the MIT fold and created a linguistics graduate program with him; Robert B. Lees, a chemical engineer who became the first student of that program and one of its most aggressive polemicists; Robert Stockwell, a philologist and early convert to Chomsky’s Transformational Grammar (TG) who founded the influential linguistics department at UCLA; Paul Postal, an early recruit, talented theorist, and intellectual gladiator who never waited for a thumbs-up/thumbs-down signal before slaying his opponents; Jerrold Katz and Jerry Fodor, philosophers who were attracted by TG’s semantic possibilities; Charles Fillmore and Emmon Bach, energetic young linguists and transformational autodidacts who developed and promoted Chomskyan linguistics, in Ohio and Texas respectively—to name a significant, but far from comprehensive, set of his supporters. These scholars, in their own technical articles, conference talks, teaching, textbooks, and hallway proselytizing, spread the word.

Chomsky was unquestionably at the helm, with a dazzlingly prolific series of articles and books, some key conference appearances, a suite of brilliant arguments, and two landmarks that flank the revolutionary period: Syntactic Structures, his highly persuasive case for Transformational Grammar (1957a), and Aspects of the Theory of Syntax, a remarkable synthesis of collaborative semantic and syntactic research growing out of his first proposals and anchored by the highly evocative notion of Deep Structure (1965 [1964]). Over that period, he attacked the preceding generation of linguists remorselessly (he mostly called them taxonomic linguists, to portray them as mechanical collectors and categorizers, trivializing their interests, but history generally calls them Bloomfieldians now, or neo-Bloomfieldians, after the great Leonard Bloomfield, who codified much of the agenda for American linguistics in the 1930s, 40s, and 50s ²).

Chomsky also went outside the paradigm to attack their favored psychology, behaviorism, in a blistering review of B. F. Skinner’s Verbal Behavior. This review was widely admired, adored even, repeated far and wide, for its compelling arguments that behaviorism should be thrown unceremoniously overboard to clear the decks for a new cognitive psychology. He forged alliances with computer science in the early years of artificial intelligence (AI), the other leg of the revolutionary cognitive tripod (along with psychology and Transformational-Generative Grammar, Chomsky’s trademarked approach). He pushed the topics of child language acquisition, mathematical modelling, and mental structure to the rhetorical forefront of the field.

Chomsky was unquestionably at the helm. But he was far from alone. The Chomskyan revolution in linguistics—a real thing that did in fact remake linguistics in Chomsky’s image, for a time—extended deep into the 1960s.³

Here are the specifics. The crucial revolutionary period is between the years 1957, when Syntactic Structures was published, and 1965, when Aspects of the Theory of Syntax appeared. The revolutionary trajectory certainly continued afterwards, peaking perhaps in the mid 1970s in terms of the general intellectual community, with various surges and flows, and recurrent changes of emphasis or direction. But the bulk of what Chomsky is known for, and the essential components of his framework, were forged in those years. We’ll diagram things out later (Figure 2.3), and talk lots about the whens and whys of various developments, but the schematic version for now is that Syntactic Structures outlined a dramatically new model of linguistics, a rule-based, procedural grammar that built sentences which could be altered or combined by transformations into other sentences. From that foundation, and with help from a growing community of scholars, Chomsky developed a more detailed, technically sophisticated, and richer grammatical model; in the process, he presented it as an utter rejection of the previous generation’s linguistics; strongly connected it to innate, universal cognitive capacities; and associated it with certain philosophical and psychological frameworks; computer scientists and psychologists saw it as fundamentally linked to their own revolutionary agendas, quickly becoming advocates and proselytes; English studies folks, especially in composition and rhetoric, were not far behind.

What the end of that period saw, with Aspects, was a model of language that looked like this: a base component of Phrase Structure Rules, which defined canonical syntactic patterns, and a lexicon, which provided words; out of the base came Deep Structures, abstract representations of sentences; Deep Structures were made available to a set of Semantic Interpretation Rules, which rendered the meaning of those sentences; Deep Structures also underwent various transformational operations to produce Surface Structures, abstract representations that were much closer in terms of order and content to the sentences that come out of our mouths or off our keyboards. The upshot? Transformations mediated form and meaning.

Still, I’ve quoted Wolfe’s fairy tale for two reasons. For one, it epitomizes the standard Chomskyan lore. The contours of that lore, despite Wolfe’s screwy details, are basically true: a young man with a towering intellect and commanding personal charisma goes to work and fundamentally reshapes a discipline to his personal specifications. But, second and more important for our chronicle, it is not all that far from how the early Chomskyans, the generation of linguists at the narrative core of this book, saw their own history. That generation felt they were boarding the kind of messianic rocket ship Wolfe describes. They imbibed this vision with their theories and their methods, reinforcing it with each other over coffee and beer and mimeographed discussion papers—especially the part about making linguistics a science. In a very influential piece—a highly promotional review article that was certainly written with Chomsky’s input and approval—Robert Lees cast Transformational Grammar as broaching “a comprehensive theory of language which may be understood in the same sense that a chemical [or] biological theory is ordinarily understood by experts in those fields” (Lees 1957:377).

Most of the recruits believed, and were encouraged to believe, that linguistics was a hopelessly misguided and spongey pursuit, not much better than stamp collecting, before Chomsky came along and transformed it into a hard science, with his powerful ideas and his crushing dismissals of the Bloomfieldians. Lees’s own dismissal of the Bloomfieldians portrayed them as preoccupied with the “mere reorganization of data into a new kind of library catalogue” (377), interested only in “collection and classification” (379). Chomsky’s focus on building grammatical models is what “mark[ed] off the brilliant scientist from the dull cataloger of data” (380). “Which would you rather be?” was the implicit question. In the early years of his program, almost as soon as Syntactic Structures hit the shelves, Chomsky’s arguments always came as a package, setting his ideas against the backdrop of Bloomfieldian error. “I was told that my work would arouse much less antagonism,” he said a few years later, when the Bloomfieldian dust had been trampled and he had in fact taken over the field,

if I didn’t always couple my presentation of transformational grammar with a sweeping attack on empiricists and behaviorists and on other linguists. A lot of kind older people who were well disposed toward me told me I should stick to my own work and leave other people alone. But that struck me as an anti-intellectual counsel. (Mehta 1971b:190–91)

Noam Chomsky has never been one to leave other people alone. From his perspective, he might choose to say he doesn’t want to leave other ideas alone. They need to be rooted out like weeds for his own ideas to thrive. But ideas have sponsors and he has never been very careful about detaching the two.

As Chomsky’s work swept linguistics, it also moved into neighboring fields, with psychology and computer science, its close allies in the cognitive upheaval, leading the way—though in somewhat different terms. Psychology was pretty well established academically and intellectually and had already been mingling with linguistics before Chomsky arrived on the scene. But the interactions went up dramatically with the rise of transformational theories. Computer science, almost indistinguishable from AI at the time, was a fledgling field that grew up alongside the Chomskyan wave. Other fields—English studies, rhetoric, composition, philosophy, eventually political science and media studies—all found Chomsky an arresting presence as well, adapting what they could of his work in large, not always well digested, chunks.

The scope of Chomsky’s success is a tribute to the power of his arguments, the initial welcoming embrace of most linguists (based in part on the close association of his work with a major Bloomfieldian, Zellig Harris), the zeal of his troops, the influx of capital into U.S. higher education in the two decades after Syntactic Structures (much of it from the military), and his own calm and righteous personality. But, also, the time was ripe.

The latter 1950s and early 1960s were times of existential dread, with two “superpowers” escalating nuclear arsenals and sponsoring proxy conflicts that threatened to erupt into global conflagration. “The entire intellectual upheaval” of the cognitive revolution, Chris Knight has argued, “was driven by industrial and military imperatives bound up with the Cold War” (2016:47; Knight’s emphasis). There was an eager audience for the big claims circulating around Chomsky’s program. “A lot of people, inside and outside the academy,” David Golumbia writes,

were looking for someone to say that the human mind is wildly different from other minds, that it does not use capacities animals do, that “rationality” itself is somehow hard-coded into it, and that the mind is essentially a “machine” or a “computer,” however metaphorical those concepts may be. (Golumbia 2018)

This new view of the human mind came together exquisitely in George Miller’s textbook with Eugene Galanter and Karl Pribram, Plans and the Structure of Behavior (1960). It fully epitomizes the themes and trends of the first wave of the cognitive revolution, and nowhere, outside of linguistics, is the influence of Chomsky more pervasive. It is a version of Chomsky, portrayed as the architect of a plan-based, information-processing cognitive psychology, that Chomsky later repudiated but which fits snugly with many of his arguments at the time. Directly counterposing his theory of mind to Skinnerian behaviorism, for instance, he describes language acquisition in terms of “the built-in structure of an information-processing (hypothesis-forming) system,” (Chomsky 1959a:58), using terms he shared with the emerging field of AI.⁴

Close upon the heels of Plans came the field-defining Psycholinguistics: A Book of Readings (Saporta 1961). It includes four pieces by Chomsky, all extracted from Syntactic Structures (not, on the surface, a book much concerned with psycholinguistics, but by the early 1960s being read through the lens of Chomsky’s explicit mentalism), as well as papers by Halle, Miller, Eric Lenneberg, and other Chomsky comrades from Cambridge. Psycholinguistics was overwhelmingly cognitive by the mid-1960s, deeply influenced by Chomsky and cemented by his collaborations with Miller on important chapters of the Handbook of Mathematical Psychology (Luce et al. 1963)—“holy writ,” Eric Wanner calls them. Even twenty-five years later, Wanner was saying, “These chapters seem to me the clearest foundational statement of the field” (1988:143).

More broadly, the powerful cognitive turn in psychology had not only Chomsky’s corrosive attack on Skinner to steer psychologists from the shoals of behaviorism but also the positive beacon of his Transformational Grammar to guide them into new and teeming waters; his arguments, widely adopted and propagated, “had shown that an activity could be rule governed and yet infinitely free and creative” (Neisser 1988:86). Chomsky appears, deservedly so, as a full-chested revolutionary hero in most histories of cognitive science (Gardner 1985; Baars 1986; Hirst 1988; Boden’s definitive two-volume Mind as Machine, gives us sections like “Chomsky Comes on the Scene,” “Transforming Linguistics,” and “Chomsky as Guru”—2006).

Computer science had a negligible but distributed and expanding presence, in academics, industry, and the military. A few institutions scattered through various engineering and math departments offered assorted courses (the first independent department came in 1962, at Purdue), with many offices, corporations, and think tanks lining up for the new machines from International Business Machines and Digital Equipment Corporation. In computer science research, Chomsky was among the biggest names. The journals and conference proceedings show recurrently how Chomsky’s work hailed “an entire community of technologically-minded intellectuals” across the new landscape: “logicians, computer scientists, and technicians . . . looking for someone to lead them down the glory road to ‘machines speaking’ ” (Golumbia 2009:38).

Chomsky’s effect on English studies was more diffuse, but equally rapid. American linguists of the earlier decades had little to offer people studying literature and composition: little, that is, except scorn. Bloomfieldian work was heavily sound-based, rarely extended in any systematic way to units as big as sentences, and was congenitally nervous about meaning. In themselves, these characteristics were enough to discourage composition folk, rhetoricians, and literary critics—people who spend a great deal of time with written texts, who are very interested in sentences, paragraphs, discourses, and who fret constantly about meaning. At best, Bloomfieldian work left the English people cold and empty-handed.

At worst, it found them hostile and brandishing torches. Bloomfieldians regularly coupled their self-promotions with strident attacks on so-called “traditional grammar,” the only place where these people could find text- and-sentence analyses of any depth. Alienation was almost complete; noisy, nasty skirmishes broke out regularly over such issues as prescription and description, the English department types wanting to pursue norms that they could prescribe for little Johnny to make him speak and write correctly, while the linguists insisted that Johnny already spoke his language just fine, thank you very much, and that the only role for a scientist was to describe the way language came out of his mouth. Vladimir Nabokov’s novel Pnin characterizes “modern scientific linguistics” as

that ascetic fraternity of phonemes, that temple wherein earnest young people are taught not the language itself, but the method of teaching others to teach that method; which method, . . . perhaps in some fabulous future [being] instrumental in evolving esoteric dialects—Basic Basque and so forth—spoken only by certain elaborate machines. (1996 [1957]:303)

The Bloomfieldians looked like philistines to the English mavens. Linguists responded like chimpanzees waving their scientific genitalia from the other side of the academic watering hole, as in the exclamatory polemics of Robert A. Hall Jr.’s Leave Your Language Alone! (1950), where he speculates that prescriptivists had turned Americans into a species of linguomasochists, wanting to “to be humiliated and abased by someone telling us our language is ‘bad’ or ‘ungrammatical’” (Hall 1950:6).

The mood was not good. But the market was.

Humanities folk very much wanted to get something they could use from the scientists, and after the Bloomfieldians, Chomsky’s program was a dream come true for them. Chomsky starts with the sentence, not the sound. His work comes out of Zellig Harris’s discourse studies; he promises to help crack meaning; he endorses the pre-Bloomfieldian traditional grammars on their shelves; and he champions creativity. The consumer base was in a lather. There was a little bit of traveling salesmanship, a few well-placed papers and conference appearances by Stockwell, Lees, and Postal, contrasting Bloomfieldian vices with Transformational and Generative virtues. But it was barely necessary: Chomsky and Stockwell met one English professor, Paul Roberts, at an important 1958 Texas conference, and he practically leapt into their arms, publishing several books and articles over the next few years, often with Chomsky’s close coöperation (Roberts 1962:viii; 1964:vii). Roberts had some allegiance to Bloomfieldian methods, so he didn’t engage in the negative polemics of Lees or Postal, but he sang the virtues of Chomsky’s program far and wide in English circles—“This grammar is traditional grammar made explicit and rigorous” he propounded (1963:334)—and very shortly there was an English choir raising the roofbeams with claims that it was impossible to engage grammar “without using the brilliant work of Noam Chomsky” (Catwell 1966:xix).⁵

Chomsky’s impact on philosophy was different again, but perhaps the most successful of all in terms of the long game. He is now widely acknowledged as one of the most important philosophers of the late twentieth century. He had significant contacts in philosophy from the very beginning of his career, preceding even his contacts with psychology. At the University of Pennsylvania Chomsky took several courses with Nelson Goodman, who helped get him into the Harvard Society of Fellows with a very strong recommendation. The society in turn brought him into the ambit of, among others, Joshua Bar-Hillel, Willard V. O. Quine, and J. L. Austin—an ambit which featured a fertile mixture of agreement and disagreement with Chomsky’s ideas. The field was philosophy, after all. Bar-Hillel, for instance, was an enthusiastic supporter of the formal tack on syntax pioneered in linguistics by Chomsky’s supervisor, Zellig Harris; there were a few crucial differences (some of them visible in an early Chomsky paper that aggressively takes issue with Bar-Hillel’s criticisms of Harris ⁶ ), but they played out against a background of shared assumptions about language and its models. The two of them worked together in MIT’s Research Laboratory of Electronics (RLE), one of the incubators of Chomsky’s program. Austin, on the other hand, disagreed with much of that background, but he was nevertheless quite impressed with Syntactic Structures, incorporating it into his final Saturday Morning Meeting group back in Oxford (Warnock 1973), and Chomsky found Austin’s general, ordinary-language approach to meaning very congenial. Quine’s famous criticism of logical positivism resonated with Chomsky, as did his complementary endorsement of the role simplicity played in science, though Chomsky disagreed rather violently with Quine’s behaviorist semantic notions.

By the mid-1960s, philosophers were lining up around the block. “I can vividly remember the shock wave that rolled through philosophy,” Daniel Dennett recalls of the period, “when Chomsky’s work first came to our attention. . . . His fame soon engulfed us all” (1995:385). Soon, one was hearing that “nothing has had a greater impact on contemporary philosophy than Chomsky’s theory of language” (Harman 1974:vii), in large measure because of its conception of the mind. Chomsky became well known in philosophy, that is, not just as the leading figure in a related discipline, but as one of their own, the most forceful advocate of an erstwhile-discredited theory of mind (rationalism), which his arguments helped to resurrect.

Meanwhile, back in computer science, Chomsky was a rock star. Many of the specifics of Chomskyan mathematico-mechanical formalisms were enough to take a computer scientist’s breath away, especially one with AI allegiances—the algorithmic rewriting rules, the set-theoretic definitions, the elegant and familiar notation. But the very idea of modeling a language as a small set of axioms, with formal rules that could be used to prove (er, “generate”) theorems (ah, “sentences”) was paradigmatically seismic on its own. Chomsky’s grammar looked like a sorting device for legitimate (“grammatical”) and illegitimate (“ungrammatical”) strings, common in computer science. It was mind blowing. For computer scientists, Chomsky’s Transformational Grammar was inextricably bound to a series of models, a ranked series of which it was the prize on the top. That series was later memorialized as the Chomsky Hierarchy.⁷ Citations to the paper that first outlined the hierarchy, “Three Models for the Description of Language,” were ritualistic in computer science, with Syntactic Structures a close second, with assorted other works getting scattered nods.⁸ Invocations of the Chomsky Hierarchy were obligatory then and are still showing up very widely in the literature over sixty years later. Chomsky could have gotten tenure in any computer science department in the world; by the 1970s, in fact, he would have been a prize catch, a superstar, in almost any computer science department he chose. “I must admit to taking a copy of Noam Chomsky’s Syntactic Structures along with me on my honeymoon in 1961,” computer science giant, Don Knuth, confessed much later. “Here was a marvelous thing: a mathematical theory of language in which I could use a computer programmer’s intuition” (Knuth 2003:iii).

The mathematical form of Transformational Grammar made everyone swoon, not just the computer scientists. Syntactic Structures models language like physicists model matter, chemists model reactions, and astronomers model the heavens. It has two sets of rules, adapted from earlier work in syntax, but smoothly integrated and expressed in Chomsky’s own quasi-mathematical formalisms. There are Phrase Structure Rules, like those in 1, pretty spectacular little devices (also called Rewrite Rules, because they work like instructions to rewrite the symbol to the left of the arrow as the symbols to the right of the arrow), and the even-more-spectacular Transformational Rules, like those of 2:

There is a lot going on here, in these few formulas. Most crucially, the Phrase Structure Rules (1a–i) give us sentences like 3–6, and the Transformational Rules (2) take those sentences and transform them into sequences (not all of them are sentences) like 7–12:

You can see, I’m sure, why these sorts of rules would make the hearts of computer scientists beat faster, even if you’re not the sort to take them on your honeymoon with you: they look like a blueprint for a language machine, and they pattern just like logic derivations. For cognitive psychologists, too, we get a set of procedures that humans could be following when they speak. In English studies, the rules could be brought into the classroom and given to little Johnny to show him how language is put together and made grammatical. Linguists? They were moved by the high-science purity of it all, and the ability of these rules to describe syntactic facts with hitherto unapproached precision. Syntactic Structures was a hit.

Sentences 3–6 are basic sentences—kernel sentences, Chomsky calls them—and 7–12 are derived sentences (or, with 9, a derived phrase). They are derived in exactly the way a mathematical or logical proof is derived. Just as one might derive “x + y” from “y + x” by “applying” the principle of commutation, the Syntactic Structures model derived 7 from 3 by “applying” the Passive Transformation (T_Pass), 11 from 3 and 4 by way of Passive (T_Pass) coupled with the transformation that conjoins sentences by way of the sentence adverbial, so (T_So).

There are seductive layers of abstraction here, too, for the academicians, not just in the way this system puts a kind of algebraic net over concepts like sentence and verb, but in the very configuration of the model. Look especially at the transformation known as Affix-hopping (T_Af-hop).¹⁰ If you’re catching on to the formalism now, you can probably figure out what it does pretty easily. If you haven’t bothered to map my claims back against the rules and the sentences, you can keep taking my word for what’s going on. Either way, Affix-hopping is a game of pin-the-tail-on-the-verb. Af is a variable that ranges over morphemes like -ing and -en, just as x might be a variable that ranges over rational numbers. So Affix-hopping finds an affix sitting around in front of a verb, and jumps it over top that verb, attaching it to the verb’s hindquarters. But why? Where does that affix come from anyway, and why isn’t it already on the verb? That’s all part of a system of abstractions for getting the distribution of the forms right while maintaining some kind of low-wattage link with meaning. This story will take a little time, but Affix-hopping was one of most rhetorically successful rules in Chomsky’s system, so bear with me. The rule and its accompanying arguments were picked up and circulated recurrently through textbooks, lectures, journal articles, and popularizations, to illustrate the beauty and power of Transformational Grammar.

Remember the two dispositions that characterize linguistic programs, towards distributional coverage and precision on the one hand, and form-meaning mediation on the other? The Bloomfieldians were especially keen on distribution, and Affix-hopping is the distributional crown jewel Chomsky’s program. Even better, syntax was a significant weak point for the Bloomfieldians. They were great with sounds and words, but their syntactic descriptions relied on a body of “heterogeneous and incomplete methods” (Wells 1947b:81), so they were mostly partial and inconsistent. Add to that: English verbal affixes are a tough nut to crack anyway. Even Phrase Structure Rules can’t do it. Chomsky’s solution was a tour de force.

Affix-hopping played directly to the central Bloomfieldian preoccupation of getting all one’s descriptive ducklings in a row. To see how tough it can be to get the Auxiliary-verb ducklings to line up neatly in English, compare sentence 13a with 13b (progressive aspect), 13c (perfective aspect), and 13d (perfective-progressive aspect):

So: 13a, simple present-tense agreement. No biggie. Just change the verb. But: 13a, the progressive, utilizes is (a form of be) along with a verb ending, -ing; 13c, the perfective, gives us has (a form of have) along with yet another verb ending, -en; and 13d, when the perfective and progressive double up, we get a row of alternating verb and affix ducklings. What’s weird, though, is that it’s actually more like a row of duckling-parts, because the bits doing one kind of work are interrupted by other bits doing different work: rise gets between be and -ing in 13b, and between has and -en in 13c, and be gets between has and -en in 13d. It’s actually even a bit trickier than that, but if you’re still with me, I invite you to build a rule following the template of the rules in 1 that will handle this kind of severed-duckling distribution.¹¹ For the record, these things are called discontinuous dependencies (be and -ing are mutually dependent, but not continuous; ditto for have and -en), and they are a headache.

First, before we get on with Chomsky’s solution, I have a confession. That business about deriving sentences from sentences was a fib, a convenient fib, a shorthand that everyone uses about Transformational Grammar, and a fib I will repeat, but a fib nonetheless. Sentences don’t derive from other sentences. Sentences derive from underlying structures, some of which they share with other sentences. So, technically, it’s more accurate to say that sentences 7–12 derive from the underlying structures of 3–6, not from the sentences themselves.

Those underlying structures were another one of the most compelling calling cards of Chomsky’s grammar. So, we need to think a bit bigger than the strings of words I’ve been using to illustrate the rules so far. The structures have a few different overlapping names, but colloquially they’re just trees, a name that captures their shape pretty well. They are full of branches. But they look more like upside-down genealogy trees than like trees in the forest. They are hierarchical arrangements of syntactic details, with words and other bits hung onto the very bottom. Trees may seem like one more level of arcana, but they are really expressive and were extraordinarily popular, especially as Chomsky’s influence moved out into grammar and foreign-language classes. We’ll see a lot more of them.

Okay, back to Affix-hopping.

Trees 1–4, respectively, are the ones associated with Sentences 13a–d, but here are the terminal strings of those trees (the words and affixes hanging onto the bottom of the trees), for convenience:

No one ever says such things as Trees 1–4, of course, or even such strings as 13′a–d, but that’s not what Syntactic Structures claims. These are underlying forms, abstract structures that represent something crucial about English—namely that be and -ing work together to evoke the meaning, progressive, and so do have and -en, for perfective, and if they show up at the same time, have and -en always start things off—but they are wholly abstract.¹² So, the Phrase Structure Rules give us the order (see 1d), capturing important conceptual information, solving a mediational problem. Be and -ing, have and -en, belong together conceptually.¹³ But, solve one problem, create another.

The paired elements go together when we want to express the subtle meanings associated with progressive and perfective aspects but they don’t go side by side. So, the trees (and 13′a–d) get something right, but only by introducing something wrong. T_Af-hop to the rescue! It is an obligatory rule. It always applies when structures like Trees 1–4 show up. It causes -ing to hop over rise, and get pinned on its tail (for rising); -en hops over be, and gets pinned on its tail (for been); and so on. Voila! We get representations of things that people do actually say. (Figure 2.1 shows Affix-hopping in action.)

Figure 2.1 The Syntactic Structures Affix-hopping Transformation. Adapted from Chomsy 1957a: 39-40.

Not all transformations are obligatory, firing every time some particular pattern shows up in the underlying trees. Transformations like T_Pass and T_Nom are optional. They can apply or not—depending on, one would guess, the stylistic choice of the speaker, though that is left out of the picture. If T_Pass or T_Nom (or both) apply, you get one sentence; if not, you get another sentence. But T_Af-hop is always on duty. It always applies when the Phrase Structure Rules generate trees like 1–4 and straightens the distribution out: the sequence Af Verb is turned into Verb-Af, much the way a mathematical function transforms one set into another, or one polygon into another.

This formal leap-frogging maneuver was one of “Chomsky’s best rules from the public relations point of view; its apparent success in reducing chaos to an unexpected kind of order won many converts to Transformational Grammar” (Sampson 1979:361). But if Affix-hopping doesn’t make your heart go pitter-pat as fast as it did for the Bloomfieldians, there were some other set-pieces that populated articles, blackboards, conference papers, and the like, that excited others, many of them well outside of linguistics.

T_Pass was one of the biggest stars because it killed two birds with one stone. There were some hints in Syntactic Structures that the rule comes close to capturing two meanings with one stone, saying that “there are striking correspondences between the structures and elements that are discovered in formal, grammatical analysis and specific semantic functions” between sentences related by the Passive Rule. Indeed, the formal and semantic parallels between such sentences as 3 and 7 or 4 and 8 on page 24, or this pair, are “undeniable,” spake Chomsky (1957a:101):

These kinds of sentential synonyms—same underlying form, different transformational activity— would become incredibly important for Chomsky’s program, but in Syntactic Structures he keeps them at arm’s length, noting that the active (3, 4, 14a)/passive (7, 8, 14b) form/meaning correspondences are somewhat “imperfect” (Chomsky 1957a:101). What he trumpets instead is how the Passive Transformation reduces “inelegant duplication” (1957a:43). A simple Phrase Structure grammar would have to generate sentences like 3 and 7, 4 and 8, 14a and 14b, independently—making them look like just any old pairs of random sentences, no different than a pair like, say, 3 and 14b or 4 and 7. But 3 and 7 are related, 4 and 8 are related, 14a and 14b are related. We feel that relatedness as English speakers. Grammarians have always said as much. So, a grammatical theory which didn’t include the facts of those relations would be deficient.

The most famous of the underlying-structure advocacy arguments, however, was undoubtedly the argument based on Sentences 15 and 16:

These two sentences have the same apparent structure: a noun phrase, a copula, an adjective, and an infinitive verb. They even appear to have the same kind of functional relationship. They both predicate something about John and about pleasing, but—drum roll, please—when you go deeper, you find important distinctions that only Chomsky’s grammar can explain; namely, that there is a crucial difference in who is doing the pleasing, and who is being pleased. Accounts like the following, taken from a 1964 College Composition and Communication article, were common in English studies, psychology, philosophy, and general-readership journals well into the 1970s (it’s a long passage, but is nicely representative in the way it spins all the central appeals around the easy/eager-to-please sentences):

The interpretation of sentences, the way we understand them, is based on their underlying structure. In [15 and 16] the basic sentences that were transformed to form the new sentences are preserved. . . .

What is crucial here is not the mere ability of Transformational Grammars to produce the right set of sentences, in terms of the final strings produced, but to provide these sentences with the correct Structural Descriptions [that is, Trees], which means, to generate them in such a way that the process of generation imposes on the sentences the correct grammatical relations between their elements—the relations which determine how we understand sentences.

But more important even than the insight into the particular sentences of a particular language—such as, for instance, the specification of the differences underlying such constructions as “eager to please” vs. “easy to please” . . . is the general insight into the way human beings use and understand language. For now that we have a precise definition of the notion “grammatical relation” we can see how the transformational process combines the grammatical relations of underlying simple sentences in complex sentences. Thus in “John is eager to please” the underlying subject-verb relation of “John pleases” is combined with the subject-adjective complement relation of the sentence into which the underlying structure has been incorporated by the transformation.

Rising to the crescendo, our author proclaims, “We now have a universal hypothesis about language.” The hypothesis is about meaning, about how sentences are understood, not by way of their surface appearances but by way of their deeper, underlying structures:

Sentences are understood in terms of the grammatical relations of the basic simple structures; and there is a universal synthetic process, the transformations, which out of the finite set of these basic structures builds up complex structures, the number of which is without limit. It is here that the “infinite creativity” of language lies. (Viertel 1964:79–80)¹⁴

All of the clarion notes of the program are struck here—traditional grammar, universality, a difference in meaning that underlies superficial sameness, and infinite creativity—and all of these themes evoked the central preoccupation of English studies, style.

Meanwhile, back at the Bar-TG Ranch, as these broad infiltrations into other fields were going on, Chomskyan theories were undergoing major reformulations. The post–Syntactic Structures years, particularly with the increase of researchers, saw a lot of activity—Lees’s work on nominalizations, Halle’s work in phonology, Katz and Fodor and Postal’s work splicing semantics into the shifting transformational model. Nor was it just Cambridge insiders contributing to the growth and development of that model. Linguists across the United States were becoming Transformationalists.

Bach (at Texas) published a very successful (and briefly definitive) textbook, An Introduction to Transformational Grammar (Bach 1964), that helped take the framework into undergraduate classrooms; and Fillmore (at Ohio) made a major and enduring contribution to what became known as the “transformational cycle” (Fillmore 1963). Indiana was graduating philosophical doctors in Transformational Grammar, including someone named George Lakoff, and places like UCLA, with Stockwell as the founding chair, and the University of Illinois, with Lees taking the same role, were effectively branch plants of MIT. Transformational Grammar increasingly dominated North American linguistics conferences, and was making a splash internationally as well. The journals were filling up.¹⁵

A bibliography published in 1965 included nearly a thousand items on Transformational Grammar, reams of them appearing in MIT Quarterly Progress Reports, and over forty including Chomsky on the byline, but many also far-flung about the United States and the world. The bibliography, by William Dingwall, stretches the definition of Transformational Generative Grammar (its title) a fair amount, and includes mimeographed articles, conference presentations, and other ephemera; a crude 50 percent coefficient to control for these tendencies would give us a somewhat more authentic account of the movement’s academic impact. But five hundred items, only eight years after the publication of Syntactic Structures, in a still-modest discipline, is a remarkable footprint all the same, including work by scholars in the Soviet Union, Hungary, East Germany, Czechoslovakia, England, and Wales, and at American universities like Wesleyan, Pennsylvania, Indiana, Illinois, Ohio, Michigan, the University of Texas, and Georgetown (where Dingwall had just received his own doctorate).

Research was beginning to diversify linguistically as well as geographically. Dingwall’s bibliography cites Transformational Grammars of several dozen languages, including Postal’s work on Mohawk, G. H. Matthews’s work on Hidatsa, and assorted theses, dissertations, or other studies on such languages as Arabic, Azerbaijani, Cantonese, Estonian, Finnish, French, German, ancient and modern Greek, Hindi, Italian, Japanese, Mandarin, Russian, Tamil, and Winnebago.

The theory was changing almost daily, and we will see a host of elegant and increasingly persuasive maneuvers in this chapter—trigger morphemes and Δ–nodes, base recursion, the cycle, lexical insertion rules—some to expand the scope of the theory, some to simplify the machinery, some just regulatory (a change in one procedure to adjust for a change made somewhere else for reasons of scope or simplicity or both). Many of these moves, if not most of them, were driven by the nexus of semantic aspirations in Katz and Fodor’s work. We can glimpse the speed with which the theory was changing in Kenneth Hale’s review of Bach’s textbook:

The book under review constitutes a successful attempt to present, in a single volume, a coherent account of the transformational generative framework as of early 1964. That some parts of the work are now, by late 1964 and early 1965, obsolete is by no means an indictment of the author or of the work itself but, rather, an inevitable state of affairs in any vital field where insights are deepening at a rapid pace. (Hale 1965:xxx–xxxi)

The belief that annual, if not monthly, obsolescence is an inevitable index of vitality was common at the time, signaling an overweening confidence far more than the anxiety one might suppose to attend constant change.

There were transformational groups developing in many quarters, and MIT was a hive, the hallways constantly abuzz with the newest ideas or the latest attack. But while there was an increasing number of people doing a wider range of work, there was no question it was all being done in the orbit of one massive intellect, prompting and proposing and promoting and polemicizing. While not every early Transformational Grammarian may agree with Lees’s self-deprecating assessment that “we all rode on Chomsky’s coat tails,” none would deny that his work was driving everything and everyone else.¹⁶

Into the Great unNoam

Noam is not a human being. He’s an angel.

—Jay Keyser (personal communication, 1986)

[Chomsky is] the devil’s accountant.

—Avishai Margalit (quoted in MacFarquhar 2003)

Now that we have seen his phenomenal rise but before we look into the war which erupted out of the crowning document of that rise, Aspects of the Theory of Syntax, perhaps we can step out of the chronology for an excursus on Chomsky’s personality. We are in dangerous and enigmatic territory here. Opinions really do range from Keyser’s angel to Margalit’s bean-counter-to-Beelzebub, with very substantial logjams at both ends, and there really is ample motivation for both extremes, and I certainly have no special competence for the job of disentangling the elements of his character that have sponsored, in almost equal measure, devotion and demonization. The one consolation I have is that very few people have special competence about Chomsky’s character; as Jay Parini entitled one of the scores of profiles of the man, “Noam’s an Island” (Parini 1988). Haj Ross, who was his student, his colleague, and his opponent, who worked under him and with him for more than two decades in the MIT linguistics department, says he is almost completely in the dark about the man’s personality:

Chomsky is a real mystery man. What he’s like as a person, I don’t know. . . .

There was never any personal contact outside of the university. When people came, they would meet Noam in his office, never even go to lunch or anything. You never go to Chomsky’s house for a party or something, even among staff members, faculty members. . . .

I know very little about Chomsky, where Chomsky’s heart is.

Even the full Chomsky biography that has seen his most active coöperation (Barsky 1997a) “offers hardly any clear glimpses of Chomsky’s personal life, despite being so rich in quotations from him that the blurb touts it as ‘the autobiography that Chomsky says he will never write’ ” (Pullum 1997a:776). Keyser’s assessment has to do more with the seemingly super-anthropic level of Chomsky’s intelligence than with the sweetness of his disposition; Margalit’s, more with political argumentation than with a fiendishness in daily affairs. Of the people involved in our story, probably Halle is the one, outside his immediate family, who best knew where Chomsky’s heart lay.

But there are some rather extreme and inescapable characteristics which contribute to virtually every aspect of our tale and which call for some reckoning, so reckon I will try. I have already alluded to several—in particular, his towering intellect and eristic gifts—and these qualities alone are enough to provoke reactions from worship to loathing, with side orders of envy. A few more traits belong on our ledger.

First, his graciousness. “When I was 15,” someone with the handle @natalieisonline tweeted (2:19 AM, December 7, 2018), “I emailed Chomsky with some asinine questions about life, philosophy, and whether writing could make a positive dent on the world.” What came back floored her: “To my shock, he responded with an essay-length reply.” @natalieisonline is not alone. Her story has been multiplied many, many, times over, perhaps thousands, or tens of thousands, of times. “If you email him,” a journalist noted, “he will email you back. He’s known for that. Even if you’re a stranger with a random question” (Bartlett 2016:B8). One of his biographers, Wolfgang Sperlich, estimates he receives and responds to about two hundred emails daily (2006:10). It is worth noting, too, @natalieisonline’s addendum to her story. “I think about that a lot,” she adds, “when I meet academics who think they’re God’s gift.”

Chomsky will talk to you, too. He lectures tirelessly, he works long, hard hours, he publishes several books a year, and he will spend hours with absolutely anyone who is interested in talking with him. Sometimes the wait is a long one—his calendar is booked months in advance—but he will patiently, kindly, and in exactly the right level of detail, explain his ideas to linguists, to laypeople, to undergraduates, to other people’s students, to journalists and directors and candlestick makers, and carefully explore their ideas with them. People come to MIT in droves to see him (yes, he is still teaching, now over ninety years old). His linguistics classes, which usually have quite small enrolments, are held in a lecture theater to accommodate the scores of visitors every week. He gives away ideas for free. He has a deeply admirable commitment to scholarly exploration, at all levels.

To students, he is ceaselessly helpful. Virtually all the theses he has supervised (over one hundred), and a large number in which he has had no official role, at a wide range of universities, include heartfelt acknowledgments of the time and thought he has given the work. “I remember well that Noam welcomed me with a smile when I was very nervous at our first meeting,” one remembers. “I will treasure forever the opportunity to work with him!” (Obata 2010:ii–iii). “An incredibly warm, supportive, and down-to-earth person . . . I am deeply grateful to him for the time he so generously devoted to my cause despite a mind-boggling schedule,” says another (Ott 2011:ix). Just three more, from noteworthy members of our dramatis personae:

To Noam Chomsky I owe my conception of linguistic description as well as thanks for long hours of discussion of the theoretical intricacies of the formal grammatical devices used in this work. (Postal 1963:i)

I would like to express my most sincere gratitude to my advisor, Noam Chomsky, who has had a profound influence on my thinking and has been a continuous source of intellectual stimulation such as I had never encountered before coming to M.I.T. (McCawley 1965:7)

What I owe to Noam Chomsky is incalculable. (Ross 1967:ix)

But second, there is a startling counterpoint to that graciousness: his splenetic expressions of contempt for many individuals and for great swaths of thought. Chomsky slings mud with abandon. His published comments can be dismissive in the extreme. His private comments can reach shocking levels of rancor, though almost always delivered deadpan, even kindly, as though he is trying to help someone see their own intellectual and moral inadequacies. “Chomsky can be brutal in argument,” Larissa MacFarquhar has noted, “but except for the words themselves there is no outward indication that he is attacking. The expression on his face doesn’t change. He never raises his voice” (2003). He doesn’t raise his voice in print, either. He may never have used an exclamation mark in his life.

Chomsky has many targets of scorn, from people to intellectual traditions, but one in particular is a source of incredible divisiveness in linguistics: every linguist—often cloaked in a collective noun such as “the field,” since he seems to regard himself as outside that field—who holds opinions about language and linguistics which depart from his, sometimes only very slightly. He often speaks, for instance, about the immaturity of linguistics, contrasting it with the “more mature” sciences, and the context of these remarks invariably betrays forked intentions. He does not mean simply that linguistics is in an earlier developmental stage than physics or chemistry, that linguists have only to solve methodological problems and reach empirical consensuses that workers in those other sciences have already achieved. He also means that linguists are whiny, irrational, petulant children. There are, he always allows, “a few quite serious people,” just “a tiny minority in the field”—who happen coincidentally to be sympathetic with his own goals—but virtually everyone else in linguistics is intellectually and emotionally and even morally callow.¹⁷

In part, this attitude seems to reflect a need to work in an us-and-them, or even a me-and-them, intellectual climate. In part, it is delusion. His program is hugely successful, yet he always seems to feel embattled. One of the most influential thinkers of the last six decades, a person who has sponsored some of the most powerful intellectual currents of the twentieth century, and who may not be done yet, Chomsky is ever the scrappy underdog. In part, it is just complete, utter, magisterial arrogance. He’s right; so very many others, with divergent views, are wrong. The rhetoric that flows from his embattled and righteous sensibility can be blistering.

In one form of this alone-against-the-world (or, at least, alone-against-the-bigwigs) ethotic stance, Chomsky has recurrently broken from his teachers and mentors, accusing them of incomprehension and neglect, most scandalously with Zellig Harris, his teacher and supervisor at Pennsylvania University, who was extraordinarily influential in Chomsky’s development and highly beneficial to his career (see R. A. Harris 2018a, 2018b). Marcus Tomalin, for instance, remarks that “despite Harris’s early interest in Chomsky, the two men drifted apart as Chomsky matured,” citing one of Chomsky’s frequent assertions that Harris never looked at his work (2006:110–11). Tomalin goes on:

Strangely, the basic pattern of Chomsky’s relationship with Harris (or at least Chomsky’s own account of it) is identical to the basic pattern of his relationship with Goodman. [Tomalin sketches Chomsky’s account of that relationship.] . . . Chomsky’s memories of his friendship with Quine outline a similar pattern. . . . This general pattern of initial closeness followed by sudden separation is clearly of interest and it raises various questions: Why did Chomsky come to be spurned by his early mentors, and precisely which aspects of his work offended them? (2006:111–12)

One might add the linguistic titan Roman Jakobson to “this general pattern” as well, and perhaps Bar-Hillel. Victor Yngve, who employed Chomsky at MIT’s Lincoln Labs, did not remember him fondly. There are no doubt others of that generation who could be added to this list. But Tomalin’s questions may be the wrong ones. There is little evidence that there was something offensive in Chomsky’s work, in his theoretical developments, something different for each replication of this pattern, that somehow repelled his early mentors and friends. There is more reason to believe that it was his perennial sense of being misunderstood and oppressed, his own turning away, that caused the friction. Tomalin’s “drifted apart [from his mentors]” is probably better rendered as “pushed them away.”

Third (or maybe it’s 2a), Chomsky is obsessively argumentative, not just in terms of everyday quarrelsomeness, though there is some of that—one member of Harris’s family, with whom Chomsky visited for long hours as a young man, recalls that he “would follow you from room to room arguing, arguing, arguing you down to dust!” (Nevin 2010:105)—but in a more abstract and technical sense of always, relentlessly, building arguments, dismantling arguments, recombining arguments. Chomsky’s compulsive argumentation manifests in all aspects of his character. He clinically assembles a case for P, and dismantles one for Q. He waxes splenetic about Smith, graciously buttresses Jones. He liberally offers data and connections and analyses to students and other scholars, without the slightest concern for ownership. He also absorbs the work of other scholars without, sometimes, the slightest concern for ownership.

He is, in any event, responsible for some very elegant argumentation. His “Three Models” is beyond classic as an argument in AI. Various incarnations of it circulated widely for decades in linguistics, psychology, and scattered other disciplines, and it still occupies a crucial role in the cursus of computational linguistics (see Linz 2011:296 for a recent account). His poverty-of-the-stimulus argument (latterly rechristened as Plato’s Problem) is still a centerpiece of debates about learning theory, language acquisition, and rationalism (Clark & Lappin 2011). Syntactic Structures is a long, scrupulously precise argument for Transformational Grammar, running numerous subroutines, looping back to recurrent topoi, and sustaining thematic appeals: it placed as the single most important book in the history of cognitive science, in a list of the top 100 assembled by the University of Minnesota’s Center for Cognitive Sciences in 2000, and even made a top-100 list of the most influential books ever written, in all cultures, for all time—a little bit behind the Bible, Confucius’s Analects, and Plato’s Republic, but on the list all the same.¹⁸

Ray Jackendoff is positively rapturous about Chomsky’s argumentative eloquence. Syntactic Structures, he says, is beautiful. “Remarks” is beautiful. “Also a great part of Reflections on Language, where [Chomsky] starts talking about mental organs and bringing in the biological analogies for the first time. . . . ‘We don’t learn to have arms instead of wings.’. . . I practically cried when I read that.” But there can also be something of a community-of-blessed effect for Chomsky’s arguments. If you are disposed to his views, he seems to hit every note with sureness and grace; and the intellectual world was certainly disposed to respond well to Syntactic Structures: the syntax-hungry, formally inclined, scientifically insecure linguists; the language-eager, formally inclined, algorithmically enamored computer scientists; and the process-oriented, model-preferring proto-cognitive psychologists. It was equally ready for the review of Skinner’s Verbal Behavior, largely banishing behaviorism from the study of language for decades.

That’s not to say Chomsky did not provide a compelling structure of reasons in those works. I’m not saying he’s a lucky stiff, the right guy at the right time with the gift of gab. But there were arguments for transformations, for mathematical precision, for scientific rigor, and against behaviorism in circulation, that Chomsky marshalled and augmented. Much like Aristotle or any crowning thinker, Chomsky collated and winnowed those arguments, synthesized, blended, and extended those arguments more powerfully and effectively than anyone else.

And Aspects? Chomsky’s earlier works had laid the track, had set the table, had built the foundation, had groomed the intellectual world and a new generation of linguists for Aspects. It was spectacularly successful.

But for audiences less disposed to his views, especially those invested in other frameworks, the arguments can ring hollow, or incite antipathy. “People say [the same things],” Jackendoff adds, forecasting where we will be a few chapters hence, “about Lectures on Government and Binding. That certainly revolutionized things, but it’s not my era. I don’t see the same force in it.” Others, particularly those who had Chomsky’s oppositional rhetoric turned on them directly, react more bitterly. We will see many arguments and exchanges before we are through, but one brief example of an extemporaneous political wrangle that happened while the Linguistics Wars were in full throttle will give us a flavor of Chomsky in action.

William F. Buckley Jr., a renowned pundit of the right and fanboy of American foreign policy, of which the Vietnam War was the most culturally inescapable excrescence at the time, had a television show, a mano a mano interview show in the 1960s that served alternately as a platform for his political friends, like Richard Nixon and Henry Kissinger, or a whipping post for his foes, a category into which Chomsky fell. Buckley invited him onto the show to rough him up and expose him as an unpatriotic, left-wing, ivory-tower egghead. Early in the interview, he threatened Chomsky that if he got out of hand, “I’d smash you in the goddamn face,” before breaking into his trademark smirk.¹⁹

In the middle of an apparently minor semantic exchange about the definition of disinterested with reference to imperial activity (the point at broader issue being whether America was acting disinterestedly in Vietnam, simply to help the populace of the country, or in its own economic self-interest), Chomsky suggests that there “is a conceptual distinction [to be made], but in actual fact, the history of colonialism shows that . . . I mean, there are exceptions—probably the Belgians in the Congo are an exception—but by and large the major imperialist ventures have been in the material interest or in the perceived material interest of . . .”

Buckley breaks in here, as was his wont, with “I’m not interested in the mathematics of the . . .” trying to both derail Chomsky and to tar him as a milquetoast academic. Simultaneously, calmly, Chomsky says, “Let me finish.” But Buckley continues the smug bullying: “You have already conceded that it is not merely conceptual difference. You say that there are exceptions.” Sproinngg!

Here is how Chomsky snaps the trap:

chomsky: There are a few exceptions.

buckley: All right. Okay. All right. Let’s talk about the exceptions then.

chomsky: Well, now wait a minute. The exceptions I mentioned—for example, the Belgian Congo—there they didn’t even pretend to have a civilizing mission. There it was pure imperial self-interest. These are the exceptions. There are, as far as I know, no exceptions on the other side. . . . The few exceptions are of pure predatory imperialism, without even any pretense of doing anything [beneficial], but these are quite rare.

The exchange goes on, but the snare was set (“there are exceptions”) and sprung (“pure predatory imperialism”), and—go visit YouTube—you can see it dawning on Buckley’s face.

Buckley, who prided himself on his great erudition and especially his knowledge of history, is prodded, with a few deft strokes, to reveal himself on his own pulpit as a shallow historian and a shallower thinker. This moment is pure Chomsky. He is brilliant at setting the agenda in any argument, usually without the person on the other side of the exchange having any idea they have now taken up his terms. Chomsky controls the debate throughout, carrying the day seemingly without effort.

Chomsky is apparently capable of more circumspect argumentation as well. Journalist Glenn Greenwald tells of sharing the stage with Chomsky one night. Prior to going on, he reports Chomsky turned to him with

chomsky: You know, there’s this interesting essay by Albert Camus, written during his first visit to the United States, in which he described his surprise at what he regarded as the poor clothing taste of Americans, particularly men’s choices of ties.

me (slightly confused): Are you sharing that anecdote because you dislike my tie?

chomsky: Yes.

That, Greenwald adds, is “how you receive a fashion critique from the world’s greatest public intellectual” (Greenwald 2016).

Fourth, not unrelated to the reason he was sharing the stage with Greenwald, his compassion: Chomsky is clearly very moved by, and very moving about, the downtrodden, especially when his own government or its agents are the oppressors. Fred Branfman, someone who worked with Laotian refugees in the late 1960s, victims of a cruel covert American bombing campaign, tells of the time that he was helping Chomsky understand their situation. “I was . . . stunned when, as I was translating Noam’s questions and the refugees’ answers, I suddenly saw him break down and begin weeping.” Branfman was especially stunned, he says, because while overwhelming compassion is the appropriate human response to suffering on this scale, story after story of mounting horror, he had brought journalists and activists to the camp before and translated the same atrocities only to find them casually taking notes, perhaps steeling themselves against those emotions (Branfman 2012).

Compassion, with Chomsky, kindled action. Remember how Ross, his student and colleague, said “I know very little about . . . where Chomsky’s heart is?” He wasn’t done. He continued, “except that [his] heart is for people who are oppressed by nasty politics. He works very hard for people like that.” His political activism began in 1964: “It got so horrible [in Vietnam]” he said, “that I couldn’t look at myself in the mirror anymore” (quoted in Knight 2019:59). “The Responsibility of Intellectuals” came out a few years later in the New York Review of Books (Chomsky 1967d), still a bracing manifesto of political engagement, in which he took on Establishment demigods like Henry Kissinger and Arthur Schlesinger. Chomsky opposed the war in Vietnam, and the leaders responsible for it, and the intellectuals supporting them, with every ounce of attention and strength he could wrest from his linguistics. What this meant, in the climate of the time, was not just writing political essays, though he did lots of that, but also

speaking several nights a week at a church to an audience of half a dozen people, mostly bored or hostile, or at someone’s home where a few people might be gathered, or at a meeting at a college that included the topics of Vietnam, Iran, Central America, and nuclear arms, in the hope that maybe the participants would outnumber the organizers. (Chomsky 1987:54–55)

He had the courage to match his compassion, and as the antiwar movement gathered some steam, he gravitated to the forefront of the protests, leading marches, engaging in civil disobedience, getting arrested, and inevitably, ending up on Nixon’s enemy list (“Has any other linguist received [that] accolade?” asked Dwight Bolinger admiringly—1991 [1974]:28).²⁰ It was those activities, of course, that landed him on Firing Line in the first place, Buckley itching to give Chomsky his comeuppance for denouncing American imperialism shortly after the publication of his first political book, a collection of scathing essays entitled American Power and the New Mandarins (Chomsky 1969).

His political interests have not subsided, nor has the time he pours into them, which eats deeply into his linguistics work. The rationing of his energies between the two domains (with two as a rather gross rounding error, each of them having multiple and diverse “subfields” that are enough on their own for any one person’s career) has oscillated over the years but by the 1970s he had almost completely abandoned his work in phonology, in mathematical modeling, and in the history of linguistics, “up to the point of barely reading about them” he says (Chomsky 1982a:57). While he has continued to write and speak on linguistics, philosophy, psychology, and evolutionary theory since his “retirement” in the late 1990s, his political writings and speaking engagements now outpace his linguistic ones substantially, and the huge bibliography of his political writings is now markedly longer than the huge bibliography of his linguistic and philosophical writings, despite the fifteen-year head start of the latter.

He is gracious. He is obstreperous. He shows great compassion. He shows great contempt. He is courageous. He is acrimonious. He works very hard. He is very smart. He is unwaveringly dedicated to the pursuit of ideas generally, fiercely dedicated to the propagation and defense of several specific ideas. He argues congenitally, inexorably, ruthlessly.

These characteristics are unmistakable, and dramatically magnified through the tremendous stature Chomsky rapidly achieved. As early as 1970, just a little more than a decade after the publication of Syntactic Structures, he was the subject of a monograph in the Fontana Modern Masters series, alongside Einstein, and Freud, and Marx (Lyons 1970a). That book had gone into its third edition by the 1990s (Lyons 1991), the only one in the series to have done so, and it has been joined by hundreds of other celebrations—several more with only his name as title, along with such entries as On Noam Chomsky, Reflections on Chomsky, Chomsky: A Guide for the Perplexed, The Cambridge Companion to Chomsky, Chomsky’s System of Ideas, The Chomsky Effect, Chomsky Update, The Chomskyan Turn, Chomsky for Beginners (a comic book, with a cover illustration of Chomsky in a superhero outfit), even The Noam Chomsky Lectures: A Play—not counting the thousands of books which assume or teach or attack or mangle his notions in more general terms.²¹ Chomsky, in fact, is one of the all-time citations kings—again in the company of Freud and Marx, also Yahweh, and a long way beyond Einstein—with hundreds of thousands of references to his work. By the early 1990s, according to the Institute for Scientific Information, the Top Ten list of sources cited for U.S. academic journals over the prior seven years looked like this (Kesterton 1993):

Chomsky was the only one on the list with a pulse, let alone an active role on the intellectual stage, and his career was then only about two-thirds along.²² A web search on the single term Chomsky will return tens of millions of hits, not quite in Kardashian territory as we go to press, but Chomsky is assuredly in the influence game for a much longer haul. Geoffrey Pullum, not known for empty encomia, calls Chomsky “the most lionized intellectual of the twentieth century and most famous linguist in history” (1997:776). A Brazilian newspaper goes somewhat further, dubbing him “o deus dos intelectuais,” the god of intellectuals (Angelo 2009). “Reason’s earthly embodiment,” says Jay Parini (1988:37). “In linguistics and cognitive sciences,” Ashish Mehta adds, “he is routinely named as the greatest mind in the last 2,500 years” (2018). Clearing away even the disciplinary boundaries, we get “one of the greatest minds in human history” (Al Jazeera 2015). So: incredibly smart, hugely influential.

At some point, the phonologically inevitable happened, and one can now purchase Gnome Chomsky the Garden Noam™, a 17-inch figurine with nerdy glasses, tweed jacket, and a pointy cap; he is clutching two books, The Manufacture of Compost and Hedgerows not Hegemony. (Chomsky owns one himself, given to him by his grandchildren [Tanenhaus 2016]).²³ He is, as Tom Bartlett puts it, souvenir-level famous (2016). There are mugs, posters, T-shirts, bookmarks, greeting cards, a graphic novel (Wilson 2019). In the Viggo Mortensen movie Captain Fantastic, a counterculture, family-off-the-grid flick, the family does not celebrate Christmas. They celebrate Noam Chomsky Day, on December 7th (his birthday), complete with decorations, gift exchanges, and carols (“Uncle Noam! It’s the day of your birth!”—M. Ross 2016).

While this is a point of contention—little that implicates Chomsky is not a point of contention—there is no doubt that his fame/notoriety is bolstered from two sides, academic and political. Tom Wolfe spins off into figurative paroxysms here, but he gets deep-level facts of Chomsky’s trajectory right.

Chomsky’s politics enhanced his reputation as a great linguist, and his reputation as a great linguist enhanced his reputation as a political solon, and his reputation as a political solon inflated his reputation from great linguist to all-around genius, and the genius inflated the solon into a veritable Voltaire, and the veritable Voltaire inflated the genius of all geniuses into a philosophical giant . . . Noam Chomsky. (Wolfe 2016:104)

This mutual reinforcement is to be expected, of course. How could renown in one field not affect renown in another, and how could they not, together, confer greater overall renown? Compounded fame is certainly not uncommon. We see it in Linus Pauling’s career, Bertrand Russell’s, Albert Einstein’s.

Returning to the 1960s and to Chomsky’s more local influence, on his students, and on their students, rippling out into the ethos of the field, his graciousness and his spleen had immediate and lasting consequences. He has supporters, vehement ones. He has opponents, vehement ones. His withering polemical style has led emulators to embarrassing extremes, and he has many emulators on this score, both defending him and attacking him. James McCloskey has wondered why it is that “phonologists, morphologists, and semanticists can all survive and coöperate in courteous disagreement, but syntacticians seem to thrive on a more robust diet of anger, polemic, and personal abuse” (1988:18). There is nothing inherently factionalizing about syntax, and there may be several answers, but unquestionably one of the most significant is “Noam Chomsky”. His demeanor has defined the field, not just because he hurt some feelings by displacing the Bloomfieldians and quashing the Generative Semanticists and contemptuously dismissing stray attackers, but because of the truculent style he uses to those ends, and because of the people he has attracted and trained and provoked with that style. Largely because of him, a gunslinger mentality suffused the field in the early 1960s, continued on for decades afterwards, and still sponsors both scornful aggression and dismissive neglect of work between the major frameworks.

The first ripples of Chomsky’s tsunamic influence on the ethos of the field, once the Bloomfieldians were dispatched, began with his followers’ exegesis of Aspects, alongside other volumes of Chomskyan writ—a process epitomized by the shift in perspective the Generative Semanticists had toward their former leader. At the outset, they “all felt they owed an allegiance deeper than professional commitment to Chomsky—it verged on worship,” Robin Lakoff remembers. But “once Chomsky was seen not to be an idol,” once his erstwhile disciples fell under the bludgeoning of his terrible rhetoric, “he was recast as satanic, the Enemy” (1989:963, 970).

This process began, innocently enough, with the exploration of Aspects’ central notion, Deep Structure.

Stalking the Hidden Harmonies

The syntactic component of a grammar must specify, for each sentence, a deep structure that determines its semantic interpretation, and a surface structure that determines its phonetic interpretation. . . . My concern in this book is primarily with deep structure and, in particular, with the elementary objects of which deep structure is constituted.

—Noam Chomsky (1965 [1964]:15, 16; emphasis in original)

Deep Structure is arguably the most famous and pervasive of Chomsky’s coinages, but the basic notion it regularly evokes goes back to the very first speculations about language. Ultimately, there seems to be something behind, underneath, preceding, sponsoring, underlying—in any case, prior and determining—the words that come out of our mouths. There is thought, we feel, which determines our speech, and the whole apparatus of language seems like a process or a mechanism that “takes” thought and “makes” talk. That’s why mediational models are so attractive. They don’t just mediate sound and meaning. They mediate the inner and the outer, the deep and the superficial. Deep Structure is Chomsky’s nexus for the inner and the outer, the deep and the superficial. It is where meaning coalesces.

One of the principal weapons in Chomsky’s thrashing of the Bloomfieldians was their failure to look underneath the surface of language. That’s why they couldn’t account for the active-passive relation or explain discontinuous dependencies. That’s why the underlying structures of his model were so compelling. Aside from a few mopping-up operations (Chomsky’s counterattack on Reichling and Uhlenbeck; Lakoff’s counterattack on Hockett; McCawley’s attack on Hockett; Postal’s attack on everyone who endorsed constituent structure; Postal’s attack on Bloomfieldian phonology; Postal’s attack on Martinet; Postal’s counterattack on Dixon . . .),²⁴ the rout was all but complete by 1965, when Deep Structure debuted in Aspects of the Theory of Syntax, Chomsky’s magnificent summary of the developments introduced or incited by Syntactic Structures. In the geekspeak of 1960s Transformationalism, Syntactic Structures was the Old Testament, Aspects, the New.

The labels are apt for the near-scriptural authority of those books, but they are exactly wrong in textual terms. Syntactic Structures has the clarity, univocality, even, in a sense, the narrative drive, of the New Testament; Aspects, a book full of refinements and extensions, has fewer of those qualities. The tone can also be more oracular in places, and, while there is no Old Testament wrath, we do get more belligerence.

Syntactic Structures is also pretty much a one-man show; it relies on innovations by Zellig Harris, and shows the influence of Charles Hockett, but little beyond that. By 1965, Transformational Grammar had been thoroughly transformed—added to, subtracted from, permuted. While the Aspects theory was still very much Chomsky’s, it showed signs of refraction through a committee. Aspects is more comprehensive, broader in its implications, and much more detailed technically than Syntactic Structures. But it is also blurrier around the edges, and cagier in its assertions. This caginess is nowhere more apparent than with the central hypothesis of the Aspects model, the Katz-Postal Principle—the pivot for our whole story—which says that transformations are semantically transparent, that they have no impact on meaning, preserving it from the underlying structure up through to speech. The Katz-Post Principle guarantees a free ride from ideas to soundwaves, which makes Deep Structure a kind of universal semantic solvent, dissolving the welter of problems that had long kept meaning at bay in linguistics.

Aspects’ endorsement of Katz-Postal rings clear and loud in the main text. If anything, there are indications it might not go far enough, that syntax and semantics might absorb one another (1965 [1964]:158–59), so that there is no “level” representing meaning at all, that meaning could percolate up directly from thought. But then there’s this footnote that hints, on the other hand, maybe the principle could be “somewhat too strong” (1965 [1964]:224n5). Maybe some aspects of meaning aren’t deep at all. Maybe they show up on the surface. Aspects’ hermeneutical potential, that is, looks much closer to the dense, prophetic books of the Old Testament than to the uncomplicated stories and modest parables of the New Testament, and subsequent generations of linguists have found support in it for an amazing range of positions.²⁵

The first two positions to grow out of Aspects—Generative Semantics and Interpretive Semantics—can in fact be traced to exactly that tension between the main text’s claims about the Katz-Postal hypothesis and its but-then-again footnote. Chomsky says perhaps it is too weak. The proto-Generative-Semanticists took that as their mission statement. Make it stronger. But then again, Chomsky says, well, perhaps it is too strong. The Interpretive Semanticists, with some coaching, took that as their mission statement. Undermine and disconfirm it. Both sides took Aspects as their defining document; Chomsky, as their spiritual leader.

The Inevitability of Deep Structure

It would be absurd to develop a general syntactic theory without assigning an absolutely crucial role to semantic considerations.

—Noam Chomsky (1966b [1964]:20)

There is a “main division of grammar,” Jespersen says, that proceeds “from the interior or meaning,” a division that deeply implicates logic and universality. “We call this syntax,” he says (1965 [1924]:45). He might have been defining the linguistics research at MIT between Syntactic Structures and Aspects of the Theory of Syntax.

Much, perhaps most, of the Transformational-Generative energy was expended in and around MIT to migrate from kernel sentence, a loose and still relatively superficial notion heavily reliant on Zellig Harris’s work, to Deep Structure, the nexus of meaning, and definitive of Noam Chomsky’s work. There is a good deal of wishful thinking in the direction of a kind of super-kernel from early on, even if the name, Deep Structure, was still a twinkle in Chomsky’s eye. “It would be a great step forward,” Lees said, in his highly influential Syntactic Structures review, “if it could be shown that all or most of what is ‘meant’ by a sentence is contained in the kernel sentence from which it is derived” (Lees 1957:394). Then, in his paradigmatic monograph, Grammar of English Nominalizations, he added, “It is possible that the problem of explaining how sentences mean might be reduced to the simpler problem of the meanings of kernel sentences” (Lees 1968 [1960]:7). Most tellingly, Chomsky says in Aspects, once Deep Structure has been forged and put into place, “This is the basic idea that has motivated the theory of transformational grammar since its inception,” an inception he backdates all the way to the seventeenth century (Chomsky 1965 [1964]:146; 221n33).

The overall program reduced the power and range of transformations, distributing their activities elsewhere and preventing them from changing meaning. The program had three distinct subroutines: building a lexicon, a formal dictionary, a home in the grammar for the words; beefing up the Phrase Structure Rules, which did all of their work pre-transformationally; and eliminating one whole class of transformations altogether, so-called Generalized Transformations, the ones that most obviously altered meaning.²⁶

The lexicon, and much else, appears in a collaborative paper by Jerrold Katz and Jerry Fodor (along with sundry companion pieces by Katz) that was a self-conscious attempt to do for semantics what Chomsky had done for syntax, probing such issues as anomaly, ambiguity, and redundancy. Katz and Fodor took up the job to explain semantic interpretation, which fell on two components—a lexicon (they called it a dictionary, but linguists have subsequently preferred the more technical lexicon) and a set of Semantic Interpretation Rules (they called them projection rules). The lexicon contained word entries specified for semantic content, but also for syntactic behavior (they ignored pronunciation).

The meaning of bachelor, for instance (more specifically, one meaning of bachelor), was represented as [+noun, +male, +human, -married] (a plus indicates the word includes the following concept, a minus that it lacks that concept).²⁷ The syntactic feature, [+noun], specified that bachelor could participate in a noun phrase, like the ones in these sentences:

The semantic features determine various meaning relations. The feature assignment [+male], for instance, gives bachelor a sex-anatomy relation with the word spinster in 17 (which, you guessed it, is [-male]), a relation that defines great lexical swathes of English (“man and woman, aunt and uncle, bride and groom, brother and sister, cow and bull” Katz & Fodor 1963:187); we are in an academic era when bachelorhood and sex-anatomy were on the minds of philosophers, who were predominantly [+male] and either bachelors or perhaps wished they were. The feature [-married] makes 18 anomalous and 20 redundant. As for [+human], it eliminated confusion with another usage of bachelor, which Katz and Fodor gloss as “young fur seals . . . without a mate during breeding season” (1963:185).

Katz and Fodor’s goal was to provide a theory of semantics that could run in parallel with Chomsky’s syntax. Syntactic Structures was preoccupied with separating grammatical sequences from the ungrammatical sequences. The first job Katz and Fodor set for their rules was to separate sense from nonsense, to determine whether a given sequence of English elements was anomalous or not. They also had their rules detect two other familiar features of meaning that philosophers wrangle with, ambiguity and sameness of meaning. The features did all this rather elegantly, at least for the test cases they provided. So, a [+married] adjective modifying a [-married] noun, like the married bachelor in 18, is anomalous; or, for the rhetoricians in the crowd, oxymoronic. The pair of [-married] words in 20 (bachelor and unmarried) mark their respective clauses as paraphrases. And with [+human] of bachelor in 19 alongside the [-human] of quantum, we just get nonsense.

Katz and Fodor built a new wing for the grammatical house blueprinted in Syntactic Structures, but they were unhappy: “It would be theoretically most satisfying,” they sighed, expressing a general MIT sentiment, “if we could take the position that transformations never changed meaning” (1963:206). But they couldn’t, not yet.

In fact, there were transformations whose job was to change meaning, like T_not which linked sentences like 21a to their own negation (21b) by introducing a not.

Jerrold Katz, though, was on the case—this time with Postal, in their (1964) Integrated Theory of Linguistic Descriptions—ferreting out and inoculating such transformations against meaning changes. For T_not, Katz and Postal adopted neg, an abstract marker now supplied by the Phrase Structure Rules (1964:73f), something like this (modifying our Rule 1a on page 23 above):²⁸

The parentheses mean that the neg is optional (like the have -en and be -ing of 1d), so the derivation of Sentence 21a is unchanged, but 21b now has a new underlying structure, 21c, and T_not would kick in automatically whenever it saw that neg (and not otherwise).

Sooo (listen: you can hear the sound of Katz and Postal drumming their fingers on their desktops), . . . Phrase Structure Rules can introduce special meaning-bearing elements, can they? And these thingys can also trigger transformations, can they? Hmmm, . . .

Eureka (cue the light bulbs over their heads)! If we can have a neg marker, how about some others? Question Transformations, T_Q and T_Wh, change meaning slightly, so let’s give them a trigger (T_Q and T_Wh make questions by moving around auxiliary verbs, and T_Wh also inserted a wh-word, like who or what; don’t worry about the details).²⁹ So does the imperative-forming transformation, T_Imp, so it’s going to need one, too. Now familiar sentences like 22a, 23a, and 24a show up in the grammar with underlying structures like 22b, 23b, and 24b.

If you’re paying attention, you noticed that there is an abstract symbol in one of these examples that I didn’t warn you about, the Δ, called a delta node, or dummy symbol (1965 [1964]:122). If not, go back and look at 23b. That Δ was suggested by Katz and Postal (1964:72–73), but it is Chomsky who really brings the notion home (1965 [1964]:122 et passim). Syntactic Structures derived 23a from kernels like 22a, and from kernels like Shelly kissed Jeff, Lynda kissed Jeff, RuPaul kissed Jeff, and so on, virtually an infinite set of possible relations. We get some weird metaphysics here. It’s almost as if the answer to who did the kissing was somehow known in the Syntactic Structures universe ahead of time, and the transformation strips out that knowledge.

Whatever. The complication for Katz and Postal’s mission for rendering transformations semantically inert is obvious: T_Wh caused virtually infinite meaning loss; whoever the kisser might have been was known to the kernel, and then gone by the time the question surfaces. But we know that the word who (like which, what, and so on) only functions in English questions precisely because it is wholly ambiguous over the range of Nirm, Shelly, Lynda, RuPaul, and all other conceivable kissing entities. The beauty of Δ is that it is a place holder for precisely such ranges. Delta nodes were exciting, and their many, many descendants would see a lot of action in generative grammar subsequently; they’re still around. Robin Lakoff calls Δ nodes “an open sesame” to a cave of riches that fueled the Wars and filled Chomsky’s coffers for a long time to come.

Katz and Postal’s assortment of abstract symbols cleaned up a lot of semantic complications, but the biggest remaining problem was with a class of transformations, called Generalized Transformations, that spliced together different kernels—like 25a–25d, which could end up combined in sentences like 25e–25g—and therefore had all kinds of semantic effects.

We’ll get to the rather obvious semantic effects shortly, but there’s a more immediate problem, an issue that Lees dubbed the “ ‘traffic rules’ problem” (1968 [1960]:1, 53–57). Keeping transformations from crashing into each other on the derivational roadways could be tricky indeed. How, for instance, would one get 25e from 25a–25d? A small clutch of transformations is implicated, producing the nominalization, repair, the possessive, toaster’s, embedding some version of 25b into 25c, and joining that spliced sentence with 25a, and so on, but Passive is part of the process as well, at two distinct points in the derivation (preceding and following the embedding), and of course, we need our regular mopping-up transformation, Affix-hopping, at least three times, one for each kernel.

Try it yourself. Get a pencil and a sheet of paper and plot it out. You might need an eraser, too, some extra sheets of paper.

. . .

Take your time.

. . .

Back so soon? That’s okay, in Syntactic Structures derivations present a confusing tangle of roadways. To sort out the traffic, reducing accidents and infractions, it took a couple of guys smarter than you and me—Chomsky, of course, but also Charles Fillmore, who came up with the flow chart we get in Figure 2.2.

Figure 2.2 The interaction of Simple Transformations and the two types of Generalized Transformations in the Syntactic Structures model. Adapted from Fillmore 1963:209.

The flow chart may look a bit kludgey, but it would be difficult to diagram a coherent flow to the rules before the innovations Figure 2.2 represents.³⁰ There were some specific ordering restrictions, but, really, it was a free-for-all. All the transformations wedged in together, trying to get somewhere, blaring their horns and expostulating expletives, with occasional bursts of action when a clog broke loose. New Delhi traffic. Figure 2.2 is Fillmore’s proposal to get the transformations moving together. We can bypass the specifics, but the big advance is in the arrows that reverse the flow, so that all the rules are not just jamming forward. What this means is that the Simple Transformations get to run recurrently between the running of the Generalized Transformations. Fillmore calls this a “re-cycling framework” (1963:16) and it introduced what became known as the Transformational Cycle, which proved among the most enduring ideas in Chomskyan linguistics.

The Syntactic Structures model became more efficient, more elaborate, and semantically richer the way science usually advances: lots of people pulling in the same direction: Fodor, Katz, Postal, and Fillmore, and Lees, among the most prominent. But there is often someone who pulls everything together as well and ties off the bundle. That would be—no surprises here—Noam Chomsky. He provided the masterful finishing touches that culminated with Aspects (and now we’re returning directly to the meaning-alteration issues we suspended to talk about traffic issues). Chomsky combined Fillmore’s insights with a Katz and Postal dummy-symbol suggestion (1964:120ff), and some of his own earlier work with Halle (Chomksy, Halle, & Lukoff 1956, Halle & Chomsky 1960)—really, a kind of erector set of available hypotheses—under the general mandate to redistribute transformational duties to other components of the grammar and concentrate meaning in the underlying structure, hugely reducing the semantic perturbations of the now-diminished transformations. The payoff? One-stop semantic shopping at a level of analysis he called Deep Structure.

Generalized Transformations combine sentences in various ways (if you look back to our list of transformations, they are represented by T_Conj and T_So), so they completely undermine semantic concentration of the Deep Structure sort: no amount of finagling and enriching of the underlying phrase markers can predict the semantic consequences of incorporating another sentence. But Generalized Transformations were absolutely crucial for an aspect of language Chomsky regards as the single most important criterion for languagehood, recursion.

Recursion is what gives language creativity—gives us creativity—the ability to generate infinite output from finite input. I’ll say that again, because it is a really big deal for Transformational Grammar: for what Chomsky was up to in developing it and for what almost everybody (computer scientists, English studies people, psychologists) found to be most compelling about Chomskyan linguistics: it models a situation whereby finite resources—a bag of words and a slate of rules—generate infinitudes of structures, from “The cat is on the mat” to “There are more things in heaven and earth, Horatio, than are dreamt of in our philosophy” and “I do not like that Sam-I-Am,” and everything in between. Recursion shows up when some operation replicates its own elements in a potentially iterative way: 25a–25d are all sentences, but T_Conj and T_so take those sentences and make new sentences (25e–g). And either of them could take those new sentences and make newer sentences, and do it over, again, and again, and again—to the end of time in principle, and beyond.

Sentences 25e–g, that is, are just the tip of the infinity iceberg. Maybe they aren’t very creative if we put Green Eggs and Ham or Hamlet on the table, but they are technically creative in the sense of producing new, never before produced utterances. Waxing somewhat creatively (though shy of Shakespearean and Seussian levels) Howard Lasnik says of the Syntactic Structures model that “the infinitude of human languages is the responsibility of generalized transformations” (2015:169). But, in a brilliant and sweeping move, Chomsky eliminated them entirely, hanging onto creativity in the bargain. He reassigned recursion.

Chomsky made a few small changes to the Phrase Structure Rules that had big implications. He did it in a few deft ways, but let’s just look at the noun phrase rule, to which he added an optional Sentence:

Voila, recursion in the base: the Sentence Rule (1a) introduces NP into a derivation, but now the NP rule can introduce a sentence, which can introduce an NP, which can introduce a sentence, which . . .³¹ In Syntactic Structures, we would need a special Generalized Transformation to produce relative clauses. But with a rule like 26 we can get potentially infinite recursive sequences of this sort:

Base-recursion was a grand and dazzling and typically Chomskyan stroke. On the surface, the development had no semantic motivations at all, just a way to bring some efficiency to the grammar. But all the discarded transformations were troublesome meaning-changers. Chomsky’s move also, for good measure, eliminated an entire class of Semantic Interpretation Rules that Katz and Fodor had matched to Generalized Transformations (a proposal that caused some embarrassment—1963:207), leaving behind a tidier set of semantic rules as well. Things were getting better all the time, and the improvements warranted a terminological change. Kernel was out. Deep Structure was in.

There were flies in the semantic ointment, and they would soon come out to buzz. Briefly though, Chomsky produced a triumphant testament to a centuries-long program to mediate form and meaning, and one that did pretty well getting the distributional ducklings in order as well. Aspects of the Theory of Syntax was the shining synthesis of a remarkably productive period in linguistic history, Deep Structure its brightest jewel. “The notion of Deep Structure caught the imagination of thousands of scholars in half a dozen disciplines,” Pullum notes, and “changed the whole mood of language study: Linguistics wasn’t a matter of classifying parts of sentences anymore; it was about discovering something deep, surprising, and hidden” (2015).

The Model

Thus the syntactic component consists of a base that generates deep structures and a transformational part that maps them into surface structures. The deep structure of a sentence is submitted to the semantic component for semantic interpretation, and its surface structure enters the phonological component and undergoes phonetic interpretation. The final effect of a grammar, then, is to relate a semantic interpretation to a phonetic representation—that is, to state how a sentence is interpreted.

—Noam Chomsky (1965 [1964]:135)

The Aspects transformational model was a hit, first in the burgeoning Chomskyan community and then out into the academic world at large. Deep Structure was technically breathtaking and rhetorically powerful, suggesting that cognitive bedrock had been reached.

Figure 2.3 The Syntactic Structures and Aspects grammars

The Aspects grammar, diagrammed out in Figure 2.3 next to its predecessor, is certainly more technically detailed, perhaps more intimidating, than the simple—really, in retrospect, rather quaint—model of Syntactic Structures. But, firstly, remember what happens when you open that box labeled Transformations in the Syntactic Structures model, as Fillmore did: you get a snarl of different and largely unregulated rules. Its simplicity is deceptive. And, secondly, the Aspects model is far more complete, more richly detailed, and, for those reasons, more responsible. Look! There are semantics! There is a lexicon! With its own rules!

And it worked beautifully. Haj Ross, the post-Aspects boy wonder, still waxed lyrically, thirty years later, about putting the machinery of Aspects to work solving the hard problems of syntax:

There’s this mechanical thrill of having got this set of rules ordered trickily and you send these constructions through them, and then, by God, here comes this long fucker twenty words long out the end, and it works! There was this real Gyro Gearloose thrill of linguistics, doing that kind of thing then. And there was also this feeling that we basically knew about 85 percent of the correct answer. There was still some stuff we didn’t know about, like maybe superlatives or purpose clauses, but that was peripheral. We’d get to it, and the basic stuff was in place and if we twiddled the knobs a little bit . . . (Huck & Goldsmith 1995:126)

Syntax had been tamed, and meaning was on the horizon. With Katz and Fodor’s work, Ross recalls, “We’d been given permission to think about semantics” (Huck & Goldsmith 1995:122). Integrated Theory had folded those semantic insights into the grammar, formulating the principle that gave Deep Structure the pivotal role, the Katz-Postal Principle that transformations don’t change meaning.

If some parts of the diagram are still a little vague, don’t worry. They were vague at the time, too. Some of the arrows and boxes in Figure 2.3 were little more than that, arrows and boxes, with no clear specification of what they meant. No one bothered to specify what a Semantic Interpretation Rule was, for instance, beyond the programmatic remarks of Katz and Fodor (1963), or to specify what a semantic representation looked like. The whole right side of the diagram was shrouded in obscurity. But it was suggestive obscurity, pregnant obscurity. Phrase Structure Rules were pretty well investigated by this point, and so were transformations. Deep and Surface Structures bloomed profusely in all the papers of the period. Productive, innovative, challenging, Gyro-Gearloose, knob-twiddling work was afoot.

No one was much concerned about the lack of detail concerning the semantic component. It would come, and, in any case, Katz and Fodor had defined semantics as an essentially residual matter. Their equation was “linguistic description minus phonology and syntax equals semantics” (1963:172), what was left over when the sounds and the syntax were taken care of. Syntax wasn’t finished quite yet (nor was phonology, but that’s another story).³² Syntax, in fact, was expanding, or, in the terms of the period, it was getting deeper: becoming more abstract; welcoming arcane theory-internal devices like triggers and dummy symbols; and embracing the content of sentences much more directly, with feature notation and selectional restrictions. There was a lot of semantic action in the Aspects model, that is, but most all of it was happening on the syntactic side of the diagram. The form/meaning “relation is mediated by the syntactic component of the grammar,” in Chomsky’s words, “which constitutes its sole ‘creative’ part” (1965 [1964]:135–36).

Deep Structure was the star of the show but it is literally inconceivable without its other half, Surface Structure. One had more cachet but it was their pairing that made the Aspects model so appealing. “In the mid-1960s, Deep Structure was an exciting idea,” Eric Wanner recalls, but when he describes that excitement, he features both structures: “[Deep Structure] provided a formal justification for the intuitively appealing notion that sentences have an underlying logical form that differs from the surface arrangement of words and phrases” (1988:147).

It was easy to lose sight of the fact that neither structure has any privileged position in the theory; they were mutually dependent. The charismatic power of the term Deep Structure, and the fact that deep syntactic relations were starting to merge with semantic relations—and frequent statements from Chomsky like “one major function of the transformational rules is to convert an abstract Deep Structure that expresses the content of a sentence into a fairly concrete Surface Structure that indicates its form” (1965 [1964]:136)—led almost everyone to think of Deep Structure on a much grander scale: the long-awaited grammarian’s stone linking the drossy material of speech to the golden realm of thought.

Chomsky was front and center in creating these impressions. He made Deep Structure the keystone of his bridge to Universal Grammar in Cartesian Linguistics (1966a), which celebrates an earlier period of grammatical wisdom; and gave Deep Structure a starring role in his academic best-seller, Language and Mind (1968), in which he argued that his transformationally mediated deep-to-surface-structure grammar was the architecture of a universal cognitive endowment for language; and featured it in numerous talks and articles across the disciplines.

For an audience of philosophers he foregrounds the term philosophical grammar, and describes the “underlying deep structure that conveys the semantic content” of a sentence as “a system of . . . propositions, present in the mind when the physical sentence is produced and understood” (1965b:18).

For English teachers and professors, Deep Structure, thought, and universalist principles all march together in an advance-of-reason narrative. The story, beginning in the seventeenth century and linked to Descartes, goes like this:

Universal grammar developed in part in reaction to an earlier descriptivist tradition which held that the only proper task for the grammarian was to present data, to give a kind of “natural history” of language (specifically, of the “cultivated usage” of the court and the best writers). In contrast, universal grammarians urged that the study of language should be elevated from the level of “natural history” to that of “natural philosophy”; hence the term “philosophical grammar,” “philosophical” being used, of course, in essentially the sense of our term “scientific.” (Chomsky 1966: 587)

That was a golden time, too, he reminds his readers, in which the English professor’s cherished notion of creativity was a defining concern of language scholars. But the Camelot of Cartesianism was short-lived. The backward forces, the cataloguers, the mere curators of data, took over yet again, with the achievements of the universalist school being “very rapidly forgotten, and an interesting mythology developed . . . [that] its assumptions about language structure have been refuted by modern ‘anthropological linguistics.’ ” Au contraire:

This is not only untrue, but, for a rather important reason, could not be true. The reason is that universal grammar made a sharp distinction between what we may call “Deep Structure” and “Surface Structure.” The Deep Structure of a sentence is the abstract underlying form which determines the meaning of the sentence; it is present in the mind but not necessarily represented directly in the physical signal. The Surface Structure of a sentence is the actual organization of the physical signal into phrases of varying size, into words of various categories, with certain particles, inflections, arrangement, and so on. The fundamental assumption of the universal grammarians was that languages scarcely differ at the level of Deep Structure—which reflects the basic properties of thought and conception—but that they may vary widely at the much less interesting level of Surface Structure. But modern anthropological linguistics does not attempt to deal with Deep Structure and its relations to Surface Structure. Rather, its attention is limited to Surface Structure—to the phonetic form of an utterance and its organization into units of varying size. Consequently, the information that it provides has no direct bearing on the hypotheses concerning Deep Structure postulated by the universal grammarians. And, in fact, it seems to me that what information is now available to us [through the breakthroughs of Transformational Generative Grammar] suggests not that they went too far in assuming universality of underlying structure, but that they may have been much too cautious and restrained in what they proposed. (Chomsky 1966d:587–88)

Competence and Performance

To think of a generative grammar in these terms is to take it to be a model of performance rather than a model of competence, thus totally misconceiving its nature.

—Noam Chomsky (1965 [1964]:150)

The Aspects model is a beautiful model. But what is it a model of?

For the Bloomfieldians a grammar modeled data, in the form of corpora (a corpus being a collection of linguistic specimens). Initially, Chomsky adopted the Bloomfieldian idiom, talking a lot about corpora and their relation to linguistic theory in Syntactic Structures. “A grammar of English is based on a finite corpus of utterances,” he said (1957a:49). But that might have been the last kind thing he said about corpora, which he has subsequently scorned throughout his career. Among the Bloomfieldians, he had absolutely no compunction about ignoring actual corpora, and implying their irrelevance. Illustrating a syntactic relation, he would simply pluck some examples out of his head (say, the good and bad combinatorics of some word). In one discussion, for instance, casually disparaging corpora, Chomsky said “the trouble with using a corpus is that some authors do not use the English language,” and cited an example from Veblen who wrote the expression “performed leisure.” Chomsky declaimed that “the verb perform cannot take such an object . . . [Veblen] has broken a law.” There is a grammatical law against combining perform and leisure in this way.

Chomsky elaborated. The Bloomfieldians were alarmed. Chomsky was challenged to explain how he knows about the law if he doesn’t have any data, if he has not abstracted it from a corpus:

chomsky:The verb perform cannot be used with mass-word objects: one can perform a task, but one cannot perform labor.

hatcher:How do you know, if you don’t use a corpus and have not studied the verb perform?

chomsky:How do I know? Because I am a native speaker of the English language. (Hill 1962c [1958]:29)

As it turns out, Chomsky was too rash in his perform pronouncement. You can disprove his intuition yourself in a few seconds with a corpus search of your own, of the internet. I found almost ten thousand examples of “perform labour,” going back to the eighteenth century, in the Google Books corpus.³³ But what comes next in the exchange with Anna Granville Hatcher is nicely instructive. Archibald Hill intercedes to say that both types of evidence serve their purposes—in an example of what we would call mansplaining these days—but when Hatcher is allowed back into the discussion, she promptly comes up with an incisive counterexample, perform magic (Hill 1962c [1958]:31). Chomsky is temporarily chastened and admits to a hasty generalization:

chomsky:I think I would have to say that my generalization was wrong.

hatcher:The generalizations of the speaker of a given language are usually wrong. (Hill 1962c [1958]:31)

Hatcher justly rapped Chomsky’s knuckles, but we can see that her own generalization is wrong, too; she has pressed her point about intuition and corpora too far. She has not had the time, even with Hill’s interruption, to pull a collection of utterances out of her bag and root around in them until she found what she needed. Her counterexample is the generalization of a native speaker, Anna Granville Hatcher, not the product of diligent corpus research.

Chomsky goes on, taking up Hatcher’s observation. Okay, he says, now we have some data we can work with. We can ask what’s different between magic in this context and other mass nouns? Why is perform magic fine, but perform justice bad? Chomsky and Hatcher don’t get anywhere on this second question—what Aspects would call the selectional restrictions of perform—and Hill has nothing to offer, but the moral of the story is clear: everyone in the discussion is using intuitions to do linguistics.

Intuitive data is a perfectly reasonable way to do some linguistic work. You have been putting up with my invented examples now for almost two chapters. You had never read or heard “Galen helped Oriana,” I’m willing to bet, until you encountered it here. It probably doesn’t exist in any corpus that doesn’t include this book (at the time of this writing, Google couldn’t find that sentence anywhere in its gargantuan corpus). But as Hatcher demonstrates and Hill suggests, the notion of using such data was not completely alien to the Bloomfieldians.

One rising Bloomfieldian star, in particular—Charles Hockett, whose career was subsequently somewhat stunted with the mounting anti-Bloomfieldian rhetoric that accompanied Chomsky’s rise—had required of grammars that they go beyond the corpus to predict the structure of nonobserved utterances (1954:34). Chomsky echoes Hockett on this point (without citation) in Syntactic Structures. The grammar, Chomsky says, must “project the finite and somewhat accidental corpus of observed utterances to a set (presumably infinite) of grammatical utterances” (1957a:15; Chomsky’s italics). But Chomsky’s indifference to corpora was outrageous for Bloomfieldians, and his heavy reliance on intuition was a major shift in the methodological winds.

The shift toward intuition, away from corpora, was liberating. It opened up vast worlds of cheap, easily obtainable data. No tape recorders, no transcripts, no luck-of-the-draw about whether you’ll get enough embedded passives in the progressive aspect to understand such constructions; just find an armchair, settle in, knead your temples, and puzzle it out. It also radically changed the nature of that data, especially in the emphasis it put on negative data, non-sentences, the trademark asterisked sequences of Chomskyan argumentation. Chomsky and Hatcher’s data might easily be arrayed as in 28 and 29, for instance.

(For the record, however, Chomsky’s *perform justice generalization—law?—doesn’t fare much better than his initial *perform [mass noun] generalization; on February 25, 2021, Google returned “about 225,000” hits for that sequence, by its own reckoning.)

There are two issues at play in the Hatcher/Chomsky exchange. The first is the responsibility of the grammar. It is accountable not just to describe the attested patterns of the corpus, but to predict nonattested examples, and even “excluded” examples. For Chomsky, the corpus doesn’t generate the grammar. The grammar generates the corpus. A Chomskyan grammar (eventually, when it is fully and accurately specified) generates some ultimate, ideal, and infinite, super-corpus-in-the-sky. Plato would have loved it.

The super-corpus might look like the answer to the question we started with here: what does the Aspects model model? But not so fast.

Chomsky’s interest in language, by the time he had this exchange with Hatcher, was increasingly psychological. As early as the waning years of the 1950s—collaborating with George Miller, excoriating B. F. Skinner, consulting on Plans and the Structure of Behavior—Chomsky isn’t interested in what’s in the sky, but what’s in our heads, our competence as he calls it in Aspects, which comes with a perennial traveling companion, performance. We will see a good deal of these terms before we’re through, but for the moment the important points about them are that they signal, respectively, knowledge and use. Competence is ideal, abstract; in principle, well-defined, and therefore a suitable object of scientific study. Performance is messy, concrete; by definition, highly variable, at best a pool from which data might sometimes be fished.

A production model—what a Chomskyan generative grammar decidedly isn’t, at least by the early 1960s—represents how speakers formulate and produce utterances. A competence model, of the Aspects variety, represents the knowledge inside a language user’s head, not the techniques, strategies, and neuromotor activities for getting pieces of it out of her mouth. When you put down this book for a few minutes and grab a snack or use the bathroom, you’re not using language, not reading or speaking. But your knowledge of language doesn’t go away. Whatever it is that is just sitting in your head, waiting until you come back to the book or activate it in some other way, the faculty or power or grammar that is your knowledge of the language, that’s competence. Now, when you’re reading, you are performing your language, putting that knowledge to work.

Knowledge of the set of natural numbers, {1, 2, 3, . . .}, represents some arithmetic competence that can be called on in an arithmetic performance in some situations, but it is not the actual process of counting a herd of goats.

More generally, competence is the hardcore knowledge someone has of her language—that subjects and verbs agree in number, that adjectives precede nouns, that easy takes one type of complement, eager another, that—if you are Chomsky—perform goes with magic but not with justice or leisure. It is ultimately variable, person to person. We don’t have the same competence in English, you and I. But language is relatively stable after childhood acquisition and sufficiently generalizable over a community so that it can be formalized, and, for Chomsky, it is the single proper focus of linguistics. Performance is the application of that knowledge in speech and other linguistic activities. It is quite variable—subject to fatigue, exhilaration, situation, molecules (like alcohol) that degrade articulation—much more difficult to formalize in a meaningful way, and, for Chomsky, largely a wastebasket of uninteresting facts.

The differences between competence and performance can be subtle, as are most differences between social and psychological accounts of the same phenomena, or between any internalized activity and its external realization. But we can grasp the difference easily in principle. We all know how to do push-ups, but after enough of them even the most fit of us start shaking and halting and stop altogether. We haven’t forgotten how to do push-ups. The knowledge is still there. But our push-up performance ceases to reflect that knowledge reliably.

The analogy only goes so far, though; language is somewhat more complicated than calisthenics, and there was considerable confusion at the time about what Chomsky was modeling because of the actual mechanics of his grammar. Those rules we looked at earlier (1a–i and the six transformations in 2 on page 23) look like instructions for how to speak. That’s why computer scientists and plan-based cognitive psychologists liked them so much. Carleton Hodge, for instance, took it for granted that Transformational Grammar concerns “the putting together” of sentences, that it seeks to answer the question “How does one proceed, not to describe a sentence, but to make one?” (in Hill 1969:38). Hodge was a typical late-Bloomfieldian linguist and typical in his assumption. Many people thought Chomsky was marketing a production model, a representation of how speakers build sentences. But that’s performance.

The construal was natural, in part because of the unclarity at the time about what it means to model the mind, but equally in part because of the terminology of the theory. Chomsky wanted to model a mental state, but his terminology strongly supported a model of a mental process. His relentless emphasis on creativity didn’t help matters, a notion with inescapably dynamic implications. For Chomsky, creativity is a crucial potentiality of linguistic competence, not a direct factor of performance. But lots of people saw his work as something “that explains how we produce [sentences],” showing how “Professor Chomsky . . . is thus interested in the creative rather than the interpretive side of grammar” (Francis 1963:320, 321).

Even members of the inner circle seemed pretty hazy at times. Lees, for instance, proposes a separation between optional and obligatory rules in his paradigmatic Grammar of English Nominalizations, because from this separation, “we might expect to gain a deeper insight into how a grammar can be used by a speaker in the production of sentences” (1968 [1960]:3), and in one conference exchange, Postal responded to a question about why some data might be unclear with:

One answer is that there is a limitation on memory. It may be that in the course of derivations of unclear cases, many complex rules are involved and the informant has difficulty in tracing the path of derivation. (Dallaire et al. 1962:26)

Behind all of this was Chomsky’s analogical framing, his assembly-line vocabulary for Transformational Grammar. No single word is more troublesome here than generative; Hodge, in fact, uses sentence generator as a synonym for transformational grammar (Hill 1969:39). In principle, generate identifies an abstract notion, like delineate, define, or enumerate, but it was often interchanged with the quite different produce, which signals a concrete notion, like make, build, and assemble. Here is Chomsky describing the workings of his model:

To produce a sentence from such a grammar we construct an extended derivation beginning with Sentence. Running through the rules . . . we construct a terminal string that will be a sequence of morphemes, though not necessarily in the correct order. We then run through the sequence of transformations T₁, . . . T_j, applying each obligatory one and perhaps certain optional ones. . . . We then run through the morphophonemic rules, thereby converting this string of words into a string of phonemes. . . . This sketch of the process of generation of sentences must (and easily can) be generalized to allow for proper functioning of such rules as [the Conjunction Transformation, T_conj] which operate on a set of sentences, and to allow transformations to reapply to transforms so that more and more complex sentences can be produced. (1957a:46)

How does one avoid thinking in production terms with verbs like this (produce, construct, run through, apply, convert, operate . . .), appearing in the very widely read, anthologized, paraphrased, and epitomized text, the manifesto of the revolution, Syntactic Structures? That book, by the way, contains forty-one instances of produce, and another twenty-eight of generate, in largely indistinguishable contexts. Much of Transformational Grammar’s success outside of linguistics, in short, and some of it within, was based on the belief that it was a performance model of language production, a belief that escalated well into the 1960s.

Small price to pay, Chomsky and the Chomskyans must have thought. They had, in a few short and feverish years, hammered out an elegant framework which accomplished the ultimate goal of all linguistic work from at least the time of the Stoics. They had formally mediated sound and meaning. The fulcrum of this beautiful theory was the underlying syntactic representation, the evocatively named Deep Structure. It was the direct output of the base component, a beefed-up descendant of Syntactic Structures’ Phrase Structure Rules. It was the direct input to the transformations, the early theory’s titular technical device. Most crucially for all concerned, with the Katz-Postal Principle in place, it was the locus of meaning, a promise “that generative grammar would at last provide a key to meaning, the holy grail of the study of mind” (Jackendoff 2002:73). Barbara Partee calls these Aspects idylls, when the biggest and most enduring questions of linguistics seemed in reach, “The Garden of Eden Period” (2015b; see also R. Lakoff 1989:944).

The New Grammar

In the language curriculum [developed for Oregon high-school programs] we have tried to rely upon sound linguistic scholarship and to give students a scientific approach to the structure of their language and an appreciation of what language is. . . . The grammar is transformational, beginning in the seventh grade with the basic structure of the “kernel” sentence, defined in eighteen phase structure rules. These are organized so that they can be expanded periodically as students are better able to comprehend linguistic complexities. The eighth grade introduces transformations, both single base (e.g., questions and passives) and double base [generalized transformations] (compound structures, relative clauses leading to adjectives, possessives, etc.).

—Michael F. Shugrue (1966:32)

In a photograph accompanying a 1965 Newsweek article entitled “The New Grammar,” one can see second-grade students literally stacking and arranging the Chomsky-inspired building blocks of sentences.³⁴ The blocks each have different words on them and are color-coded for different parts of the sentence. The children assemble sentences, and then by rotating the blocks they can perform a kind of lexical substitution exercise to discover that “although the words change, the sentence structure doesn’t (‘Sam and I kick’ becomes ‘Pete and Susy play,’ and so on).” The article reports on the impact Chomsky’s theory is having on education, how Transformational Grammar, the New Grammar, was lining up alongside the New Math and the New Biology to revolutionize education. It cites Transformational Grammar curricular reform efforts in several states, a new series of textbooks for grades three through nine by Harcourt-Brace, a multi-year study across sixteen schools in San Francisco, heavily funded by the United States Office of Education (USOE),³⁵ and, of course the many advantages to be gained by these efforts.

The old grammar is “totally mechanical . . . that’s my criticism,” Chomsky is quoted as saying in the article. Once you’ve learned “the nomenclature,” he adds, “you don’t know any more about the language than you did before.” The New Grammar teaches the logic behind sentence structure. There are fewer rules. Students get a “much more explicit understanding of grammaticalness” (quoting Paul Roberts this time). The article is preparative, so you’ll know what to do, one evening, when your child comes home from school and tells you that one sentence is a transform of another, or reels off the “ ‘morpheme’ string, the + boy + plural + present + be + ing + swim + in + edwin + lake.” There is a sidebar, a “Transformer’s Guide” to terms like Kernel Sentence, Noun Phrase, Verb Phrase, Matrix, and Insert Sentence.

The linguistic blocks were an especially powerful teaching tool for the New Grammar advocates. “In the thirty-three years I’ve been teaching,” a second grade teacher (Juanita Carr, fifty-seven) in an upper-income neighborhood said, “I’ve never seen such a change in students.” We get a similar message from Jane Lord, twenty-five, a second grade teacher in a school “serv[ing] a Negro slum.” The blocks were emblematic of the “creative” advantages regularly touted at the time, that Transformational Grammar provides for understanding and shaping language. Their developer, Robert Ruddell, says the blocks emphasize “the importance of syntax and the idea that you can transform sentences in certain ways” (Schantz 2002:96). While computer scientists and psychologists were thrilled with the way Transformational Grammar could represent knowledge, what raced the pulse of English teachers was its implications for prose style.

How involved were card-carrying Chomskyans in these developments? That’s not clear. Aside from Roberts, a very early Chomsky disciple who kept in frequent contact with him and with Stockwell, none of the primary scholars engaged in the pedagogical research, grammar book publication, and educational policy formation appear to have been very closely aligned with the leading Transformationalists. Nor were Transformationalists entirely sympathetic to much of this work; Robin Lakoff, for one, complaining that it tended to rely only on “a hollow shell of [transformational] formalism” and taught undigested “new formulas” by rote (1969c:130; see Newmeyer 1983:135–37). It was also, inevitably, out of step. The Newsweek article features kernels and Generalized Transformations at exactly the time Aspects was writing them out of the theory.

But, while direct involvement was limited, these educational reforms and scholarly expansions were certainly encouraged—indeed, urged on in the familiar revolutionary rhetoric—from Transformation Central, MIT (Lees 1963 [1962]; Viertel 1964; Chomsky 1966d, 1967c).

Things were going very well indeed for Chomsky and his grammar. The Aspects model was elegant and compelling. It linked sound and meaning. It represented knowledge of language. It housed the alluring Deep Structure. It gained rapid renown in psychology and computer science, in philosophy, in English studies, in education policy, and among the intelligentsia everywhere. Chomsky and his New Grammarians were showing up in Time and Newsweek and the New York Review of Books. Their work was the talk of coffee houses hither and yon. The Aspects model was elegant and compelling, but it leaked.