The Linguistics Wars

The Vicissitudes of War

And when men did engage in debate about their deepest concerns, they found that each man could say unto his brother, Racca, thou fool.

—Wayne Booth (1974:31)

Nyaah, nyaah!

—George Lakoff (1973b:290)

Weapons of Choice

“For base men,” Empedocles warned us long, long ago, “it is indeed possible to withhold belief from strong proofs” (fragment 55), but baseness, like beauty and contact lenses, is in the eye of the beholder. When scientists across the way refuse to grant the force of an argument that its sponsor finds compelling, they are virtually base by definition, or obtuse, or misguided, or corrupt, or treacherous, or . . . something unflattering . . . which is why the most common accusations in science are forms of ad personams—implications of personal failings, like stupidity, sloppy scholarship, dishonesty.

Personal attacks are far more common in science than is generally thought (suggesting, among other things, that gauzy rhetorical overlays such as x is confused, or y fails to understand the issue, or z misrepresents my position actually work). Even the parched and pinched pages of professional journals are full of them, and they cluster fructiferously around paradigm disputes. It is not hard to see why. For one thing, mudslinging is easy. If you find an observation disagreeable, said Stephen Hawking, “you can always question the competence of the person who carried the observation out” (1988:10). But the plainer truth is that people get angry and lash out. Vituperation is a natural response when you see something clearly, believe it strongly, propound it fervently, and find your colleagues looking at you like a cheap snake-oil merchant. Listen to Luis Alvarez’s reasoning about Robert Oppenheimer, coming to the most damning conclusion one could offer of an atomic physicist in the McCarthy-sponsoring 1950s:

Oppenheimer and I often have the same facts on a question and come to opposing decisions—he to one, I to another. Oppenheimer has high intelligence. He can’t be analyzing and interpreting the facts wrong. I have high intelligence. I can’t be wrong. So with Oppenheimer it must be insincerity, bad faith—perhaps treason. (Davis 1968:314)

There were no accusations of political treason in the Generative-Interpretive debates, so far as I can tell (and, in any case, such charges would have had a much different impact in the anti-establishment 1960s, more likely conferring honor than suspicion). But there were plenty of charges of scientific treason, especially of the x-is-just-a-Bloomfieldian-in-Chomskyan-clothing variety. Since Bloomfield had become the Bogey Man of linguistics, and since the Generative and the Interpretive Semanticists seemed to be working within the same overall framework, it took a while for these accusations to surface. They marked the death throes of the dispute, an obvious sign that reconciliation was completely hopeless, that each side was on its own, that neither side thought the other was doing Real Linguistics; and, in the end, some Generative Semanticists, when they learned what the Bloomfieldians were really doing, embraced the associations.

When the differences still revolved around a few technical questions under the same Chomskyan umbrella, a wide variety of ad personams ruled the day. The most common of them, as with most debates, was the straw house charge: “x is arguing not against my real position, but against a caricature of my position.” Thickly implicit, and occasionally explicit, in such charges was willful, deceitful distortion. But building straw houses for your opponents is a quite natural side-effect of conviction. The practice of reading (or listening) only for weaknesses inevitably leads to distortions of one another’s arguments—McCawley’s wonderful title, “Interpretive Semantics Meets Frankenstein,” jumps on Katz for his attack not on a living, breathing Generative Semantics, but on the spare parts he has stitched and bolted together in order to rally the villagers with their torches—but such stitching and bolting is very common.

Another frequent and related charge, in science generally and the Generative Semantics dispute particularly, is vagueness. Witness Chomsky’s troubles—a man who told me “I never had the slightest problem reading papers in [Generative Semantics]”—expressing his thwarted attempts to decode Generative Semantic arguments:

• “At best, the logic of [Lakoff’s] argument is unclear . . . [it] is hardly clear enough even to be a speculation.” (1972b [1967]:35)

• “Until these matters are cleared up, I see no force to McCawley’s contention.” (1972b [1967]:48n30)

• “I do not see how these questions can be resolved without undertaking an analysis of these structures which . . . goes well beyond the approach to these questions that Lakoff presents.” (1972b [1968]:82)

• “I fail to see what more can be said, at the level of generality at which McCawley develops his critique.” (1972b [1968]:80)

The charge, naturally, returned to Chomsky and his camp in quadruplicate:

• [To Chomsky:] “You ought to make explicit just what new devices you would need in the lexicon. . . . Throughout your [“Remarks”] lectures you remarked in an offhand fashion that you would need certain principles of word-formation in the lexicon, without stating just what these principles would be, how many, what kind, [or] how they would differ from one another. Without fairly precise claims along these lines and without fairly precise claims about what new rules of semantic interpretation you would need, it is impossible to figure just what the lexicalist hypothesis is claiming.” (G. Lakoff 1967:6)

• “[An] interpretive or ‘Surface Structure’ approach to semantic interpretation . . . has been posited, not very clearly in my opinion, in a number of papers by various authors at MIT over the last year or so, most notably Chomsky and Jackendoff.” (Postal 1988a [1969]:79)

• “The concept of ‘independent motivation’ is highly obscure in [“Remarks”].” (Ross 1973c:212)

• “I can speak only vaguely of the bad guy [Interpretive Semantics] conception of semantic structure, since the papers by bad guys which I have seen generally give very little clue as to what they think a semantic structure looks like or a semantic interpretation rule does.” (McCawley 1973a [1970]:276–77)

Some motivation for this charge is once again simple denigration, implying woolly-mindedness or deliberate obfuscation. Some of it also has to do with a combative spirit that prevents one from tolerating the same level of informality from opponents as allies: a confederate writing vaguely is “suggestive” or “promising” or “intriguing”; enemies are “sketchy” or “obscure” or “sloppy.” And much of it has to do with the high levels of vagueness which are in fact present in such disputes. “Is Deep Structure Necessary?” is a vague paper, which Ross and Lakoff’s colleagues found promising and programmatic and nourishing, and their opponents found very thin gruel indeed. “Remarks” is a vague paper, which Chomsky’s contemporary students found rich, suggestive, deep, but former students and colleagues found flimsy, underdetermined, shallow.

Students are a special case, a more specific audience than colleagues and supporters. They are not confederates so much as disciples, and disciples don’t just see promise. They see gospel. In a way that might apply equally to Lexicalism or -syntax, Daniel Dennett marveled of a later remark by Chomsky, “[This] mild suggestion has been eagerly inflated by others into a scientifically based demonstration!” (1995:381n3). There were disciples on both sides. Consider just the outrageousness of McCawley’s student, Georgia Green, in her comment that “the theory of transformational grammar [a]s developed since 1966, . . . may be considered an explicit framework for investigating the properties of natural language” (1970:153). Transformational Grammar, she appears to be saying, whatever minor antecedents it may have had in Harris or Chomsky, really begins in earnest as an explicit framework for investigating languages with Abstract Syntax.

There were also issues—some of them genuine, some figments, some of them lasting, some ephemeral. Some of them require a little attention, some don’t.

Chomsky’s first direct move against Generative Semantics, for instance, was to charge that it was just a hollow imitation of the Aspects model, and the charge is a whiff of smoke we can dispel pretty quickly. “It is easy to be misled,” Chomsky said, “into assuming that differently formulated theories actually do differ in empirical consequences, when in fact they are intertranslatable—in a sense, mere notational variants” (1972b [1968]:69). Moving past the latent snarkiness of misled and mere, what Chomsky says is clear enough: no matter what two theories look like, in terms of formalisms, architecture, what-have-you, if they make the same empirical predictions, they are just different ways of saying the same thing. All well and good, really, and it is around this question, in fact, that Chomsky coined the term, Standard Theory, and then argued repeatedly that Generative Semantics was a mere notational variant of the Standard Theory. All well and good, really, except that one of the notational variants might be preferable for other reasons, such as simplicity or clarity or mathematical amenability or simple aesthetics. All well and good, really, except that the notational variants Chomsky tags with this label (he uses this argument often—for instance, of Case Grammar at the time, Chomsky 1972b [1970]:174) are always wrong, often flagrantly so.

Katz picked up this stick and beat Generative Semantics with it for a while as well, but almost everyone else was bemused by the argument—in particular, by how it could conceivably count as a reason to dismiss Generative Semantics. The most immediate implication of the notational-variants position must be that there is just no point in squabbling, that the two theories should go off arm-in-arm, like Tweedledum and Tweedledee after their pillow fight. But, if so, Generative Semantics wins. Everyone agreed, Chomsky included, that Generative Semantics (in its initial, Homogeneous I configuration) was by far the prettiest and simplest theory—two criteria that have traditionally been extremely successful in theory-marketing, and which Chomsky has leaned on repeatedly over the years; in fact (looking ahead a few chapters) by the mid 1990s he was trumpeting just such a Homogeneous I conception in his own approach, dubbed the Minimalist Program.

So—since Chomsky’s motivation was not to endorse Generative Semantics—he devoted a fair amount of attention to showing not only (1) that Generative Semantics says the same thing as the Standard Theory, but (2) that Generative Semantics is wrong; really, badly, horribly wrong. Almost everyone, friend and foe alike, found the combination of these claims incoherent, and the notational variants charge had a fairly short shelf life. It was widely treated as a joke (one that linguists were still chuckling about ten years later ¹). After a few more developments, after Chomsky had coined the term Extended Standard Theory, after Jackendoff had put some flesh to it, after the claims about Surface-Structure interpretation were developed enough to be taken seriously, Chomsky’s argument was a little clearer: the Standard Theory is wrong and Generative Semantics exacerbates these wrongs, while the Extended Standard Theory remediates them. But these issues were too fuzzy at the time to get anywhere. By the time fuzziness had receded enough to make a bit of sense, the debate had moved on to other stages. We can safely ignore the argument here.

The next set of arguments (taking next very loosely; arguments overlapped temporally and conceptually), though, does require some ink. These are the exchanges over Deep Structure and the Katz-Postal hypothesis that formally opened the dispute—the issues the debate is most remembered for, and the issues over which Generative Semantics is said to have lost its shirt. They are certainly real issues, but they were ephemeral—for the very good reason that they had the clearest resolutions: Deep Structure (in its Aspects variant) was thrown out by everyone, on both sides, and so was (the strong version of) the Katz-Postal Principle.

Chomsky’s next tack was restrictiveness—Generative Semantics was said to be descriptively wanton, Interpretive Semantics responsibly restricted—and it was a hugely successful rhetorical maneuver. The argument was never resolved; in fact, it could never be resolved in either an empirical or a mathematical sense (Gazdar, in Longuet-Higgins et al. 1981:[690]). All the same, Chomsky won the day utterly with this topos. Restrictiveness was the principal issue which led to Generative Semantics being laughed from the scene for irrationality and error, while Chomsky’s post-Aspects approach triumphantly defined a new vein of constraint-based research that determined linguistics pretty much for the rest of the century. The issue is real enough, though its inability to be meaningfully resolved also brought considerable haze to the exchanges.

For their part, the Generative Semanticists countered with charges that the Interpretive camp was playing fast and loose with crucial notions like grammaticality and sweeping important issues under any rug they could find. What Chomsky and Jackendoff couldn’t handle directly, they said, was simply banished to some netherworld of ill-behaved phenomena, in a shady shell-and-pea game with the data.

This was actually a Bloomfieldian move, though no one noticed it at the time. Recalcitrant data, Bloomfield had said, should “properly be disposed of by merely naming them as belonging to the domain of other sciences” (1926:154; 1970:129). Some x might look like linguistics, and it might smell like linguistics, he said, but if we call x “psychological” or “sociological” we don’t have to look at it or smell it any more. Chomsky and his followers did not invoke other sciences, but they did shift the borders within their model—chiefly, the competence-performance border and the syntax-semantics border—and conveniently left the troublesome facts on the other side.

The you’re-a-Bloomfieldian charge came first from the Interpretive side. Dougherty raised it in an embarrassing so’s-your-old-teacher diatribe that really wasn’t very clear on what it was alleging. He just called up a convenient scarecrow and, insofar as he went beyond name-calling, pointed narrowly to certain structuralist methodologies he mistakenly took to represent the forces of darkness swirling around that scarecrow. But the case was actually quite strong, particularly in connection with the rhetorical use of those flip-side epistemologies, empiricism and rationalism. Bloomfield was an unapologetic empiricist (language was at the intersection of general cognition and rich data) as was the generation he invigorated and defined; Chomsky was a champion of rationalism (language required grammar-specific cognition because of inescapably impoverished data). Many of the first generation Chomsky invigorated and defined, however, including all the horsemen but Postal, became increasingly empiricist.

At this point, the two sides could do little more than wave good-bye and walk away from each other, with some muttering of “Fuck you, Ray” and “Fuck you, George” over their shoulders.

The Decline and Fall of Deep Structure

He who sets to work on a different strand destroys the whole fabric.

—Confucius (Analects II.16)

The earliest concerted argument against Deep Structure is McCawley’s respectively argument, coming in a curious paper delivered at the 1967 Texas Conference on Universals and published in the proceedings of that conference with a postscript substantially modifying his position. The main body of the paper, “The Role of Semantics in a Grammar” (McCawley 1968a), attempts to stretch the Aspects model in several directions, particularly in the use of indices in syntax and the use of logic as a tool of semantic representation. Most of the discussion involves interesting and subtle facts about plurality and subject-verb agreement, such as in 1.

Rationally, 1b should be good, 1a bad, because only one person likes the book, Avash, and only one person is disappointed by it, Jaidon, but English grammar leans the other way.

For all the stretching, however, the discussion falls clearly within the (rather broad) scope of the Aspects model; sub-species, Abstract Syntax. But between delivering the paper and submitting the manuscript for the proceedings, McCawley turned “from a revisionist interpretive semanticist into a generative semanticist” (1976a:159), and the postscript has an air of manifesto about it. Like any recent convert in a rhetorical enterprise like science, he went quickly to work uncovering arguments to justify the conversion, focusing on the titular question of Lakoff and Ross’s “Is Deep Structure Necessary?”

The principal argument he came up with (Postal seems to have had a hand in it as well) is based on indices, quantifiers, and cool features of the word respectively. Chomsky’s manhandling of this argument is one of the things that caused Lakoff to blow his cool last chapter, if you remember, and we’ll get to that manhandling in a moment. First, the argument. McCawley’s evidence is in sentences like these three:

Aspects says that these sentences need at least three levels of representation: Surface Structures (which, for our purposes, we can represent with 2a–c), Deep Structures (3a–c) and Semantic Representations (4a–c). The last two sentences (2b and 2c) are especially interesting since both of them come from very similar Deep Structures, different only in the distinguishing indices which identify whether the two occurrences of that man refer to the same guy or to different guys, as in 3b (underlying 2b) and 3c (underlying 2c). Okay, the Deep Structures (terminal strings) are:

And McCawley gives their respective semantic representations as:

So far, so good. Now comes the tricky part. According to the sketch of a rule McCawley offers (but never formulates) called the Respectively transformation, all of these phenomena can be handled in a unitary way. But there is a problem: the rule can’t operate on structures like 3a–c (Aspects-type Deep Structures). It needs access to universal quantifiers (∀) and set indices (the subscripts).

Therefore—this is the key point—for McCawley’s Respectively transformation to work, it has to run directly off the semantic representations (4a–c) to generate the Surface Structures (2a–c). Generative Semantics therefore can do the same work with one rule that takes the Aspects model at least two rules—operating at two different stages and angled in two different “directions”—a Semantic Interpretation Rule which extracts the quantification facts, functions, and predicate relations from 3a–c angled toward producing Semantic Representations (like the ones in 4), and a transformation (Conjunction-reduction), angled toward the Surface Structure, which crunches the set indices to yield those men from that man_i and that man_j and that man from that man_i and that man_i. The upshot: Generative Semantics only takes one rule and avoids Deep Structure; Aspects takes two rules and gratuitously requires Deep Structure.

The argument may look dense, but structurally it is almost identical to a legendary argument by Halle attacking the Bloomfieldian “taxonomic phoneme” (Halle (1959 [1955]:21–3; Churma 1983). Halle had said there are three levels in a Bloomfieldian grammar, with a box of rules separating each one, and he uncovered some data from Russian that could only be handled by this three-level grammar in a very clumsy way, with two identical rules, one per box. But if the middle level—the phonemic level—was tossed out and the grammar restructured, one rule by itself would do the trick. For reasons of simplicity, therefore, the middle level (the phonemic level) had to go. Known as the Hallean Syllogism, the argument was a classic of the Generative tradition, taught to cohort after cohort of linguistic students.

McCawley took exactly the Hallean party line, only now about an Aspects grammar: that there are three levels, with a box of rules separating each one, and he uncovered some data from English that could only be handled by this three-level grammar in a clumsy way, with two rules, one per box. But if the middle level—the level of Deep Structure—is tossed out, and the grammar restructured, one rule would do the trick (see Figure 5.1). For reasons of simplicity, therefore, Deep Structure has to go. McCawley is quite conscious of the provenance of his reasoning, and brags that his argument follows the template of “Halle’s celebrated argument against the phonemic level” (1976 [1967]:122).

Figure 5.1 McCawley’s Respectively Argument. The Aspects model requires two rules at two stages to do the same job as the Generative Semantics model can do with one rule at one stage.

Chomsky predictably and legitimately objected to the argument (1972 [1968]:76–80), but not on any of the grounds which seem fairly obvious—such as “Let’s see the rule before we decide if anything follows from it” or “Sure, but do the facts really show that there is a unitary phenomenon here? (And, if so, prove it.)” He objected by reconstructing McCawley’s argument in a very unappealing way, and beyond the distortion, he also accuses McCawley of the fallacy of equivocation frequently enough in the short discussion to cast doubt on either McCawley’s integrity (if intentional) or his acumen (if not). Just business as usual in scholarly disputation, I suppose, but it is hard not to find a lack of charity on this scale for one’s former student noteworthy.

It gets worse. McCawley had written Chomsky to repudiate Chomsky’s reconstruction long before it was published (remember, these arguments circulated in mimeograph). Chomsky was resolute. He wrote back to McCawley, maintaining that his reconstruction was accurate and, in fact, that McCawley was confused about what his own paper says. McCawley tried again a few times before it went to press, with similar results.² Additionally, there was also a more explicit version of McCawley’s argument (1976b [1967]:121–32) available years before Chomsky’s criticisms went to press, which he might have addressed. But Chomsky’s paper was published with no indication that McCawley had clarified his argument; in fact, with only marginal indications that his own straw-house version is anything but a verbatim reproduction of the only version that existed, forlorn and misbegotten.

Maybe Chomsky genuinely could not see what McCawley was saying—stranger things happened in the course of the dispute—but that is almost beside the point next to his total unwillingness to view work by his former student with any courtesy at all. His insistence on publishing a version, identified as McCawley’s, that McCawley clearly, repeatedly, politely rejected in personal correspondence shows, in the kindest interpretation, his utter lack of interest in harmoniously resolving the debate. A less kindly interpretation of his actions—that is, an interpretation of the sort which by this point had become automatic to most Generative Semanticists—has more deceit than disinterest about it; and, deceit or disinterest, we can certainly see an unmistakable arrogance in Chomsky’s stubborn assertion that he was better able to judge what McCawley was saying than was McCawley himself.

The most immediate lesson here, Chomsky’s ill-treatment of his former student aside, is that this emulation of the Hallean Syllogism did not have the effect McCawley was aiming for, a resolution to the question of the existence of Deep Structure. It only stirred up more animosities. Lakoff denounced Chomsky’s treatment (1971b). Chomsky replied. Dougherty weighed in (1975:268n4), adding to the confusion by somehow finding Lakoff to agree with Chomsky’s claim that McCawley was equivocating.³ The lines are very, very clear: those who were prepared to doubt the existence of Deep Structure liked the argument (though not without some reservation). Those who were unprepared to doubt Deep Structure, liked it not. The Interpretive camp felt (or, in any case, proclaimed) that “the [respectively] argument has been effectively refuted by Chomsky” (Wassow 1976:288), and declared the case closed. “The dispute,” it was clear, quickly reached the “point where logic was powerless to resolve it” (Huck & Goldsmith 1995:158n4). No one but McCawley ever paid much attention to it after the fireworks from Chomsky’s construal died down. But there was a much stronger class of arguments about Deep Structure on the table—again with McCawley at the helm. He articulated a theory of grammar in which lexical insertion worked without Deep Structure. He supplied an alternative to Aspects.

There was a widespread attempt by Generative Semantics to deny they had any burden of proof with respect to the dissolution of Deep Structure, claiming the ball was in Chomsky’s court. “You have sound,” they said, “and you have meaning. They’re given. They are in the data. If you’re going to stick some other level between them you have prove it belongs there.” Or, in the real words of a real Generative Semanticist: “any theory which claims the existence of special levels of structure beyond surface structure and semantic representation . . . require[s] special empirical justification” (Postal 1972 [1969]:137–38).

McCawley makes the point by turning Chomsky’s words against him, again yoking in the venerable Hallean anti-phoneme argument:

Chomsky’s remark [in a discussion of the Hallean syllogism] that “the burden of proof is on the linguist who believes . . . that there is . . . a linguistically significant level of representation meeting the conditions of taxonomic phonemics and provided by the phonological rules of the grammar” [1966c:48] applies equally well to the linguist who believes that a level such as “deep structure” exists intermediate between the semantic and surface syntactic representation. (1976b [1968]:170; McCawley’s elisions; see also 1976b [1967]:92–93)

McCawley and the other Generative Semanticists maintained that the burden had never been met: “It was simply assumed in Aspects that [Deep Structure] contained all lexical items and preceded all transformations,” Lakoff said; “no arguments were given” (G. Lakoff 1971b:281).

But Chomsky about taxonomic phonemics—and, in his footsteps, McCawley and Lakoff about Deep Structure—has it exactly backwards. Presumption always falls on the side of established scientific principles. The burden of proof was on Halle and on Generative phonology, and it was met by providing a model of phonology that worked efficiently without “the phonemic level.” Lakoff’s claim that no specific arguments were advanced in Aspects is strictly true. But so what? Deep Structure is the linchpin of a model that the entire community (including, of course, all of the budding Generative Semanticists) found very compelling. Or, look at it from the other end: the Aspects model is itself an elaborate argument for Deep Structure.

History also made specific justifications moot: Deep Structure was simply what the Syntactic Structures model got after incorporating trigger morphemes and Δ nodes, doing away with Generalized Transformations, adding a semantic component, and so on. In the terms of good Bishop Whately, a stuffy cleric and astute rhetorician of the nineteenth century, Deep Structure preoccupied the ground of Transformational Grammar—just as the “taxonomic phoneme” preoccupied the ground of structuralist phonology—and Generative Semantics was obliged to dislodge it:

According to the most correct use of the term, a “Presumption” in favor of any supposition, means, not (as has been sometimes erroneously imagined) a preponderance of probability in its favour, but such a preoccupation of the ground, as implies that it must stand good till some sufficient reason is adduced against it; in short, that the Burden of Proof lies on the side of him who would dispute it. (1963 [1846]:112; Whately’s italics)

Chomsky actually gets this right elsewhere, more bluntly and more concisely: “There is no burden of proof on the person who provides the only theory that exists.”⁴

Halle’s anti-phoneme argument came with a winning treatment of the “highly complex patterns of phonological relationships in Russian” (Anderson 1985:321). McCawley’s respectively argument came only with a post-scripted IOU, and the choice between a reasonably well articulated theory (the Aspects model) and a promissory note for a theory (Generative Semantics), much the same but without its most glamorous element. Absent partisanship, such an argument is not persuasive. Confederates are usually willing to take your IOUs, but your opponents never are, and rarely will the field at large.

Practically speaking, though, whatever the Generative Semanticists’ rhetorical maneuvering was on the matter, no one was unclear about where the burden of proof over Deep Structure really fell: on its apostates, not on its apostles. Generative Semanticists mounted a range of negative arguments against it, but the Interpretivists only bothered with a few positive arguments in favor of Deep Structure (Chomsky 1972b [1968]:85–86, [1969]:151–62). McCawley certainly realized, whatever he said at the time, where his responsibilities lay. “I had an ulterior motive in writing and presenting [the respectively argument],” he said later, “namely corruption of the young . . . I intended to generate more confidence in that sketch of a model than is warranted by current knowledge” (1976b [1967]:129–30). Not only were arguments against Deep Structure required, so was, and much more urgently, arguments in favor of successful alternatives. So, McCawley got to work—work we’ve seen already, the work that hooked Postal into the movement, his Predicate-raising rule and its related lexical-insertion proposals. These were the first real steps toward articulating a genuine alternative to Aspects.

The key issue in the difference between a theory with Predicate-raising and Aspects concerns where it operates. Critically, the relation between the lexicon and Deep Structure in Aspects is such that first you plug your words into a sanctioned Phrase Structure tree, giving you a Deep Structure, and only then do all the transformations occur (as in Figure 2.3, page 53). But Predicate-raising, a transformation, has to precede at least some lexical insertion rules, because it collects deep predicates (or abstract verbs) together and assembles them into meaning complexes that correspond to words. Predicate-raising turns underlying strings like Juones strikes me as like a monkey into sentences like Juones reminds me of a monkey by way of a lexical insertion rule that replaces strikes like with reminds (or, of course, it might just surface as Juones strikes me as like a monkey; Postal 1971b [1969]). The resulting grammar, with some transformations preceding some lexical insertions, could not, therefore, include an Aspects level of Deep Structure.

Generative Semanticists flocked down the trail blazed by McCawley. Parallel deep-predicate analyses sprung up in fairly short order for all sorts of other words, along with ingenious support arguments for lexical decomposition.⁵ Lexical decomposition is a somewhat misleading term for this line of research since the argumentation was about lexical composition; especially about how words were composed, transformationally, out of “semantic atoms.” But the term reflects the methodology. Someone would probe the meaning of some word, like kill, find that it can be decomposed into atomic bits, like cause and become and not and alive, and chart out the syntactic and semantic implications of the relations among words and their atomic bits.

The Interpretive Semanticists, of course, beat a trail in the opposite direction, gathering counterevidence to the lexical decomposition arguments. Many papers questioned the semantic implications of the atomism, the most renowned being Jerry Fodor’s “Three Reasons for Not Deriving ‘Kill’ from ‘Cause to Die’ ” (1970), hinging on distinctions between such data as 5a and 5b:

Chomsky also dismissed Predicate-raising as merely an upside-down Semantic Interpretation Rule, and unmotivated to boot, and attacked the general consequences of lexical decomposition, which he said would lead to such absurdities as cause to die by unlawful means and with malice aforethought for murder (1972b [1968]:72). We’ve seen the Floyd tree and no one who was happy with a tree that could be modeled as a floor-to-ceiling mobile was going to get too worried about Chomsky’s supposed absurdities, but the Interpretive cadre felt those kinds of dilations put a stake through the heart of Predicate-raising.⁶

McCawley’s arguments and their variations did not, in short, move the needle toward consensus. The Generative Semanticists saw enough in McCawley’s proposals to warrant throwing Deep Structure overboard; the Interpretive Semanticists just folded their arms intransigently. The Interpretive reaction was not especially rational, particularly seen in retrospect, given that nobody (except Katz) wanted Aspects’s Deep Structure anyway by this point, and given that, when the dispute died down they quietly incorporated versions of both lexical (de)composition and Predicate-raising. It was, in the end, little more than a terminological wrangle. The Generative Semanticists wanted to do away with the term Deep Structure, and the Interpretive Semanticists wanted to hang onto it, but nobody (again asterisking Katz) wanted to retain the syntactic level that actually sported that term in the Aspects model, which brings us to the defining principle of Deep Structure, the Katz-Postal Principle.

The Katz-Postal Principle

The great tragedy of science—the slaying of a beautiful hypothesis by an ugly fact.

—T. H. Huxley (1894 [1870]:8.244)

The Katz-Postal Principle is a different story: argument and counterargument over the principle were very successful, two-ways successful, in modifying linguistic theory. First, as virtually everyone agreed (*Katz), it was disconfirmed. Generative Semantics retained the principle (for a while), but in a clearly attenuated form; Interpretive Semantics rejected it, but retained some of its implications. Second, both theories changed markedly as a result of the issues raised by the question of meaning preservation and transformations.

By 1969—an important year, the year of the contentious Texas Conference on the Goals of Linguistic Theory in which all the major players participated—Chomsky was saying that the Katz-Postal hypothesis was disconfirmed, and his students were coming up daily with new data they could fashion into arguments to exactly the same end. “Death to the Katz-Postal Principle!” echoed in the halls of Building 20, MIT.

“Long live the Katz-Postal Principle!” continued to echo in other hallways, but by 1969 it was not the same old Katz-Postal Principle. It had hardened into dogma. The step to dogma—neither unusual nor necessarily unwholesome in science (see, e.g., Popper 1970 [1965]:55, Feyerabend 1978 [1975]:42)—was a very natural one in the transformational milieu. From Harris’s earliest efforts to explore synonymy relations, through trigger morphemes, the discarding of Generalized Transformations, and the design of Deep Structure, the entire transformational program had marched ineluctably toward the Katz-Postal Principle. Chomsky had upgraded it from a heuristic to a hypothesis in works like Aspects, Cartesian Linguistics, and Language and Mind, where he consistently expressed strong support for it; the next reasonable step was to see what conception of grammar one got by taking it very seriously. The answer was Generative Semantics.

The earliest position of Generative Semantics on apparent violations of their defining axiom was the Integrated Theory position; namely, that seeming counterexamples to it were merely the result of inadequate analyses. Either the offending transformation had been wrongly formulated or the data was insufficiently understood. Integrated Theory and Aspects had done such a thorough job of banishing violations, in fact, that there seemed only to be one major class of troublesome data left—the relative Deep and Surface Structure locations of quantifiers—and the Generative Semanticists made a good deal of progress taming them with these faulty-analyses sorts of arguments. For instance, the conventional derivation for 6a ran into trouble, since it would derive from 6b, resulting in a nonsynonymous relation and a Katz-Postal violation.

But Carden argued (1968) that the Deep Structure sources for 6a and 6b should be, respectively, something like 7a and 7b (with only 7a sanctioning a deletion):⁷

Lakoff wove similar re-analyses around sentences like 8a, which Aspects would derive from (the Katz-Postal violating) 8b, but which he said ought to come from something more like (the Katz-Postal maintaining) 8c.

On the other side of the fence, Chomsky’s busy band of Interpretivists kept finding more and more examples where the Deep Structure position of quantifiers supported one reading and, after certain transformational derangements, the Surface Structure position supported another. Interpretive Semanticists used these arguments to justify a major renovation of the Aspects semantic component, albeit a very vaguely sketched renovation: Deep Structure continued to feed the semantic component information about grammatical relations and lexical content, but Surface Structure now fed it information about logical elements (like quantifiers and negatives). This change made for a more powerful and distributed semantic component, which many linguists regarded with a good deal of nervousness. But that was just the beginning.

Chomsky (1972b [1969]:180) says, with his familiar indifference, that Semantic Interpretation Rules must apply “to deep and surface (perhaps also shallow) structure.” Just as he had hinted earlier that Surface Structure semantic rules were on their way, he was now hinting that a new syntactic level, Shallow Structure, might be getting its own complement of semantic rules.⁸ These rules, too, were left unspecified.

The task of putting meat on these skeletal suggestions, and many others, fell to Chomsky’s conscience, Ray Jackendoff.

Semantic Interpretation in Generative Grammar

One thing about pioneers that you don’t hear mentioned is that they are invariably, by their nature, mess-makers. They go forging ahead, seeing only their noble, distant goal, and never notice any of the crud and debris they leave behind them. Someone else gets to clean that up and it’s not a very glamorous or interesting job.

—Robert M. Pirsig (2011 [1974]:241)

Lees was called “Chomsky’s Huxley” in the early years, with a certain appropriateness, though Chomsky did not really need an attack dog; the phrase holds at least as well of Jackendoff, with the same qualification, though maybe “Chomsky’s janitor” would be more in keeping with our epigram from Pirsig, or “Chomsky’s handyman,” or “Chomsky’s mechanic.” He was a polemicist, certainly, but he did very important theoretical and methodological work as well, cleaning up and repairing after Chomsky’s brilliant insights had left much of the established Generative machinery broken and strewn across the floor.

His results were not pretty. “Jackendoff is quite candid about the aesthetic appeal of [Interpretive Semantics]—or lack of it,” Michael Kac noted at the time. “He admits that the picture would be prettier if all semantic interpretation took place at a single level, but maintains that the facts militate against such an assumption” (1975:23). Jackendoff’s results were not pretty, but they worked. Jackendoff had sufficient brilliance of his own, and he was a true believer.

Like T. H. Huxley with natural selection, and Lees with the first iteration of Chomskyan gospel, Jackendoff had the “Remarks” material as part of a small privileged group, directly from the source, and at length, before most of the world had even heard of it. “I heard the lectures.,” he reminisced with me, “That’s different from reading the paper. They were more fleshed out. It took [Chomsky] probably the better part of a semester to cover that material.” More importantly—and here the Huxley/Darwin analogy takes on richer colors—Jackendoff had a front-row seat to the considerably more hostile objections of Lakoff and others, which Chomsky also settled to his satisfaction. Having the recalcitrant, baiting, Generative Semanticists constantly raising objections to the “Remarks” proposals gave the entire emerging Interpretive Semantics community a sense of shared purpose. Exactly the same sense of conspiratorial camaraderie shows up in the letters among Darwin, Huxley, and Hooker, discussing some of their opponents’ blockheadedness; as Darwin put it, “truth can only be known by rising victorious from every attack” (letter to Sedgwick, November 26, 1859).

Chomsky also had somewhat more need for a conspiratorial cadre, a kennel of bulldogs, in the mid 1960s. Chomsky is rather less shy of combat than Darwin was, we know, but his energy was more divided during the Linguistics Wars than in his earlier career, with his political activism drawing his polemical energies to other sectors. Jackendoff, along with Dougherty, Michael Brame, and others, were more than happy to take up the linguistic cudgels.

Jackendoff tore into not just Ross and Lakoff and McCawley, his professional seniors by only a few years, but into the considerably more formidable Postal as well. He showed, repeatedly, the subtle ability to play partisan science with the best of them—such as calling a constraint he endorses “the Complex NP Constraint” (not, say, “Ross’s Complex NP Constraint”), and one he doesn’t endorse “the Postal Crossover Condition” (1968). The favored proposals are just proposals, the unfavored ones are errors made by bad guys.

There is a major difference between Huxley’s role and Jackendoff’s, however, which is where mechanic comes in. While Huxley largely explicated and rephrased the meticulously detailed arguments of Darwin, Jackendoff elaborated and expanded the casual adumbrations of Chomsky, putting formal flesh on suggestions that even the unflaggingly loyal Jackendoff called “sketchy and programmatic” (1977:xi).⁹ It is as if Darwin had quit after his 1858 paper to the Linnean Society and Huxley had written Origin and The Descent of Man himself. Or, fishing for a better analogy than Huxley to Darwin on this front, we might settle briefly on Kepler to Copernicus. Like Kepler’s elliptical orbits, the form that Jackendoff gave to some of Chomsky’s beautiful airy proposals violated his own sense of aesthetics; it was unlovely, but it worked.

Whatever the analogy, Jackendoff is a hero in the tale of Interpretive Semantics. There were others—especially Adrian Akmajian and Joseph Emonds (Dougherty was a polemicist and little more)—but aside from Chomsky, no one else came anywhere close to him in terms of scale—positively (in supporting lexicalism, -syntax, and post-Deep Structure semantic interpretation), negatively (in discrediting the Katz-Postal Principle), and polemically (in taking a hammer to many specific Generative Semantics proposals). Indeed, in some ways, his contribution was more substantial than Chomsky’s: certainly, it was more sustained, more comprehensive, and considerably more rigorous. The level of his heroism is clear if we step back into the historical long view for a moment. Chomsky’s linguistic development has been fecund to an unprecedented (and, indeed, un-subsequented as well) degree. Chomsky’s career is often said to be punctuated by six main theories. They generally go by the names Early Transformational Theory, the Standard Theory, the Extended Standard Theory, Government-and-Binding (or Principles and Parameters) theory, the Minimalist Program, and Biolinguistics (we will see them all in some degree before we’re done, charting their overlaps and discontinuities in general ways). Here’s the curious thing: each of these models but one is associated with a major Chomsky text: Syntactic Structures (and/or Logical Structure of Linguistic Theory) for the early theory, Aspects for the Standard Theory, Lectures on Government and Binding, The Minimalist Program, and Why Only Us? for his latest approach. The odd model out is the Extended Standard Theory. There is a major text, but it’s not Chomsky’s. It is Jackendoff’s. His Semantic Interpretation in Generative Grammar (1972) is the Extended Standard Theory Word.¹⁰

If Chomsky’s central “Remarks” proposals had stayed as he left them in 1967, it is a good bet that many fewer linguists would have been drawn to them; the other two papers in Chomsky’s anti-Generative-Semantics trilogy—“Deep Structure” and “Some Empirical Issues”—are a great deal shorter on positive details even than “Remarks.” They are both important collections of arguments, which Interpretive Semanticists, virtually en bloc, regarded as lethal to Generative Semantics; they are extremely effective works of rhetoric, completely shifting the agenda of the debate; and they touch on a number of issues that subsequently became central topics in linguistics. All the same, they are little more than heaps of negative criticisms, attended only by the most allusive positive suggestions. Most aggravatingly to the Generative Semanticists, they carry the strong implication that Chomsky has no responsibility to provide positive accounts of the phenomena he introduces. For instance, he brings the notions of focus and presupposition into the debate, arguing that Generative Semantics can’t adequately account for them, but where his own model is concerned, he goes little further than the remark “these notions seem to involve Surface Structure in an essential way” (1972b [1968]:101).

Presupposition is a typical example, as in 9a–c (upper case signals phonological stress—a bit more volume and duration for emphasis).

These three questions all have the same framing main clause (“Was it . . . he was warned to look out for?”), but each differs in its focalized Noun Phrase corresponding to a different presupposition. We know that he was warned to look out for someone who seems to have been an ex-con, who was wearing something red, which may have been a shirt, and that, in any case, he was wearing a shirt, which may have been red. Each of 9a–c exhibits a level of uncertainty about some aspect of a supposedly shared proposition. If they share a proposition, they share a Generative Semantics underlying structure, but Chomsky proposes these kinds of appropriate answers to questions 9a–c:

Answers 10a–c align perfectly with questions 9a–9c, because 10a presupposes that the focus of the question is a red article of clothing, 10b that the focus is the type of person wearing the clothing, 10c that the focus is the color of the shirt. All other question-answer pairing is wonky (10c as an answer to question 9a, for instance, is incoherent). Now, the presence of stress as a focalizer is the real killer in Chomsky’s argument, because it is theoretically associated with Surface Structure, making 9a–9c poison pills for Generative Semantics.

Focus and presupposition are part of a bait-and-switch argument against Generative Semantics. They rely on context (question/answer pairings) and belief structures (the presuppositions grounding the focus), aspects of meaning Chomsky had never talked about before, rarely mentions again, and does not hold his own frameworks responsible for. The upshot is “You can’t handle focus and presupposition. Me? I don’t have to.” After this argument, he blithely switched back to ignoring such phenomena.

Generative Semantics, in any case, took the bait; or, just felt more responsibility than Chomsky. Either way, focus and presupposition are prominent among the phenomena for which Lakoff calls global rules into being (G. Lakoff 1969b, 1970b, 1971b), and Chomsky, as if waiting for the opening, expressed outrage and scorn over global rules, which then formed the rhetorical shovel of the Interpretivist case against Generative Semantics throughout the 1970s. They beat every argument with that shovel, and then they used it to bury the whole movement.

Jackendoff, though, could not be so cavalier. He offered concrete proposals for how Surface Structure might be involved in the interpretation of focus and presupposition (1972:229–78), just as he did for virtually every other thorny issue of the day—grammatical relations, pronouns, modals, negation, quantifiers—even venturing with some success into the very murky regions of intonational meaning.

While Jackendoff was building his Swiss-army-knife semantic system to manage these daunting tasks, the major ideas of “Remarks” were largely neglected. Chomsky (except for passing mentions—e.g., 1972b [1969]:158–62) did not revisit them, and his other students, despite a totemic identification with the name Lexicalists, took them no further either. Even Jackendoff appeared reluctant to embrace them at first.¹¹

But his reluctance may have had less to do with embarrassment than with a lack of time. Both the Lexicalist hypothesis and -syntax are very ramified ideas, and exploring them with any seriousness would have meant dropping the project Jackendoff found more urgent, justifying post-Deep-Structure semantics—more urgent, of course, because more devastating to Generative Semantics. Once he had offered that justification, in Semantic Interpretation, he turned first to the lexical rules necessary to give Lexicalism some formal substance, then to the -notation necessary to give Lexicalism some explanatory capacity—all the while maintaining a fervent opposition to Generative Semantics.

Some of Jackendoff’s digressive attacks were annoying to readers on the sidelines—Michael Kac’s review of Semantic Interpretation complained that Jackendoff was squandering his considerable skill “in meaningless polemical exercises” and with “deliberate parochialism” in order “to score points in a sectarian debate” (Kac 1975:30). But—as David Hull tells us, among the many worthy motivations for scientists, like “natural curiosity, the love of truth, and the desire to help humanity,” is one, equally prevalent, that is not quite so worthy, which he labels, simply, as “Get that son of a bitch!” (1988:160)—those sectarian aggressions kept Jackendoff’s imagination fired.

Semantic Interpretation is stunning in its ambitions and its meticulous ingenuity; forty years later, one of the premier semanticists of the age was still calling it “magnificent and wide-ranging . . . foundational” (Partee 2015c: xvii). But the model it lovingly details is a gawky contraption next to the Generative Semantics model or the Aspects model, both of which look very good on a blackboard. The Aspects model had an orderly semantic component that looked in on derivations at Deep Structure, and only at Deep Structure (Figure 2.3, page 53). Its semantic rules were all of the same type, and they produced one semantic representation per derivation. The Generative Semantics pure-case model, Homogeneous I, had a semantic component that, to all effects, percolated meaning directly up into the syntax (Figure 3.3, page 103). Its semantic rules were not only all the same, they were also the same as its syntactic rules; namely, transformations. There was only one semantic representation per derivation. Jackendoff’s model, though, looked in on derivations virtually at will. And it did away with a guiding principle of Transformational Grammar from the outset, that each derivation has a single semantic representation; Jackendoff’s derivations had four distinct semantic representations.¹²

Moreover, one of the calling cards of Aspects is that it reduced the complexity of the Katz-Fodor semantic picture by eliminating one of their rule classes. Generative Semantics went further and eliminated Semantic Interpretation Rules altogether. The Extended Standard Theory model, in the only full articulation it got, Jackendoff’s Semantic Interpretation, added three new classes of semantic rules, each of which produced its own semantic representation. To make things worse yet, the Extended Standard Theory also had several other bits and pieces of theoretical paraphernalia that had entered the field since the mid-1960s—lexical redundancy rules, output constraints, conditions on transformations, and the like—all of which Jackendoff felt responsible to incorporate. You can check out a relatively conservative diagram of the Semantic Interpretation model in Figure 5.2. Jackendoff now says of this model, “I’m sure everybody thought that it was off-the-wall and weird, although nobody complained to my face. [They must have thought] that is really wacko. If semantic interpretation is like that, forget it.’ ”

Figure 5.2 The Extended Standard Theory. Adapted from Jackendoff 1972:4. (SIRs = Semantic Interpretation Rules)

On the contrary, many people thought, “So that’s how serious semantics gets done in a Chomskyan grammar”; and “If semantic interpretation looks like that, it needs to be cleaned up, tightened, maybe winnowed a bit, not discarded.” Semantic Interpretation in Generative Grammar pursues a rhetoric of assent which rarely fails to put its money where its mouth is. Virtually every negative criticism is balanced by a positive proposal.

The Katz-Postal Principle Redux

It has become clear over the past five years that Transformational Generative Grammar is nowhere near being an adequate theory of human language. Those of us who have tried to make Transformational Grammar work have attempted to patch up the classical theory with one ad hoc device after another: my theory of exceptions, Ross’s constraints on movement transformations, the Ross-Perlmutter output conditions, Postal’s Crossover principle and anaphoric island constraints, Jackendoff’s surface interpretation rules, Chomsky’s lexical redundancy rules and his analogy component, and so on. . . . Most, if not all, of these ad hoc patching attempts [are] special cases of a single general phenomenon: global derivational constraints.

—George Lakoff (1970b:627)¹³

Interpretive Semantics was a very different beast, and a much lumpier one, once Chomsky’s scattered proposals and intimations were instrumentalized by Jackendoff. But there was uglification going on in the Generative Semantics camp too, for much the same reasons. The sort of data that the Interpretivists’ research kept turning up against the Katz-Postal Principle proved too much for arguments and devices modeled on the Integrated Theory approach to meaning-preservation. Jackendoff and Chomsky had no trouble abandoning the Katz-Postal Principle by enriching the semantic component substantially. But Generative Semantics was incomprehensible without some form of the principle. There was no semantic component, sitting off to the side, away from syntax, which it could enrich. There were only two conceivable options: abandon the theory, or enrich the homogeneous rule system in a way that would preserve some version of the Katz-Postal Principle, however theoretically compromised it became. Aside from the dialectical pressures of the dispute, which would have made surrendering to Chomsky’s new vision impossible, the Generative Semanticists increasingly saw that vision as essentially abandoning any hope for a realistic account of language.

Most of them had a good deal of respect for Jackendoff’s ingenuity, and for his willingness to confront the implications of data in which Chomsky had no apparent interest after it had served his anti-Generative Semantics purposes. But they regarded Jackendoff’s efforts as an endless, misbegotten series of patches in a wall built to keep meaning and structure artificially apart; when the Aspects version of Deep Structure wasn’t strong enough, the Interpretive crew invoked Surface Structure, and then Shallow Structure, and then, recurrently, the nameless structural levels at the end of each cycle (Figure 5.2).

Generative Semanticists felt the right approach was simply to admit the artificiality of that wall, to acknowledge semantics and syntax intermingled so thoroughly as to make autonomous accounts of either futile. Then, having made this admission, the real task of linguistics could begin, finding the order in this gumbo; the Katz-Postal Principle still looked to be the best bet on this front, even in a weakened form, and one new rule type looked a small price to pay for its maintenance.

This new rule type was introduced by George Lakoff, who had become, far and away, the most prominent and productive Generative Semanticist. It was immediately endorsed by the other leaders; the Interpretive camp, led by Chomsky, assumed the posture and facial posture and facial expression made famous by Edvard Munch’s The Scream, howling with outrage. Lakoff proposed to incorporate devices he called global derivational constraints (global rules, for short; 1970b; 1971b:238ff), moving the theory into what Postal called its Homogeneous II phase. In brief, global rules recognize that some transformations can alter the relations of words such that Deep and Surface Structures of the same derivation could support different semantic readings; however, they outlaw such derivations. Derivations in which transformations change meaning were legislated out of the theory.

To take an analogy from Logical Structure (Chomsky 1975a [1955–56]:145), early transformational theory allows the generation of sentences like the legendary 11:

But— somehow, somewhere, some way—the grammar rejects it for not achieving “the highest degree (first order) of grammaticalness” (1975a [1955–56]:154). Katz and Fodor adopted the same approach, by having the semantic component fail to return any semantic reading for sentences like 11. This general plan of attack fits the Aspects term filtering.

Lakoff simply applied this filters-approach to some new sentences, like 12a when it derives from an underlying structure like 12b (and thereby violates the Katz-Postal Principle). In order to be legitimate, the global rule declares, 12a must arise only from 12c.

Once again, we have quantifiers (few, many) throwing a wrench into meaning-preservation: 12a means that there are hardly any books such that lots of men read them; 12b means that there are lots of men who read hardly any books. In Syntactic Structures our old transformational friend, Passive, would relate 12a and 12b. But 12a and 12c can also be related transformationally.

Here’s the deal, then: Transformational Rules alone permit the derivation of 12a from 12b or from 12c, doesn’t matter, but Lakoff introduces a global rule that outlaws one of those derivations, the 12b ⇒ 12a derivation, so the Katz-Postal Principle is maintained. Essentially, we have a semantic output condition. If this solution smells of desperation, your nose is serving you well. The principle is maintained by the strategy Aristotle called, none to affectionately, to en archêi aiteisthai (aka, “begging the question;” aka “circular reasoning”). Aristotle’s phrase translates fairly literally as “asking for the initial thing.” The initial thing in this case is “transformations do not change meaning,” and the proof proceeds by inventing something that stipulates situations where transformations change meaning (of the 12b ⇒ 12a sort) cannot exist, with the conclusion that, “See, transformations do not change meaning” (see especially 1971b:240).

The Katz-Postal Principle cannot be violated because the Katz-Postal Principle cannot be violated.

From this angle, then, global rules are a public confession that the Katz-Postal hypothesis is false, since only a hand-of-god stipulation prevents violations, and, therefore, global rules vitiate the entire Generative Semantics program. Certainly, that’s the way most Interpretive Semanticists saw it, as well as a few Generative Semanticists, and Chomsky surely has quantifiers and global rules in mind when he says that Generative Semantics “was proven wrong, very early.”

But Lakoff argues persuasively that global rules are necessary for reasons wholly unrelated to the Katz-Postal Principle or any semantic issues at all. In particular, he points out that a number of non-transformational rule types which both sides of the schism had already adopted—Ross’s island constraints, Postal’s Crossover Principle, David Perlmutter’s output conditions—have similar powers. And then there is the whole panoply of general conditions on rules that had mushroomed in Transformational Grammar and was codified in Aspects—rule ordering, cyclicity, Recoverability of Deletion, the Katz-Postal Principle itself, and so on. The specifics of these various conditions aren’t important, just that they make up a fat, knobby bag of extra-transformational goodies.

Lakoff also unearths lots of data for which such a solution seems inevitable in any Transformational Grammar. One of his most compelling arguments concerned Greek case-agreement. We can cut to the chase on it: adjectives and participles must agree morphologically with their Deep Structure nouns; some transformations can derange sequences such that the relevant words are indefinitely far apart and in different phrases; case assignment can occur after such derangements; therefore—here comes the globality—the agreement rule must be “able to look back to a prior point in the derivation” (Lakoff 1970b:629). Quantum rules might have been a better label than global rules, because different points in a derivation could communicate with each other instantly without regard to time and space. Except for the unpalatable consequences for the (Extended) Standard Theory, the argument is textbook distributional reasoning, with clear data and natural consequences. Ipso facto, Lakoff asks, since “global rules are necessary, whatever position one takes on the relative merits of generative and interpretive semantics” (1970b:638n9), why not use them to maintain the Katz-Postal Principle?

Generative Semantics, even more than Aspects before it, ran on a transformational engine; globality became the governor for that engine. But because global rules involve a sort of direct communication between noncontiguous trees in a derivation (like Deep and Surface Structure in the many-men-few-books example in 12), global rules raise a welter of complications that contributed heavily to the downfall of Generative Semantics (to be traced out shortly), and it’s not clear that Lakoff did anything more than stencil a name onto the fat, knobby bag of devices shoring up Transformational Grammar in all its variants. But, for the moment, we can give him the last word, pointing to a definite advantage of global rules over the aesthetic and conceptual messiness of incorporating several distinct Jackendovian classes of semantic rules:

For each different case [Chomsky] would propose not a different rule, but a different kind of rule, adding a new type of theoretical apparatus to the theory of grammar for each new global rule discovered.

It is sad and strange to encounter such [an attitude]. (1970b:637; Lakoff’s emphasis)

Restrictiveness

May 27, 1969: George Lakoff discovers the global rule. Supermarkets in Cambridge, Mass., are struck by frenzied buying of canned goods.

—entry in James McCawley’s (1978 [c1971])“Dates in the Month of May That are of Interest to Linguists”

Chomsky’s anti-Generative-Semantics response had several stages, each one defining a new direction for its own model. First, he undermined Abstract Syntax with the “Remarks” proposals, then he attacked the Katz-Postal Principle by promoting the Surface-Structure-impinges-on-meaning arguments, and then, in his reply to Postal’s “Best Theory,” he attacked Generative Semantics for descriptive wantonness. The first two approaches—Lexicalism and post-Deep-Structure semantics—failed to resonate with anyone beyond his immediate students. But the descriptive-wantonness argument caught fire.

“The great weakness of the theory of transformational grammar,” Chomsky said,

is its enormous descriptive power, and this deficiency becomes more pronounced to the extent that we permit other rules beyond transformations (i.e., other sorts of “derivational constraints” [global rules]). . . . Any imaginable rule can be described as a “constraint on derivations.” The question is: what kinds of rules (“derivational constraints”) are needed, if any, beyond those permitted by the standard theory? (1972b [1969]:133–34)

Chomsky cannot even bear to use Lakoff’s term derivational constraint without the sanitizing envelope of quotation marks (and double-bagging it by adding a set of parentheses), but the general point is reasonably clear: unless some rigor is brought to the notion, it drives linguistic theory away from what he represents as a crucial goal, restrictiveness.

Ordinarily, one thinks of descriptive power as a virtue in science, so enormous descriptive power should be the mother of all scientific virtues. Within certain parameters, this is certainly the case. Big-time descriptive sciences like astronomy and biology earn their bacon by having enormous descriptive ranges, from pin-size black holes to pulsars, amoebas to elephants. But the parameters are extremely important, because they represent the limits of the science. Astronomy is not the science of all objects with mass and weight and velocity. The objects it describes do not include ‘56 Chevys. Biology is not the science of all objects that consume and excrete and have inherited characteristics. The objects it describes do not include ‘56 Chevys. Linguistics is not the science of all possible symbols or symbol systems. No ‘56 Chevys.

Linguistics is the science of natural languages, and there are lots of symbol systems which are not natural languages. It is extremely easy, in fact, to come up with symbol systems that operate in ways that natural languages don’t; computer scientists do it all the time. Sentences might be ordered by word length, or vowel frequency, or chronological occurrence. Questions might be formed by reversing the order of words in a declarative sentence, or rearranging them alphabetically, or transposing every second pair of consonants, or only being uttered when the speaker is leaning against a ‘56 Chevy. An unconstrained Transformational Grammar—say, the one outlined in Lees’s English Nominalizations—can describe all of these systems, and many, many more. Transformational Grammar is so powerful, Emmon Bach once said, that “a not too far-fetched analogy” to the way it describes language “would be a biological theory which failed to characterize the difference between raccoons and lightbulbs” (1974:158). This ugly situation is made all the uglier in Generative Grammar because of the cognitive mandate it assumed.

At its core, remember, Transformational-Generative Grammar is supposed to be psychologically plausible, describing what is between the ears of a language user, and it is supposed to be particularly attuned to the problems of language acquisition. How is a child to acquire a language if she can’t even know what one is, if she is in the same position as a biologist trying to learn about raccoons who is unable to distinguish one from a lightbulb or a ‘56 Chevy?

Hence, the opposition to enormous descriptive power. Hence, Chomsky’s work on the A-over-A principle, Ross’s on island constraints, and Postal’s on crossover effects. Hence, Chomsky’s burning question, the one that the Extended Standard Theory set out to answer in the 1970s—“What kinds of rules (‘derivational constraints’) are needed, if any, beyond those permitted by the standard theory?” In short—though very few people noticed it at the time, and few would acknowledge it now—Lakoff’s proposal for a program to explore and develop global rules pretty much defined the Interpretive program.

Chomsky’s great rhetorical triumph was that, in very short order, he managed to turn the words global rule into a synonym for “any imaginable rule,” completely reversing the thrust of Lakoff’s argument, and the words Generative Semantics into a metonym for “enormous descriptive power,” and the words Extended Standard Theory into a metonym for “restrictiveness.” Chomsky and the Interpretivists regularly pointed to the vast, growing menagerie of Generative Semantics descriptive devices in horror, but conveniently ignored their own proliferating menagerie when making comparisons. When they weighed the two frameworks, everything went on the Generative Semantics side of the scale, only basic syntax on their own; McCawley later called it “comparing apples with orange peels” (1999:159).

The Generative Semanticists were outraged. To them, it was all sleight of hand, smoke and mirrors. They still recall Chomsky’s restrictiveness arguments with the imagery of prestidigitation. Here’s Postal:

Chomsky had these—what did George call them?—these wild cards that he could pull out of his hat whenever he wanted, and somehow they didn’t count when it came to talking about restrictiveness.

Whenever he was doing something descriptive, where he needed to describe facts that Generative Semantics would handle with transformations—linking meanings to Deep Structures, or to other kinds of structures, by way of global rules—Chomsky would appeal to Semantic Interpretation Rules.

He would never define them. He has never, to this day, given any content to that notion. He’s never said what they were. But he could have as many as he wanted. Whenever he needed one, he could pull one out of his hat and use it. Now, when it came time to compare Generative Semantics to his framework [in terms of restrictiveness], those were never included. He never felt he had to say anything about them.

It seems, a priori, implausible that he could get away with that. But he did.¹⁴

The Generative Semanticists were particularly alert to Chomsky’s moves on this front, watching his hands closely, pointing out the cards up his sleeves, but almost everyone else took his restrictiveness arguments at face value. Even many of his most fervent detractors frequently take time out from attacking him to take a few kicks at his Generative Semantics scapegoat for descriptive profligacy. “Chomsky is pretty bad, but—hoo-wee!—those Generative Semanticists are crazy” themes abounded (see, for instance, Hagège 1981:83).

In fact, Chomsky’s victory on this issue was so complete that it is now difficult to appreciate its dimensions, especially for anyone unfamiliar with the rhetorical history of Transformational Grammar. We have only looked very casually at the appeals which constituted the arsenal of topoi Transformational-Generative Grammar deployed in its blitzkrieg colonization of linguistics. One topos in particular that we brushed past fairly quickly ran through the overwhelming bulk of early arguments, is simplicity—simplicity as a goal of linguistic research, simplicity as the central criterion in theory comparison, and simplicity as a methodological principle. The evaluation metric with which Chomsky thumped the Bloomfieldians, Halle’s case against the phoneme, the daily warrants for specific analyses, instruments, and hypotheses—all leaned heavily, in some cases exclusively, on the virtues of simplicity.

With the central, virtually defining role of simplicity in Chomskyan linguistics, one would have thought (Postal surely thought) that the “Best Theory” case would be enormously appealing. It is a straightforward minimalist argument that the grammar with the fewest theoretical devices is the simplest, and therefore, should be the most highly prized. On these grounds, Generative Semantics wins, hands down. In particular, the model Postal dubs Homogeneous I is “the best grammatical theory a priori possible” (Postal 1972a [1969]:136). Homogeneous I has semantic representations at one end, surface representations at the other, and a relatively uniform component of Transformational Rules taking you from one to the other; the beautiful and compelling Aspects theory, by comparison, had three levels, and two sets of rules, and the emerging Extended Standard Theory was positing or assuming or hinting at new devices in every other argument.

Alas, Postal is forced to concede by 1969, there is a rub. His best of all possible theories, regrettably, can’t accommodate the facts that have been stirred up in the clash of theories. Any responsible grammatical model needs to include Lakoff’s new rule-type, global derivational constraints. He is quick to point out that the result, Homogeneous II, is still much simpler than Aspects; not to mention than that vague post-Aspects Interpretivist theory Chomsky was peddling at the same conference.

Newmeyer observes, correctly, that despite what should have been an extremely winning case to formal linguists, “probably no metatheoretic statement by a Generative Semanticist did more to undermine confidence in that model than Postal’s paper, ‘The Best Theory’ ” (Newmeyer 1980a:169; 1986a:135). The reason goes far beyond the arrogance many found in Postal’s title, and far beyond the culprit that Newmeyer himself cites, the character (or, in Newmeyer’s view, lack of character) of the new rule-type that moved Generative Semantics from Homogeneous I to Homogeneous II. The reason is straightforwardly rhetorical: Chomsky managed a remarkable change of agenda in the debate.

Chomsky raised the alarm—“the gravest defect of the theory of transformational grammar is its enormous latitude and descriptive power” (1972b [1969]:125)—and as grave as that defect is, Chomsky said, Generative Semantics makes it worse by adding global rules to the arsenal. A grammar organized solely around transformations (that is, Homogeneous I) “is a rather uninteresting theory,” because of its immense descriptive power; “It can be made still more uninteresting by permitting still further latitude, for example, by allowing rules other than transformations that can be used to constrain derivations [that is, by adding global rules and becoming Homogeneous II]” (1972b [1969]:126). To this point, the argument adds no weight to Chomsky’s position; indeed, it undermines him. His argument is that transformations are bad, and that adding more rule-types makes any grammar that incorporates them even worse, so Aspects looks at least as bad as Generative Semantics, with its Semantic Interpretation Rules, and post- Aspects Interpretivism looks worst of all, with distinct types of Semantic Interpretation Rules making the rounds almost daily.

But Chomsky isn’t through. “Notice that it is often a step forward,” he says, “when linguistic theory becomes more complex” (1972b [1969]:126).

The grounds of theory comparison changed almost overnight: simplicity was out, restrictiveness was in, and progress was to be found through increases of complexity. There is an aspect of rhetoric known as kairos, the opportune moment. Some arguments get a better hearing at one moment than another, and the moment was ripe for restrictiveness.

Although Chomsky regularly dismisses their influence, the most important kairotic factor in the success of his restrictiveness case was a series of papers by Stanley Peters and Robert Ritchie (1969, 1971, 1973a, 1973b), demonstrating mathematically Transformational Grammar’s virtually complete lack of discrimination; it is in a passage on the Peters-Ritchie results that Bach made his lightbulbs-and-raccoons comment about Transformational Grammar.¹⁵ Specifically, the Peters-Ritchie results show that the class of grammars described by Aspects is so all-encompassing that it can’t distinguish between any indiscriminate list of strings of symbols (say, all the decimal places of π, divided into arbitrary sequences and enumerated by value of the products of their digits) and a list of actual strings that people use to communicate (say, English). These results formalize notions that had been present in transformational theory for some time, but a mathematical proof brought them home very powerfully.¹⁶ The concern with restricting Transformational Grammar that led Chomsky and Ross and Postal and Perlmutter to work on constraints had been one of many themes around the Aspects theory, but Chomsky’s urgings at the Goals conference, coinciding with the publication of the Peters-Ritchie results, put flashing lights on its roof and opened up the siren. Something of a crisis ensued in the linguistic theory of the 1970s, and—with the target Chomsky painted on its back—Generative Semantics was the emblem of profligacy.

The most curious aspect of the restrictiveness counterargument, however, is that Generative Semantics showed very few signs of the descriptive wantonness in 1969 for which Chomsky indicted it. Indeed, between Aspects and the Texas Goals conference, Generative Semanticists had done a great deal more to constrain Transformational Grammar than anyone in the Interpretivist camp. Ross and Postal had both done extremely important work on constraints. McCawley had re-analyzed Phrase Structure Rules in a way that made them serve as filters, and argued to extend Ross’s movement constraints to the lexicon. Ross and Lakoff had done crucial work on the cycle. Ross and Ronald Langacker had made parallel proposals for restricting the application of pronominal transformations. And the Bogey Man of Chomsky’s attack, Lakoff’s proposal of global rules, was an attempt at further restriction; the expanded name is global derivational constraint. “When Lakoff proposed that you need global rules,” McCawley pointed out a few years later, “that did not carry with it a proposal that every imaginable global rule is a possible rule” (in Parret 1974 (1972]:268).

Lakoff argued that virtually all of the serious work in Transformational Grammar had been to constrain derivational relations. Transformations, he said, constrain two contiguous trees in a derivation. Their application is local. They apply only to two trees standing side by side. Other principles and rules—in particular the Ross-Perlmutter-Postal line of research, but also such transformational traditions as rule ordering, cyclicity, and Recoverability of Deletion—constrain noncontiguous trees in a derivation. Their application is global. They apply to pairs of trees (“or perhaps sometimes triples”—Lakoff 1970b:638) in a derivational arboretum, irrespective of the distance between them.

Global rules, in short, are just derivational constraints with a wider range than transformations. What Lakoff offered is nominal, but naming things is an important precondition for understanding: Lakoff offers a label for the seemingly disparate research into restrictions on transformations. It is worth backing up a moment here to note that Syntactic Structures offered a very similarly nominal service. Its transformations moved things around, deleted things, inserted things, joined things together. The label transformation, that is, tags a cluster of operations, not a singular process; or, rather, it is the label that yokes them into a conceptual singularity. The label global rules yokes a cluster of conditions on transformations.

Lakoff’s basic argument is that linguists need to recognize the necessity for all the extra paraphernalia beyond transformations and begin exploring them as a class of rules, rather than as a mixed bag of ad hoc devices. Lakoff calls for the development of a “theory of global grammar” (1970b:638), but then—and this is probably the other main kairotic factor in the success of Chomsky’s restrictiveness argument—Lakoff never took up the mission himself.

Back at MIT, the Interpretivists had done very little work on derivational constraints. Chomsky (1964b [1962]) had inaugurated this area of investigation, with his A-over-A principle, and Joseph Emonds’s important (Chomsky-supervised) 1970 dissertation on a new type of filtering was just about to hit the market. But the bulk of the work on constraining the transformational component—in 1969, when Chomsky called the lack of restrictiveness “the gravest defect” in transformational theory, and 1970, when Lakoff urged a theory of global grammar to correct that defect—had been done by Generative Semanticists. Moreover, Postal pointed out, as the dispute unfolded, the Interpretivists were at least as guilty as their Generative Semantics whipping boys of the “illegitimate appeal to overly powerful devices” (1972c:215)—vague or completely unspecified rules of performance, partially sketched Semantic Interpretation Rules, and syntactic features.

That situation changed dramatically in the early 1970s. Postal and Perlmutter moved on, proposing new and interesting constraints, but in another framework altogether, Relational Grammar. Ross and Lakoff lost interest in constraints. The ground was left to Chomsky’s camp, which took up the job of grammatical restriction with a vengeance.

And—in a fit of suicidal strangeness, the final and ultimately fatal kairotic element in Chomsky’s success—many Generative Semanticists wore the target on their backs with pride, becoming champions of descriptive profusion. In a climate where the most urgent problem in Transformational Grammar was perceived to be restricting descriptive power, and global rules were painted as the most serious offender, Lakoff announced that “the real problem with global rules is not that they are too powerful, but that they are too weak” (Parret 1974 [1972]:176; Lakoff’s emphasis), and accordingly proposed more powerful devices—in particular transderivational constraints, which relate not two noncontiguous trees in a derivation, but two trees in different derivations (and, with the introduction of these devices, Lakoff became explicit that it no longer made sense to maintain the Katz-Postal hypothesis—1975:283–84).

Sadock even proposed meta–transderivational constraints (which involve “two derivations and an aspect of the real world”—1974b:604). The methodological liabilities of these descriptively powerful devices were compounded by terminological confusions. McCawley, for instance, introduced the term panderivational constraint (1982b [1973]:54), and adopted extraderivational constraint as a generic for both Lakoff’s transderivational constraint and Sadock’s meta-transderivational constraint—attempts at clarification which did little more than contribute to the Generative Semanticists’ growing reputation for theoretical extravagance. Such additional descriptive devices as meaning postulates, conversational postulates, and syntactic amalgams all entered the Generative Semantics pageant, with very little clarity as to how or if they related to transderivational constraints, or even to derivations.

All the while, both Lakoffs, Ross, and a good many second-generation Generative Semanticists were spending much of their time and effort mucking around in data that appeared to call for more powerful devices yet; George Lakoff even entertained “such madness as ordering of transderivational constraints, cyclical transderivational constraints, exceptions to transderivational constraints, and perhaps the elimination of transformations altogether” (1973a [1970]:452).

In the other camp, the Interpretive Semanticists effectively adopted Lakoff’s proposal for a theory of global rules, while vigorously attacking Lakoff and turning global rules into a compound curse word. Emonds (1970) developed some of McCawley’s ideas into an elegant way for the Phrase Structure rules to exercise direct control over every tree in a derivation, and Chomsky (1973a [1971]) developed Ross’s constraints in such a remarkable way that they became the focus of Interpretivist work for the rest of the decade.

The most obvious Interpretivist excursion into globality was Chomsky’s (1973a [1971]) introduction of the trace convention, which gives transformations the power to mark sentences at one stage so that other transformations, arbitrarily later in the derivation, can tell they have applied.¹⁷ In Aspects, Chomsky had proposed that deletions “leave a residue” (Chomsky 1965 [1964]:146), often an abstract morpheme introduced by Phrase Structure rules. The trace convention introduces another type of residue.

This convention has a movement rule leave behind a syntactically and semantically relevant, but invisible “trace,” as in the examples of 13: 13a represents the Deep Structure, 13b the Surface Structure, with t marking the place where the noun phrase, Mikka, was before it moved transformationally to the front of the sentence.

Among other things, this convention means that Surface Structure, which most interested onlookers and many linguists, took to represent directly “what we say” was becoming increasingly abstract in the Interpretive camp. No one says that little t when they speak.

But traces are extremely interesting little doodads. They are global, of course, since they provide a way for different stages in a derivation to communicate with one another, and their chief use is to maintain aspects of Deep Structure (like the original location of Mikka as the subject of to like) so that later transformations or post-Deep-Structure Semantic Interpretation Rules can access them—noncontiguous trees talk to each other—but they also have some fascinating implications that correlate with empirical facts. Nobody says that little t, but it does appear to have consequences for speech. The most celebrated of these implications is the account they offer of the lack of ambiguity in 14b, in contrast to the univocal 14a.

Sentence 14a could mean that I want Bernie to succeed (be successful), or that I want to succeed Bernie (follow him in some way, maybe as a progressive politician); 14b can only mean that I want to follow Bernie in some way. The Aspects model explains the two different meanings of 14a by saying they spring from two different Deep Structures (14c when I want Bernie to be successful, 14d when I want to follow Bernie).

So, the Aspects model handles the ambiguity in its familiar, efficient, different-Deep-Structures way. But it says nothing about the fact that 14b, similar in almost every respect, is unambiguous, necessarily deriving from only one Deep Structure, 14d. Trace theory to the rescue: with trace-enriched Surface Structures the two meanings are represented, respectively, 14e and 14f.

With a little extension that traces block contraction, then, the explanation is clear: either 14e or 14f can be a Surface Structure for 14a (hence, its ambiguity), but only 14f can be the Surface Structure for 14b (hence, its univocality).¹⁸ Nobody says that little t, but somehow it has phonological effects.

And there’s a really big payoff for Interpretive Semantics as well, not a coincidence. Remember all those arrows sticking out of the side of Jackendoff’s “wacko” interpretive model, one for each cycle (Figure 5.2, page 165)? Traces eliminate them. Jackendoff’s semantics needed all those access points to the derivation in order to keep track of when and where movement transformations shuttled various constituents around. With movement rules always leaving bread crumbs behind, that information now makes its way into the Surface Structure.

Time for a parable: In a not uncommon narrative development in scientific debates, we get two sides of a debate—call them Theory X and Theory Y. Now, since it wants to sink the other theory, Theory X finds data that Theory Y can’t handle. Actually Theory X can’t really handle the data yet either, but it argues for a new innovation to take care of them. This theoretical innovation, Theory X proponents insist, shores up its overall program. “Hmm,” say Theory Y proponents, and (when they don’t just ignore the data, also a common response) they proceed to develop their own theoretical innovation, which, they argue, makes Theory Y the stronger of the two. Scientists then choose (for all sorts of reasons) which of the innovations to adopt and a consensus aligns itself with the theory that houses the preferred innovation. Rinse and repeat.

Theory X in our trace convention narrative is Generative Semantics. Data like 14a and 14b first shows up in an argument coming from . . . wait for it . . . arch-villain George Lakoff, in connection with . . . drum roll, please . . . the need for “a constraint operating at two separate points in the grammar” (1970b:632); that is, a global rule.¹⁹ The data and the rule, in turn, Lakoff argues support a theory of grammar in which semantic representations and syntactic representations are homogeneous, “exactly what is claimed by the theory of generative semantics” (1970b:638). Interpretive Semantics, the Theory Y of our parable, rejected the homogeneity-of-representations part of the argument, proclaimed disgust at the word global, but incorporated the data to support its arguments for the trace convention. After some delay, while other issues worked themselves out, linguistic consensus went with traces, and with the Extended Standard Theory framework that came along for the ride (sometimes, under recognition of how far it had traveled, under a new name, the Conditions framework; for a little while, in some quarters, it was even called Trace Theory).

The moral of the parable, exemplified by the emergence of the trace convention, is that the crucible of debate forges new scientific positions and makes knowledge.

If the broader Interpretive/Generative Semantics dispute teaches us anything about the development of science and the making of knowledge, it is that when two frameworks collide and one of them comes out of that collision as the governing paradigm, however it may represent itself, the victor is not the same as before the dispute began. It abandons claims and instruments, redefines some of them, generates new ones, and appropriates some from the opposing framework, often with little or no acknowledgment. The emergence of traces—a global condition based on data raised by Generativist Semanticists that is insistently represented as the very antithesis of a global rule, in fact as an antidote to the poison of global rules—illustrates all of these tendencies with special clarity.

Trace theory grew rapidly into a cottage industry for Interpretivists, part of the dedicated effort to work on constraints. Meanwhile, the few Generative Semantic attempts to propose and explore specific global rules or conditions (such as Lakoff 1971b:238ff, 1974; Ross 1972a), were restless and abortive, betraying little conviction; indeed, Ross discards his derivational constraint in the last few pages of the paper in which he proposes it, waving instead at a transderivational solution. Most Generative Semantic invocations of globality were little more than gestures: “here are some phenomena, and it looks like we’re going to need a global rule to handle them”; no specific rules offered. (Postal was an honorable exception—1972b [1970].)

The situation with transderivational constraints was looser yet. Lakoff introduced them in a very informal paper, “Some Thoughts on Transderivational Constraints,” which doesn’t offer so much as a single example of this new rule-type (1973a [1970]), and further discussion of transderivational constraints was largely of only two, equally unproductive sorts: (1) horrified and uncategorical denunciation, from the Interpretivists; and (2) gleeful and unformalized invocation, by the Generative Semanticists.

Chomsky’s camp associated transderivational constraints with the complete abandonment of formal grammar,²⁰ which seems about right. Certainly, the introduction of transderivational constraints coincides with a decline in formal interests among most Generative Semanticists, especially in the persons of the Lakoffs and Ross. The Generative Semanticists simply appealed to transderivational constraints when other theoretical mechanisms broke down (an extremely frequent occurrence, given the data they were exploring), with little justification, and without specifying the constraint formally, or even examining its application very carefully. The only explicitly proposed transderivational rule came very late in the schism, and its author quickly repudiated it (Gazdar 1977; 1979).

Notice, however, that we have moved a long way from Chomsky’s initial any-imaginable-rule charges of descriptive profligacy. For one thing—and, from their perspective, it is by far the most important thing—the Generative Semanticists had expanded their data concerns very extensively. They were no longer interested in the set-of-possible-sentences approach in which Chomskyan grammar is rooted—exploring, in particular, a great many contextual phenomena well outside the bounds of Aspects. But they had also reconceived or rejected many of the defining notions of the Chomskyan framework, like competence, performance, and grammaticality. Several had abandoned formal theory construction altogether, while others were barely hanging on by their fingernails. Interpretive and Generative Semantics were no longer comparable on metrics like restrictiveness; indeed, Generative Semantics₁₉₇₃, was no longer comparable on such metrics to Generative Semantics₁₉₇₀, which was not altogether comparable to Generative Semantics₁₉₆₇.

Grammaticality

While there was general agreement about the notion of “grammaticality” in 1967, generative semanticists have come to dispute the notion that one can speak coherently of a string of words (or even a surface phrase-marker) as being grammatical or ungrammatical or having a degree of grammaticality. . . . Thus, strictly speaking, generative semanticists are not engaged in “generative grammar.” Chomsky, on the other hand, has greatly expanded the range of sentences which he would call “grammatical” but semantically unacceptable and thus, while maintaining a notion of grammaticality of sentences, applies it very differently than he did in [Aspects of the Theory of Syntax].

—James D. McCawley (1982b [1973]:11)

While Chomsky and his kith were piously intoning against the descriptive recklessness of Generative Semantics, the return charge was of an insular and unwholesome descriptive asceticism. Much of this counterattack was unfocused, but some Generative Semanticists—most notably, McCawley—brought it directly to bear on the concept of grammaticality. Remember what he told the New Yorker: “Chomsky assumes that there are sentences which belong to the language and other sequences of words which don’t—and the grammarian’s task is to write rules [that] determine which belong and which don’t. Postal and Lakoff and I say this isn’t a coherent notion” (Shenker 1972).

Although he hasn’t been entirely consistent in application, and there are some surprises on this front later in our story, Chomsky always said that grammaticality is a technical term, relative to some specified grammar.²¹ A sentence is grammatical if and only if there is a grammar, a body of rules, which generates it. A sentence is grammatical or not relative to, for instance, the Aspects grammar or the Syntactic Structures grammar. Acceptability, on the other hand, is relative to a specified speaker in a specified context. A sentence is acceptable if a speaker says it is okay.

Since grammars are ideally models of linguistic knowledge, grammatical pertains to the theory of competence, acceptable to the theory of performance.

There are obvious and important overlaps between grammatical and acceptable, of course, but they are theoretically very distinct notions; grammatical sentences, for instance, can be very, very long, so long as to be unacceptable to some speakers, or they can lead speakers down a garden path so winding that they refuse to accept them, and acceptable sentences can be, as in poetry, deliberately ungrammatical. The position that grammaticality is relative to an abstract grammar (“model,” “theory”), rather than to a speaker’s judgment, has led to some confused ridicule of Chomsky, mostly from humanists and Bloomfieldian holdovers, about such theoretical phenomena as the shifting status of “Colorless green ideas sleep furiously”—sometimes grammatical in his framework, sometimes not. But this sort of drift is not uncommon in science: as the grammar changed, so did its grammaticality implications. The atom was once the smallest piece of matter in physics, now it’s not; light is sometimes corpuscular, sometimes undulant, sometimes both; space sometimes full of ether, sometimes not, sometimes peaceful, sometimes turbulent.

But as Carden archly points out, the combination of the crucial, theory-deciding notion of grammaticality with the slipperiness of data that comes from introspection is not a recipe for easy and natural theory choice. “When informants A and B speak superficially similar dialects of American English,” he says—and, hey!, he can use Informant A and Informant B all he likes, but let’s substitute Jackendoff and Lakoff—“we must be suspicious of a theory that forces us to conclude that [Jackendoff] speaks a language with, say, Deep Structure and interpretive semantic rules, while [Lakoff] speaks a language with a generative semantics controlled by derivational constraints” (Carden 1976:5).

The instability of one notion can be a domino, and the Generative Semanticists came increasingly to view the competence-performance distinction, on which the grammatical-acceptable distinction rests, as artificial—worse, incoherent—and grammaticality as the symptom of a sterile, head-in-the-sand framework. Ross and the Lakoffs attacked the distinction somewhat obliquely, by cataloging phenomena clearly required by language (or, perhaps more accurately, clearly required by speakers), but wholly indifferent to grammaticality—phenomena like please and thank you and the relative appropriateness of 15a and 15b for a lecture presented to an anthropological society.

Lakoff’s use of the asterisk here is telling, since she uses it to signal inappropriateness in a given context, not ungrammaticality. Her opposition to Chomskyan grammaticality had much to do with her increasing interest in sociolinguistics and ordinary language philosophy, for which context is crucial. Ordinary language philosophy, in particular, grew more important for Generative Semantics in the 1970s. Beginning with Ross’s and Robin Lakoff’s fairly direct importation of Austin’s insights about performatives, which Sadock and Davison took up at Chicago under McCawley, it gained momentum under the influence that philosopher H. P. Grice’s conversational implicature work had on both Lakoffs, on their students, and on McCawley.²²

It is from this period that linguists began to develop a sense of something they called pragmatics, as distinct from what they called semantics. For Generative Semanticists in particular, and linguists in general, the latter was virtually a synonym for “meaning” until the early-to-mid-1970s. Now, semantics began hardening into a term for the truth conditions of sentences (like our ‘Yallis-Island examples) and a hatful of related notions (principally, paraphrase and entailment), largely informed by connections to formal work in philosophy.

Pragmatics, . . . well, pragmatics never hardened into anything particularly, but it pretty much stands for ‘everything else about meaning’ in linguistics now—in particular, everything related to the influence of context. Gerald Gazdar put this into a blunt equation (1979a:2):

Non-truth-conditional meaning began to get serious attention for the first time in the history of linguistics when Generative Semanticists began snooping around in the data and implications of ordinary language philosophy, a great deal of which played havoc with Chomsky’s notion of grammaticality. Robin Lakoff was at the forefront of this work, inventing the sociolinguistics of gender in the bargain, with her foundational Language and Woman’s Place (1975).

The main, and most vocal, opponent of Chomskyan grammaticality wasn’t Ross or either of the Lakoffs, though. It was McCawley, and his disaffection followed a somewhat different route. In 1970, James Heringer did an unusual study of quantifier-negative idiolects—unusual for Transformational Grammar in that it gathered data empirically rather than introspectively. The study was very modest, but among its results were that a number of his informants found 16a, a sentence most Transformationalists would brand ungrammatical, to be acceptable given the context supplied in 16b:

Although this study is clearly and intentionally about acceptability, McCawley took it as a confirmation of his “long-held suspicion . . . that (contrary to claims by Chomsky, Katz, and others) native speakers of a language are not capable of giving reliable judgments as to whether a given string of morphemes or words is possible in that language” (1979 [1972]:218); in brief, that Chomskyan grammaticality has no psychological basis. Indeed, McCawley came to regard the pursuit of grammaticality as utterly foolish: “[it is something which] I would not label as linguistics” (1979 [1972]:217– 18); worse, it is unethical, belonging to his collection of “ideas not to live by” which “I hold to be pernicious in that they have retarded our development of an understanding of how language functions” (1979 [1976]:234); worse yet, it is a personal embarrassment, for which “I hang my head in shame at seeing how many times I have spoken of sentences as being ‘grammatical’ or ‘ungrammatical’ ” (1982b:8).

As one might expect, the use of grammaticality and related terms in the Generative/Interpretive dispute is frequently confused, with McCawley an archetypal case (hence, the head-hanging). At one point grammaticality for him does not apply to strings of words, but to complexes of semantic structures, surface structures, all intermediate structures, contextual information, and the speaker’s intentions, adding a telltale Generative Semantic, etc. (Parret 1974 [1972]:250). But McCawley, who read deeply in philosophy of science, realized the havoc that the slipperiness of key terms could play in reasoned debate. He even retroactively qualifies his use of grammaticality terms in a 1973 review of Chomsky’s Studies on Semantics, denuding them of technical significance and asking the reader to “take ‘ungrammatical’ as simply an informal English equivalent for . . . the kind of anomaly that I happen to be talking about at the time” (1982b:8). Eventually he developed a five-point syntactic-anomaly scale (marked by what he called stigmata ranging, left to right, from mild to extreme deviance:?,??,?*, *, **), sometimes throwing in a “%” for dialectal variation (1988:5–6); he was even known to use “✡” for English sentences with “a Yiddish flavor” (1976 [1964]:9n6).

These and other quasi-conventions (quasi- because they varied linguist to linguist) show up in the literature, such as the exclamation mark, singly or in pairs, as the “double shriek” (!!), for screamingly bad sentences,²³ but the collective meaning of stigmata is clear: many, many linguists do not see grammaticality as a binary, off-or-on notion, and many confusions arose in the dispute owing to a lack of clarity over in-the-grammar/out-of-the-grammar claims.

Chomsky’s response to this line of argument had been on record for a long time. Some Bloomfieldians objected to his concept of grammaticality when he first proposed it (see, especially, Hill 1961), as did his old mentalist mentor, Roman Jakobson; and Chomsky had some words for them:

Linguists, when presented with examples of semi-grammatical, deviant utterances, often respond by contriving possible interpretations in constructed contexts, concluding that the examples do not illustrate departure from grammatical regularities. This line of argument completely misses the point. It blurs an important distinction between a class of utterances that need no analogic or imposed interpretation and others that can receive an interpretation by virtue of their relations to properly selected members of this class. Thus, e.g., when Jakobson observes that “golf plays John” can be a perfectly perspicuous utterance [see Jakobson 1959:144], he is quite correct. But when he concludes that it is therefore as fully in accord with the grammatical rules of English as “John plays golf,” he is insisting on much too narrow an interpretation of the notion “grammatical rule”—an interpretation that makes it possible to mark the fundamental distinction between the two phrases. (1964a [1961]:385)

Chomsky is up to familiar tricks here—in particular, calling the way broader Jakobson “much too narrow”—but his principal point is inescapably clear: no matter what one can do with “All the applicants didn’t fail the test we so carefully rigged, didn’t they?” and “Spiro conjectures Ex-Lax,” one still needs to distinguish them somehow from more canonical sequences.

Finally, however, the argument on this front comes down, like so many scientific disputes concerning goals and methods, to a matter of faith. Chomsky acknowledges that the limits one puts on the study of grammar by roping off certain sections from others are necessarily arbitrary (1964a [1961]:385n5), but he is willing to live with the arbitrariness because it buys him a more manageable data set. The position is familiar in philosophy of science: “We are always surrounded by far more ‘phenomena’ than we can use and which we decide—and must decide—to discard at any particular stage of science” (Holton 1988:39). Bloomfield simply and unapologetically consigned the troublesome phenomena of meaning to “some other science,” like sociology or psychology, or resorted “to makeshift devices” (1933:140). Chomsky is in good company.

But how much data can one safely rope off before ending up with an unproductively shallow framework? A frequently invoked criticism of the Bloomfieldians was that they discarded semantics, to concentrate on phonemes and morphemes, and missed a great deal of what is going on in language. Generative Semanticists saw Chomsky committing the same sort of error by ignoring or outlawing pragmatics. They saw his velvet ropes as excluding crucial areas of research, perniciously directing linguistics down a blind alley.

The defining “framework containing many underlying agreements” that Postal (1974:v) cites in his On Raising dedication to Chomsky was crumbling by the time that book came out.

Rug Bulging

Chomsky’s shifting definitions of performance provide him with a rug big enough to cover the Himalayas.

—George Lakoff (in Parret 1974 [1972]:155)

The Generative Semanticists also saw Chomsky’s strategic roping-off of data as malign in another way. It wasn’t just a methodological error, they felt, it was cheating: a deliberate and disingenuous attempt to cloud the discussion of everything about descriptive power, a beclouding mission that spread to virtually every every corner of the Interpretivist program. Whenever the Interpretivists attacked Generative Semantics as too powerful, the entire Generative Semantic kit bag of doohickeys and kludges was brought into the tally, called global rules and treated as a byword for ‘any imaginable rule.’ On their side of the ledger, however, the Interpretivists listed only transformations and constraints (which, of course, were not to be called global, and were held to reduce power, not increase it). Conspicuously absent were the increasingly powerful Semantic Interpretation Rules, which Interpretivists seemed to regard as free for the taking. And since the Generative Semanticists felt responsible for a much wider class of phenomena than the Interpretivists, there was another loophole in the accounting procedure. Many of the facts that Generative Semantics addressed were simply ignored, postponed, left out of the comparison.

The trifecta Interpretive Semantics strategy on restrictiveness looked like this: (1) its own architecture was more complex, which Chomsky claimed as an advantage; (2) it could handle many phenomena essentially for free, by leaving them to Semantic Interpretation Rules; and (3) it could relegate its most troublesome phenomena to performance, at best waving in the direction of a solution, such as the analogic rules of “Remarks.” As we have seen, the first of these maneuvers, complexity, caught the Generative Semanticists rather slack-jawed. They couldn’t believe Chomsky would pull it, or that anyone would fall for it, and offered no counterarguments. Simplicity was almost a religious mantra for Chomsky up through Aspects (and it returned again in the 1990s), but for a decade or so, complexity became a virtue. On the other two moves, shuffling data into the “free” semantic component or into the netherworld of performance, Generative Semanticists came to fight, repeatedly drawing attention to what they regarded as blatantly specious argumentation.

Even before Chomsky had launched the restrictiveness assault, Lakoff was complaining about his use of wild cards—beginning with “Remarks.” Chomsky’s trick with analogy was particularly notorious. If the grammaticality judgments go the right way, Chomsky said in “Remarks,” well and good. If they go the other way, then we can attribute them to a rule of performance, and that’s not my concern. Heads, Chomsky wins; tails, Generative Sematics loses. All of the Interpretivists continued this tack throughout the debate, shuffling problem data out of their competence-centered purview and exhibiting the attitude “that theoretical innovations need no particular justification if they can be relegated to ‘performance’ ” (McCawley 1982b [1973]:29). It never stopped driving the Generative Semanticists to distraction. But performance was only one of the dodges.

Semantic Interpretation Rules were also brought in, not just to handle semantic phenomena of the sort Generative Semanticists were handling with transformations and global rules, but for a surprising amount of (formerly) syntactic phenomena. Postal proposed his Crossover principle to explain straightforward grammaticality facts, and grammaticality was a wholly syntactic beast for Chomskyan linguists. But Jackendoff (1972:145–59) said that the data ought to be handled by Semantic Interpretation Rules, an account that requires a sequence of words like “Himself was shaved by Rob” to be grammatical (“syntactically well formed”) but semantically anomalous. One of the more notorious examples of this now-you-see-it, now-you-don’t application of Semantic Interpretation Rules occurred at the 1969 Texas conference, where Lakoff presented an argument he got from Perlmutter concerning respectively sentences in Spanish. In some dialects, goes the argument, 17b is a legitimate sentence, indicating that it derives from something like 17a (Spanish requires adjectives to agree in gender with the nouns they modify):

The data clearly supports lexical (de)composition (padres from madre-y-padre), and clearly suggests that the gender-agreement transformation is conditioned semantically (since the adjectives do not agree with padres, which is masculine, but with its implied constituents)—seeming to offer solid evidence for Generative Semantics. Chomsky is reputed to have declared gender agreement to be a wholly semantic phenomenon, something handled by Semantic Interpretation Rules, rather than transformational rules, a position which implies that 18 is perfectly grammatical, but semantically anomalous:

Since gender agreement had traditionally been treated the same as other “syntactic” properties, like person, number, and case, a corollary claim in English would be that 19a and 19b are both grammatical, but 19b is weird for semantic reasons:

We obviously can’t reconstruct the conference argument with any reliability, but Chomsky did comment on the exchange in “Some Empirical Issues,” observing that “if [the facts in 17a and 17b] are accurate, it would appear that gender agreement may be a matter of surface interpretation, perhaps similar to determination of coreference. This would seem not unnatural” (1972b [1969]:155n26). As the prophylactic would, the cautious copula, and the double negative litotes (not unnatural) all indicate, such a conclusion is in fact quite unnatural in the linguistic tradition, but if it allows him to keep syntax in a petri dish, away from semantic contamination, Chomsky is willing to embrace it.²⁴

The Generative Semanticists were gob-smacked. No matter what arguments they came up with, Chomsky just calmly moved the goalposts back on them. McCawley uses this incident to demonstrate the violence Chomsky is prepared to commit on the “commonly held conceptions of ‘grammaticality’ ” and the power he is prepared to give over to Semantic Interpretation Rules (1982b [1973]:89–90) in order to keep semantics away from syntax; Lakoff uses it as an illustration of how counterevidence can push the Interpretivists “to ever crazier positions” (Parret 1974 (1972]:169). One moment a given sentence is grammatical, the next it is ungrammatical but acceptable because of rules of performance; one moment it is ungrammatical, the next it is grammatical but now semantically anomalous. As Talmy Givón observed, however, the Interpretivists were not the only linguists redefining their data to suit their needs. Generative Semantics was a theory, he snorted, “where adjectives were proclaimed to be verbs one day (Ross & Lakoff 1967) and nouns the next (Ross 1969[a])” (1979:14).

The Interpretivists regarded their goalpost relocation maneuvers as the price of doing business, responsibly arranging their data into the piles that would make it most manageable. To them, the Generative Semanticists, with their ill-specified global rules and completely ungovernable data, were the ones who looked crazy. Worse: they were slipping back into the pre-scientific dark ages from which Chomsky had delivered the field.

Bloomfieldian Backslide

Is generative linguistics infiltrated by a counterrevolutionary underground?

—Ray Dougherty (1974:277)

Chomsky’s routing of the Bloomfieldians had been so complete that by the late 1960s any of the synonyms for that school (taxonomic, descriptive—even structuralist, which described Chomsky as well as anyone at the time, better than some) were code words for misguided, unscientific, and blockheaded. It was inevitable that these terms would be deployed against the new unscientific blockheads on the block, Generative Semanticists, and the final offensive against Generative Semantics was that it represented a backslide into the stamp-collecting era from which Chomsky had rescued linguistics; or, in Dougherty’s quasi-Kuhnian terminology, a “Bloomfieldian counterrevolution” (1974).

Chomsky was not active in the final campaign, though he may well have been its sponsor; much of the argumentation that flows out of MIT begins with him, and he certainly endorsed the Generative-Semantics-as-Bloomfieldian-backslide case (see, for instance, Chomsky 1979 [1976]:154; also Brame (1976:26n1). The case is in fact very strong, though with even a modicum of objective tempering it is difficult to feel the same horror in a return to some Bloomfieldian tenets that Dougherty (1974, 1975, 1976a, 1976b, 1976c), Brame (1976), Katz and Bever (1976 [1974]), and Ronat (1972 [1970]) all declaim. Bloomfieldians cared about usage. They respected the particularities of different languages, registers, and purposes. They acknowledged context and culture. They were cautious about universalizing. What could be so bad? In a word, empiricism.

Rationalism and empiricism had accrued rhetorical magnetism for the Chomskyan community, the first pulling mentalism, Universal Grammar and genetic endowment along with it into any arguments, the second pulling in Bloomfieldian blockheadedness. While no one holds to either of these views at the in extremis level of these definitions, scholars who pledge allegiance to one of these terms frequently accuse their enemies of holding to the opposite definition (in the familiar straw-house fallacy of caricaturing opponents):

Empiricism: all knowledge is acquired through the senses.

Rationalism: all knowledge comes hard-wired.

The Bloomfieldians pledged allegiance to empiricism. Chomsky and Chomskyans pledge allegiance to rationalism.

Rationalism had fallen largely into obsolescence by the early twentieth century, and Chomsky’s most renowned contribution to contemporary philosophy is its resurrection, largely around his language acquisition arguments. We acquire an intricate linguistic system with little overt teaching on the basis of incomplete and glitchy exposure, this argument runs (the poverty of stimulus argument); therefore, much of it must be hard-wired, part of our biological endowment, a Universal Grammar (see especially 1965 [1964]:25–26, et passim; 1988:xxv–xxix, et passim). Chomsky takes this incomplete and glitchy exposure to be axiomatic, so he also labels this focus of his program Plato’s problem because Plato famously held that we had more knowledge in our heads than the senses could possibly extract from the world. If it’s the case we can’t get enough from our environment, and it’s the case that we have a very rich cognitive system of thought and communication, language, then we must have a universal, pre-wired grammar.

The job of the linguist becomes building an explanatory theory of this Universal Grammar. In his case for rationalism, Chomsky frequently used the Bloomfieldians as whipping boys, whom he depicted appropriately in general terms, but not without distortion in specific arguments, as empiricists. To make any progress in linguistics, Chomsky told the troops, “it is necessary to go far beyond the restricted framework of modern taxonomic linguistics and the narrowly-conceived empiricism from which it springs” (1964d [1963]:113), a framework relying on a loose “body of procedures for determining the grammar” that assumes “the form of language [is] unspecified” (1965 [1964]:54).²⁵

By the time the Linguistics Wars were in full swing, the evil of empiricism was so self-evident to the Interpretivists that few of them felt any burden to establish why the Generative Semantics return to empiricism is so disastrous. Dougherty (1974, 1975) and Brame (1976) are content with name-calling; Katz and Bever (1976 [1974]) show carefully how Generative Semantics opens the door for the return of Bloomfieldian epistemology, and then end their paper with an ominous “Caveat lector.” But none of them take much time to say what was wrong with empiricism.

The backslide arguments have two primary aspects, one methodological, one philosophical. The methodological part of the case was silly and vitriolic, and belongs mostly to Dougherty, who asked such memorable questions as

Whatever became of those linguists who were thoroughly trained in taxonomic methodology? Where are those old students who brought joy to the taxonomic hearts of their old masters? Where are the old students who, while suffering through a sequence of field methods, relentlessly pursued the phoneme from teepee to teepee? (1974:278)

After attacking McCawley and Ross and Lakoff for many haranguing pages, and dropping such broad hints as “having cut their eyeteeth on Bloch, etc.,” and diagnosing Postal’s work as symptomatic of the dread disease “Generative Breakdown-Taxonomic Relapse,” Dougherty leaves his readers to find the answers on their own.²⁶ He returns to such issues recurrently. Here, for instance, is a lavish metaphysical conceit he fashioned a few years later:

What is Generative Semantics? Figuratively speaking, GS is a current manifestation of the Phoenix of Science. In periods of scientific revolution, the Phoenix of Science arises from the ashes of the expiring paradigm and can thrive only until the new paradigm becomes the basis for research. Once the new paradigm is established as a coherent set of assumptions, the lush growths of pseudoproblems and the abundance of interesting, but unimportant, data upon which the Phoenix feeds start to disappear. Lacking substance, the Phoenix consumes itself in a mystical fire only to reappear at the next scientific revolution from the ashes of the disintegrating paradigm. (Dougherty 1976c:22)

Huh?

The philosophical component of the empiricism critique, represented most fully by Katz and Bever’s (1976 [1974]) “The Fall and Rise of Empiricism,” is somewhat more cogent. They don’t argue either that Generative Semantics entails an empiricist epistemology, or that Generative Semanticists have deliberately adopted empiricism, still less that they inherited it from their teepee-creeping professors:

We do not claim that the linguists who are bringing it back are necessarily empiricists or are aware that their work has this thrust, but only that their work clears the way for the return of empiricism. (Katz & Bever 1976 [1974]:30)

Generative Semanticists are simply unwitting hosts for the reemergence of a virulent strain of empiricism. Katz and Bever warn linguists about the direction in which Generative Semantics will lead the field, not about its current (mid-1970s) stance. Trotting out the rhetoric of Cold War American foreign policy to strike fear in the hearts of linguistic consumers, they suggest a domino theory of encroaching empiricism in linguistics, built around the notion of grammaticality: first it is relaxed; then it is modified; eventually it must be discarded. “Once it fell,” they argue, “so would each other domino: conversational bizarreness; next cultural deviance, then, perceptual complexity, and so on” (1976 [1974]:59).

Figure 5.3 Chomsky’s “hypothetical language-acquisition device.” Adapted from Chomsky 1966b:20.

Katz and Bever were clearly right—not about the danger, necessarily, but certainly about the cascade. We’ve already seen the Generative Semantics rejection of grammaticality, the stigmata of conversational bizarreness, and the implications of cultural divergence. Generative Semantics unquestionably represents a “return” to empiricism. Look, for instance, at how McCawley criticizes a diagram Chomsky had made famous in defining the central focus of his program (given as Figure 5.3). “The flaw in this account,” McCawley says, is that it makes the child look like “a linguist who elicits ten notebooks full of data from his informants in New Guinea and doesn’t start writing his grammar until he is on the boat back to the United States” (1976b [1968]:171): Chomsky’s diagram suggests a model with no room for hypothesis-testing, experimentation, game-playing, error correction—in a word, no room for feedback—all of which are defined not by exposure to data (the rationalist position) but interaction with data (the empiricist position). McCawley suggests the diagram be revised along the lines of Figure 5.4.

Figure 5.4 McCawley’s modification of Chomsky’s Acquisition Device. Adapted from McCawley 1976b [1968]:171.

McCawley’s commentary is a willful misreading of Chomsky. While it is certainly true that Chomsky cares far more for the character of the “device” determining language acquisition than the actual processes of language acquisition, he would not deny that “primary linguistic data” includes feedback, or that acquisition involves the formulation and modification of successive grammars before full competence is achieved (see, for instance Chomsky (1962b [1960]:530, 1965 [1964]:207).

But the two diagrams are very revealing all the same. Chomsky rarely cites empirical acquisition studies in his work, for instance, or pays more than lip service to feedback or modification, and versions of Figure 5.3 have shown up recurrently in his work with essentially the same form. Chomsky is interested far more in the properties of the cognitive mechanism he terms the language acquisition device (1965 [1964]:32 et passim), than in its specific employment—a definitively rationalist concern. McCawley is interested at least as much in how the mechanism is put to work, and in how it interacts with general-purpose learning strategies, and in the character of the acquisition data— empiricist concerns all. He doesn’t reject Chomsky’s rationalist arguments about language acquisition. He just has somewhat broader interests:

Chomsky’s well-known arguments that language acquisition cannot be accomplished purely by general purpose learning faculties should not lead to the non sequitur of concluding that general purpose learning mechanisms play no role in language acquisition: General purpose learning faculties clearly exist . . . and it is absurd to suppose that they shut off while language is being acquired. (1980b:183)

Returning to Katz and Bever for a moment, we find they barely consider McCawley. They focus almost exclusively on George Lakoff, which, among other things, illustrates how fully Generative Semantics had come to be associated with Lakoff by then, especially in Interpretive eyes. But they don’t actually look at anything he said about the issue either.

What Lakoff said was fairly consistent by the early 1970s. There was a little I-know-you-are-but-what-am-I? taunting about empiricism, as in his remarks about (who else?) Chomsky:

I would say that, of contemporary linguists, Chomsky is among the more empiricist linguists . . . in the sense that he is still interested in accounting for distributions of formatives in surface structure without regard to meaning. (Parret 1974 [1972]:172; see also McCawley’s comments, p. 251)

But Lakoff, turning himself in many ways into the anti-Chomsky by the early 1970s, was actually quite happy to admit his interest in data and language processes and general-purpose cognition—he calls himself, in fact, a “Good Guy Empiricist”—and equally happy to be associated with Chomsky’s notorious earlier opponents. He celebrates the Bloomfieldians in very respectful and accurate terms, for their creation of “a broad, diverse, and interesting field, which happened not to be very good at dealing with the syntactical problems raised by Chomsky, and which showed little interest in formalized theories” (1973c). And, also accurately, he noted that “when transformational grammar eclipsed structural linguistics, it also eclipsed many of these concerns, much to the detriment of the field” (Parret 1974 [1972]:172).

More explicitly, Lakoff’s critique of Chomsky’s famous nativist argument goes well beyond McCawley’s live-and-let-live, don’t-forget-general-purpose-learning-mechanisms approach. Lakoff puts the implications of the argument in binary terms, and lobbies heavily against the rationalist side of the coin:

What Chomsky has shown is that either there is a specifically linguistic innate faculty or there is a general learning theory (not yet formulated) from which the acquisition of language universals follows. The former may well tum out to be true, but in my opinion the latter would be a much more interesting conclusion [though see the earlier Lakoff (1968b:1–4), when his views were the reverse]. If I were a psychologist, I would be much more interested in seeing if there were connections between linguistic mechanisms and other cognitive mechanisms, than in simply making the assumption with the least possible interest, namely, that there are none. (G. Lakoff 1973c; his italics)

Meanwhile, Ross began to emphasize Zellig Harris’s influence more and more—citing him, for instance, as marking the major conceptual break that led to modern linguistics, in a paper noteworthy for its odd sense of history. The paper is addressed to cognitive psychologists, outlining the sorts of contributions linguists can make to their field, and Harris, as thoroughgoing an anti-mentalist as they come, gets the lion’s share of the credit for making these contributions possible; Chomsky, incredibly, is not even mentioned until well into the article, when he is conspicuously introduced as Harris’s student (1974b:64, 68). Pursuing the same theme, G. Lakoff takes the implications of Chomsky’s apprenticeship to even further absurdities, suggesting that his work is mindlessly derivative:

Chomsky was extraordinarily dependent on his teachers for his intellectual development. Most of his early linguistic analyses are taken directly from Harris, as is the idea of the transformation. The idea of evaluation metrics was taken over directly from Nelson Goodman. (Parret 1974 [1972]:172–73).²⁷

There was a decidedly schizoid flavor to the Generative Semantics invocations of Bloomfieldian linguistics. On the one hand, we see the standard move in reaching back historically to embrace the enemy of your enemy, as Chomsky had reached back to Humboldt and the Port-Royal philosophers. On the confused other hand, Chomsky was just a shallow and derivative Bloomfieldian anyway.

Not to be outdone, we get Dougherty’s absurdity that: “Generative Semantics has been developed internal to Harris’s transformational taxonomic system” (1975:154). Chomsky, the recursive base, the Katz-Postal Principle, Aspects of the Theory of Syntax, and so on, apparently had nothing to do with the development of Generative Semantics; Harris begot them directly.

Parting Company

When two opponents have been arguing, though the initial difference in their positions may have been slight, they tend under the “dialectical pressure” of their drama to become eventually at odds in everything. No matter what one of them happens to assert, the other (responding to the genius of the contest) takes violent exception to it—and vice versa.

—Kenneth Burke (1941:139)

Shenker describes Postal’s main occupation in the early 1970s as “proliferating exceptions to Professor Chomsky’s theories” (1972), which is actually a pretty good phrase for what both sides were up to for much of the dispute. A great many exceptions to Professor Chomsky’s work percolated out of Abstract Syntax—Postal’s verb-adjective conflation, Ross’s auxiliary analysis, McCawley’s re-analysis of selectional restrictions as semantic rather than syntactic. When the threshold to Generative Semantics was crossed, the exceptions became far more specific, zeroing in on Deep Structure—McCawley’s respectively argument, Postal’s remind argument, the Predicate-raising, cause-to-die arguments. On Professor Chomsky’s part, the “Remarks” proposals gnawed away at the Abstract-Syntax foundations of Generative Semantics, but the exceptions it spins out are somewhat scattered, going after abstract verbs here, category-reduction there, “the Transformationalist hypothesis” somewhere else. For the next two years, however, the whole MIT program, Chomsky at the helm, seemed to do little more than proliferate a class of very specific exceptions, directed at the Katz-Postal hypothesis. Meanwhile, the Generative Semanticists switched from their assault on Deep Structure and began proliferating grammaticality exceptions to Professor Chomsky’s work hither and yon, most of them pragmatic.

At this point, a rather clear difference surfaced. It may have been part of the intellectual makeup of the principals. It may have arisen through the genius of a contest in which the other side seemed to be playing a constant shell-game with the facts. More likely, it was both. But the Generative Semanticists loved data. And the more problems it caused for various bits of theoretical machinery, the more they seemed to love it. They kept their exception-proliferating noses to the grindstone throughout the 1970s.

The Interpretivists loved theory. And the more problematic the data was, the more eagerly they shunted it off to grammatical provinces for which they felt little or no responsibility in order to inoculate their theory (again, with exceptions for Jackendoff). McCawley compares this attitude to “the traditional Christian attitude towards sex: the pleasure of gathering data is proper only within the confines of holy theory construction and when not carried to excess; recreational data-gathering is an abomination” (1980a:917–19; see also Newmeyer 1980b:932–34 and R. Lakoff 1989:956n5).

Since exception-proliferation is a data-heavy activity, corrosive to theory construction, Interpretive Semanticists quickly tired of exception-generating, and the brunt of their attack on Generative Semantics became conceptual. In fairly rapid succession, the arguments came: Generative Semantics is just a new name for the same old grammar (a notational variant); it is licentious in its use of theoretical mechanisms (unrestrictive); and it has the wrong philosophy (backsliding into Bloomfieldian empiricism). At times, the combination of these arguments strains credulity; for instance, in one paragraph Chomsky says that global rules “are quite similar, if not identical, to the interpretive rules proposed by Jackendoff and others,” but that they add “immense descriptive potential” to grammatical theory, and thus their introduction “constitutes a highly undesirable move” (1979 [1976]:152): global rules are notational variants of a good rule-type, but they are a bad rule-type.

The Generative Semanticists kept exception-proliferation factories running day and night to expose the absurdly narrow conception of grammaticality Chomsky championed,. They accused him of sweeping those excluded facts under the performance or Semantic Interpretation carpet, or waving half-baked, inexplicit solutions at them.

Then, suddenly, they found that he just didn’t care.

After the barrage of counterarguments in his 1969 Texas paper, “Some Empirical Issues,” Chomsky just turned his back on them. He was still happy to bash Generative Semantics in class, in interviews, and other informal settings, and remains happy to do so, but it rarely earned even a contemptuous footnote in his formal work after the Texas paper. In 1971, his trace-proposing paper, “Conditions on Transformations,” began circulating underground (published 1973a; 1977:81–162), marking his official withdrawal from the debate. Before that paper, Sadock says, Chomsky was “still talking the same lingo.” Afterwards, “there was a new philosophy,” a philosophy of restrictiveness, and a complete inattention to any of the issues that were driving Generative Semantics. His students continued to press the attack, becoming more and more savage, but Chomsky no longer had even a sneering allusion left for his old enemies in this work. He gave up the attack and pursued positive work.

But Generative Semantics was no longer talking the same lingo it started with either. By the mid-1970s, they were no longer “engaged in ‘generative grammar’ ” (McCawley 1982b [1973]:11; Parret 1974 [1972]:152). In his Parret interview, G. Lakoff uses Generative Semantics and generative grammar as virtual antonyms throughout.

There had been behind-the-scenes attempts to find some amicable common ground over the central issues via correspondence, at least among Ross, Postal, and McCawley, on the one side, Chomsky on the other, without any success. The letters are mostly calm, with glimpses of congeniality. Ross, for instance, points out that one of Chomsky’s remarks about Postal in the manuscript of “Empirical Issues” (1971 [1969]) is “an unnecessary slap in the face.” Chomsky disagrees, saying that he is “simply giving [his] opinion,” but agrees to “qualify it even more” if Ross feels he is being unfair (see Huck & Goldsmith 1995:74, for the exchange). But very few points of contact were achieved or maintained in these exchanges, and goodwill was strained. In a letter to Postal on the same issue, Chomsky says

What so arouses your ire is a side remark . . . that I regard some rule [of yours] as dubious. Even if that were erroneous, your response would be astonishing. In fact, even if I had given a wrong argument, made a factual error, or whatever, such a response would be astonishing—it is doubly so when what is involved is a statement that I regard a proposal [of yours] as dubious. . . . This kind of reaction makes discussion impossible, too unpleasant to be worth pursuing.

Postal’s reply?

We are so far apart and so lacking common assumptions and judgment that the time has possibly come when discussion is largely fruitless beyond picking up occasional counterexamples to proposals, which opposing-hostile thought generates with great facility. Frankly, I find it increasingly hard to see the kind of work you do and sponsor as part of a common field of interest. (Huck & Goldsmith 1995:75–76)

The rhetorical breakdown was complete. Lakoff talks about a kind of incommensurability between the two approaches, charting out the conflict of methodological and teleological commitments. Maybe. Certainly there were large-scale value differences—still hinging significantly on the mutual influence of syntax and meaning in language, but by now implicating a wide array of particulars (of methodology, of instruments, of axioms and primitives, and especially of scope). By the mid 1970s it barely made sense to speak of two sides. The division was not binary or even polar, but cultural and ideological. One culture followed a visceral allegiance to autonomous syntax. Distributional matters were always in vitro. The other culture followed a visceral allegiance to language in vivo, to the great wolly commingle, to pragmantax.²⁸

Incommensurability, to the extent that there is such a thing in science, is a phenomenon of theorists, not of theories (R. A. Harris 2005:92), and there was precious little will to commensurate by this stage, not just among the leading figures, but across the discipline as a whole. Geoffrey Pullum, for instance, recalls the reception for a paper about auxiliaries he wrote with Deirdre Wilson in the mid-1970s (Pullum & Wilson 1977):

It tried desperately to separate the issue of whether auxiliaries are main verbs (they are) from the issue of whether Generative Semantics was right. Hardly anyone was listening. We might as well have suggested the Israelis and the Palestinians sit down together and talk.

Each side went its own way. To the Generative Semanticists, it looked pretty much like they were going off in victory, and it looked that way to Chomsky as well. He recalls that by the mid-1970s “the overwhelming mass of linguists interested in transformational grammar at all were doing some kind of generative semantics” (1982a [1979–80]:43, 46). Not only was the first generation of MIT linguists enamored of it, so were their students. Influential Generative Semantics work was being done at most of the major institutions—at the University of Texas–Austin, at the University of Massachusetts–Amherst, at the University of California–Los Angeles, at University of Illinois Champaign-Urbana, at the University of Michigan (where the Lakoffs taught briefly), at the University of California at Berkeley (where both Lakoffs moved in 1972), and, of course, at the University of Chicago. Chomsky really had only MIT. Transformational textbooks from the 1970s virtually all include a sort of cutting-edge section that declares or suggests Generative Semantics to be the linguistics of the future. The most dramatic illustration is John Kimball’s The Formal Theory of Grammar. Kimball was a 1971 MIT graduate, and the final section of his book doesn’t so much as hint that Chomsky has a contemporary position on semantic questions. Generative Semantics shows up as the natural successor to the “Standard Theory,” just as natural a successor to that theory as the Standard Theory was to the one presented in Syntactic Structures (Kimball 1973:116–26); it’s almost as if Chomsky got a basset hound, put on his slippers, and retired after Aspects.²⁹

Nor was it just the younger linguists who were signing up to Generative Semantics. Many linguists from earlier generations also found the model very attractive. Emmon Bach was influential early on, and Lees endorsed it with enthusiasm. Wallace Chafe had proposed a semantically-based Transformational Grammar himself, independently (1967a, 1970a, 1970b), then aligned himself with Generative Semantics (especially, 1970b:56, 68). He was instrumental in bringing the Lakoffs to Berkeley. Even some Bloomfieldians, from the linguistics boneyard, rattled their approval. Hill’s LSA memoir captures the general mood of the old guard (or maybe “the ancient guard,” since Chomsky now looked like the old guard) in the 1970s—Hill recalls receiving Ross’s Abstract Syntax work fondly, delighting at a Robin Lakoff paper for blowing the lid off Chomsky’s Cartesian claims, and even praises the other Lakoff for his humorous sentences. Eugene Nida painted his work in the period as having “a dependence on the Generative Semantics approach” (1975:8). Others from that generation chimed in as well, Householder commenting that “the views of Ross, Lakoff, McCawley, [and] Mrs. Lakoff on the nature of the base . . . are the ones which I find most congenial” (1970:35). Bolinger became an active ally in the movement, showing up widely in Generative Semantics footnotes and acknowledgments for perceptive or inspiring assistance; Ross singles him out for “a special kind of thanks” in one article, because he “has been saying the kind of things I say in this paper for a lot longer than I have been able to hear them” (1973c:234). Even the transformational granddaddy, Zellig Harris, was being touted by his supporters as developing a theory “similar to a Generative Semantics theory” (Muntz 1972:270). Whatever their views of Generative Semantics, the Bloomfieldians were certainly not displeased to see Chomsky reaping the discordant harvest he had sown.

Psychologists found Generative Semantics appealing, for obvious reasons—“the attraction for psychologists of Generative Semantics is the greater plausibility of supposing that a speaker begins by generating the basic semantic content of ‘what he wants to say,’ only then going on to cast it in an appropriate syntactic form” (Greene 1972:85)—and the leading Generative Semanticists had no trouble gaining a hearing with them. McCawley and Ross, for instance, were featured presenters at the 1972 Conference on Cognition and Symbolic Processes (McCawley 1974b; Ross 1974b). Philosophers, too, showed considerable interest, welcoming in particular the explorations of logical form by G. Lakoff and McCawley. Donald Davidson and Gilbert Harman invited them to “a cozy cross-cultural colloquium at Stanford” in 1969 (Quine 1985:357), and then included a paper by Ross in the collection of work stemming out of that colloquium (G. Lakoff 1972b; McCawley 1972; Ross 1972d. See also Harman 1972). David Lewis, while professing in a now-classic paper on semantics not to “[take] sides on disputed issues in syntactic theory” (1970:37), cites Chomsky only routinely, approvingly engages with McCawley’s work, appeals to pre-Deep-Structure lexical-insertion transformations, thanks George Lakoff for help with the manuscript, and even invokes global rules.

Europeans, most of whom only became interested in transformational work in the late 1960s, also signed on to Generative Semantics in large numbers. It resonated with the older generation, who complained that Chomsky was just a meaning-ignoring Bloomfieldian in Interpretive-Semantic clothing, and the younger generation was attracted by the cachet of the latest thing (Koster 1990:306). Pieter Seuren, Rudolf de Rijk, Eva Hajičová, and Werner Abraham all became influential Generative Semanticists. The model had notable followings in the Netherlands, Sweden, France, Czechoslovakia, and Germany, where Herbert Brekle’s Generative Satzsemantik went through two editions (1970 [1968], 1978), and where the important American papers all saw translations (Abraham and Binnick 1972; G. Lakoff 1971d; Vater 1971). In Czechoslovakia, where Petr Sgall had independently proposed a semantically based Transformational Grammar, the term gained rapid acceptance. There was even a growing following in Japan, and in Australia Frans Liefrink proposed his Semantico-syntax (1973).

Generative Semantics, it seemed, had inherited the field; Chomsky and his small band of Interpretivists were fading reactionaries; Aspects belonged to a quaint antebellum world. Carden even remembers his old teacher expressing puzzlement over one of his mid-1970s papers that attempted to keep a dialogue open with the Interpretivists: “Guy, why are you still talking to those people?” Lakoff asked him. “Haven’t you noticed the war is over? We’ve won.” Winning a war has its privileges, apparently. Lakoff noted in a paper from this period that “one of the joys” of debating is that “the winner gets to say ‘Nyaah, nyaah!’ to the loser.” A few pages later, of course, he closes with “Nyaah, nyaah!” (1973b:286, 290). Few published words tell us more about the dynamics of the dispute or about the personality of Lakoff.

Not so fast, George. The story’s not over yet.