`4 Parsing at Interfaces`

`4.1 Logical Form: Pronouns`

So far we have been discussing syntactic well-formedness in terms of what UG prescribes about syntactic structures and what is learned by children through the parsing process, that is, through discovery and selection. However, in the most general terms, languages are systems in which syntax connects meaning to some form of EXTERNALIZATION. Meanings, as framed by logicians, philosophers, and linguists, are “logical forms,” consisting of units formed from subunits, head–complement relations, phrase–adjunct relations, thematic roles, indexical relations, topic–comment relations, presupposition–assertion distinctions, and much more. For every syntactic structure, there is a logical form, specifying aspects of the meaning, associated with an externalization.

In most people, the syntax connects meaning and sound: the externalization consists of a phonological form for the expression. Thus, in these individuals, for every syntactic structure there is a corresponding externalization that specifies aspects of the sound and a logical form that specifies aspects of the meaning. These are interpreted at the “sensorimotor” and “conceptual-intentional” interfaces respectively, which have their own well-formedness conditions. Those well-formedness conditions interact with learned, variable properties and therefore involve parsing on our approach. For a significant minority, the externalization is some kind of signed system, not based on sounds, but on gestures like those of American Sign Language or the new Nicaraguan Sign Language. For smaller minorities, the externalization might be tactile; the Tadoma method is one such system. Whatever the externalization used, sound, sign, or touch, the point is the same all around. Phonologists, beginning in the early twentieth century, did fundamental work developing rich descriptive systems based on the theories of contrast and distinctive features of Roman Jakobson, Edward Sapir, Nikolai Trubetzkoy, and others (Jakobson 1941; Sapir 1925; Stokoe 1960; Trubetzkoy 1939). In the 1960s, analysts followed the seminal work of William Stokoe of Gallaudet University in Washington, DC, and set about understanding how signed languages conveyed the meanings that syntacticians had discovered in working on oral languages. Soon a vibrant research community emerged, finding that signed systems were as rich and complex as oral systems and making fundamental discoveries about individual signed languages, which proved to bear essentially no relation to their ambient oral languages.

In this chapter we will consider some difficult ellipsis phenomena that interact with both the sensorimotor interface and the conceptual-intentional interface and have interesting logical and phonological consequences. They also do not manifest the properties one would expect if linguistic variation were due to binary parameters defined in UG. Let us begin by considering the conceptual-intentional or syntax–meaning interface, in particular the Binding Theory and the parsing issues that it raises. The logical form will involve the interpretation of pronouns, particularly those in ellipsed VPs, such as Papa Bear wiped his face and Brother Bear did wipe his face, too. Then in the next section we will consider the distribution of ellipsed VPs, determining where VPs may be reduced to silence in this way.

The early history of generative grammar spawned many publications on the way in which pronouns could refer to people and objects. Linguists offered complex indexing procedures, whereby DPs might have the same or different indices, depending on whether they referred to the same or different entities. For a taste of the kind of unpleasant complexity invoked, see the appendix to Chomsky 1980 and the indexing procedures postulated there. Fortunately, the classical Binding Theory was introduced soon after, in Chomsky 1981a, and consisted of just three principles, simplifying matters enormously:

(1) A. Anaphors are bound locally.

B. Pronouns are free locally.

C. Names are free.

To be bound locally meant being coindexed with a higher, C-COMMANDING expression that is local, within its DOMAIN; being free meant not being coindexed with a local c-commanding expression. Those three principles, simply known as Principle A, Principle B, and Principle C, permitted a dramatic simplification of analyses. By hypothesis, they constitute a component of UG, available to humans in advance of experience, in fact enabling children to interpret their experience. The principles facilitate structures that can meet the demands of learnability, because children only need to learn which nominals are anaphors and pronouns, which does not look difficult. The principles divide nominals into three types: anaphors like the reflexive pronouns himself, themselves, and so on (in English), pronouns like she, her, their, and names (all other nominals). Each nominal is contained inside a Domain, roughly its clause or a larger DP, and it is either coindexed with another, higher DP within that Domain or not. If so, then it is bound locally; if not, then it is free locally. The Binding Theory, in modern terms, involves the indexical relations that make up a well-formed logical form; it is part of the conceptual-intentional interface.

As one way of visualizing things, for an expression Kim’s mother washed herself, whose structure is represented in (2), an analyst might start from herself and proceed up the hierarchical structure from one node to another. If a node is reached on this upward trajectory whose sister is a DP, that DP is a potential binder for herself. This approach involving tree climbing and sister checking enables us to capture the basic technical notion of c-command. The analyst gets as far as the lower IP, at which point there is a DP that is a sister to this node: Kim’s mother.

Herself must be coindexed with (and refer to) this maximal DP Kim’s mother; it may not refer to the lower DP Kim, because that lower DP is not a sister to the IP—it is contained within the larger DP and is therefore inaccessible to the Binding Theory.

The representation in (3a) is an alternative and partial representation of (2).

(3) a. _DP[_DP[Kim]’s mother]i washed herselfi.

b. _DP[_DP[Kim]’s mother]j washed heri.

c. _DP[_DP[Kim]’s mother] said _CP[that the doctori washed herj].

d. _DP[_DP[Kimi]’s mother]j said _CP[that the doctori washed Kimi].

e. Kim said _CP[that the doctori washed herj].

f. Kimi said _CP[that the doctor washed Kimj].

Now consider (3b): it has the same structure, just with her in place of herself—and her may not be coindexed with the DP Kim’s mother, because as a pronoun it needs to be free in its clause. It may, on the other hand, be coindexed with Kim, precisely because the DP Kim is not a sister to the IP (or any node reached by moving up the tree node by node starting with her) and is, therefore, irrelevant to the demands of the Binding Theory. Next, note that (3c) is ambiguous: her may refer to Kim or to Kim’s mother. The Binding Theory stipulates only that her, a pronoun, be free within its own Domain, the clause (CP) indicated; beyond that, there is nothing systematic to be said and any indexing is possible. Similarly, in (3e) her may be coindexed with Kim, because her is thus free (not coindexed with anything) within its own clause; or it may have its own unique index, referring to a woman other than Kim. Turning to names, note that the difference between them and pronouns is that while pronouns only need to be free locally, the need of names to be free is not limited to their own Domain. On the one hand, (3d), with its two Kims, can be a statement about one Kim; the lower Kim, the complement of washed, may not be coindexed with any sister DP we meet as we work our way up the tree structure (not stopping at the CP node but continuing to climb), but the higher DP Kim is not a sister to any node dominating the lower Kim, hence invisible to the Binding Theory. On the other hand, (3f) necessarily concerns two Kims; the lower Kim may not be coindexed with the higher Kim, whose DP is a sister to the IP node dominating the lower Kim.

The Binding Theory yields the necessary distinctions beautifully but itself cannot be learned from data accessible to young children, the PLD. We therefore say that it is part of UG, part of what children bring to the analysis of initial experience. Learning is involved, however: children must learn which words are anaphors and which are pronouns, but nothing more complex is needed. The three possibilities are defined in (1) and they hold for all languages. Once a child acquiring English has learned that themselves is an anaphor, her a pronoun, and so on, all the appropriate indexing relations follow, with no further learning. Similarly for other languages, children learn which words are anaphors and pronouns and everything else follows. How, then, do they learn which words are which? We will see that parsing must be involved.

Exposure to a simple sentence like (4a), interpreted with themselves referring to (coindexed with) they, suffices to show that themselves is an anaphor and not a pronoun or a name; pronouns and names may not be thus coindexed with an accessible phrasal category within their Domain.

(4) a. Theyi washed themselvesi.

b. Kimi’s father loves heri.

c. Kimi heard _DP[Bill’s speeches about heri].

d. Kim left.

The sentence in (4b), interpreted with her referring to Kim, shows that her is no anaphor, since it is not coindexed with any sister DP encountered as we move up the tree structure within her’s Domain. And (4c), with her referring to Kim, shows that her is not a name, since names may not be coindexed with a sister DP anywhere; the Domain of her is the DP indicated and her is free within that Domain, happily. If neither an anaphor nor a name, then her is a pronoun. So far, so good but here comes the snag.

A very simple expression like (4d) shows that Kim is not an anaphor, but there is no positive evidence available to a child showing that Kim is not a pronoun. Analysts know that Kim is not a pronoun, because one does not find sentences like Kim said that Kim left, with the two Kims referring to the same person, but that is a negative datum, information that something doesn’t occur, hence unavailable to young children. So a complication has arisen; but it can be resolved by appealing to hierarchical organization.

If we turn to hierarchical relations, the starting point for a child might be that every word is a name, unless there is positive, refuting evidence. Under that view, sentences like (4a) show that themselves is not a name, and not a pronoun either, hence an anaphor. And (4c) shows that her is not a name, because it is coindexed with an accessible sister DP (Kim), and not an anaphor, because it is not locally coindexed, hence a pronoun. This yields a satisfactory account. We have a theory of mature capacity that provides the appropriate distinctions, and one can show how children learn from environmental data which elements are anaphors (1A) and which are pronouns (1B); all other nominals are names (1C).

However, further work suggests that this problem with the learnability of pronouns may be symptomatic of a bigger issue. Principles A and C, applying to anaphors and names, have stood the test of time well, but Principle B has been problematic from the early days of the Binding Theory. For example, Avrutin and Wexler 1992 observes that Principle B seems to be delayed in Russian-speaking children and does not come into effect until well after Principles A and C. Avrutin and Wexler argue that the effect is illusionary and that, in fact, Russian children behave according to Principle B from the earliest stage but lack a particular pragmatic principle, which they spell out (below). That is what makes it appear that Russian children lack Principle B.

According to Grodzinsky and Reinhart 1993, Principle B applies only to pronouns that are bound variables, hence to the pronoun in Is every bear touching her? but not to the pronoun in Is Mama Bear touching her?, which can be referential. A clear instance of a nonreferential pronoun is No bear likes his father, where there can be no referent for no bear and therefore none for his. If Principle B does not apply to referential pronouns, one may wonder why Mama Bear is touching her does not mean the same as the corresponding sentence with an anaphor, Mama Bear is touching herself. For that reason, Grodzinsky and Reinhart invoke a special rule, their Rule I: “NP A cannot co-refer with NP B if replacing A with C, C a variable A-bound by B, yields an indistinguishable interpretation” (p. 70). The status of such a rule is quite unclear; for good discussion, see Elbourne 2005.

And, talking of Elbourne, he claims (p. 361) that languages vary in terms of whether they observe Principle B: earlier forms of English and Maori, for example, do not. That leads him to conclude that Principle B is subject to a parameter and that children have to learn it only if it applies to their I-language. However, as Elbourne notes, there is a serious problem with that proposal: as we noted four paragraphs back, learning the principle would require access to negative evidence, information that certain things do not occur in the language.

Also contributing to this literature on the special properties of pronouns, Thornton and Wexler 1999: chap. 2 surveys empirical investigations showing that children have Principles A and C but not Principle B. Thornton and Wexler (p. 9) find that Principle B “stands out as an empirical problem area,” as argued by Elbourne. They adapt the proposal of Avrutin’s dissertation (Avrutin 1994) and distinguish three analyses of Mama Bear is washing her face, what they call the deictic, coreference, and quantificational readings. In the deictic reading, Mama Bear washes somebody else’s face, perhaps Snow White’s, and there is no coindexing. In the second reading there is coreference between Mama Bear and the pronoun. And the third reading involves a quantificational analysis with a LAMBDA OPERATOR: Mama Bear (λx (x is washing x’s face)) (cf. the bound-variable use of pronouns discussed above). The second and third analyses have the same truth conditions and are difficult to tell apart. There has been a vast amount of work on the interpretation of pronouns and the circumstances under which they may be interpreted as coindexed with a preceding noun phrase; Elbourne 2005 has wise discussion.

Thornton and Wexler argue that children’s misinterpretations of sentences like Mama Bear is washing her, with her referring to Mama Bear, are not violations of Principle B but reflect incomplete pragmatic knowledge. “As a consequence, children accept coreference between a pronoun and a name, what we will be calling a local coreference interpretation, in circumstances in which an adult would not” (p. 14). They claim that children have difficulty evaluating other speakers’ intentions, which has consequences for both speech and understanding of language. Children typically take new information to be old information, in nonadult fashion, explaining why “children may announce ‘He hit me’ instead of ‘A boy hit me’ or ‘John hit me’” (p. 15). This is why children allow local-coreference interpretations for expressions like Mama Bear is washing her.

Thornton and Wexler experiment with VP-ellipsis constructions. They compare children’s interpretation of pronouns in simple sentences like (5a), governed by Principle B, with their responses to sentences like (5b), governed by Principle C. As noted, children sometimes let the pronoun and the name co-refer in (5a), as adults would never do.

(5) a. Mama Bear is washing her.

b. She is washing Mama Bear.

Thornton and Wexler also compare children’s interpretation of (5a) with their interpretation of pronouns in ellipsed VPs. An example of VP ellipsis is given in (6a); in (6b), the ellipsed VP contains a pronoun.

(6) a. Papa Bear ate pizza and Brother Bear did eat pizza, too.

b. Papa Bear wiped his face and Brother Bear did wipe his face, too.

The pronoun in (6b) is multiply ambiguous and may have a deictic, coreference, or bound-variable reading. In simple sentences the coreference and bound-variable readings are hard to distinguish, because they are true under the same truth conditions, as we noted in our discussion above of Mama Bear is washing her. In ellipses, however, the truth conditions are different for the two readings. For (6b), the deictic and coreference readings show strict identity (cf. §2.6): the deictic reading takes the two pronouns to refer to a specific individual not mentioned in the sentence, perhaps to Sister Bear, while the coreference reading takes them to refer to Papa Bear, with the second pronoun (in the ellipsed VP) linked to the overt pronoun in both cases. The bound-variable reading, on the other hand, shows sloppy identity. The pronoun is bound in both clauses but by different operators, so it refers to different individuals in each clause.

When we consider the three principles of the Binding Theory, Principles A and C look quite straightforward, and it is easy to see how children learn what an anaphor is and what is a name. Principle B is different and there is substantial learning involved, which seems to require parsing on the part of our children; the literature shows children having difficulty with Principle B. Elbourne 2005 surveys experimental work by many researchers investigating children’s use of pronouns and the different ways they link to other DPs. If Principle B were simply part of the toolbox made available by UG like Principles A and C, we would expect similarly uniform linguistic behavior of children and rapid, accurate learning. Instead, we see children behaving quite differently, depending on the language they are selecting and their age.1 Children appear to be challenged by the behavior of pronouns and to be conducting detailed analysis, sometimes arriving at systems that differ from those of the adults around them.²

We see that the Binding Theory needs to be part of the conceptual-intentional interface, part of what is given by UG, but it interacts with variable properties, which have to be learned and therefore involve parsing. No proposal for this kind of phenomena has been provided in terms of binary parameters whose content is stated at UG, as far as I am aware. If children are born to parse and if parsing is of fundamental importance, we can see how they might arrive at an appropriate analysis, incorporating much variation depending on the language being selected and the age at which learning takes place.

`4.2 Phonological Form: Not Pronounced`

Now let us turn to the sensorimotor interface and the requirements for the phonological form of expressions (one of the possible externalizations). We will be concerned with when elements may go unpronounced. Ellipsed VPs (VPs rendered silent through the operation of ellipsis) are unusual across languages, but English allows them, and children have plenty of evidence to that effect, as illustrated in (7). They occur in a wide range of structures, but they need a “host,” an adjacent overt head that licenses them. The suggestion is that empty VPs occur only where they cliticize onto an adjacent host.3 In (7a) the empty VP is the complement of did, and did hosts it. Of course, VP ellipsis only applies to VPs: (7b) is ill-formed because part of the VP remains, for Naples, and there is no null VP. In the ungrammatical (ii) structures of (7c,d), the null VP is separated from its potential host had, hence their ungrammaticality can be attributed to failure to cliticize. A properly hosted ellipsed VP may occur in a subordinate clause (7e), to the left of its antecedent (7f), in a separate sentence from its antecedent (7g), within a complex DP (7h), with an antecedent that is contained in a relative clause (7i), or even without any overt antecedent (7j).

(7) a. Max left on Wednesday but Mary did _VP[leave on Wednesday] as well.

b. *Max left for Rio but Mary didn’t _VP[leave for Naples].

c. i. They denied reading it, although they all had _VP[read it].

ii. *They denied reading it, although they had all _VP[read it].

d. i. They denied reading it, although they often/certainly had _VP[read it].

ii. *They denied reading it, although they had often/certainly _VP[read it].

e. Max left for Rio, although Mary didn’t _VP[leave for Rio].

f. Although Max couldn’t _VP[leave for Rio], Mary was able to leave for Rio.

g. Susan went to Rio.

Yes, but Jane didn’t _VP[go to Rio].

h. The man who speaks French knows _DP[the woman who doesn’t _VP[speak French]].

i. People who appear to support mavericks generally don’t _VP[support mavericks].

j. Don’t _VP[??]!

It appears that an ellipsed VP must cliticize (or incorporate in some other way) to the left, to a host head of which it is the complement:

(8) Max could visit Rio and Susan _INFLcould + _VP[visit Rio], too.

This requirement explains the nonoccurrence of (9a), noted in Zagona 1988: the ellipsed VP needs an appropriate, adjacent host, a full phonological word, of which it is the complement, as in (9b). In (9a), has has become part of the noun John and no longer heads a phrase of which the empty VP is the complement.

(9) a. *I haven’t seen that movie, but John’s _VP[seen the movie].

b. I haven’t seen that movie, but John [has + _VP[seen the movie]].

Consider now null complementizers and deleted copies, where something similar seems to be at work, as discussed briefly in §1.1 (see (5)–(8) there). A child might hear sentences like (10a–c) pronounced with or without the complementizer that, because in English both versions occur. Such experiences would license an operation of the form in (10d) whereby that is deleted or rendered silent. French, Dutch, and German children have no comparable experiences and hence no grounds to parse a comparable deletion operation in their grammars; nothing like (10d) is triggered and there is no optionality for them; the complementizer must be present.

(10) a. Peter said [that/0 Kay had left already].

b. The book [that/0 Kay wrote] arrived.

c. It was obvious [that/0 Kay left].

d. that → 0

So experience licenses the operation in (10d) for children acquiring English; but a linguist may observe that as a generalization, (10d) breaks down at certain points: that may not be null in the contexts of (11). The crucial data here are negative data, data about what does not occur, which are not available to children. Hence UG must be playing some role.

(11) a. Peter said yesterday [that/*0 Kay had left already].

b. The book arrived yesterday [that/*0 Kay wrote].

c. [that/*0 Kay left] was obvious to all of us.

d. Fay believes, but Kay doesn’t, [that/*0 Ray is smart].

e. Fay said Ray left and Tim _Ve [that/*0 Jim stayed].

f. Fay said [that/0 [that/*0 the moon is round] is obvious].

What we see here is that, much as with ellipsed VPs, that can be deleted only if the clause it occurs in is the complement of an overt, adjacent word. In (11a,b) the clause is the complement of said and book respectively, neither adjacent.⁴ In (11c), the clause is the complement of nothing. In (11d) it is the complement of believes, which is not adjacent, and in (11e) it is the complement of a verb that is not overt. In (11f) the lower complementizer may not be null because its clause is not the complement of said.⁵

The same condition holds for what we used to view as “traces” of wh- movement. English-speaking children learn that wh- elements are displaced, that is, pronounced in a position other than where they are understood, on hearing and understanding a sentence like (12a). On Minimalism’s Copy-and-Delete implementation of displacement, there are actually multiple copies of the same element; an independent principle says that only one of them may be pronounced (in this case, it is the sentence-initial one), entailing deletion of all the others. For more on this, see note 7. In (12a′,b′), the structures posited for (12a,b), the lowest who is the complement of the adjacent verb, and in (12b′), the intermediate who occurs in a clause that is the complement of the adjacent verb say.

(12) a. Who did Jay see?

b. Who did Jay say _CP[that Fay saw]?

a′. Who did Jay see who?

b′. Who did Jay say _CP[who that Fay saw who]?

Assuming the Copy-and-Delete Minimalist structures of (12a′,b′), a copy of who can be deleted only when it or the clause in which it occurs is the complement of an adjacent, overt word. If that is the condition, it predicts, with no further learning, that (13a) is ill-formed, because the boldface who is undeletable (henceforth, boldface indicates a copy that cannot be deleted as required): it is in a clause that is the complement of apparent but not adjacent to it. The lowest who is the complement of the adjacent, overt seen, hence deletable. Also, if yesterday in Chicago were not present in (13a), then it would be the case that who was in an adjacent complement of the overt apparent, hence deletable; this yields the well-formed (13b), where (13b′) is the Copy-and-Delete representation.

(13) a. *Who was it apparent yesterday in Chicago _CP[who that [Kay had seen who]]?

i.e.,

*Who was it apparent yesterday in Chicago who that Kay had seen?

and

*Who was it apparent yesterday in Chicago 0 that Kay had seen?

b. Who was it apparent that Kay had seen?

b′. Who was it apparent [who that Kay had seen who]?

We thus solve the poverty-of-stimulus problem posed by (13a) as follows: children learn simply that wh- items may be displaced (copied and deleted), and the interface condition requiring deleted items to cliticize onto an adjacent host causes the derivation of (13a) to crash with no further learning.

Other contexts likewise indicate that items may be deleted only if they are the complement or in the complement of an overt, adjacent word. So which man is deletable in the leftmost conjunct of (14c) (the complement of the adjacent introduce) but not the boldface which woman in the rightmost conjunct, the complement of a nonovert verb. Hence the corresponding sentence is ill-formed. Similarly, in (14d,e,g), the boldface element fails to meet the condition for deletion, because the relevant verb is not overt. These structures involve wh- movement (14c,d), readily learnable as noted above; heavy-DP shift (14e,g), learnable on exposure to simple expressions like John gave to Ray his favorite racket; and gapping (14c,d,e,g), learnable on exposure to things like (14b,f). The UG principle then solves the poverty-of-stimulus problems of (14c,d,e,g).⁶

(14) a. Jay introduced Kay to Ray and Jim introduced Kim to Tim.

b. Jay introduced Kay to Ray and Jim _Ve Kim to Tim.

c. *Which mani did Jay introduce which mani to Ray and which womanj Jim _Ve which womanj to Tim?

i.e.,

*Which man did Jay introduce to Ray and which woman Jim which woman to Tim?

and

*Which man did Jay introduce to Ray and which woman Jim 0 to Tim?

d. *Jay wondered whati Kay gave whati to Ray and whatj Jim _Ve whatj to Tim.

e. *Jay admired [his uncle from Paramus]i greatly [his uncle from Paramus]i but Jim _Ve [his uncle from New York]j only moderately [his uncle from New York]j.

f. Jay gave his favorite racket to Ray and Jim _Ve his favorite plant to Tim.

g. *Jay gave [his favorite racket]i to Ray [his favorite racket]i and Jim _Ve [his favorite plant]j to Tim [his favorite plant]j.

The same condition explains why a complementizer may not be null if it occurs to the right of a gapped (nonovert) verb, as in (15b); nor does one find a deleted copy in that same position, as with the boldface who in (15c).

(15) a. Jay thought Kay hit Ray and Jim _Ve _CP[that Kim hit Tim].

b. *Jay thought Kay hit Ray and Jim _Ve _CP[0 Kim hit Tim].

c. *Whoi did Jay think Kay hit whoi and whoj Jim _Ve _CP[whoj (that) [Kim hit whoj]]?

So, children exposed to some form of English have plenty of evidence that a that complementizer is deletable (10d), that wh- phrases may be displaced (copied), and that heavy DPs may be copied to the end of a clause (14e,g); but they also know without evidence that complementizers and copies may not be deleted unless they are the complement or in the complement of an adjacent, overt word. And the data of (10–15) suggest that this is the information that UG needs to provide and that head–complement relations are crucial. The convergence of that information with the I-language-specific devices that delete a that complementizer and allow a wh- phrase or a heavy DP to be copied yields the distinctions we have noted and solves the poverty-of-stimulus problems.⁷ The UG requirement guarantees that deleted items must be understood in structurally prominent positions, where they have an appropriate host. This might be motivated by parsing needs: the possibility of a deleted item need only be considered where there is an appropriate host for one. The absence of an appropriate host rules out a deleted element; correspondingly, the presence of an appropriate host is a potential cue to the presence of a deleted item.

More evidence for this interface requirement comes from failures of verb reduction. The verbs is, am, are, has, have, had, will, would, and shall may reduce: Kim’s happy, Jim’ll do it, Sarah’d read it, and so on. However, by now readers are not surprised that there are apparent exceptions: for example the boldface instances of is in Kim’s happier than Tim is, I wonder what the problem is, I wonder what that is up there, I wonder where the concert is on Wednesday may not reduce. These data, negative data concerning contexts where is does not reduce, are not available to children directly, and that is the familiar poverty-of-stimulus problem: the stimulus appears to be too poor to determine all the properties of the mature system. Children hear some instances of the reduced forms but somehow come to know much more, namely that is may be reduced generally but not in the boldface contexts above. But notice that the boldface items each precedes a deletion site, as shown in (16). Our emerging analysis suffices to explain the nonreduction: the full form is needed to license the deletion site.⁸

(16) a. Kim is happier than Tim is happy.

b. I wonder what the problem is what. (Cf. The problem’s twofold.)

c. I wonder what that is what up there. (Cf. That’s a fan up there.)

d. I wonder where the concert is where on Wednesday. (Cf. The concert’s in Nogales on Wednesday.)

All is well so far, but now the question is: how is what deleted? Let us first review effects of earlier restrictions and see how we might capture them with the economy and elegance that the Minimalist Program encourages.

We know that elements may cliticize to the left and become an inseparable part of their host. That happens with the reduced is discussed earlier. When is reduces, its pronunciation is determined by the last segment of the word to which it attaches, as (17a) illustrates: voiceless if the last segment is voiceless, voiced if the last segment is voiced, and syllabic if the last segment is a sibilant or affricate. Precisely the same is true of the plural marker, the possessive, and the third-person singular ending on a verb, illustrated in (17b–d) respectively.

(17) a. Pat’s happy, Doug’s happy, and Alice’s here.

b. cats, dogs, and chalices

c. Pat’s dog, Doug’s cat, and Alice’s crocodile

d. commits, digs, and misses

Children understand Pat’s happy as ‘Pat is happy’, Pat being the subject of the phrase ‘is happy’. However, is is pronounced inseparably with Pat, and children parse what they hear as (18a), that is, with reduced is attached to the noun, with normal pronunciation applying. What (18a) expresses is a piece of structure, (18b), that serves to determine the shape of the emerging grammar, showing particularly that elements may be cliticized (Lightfoot 1999, 2006a). So from hearing and understanding an expression like Pat’s happy, children learn that is may be reduced and absorbed into the preceding word. Again we see the effects of parsing.

(18) a. _NPat +’s

b. noun + clitic

If we draw (17) together with (19), we now find something interesting: copies do not delete if they are to the right of a cliticized verb. In (19), the copied elements may be deleted if is is in its full form, but not if it is reduced; the corresponding sentences with ’s do not occur.

(19) a. Kim is happieri than Tim is/*Tim’s happyi.

b. That is a fan up there.

c. I wonder whati that is/*that’s whati up there.

d. I wonder wherei the concert is/*concert’s wherei on Wednesday.

This suggests again that a deleted copy is incorporated into the element of which it is the complement. In (19), if is cliticizes onto the subject noun and becomes part of that noun, it no longer heads a phrase of which what/where is the complement, and no incorporation is possible, hence no deletion if deletion is incorporation or cliticization.

That idea enables us to capture another subtle and interesting distinction. The sentence in (20a) is ambiguous: it may mean that Mary is dancing in New York or just that she is in New York (working on Wall Street, say, not dancing). The minimally different (20b), however, only has the latter interpretation. The ‘dancing in New York’ interpretation of (20a) has a structure with an empty verb, understood as ‘dancing’, represented in (20c). If empty elements (like an understood verb) are incorporated, there must be an appropriate host. There is an appropriate host in (20c), where the empty verb cliticizes onto a full verb, is, but not in (20d): _Ve isn’t the complement of Mary’s, therefore it is not licensed. Consequently (20b) unambiguously means that Mary is in New York (occupation unspecified), because there is no empty, understood verb. Again, it is inconceivable that children learn such distinctions purely on the basis of external evidence.

(20) a. Max is dancing in London and Mary is in New York.

b. Max is dancing in London and Mary’s in New York.

c. Max is dancing in London and Mary is _Ve in New York.

d. *Max is dancing in London and Mary’s Ve in New York.

So copies are deleted in the phonology in order to satisfy linearization requirements, and our analysis takes deletion to be an instance of cliticization, which allows the analysis to generalize to other null elements, such as copies, as already discussed above. In (21a) the deleted complement cliticizes onto the adjacent see, and in (21b) the deleted Jay is in the complement of expected, which is adjacent to it, and accordingly cliticizes onto it.

(21) a. Whoi did Jay see whoi?

b. Jayi was expected [Jayi to win].

The analysis appeals to head–complement relations and adjacency.

Our analysis captures many other distinctions. For example, English speakers’ grammars typically have an operation whereby a “heavy” DP is displaced to the right (see (14e,g) above). Under our Copy-and-Delete approach that means merging a copy to the right and reducing the first copy to silence by absorbing it clitic-like into a host. In (22a) the copied element is the complement of introduced, hence incorporated and deleted successfully; in (22b) it is in the complement of the adjacent expect; but in (22c) the element that needs to be deleted is neither the complement nor contained in the complement of anything, and the derivation is ill-formed and crashes.

(22) a. I introduced [all the students from Brazil]i to Mary [all the students from Brazil]i.

b. I expect [[all the students from Brazil]i to be at the party] [all the students from Brazil]i.

c. *[[All the students from Brazil]i are unhappy] [all the students from Brazil]i.

Our UG principle, that deletion of this kind is cliticization or incorporation, solves the poverty-of-stimulus problem of (22c): children simply learn that heavy DPs may be copied to the right, and the UG condition accounts for the nonoccurrence of (22c) with no further learning or experience needed.

Our analysis can also solve a puzzle about genitives and DP structure, discussed in §3.1. Whereas a simple DP like a book has the structure _DP[_Da _Nbook], a DP like Kim’s book about syntax has the Determiner ’s governing (and assigning Case to) its specifier, the genitive Kim, as well as its complement _NP[book about syntax]. Consider now an expression like Jay’s picture. It is three-ways ambiguous: Jay may be the owner of the picture, the painter, or the person portrayed. The latter reading is the so-called objective genitive and is usually analyzed as in (23), where Jay is copied from the “object” position to the specifier of the DP. The operation is specific to grammars of English speakers and does not occur in French, for example. This much is learnable: children hear expressions like Jay’s picture in contexts where it is clear that Jay is pictured.

(23) _DP[Jayi’s _NP[picture Jayi]]

The curious thing is that comparable expressions like the picture of Jay’s, The picture is Jay’s, and the picture that is Jay’s show only a two-way ambiguity, where Jay may be the owner or the painter but not the person portrayed. This is yet another poverty-of-stimulus problem, because it is inconceivable that children are systematically supplied with evidence that the objective interpretation is not available in these cases. We have an explanation for this, as already noted in §3.1: the structure of these expressions would need to be as follows.

(24) a. *the picture of _DP[Jay’s _NP[picture Jay]] (the picture of Jay’s)

b. *the picture is _DP[Jay’s _NP[picture Jay]] (the picture is Jay’s)

c. *the picture that is _DP[Jay’s _NP[picture Jay]] (the picture that is Jay’s)

A preposition like of in (24a) is always followed by a DP, a possessive like Jay’s occurs only as the fused specifier and head of a DP, and Ds always have an NP complement, even if the noun is empty, as it is here (where it is understood as ‘picture’). Now we can see why the structures are ill-formed: the copied Jay has no host to cliticize onto, hence it is undeletable (boldface) and the derivation crashes. Jay is the complement of the adjacent noun, but that noun is not overt, hence not a viable host.

The pair in (25) reflects another distinction covered by our account. The sentence in (25a) is well-formed and involves no deletion of a copied element, whereas (25b) involves two instances of DP copying and deletion (to yield the passive constructions). The leftmost instance is well-formed, because the copied Jay is in the complement of the adjacent known and therefore deletes; however, in the rightmost conjunct, the copied he has no overt host to cliticize onto and therefore cannot be deleted as required, leading the derivation to crash.

(25) a. It is known that Jay left but it isn’t _Ve that he went to the movies.

b. *Jayi is known [Jayi to have left] but hei isn’t _Ve [hei to have gone to the movies].

And there is more: it is well known that an expression like They were too angry to hold the meeting is ambiguous, meaning either that they were so angry that they couldn’t hold the meeting or that some unspecified person (e.g., the speaker) couldn’t hold the meeting; the ambiguity lies in who was in charge of holding the meeting (Chomsky 1986: 33). The former reading has the structure of (26a), where they is copied and deleted; the CP is the complement of angry and they is in that complement and adjacent to angry, hence incorporated. The other reading has arbitrary PRO as the subject of hold, as shown in (26b): nothing is copied, and that would not be possible because the clause is an ADJUNCT to angry, not a complement (adjuncthood is represented here by italics).

(26) a. Theyi were too angry _CP[theyi to hold the meeting].

b. They were too angry CP[PRO_arb to hold the meeting].

c. Which meetingi were theyj too angry _CP[which meetingi [theyj to hold which meetingi]]?

d. *Which meetingi were theyj too angry CP[which meetingi [PRO_arb to hold which meetingi]]?

However, the corresponding question Which meeting were they too angry to hold? is unambiguous and has only the anaphoric reading, as in (26c), under which they are unable to hold the meeting. It lacks the meaning of an arbitrary subject for hold: (26d) is ill-formed. In (26c), the clause is the complement of angry and therefore which meeting in that complement can cliticize onto angry and thus be deleted. Likewise, the copied they is deleted successively in (26c). (See also (31).) However, in (26d), the clause is an adjunct to angry, not a complement, and therefore the intermediate copy of which meeting is undeletable.

Several instances of deletion, we have now seen, are subject to poverty-of-stimulus problems suggesting a cliticization or incorporation analysis. Our children are learning what they need to learn through parsing positive data. Other instances of apparent deletion are not subject to comparable poverty-of-stimulus problems and do not fall under a cliticization treatment. Van Craenenbroeck and Merchant 2013 offer a quite comprehensive inventory of deletion processes, instances where elements are not pronounced. In some instances we understand analyses in some detail, but other examples are less well understood, and work remains to be done on why a cliticization or incorporation analysis works in some places and not elsewhere. Nonetheless the poverty-of-stimulus problems are real and require at least the information invoked here, even if analyses require further elaboration. For example, gapped verbs have a very different distribution from ellipsed VPs, so they do not cliticize in the way that we have analyzed ellipsed VPs here. Compare the gapped verbs in (14, 27) with the ellipsed VPs in (7): their distribution is quite different.

(27) a. *Max speaks French, although Mary _Ve German.

b. *Jim said that Max speaks French and Kim said that Mary _Ve German.

c. *Max _Ve French and Mary speaks German.

d. *The man who speaks French knows _DP[the woman who _Ve German].

e. *Max drove to New York and Susan did _Ve to Chicago.⁹

So far we have been talking about deletion sites as involving cliticization onto a host, treating the deleted item as some kind of clitic. Indeed, it is profitable to view the incorporated items as clitics. Zwicky and Pullum (1983) distinguish between clitics and AFFIXES, and this distinction permits some further understanding. Specifically, Zwicky and Pullum argue that the English reduced negative n’t is an affix: so in our terms isn’t, for example, is formed in the lexicon and merged directly into syntactic structure. That distinguishes between (28b), where isn’t is merged with here to form a constituent, and the ill-formed (28c).

(28) a. John’s not here.

b. John isn’t here.

c. *John’sn’t here.

Two of Zwicky and Pullum’s criteria for their distinction are given in (29). Criterion F says that affixes may not attach to material already containing clitics, hence the nonoccurrence of (28c).

(29) E. Syntactic rules can affect affixed words, but cannot affect clitic groups.

F. Clitics can attach to material already containing clitics, but affixes cannot.

This allows us to distinguish between the structures of (30): criterion E allows a syntactic copying operation (what we used to think of as displacement or movement) to affect couldn’t, an affixed form, but not could’ve, where ’ve is cliticized onto could.

(30) a. Couldn’t Kim see that?

b. *Could’ve Kim seen that?

Hence also the grammaticality of the corresponding Could Kim’ve seen that? versus *Could Kimn’t see that?

If n’t is an affix, then phonologically reduced verbs (’s, ’ve, wanna, etc.), ellipsed VPs, null complementizers, gapped verbs, and deleted copies are clitics. If clitics may attach to material already containing clitics (29F), we allow (31a–d) but not (31e), which has an affix attached to could’ve, in violation of (29F).¹⁰

(31) a. Kim visited NY and Jim could’ve _VPe.

b. Kim visited NY but Jim couldn’t _VPe.

c. Kim visited NY but Jim couldn’t’ve _VPe.

d. I’d’ve visited NY.

e. *Jim could’ven’t seen it.

There is a vast literature on clitics and many distinctions are drawn; indeed, Arnold Zwicky argued in his later work that there are no clitics (Zwicky 1994). I have drawn selectively from that literature in arguing that the deletion sites discussed so far are clitics. However, it may be that the incorporation analysis of deletion is correct but that the incorporated elements are not clitics; the claims are logically distinct. Thinking of the deletion of copied phrases as cliticization enables us to understand old puzzles about the Fixed-Subject Condition (Bresnan 1972) and the that–trace effect of the 1970s, later subsumed under the agreement relations of Rizzi 1990. It also enables us to learn more about the cliticization operation. In general, subjects resist displacement; when they are copied into a displaced position, odd things happen (for discussion, see Lightfoot 2006b).¹¹

Not only do complementizers like that and how not generally host clitics (see note 11), neither do prepositions. This explains the well-known observation that generally prepositions do not license movement sites: French *Qui as-tu parlé avec?, Dutch *Wie heb je met gesproken?, ‘Who have you spoken with?’. In English, prepositions may be stranded like this, but only where they are themselves reanalyzed as part of a complex verb, as in (32a) (see Hornstein & Weinberg 1981 for discussion of the reanalysis operation); compare the ill-formed (32b,c), where the PP is not the complement of an adjacent verb (in (32b) it is not adjacent, in (32c) it is an adjunct) and consequently may not host the deleted copy.

(32) a. Whoi did you _Vtalk + to whoi?

b. *Whoi did you talk at the meeting to whoi?

c. *Whati did you sleep during whati?

I have argued that English speakers learn that certain verbs may be phonologically reduced, that complementizers may be null, that wh- phrases may be displaced (pronounced in positions other than where they are understood), that verbs may be gapped, that heavy DPs may be displaced to the right, that VPs may be ellipsed, that possessive noun phrases may have objective interpretations. These seven variable properties are readily learnable from the linguistic environment, and we can point to plausible PLD. Such data that all English-speaking children hear include sentences like Kim’s happy, manifesting reduction; Peter said Kay had left already (11a), exhibiting a null complementizer; Who did Jay see? (12a), with a displaced wh- phrase; Jay introduced Kay to Ray and Jim Kim to Tim (14b), an example of gapping; Jay gave to Ray his favorite racket (14g), heavy-DP shift; Max could visit Rio and Susan could, too (8), an ellipsed VP; and Jay’s picture (23), meaning ‘picture of Jay’.

The way to think of this, I believe, is that children identify certain structures, through understanding and assigning structure to what they experience, that is, through parsing; some of these structures reflect variable properties. Consider the object–verb-order parameter. If we take parsing to be the key, children find either _VP[DP V] or _VP[V DP] structures, very specific information. Children use structures or lose them: a child who builds object–verb _VP[DP V] into her I-language loses _VP[V DP] structures, which atrophy. Notice that children are reacting to abstract structures, elements of grammar, which are required to understand expressions that they hear; they identify only structures that are unambiguous.

I have argued that an empty element (a deleted phrasal copy, a null complementizer, an ellipsed VP, the ellipsed dancing in 20b,c) is incorporated or cliticized onto an adjacent phonological head (N, V, Infl) of which it is (in) the complement. This one simple idea at the level of UG interacts with seven grammar-specific devices, all demonstrably learnable, and that interaction yields a complex range of phenomena. This involves carving up the grammatical world differently.

We seek a single object: the genetically prescribed properties of the language organ. Those properties permit language acquisition to take place in the way that it does, and that means that we must examine language variation along the lines of Baker 2001; that yields a wealth of empirical considerations. Baker analogized parametric options in language to the elements of chemistry, claiming that the linguistic options are the basic building blocks of languages. That imputes much detailed information to UG in violation of Minimalist principles. What we postulate must solve the poverty-of-stimulus problems that we identify and solve them for all languages as well. We also want our ideas to be as elegant and economical as is feasible. In addition, the grammars that our theory of UG permits must meet other demands.

To take just one example, they must allow speech comprehension to take place in the way that it does. That means that considerations of parsing might drive proposals. That hasn’t happened much yet, but there is no principled reason why not, and the situation might change. Similarly, evidence drawn from brain imaging or even from brain damage might suggest grammatical properties. In fact, the proposals here look promising for studies of online parsing. When a person hears a displaced element, say a wh- phrase at the beginning of an expression, she needs to search for the deletion site, the position in which it needs to be understood. The ideas developed here restrict the places where she can look.

Here I have tried to sketch the details of what a good theory of parsing would lead a child to select. We are far from a satisfactory theory, but thinking in terms of how children interpret the contrasts they experience looks far more tractable than seeking to define UG-defined parameters of what constitutes what kind of clitic. The latter would entail postulating very rich information as part of UG, violating Minimalist aspirations.

One uses what looks like the best evidence available at any given time, but that will vary as research progresses, and consequently the form of our innateness claims will vary. There are many basic requirements that our hypotheses must meet, and there is no shortage of empirical constraints, and therefore there are many angles one may take on what we aim for. In this chapter I have taken one angle and progressed beyond where government took us: to delete an element is to cliticize it. This is certainly not the end of any story, but a reasonable way to proceed and an improvement on earlier accounts.

Notes

1. For example, Elbourne investigates the Salience Hypothesis, the idea that the different behavior of referential and quantificational antecedents does not reflect a real difference in the way that bound and referential pronouns are analyzed with respect to Principle B; rather, it arises because children interpret pronouns as referring to the most prominent characters.
2. Elbourne 2005 and Conroy et al. 2009 dispute Thornton and Wexler’s idea that Principle B violations occur only with nonquantificational antecedents. Conroy et al.’s paper constitutes a major clarification of the complex literature on the alleged “delay-of–Principle B effect.”
3. Lobeck 1995 and Zagona 1988 offer good Government and Binding accounts of VP ellipsis in terms of the Empty Category Principle of more than thirty years ago; this section adopts the spirit of those analyses in requiring a host to license ellipsed VPs and, in doing so, draws on Lightfoot 2006b.

As a simple illustration of the apparent role of a host for deletion sites, Potsdam 1997 observes the distinction between (ia) and (ib,c), where do and not appear to license an ellipsed VP; Potsdam notes a similar distinction between (iia) and (iib), where to licenses an ellipsed VP.

(i) a. *It is possible to eat this fruit, and we recommend that you _VP[eat this fruit].

b. It is possible to eat this fruit, and we recommend that you do _VP[eat this fruit].

c. It is possible to eat this fruit, and we recommend that you not _VP[eat this fruit].

(ii) a. *Kim began singing a song before Jim began _VP[singing a song].

b. Kim began to sing a song before Jim began to _VP[sing a song].
4. I assume here that restrictive relative clauses are complements to nouns, distinguishing (10b, 11b); I will take nonrestrictive relatives to be noncomplements, that is, adjuncts. We need a syntactic distinction between restrictive and nonrestrictive relative clauses, and restrictive relatives have some properties of complement clauses. What I am calling complement structures may be captured through Richard Kayne’s 1994 raising analysis of restrictive relatives, where a restrictive relative headed by that is the complement of D and the head raises out of the relative clause. For a good discussion of relative clauses and the problems they pose for modern theories of phrase structure, see Borsley 1997.
5. Bošković and Lasnik 2003, adopting ideas from Pesetsky 1991 (an unpublished extension of Pesetsky 1995), treats null complementizers as resulting from a phonological affixation operation. For Bošković and Lasnik, affixation requires adjacency, but the data of (11) show that head–complement relations are also crucially involved.
6. Notice that Which man did Jay introduce to Ray and Jim to Tim?, analogous to the ill-formed (14c), is well-formed. Here only one wh- phrase is overt and it moves across the board. One way of thinking of this is that across-the-board movement takes place on a three-dimensional structure before the two clauses are linearized; at that point which man is the complement of introduce (Williams 1978).
7. I also adopt the proposals of Nunes 2004, namely that deletion of the copied element follows from the linearization of chains. Linearization is a phonological operation that converts a syntactic structure into a sequence of items in consonance with the Linear-Correspondence Axiom of Kayne 1994. The two whats in a structure like (i) are nondistinct, and this leads to ordering contradictions. What must precede buy, for instance, but what must also follow buy. That is a contradiction: x cannot both precede and follow y.

(i) [what [did _IP[you did buy what]]]
It is this failure to yield a linear order that renders the structure ill-formed—unless one of the whats is deleted; and it must be the lower what, for reasons of the Binding Theory. So, the fact that there can be no chains in the phonology with more than one overtly realized link entails that the lower what in (i) must be deleted. Nunes offers a rich analysis, noting exceptional cases where multiple wh- items are pronounced; there he shows that it is only intermediate copies that may be pronounced, not the lowest copy, and, indeed, that these pronounced copies have clitic-like qualities (see Nunes 2004: 38–43 for discussion).
8. There are many interesting distinctions at work. Compare, for example, I wonder what is/’s that up there, where reduction is possible. In this example there is no deletion site right-adjacent to is, and so is may be reduced. In (16c), however, the deletion site of what is between is and up, blocking reduction.
9. Another category of deletions different from ellipsed VPs is “pseudogapping,” when an auxiliary is present; such constructions are as bad as the gaps in (27), for example, *Which man did Jay introduce to Ray and which woman did Jim to Tim?, analogous to (14c), or *Jay wondered what Kay gave to Ray and what did Jim to Tim, analogous to (14d). Pseudogapping structures are often analyzed very differently (but see Lasnik 1999: chap. 7, which treats them as VP ellipsis plus “remnant raising”), and I avoid them here. A good theory of parsing would show how children make the right selections.
10. Notice that (31a) is well-formed but (9a) is not. In (31a) the null VP is the complement of the adjacent could’ve, but in (9a) it is not the complement of John’s. This also accounts for the following distinction.

(i) a. Kim canceled her subscription, and I would’ve _VPe, too.

b. *Kim canceled her subscription, and I’d’ve _VPe, too.
A null VP following the reduced ’ve in (ia) is the complement of would’ve, but a null VP following the reduced ’ve in (ib) is not the complement of I’d’ve, where the reduced auxiliaries have been cliticized to the subject DP and are no longer in the position of Infl with a VP complement.
11. In (ia) the complementizer contained in the clausal complement to thought cliticizes to thought straightforwardly and may be deleted (unpronounced). In (ib) the lowest who cliticizes to saw, the intermediate who to think, and the complementizer to think + who, successively like the clitics of (31), again straightforwardly.

(i) a. I thought [that/0 Ray saw Fay].

b. Whoi did you think [whoi that/0 Ray saw whoi]?

c. *Whoi did you think [whoi that whoi saw Fay]?
However, in (ic), the intermediate who cliticizes to think just like in (ib), but the lowest (boldface) who apparently cannot cliticize to that, presumably because that does not take complements in the usual sense (despite the name “complementizer,” the following clause does not “complete” the meaning of that in the way that Fay completes the meaning of saw) and is not an appropriate host. Likewise for equivalent complementizers in other languages.

Similarly, in (iia) the boldface who may not cliticize to the complementizer how, because how has no complement. Hence the difference with (iib), where each copied element is deleted in the appropriate way, and the sentence is grammatical if not completely felicitous.

(ii) a. *Whoi do you wonder [whoi how [whoi solved the problem]]?

b. Whati do you wonder [whati how [John solved whati]]?

There is also an interesting range of comparative data resulting from the discomfort in subject DPs with respect to displacement in several languages; see Rizzi 1990: §2.6 and Lightfoot 2006b: §2.5–§2.6 on French, West Flemish, Swedish, and Vata.