Many generations of productive scholarship notwithstanding, the questions to which this paper is addressed can receive only quite tentative answers. There are few languages for which descriptions in depth are available, and only selected aspects of language have been studied with sufficient care and success to provide support for conclusions of a general nature. Still, it is possible, with some degree of confidence, to outline certain properties and conditions that distinguish human languages among arbitrary systems of symbolic manipulation, communication, and self-expression.
At the crudest level of description, we may say that a language associates sound and meaning in a particular way; to have command of a language is to be able, in principle, to understand what is said and to produce a signal with an intended semantic interpretation. But aside from much unclarity, there is also a serious ambiguity in this crude characterization of command of language. It is quite obvious that sentences have an intrinsic meaning determined by linguistic rule and that a person with command of a language has in some way internalized the system of rules that determine both the phonetic shape of the sentence and its intrinsic semantic content – that he has developed what we will refer to as a specific linguistic competence. However, it is equally clear that the actual observed use of language – actual performance – does not simply reflect the intrinsic sound–meaning connections established by the system of linguistic rules. Performance involves many other factors as well. We do not interpret what is said in our presence simply by application of the linguistic principles that determine the phonetic and semantic properties of an utterance. Extralinguistic beliefs concerning the speaker and the situation play a fundamental role in determining how speech is produced, identified, and understood. Linguistic performance is, furthermore, governed by principles of cognitive structure (for example, by memory restrictions) that are not, properly speaking, aspects of language.
To study a language, then, we must attempt to disassociate a variety of factors that interact with underlying competence to determine actual performance; the technical term “competence” refers to the ability of the idealized speaker–hearer to associate sounds and meanings strictly in accordance with the rules of his language. The grammar of a language, as a model for idealized competence,1 establishes a certain relation between sound and meaning – between phonetic and semantic representations. We may say that the grammar of the language L generates a set of pairs (s, I), where s is the phonetic representation of a certain signal2 and I is the semantic interpretation assigned to this signal by the rules of the language. To discover this grammar is the primary goal of the linguistic investigation of a particular language.
The general theory of linguistic structure is concerned with discovering the conditions that any such grammar must meet. This general theory will be concerned with conditions of three kinds: conditions on the class of admissible phonetic representations, the class of admissible semantic representations, and the systems of rules that generate paired phonetic and semantic representations. In all three respects, human languages are subject to stringent limiting conditions. There is no difficulty in constructing systems that do not meet these conditions, and that do not, therefore, qualify as potential human languages despite the fact that they associate sound and meaning in some definite way. Human languages are systems of a highly specific kind. There is no a priori necessity for a system relating sound and meaning to be of this kind. As this paper proceeds, we shall mention some of the highly restrictive conditions that appear to be essential properties of human language.
A grammar generates a certain set of pairs (s, I), where s is a phonetic representation and I its associated semantic interpretation. Similarly, we might think of a performance model as relating sound and meaning in a specific way. A perceptual model, PM, for example, might be described, as in 1, as a device that accepts a signal as input (along with much else) and assigns various grammatical representations as “output.”
1 | ![]() |
A central problem for psychology is to discover the characteristics of a system PM of this sort. Clearly, in understanding a signal, a hearer brings to bear information about the structure of his language. In other words, the model PM incorporates the grammar G of a language. The study of how sentences are understood – the general problem of speech perception – must, obviously, remain within narrow limits unless it makes use of this basic property of a perceptual model. But it is important to distinguish clearly between the function and properties of the perceptual model PM and the competence model G that it incorporates. Both G and PM relate sound and meaning; but PM makes use of much information beyond the intrinsic sound–meaning association determined by the grammar G, and it operates under constraints of memory, time, and organization of perceptual strategies that are not matters of grammar. Correspondingly, although we may describe the grammar G as a system of processes and rules that apply in a certain order to relate sound and meaning, we are not entitled to take this as a description of the successive acts of a performance model such as PM – in fact, it would be quite absurd to do so. What we have said regarding perceptual models is equally applicable to production models. The grammatical rules that generate phonetic representations of signals with their semantic interpretations do not constitute a model for the production of sentences, although any such model must incorporate the system of grammatical rules. If these simple distinctions are overlooked, great confusion must result.
In this paper, attention is focused on competence and the grammars that characterize it; when speaking of semantic and phonetic interpretation of sentences, we refer exclusively to the idealized representations determined by this underlying system. Performance provides data for the study of linguistic competence. Competence, in the sense just described, is one of many factors that interact to determine performance. In general, we would expect that in studying the behavior of a complex organism, it will be necessary to isolate such essentially independent underlying systems as the system of linguistic competence, each with its intrinsic structure, for separate attention.
Turning to the study of underlying competence, let us first take note of a few very obvious properties of the grammar of a human language. It is, first of all, quite clear that the set of paired phonetic and semantic representations generated by the grammar will be infinite. There is no human language in which it is possible, in fact or in principle, to specify a certain sentence as the longest sentence meaningful in this language. The grammar of any language contains devices that make it possible to form sentences of arbitrary complexity, each with its intrinsic semantic interpretation. It is important to realize that this is no mere logical nicety. The normal use of language relies in an essential way on this unboundedness, on the fact that language contains devices for generating sentences of arbitrary complexity. Repetition of sentences is a rarity; innovation, in accordance with the grammar of the language, is the rule in ordinary day-by-day performance. The idea that a person has a “verbal repertoire” – a stock of utterances that he produces by “habit” on an appropriate occasion – is a myth, totally at variance with the observed use of language. Nor is it possible to attach any substance to the view that the speaker has a stock of “patterns” in which he inserts words or morphemes. Such conceptions may apply to greetings, a few cliches, and so on, but they completely misrepresent the normal use of language, as the reader can easily convince himself by unprejudiced observation.3
To discover the grammar of some language user, we must begin by obtaining information that bears on his interpretation of sentences, on the semantic, grammatical, and phonetic structure that he assigns to them. For example, for the study of English, it would be important to discover such facts as the following. Consider the sentence frames 2 and the words “persuaded,” “expected,” and “happened”:
2 |
|
The word “persuaded” can be inserted in a and b, but not c or d; “expected” can be inserted in b, c, d, but not a; “happened” can be inserted only in c. Inserting “persuaded” in a, we derive an ambiguous sentence, the interpretation of which depends on the reference of “he”; under one interpretation, the sentence is a near paraphrase of b, with “persuaded” inserted. When “expected” appears in b and c, the subject–verb relation holds between “Bill” and “leave” in b, but between “John” and “leave” in c. The sentence “John happened to leave” has roughly the same meaning as “It happened that John left,” but “John expected to leave” is not even a remote paraphrase of “It expected that John left.” Such facts as these can be stated in many ways, and we might use one or another technique to make sure of their accuracy. These are facts about the competence of the speaker of English. They can be used as a basis for discovering his internalized grammar.
Let us consider the status of such observations with slightly greater care. These observations actually bear directly on the output of a perceptual model such as 1; they relate to the structures assigned to signals by the hearer. Our characterization of the output of 1 is a construct based on evidence of this sort. Then, the perceptual model PM itself is a second-order construct. Abstracting further, we can study the grammar that constitutes one fundamental component of 1 as a third-order construct. Thus the evidence cited in the preceding paragraph actually has a bearing on grammar only indirectly. We must, in other words, presuppose the legitimacy of each abstraction. There seems little question of the legitimacy of abstraction in such cases as these, and there is an overwhelming mass of evidence of the sort cited. Once again, we note that idealization of the kind just described is inescapable if a complex organism is to be studied in a serious way.
This process of abstraction can be carried one step further. Consider an acquisition model AM that uses linguistic data to discover the grammar of the language to which this data pertains.
3 | ![]() |
Just how the device AM selects a grammar will be determined by its internal structure, by the methods of analysis available to it, and by the initial constraints that it imposes on any possible grammar. If we are given information about the pairing of linguistic data and grammars, we may try to determine the nature of the device AM. Although these are not the terms that have been used, linguistics has always been concerned with this question. Thus modern structural linguistics has attempted to develop methods of analysis of a general nature, independent of any particular language, and an older and now largely forgotten tradition attempted to develop a system of universal constraints that any grammar must meet. We might describe both these attempts as concerned with the internal structure of the device AM, with the innate conception of “human language” that makes language acquisition possible.4
Let us now turn to the study of underlying competence, and consider the general problem of how a sound–meaning pairing might be established. As a preliminary to this investigation of universal grammar, we must ask how sounds and meanings are to be represented. Since we are interested in human languages in general, such systems of representation must be independent of any particular language. We must, in other words, develop a universal phonetics and a universal semantics that delimit, respectively, the set of possible signals and the set of possible semantic representations for any human language. It will then be possible to speak of a language as a particular pairing of signals with semantic interpretations, and to investigate the rules that establish this pairing. Our review of the general properties of language thus falls naturally into three parts: a discussion of universal phonetics, of universal semantics, and of the overarching system of universal grammar. The first two topics involve the representation of idealized form and semantic content; the theory of universal grammar deals with the mechanisms used in natural languages to determine the form of a sentence and its semantic content.
The importance of developing a universal semantics and universal phonetics, in the sense of the last paragraph, was clearly recognized long before the development of modern linguistics. For example, Bishop Wilkins in his Essay Towards a Real Character and a Philosophical Language (1668) attempted to develop a universal phonetic alphabet and a universal catalogue of concepts in terms of which, respectively, the signals and semantic interpretations for any language can be represented. The phonetic alphabet is based on a system of phonetic properties developed in terms of point and manner of articulation. Each phonetic symbol is analyzable as a set of such properties; in modern terms, it is analyzable as a set of distinctive features. It is furthermore tacitly assumed that the physical signal is determined, by language-independent principles, from its representation in terms of phonetic symbols. The concepts that are proposed as units of semantic interpretation are also analyzable into fixed properties (semantic features) of some sort, for example, animate-inanimate, relational-absolute, agent-instrument, etc. It is tacitly assumed that the semantic interpretation of a sentence is determined by universal, language-independent principles from the concepts comprised in the utterance and the manner in which they are grammatically related (for example, as subject-predicate).5 Although the defects in execution in such pioneering studies as that of Wilkins are obvious, the general approach is sound. The theory of universal phonetics has been intensively pursued along the lines just indicated with considerable success; the parallel theory of universal semantics has, in contrast, been very little studied.
The theory of universal phonetics attempts to establish a universal phonetic alphabet and a system of laws. The alphabet defines the set of possible signals from which the signals of a particular language are drawn. If the theory is correct, each signal of a language can be represented as a sequence of symbols of the phonetic alphabet. Suppose that two physical events are represented as the same sequence. Then in any language they must be repetitions of one another.6 On the other hand, two physical events might be regarded by speakers of one language as repetitions and by speakers of another language as nonrepetitions. In this case, the universal alphabet must provide the means for distinguishing them. Representation in terms of the universal alphabet should provide whatever information is necessary to determine how the signal may be produced, and it should, at the same time, correspond to a refined level of perceptual representation. We stress once again, however, that actual performance involves other factors beyond ideal phonetic representation.
The symbols of the universal phonetic alphabet are not the “primitive elements” of universal phonetic theory. These primitive elements include, rather, what have been called (phonetic) distinctive features, properties such as voicing, frontness–backness, stress, etc.7 Each of these features can be thought of as a scale in terms of which two or more values can be distinguished (how many values need be distinguished is an open question, but the number is apparently quite small for each feature). A symbol of the phonetic alphabet is properly to be regarded as a set of features, each with a specified value. A signal, then, is represented as a sequence of such sets.
Three obvious properties of language are reflected in a phonetic theory of this sort. The first is its discreteness – the fact that only a determinable finite number of signals of any given length can be nonrepetitions. The second property is the unboundedness of language – the fact that a signal can be of arbitrary length, so that a language will contain infinitely many semantically interpreted signals. In addition to these formal properties, a phonetic theory of this sort reflects the fact that two segments of a signal, represented by two symbols of the universal alphabet, may be alike in certain respects and distinct in others; and that there are, furthermore, a fixed number of such dimensions of sameness and difference and a fixed number of potentially significant points along these dimensions. Thus, the initial segments of pin and bin8 differ with respect to voicing and aspiration but not (significantly) with respect to point of articulation; the two consonants of cocoa differ with respect to neither point of articulation nor voicing, but only with respect to aspiration; etc.
It is important to note that the distinctive features postulated in universal phonetic theory are absolute in several senses but relative in others. They are absolute in the sense that they are fixed for all languages. If phonetic representation is to provide sufficient information for identification of a physical signal, then specification of feature values must also be absolute. On the other hand, the features are relative when considered in terms of the notion of repetition– nonrepetition. For example, given three absolute values designated 1, 2, 3 in terms of the feature front–back, we might find that in language L1 two utterances that differ only in the values 1, 2 of frontness–backness are distinguished as nonrepetitions but utterances differing only in the values 2, 3 are not; whereas in language L2 the opposite might be the case. Each language would use the feature front–back to distinguish nonrepetitions, but the absolute value 2 that is “front” in one language would be “back” in the other.
In addition to a system of distinctive features, a universal phonetic theory will also attempt to formulate certain laws that govern the permitted sequences and permitted variety of selection in a particular language. For example, Jakobson has observed that no language uses both the feature labialization and the feature velarization for distinguishing nonrepetitions, and he has suggested a more general formulation in terms of which these two features can be regarded as variants of a single, more abstract feature. Generalizations of this sort – particularly when they can be supported by rational argument – can be proposed as laws of universal phonetics.
Although universal phonetics is a fairly well-developed subject, the same cannot be said of universal semantics. Here, too, we might hope to establish a universal system of semantic features and laws regarding their interrelations and permitted variety. In fact, the problem of determining such features and such laws has once again become a topic of serious investigation in the past few years,9 and there is some promise of fruitful development. It can be seen at once that an analysis of concepts in terms of such features as animateness, action, etc. (see p. 107), will hardly be adequate, and that certain features must be still more abstract. It is, for example, a fact of English that the phrase “a good knife” means “a knife which cuts well.” Consequently the concept “knife” must be specified in part in terms of features having to do with characteristic functions (not just physical properties), and in terms of an abstract “evaluation feature”10 that is determined by such modifiers as “good,” “terrible,” etc. Only by such an analysis can the semantic relationship between “this is a good knife” and “this knife cuts well” be established. In contrast, the irrelevance of “this is a good knife for digging with” to “this knife cuts well” shows that the semantic interpretation of a sentence is determined by grammatical relations of a sort that are by no means transparent.
As in the case of universal phonetics, we might hope to establish general principles regarding the possible systems of concepts that can be represented in a human language and the intrinsic connections that may exist among them. With the discovery of such principles, universal semantics would become a substantive discipline.
Suppose that a satisfactory theory of universal phonetics and of universal semantics were at hand. We could then define a language as a set of sentences, where a sentence is a particular kind of sound–meaning pair, and go on to study the systems of rules that define human languages. But in fact only the theory of universal phonetics is sufficiently well established to support this enterprise. Consequently, we must approach the study of language structure in a slightly more indirect way.
Notice that although the notion “semantic representation” is itself far from clear, we can, nevertheless, find innumerable empirical conditions that an explication of this notion must meet. Consider, for example, the following sentence:
4 | What disturbed John was being disregarded by everyone. |
It is clear, first of all, that this expression has two distinct interpretations. Under one interpretation, it means that John was disturbed by the fact that everyone disregarded him; under the second, it means that everyone was disregarding the things that disturb John. Under the first of these interpretations, a certain grammatical relation holds between “disregard” and “John,” namely the same relation that holds between these items in “Everyone disregards John” (the “verb–object” relation). Under the second interpretation neither this nor any other grammatically significant relation holds between “disregard” and “John.” On the other hand, if we insert the word “our” between “was” and “being,” the sentence is unambiguous, and no grammatical relation holds between “disregard” and “John,” although the verb–object relation now holds between “disregard” and “we” (an underlying element of “our”).
Examples of this sort can be elaborated indefinitely. They provide conditions of adequacy that the notion “semantic interpretation” must meet (for example, relations of paraphrase and implication and the property of ambiguity must be correctly reflected), and they illustrate clearly some of the ways in which the semantic interpretations of linguistic expressions must be determined from those of their grammatically related parts.
From such considerations, we are led to formulate a more restricted but quite significant immediate goal for the study of linguistic structure. Still taking a language to be a set of sentences, let us consider each abstract “sentence” to be a specific pairing of a phonetic representation with an abstract structure of some sort (let us call it a deep structure) that incorporates information relevant to semantic interpretation. We can then study the system of rules that determines this pairing, in a particular language, and the general characteristics of such rules. This enterprise will be significant to the extent that these underlying deep structures do actually provide a way to meet the empirical conditions on semantic interpretation. Semantic theory, as it progresses, will then provide means for enriching deep structures and associating semantic interpretations with them. The empirical significance of a full theory of grammar, comprising a universal phonetics, semantics, and syntax, will depend in part on the extent to which conditions on semantic interpretation can be satisfied by systematic use of the devices and principles that this theory supplies.
Summarizing these remarks, let us establish the following frame-work for the study of linguistic structure. The grammar of a language is a system of rules that determines a certain pairing of sound and meaning. It consists of a syntactic component, a semantic component, and a phonological component. The syntactic component defines a certain (infinite) class of abstract objects (D, S), where D is a deep structure and S a surface structure. The deep structure contains all information relevant to semantic interpretation; the surface structure, all information relevant to phonetic interpretation. The semantic and phonological components are purely interpretive. The former assigns semantic interpretations to deep structures; the latter assigns phonetic interpretations to surface structures. Thus the grammar as a whole relates semantic and phonetic interpretations, the association being mediated by the rules of the syntactic component that define paired deep and surface structures. The study of the three components will, of course, be highly integrated; each can be investigated to the extent that it is clear what conditions the others impose upon it.
This formulation should be regarded as an informal first approximation. When we develop a precise theory of grammatical structure – for example, the particular version of the theory of transformational grammar sketched below – we will provide a technical meaning for the terms “deep structure” and “surface structure,” and in terms of these technical meanings, we can then raise the empirical (not conceptual) question of how deep and surface structures contribute to and determine semantic and phonetic interpretations. In the technical sense that is given to the concepts of deep and surface structure in the theory outlined below, it seems to me that present information suggests that surface structure completely determines phonetic interpretation and that deep structure completely determines certain highly significant aspects of semantic interpretation. But the looseness of the latter term makes a more definite statement impossible. In fact, I think that a reasonable explication of the term “semantic interpretation” would lead to the conclusion that surface structure also contributes in a restricted but important way to semantic interpretation, but I will say no more about this matter here.
Universal grammar might be defined as the study of the conditions that must be met by the grammars of all human languages. Universal semantics and phonetics, in the sense described earlier, will then be a part of universal grammar. So defined, universal grammar is nothing other than the theory of language structure. This seems in accord with traditional usage. However, only certain aspects of universal grammar were studied until quite recently. In particular, the problem of formulating the conditions that must be met by the rules of syntax, phonology, and semantics was not raised in any explicit way in traditional linguistics, although suggestive and nontrivial steps toward the study of this problem are implicit in much traditional work.11
A grammar of the sort described previously, which attempts to characterize in an explicit way the intrinsic association of phonetic form and semantic content in a particular language, might be called a generative grammar 12 to distinguish it from descriptions that have some different goal (for example, pedagogic grammars). In intention, at least, traditional scholarly grammars are generative grammars, although they fall far short of achieving the goal of determining how sentences are formed or interpreted. A good traditional grammar gives a full exposition of exceptions to rules, but it provides only hints and examples to illustrate regular structures (except for trivial cases – for example, inflectional paradigms). It is tacitly presumed that the intelligent reader will use his “linguistic intuition” – his latent, unconscious knowledge of universal grammar – to determine the regular structures from the presented examples and remarks. The grammar itself does not express the deep-seated regularities of the language. For the purpose of the study of linguistic structure, particular or universal, such grammars are, therefore, of limited value. It is necessary to extend them to full generative grammars if the study of linguistic structure is to be advanced to the point where it deals significantly with regularities and general principles. It is, however, important to be aware of the fact that the concept “generative grammar” itself is no very great innovation. The fact that every language “makes infinite use of finite means” (Wilhelm von Humboldt) has long been understood. Modern work in generative grammar is simply an attempt to give an explicit account of how these finite means are put to infinite use in particular languages and to discover the deeper properties that define “human language,” in general (that is, the properties that constitute universal grammar).
We have been concerned thus far only with clarification of concepts and setting of goals. Let us now turn to the problem of formulating hypotheses of universal grammar.
The syntactic component of a generative grammar defines (generates) an infinite set of pairs (D, S), where D is a deep structure and S is a surface structure; the interpretive components of the grammar assign a semantic representation to D and a phonetic representation to S.
Let us first consider the problem of assigning phonetic representations to surface structures. As in the previous discussion of universal phonetics, we take a phonetic representation to be a sequence of symbols of the universal phonetic alphabet, each symbol being analyzed into distinctive features with specific values. Stating the same idea slightly differently, we may think of a phonetic representation as a matrix in which rows correspond to features of the universal system, columns correspond to successive segments (symbols of the phonetic alphabet), and each entry is an integer that specifies the value of a particular segment with respect to the feature in question. Our problem, then, is to determine what information must be contained in the surface structure, and how the rules of the phonological component of the grammar use this information to specify a phonetic matrix of the sort just described.
Consider once again the example 4, which we repeat in 5 for ease of reference:
5 | What # disturb-ed # John # was # be-ing # dis-regard-ed # by # every-one. |
To first approximation,13 we may think of 5 as a sequence of the formatives “what,” “disturb,” “ed,” “John,” “was,” “be,” “ing,” “dis,” “regard,” “ed,” “by,” “every,” “one,” with the junctures represented by the symbols # and – in the positions indicated in 5. These junctures specify the manner in which formatives are combined; they provide information which is required by the interpretive rules of the phonological component. A juncture must, in fact, be analyzed as a set of features, that is, as a single-column matrix in which the rows correspond to certain features of the junctural system and each entry is one of two values which we may represent as + or –. Similarly, each formative will be analyzed as a matrix in which columns stand for successive segments, rows correspond to certain categorial features, and each entry is either + or –. Therefore, the entire sentence 5 can be regarded as a single matrix with the entries + and –.14
The categorial features include the universal features of the phonetic system, along with diacritic features which essentially indicate exceptions to rules. Thus the matrix corresponding to “what,” in the dialect in which the corresponding phonetic representation is [wat], will contain three segments, the first specified as a labial glide, the second as a low back unrounded vowel, the third as an unvoiced dental stop consonant (these specifications given completely in terms of the + and – values of features supplied by the universal phonetic system). The rules of the phonological component, in this case, will convert this specification in terms of + and – values into a more detailed specification in terms of integers, in which the value of each segment with respect to the phonetic features (for example, tongue height, degree of aspiration, etc.) is indicated to whatever degree of accuracy is required by the presupposed theory of universal phonetics, and with whatever range of variation is allowed by the language. In this example, the assigned values will simply refine the bifurcation into + and – values given in the underlying matrix for “what” in 5.
The example just cited is unusually simple, however. In general, the rules of the phonological component will not only give a finer specification of the underlying division into + and – values, but will also change values significantly and, perhaps, insert, delete, or rearrange segments. For example, the formative “by” will be represented with an underlying matrix consisting of two columns, the second of which is specified as a high front-vowel (specification given in terms of values of features). The corresponding phonetic matrix, however, will consist of three columns, the second of which is specified as a low back-vowel and the third as a palatal glide (the specification here being in terms of integral valued entries in a phonetic matrix).15
The surface structure of 5, then, is represented as a matrix in which one of two values appears in each entry. The fact that only two values may appear indicates that this underlying matrix really serves a purely classificatory function. Each sentence is classified in such a way as to distinguish it from all other sentences, and in such a way as to determine just how the rules of the phonological component assign specific positional phonetic values. We see, then, that the distinctive features of the universal phonetic system have a classificatory function in the underlying matrix constituting a part of the surface structure, and a phonetic function in the matrix constituting the phonetic representation of the sentence in question. Only in the former function are the distinctive features uniformly binary; only in the latter do they receive a direct physical interpretation.
The underlying classificatory matrix just described does not exhaust the information required by the interpretive phonological rules. Beyond this, it is necessary to know how the sentence in question is subdivided into phrases of varying size, and what types of phrase these are. In the case of 5, for example, phonological interpretation requires the information that “disturb” and “disregard” are verbs, that “what disturbed John” is a noun phrase, that “John was being” is not a phrase at all, and so on. The relevant information can be indicated by a proper bracketing of the sentence with labeled brackets.16 The unit contained within paired brackets [A and] A will be referred to as a phrase of the category A. For example, the sequence “what # disturbed # John” in 5 will be enclosed within the brackets [NP,]NP, indicating that it is a noun phrase; the formative “disturb” will be enclosed within the brackets [V,]V, indicating that it is a verb; the whole expression 5 will be enclosed within the brackets [S,]S, indicating that it is a sentence; the sequence “John was being” will not be enclosed within paired brackets, since it is no phrase at all. To take an extremely simple example, the sentence “John saw Bill” might be represented in the following way as a surface structure, where each orthographically represented item is to be regarded as a classificatory matrix:
6 | ![]() |
This representation indicates that “John” and “Bill” are nouns (N’s) and “saw” a verb (V); that “John” and “Bill” are, furthermore, noun phrases (NP’s); that “saw Bill” is a verb phrase (VP); and that “John saw Bill” is a sentence (S). It seems that interpretation of a sentence by the phonological component of the grammar invariably requires information which can be represented in the way just described. We therefore postulate that the surface structure of a sentence is a properly labeled bracketing of a classificatory matrix of formatives and junctures.
The phonological component of a grammar converts a surface structure into a phonetic representation. We have now given a rough specification of the notions “surface structure” and “phonetic representations.” It remains to describe the rules of the phonological component and the manner in which they are organized.
The evidence presently available suggests that the rules of the phonological component are linearly ordered in a sequence R1, . . . Rn, and that this sequence of rules applies in a cyclic fashion to a surface structure in the following way. In the first cycle of application, the rules R1, . . . , Rn apply in this order to a maximal continuous part of the surface structure containing no internal brackets. After the last of these rules has applied, innermost brackets are erased and the second cycle of application is initiated. In this cycle, the rules again apply in the given order to a maximal continuous part of the surface structure containing no internal brackets. Innermost brackets are then erased, and the third cycle is initiated. The process continues until the maximal domain of phonological processes (in simple cases, the entire sentence) is reached. Certain of the rules are restricted in application to the level of word-boundary – they apply in the cycle only when the domain of application is a full word. Others are free to iterate at every stage of application. Notice that the principle of cyclic application is highly intuitive. It states, in effect, that there is a fixed system of rules that determines the form of large units from the (ideal) form of their constituent parts.
We can illustrate the principle of cyclic application with some rules of stress assignment in English. It seems to be a fact that although phonetic representations for English must allow five or six different values along the distinctive feature of stress, nevertheless, all segments can be unmarked with respect to stress in surface structures – that is, stress has no categorial function (except highly marginally) as a distinctive feature for English. The complex stress contours of the phonetic representation are determined by such rules as 7 and 8.17
7 | Assign primary stress to the left-most of two primary stressed vowels, in nouns. |
8 | Assign primary stress to the right-most stress-peak, where a vowel V is a stress-peak in a certain domain if this domain contains no vowel more heavily stressed than V. |
Rule 7 applies to nouns with two primary stresses; rule 8 applies to a unit of any other kind. The rules apply in the order 7, 8, in the cyclic manner described above. By convention, when primary stress is assigned in a certain position, all other stresses are weakened by one. Notice that if a domain contains no stressed vowel, then rule 8 will assign primary stress to its right-most vowel.
To illustrate these rules, consider first the surface structure 6. In accordance with the general principle of cyclic application, the rules 7 and 8 first apply to the innermost units [N John]N, [V saw]V, and [N Bill]N. Rule 7 is inapplicable; rule 8 applies, assigning primary stress to the single vowel in each case. Innermost brackets are then erased. The next cycle deals with the units and
and simply reassigns primary stress to the single vowel, by rule 8 Innermost brackets are then erased, and we have the unit
as the domain of application of the rules. Rule 7 is again inapplicable, since this is not a noun; rule 8 assigns primary stress to the vowel of “Bill,” weakening the stress on “saw” to secondary. Innermost brackets are erased, and we have the unit
as the domain of application. Rule 7 is again inapplicable, and rule 8 assigns primary stress to “Bill,” weakening the other stresses and giving
which can be accepted as an ideal representation of the stress contour.
Consider now the slightly more complex example “John’s black-board eraser.” In the first application of the cycle, rules 7 and 8 apply to the innermost bracketed units “John,” “black,” “board,” “erase”; rule 7 is inapplicable, and rule 8 assigns primary stress in each case to the right-most vowel (the only vowel, in the first three). The next cycle involves the units “John’s” and “eraser,” and is vacuous.18 The domain of application for the next cycle is . Being a noun, this unit is subject to rule 7, which assigns primary stress to “black,” weakening the stress on “board” to secondary. Innermost brackets are erased, and the domain of application for the next cycle is
. Again rule 7 applies, assigning primary stress to “black” and weakening all other stresses by one. In the final cycle, the domain of application of the rules is
. Rule 7 is inapplicable, since this is a full noun phrase. Rule 8 assigns primary stress to the right-most primary stressed vowel, weakening all the others and giving
In this way, a complex phonetic representation is determined by independently motivated and very simple rules, applying in accordance with the general principle of the cycle.
This example is characteristic and illustrates several important points. The grammar of English must contain the rule 7 so as to account for the fact that the stress contour is falling in the case of the noun “blackboard,” and it must contain rule 8, to account for the rising contour of the phrase “black board” (“board which is black”). The principle of the cycle is not, strictly speaking, part of the grammar of English but is rather a principle of universal grammar that determines the application of the particular rules of English or any other language, whatever these rules may be. In the case illustrated, the general principle of cyclic application assigns a complex stress contour, as indicated. Equipped with the principle of the cycle and the two rules 7 and 8, a person will know19 the proper stress contour for “John’s blackboard eraser” and innumerable other expressions which he may never have heard previously. This is a simple example of a general property of language; certain universal principles must interrelate with specific rules to determine the form (and meaning) of entirely new linguistic expressions.
This example also lends support to a somewhat more subtle and far-reaching hypothesis. There is little doubt that such phenomena as stress contours in English are a perceptual reality; trained observers will, for example, reach a high degree of unanimity in recording new utterances in their native language. There is, however, little reason to suppose that these contours represent a physical reality. It may very well be the case that stress contours are not represented in the physical signal in anything like the perceived detail. There is no paradox in this. If just two levels of stress are distinguished in the physical signal, then the person who is learning English will have sufficient evidence to construct the rules 7 and 8 (given the contrast “blackboard,” “black board,” for example). Assuming then that he knows the principle of the cycle, he will be able to perceive the stress contour of “John’s blackboard eraser” even if it is not a physical property of the signal. The evidence now available strongly suggests that this is an accurate description of how stress is perceived in English.
It is important to see that there is nothing mysterious in this description. There would be no problem in principle in designing an automaton that uses the rules 7 and 8, the rules of English syntax, and the principle of the transformational cycle to assign a multi-leveled stress contour even to an utterance in which stress is not represented at all (for example, a sentence spelled in conventional orthography). The automaton would use the rules of syntax to determine the surface structure of the utterance, and would then apply the rules 7 and 8, in accordance with the principle of the cycle, to determine the multi-leveled contour. Taking such an automaton as a first approximation to a model for speech perception (see 1, p. 103), we might propose that the hearer uses certain selected properties of the physical signal to determine which sentence of the language was produced and to assign to it a deep and surface structure. With careful attention, he will then be able to “hear” the stress contour assigned by the phonological component of his grammar, whether or not it corresponds to any physical property of the presented signal. Such an account of speech perception assumes, putting it loosely, that syntactic interpretation of an utterance may be a prerequisite to “hearing” its phonetic representation in detail; it rejects the assumption that speech perception requires a full analysis of phonetic form followed by a full analysis of syntactic structure followed by semantic interpretation, as well as the assumption that perceived phonetic form is an accurate point-by-point representation of the signal. But it must be kept in mind that there is nothing to suggest that either of the rejected assumptions is correct, nor is there anything at all mysterious in the view just outlined that rejects these assumptions. In fact, the view just outlined is highly plausible, since it can dispense with the claim that some presently undetectable physical properties of utterances are identified with an accuracy that goes beyond anything experimentally demonstrable even under ideal conditions, and it can account for the perception of stress contours of novel utterances20 on the very simple assumption that rules 7 and 8 and the general principle of cyclic application are available to the perceptual system.
There is a great deal more to be said about the relative merits of various kinds of perceptual models. Instead of pursuing this topic, let us consider further the hypothesis that rules 7 and 8, and the principle of cyclic application, are available to the perceptual system and are used in the manner suggested. It is clear how rules 7 and 8 might be learned from simple examples of rising and falling contour (for example, “black board” contrasted with “blackboard”). But the question then arises: how does a person learn the principle of cyclic application? Before facing this question, it is necessary to settle one that is logically prior to it: why assume that the principle is learned at all? There is much evidence that the principle is used, but from this it does not follow that it has been learned. In fact, it is difficult to imagine how such a principle might be learned, uniformly by all speakers, and it is by no means clear that sufficient evidence is available in the physical signal to justify this principle. Consequently, the most reasonable conclusion seems to be that the principle is not learned at all, but rather that it is simply part of the conceptual equipment that the learner brings to the task of language acquisition. A rather similar argument can be given with respect to other principles of universal grammar.
Notice again that there should be nothing surprising in such a conclusion. There would be no difficulty, in principle, in designing an automaton which incorporates the principles of universal grammar and puts them to use to determine which of the possible languages is the one to which it is exposed. A priori, there is no more reason to suppose that these principles are themselves learned than there is to suppose that a person learns to interpret visual stimuli in terms of line, angle, contour, distance, or, for that matter, that he learns to have two arms. It is completely a question of empirical fact; there is no information of any general extralinguistic sort that can be used, at present, to support the assumption that some principle of universal grammar is learned, or that it is innate, or (in some manner) both. If linguistic evidence seems to suggest that some principles are unlearned, there is no reason to find this conclusion paradoxical or surprising.
Returning to the elaboration of principles of universal grammar, it seems that the phonological component of a grammar consists of a sequence of rules that apply in a cyclic manner, as just described, to assign a phonetic representation to a surface structure. The phonetic representation is a matrix of phonetic feature specifications and the surface structure is a properly labeled bracketing of formatives which are, themselves, represented in terms of marking of categorial distinctive features. What evidence is now available supports these assumptions; they provide the basis for explaining many curious features of phonetic fact.
It is important to notice that there is no a priori necessity for the phonological component of a grammar to have just these properties. These assumptions about universal grammar restrict the class of possible human languages to a very special subset of the set of imaginable “languages.” The evidence available to us suggests that these assumptions pertain to the language acquisition device AM of 3, p. 106, that is that they form one part of the schematism that the child brings to the problem of language learning. That this schematism must be quite elaborate and highly restrictive seems fairly obvious. If it were not, language acquisition, within the empirically known limits of time, access, and variability, would be an impenetrable mystery. Considerations of the sort mentioned in the foregoing discussion are directly relevant to the problem of determining the nature of these innate mechanisms, and, therefore, deserve extremely careful study and attention.
Let us now consider the second interpretive component of a generative grammar, the system of rules that converts a deep structure into a semantic representation that expresses the intrinsic meaning of the sentence in question. Although many aspects of semantic interpretation remain quite obscure, it is still quite possible to undertake a direct investigation of the theory of deep structures and their interpretation, and certain properties of the semantic component seem fairly clear. In particular, as we have noted earlier, many empirical conditions on semantic interpretation can be clearly formulated. For example, we know that sentence 4 on p. 110 must be assigned at least two semantic representations, and that one of these must be essentially the same as the interpretation assigned to both 9 and 10.
9 | Being disregarded by everyone disturbed John. |
10 | The fact that everyone disregarded John disturbed him.21 |
Furthermore, it is clear that the semantic representation of a sentence depends on the representation of its parts, as in the parallel case of phonetic interpretation. For example, in the case of 10, it is obvious that the semantic interpretation depends, in part, on the semantic interpretation of “Everyone disregarded John”; if the latter were replaced in 10 by “Life seemed to pass John by,” the interpretation of the whole would be changed in a fixed way. This much is transparent, and it suggests that a principle like the principle of cyclic application in phonology should hold in the semantic component.
A slightly more careful look at the problem shows that semantic interpretation must be significantly more abstract than phonological interpretation with respect to the notion of “constituent part.” Thus the interpretation of “Everyone disregarded John” underlies not only 10, but also 9 and 4, and in exactly the same way. But neither 4 nor 9 contains “everyone disregarded John” as a constituent part, as does 10. In other words, the deep structures underlying 9 and 10 should both be identical (or very similar) to one of two deep structures underlying 4, despite the wide divergence in surface structure and phonetic form. It follows that we cannot expect deep structure to be very close to surface structure in general.
In the case of a sentence like 6 (“John saw Bill”), there is little difference between deep and surface structure. Semantic interpretation would not be far from the mark, in this case, if it were quite parallel to phonetic interpretation. Thus the interpretation of “saw Bill” can be derived from that of “saw”22 and that of “Bill,” and the interpretation of 6 can be determined from that of “John” and that of “saw Bill.” To carry out such interpretation we must know not only the bracketing of 6 into constituents, but also the grammatical relations that are represented; that is, we must know that “Bill” is the direct-object of “saw” and that the subject–predicate relation holds between “John” and “saw Bill” in “John saw Bill.” Similarly, in the slightly more complex case of “John saw Bill leave,” we must know that the subject–predicate relation holds between “John” and “saw Bill leave” and also between “Bill” and “leave.”
Notice that at least in such simple cases as 6, we already have a mechanism for representing grammatical relations of just the sort that are required for semantic interpretation. Suppose that we define the relations subject-of as the relation holding between a noun phrase and a sentence of which it is an immediate constituent23 and the relation predicate-of as holding between a verb phrase and a sentence of which it is an immediate constituent. The subject–predicate relation can then be defined as the relation holding between the subject of a sentence and the predicate of this sentence. Thus, in these terms, “John” is the subject and “saw Bill (leave)” the predicate of “John saw Bill (leave),” and the subject–predicate relation holds between the two. In the same way, we can define the relation direct-object (in terms of the immediate constituency of verb and noun phrase in verb phrase) and others in a perfectly appropriate and satisfactory way. But returning now to 6, this observation implies that a labeled bracketing will serve as the deep structure (just as a labeled bracketing will serve as the surface structure); it contains just the information about constituency and about grammatical relations that is required for semantic interpretation.
We noted that in “John saw Bill leave” the subject–predicate relation holds between “Bill” and “leave,” as well as between “John” and “saw Bill leave.” If 6 or something very much like it – see, for example, note 22 – is to be taken as the deep structure, with grammatical relations defined as previously, then the deep structure of “John saw Bill leave” will have to be something like 11 (many details omitted):
11 | ![]() |
The labeled bracketing 11 expresses the subject–predicate relation between “John” and “saw Bill leave” and between “Bill” and “leave,” as required.
Moving to a somewhat more complex example, the sentences 9 and 10 (as well as 4 under one interpretation) will each have to contain something like 12 in the deep structure:
12 | ![]() |
If this requirement is met, then we will be able to account for the fact that, obviously, the meaning of 4 (= “what disturbed John was being disregarded by everyone”) in one interpretation of 9 (= “being disregarded by everyone disturbed John”) is determined in part by the fact that the direct-object relation holds between “disregard” and “John” and the subject–predicate relation between “everyone” and “disregards John,” despite the fact that these relations are in no way indicated in the surface structure in 4 or 9.
From many such examples, we are led to the following conception of how the semantic component functions. This interpretive component of the full generative grammar applies to a deep structure and assigns to it a semantic representation, formulated in terms of the still quite obscure notions of universal semantics. The deep structure is a labeled bracketing of minimal “meaning-bearing” elements. The interpretive rules apply cyclically, determining the semantic interpretation of a phrase X of the deep structure from the semantic interpretations of the immediate constituents of X and the grammatical relation represented in this configuration of X and its parts.
Superficially, at least, the two interpretive components of the grammar are rather similar in the way in which they operate, and they apply to objects of essentially the same sort (labeled bracketings). But the deep structure of a sentence will, in nontrivial cases, be quite different from its surface structure.
Notice that if the notions “noun phrase,” “verb phrase,” “sentence,” “verb,” can receive a language-independent characterization within universal grammar, then the grammatical relations defined above (similarly, others that we might define in the same way) will also receive a universal characterization. It seems that this may be possible, and certain general lines of approach to such a characterization seem clear (see p. 139). We might then raise the question of whether the semantic component of a grammar contains such particular rules as the rules 7 and 8 of the phonological component of English or whether, alternatively, the principles of semantic interpretation belong essentially to universal grammar. However, we will put aside these and other questions relating to the semantic component, and turn next to the discussion of the one noninterpretive component of the grammar – which we have called its “syntactic component.” Notice that as in the case of the phonological component, insofar as principles of interpretation can be assigned to universal rather than particular grammar, there is little reason to suppose that they are learned or that they could in principle be learned.
The syntactic component of a grammar must generate (see note 12) pairs (D, S), where D is a deep structure and S an associated surface structure. The surface structure S is a labeled bracketing of a sequence of formatives and junctures. The deep structure D is a labeled bracketing that determines a certain network of grammatical functions and grammatical relations among the elements and groups of elements of which it is composed. Obviously, the syntactic component must have a finite number of rules (or rule schemata), but these must be so organized that an infinite number of pairs (D, S) of deep and surface structures can be generated, one corresponding to each interpreted sentence (phonetically and semantically interpreted, that is) of the language.24 In principle, there are various ways in which such a system might be organized. It might, for example, consist of independent rules generating deep and surface structures and certain conditions of compatibility relating them, or of rules generating surface structures combined with rules mapping these into the associated deep structure, or of rules generating deep structures combined with rules mapping these into surface structures.25 Choice among these alternatives is a matter of fact, not decision. We must ask which of the alternatives makes possible the deepest generalizations and the most far-reaching explanation of linguistic phenomena of various sorts. As with other aspects of universal grammar, we are dealing here with a set of empirical questions; crucial evidence may be difficult to obtain, but we cannot conclude from this that there is, in principle, no right and wrong in the matter.
Of the many alternatives that might be suggested, the linguistic evidence now available seems to point consistently to the conclusion that the syntactic component consists of rules that generate deep structures combined with rules mapping these into associated surface structures. Let us call these two systems of rules the base and the transformational components of the syntax, respectively. The base system is further subdivided into two parts: the categorial system and the lexicon. Each of these three subparts of the syntax has a specific function to perform, and there seem to be heavy universal constraints that determine their form and interrelation. The general structure of a grammar would, then, be as depicted in diagram 13:
13 | ![]() |
The mapping S is carried out by the semantic component; T by the transformational component; and P by the phonological component. Generation of deep structures by the base system (by operation B) is determined by the categorial system and the lexicon.
The lexicon is a set of lexical entries; each lexical entry, in turn, can be regarded as a set of features of various sorts. Among these are the phonological features and the semantic features that we have already mentioned briefly. The phonological features can be thought of as indexed as to position (that is, first, second, etc.); aside from this, each is simply an indication of marking with respect to one of the universal distinctive features (regarded here in their categorial function) or with respect to some diacritic feature (see p. 114), in the case of irregularity. Thus the positionally indexed phonological features constitute a distinctive feature matrix with the entries given as + or – values, as described earlier. The semantic features constitute a “dictionary definition.” As noted previously, some of these at least must be quite abstract; there may, furthermore, be intrinsic connections of various sorts among them that are sometimes referred to as “field structure.” In addition, the lexical entry contains syntactic features that determine the positions in which the entry in question may appear, and the rules that may apply to structures containing it as these are converted into surface structures. In general, the lexical entry contains all information about the item in question that cannot be accounted for by general rule.
Aside from lexical entries, the lexicon will contain redundancy rules that modify the feature content of a lexical entry in terms of general regularities. For example, the fact that vowels are voiced or that humans are animate requires no specific mention in particular lexical entries. Much of the redundant lexical information can, no doubt, be provided by general conventions (that is, rules of universal grammar) rather than by redundancy rules of the language.
The lexicon is concerned with all properties, idiosyncratic or redundant, of individual lexical items. The categorial component of the base determines all other aspects of deep structure. It seems that the categorial component is what is called a simple or context-free phrase-structure grammar. Just what such a system is can be understood quite easily from a simple example. Suppose that we have the rules 14:
14 | ![]() |
With these rules we construct the derivation 15 in the following way. First write down the symbol S as the first line of the derivation. We interpret the first rule of 14 as permitting S to be replaced by NP VP, giving the second line of 15. Interpreting the second rule of 14 in a similar way, we form the third line of the derivation 15 with VP replaced by V NP. We form the fourth line of 15 by applying the rule NP N of 14, interpreted the same way, to both of the occurrences of NP in the third line. Finally, we form the final two lines of 15 by applying the rules N
Δ and V
Δ.
15 | ![]() |
Clearly, we can represent what is essential to the derivation 15 by the tree diagram 16.
16 | ![]() |
In the diagram 16, each symbol dominates the symbols by which it was replaced in forming 15. In fact, we may think of the rules of 14 as simply describing the way in which a tree diagram such as 16 can be constructed. Evidently, 16 is just another notation for the labeled bracketing 17:
17 | ![]() |
Domination of some element by a symbol A in 16 (as, for example, V NP is dominated by VP) is indicated in 17 by enclosing this element by the labeled brackets [A,]A. If we have a lexicon which tells us that “John” and “Bill” can replace the symbol Δ when this symbol is dominated by N (that is, is enclosed by [N,]N), and that “saw” can replace Δ when it is dominated by V, then we can extend the derivation 15 to derive “John saw Bill,” with the associated structure that we have given as 6. In fact, 6, derives from 17 by replacing the first occurrence of Δ by “John,” the second by “saw,” and the third by “Bill.”
Notice that the rules 14 in effect define grammatical relations, where the definitions are given as on pp. 121–22. Thus, the first rule of 14 defines the subject–predicate relation and the second, the verb–object relation. Similarly, other semantically significant grammatical functions and relations can be defined by rules of this form, interpreted in the manner indicated.
Restating these notions in a more formal and general way, the categorial component of the base is a system of rules of the form A Z, where A is a category symbol such as S (for “sentence”), NP (for “noun phrase”), N (for “noun”), etc., and Z is a string of one or more symbols which may again be category symbols or which may be
terminal symbols (that is, symbols which do not appear on the left-hand side of the arrow in any base rule). Given such a system, we can form
derivations, a derivation being a sequence of lines that meets the following conditions: the first line is simply the symbol S (standing for sentence); the last line contains only terminal symbols; if X, Y are two successive lines, then X must be of the form . . . A . . . and Y of the form . . . Z . . . , where A
Z is one of the rules. A derivation imposes a labeled bracketing on its terminal string in the obvious way. Thus given the successive lines X = . . . A . . . , Y = . . . Z . . . , where Y was derived from X by the rule A
Z, we will say that the string derived from Z (or Z itself, if it is terminal) is bracketed by [A,]A. Equivalently, we can represent the labeled bracketing by a tree diagram in which a node labeled A (in this example) dominates the successive nodes labeled by the successive symbols of Z.
We assume that one of the terminal symbols of the categorial component is the dummy symbol Δ. Among the nonterminal symbols are several that stand for lexical categories, in particular N (for “noun”), V (for “verb”), ADJ (for “adjective”). A lexical category A can appear on the left-hand side of a rule A Z only if Z is Δ. Lexical entries will then be inserted in derivations in place of Δ by rules of a different sort, extending the derivations provided by the categorial component. Aside from Δ, indicating the position in which an item from the lexicon may appear, the terminal symbols of the categorial component are grammatical elements such as be, of, etc. Some of the terminal symbols introduced by categorial rules will have an intrinsic semantic content.
A labeled bracketing generated by base rules (that is, by the phrase-structure rules of the categorial component and by the rule of lexical insertion mentioned in the preceding paragraph) will be called a base phrase-marker. More generally, we will use the term “phrase-marker” here to refer to any string of elements properly bracketed with labeled brackets.26 The rules of the transformational component modify phrase-markers in certain fixed ways. These rules are arranged in a sequence T1, . . . , Tm. This sequence of rules applies to a base phrase-marker in a cyclic fashion. First, it applies to a configuration dominated by S (that is, a configuration [S . . .]S) and containing no other occurrence of S. When the transformational rules have applied to all such configurations, then they next apply to a configuration dominated by S and containing only S-dominated configurations to which the rules have already applied. This process continues until the rules apply to the full phrase-marker dominated by the initial occurrence of S in the base phrase-marker. At this point, we have a surface structure. It may be that the ordering conditions on transformations are looser – that there are certain ordering conditions on the set {T1, . . . , Tm}, and that at a given stage in the cycle, a sequence of transformations can apply if it does not violate these conditions – but I will not go into this matter here.
The properties of the syntactic component can be made quite clear by an example (which, naturally, must be much oversimplified). Consider a subpart of English with the lexicon 18 and the categorial component 19.
18 | ![]() |
19 | ![]() |
In 19, parentheses are used to indicate an element that may or may not be present in the rule. Thus the first line of 19 is an abbreviation for two rules, one in which S is rewritten Q NP AUX VP, the other in which S is rewritten NP AUX VP. Similarly, the third line of 19 is actually an abbreviation for four rules, etc. The last line of 19 stands for five rules, each of which rewrites one of the categorial symbols on the left as the dummy terminal symbol Δ.
This categorial component provides such derivations as the following:
20 | ![]() |
These derivations are constructed in the manner just described. They impose labeled bracketings which, for clarity, we will give in the equivalent tree representation:
21 | ![]() |
We now use the lexicon to complete the base derivations 20a, 20b. Each entry in the lexicon contains syntactic features which identify the occurrences of Δ that it can replace in a derivation. For example, the items of the five rows of 18 can replace occurrences of Δ that are dominated, in the tree representations of 21, by the categorial symbols N, V, ADJ, M, DET, respectively.
But the restrictions are much narrower than this. Thus of the verbs in 18 (line 2), only persuade can replace an occurrence of Δ dominated by V when this occurrence of V is followed in the VP by: NP of NP. We can form “. . . persuade John of the fact,” but not “. . . dream (see, annoy) John of the fact.” Similarly, of the nouns in 18 (first line) only fact can appear in the context DET – that S (that is, “the fact that John left”); only it in a NP of the form – that S;27 only fact, boy, and future in a NP of the form DET – (“the fact,” “the boy,” “the future”), etc. Details aside, the general character of such restrictions is quite clear. Assuming, then, that the lexical entries contain the appropriate lexical features, we can extend the base derivations of 20 to give the terminal strings 22, inserting the items enclosed in brackets in 21.
22 |
|
We can also form such terminal strings as 23, with other choices in derivations.
23 |
|
In this way, we form full base derivations, using the rules of the categorial component and then substituting lexical entries for particular occurrences of the dummy symbol Δ in accordance with the syntactic features of these lexical entries. Correspondingly, we have the labeled brackets represented as 21, with lexical entries substituted for occurrences of Δ in the permitted ways. These are the base phrase-markers.
Notice that the rules that introduce lexical entries into base phrase-markers are entirely different in character from the rules of the categorial component. The rules of 19 that were used to form 20 are of a very elementary sort. Each such rule allows a certain symbol A in the string . . . A . . . to be rewritten as a certain string Z, independently of the context of A and the source of A in the derivation. But in introducing lexical entries in place of Δ, we must consider selected aspects of the phrase-marker in which Δ appears. For example, an occurrence of Δ can be replaced by “John” if it is dominated in the phrase-marker by N, but not by V. Thus the rules of lexical insertion really apply not to strings of categorial and terminal symbols, as do the rules of the categorial component, but to phrase-markers such as 21. Rules which apply to phrase-markers, modifying them in some specific way, are referred to in current terminology as (grammatical) transformations. Thus the rules of lexical insertion are transformational rules, whereas the rules of the categorial component are simply rewriting rules.
Let us now return to the examples 22a, 22b. Consider first 22a, with the base phrase-marker 21a.28 We see at once that 21 contains just the information required in the deep structure of the sentence “John was sad.” Clearly, the string past be is simply a representation of the formative “was,” just as past see represents “saw,” past persuade represents “persuaded,” etc. With a rule that converts past be to the formative “was,” we form the surface structure of the sentence “John was sad.” Furthermore, if we define grammatical functions and relations in the manner described earlier (see pp. 121–22), 21 expresses the fact that the subject–predicate relation holds between John and past be sad, and it also contains semantic information about the meaning-bearing items John, past, sad; we may assume, in fact, that past is itself a symbol of a universal terminal alphabet with a fixed semantic interpretation, and the semantic features of the lexical entries of John and sad can also be assumed to be selected, like the phonological features of these entries, from some universal system of representation of the sort discussed above. In short, 21a contains all information required for semantic interpretation, and we can, therefore, take it to be the deep structure underlying the sentence “John was sad.”
What is true of this example is true quite generally. That is, the base phrase-markers generated by the categorial component and the lexicon are the deep structures that determine semantic interpretation. In this simple case, only one rule is needed to convert the deep structure to a surface structure, namely, the rule converting past be to the formative was. Since this rule is clearly a special case of a rule that applies as well to any string of the form past V, it is really a very simple transformational rule (in the terminology just given) rather than an elementary rule of the type that we find in the categorial component. This observation can be generalized. The rules that convert deep structures to surface structures are transformational rules.
Suppose now that instead of the derivation 20a we had formed the very similar derivation 20:
24 | ![]() |
with its associated phrase-marker. We intend the symbol Q to be a symbol of the universal terminal alphabet with a fixed semantic interpretation, namely, that the associated sentence is a question. Suppose that the transformational component of the syntax contains rules that convert phrase-markers of the form Q NP AUX . . . to corresponding phrase-markers of the form AUX NP . . . (that is, the transformation replaces Q by AUX, leaving the phrase-marker otherwise unchanged). Applied to the phrase-marker corresponding to 24, this rule gives the labeled bracketing of the sentence “Will John be sad?”; that is, it forms the surface structure for this sentence.
Suppose that in place of 24 we had used the rule rewriting AUX as past. The question transformation of the preceding paragraph would give a phrase-marker with the terminal string “past John be sad,” just as it gives “Will John be sad?” in the case of 24. Evidently, we must modify the question transformation so that it inverts not just past, in this case, but the string past be, so that we derive finally, “Was John sad?” This modification is, in fact, straightforward, when the rules are appropriately formulated.
Whether we select M or past in 24, the generated base phrase-marker once again qualifies as a deep structure. The grammatical relation of John to will (past) be sad is exactly the same in 24 as in 20a, with the definitions proposed previously, as required for empirical adequacy. Of course, the surface forms do not express these grammatical relations, directly; as we have seen earlier, significant grammatical relations are rarely expressed directly in the surface structure.
Let us now turn to the more complex example 20b – 21b – 22b. Once again, the base phrase-marker 21b of 22b expresses the information required for the semantic interpretation of the sentence “The boy will persuade John of the fact that Bill dreamt,” which derives from 22b by a transformational rule that forms “dreamt” from past dream. Therefore, 21b can serve as the deep structure underlying this sentence, exactly as 21a can serve for “John was sad,” and the phrase-marker corresponding to 24 for “Will John be sad?”
Suppose that in rewriting NP in the third line of 20b, we had selected not DET N that S but N that S [see the fourth line of 19]. The only lexical item of 18 that can appear in the position of this occurrence of N is it. Therefore, instead of 22b, we would have derived
25 | the boy will persuade John of it that Bill past dream, |
with grammatical relations and lexical content otherwise unmodified. Suppose now that the transformational component of the syntax contains rules with the following effect:
26 |
|
Applying 26a and 26b to 25 in that order, with the rule that converts past dream to “dreamt,” we derive the surface structure of “The boy will persuade John that Bill dreamt.” The base phrase-marker corresponding to 25 serves as the deep structure underlying this sentence.
Notice that the rule 26a is much more general. Thus suppose we select the NP it that Bill past dream as the subject of past annoy John, as is permitted by the rules 18, 19. This gives
27 | it that Bill past dream past annoy John |
Applying the rule 26a (and the rules for forming past tense of verbs), we derive, “That Bill dreamt annoyed John.” Alternatively, we might have applied the transformational rule with the effect of 28:
28 | A phrase-marker of the form it that S X is restructured as the corresponding phrase-marker of the form it X that S. |
Applying 28 to 27, we derive “It annoyed John that Bill dreamt.” In this case, 26a is inapplicable. Thus 27 underlies two surface structures, one determined by 28 and the other by 26a; having the same deep structure, these are synonymous. In the case of 25, 28 is inapplicable and, therefore, we have only one corresponding surface structure.
We can carry the example 25 further by considering additional transformational rules. Suppose that instead of selecting Bill in the embedded sentence of 25, we had selected John a second time. There is a very general transformational rule in English and other languages providing for the deletion of repeated items. Applying this rule along with other minor ones of an obvious sort, we derive
29 | The boy will persuade John to dream |
from a deep structure that contains, as it must, a subphrase-marker that expresses the fact that John is the subject of dream. Actually, in this case the deep phrase-marker would be slightly different, in ways that need not concern us here, in this rough expository sketch.
Suppose now that we were to add a transformation that converts a phrase-marker of the form NP AUX V NP into the corresponding passive, in the obvious way.29 Applying to phrase-markers very much like 21b, this rule would provide surface structures for the sentences “John will be persuaded that Bill dreamt (by the boy)” [from 25] and “John will be persuaded to dream (by the boy)” [from 29]. In each case, the semantic interpretation will be that of the underlying deep phrase-marker. In certain cases, the significant grammatical relations are entirely obscured in the surface structure. Thus in the case of the sentence “John will be persuaded to dream,” the fact that “John” is actually the subject of “dream” is not indicated in the surface structure, although the underlying deep structure, as we have noted, expresses this fact directly.
From these examples we can see how a sequence of transformations can form quite complicated sentences in which significant relations among the parts are not represented in any direct way. In fact, it is only in artificially simple examples that deep and surface structure correspond closely. In the normal sentences of everyday life, the relation is much more complex; long sequences of transformations apply to convert underlying deep structures into the surface form.
The examples that we have been using are stilted and unnatural. With a less rudimentary grammar, quite natural ones can be provided. For example, in place of the sentences formed from 27 by 26 or 28 we could use more acceptable sentences such as “That you should believe this is not surprising,” “It is not surprising that you should believe this,” etc. Actually, the unnaturalness of the examples we have used illustrates a simple but often neglected point, namely, that the intrinsic meaning of a sentence and its other grammatical properties are determined by rule, not by conditions of use, linguistic context, frequency of parts, etc.30 Thus the examples of the last few paragraphs may never have been produced in the experience of some speaker (or, for that matter, in the history of the language), but their status as English sentences and their ideal phonetic and semantic interpretations are unaffected by this fact.
Since the sequence of transformations can effect drastic modifications in a phrase-marker, we should not be surprised to discover that a single structure31 may result from two very different deep structures – that is, that certain sentences are ambiguous (for example, sentence 4 on p. 110). Ambiguous sentences provide a particularly clear indication of the inadequacy of surface structure as a representation of deeper relations.32
More generally, we can easily find paired sentences with essentially the same surface structure but entirely different grammatical relations. To mention just one such example, compare the sentences of 30:
30 |
|
The surface structures are essentially the same. The sentence 30a is of the same form as 29. It derives from a deep structure which is roughly of the form 31:
31 | ![]() |
This deep structure is essentially the same as 21b, and by the transformational process described in connection with 29, we derive from it the sentence 30a. But in the case of 30b there are no such related structures as “I expected the doctor of the fact that he examined John,” “. . . of the necessity (for him) to examine John,” etc., as there are in the case of 30a. Correspondingly, there is no justification for an analysis of 30b as derived from a structure like 31. Rather, the deep structure underlying 30b will be something like 32 (again omitting details):
32 | ![]() |
There are many other facts that support this analysis of 30a and 30b. For example, from a structure like 32 we can form “What I expected was that the doctor (will, should, etc.) examine John,” by the same rule that forms “What I saw was the book,” from the underlying NP-V-NP structure “I saw the book.” But we cannot form “What I persuaded was that the doctor should examine John,” corresponding to 30a, because the underlying structure 31 is not of the form NP-V-NP as required by this transformation. Applying rule 26a to 32, we derive “I expected that the doctor (will, should, etc.) examine John.” We derive 30b, instead, by the use of the same rule that gives 29, with “to” rather than “that” appearing with the embedded sentence, which, in this case, contains no other representative of the category AUX.
Details aside, we see that 30a is derived from 31 and 30b from 32, so that despite near identity of surface structure, the deep structures underlying 30a and 30b are very different. That there must be such a divergence in deep structure is not at all obvious.33 It becomes clear, however, if we consider the effect of replacing “the doctor to examine John” by its passive, “John to be examined by the doctor,” in 30a and 30b. Thus we have under examination the sentences 33 and 34:
33 |
|
34 |
|
The semantic relation between the paired sentences of 34 is entirely different from the relation between the sentences of 33. We can see this by considering the relation in truth value. Thus 34a and 34b are necessarily the same in truth value; if I expected the doctor to examine John then I expected John to be examined by the doctor, and conversely. But there is no necessary relation in truth value between 33a and 33b. If I persuaded the doctor to examine John, it does not follow that I persuaded John to be examined by the doctor, or conversely.
In fact, exchange of active and passive in the embedded sentence preserves meaning, in a rather clear sense, in the case of 30b but not 30a. The explanation is immediate from consideration of the deep structures underlying these sentences. Replacing active by passive in 32, we then go on to derive 34b in just the way that 30b is derived from 32. But to derive 33b, we must not only passivize the embedded sentence in 31, but we must also select “John” instead of “the doctor” as the object of the verb “persuade”; otherwise, the conditions for deletion of the repeated noun phrase, as in the derivation of 29, will not be met. Consequently, the deep structure underlying 33b is quite different from that underlying 33a. Not only is the embedded sentence passivized, but the object “the doctor” must be replaced in 31 by “John.” The grammatical relations are, consequently, quite different, and the semantic interpretation differs correspondingly. It remains true, in both cases, that passivization does not affect meaning (in the sense of “meaning” relevant here). The change of meaning in 30a when “the doctor to examine John” is replaced by “John to be examined by the doctor” is occasioned by the change of grammatical relations, “John” now being the direct object of the verb phrase in the underlying structure rather than “the doctor.” There is no corresponding change in the case of 34a, so that the meaning remains unaltered when the embedded sentence is passivized.
The example 30a, 30b illustrates, once again, the inadequacy (and, quite generally, irrelevance) of surface structure for the representation of semantically significant grammatical relations. The labeled bracketing that conveys the information required for phonetic interpretation is in general very different from the labeled bracketing that provides the information required for semantic interpretation. The examples 30a, 30b also illustrate how difficult it may be to bring one’s “linguistic intuition” to consciousness. As we have seen, the grammar of English, as a characterization of competence (see pp. 102f.), must, for descriptive adequacy, assign different deep structures to the sentences 30a and 30b. The grammar that each speaker has internalized does distinguish these deep structures, as we can see from the fact that any speaker of English is capable of understanding the effect of replacing the embedded sentence by its passive in the two cases of 30. But this fact about his internalized grammatical competence may escape even the careful attention of the native speaker (see note 33).
Perhaps such examples as these suffice to give something of the flavor of the syntactic structure of a language. Summarizing our observations about the syntactic component, we conclude that it contains a base and a transformational part. The base generates deep structures, and the transformational rules convert them to surface structures. The categorial component of the base defines the significant grammatical relations of the language, assigns an ideal order to underlying phrases, and, in various ways, determines which transformations will apply.34 The lexicon specifies idiosyncratic properties of individual lexical items. Together, these two components of the base seem to provide the information relevant for semantic interpretation in the sense in which we have been using this term, subject to the qualifications mentioned earlier. The transformational rules convert phrase-markers to new phrase-markers, affecting various kinds of reordering and reorganization. The kinds of changes that can be effected are quite limited; we will, however, not go into this matter here. Applying in sequence, the transformations may affect the organization of a base phrase-marker quite radically, however. Thus the transformations provide a wide variety of surface structures that have no direct or simple relation to the base structures from which they originate and which express their semantic content.
It is a fact of some significance that the mapping of deep to surface structures is not a matter of a single step but is, rather, analyzable into a sequence of successive transformational steps. The transformations that contribute to this mapping of deep to surface structures can be combined in many different ways, depending on the form of the deep structure to which they apply. Since these transformations apply in sequence, each must produce a structure of the sort to which the next can apply. This condition is met in our formulation, since transformations apply to phrase-markers and convert them into new phrase-markers. But there is very good empirical evidence that the surface structures that determine phonetic form are, in fact, phrase-markers (that is, labeled bracketing of formatives). It follows, then, that the deep structures to which transformations originally apply should themselves be phrase-markers, as in our formulation.
In principle, there are many ways in which a network of grammatical relations might be represented. One of the major reasons for selecting the method of phrase-markers generated by base rules is precisely the fact that transformations must apply in sequence and therefore must apply to objects of the sort that they themselves produce, ultimately, to phrase-markers that have the same formal properties as surface structures.35
The grammatical theory just presented calls for several comments. We pointed out earlier that the grammar of a language must, for empirical adequacy, allow for infinite use of finite means, and we assigned this recursive property to the syntactic component, which generates an infinite set of paired deep and surface structures. We have now further localized the recursive property of the grammar, assigning it to the categorial component of the base. Certain base rules introduce the initial symbol S that heads derivations, for example, the fourth rule of 19. It may be that introduction of “propositional content” in deep structures by this means is the only recursive device in the grammar apart from the rules involved in forming coordinated constructions, which raise various problems going beyond what we have been discussing here.
It is reasonable to ask why human languages should have a design of this sort – why, in particular, they should use grammatical transformations of the sort described to convert deep structures to surface form. Why should they not make use of deep structures in a more direct way?36 Two reasons suggest themselves at once. We have already observed that the conditions of lexical insertion are essentially transformational rather than phrase-structural (see p. 130). More generally, we find many nonphrase-structural constraints (for example, those involved in deletion of identical items – see pp. 132 and 136) when we study a language carefully. Thus transformations not only convert a deep structure to a surface structure, but they also have a “filtering effect,” ruling out certain potential deep structures as not well-formed.37 Apart from this, we would naturally be inclined to seek an explanation for the use of grammatical transformations in the empirical constraints that linguistic communication must meet. Even the simple fact that sound is unrecoverable imposes conditions on speech that need not, for example, be imposed on a linguistic system designed only for writing (for example, the artificial systems mentioned in note 36). A written system provides an “external memory” that changes the perceptual problem in quite a significant way. We would expect a system designed for the conditions of speech communication to be somehow adapted to the load on memory. In fact, grammatical transformations characteristically reduce the amount of grammatical structure in phrase-markers in a well-defined way, and it may be that one consequence of this is to facilitate the problem of speech perception by a short-term memory of a rather limited sort.38 This observation suggests some promising directions for further research, but little of substance can be said with any confidence on the basis of what is understood today.
One further point requires some clarification. We noted at the outset that performance and competence must be sharply distinguished if either is to be studied successfully. We have now discussed a certain model of competence. It would be tempting, but quite absurd, to regard it as a model of performance as well. Thus we might propose that to produce a sentence, the speaker goes through the successive steps of constructing a base-derivation, line by line from the initial symbol S, then inserting lexical items and applying grammatical transformations to form a surface structure, and finally applying the phonological rules in their given order, in accordance with the cyclic principle discussed earlier. There is not the slightest justification for any such assumption. In fact, in implying that the speaker selects the general properties of sentence structure before selecting lexical items (before deciding what he is going to talk about), such a proposal seems not only without justification but entirely counter to whatever vague intuitions one may have about the processes that underlie production. A theory of performance (production or perception) will have to incorporate the theory of competence – the generative grammar of a language – as an essential part. But models of performance can be constructed in many different ways, consistently with fixed assumptions about the competence on which they are based. There is much that can be said about this topic, but it goes beyond the bounds of this paper.
Specifying the properties of the various components and subcomponents of a grammar precisely, along the lines outlined in this discussion, we formulate a highly restrictive hypothesis about the structure of any human language. As we have remarked several times, it is far from necessary, on any a priori grounds, that a language must have a structure of this sort. Furthermore, it seems quite likely that very heavy conditions can be placed on grammars beyond those outlined above. For example, it may be (as, in fact, was traditionally assumed) that base structures can vary only very slightly from language to language; and, by sufficiently restricting the possible range of base structures, it may be possible to arrive at quite general definitions for the categories that function as “nonterminal symbols” in the rules of the categorial component. As observed previously, this would provide language-independent definitions of grammatical relations, and would raise the possibility that there exist deep-seated universal principles of semantic interpretation.
In mentioning such possibilities, we must take note of the widespread view that modern investigations have not only conclusively refuted the principles of traditional universal grammar but have, moreover, shown that the search for such principles was ill-conceived from the start. But it seems to me that such conclusions are based on a serious misunderstanding of traditional universal grammar, and on an erroneous interpretation of the results of modern work. Traditional universal grammar tried to demonstrate, on the basis of what information was then available, that deep structures vary little from language to language. That surface structures might be highly diverse was never doubted. It was also assumed that the categories of syntax, semantics, and phonetics are universal and quite restricted in variety. Actually, modern “anthropological linguistics” has provided little evidence that bears on the assumption of uniformity of deep structures, and insofar as the universality of categories is concerned, conclusions rather like the traditional ones are commonly accepted in practice in descriptive work.39
Modern linguistics and anthropological linguistics have concerned themselves only marginally with deep structure, either in theory or practice. A great diversity of surface structures has been revealed in descriptive work, as anticipated in traditional universal grammar. Nevertheless, a good case can be made for the conclusion that the fundamental error of traditional universal grammar was that it was not sufficiently restrictive in the universal conditions it proposed for human language – that much heavier constraints must be postulated to account for the empirical facts.
Our discussion of the structure of English in the illustrative examples given previously has necessarily been quite superficial and limited to very simple phenomena. But even a discussion of the topics we have touched on requires a fairly intimate knowledge of the language and a reasonably well-articulated theory of generative grammar. Correspondingly, it is only when problems of the sort illustrated are seriously studied that any contribution can be made to the theory of universal grammar. Under these circumstances, it is not too surprising that even today, the hypotheses of universal grammar that can be formulated with any conviction are supported by evidence from a fairly small number of studies of very few of the languages of the world, and that they must therefore be highly tentative. Still, the inadequacy of the evidence should not be overstated. Thus it is surely true – and there is nothing paradoxical in this – that a single language can provide strong evidence for conclusions regarding universal grammar. This becomes quite apparent when we consider again the problem of language acquisition (see p. 106). The child must acquire a generative grammar of his language on the basis of a fairly restricted amount of evidence.40 To account for this achievement, we must postulate a sufficiently rich internal structure – a sufficiently restricted theory of universal grammar that constitutes his contribution to language acquisition.
For example, it was suggested earlier that in order to account for the perception of stress contours in English, we must suppose that the user of the language is making use of the principle of cyclic application. We also noted that he could hardly have sufficient evidence for this principle. Consequently, it seems reasonable to assume that this principle is simply part of the innate schematism that he uses to interpret the limited and fragmentary evidence available to him. It is, in other words, part of universal grammar. Similarly, it is difficult to imagine what “inductive principles” might lead the child unerringly to the assumptions about deep structure and about organization of grammar that seem to be necessary if we are to account for such facts as those we have mentioned. Nor is a search for such principles particularly well-motivated. It seems reasonable to assume that these properties of English are, in reality, facts of universal grammar. If such properties are available to the child, the task of language acquisition becomes feasible. The problem for the child is not the apparently insuperable inductive feat of arriving at a transformational generative grammar from restricted data, but rather that of discovering which of the possible languages he is being exposed to. Arguing in this way, we can arrive at conclusions about universal grammar from study of even a single language.
The child is presented with data, and he must inspect hypotheses (grammars) of a fairly restricted class to determine compatibility with these data. Having selected a grammar of the predetermined class, he will then have command of the language generated by this grammar.41 Thus he will know a great deal about phenomena to which he has never been exposed, and which are not “similar” or “analogous” in any well-defined sense to those to which he has been exposed.42 He will, for example, know the relations among the sentences 33 and 34, despite their novelty; he will know what stress contours to assign to utterances, despite the novelty and lack of physical basis for these phonetic representations; and so on, for innumerable other similar cases. This disparity between knowledge and experience is perhaps the most striking fact about human language. To account for it is the central problem of linguistic theory.
The basic conclusion that seems to be emerging with increasing clarity from contemporary work in linguistics is that very restrictive initial assumptions about the form of generative grammar must be imposed if explanations are to be forthcoming for the facts of language use and language acquisition. Furthermore, there is, so far, no evidence to suggest that the variety of generative grammars for human languages is very great. The theory of universal grammar suggested by the sketchy description that we have just given will no doubt be proven incorrect in various respects. But it is not unlikely that its fundamental defect will be that it permits far too much latitude for the construction of grammars, and that the kinds of languages that can be acquired by humans in the normal way are actually of a much more limited sort than this theory would suggest. Yet even as the theory of generative grammar stands today, it imposes fairly narrow conditions on the structure of human language. If this general conclusion can be firmly established – and, furthermore, significantly strengthened – this will be a highly suggestive contribution to theoretical psychology. It is hardly open to controversy that today, as in the seventeenth century, the central and critical problem for linguistics is to use empirical evidence from particular languages to refine the principles of universal grammar. I have tried, in this paper, to suggest some of the principles that seem well established and to illustrate some of the empirical considerations that bear on such principles.43
1 The term “grammar” is often used ambiguously to refer both to the internalized system of rules and to the linguist’s description of it.
2 To be more precise, a certain class of signals that are repetitions of one another, in a sense to which we return subsequently.
3 Or by some simple calculations of the number of sentences and “patterns” that might be needed, for empirical adequacy, in such repertoires. For some relevant comments, see G. A. Miller, E. Galanter, and K. H. Pribram, Plans and the Structure of Behavior (New York: Holt, Rinehart and Winston, 1960), pp. 145 f.; G. A. Miller and N. Chomsky, “Finitary Models of Language Users,” in R. D. Luce, R. Bush, and E. Galanter, eds., Handbook of Mathematical Psychology (New York: Wiley, 1963), Vol. II, p. 430.
4 The existence of innate mental structure is, obviously, not a matter of controversy. What we may question is just what it is and to what extent it is specific to language.
5 This assumption is not explicit in Wilkins, but is developed in other seventeenth- and eighteenth-century work. See my Cartesian Linguistics (New York: Harper & Row, 1966) for references and discussion.
6 In an appropriate sense of repetition. Thus any two physical signals are in some way distinct, but some of the differences are irrelevant in a particular language, and others are irrelevant in any language.
7 A theory of phonetic distinctive features is developed in R. Jakobson, G. Fant, and M. Halle, Preliminaries to Speech Analysis, 2nd edn (Cambridge, Mass.: MIT. Press, 1963). A revised and, we think, improved version appears in N. Chomsky and M. Halle, Sound Pattern of English (New York: Harper & Row, 1968).
8 Observe that although the order of phonetic segments is a significant fact, there is no reason to assume that the physical event represented by a particular sequence of phonetic symbols can be analyzed into successive parts, each associated with a particular symbol.
9 See J. Katz, The Philosophy of Language (New York: Harper & Row, 1965), for a review of some recent work. For another view, see U. Weinreich, “Explorations in Semantic Theory,” in T. A. Sebeok, ed., Current Trends in Linguistics, Vol. III of Linguistic Theory (The Hague: Mouton, 1966); and for comments on this and more extensive development of the topic, see J. Katz, Semantic Theory (New York: Harper & Row, Publishers, 1972). In addition, there has been quite a bit of recent work in descriptive semantics, some of which is suggestive with respect to the problems discussed here.
10 For discussion of this notion, see J. Katz, “Semantic Theory and the Meaning of ‘Good,’” Journal of Philosophy, Vol. 61, No. 23, 1964.
11 See Chomsky, Cartesian Linguistics, for discussion.
12 See p. 91. In general, a set of rules that recursively define an infinite set of objects may be said to generate this set. Thus a set of axioms and rules of inference for arithmetic may be said to generate a set of proofs and a set of theorems of arithmetic (last lines of proofs). Similarly, a (generative) grammar may be said to generate a set of structural descriptions, each of which, ideally, incorporates a deep structure, a surface structure, a semantic interpretation (of the deep structure), and a phonetic interpretation (of the surface structure).
13 The analysis that is presented here for purposes of exposition would have to be refined for empirical adequacy.
14 Notice that every two successive formatives are separated by a juncture, as is necessary if the representation of 5 as a single matrix is to preserve the formative structure. For present purposes, we may think of each segment of a formative as unmarked for all junctural features and each juncture as unmarked for each formative feature.
15 The reasons for this analysis go beyond the scope of this discussion. For details see Chomsky and Halle, Sound Pattern of English.
16 In the obvious sense. Thus [A . . . [B . . .] B . . . [C . . .] C . . .] A would, for example, be a proper bracketing of the string . . . in terms of the labeled brackets [A,] A, [B,] B, [C,] C, but neither of the following would be proper bracketing:
17. These are simplified, for expository purposes. See Chomsky and Halle, Sound Pattern of English, for a more accurate account. Notice that in this exposition we are using the term “applies” ambiguously, in the sense of “available for application” and also in the sense of “actually modifies the sequence under consideration.”
18. The word “eraser” is, at this stage, bisyllabic.
19. As earlier, we refer here to “tacit” or “latent knowledge,” which can, perhaps, be brought to consciousness with proper attention but is surely not presented to “unguided intuition.”
20. And other aspects. The argument is, in fact, much more general. It must be kept in mind that speech perception is often impaired minimally, or not at all, even by significant distortion of the signal, a fact difficult to reconcile with the view that phonetic analysis in detail is a prerequisite for analysis of the syntactic and semantic structure.
21. The latter is again ambiguous in an entirely different way from 4, depending on the reference of “him.” We will assume, throughout, that it refers to John.
22. But the interpretation of this depends on that of “see” and that of “past tense”; hence, these separate items must be represented in the deep structure, though not, in this case, in the surface structure.
23. A phrase X is an immediate constituent of the phrase Y containing X if there is no phrase Z which contains X and is contained in Y. Thus, the noun phrase “John” is an immediate constituent of the sentence “John saw Bill” [analyzed as in 6], but the noun phrase “Bill” is not, being contained in the intervening phrase “saw Bill.” “John saw” is not an immediate constituent of the sentence, since it is not a phrase; “John” is not an immediate constituent of “John saw,” since the latter is not a phrase. Notice that the definitions proposed here for grammatical functions and relations make sense only when restricted to deep structures, in general.
24. In fact, we might think of a grammar as assigning a semantic interpretation to all possible sentences (this being a clear notion, given theories of universal phonetics and semantics), including those that deviate from rules of the language. But this is a matter that we will not go into any further here.
25. The question of how the syntactic component is organized should not be confused, as it all too often is, with the problem of developing a model of performance (production or perception). In fact, any of the kinds of organization just described (and others) could be used as the basis for a theory of performance of either kind.
26. It may be that a slightly more general notion of “phrase-marker” is needed, but we will put this question aside here.
27. This may not seem obvious. We return to the example directly.
28. We henceforth suppose 21a and 21b to be extended to full phrase-markers by insertion of appropriate lexical entries, as indicated.
29. Notice that this transformation would modify the phrase-marker to which it applies in a more radical way than those discussed above. The principles remain the same, however.
30. These factors may affect performance, however. Thus they may affect the physical signal and play a role in determining how a person will interpret sentences. In both producing and understanding sentences, the speaker–hearer makes use of the ideal phonetic and semantic interpretations, but other factors also play a role. The speaker may be simply interested in making himself understood–the hearer, in determining what the speaker intended (which may not be identical with the literal semantic interpretation of the sentence or sentence fragment that he produced). Once again, we must insist on the necessity for distinguishing performance from competence if either is to be studied in a serious way.
31. More accurately, surface structures that are sufficiently close so as to determine the same phonetic representation.
32. Modern linguistics has made occasional use of this property of language as a research tool. The first general discussion of how ambiguity can be used to illustrate the inadequacy of certain conceptions of syntactic structure is in C. F. Hockett’s “Two Models for Grammatical Description,” Word, Vol. 10, 1954, pp. 210–31, reprinted in M. Joos, ed., Readings in Linguistics One, 4th edn. (Chicago: University of Chicago Press, 1966).
33. It seems, in fact, that this phenomenon has escaped the attention of English grammarians, both traditional and modern.
34. It is an open question whether this determination is unique.
35. There are other supporting reasons. For one thing, grammatical relations are not among words or morphemes but among phrases, in general. For another, empirical investigation has uniformly shown that there is an optimal ideal order of phrases in underlying structures, consistent with the assumption that these are generated by a base system of the sort discussed above.
36. It is interesting to observe, in this connection, that the theory of context-free phrase-structure grammar (see p. 125) is very close to adequate for “artificial languages” invented for various purposes, for example, for mathematics or logic or as computer languages.
37. And hence, in certain cases, as underlying “semigrammatical sentences” that deviate, in the indicated way, from grammatical rule. This suggests one approach to the problem touched on in note 24.
38. For some speculations about this matter and discussion of the general problem, see G. A. Miller and N. Chomsky, “Finitary Models for the User,” in R. D. Luce, E. Galanter, and R. Bush, eds., Handbook of Mathematical Psychology (New York: Wiley, 1963), Vol. II. The suggestion that transformations may facilitate performance is implicit in V. Yngve, “A Model and a Hypothesis for Language Structure,” Proceedings of the American Philosophical Society, 1960, pp. 444–66.
39. Traditional theories of universal phonetics have been largely accepted as a basis for modern work, and have been refined and amplified in quite important ways. See the references in note 7.
40. Furthermore, evidence of a highly degraded sort. For example, the child’s conclusions about the rules of sentence formation must be based on evidence that consists, to a large extent, of utterances that break rules, since a good deal of normal speech consists of false starts, disconnected phrases, and other deviations from idealized competence.
The issue here is not one of “normative grammar.” The point is that a person’s normal speech departs from the rules of his own internalized grammar in innumerable ways, because of the many factors that interact with underlying competence to determine performance. Correspondingly, as a language learner, he acquires a grammar that characterizes much of the evidence on which it was based as deviant and anomalous.
41. We are presenting an “instantaneous model” of language acquisition which is surely false in detail, but can very well be accepted as a reasonable first approximation. This is not to deny that the fine structure of learning deserves study. The question, rather, is what the range of possibilities may be within which experience can cause knowledge and belief to vary. If the range is quite narrow (as, it seems to me, is suggested by considerations of the sort mentioned above), then a first approximation of the sort suggested will be a prerequisite to any fruitful investigation of learning. Given an instantaneous model that is empirically well supported, as a first approximation, there are many questions that can immediately be raised: for example, what are the strategies by which hypotheses are sampled, how does the set of hypotheses available at one stage depend on those tested at earlier stages, etc.
42. Except, tautologically, in the sense that they are accounted for by the same theory.
43. In addition to works mentioned in earlier notes the following books can be consulted for further development of topics touched on in this paper: N. Chomsky, Syntactic Structures (The Hague: Mouton, 1957); N. Chomsky, Aspects of the Theory of Syntax (Cambridge, Mass.: MIT Press, 1965); M. Halle, Sound Pattern of Russian (The Hague: Mouton, 1959); J. Katz and P. Postal, An Integrated Theory of Linguistic Descriptions (Cambridge, Mass.: MIT Press, 1964). See also many papers in J. Fodor and J. Katz, eds., Structure of Language: Readings in the Philosophy of Language (Englewood Cliffs, N.J.: Prentice-Hall, 1964). For more information on aspects of English structure touched on here, see also R. Lees, Grammar of English Nominalizations (New York: Humanities Press, 1963), and P. Rosenbaum, “Grammar of English Predicate Complement Constructions,” unpublished Ph.D. dissertation, MIT, 1965. For further material see the bibliographies of the works cited.