6

Second language phonology

Fred R. Eckman

Introduction

The field of second-language (L2) phonology attempts to document and explain the pronunciation patterns of non-native learners of a language.1 The purpose of this chapter is to describe this area of study through several steps. The first is to locate L2 phonology within the field of second language acquisition (SLA), and, in turn, to relate it to other disciplines. We will then consider a number of specific aspects of L2 phonology, including how it fits into a historical context, what its major findings have been, where the major questions and issues lie, and what kinds of data it seeks to use in order to test empirically the various hypotheses that one can postulate within the field. Finally, the chapter will take up the question of how the results of research on L2 phonology can be brought to bear on language pedagogy, and outline what some of the major questions are that need to be addressed by future research.

Because the field of SLA necessarily impinges on several academic areas, including psychology (SLA entails language contact within an individual), sociology (language contact within a society) and biology (the end state for acquisition seems to be different for older versus younger learners), L2 acquisition is naturally studied from a multidisciplinary perspective. Second-language phonology, by extension, is also of interest to researchers in several disciplines. Psycholinguistic perspectives on L2 phonology are concerned about, for example, the explicit and implicit learning of L2 phonemic categories; sociolinguistic approaches to L2 phonology, on the other hand, are interested is how L2 pronunciation patterns may vary according to social context in which a person is speaking; and from a biological viewpoint, L2 phonologists may investigate the acquisition of L2 sounds and contrasts as a function of age of first exposure to the target language (TL). L2 phonology is also studied from the point of view of phonological theory, placing it squarely within the domain of linguistics. The major distinction in the ways that L2 phonology is approached by the above-mentioned related disciplines, on the one hand, and the way it is investigated through a linguistic approach, on the other, is in the kinds of constructs and principles used to give explanations for the relevant L2 facts. Whereas psycholinguistic, sociolinguistic and biological approaches to L2 phonology employ constructs appropriate to their corresponding disciplines, such as short-term memory, prestigious dialect, and critical period hypothesis, respectively, linguistic approaches invoke concepts such as, for example, grammar, phoneme, devoicing, and derived environment. For a general discussion of these other perspectives, the reader is referred to the appropriate chapters in this volume. The focus of this chapter will be on a linguistic approach to L2 phonology.

Given this introduction, the remainder of the chapter is structured as follows. The next section places L2 phonology in its historical context by dividing the field into two distinct periods according to the objectives of the research carried out at the time. The section following that addresses several of the major models and approaches to L2 phonology, as well as some of the findings that derive from these different research programs. This is followed by a description of the kinds of studies that have been conducted in order to test the various hypotheses that these approaches have made, which leads naturally to a discussion of the instructional relevance of these findings. The chapter concludes with an outline of some of the important questions remaining to be addressed by future research.

An important caveat at the outset is that our discussion faces limitations of space that will necessarily curtail to some extent our treatment of all of these topics. Where possible the reader is referred to relevant chapters in this, or other volumes, or to the original work itself for more details.

Historical discussion

Although there are a number of ways in which one could structure the discussion of the historical context, we will, for our purposes, divide the study of L2 phonology into two major eras, the first being the time prior to the formulation of the interlanguage (IL) hypothesis, which we will designate as “pre-ILH,” and the second being the period after the postulation of the IL hypothesis, which we term “post-ILH.” The most salient characteristic of the pre-ILH approaches to explaining L2 pronunciation problems is that they focused on only two linguistic systems, the L2 learner's native language (NL) and TL. We turn first to the discussion of the pre-ILH period, after which we will define the term “interlanguage,” and take up the post-ILH era.

Pre-ILH

The explanation of pronunciation patterns of L2 learners that we label as pre-ILH focused largely on explaining learners’ errors in terms of the differences between the NL and TL, the underlying assumption being that the NL influenced the learning of the TL. The goal of SLA studies in general, and investigations into L2 phonology, in particular, was to explain learning difficulty, which, in turn, led to claims about degree of difficulty, and to the postulation of hierarchies of difficulty.

The best known and most explicit claim that the L2 learner's NL had a significant role to play in accounting for pronunciation errors dates from the middle of the last century. This work was carried out within the context of the contrastive analysis hypothesis (CAH), and claimed that NL-TL differences, along with L1 transfer, were paramount in accounting for L2 utterances. During this era, a phonological analysis of a language consisted of a description of the phonemes of that language along with the distribution of the allophones of those phonemes. L2 pronunciation errors were explained in terms of a comparison of the phonemes and their distribution within the respective NL and TL, and although phonemes figured prominently in the predictions of the CAH, a large role was also played by allophones. (See also Pickering, Chapter 20, this volume, for further discussion.) Thus, Lado's (1957) proposals concerning learning difficulty addressed the question of what constituted maximum phonological difficulty, and allophonic differences between the NL and TL were an important part of his predictions. For him, the greatest difficulty lay in the learner re-categorizing two or more allophones in the NL into different phonemic categories in the TL. An example, the one used by Lado, involved the sounds [d] and [∂], which are allophones of /d/ in Spanish, but which contrast as distinct phonemes in English. Lado claimed that assigning the allophones [d] and [∂] to separate phonemes in English by a learner whose native language is Spanish constituted maximum learning difficulty.

Important contributions to what constituted maximal learning difficulty in L2 pronunciation within the context of the CAH were made by Stockwell and Bowen (1965) and by Hammerly (1982). Stockwell and Bowen (1965) expanded and refined the predictions of the CAH by comparing the NL and TL in terms of whether any given sound was phonemic, allophonic, or absent in either language. Through these comparisons the authors constructed an eight-level hierarchy of difficulty in which maximum phonological difficulty was ascribed to a learner having to acquire a TL allophone that was absent in the NL. Work by Hammerly (1982) supported empirically some of the claims about learning difficulty made by the CAH and by Stockwell and Bowen's hierarchy. Hammerly's analysis showed that, of the six most problematic areas of pronunciation, the top three involved allophones. The greatest difficulty for his subjects was the suppression of NL allophones in pronouncing the TL; the second area of difficulty was producing NL allophones with a different distribution in the TL, including contrastive distribution; and the third most difficult aspect of L2 pronunciation was the production of a TL allophone that did not exist in the NL.

In the research that followed the postulation of the CAH in the ensuing decades, the results were mixed in terms of whether the hypothesis was confirmed. Although there were studies in which the findings were supportive of the CAH, there were many others that reported facts that were counter to the hypothesis, which led to the eventual demise of the CAH. Much of the work in this framework, while setting out to find support for the CAH, actually found that the role of developmental processes, patterns often found in first language acquisition, played a more significant role in the explanation of L2 sound patterns than did NL-TL differences. For example, Johannsson's (1973) study of 20 L2 learners of Swedish from eight different native-language backgrounds showed that, although some of the errors were predictable by the CAH, others were explainable in terms of articulation ease.

In sum, research within the CAH paradigm from this period showed that, whereas NL influence had a role to play in explaining some aspects of L2 pronunciation, the influence of the NL could not explain all of the facts. It became clear, therefore, that other principles were necessary to explain learning difficulty that could not be directly related to NL-TL differences. Over the decades since that time, numerous proposals have been made to account for facts that are not subsumed under the CAH. These include proposals pertaining to the similarities between the NL and TL, facts about the relationship between production and perception, and principles of markedness, each of which is briefly considered below.

Although the majority of work on L2 pronunciation during this time was done within the principles of the CAH, and attempted to explain L2 phonological difficulty on the basis of differences between the NL and TL, there are also frameworks that base the explanation of pronunciation problems on similarities between the NL and TL. Two such models that incorporate the role of L2 perception are the speech learning model (SLM) developed by Flege (1995), and the perceptual assimilation model (PAM), proposed by Best (1995). Pickering (Chapter 20, this volume) gives an in-depth discussion of Flege's SLM. For this reason, and because the topic of L2 speech perception is covered in Part IV of this volume, I will not discuss the SLM and PAM any farther here.

A principle that was introduced into SLA theory to help address problems with the CAH was typological markedness, a concept that was pioneered by the Prague School of Linguistics in the theories of Trubetzkoy (1939). The idea behind markedness is that binary oppositions between certain linguistic representations (e.g., voiced and voiceless obstruents, or open and closed syllables) are not simply polar opposites, but that one member of the opposition is assumed to be privileged in that it has wider distribution, both across languages and within a language. Assigning the term “unmarked” to this privileged member is a way of giving it special status, and indicating that it is considered to be, in some definable way, simpler, more basic, and more natural than the less widely occurring member of the opposition, which is designated as being “marked.” Over the years, the term markedness has taken on a number of different definitions within several distinct approaches to linguistics (see Battistella, 1990 for discussion).

The proposal with respect to L2 phonology was that markedness, as defined in terms of cross-linguistic, implicational generalizations, as in (1) below, would be incorporated into the CAH as a measure of relative difficulty. According to this proposal, any given TL structure would be predicted to be more difficult if it was both different from the corresponding NL structure, and was also more marked than that structure. This claim was explicitly embodied in the markedness differential hypothesis (MDH) stated in Eckman (1977). Whereas the CAH attempted to explain L2 learning difficulty only on the basis of differences between the NL and TL, the claim behind the MDH is that NL-TL differences are necessary for such an explanation, but they are not sufficient, and that therefore one must incorporate into the hypothesis the concept of typological markedness as a measure of relative difficulty.

(1) Typological markedness

A structure X is typologically marked relative to another structure, Y if every language that has X also has Y, but every language that has Y does not necessarily have X.

Over the last three decades there have been a number of studies addressing the claims of the MDH, showing that typological markedness is a reliable predictor of difficulty, that there are cases where the directionality of difficulty between the NL and TL involved in a language-contact situation follows the predictions of the MDH, and that the relative degree of difficulty corresponds to the relative degree of markedness (Anderson, 1987; Carlisle, 1992; de Jong et al., 2009).

To sum up this sub-section, we have seen that several proposals have been developed over the years to address some of the perceived shortcomings of the CAH. Thus, it is fair to state that conventional wisdom within the field of L2 phonology is that, although the learner's NL has a role to play in explaining certain aspects of learning difficulty, NL influence is by no means sufficient to account for L2 pronunciation patterns. Rather, additional principles are necessary.

Post-ILH

Interlanguage. We now address the post-interlanguage hypothesis period of L2 phonology. Interlanguage (IL) is the term given to the mental system developed by L2 learners that enables them to produce and understand utterances of the TL. The idea behind this construct, which has been one of the key developments over the past few decades in SLA theory, in general, and in L2 phonology, in particular, is that L2 learners create their own version of the target language. The motivation for this idea, which was proposed independently by three different scholars (Corder, 1971; Nemser, 1971; Selinker, 1972), is the same as that invoked by a linguist postulating a mental grammar as underlying the ability of a native speaker of any language to speak and understand utterances of that language. Just as the patterns found in the productions of a native speaker of a language are assumed to derive from that speaker's mental system, so too, the patterns found in the utterances made by non-native speakers of a language are hypothesized to derive from a mental set of rules, viz. the interlanguage system that the learner has acquired.

It is important to note that, in linguistic discussions, the term “grammar” can be used with a systematic ambiguity, on the one hand referring to a description of sentence structure, and on the other hand indicating a mental system that relates meanings and utterances. It is in the latter, more general sense in which “grammar” includes phonological rules that the word is used in this chapter. Where these phonological rules characterize L2 patterns, the term “interphonology” is often used.

This IL system of the L2 learner must be at least partially different from the learner'sNL grammar, because the L2 utterances produced by the learner of a language are different from that learner's NL utterances. By the same token, the IL grammar must be different in part from the TL grammar, because the L2 utterances of the learner are distinct from those produced by native speakers of the TL. Thus, a mental grammar that is in at least some respects different from both the NL and the TL must underlie the utterances of the L2 learners. On this view, the process of second language acquisition becomes the construction of a mental system of rules, the interlanguage. And although the acquisition of this IL may be based in part on structures transferred from the NL, and in part on input from the TL, it has also been shown that the IL is, to some extent, independent of both the NL and TL. This is true because studies have shown that some L2 patterns are not part of either the NL or TL.

The strongest argument for the postulation of an IL is an empirical one that requires providing evidence of what is acknowledged to be the most interesting of L2 data, viz., a pattern of utterances that, on the one hand, does not derive from NL transfer, because the NL does not evidence the regularity in question, and on the other hand could not be tied to TL input, because the TL does not exhibit the relevant pattern either. In other words, neither the NL nor the TL can explain the observed L2 patterns, but, as with all regularities, an explanation is required. Therefore, the interlanguage must be hypothesized to explain the observed systematicity.

The value of the construct of an IL is that it has allowed researchers to propose answers to questions that, before this notion was proposed, could not even be asked. Given the concept of IL, not only is it possible, but also reasonable, to raise the question of whether IL grammars are similar in important ways to native language grammars. It is this question which has driven many, if not most, of the research programs in L2 phonology over the last few decades, and on which we will focus in the following sections.

An example of this kind of IL pattern in L2 phonology was reported in Altenberg and Vago (1983) and in Eckman (1981). Altenberg and Vago found that their subjects who were native speakers of Hungarian learning English exhibited the kind of L2 pattern that would motivate a rule of word-final devoicing. What is particularly interesting about this outcome is that neither the phonology of the NL nor that of the TL has a devoicing rule, because both languages have a voice contrast in word-final position. Thus, the IL devoicing rule is independent of both the NL and TL. In the Eckman study, speakers of Spanish produced an IL pattern that motivated a rule of word-final devoicing. This was also a situation where the IL rule was independent of both the NL and TL in that Spanish did not exhibit the kind of evidence necessary to motivate such a rule. What is especially interesting is that the cases of the Hungarian and Spanish learners represent an example of an IL pattern that is not attributable to either NL transfer or TL input, but is attested in other languages of the world, including Catalan, German, Polish, and Russian, to name a few. See also Pickering (Chapter 20, this volume) for discussion of the relationship between the markedness differential hypothesis and the SCH.

To summarize this subsection, the concept of interlanguage led directly to the possibility that L2 patterns could emerge which were independent of both the NL and TL. This development allowed L2 researchers to question whether IL phonologies were in fact similar in important respects to L1 phonologies.

Constraints on IL grammars. With the postulation of the construct, interlanguage, the goal of SLA theory, and by inclusion, L2 phonology, changed from attempting to explain learning difficulty in L2 acquisition to the goal of addressing the question of why IL phonologies are the way they are, which is actually a subset of the question, “why is SLA the way it is?” The response that the research programs of this era have given is along these lines: “SLA is the way it is, because IL systems are they way they; and IL systems are as they are, because they are constrained by general, linguistic principles.” In the case of interphonology, the constraints are principles of phonological theory, and within this context, it is also possible for explanations to hark back to the role of the learner's NL. Thus, a number of proposals hypothesized that various linguistic principles interacted with the NL phonology to explain L2 pronunciation patterns. Approaches to L2 phonology differed as to which kinds of theoretical principles the investigators proposed as constraining IL phonologies, and to the extent to which the learner's NL interacted with these principles. We now turn our discussion to the kind of principles that were proposed as constraining IL phonologies.

Constraining principles. The development of phonological theory during this period led to the use of a number of general principles and constructs from phonology to explain facts about L2 pronunciation patterns. These constructs included distinctive features, rule types, underlying representations, and derivations. Researchers viewed the application of these principles as a way, on the one hand, of testing the general claim that interphonologies are constrained in the same way as are native-language phonologies, and, on the other hand, as a way of accounting for different aspects of L2 pronunciation patterns. Whereas some of the constructs that were used to explain some aspect of L2 phonology may have overlapped in part with those invoked as an explanation in other areas, and although the application of some principles may have resulted in deeper explanations than the application of others, it is in general the case that the phonological principles in question did not conflict with each other, or create issues that had to be addressed.

As proposals for non-linear representations made their way into phonological theory, prosodic hierarchies (Zampini, 1997), metrical grids (Archibald, 1993, 1995), and feature geometry (Brown, 1998) were also employed as explanatory principles in L2 phonology. Likewise, as did phonologists focusing on NL phonologies, L2 phonologists appealed to linguistic universals, interpreted in the broadest sense of “universal,” including typological generalizations and principles of Universal Grammar (UG), as sources of explanation for second-language pronunciation patterns. More recently, some L2 phonologists have turned from rule-driven grammars to constraint-based analyses within an Optimality Theory (OT) framework for analyzing L2 pronunciation patterns. OT is a framework in which constraints, constraint rankings and constraint re-rankings in the learner's interlanguage, rather than rules, determine the time course of acquisition (Broselow et al., 1998; Hancin Bhatt, 2008). During this period, one of the important recurring themes in virtually all approaches to L2 phonology has been the reporting and explanation of L2 phonological patterns that are not directly attributable either to the learner's NL or to the TL, but may be attested in the phonologies of other languages of the world. Indeed, it is this kind of evidence that purports to show the fundamental properties involved in the acquisition of L2 phonology.

The first example of this involves the employment of prosodic hierarchies and metrical grids to account for the stress patterns of L2 learners. A prosodic hierarchy is a structural representation of the phonological domains that are relevant to the application of rules, with the phonetic segment being the smallest domain, the syllable the next larger, the foot the next larger, and so on, with the utterance being the largest and most inclusive domain. A metrical grid is a structural hierarchy of the syllables and prosodic feet of a given utterance from which properties of stress can be predicted. These constructs have been invoked by L2 phonologists (Archibald, 1993, 1995; Mairs, 1989) to account for the acquisition of stress patterns. Both Archibald and Mairs found that the TL had a role to play in the explanation of IL patterns in that L2 learners transferred some, or all, of their NL metrical grid into the IL grammar, but that the NL principles that applied to these grids may operate differently in the IL grammar than they do in the NL or in the TL.

Two other studies of note involving prosodic features are Nguyen and Macken (2008), and Zampini (1997). Nguyen and Macken analyzed the acquisition of the production of Vietnamese tones by American learners. The authors found the patterns of tone production to be influenced by several factors, including universal principles, NL structures, and TL-specific rules. Zampini studied the acquisition of Spanish spirantization, that is, the rule by which a stop becomes a fricative, by native speakers of English. She showed that the spirantization rule of Spanish must be formulated in terms of the prosodic hierarchy, with the domain of the rule being the intonational phrase. However, she found that the spirantization rule in the respective interlanguage grammars of her subjects applied in a more restrictive domain: only word-internally for most of her subjects, but in the phonological word for others.

Other kinds of phonological constructs have also been put forth as constraints on IL grammars; these include aspects of feature geometry and underspecification, along with principles governing the well-formedness of syllables.

Brown (1998) proposed that feature geometry and underspecification could be invoked as explanatory principles for L2 pronunciation patterns. Feature Geometry is a system of representing segments in which phonological features are not unordered bundles of properties, but are instead structured hierarchically so that some features are dependent on others. Underspecification is a system for representing the underlying representations of segments that takes advantage of the fact that some properties of sounds are predictable on the basis of other properties. These predictable properties are redundant for making contrasts, and are not represented underlyingly. All features are completely specified only when the segments are realized phonetically. Assuming that feature geometry and underspecification are principles that constrained IL phonologies, just as they constrain L1 grammars, Brown employed them in her analysis of the acquisition of several English contrasts by speakers of Japanese, Korean, and Mandarin.

Studies involving principles of syllable structure as a constraint on IL grammars have been carried out, for the most part using NLs which are much more restrictive than English in the kinds of syllable they allow. Sato (1984) conducted one such study of two Vietnamese-speaking brothers, aged 10 and 12, eliciting utterances exclusively through spontaneous conversations. Sato's data contained numerous tokens of syllable-initial and syllable-final consonant clusters, demonstrating that the subjects’ difficulty with the TL clusters was reflected not in terms of vowel epenthesis, but in terms of reducing the length of the clusters in question, or in changing the features of one or both of the segments involved. Thus, bi-literal clusters were often reduced to single consonants, and voiced obstruents in the clusters were often devoiced. More recently, Hansen (2004) carried out a year-long study of two Vietnamese learners of English, mapping the development of onsets and codas as a function of several linguistic and contextual factors. Other longitudinal studies have raised the question of whether the development of L2 syllable structure is linear. Abrahamsson (2003) concluded that coda development over time was U-shaped rather than linear. One explanation proposed for this kind of development is that the subjects may tend to pay less attention to form as their fluency increases, and as their ability to control a more casual style of speaking develops.

We now focus our attention on the proposal that linguistic universals act as constraints on IL grammars, and then conclude this section with a discussion of the framework of Optimality Theory. See Pickering (Chapter 20, this volume) for discussion of the use of Optimality Theory in explaining L2 phonology.

One of the earliest studies in L2 phonology that utilized a parameter of UG as an explanatory principle was Broselow and Finer (1991). A parameter is a construct that specifies how grammars can differ from each other with respect to a given structure. The study by Broselow and Finer invoked the Minimal Sonority Distance (MSD) parameter to explain the performance of Korean and Japanese learners of English on the production of onset clusters.

This parameter uses a measure called the Sonority Index (SI), from Selkirk (1982), which assigns a numerical value to certain consonant types according to the consonant's sonority—the greater the sonority of the segment, the greater the value assigned by the SI. The idea is that the MSD parameter characterizes the systematic variation found in the kinds of onsets allowed cross-linguistically, specifying, for any given language the minimal difference in sonority that must exist between adjacent consonants in the onset of a syllable. This minimal difference is computed by subtracting the SI value of one consonant type from that of the other consonant type co-occurring in the onset. If the resulting value is equal to or greater than the value specified by the MSD for that language, then the onset cluster is allowed; if the resulting value is less than that number, the cluster is disallowed. The point of most interest in the Broselow and Finer (1991) study was the finding that their subjects did not simply transfer the value of the MSD parameter of the NL into the IL, nor did they evidence TL-like values of this parameter. Rather, the subjects evinced IL patterns that were somewhere in between the NL and TL settings, providing another instance in which interlanguage grammars obey general principles of phonological theory.

An approach to L2 phonology that invoked generalizations from the other school of universals, typological markedness is the Structural Conformity Hypothesis (SCH) (Eckman, 1991), stated as in (2).

(2) The Structural Conformity Hypothesis (Eckman 1991, p. 24)

The universal generalizations that hold for primary languages hold also for interlanguages.

This hypothesis developed historically from the Markedness Differential Hypothesis (MDH), discussed briefly in the Pre-ILH section above, and it is part of the research program that invokes principles of linguistic typology to explain facts about L2 phonology. Thus, the universal generalizations that have been tested with respect to the SCH asserts are typological universals. The crucial difference between the MDH and the SCH is that the former is relevant only in cases where the NL and TL are different with respect to some representation or structure, whereas the SCH is neutral on NL-TL differences. The MDH can be considered a special case of the SCH, viz., the case when the typological generalization in question involves an area of NL-TL difference. Consistent with the point made in the above section on Interlanguage, the strongest kind of support for the SCH is an L2 pattern in which the structures adhere to universals principles, but the patterns in question are not directly derivable from either the NL or the TL. This type of evidence has been adduced in a number of studies, including Altenberg and Vago (1983), Eckman (1991), Carlisle (1998), among others.

Two other proposals related to the topic of markedness, and also discussed by Pickering (Chapter 20, this volume), are the Similarity Differential Rate Hypothesis (SDRH) formulated by Major and Kim (1996), and Optimality Theory. The SDRH reprises an idea from earlier proposals in L2 phonology that sounds which are different from those in the NL may be easier to acquire, in that these sounds are acquired more quickly than sounds that are different. According to the hypothesis, “rate of acquisition” is the basis for explaining many L2 pronunciation errors, not difficulty, as is stated in both the CAH and the MDH.

The most significant difference between a phonology within the framework of Optimality Theory and phonologies within other approaches to sound systems is how well-formedness is described. In rule-based phonologies, a well-formed representation is characterized by constructing a set of rules, which, if adhered to, will yield well-formed utterances as judged by native speakers. Deviance is described by showing that unacceptable representations violate at least one of the principles of the phonology. Within OT, on the other hand, phonologies consist of a universal set of constraints instead of a set of rules. A good way to conceive of the constraints is as criteria for well-formedness, and the assumption is that no language can satisfy all of these criteria. Therefore, the constraints in question can be violated, and conflicts among constraints are resolved by ranking them in such cases in different orders for different languages. Within OT, therefore, well-formedness is not characterized on the basis of whether or not an utterance violates one or more of the constraints; instead, whether an utterance is well-formed is determined by an optimization procedure whereby well-formed utterances are those that conform to the highest ranked constraints. Thus, within the framework of OT, phonologies of particular languages result from different rankings of the universal constraints; any ranking of the universal constraints should yield a phonology of some language, and any phonological system of a language should conform to one of the possible rankings of the constraints.

In recent years there have been a few studies on L2 phonology done within an OT framework. One example is Broselow et al. (1998), in which the authors illustrated that the simplification strategies used to modify English codas by native speakers of Mandarin could be explained as what is termed “the emergence of the unmarked,” one of the ways that OT phonologies can represent an L2 pattern that is independent of both the NL and TL. Another example is Lombardi (2003), who argued for an OT approach to the classical problem of “differential substitution,” in which learners of L2 English substitute either [t] or [s] for [†], depending on their NL background.

To summarize this section, the study of L2 pronunciation patterns can be insightfully divided into those that investigate the topic from the standpoint of the mental system of the L2 learner, that is, the interlanguage, and those that consider the patterns in terms of the native and target languages.

Core issues

This section is divided into two parts. First, we will outline the major tenets underlying research into L2 phonology, and then, we will discuss some of the major findings that have emerged from this research.

The principal postulation of virtually all, if not truly all, approaches to L2 phonology is the notion that each L2 learner must create an interlanguage system on the basis of whatever TL input is available. As discussed above, this mental system is the learner's version of the target language, and is used by the learner to produce and understand L2 utterances. Moreover, it is fair to say that the construct of an interphonology is the only tenet that is used in research on L2 pronunciation that is particular to L2 phonology. All of the other concepts that have been brought to bear on explaining patterns of L2 utterances are motivated also for the explanation of phonologies of L1 grammars. Where approaches to L1 phonology differ in their underlying assumptions and constructs, so, too, do the assumptions brought to bear on L2 phonologies.

In fact, one of the major underpinnings of approaches to L2 phonology is that interphonologies are subject to the same principles and constraints as are L1 phonologies. Whereas this claim is embodied implicitly in the analyses of L2 utterances that use principles that are motivated for L1s, it is made explicitly by the Structural Conformity Hypothesis in (2) above, and within the Optimality Theoretic approach to L2 phonology. The major premise of OT is that all phonologies are constructed in the same way; specifically, all phonologies, including interphonologies, must be one of the possible rankings of the universal set of constraints (Hancin Bhatt, 2008).

This brings us to the second point of this section: the major finding of research on L2 phonology over the decades is that IL phonologies are constrained by the same principles, and obey the same generalizations, as do phonologies of languages learned natively. In other words, the fundamental hypothesis that IL phonologies do not violate principles of native-language phonologies is supported, as is the claim that IL phonological systems are similar in important ways to native-language systems.

Therefore, the reasoning behind Zampini's (1997) and Archibald's (1993) respective claims that the prosodic hierarchy and the metrical grid could account for facts about the L2 acquisition of Spanish spirantization or English stress is that L2 phonologies adhere to the principles of the prosodic hierarchy and to metrical grids. Likewise, the rationale behind Brown (1998) invoking feature geometry, or Broselow and Finer (1991) appealing to the Minimal Sonority Distance parameter in their studies on the L2 acquisition of lateral contrasts or consonant clusters, is the assumption that IL systems will obey these principles. And these works are simply examples among many other studies pursuing the same general research program. And although this is a very general characterization of the findings of research on L2 phonology, we will see in the section below on applications that many of the pedagogical applications of this research hark back to this very point. We will consider this in more detail after we discuss the kind of data that is adduced by L2 phonologists.

Data and common elicitation measures

The kind of data that a researcher employs can be sorted along several dimensions, depending on the nature of the question being addressed, and the hypothesis being tested. One major distinction that can be made along these lines is whether the data to be gathered bear on the L2 learner's production versus perception of the TL sounds or contrasts. The second parameter by which the data can be characterized is the extent to which the researcher allows the speakers’ utterances to be free ranging, or the extent to which the utterances are controlled.

For example, data gathered from L2 speakers in a casual conversation with another interlocutor are uncontrolled in the sense that the kinds of utterances produced by the speaker would be determined entirely by flow of the conversation topics, and by the speaker, rather than by the researcher. On the other hand, data elicited from L2 learners by having them read words on a list would be highly controlled in that the words produced by the speakers would be determined by the researcher. In a word-list recitation, the context for the productions would be the task itself, where words are produced one at a time, rather than as part of connected speech. Such elicited productions would represent a careful style of talking as opposed to a more casual style that one encounters during free conversation.

There are clear trade-offs as to which kind of elicitation methods are preferable. For example, if the researchers in question had not yet determined which kind of phonological structures to investigate, then it may well be worthwhile for them to simply “cast out a net” and record their subjects during an interview or some other extemporaneous form of speech. Free conversation might also be the task of choice if the researchers were investigating structures that would be likely to occur in all conversations, and would be likely to occur with sufficient frequency to allow analysis. This might not be the case with words containing consonant clusters in onsets or codas, as this type of phonological structure might not be very frequent. To ensure sufficient tokens of such words, a researcher might have to elicit them explicitly using a more controlled type of task.

Having considered some of the data used in L2 phonological studies, we turn to the applications of the results, more specifically, to the empirical testing of the hypotheses, and to the pedagogical implications of the findings. We begin with the empirical testing of hypotheses.

Empirical testing

The approach taken by virtually every L2 phonologist is that the claims that are made about IL phonologies are empirical in that they can be tested on the basis of facts about the real world. There are, however, several issues, listed in (3) below, that arise with respect to testing various hypotheses. We will address each in turn.

(3) Issues in empirical tests of hypotheses

(a)   How does one determine the kinds of facts that bear on a hypothesis?

(b)   Given that IL grammars are internal mental systems, what considerations must be taken into account?

(c)   What criteria does one invoke to conclude that a structure has been acquired?

Determining the kinds of data to bring to bear on a hypothesis involves a number of assumptions about what the objective of the research is. In the period that we referred to above as pre-ILH, the goal of studies on L2 phonology was to explain learning difficulty. Linguists posed the question as to why some aspects of the TL were difficult for a given learner, and why some structures were more difficult than others.

L2 phonologists generally agreed that errors that learners made were a reflection of difficulty, though there were other approaches, such as Schachter (1974), who proposed that difficulty could be reflected also in a learner's avoidance of a structure. Within this framework, to test empirically a hypothesis that structure A was more difficult to learn than structure B, the prediction would be, other things being equal, that a learner would make more errors on structure A than on structure B.

The second issue concerns the fact that hypotheses about the acquisition of aspects of L2 phonology are hypotheses about the state of an IL grammar, and as such, these hypotheses are claims about an individual's internal system. This system can be placed in time and space as existing in the mind of the L2 learner. Two conclusions follow immediately from this point. The first is that linguists cannot directly observe the workings of an internal IL grammar, and therefore must base their claims about an IL on the basis a speaker's utterances or behavior (but see the section Future directions below). The second conclusion is that linguists’ claims about IL grammars must be tested against individual data, not against aggregate data. In other words, if second language acquisition is understood as the learning of an IL system, then a second language is learned by an individual. There is no IL grammar of a group of people, just as there is no mind of a group, at least not one that can be placed in time and space. In other words, a class or a group of people does not acquire the IL grammar collectively; rather, individuals in a class or a group acquire the IL grammar.

This leads directly to the third point: linguists formulating hypotheses about the acquisition of various TL structures must have a way of ascertaining when a structure has been acquired. As was true in the case of measuring difficulty, there have been several proposals for determining when a structure has been learned, but, in general, linguists consider a structure to be acquired when it is part of the IL grammar. A structure is assumed to be part of the IL grammar when it occurs systematically in the utterances of the learner. Finally, systematic occurrence is generally taken to be when the structure in question occurs with a relatively high percentage of occurrence, usually assumed to be around 80 or 90 percent.

Applications

The basis for the claim that research on L2 phonology has pedagogical implications is the conclusion, developed through studies on second-language pronunciation, that learning how to pronounce a TL involves much more than simply learning the sounds, that is, the phonetics, of the target language. In this subsection, we will illustrate this point using several examples, most importantly word-final devoicing discussed above in the section on Interlanguage. The idea we will attempt to get across in this discussion is that the pedagogical strategy devised to address at least some pronunciation errors must be based on the nature of the interlanguage phonological system, and should not simply be grounded on the sounds produced by the learner.

Within this context, let us consider an L2 pronunciation pattern in which a learner of English pronounces TL words containing word-final voiced obstruents such as [b], [d], and [g], as the corresponding voiceless sounds [p], [t], and [k]. In other words, the learner in question systematically pronounces the English words, [ImagesIb] “rib,” [bId] “bid,” and [pIg] “pig” as [pIk], [bIt], [pIk], respectively. Based on the sounds produced, and with no further analysis of the utterances, one might conclude that the learner in question needs to learn that these and similar TL words all have word-final voiced obstruents, and therefore must be distinguished from their minimal-pair counterparts “rip,” “bit,” and “pick.” However, a deeper analysis of the IL systems of these learners reveals that they may already know that these words end in final voiced sounds, and instead that they need to acquire other aspects of TL pronunciation.

The crux of the matter, and what a deeper, phonological analysis would reveal, is the nature of the phonemic representation that the learners have stored in their mental lexicon, and that ultimately underlie the errors produced. There are two possibilities for this. On the one hand, the learners could have represented the words “rib,” “bid,” and “pig” phonemically in the IL lexicon as, respectively, /ImagesIb/, /bId/, and /pIg/, in which case their ILs would also have to incorporate a word-final devoicing rule that systematically changes the pronunciation of all word-final voiced obstruents to voiceless. On the other hand, the learners’ ILs could represent the relevant words in the lexicon as /ImagesIp/, /bIt/, and /pIk/, exactly as these words are pronounced, in which case the IL systems do not incorporate a rule of devoicing to yield the observed pronunciation pattern.

The differences between these two analyses, that is to say, the differences between the two IL systems, are not trivial, and they have different pedagogical implications. Moreover, it is an empirical matter as to which of the two IL grammar types a given learner has internalized. If a learner has acquired an IL system where the phonemic representations in the IL lexicon are /ImagesIb/, /bId/, and /pIg/, in which case the IL contains a word-final devoicing rule, then the target-like pronunciation of the voiced obstruents should occur in morphologically-related forms of the words when the consonants in question are not in word-final position. For example, a learner with such an IL grammar should pronounce “ribbing” as [ImagesIbIŋ] and “piggy” as [pIgi], with voiced medial obstruents, whereas a learner who has internalized an IL system with the relevant words represented as /ImagesIp/, /bIt/, and /pIk/ in the IL lexicon should pronounce “ribbing” as [ImagesIpIŋ] and “piggy” as [pIki]. According to this analysis, a learner with the former IL grammar type knows, albeit implicitly, that the words in question end with a voiced obstruent, whereas a learner with the latter type of IL has not yet acquired this aspect of the TL. And based on this reasoning, the two learner types need to acquire different aspects of English in order for their pronunciation patterns to become more target-like. Specifically, the former needs to suppress the IL rule of devoicing, and the latter must learn that English has a word-final voice contrast. Without going into detail, it seems reasonable that different pedagogical strategies would be involved, depending on which IL grammar type confronted a teacher.

The point of our discussion is this. Although a teacher may observe L2 learners making the same kind of pronunciation errors (saying [bIt] instead of [bId]), unless the analysis moves beyond the level of simply noting the sounds produced, the differences in the IL systems that the learners may have internalized and their concomitant pedagogical implications will not be revealed.

The same conclusion can be defended with respect to other aspects of L2 pronunciation. It has been shown that the phonetic properties of certain TL sounds do not have as great an impact on their acquisition as do the more abstract, phonological characteristics of a segment, such as whether the TL sounds to be learned are distributed as non-contrasting allophones or as phonemes. Studies have also confirmed that a learner's suppressing the pronunciation of NL allophones and learning to produce TL allophones can present significant difficulty, leading to later acquisition of these sound patterns (Flege, 1995; Stockwell and Bowen, 1965). Moreover, it is a serious learning problem when NL allophones correspond to TL phonemes. Well-known examples include Spanish [d] and [ð], which are allophones in Spanish but are phonemes in English, on the one hand, and Korean [s] and [š], which are allophonic in Korean but contrast in the English words “sip” and “ship.” This is the area that Lado (1957) highlighted as being maximally difficult, and the intractable nature of this difficulty has also been borne out empirically, most recently by Eckman, Elreyes and Iverson (2003).

As a final example of the necessity to look beyond phonetics to more abstract representations in order to understand L2 pronunciation patterns, consider the role of the syllable in acquisition. Research into language typology has revealed that languages are much more restrictive with the sounds that can occur in syllable codas than they are with the sounds that can occur in onsets. Although all languages allow some consonants in onsets, some languages, such as Hawa'ian do not allow any consonants in coda position, while others such as Mandarin allow only sonorant consonants, and still others such as Spanish allow only the coronal consonants [s n l r ð]. It should not be surprising, then, when IL grammars adhere to the same constraints as L1 grammars, and L2 learners have more difficulty learning TL segments and contrasts in coda position than they do acquiring these aspects of the TL in onset position.

In sum, we are able to defend the position that learning to pronounce a target language involves the acquisition of an IL grammar, and therefore requires much more than simply learning the phonetics of the TL, and that these more abstract properties of L2 phonologies have implications for language pedagogy.

We conclude this chapter by speculating a bit on what important questions in L2 phonology will guide the future direction of research.

Future directions

Three important questions that confront future research in L2 phonology will, if on the mark, shape future work in the field. The first is what the place of heritage learners is within the study of L2 phonology; the second is whether it is possible for an L2 learner to achieve native-like competence or proficiency; and the third is what the role of cognitive neuroscience will be in L2 phonological research. Because the question of heritage learners is taken up in Section 5 of this volume, we will consider only the last two of these questions.

An issue that will likely continue to shape work in L2 phonology is how to measure whether an L2 learner can achieve native-like proficiency in a language. There are at least two methods in which this question has been addressed: one is to have the utterances of non-native speakers recorded and then evaluated as to how native-like the recordings sound, and the other method has been to measure the performance of the non-native speakers on various aspects of the TL and compare the results to the performance of native speakers on the same structures. Recent work by Abrahamsson and Hyltenstam (2009) used the second method, comparison of performance on a number of TL phonological structures and found that L2 speakers did not fall within the range of performance by natives.

The other important question for the future is whether developments in the field of cognitive neuroscience can confirm some of the constructs that L2 phonologists have postulated as part of their explanation for IL pronunciation patterns (Gollestani and Zatorre, 2002). In other words, can studies using neuro-imaging of the brain, such as through fMRI, provide independent evidence for underling representations, markedness, and phonemic contrasts, to cite just a few examples?

To conclude this subsection, postulating hypothetical constructs such as underlying forms and lexical representations has been part and parcel of L2 phonological analyses for decades. Thus, it is intriguing to consider whether advances in cognitive neuroscience will be able to provide more direct evidence for these constructs.

Conclusion

This chapter has characterized the field of second-language phonology from several perspectives. We have outlined the development of second-language phonology over the decades, and described its most significant findings. We have also reviewed, from an empirical standpoint, some of the most important questions in the field, and pointed out the kind of data that can be brought to bear on resolving these issues. Finally, we concluded our discussion with two additional points: the first, relating a number of pedagogical implications from the insights of theoretical research to instructional practice; and the second, pointing to what promise to be some of the important questions of the future.

Note

1   This work was supported in part by a grant from the National Institutes of Health R01 HD046908−05. The views and positions held in this work are those of the author. The NIH is not responsible for, nor does it necessarily agree with, any of the views taken in this chapter.

References

Abrahamsson, N. (2003). Development and recoverability of L2 codas: A longitudinal study of Chinese-Swedish interphonology. Studies in Second Language Acquisition, 25, 313–349.

Abrahamsson, N. and Hyltenstam, K. (2009). Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language Learning, 59, 249–306.

Altenberg, E. and Vago, R. (1983). Theoretical implications of an error analysis of second language phonology production. Language Learning, 33, 427–447.

Anderson, J. (1987). The markedness differential hypothesis and syllable structure difficulty. In G. Ioup and S. Weinberger (Eds.), Interlanguage phonology: The acquisition of a second language sound system (pp. 279–291). Cambridge, MA: Newbury House.

Archibald, J. (1993). The learnability of English metrical parameters by Spanish speakers. International Review of Applied Linguistics, 31, 129–141.

Archibald, J. (1995). The acquisition of stress. In J. Archibald (Ed.), Phonological acquisition and phonological theory (pp. 81–109). Hillsdale, NJ: Lawrence Erlbaum Associates.

Battistella, E. (1990). Markedness: The evaluative superstructure of language. Albany: The State University of New York Press.

Best, C. T. (1995). A direct realist's view of cross-language speech perception. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 171–204). Baltimore: York Press.

Broselow, E., Chen, S., and Wang, C. (1998). The emergence of the unmarked. Studies in Second Language Acquisition, 20, 261–280.

Broselow, E. and Finer, D. (1991). Parameter setting in second language phonology and syntax. Second Language Research, 7, 35–59.

Brown, C. (1998). The role of the L1 grammar in the acquisition of segmental structure. Second Language Research, 14, 139–193.

Carlisle, R. (1992). Environment and markedness as interacting constraints on vowel epenthesis. In J. Leather and A. James (Eds.), New sounds 92: Proceedings of the Amsterdam symposium on the acquisition of second language speech (pp. 64–75). Amsterdam: University of Amsterdam.

Carlisle, R. S. (1998). The acquisition of onsets in a markedness relationship: A longitudinal study. Studies in Second Language Acquisition, 20, 245–260.

Corder, S. P. (1971). Idiosyncratic dialects and error analysis. International Review of Applied Linguistics, 9, 149–159.

de Jong, K., Silbert, N. H., and Park, H. (2009). Generalizations across segments in second language consonant identification. Language Learning, 59, 1–31.

Eckman, F. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27, 315–330.

Eckman, F. (1981). On the naturalness of interlanguage phonological rules. Language Learning, 31, 195–216.

Eckman, F. (1991). The Structural Conformity Hypothesis and the acquisition of consonant clusters in the interlanguage of ESL learners. Studies in Second Language Acquisition, 13, 23–41.

Eckman, F., Elreyes, A., and Iverson, G. (2003). Some principles of second language phonology. Second Language Research, 19, 169–208.

Flege, J. E. (1995). Second language speech learning: Theory, findings and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233–277). Baltimore: York Press.

Gollestani, N. and Zatorre, R. J. (2002). Anatomical correlates of learning novel speech. Neuron, 35, 997–1010.

Hammerly, H. (1982). Contrastive phonology and error analysis. International Review of Applied Linguistics 20, 17–32.

Hancin Bhatt, B. (2008). Second language phonology in Optimality Theory. In J. Hansen Edwards and M. Zampini (Eds.), Phonology and second language acquisition (pp. 117–146). Amsterdam: John Benjamins.

Hansen, J. (2004). Developmental sequences in the acquisition of English L2 syllable codas: A preliminary study. Studies in Second Language Acquisition, 26, 85–124.

Johannsson, F. (1973). Immigrant Swedish phonology: A study in multiple contact analysis. Lund, Sweden: CWK Gleerup.

Lado, R. (1957). Linguistics across cultures: Applied linguistics for language teachers. Ann Arbor: University of Michigan Press.

Lombardi, L. (2003). Second language data and constraints on manner: Explaining substitutions for the English interdentals. Second Language Research, 19, 225–250.

Mairs, J. L. (1989). Stress assignment in interlanguage phonology: An analysis of the stress system of Spanish speakers learning English. In S. Gass and J. Schachter (Eds.), Linguistic perspectives on second language acquisition (pp. 260–283). Cambridge: Cambridge University Press.

Major, R. and Kim, E. (1996). The similarity differential rate hypothesis. Language Learning, 46, 465–496.

Nemser, W. (1971). Approximative systems of foreign language learners. International Review of Applied Linguistics, 9, 115–123.

Nguyen, H. and Macken, M. (2008). Factors affecting the production of Vietnamese tones: A study of American learners. Studies in Second Language Acquisition, 30, 49–78.

Sato, C. (1984). Phonological processes in second language acquisition: Another look at interlanguage syllable structure. Language Learning, 34, 43–57.

Schachter, J. (1974). An error in error analysis. Language Learning, 24, 205–214.

Selinker, L. (1972). Interlanguage. International Review of Applied Linguistics, 10, 209–231.

Selkirk, E. (1982). The syllable. In H. Van der Hulst and N. Smith (Eds.), The structure of phonological representations Part II (pp. 337–384). Dordrecht: Foris Publications.

Stockwell, R. and Bowen, J. (1965). The sounds of English and Spanish. Chicago: University of Chicago Press.

Trubetzkoy, N. (1939). Principles of phonology. Paris: Klincksieck.

Zampini, M. (1997). L2 Spanish spirantization, prosodic domains and interlanguage rules. In S. J. Hannahs and M. Young-Scholten (Eds.), Focus on phonological acquisition (pp. 263–289). Amsterdam: John Benjamins.