The study of second language (L2) speech production has a long history in second language (SLA) research, both for what it can tell us about the development of the specific skill and how it might illuminate the general processes of SLA. Over time, studies in L2 phonological attainment have become a battleground for perennial issues in SLA research encompassing cognitive, psychological, and socio-cultural factors as varied as age-related constraints on ultimate attainment, and the role of identity in perceptions of accentedness. SLA research has also benefited from the ongoing development of phonological theory which has been consistently applied to L2 production throughout the decades.
The investigation of L2 speech production has been part of second language acquisition research from its beginnings. Following the prevailing linguistic paradigm of the time, researchers employed Contrastive Analysis (CA) (Lado, 1957) and the “difference = difficulty” hypothesis to explain the shape of L2 accents (Weinreich, 1953). Despite this belief that transfer was at the root of non-native accents, it became evident that the prognoses made by CA with regard to L2 phonology, as with other language systems, were not predictive of learner error. Recognizing this difficulty, Eckman supplemented CA with the Markedness Differential Hypothesis (1977) and the Structural Conformity Hypothesis (1991) (for further information, see Chapter 6 by Eckman).
Throughout the 1980s, speech researchers followed the shift in the field of SLA away from CA toward a closer investigation of learner language through Error Analysis (Corder, 1967) and the recognition of stable, transitional grammars, or interlanguages (ILs) (Selinker, 1972). Major's Ontogeny Model (OM) (1987) for example, proposed a three-part structure underlying IL comprising influences from L1, L2 and universal processes. The OM further stated that these influences would be more salient during different phases of learner development in phonology with L1 transfer initially frequent in the IL and then decreasing as developmental processes increased.1
During this time, few had challenged the presumption of a critical period (Lenneberg, 1967) for linguistic development, and Scovel (1988) predicted that post-pubescent L2 learners would retain permanently accented speech. In the early 1980s, however, Flege and his colleagues began to question the role of the CPH as the primary explanatory factor for the differences between the production of L2 phonetic segments by children and adults (see later in this chapter for a detailed discussion of the debate regarding the CPH). Based on an ongoing series of studies (Flege, 1981, 1987; Flege and Eefting, 1986; Flege and Hillenbrand, 1984; among others), they posited a relationship between perception and production that operated differently in L2, not because of an age-based constraint but because adult speaker-hearers already have established phonetic categories in their L1. Flege proposed a mechanism termed Equivalence Classification in which adult learners are more likely to identify new phones appearing in the L2 as equivalent to already existing L1 categories. Equivalence classification predicts that learners are likely to be less effective at successfully distinguishing and producing sounds that are similar to sounds in their L1 than they are with sounds that are novel and have no correlate in the L1 sound system. This may be the result of perceptually similar but phonetically distinct sounds in L2 being assimilated into a single category (See Hardison, Chapter 21, in this volume for a discussion of the relationship between perception and production in the L2.). Although the hypothesis predicts that missed perceptual cues will render production inaccurate and that changes in perception should lead to changes in production, Flege notes that not all inaccurate production can or should be explained by perception. In a 1995 study, he discusses output constraints on syllable type as a possible cause of production difficulties such as the word-external epenthesis typically shown by Spanish speakers of English. Over time, Flege has formalized his hypotheses in the Speech Learning Model (SLM) which he describes as follows:
An assumption we make is that the phonetic systems used in the production and perception of vowels and consonants remain adaptive over the life span, and that phonetic systems reorganize in response to sounds encountered in an L2 through the addition of new phonetic categories, or through the modification of old ones.
(1995, p. 233)
The SLM focuses on the production of segments and with a few exceptions, research on the development of suprasegmentals in L2 has traditionally been scarce (see Wenk, 1986 and Juffs, 1990). Taking intonation study as a specific example, Willems (1982) initially reported that Dutch speakers of English demonstrated differences in all aspects of intonation structure including pitch range, pitch prominence, and pitch reset following a boundary. These features have since been confirmed in more recent studies with participants comprising a range of L1s. In investigations of advanced and intermediate Asian and European learners of English, Wennerstrom (1994, 1997) found that speakers did not use pitch variation to signal new or contrastive lexical items, and used less reduction of pitch than L1 speakers on non-prominent words. Japanese, Thai, and Chinese speakers also tended to use low boundary tones between repeated propositions where rising or mid level tones would be anticipated by native speaker (NS) hearers. Pirt (1990) reported similar results in a study of Italian learners, and Pickering (2001) reported equivalent findings for Chinese speakers of English in academic discourse. Hewings (1995) found a preference for the use of falling tones in the discourse of advanced L2 learners from Korea, Greece, and Indonesia in contexts where NSs would use rising or level tones. Both Mennen (1998) and Pickering (2004) report demonstrably narrower pitch ranges in L2 learners as compared to NSs.
In addition to studies addressing different aspects of the phonological system, some of the recent approaches taken to L2 speech production have emerged from current models of phonology such as Optimality theory (OT) (Prince and Smolensky, 1993) and Connectionism (Elman et al., 1996). Optimality theory proposes a universal set of violable constraints accessible to all speakers. Each language ranks these constraints differently and these different rankings account for phonological differences among languages (see Archangeli and Langendeon, 1997, for a complete introduction to OT, and Eckman, Chapter 6, in this volume for a more detailed discussion of OT.). As examples of natural language, ILs may also allow novel structures to surface (i.e., structures that are not present in either the L1 or L2) as a result of learners hypothesizing different constraint rankings. Hancin-Bhatt and Bhatt (1997) account for both Japanese and Spanish learners’ difficulties with phonotactic structure in English within the OT framework, and Broselow et al. (1998) attribute the preference for Mandarin speakers of English to devoice final obstruents to a specific re-ranking of constraints.
Connectionist frameworks are modeled on computer programming, and propose a network of nodes which have different activation values. Connections between the nodes are weighted, with larger weights indicating stronger connections. A network of connections is built as the learner is exposed to many instances of a given language feature. As an example of its possible application to L2 phonology, Hancin-Bhatt (1992) exemplifies how a connectionist approach to processing may account for the substitution of a dental [t] for the voiceless alveolar fricative by Hindi speakers of English. There continues to be a burgeoning research agenda in the field of L2 speech production with a continued emphasis on model building supplemented by more recent additions such as the investigation of neurological factors in language production (Sereno and Wang, 2007).
This part of the chapter discusses the following issues as they relate to L2 speech production: Age-related effects, language-related effects, and socio-affective factors involved in L2 production. The final section reviews research in intelligibility as it pertains to SLA. An additional core area of importance, the relationship between speech production and perception is only briefly addressed, and the reader is referred to Chapter 21 in this volume by Hardison.
Often referred to as the “Conrad phenomenon” after the novelist Joseph Conrad, perhaps the most compelling question in L2 phonology research has been the interaction between the Critical Period Hypothesis (CPH) and degree of accent, i.e., the assumption that after a certain age, L2 learners are biologically incapable of achieving a native accent in their second language (See also Chapter 31 by Byrnes and Chapter 27 by DeKeyser in this volume.):
In its most succinct and theory neutral formulation, the CPH states that there is a limited developmental period during which it is possible to acquire a language be it L1 or L2 to normal, natively levels. Once this window of opportunity is passed, however, the ability to learn language declines.
(Birdsong, 1999, p. 1)
With regard to pronunciation specifically, proponents of the CPH have proposed a developmental constraint ranging from five to 15 years old. L2 speech production has been one of the primary testing grounds for the CPH, and the controversy is well illustrated in an open debate that began in the journal Applied Linguistics between Flege (1987, 1999) and Patkowski (1990, 1994).
As noted above, Flege (1987) questioned a number of the assumptions underlying traditional acceptance of the CPH. He cites studies in which children do not appear to out-perform adults in the production and perception of L2 speech sounds (Snow and Hoefnagel-Höhle, 1978; Winitz, 1981) and argues that studies show a linear relationship between degree of foreign accent and age as opposed to a noticable discontinuity which would be expected at the onset of the end of the critical period (Oyama, 1978). He describes the focus on CPH as reductionistic and suggests that differences between adult and child learners may be the result of a number of factors other than (or in addition to) a critical period. Examples of possible confounding factors include previous linguistic experience, affective factors such as motivation, and social factors such as group identity.
In his reply to this paper, Patkowski (1990) argues that proponents of the CHP focus on ultimate L2 proficiency rather than rate of acquisition; thus, evidence of adults showing faster initial rates than children are not relevant to the debate. With regard to Flege's contention that there is a lack of research evidence verifying the onset of a marked discontinuity which would mirror the end of the critical period, Patkowski both challenges the design of the studies cited by Flege and cites a study of his own (Patkowski, 1980) in which such a discontinuity was in evidence. In summary, Patkowski states that there is no “convincing rationale for entirely discarding the notion of a biologically based age limitation on the ability to acquire second languages with native fluency” (1990, p. 86).
In a follow-up paper in 1994, Patkowski cites a number of review articles and empirical studies (including most notably Long, 1990 and Patkowski, 1990) to support his position that a biologically based sensitive or critical period somewhere between the ages of 12 and 15 years exists for the ultimate attainment of second language phonology. Flege (1999) responds and cites two studies (Flege et al., 1995; Yeni-Komshian et al. 1997) which continue to show a linear relationship between degree of perceived accent and age in subjects between the ages of two and 23 years which does not support an abrupt biologically or neurologically based shift in ability.
The debate continues to expand (for an accessible summary see “The whys and why nots of the CPH-L2A” by Birdsong, 1999); Bongaerts (1999) conducted several studies in which some highly advanced late Dutch learners of English and French were rated by judges as indistinct from native speakers suggesting that the CHP could be nullified. Bongaerts submits that this may be the result of high motivation, high levels of input, and training in the perception and production of L2 speech. Birdsong (2007) reports similar results in a study with late Anglophone learners of French. There continues to be no clear resolution to this controversy. In the first chapter of their 2007 volume, Bohn and Munro cite Flege et al. (2006) who find that even very young L2 learners exhibit foreign accents and also report Hakuta et al. (2003) whose adult learners exhibit success that correlates negatively with age of arrival.
Early studies that conceived of transfer through the lens of CA sought straightforward explanations of L2 pronunciation errors in a comparative analysis of the different phonological systems of L1 and L2. As more and more empirical evidence came to light that did not support this thesis, a more moderate version of CA (Oller and Ziahosseiny, 1970) became popular. The original hypothesis was revised to include both similarities and differences between phonological systems, and there was a recognition that perceptual saliency may play a crucial role.
Major (2001) suggests that learners may perceive large differences between the L1 and L2 sound systems but have more difficulty noticing smaller differences. Thus, the learner may be more likely to hit a phonological target if that target is unlikely to be substituted by a similar target in the L1 (see also Flege's SLM above). Major further advanced the notion of similarity and dissimilarity by adding the principle of rate (Major and Kim, 1996). The Similarity Differential Rate Hypothesis suggests that dis similar features will be acquired at faster rates than similar ones but that markedness will slow rate. Major proposed that this combination of underlying factors results in a surface structure in the IL that does not support simplistic notions of transfer.
In an investigation of the relative contribution of markedness and direct L1 transfer, Carlisle (1994) reviews studies in the area of syllable structure. These studies show a clear preference for L2 learners to transfer syllable structure into IL phonology by resyllabifying to match L1 constraints rather than simplifying to produce a universally less marked structure (e.g., an open CV syllable). With regard to prosodic structure, overall prosodic profiles of certain groups of learners have resulted in similar suggestions of the primacy of transfer over developmental features (for example, Wennerstrom (1998) for Chinese learners of English and Jilka (2007) for English speakers of German.) However, it is also the case that similar errors in L2 intonation structure by learners of very different backgrounds have been reported (see Mennen, 2007 for a summary). As Mennen notes, however, much of this work is inconclusive. The majority of studies report on L2 acquisition of English only, and the lack of a common framework to describe intonation systems cross-linguistically makes it difficult to assess features that may reflect universal tendencies vs. cases of L1 transfer.
Despite a historical focus on both age- and language-related effects, research suggests that there are additional language independent constraints that may affect L2 phonological attainment. Thus far, studies have investigated task variation (Tarone, 1980), attitudes and motivation (Stokes, 2001), concern for pronunciation accuracy (Elliott, 1995), social markings of identity (Dowd et al., 1990; Lybeck, 2002), and extent of L1 and L2 use (Piske et al., 2001) among other socio-affective variables. Two current studies, Moyer (2004) and Hansen (2006) demonstrate a more recent research agenda in which social factors are at the center of the analysis.
Moyer investigates 25 advanced learners of German as a second language from a range of L1 backgrounds. She considers a number of instructional and social factors including level of motivation, self-perceived accentedness, amount of formal language instruction, and amount and context of use of German on a regular basis. The participants completed four language tasks which were recorded and then judged by three NS judges. Participants also completed a questionnaire and semi-structured interviews in which they talked about their personal language-related experiences. Moyer found that although age exerted some independent influence, psychological variables such as intensity of motivation, satisfaction with attainment, and professional motivational orientation accounted for a larger percentage of the variance than age of onset combined with length of residence. Thus, she determines that “the idea that ultimate attainment is primarily a function of age must be reconsidered. Instead, the impact of age should be understood as indirect as well as possibly direct” (p. 140).
Hansen (2006) conducted a longitudinal study of the development of syllable margins in the emerging L2 of two adult Vietnamese learners of English and considered both linguistic and social constraints in accounting for acquisition. The participants in the study, a husband and wife in their 40s (Nhi and Anh), arrived in the USA from Vietnam one year before data collection began. Like Moyer, Hansen uses interview data to target socio-affective factors. In her discussion of social constraints, she presents a narrative account spanning ten months, partially in her own words and partially in the words of her participants. The reader is introduced to the participants’ changing social contexts over the length of the study and given insights into each person's personality, motivations, and frustrations. We learn that Anh adapts very slowly to her new surroundings and by the end of the study, she still struggles to communicate in English. Nhi is a more easy-going learner and significantly less anxious than Anh. He prioritizes his relationships with English speakers and engages in his English language environment at work. In her interpretation of these data, Hansen uses both Schumann's Acculturation Model (1986) and Pierce's (1995) concept of investment to explain how social constraints such as perceptions of cultural identity and extended family dynamics may impact Anh and Nhi's individual phonological development. Although emerging production modifications suggest a shift in developmental patterns that will ultimately favor Nhi (and by extension his approach to learning), it remained to be confirmed in terms of overall change.
Assessment of intelligibility has long been considered a core area of L2 speech research. Although we can use intelligibility in a broad sense to mean “intelligible production and felicitous interpretation of English” (Nelson, 1995, p. 274), more recently there has been a distinction in the literature between “intelligibility” to mean formal recognition of decoding of words and utterances, and “comprehensibility” to mean the listener's ability to understand the meaning of the word or utterance in its given context. Thus, as Field (2003) suggests, a listener may use contextual understanding to compensate for the fact that a message is unable to be precisely decoded.
Comprehensibility or intelligibility judgments of L2 speech tend to rely on NS listener ratings (Anderson-Hsieh et al., 1992; Piske et al., 2001) and are often accompanied by judgments of accentedness (Derwing and Munro, 1997); as yet, however, no clear relationship has been established between accentedness and comprehensibility. Speakers who succeed in reducing the degree of foreignness in their accents (based on expert NS raters) may still be heard as incomprehensible by lay listeners (Munro and Derwing, 1995). Although accentedness in L2 speech may derive from several different sources, Derwing and Munro (1997) conclude that L2 comprehensibility is improved for NS listeners with enhanced prosodic proficiency, and their position is supported by subsequent studies (e.g., Derwing and Rossiter, 2003; Field, 2005). Prosodic characteristics that have been found to be important include speech rate (Derwing, 1990; Derwing and Munro, 2001), mean length of utterance (Kormos and Dénes, 2004), length and placement of pauses (Anderson-Hsieh and Venkatagiri, 1994; Pickering, 1999; Riggenbach, 1991), and non-standard word stress (Field, 2005; Hahn, 2004).
Most recently, intelligibility studies have expanded to include non-native speaker (NNS) perceptions of comprehensibility in NNS-NNS or learner-learner interactions (for a review of studies within the context of English as a lingua franca see Pickering, 2006). This work suggests that L2 listeners may process phonological features differently from their NS counterparts. While prosodic features appear to be a crucial cue for NS comprehensibility, studies with NNS listeners suggest that they may rely more on segmental features (Deterding, 2005; Field, 2005; Jenkins, 2000). Jenkins suggests that this predominant focus on bottom-up processing (i.e., resorting to acoustic information rather than contextual information) reflects L2 speakers’ higher dependency on phonological form as opposed to shared contextual knowledge with their interlocutors.
In addition, L2 speaker-hearers may draw on an “interlanguage speech intelligibility benefit” (Bent and Bradlow, 2003)– an effect resulting from some familiarity with particular nonstandard phonological forms. An L2 learner may be better equipped to interpret specific acoustic-phonetic features of a L2 speaker that are matched with her or his own production, and therefore find understanding an L2 speaker from their own L1 background easier than understanding someone from a different L1 background (cf., Major et al., 2002; see also Gass and Varonis, 1984).
The types of data utilized and the methods of analysis employed in L2 speech production research reflect the breadth of quantitative and qualitative research possibilities typically found in applied linguistics. Research designs encompass experimental designs comprising a hundred or more participants (Flege et al., 1995) to more naturalistic contexts involving just two participants (Hansen, 2006). Elicitation measures range from native speaker judgments of perceived accent or intelligibility (Munro and Derwing, 1995) to objective measures of acoustic characteristics such as voice onset time (VOT) (Strange, 1995) or fundamental frequency values (Jilka, 2007; Kang et al., 2010). The following discussion addresses issues of data collection and verification in studies of segmental and suprasegmental features of L2 production.
At the highly experimental end of the research continuum, L2 research focused on segmentals has been dominated by Flege and his colleagues (see earlier discussion). Their goal for the most part has been to find evidence for the claims made by the SLM (1995), namely, that phonetic systems remain adaptive and that L2 learners will be more successful at creating new categories for L2 sounds that are dissimilar from L1 sounds than those that are similar. Thus, studies typically employ late learners who can be exempted a priori from traditional conceptions of the CPH or learners comprising a range of ages. L2 populations have also come from a wide variety of L1 backgrounds including Swedish, Chinese, Spanish, Italian, Dutch, and Japanese speakers. As the focus is on creation of phonetic categories, testing is usually confined to a very small aspect of production such as VOT or vowel duration. In order to limit confounding variables for these heavily statistical designs, spontaneous speech is also eschewed in favor of word lists or utterances read aloud. Certain segmental features have become emblematic of L2 phonological research such as the perception and production of /l/ and /r/ by Japanese learners of English (Bradlow, 2008; Yamada et al., 1996) or the cross-linguistic comparison of formant structures (F1 and F2) or duration in vowels (see Strange, 2007 for a review.) These have allowed researchers to compare findings more easily.
At the opposite end of the continuum, while still focusing on individual elements of L2 phonology, in this case the production of syllable margins, Hansen (2006) adopts a dual design that incorporates both qualitative and quantitative components. Interviews recorded with two participants targeted socio-affective factors and were transcribed and coded for production of syllable onsets and codas. Hansen is able to document detailed production modifications over time that suggest an emerging L2 phonology in which L2 consonants are very gradually acquired by the two learners in similar stages but at different rates. Quantitative findings, however, are difficult to interpret, as there are not enough data to make this kind of analysis work well. At one point, for example, Hansen notes that although it appears that one participant has acquired three-member consonant onsets at 89 percent accuracy, there are too few tokens for this percentage to be meaningful.
Studies of the suprasegmental characteristics of L2 speech tend to look quite different from work in segmentals. Most notably, they are usually smaller in terms of numbers of participants. Following a review of studies on L2 prosody in major journals over the past 25 years, Gut (2007, p. 145) finds that research on intonation is conducted with an average number of 22.6 participants and research on word stress is based on an average of 7.7 participants. In addition, she notes that most studies comprise artificial speech tasks such as reading aloud and are undertaken in laboratory settings. There is growing evidence that such data are problematic for assessing suprasegmental features such as intonation and rhythm. Tao (1996, p. 34), for example, argues that the proposed pitch register differences suggested between interrogatives and declaratives in Mandarin may be an artifact of studying isolated sentences as opposed to natural discourse. He finds that the theory of register does not account for a large portion of the intonation patterns that may be present in natural speech. Similarly, with regard to rhythmic characteristics, Lai (2002) proposes that misconceptions about the stress patterns of Cantonese derive in part at least from a reliance on experimental production of the language which alters its natural prosodic patterns. Finally, Brazil (1992) recognizes a number of different levels of engagement in reading aloud by a speaker which result in different prosodic compositions depending on the type of reading (i.e., text versus isolated sentences), and the degree of engagement by the speaker with text and listener.
Despite their disadvantages, these research designs often reflect necessary compromises if we want to compare apples to apples; two issues are particularly salient. First, differences between L1 and L2 production of intonational features such as use of contrastive prominence or tonal structure are difficult to assess if participants are saying different things. To address this difficulty, Hewings (1995) asked L2 learners to read scripted dialogs and then compared these readings to NS performances. Wennerstrom (1994) asked all her participants to read the same passage that had been constructed to exemplify specific intonational features. The second issue is the increased use of instrumental data to support findings regarding L2 prosodic structure through programs such as WASP (Huckvale, 2003) and PRAAT (Boersma and Weenick, 2002) which are freely downloadable, as well as commercially available programs (see Schuetze-Coburn et al., 1991, for a comparison of auditory and instrumental analysis). These tools allow researchers to measure a variety of acoustic features; however, they also demand audio data of a high quality that is difficult to obtain outside of a controlled environment (although see Pickering (1999) and Wennerstrom (1997) for the recording of naturally occurring data).
An alternative approach to data collection lies in recent developments in corpus construction. Currently there are at least two learner corpora that include some annotation of prosodic features of L2 speakers. The LeaP (Learning Prosody in a Foreign Language) corpus comprises more than 12 hours of recording time of second language learners of German and English and includes six manually annotated and two automatically annotated tiers. The Hong Kong corpus of Spoken English has approximately one million words prosodically transcribed (manually) using Brazil's (1997) discourse model. Both corpora have generated studies of L2 speech production (e.g., Cheng et al., 2005; Gut, 2007) which benefit from the large datasets that they are based on.
There has been a consistent interplay between L2 speech production research and pedagogy throughout the history of this area, most particularly in the teaching of EFL/ESL. English pronunciation materials such as Drills and Exercises in English Pronunciation (1967) which focuses on stress and intonation and English Pronunciation Illustrated (Trim, 1965) which practices phonemes and minimal pairs reflect the tenets of CA and the belief that practice will instill the good habits needed to conquer L2 pronunciation (O'Connor, 1967). With the onset of research in L2 interlanguage and the recognition of the complexity underlying L1 transfer, some attempts were made to introduce notions of markedness and universal processes to language instruction. For example, Yavas (1994) addressed the finding of a universal tendency toward final devoicing with a set of graded teaching materials for English in which presentation and instruction of bilabial final stops (e.g., tub, cab) precedes more difficult stops and consonant clusters.
During the 1980s, the popularity of the communicative approach encouraged development of pedagogical materials that embraced the full scope of the L2 phonological system including suprasegmentals (Chun, 2002). Materials often explicitly addressed the changing ideologies in SLA. In their introduction to Teaching American English Pronunciation for example, Avery and Ehrlich (1992) go beyond linguistic factors in pronunciation instruction to discuss the roles of socio-cultural and personality variables. They further note that teachers should be concerned with comprehensibility rather than accuracy when correcting student pronunciation.
Despite this progress toward a focus on language use, the phonological system still tended to be taught in pieces rather than in a realistic context. This applied particularly to the intonation structure in English where priority was still given to grammatical contrasts of attitudinal effects (Levis, 1999) despite growing recognition that intonation formed part of a speaker's discourse and pragmatic competence (Brazil, 1997; Grosz and Sidner, 1986) and that isolated contours form part of a larger organizational structure through which they acquire their full significance (Pierrehumbert and Hirschberg, 1990). In the past several years, interactive teaching materials such as Streaming Speech (Cauldwell, 2003) have begun to incorporate examples of naturally occurring discourse that introduce the characteristics of conversational English. The 1980s also saw the more widespread use of speech visualization technology in prosodic instruction (de Bot, 1980; de Bot and Mailfert, 1982; Weltens and de Bot, 1984). Results suggested that learners who received audio-visual feedback demonstrated improved perception and production of intonational contrasts in the L2. More recently, Levis and Pickering (2004) discuss these applications and expand pedagogical applications to a discourse context.
Instruction in English as a foreign language has also been at the forefront of the paradigm shift prompted by the precipitate growth of English as a global lingua franca (Jenkins, 2002). Traditional conceptions of intelligibility that prioritize the speaker are giving way to those that more explicitly consider the listener, and a review of recent research suggests that we may want to promote very different strategies in L2 learners if they intend to remain in an international context (Pickering, 2006). There is also ongoing debate regarding pedagogical practices that privilege certain varieties of English as exemplified by Walker (2001) who discusses proposals that reconsider traditional target models and move toward pronunciation for international intelligibility.
Most recently, there has been a heartening trend in volumes addressing L2 phonological acquisition to include not only research but instructional implications and practice (see Hansen Edwards and Zampini, 2008; and Trouvain and Gut, 2007, for examples). In their preface, Trouvain and Gut describe their hope for what this kind of cross-pollination may achieve:
The first part [of the volume] contains contributions by SLA researchers and experts in prosody ... This includes overviews of current theoretical models as well as findings from empirical investigations. In the second part, some of the leading teaching practitioners and developers of phonological learning materials present a variety of methods and exercises in the area of prosody ... On the one hand, research on non-native prosody can help teachers to interpret and make sense of their classroom experiences and to provide them with a broad range of pedagogic options. On the other hand, researchers may be encouraged to investigate aspects of non-native prosody that have shown to be of primary importance in language classrooms.
(pp. v–vi)
This synergy between laboratory and classroom will be critical to the continued evolution of L2 speech production research and practice.
The future agenda of this area is robust as the extent of the work discussed in this chapter suggests. In this section I identify three areas of particular interest for researchers and teachers:
(1) Perhaps the most promising area of growth is in the new technologies being used to investigate the processes of language production such as fMRI scans (Sereno and Wang, 2007) or ultrasound imaging techniques (Gick et al., 2008). It remains to be seen how much this work will impact our current understanding of L2 acquisition of speech; however, it is likely that increased understanding of neurological factors will illuminate differences between L1 and L2 phonological experience.
(2) Some of the most recent expansions in this area have been as a result of a shift in the research terrain toward an interrogation of what constitutes intelligibility, the native speaker, and possible influences on speech production. We need to continue to fill in these gaps. They include both investigations of language learning outside of an English context, and as Leather (1999) suggests, a broadening of our learner base to include multilingual speaker-hearers in non-Western environments.
(3) Speech production research continues to benefit from methodological innovation. Most recently, the development of learner corpora offer a new and largely untapped resource for researchers to access data which may previously have been unattainable due to limited resources.
1 For the current revision of this model see Major (2001).
Anderson-Hsieh, J., Johnson, R., and Koehler, K. (1992). The relationship between native speaker judgments of non-native pronunciation and deviance in segmentals, prosody, and syllable structure. Language Learning, 42, 529–555.
Anderson-Hsieh, J. and Venkatagiri, H. S. (1994). Syllable duration and pausing in the speech of intermediate and high proficiency Chinese ESL speakers. TESOL Quarterly, 28, 807–812.
Archangeli, D. and Langendoen, D. T. (1997). Optimality theory: An overview. Malden, MA/London: Blackwell.
Avery, P. and Ehrlich, S. (1992). Teaching American English pronunciation. Oxford: Oxford University Press.
Bent, T. and Bradlow, A. R. (2003). The interlanguage speech intelligibility benefit. Journal of the Acoustic Society of America, 114, 1600–1610.
Birdsong, D. (1999). Introduction: Whys and why nots of the critical period hypothesis for second language acquisition. In D. Birdsong (Ed.), Second language acquisition and the critical period hypothesis (pp. 1–22). Mahwah, NJ: Lawrence Earlbaum.
Birdsong, D. (2007). Nativelike pronunciation among late learners of French as a second language. In O.-S. Bohn and M. J. Munro (Eds.), Language experience in second language speech learning (pp. 99–116). Amsterdam/Philadelphia: John Benjamins.
Boersma, P. and Weenick, D. (2002). Praat [Computer software]. Amsterdam, The Netherlands: Institute of Phonetic Sciences, University of Amsterdam.
Bohn, O. S. and Munro, M. (Eds.). (2007). Language experience in second language speech learning. Philadephia: John Benjamins.
Bongaerts, T. (1999). Ultimate attainment in L2 pronunciation: The case of very advanced late L2 learners. In D. Birdsong (Ed.), Second language acquisition and the critical period hypothesis (pp. 133–159). Mahwah, NJ: Lawrence Earlbaum.
Bradlow, A. (2008). Training non-native language sound patterns: Lessons from training Japanese students on the English /r/ and/l/ contrast. In J. Hansen Edwards and M. Zampini (Eds.), Phonology and second language acquisition (pp. 287–308). Philadelphia: John Benjamins.
Brazil, D. (1992). Listening to people reading. In M. Coulthard (Ed.), Advances in spoken discourse analysis (pp. 209–241). London: Routledge.
Brazil, D. (1997). The communicative value of intonation in English. Cambridge University Press, Originally published in 1985 by University of Birmingham: English Language Research, UK.
Broselow, E., Chen, S., and Wang, C. (1998). The emergence of the unmarked in second language phonology. Studies in Second Language Acquisition, 20, 261–280.
Carlisle, R. S. (1994). Markedness and environment as internal constraints on the variability of inter-language phonology. In M. Yavas (Ed.), First and second language phonology (pp. 223–249). San Diego, CA: Singular.
Cauldwell, R. (2003). Streaming Speech Student's Book: Listening and pronunciation for advanced learners of English. Birmingham: Speechinaction.
Cheng, W., Greaves, C., and Warren, M. (2005). The creation of a prosodically transcribed intercultural corpus: The Hong Kong Corpus of Spoken English (prosodic). International Computer Archive of Modern English (ICAME) Journal, 29, 47–68.
Chun, D. (2002). Discourse intonation in L2: From theory to research practice. Philadelphia: John Benjamins.
Corder, S. P. (1967). The significance of learners’ errors. International Review of Applied Linguistics, 5, 161–170.
de Bot, K. (1980). The role of feedback and feedforward in the teaching of pronunciation: An overview. System, 8, 35–45.
de Bot, K. and Mailfert, K. (1982). The teaching of intonation. Fundamental research and classroom applications. TESOL Quarterly, 16, 71–77.
Derwing, T. M. (1990). Speech rate is no simple matter: Rate adjustment and NS-NNS communicative success. Studies in Second Language Acquisition, 12, 303–313.
Derwing, T. M. and Munro, M. J. (1997). Accent, comprehensibility and intelligibility: Evidence from four L1s. Studies in Second Language Acquisition, 19, 1–16.
Derwing, T. and Munro, M. (2001). What speaking rates do non-native listeners prefer? Applied Linguistics, 22, 324–337.
Derwing, T. M. and Rossiter, M. J. (2003). The effects of pronunciation instruction on the accuracy, fluency and complexity of L2 accented speech. Applied Language Learning, 13, 1–18.
Deterding, D. (2005). Listening to estuary English in Singapore. TESOL Quarterly, 39, 425–440.
Dowd, J., Zuengler, J., and Berkowitz, D. (1990). L2 social marking: research issues. Applied Linguistics, 10, 16–29.
Eckman, F. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27, 315–330.
Eckman, F. (1991). The Structural Conformity Hypothesis and consonant clusters in the interlanguage of ESL Learners. Studies in Second Language Acquisition, 13, 23–41.
Elliott, A. (1995). Field independence/dependence, hemispheric specialization, and attitude in relation to pronunciation accuracy in Spanish as a foreign language. The Modern Language Journal, 79, 356–371.
Elman, J., Bates, E., Johnson, M., Karmiloff-Smith, A., Parisi, D., and Plunkett, K. (1996). Rethinking innateness. A connectionist perspective on development. Cambridge, MA: MIT Press.
English Language Service (1967). Drills and exercises in English pronunciation. Stress and intonation. London/ New York: McMillan.
Field, J. (2003). Promoting perception: Lexical segmentation in L2 listening. ELT Journal, 57, 325–334.
Field, J. (2005). Intelligibility and the listener: The role of lexical stress. TESOL quarterly, 39, 399–424.
Flege, J. E. (1981). The phonological basis of foreign accent. TESOL Quarterly, 15, 443–455.
Flege, J. E. (1987). A critical period for learning to pronounce foreign languages? Applied Linguistics, 8, 162–177.
Flege, J. E. (1995). Second language speech learning. Theory, findings, and problems. In W. Strange (Ed.), Speech perception and linguistic experience (pp. 233–277). Baltimore, MD: York Press.
Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.), Second language acquisition and the critical period hypothesis (pp. 101–131). Mahwah, NJ: Lawrence Earlbaum.
Flege, J., Birdsong, D., Bialystok, E., Mack, M., Sung, H., and Tsukada, K. (2006). Degree of foreign accent in English sentences produced by Korean children and adults. Journal of Phonetics, 34, 153–175.
Flege, J. and Eefting, W. (1986). Linguistic and developmental effects on the production and perception of stop consonants. Phonetica, 43, 155–171.
Flege, J. and Hillenbrand, J. (1984). Limits on pronunciation accuracy in adult foreign language speech production. Journal of the Acoustical Society of America, 76, 708–721.
Flege, J., Munro, M., and MacKay, I. (1995). The effect of age of second language learning on the production of English consonants. Speech Communication, 16, 1–26.
Gass, S. and Varonis, E. (1984). The effect of familiarity on the comprehensibility of non-native speakers. Language Learning, 34, 65–89.
Gick, B., Bernhardt, B., Bacsfalvi, P., and Wilson, I. (2008). Ultrasound imaging applications in second language acquisition. In J. Hansen and M. Zampini (Eds.), Phonology and second language acquisition (pp. 309–322). Amsterdam: John Benjamins.
Grosz, B. and Sidner, C. (1986). Attention, intentions, and the structure of discourse. Computational Linguistics, 12, 175–204.
Gut, U. (2007). Learner corpora in second language prosody research and teaching. In J. Trouvain and U. Gut (Eds.), Non-native prosody. Phonetic description and teaching practice (pp. 145–167). Berlin: Mouton de Gruyter.
Hahn, L. D. (2004). Primary stress and intelligibility: Research to motivate the teaching of suprasegmentals. TESOL quarterly, 38, 201–223.
Hakuta, K., Bialystok, E., and Wiley, E. (2003). Critical evidence: A test of the critical-period hypothesis for second-language acquisition. Psychological Science, 14, 31–38.
Hancin-Bhatt, B. (1992). Toward a forward model of second language phonology: Phonological theory and connectionism. Papers in Applied Linguistics Michigan (PALM), 7, 61–81.
Hancin-Bhatt, B. and Bhatt, R. M. (1997). Optimal L2 syllables. Interactions of transfer and developmental effects. Studies in Second Language Acquisition, 19, 331–378.
Hansen Edwards, J. and Zampini, M. (Eds.). (2008). Phonology and second language acquisition. Philadelphia: John Benjamins.
Hansen, J. G. (2006). Acquiring a non-native phonology. London: Continuum.
Hewings, M. (1995). Tone choice in the English intonation of non-native speakers. International Review of Applied Linguistics, 33, 251–265.
Huckvale, M. (2003). SFS/WASP Version 1.41 http://www.phon.ucl.ac.uk/resource/sfs/wasp.htm
Jenkins, J. (2000). The phonology of English and an international language. Oxford: Oxford University Press.
Jenkins, J. (2002). A sociolinguistically-based, empirically-researched pronunciation syllabus for English as an International Language. Applied Linguistics, 23, 83–103.
Jilka, M. (2007). Different manifestations and perceptions of foreign accent in intonation. In J. Trouvain and U. Gut (Eds.), Non-native prosody. Phonetic description and teaching practice (pp. 77–96). Berlin: Mouton de Gruyter.
Juffs, A. (1990). Tone, syllable structure and interlanguage phonology: Chinese learners’ stress errors. International Review of Applied Linguistics, 28(2), 99–117.
Kang, O., Rubin, D., and Pickering, L. (2010). Suprasegmental measures of accentedness and judgments of English language learner proficiency in oral English. The Modern Language Journal, 94, 554–566.
Kormos, J. and Dénes, M. (2004). Exploring measures and perceptions of fluency in the speech of second language learners. System, 32, 146–164.
Lado, R. (1957). Linguistics across cultures. Ann Arbor, MI: University of Michigan Press.
Lai, E. L. Y. (2002). Prosody and prosodic transfer in foreign language acquisition: Cantonese and Japanese. Muenchen: Lincom.
Leather, J. (1999). Second-language speech research: An introduction. Language Learning, 49, 1–56.
Lenneberg, E. (1967). The biological foundations of language. New York: Wiley and Sons.
Levis, J. (1999). Intonation in theory and practice, revisited. TESOL Quarterly, 33, 37–64.
Levis, J. and Pickering, L. (2004). Teaching intonation in discourse using speech visualization technology. System, 32(4), 505–524.
Long, M. H. (1990). Maturational constraints on language development. Studies in Second Language Acquisition, 12, 251–285.
Lybeck, K. (2002). Cultural identification and second language pronunciation of Americans in Norway. The Modern Language Journal, 86, 174–191.
Major, R. C. (1987). A model for interlanguage pronunciation. In G. Ioup and S. Weinberger (Eds.), Interlanguage phonology: The acquisition of a sound system. Rowley: Mass. Newbury House.
Major, R. C. (2001). Foreign accent. The ontogeny and phylogeny of second language phonology. Mahwa, NJ: Lawrence Earlbaum.
Major, R., Fitzmaurice, S., Bunta, F., and Balasubramanian, C. (2002). The effects of non-native accents on listening comprehension: Implications for ESL assessment. TESOL Quarterly, 36, 173–90.
Major, R. C. and Kim, E. (1996). The similarity differential rate hypothesis. Language Learning, 46, 465–496.
Mennen, I. (1998). Can second language learners ever acquire the intonation of a second language? Proceedings of the ESCA workshop on speech technology in language learning. Marholmen: Sweden.
Mennen, I. (2007). Phonological and phonetic influences in non-native intonation. In J. Trouvain and U. Gut (Eds.), Non-native prosody. Phonetic description and teaching practice (pp. 53–76). Berlin: Mouton de Gruyter.
Moyer, A. (2004). Age, accent and experience in second language acquisition. Clevedon, UK: Multilingual Matters.
Munro, M. and Derwing, T. (1995). Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning, 45, 73–97.
Nelson, C. (1995). Intelligibility and world Englishes in the classroom. World Englishes, 14, 273–79.
Oller, J. and Ziahosseiny, S. (1970). The contrastive analysis hypothesis and spelling errors. Language Learning, 20, 183–89.
Oyama, S. (1978). The sensitive period and comprehension of speech. Working Papers in Bilingualism/Travaux de Recherches sur le Bilinguisme, 61, 1–17. In S. D. Krashen, R. C. Scarcella, and M. Long, (Eds.) (1982), Child-adult differences in second language acquisition (pp. 39–51). Rowley, MA: Newbury House.
O'Connor, J. D. (1967). Better English pronunciation. Cambridge: Cambridge University Press.
Patkowski, M. (1980). The sensitive period for the acquisition of syntax in a secondary language. Unpublished doctoral dissertation. New York University.
Patkowski, M. S. (1990). Age and accent in a second language: A reply to James Emil Flege. Applied Linguistics, 11, 73–89.
Patkowski, M. S. (1994). The critical period hypothesis and interlanguage phonology. In M. Yavas (Ed.), First and second language phonology (pp. 205–221). San Diego, CA: Singular.
Pickering, L. (1999). An analysis of prosodic systems in the classroom discourse of native speaker and nonnative speaker teaching assistants. Unpublished dissertation. University of Florida.
Pickering, L. (2001). The role of tone choice in improving ITA communication in the classroom. TESOL Quarterly, 35(2), 233–255.
Pickering, L. (2004). The structure and function of intonational paragraphs in native and non-native instructional discourse. English for Specific Purposes, 23, 19–43.
Pickering, L. (2006). Current research on intelligibility in English as a lingua franca. Annual Review of Applied Linguistics, 26, 219–233.
Pierce, B. N. (1995). Social identity, investment, and second language learning. TESOL Quarterly, 29, 9–31.
Pierrehumbert, J. and Hirschberg, J. (1990). The meaning of intonation in the interpretation of discourse. In P. Cohen, J. Morgan, and M. Pollack (Eds.), Intentions in communication (pp. 271–311). Cambridge MA: MIT Press.
Pirt, G. (1990). Discourse intonation problems for non-native speakers. In M. Hewings (Ed.), Papers in Discourse Intonation (pp. 145–156). Birmingham, England: University of Birmingham, English Language Research.
Piscataway, N. J. and Riggenbach, H. (1991). Toward an understanding of fluency: A microanalysis of non-native speaker conversations. Discourse Processes, 14, 423–441.
Piske, T., Mackay, I. R. A., and Flege, J. E. (2001). Factors affecting degree of foreign accent in a L2: A review. Journal of Phonetics, 29, 191–215.
Prince, A. and Smolensky, P. (1993). Optimality Theory: Constraint interaction in generative grammar, RuCCs Technical Report #2, Rutgers University Center for Cognitive Science.
Riggenbach, H. (1991). Toward an understanding of fluency: A microanalysis of non-native speaker conversations. Discourse Processes, 14, 423–441.
Schuetze-Coburn, S., Shapley, M., and Weber, E. (1991). Units of intonation in discourse: A comparison of acoustic and auditory analyses. Language and Speech, 34, 207–234.
Schumann, J. (1986). Research on the acculturation model for second language acquisition. Journal of Multilingual and Multicultural development, 7, 397–92.
Scovel, T. (1988). A time to speak: A psycholinguistic enquiry into the critical period for human speech. Rowley, Mass: Newbury House.
Selinker, L. (1972). Interlanguage. International Review of Applied Linguistics, 10, 209–231.
Sereno, J. A. and Wang, Y. (2007). Behavioral and cortical effects of learning a second language: The acquisition of tone. In O. Bohn and M. Munro (Eds.), Language experience in second language speech learning (pp. 239–258). New York: John Benjamins.
Snow, C. E. and Hoefnagel-Höhle, M. (1978). The critical period for language acquisition: Evidence from second language learning. Child Development, 49, 1114–1128.
Stokes, J. (2001). Factors in the acquisition of Spanish pronunciation. I.T.L. Review of Applied Linguistics, 131/ 132, 63–83.
Strange, W. (1995). Cross-language studies of speech perception: A historical review. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language speech research (pp. 3–45). Timonium, MD: York Press.
Strange, W. (2007). Cross-language phonetic similarity of vowels: Theoretical and methodological issues. In O. Bohn and M. Munro (Eds.), Language experience in second language speech learning (pp. 35–55). New York: John Benjamins.
Tao, H. (1996). Units in Mandarin conversation: Prosody, discourse, and grammar. Amsterdam/Philadelphia, PA: John Benjamins.
Tarone, E. (1980). Communication strategies, foreigner talk and repair in interlanguage. Language Learning, 30, 417–431.
Trim, J. (1965). English pronunciation illustrated. Cambridge: Cambridge University Press.
Trouvain, J. and Gut, U. (Eds.). (2007). Non-native prosody. Phonetic description and teaching practice. Berlin: Mouton de Gruyter.
Walker, R. (2001). Pronunciation for international intelligibility. English Teaching Professional, 21, 1–7.
Weinreich, U. (1953). Languages in contact. New York: Linguistic circle of New York.
Weltens, B. and de Bot, K. (1984). Visual feedback of Intonation II: Feedback delay and quality of feedback. Language and Speech, 27, 79–88.
Wenk, B. (1986). Cross-linguistic influence in second language phonology: Speech rhythms. In E. Kellerman and M. Sharwood-Smith (Eds.), Cross-linguistic influence in second language acquisition (pp. 120–133). Oxford: Pergamon Press.
Wennerstrom, A. (1994). Intonational meaning in English discourse. Applied Linguistics, 15, 399–421.
Wennerstrom, A. (1997). Discourse intonation and second language acquisition: Three genre-cased studies. Unpublished PhD Dissertation. University of Washington.
Wennerstrom, A. (1998). Intonation as cohesion in academic discourse: A study of Chinese speakers of English. Studies in Second Language Acquisition, 20, 1–25.
Willems, N. (1982). English intonation from a Dutch point of view. Dordrecht/Cinnaminson, NJ: Foris.
Winitz, H. (1981). Input considerations in the comprehension of first and second language. In H. Winitz (Ed.), Native and foreign language acquisition. New York: New York Academy of Sciences.
Yamada, R. A., Tohkura, Y., and Kobayashi, N. (1996). Effect of word familiarity on non-native phoneme perception: Identification of English /r/, /l/ and /w/ by native speakers of Japanese. In J. Leather and A. James (Eds.), Second-Language speech: Structure and process (pp. 103–117). Berlin: Mouton de Gruyter.
Yavas, M. (Ed.). (1994). First and second language phonology. San Diego, CA: Singular.
Yeni-Komshian, G., Flege, J., and Liu, S. (1997). Pronunciation proficiency in L1 and L2 among Korean-English bilinguals: The effect of age of arrival in the US. Journal of the Acoustical Society of America, 102(5), 3138.