This chapter examines the neurocognition of second language, that is, the neural and cognitive (psychological and computational) bases of late-learned second or subsequent language (L2). The chapter focuses on lexical and grammatical processes. We first summarize the historical context of research related to the neurocognition of L2, and then examine current neurocognitive theories and their predictions. Next, we describe two common neurocognitive measures (Event-Related Potentials and functional Magnetic Resonance Imaging—that is, ERPs and fMRI) and review their extant neurocognitive empirical evidence for L2. Finally, we discuss the instructional relevance of such theories and evidence and suggest future directions for research.
The field of second language acquisition (SLA) has examined, among other issues, the nature of the cognitive mechanisms underlying L2 learning and use, and the various intrinsic and extrinsic factors that may influence these mechanisms. The field has gained insight into these issues, primarily through behavioral approaches. However, the development of valid operationalizations and reliable measurements of the cognitive mechanisms of interest is often a complex task at best, as in the case of assessing implicit learning. Because of the difficulty of identifying underlying mechanisms with behavioral methods, interest is growing in more brain-based approaches, which can complement behavioral approaches in gaining an understanding of these issues. Importantly, this research may elucidate not only the cognitive mechanisms of L2, which have been one focus of SLA, but also its brain and biological bases, which are potentially important in their own right.
Brain-based research of L2 has experienced a virtual explosion in recent years—for several reasons, including recent advances both in neuroimaging techniques and in the development of neurocognitive theories of L2 acquisition and processing. Indeed, a search of the databases Language and Linguistics Behavioral Abstracts and PsychINFO using the search terms “second language,” “neur*,” and “ERP” or “fMRI” revealed only one publication in 1981–1990, 21 publications in 1991–2000, and an astounding 875 from 2001 to mid-2010.
However, the study of the neurocognition of L2, which has emerged mainly from the field of cognitive neuroscience, has developed largely independently from the field of SLA. Unlike SLA, it has focused almost exclusively on two questions: (1) whether the neurocognitive representation and processing of L2 are similar to or different from those of first language (L1); and (2) to what degree such similarities and differences are influenced by a small number of factors, in particular, L2 age of acquisition and proficiency, and to a lesser extent, transfer effects from the learner'sL1 to the L2. The empirical focus on these factors is evident in a number of recent reviews of L2 neurocognitive research (Abutalebi, 2008; Hernandez and Li, 2007; Indefrey, 2006; Kotz, 2009; Mueller, 2005; Schmidt and Roberts, 2009; Steinhauer et al., 2009; van Hell and Tokowicz, 2010). Perhaps most importantly, although this body of neurocognitive research has made a significant contribution to our understanding of what the neural representation and processing of L2 look like in comparison to L1, it is still in its infancy in understanding why L2 neural representation and processing look the way they do, and why various factors might affect them.
In this chapter we take a somewhat different approach from most previous overviews of L2 neurocognitive research. We focus not only on the L2 empirical evidence (the what), but also on the theoretical landscape and how the evidence constrains it (the why). Specifically, we present current neurocognitive theories of L2; discuss how their claims and predictions overlap or differ; review the L2 neurocognitive evidence from ERPs and fMRI; and finally discuss how these data do or do not fit current theories.
Here we focus on four theories, all of which make claims about the neural as well as the cognitive basis of second language: the Declarative/Procedural model, Paradis’ model, the Competition model, and the Convergence hypothesis. As we shall see below, these four theories are representative of two broad neurocognitive perspectives that make quite different claims and predictions. For reasons of space, we have left out other theories, including those of Clahsen (Clahsen and Felser, 2006a, b, c), whose theory shares many claims with those of the Declarative/ Procedural model, and Ellis (Ellis, 2002a, b, 2006), whose view shares certain similarities with the Competition Model.
The Declarative/Procedural (DP) model (Ullman, 2001b, 2004) proposes that crucial aspects of language depend on two well-studied brain memory systems—declarative and procedural memory—that support non-language functions in humans and other animals (Eichenbaum and Lipton, 2008; Henke, 2010; Mishkin et al., 1984; Squire and Schacter, 2002). Declarative memory, which stores knowledge about facts and events (semantic and episodic knowledge), is posited to also underlie the mental lexicon, including the conceptual meanings of words as well as their phonological forms and grammatical specifications (e.g., irregular morphology and argument structure). This lexical dependence on declarative memory is expected in both L1 and L2. Knowledge can be learned rapidly in declarative memory, and it is at least partly, but not completely (Chun, 2000) explicit, that is available to conscious awareness. Declarative memory relies on the hippocampus and other medial temporal structures for learning new knowledge, which eventually depends largely on neocortical regions, particularly in the temporal lobes. Other brain structures play a role in declarative memory as well, including a region in frontal neocortex corresponding to Brodmann's Areas (BAs) 45 and 47 (within and near classical Broca's area), which underlie the selection or retrieval of declarative memories. (Note that for both declarative and procedural memory, the DP model refers to the entire neurocognitive system involved in the learning, representation, and processing of the relevant knowledge, not just to those parts underlying learning and consolidating new knowledge, which is what some researchers refer to regarding the two memory systems.)
The procedural memory system, which is a distinct brain memory system from declarative memory, underlies the implicit learning and use of motor and cognitive skills, and may be specialized at least in part for sequences and rules (Henke, 2010; Mishkin et al., 1984; Poldrack and Foerde, 2008; Squire and Schacter, 2002). Note that evidence suggests the existence of more than one non-declarative implicit memory system, only one of which is referred to as procedural memory by the DP model. Thus it is not the case that all implicit skills or knowledge are learned in the procedural memory system. According to the DP model (Ullman, 2001a, 2005), in L1 procedural memory is posited to also underlie aspects of a symbol-manipulating grammar, across grammatical sub-domains, including phonology, morphology, and syntax, in both expressive and receptive language. Procedural memory is posited to be especially important in the real-time sequential and hierarchical combination (e.g., Merge (Chomsky, 1995) or concatenation) of stored lexical forms and abstract representations into rule-governed complex structures (e.g., the + cat, walk + -ed, Noun Phrase + Verb Phrase)—that is, in grammatical structure-building. However, just as we can memorize lyrics, poems, and speeches, presumably in declarative memory, so complex structures can also be stored in declarative memory, for example as chunks (“the cat”). Unlike learning in declarative memory, learning in procedural memory requires repeated exposure to the stimulus, or practice with the skill, although once learned, skills seem to apply automatically, rapidly, and reliably. The procedural memory system is composed of a network of interconnected brain structures rooted in circuits connecting the basal ganglia (a set of structures deep within the brain, including the caudate nucleus and putamen) with certain frontal regions, in particular pre-motor regions and nearby BA 44 within Broca's area. The basal ganglia may be more important in learning new skills, while frontal regions are more important in representing and processing already-learned skills.
Crucial to the DP model's predictions about L2 is the fact that learning abilities in the declarative and procedural memory systems change differentially across the lifespan. Learning in declarative memory improves during childhood, and plateaus during adolescence and early adulthood, though it subsequently begins to decline (DiGiulio et al., 1994; Graf, 1990; Vaidya et al., 2007). In contrast, procedural memory learning abilities seems to be established early in childhood, with minimal subsequent changes into adulthood, including a possible decrement from childhood to adolescence (Bachevalier, 2001; Dorfberger et al., 2007; Siegel, 2001). These changes in functionality during development lead to predicted changes in reliance on the two memory systems for later- vs. earlier-learned language. According to the DP model, later-learned L2s are expected to initially depend on the declarative memory system not only for lexical knowledge (which can only be learned in this system, and thus depends on it for both L2 and L1) but also for complex forms, which can be memorized as chunks (or processed in declarative memory in other ways, e.g., computed with rules learned in this system; Ullman, 2005, 2006). This dependence on declarative rather than procedural memory for complex forms is due both to the fact that knowledge is learned faster in declarative than procedural memory (and thus is learned initially in the former), and because later learners of language should have more developed declarative learning abilities than younger learners, and possibly also more attenuated procedural learning abilities. Thus complex forms in L2 should depend more on declarative memory than in L1, and the later the L2 is learned (up to young adulthood), the greater its dependence on this system.
However, the procedural memory system is not dysfunctional in adulthood. At worst, it is somewhat attenuated, and procedural learning can and does take place in adults. Thus, with increasing practice with the L2 (and accompanying proficiency), aspects of grammar are predicted to increasingly rely on the procedural memory system. Therefore, the DP model predicts that though the neurocognition of L2 is always L1-like for lexical/conceptual knowledge, it can also become L1-like for complex forms. The degree of this proceduralization of grammatical abilities is expected to be a function of multiple intrinsic and extrinsic factors, including not only the amount of L2 practice (experience), but also the type of input (see below) and the kinds of grammatical rules and relations (some may be easier to proceduralize), as well as intrinsic factors such as genotype and sex (male vs. female), which can modulate the functionality of the declarative and/or procedural memory systems (Ullman, 2004, 2005, 2007, 2008; Ullman et al., 2008).
Note that although the improvement of declarative memory and possible worsening of procedural memory during childhood should lead to a greater reliance of complex forms on declarative than procedural memory in later than earlier learned language (e.g., in L2 vs. L1, and later- vs. earlier-learned L2), the basic pattern of the trajectory is predicted to be similar in L1 and L2 acquisition, that is, in both cases the rapid learning ability of declarative memory should lead to earlier learning of words as well as complex forms as chunks, with only later proceduralization of the grammar (Ullman, 2005).
Finally, note that it is not the case that such changes in the relative reliance on the two memory systems are due to any “transformation” of knowledge from one to the other system. The two systems can independently acquire knowledge, though knowledge acquired in one system can either enhance or inhibit learning analogous knowledge in the other system (Poldrack and Packard, 2003; Ullman, 2004). Thus proceduralization of grammar does not constitute the “transformation” of declarative into procedural representations, but rather the gradual acquisition of grammatical knowledge in procedural memory, which is increasingly relied on, with an accompanying decrease in reliance on declarative memory (Ullman, 2001a, 2004, 2005).
Paradis’ model also implicates notions of declarative and procedural memory in language acquisition (Paradis, 1994, 2004, 2009). Unlike the DP model, Paradis assumes isomorphic relations between declarative memory and explicit knowledge; that is, declarative memory only contains explicit knowledge, and all explicit knowledge is in declarative memory. Similar to the DP model, Paradis claims that procedural memory is necessarily implicit but that other types of implicit knowledge also exist (Paradis, 2009). He argues that explicit knowledge (stored in declarative memory) must be “learned,” with attention directed to what is being learned, whereas implicit knowledge (stored in procedural memory) can only be “acquired” incidentally, without attention being paid directly to it. Paradis implicates hippocampal/medial-temporal structures and association cortex in declarative memory, and the basal ganglia, the cerebellum, and perisylvian neocortex (i.e., around the Sylvian fissure) in procedural memory (Paradis, 1994, 2009).
For L1, language acquisition is posited to depend on procedural memory for all aspects of language that are implicit. This includes all aspects of the grammar, including not only phonology, morphology, and syntax but also implicit grammatical properties of the lexicon (Paradis, 2009). In contrast, consciously accessible vocabulary items, specifically, the sound-meaning pairings of words, are stored in declarative memory, although their use in context, which requires automatic access to their representation, is “also implicit (i.e., non-conscious)” (Paradis, 2009, p. 12). Note a particular but important difference between Paradis and Ullman's positions. For both Paradis (Paradis, 1994, 2009) and Ullman (Ullman, 2001b, 2004), the “lexicon” is comprised of the sound-meaning pairings and the grammatical properties of words. Paradis claims that the sound-meaning pairings of words, that is, the “vocabulary,” are explicit and are subserved by declarative memory, which is synonymous with explicit memory, whereas knowledge about the grammatical properties of words, also referred to as the “lexicon,” is implicit and is subserved by procedural memory. In contrast, Ullman claims that both the sound-meaning pairings and the grammatical properties of words are part of the mental lexicon and are expected to rely on the declarative memory system, which is understood to underlie implicit as well as explicit knowledge.
Paradis accounts for differences between L1 and L2 in terms of brain maturation. Specifically, he claims that there is a decline in “the use” of procedural memory after the “optimal period” for language acquisition between two and five years of age (Paradis, 2009, pp. 114–118). Because declarative memory becomes more available during development (see above), older learners depend more on declarative-based learning as opposed to procedural-based acquisition. This reliance on declarative memory is predicted across various domains of language, including lexical knowledge and processing that is implicit in L1. Paradis states that, with sufficient practice, that is, “repeated use (involving both comprehension and production) in interactive communicative situations” (p. 4), it is theoretically possible, though very rare, for learners to acquire L2 grammar in its entirety (Paradis, 2009). More commonly, some specific elements of the grammar may become part of the implicit linguistic system and come to depend on procedural memory. A final key element of this model is that explicit, metalinguistic knowledge about the L2 grammar is expected to indirectly contribute to the process of acquiring elements of the L2 grammar, in that explicit knowledge may promote repeated use of the form, which in turn leads to the development of procedural knowledge for that form.
According to the Competition Model (Hernandez et al., 2005; MacWhinney, 2005, 2007), language comprehension is based on the detection of a set of cues whose strength is determined by their reliability and availability. Language acquisition in the Competition Model occurs as cue-driven learning, which is a “process of acquiring coalitions of form-function meanings, and adjusting the weight of each mapping until it provides an optimal fit to the processing environment” (MacWhinney and Bates, 1989, p. 59). This mechanism for acquisition is relevant for all aspects of language, including lexical, phonological, and grammatical forms, in both first and second language (MacWhinney, 2007).
Thus the Competition Model posits that L2 acquisition relies on the same mechanisms as does L1 acquisition. The difference between early L1 and late L2 acquisition is explained in terms of the concepts of competition, resonance, parasitism, and entrenchment (Hernandez et al., 2005). Specifically, a learner's ability to engage in the process of cue-driven learning of an L2 is more difficult than in L1 because an entrenched set of L1 relationships is already established and resonant, that is, repeatedly coactivated. L2 learning is claimed to be parasitic on L1 because L2 associations are interspersed with L1 forms instead of being clustered in a separate L2 region. This parasitism would be expected to lead to L1 transfer effects. However, by using explicit metacognitive procedures such as rehearsal and recoding, L2 learners can increase the strength between relationships within L2, so that the L2 itself becomes more resonant and is able to compete with entrenched L1 forms and block L1 transfer effects more effectively. Additionally, high similarity between L2 and L1 forms is predicted to lead to positive overlap and to be facilitative for acquisition of the similar L2 form. In contrast, when L2 forms differ from those in L1, competition between the forms may negatively affect successful acquisition of the L2.
Consistent with these claims, on a neurocognitive level the Competition Model expects that L1 and L2 will show significant overlap in neural representation and processing, for example as measured by fMRI or ERP (Hernandez et al., 2005)— especially when there is a lack of competition between the L1 and L2 forms, and even at lower levels of proficiency (Tokowicz and MacWhinney, 2005). Thus few differences in the neural representation and processing of L1 and L2 are expected, apart for some neural separation that might be detectable at a local cortical level, as well as the activation of non-language brain areas, such as those recruited for explicit metacognitive processes (Hernandez et al., 2005, p. 223).
The Convergence Hypothesis (Abutalebi, 2008; Abutalebi and Green, 2007; Green, 2003; Green et al., 2006) is largely consistent with the Competition Model, in that it claims that L2 acquisition depends on the same neural mechanisms as does L1, and that these mechanisms operate in the context of the already-specified L1 neural system. Specifically, the Convergence Hypothesis expects that at initial stages of L2 acquisition, aspects of L2 (e.g., L2 semantics and syntax) will depend on the neural substrates that underlie parallel aspects of the learner's L1, and thus will arise in the context of the speaker's already-specified L1 system. Therefore initially the representation of L2 is “convergent” with the (already-specified) representation of the learner's L1. However, with increasing proficiency, the L2 is expected to “converge with the representation of that language learned as an L1” (Green, 2003, p. 204).
Although the model predicts that the neural representation of L2 converges to that of the same language in a native speaker, especially at higher levels of proficiency, the model also expects L2/L1 differences, possibly due to competition between the L2 and L1, particularly at lower levels of proficiency. This competition may lead to increased neural activation for L2 as compared to L1 in certain areas, particularly for those associated with language control (prefrontal cortex, basal ganglia, and anterior cingulate cortex; Abutalebi, 2008). In addition, the model also recognizes that “explicit, declarative representations of grammatical information” may play an initial role in on-line processing, and expects that different contexts of acquisition (e.g., “formal school setting versus immersion setting”) should affect this initial registration of linguistic information” (Green, 2003, p. 205). Thus the model accounts not only for convergence between L1 and L2 but also offers a proficiency- and context-based account for differences between L1 and L2.
Ideally, in order to test these competing theories one would compare and contrast them on a range of issues, such as claims of domain generality, the specific implicated neural substrates and their functions, the roles for explicit metalinguistic L2 learning and knowledge, and so on. However, many of these comparisons are difficult to make because they are either specified to different levels by different models or not specified at all. Here we focus on two related issues that seem sufficiently well specified by each of the theories laid out above and that lead to different predictions, providing a means to help distinguish between two broad neurocognitive perspectives represented by these theories. Note that although there are clear differences between the theories within each broad perspective (e.g., between the DP model and Paradis, and between the Competition model and the Convergence hypothesis), we do not address these here. The two issues we examine in light of the theoretical perspectives are: what the neurocognition of L2 representation, processing and acquisition looks like in comparison to L1, and why it looks that way, including which factors affect it.
First, a clear difference emerges between two sets of theories regarding the assumptions underlying L2 vs. L1 (the why). On the one hand, the DP model and Paradis posit the involvement of two distinct neurocognitive systems that play somewhat different roles in L2 and L1, due primarily to the development or maturation of the declarative and procedural memory systems (Paradis, 1994, 2004, 2009; Ullman, 2001a, 2005). On the other hand, the Competition and Convergence models posit the involvement of a single set of mechanisms that play similar roles in L2 and L1, with any differences emerging largely as a result of the neural commitment of L1, which competes with the establishment of an L2 (Green, 2003; Green et al., 2006; Hernandez et al., 2005).
Second, these different underlying assumptions lead to different predictions regarding the neurocognitive basis of L2 (the what). The DP model and Paradis both predict certain qualitative neurocognitive differences between L1 and L2, as well as between lower and higher exposure L2—that is, differences in which brain and cognitive systems are being relied on. In particular, both of these models expect that L1 will depend more on procedural memory, and L2 more on declarative memory, particularly at lower exposure L2. (Note that the specifics differ between the two models, both in which brain structures are involved, and in which linguistic forms and functions should show the qualitative differences. The DP model predicts that rule-governed relations and forms should show these differences, while idiosyncratic lexical knowledge should not. Paradis predicts these differences for all knowledge and processing that is implicit in L1, including both grammar and lexical knowledge. However, these specific differences are not addressed here. For further discussion, see e.g., Ullman, 2001a, 2005).
Both the Competition and Convergence models predict primarily either no differences or quantitative differences between L2 and L1: that is, L2 should activate largely the same set of neural structures as L1. At lower levels of proficiency, L2 learners may show increased brain activity in these structures as compared to L1, due to greater competition and difficulty. At higher proficiency levels, activation should decrease, perhaps to the level of L1 (Abutalebi, 2008; Green, 2003; Green et al., 2006; Hernandez et al., 2005). Note, however, that these models also provide some accounts of possible qualitative differences between L1 and L2. For example, at lower levels of proficiency, both models posit a reliance on explicit metalinguistic knowledge during lower levels of proficiency, and the Convergence Model emphasizes an increased reliance on neural structures involved in language control.
These differing claims and predictions can be tested using a variety of different behavioral and brain measures. Consistent with this chapter's focus on brain-based approaches, here we provide a short overview of two brain-based methods, the most commonly used in neurocognitive L2 research: the electrophysiological technique of ERPs and the hemodynamic neuroimaging approach of functional fMRI. In our discussion of evidence below, we also present results from Positron Emission Tomography (PET), a more rarely used hemodynamic neuroimaging approach. Note that other methods, such as the lesion method, magnetoencephalagraphy (MEG), direct cortical brain stimulation, and transcranial magnetic stimulation (TMS), have also been used in L2 research but are beyond the scope of this chapter.
ERPs are scalp-recorded electrical potentials of the brain activity that takes place after subjects are presented with stimulus “events,” such as words, pictures, or sounds. Language-related ERP research often employs a violation paradigm for presenting linguistic stimuli. In this paradigm the ERP response to a linguistic violation (e.g., lexical, syntactic, morphosyntactic) is compared to the ERP response of a matched control word or structure. As the stimuli are presented, electrophysiological activity is recorded at the scalp from electrodes that are bound to ERP caps. The precise time of each stimulus is marked in the ongoing electroencephalogram (EEG). After the EEG is amplified, the time-locked signal for each stimulus is averaged by condition in order to detect that part of the wave-form related to that specific condition. This waveform is called the “event-related potential.” Characteristic peaks (i.e., “bumps” or “dips”) in the ERP wave-form that are consistently found in particular experimental conditions are referred to as “ERP components.” Different ERP components can be identified and distinguished by various factors, including their latency (when they occur in the wave-form), amplitude (of the voltage), scalp distribution (which electrodes the component is strongest at and extends to), and polarity (whether their voltage is positive or negative).
A benefit that researchers gain when using ERPs is that, unlike other neuroimaging techniques (e.g., fMRI, MEG), ERP research has revealed a set of widely studied language-related ERP components in L1, whose characteristics and underlying functions are relatively well understood (Steinhauer and Connolly, 2008). These components thus provide a frame of reference for examining the attainment of native-language processing in L2. A second advantage of ERPs is that they reflect actual electrophysiological neuronal activity, which changes on the order of milliseconds, and thus provide us with excellent temporal information.
Unfortunately, ERPs’ advantage in temporal resolution is accompanied by a strong disadvantage in spatial resolution. It is quite difficult, though not impossible, to identify the actual brain structures that generate scalp-recorded potentials. There are also other limitations to ERPs. The electrical activity of a single neuron firing is much too small to measure outside the brain, and ERPs are generally detectable only when hundreds or even thousands of neurons with a similar geometrical orientation are active at the same time. It also turns out that only certain types of neurons in the cortex tend to have these properties, and therefore it is mainly these cortical neurons that are captured by ERPs. Finally, ERP research is limited by the fact that participants must refrain from moving (even blinking their eyes) during the presentation of stimuli because of the electrical noise produced both by muscles and by motor neurons. So almost all ERP language studies are limited to language perception (reading or listening).
A second common elicitation method used for exploring the neurocognition of L2 is fMRI, a neuroimaging technique that emerged in the 1990s. In fMRI two images are recorded: (a) a structural image, which details the anatomical structures of the brain, and (b) a functional image, which typically reflects changes in blood oxygenation in the brain. These changes are thought to reflect increases or decreases in neuronal firing rates, which in turn are taken to be related to cognitive processes. When a functional image is aligned with a structural image, one can thus attempt to determine the location in the brain of neuronal activity related to a particular cognitive process.
The primary benefit of fMRI is that it provides superb spatial resolution. Indeed, it allows one to distinguish different areas of brain activation that are as little as a few millimeters apart. However, fMRI does have its limitations. First, the very strong magnetic fields of the MRI scanner can be quite dangerous in some circumstances (e.g., if you have metal in your body). Second, in fMRI studies, the participant must be lying down and must not move their head, limiting the nature of experimental paradigms. Finally, the hemodynamic changes that take place in response to neural activity are too slow to allow the detection of real-time processing changes, so one cannot use fMRI to measure the spatio-temporal dynamics of language. It is therefore important to test a given linguistic process not only with fMRI, but also with techniques such as ERPs that provide very high temporal resolution.
Several reviews of ERP and hemodynamic (fMRI and PET) evidence have recently provided synthesis and insight on the neurocognition of L2 (Abutalebi, 2008; Hernandez and Li, 2007; Indefrey, 2006; Kotz, 2009; Mueller, 2005; Schmidt and Roberts, 2009; Steinhauer et al., 2009; Stowe and Sabourin, 2005; van Hell and Tokowicz, 2010). Thus the purpose of the current section is not to provide an exhaustive review of empirical studies, but rather to point out emerging patterns of evidence, and to consider these in light of the theories discussed above. Specifically, we consider ERP and hemodynamic studies of natural and artificial L2 in order to assess whether the extant evidence suggests qualitative or quantitative differences between L2 and L1 and between lower and higher exposure L2.
ERP evidence from natural language. In L1, different types of processing difficulties elicit different ERP components (for recent comprehensive reviews see Kaan, 2007; Steinhauer and Connolly, 2008). Difficulties in lexical/semantic processing in L1 elicit central/posterior bilaterally distributed negativities (N400s) that often peak about 400 ms after the onset of the word (Friederici et al., 1999; Kutas and Hillyard, 1980). N400s reflect aspects of lexical/semantic processing, and may depend on the declarative memory brain system (Lau et al., 2008; McCarthy et al., 1995; Simos et al., 1997; Steinhauer and Connolly, 2008; Ullman, 2001a). In contrast, difficulties in (morpho) syntactic processing often produce three components. First, such processing difficulties can, though do not always (Hagoort and Brown, 1999; Osterhout et al., 1997), elicit early (150–500 ms) left-to-bilateral anterior negativities (LANs) (Friederici et al., 1993; Neville et al., 1991). LANs appear to reflect aspects of rule-governed automatic structure-building (Friederici and Kotz, 2003; Hahne and Friederici, 1999) and have been posited to depend on the procedural memory brain system (Ullman, 2001a, 2004). Second (morpho)syntactic processing difficulties usually elicit late (600 ms) centro-parietal positivities (P600s) (Kaan et al., 2000; Osterhout and Holcomb, 1992) linked to controlled (conscious) processing and structural reanalysis (Hahne and Friederici, 1999; Kaan, 2007; Kaan et al., 2000). The biphasic pattern of a LAN followed by a P600 may be characteristic of native-speaker processing of (morpho)syntactic violations (Friederici et al., 1993; Steinhauer and Connolly, 2008). Finally, such violations may also elicit later (600–2000 ms) sustained anterior negativities (“late anterior negativities”), which generally show bilateral distributions, and may reflect increased working memory demands (Martin-Loeches et al., 2005).
In L2, ERP studies have revealed the following: Difficulties in lexical/semantic processing do not differ qualitatively between L1 and L2, reliably eliciting N400s in both cases, even after minimal L2 exposure (McLaughlin et al., 2004; Steinhauer et al., 2009; Ullman, 2001a). In contrast, L2 differs from L1 in aspects of (morpho)syntactic (grammatical) processing, in particular at lower levels of exposure and proficiency. At lower levels LANs are typically absent, with subjects instead showing no negativity at all (Hahne and Friederici, 2001; Ojima et al., 2005) or N400s or N400-like posterior negativities (Osterhout et al., 2008; Weber-Fox and Neville, 1996). However, recent studies have reported LANs in higher proficiency L2 (Gillon Dowens et al., 2009; Isel, 2007; Ojima et al., 2005; Rossi et al., 2006; Steinhauer et al., 2009) (but see Chen et al., 2007). These LANs are sometimes bilaterally distributed (Isel, 2007; Rossi et al., 2006), possibly due to lower L2 proficiency (Steinhauer et al., 2009). P600s are generally found in L2, particularly at higher proficiency (Gillon Dowens et al., 2009; Isel, 2007; Osterhout et al., 2008; Rossi et al., 2006; Steinhauer et al., 2009; Weber-Fox and Neville, 1996). In some studies of high proficiency L2, the LAN and P600 are both elicited in response to (morpho)syntactic violations (Gillon Dowens et al., 2009; Hahne et al., 2006; Rossi et al., 2006; Steinhauer et al., 2009). Finally, late anterior negativities have also been observed in L2, again mainly at higher proficiency (Gillon Dowens et al., 2009; Rossi et al., 2006).
Hemodynamic evidence from natural language. In L1, fMRI and PET research shows somewhat different activation patterns for lexical/semantic and grammatical processing (for a review see Hasson and Small, 2008). Brain structures in temporal/temporo-parietal regions, including the medial temporal lobe, are linked with lexical/semantic processing (Friederici et al., 2003; Illes et al., 1999; Kuperberg et al., 2000; Newman et al., 2001). In addition, BA 45 and 47 seem to underlie the selection, retrieval or integration of lexical and semantic knowledge (Dapretto and Bookheimer, 1999; Illes et al., 1999; Poldrack et al., 1999). In the (morpho)syntactic domain, activation of inferior frontal regions, in particular BA 44 and the frontal operculum, as well as of the superior temporal gyrus and the basal ganglia, are observed (Dapretto and Bookheimer, 1999; Friederici et al., 2003; Newman et al., 2001).
In L2, certain patterns are beginning to emerge from hemodynamic studies (for comprehensive reviews see Indefrey, 2006; Kotz, 2009; Stowe and Sabourin, 2005). First, tasks that involve only lexical processing typically depend on the same brain structures in L2 as in L1, and do not generally yield different activation patterns in L1 and L2 (Chee et al., 1999; Klein et al., 1999; Xue et al., 2004), although some studies have reported greater activation in L1 compared to L2 (Pillai et al., 2004) as well as in L2 compared to L1 (Perani et al., 2003). In contrast, grammatical processing has thus far shown varied results in both region and level of activation. Some studies suggest that grammatical processing in L2 as compared to L1 elicits greater temporal lobe involvement, in both the left and right hemispheres (Dehaene et al., 1997; Perani et al., 1996; Perani et al., 1998). The extent of temporal lobe involvement during these tasks also appears to be greater for later than earlier L2 learners, and for L2 speakers with less than more exposure, although confounds between age of exposure and amount of exposure complicate interpretation (Dehaene et al., 1997; Perani et al., 1996; Perani et al., 1998). Greater frontal lobe involvement in L2 than L1 has also varied across studies. Although some studies have not found more frontal activation in L2 than L1 (Dehaene et al., 1997; Perani et al., 1996, 1998), others have found increased frontal activation in L2, albeit in different regions across studies, including BA 44, 45, and 47 (Golestani et al., 2006; Hasegawa et al., 2002; Nakai et al., 1999; Ruschemeyer et al., 2005; Wartenburger et al., 2003).
ERP and hemodynamic evidence from artificial language. Examining artificial language learning and processing offers several important advantages over, and crucially complements, testing only natural language. These advantages include both very rapid learning (e.g., on the order of hours to days to reach high proficiency levels of production and comprehension) and the ability to control for a range of linguistic and other factors. If indeed it can be demonstrated that (at least certain types of) artificial languages constitute valid models for aspects of natural language, this will provide researchers in the study of language with an extremely useful tool, analogous to the reliance of many other scientific disciplines on simplified models to test complex phenomena.
An early ERP study examined a simple artificial language (Brocanto) whose syntactic rules conform to language universals (Friederici et al., 2002). Adults listened to Brocanto sentences while playing a computer-based game until they reached high proficiency at the language, at which point syntactic violations elicited a LAN/P600, typical of L1 natural languages. This ERP pattern was found even for violations of rules that did not exist in any language known to the subjects, ruling out L1–L2 rule transfer.
More recent ERP research used a modified version of Brocanto to examine how L2 neuro-cognition might be differentially affected by explicit training (in which the rules of the language are explained to the learner) and implicit training (in which only meaningful phrases and sentences of the language are heard) (Morgan-Short, 2007; Morgan-Short et al., 2010; Morgan-Short et al., in press). The findings showed that for L2 syntax (Morgan-Short et al., in press), learning under explicit and implicit training conditions led to similar performance on behavioral (performance) measures of the language. There were striking differences, however, in the ERPs elicited by syntactic violations. Implicitly but not explicitly trained learners showed evidence of (a) reliance on lexical/semantic processing, as evidenced by an N400, at low proficiency; and (b) an L1-like LAN/P600 biphasic pattern at high proficiency. For L2 morphosyntax (grammatical gender agreement) (Morgan-Short et al., 2010), some ERP differences were also found between the explicit and implicit training conditions, although neither condition led to a fully native-like biphasic LAN-P600 response.
In an fMRI Brocanto study (Opitz and Friederici, 2003), adult Brocanto acquisition initially involved the hippocampus and temporal neocortical regions. While activation in these brain structures decreased during learning, activation increased in BA 44. This finding suggests a switch from declarative memory to L1-like (procedural memory) processing during L2 learning. Two other fMRI studies examined the acquisition of artificial grammatical rules that either do or do not follow natural-language universals (Musso et al., 2003; Tettamanti et al., 2002). In both studies, subjects learned both types of rules to high proficiency. In one study, activation increased in Broca's area (BA 45) as proficiency increased, but only for the rules following natural-language patterns (Musso et al., 2003). Similarly, in the other study learners showed activation in Broca's area (BA 44) only for natural-language-like rules (Tettamanti et al., 2002); moreover, those learners with the highest proficiency showed more such activation than learners with somewhat lower proficiency (Tettamanti et al., 2002).
Discussion of ERP and hemodynamic evidence. We have summarized ERP and hemodynamic (fMRI and PET) evidence of lexical and grammatical aspects of L2, examining studies of both natural and artificial language. Studies of ERPs provide evidence for qualitative differences in (morpho)syntactic processing between (a) L2 at lower levels of exposure/proficiency and L1, with LAN/P600 responses typically found in the latter but not the former, for which N400s have been reported; and (b) lower and higher exposure/proficiency L2, with L1-like LAN/P600 responses found only in the latter case. In contrast, no such qualitative differences are found for lexical/ semantic processing, which consistently yields N400s in both L1 and L2, irrespective of exposure and proficiency levels. Overall, this pattern is consistent with the qualitative differences predicted by the DP model and Paradis.
From hemodynamic studies perhaps the clearest findings are from lexical/semantic processing, which do not seem to show any qualitative differences between L1 and L2. For grammatical processing the pattern is less clear, with both differences and similarities between L1 and L2 reported across studies. Perhaps the clearest hemodynamic results seem to be found in studies of grammatical processing of artificial languages, for which the evidence appears to be largely consistent with that from ERPs: L1-like processing patterns at high but not low levels of exposure and proficiency, and declarative memory brain structures active only at low levels, as would be expected by the DP model and Paradis.
It is interesting to note that, to our knowledge, hemodynamic as well as ERP studies that employ longitudinal designs, in which between-subject variability is eliminated, consistently provide evidence of a qualitative neurocognitive shift in (morpho)syntactic processing with increasing exposure and proficiency (Morgan-Short, 2007; Morgan-Short et al., 2010; Morgan-Short et al., in press; Opitz and Friederici, 2003; Osterhout et al., 2008; Osterhout et al., 2006). Specifically, ERP studies with longitudinal designs report qualitative shifts of neural patterns with increasing L2 exposure and proficiency, either from one ERP component to others (e.g., from N400 to LAN/P600), or from an absence of components to their emergence (e.g., of a P600) (Morgan-Short et al., 2010; Morgan-Short et al., in press; Opitz and Friederici, 2003; Osterhout et al., 2006; Osterhout et al., 2008). Likewise, the one longitudinal fMRI study we are aware of reported a shift from declarative memory brain structures at low proficiency to BA 44 (implicated in L1 syntactic processing and procedural memory) at high proficiency. Thus longitudinal studies seem to support the DP model and Paradis.
In sum, although there appears to be a fair bit of variability in the findings of cross-sectional hemodynamic studies of L2 grammar, consistent findings can be found in various strands of L2 neurocognitive research: (a) in ERP studies, whether of natural or artificial language; (b) in hemodynamic as well as ERP studies of artificial language; (c) in longitudinal studies, again whether using ERP or hemodynamic methods; and (d) in hemodynamic studies of lexical/ semantic processing. Overall, these consistencies support the qualitative neurocognitive L1/L2 differences and low/high exposure and proficiency shifts predicted by the DP model and Paradis.
Future research is necessary to understand why some strands of L2 neurocognitive research are more consistent than others. For example, it remains to be resolved why the ERP approach might yield much greater consistency across studies of grammatical processing than hemodynamic methods (e.g., perhaps the small number of ERP components leads to less variability than hemodynamic activation patterns; the violation paradigm used in ERPs is more consistent than the task paradigms used in fMRI or PET, etc.). Moreover, some of the consistent strands of research are still somewhat confounded. In particular, almost all longitudinal L2 studies examine artificial language, so it is remains unclear whether the consistency of such studies is due to their longitudinal design or to the fact that they examine artificial languages. Nevertheless, both artificial languages (see above) and longitudinal designs (de Bot, 2008; Green, 2003; Indefrey, 2006; Larsen-Freeman and Cameron, 2008; Ortega and Byrnes, 2008; Osterhout et al., 2006; Steinhauer et al., 2009) seem to offer clear advantages, and both seem worthwhile pursuing.
It is not yet clear to what extent L2 neurocognitive research has direct implications for L2 instruction. However, as researchers continue to adopt a more explanatory rather than descriptive approach, we will gain a deeper understanding of the neurocognition of L2 processing and acquisition, which should lead to clearer implications for instruction. In other words, as we come to understand why there are similarities or differences between L1 and L2, we may better be able to draw conclusions about how instruction can best be structured to promote successful L2 acquisition.
Based on current research, however, we can already draw a few cautious conclusions about the instructional relevance of L2 neurocognitive research. First, the neurocognitive evidence seems promising for adult L2 learners. It is not the case that such learners are unable to achieve L1-like neural processing. Rather increasing evidence suggests that they can do so, not only for lexical/ semantics but also for at least some aspects of grammar. Of course native-like neurocognition does not imply native-like proficiency (conversely, high L2 proficiency does not necessarily suggest a dependence on native-language neurocognitive mechanisms; Ullman, 2005). Nevertheless, given that L1 mechanisms are evidently extremely well-suited to language, it is quite plausible that native-like proficiency might be reliably attained only with native-language neurocognitive mechanisms. Thus the achievement of such mechanisms by L2 learners is good news.
Second, as we have seen, some research suggests that the type of training and instruction can influence the achievement of L1-like neurocognitive mechanisms. Specifically, learning under more implicit contexts, such as immersion, may be more effective at promoting L1-like processing at higher levels of proficiency than more explicit contexts such as traditional grammar-focused classroom settings. This suggestion is based on two observations: (a) The only study to examine the effects of explicit and implicit training on the neurocognition of L2 found that when processing syntactic elements of an artificial L2, implicitly trained learners evidenced both the automatic and controlled processes (i.e., LAN-P600) seen in native speakers whereas explicitly trained learners evidenced controlled (i.e., P600) but not automatic processes (Morgan-Short et al., in press); and (b) an examination of neurocognitive studies of natural L2 showing L1-like grammatical processing suggests that these effects are generally found in L2 learners who have been immersed in their L2 for non-trivial amounts of time (e.g., Gillon Dowens et al., 2009; Hahne et al., 2006; Rossi et al., 2006; Steinhauer et al., 2009). Although additional studies are clearly needed (e.g., in these natural language studies immersion is not teased apart from other factors such as motivation and amount of exposure), the evidence suggests that immersion may be an important element in attaining native-like neurocognition.
Neurocognitive L2 research has clearly contributed to our understanding of L2 acquisition and processing. The empirical research has begun to paint an increasingly detailed picture of the neurocognition of aspects of L2, and how it is affected by various factors, including age of acquisition, amount of exposure and proficiency, similarity of L1/L2 linguistic forms, and type of training. Moreover, neurocognitive theories of L2 and its relation to L1 are beginning to explain and, even more importantly, to predict the way in which L2 is acquired, represented, and processed in the brain and mind.
But many issues remain unresolved—issues that in some cases have been investigated by the field of SLA and have not been addressed by neurocognitive research, but in other cases have been left entirely unexamined. We strongly believe that the future of L2 research should be bi-directional in the sense that the fields of SLA and cognitive neuroscience should inform each other. Neurocognitive research may be able to shed light on central questions in SLA that are difficult to assess with behavioral methods alone. Neurocognitive researchers, moreover, would be wise to design their studies in a manner informed by theoretical perspectives and empirical findings from SLA (de Bot, 2008).
Following are some issues and questions that would greatly benefit from future investigations, and at least in some cases from an influence from SLA research:
(1) Although most neurocognitive L2 research has examined lexical and grammatical processing—the topic of this chapter—other L2 domains are also important, such as prosody and phonology. Indeed, some research has begun to examine these areas (Chandrasekaran et al., in Press; Sanders and Neville, 2003; Sanders et al., 2002; Song et al., 2008; Wong et al., 2008; Wong et al., 2007), though much work remains to be done.
(2) Most neurocognitive L2 studies to date have examined language processing. Future studies should begin to investigate the acquisition and learning of L2 (e.g., neuroimaging during the acquisition of an artificial language).
(3) As discussed in the introduction to this chapter, research on the factors affecting L2 neurocognition has been limited to a small number of factors, in particular age of acquisition and proficiency. However, many other intrinsic (e.g., sex, genotype) and extrinsic (e.g., type of training) factors need to be investigated, some of which are specifically predicted by neuro-cognitive theory to play important roles (see above). Additionally, the factors of proficiency and exposure, which have been confounded by most previous research, need to be teased apart.
In sum, the explosion of research in recent years on the neurocognition of L2 has begun to elucidate the neural and cognitive bases of L2 acquisition, representation and processing. These empirical and theoretical advances not only complement SLA research, but also crucially help to answer questions that have been difficult to address within the field of SLA as it has been traditionally defined. In the future, SLA and cognitive neuroscience will increasingly interact, and indeed a new area of research tightly integrating the two is likely to emerge.
Abutalebi, J. (2008). Neural aspects of second language representation and language control. Acta Psychologica, 128(3), 466–478.
Abutalebi, J., and Green, D. (2007). Bilingual language production: The neurocognition of language representation and control. Journal of Neurolinguistics, 20, 242–275.
Bachevalier, J. (2001). Neural bases of memory development: insights from neuropsychological studies in primates. In C. A. Nelson and M. Luciana (Eds.), Handbook of developmental cognitive neurosience (pp. 365–379). Cambridge: The MIT Press.
Chandrasekaran, B., Sampath, P. D., and Wong, P. C. M. (2010). Individual variability in cue-weighting and lexical tone learning. Journal of the Acoustical Society of America, 128(1), 456–465.
Chee, M. W., Tan, E. W., and Thiel, T. (1999). Mandarin and English single word processing studied with functional magnetic resonance imaging. Journal of Neuroscience, 19(8), 3050–3056.
Chen, L., Shu, H., Liu, Y., Zhao, J., and Li, P. (2007). ERP signatures of subject-verb agreement in L2 learning. Bilingualism: Language and Cognition, 10(2), 161–174.
Chomsky, N. (1995). The minimalist program. Cambridge, MA: The MIT Press.
Chun, M. M. (2000). Contextual cueing of visual attention. Trends in Cognitive Sciences, 4(5), 170–178.
Clahsen, H., and Felser, C. (2006a). Continuity and shallow structures in language processing: A reply to our commentators. Applied Psycholinguistics, 27(1), 107–126.
Clahsen, H., and Felser, C. (2006b). Grammatical processing in language learners. Applied Psycholinguistics, 27(1), 3–42.
Clahsen, H., and Felser, C. (2006c). How native-like is non-native language proceessing? Trends in Cognitive Science, 10(12), 564–570.
Dapretto, M., and Bookheimer, S. Y. (1999). Form and content: Dissociating syntax and semantics in sentence comprehension. Neuron, 24(2), 427–432.
de Bot, K. (2008). The imaging of what in the multilingual mind?. Second Language Research, 24(1), 111–133.
Dehaene, S., Dupoux, E., Mehler, J., Cohen, L., Paulesu, E., Perani, D. (1997). Anatomical variability in the cortical representation of first and second language. NeuroReport, 8(17), 3809–3815.
DiGiulio, D. V., Seidenberg, M., O'Leary, D. S., and Raz, N. (1994). Procedural and declarative memory: A developmental study. Brain and Cognition, 25(1), 79–91.
Dorfberger, S., Adi-Japha, E., and Karni, A. (2007). Reduced susceptibility to interference in the consolidation of motor memory before adolescence. PLoS ONE, 2(2), 1–6.
Eichenbaum, H., and Lipton, P. A. (2008). Towards a functional organization of the medial temporal lobe memory system: Role of the parahippocampal and medial entorhinal cortical areas. Hippocampus, 18(12), 1314–1324.
Ellis, N. C. (2002a). Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition, 24, 143–188.
Ellis, N. C. (2002b). Reflections on frequency effects in language processing. Studies in Second Language Acquisition, 24, 297–339.
Ellis, N. C. (2006). Cognitive perspectives on SLA: The Associative-Cognitive CREED. AILA Review, 19(1), 100–121.
Friederici, A. D., and Kotz, S. A. (2003). The brain basis of syntactic processes: Functional imaging and lesion studies. NeuroImage, 20 (Supplement 1), S8–S17.
Friederici, A. D., Pfeifer, E., and Hahne, A. (1993). Event-related brain potentials during natural speech processing: Effects of semantic, morphological and syntactic violations. Cognitive Brain Research, 1(3), 183–192.
Friederici, A. D., Ruschemeyer, S. A., Hahne, A., and Fiebach, C. J. (2003). The role of left inferior frontal and superior temporal cortex in sentence comprehension: Localizing syntactic and semantic processes. Cerebral Cortex, 13(2), 170–177.
Friederici, A. D., Steinhauer, K., and Frisch, S. (1999). Lexical integration: Sequential effects of syntactic and semantic information. Memory and Cognition, 27(3), 438–453.
Friederici, A. D., Steinhauer, K., and Pfeifer, E. (2002). Brain signatures of artificial language processing: Evidence challenging the critical period hypothesis. Proceedings of the National Academy of Sciences, 99(1), 529–534.
Gillon Dowens, M., Vergara, M., Barber, H., and Carreiras, M. (2009). Morphosyntactic processing in late second-language learners. Journal of Cognitive Neuroscience, 22(8), 1870–1887.
Golestani, N., Alario, F. X., Meriaux, S., Le Bihan, D., Dehaene, S., and Pallier, C. (2006). Syntax production in bilinguals. Neuropsychologia, 44, 1029–1040.
Graf, P. (1990). Life-span changes in implicit and explicit memory. Bulletin of the Psychonomic Society, 28(4), 353–358.
Green, D. W. (2003). Neural basis of lexicon and grammar in L2 acquisition: The convergence hypothesis. In Rv. Hout, A. Hulk, F. Kuiken, and R. Towell (Eds.), The lexicon-syntax interface in second language acquisition (pp. 197–218). Amsterdam: John Benjamins.
Green, D. W., Crinion, J., and Price, C. J. (2006). Convergence, degeneracy, and control. Language Learning, 56 (Supplement 1), 99–125.
Hagoort, P. and Brown, C. M. (1999). Gender electrified: ERP evidence on the syntactic nature of gender processing. Journal of Psycholinguist Research, 28(6), 715–728.
Hahne, A. and Friederici, A. D. (1999). Electrophysiological evidence for two steps in syntactic analysis: Early automatic and late controlled processes. Journal of Cognitive Neuroscience, 11(2), 194–205.
Hahne, A. and Friederici, A. D. (2001). Processing a second language: Late learners’ comprehension mechanisms as revealed by event-related brain potentials. Bilingualism: Language and Cognition, 4, 123–141.
Hahne, A., Mueller, J. L., and Clahsen, H. (2006). Morphological processing in a second language: Behavioral and event-related brain potential evidence for storage and decomposition. Journal of Cognitive Neuroscience, 18(1), 121–134.
Hasegawa, M., Carpenter, P. A., and Just, M. A. (2002). An fMRI study of bilingual sentence comprehension and workload. NeuroImage, 15, 647–660.
Hasson, U., and Small, S. L. (2008). Functional magnetic resonance imaging (fMRI) research of language. In B. Stemmer and H. Whitaker (Eds.), Handbook of the neuroscience of language (pp. 81–89). Elsevier.
Henke, K. (2010). A model for memory systems based on processing modes rather than consciousness. Nature Reviews Neuroscience, 11, 523–532.
Hernandez, A., Li, P., and MacWhinney, B. (2005). The emergence of competing modules in bilingualism. Trends in Cognitive Science, 9(5), 220–225.
Hernandez, A. E., and Li, P. (2007). Age of acquisition: Its neural and computational mechanisms. Psychological Bulletin, 133(4), 638–650.
Illes, J., Francis, W. S., Desmond, J. E., Gabrieli, J. D., Glover, G. H., and Poldrack, R. (1999). Convergent cortical representation of semantic processing in bilinguals. Brain and Language, 70(3), 347–363.
Indefrey, P. (2006). A meta-analysis of hemodynamic studies on first and second language processing: Which suggested differences can we trust and what do they mean? Language Learning, 56 (Supplement 1), 279–304.
Isel, F. (2007). Syntactic and referential processes in second-language learners: Event-related brain potential evidence. NeuroReport, 18(18), 1885–1889.
Kaan, E. (2007). Event-related potentials and language processing: A brief overview. Language and Linguistics Compass, 1(6), 571–591.
Kaan, E., Harris, A., Gibson, E., and Holcomb, P. (2000). The P600 as an index of syntactic integration difficulty. Language and Cognitive Processes, 15(2), 159–201.
Klein, D., Milner, B., Zatorre, R. J., Zhao, V., and Nikelski, J. (1999). Cerebral organization in bilinguals: A PET study of Chinese-English verb generation. NeuroReport, 10(13), 2841–2846.
Kotz, S. A. (2009). A critical review of ERP and fMRI evidence on L2 syntactic processing. Brain and Language, 109, 68–74.
Kuperberg, G. R., McGuire, P. K., Bullmore, E. T., Brammer, M. J., Rabe-Hesketh, S., and Wright, I. C., (2000). Common and distinct neural substrates for pragmatic, semantic, and syntactic processing of spoken sentences: An fMRI study. Journal of Cognitive Neuroscience, 12, 321–341.
Kutas, M. and Hillyard, S. A. (1980). Reading between the lines: Event-related brain potentials during natural sentence processing. Brain and Language, 11, 354–373.
Larsen-Freeman, D. and Cameron, L. (2008). Complex systems and applied linguistics. Oxford: Oxford University Press.
Lau, E. F., Phillips, C., and Poeppel, D. (2008). A cortical network for semantics: (de)constructing the N400. Nature Reviews Neuroscience, 9, 920–933.
MacWhinney, B. (2005). A unified model of language acquisition. In J. F. Kroll and A. M. B. Degroot (Eds.), Handbook of bilingualism: Psycholinguistic approaches (pp. 49–67). Oxford: Oxford University Press.
MacWhinney, B. (2007). A Unified Model. In N. Ellis and P. Robinson (Eds.), Handbook of cognitive linguistics and second language acquisition. Mahwah, NJ: Lawrence Erlbaum Press.
MacWhinney, B. and Bates, E. (Eds.) (1989). The cross-linguistic study of sentence processing. New York: Cambridge University Press.
Martin-Loeches, M., Munoz, F., Casado, P., Melcon, A., and Fernandez-Frias, C. (2005). Are the anterior negativities to grammatical violations indexing working memory? Psychophysiology, 42, 508–519.
McCarthy, G., Nobre, A. C., Bentin, S., and Spencer, D. D. (1995). Language-related field potentials in the anterior-medial temporal lobe: I. Intracranial distribution and neural generators. Journal of Neuroscience, 15(2), 1080–1089.
McLaughlin, J., Osterhout, L., and Kim, A. (2004). Neural correlates of second-language word learning: Minimal instruction produces rapid change. Nature Neuroscience, 7(7), 703–704.
Mishkin, M., Malamut, B., and Bachevalier, J. (1984). Memories and habits: Two neural systems. In G. Lynch, J. L. McGaugh, and N. W. Weinburger (Eds.), Neurobiology of learning and memory (pp. 65–77). New York: Guilford Press.
Morgan-Short, K. (2007). A neurolinguistic investigation of late-learned second language knowledge: the effects of explicit and implicit conditions. Dissertation. Georgetown University, Washington, D.C.
Morgan-Short, K., Sanz, C., Steinhauer, K., and Ullman, M. T. (2010). Second language acquisition of gender agreement in explicit and implicit training conditions: An event-related potential study. Language Learning, 60(1), 154–193.
Morgan-Short, K., Steinhauer, K., Sanz, C., and Ullman, M. T. (in press). Implicit but not explicit second language training leads to native-language brain patterns. Journal of Cognitive Neuroscience.
Mueller, J. L. (2005). Electrophysiological correlates of second language processing. Second Language Research, 21(2), 152–174.
Musso, M., Moro, A., Glauche, V., Rijntjes, M., Reichenbach, J., and Buchel, C. (2003). Broca's area and the language instinct. Nature Neuroscience, 6(7), 774–781.
Nakai, T., Matsuo, K., Kato, C., Matsuzawa, M., Okada, T., and Glover, G. H. (1999). A functional magnetic resonance imaging study of listening comprehension of languages in human at 3 tesla-comprehension level and activation of the language areas. Neuroscience Letters, 263(1), 33–36.
Neville, H. J., Nicol, J. L., Barss, A., Forster, K. I., and Garrett, M. F. (1991). Syntactically based sentence processing classes: Evidence from event-related brain potentials. Journal of Cognitive Neuroscience, 3(2), 151–165.
Newman, A. J., Pancheva, R., Ozawa, K., Neville, H. J., and Ullman, M. T. (2001). An event-related fMRI study of syntactic and semantic violations. Journal of Psycholinguistic Research, 30(3), 339–364.
Ojima, S., Nakata, H., and Kakigi, R. (2005). An ERP study on second language learning after childhood: Effects of proficiency. Journal of Cognitive Neuroscience, 17(8), 1212–1228.
Opitz, B., and Friederici, A. D. (2003). Interactions of the hippocampal system and the prefrontal cortex in learning language–like rules. NeuroImage, 19(4), 1730–1737.
Ortega, L., and Byrnes, H. (Eds.) (2008). The longitudinal study of advanced L2 capacities. New York: Routledge. Osterhout, L., Bersick, M., and McLaughlin, J. (1997). Brain potentials reflect violations of gender stereotypes. Memory and Cognition, 25(3), 273–285.
Osterhout, L., and Holcomb, P. J. (1992). Event-related brain potentials elicited by syntactic anomaly. Journal of Memory and Language, 31, 785–806.
Osterhout, L., McLaughlin, J., Pitkänen, I., Frenck-Mestre, C., and Molinaro, N. (2006). Novice learners, longitudinal designs, and event-related potentials: A means for exploring the neurocognition of second language processing. Language Learning, 56 (Supplement 1), 199–230.
Osterhout, L., Poliakov, A., Inoue, K., McLaughlin, J., Valentine, G., Pitkanen, I., Frenck-Mestre, C. and Hirschensohn, J. (2008). Second-language learning and changes in the brain. Journal of Neurolinguistics, 21(6), 509–521.
Paradis, M. (1994). Neurolinguistic aspects of implicit and explicit memory: Implications for bilingualism and SLA. In N. C. Ellis (Ed.), Implicit and explicit learning of languages (pp. 393–419). London, UK: Academic Press.
Paradis, M. (2004). A neurolinguistic theory of bilingualism. Amsterdam, Netherlands: John Benjamins.
Paradis, M. (2009). Declarative and procedural determinants of second languages (Vol. 40). John Benjamins Publishing Company.
Perani, D., Abutalebi, J., Paulesu, E., Brambati, S., Scifo, P., and Cappa, S. F. (2003). The role of age of acquisition and language usage in early, high-proficient bilinguals: An fMRI study during verbal fluency. Human Brain Mapping, 19(3), 170–182.
Perani, D., Dehaene, S., Grassi, F., Cohen, L., Cappa, S. F., and Dupoux, E. (1996). Brain processing of native and foreign languages. NeuroReport, 7(15–17), 2439–2444.
Perani, D., Paulesu, E., Galles, N. S., Dupoux, E., Dehaene, S., and Bettinardi, V. (1998). The bilingual brain. Proficiency and age of acquisition of the second language. Brain, 121(10), 1841–1852.
Pillai, J. J., Allison, J. D., Sethuraman, S., Araque, J. M., Thiruvaiyaru, D., and Ison, C. B. (2004). Functional MR imaging study of language-related differences in bilingual cerebellar activation. AJNR Am J Neuroradiol, 25(4), 523–532.
Poldrack, R. A. and Foerde, K. (2008). Category learning and the memory systems debate. Neuroscience and Biobehavioral Reviews, 32, 197–205.
Poldrack, R. A. and Packard, M. G. (2003). Competition among multiple memory systems: Converging evidence from animal and human brain studies. Neuropsychologia, 41(3), 245–251.
Poldrack, R. A. Wagner, A. D., Prull, M. W., Desmond, J. E., Glover, G. H., and Gabrieli, J. D. (1999). Functional specialization for semantic and phonological processing in the left inferior prefrontal cortex. NeuroImage, 10(1), 15–35.
Rossi, S., Gugler, M. F., Friederici, A. D., and Hahne, A. (2006). The impact of proficiency on syntactic second-language processing of German and Italian: Evidence from event-related potentials. Journal of Cognitive Neuroscience, 18(12), 2030–2048.
Ruschemeyer, S. A., Fiebach, C. J., Kempe, V., and Friederici, A. D. (2005). Processing lexical semantic and syntactic information in first and second language: FMRI evidence from German and Russian. Hum Brain Mapp, 25(2), 266–286.
Sanders, L. D., Neville, H., and Woldorff, M. G. (2002). Speech segmentation by native and non-native speakers: The use of lexical, syntactic, and stress-pattern cues. Journal of Speech, Language, and Hearing Research, 45(3), 519–530.
Sanders, L. D. and Neville, H. J. (2003). An ERP study of continuous speech processing. II. Segmentation, semantics, and syntax in non-native speakers. Cognitive Brain Research, 15(3), 214–227.
Schmidt, G. L. and Roberts, T. P. L. (2009). Second language research using magnetoencephalography: A review. Second Language Research, 25(1), 135–166.
Siegel, D. J. (2001). Memory: An overview, with emphasis on developmental, interpersonal, and neurobiological aspects. Journal of the American Academy of Child and Adolescent Psychiatry, 40(9), 997–1011.
Simos, P. G., Basile, L. F. H., and Papanicolaou, A. C. (1997). Source localization of the N400 response in a sentence-reading paradigm using evoked magnetic fields and magnetic resonance imaging. Brain Research, 762(1–2), 29–39.
Song, J. H., Skoe, E., Wong, P. C. M., and Kraus, N. (2008). Plasticity in the adult human auditory brainstem following short-term linguistic training. Journal of Cognitive Neuoscience, 20(10), 1892–1902.
Squire, L. R. and Schacter, D. L. (2002). Neuropsychology of memory (Third Edition.) New York: The Guildford Press.
Steinhauer, K. and Connolly, J. F. (2008). Event-related Potentials in the Study of Language. In B. Stemmer and H. A. Whitaker (Eds.), Handbook of the neuroscience of language. Oxford, UK: Elsevier.
Steinhauer, K., White, E. J., and Drury, J. E. (2009). Temporal dynamics of late second language acquisition: Evidence from event-related brain potentials. Second Language Research, 25(1), 13–41.
Stowe, L. A. and Sabourin, L. (2005). Imaging the processing of a second language: Effects of maturation and proficiency on the neural processes involved. International Review of Applied Linguistics in Language Teaching, 43(4), 329–353.
Tettamanti, M., Alkadhi, H., Moro, A., Perani, D., Kollias, S., and Weniger, D. (2002). Neural correlates for the acquisition of natural language syntax. NeuroImage, 17(2), 700–709.
Tokowicz, N. and MacWhinney, B. (2005). Implicit and explicit measures of sensitivity to violations in second language grammar: An event-related potential investigation. Studies in Second Language Acquisition, 27(2), 173–204.
Ullman, M. T. (2001a). The neural basis of lexicon and grammar in first and second language: The declarative/procedural model. Bilingualism: Language and Cognition, 4(1), 105–122.
Ullman, M. T. (2001b). A neurocognitive perspective on language: The declarative/procedural model. Nature Reviews Neuroscience, 2, 717–726.
Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/procedural model. Cognition, 92(1–2), 231–270.
Ullman, M. T. (2005). A cognitive neuroscience perspective on second language acquisition: The declarative/ procedural model. In C. Sanz (Ed.), Mind and context in adult second language acquisition: Methods, theory and practice (pp. 141–178). Washington, D.C.: Georgetown University Press.
Ullman, M. T. (2006). The declarative/procedural model and the shallow-structure hypothesis. Journal of Applied Psycholinguistics, 27(1), 97–105.
Ullman, M. T. (2007). The biocognition of the mental lexicon. In M. G. Gaskell (Ed.), The Oxford handbook of psycholinguistics (pp. 267–286). Oxford, UK: Oxford University Press.
Ullman, M. T. (2008). The role of memory systems in disorders of language. In B. Stemmer and H. A. Whitaker (Eds.), Handbook of the neuroscience of language (pp. 189–198). Oxford, UK: Elsevier Ltd.
Ullman, M. T., Miranda, R. A., and Travers, M. L. (2008). Sex differences in the neurocognition of language. In J. B. Becker, K. J. Berkley, N. Geary, E. Hampson, J. Herman, and E. Young (Eds.), Sex on the brain: From genes to behavior (pp. 291–309). New York, NY: Oxford University Press.
Vaidya, C. J., Huger, M., Howard, D. V., and Howard, J. H. J. (2007). Developmental differences in implicit learning of spatial context. Neuropsychology, 21(4), 497–506.
van Hell, J. G. and Tokowicz, N. (2010). Event-related brain potentials and second language learning: Syntactic processing in late L2 learners at different L2 proficiency levels. Second Language Research, 26(1), 43–74.
Wartenburger, I., Heekeren, H. R., Abutalebi, J., Cappa, S. F., Villringer, A., and Perani, D. (2003). Early setting of grammatical processing in the bilingual brain. Neuron, 37, 159–170.
Weber-Fox, C. M. and Neville, H. J. (1996). Maturational constraints on functional specializations for language processing: ERP and behavioral evidence in bilingual speakers. Journal of Cognitive Neuroscience, 8(3), 231–256.
Wong, P. C. M., Perrachione, T. K., and Parrish, T. B. (2007). Neural characteristics of successful and less successful speech and word learning in adults. Human Brain Mapping, 28, 995–1006.
Wong, P. C. M., Warrier, C. M., Penhune, V. B., Roy, A. K., Sadehh, A., and Parrish, T. B. (2008). Volume of left Heschl's Gyrus and linguistic pitch learning. Cerebral Cortex, 18, 828–836.
Xue, G., Dong, Q., Jin, Z., Zhang, L., and Wang, Y. (2004). An fMRI study with semantic access in low proficiency second language learners. Neuroreport, 15(5), 791–796.