32
Using corpora for writing instruction

Lynne Flowerdew

1. Introduction: an overview

In a 1997 article examining the link between language corpora and teaching, Leech remarks on the ‘trickle-down’ effect from corpus research to teaching, which only really took off in the early 1990s. The main instigator of the exploitation of corpora in language teaching, specifically on the application to writing, was Johns, whose seminal work in the field of data-driven learning (DDL) is reported in Johns (1984, 1986). Another landmark publication in the field is the volume by Tribble and Jones (1990), which also acted as a catalyst for the pedagogic application of corpora. Since then, as Leech (1997: 2) humorously observes: ‘The original “trickle down” from research to teaching is now becoming a torrent’! The purpose of this chapter is to review the veritable cascade of corpus applications to various aspects of writing over the past couple of decades. But first a brief overview of the utility of corpora for enhancing different features of writing is in order.

The multiple lines of concordance output can reveal grammatical features, such as the different use of tenses with for and since in time expressions. Likewise, concordance data can shed light on vocabulary items, e.g. the most common senses of a word or its meaning. However, aside from individual grammatical or lexical features, the main utility of corpora in writing is in their uniqueness of showing what can loosely be termed as phraseological patterning, involving collocations, colligations and semantic preferences and prosodies. Biber et al. (1999: 990) have underscored the importance of lexical bundles, ‘sequences of words that commonly go together in natural discourse’ in academic writing. Corpora are ideally suited to helping learners master such patterning in writing classes, as these phraseological features tend not to be easily accessible in either dictionaries or grammars.

This focus on the usage patterns of words, and variations thereof has been accompanied by greater attention with regard to their occurrences in specific genres. This has led to more initiatives to build small specialised corpora; see Aston (1997, 2001) for a discussion on the advantages of using small corpora, and Flowerdew (2004) for an overview of such corpora. Moreover, Hyland (2006) has noted that such corpora are ideally suited for genre-based writing instruction as they reveal the prototypical and frequently occurring phraseologies of specific genres. Also Hyland (2003) has made the point that corpus-oriented genre-based instruction should not only aim to increase students’ explicit awareness of the linguistic conventions of specialist texts, but also elucidate the interrelationship between rhetorical purpose and lexico-grammatical choices.

The following sections review the increasingly important role that corpora are now playing in writing instruction, especially corpora of a specialist nature. Corpora have principally been used in two main ways to inform writing instruction, either through a corpus-based approach where worksheet materials (e.g. gap-filling exercises) are derived from concordance output, or through a corpus-driven approach, commonly referred to as data-driven learning (DDL), which requires the student to interact directly with the corpus. It should be pointed out. though. that in reality many writing instruction programmes utilise a combination of these two approaches, although as will be seen the corpus-driven approach is far more prevalent. Moreover, corpora have been exploited at different stages of the writing process from initial drafting through to the final proofreading and editing stages.

In the following sections corpus-based and corpus-driven approaches to informing and designing writing materials are discussed in relation to English for General Academic Purposes (EGAP) and English for Specific Academic Purposes (ESAP) instruction (see Flowerdew 2002 for specific examples of such corpora).

2. Exploiting corpora for EGAP writing instruction

The corpus-based approach in EGAP

One advantage of having students work with worksheet output of concordance data is that it is a valuable means of providing them with ‘corpus competence’, thereby gently familiarising them with corpus methodologies such as the inductive approach, interpretation of frequency data, etc. (Boulton 2010). Another advantage is that it allows teachers to sift through what may be a vast number of concordance lines to reduce and select data on the basis of utility value. These main advantages are inherent in the corpus-based materials for EGAP writing instruction discussed below.

It is surprising that since the publication of Thurstun and Candlin’s (1998a) textbook Exploring Academic English: A Workbook for Student Essay Writing, based on the one-millionword MicroConcord Corpus of Academic Texts, there do not seem to have been any other similar initiatives, quite possibly owing to the fact that producing such corpusbased writing activities is a time-consuming task. In this workbook the lexico-grammar is introduced according to its specific rhetorical function, e.g. referring to the literature, reporting the research of others. Within each broad function, each key word (e.g. claim, identify) is examined within the following chain of activities:

• LOOK at concordances for the key term and words surrounding it, thinking of meaning.

• FAMILIARISE yourself with the patterns of language surrounding the key term by referring to the concordances as you complete the tasks.

• PRACTISE key terms without referring to the concordances.

• CREATE your own piece of writing using the terms studied to fulfil a particular function of academic writing.

(Thurstun and Candlin 1998b: 272)

Such a suite of activities based on a clear progression from controlled to more openended writing activities would seem to be inculcating in students the kind of ‘corpus competence’ after Kreyer (2008).

The corpus-driven approach in EGAP

In contrast to the dearth of corpus-based instructional writing material for EGAP, there are more reports in the literature on the use of corpora in data-driven learning. Two of the pedagogic applications, in common with those reported in Thurstun and Candlin (1998a), concern the use of citations in EAP. Thompson and Tribble (2001) devised DDL class activities for postgraduate students in which they conducted their own analyses of citation practices in small corpora, to develop genre awareness. A key feature of these activities is that they are based on investigations comparing the use of integral and non-integral citations in both doctoral theses and novice student writing (non-integral citations are those that are placed outside the sentence, usually in brackets, and integral citations are those that play an explicit grammatical role within the sentence). For the corpus exercises Thompson and Tribble use a carefully graded procedure, moving from teacher input on a range of citation forms to applications of citations to students’ own writing, as outlined below:

Stage 1:

Learners are introduced to a range of citation forms appropriate to their level of study.

Stage 2:

Learners investigate actual practice in relevant texts, reporting back on the range, form and purpose of the citations they identify.

Stage 3:

Learners investigate the practices of their peers in writing assignments.

Stage 4:

Learners review their own writing and revise in the light of these investigations.

(Thompson and Tribble 2001: 101)

Thompson and Tribble used Lee’s (2001) BNC index to make a micro-corpus of twenty-two extracts from one academic journal – Language and Literature – with the assumption that the texts would be of relevance for postgraduate humanities students. The extract in Table 32.1 is from a worksheet for Stage 2, in which students were required to identify different types of citation practices in actual text.

While Thompson and Tribble’s postgraduate students may have been able to cope with the level of text difficulty and the metalanguage required for identifying the different types of citations, such sub-corpora would not lend themselves easily to exploitation by lower-level students.

In addition to the choice of corpus to promote writing instruction, a related issue to consider is the choice of concordancing software. These are the main concerns of Bloch (2008, 2009) in his corpus-driven activities for teaching the use of reporting verbs in

academic writing. Bloch makes the important point that it is necessary to control for the types of language and text that the teacher wants to focus on. To this end, he assembled two types of corpora: an ‘analogue’ and an ‘exemplar’ corpus. The ‘analogue’ corpus consisted of texts similar to the writing task of a critical review, since no articles directly related to the writing assignment were available. The ‘exemplar’ corpus comprised research reports directly related to the writing task at hand (see Tribble 2002 for more details on the application of these types of corpora). He also designed a program with a user-friendly interface that presented users with only a limited number of hits for each query and a limited number of criteria for querying the database, namely, integral/ non-integral; indicative/informative; writer/author; attitude towards claim; strength of claim.

While the corpus materials designed by Thurstun and Candlin (1998a), Thompson and Tribble (2001) and Bloch (2008) focus on individual rhetorical functions, Charles’ (2007: 296) EAP materials target the combinatorial function of defending your work against criticism, a two-part pattern: ‘anticipated criticism!defence and its realisation using signals of apparent concession, contrast and justification’. Another feature of Charles’ materials is that she approaches these functions by first using a top-down approach, providing students with a suite of worksheet activities to sensitise them to the extended discourse properties of this rhetorical function. She then supplements these with a more bottom-uplevel approach by having students search the corpus to identify typical lexico-grammatical patterns realising these functions, as exemplified in Table 32.2.

In an interesting departure from the usual ‘accommodationist’ perspective on EAP writing in which students are often encouraged to adhere uncritically to conventionalised forms of expression, Starfield (2004: 139) uses concordancing as ‘a strategic engagement with technology’ as a means of ‘further exploration of issues of power and identity in academic writing’. Starfield devised both worksheet and online concordancing activities as a consciousness-raising activity to foster awareness as to how writers position themselves with regard to the research of others with a view to creating a niche for their own work, and how they structure their own argument at the level of textual metadiscourse. She reported that students experienced a sense of empowerment when they realised they had readily available access to the language resources of authoritative English to expropriate these for their own means. A similar phenomenon has been noted by Lee and Swales (2006) in their students’ reaction to the use of corpora as they realised they did not always need access to native speakers to check up on certain language issues.

The following section discusses how corpora of an ESAP nature have been utilised in informing and preparing writing materials concentrating on specialised genres.

3. Exploiting corpora for ESAP writing instruction

The corpus-based approach in ESAP

It has been pointed out in the previous section that there is a paucity of materials for corpus-based writing instruction in EGAP, and the same is true for corpus-based writing materials in ESAP. Reasons for this, as exemplified below, may well be that such materials require students to compare either non-technical with sub-technical language or student with professional writing. Some intervention and manipulation of the concordance output by the tutor would seem to be necessary in such cases to avoid overwhelming the student with too numerous or irrelevant examples.

In the realm of engineering, sub-technical vocabulary (i.e. those items such as current, solution, tension which have one sense in general English, but are used in a different sense in technical English) has been found to be problematic for students (Mudraya 2006). In her corpus-based materials, on the basis of findings from a two-million-word Student Engineering English Corpus made up of engineering textbooks, Mudraya proposes a set of queries based around solution on the grounds that this word occurs, in its general sense, both as a high-frequency word family and as a frequent sub-technical item. Students are presented with concordance output of carefully selected examples of solution and in one exercise are asked to identify, for example, the following: those adjectives used with solution (1) in the general sense and (2) in the technical (chemical) sense, and then asked to underline those adjectives that can be used with both senses of solution. This type of phraseologically oriented task would serve to highlight collocational sensitivities and may also uncover examples of ‘universal’ and ‘local’ semantic prosodies. Tribble (2000) argues that such features play an important role in the teaching of written genres, proposing that there may be a ‘universal’ semantic prosody for a word in relation to general English, but a ‘local’ semantic prosody in a specific context or genre.

Other corpus-based materials are those by Hewings and Hewings (2002) and Hewings (2002) on the use of metadiscoursal anticipatory it in business writing. An interesting feature of these materials is that they incorporate the findings of both expert texts, i.e. published journal articles from the field of Business Studies, and the findings of student writing, i.e. MBA dissertations written by non-native speakers. Students are asked to compare and discuss the differences of it seems … in concordance lines selected from the two corpora, as shown in Table 32.3.

Other examples of business-related, corpus-based writing materials are those by Nelson based on his research of a Business English Corpus of around one million words, made up of both written (56 per cent) and spoken (44 per cent) genres (Nelson 2006). For the written genres of business contracts and minutes of business meetings, Nelson has

devised gap-filling and matching function with phrase tasks, targeting key lexis (a web search of business English lexis will bring you to the homepage).

The corpus-driven approach in ESAP

In common with corpus-driven instruction for EGAP, that for ESAP very often combines initial pen-and-paper awareness-raising activities with follow-up direct consultation of the corpus by students. One key methodological feature of these activities is that the majority approach the corpus consultation from a genre-based perspective (Weber 2001; Bhatia et al. 2004; Noguchi 2004). See Handford, this volume, for the role of corpus linguistics in analysing specialist genres.

Bhatia et al. (2004) propose various move-specific concordancing activities for one genre of legal English, the problem-question genre written by students within academic settings. They note that deductive reasoning plays a major role in this highly specialised genre. One of their major foci, therefore, is to have students examine various types of non-lexical epistemic and pragmatic/discoursal hedges for the role they play in deductive reasoning. This activity thus exemplifies a type of ‘local grammar’, which ‘attempts to describe the resources for only one set of meanings in a language rather than for the language as a whole’ (Hunston 2002: 90). Bhatia et al. also propose a task-based activity comprising three steps (Awareness; Contextualising; Application) for familiarising students with both the form and function all the various hedges take in different parts of the legal problem answer. In Stage 3: Application, Bhatia et al. (2004: 224) suggest having students writing alternative arguments and outcomes.

Another advocate of a concordance- and genre-based approach to academic essay writing in the legal field, specifically formal legal essays written by undergraduates, is Weber (2001). First, Weber’s students were inducted into the genre of legal essays by reading through whole essays taken from the University of London LLB Examinations written by native speakers, and identifying some of the prototypical rhetorical features, e.g. identifying and/or delimiting the legal principle involved in the case. They were then asked to identify any lexical expressions which seemed to correlate with the genre features. This was followed up by consulting the corpus of the legal essays to verify and pinpoint regularities in lexico-grammatical expressions. Swales (2002) has contrasted the ‘fragmented’ world of corpus linguistics with its tendency to adopt a somewhat bottomup, atomistic approach to text with the more ‘integrated’ world of ESP material design with its focus on top-down analysis of macro-level features. Weber’s tasks, and those by Charles described in the previous section, seem to be achieving a ‘symbiosis’ between these two approaches, as called for by Partington (1998).

Similar to those tasks proposed by Bhatia et al. (2004), Weber also approaches the lexico-grammar from the perspective of a ‘local grammar’. For example, items such as assume, consider, regard and issue, in various constructions, were all found to act as signals in an opening-type move, delimiting the case under consideration before the principle involved in it was defined, as exemplified by the extract in Table 32.4.

In an interesting departure from the normal type of ESP work, Weber’s students were also exposed to corpora of different, non-legal genres in order to sensitise them to the highly specific use and patterning of certain lexical items, such as held and submit, in legal texts.

However, as legal discourse is such an intricate discourse area, corpus-based methodologies may not completely align themselves with legal writing tasks. For example, Bhatia et al. (2004: 224) have underscored the complexity of legal discourse, pointing out that a

number of academic and professional genres in law appear to be ‘dynamically embedded in one another’. In view of this, they caution that one has to go beyond the immediate textual concordance lines and look at discursive and institutional concerns and constraints to fully interpret and by extension become a skilled writer of these highly specialised genres. Hafner and Candlin (2007), in their report on the use of legal corpora by university students, also note a tension between professional discourse practices which encourage students to focus on models, and the phraseological approach associated with corpus-driven learning. However, they see this as a tension to be exploited, arguing that ‘continuing lifelong learners still need to be able to focus and reflect on the functional lexical phrases that constitute the essence of the texture of the documents they are composing’ (p. 312).

Turning to another ESP area, that of psychology, Bianchi and Pazzaglia (2007) adopted a genre perspective in a cycle of activities for helping Italian students write psychology research articles in English. Acknowledging the continuing debate on English as an International Language, they state that their 500,000-word corpus drawn from the areas of language acquisition and developmental psychology ‘should be representative of the language of the psychology community, which includes authors from different nationalities using English as a lingua franca’ (p. 265). An innovative feature of this writing instruction cycle is that students were asked to subdivide their choice of written article into moves and annotate it themselves using a functional and meta-communicative coding system devised by the authors. This was followed by data-driven guided writing tasks, which focused on the lexico-grammatical patterning of key words related to the concept of research and verb tense usage in different moves.

Although in the past few years there has been a steady stream of articles reporting on the manifold applications of corpora to EAP and ESP writing materials at the individual or institutional level, DDL aimed at writing instruction cannot be considered to have percolated through to the language teaching community at large. This situation can be accounted for, by and large, by the three following issues: the user-friendliness of corpora and tools for classroom use; strategy training for both students and teachers; and evaluation on the effectiveness of corpora for improving writing performance. These three considerations are discussed in the following section.

4. Issues in the application of corpora to writing instruction

Corpora and tools

Kosem (2008), in a recent survey on corpus tools for language teaching and learning, notes that one possible reason for the lack of uptake of DDL by the teaching community is not the methodology but the medium. Some tools try to meet the needs of both researchers and teachers, which makes them over-complicated. This issue has also been flagged by Römer (2006) and Granger and Meunier (2008: 251), who have indicated that one future challenge lies in ‘creating ready-made and user-friendly interfaces to enable learners and teachers to access multiword units from a variety of genres and text types’. However, very recent endeavours are underway in this area and user-friendly tools, specifically to enhance academic writing skills, are described in Milton (2004), Krishnamurthy and Kosem (2007) and Kaszubski (in press, 2010). Another key feature of these tools is that they are accompanied by corpora compiled in-house to meet the needs of specific learners.

A related issue concerns the sometimes inappropriate nature of ready-made corpora for language learners, which could be another reason for the lack of uptake. As Osborne (2004: 252) comments, ‘Unless corpus examples are filtered in some way … many of the contexts are likely to be linguistically and culturally bewildering for the language learner’. In this respect, Chujo et al. (2007) note that according to their readability index most of their English–Japanese corpora were rated at the advanced level, and what is needed, therefore, are available e-texts at beginner level. Once again, this is an area where progress is being made. Wible et al. (2002) describe a lexical filter which sorts examples according to a flexible threshold of lexical difficulty. A similar function is available in SketchEngine, which has an option to sort concordances ‘best first’ from a learner’s point of view (Kilgariff, message posted on Corpus Linguistics discussion list, 16 April 2008).

Strategy training for learners and teachers

Another impediment to the adoption of DDL by classroom practitioners may well be the fact that the writing teachers themselves lack confidence in using the technology or don’tpossess a pedagogic grounding in exploiting corpora. Interestingly, and somewhat surprisingly, there are very few accounts in the literature which touch on the question of learner training. One writing programme which has integrated strategy training into writing work is reported in Kennedy and Miceli (2002). These practitioners built a corpus of contemporary written Italian to aid students with personal writing on everyday topics. Initially, the teacher gave directions for corpus investigations through a series of leading questions. This was followed up, after a few sessions, with the students encouraged to use the corpus on their own while revising their own work, the teacher acting as facilitator. See also O’Sullivan and Chambers (2006) for an account of how strategy training has been integrated into a writing course to help students improve their writing skills in French.

A detailed overview with examples of exercises for inducting advanced-level students into the skills needed for exploitation of corpus tools and data is given in Lee and Swales (2006). Students were introduced to the ‘corpus way’ of investigating language through, for example, using context to disambiguate near-synonyms and ‘gaining sensitivity to norms and distributional patterns in language (semantic prosody; genre analysis)’ (p. 62). One of these induction sessions is given below.

Wk 6: Corpus, usage patterns and subtle nuances

Guessing/scrutinizing the meanings of words by studying concordances (e.g. cabal; continually v. continuously). Looking at similar lexical items (e.g., for instance v. for example; effective v. efficient; expect v. anticipate; somewhat v. fairly). Participantgenerated examples of puzzling pairs, such as totally v. in total, seek v. search.

(Lee and Swales 2006: 66)

In view of the paucity of such induction-type exercises for acculturating students into the ‘corpus way’ of looking at language from a phraseological perspective, it seems that more such activities should be made available in the literature for different levels of learners.

Likewise, effective induction tasks for teachers, in both pre- and in-service training, also need to be devised for the successful adoption of DDL (Mukherjee 2004; O’Keeffe and Farr 2003). However, as pointed out by O’Keeffe et al. (2007), there exists a considerable gap between corpus theory and teacher practice, which has only recently begun to be bridged. See Frankenberg-Garcia (2010) for specific consciousness-raising induction tasks for teacher training, and Coxhead and Byrd (2007) on the specific lexico-grammar features that teacher educators can introduce via computer means in teacher development courses specifically aimed at teachers of academic writing.

A third reason writing teachers may not have integrated DDL into the curriculum is that, to date, there is very little empirical evidence to show the efficacy of corpus methodology on writing performance, as discussed below.

Evaluation of corpora in writing performance

There is no doubt that corpus consultation has potential for enhancing L2 writing at different stages, but to what extent is still to be ascertained. Although some very insightful studies have been conducted on learners’ evaluation of corpora (Yoon and Hirvela 2004; Curado Fuentes 2002; Yoon 2008), much more empirical research needs to be carried out on the influence of corpus methodologies on learners’ writing performance.

In this area the few studies of note are Boulton’s (2009) tests on linking adverbials, and Cresswell’s (2007) study on the use of connectors in experimental and control writing groups, which showed DDL in the context of the communicative teaching of writing skills to be moderately effective. Three studies which focus on students’ writing improvement in the revision stages of writing after being given feedback on errors are those by Watson-Todd (2001), Gaskell and Cobb (2004) and O’Sullivan and Chambers (2006). Watson-Todd’s implementation is innovative in that it requires students to build their own concordances of lexical items from internet sources, inducing valid patterns for self-correction. However, interestingly, an exploratory study by Jones and Haywood (2004) found that although students’ awareness of formulaic sequences increased through corpus-based tasks, they did not do so well in transferring these phrases to their own writing. Thus the experimental results to date suggest that corpus consultation seems to be most effective for the revising process.

Having reviewed three issues which present potential impediments to the application of corpora to writing, in the following section other considerations for future expansion and extension for the use of corpora in writing are discussed.

5. 

Future pathways: expansion and extension

Expansion of corpora: into other varieties

A glaring omission in the previous sections is that very little has been done to address writing instruction in other languages. More projects such as those reported in Chambers and O’Sullivan (2004) and O’Sullivan and Chambers (2006) on writing in French, and in Kennedy and Miceli (2002) on writing in Italian, would certainly provide a welcome expansion to the field.

Also, only a few of the corpus/based writing tasks reviewed, such as those by Hewings (2002) and Thompson and Tribble (2001), make use of learner corpora. Corpus linguists, most notably Granger (2004a, 2004b), Gilquin et al. (2007) and Nesselhauf (2004), have persuasively argued for the findings from learner corpora to be used to inform EAP writing materials. One experimental classroom project where learner corpora are being integrated in the instruction cycle is reported in Mukherjee and Rohrbach (2006), who advocate individualising writing by having students build mini-corpora of their own writing, and localising the database.

Another consideration is for native language interference to be taken into account as this is also a source of error (Granger 2004a; O’Sullivan and Chambers 2006). A course for teaching technical writing which makes reference to the L1 is described in Foucou and Kübler (2000: 67), who point out that the use of the passive presents difficulties for French speakers as this construction is used less frequently in French than in English (e.g. ‘On donne ci-dessous des conventions pour ces options’ would be translated as: ‘Below, conventions are given for these options’). They deal with such phraseologies using corpora compiled from the web, which then begs the question as to why there are so few accounts of using the web for preparing writing materials.

Moving to the area of critical pedagogies, it was noted that Starfield’s students reported a sense of empowerment through accessing corpus data to glean the phraseologies used in authoritative writing. One can also argue from a different angle, as van Rij-Heyligers (2007: 105) does, that this kind of corpus approach ‘may contain the hidden message that the native speaker knows best, hence representing elements of linguistic imperialism’. For this reason, van Rij-Heyligers makes a case for using the web for building corpora of academic English for writing instruction, as this source would treat EAP as a lingua franca and convey the sense that academic genres are dynamic entities continually being shaped and negotiated by participants, rather than as prescriptive and fixed artefacts. It could well be the case that the increasing focus on English as a lingua franca will entail more corpus building from the web reflecting this changing nature of English.

Another area for future expansion of corpora in writing is for students to work with tagged corpora of some kind, one possibility being corpora coded for genre features such as moves and steps. From the perspective of Systemic Functional Linguistics, in which choices are made from the lexico-grammar for meaning-making, Ragan (2001) describes a programme in which students worked with learner corpora tagged for features such as different types of process verbs (i.e. material; relational; existential).

Not only is there room for writing instruction to pay attention to other varieties of corpora, but scope also exists for writing instruction to escape the confines of the classroom and extend to other milieus.

Extension of corpora: out of the classroom

Chambers (2007: 13) notes that the next important step in the use of corpora is ‘out of the classroom’. It is pertinent to note that to date there have been very few accounts or research on the use of corpora for self-access purposes in writing, one exception being the corpus tools and programmes reported in Milton (1997, 2004). Gavioli and Aston (2001: 244) propose ways (e.g. pre-editing corpus data, grading corpora) ‘to progressively develop learners’ autonomy so that they become able to select and interact with appropriate data independently’. Flowerdew (2008) reports on a writing programme which systematically moves from teacher-directed convergent tasks to more divergent ones, where students take on a more autonomous role. The detailed suite of tasks undertaken during an induction period outlined in Lee and Swales (2006) would be good preparation to set students on a more autonomous writing path. But what is needed to realise autonomous use of corpus data is training not only of learners but of teachers in teacher education programmes (Mukherjee 2004).

It is to be noted that corpora of professional workplace writing have overwhelmingly been used in the academy for professionals-in-training purposes. It seems that corpora have yet to infiltrate the workplace for writing instruction. There are initiatives underway to compile a 100-million-word Corpus of Professional English in science, engineering, technology and other fields (a search of ‘Corpus of Professional English’ will bring you to the homepage). It is hoped that endeavours such as this will provide relevant corpora for use not only by professionals-in-training, but also for working professionals in the scheme of lifelong learning.

To conclude, this chapter has revealed the many innovative ways in which corpora have been used to produce materials and corpus-driven learning has been integrated into the different stages of writing programmes. However, this ‘torrent’ (to use Leech’s word) has only burst into applications by individuals or institutions. The applications of corpora to writing are at a watershed at present. Such initiatives are still to be adopted at a more national level or be implemented outside the classroom for autonomous learning, also including professional lifelong learning. However, recent innovations in corpora and tools, and the introduction of strategy training for learners and teachers, hold promise of a trickle-down effect.

Further reading

Aimer, K. (ed.) (2009) Corpora and Language Learning. Amsterdam: John Benjamins. (Several of the papers in this volume deal with some of the issues discussed in this chapter, such as learner and teacher training for implementing corpus methodologies.)

Gavioli, L. (2005) Exploring Corpora for ESP Learning. Amsterdam: John Benjamins. (This volume provides very useful information on all aspects of using corpora in ESP situations.)

Mukherjee, J. (2006) ‘Corpus Linguistics and Language Pedagogy: The State of the Art – and Beyond’, in S. Braun, K. Kohn and J. Mukherjee (eds) Corpus Technology and Language Pedagogy. Frankfurt am Main: Peter Lang, pp. 5–24. (This article gives a very informative introductory overview of pedagogic applications in writing.)

References

Aston, G. (1997) ‘Small and Large Corpora in Language Learning’, in B. Lewandowska-Tomaszczyk and J. Melia (eds) Practical Applications in Language Corpora. odz: odz University Press, pp. 51–62.

——(2001) ‘Learning with Corpora: An Overview’, in G. Aston (ed.) Learning with Corpora. Houston, TX: Athelstan, pp. 7–45

Bhatia, V. K., Langton, N. and Lung, J. (2004) ‘Legal Discourse: Opportunities and Threats for Corpus Linguistics’, in U. Connor and T. Upton (eds) Discourse in the Professions. Amsterdam: John Benjamins, pp. 203–31.

Bianchi, F. and Pazzaglia, R. (2007) ‘Student Writing of Research Articles in a Foreign Language: Metacognition and Corpora’, in R. Facchinetti (ed.) Corpus Linguistics 25 Years On. Amsterdam: Rodopi, pp. 259–87

Biber, D., Johannson, S., Leech, G., Conrad, S. and Finnegan, E. (1999) Longman Grammar of Spoken and Written English. Harlow: Pearson Education.

Bloch, J. (2008) Technologies in the Second Language Composition Classroom. Ann Arbor, MI: University of Michigan Press.

——(2009) ‘The Design of an Online Concordancing Program for Teaching about Reporting Verbs’, Language Learning and Technology 13(1): 59–78.

Boulton, A. (2010) ‘Data-driven Learning: Teaching the Computer Out of the Equation’, Language Learning 60(3).

—— (2009) ‘Testing the Limits of Data-driven Learning: Language Proficiency and Training’, ReCALL 21(1): 37–54.

Chambers, A. (2007) ‘Popularising Corpus Consultation by Language Learners and Teachers’,inE. Hidalgo, L. Quereda and J. Santana (eds) Corpora in the Foreign Language Classroom. Amsterdam: Rodopi, pp. 3–16.

Chambers, A. and O’Sullivan, I. (2004) ‘Corpus Consultation and Advanced Learners’ Writing Skills’, ReCALL 16(1): 158–72.

Charles, M. (2007) ‘Reconciling Top-down and Bottom-up Approaches to Graduate Writing: Using a Corpus to Teach Rhetorical Functions’, Journal of English for Academic Purposes 6(4): 289–302.

Chujo, K., Utiyama, M. and Nishigaki, C. (2007) ‘Towards Building a Usable Corpus Collection for the ELT Classroom’, in E. Hidalgo, L. Quereda and J. Santana (eds) Corpora in the Foreign Language Classroom. Amsterdam: Rodopi, pp. 47–69.

Coxhead, A. and Byrd, P. (2007) ‘Preparing Writing Teachers to Teach the Vocabulary and Grammar of Academic Prose’, Journal of Second Language Writing 16(3): 129–47.

Cresswell, A. (2007) ‘Getting to “Know” Connectors? Evaluating Data-driven Learning in a Writing Skills Course’, in E. Hidalgo, L. Quereda and J. Santana (eds) Corpora in the Foreign Language Classroom. Amsterdam: Rodopi, pp. 267–87.

Curado Fuentes, A. (2002) ‘Exploitation and Assessment of a Business English Corpus through Language Learning Tasks’, ICAME Journal 26: 5–32.

Flowerdew, L. (2002) ‘Corpus-based Analyses in EAP’, in J. Flowerdew (ed.) Academic Discourse. London: Longman, pp. 95–114.

——(2004) ‘The Argument for Using English Specialized Corpora to Understand Academic and Professional Language’, in U. Connor and T. Upton (eds) Discourse in the Professions: Perspectives from Corpus Linguistics. Amsterdam: John Benjamins, pp. 11–33.

——(2008) ‘Corpus Linguistics for Academic Literacies Mediated through Discussion Activities’,inD. Belcher and A. Hirvela (eds) The OralLiterate Connection: Perspectives on L2 Speaking, Writing and Other Media Interactions. Ann Arbor, MI: University of Michigan Press, pp. 268–87.

Foucou, P.-Y., and Kübler, N. (2000) ‘A Web-based Environment for Teaching Technical English’,in L. Burnard and T. McEnery (eds) Rethinking Language Pedagogy from a Corpus Perspective. Frankfurt am Main: Peter Lang, pp. 65–71.

Frankenberg-Garcia, A. (2010) ‘Raising Teachers Awareness to Corpora’, in N. Kübler (ed.) Corpora, Language Teaching, and Resources. Bern: Peter Lang.

Gaskell, D. and Cobb, T. (2004) ‘Can Learners Use Concordance Feedback for Writing Errors’? System 32: 301–19.

Gavioli, L. and Aston, G. (2001) ‘Enriching Reality: Language Corpora in Language Pedagogy’, ELT Journal 55(3): 238–46.

Gilquin, G., Granger, S. and Paquot, M. (2007) ‘Learner Corpora: The Missing Link in EAP Pedagogy’, Journal of English for Academic Purposes 6(4): 319–35.

Granger, S. (2004a) ‘Computer Learner Corpus Research; Current Status and Future Prospects’,inU. Connor and T. Upton (eds) Discourse in the Professions: Perspectives from Corpus Linguistics. Amsterdam: John Benjamins, pp. 123–45.

——(2004b) ‘Practical Applications of Learner Corpora’, in B. Lewandowska-Tomaszczyk (ed.) Practical Applications in Language and Computers (PALC) 2003. odz: odz University Press, pp. 291–301.

Granger, S. and Meunier, F. (2008) ‘Phraseology in Language Learning and teaching: Where To From Here?’ in S. Granger and F. Meunier (eds) Phraseology in Foreign Language Learning and Teaching. Amsterdam: John Benjamins, pp. 247–51.

Hafner, C. and Candlin, C. (2007) ‘Corpus Tools as an Affordance to Learning in Professional Legal Education’, Journal of English for Academic Purposes 6(4): 303–18.

Hewings, M. (2002) ‘Using Computer-based Corpora in Teaching’, paper presented at the TESOL Conference, Utah, March.

Hewings, M. and Hewings, A. (2002) ‘“It Is Interesting to Note That …”: A Comparative Study of Anticipatory “It” in Student and Published Writing’, English for Specific Purposes 21(4): 367–83.

Hunston, S. (2002) Corpora in Applied Linguistics. Cambridge: Cambridge University Press.

Hyland, K. (2003) Second Language Writing. Cambridge: Cambridge University Press.

——(2006) English for Academic Purposes: An Advanced Resource Book. London and New York: Routledge.

Johns, T. (1984) ‘From Printout to Handout: Grammar and Vocabulary Teaching in the Context of Data-driven Learning’, in T. Odlin (ed.) Perspectives on Pedagogical Grammar. Cambridge: Cambridge University Press, pp. 293–313.

——(1986) ‘Micro-Concord: A Language Learner’s Research Tool’, System 14(2): 151–62.

Jones, M. and Haywood, S. (2004) ‘Facilitating the Acquisition of Formulaic Sequences: An Exploratory Study in an EAP Context’, in N. Schmitt (ed.) Formulaic Sequences. Amsterdam: John Benjamins, pp. 269–91.

Kaszubski, P. (in press, 2010) ‘A guided collaboration tool for online concordancing with EPA learners’. in A. Frankenberg-Garcia, L. Flowerdew and G. Aston (eds) New Trends in Corpora and Language Learning. London: Continuum.

Kennedy, C. and Miceli, T. (2002) ‘The CWIC Project: Developing and Using a Corpus for Intermediate Italian Students’, in B. Kettemann and G. Marko (eds) Teaching and Learning by Doing Corpus Analysis. Amsterdam: Rodopi. pp. 183–92.

Kosem, I. (2008) ‘User-friendly Corpus Tools for Language Teaching and Learning’, in A. FrankenbergGarcia, T. Rkibi, M. Braga da Cruz, R. Carvalho, C. Direito and D. Santos-Rosa (eds) Proceedings of the 8th Teaching and Language Corpora Conference. Lisbon: ISLA.

Kreyer, R. (2008) ‘Corpora in the Classroom and Beyond: Aspects of Corpus Competence’, paper presented at the Fourth International Inter-Varietal Applied Corpus Studies (IVACS) Conference, 13 June, University of Limerick, Ireland.

Krishnamurthy, R. and Kosem, I. (2007) ‘Issues in Creating a Corpus for EAP Pedagogy and Research’, Journal of English for Academic Purposes 6(4): 356–73.

Lee, D. (2001) ‘Genres, Registers, Text Types, Domains, and Styles: Clarifying the Concepts and Navigating a Path through the BNC Jungle’, Language Learning and Technology 5(3): 37–72.

Lee, D. and Swales, J. (2006) ‘A Corpus-based EAP Course for NNS Doctoral Students: Moving from Available Specialized Corpora to Self-compiled Corpora’, English for Specific Purposes 25(1): 56–75.

Leech, G. (1997) ‘Teaching and Language Corpora: A Convergence’, in A. Wichmann, S. Fligelstone, T. McEnery and G. Knowles (eds) Teaching and Language Corpora. London: Longman, pp. 1–23.

Milton, J. (1997) ‘Providing Computerized Self-access Opportunities for the Development of Writing Skills’, in P. Benson and P. Voller (eds) Autonomy and Independence in Language Learning. Harlow: Longman, pp. 204–14.

——(2004) ‘From Parrots to Puppet Masters: Fostering Creative and Authentic Language Use with Online Tools’, in B. Homberg, M. Shelley and C. White (eds) Distance Education and Languages: Evolution and Change. Clevedon: Multilingual Matters, pp. 242–57.

Mudraya, O. (2006) ‘Engineering English: A Lexical Frequency Instructional Model’, English for Specific Purposes 25(2): 235–56.

Mukherjee, J. (2004) ‘Bridging the Gap between Applied Corpus Linguistics and the Reality of English Language Teaching in Germany’, in U. Connor and T. Upton (eds) Applied Corpus Linguistics: A Multi-dimensional Perspective. Amsterdam: Rodopi, pp. 239–50.

Mukherjee, J. and Rohrbach, J. (2006) ‘Rethinking Applied Corpus Linguistics from a Language– Pedagogical Perspective: New Departures in Learner Corpus Research’, in B. Kettemann and G. Marko (eds) Planing, Gluing and Painting Corpora: Inside the Applied Corpus Linguists Workshop. Frankfurt am Main: Peter Lang, pp. 205–31.

Nelson, M. (2006) ‘Semantic Associations in Business English: A Corpus-based Analysis’, English for Specific Purposes 25(2): 217–34.

Nesselhauf, N. (2004) ‘Learner Corpora and their Potential for Language Teaching’, in J. Sinclair (ed.) How to Use Corpora in Language Teaching. Amsterdam: John Benjamins. pp. 125–52.

Noguchi, J. (2004) ‘A Genre-analysis and Mini-corpora Approach to Support Professional Writing by Nonnative English Speakers’, English Corpus Studies 11: 101–10.

O’Keeffe, A. and Farr, F. (2003) ‘Using Language Corpora in Initial Teacher Training: Pedagogic Issues and Practical Application’, TESOL Quarterly 37(3): 389–418.

O’Keeffe, A., McCarthy, M. J. and Carter, R. A. (2007) From Corpus to Classroom: Language Use and Language Teaching. Cambridge: Cambridge University Press.

Osborne, J. (2004) ‘Top-down and Bottom-up Approaches to Corpora in Language Teaching’, in U. Connor and T. Upton (eds) Applied Corpus Linguistics: A Multi-Dimensional Perspective. Amsterdam: Rodopi, pp. 251–65.

O’Sullivan, I. and Chambers, A. (2006) ‘Learners’ Writing Skills in French: Corpus Consultation and Learner Evaluation’, Journal of Second Language Writing 15(1): 49–68.

Partington, A. (1998) Patterns and Meanings. Amsterdam: John Benjamins.

Ragan,P.(2001)‘Classroom Use of a Systemic Functional Small Learner Corpus’, in M.Ghadessy, A. Henry and R. L. Roseberry (eds), Small Corpus Studies and ELT. Amsterdam: John Benjamins, pp. 207–36.

Römer, U. (2006) ‘Pedagogical Applications of Corpora: Some Reflections on the Current Scope and a Wish List for Future Developments’, Zeitschrift Anglistik und Americanstik 54(2): 121–34. SketchEngine, http://www.sketchengine.co.uk/ (accessed 8 October 2008).

Starfield, S. (2004) ‘Why Does This Feel Empowering? Thesis Writing, Concordancing and the “Corporatising University”’, in B. Norton and K. Toohey (eds) Critical Pedagogies and Language Learning. Cambridge: Cambridge University Press, pp. 138–57.

Swales, J. (2002) ‘Integrated and Fragmented Worlds: EAP Materials and Corpus Linguistics’,inJ. Flowerdew (ed.) Academic Discourse. London: Longman, pp. 150–64.

Thompson, P. and Tribble, C. (2001) ‘Looking at Citations: Using Corpora in English for Academic Purposes’, Language Learning and Technology 5(3): 91–105.

Thurstun, J. and Candlin, C. (1998a) Exploring Academic English: A Workbook for Student Essay Writing. Macquarie University: NCELTR.

——(1998b) ‘Concordancing and the Teaching of Vocabulary of Academic English’, English for Specific Purposes 17(3): 267–80.

Tribble, C. (2000) ‘Genres, Keywords, Teaching: Towards a Pedagogic Account of the Language of Project Proposals’, in L. Burnard and T. McEnery (eds) Rethinking Language Pedagogy from a Corpus Perspective. Frankfurt am Main: Peter Lang, pp. 75–90.

——(2002) ‘Corpora and Corpus Analysis: New Windows on Academic Writing’, in J. Flowerdew (ed.) Academic Discourse. London: Longman, pp. 131–49.

Tribble, C. and Jones, G. (1990) Concordances in the Classroom. London: Longman.

van Rij-Heyligers, J. (2007) ‘To Weep Perilously or W.EAP critically: The Case for a Corpus-based Critical EAP’, in E. Hidalgo, L. Quereda and J. Santana (eds) Corpora in the Foreign Language Classroom. Amsterdam: Rodopi, pp. 105–18.

Watson-Todd, R. (2001) ‘Induction from Self-selected Concordances and Self-correction’, System 29 (1): 91–102.

Weber, J.-J. (2001) ‘A Concordance- and Genre-informed Approach to ESP Essay Writing’, ELT Journal 55(1): 14–20.

Wible, D., Kuo, C.-H., Chien, F.-Y., Liu, A. and Wang, C.C. (2002) ‘Towards Automating a Personalized Concordancer for Data-driven Learning: A Lexical Difficulty Filterfor Language Learners’,inB.Kettemann and G. Marko (eds) Teaching and Learning by Doing Corpus Analysis. Amsterdam: Rodopi, pp. 147–54.

Yoon, H. (2008) ‘More than a Linguistic Reference: The Influence of Corpus Technology on L2 Academic Writing’, Language Learning and Technology 12(2): 31–48.

Yoon, H. and Hirvela, A. (2004) ‘ESL Student Attitudes toward Corpus Use in L2 Writing’, Journal of Second Language Writing 13(4): 257–83.