24
What features of spoken and written corpora can be exploited in creating language teaching materials and syllabuses?

Steve Walsh

1. Integrating corpus-based approaches in a syllabus

Corpus-based approaches have clearly benefited language learning and teaching in both formal (class-based) and informal or naturalistic learning environments. Few can dispute the enormous developments which have occurred in EFL dictionaries as a result of corpora (see Walter, this volume). While similar developments are now occurring in the fields of morpho-syntax and semantics, the same rate of progress has not yet been seen in the integration of corpus-based approaches in syllabuses. In terms of language teaching methodology, corpus-based approaches have resulted in a more student-centred approach to learning and teaching and enabled learners to become researchers of their own developing interlanguage (see below). One of the main names associated with a problem-based approach to learning is Tim Johns, whose work on data-driven learning (DDL) has revolutionised the ways in which corpus-based approaches are integrated with more traditional methodologies (see, for example, Johns 1994; see also the chapters by Chambers and Sripicharn, this volume).

Learner corpora – that is, collections of students’ spoken or written work – are, arguably, one of the most useful ways to help learners understand their own problems and develop new insights into their interlanguage system (see Gilquin and Granger, this volume). The data to which they are exposed is likely to be both more relevant and more appropriate to their needs. In terms of the selection and use of teaching materials, learner corpora allow teachers to really tailor materials to the group of students they are working with. Not only does this ensure that materials are perceived as being relevant by learners, but language acquisition is more likely to occur since the materials are at the appropriate level (cf. Krashen 1983). Using learner corpora can greatly facilitate formfocused instruction (see, for example, Schmidt 1990), considered to be one of the most effective ways of ensuring that second language acquisition occurs.

Other types of commercially available corpora are also helpful in ensuring that the language which is being used in classrooms is authentic. A number of studies have compared textbook materials with spoken and written corpora and identified notable differences in what appears in the published materials and what occurs in ‘real life’ (see, for example, Biber et al. 1994; Gilmore 2004). This has led some writers to base teaching materials on corpora, although, at the time of writing, this is not happening on any wide scale in mainstream EFL course books, with the notable exception of Touchstone (McCarthy et al. 2005; see also McCarten, this volume).

It is uncontroversial to say that, at the time of writing, the following common features of everyday spoken interaction are often missing from invented textbook dialogues:

In naturally occurring conversation, speakers construct meaning together:

In the remainder of this chapter, we will focus on the ways in which evidence from corpora, wedded to corpus-based approaches, might be used to help teach the cornerstone skills of speaking/listening and reading/writing. Finally, we will take a closer look at some of the applications of learner corpora.

2. Using corpus-based materials to teach speaking and listening

Features of spoken language

Learners experience all kinds of difficulty when speaking and listening in English. These include things like initiating discussions and taking part in multi-participant conversations, dealing with listening and speaking at the same time, taking responsibility for deciding who speaks and when, opening, closing and changing topics of conversation, negotiating meaning, expressing personal feelings and using common language chunks appropriately. A corpus can be used to help learners understand and overcome many of these problems.

When we look at examples of spoken language in a corpus, we can immediately see how speakers create patterns and structures which characterise spoken English. The following two extracts have a number of features which can be found in virtually all instances of communication where two or more people are involved.

A: And if we don’t use them all up, they can be used, can be used elsewhere, anywhere.

B: Yeah elsewhere, okay.

A: That’s right.

B: I’ll go for that. It won’t be, it won’t be wasted.

A: No, okay, well I’ll sort that out with Kath then.

(BNC)

Our first observation is that expressions such as okay and thats right are used to follow up on the other speaker’s turn. So, instead of thinking of dialogues as being two-part (A/B, A/B) what we see here is, in fact, three parts (A/B/A, A/B/A). These three-part exchanges (see Sinclair and Coulthard 1975; Sinclair and Brazil 1982) are the basic building block or organising function of everyday spoken language. Each exchange consists of:

I

Initiation

A: And if we don’t use them all up, they can be used, can be used elsewhere, anywhere.

R

Response

B: Yeah elsewhere, okay.

F

Feedback/follow-up

A: That’s right.

Three-part exchanges are often referred to as IRF exchanges (Sinclair and Coulthard 1975). The IRF pattern is extremely common in any spoken corpus of everyday interaction. Understanding IRF patterns is useful when we consider the first two learner problems we listed at the start of this chapter: initiating and taking part in discussions freely and dealing with listening and speaking at the same time. A second observation when we look at typical everyday exchanges is that usually only one person is speaking at any one time. Taking turns in a conversation is a precise activity and there are relatively few overlaps or interruptions. A spoken corpus can help us gain a closer understanding of the mechanics of turn-taking and help us to evaluate things like how turns are opened, closed and passed, which words appear in turn initial or turn final place, and so on (Tao 2003). Most of the time, a speaker will have an expectation as to what the next person will say. Turns are context dependent and context renewing (Heritage 1997). That is, one turn is both dependent on a previous one and establishes a context for what might follow.

Corpus-derived evidence of exchange- and turn-construction can be used as the basis of teaching input and items for the syllabus for what McCarthy (2002) refers to as ‘good listenership’: that is to say, the way listeners acknowledge incoming talk, respond to it appropriately and thereby show understanding. In this way, the traditional listening skill can be integrated with the speaking skill in tasks which demand particular responses which mirror common types found in corpora. Examples of these may be found in the speaking conversation strategy syllabus in McCarthy et al. (2005).

Spoken corpora and oral fluency

Traditionally, spoken fluency is measured (Fillmore 1979; Lennon 1990) by considering features such as:

• Coherence: is the contribution relevant and does it ‘make sense’?

• Hesitancy: are pauses and hesitations too frequent?

• Long turns: does the speaker produce longer stretches of talk?

• Flexibility: does the speaker use vocabulary in a flexible and varied way?

• Automaticity: including the ability to retrieve and use a repertoire of chunks or fixed formulae.

However, when we look at a corpus, we find that native speakers also hesitate a lot, are not always coherent, frequently use shorter turns, and may use a fairly narrow range of vocabulary. Having said this, expert users do succeed in communicating with apparent ease so that listeners understand their intended meaning. Oral fluency is as much about helping listeners and attending to meaning as it is about producing coherent language forms in monologue. If we want learners to become more fluent, we have to consider listening and speaking together, and not as separate skills. When people speak fluently, they use a range of interactional strategies to help their listener follow their intended meaning (for example, by using appropriate questioning strategies). Meanings do not just happen. Both speakers and listeners work hard to ensure that they are each understood. A spoken corpus can help us to understand the interactional aspect of fluent speech and increase learners’ interactional competence.

Language chunks

Another feature of spoken language that a corpus can help us to understand is a category of vocabulary often referred to as fixed and semi-fixed expressions, or variously termed chunks, clusters, lexical bundles and multi-word units (see Greaves and Warren, this volume). Fluent speakers are able to recognise and use a wide repertoire of fixed expressions. Learning to recognise and use language chunks can help learners to become more fluent. In fact, it is almost impossible to think of spoken fluency without having at our disposal hundreds of ready-made chunks; we cannot possibly create each sentence from scratch every time we speak. A corpus can show us how speakers repeat the same chunks over and over, and frequency lists of the most common chunks can be generated with relative ease with proprietary software. Such chunks, when viewed as lexical items, can be incorporated directly into the vocabulary syllabus and can be graded either according to frequency or according to degree of complexity of pragmatic specialisation. McCarthy and Carter (2004) looked at the discourse functions associated with the most frequently used chunks in CANCODE, a five-million-word corpus of spoken discourse, and found that they preformed core spoken language functions such as vagueness (things like that) and discourse marking (I mean, you know), as well as more advanced strategic acts such as face protection and politeness (do you think, I dont know if … ) (see also Greaves and Warren, this volume; McCarten, this volume). O’Keeffe et al. (2007) offer a discussion of practical teaching concerns for the incorporation of chunks into syllabuses and materials.

One of the main lessons of examining a spoken corpus is that all speakers are listeners and that ‘flow’ or fluency is a jointly engineered outcome. Pedagogic intervention to foster such natural communicative interaction can take place at the level of both syllabus and classroom materials.

3. Using corpus-based materials to teach reading and writing

Using corpus-based texts to develop reading skills

Learners are commonly held to face a number of problems when reading (see, for example, Nuttall 1996; Ur 1996). These include:

• trying to understand every word;

• not being able to read quickly enough;

• ‘getting lost’ in a text;

• finding lexical density too high;

• not knowing the cultural content;

• being unfamiliar with the topic;

• having no interest in the topic of the text.

One of the main advantages of a corpus is that teachers have a large resource from which to select texts to use with a group of learners (see Allan 2009 for an example of this). By using a corpus, teachers can select texts according to things like their potential to maximise exposure to a target form, their level or content, all of which choices can be assisted by corpus-analytical software. Text difficulty can be investigated using web-based resources for the lexical profiling of texts – for example, at the time of writing, the excellent and easy-to-use web-based tools provided by Tom Cobb, where lexical difficulty levels of reading texts are generated by comparing any text input by the user against large corpora such as the BNC. Alternatively, measures such as type-token ratios for texts in a corpus can offer useful guidelines in selecting suitable reading material. A corpusbased approach to reading and writing gives teachers much more control over the types of text used. By controlling text types, there is a much greater possibility of minimising some of the difficulties listed above and of maximising the learning potential of the materials. For example, with students at lower levels, we might use (or create) a corpus of graded readers to ensure that the material is manageable for learners (Allan 2009).

There are basically two types of ‘homemade’ corpus which can be used to help students improve their reading and writing (Aston 1996):

• sets of texts which have been written (Allan 2009), or read by learners in the course of a programme of study (see, for example, Willis 1998); a corpus of learner writing is an especially powerful tool for helping learners to understand and overcome the difficulties they encounter when writing in English (see Seidlhofer 2002);

• collections of one particular text type which contain specific linguistic features, such as specialised vocabulary, recurring grammatical structures, a particular rhetorical structure (see, for example, Varantola 2000).

Both types have a number of advantages:

• They promote a focus on form, whereby learners are given opportunities to identify (or notice, Schmidt 1990) and practise particular linguistic features. In terms of using corpora in teaching, this corresponds to data-driven learning (Johns 1991).

• They allow the use of authentic texts which are taken from and used in contexts that are relevant to learners. Students are more likely to become more proficient readers and writers if they have some interest in or familiarity with the texts being used.

A corpus of reading texts can help enormously with learners’ vocabulary development. One approach, known as R-Read (Read Extended and Authentic Documents, Cobb 1997), makes use of web-based helper resources which offer the reader linguistic support in relation to meanings of words, parts of speech, etc. (see Cobb 2007 for one example based on Jack London’s Call of the Wild).

4. Using texts to develop writing skills

By scrutinising even a small corpus of student academic writing, it is possible to detect common problems, which might include things such as:

• an under- or over-use of discourse markers (see De Cock 2000): first, in addition, however, etc.

• a lack of attention to cohesion and coherence;

• not using an adequate range of vocabulary (detectable through type-token ratio counts, for example);

• improper use of academic conventions, such as citing, referencing, etc.;

• not paying attention to the audience.

Each of the extracts below is taken from the first 100 words of each dissertation in the Learner Dissertation Writing Corpus (Walsh 2007). They tell us a great deal about these students’ writing and could be used to raise awareness about good practice in academic writing. For example, learner awareness might be encouraged by asking questions such as:

1. Which is the best opening sentence and why?

2. Can you identify the topic sentence in each extract?

3. Should the first-person pronoun ‘I’ be used? Why/why not?

4. How does each writer signpost (using words like first, next, etc.) for the reader?

5. Comment on the length of sentences in each extract. Are any too long or too short? What changes would you make?

6. Rewrite one extract so that it can be read more easily.

In this study, I would like to focus on an understanding of student–student interaction in a language classroom. There are many key components in a language classroom, such as textbooks, teaching materials, teachers, and students. The learning environment, learners’ age and their reasons for learning are all influencing factors on the way they learn. I will describe their functions in the following sections.

This dissertation aims to discuss the influences of multi-media teaching materials on second language acquisition (SLA) in a comparative study. The learners are international students in a language classroom. The description and survey will focus on two parts, the first part is language learning materials and the second one is computer-assisted language learning (CALL).

A second type of learner writing corpus entails learners constructing their own, individual corpus taken from, for example, work that they have already submitted. Another alternative is for students to seek out texts that are relevant to their specialised language needs and to create a corpus based on texts from this area: for example, medical students might build a corpus of medical papers. The advantage of this approach is that learners and teachers can adopt a more individualised approach to writing, quickly identifying and addressing common problems.

There are a number of other features of academic writing which might be investigated using a learner writing corpus.

Signposting and linking

One of the most important features of good writing is the way the writer signposts and links the arguments for the reader. One of the most commonly used adverbial linking words is therefore. It is frequently over-used, or used where another linking device would be better. In the corpus extract below, for example, there are problems in the ways in which therefore is used, both syntactically (its position in a sentence) and semantically (the way it communicates meaning).

Therefore, it is suggested that more research is needed to find out the effects of lexical CALL on learning of vocabulary and comprehension. The research had to communicate with the participants therefore beforehand. The language learners therefore would not obtain more opportunities to negotiate for meaning and repair the breakdown in communication. Although audio recordings were made of what the conversations were, some aspects of non-verbal conversations could not be recorded and therefore they could not be analysed as collected data.

(Walsh 2007)

In this extract, we can see that not only is therefore over-used (if normalised, its use here would be about 100 times more frequent than its occurrence in the academic segment of the BNC), but it is used in a way which makes the linkages it suggests rather unclear. Reporting verbs

Another aspect of academic writing which students often find difficult to cope with is the use of reporting verbs in citations and quotations. We use many different verbs for reporting the work of others, each with a very specific meaning. For example, verbs like suggested, indicated, claimed allow a writer to adopt a particular position or stance while still making valid claims. Reporting verbs also allow writers to hedge, or to be less assertive or confident about the claims they are making.

By going to Tim Johns’ web homepage we can learn something about reporting verbs. For example, one study, based on a corpus of around 430,000 words from the scientific journal Nature, found that the following are the most common reporting verbs in academic texts (see also McCarthy and O’Dell 2008: 72):

When we look at show in the learner corpus of academic writing, it is obvious that it is indeed used widely, as indicated in this extract, all taken from one chapter of one student’s writing:

Donato’s (1994) findings show that learners are able to develop their own L2 knowledge as well as ‘extend the linguistic development of their peers’ during the process of scaffolding.

The findings of Anton’s (1999, p315) study show that ‘the functions of scaffolded assistance are achieved’.

One example is shown in Mattos’ (2000) research, which states that learners play different roles when working together on a given task, as a provider and receiver.

Insights like these enable learners to ask questions and address problems in their own writing such as:

• Am I over-using show in this chapter?

• What alternatives can I find (for example: indicate, confirm, demonstrate)?

• How is my intended meaning changed if I replace show with suggest?

Proof-reading and error correction

If learners are to become effective writers, it is important that they acquire good proofreading skills and are able to correct their own errors. A corpus is very helpful to the development of both sets of skills.

A useful place to look at error correction is the University of Toronto website. Here, learners can find out about the most commonly occurring errors in academic writing, with examples taken from small corpora of student writing. By comparing the most common errors with their own writing, students can learn to avoid many of the pitfalls found in non-native speaker writing.

In a longitudinal study in which learners focused on their own data, Chambers and O’Sullivan (2004) stress the importance of ‘corpus consultation’ as a means of improving writing. They found that most students in the study were able to make significant improvements to accuracy, especially in grammar and vocabulary. Their work underscores the importance of allowing students access to their own data as a means of increasing autonomy and maximising authenticity.

Cohesive devices

Effective writers make appropriate but not excessive use of cohesive devices (words like moreover, in addition, on the other hand). Similarly, good readers use cohesive devices to quickly find their way through a text.

Tankó (2004) conducted a study in which he found that Hungarian writers used more adverbial connectors but from a narrower range than native speaker writers. For example, connectors like therefore (see also above) were used almost twice as frequently by non-native as by native speakers. Tankó also found that the most common types of adverbials used by his students were those which listed (e.g. firstly, next, finally) as a means of structuring an essay. Clearly this use of connectors is over-simplistic and fails to provide adequate coherence in many texts.

Findings like these are useful for a number of reasons:

• They help instructors decide which are the most frequently occurring connectors for the purposes of syllabus grading.

• They indicate that, contrary to expectations, native writers of English use more ‘simple’ connectors (so, yet, that is) than ‘complex’ ones (such as furthermore, moreover, etc.).

Tankó (2004: 164) contends that ‘a corpus-based data-driven approach to the teaching of adverbial connectors should be adopted’. By using this approach, Tankó found that his Hungarian students’ use of adverbial connectors improved.

5. Exploiting learner corpora

Why use a learner corpus?

In the previous sections of this chapter, we have seen how relatively small samples of student data (essays, reports, recorded role plays, tests, and so on) can be used to produce a learner corpus. In this section, we’ll look in more detail at ways of exploiting learner corpora.

Investigating errors

By analysing the language learners produce and making comparisons with what native speakers produce, we can start to interpret errors and offer plausible explanations. Take the following example:

I dont know whether he is going.

I didn’t know that he went.

I did’nt know whether he will go.

I don’t know does he goes.

I don’t know he go or don’tgo

I don’t know he goes or not.

I don’t know he will go or not.

I don’t know him where to go.

I don’t know if he go there.

I don’t know if he go.

I don’t know if he goes.

(Hiroshima English Learners’ Corpus)

Here, we can see that the correct model sentence (I dont know whether he is going) has been reproduced by learners with a range of errors. Learner corpora offer us the opportunity to compare learner language with native speaker language and explain why errors occur. Using real data like this has the advantage that it is based on what learners actually produce rather than invented examples. This may be seen as a very effective way of raising learner language awareness.

An early example of a learner corpus was the Louvain-based International Corpus of Learner English (ICLE; Granger 1994). This corpus consists of a collection of written texts (totalling approximately two million words) produced by advanced learners of English from countries such as France, Germany, Spain, Poland, Russia, Japan and China. The ICLE can offer interesting insights into the errors learners produce in terms of lexis, grammar and discourse.

For example, in Table 24.1, we can classify errors in relation to a ‘standard’ native speaker (NS) utterance like the one given: She is Mikes sister.

Once we have classified errors in this way, we can look at their frequency of occurrence and consider the extent to which the errors affect intelligibility. By using a spoken learner corpus, we can gain access to non-native speaker talk and consider how errors might impact on the interaction taking place. For example, the Michigan Corpus of Academic Spoken English (MICASE, available free on the MICASE website) permits users to download examples of NNS talk.

The next extract, taken from MICASE, involves a Japanese student giving a presentation. This data could be used to identify errors (here noted in bold). It could also be used to help the student improve the presentation by making it more transparent to listeners, adding in signpost words, and so on.

Okay, so, I’m sorry my voice doesn’t sound as good as it should be but the last week I had a crew from Japan who videotape various part of the on- and offcampus about the child care resource in Ann Arbor. This is something I report last year based on a request from the government in Japan. The child care issue I went through as a parent but, its not my expertise and try to resign, but I couldn’tso at the end I did and thought that was it. You know they gave me a small funding but I did it and I thought that’s the end for good and then they came back this year and said this was really good, we found your report fascinating so we’d like to create the video.

Another use of the same extract would be to have students rewrite it, but in a different genre: for example, as the introduction to a report, or a news item reporting the events.

In this chapter, we have considered how corpus-based insights might be incorporated into a syllabus and used to underpin teaching materials. While we are not suggesting that corpora should replace existing materials and syllabuses, we are proposing that they are an extremely useful means of helping learners by extending and consolidating more traditional approaches to teaching and learning. Specifically, in this chapter we have seen how spoken and written corpora have much to offer in terms of helping learners improve their interactional competence and language awareness. The main advantages of corpus-based approaches are that materials can be tailored to both the level and the needs

of particular groups of students, and that students can be more actively involved in the learning process and can develop skills which will help them in their own interlanguage development.

Further reading

Allan, R. (2009) ‘Can a Graded Reader Corpus Provide “Authentic” Input?’ ELT Journal 63(1): 23–32. (Useful article on ways of exploiting a reading corpus by using authentic texts to help students acquire essential reading skills.)

Granger, S. (2002) ‘A Bird’s Eye View of Learner Corpus Research’, in S. Granger, J. Hung and Stephanie Petch-Tyson (eds) Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. Amsterdam: John Benjamin. (Offers some useful insights into the use of learner corpora for language teaching and learning.)

O’Keeffe, A., McCarthy, M. J. and Carter, R. A. (2007) From Corpus to Classroom. Cambridge: Cambridge University Press. (Gives an excellent introduction to the use of corpora in the teaching of all four skills.)

References

Allan, R. (2009) ‘Can a Graded Reader Corpus Provide “Authentic” Input?’ ELT Journal 63(1): 23–32.

Aston, G. (1996) ‘The British National Corpus as a Language Learner Resource’, in S. Botley, J. Glass, T. McEnery and A. Watson (eds) Proceedings of Teaching and Language Corpora. Lancaster: University Centre for Computer Corpus Research on Language, pp. 178–91.

Biber, D., Conrad, S. and Reppen, R. (1994) ‘Corpus-based Approaches to Issues in Applied Linguistics’, Applied Linguistics 15(2): 169–89.

Chambers, A. and O’Sullivan, I. (2004) ‘Corpus Consultation and Advanced Learners’ Writing Skills in French’, ReCALL 16(1): 158–72.

Cobb, T. (1997) ‘Is There Any Measurable Learning from Hands-on Concordancing?’ System 25(3): 301–15.

——(2007) ‘Computing the Vocabulary Demands of L2 Reading’, Language Learning and Technology 11(3): 38–63.

De Cock, S. (2000) ‘Repetitive Phrasal Chunkiness and Advanced EFL Speech and Writing’,inC. Mair and M. Hundt (eds) Corpus Linguistics and Linguistic Theory. Papers from ICAME 20 1999. Amsterdam: Rodopi, pp. 51–68.

Fillmore, C. J. (1979) ‘On Fluency’, in C. J. Fillmore, D. Kempler and W. Wang (eds) Individual Differences in Language Ability and Language Behavior. New York: Academic Press, pp. 85–101.

Gilmore, A. (2004) ‘A Comparison of Textbook and Authentic Interactions’, ELT Journal 58(4): 363–74.

Granger, S. (1994) ‘The Learner Corpus: A Revolution in Applied Linguistics’, English Today 39 (10/3): 25–9.

Heritage, J. (1997) ‘Conversational Analysis and Institutional Talk: Analysing Data’, in D. Silverman (ed.) Qualitative Research: Theory, Method and Practice. London: Sage, pp. 223–45.

Johns, T. (1991) ‘Should You Be Persuaded?: Two Samples of Data-Driven Learning Materials’, English Language Research Journal 4: 1–16.

——(1994) ‘Data-driven Learning: An Update’, TELL&CALL 2: 4–10.

Krashen, S. (1983) The Input Hypothesis. London: Longman.

Lennon, P. (1990) ‘Investigating Fluency in EFL: A Quantitative Approach’, Language Learning 40 (3): 387–417.

McCarthy, M. J. (2002) ‘Good Listenership Made Plain: British and American Non-minimal Response Tokens in Everyday Conversation’, in R. Reppen, S. Fitzmaurice and D. Biber (eds) Using Corpora to Explore Linguistic Variation. Amsterdam: John Benjamins, pp. 49–71.

McCarthy, M. J. and Carter, R. A. (2004) ‘This, That and the Other. Multi-word Clusters in Spoken English as Visible Patterns of Interaction’, Teanga 21: 30–52.

McCarthy, M. J. and O’Dell, F. (2008) Academic Vocabulary in Use. Cambridge: Cambridge University Press.

McCarthy, M. J., McCarten, J. and Sandiford, H. (2005) Touchstone. Cambridge: Cambridge University Press.

Nuttall, C. (1996) Teaching Reading Skills in a Foreign Language. London: Macmillan.

O’Keeffe, A., McCarthy, M. J. and Carter, R. A. (2007) From Corpus to Classroom: Language Use and Language Teaching. Cambridge: Cambridge University Press.

Schmidt, R. W. (1990) ‘The Role of Consciousness in Second Language Learning’, Applied Linguistics 11(2): 129–58.

Seidlhofer, B. (2002) ‘Pedagogy and Local Learner Corpora: Working with Learning-driven Data’,inS. Granger, J. Hung and S. Petch-Tyson (eds) Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. Amsterdam: John Benjamins, pp. 213–34.

Sinclair, J. and Brazil, D. (1982) Teacher Talk. Oxford: Oxford University Press.

Sinclair, J. and Coulthard, M. (1975) Towards an Analysis of Discourse. Oxford: Oxford University Press.

Tankó, G. (2004) ‘The Use of Adverbial Connectors in Hungarian University Students’ Argumentative Essays’, in J. Sinclair (ed.) How to Use Corpora in Language Teaching. Amsterdam: John Benjamins, pp. 157–81.

Tao, H. (2003) ‘Turn Initiators in Spoken English: A Corpus-based Approach to Interaction and Grammar’, in P. Leistyna and C. Meier (eds) Corpus Analysis: Language Structure and Language Use. Amsterdam: Rodopi, pp. 187–207.

Ur, P. (1996) A Course in Language Teaching: Practice and Theory. Cambridge: Cambridge University Press. Varantola, K. (2000) ‘Translators, Dictionaries and Text Corpora’, in S. Bernardini and F. Zanettin (eds) I Corpora nella Didattica della Traduzione. Bologna: CLUEB, pp. 117–33.

Walsh, S. (2007) ‘Learner Dissertation Writing Corpus’, unpublished.

Willis, J. (1998) ‘Concordances in the Classroom without a Computer: Assembling and Exploiting Concordances of Common Words’, in B. Tomlinson (ed.) Materials Development in Language Teaching. Cambridge: Cambridge University Press, pp. 44–66.