42
How to use corpus linguistics in the study of political discourse

Annelie Ädel

1. What is political discourse?

Political discourse has a long-standing tradition as an object of study across many disciplines, including rhetoric, linguistics and political science. However, this is not to say that it is a well-defined entity. After brief reflection, it quickly becomes apparent that ‘political discourse’ is a vague term that can be defined rather narrowly or quite broadly. Essentially, our definition of political discourse boils down to how we define ‘political’. This stands in contrast to, for example, legal discourse, where ‘legal’ is a much more restricted concept than ‘political’. The reader may wish to look at how ‘political’ collocates in a large standard corpus, such as the British National Corpus, to learn more about its various meanings.

A narrow definition of ‘political discourse’ sees it as taking place when ‘political actors, in and out of government, communicate about political matters, for political purposes’ (Graber 1981: 196). A broader definition of ‘political discourse’, however, could even include any discourse or part of a discourse which happens to be on a political topic, such as an informal conversation between friends where the topic revolves around an upcoming election or around a politicised issue, such as abortion. An even broader view of what ‘political’ means can be exemplified by the slogan ‘the personal is political’ (used in the feminist movement in the 1960s and 1970s). In terms of linguistic research, we can consider for example Ochs and Taylor (1992: 301), who frame a study of family dinner narratives as political discourse, arguing that ‘[f]amilies are political bodies in that certain members review, judge, formulate codes of conduct, make decisions and impose sanctions that evaluate and impact the actions, conditions, thoughts and feelings of other members’. Some linguists (e.g. Shapiro 1981, cited in Wilson 2003) take the stance that all discourse is essentially political.

We can talk about ‘political discourse’ in terms of three different definitional scopes. In the narrow scope, the political genre is the main criterion; ‘political discourse’ entails a speech event which takes place in a political context, involving political agents. In the broad scope, the political topic becomes the main criterion; ‘political discourse’ refers to any discourse on a topic which is political. Finally, in the extended scope, the underlying political issue is the main criterion; the idea is that power and control are (often or always) 591 enacted through discourse, which makes it possible to consider any discourse in, for example, an educational or institutional setting as ‘political’.

In this chapter, a narrow scope will be taken, so the focus will be on the political genre. It would be beyond the scope of this chapter to bring in examples which do not represent ‘big politics’, even if they may be highly relevant to people’s everyday lives. Further, the focus is on linguistic analysis rather than political comment, although O’Halloran (this volume) on critical discourse analysis offers a different perspective. A necessary limitation of the chapter, given space constraints, is the scarce representation of non-English-language studies and corpora.

Genres of political discourse

As political discourse can encompass a range of different speech events, it would perhaps be more appropriate to speak of ‘political discourses’. Despite the vagueness of the term, it is still possible to talk about prototypical examples. Many language users, at least in the Western world, would likely cite the political speech or the political debate as prototypical examples of political discourse. Further examples of genres of political discourse are the political manifesto, the political pamphlet, the political press conference, the political editorial, the political media interview, the political poster, the political television advertisement, the political slogan and the political car sticker. One thing that these genres tend to have in common is that political discourse is typically persuasive – that is, its main communicative purpose is to persuade an audience about something.

In this chapter, we will focus on four genres of political discourse: (1) the political speech, (2) parliamentary debates, (3) the governmental press conference and (4) the political news report. These reflect some of the most powerful and visible agents in political discourse: the politicians, the political institution, the government and the political media. They all occur in relatively formal contexts, where talk is ‘on the record’.

2. Political discourse and corpora

This section gives an overview of corpora and corpus studies of political discourse from the perspective of the four prototypical genres mentioned above. Further examples of political genres which have been covered in corpus analysis are websites of special interest groups (e.g. Teubert 2002; Koteyko 2007), the media interview of the politician (e.g. Milizia and Spinzi 2008) and the editorial or leader, which expresses the opinion of the editor or publisher of the newspaper on a topical issue (Westin and Geisler 2002; Morley 2004). We will return to these later in the chapter.

Political discourse is frequently represented in corpora, in part because many political genres are not only public but also widely publicised, which makes them more easily accessible than many other types of discourse. It is also the case that, in parliamentary debates in particular, keeping records of the discourse is of importance for legal and democratic reasons.

Political discourse has been represented in standard corpora since the first-generation projects, although in written corpora this primarily takes the form of mediated political discourse; the Brown and the LOB include ‘political reportage’ and the editorial. The spoken London–Lund Corpus, for its part, also includes political speeches. A more recent example of a standard corpus, the British National Corpus, offers a broader range of both spoken and written political genres: parliamentary speeches, parliamentary debates and public spoken debates. The political discourse represented in the media includes such categories as editorials and letters to the editor. Historical corpora of English in which political discourse is represented also exist; for example, the Lampeter Corpus includes a section of political texts from 1646 to 1730.

Political discourse has also held a strong position in translation corpora, perhaps due primarily to the immense translation activity (often required by law) triggered by multilingual political contexts. The Hansard Corpus, which consists of proceedings from the Canadian Parliament in French and English, can be mentioned as an early example of the translation corpus (see Tognini Bonelli, this volume). Another highly active translation area is the European Union, where laws regulate the production of translations of various political and institutional documents. Such widely accessible documents continue to feed into many corpora.

Although political discourse tends to be fairly well represented in standard corpora, few generally available corpora exist which contain exclusively genres of political discourse. The following sections comment on the corpora available for the four different genres, or the relative ease of ad hoc corpus compilation, and offer examples of studies based on such corpora.

Political speeches

The political speech is typically meticulously prepared, rhetorically elaborate and read from a written manuscript. A corpus of political speeches makes it possible to analyse the idiolect of specific politicians (although note that many politicians use speech writers), for example in terms of rhetorical style or the typical connotations of specific keywords. Many different types of political speeches exist, ranging from live speeches by local politicians to televised presidential inaugural addresses. The political speech can be experienced live by an audience; it can be experienced through radio or television and often the internet; it can be accessed in reading through a transcript; or parts of it can be represented by either direct or indirect reported speech, for example in the news media. All of these representations can be made into corpora of political discourse to be studied in systematic ways.

One corpus resource for political speeches is CORPS (CORpus of tagged Political Speeches), which has been annotated with audience reactions, such as applause or laughter. There are also many online resources for collecting political speeches.

Perhaps not surprisingly, there is a strong tradition in the study of political discourse of focusing on speeches by national leaders. Examples of corpus-based studies include Fairclough’s (2000) analysis of the speeches of the British Prime Minister Tony Blair, Charteris-Black’s (2002) study of inaugural speeches by various US presidents, and Berber Sardinha’s (2008) study of speeches by Brazilian President Lula.

Parliamentary debates

A corpus of parliamentary debates makes it possible to analyse the political discourse of specific political parties or groups. As parliamentary debates are often argumentative, with representatives defending and attacking various political positions, they are of particular interest for the analysis of argumentation strategies.

A corpus resource for parliamentary debates is the Corpus of European Parliament proceedings (EUROPARL), consisting of parallel texts from 1996 to 2006 in the various languages of the EU Parliament. Much more material is available online at official parliamentary or congressional websites. To find material from an English-language context, search for items such as ‘British Parliament’; ‘Irish Parliament’; ‘US Congress’. There are also many resources outside of the English-language context, such as the extensive archives of the EU.

There are several examples of corpus-based analyses of parliamentary debates, for instance Baker (2004) on debates in the UK House of Lords, Bevitori (2006) on debates in the UK House of Commons, and Ilie (2000) on debates in both UK Houses of Parliament. These will be discussed below.

Political press conferences

A corpus of political press conferences makes it possible to analyse language use in government or in the process of governance, as well as the interactions between government and media representatives. These events provide an important channel for an administration’s policy-making.

A ready-made corpus resource for press conferences is the Corpus of Spoken Professional American-English (CSPAE; Barlow 2000), half of which contains press conference transcripts from the White House amounting to almost one million words. There are also many official sites for downloading press briefings, for example the official White House website.

An example of corpus-based analysis of this genre is Partington (2003), in which White House press briefings are studied with respect to metaphor, as we will see below.

Mediated political discourse: news reports

A corpus of political news reports makes it possible to analyse political discourse as represented in the media. A great deal of the research carried out on political discourse has been on political news reported in the media, which bears witness to the relative ease of accessibility of media genres and to the special relationship between politics and the media.

It is technically relatively easy to compile one’s own database of news reports, for example by using a large searchable database (such as LexisNexis Academic) and searching for specific keywords or phrases (e.g. ‘war in Iraq’). This can give access to a great many articles from a range of different newspapers; however, this method tends to produce a great deal of duplicate material, as many papers print only slightly modified versions of articles from large news associations such as the Associated Press. Another potential problem is that it is not always possible to include only news reports and not editorials, for example. It is also important to note that some databases impose extremely restrictive licensing rules. On the other hand, many newspaper sites – in English as well as other languages – make their archives available for the searching and downloading of articles, though copyright laws still prevail.

It is important to know one’s corpus and maintain control of what is in it. One way of doing so it to use explicit criteria when collecting one’s own material. Garretson and Ädel (2008: 162), for example, drew their articles on the 2004 US election (see Section 4) from eleven high-circulation newspapers in the US. The criteria were as follows: each article had to be at least 400 words long, mention both candidates (Bush and Kerry) and be published within the thirty days before the election. Editorials and letters to the editor were excluded from the corpus manually, as only ‘objective’ news reports were desired. This procedure resulted in a corpus of 1.74 million words which took only a few days to compile (for more on building a corpus, see chapters by Adolphs and Knight, Biber, Clancy, Koester, Nelson, Reppen and Thompson, this volume).

3. Corpus techniques for exploring political discourse

What can corpus linguistics add to the study of political discourse? The value of the corpus and corpus tools has been discussed at length in the corpus-linguistic literature, and other chapters of this handbook provide a rich coverage of the advantages of taking a corpus-linguistic approach to the study of language use (see, for example, chapters by Evison, Hunston, Scott and Tribble, this volume). Naturally, the general advantages of using corpus-based methods (e.g. the empirical basis and the potential for systematic and semi-automatic analysis) also hold true for the study of political discourse in particular.

Carrying out a corpus-based study can mean different things; as in the case of defining ‘political discourse’, there are broad and narrow definitions of what a corpus-linguistic study is. Some researchers simply take it to mean that an electronic database of texts was used, whether or not any corpus-linguistic tools, such as the concordancer, were employed. In the majority of cases, however, a researcher using a corpus will also make sure to capitalise on the fact that search tools can be used to automatically locate all instances of a particular word form, or tag in the case of an annotated corpus. Another way in which researchers differ in their approach to the corpus is in how data-driven the analysis is. Most often, the researcher has already decided on the research question before searching the corpus. However, an alternative route the researcher can take is to let the corpus data guide any decisions about what to look for, for example by creating word frequency lists or keyword lists.

This section provides four examples of useful corpus analysis techniques in the context of political discourse: (1) analysing ‘how X is talked about’, (2) making corpus comparisons, (3) analysing sets of linguistic features marking a particular style, and (4) analysing keywords. There is a certain amount of overlap between the techniques; comparison, in particular, is fundamental and is also part of both (3) and (4), albeit in more complex ways. Furthermore, the techniques by no means represent an exhaustive set, but were selected with an eye towards exemplifying a variety of common techniques useful for analysing political discourse. More detailed information about corpus methods is given in other chapters of this book; Evison, for example, covers the basics of exploring word frequency, concordance lines and keywords.

Analysinghow X is talked about

Quantitative and qualitative methods complement each other usefully in corpus studies of political discourse. Researchers often start with the quantitative analysis (looking at frequency and distribution), then proceed in an increasingly qualitative way, looking at concordance lines and finding patterns in the co-text – perhaps also exploring discourselevel phenomena or rhetorical functions. In the process of analysis, there is a great deal of moving back and forth between overall frequencies and more contextualised examples.

The link between the word and the co-text (the concordance lines and broader discourse patterns) bears emphasising: ‘While much has been made of single words in political discourse … in most cases it is the context … which carries the political message’ (Wilson 2003: 409). Even though the point of departure in corpus analysis is typically a word form, the ultimate interest is in broader topics such as discourse patterns, argumentation patterns, schemas or cultural beliefs. For example, many linguists who study political discourse are interested in ‘how X is talked about’.

An early analysis of corpus data with respect to ‘how X is talked about’ is de Beaugrande (1999), who investigated ‘liberal’ and related words in corpora from the UK, the US and South Africa. He specifically takes the view that corpus data can be successfully examined for ‘expressions which are presumed to undergo ideological contestation’ (1999: 259); ‘liberal’, for example, is said to be used by different groups interested in different aspects of its meaning, e.g. ‘for freedom from government regulations and for solidarity of white people with black people’ (1999: 273).

A somewhat different approach was used in Ilie (2000) for a pragmatic analysis of political clichés in the British Parliament. What she did was use the term ‘cliché’ itself as a search term, as in ‘We should be careful before uttering a string of good-sounding clichés to solve all the problems’, which was then analysed from a pragmatic perspective, examining the words labelled as clichés and the salient argumentation patterns that referred to the clichés under discussion.

Making corpus comparisons

Bringing out what is typical of a particular discourse is really only possible by comparison. Ideally, the analyst should be able to make the comparison to something more than simply his or her own intuitions about other types of discourse. Using representative samples of discourse makes it possible to make more empirically sound and accurate comparisons.

When studying a word, expression or syntactic structure in a specific type of discourse, it is often a good idea to check its frequency and/or use in another type of discourse. Morley (2004), for example, studied generic statements (XisY) in editorials, specifically through searching for present-tense forms of the verb BE. He compared the frequency of present-tense BE in editorials to its frequency in news reports, and found considerably higher numbers in the editorials. Thus, his hypothesis that ‘telling the readers what is the case’ is a significant function in editorials was strengthened.

In addition to comparing two genres, one can also compare the production of two different speakers in the same genre. A study by Milizia and Spinzi (2008) compared language produced by George W. Bush and Tony Blair in the form of speeches, press conferences and interviews from a specific year (2005). They analysed how the two leaders used multi-word units involving the lemma TERROR differently and found different idiolects at work, ‘signalling different cultural and political identity’ (2008: 346). For example, Bush preferred the phrase ‘war on terror’, often co-occurring with ‘allies and friends’, while Blair preferred ‘fight against terrorism’, often co-occurring with nouns like ‘co-operation’, ‘solidarity’, ‘unity’ and ‘support’. Comparison of the two subcorpora is said to tell us something about the different objectives of the two speakers, for example that ‘Bush assumes a more overtly warlike style’ (2008: 346).

Yet another option for comparison is to study the discourse of two different groups of speakers with opposing views on a political issue. Especially in the context of a political debate, it may prove useful to create subcorpora for comparison based on the representation of different views, for instance one subcorpus for those speakers who are for and one for those who are against a specific issue (cf. Baker 2004). This way, the differences in stance can be more easily brought out when analysing how specific topics or populations are talked about.

Analysing sets of linguistic features marking a particular style

Another example of a corpus technique which has been used in studies of political discourse is the analysis of sets of linguistic features that have been shown to be indicative of a particular style – for example, a persuasive or narrative style. This technique has been applied, for example, to diachronic corpora, such that statistically significant variation in these features across sections of the corpus can tell us something about how the genre as a whole has changed over time (see e.g. Biber 1988; Biber, this volume).

One corpus-based study of changes over time in a specific political genre is Westin and Geisler (2002), focusing on editorials in British up-market newspapers from 1900 to 1993. By analysing some forty linguistic features, they demonstrate a development in editorials from less narrative and to more persuasive and argumentative styles throughout the period. Especially during the latter part of the twentieth century, the language of editorials also became less abstract and more informal.

A broad overall increase in informalisation has also been shown in a study of UK party election broadcasts between 1966 and 1997 (Pearce 2005). Twenty-eight linguistic features were selected as indicators of a formal/informal style, including features such as nominalisations, personal pronouns, common adverbs, common lexical verbs, questions and words containing nine or more letters. This trend toward informality was used to support claims made about a general increase in informalisation in public discourse in the twentieth century.

Analysing keywords

Studying keywords is a popular approach to the analysis of political discourse. Keyword analysis is essentially based on the notion that recurrent ways of talking about concepts and ideas reveal something about how we think about the social world. Different researchers mean different things by the term ‘keyword’, however. A non-technical definition of ‘keyword’ is any word which is important in a discourse – and, by extension, in the culture in which it is used (for early work on keywords, see Williams 1985). A more technical definition is that a keyword is ‘found to be outstanding in its frequency in the text’ (Scott 1999; this volume) by comparison to another corpus. Since most concordance programs offer automatic comparison of words across corpora, such data are relatively easily obtained. Note that this technique is more data-driven than the ones mentioned above; that is, the corpus material itself constitutes the starting point of the study.

In this subsection, we will look at two different types of corpus comparisons involving keywords: one example involving a comparison of a specialised corpus with a general corpus and three examples involving comparison of different types of specialised corpora with each other.

An example of how keywords can be studied in a specialised corpus and a general corpus is found in Teubert (2002). The specialised corpus was compiled of text from anti-EU websites in the UK, while the general corpus used was the British National Corpus. First, two word frequency lists were created, one based on the specialised corpus and one based on the general corpus. Next, a keyword list was automatically generated, based on the two word frequency lists. The keyword list highlighted which words were unusually central to the specialised corpus. In other words, the most significant lexical choices in the corpus were revealed, and these say something about what the writers choose to focus on. Among the keywords found in the specialised corpus were ‘bureaucratic’, ‘corrupt’, ‘prosperity’, ‘Anglo-Saxon’, ‘sovereignty’ and ‘province’–all words which have strong connotations. As Teubert remarks (2002: 11), the anti-EU discourse is quite emotional.

An example of how keywords can be compared across two specialised corpora is found in Fairclough (2000). Here, discourse samples from the same political party but from different time periods – specifically, before and after a major reform – were contrasted. What Fairclough did was to compare keywords from ‘New Labour’ (the re-named and modernised British Labour Party) material and from earlier Labour material. Some of the keywords of New Labour were ‘new’, ‘business’, ‘rights’, ‘values’ and ‘work’. However, it is only by considering contextualised examples that we can know how these words are used in New versus Old Labour discourse. For example, Fairclough’s analysis shows that ‘rights’ in New Labour tends to collocate with ‘responsibilities’ or ‘duties’ (unless used in compounds like ‘human/civil rights’), while this pattern is not found in the earlier Labour corpus. There, ‘responsibilities’ co-occurs instead with mention of public authorities (e.g. ‘responsibilities of local councils’). According to Fairclough (2000: 40–1), this illustrates the individualist discourse of New Labour, which is contrasted with the traditional collectivism of ‘old’ Labour.

4. Examples of topics in corpus-based research on political discourse

In this section, we will look at three different phenomena that have been studied in political discourse using corpora and corpus tools. These are (un)favourable representations, metaphors and reported speech. (Un)favourable representations and metaphor have been widely studied in political discourse even before the advent of the corpus, while reported speech has only recently started to attract researchers’ attention.

(Un)favourable representation in political discourse

Researchers interested in studying favourable or unfavourable representations of concepts in political discourse tend to find implicit types more interesting than explicit ones, since these are less obvious and cannot be taken at face value. One long-standing topic of study in political discourse is euphemism. In George Orwell’s essay Politics and the English Language (1946), political discourse is described as ‘largely the defense of the indefensible’, by which he was referring to the use of euphemism. Euphemisms – also referred to as ‘nukespeak’ in work on political discourse (see Wilson 2003: 401) – can be described as ‘words with relatively clear definitions but meanings designed to conceal referents’ (Gastil 1992: 474). One of Orwell’s well-known examples is the term ‘pacification’, used to refer to a situation in which ‘[d]efenseless villages are bombarded from the air, the inhabitants driven out into the countryside, the cattle machine-gunned, the huts set on fire with incendiary bullets’ (Peterson and Brereton 2008: 324).

Although euphemism is understudied in a corpus context, the analysis of favourable or unfavourable representations of concepts or populations in corpora of political discourse has attracted a great deal of interest. It is an especially active area of research in critical discourse analysis (see O’Halloran, this volume).

An example of a study of lexical differences between oppositional stances is Baker (2004), who studied debates in the UK Parliament’s House of Lords on law reform for the age of consent of gay males. His corpus consisted of discourse samples on this specific topic. Guided by the most frequent lexical items used, he focused specifically on how discourses of homosexuality were constructed by the two opposing groups. What he did was to compare the keywords in two subcorpora: one consisting of the speech of the pro-reformers and the other consisting of the speech of the anti-reformers. The qualitative analysis of the keywords uncovered the ways in which the arguments were framed by the debaters (2004: 91). It was found that the pro-reformers used a ‘discourse of tolerance’, including keywords such as ‘convention’, ‘rights’ and ‘human’. The anti-reformers, on the other hand, were found to use a complex chain of argumentation involving (1) talking about homosexuality as an act rather than an identity, (2) describing the prototypical act of homosexuality as anal sex, and (3) establishing that ‘anal sex is dangerous, criminal and unnatural indulgence’ (2004: 103).

To turn to the analysis of the development of (un)favourable representations of concepts over time, Koteyko (2007) studied a diachronic corpus of texts written between 1998 and 2003 by members of the Russian pro-Communist community. Specifically, Koteyko analysed paraphrases (metalinguistic restatements of what has been said) involving English loanwords into Russian such as ‘business’ and ‘businessman’ and tracked how these are represented favourably or unfavourably throughout the period. From 1998 to 2002, a negative attitude towards business was found to be prevalent in the form of an association of ‘business’ with ‘stealing’ or ‘deception of people’: ‘[t]he theme of business as a crime is gradually developed by subsequent texts that enumerate new types of illegal activities of referents of this loanword’ (2007: 81). However, starting in the latter half of 2002, a different pattern was found to emerge, with ‘business’ being used more often in the neutral-to-positive sense, and with the expression ‘big businessmen’ only (not ‘small-scale businessmen’) being used pejoratively, paraphrased as ‘the oligarchs’.

Metaphor in political discourse

Metaphor has been shown to be highly central to political discourse. One perspective on metaphor is that it ‘plays an important rhetorical role in persuasive language because it has the potential to exploit the associative power of language in order to provoke an emotional response on the part of the hearer’ (Charteris-Black 2002: 134). This is a traditional rhetorical view. A more cognitive view stresses the idea that in studying patterns of metaphor use, it is possible to examine the ways in which a phenomenon is conceptualised.

It may prove to be a challenge to study metaphor using corpus tools, as searches typically have to be done at the word level. In order to find all examples of the conceptual metaphor SOCIAL ORGANISATIONS ARE PLANTS, for example, it is necessary to first work out a list of possible word forms associated with the metaphor (e.g. ‘branch’ and ‘grow’). Another point to consider is that metaphors are ‘not inherent in word forms but arise from the relationship between words and their contexts’ (Charteris-Black 2002: 134) –‘path’ and ‘step’, for example, may draw on the domain of journeys, or may equally well be used literally – as in the case of ‘steps’ referring to the steps of the White House from where the US president sometimes delivers speeches.

A somewhat easier option, from a corpus perspective, is to select a concept (e.g. immigration) and search the corpus for related words to see what metaphors are used in connection to that concept. For example, are refugees described in terms of ‘torrents’, ‘influxes’, ‘waves’ or some other natural disaster which is difficult to control? (See also O’Halloran, this volume.)

Partington (2003) analysed metaphors in a corpus of White House briefings, which were compared to a corpus of political interviews. One of the types of metaphor analysed was the orientational metaphor, which is based on movement in space. It was identified through the unusually high frequency of prepositions and particles in the White House briefings, especially involving ‘toward(s)’ and ‘forward’. The most frequent collocates of these were the various forms of the verb ‘move’, as shown in the large number of clusters such as ‘as we move forward’ and ‘to move forward with’. Based on the data, two systematic metaphors were found: PROGRESS IS FORWARD MOTION and MOVING FORWARD IS NECESSARY. The fact that these metaphors dominate the briefings shows that the press ‘sees immobility as stagnation, as culpable lethargy and so the administration must project itself as being in a state of perpetual motion’ (2003: 200). Interestingly, ‘move’ occurred in the interview material, by contrast, predominantly in reference to moving on to another topic.

Finally, two examples of metaphor studies which incorporate cultural comparison are Musolff (2004) and Charteris-Black (2002). Musolff (2004: 437) investigated the geographical heart metaphor in British and German press coverage of EU politics in the 1990s, pointing out that the human body is a long-standing source of metaphors ‘denoting social and political entities in Western culture’, with the heart, being such an essential organ, occupying ‘a particularly prominent status in political imagery’. While the German sample contained an overwhelming number of positive claims about Germany being ‘at/in the heart of Europe’, the British sample did not place Britain in this location, but rather placed other geographically peripheral parts of Europe there (such as former Yugoslavia). Furthermore, it was found that the governmental slogan of Britain ‘being/ working at the heart of Europe’ was recontextualised in the press reports in DISEASE/ILLNESS scenarios (e.g. heart failure) to express scepticism towards integration. The findings are said to reflect ‘deep-seated differences in political attitude and perception patterns’ (Musolff 2004: 449–50).

Charteris-Black (2002) undertook a cognitive-semantic and corpus-based comparison of metaphor in US inaugural speeches and UK election manifestos, where he identified lexical fields which occurred in only one of the varieties: fire/light and the physical environment in the US corpus, and plants in the UK corpus. The explanations given for these differences are cultural and historical, with the US experience of struggling for independence leading to a positive evaluation of fire metaphors and the UK passion for gardening leading to the positive associations to words like ‘nurture’ and ‘growth’.

It is possible to discern a pattern of increasing reliance on corpus tools in the study of metaphor. While some corpus-based studies of metaphor (e.g. Charteris-Black 2002) start qualitatively and involve reading the corpus texts in looking for potential metaphorical uses, it is becoming increasingly common for studies to start quantitatively, taking a frequency or keyword list as a starting point (e.g. Partington 2003).

Reported speech in political discourse

Human language tends to be full of reproductions of and references to other people’s discourse. A great deal of recycling goes on in discourse – not only in the form of explicit references to what has already been said or written, but also in the form of expressions and linguistic structures (with no explicit attribution) which have been used by others. The explicit reference to what other speakers/writers have said/written is a prominent feature of many types of political discourse, such as parliamentary debates and political news reports. Direct quotation is a central tool of argumentation in parliamentary debates, where the official record is often invoked to quote political opponents. Also in governmental press conferences, what has or has not been said is highly topical.

Corpus-based studies of reported speech in political discourse are found in, for example, Bevitori (2006) and Garretson and Ädel (2008). Bevitori (2006) used a corpus of debates on Iraq in the UK House of Commons in 2003 to examine reporting verbs. She started with the types of reporting verbs employed by Members of Parliament and then analysed the context – specifically, the form of evaluative meanings and rhetorical function these verbs carry. Her explicit aim was to ‘place the range of reporting verbs in the wider context of the meaning potential of the language by moving from concordance to discourse’ (2006: 163). Studying expanded concordance lines and the semantics of the verbs used (e.g. ‘acknowledge’ or ‘suggest’), she examined whether the speaker explicitly aligns with the attributed material or not.

How was the corpus searched? As reporting structures cannot be retrieved automatically from a corpus without any manual analysis, the Bevitori study was restricted to past simple verb forms (with the exception of SAY, which is considerably more frequent than any of the other reporting verbs). A similar method was used by Garretson and Ädel (2008), who first created a list of ‘reporting words’ (e.g. all forms of the verb lemma STATE, the noun ‘statement’, and the phrase ‘according to’) in an attempt to capture as many instances as possible in their data. They then checked all hits and rejected irrelevant examples. In the case of homonymous words like ‘state’, examples like ‘the association states that misconceptions continue to affect law’ were retained and examples like ‘two dozen states that allow early voting’ were rejected.

Garretson and Ädel’s (ibid.) study investigated a corpus of newspaper reports culled from a dozen major US newspapers and relating to the 2004 US presidential election. Some linguists have stressed the difficulty of rendering other people’s discourse without the opinions or interpretations of the person doing the reporting becoming part of the message. This led to the question of whether media bias can be reflected in reporting structures. Garretson and Ädel analysed the sources to whom statements in the corpus were attributed in order to find out who got to speak, and whether there was balance between the two sides (Democrats and Republicans) in the election. They also examined how speech was reported in the corpus with respect to the use of direct versus indirect speech, the explicitness of source identification, and the effects that the choice of reporting word can have on the portrayal of a source. Slight evidence was found of an apparent preference for one candidate or the other in certain papers, but overall no statistically significant differences that could be construed as bias were found.

5. Corpus linguistics and political discourse: looking to the future

This final section offers some reflections on possible future developments in corpus analysis of political discourse, considering new genres to study, new types of corpora to compile, and new topics to cover.

New genres to study

It may be that what is considered prototypical political discourse is changing. Political discourse is increasingly channelled through various media: television, radio, newspapers, the internet, e-mail and telephones. This has lead to new (sub-)genres emerging, such as political blogs, special interest group e-mails and message boards, talk radio interviews of politicians, webcasts and political cell-phone messages (introduced in 2008 on a large scale by the Obama campaign in the US). The so-called ‘new media’ are said to play an increasingly important role in the political process, which makes these important types of discourse to study from a linguistic viewpoint. What is particularly exciting about this development, from the perspective of corpus analysis, is that there is great potential for exploring political discourse on the internet using corpus tools, not least because the material is already accessible in an electronic format.

New types of corpora to compile

It seems reasonable to make two predictions about future types of corpora in the study of political discourse. One prediction is that increasingly specialised corpora will be developed, in particular ‘topic-based corpora’, each compiled to research a specific topic (such as an armed conflict) or a specific political issue (such as global warming). The second prediction is that we will see the creation of multimodal corpora of political discourse, for example involving text, audio data and image data. This would be of great interest, as persuasive types of discourse can rely heavily on audio or on visual information. The political poster, for example, would be a suitable genre for taking a semiotic approach to both the linguistic and the visual features simultaneously. In order for this type of information to be usefully applied in corpus studies, future textual corpora would need to be annotated for multimodal features – such as images used in a political manifesto, or gestures used in a political speech.

New topics to cover

The study of political discourse would benefit from greater coverage at the pragmatic and discourse level. One thing that would help bring this about is improved general tools for manual or automated annotation of corpus data. With better opportunities for userdefined classification of data, linguistic categories which are difficult or even impossible to search for by means of surface word forms or part-of-speech tagging could be analysed in more systematic ways. Parallel to such developments, there is also a need to develop new ideas for finding ‘proxies’ for phenomena at the pragmatic or discourse level. For example, in work by Partington (2008), the speech act of teasing (both performing and responding to it) is analysed in political press conferences. As the transcripts were already marked for instances of laughter, and as laughter-talk is frequently associated with teasing, what Partington did was to use the tag [laughter] as a way of identifying instances of teasing (cf. McCarthy and Carter’s (2004) study of hyperbole using a similar method).

Finally, one of the major challenges of the corpus-based approach is how to anchor linguistic findings in the social and political world – and how to do so in a systematic and scientific way. Future corpus-based studies of political discourse need to further develop theoretically sound ways of showing – to use a quote from Stubbs (2008: 1) –‘how all the empirical information contributes to solving the great intellectual puzzles of language in society’.

Further reading

Musolff, A. (2004) ‘The Heart of the European Body Politic: British and German Perspectives on Europe’s Central Organ’, Journal of Multilingual and Multicultural Development 25(5 and 6): 437–52. (This article is a good example of a corpus-assisted study of metaphor in political discourse.)

Partington, A. (2003) The Linguistics of Political Argument: The Spin-Doctor and the Wolf-Pack at the White House. London: Routledge. (This book contains useful chapters on concordance analysis in the context of White House press briefings.)

References

Baker, P. (2004) ‘Unnatural Acts: Discourses of Homosexuality within the House of Lords Debates on Gay Male Law Reform’, Journal of Sociolinguistics 8(1): 88–106.

Barlow, M. (2000) ‘Corpus of Spoken Professional American English’, Houston, TX: Michael Barlow [producer], Athelstan (www.athel.com) [distributor].

Berber Sardinha, T. (2008) ‘Lula e a Metáfora da Conquista [Lula and the Metaphor of “Conquest”]’, Linguagem em (Dis)curso 8(1): 93–120.

Bevitori, C. (2006) ‘Speech Representation in Parliamentary Discourse. Rhetorical Strategies in a Heteroglossic Perspective: A Corpus-based Study’, in J. Flowerdew and M. Gotti (eds) Studies in Specialized Discourse. Bern: Peter Lang.

Biber, D. (1988) Variation across Speech and Writing. Cambridge: Cambridge University Press.

Charteris-Black, J. (2002) ‘Why “An Angel Rides in the Whirlwind and Directs the Storm”: A Corpusbased Comparative Study of Metaphor in British and American Political Discourse’, in K. Aijmer and B. Altenberg (eds) Advances in Corpus Linguistics. Amsterdam: Rodopi.

de Beaugrande, R. (1999) ‘Discourse Studies and the Ideology of “Liberalism”’, Discourse Studies 1(3): 259–95.

Fairclough, N. (2000) New Labour, New Language? London: Routledge.

Garretson, G. and Ädel, A. (2008) ‘Who’s Speaking?: Evidentiality in US newspapers during the 2004 Presidential Campaign’, in A. Ädel and R. Reppen (eds) Corpora and Discourse: The Challenges of Different Settings. Amsterdam: John Benjamins.

Gastil, J. (1992) ‘Undemocratic Discourse: A Review of Theory and Research on Political Discourse’, Discourse and Society 3(4): 469–500.

Graber, D. (1981) ‘Political Language’, in D. Nimmo and K. Sanders (eds) Handbook of Political Communication. Beverly Hills, CA: Sage, pp. 195–224.

Ilie, C. (2000) ‘Cliché-based Metadiscursive Argumentation in the Houses of Parliament’, International Journal of Applied Linguistics 10(1): 65–84.

Koteyko, N. (2007) ‘A Diachronic Approach to Meaning: English Loanwords in Russian Opposition Discourse’, Corpora 2(1): 65–95.

McCarthy, M. J. and Carter, R. A. (2004) ‘“There’s Millions of Them”: Hyperbole in Everyday Conversation’, Journal of Pragmatics 36: 149–84.

Milizia, D. and Spinzi, C. (2008) ‘The “Terroridiom” Principle between Spoken and Written Discourse’, International Journal of Corpus Linguistics 13(3): 322–50.

Morley, J. (2004) ‘The Sting in the Tail: Persuasion in English Editorial Discourse’, in A. Partington, J. Morley and L. Haarman (eds) Corpora and Discourse. Bern: Peter Lang.

Musolff, A. (2004) ‘The Heart of the European Body Politic: British and German Perspectives on Europe’s Central Organ’, Journal of Multilingual and Multicultural Development 25(5 and 6): 437–52.

Ochs, E. and Taylor, C. (1992) ‘Family Narrative as Political Activity’, Discourse and Society 3(3): 301–40.

Partington, A. (2003) The Linguistics of Political Argument: The Spin-Doctor and the Wolf-Pack at the White House. London: Routledge.

——(2008) ‘Teasing at the White House: A Corpus-assisted Study of Face Work in Performing and Responding to Teases’, Text and Talk 28(6): 771–92.

Pearce, M. (2005) ‘Informalization in UK Party Election Broadcasts 1966–97’, Language and Literature 14 (1): 65–90.

Peterson, J. C. and Brereton, L. H. (eds) (2008) The Norton Reader: An Anthology of Nonfiction, twelfth shorter edition. New York: W. W. Norton.

Scott, M. (1999) WordSmith Tools: Version 3.0. Oxford: Oxford University Press.

Stubbs, M. (2008) ‘Three Concepts of Keywords’, revised version of a paper presented to the conference on Keyness in Text at the Certosa di Pontignano, University of Siena, June 2007, available at www.uni-trier.de/fileadmin/fb2/ANG/Linguistik/Stubbs/stubbs-2008-keywords.pdf (accessed 22 March 2009).

Teubert, W. (2002) ‘Der britische Anti-Europa-Diskurs und seine Schlüsselwörter’, Sprachreport 2: 7–12.

Westin, I. and Geisler, C. (2002) ‘A Multi-dimensional Study of Diachronic Variation in British Newspaper Editorials’, ICAME Journal 26: 133–52.

Williams, R. (1985) A Vocabulary of Culture and Society. Oxford: Oxford University Press.

Wilson, J. (2003 [2001]) ‘Political Discourse’, in D. Schiffrin, D. Tannen and H. Hamilton (eds) The Handbook of Discourse Analysis. Malden, MA: Blackwell.