ONE of Grice’s (1975, 1989) major contributions to pragmatics stems from his distinction between what a sentence means linguistically versus what a speaker intends to communicate when that sentence is uttered. When reviewing the contributions from experimental pragmatics, one could be justified in thinking that pragmatic research is devoted almost exclusively to the sentence meaning side because one finds there an abiding interest in particular terms, such as the quantifier some (e.g. Noveck, 2001; Pouscoulous et al., 2007; Grodner et al., 2010; Bott et al., 2012; Noveck, 2018), and the way these are understood in paradigms that use out-of-the-blue sentences with minimal contexts. As this volume suggests (for some examples, see also Degen & Tanenhaus and Breheny (Chapters 3 and 4, respectively, in this volume)), this has been a rather successful strategy for getting linguists and other cognitive researchers to appreciate the role of relatively low-level pragmatic enrichments in comprehension.

While this approach has been productive and edifying, it also comes with two drawbacks. One is that a focus on terms renders experimental pragmatics microscopic. That is, showing how a pragmatic reading of a word affects interpretation implies that that is all that matters. The other is that any Gricean would insist that the desired unit of analysis is the utterance and not individual subunits within it. There is more to pragmatic processing than understanding how an individual word is enriched. According to Grice, recognizing the linguistically encoded meaning is only part of the effort in discerning the speaker’s intention. There must be pragmatic phenomena that can non-controversially receive sentence-wide treatment in a more ecologically valid context.

As the title of the chapter suggests, one phenomenon that—prima facie—requires a listener to understand an utterance in its entirety (and in context) is irony. To appreciate how this is the case, consider a scenario (one that we will be revisiting throughout this chapter) in which an opera singer and a colleague share a moment after a performance. In one version of this scenario, it is clear that the two had sung horribly. Now, consider (1) as a comment the singer makes to her colleague in this post-performance moment:

(1) ‘Tonight we gave a superb performance!’

She is being ironic, of course. Now, if the performance had been astounding, instead of awful, the exact same sentence in (1) would no longer be ironic; it would just be a sincere, self-congratulatory literal remark. Note that there is no single isolable word in this utterance that can be declared to be the source of a pragmatic enrichment in a way similar to a scalar, a metaphor, a referent, or a presupposition.

These theoretical features also come with two experimental advantages. One is that a target item, such as the utterance in (1), can be both an experimental object as well as a control. All one needs is a small change in context to view an ironic utterance as a literal one (and vice versa). The other is that irony is a perfect testing ground for investigating what cognitive resources, other than the language ‘faculty’, are engaged during the comprehension of an utterance. Given that intention-reading ought to be central to irony comprehension, one can anticipate that Theory of Mind (ToM), a well-developed area of research, is implicated. Irony makes for an ideal testbed for investigating how ToM is implicated in utterance processing because, when compared to its (sincere) literal control, an ironic utterance arguably calls for additional ToM-related effort.

In what follows, we present a summary of the pragmatic literature on irony as well as irony processing in a historical order. This will help the reader appreciate that, early on in modern pragmatics, theoretical approaches assumed a role for intentions in irony comprehension. It will also become clear that this important feature of irony was eclipsed, or ignored, soon afterward in order to consider how literal meanings enter into figurative language processing. We will then go on to show that, while debates on the relevance or irrelevance of literal readings to irony processing persisted in the literature, work on ToM was expanding exponentially (see Rubio-Fernández, Chapter 31 in this volume). We conclude by reviewing work, including our own recent endeavours, that reasserts a role for attitude ascription in irony processing while taking advantage of insights from the ToM literature.

17.1 GRICE: BRINGING AN ATTITUDE

Spanning classical antiquity to Gricean pragmatics (e.g. from Quintilian, first century AD/1921; to Grice, 1989), one finds a rich literature on the nature and uses of irony in linguistics, rhetoric and literary studies. This literature a) accepts as fundamental the basic assertion of classic rhetoric that misdirection is the crucial feature of irony and b) labels ironic meaning as ‘the opposite of what the speaker said’. This label has become the traditional definition of irony: that is, the one found in dictionaries. However, it is not entirely satisfying. It suffices to look carefully at the example of the two singers in (1) to see two reasons why.

One is that if the singer just wanted to communicate to her colleague that their performance was awful she could have said so. Indeed, from the point of ostensive-inferential communication (e.g. Grice, 1975; Sperber & Wilson, 1986/95; Grice, 1989; Carston, 2002; Wilson & Sperber, 2004), it is generally assumed that the speaker would do her best to convey the intended meaning and so she should avoid effortful and deviant formulations if they are not necessary. Following the traditional definition, irony does not come with any obvious added value. The other is that the singer’s colleague probably knows that the performance was bad (or she would not be able to get the ironic meaning) so simply stating the obvious (albeit through some figurative technique) does not add anything new either. In the end, the definition of the ironic meaning as just ‘the opposite of what the speaker said’ turns irony into an effortful and meaningless communicative act.

What is missing in the above description? An ironic interpretation critically involves the ability to ascribe attitudes. In fact, the main theories of pragmatics assert that a speaker who uses an ironic remark does so in order to convey his personal attitude. Paul Grice (1967/89), who analysed irony as an instance of figurative language (while not distinguishing irony from, say, metaphor), pointed out that irony involves a ‘hostile or derogatory judgment or a feeling such as indignation or contempt’ (p. 53).

Sperber & Wilson were more specific and expansive about the role of attitude in irony when they proposed the Echoic theory of irony (e.g. Sperber & Wilson, 1981; Wilson, 2006; Wilson, 2009; Wilson & Sperber, 2012). According to them, verbal irony is a subset of the attributive use of language, which is the ascription of a thought to someone else, as in ‘John thinks that it’s Monday.’ This sentence is not directly about the actual day, but it is about another thought that the speaker attributes to some source other than himself, that is, John. The echoic use of language is a subcategory of the attributive use, which occurs when the message that the speaker wants to convey is not the content of the attributed thought but rather his own attitude or reaction to it. To make attributive and echoic uses clear, consider two different analyses from a single response to (2):

(2) John says: ‘It’s Monday.’

(3) Mary says: ‘John thinks that it’s Monday.’ (Mary is informing someone else about John’s thought because John is the only one with a calendar.)

(4) Mary says: ‘John thinks that it’s Monday!’ (Actually, it is Tuesday but John is so drunk as to have forgotten what day it is.)

While (2) is an ordinary descriptive use of language with which John describes a state of affairs, (3) is an attributive use of language because Mary’s utterance does not refer directly to the state of affairs but to John’s thought. More germanely, (4) is an echoic use of language because the content of Mary’s sentence is not so important, it is just used to convey Mary’s reaction to John’s utterance. Verbal irony is a sub-type of echoic use in which the utterance conveys a sceptical, mocking attitude about a thought that is attributed to someone else.

Irony was one of several pragmatic phenomena that Relevance Theory (Sperber & Wilson, 1986/95), a cognitive pragmatic account that is both an extension of and a challenge to Grice, sought to account for. Relevance Theory describes in detail how intended meanings are the result of the interplay between cognitive effects and effort in a particularized way with each communication. According to Sperber & Wilson’s (1986/95) principle of Relevance, determining the meaning of an utterance is part of a listener’s effort to understand the speaker’s intended meaning and that meaning is always inferred (even when it consists in a literal interpretation of an utterance). The inferences involved, however, make the comprehension of an utterance vary with respect to the effort required. Both the sentence meaning and the context contribute to making some interpretations more easily derivable than others.

Aside from being an original account that dates back at least three decades, the echoic mention account came with empirical support as well. Jorgensen et al. (1984) presented six stories that included a potentially innocuous remark at the end of each, for example The Clarks have a beautiful lawn, and showed that the perceived irony of such a remark nearly tripled as a function of the presence of an expectation made explicit earlier in the story (in the experimental condition). For example, the remark above was uttered by a character named Joe and it was potentially ironic because he was talking to Irma who had earlier given him (poor) directions for finding the Clarks’ party by saying It’s the house with the big maple tree on the front lawn. Joe’s remark was making reference to the fact that Irma had gotten the description of the house all wrong (it turned out that the Clarks lived in a walk-up apartment above a store on a street). In other words, Joe was making an echoic reference to Irma’s inexact instruction. The control condition presented the same story without Irma’s echo-able utterance in the context.

Remarkably, the Echoic Mention proposal prompted another attitude-laden account, namely the Pretense Theory (Clark & Gerrig, 1984). Clark & Gerrig claimed that the basic mechanism behind irony is not an attributive use of language but a speaker’s fake communicative act. The main idea of this proposal is that the speaker of an ironical utterance is not himself performing a speech act but pretending to perform one in order to convey a mocking, sceptical, or contemptuous attitude to the speech act itself. Like an actor on a stage, the ironic speaker stops being himself for a moment and becomes a character who asserts something that is clearly false or inappropriate. The fictive scenario should allow the speaker to distance himself from the content of the utterance in order to reveal his mocking attitude towards it. Clearly the mise en scène works only if the audience is able to understand the pretense.

While the chapter will not pursue the differences among these seminal theories, we simply point out that these accounts all share the fundamental idea that the communication of the speaker’s attitude is the hallmark of irony. All the protagonists in this Grice-inspired debate take for granted the notion that the message conveyed by an ironic remark is the speaker’s mocking, sceptical, or contemptuous attitude. Despite this shared insight, the importance of attitude in irony would not become a permanent fixture of irony debates.

17.2 THE PSYCHOLINGUISTIC APPROACH TO IRONY

Despite these early, thoughtful, attitude-rich accounts, the psychological literature on irony would soon focus on a very different Gricean claim, viz. that the literal meaning of a figurative utterance (which includes metaphor too) needs to be considered and rejected before the speaker’s intended meaning is accessed. Several researchers (including Gibbs and Glucksberg) grew doubtful of this claim.¹

Their general mode of attack was to transform the architecture of Grice’s seminal work into a psychological framework, to dub it the Standard Pragmatic Model (SPM), and to use that as Grice’s description of figurative language processing. At its simplest, the SPM is a three-step process that involves (1) the computation of the semantic/literal meaning; (2) the recognition of a violation of a maxim; and (3) the computation of an implicature. However, it has been difficult to establish that these three steps actually occur (let alone in such an order) and there is much evidence indicating that pragmatic processing is much faster than such a three-step process suggests. As far as irony is understood, studies like Dews & Winner (1999), and Schwoebel et al. (2000), as well as some of Ortony (1979), are compatible with the SPM, but other accounts (Gibbs, 1986, 1994a) argue vociferously against it. The upshot is that findings showing fast figurative interpretations have made it easy for critics to rail against the SPM and, in so doing, the entire Gricean approach.

However, this anti-Gricean turn was not entirely justifiable. As we have pointed out elsewhere (Noveck & Spotorno, 2013), Grice never intended his theory to be used as a model of language processing (as a philosopher, Grice should not be expected to view his theory as one). Transforming Gricean theory into ‘the SPM’ is emblematic of a common pitfall in the cognitive sciences as underlined by David Marr (1982). Marr pointed out how the understanding of a phenomenon can advance at three different levels—often referred to as the computational, algorithmic, and implementational levels of analysis—and how the three levels are separate and complementary. We summarize the three briefly. The computational level makes explicit the input and output of the process as well as the constraints that would allow a specified problem to be solved. The algorithmic level describes how to get from input to output, and specifically determines which representations have to be used and which processes have to be employed in order to build and manipulate the representations. The implementational level provides a description of the physical system that should realize the process at the physical (for example, neuronal) level.

It should be clear, then, that Grice’s theory was designed at the computational level; the so-called SPM was invented to practically mimic it at the algorithmic level. However, as Marr (1982) argued, it is neither necessary nor advisable to assume that the two levels resemble each other. One can do theoretical work at the computational level without recourse to the algorithmic level and so on with any level with respect to the other two.

In any case, viewing the SPM as an architecture for processing became a fait accompli. From the mid-1980s onward, the great majority of online studies became focused on determining whether the literal meaning of an ironic utterance needs to be processed before its figurative meaning. This led to several years of concrete investigations aimed at determining the linguistic or contextual factors that make irony more or less accessible to readers. With a few exceptions (for example, see Pexman & Olineck, 2002), this line of research did not systematically make reference to theories of pragmatics that emphasized intentions or attitude ascription.² Ultimately, the back and forth concerning the role of literal meanings led to two major hypotheses and a compelling compromise. We summarize each in turn.

17.2.1 Direct Access

Gibbs’s (1994a, 2002) approach to figurative language comprehension contrasts directly with the SPM. While both Gibbs and the SPM make no distinction between irony and other forms of figurative language, Gibbs suggests similar processing mechanisms for both figurative and literal language (Gibbs, 1994a; Gibbs & Moise, 1997). This assumption is based on the notion that comprehending a literal or non-literal meaning of a sentence depends largely on pragmatic knowledge and a listener’s figurative modes of thought (Gibbs, 1994a, 2002). Both sorts of reading are determined by contextual information. In other words, contextually appropriate meanings of, say, a metaphor can be understood directly without leading to an incompatibility during semantic information processing. This is why he calls his approach Direct Access.

Evidence in favour of the Direct Access view comes from comprehension studies showing that latencies for comprehending literal and figurative readings of similar target sentences are comparable. With respect to irony, this is based primarily on Gibbs’s (1986) study on sarcasm (but see also Gibbs, O’Brien, & Doolittle, 1995). Other, more recent studies have also reported that contextual information facilitates the recognition and comprehension of sentences conveying ironic meaning (e.g. Colston & O’Brien, 2000; Colston and Colston, 2002; Ivanko & Pexman, 2003).

17.2.2 Graded salience

The SPM and the Direct Access view can be considered two extremes of a spectrum, which leaves space for other proposals. One of the most influential accounts that fills the Direct-Indirect Access gap is the Graded Salience Hypothesis, proposed by Rachel Giora (1997). According to the Graded Salience Hypothesis, the initial processing of lexical information is an encapsulated and graded process in which salient meanings of words or expressions are retrieved from the mental lexicon (Giora, 2003). Contextual information is processed in parallel with lexical processes but it neither interacts with nor inhibits salient meanings when contextually incompatible (e.g. Peleg et al., 2001; Giora, 2002). Salience is a function of properties such as familiarity, prototypicality, and frequency and the meaning of a word. In order to be salient, a word has to be encoded in the mental lexicon. In cases where words or expressions have multiple meanings varying in salience, Giora (2003) suggests that this process is graded: more salient meanings are accessed earlier than less salient meanings. Thus, the most salient meanings are always accessed initially irrespective of their literality or contextual support. This implies that the processing of figurative sentences only diverges from that of literal sentences during later phases of processing, and only if the salient meanings that were accessed cannot be integrated with contextual information. In that case the salient meanings have to give way to less salient but contextually appropriate meanings. As opposed to the Direct Access view, contextual information is proposed to have a limited impact because it cannot restrict the initial access to salient meanings that might be contextually incompatible.

Evidence for the Graded Salience Hypothesis comes from behavioural studies that emphasize how conventional meanings are more facilitative for irony comprehension than novel ones (e.g. Giora et al., 1998; Giora & Fein, 1999a; Giora et al., 2007). For example, Giora & Fein (1999a) have shown that conventional ironies (e.g. ‘Very funny.’) can be processed as easily as literal remarks, while unconventional ironies seem to require more effort. In addition, Giora and her colleagues (e.g. Giora et al., 2007; Giora, 2011; Fein et al., 2015) presented evidence against the Direct Access view by showing several instances where reading an irony takes longer than reading a literal sentence, regardless of the amount of contextual information that should facilitate the interpretation of the remark as ironic. Other evidence in favour of the Graded Salience Hypothesis comes from a study by Filik and colleagues (2014) who showed, through both eye movements and electrophysiological measures, that sentence processing is disrupted by unfamiliar—thus non-salient—ironies, while familiar ironies seems to be processed as easily as literal statements.

17.2.3 A middle way

A compelling model that tries to find a middle ground between the Direct Access view and the Graded Salience Hypothesis is the parallel constraint satisfaction account (e.g. Katz, 2005; Pexman, 2008). According to this account, various cues—such as expectations about the speaker’s style of communication, the familiarity, or conventionality of the ironic utterance and paralinguistic cues (e.g. prosody)—are used in parallel during the processing of the utterance. Pexman and colleagues’ hypothesis is based on a connectionist approach according to which the stimulus (the ironic statement) activates multiple cues in a network of nodes; when the network stabilizes, the interpretation that has the highest activation value emerges as the relevant one and the alternative meanings are suppressed. In other words, an ironic interpretation is considered as soon as enough converging cues point to that interpretation. Thus, there is no need to theorize an extra (literal) stage of processing that must take place every time an ironic statement is encountered. Processing ironic meanings does not necessarily take longer than processing literal meanings, although this could happen if contextual cues are not strong enough to quickly lead to an ironic interpretation.

Evidence in favour of the parallel constraint satisfaction account comes from studies suggesting that the ironic interpretation of a statement is taken into consideration early on during sentence processing, although the processing of an ironic utterance can still be more taxing than the processing of a straight literal assertion when multiple cues (e.g. the valence of the meaning, the prosody, the reader/listener expectancy) point in different directions. For example, Kowatch and colleagues (2013) showed that ironic criticism was harder to understand than literal criticism—suggesting a difference in cognitive demand—but response time and eye-gaze data showed no difference in the two conditions, suggesting that the processing of literal and ironic criticism follow similar procedures.

17.2.4 Interim conclusions

However brief, this summary captures the state of irony comprehension research in mainstream psycholinguistic venues. As we have pointed out, it took off in the mid-1980s with proposals that followed up on Grice’s. While investigations into irony processing have provided the literature with a rich set of data, the upshot is that the literature’s main debate ultimately led to an impasse because there are some data showing support for fast figurative readings and there are others that do not. This literature also treats irony as on a par with other pragmatically rich phenomena such as metaphor and idioms. As we argue that is not justifiable since the speaker’s attitude is especially prominent in irony. As is clear, our goal is to reintroduce attitude into discussions of irony. In order to do that, we need to appreciate a literature that incorporates attitudes, namely research into ToM, and what it has to offer.

17.3 THEORY OF MIND

Premack & Woodruff published a seminal paper ‘Does the Chimpanzee have a ‘Theory of Mind’?’ (1978) that described ToM as the ability to impute mental states to oneself and to others. According to their proposal, human ToM abilities are specialized for the rapid attribution of beliefs, intentions, desires, or knowledge to others and ourselves and in the spontaneous understanding that others have mental states that may differ from our own. Depending on which literature one peruses, synonyms for this ability are mindreading and mentalizing. The point is that this ability permeates human cognition, especially with respect to communicative exchanges. When watching people’s actions, we automatically interpret their current behaviour in light of the intentions they might have. If we see a man in a queue at a coffee machine, for example, we will be inclined to explain his behaviour by the fact that he wants a coffee, and this attribution of intention will lead us to predict that he will put coins into the machine and that he will choose his favourite coffee. Attributing mental states and predicting behaviour on that basis can thus be seen as our most natural way of grasping the social world.

Empirical work designed to tap into ToM abilities evolved through the False Belief Task (FBT; Wimmer & Perner, 1983; Baron-Cohen et al., 1985), in which a participant needs to take someone’s belief into account in order to provide the appropriate answer to a question. Crucially, the beliefs she needs to take into account differ from her own, and they are also in contrast with the actual state of affairs (hence the term ‘false’ belief). Arguably the most common version of the FBT is the ‘Sally-Anne’ paradigm. Children are told a story involving two dolls, Sally and Anne, who play with a marble. Sally puts the marble away in a basket and leaves the room. In Sally’s absence, Anne takes the marble out and plays with it. Once she has finished playing, she puts the marble away in a box. Sally returns and the child is asked where Sally will look for the marble. The child passes the task if she answers that Sally will look where she first put the marble; the child fails the task if she answers that Sally will look in the box where the marble indeed is. Passing this test has been taken to mean that one is able to represent others’ mental states and failing the test has been taken as evidence that one is as yet incapable or impaired in representing another’s mental states.

This task would go on to inspire dozens of similar ones and to evolve into more sophisticated ToM measures. Consider a more complex version of the original task that calls on second-order ToM. Here, Anne changes the location of the marble thinking that Sally cannot see. However, Sally—looking from the peephole of the door—is observing the action. When Sally comes to the scene, the subject is asked where Anne thinks that Sally will look for the marble. In this task, the subject has to be able to reflect on the beliefs of a character about the beliefs of another character.

The reliability of these tasks and the insights that they have provided into ToM have led to tools for diagnosing the extent to which people fall on the Autism Spectrum (as we will see later). The insights that these tasks provide about ToM have also led to the development of tools for diagnosing Autism. Taken together with other measures, such as Baron-Cohen et al.’s (2001) Autism-spectrum Quotient (AQ), we can regard this work as creating a genuinely new scientific frontier, one that comes with societal benefits.

While there are arguably limits to the conclusions one can draw from the FBT, for instance with respect to the age-related claims of competence (Southgate & Hamilton, 2008), the experimental research on ToM has shown how we readily attribute mental states in order to explain and predict the behaviour of intentional agents. ToM is clearly a building block of social cognition and the investigation of the interaction between mindreading and language processing can shed light on the cognitive processes behind human communicative skills.

17.4 REINTRODUCING THEORY OF MIND TO LANGUAGE PROCESSING

As far as we know, Francesca Happé was the first to make a link between ToM and communicative phenomena. As a specialist in Autism, Happé predicted that people with Autism Spectrum Disorders (ASD), those who manifest severe impairments in ToM abilities and in communication, should reveal language comprehension competence that corresponds with different levels of mindreading abilities. Happé employed Relevance Theory, the post-Gricean theoretical framework mentioned earlier that makes explicit a role for comprehension of intentions in human communication, to clarify her expectations about how people, and especially those on the spectrum, would understand a range of figurative uses.

According to Happé, simile could be understood literally and need not rely on ToM mechanisms. ‘He was like a lion’ should not differ from ‘He was like his father’ because in both cases the hearer simply has to decide in what respect there is a similarity. Metaphor should require some understanding of intentions. In a metaphor the propositional form of the utterance is more or less a loose interpretation of the speaker’s thought (see, for example, Wilson & Carston, 2007, and Sperber & Wilson, 2008, for the relationship between loose talk and metaphor). Therefore metaphors should require first-order ToM to be properly understood. Irony should be more demanding still because the hearer has to comprehend a thought about an attributed thought, engaging second-order ToM.

Happé was interested in determining whether ToM capacities correspond with the comprehension of these three language phenomena. Her results revealed that a) individuals with ASD who fail the standard FBT are not able to understand both metaphor and irony, but perform well with similes, b) ASD people who pass first-order FBTs but not second-order FBTs master the comprehension of metaphors but not of irony, and c) only individuas with ASD who pass second-level FBTs are able to properly understand ironic remarks. In addition, a further experiment, which was included in the same paper (Happé, 1993), supported the same predictions showing that only typically developing children who pass the second-order FBT are able to correctly understand ironies. This was the first study, since Jorgensen et al.’s (1984), to consider intention-reading as a basis for comprehending irony. Note, however, that it did not come with the kind of psycholinguistic evidence, such as reading times, that was already current by that time.

17.5 RECONCILING THEORETICAL PRAGMATIC APPROACHES WITH PSYCHOLINGUISTIC METHODS: DOING EXPERIMENTAL PRAGMATICS

Starting in 2011, our group started to look at irony processing from a more traditional pragmatics-oriented point of view. It struck us as surprising that debates on irony processing, which had inspired multiple experimental studies, had become practically devoid of proposals that considered attitude ascription. As far as we could tell, there had been no reason to reject the attitude-laden accounts.³

We turned our attention first to neural correlates of ToM and irony because it is one domain in which ToM research had successfully evolved with neural imaging techniques. Work on FBTs led to increasingly elaborate work on ToM from dozens of researchers—Rebecca Saxe, Simon Baron-Cohen, Uta and Chris Frith, Jason Mitchell, and many others. This led to a general consensus (see Overwalle’s (2006) meta-analysis) indicating that the best candidate regions for the neural underpinnings of ToM are the right and left temporo-parietal junction (the rTPJ and lTPJ), the medial prefrontal cortex (MPFC) and the precuneus (PC). These four regions are often collectively defined as the ‘ToM network’ and their involvement in ToM-processing has been shown through studies employing different techniques such as functional Magnetic Resonance Imaging (fMRI) (e.g. Kampe et al., 2003; Saxe & Kanwisher, 2003; Saxe & Wexler, 2005; Saxe & Powell, 2006; Jenkins & Mitchell, 2010), transcranial magnetic stimulation (TMS; e.g. Kalbe et al., 2010; Lev-Rana et al., 2012) and studies involving patients (e.g. Stone et al., 2003; Shamay-Tsoory & Aharon-Peretz, 2007; Shamay-Tsoory et al., 2007).

To provide an inkling of this very rich literature, consider this straightforward study from Rebecca Saxe and colleagues (2004), in which participants watch a short film. In the experimental condition, the film consists of a person shown walking, stopping behind a bookshelf for four seconds, and continuing walking. The control condition would present the same events but in a different order so that the first scene is the walker stepping out from behind the bookcase. In this way, the only difference between the experimental and control conditions is the way participants interpret one identical scene, the one showing the walker momentarily behind the bookcase. In one case, the walker stops intentionally and in the other it is the initial state of the film. One can then determine what areas of the brain are implicated when the walker is observed stopping behind the bookcase. The stopping versus the pause-prior-to-stepping-out prompts activity in the area that is now classically associated with ToM (intention-attribution) areas: the right TPJ.

17.5.1 Links between Theory of Mind and irony processing?

Early fMRI studies on irony did not reveal strong activations in the ToM network during irony processing in the way other ToM investigations do (cf. Saxe & Powell, 2006). One finds either no overlap with ToM regions (e.g. Uchiyama et al., 2011) or only partial overlap (e.g. Eviatar & Just, 2006). As we discuss in Spotorno et al. (2012), we surmised that this lack is likely due to the fact that the experimental designs of these studies prevent one from detecting ToM-related activations. More specifically, we noticed five shortcomings in these fMRI studies, especially when compared to those in the psycholinguistic literature. We summarize these very briefly here. First, the vignettes portraying irony were unusually short, ranging from two to at most four sentences (Eviatar & Just, 2006; Uchiyama et al., 2006, 2011; Wang et al., 2006; Wakusawa et al., 2007; Rapp et al., 2010; Shibata et al., 2010). Second, the length of time needed to read the vignettes was typically outside the participant’s control (a vignette is presented as a block or else at a speed pre-determined by the experiment). Hence, irony reading was neither time-locked nor natural. Third, the ironies in the neuroimaging studies could be anticipated as a result of the experimental design. Most noticeable is the fact that ironic items are usually cued by negative events while literal uses of similar utterances are not. Fourth, the ironic materials are very prominent in most of these fMRI investigations, meaning they often made up a majority of the test items, which detracts from the task’s ecological validity (according to Gibbs, 2000, frequency estimates indicate that irony represents 8% of conversational turns in talk among friends). This particular issue could easily be dealt with by using extra filler items but this was rarely addressed. Finally, not all the studies take advantage of the fact that ironic statements can serve as an ideal control (that is, to use the same utterance literally in a slightly different context). In some cases (e.g. Wakusawa et al., 2007; Shibata et al., 2010), ironic versus literal stories are not designed from common contexts.

In light of these shortcomings, we prepared an fMRI study on irony comprehension with more optimal materials (Spotorno et al., 2012). This paradigm set up ironies in vignettes that are a) quite long; b) self-paced; c) based on comparing identical utterances whose contexts vary only minimally in a between-subjects design; and d) arrayed with filler items including distractor items we call decoys. These refer to vignettes that present scenarios containing a negatively charged event that is followed by a banal (literal) remark instead of an ironic statement (see the example in Table 17.1 where a character breaks a mirror before his friend says ‘We have made a big mistake’). We also provided comprehension questions that did not concern irony to ensure that stories were understood. See Table 17.1 for a portrayal of our paradigm.

Table 17.1. A summary of a paradigm

Our results confirmed our hypothesis that irony implicates ToM areas. Both a whole-brain analysis and a region-of-interest analysis showed clear activations in the most prototypical ToM-related areas—the right and left TPJ, MPFC, and PC. In addition to that, a functional connectivity analysis showed an increased coupling in the activity of the left inferior frontal gyrus and on MPFC, which are hubs of the language-related and of the ToM-related networks respectively (see Spotorno et al., 2012, for further details).

In a complementary study, based on electroencephalography (EEG) techniques, we aimed to better capture how mindreading interacts with language during on-line processing (Spotorno et al., 2013). In this study, we used the same paradigm except that the target sentence was presented in a word-by-word manner to facilitate the recording of the EEG signal while the last word of the sentence was on the screen. This work was designed to investigate Event-Related Potentials (ERPs) with particular focus on the P600 component, which had been recurringly associated with pragmatic phenomena (Coulson & Lovett, 2010; De Grauwe et al., 2010; Regel et al., 2011). We also applied a Time Frequency Analysis (TFA), which is best suited for capturing ongoing processes that are not necessarily time-locked to the presentation of a single word. As expected, we found an increase in magnitude of the P600 when comparing the ironic condition to the literal one. We also observed an increase of power in the gamma band for the same contrast. This is an intriguing finding because it takes place early on in the comprehension process (in the 280–400 milliseconds (ms) time window). This might indicate that integration operations during irony processing start well before the latency associated with the P600 and that the integration between the linguistic code and the contextual information is an ongoing process that is not necessarily segregated to a late Gricean step in the comprehension of an utterance. However, it is important to bear in mind that the detection of variations in gamma power at the level of the scalp (i.e. with EEG recording) can be tricky, thus it would pay to replicate these effects.

Armed with revealing data from neuroimaging experiments, we used our paradigm in a more classical setting—a reading time study. We anticipated that attitude ascription can affect processing time in at least one of two ways. One is that a mindreading ability can help participants anticipate ironic readings. That is, participants can take advantage of contextual and ‘experimental’ cues to anticipate the arrival of an ironic statement. For example, we would expect that a reading task that routinely presents a negative event followed by an ironic remark would be more conducive to speeded irony readings over the course of an experimental session than a task that does not present such reliable cues. We thus predicted that we could create a set of materials that would provide an interaction in which ironic readings start out distinctly slower than literal readings early on in a session and become progressively faster later. In so doing, we would be in a position to account for the mixed results with respect to irony in the psycholinguistic literature. The other way in which individuals could vary with respect to mindreading abilities is that they can be more or less sensitive to the cues that anticipate an ironic remark. To measure individual differences, we would use Baron-Cohen et al.’s (2001) AQ. We tested both of these hypotheses in three reading time experiments (Spotorno & Noveck, 2014).

The first experiment was essentially a replication of our neuroimaging studies in that it included decoys. Again, these distractor vignettes were included to keep participants from anticipating that a negative event is a cue that an irony is on its way; the upshot is that the ironic stories, when they arrive, are viewed spontaneously. As in the prior neuroimaging studies, our reading time results revealed a significant slowdown overall in reading the ironic statements when compared to both the literal and the decoy items.

Our second reading time experiment was identical to the first except that we removed the decoys. The results were twofold. First, the data showed that the ironic target sentences prompted reading times that were longer than their literal controls, but only at the beginning of the experimental session. That is, the absence of decoys allowed readers to anticipate ironies and to such an extent that, over the course of the experiment, they ended up approaching the speeds of the literal reading times. Second, there was a positive correlation between one subscale of the AQ (the Social Skill subscale, which include TOM-related questions) and an individual’s Ironic-minus-Literal reading time difference in the second half of the experiment. In other words, those who score higher on the AQ (those we described as socially disinclined) were more likely than those who scored lower (the socially inclined) to continue processing ironies at a slower pace even into the second half of the session. This correlation supports our hypothesis that ToM abilities have a role to play in producing mixed results in irony processing.

This third experiment employed the same structure as Experiment 2 (no decoys) as we investigated the effect of echoic mention (as seen in Jorgensen et al., 1984, and in Gibbs, 1986). To make this concrete, imagine that the opera singer from our favourite example explicitly indicates—before going on-stage—that she expects the two of them to perform well (by saying ‘Tonight our performance will be fantastic’). In such a case, the target utterance ‘Tonight, we gave a superb performance’ after an awful show would arguably prompt ironic interpretations more readily than in the prior versions because it echoes the singer’s hopeful prediction. We hypothesized that explicitly referring to an antecedent will either a) eliminate differences between ironic and literal reading times completely, or, at least, b) maintain the kind of interaction reported in Experiment 2.

The results of this experiment closely resemble those of the second experiment. On first blush, this could be interpreted as indicating that the to-be-echoed antecedent had no additional effect on the comprehension of irony. However, this is not quite so. The results of Experiment 3 showed a negative correlation between the Social Skill subscale and the (Irony-minus-Literal) reading time difference. While the more socially inclined distinguished between ironic and literal readings uniquely in the first half of the session, as in Experiment 2, the socially disinclined read the ironic and the literal statements with comparable speeds from the beginning to the end of the experimental session. This is unlike Experiment 2, where the socially disinclined maintained an Irony–Literal gap across the session.

It appears then that the socially inclined and the socially disinclined participants react in distinctive ways across the two experiments. In our paper (Spotorno & Noveck, 2014), we argued that of the two cues— i) a reliable link between negative events and irony across stories and ii) the role of echoic mention—only the former can be viewed as representative of a second level of ToM. Arguably, those who are more socially inclined are seeking to better understand the structure of the task and are looking for ToM-related cues. It is less clear to us why the socially disinclined appear to take full advantage of the to-be-echoed antecedent in Experiment 3.

Taken together, the results of the reading time experiments provide strong evidence in favour of the centrality of attitude ascription in irony processing. It would be difficult to refer to these data as part of a debate around the priority of the literal meaning. These data show that attitude ascription is a subtle experimental manipulation that can drastically affect the results of an experiment.

17.6 CONCLUSIONS

Over the course of this chapter we presented the main positions with respect to a lively debate around irony processing. We started with largely theoretical perspectives before taking an increasingly empirical path. We hope that two take-home messages will remain. The first is that irony research provides researchers with the opportunity to genuinely study utterances. While this approach can appear risky (because it does not focus on the effects linked to a single word), it also comes with benefits. Results from irony comprehension research, in particular, can be edifying as long as investigators are rigorous in their methods. It is important, for example, to prevent participants from anticipating the arrival of ironic remarks.

The second take-home message is that, when considering irony comprehension, attitude ascription cannot be ignored. While the reader has probably encountered the words ‘attitude’ and ‘Theory of Mind’ (or ‘ToM’) more here than in other chapters in this volume, this insistence is valuable because attitude ascription is a) the hallmark of irony processing as well as b) the fil rouge that links the modern pragmatic discussions on irony to the debate about the priority of the literal meaning as well as to the irony studies that implicate neuroscience (neuroimaging and Autism). A clearer picture of irony emerges through converging evidence when mindreading is brought (back) into the discussion.

To conclude, in keeping with pragmatic theorists dating back to Grice, intention-reading (or mindreading) is central to the comprehension of a speaker’s meaning. While irony comprehension might make mindreading obvious, it should not be viewed as exceptional. Irony simply provides a useful paradigm in which to appreciate its role. Used with care, irony can indeed provide us with many useful lessons.

¹ This claim was attributed to both Grice and Searle, two of the leading lights in the emerging pragmatics literature.

² Pexman & Olineck (2002) compared their results with the theoretical positions (e.g. the Echoic account and pretence-based proposals) but this is not the focus of their paper and they found just partial overlaps with some of them (e.g. Utsumi, 2000; Colston, 2001).

³ We surmise that the absence of ToM from discussions of irony was at least partly due to the fact that it did not lend itself to psycholinguistic investigations, at least in the 1980s. It was not obvious how to manipulate ToM in an epoch where the focus was on reading times (though see Gibbs, 1986: Experiment 2). Reading times made factors like conventionality more attractive to study.

CHAPTER 17

IRONIC UTTERANCES