5
Thinking Aloud about Mental Voices
Abstract
There is a consensus that auditory verbal hallucinations (AVHs) stem from a misattribution of inner speech to an external agency. We consider whether a developmental view of inner speech can resolve some of the problems associated with inner-speech theories. We examine neurophysiological and phenomenological evidence relevant to the issue and point up some key issues for future research.
The recent development of the cognitive sciences has been marked by an increased interest in inner experience. One factor in this resurgence has been the renewed legitimacy of an interest in consciousness as a topic of scientific and philosophical inquiry (e.g., Velmans & Schneider, 2007; Zelazo, Moscovitch, & Thompson, 2007). Related to this development is a growing recognition that taking consciousness seriously demands a comparable seriousness about its phenomenal properties, such as have typically been explored through introspective methods (Hurlburt & Schwitzgebel, 2007). A third factor concerns methodological advances in techniques for studying inner experience, such as the development of the method of Descriptive Experience Sampling (DES) (Hurlburt, 1990; Hurlburt & Heavey, 2006). As the science of consciousness assumes an ever more prominent position within the cognitive sciences, there is a growing consensus that any such endeavor must pay attention to the qualitative details of inner experience.
One aspect of inner experience that is beginning to be studied systematically is inner speech or verbal thought (e.g., Wiley, 2006; Feigenbaum, 2009; Riley, 2004; Fernyhough, 2009). Both armchair introspection and more systematic investigations of inner experience point to it having a verbal quality. Baars (2003), for example, views inner speech as a constant of consciousness: “Overt speech takes up perhaps a tenth of the waking day; but inner speech goes on all the time” (7). A flow of verbal ideation has been proposed both to have a constitutive role in conscious mentation (Carruthers, 2002) and to be a source of evidence through which we know our own minds (Carruthers, 2009). Covert mental language has been proposed to structure our cognitive environments in such a way as to augment existing cognitive capacities and expand the range of tasks that our brains can perform (Clark, 1998, 2006).
Such theoretical proposals have been supported by experimental findings that disruption of covert articulatory mechanisms can impair cognitive functioning in a range of domains (e.g., Baddeley, Chincotta, & Adlam, 2001; Hermer-Vazquez, Spelke, & Katsnelson, 1999; Lidstone, Meins, & Fernyhough, 2010). Such indirect measures of inner speech use, obtained through the employment of secondary-task methodologies assumed to disrupt such speech, do not, however, allow us to distinguish between different conceptions of the phenomenon. In particular, the notion of inner speech as a stream of verbal ideation capable of describing one’s own experience and shaping new plans of action is distinctly richer than a conception of subvocal rehearsal, as inner speech has frequently been operationalized in experimental research (Jones & Fernyhough, 2007; see sec. 4 below). That said, it is likely that both conceptions of inner speech rely on common underlying cognitive and neural pathways (Al-Namlah, Fernyhough, & Meins, 2006).
A further reason why inner speech has been a focus of attention concerns the role it has played in recent theorizing about hallucinations. In this chapter, we explore the idea that the study of inner speech can be illuminating about the causes of a particular kind of anomalous experience, namely, auditory verbal hallucinations (AVHs). Our contribution is in four parts. First, we set out a model of inner speech that sees it as developing through the internalization of social dialogue. We argue that this developmental view of inner speech can help to explain some of the qualitative features of the phenomenon. In the second section, we consider the experience of AVHs and attempts to account for them in terms of disordered inner speech, before going on to consider what a developmental view can contribute to such accounts of the phenomenon. In the third section, we consider implications of the developmental view for research into the neurophysiology of inner speech. Finally, we review some recent research on the phenomenology of inner speech in both voice hearers and healthy participants.
2 A Developmental View of Inner Speech
Understanding where inner speech comes from, in terms of individual ontogenesis, may help us to understand the quality and behavior of this form of experience. The most informative writings in this respect are those of L. S. Vygotsky (1896–1934). Vygotsky ([1934] 1987) argued that inner speech represents the end point of a developmental process involving the gradual internalization of linguistic (and other semiotic) exchanges with others. A transitional stage in this process is represented by the phenomenon of private speech, where children and adults talk to themselves out loud in ongoing commentaries on their own actions.
The evidence from the study of children’s and adults’ private speech broadly supports Vygotsky’s account (e.g., Winsler, Fernyhough, & Montero, 2009). For example, Vygotsky’s predictions about relations between task difficulty and private speech production have been supported by recent research (e.g., Fernyhough & Fradley, 2005), as have hypotheses, inspired by Vygotsky’s writings, that task-relevant self-regulatory private speech will relate positively to task performance (Fernyhough & Fradley, 2005; Al-Namlah et al., 2006; Lidstone et al., 2010). Vygotsky’s claims about the developmental trajectory of private speech, particularly the prediction that it will become more covert as it is gradually internalized, are also broadly supported by recent empirical findings (Winsler & Naglieri, 2003).
Vygotsky’s ideas about private and inner speech have also inspired some theoretical contributions to the ongoing debate about the relation between language and thought. Fernyhough (1996, 2008, 2009) has used Vygotsky’s ideas as a starting point for exploring the dialogic quality of thinking. The Dialogic Thinking (DT) framework (Fernyhough, 1996, 2008) sets out some of the implications of a view of thinking as internalized, mediated interpersonal activity. Implicit in Vygotsky’s writings, although never fully spelled out by him, is the suggestion that inner speech will retain the dialogic quality of the mediated social exchanges from which it is ontogenetically derived. Furthermore, the capacity of language to make manifest differing perspectives on reality means that internalized dialogues involve a comparable richness of perspectival difference. As dialogue is internalized, so too are the various perspectives on reality that are manifested in that dialogue. The eventual developmental consequence is a “restructuring of cognition to enable the simultaneous accommodation of multiple perspectives upon a topic of thought” (Fernyhough, 2008, 232–233). A view of thinking as internalized dialogue can help to explain the flexibility and open-endedness of human cognition (Fernyhough, 1996) and has also been proposed to provide a basis for a theory of the development of social understanding (Fernyhough, 2008).
For present purposes, the most important element of the DT framework is the idea that inner speech can occur in more than one form. Specifically, the framework entails a distinction between expanded and condensed inner speech. The key to understanding the difference between these two forms of inner speech lies in the transformations that are proposed to accompany internalization.
The concomitants of internalization described by Vygotsky ([1934] 1987) can be divided into syntactic and semantic types. The main syntactic transformation is abbreviation, where the psychological subject of the utterance is dropped, leaving only the psychological predicate. For example, imagine sitting in your living room late at night and hearing a loud sound outside, like a dustbin lid being knocked to the ground. After some initial alarm, you have the thought that it is your neighbor’s cat that has knocked over the dustbin. You are, however, unlikely to experience an utterance in inner speech of the kind “next-door’s cat has just knocked the dustbin over.” Rather, the inner-speech utterance is likely to be abbreviated to a brief comment like “the cat.” Reiterating the psychological subject (the alarming noise) is unnecessary, and so only the relevant predicate (the assumed cause of the noise) is expressed.
In addition to syntactic abbreviation, Vygotsky described three kinds of semantic transformation accompanying the internalization of speech: the predominance of sense over meaning (the greater prominence of personal, private meanings compared to conventional, public ones), the process of agglutination (the development of hybrid words signifying complex concepts), and the infusion of sense (the tendency for specific elements of inner language to become infused with more semantic associations than are present in their conventional meanings).
Although other important transformations likely accompany internalization, the two categories of process described by Vygotsky illustrate the difference between expanded and condensed inner speech. In the first of these forms, internal dialogue retains many of the acoustic qualities and turn-taking properties of the external dialogue from which (developmentally speaking) it was derived. Because the process of internalization, and its accompanying syntactic and semantic transformations, are only partially complete, expanded inner speech appears phenomenally as an exchange between voices in the head that bear many of the acoustic and functional properties of external speech.
In condensed inner speech, by contrast, the interplay of semiotically manifested perspectives that constitutes inner dialogue is profoundly altered. Condensed inner speech is speech that has been fully internalized and therefore fully subjected to the transformational processes proposed to accompany internalization. As a result of these transformations, condensed inner speech loses most of the acoustic and structural qualities of external speech and approaches the state of “thinking in pure meanings” described by Vygotsky ([1934] 1987).
The distinction between these two forms of inner speech adds another level to Vygotsky’s tripartite description of the internalization of speech. Figure 5.1 depicts this resultant four-stage model (see also Fernyhough, 2004; see also McCarthy-Jones & Fernyhough, 2011, for empirical support concerning the expanded-condensed distinction). The figure also illustrates a further important feature of the model, namely, that movement between the stages of inner speech development is possible even after internalization is complete. That is, adults who have already fully internalized speech (and thus completed the developmental process described by Vygotsky) nevertheless revert to more ontogenetically primary forms of inner speech under certain conditions. In ordinary circumstances, inner speech takes a condensed form (level 4), such that the thinker is not aware of an explicit, expanded to-and-fro between different internalized voices.
Figure 5.1
Condensed inner speech can therefore be considered to be the default setting for inner speech. Under conditions of stress and cognitive challenge, however, condensed inner speech can be “re-expanded” into the developmentally more primitive form of inner speech, namely, expanded inner speech (level 3). A further movement back to private speech (level 2) is also possible, accounting for the evidence that adults often engage in overt private speech (e.g., Duncan and Cheyne, 2002). As will be seen in the next sections, this four-stage model of the development of inner speech can contribute in important ways to our understanding of the phenomenon of AVHs.
3 Applying a Developmental View of Inner Speech to an Explanation of AVHs
AVHs involve the experience of perceiving speech in the absence of any corresponding external stimulation. Although usually associated with psychiatric disorders such as schizophrenia, there is a growing recognition that the phenomenon can be part of normal experience (see McCarthy-Jones, 2012), with around 2 to 4 percent of the general population each year having auditory hallucinatory experiences when completely awake (Tien, 1991).
Views of AVHs as involving disordered inner speech date back to Maudsley (1886). Inner-speech theorists see hallucinations as arising when utterances in inner speech are misattributed to external agents (e.g., Leudar et al., 1997; Bentall, 2003). Common to all such approaches is the challenge of explaining why hallucinated voices appear to consciousness as both alien and at the same time somehow “of the self” (Leudar & Thomas, 2000; Fernyhough, 2004).
Despite their prominence among theoretical explanations of AVHs, inner-speech accounts have achieved only limited success (McCarthy-Jones, 2012). Jones and Fernyhough (2007) have set out some key explanatory challenges that any successful theory of AVHs must meet. First, such a theory must explain why the hearer has the experience of perceiving a voice without any corresponding external stimulation. Second, it must explain why the voice appears to be generated by a person other than the self. Third, it must account for the fact that the hallucinated voice often has person-specific acoustic properties (such as accent or timbre) different from the hearer’s own. Fourth, it must explain the particular content and pragmatics of hallucinated voices, such as the fact that they often take the form of commands (McCarthy-Jones et al., in press; Nayani & David, 1996).
How well have inner-speech theories fared in meeting these challenges? They have had some success at explaining why a voice is heard in the absence of any stimulus. As Jones and Fernyhough (2007) point out, both inner speech and AVHs involve “some form of internal verbal mentation, or ‘voice in the head’” (141), whose content typically relates to the individual’s ongoing behavior. A misattribution of the source of that voice as a result of cognitive bias (Bentall, 1990) or neurocognitive deficit (Frith, 1992) can therefore plausibly be related to the experience of hallucination.
Inner-speech theories have had less success at explaining the “alien yet self” paradox. If AVHs result from a simple misattribution of an utterance in inner speech to an external agency, it is difficult to see why they are also acknowledged to have an internal origin, except through a process of inference from the fact that no external speaker is actually present. Inner-speech theories also struggle to explain why AVHs frequently have person-specific characteristics that are alien to the hearer. If inner speech is the voice of the self, then how is it apparently populated by other voices? Finally, inner-speech theories have had little success at explaining the characteristic content and pragmatics of AVHs, particularly the predominance of self-directed commands.
We submit that among the reasons for the limited success of inner-speech theories of AVHs is that they have not had a developmental account of the phenomenon. We noted earlier that in developmental, cognitive, and phenomenological respects, inner speech is underdescribed. Inner-speech accounts of AVHs have generally ignored the developmental transformations proposed to accompany the internalization of social and private speech. In particular, such theories have made no distinction between expanded (level 3) and condensed (level 4) inner speech. In addition, the pragmatic functions of inner speech, as a developmental derivative of self-regulatory private speech, have largely been ignored.
Fernyhough (2004) set out to address these imbalances by exploring the implications of a Vygotskian developmental view of inner speech for theorizing about AVHs. Specifically, he builds on the assumption that transition between levels of inner and outer speech is possible in adulthood, as well as throughout ontogenesis. Distinguishing between two possible models that might result from such a treatment, he favors one in which condensed inner speech (level 4) is temporarily re-expanded into expanded inner speech (level 3). As noted earlier, this re-expansion is particularly likely to occur under conditions of stress and cognitive challenge. When combined with other cognitive biases and patterns of metacognitive beliefs, this can lead to sudden intrusion into consciousness of other voices, and thus the experience of AVHs.
Several empirical predictions follow from the re-expansion model (Fernyhough, 2004). First, one would predict that AVH hearers will not experience normal expanded (level 3) inner speech. Second, the model predicts that AVH hearers will experience normal condensed (level 4) inner speech. Third, AVHs should be associated with conditions of stress and cognitive challenge. Fourth, AVHs would be predicted to occur in psychiatrically healthy individuals under conditions of extreme stress and cognitive challenge.
Empirical evidence relating to some of these predictions is considered in the next sections (see also Fernyhough, 2004). The re-expansion model of AVHs would appear to have several immediate advantages over conventional theories of inner speech. As with other such accounts, the re-expansion model is well placed to explain why AVH hearers experience voices in the absence of any external stimulation, since such voices are viewed as products of inner speech. The “alien yet self” paradox that proves a particular challenge for inner-speech theories seems more tractable for the re-expansion model. This is because inner speech, dialogic by default, already involves a heterogeneity of voices, by virtue of the particular developmental pathway through which it emerges. Where inner speech is concerned, “the inner is always at least partly outer” (Fernyhough, 2004, 62).
The re-expansion model can also account for the person-specific acoustic properties associated with AVHs. Because hallucinations occur during episodes of expanded inner speech, the syntactic and semantic transformations that accompany internalization are temporarily reversed. A similar line of reasoning can account for the particular functional properties of AVHs, such as the predominance of self-directed commands. Analysis of children’s private speech shows that, as Vygotsky predicted, it has a self-regulatory function, with many utterances expressed as commands to the self (Luria, 1961). Given the proposed developmental continuity between private and inner speech, it is unsurprising that AVHs frequently have a self-directive quality (see sec. 4 for further discussion of this point).
A further advantage of the re-expansion model is that it meets an acknowledged need (Bentall et al., 2007) for a developmental approach to psychosis. In providing an account of how “normal” transitions between different levels of internal and external speech can result in anomalous experiences, the model is similarly congruent with approaches that see these experiences as existing on a continuum between typical and atypical experience. As we will see in the next section, the model’s distinction between expanded and condensed inner speech also helps to straighten out some paradoxes in the neuroimaging literature.
4 Neuroimaging of Inner Speech and AVHs
The influence of inner-speech theories of AVHs has led researchers to use neuroimaging techniques to examine the neural correlates of inner speech. Two studies have directly compared the neural activation associated with a variety of inner-speech tasks between individuals who experience AVHs and healthy controls (McGuire et al., 1995; Shergill, Bullmore, Simmons, Murray, & McGuire, 2000). One such study (Shergill et al., 2000) used functional magnetic resonance imaging (fMRI) to compare the neural activation associated with inner speech between healthy controls and patients diagnosed with schizophrenia who had prominent AVHs but were in remission at the time. In the “inner speech” condition, participants heard a word (e.g., swimming) spoken in a neutral tone presented via an audio recording. They were then asked silently to articulate this word in a sentence with the form “I like . . .,” ending with the presented word. In addition to this task, participants were asked to perform auditory verbal imagery (AVI). This involved imagining the same phrases being spoken. Participants were asked to perform first-person AVI (imagining the same sentence spoken in their own voice), second-person AVI (imagining the sentence in the form “You like . . .” being spoken in the voice heard on the audio recording), and third-person AVI (imagining the sentence in the form “he/she likes . . .” being said in the voice from the audio recording).
The study found that the neural activation associated with the “inner speech” condition did not differ between patients and controls. In contrast, AVH hearers in remission who imagined others speaking to them showed less activation than for controls in the posterior cerebellum, hippocampal complex, and lenticular nuclei bilaterally, and also in the right thalamus, middle and superior temporal gyri, and left nucleus accumbens. This pattern of findings (no differences during inner speech combined with differences during AVI between subjects with AVHs in remission and healthy controls) replicated the findings of the earlier positron emission topography study using a similar methodology (McGuire et al., 1995).
The researchers’ explanation for these findings was based on the observation that AVI involves greater levels of verbal self-monitoring (VSM), the cognitive capacity responsible for monitoring inner speech (e.g., McGuire et al., 1995). Mentally imitating another voice and internal inspection of this imagined speech (necessary to assess whether the voice has the prosody, tone, pitch, and rhythms of the voice it is intended to be; see McGuire et al., 1996) is thought to place high demands on the VSM system. The researchers proposed that participants with AVHs have impaired VSM skills, and as the generation of inner speech demands only low levels of this skill, no differences are detectable between participants with AVHs and controls. In contrast, on tasks high in VSM, such as generating AVI, the impaired VSM of participants with AVHs makes neural differences observable.
These neuroimaging findings seem to present something of a paradox. Inner speech is meant to be the raw material of AVHs, and yet it does not differ, neurologically, between patients and controls. However, neurophysiological differences do obtain on tasks that are not to do with inner speech as it is usually construed, but instead involve imagining another voice talking to or about you. As we have previously noted (Jones & Fernyhough, 2007), researchers need to explain why AVH hearers might be performing verbal mentation in inner speech that uses the same cognitive resources that are involved with imagining people speak. This, we have proposed, can be resolved through a Vygotskian interpretation of inner speech as elaborated in the previous section (Jones & Fernyhough, 2007).
Neuroimaging studies of AVHs typically draw on a definition of inner speech proposed by Levine, Calvanio, and Popovics (1982), namely, the “subjective phenomenon of talking to oneself, of developing an auditory-articulatory image of speech without uttering a sound” (391). This leads to neurophysiological researchers implicitly assuming that the silent articulation of sentences represents inner speech (e.g., Shergill et al., 2001). As such, patients and controls are said to be performing inner speech when they mentally recite sentences such as “You are stupid” (McGuire et al., 1995, p. 597). However, such a conception is somewhat impoverished relative to the conception of inner speech set out in the previous section, according to which inner speech retains several features of the external dialogic exchanges from which it derives. Thus, we would argue, what the neurophysiological researchers assume elicits “inner speech” (asking participants to articulate silently) is not actually eliciting inner speech at all.
Conversely, the neurophysiologists’ baseline condition (in which they assume that no inner speech is taking place) is actually likely to involve inner speech. Specifically, in a study such as that of Shergill et al. (2001), the neural correlates of inner speech are examined by subtracting the level of baseline activation (in which subjects listen to words, and where no inner speech is assumed to be occurring) from an “inner speech” condition in which they are silently articulating sentences. However, this baseline condition is likely to involve ongoing condensed (level 4) inner speech. The results of such studies would therefore appear to be contaminated by the persistence of a form of inner speech even into the baseline condition. This leads to the awkward conclusion that when neurophysiological researchers assume that inner speech is occurring, it is probably not, and when they assume it is not happening, it probably is.
This apparent paradox, concerning why AVH hearers might be performing verbal mentation in inner speech that uses the same cognitive resources that are involved with imagining people speak, begins to fall away when we consider the dialogic nature of inner speech. If we assume that inner speech has a dialogic nature and incorporates a multiplicity of internalized voices, we should not be surprised that when AVH hearers perform inner speech, they use cognitive resources involved with imagining people speak. Indeed, this is precisely what expanded inner speech is likely to involve in all of us. Furthermore, if the re-expansion model of AVHs is correct, and expanded dialogues do indeed form the raw material of AVHs, it is unsurprising that the existing thoughts and ideas of voice hearers may come to be reflected in part or much of the content of the AVHs (Leudar & Thomas, 2000). A second implication of the conception of inner speech used by the re-expansion model is that it predicts the characteristic pragmatic qualities of AVHs. Specifically, it allows us to understand how AVHs frequently take the form of commands. Nayani and David (1996) found that command hallucinations such as “get the milk” or “go to the hospital” were reported by 84 percent of voice hearers. As inner speech is developmentally linked with the control of action (Luria, 1961; Vygotsky, [1934] 1987), we should not be surprised that AVHs frequently have a similar regulatory quality.
Jones and Fernyhough (2007) set out a number of empirical predictions that neuroimaging studies may be able to address in future research. First, the neurological correlates of condensed (level 4) inner speech should not differ between patients and controls. As attenuation of the acoustic characteristics of speech in level 4 (condensed) inner speech is, we suggest, likely to result in a reduced VSM load in inner speech of this kind, this prediction is consistent with the neuroimaging data showing that low-VSM tasks do not differ between patients and controls. Second, any neurological differences should show up only on tasks eliciting expanded (level 3) inner speech, in which VSM demands are increased. Finally, neural correlates of AVHs in patients should be similar to patterns observed during expanded (level 3) inner speech in healthy controls. In support of the last hypothesis, brain researchers have already noted that “the pattern of activation we observed during auditory hallucinations is remarkably similar to that seen when healthy volunteers imagine another person talking to them (auditory verbal imagery)” (Shergill et al., 2000, 1036).
The observed commonalities between inner speech and AVHs lead us to the broader question of to what extent the phenomenology of inner speech parallels that of AVHs. In the next section, we consider these details of the relation between the two forms of experience.
5 Phenomenology: What Are Inner Speech and AVHs Like?
One key issue for any inner-speech theory, including the re-expansion model highlighted here, concerns the similarity of the phenomenology of inner speech to the phenomenology of AVHs (Jones, 2010). If the relation of inner speech to AVHs were to be one of apples to oranges, then this would clearly be a prima facie case against inner-speech theories of AVHs. Such a comparison may start with a specific question: in terms of its form, function, and pragmatics, is the inner speech of voice hearers similar to their voices?
To date, only one study has asked AVH hearers about their inner speech (Langdon, Jones, Connaughton, & Fernyhough, 2009). Twenty-nine AVH-hearing patients underwent an in-depth interview about AVHs and inner speech, and their responses were compared to reports from 42 healthy (non-voice-hearing) controls. The study found that reports of the frequency of occurrence of inner speech did not differ between patients and controls. There was also no difference in the amount of expanded dialogic (level 3) inner speech between controls and voice hearers. No differences existed between the two groups in terms of other characteristics of inner speech, such as intelligibility, speed, and pragmatics. There was a nonsignificant trend toward fewer patients with AVHs than controls reporting dialogic inner speech (i.e., answering in the affirmative to questions about inner speech as a back-and-forth conversation).
The study also investigated the concordance between the phenomenological qualities of the voice hearers’ voices and their inner speech. No relation was found between measures of the speed, volume, and intelligibility of patients’ inner speech and equivalent measures for hallucinated voices. Furthermore, there was no relation between the tendency for voice hearers to experience their AVHs as talking to them directly and their tendency to talk directly to themselves in their own thoughts. Similarly there was no relation between the experience of hearing voices conversing and voice hearers’ tendency to experience thinking as a conversation with oneself. Likewise there were no concordances between measures of the usage of personal names and second-person or third-person pronouns in inner speech and the frequency with which similar terms of address were used by voices.
In conclusion, the study of Langdon et al. (2009) offers evidence that inner speech is unimpaired in subjects who hear voices, and can operate normally as a phenomenon distinct from patients’ AVHs. Such a conclusion is consistent with the neuroimaging research presented in the previous section, which showed no neural abnormalities in inner speech conceived as subvocal rehearsal in remitted voice hearers. Langdon et al.’s study did not, however, examine how often voice hearers performed auditory verbal imagery (the creation of the voices of other people). Such a study would likely be of great interest because, as highlighted in the previous section, the involvement of other voices in dialogic inner speech, coupled with the finding that it is in AVI that we find neural abnormalities in subjects who hear voices, suggests that AVI would be a promising candidate for the raw material of AVHs. Hoffman, Varanko, Gilmore, and Mishara (2008) have also recently made a similar point, noting that “source monitoring mislabeling may selectively attach to verbal imagery of non-self speakers rather than ordinary inner speech” (1172).
One challenge for any inner-speech account of AVHs is to explain why voice hearers frequently experience negative comments about themselves or their ongoing actions (McCarthy-Jones et al., in press; Nayani & David, 1996). The evidence actually indicates that the dialogic inner speech of such individuals is likely to involve negative perspectives on the self. Specifically, substantial evidence demonstrates that AVHs are often, though not always, associated with earlier experiences of physical and sexual abuse (Offen, Waller, & Thomas, 2003; Read, & Argyle, 1999). It is hence plausible that such events may form a key part of the stream of consciousness of such individuals, with the content likely being hyperaccessible (Wenzlaff & Wegner, 2000). Both the re-expansion model and other theories linking inner speech to AVHs would thus predict that the content of AVHs in people in such situations would in some way be related to such traumatic events.
A conception of inner speech as shot through with other voices, possibly those involved in traumatic events in the individual’s past, thus appears to have reasonable phenomenological concordance with some facets of AVHs. However, this still leaves some puzzles to be solved. First, AVHs are typically reported as having the phenomenological quality of being heard. For example, in a study by Leudar et al. (1997), all voice-hearing patients with a diagnosis of schizophrenia reported that it was “very much like hearing other people speak” (889). One approach to reconciling this to the phenomenology of inner speech has been to suggest that inner speech has particularly developed acoustic properties in voice hearers, in comparison to expanded (level 3) inner speech in healthy individuals. For example, Moritz and Larøi (2008) found that approximately 40 percent of patients with schizophrenia with AVHs rated their own thoughts (defined as cognitions that the participant deliberately initiated or contemplated, such as thinking about how to respond to a particular question) as having some acoustic properties (as opposed to being silent). This led the authors to argue that AVHs may be associated with abnormalities in sensory inner perception “which apparently arise already at the stage of thoughts.” Such a conclusion would, however, conflict with the assumption made in the re-expansion model that condensed (level 4) inner speech in voice hearers is no different from that in healthy controls.
An alternative approach is to question the degree to which an experience’s being labeled as a voice has to do with its acoustic properties. Stephens and Graham (2000) have argued that “something can count as a voice without being experienced as audition-like or mistaken for sensory perception of another’s speech” (114). In line with this claim, not all AVHs have the phenomenal qualities of a heard voice. Bleuler (1952) noted that some “patients are not always sure that they are actually hearing the voices or whether they are only compelled to think them. There are such ‘vivid thoughts’ which are called voices by the patients” (110).
A second limitation of all inner-speech-based models is that they do not seem appropriate for the approximately 10 to 20 percent of individuals whose voices have content that can be linked directly back to memories of trauma (Jones, 2010). These instead appear better modeled as verbatim intrusions from memory. A third limitation, as Waters, Badcock, Michie, and Maybery (2006) have noted, is that such models cannot explain other types of AVH, such as the voices of crowds, or nonverbal auditory hallucinations, such as environmental noise, music, and silence (see Kennedy, this volume). Indeed, both McCarthy-Jones et al. (in press) and Nayani and David (1996) found hallucinations of music to be quite frequent, occurring in 46 percent and 36 percent of voice-hearing patients respectively.
Fourth, it may be worth considering Hurlburt and Schwitzgebel’s (2007) distinction, on the basis of their DES study, between inner speech and inner hearing. These authors note that while inner speech is experienced as “going away,” “produced by,” and “under the control of” the individual and is “just like speaking aloud except no sound,” in contrast inner hearing is the experience of a sound that is “coming toward,” “experienced by,” and “listened to” by the individual (257). In these terms, many AVHs are more phenomenologically consistent with inner hearing than inner speech. If Hurlburt and Schwitzgebel’s distinction is validated by future experience sampling research, one might conclude that the typical balance between inner speaking and inner hearing is distorted in voice hearers as compared to healthy controls.
Despite these outstanding problems, it is interesting to note that the model of inner speech laid out here can also explain the phenomenal quality of one of the more unusual forms of AVHs. As noted in section 2, Vygotsky claimed that inner speech becomes a process of “thinking in pure meanings,” which we have characterized as condensed (level 4) inner speech. Is it possible that condensed, as well as expanded, inner speech could form the raw material for AVHs? While such a possibility would require the revision of the re-expansion model, it would lead to the prediction that, in addition to fully formed words or sentences being experienced as AVHs, some hallucinatory experiences would also have this quality of “pure meaning.” Some such types of AVHs have indeed been documented and were designated by Bleuler (1952) as “soundless voices” (110). In such AVHs, a message or meaning is communicated although it is not actually heard. For example, a patient of Bleuler’s who threw himself into the Rhine reported afterward that “it was as if someone pointed his finger at me and said, ‘Go and drown yourself’” (111; italics added). The prominent psychiatrist/psychologist Pierre Janet also noted this phenomenon, giving the example of a patient who reported that “it is not a voice, I do not hear anything, I sense that I am spoken to” (Leudar & Thomas, 2000). However, although no reliable data exist on the prevalence of such AVHs, they seem to be much rarer than their counterparts involving fully formed words and sentences. It is interesting to speculate that such AVHs may form a midpoint on an auditory-ideational spectrum ranging from, at one extreme, clear AVHs, through these “soundless voices,” to intrusive thoughts that may form the basis of delusions (e.g., “that’s a police helicopter searching for me”). Such a conclusion would, however, require us to go beyond the re-expansion model set out in Fernyhough (2004), toward a model where condensed (level 4) inner speech could also be misattributed to an external agency. This would then leave the puzzle of how an element of inner experience that had no or few perceptual qualities (i.e., condensed inner speech) could be misperceived as anything.
In conclusion, various aspects of the phenomenology of inner speech, including its self-regulatory nature, its linkage to ongoing events, its involvement of the voices and perspectives of others, its ability to take the form of “thinking in pure meanings,” and its creative nature, are consistent with the phenomenological properties of a large proportion of AVHs. That said, a number of properties of AVHs, such as their tendency to be associated with other nonverbal forms of auditory hallucinations, their “heard” nature, and in some cases their similarity to verbatim memories, limit their resemblance to inner speech. Despite these limitations and the need for future empirical testing, inner-speech theories have the starting advantage of proposing a raw material for AVHs that accords with the phenomenology of a significant number of such hallucinations. It may be that some subsets of AVHs have inner speech as their raw material, and other AVHs do not involve inner speech (McCarthy-Jones et al., in press; McCarthy-Jones, 2012; Jones, 2010). Indeed, such a claim is consistent with recent work on phenomenological subtypes of AVHs (McCarthy-Jones et al., in press). The tendency to hallucinate may be associated with alterations to neurological mechanisms that result in both inner-speech-based processes (such as the re-expansion of inner speech) and memory-based processes (such as the intrusions of material from memory) being experienced as hallucinatory phenomena. In this way, a single underlying neurological change may impact across many cognitive domains.
We have proposed that close attention to the developmental origins of inner speech can be illuminating about verbal thought in adulthood, and about the cognitive, neurophysiological, and phenomenological properties of AVHs. A developmental account can show up contradictions and inconsistencies in existing inner-speech accounts of AVHs, as well as resolving some problems that such accounts face. That said, many empirical and conceptual puzzles remain to be solved. With the exception of the work of Hurlburt (1990), research into the phenomenal qualities of AVHs has relied on introspective and retrospective methods. Further development of methods for experience sampling such as DES is likely to shed further light on the nature of both forms of inner experience considered here.
We also note that our account would appear to have little to say about hallucinations in other modalities, such as the visual and olfactory. Our skepticism that there can be a modality-general account of hallucinatory experiences may turn out to be misplaced, but we nevertheless maintain that the puzzle of AVHs will never fully be resolved until we make further progress in understanding the mysteries of inner speech.
Acknowledgments
Charles Fernyhough gratefully acknowledges the support of a fellowship from the Institute of Advanced Study, Durham University, and a Strategic Award from the Wellcome Trust (grant no. WT098455MA).
References
Al-Namlah, A. S., Fernyhough, C., & Meins, E. (2006). Sociocultural influences on the development of verbal mediation: Private speech and phonological recoding in Saudi Arabian and British samples. Developmental Psychology, 42, 117–131.
Baars, B. J. (2003). How brain reveals mind: Neural studies support the fundamental role of conscious experience. Journal of Consciousness Studies, 10, 100–114.
Baddeley, A., Chincotta, D., & Adlam, A. (2001). Working memory and the control of action: Evidence from task switching. Journal of Experimental Psychology: General, 130, 641–657.
Bentall, R. P. (1990). The illusion of reality: A review and integration of psychological research on hallucinations. Psychological Bulletin, 107, 82–95.
Bentall, R. P. (2003). Madness explained. London: Penguin.
Bentall, R. P., Fernyhough, C., Morrison, A. P., Lewis, C., & Corcoran, R. (2007). Prospects for a cognitive-developmental account of psychotic experiences. British Journal of Clinical Psychology, 46, 155–173.
Bleuler, E. (1952). Dementia Praecox or the group of schizophrenias. New York: International Universities Press.
Carruthers, P. (2002). The cognitive functions of language. Behavioral and Brain Sciences, 25, 657–726.
Carruthers, P. (2009). How we know our own minds: The relationship between mindreading and metacognition. Behavioral and Brain Sciences, 32, 121–182.
Clark, A. (1998). Magic words: How language augments human cognition. In P. Carruthers & J. Boucher (Eds.), Language and thought: Interdisciplinary themes (pp. 162–183). Cambridge: Cambridge University Press.
Clark, A. (2006). Language, embodiment, and the cognitive niche. Trends in Cognitive Sciences, 10, 370–374.
Duncan, R. M., & Cheyne, J. A. (2002). Private speech in young adults: Task difficulty, self-regulation, and psychological predication. Cognitive Development, 16, 889–906.
Feigenbaum, P. (2009). Development of communicative competence through private and inner speech. In A. Winsler, C. Fernyhough, & I. Montero (Eds.), Private speech, executive functioning, and the development of verbal self-regulation. Cambridge: Cambridge University Press.
Fernyhough, C. (1996). The dialogic mind: A dialogic approach to the higher mental functions. New Ideas in Psychology, 14, 47–62.
Fernyhough, C. (2004). Alien voices and inner dialogue: Towards a developmental account of auditory verbal hallucinations. New Ideas in Psychology, 22, 49–68.
Fernyhough, C. (2008). Getting Vygotskian about theory of mind: Mediation, dialogue, and the development of social understanding. Developmental Review, 28, 225–262.
Fernyhough, C. (2009). Dialogic thinking. In A. Winsler, C. Fernyhough, & I. Montero (Eds.), Private speech, executive functioning, and the development of verbal self-regulation. Cambridge: Cambridge University Press.
Fernyhough, C., & Fradley, E. (2005). Private speech on an executive task: Relations with task difficulty and task performance. Cognitive Development, 20, 103–120.
Frith, C. D. (1992). The cognitive neuropsychology of schizophrenia. Hove, UK: Lawrence Erlbaum.
Hermer-Vazquez, L., Spelke, E. S., & Katsnelson, A. S. (1999). Sources of flexibility in human cognition: Dual-task studies of space and language. Cognitive Psychology, 39, 3–36.
Hoffman, R. E., Varanko, M., Gilmore, J., & Mishara, A. L. (2008). Experiential features used by patients with schizophrenia to differentiate “voices” from ordinary verbal thought. Psychological Medicine, 38, 1167–1176.
Hurlburt, R. T. (1990). Sampling normal and schizophrenic inner experience. New York: Plenum.
Hurlburt, R. T., & Heavey, C. L. (2006). Exploring inner experience: The Descriptive Experience Sampling method. John Benjamins.
Hurlburt, R. T., & Schwitzgebel, E. (2007). Describing inner experience? Proponent meets skeptic. Cambridge, MA: MIT Press.
Jones, S. R. (2010). Do we need multiple models of auditory verbal hallucinations? Examining the phenomenological fit of cognitive and neurological models. Schizophrenia Bulletin, 36, 566–575.
Jones, S. R., & Fernyhough, C. (2007). Neural correlates of inner speech and auditory verbal hallucinations: A critical review and theoretical integration. Clinical Psychology Review, 27, 140–154.
Langdon, R., Jones, S. R., Connaughton, E., & Fernyhough, C. (2009). The phenomenology of inner speech: Comparison of schizophrenia patients with auditory verbal hallucinations and healthy controls. Psychological Medicine, 39, 655–663.
Leudar, I., & Thomas, P. (2000). Voices of reason, voices of insanity: Studies of verbal hallucinations. London: Routledge.
Leudar, I., Thomas, P., McNally, D., & Glinski, A. (1997). What voices can do with words: Pragmatics of verbal hallucinations. Psychological Medicine, 27(4), 885–898.
Levine, D. N., Calvanio, R., & Popovics, A. (1982). Language in the absence of inner speech. Neuropsychologia, 20(4), 391–409.
Lidstone, J. S. M., Meins, E., & Fernyhough, C. (2010). The roles of private speech and inner speech in planning in middle childhood: Evidence from a dual task paradigm. Journal of Experimental Child Psychology, 107, 438–451.
Luria, A. R. (1961). The role of speech in the regulation of behavior. Harmondsworth: Penguin.
Maudsley, H. (1886). Natural causes and supernatural seemings. London: Kegan Paul, Trench & Co.
McCarthy-Jones, S. (2012). Hearing voices: The histories, causes and meanings of auditory verbal hallucinations. Cambridge: Cambridge University Press.
McCarthy-Jones, S. R., & Fernyhough, C. (2011). The varieties of inner speech: Links between quality of inner speech and psychopathological variables in a sample of young adults. Consciousness and Cognition, 20, 1586–1593.
McCarthy-Jones, S., Trauer, T., Mackinnon, A., Sims, E., Thomas, N., & Copolov, D. L. (in press). A new phenomenological survey of auditory hallucinations: Evidence for subtypes and implications for theory and practice. Schizophrenia Bulletin.
McGuire, P. K., David, A. S., Murray, R. M., Frackowiak, R. S. J., Frith, C. D., Wright, I., et al. (1995). Abnormal monitoring of inner speech: A physiological basis for auditory hallucinations. Lancet, 346, 596–600.
McGuire, P. K., Silbersweig, D. A., Murray, R. M., David, A. S., Frackowiak, R. S. J., & Frith, C. D. (1996). Functional anatomy of inner speech and auditory verbal imagery. Psychological Medicine, 26, 29–38.
Moritz, S., & Larøi, F. (2008). Differences and similarities in the sensory and cognitive signatures of voice-hearing, intrusions and thoughts. Schizophrenia Research, 96–107.
Nayani, T. H., & David, A. S. (1996). The auditory hallucination: A phenomenological survey. Psychological Medicine, 26(1), 177–189.
Offen, L., Waller, G., & Thomas, G. (2003). Is reported childhood sexual abuse associated with the psychopathological characteristics of patients who experience auditory hallucinations? Child Abuse and Neglect, 27, 919–927.
Read, J., & Argyle, N. (1999). Hallucinations, delusions, and thought disorder among adult psychiatric inpatients with a history of child abuse. Psychiatric Services, 50, 1467–1472.
Riley, D. (2004). “A voice without a mouth”: Inner speech. In J. J. Lecercle & D. Riley (Eds.), The force of language. Basingstoke: Palgrave Macmillan.
Shergill, S. S., Bullmore, E. T., Brammer, M. J., Williams, S. C. R., Murray, R. M., & McGuire, P. K. (2001). A functional study of auditory verbal imagery. Psychological Medicine, 31, 241–253.
Shergill, S. S., Bullmore, E., Simmons, A., Murray, R., & McGuire, P. (2000). Functional anatomy of auditory verbal imagery in schizophrenic patients with auditory hallucinations. American Journal of Psychiatry, 157, 1691–1693.
Stephens, G. L., & Graham, G. (2000). When self-consciousness breaks: Alien voices and inserted thoughts. Cambridge, MA: MIT Press.
Tien, A. Y. (1991). Distribution of hallucination in the population. Social Psychiatry and Psychiatric Epidemiology, 26, 287–292.
Velmans, M., & Schneider, S. (Eds.). (2007). The Blackwell companion to consciousness. Oxford: Blackwell.
Vygotsky, L. S. [1934] (1987). Thinking and speech. In The collected works of L. S. Vygotsky (Vol. 1). New York: Plenum.
Waters, F. A. V., Badcock, J. C., Michie, P. T., & Maybery, M. T. (2006). Auditory hallucinations in schizophrenia: Intrusive thoughts and forgotten memories. Cognitive Neuropsychiatry, 11, 65–83.
Wenzlaff, R. M., & Wegner, D. M. (2000). Thought suppression. Annual Review of Psychology, 5, 59–91.
Wiley, N. (2006). Inner speech as a language: A Saussurean inquiry. Journal for the Theory of Social Behaviour, 36, 319–341.
Winsler, A., Fernyhough, C., & Montero, I. (Eds.). (2009). Private speech, executive functioning, and the development of verbal self-regulation. New York: Cambridge University Press.
Winsler, A., & Naglieri, J. (2003). Overt and covert verbal problem-solving strategies: Developmental trends in use, awareness, and relations with task performance in children aged 5 to 17. Child Development, 74, 659–678.
Zelazo, P. D., Moscovitch, M., & Thompson, E. (Eds.). (2007). The Cambridge handbook of consciousness. Cambridge: Cambridge University Press.