4    Is the Theory of Mind Wired In?

As we employ it and as our ancestors have used it for at least the last fifty thousand years or so, the theory of mind is complex, detailed, hard to use, and, worst of all, almost impossible to teach. Why impossible to teach? Because teaching almost anything at all and certainly learning anything from someone else already require that teacher and learner both share a theory of mind. That fact makes it hard not to conclude that the theory must be innate, the result of a developmental program encoded by neural genes in our brains. If not, then it must be the case that our brains are strongly predisposed, somehow prepared for or primed by our genetic inheritance to learn, acquire, and internalize the theory on the basis of very early and very limited experience.

Either one of these alternative possibilities explains why we employ the theory of mind without even noticing that we do, why we find it satisfies curiosity better than anything else from earliest childhood, and why we can’t give it up, no matter what its weaknesses, failures, and alternatives.

How could so complicated a theory be written in our genes? Or, alternatively, how could we be so predisposed to learn such a complicated theory so soon after birth? If acquiring the theory of mind is a necessary condition for learning anything from someone else—including language—then, hard as it may be to accept, one of these two possibilities must be true.

There is plenty of scientific evidence in favor of each possibility. We’ll walk through some of this evidence, from neuroscience, from clinical cases of mental illness, from developmental and cognitive social psychology. We won’t be able to determine once and for all whether the theory of mind is completely innate or acquired in early childhood. But we won’t have to. Either way, we’ll have a full explanation for why we love narratives, effortlessly remember stories, fall for conspiracy theories, and are so sure we understand history.

One way to convince you of the (nearly) innate role of theory of mind in human understanding is to tell a story of how cognitive scientists discovered that it’s (close to) hardwired in our brains. A second way is to tell a story about how they discovered that we acquire the theory very early in childhood, as infants even. But the right thing to do is not to tell a story at all. Rather it’s to identify the specific regions of the brain where cognitive scientists have found the theory of mind to be “hardwired” and what they’ve found about how these regions work to control our behavior. We’ll do all three, because the stories work, and the science is the right explanation.

Let’s start with the second way. Like many stories about babies, my account begins with the twinkling of an eye (Gredebäck and Melinder, 2011).

Subtle as it is, the eye movements of other people secure an infant’s eye tracking at a matter of a few weeks of age, as do self-propelled objects. By the age of only a few months, infants are already distinguishing the motion of bodies that looks goal-directed from uniform or random motion. They can do so without any prior exposure to the specific behaviors in question. Expose a four-month-old infant to a human, a robot, or a box moving around a barrier to a location at which each stops. Remove the barrier and have the person, the robot, or the box move along the same path, instead of directly to the location. The infant eye’s pupil dilation, its expression of surprise, and dishabituation—amount of time before the child ceases to look—are all greater under the second condition. This response is universally interpreted as showing an extremely early, unlearned interest in and an ability to discriminate means-ends behavior from mere motion, and to be perplexed by the failure of the human or robot or plain box to manifest it. The experimenters describe the result as revealing that “infants already at four months of age are able to interpret other people's actions as rational or irrational.” They go on to admit “it is unclear at this stage if the demonstrated early sensitivity to irrational social interactions is innate or based on early experience [within the first three months of life] with rational actions” (Gredeback and Melinder, 2011, p. 4) This may simply be a pardonable overinterpretation, but detecting purposeful behavior is an impressive achievement for any four-month-old. Let’s postpone asking how such detection is made—what developmental program lays down the capacity to discriminate purposive behavior from mere motion and how the program discriminates when it gets the chance.

Figure 4.1

Infant with eye motion tracker. http://www.indiana.edu/~dll/images/IMG_1303.jpg

Cognitive scientists have also found, that, at about one year of age, infants begin to make much more sophisticated inferences from observed behavior of others to their unobservable “intentions” (Gredeback and Melinder, 2011). Admittedly, the only basis for this conclusion is the appropriateness of the infants’ responses, but they found that, by eighteen months, infants’ joint attention on an object cued by an adult’s gaze is pretty reliable and that, by two and a half, toddlers demonstrate a grip on “shared intentionality,” pushing a ball back and forth when they play with their partners (Tomasello, 2014).

They further determined that, even at sixteen months, infants seem already to have figured out that other people carry around information about their surroundings. When presented with an adult observing where an object was being hidden and with another adult prevented from doing so by being blindfolded, infants met the mistaken guesses by the adult who had not been blindfolded with expressions of surprise and sustained visual attention (Gredeback and Melinder, 2011).

Scientists are quite rightly unwilling to credit young children with having sophisticated cognitive skills until they confirm unambiguously that the children actually have them. Without good evidence, they quite rightly resist crediting a child with anything so sophisticated as a “nested belief”—having the concept of someone else having a belief. For one thing, they can’t see into the child’s mind to detect beliefs; for another, it’s hard to determine exactly what any belief is about, and, for a third, it’s even harder to establish that a child has a belief about someone else’s belief. Exactly what behavior of the child would reveal this?

It was the philosopher Daniel C. Dennett who first figured out how to answer this question (Dennett, 1978). Thinking about how children respond to a Punch and Judy show (figure 4.2, plate 1) led Dennett to realize the obvious fact staring everyone in the face: if a child believes that someone else has a false belief, then the child has to believe that someone else has a belief in the first place. Punch and Judy shows only work if children have beliefs about the false beliefs of Punch. They laugh when they see him search for Judy in places they know she isn’t. Their laughter betrays their belief that he has a false belief.

Figure 4.2

Punch and Judy show: experimental apparatus for determining the child’s acquisition of the theory of mind. https://en.wikipedia.org/wiki/Punch_and_Judy#/media/File:Swanage_Punch_%26_Judy.jpg

Following Dennett’s insight, cognitive scientists were surprised to discover that, at about four years of age, children have acquired a quite grown-up theory about such false beliefs. By the early 1980s, scientists had set up experiments to detect the age at which children’s behavior unambiguously betrays that they think other people have false beliefs (Baron-Cohen et al., 1985). In the standard experiment, something is hidden in the presence of the child, who is required to predict the search behavior of another person. Correct prediction that the person will look in the wrong place requires the child to attribute a false belief to the searcher. For children to grasp that a third person can have beliefs—true and false—about the false beliefs of a second person takes a few more years. But parents know that, from a very early age, children are effectively manipulating us in ways that require us to attribute to them a complicated theory about our beliefs, desires, intentions, and so on. That theory is a pretty full-fledged theory of mind.

Now we can begin to see other complex and interesting components of the theory of mind we all carry around with us, ones we didn’t even mention in chapter 3. To begin with, once we have grasped the theory of mind, there is, in principle, no apparent limit to the number of nested beliefs we can have about another person’s belief about still another person’s belief and so on ad infinitum. The same goes for desires about other people’s desires about still other people’s desires. Moreover, we can have beliefs about other people’s desires, desires about other people’s beliefs, and so on in any number of mind-boggling iterations and combinations. However, as Nathan Oesch and Robin Dunbar show (Oesch and Dunbar, 2016), there is an upper limit to the “recursion” of nested beliefs and desires we can keep in our working memory. The limit is about five levels of “A believes that B wants C to believe that D wants E to believe ” Many of these iterations are important in human life. If you think about your willingness to accept paper money, for example, you’ll realize it’s based on your beliefs about other people’s beliefs about still other people’s beliefs, and so on. Why else do we accept as valuable small pieces of paper based only on our belief in the “full faith and credit” of the U.S. government? The same goes for all the kinds and combinations of beliefs and desires we have—hopes, fears, likes, dislikes, wishful thoughts: the theory of mind tells us they can be endlessly combined, iterated, and nested inside people’s belief and desire boxes.

The achievement of infants and toddlers in learning a theory of mind or deploying an innate one is even more impressive when we appreciate what using the theory requires them to assume. As we’ve seen, whoever adopts the theory of mind, whether consciously or not, accepts that we all have in our minds “representations” of how the world or some part of it is arranged and the way it could be arranged. Recall our discussion about one of the crucial differences between beliefs and desires, the “direction of fit” difference that is revealed whenever we talk about “wishful thinking.”

The crucial role of this component of the theory of mind is reflected in the way it gets combined with the nesting of belief and desires that is required for the child’s acquisition of language. A toddler’s achievement of realizing that the sounds its parents make are meaningful speech and not just sounds requires that the toddler have beliefs about what its sound-making parents want and believe, and, in particular, what those beliefs and wants are about, what their direction of fit to the world is. The child has to believe that its parents have beliefs and desires that represent the way things are and the way the parents want things to be. Until it has nested beliefs about the content of its parents’ beliefs and desires, the child is treating the sounds its parents make the way we treat the sounds a squeaky wheel makes, trying anything we can think of to get it to stop, without thinking that the wheel’s actually telling us something about what it wants and thinks. We’ll come back to this relation between the theory of mind and language again in later chapters.

So, before children have acquired language and almost anything else we can teach them, they have taken on board a pretty hefty psychological theory. How could they have done it?

Here’s why some cognitive scientists argue that the theory of mind is “innate,” hardwired, laid down in the developing brain by a genetically encoded process. The theory of mind is universal in us humans, present in all cultures. If acquired at all, it’s triggered or learned quickly, very early, in infancy, and under a vast range of circumstances. How could we all learn the same theory so quickly, so uniformly, and so accurately? Maybe, as children, we don’t learn it at all. Maybe we just have it from birth, requiring a little priming in early infancy to trigger its expression. Even children blind, deaf, or both from birth seem to acquire the theory easily and early. Only a very little learning of no special kind seems required—if any at all. Think of blind, deaf, and mute Helen Keller, who somehow has acquired the theory of mind, as she is portrayed in the 1959 play and 1962 film The Miracle Worker.

Moreover, children employ the theory of mind automatically, and, indeed, it’s hard for them to stop from invoking the theory to explain and predict, even when it doesn’t apply. They overshoot, as we all do even as adults, seeking theory-of-mind explanations for things that happen for no reason at all.

Motivated by these sorts of considerations, cognitive scientists in clinical settings began to uncover a much more powerful reason for treating the theory of mind as either innate or triggered by only the slightest provocation in early childhood experience: the behavioral pattern labeled “autism.”

“Autism” or “autism spectrum disorder” are fraught terms, subject to controversies at the intersection of science and politics. In recent years, autism has become a touchstone for the politics of difference, diversity, and demands for the treatment of unusual human traits as equally valuable ways to function, each with its own contribution to humanity. To explore what autism reveals about the theory of mind, we have to negotiate a minefield of passionate disputes at the intersection of science and human values.

Autism was first diagnosed variously, as an “illness,” a “disease,” a “disorder,” or a “defect.” It’s worth noting, to begin with, that most physical diseases or illnesses are known to have a number of symptoms but only a single cause—that’s what makes each of them a distinct disease or illness. That is not yet true of most mental illnesses or disorders. Our taxonomy of mental illnesses, as set forth in the Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM-5) and its predecessors, is a classification in terms of symptoms—effects, not causes. Since the same effects can and often do have different causes, the various editions of the DSM don’t actually identify kinds of mental illnesses with much precision. And some effects that look pathological to one physician, culture, or epoch may seem quite unpathological to another. Consider homosexuality, once classified as a “disorder” in the DSM, but no longer.

By contrast with other mental conditions, autism spectrum disorders were initially thought to have a fairly narrow range of causal agents; their onset was attributed to the failure of one or another component of the child’s acquisition and employment of a theory of mind.

Thirty years ago, social psychologists established that children with autism fail the false-belief test at ages when normal children and otherwise mentally disabled children (for example, those with Down’s syndrome) pass the test. When the genetic basis of autism was widely recognized and accepted in the 1980s (see Baron-Cohen et al., 1985), biomedical researchers sought to identify the defect in the brain resulting from the genetic defect or defects associated with autism. The next step would be to locate the defect in the DNA.

To say that autism has a genetic basis is not say that is transmitted from parents to offspring. Indeed, researchers have found little indication that the condition is genetically hereditary. But it’s well known that when one identical twin is autistic, the probability the other twin is as well is higher than 50 percent, and the probability that the other twin has some related developmental “disorder” is higher still. This suggests that autism probably has some early, possibly prenatal cause that is carried into both twins by DNA copying. There is, however, almost certainly no single gene for autism. This should be no surprise given what we already know about the quantitative genetics of most human traits that have been studied. Even height, the most strongly inherited human trait is correlated with literally scores of genes no one of which seems to contribute more that 2 percent to the probability that an individual attains a certain height or not.

Autism, then, is a genetic trait in the sense not of inherited but of somatic genes, the genes in the body’s cells that control what the cells do. In particular, it’s caused by variations in the genes of the cells that build certain parts of the brain. Genetic variants can be inherited from parents if they are present in the sperm or ovum. But they can also be the result of “breakdowns” in the copying of somatic genes soon after fertilization that produces the zygote’s full complement of twenty-three chromosome pairs. Then they won’t be inherited, even though they are genetic in origin. Twins start out sharing an almost perfectly identical sequence of 3 billion nucleotides in their genes, which then get copied over and over through development in all their cells. The copying produces a relatively small number of inconsequential differences—copy “errors”—in their DNA sequences. (The quotation marks around “breakdowns” and “errors” remind us that all genetic variations have their source in the always imperfect process of gene copying, and that whether the resulting trait is favorable or harmful depends entirely on the environment it interacts with.) Since the chances these copy errors will be the same in two twins are vanishingly small, the variations in autistic twins’ genes that produce the same symptoms almost certainly happened before the original division of the twins’ zygote. And because autism doesn’t have a clear pattern of inheritance across the generations, it’s also probably the result of combining two genomes—the mother’s and the father’s—neither of which, alone, has enough genetic copy “errors” to result in autism, but which together do have enough (Brandler and Sebat, 2015).

But which genes are involved? Some 60 percent of all genes are expressed in the brain, that is, make proteins that build and maintain the brain, and different amounts or assemblages of protein structures and different orders of activity can produce much the same developmental outcome or behavior. Autism is therefore the result of a variation somewhere in the building of the brain or in some regions of it that involves a relatively rare combination of a large number of genes, most of which are common to everyone, but some of which are rare. The particular combination will almost certainly differ from case to case, even as it produces the “same” result, namely, the range of behavior that is symptomatic of autism. Although we still have much more to learn about the condition, what the evidence shows so far is that autism is a result of differences in the genetically encoded program of brain development.

Is the failure of children with autism to pass the false-belief test also a failure to fully deploy the theory of mind? Influential researchers like Simon Baron-Cohen (Baron-Cohen, Leslie, and Frith, 1985) and many others thought so in the 1980s. They inferred that the theory of mind must itself be largely the result of a genetically encoded, hardwired program of neurological development. The wiggle word “largely” acknowledges that, almost certainly, some crucial, though as yet undetected, triggering stimuli in earliest childhood, most likely in infancy, were also required for the child to begin fully deploying the theory of mind. Researchers hypothesized that in autistic children, when the crucial triggering stimuli occurred, the theory of mind fails to deploy.

This view of the nature of autism led cognitive scientists to seek a domain or module in the brain that specifically and narrowly “subserves,” “realizes,” “implements,” or “instantiates” the theory of mind. Most of their research has employed functional magnetic resonance imaging (fMRI), a technique for inferring brain activity from the differential uptake of oxygen during cognitive tasks of various sorts (and one whose well-understood limitations will eventually give way to finer discriminations employing more direct measurements). Another approach to localization employed by cognitive neuroscientists scientists is transcranial magnetic stimulation (TMS), which focuses a magnetic field on a small area of the brain, where it interferes with the neurons’ electrical polarization up to 6 centimeters below the scalp. Whereas fMRI identifies regions of activity by their use of oxygen, TMS identifies them by temporarily disabling the effects of their neurons on behavior

Every month, research employing fMRI and TMS by a large number of cognitive neuroscientists reports new findings about the localization of cognitive activity and emotional response. The regions identified are relatively large and almost certainly responsible for a range of distinct, though related, behaviors. Moreover, several of the regions work together in bringing about any one activity, and each is involved in more than one identifiable behavioral task. In fact, the main challenge in this research is designing behavioral tasks that are so constrained, specialized, and measurable that they are produced only by a small number of these distinct regions (Gweon and Saxe, 2013; Saxe, Carey, and Canwisherm, 2004).

Identifying brain regions with distinct functions—modules—is an iterative process: researchers start out with localization in relatively large areas due to co-occurrence of behavioral deficits with visible abnormalities—lesions, congenital deformities, stroke sites, and so on. They then correlate these deficits with findings from fMRI and TMS in normal brains and devise behavioral tests that activate specific regions. Subsequently, they try to design new behavioral tests that elicit activity in smaller and smaller parts of these regions. The limitation here is, of course, the researchers’ ability to design such tests and ensure that all subjects understand instructions and interpret stimuli in a closely similar way. Testing for the localization of theory of mind requires that subjects all share the same theory of mind and that they want to aid the researchers by using the theory while under fMRI scanning. The researchers’ tasks are to design experiments in which subjects have to attribute false beliefs to others, either upon listening to stories or viewing cartoons, films, or demonstrations of behavior inexplicable unless the agents are acting on false beliefs.

Using both fMRI and TMS, Tobias Schuwerk, Bertold Langguth, and Monika Sommer identified five regions of the brain as loci that work together to deploy a theory of mind (figure 4.3): (1) the left and right temporoparietal junction (LTPJ and RTPJ); (2) the left and right dorsolateral prefrontal cortex (DLPFC); (3) the left inferior frontal gyrus (IFG); (4) the ventral medial prefrontal cortex (vMPFC); and (5) the posterior medial prefrontal cortex (pMPFC) (Schuwerk, Langguth, and Sommer, 2014).

Figure 4.3

Five regions of brain identified by both fMRI and TMS as loci that work together to deploy a theory of mind: the left and right temporoparietal junction (LTPJ and RTPJ); the left and right dorsolateral prefrontal cortex (DLPFC); the left inferior frontal gyrus (IFG); the ventral medial prefrontal cortex (vMPFC), and the posterior medial prefrontal cortex (pMPFC). From Schuwerk, Langguth, and Sommer, 2014, fig. 1. Courtesy of Frontiers in Psychology.

The localization of theory of mind to this small number of particular brain regions is one of the most robust and well-replicated results in neuroimaging (see Saxe et al., 2004). Using different tests and stimuli over a range of different male and female subjects of different ages, speaking different languages, and from different cultures, researchers showed that the same five regions of the brain were specifically activated by tasks requiring theory of mind.

Transcranial magnetic stimulation has the advantage over fMRI in that it can be employed to temporarily inhibit normal functioning in specific regions of the brain, especially ones close to its outside surface and near the skull. In effect, TMS can switch on and off regions of the brain by interfering with or stimulating patterns of electrochemical discharge. TMS studies naturally offer the prospect of a finer-grained geography of specialization among brain regions than fMRI can provide. They have strongly confirmed results from fMRI about the specific regions of the brain that are differentially involved in theory-of-mind reasoning. In fact, the TMS studies have begun to help identify the specific contribution of each of them to the mind-reading task (Kalb et al., 2009). Of the five regions highlighted in figure 4.3, transcranial magnetic stimulation has begun to suggest that one specific function of the right temporal medial junction (RTPJ) is to enable the experimental subject to adopt the perspective of the other person (agent) the subject is thinking about and whose behavior the subject is to explain or predict (O’Connell et al., 2015). By contrast, TMS studies suggest that the posterior medial prefrontal cortex (pMPFC) does the opposite, enabling the experimental subject to distinguish his or her own perspective from that of the other person (agent) whose thoughts the subject is tracking, an essential task in false-belief reasoning (e.g., Schuwerk, Langguth, and Sommer, 2014).

Taken together, the research provides evidence that the theory of mind is localized to a small number of specific brain regions acting together as a mental “module.” Besides its localization, the imaging research reveals that it’s “domain specific”: the regions of the brain that encode and deploy theory of mind are not active in other tasks that seem quite similar to the theory’s proprietary activities. For example, when the brain detects other kinds of “falsehoods” besides false beliefs, for example, directional signs pointing in ways the experimental subjects know are wrong, or pictures that are known by the subjects to be inaccurate, the theory-of-mind regions are inactive. When subjects are told stories or shown cartoons that don’t involve false beliefs, the pattern of nonactivation of these regions and activation of non-theory-of-mind regions is quite distinctive. The theory-of-mind regions are also distinguishable on fMRI from other nearby regions involved in cognition that is focused on human purposeful behavior that does not involve either false or true beliefs or desires.

The theory-of-mind regions have several other features characteristic of special-purpose mental modules deployed in specific regions of the brain. As we’ve seen, theory-of-mind capacities develop in a fixed pattern in children, and they break down in characteristic ways that are reflected in medical diagnoses. We know from our own experience that these capacities operate quickly, often unconsciously, and automatically to explain or predict other people’s behavior in ways that we’re often not even conscious of.

About the strongest evidence against this idea that the right temporal parietal junction is itself a theory-of-mind module is the fact that, as fMRI research has shown, it is also active in some presumably non-theory-of-mind tasks, especially attention tasks (O’Connell et al., 2015). It may turn out, of course, that finer-grained fMRI investigation or the use of other, more refined techniques may differentiate structures in the right temporoparietal junction that deliver each of these activities independently.

Functional magnetic resonance imaging has also enabled neuroscientists to locate the regions of the brain that “subserve” a different kind of understanding, the kind involved in identifying causes in processes that don’t involve purposeful behavior or intentional action, and that don’t employ a theory of mind at all. These regions of the brain (the frontopolar cortex, the dorsolateral prefrontal cortex, and the motor cortex) are involved in mathematical processing, logical and spatial reasoning, and visual modeling. They are all well away from those involved in theory-of-mind reasoning, indeed, all the way on the other side of the brain. It’s interesting that mathematically gifted male adolescents show patterns of brain activation in three regions different from those active in average-ability students dealing with the same problems. These mathematically gifted kids are often less gifted in their employment of the theory of mind (Yun et al., 2011).

But the findings from neuroscience are not just that brain regions presumably subserving natural understanding are distinct and distant from the regions that subserve the theory of mind. These two sets of regions have also been shown to inhibit, suppress, and obstruct each other’s activity. The regions of the brain that subserve theory-of-mind cognition are components of a larger domain—the default mode network (DMN)—whereas the regions involved in mathematical reasoning and visual modeling are part of another larger domain—the task positive network (TPN). The DMN drives social information processing, whereas the TPN subserves reasoning about physical objects (Raichle et al., 2001). What’s more, fMRI studies have revealed that these two networks appear to be antagonistic: when either is active, the other is suppressed. There is a competition between theory-of-mind reasoning and physical-object reasoning in the brain. Behavioral experiments reveal that mathematical tasks interfere with responses that require theory-of-mind reasoning. Just as we might suspect, many autistic spectrum disorders are associated with higher levels of “fluid intelligence” and visualization ability, the kind of ability that makes for high achievement in math, but that correlates with lower levels of social cognition (Simard et al., 2015). The antagonistic relationship between these two domains helps explain why most of us, with fully functioning theories of mind, prefer to learn about science through narratives and why this doesn’t work very well.

The competition between narrative and analytical modes of thinking is also reflected in the enhanced visualization powers associated with reduced social functioning in some dementias. The opposite also occurs: heightened theory-of-mind capacities are often associated with mathematical deficits (for a review of some of these studies see Jack, Connelly, and Morris, 2012).

In short, neuroscience has begun to amass a good deal of evidence that the theory of mind is somehow inscribed in highly specialized regions of the brain, ones that initially showed deficits in people with autism. It has shown that in neurotypical people, transient interference with this brain region temporarily disturbs theory of mind reasoning. It has provided evidence that the brain’s employment of this theory is antagonistic to its employment of reasoning required to understand nonpurposive, physical or mechanical behavior, and that there are differences in the brains of people better than average at reasoning about nonpurposive behavior and worse at theory-of-mind reasoning.

It was at this point that autism researchers dropped out of the research program to locate the theory of mind in a specialized module of the brain. They did so for several reasons. For one, they began to discover that there are other, separately identifiable dimensions of cognitive, motivational, and emotional activity where autistic individuals differ from “normal” or “neurotypical” ones. For another, these researchers turned their attention to the ability of autistic people to deploy cognitive strategies other than theory of mind to cope with social interaction, including the prediction of other people’s behavior. And for a third, the achievements of autistic people like Temple Grandin led them to challenge the taxonomy of psychopathology, at least for some people diagnosed with the condition. Autism rights advocates have increasingly argued that, like homosexuality, autism does not require a cure but should be accepted instead as an alternative set of behaviors that may in fact facilitate some creative achievements.

Meanwhile, independent of any simple association with autism, the localization of the theory of mind to a specific domain or module of the brain is well established. But how could such a complicated theory be laid down in our brains by our genes or DNA? After all, just getting the brain built in the first place seems a formidable undertaking for a bunch of macromolecules. Surely, the theory of mind is not simply written down in the alphabet of the four nucleotides of DNA molecules, even if they’re three billion base pairs long, but, rather, is something we as infants, babies, toddlers, or young children learn from experience as we grow up. That view drives a lot of the resistance to the notion that our cognitive abilities are hardwired.

And yet there is clear evidence of our acquiring these complex cognitive abilities very soon after birth on the basis of so little experience that there must be some innate “device” or hard-wired “module”—or some detector of early stimuli to trigger our acquisition and deployment of the theory of mind immediately on encountering them. How might such a triggering mechanism work?

Imprinting by ducklings on their mother or on any moving object of roughly her height in the right time window of early development is the oldest well-known example of early experiential triggering (Lorenz, 1949). Another is the Nobel Prize–winning discovery of David Hubel and Torsten Wiesel that kittens’ normal visual abilities depend on having a certain experience that organizes identifiable parts of their brains during a brief critical period just after birth. It’s well established that, without any prior experience, infants respond to the presence of snakes, and both they and adults detect spiders and snakes more accurately and more quickly than other stimuli, even without prior experience of either. And this innate snake/spider detection ability can be rapidly converted to a fear/flight response with minimal learning. Our widespread antipathy of snakes and spiders, though not innate, is at least close to innate (Öhman and Mineka, 2001).

Noam Chomsky famously noted that every normal prelinguistic infant can learn any language at all, even from highly imperfect speakers, after only a few months of exposure. We acquire a capacity to decode and encode an infinite number of spoken sentences within a brief period of time after hearing even a small sample of expressions, however ungrammatical or otherwise defective they might be. All this suggested to Chomsky that we humans are born with a cognitive device that enables us to construct, discover, extract a set of syntactical rules for the language we’re exposed to, and begin using it early in childhood. Contrary to popularizations of his work, Chomsky believes that this device evolved as the result of selective pressure not for language, but for a more basic role in cognition and reasoning (Hauser, Chomsky and Fitch, 2002). Perhaps, in order to produce language, our ability to think syntactically needed to be harnessed with a theory of mind (a subject we’ll return to in chapter 11).

All of which is to say there is much evidence to support the conclusion that a theory of mind is a fundamental and universal component of our cognitive equipment, largely innate, requiring only the slightest of triggering by stimuli in early childhood and emerging at the same time among most of us, regardless of our circumstances. This component can be damaged or disabled in the developmental process that builds our infant brains, and, like a distinct organ, it is localized to particular parts of those brains. Moreover, the ability of our brains to use the theory can be turned off by experimental intervention as TMS studies show.

The conclusion that the theory of mind is a (nearly) innate component of our cognitive tool kit and that it is indispensable for learning language and, indeed, for learning anything at all from other people raises the deep and yet also obvious question of how such a trait evolved. If it turns out to be an indispensable prerequisite for life as we humans know it, the theory of mind almost certainly must have an evolutionary pedigree.

Although we’ll address its pedigree in greater detail in chapter 5, a few remarks are worth making here. Evolutionary anthropologists have discovered that the theory of mind and the more fundamental abilities from which it emerged played a central role in our very survival as a species (Tomasello, 2014; Hrdy, 2009). Their findings vindicate what fMRI studies have shown about the theory of mind being a distinct cognitive module and reveal how so complicated a cognitive capacity can get itself (practically) written into our genomes. In short, these findings establish that the theory of mind couldn’t work at all unless it was almost completely hardwired in the developmental program of our brains.

But the Darwinian pedigree of the theory of mind also raises serious questions about the theory’s explanatory power—questions whose answers end up completely discrediting historians’ claims to provide real understanding of human affairs.