Imagine arriving at the airport just in time to catch a plane. Everything in your behavior betrays the heightened concentration of your attention. Your mind on alert, you look for the departures sign, without letting yourself be distracted by the flow of travelers; you quickly scroll through the list to find your flight. Advertisements all around call out to you, but you do not even see them—instead, you head straight for the check-in counter. Suddenly, you turn around: in the crowd, an unexpected friend just called your first name. This message, which your brain considers a priority, takes over your attention and invades your consciousness . . . making you forget which check-in counter you were supposed to go to.
In the space of a few minutes, your brain went through most of the key states of attention: vigilance and alertness, selection and distraction, orientation and filtering. In cognitive science, “attention” refers to all the mechanisms by which the brain selects information, amplifies it, channels it, and deepens its processing. These are ancient mechanisms in evolution: whenever a dog reorients its ears or a mouse freezes up upon hearing a cracking sound, they’re making use of attention circuits that are very close to ours.1
Why did attention mechanisms evolve in so many animal species? Because attention solves a very common problem: information saturation. Our brain is constantly bombarded with stimuli: the senses of sight, hearing, smell, and touch transmit millions of bits of information per second. Initially, all these messages are processed in parallel by distinct neurons—yet it would be impossible to digest them in depth: the brain’s resources would not suffice. This is why a pyramid of attention mechanisms, organized like a gigantic filter, carries out a selective triage. At each stage, our brain decides how much importance it should attribute to such and such input and allocates resources only to the information it considers most essential.
Selecting relevant information is fundamental to learning. In the absence of attention, discovering a pattern in a pile of data is like looking for the fabled needle in a haystack. This is one of the main reasons behind the slowness of conventional artificial neural networks: they waste considerable time analyzing all possible combinations of the data provided to them, instead of sorting out the information and focusing on the relevant bits. It was only in 2014 that two researchers, Canadian Yoshua Bengio and Korean Kyunghyun Cho, showed how to integrate attention into artificial neural networks.2 Their first model learned to translate sentences from one language to another. They showed that attention brought in immense benefits: their system learned better and faster because it managed to focus on the relevant words of the original sentence at each step.
Very quickly, the idea of learning to pay attention spread like wildfire in the field of artificial intelligence. Today, if artificial systems manage to successfully label a picture (“A woman throwing a Frisbee in a park”), it is because they use attention to channel the information by focusing a spotlight on each relevant part of the image. When describing the Frisbee, the network concentrates all its resources on the corresponding pixels of the image and temporarily removes all those which correspond to the person and the park—it will return to them later.3 Nowadays, any sophisticated artificial intelligence system no longer connects all inputs with all outputs—it knows that learning will be faster if such a plain network, where every pixel of the input has a chance to predict any word at the output, is replaced by an organized architecture where learning is broken down into two modules: one that learns to pay attention, and another that learns to name the data filtered by the first.
The first pillar of learning is attention, a mechanism so fundamental that it is now being integrated into most contemporary artificial neural networks. Here, the machine learns to find the words to describe an image. Selective attention acts as a spotlight that lights up certain areas of the image (in white on the right) and discards everything else. At any given moment, attention thus concentrates all the learning power on a selected data set.
Attention is essential, but it may result in a problem: if attention is misdirected, learning can get stuck.4 If I don’t pay attention to the Frisbee, this part of the image is wiped out: processing goes on as if it did not exist. Information about it is discarded early on, and it remains confined to the earliest sensory areas. Unattended objects cause only a modest activation that induces little or no learning.5 This is utterly different from the extraordinary amplification that occurs in our brain whenever we pay attention to an object and become aware of it. With conscious attention, the discharges of the sensory and conceptual neurons that code for an object are massively amplified and prolonged, and their messages propagate into the prefrontal cortex, where whole populations of neurons ignite and fire for a long time, well beyond the original duration of the image.6 Such a strong surge of neural firing is exactly what synapses need in order to change their strength—what neuroscientists call “long-term potentiation.” When a pupil pays conscious attention to, say, a foreign-language word that the teacher has just introduced, she allows that word to deeply propagate into her cortical circuits, all the way into the prefrontal cortex. As a result, that word has a much better chance of being remembered. Unconscious or unattended words remain largely confined to the brain’s sensory circuits, never getting a chance to reach the deeper lexical and conceptual representations that support comprehension and semantic memory.
This is why every student should learn to pay attention—and also why teachers should pay more attention to attention! If students don’t attend to the right information, it is quite unlikely that they will learn anything. A teacher’s greatest talent consists of constantly channeling and capturing children’s attention in order to properly guide them.
Attention plays such a fundamental role in the selection of relevant information that it is present in many different circuits in the brain. American psychologist Michael Posner distinguishes at least three major attention systems:
Alerting, which indicates when to attend, and adapts our level of vigilance.
Orienting, which signals what to attend to, and amplifies any object of interest.
Executive attention, which decides how to process the attended information, selects the processes that are relevant to a given task, and controls their execution.
These systems massively modulate brain activity and can therefore facilitate learning, but also point it in the wrong direction. Let us examine them one by one.
The first attention system, perhaps the oldest in evolution, tells us when to be on the watch. It sends warning signals that mobilize the entire body when circumstances require it. When a predator approaches or when a strong emotion overwhelms us, a whole series of subcortical nuclei immediately increases the wakefulness and vigilance of the cortex. This system dictates a massive and diffuse release of neuromodulators such as serotonin, acetylcholine, and dopamine (see figure 16 in the color insert). Through long-range axons with many spread-out branches, these alerting messages reach virtually the entire cortex, greatly modulating cortical activity and learning. Some researchers speak of a “now print” signal, as if these messages directly tell the cortex to commit the current contents of neural activity into memory.
Animal experiments show that the firing of this warning system can indeed radically alter cortical maps (see figure 16 in the color insert). The American neurophysiologist Michael Merzenich conducted several experiments in which the alerting system of mice was tricked into action by electrical stimulation of their subcortical dopamine or acetylcholine circuits. The outcome was a massive shift in cortical maps. All the neurons that happened to be activated at that moment, even if they had no objective importance, were subject to intense amplification. When a sound, for instance, a high-pitched tone, was systematically associated with a flash of dopamine or acetylcholine, the mouse’s brain became heavily biased toward this stimulus. As a result, the whole auditory map was invaded by this arbitrary note. The mouse became better and better at discriminating sounds close to this sensitive note, but it partially lost the ability to represent other frequencies.7
It is remarkable that such cortical plasticity, induced by tampering with the alerting system, can occur even in adult animals. Analysis of the circuits involved shows that neuromodulators such as serotonin and acetylcholine—particularly via the nicotinic receptor (sensitive to nicotine, another major player in arousal and alertness)—modulate the firing of cortical inhibitory interneurons, tipping the balance between excitation and inhibition.8 Remember that inhibition plays a key role in the closing of sensitive periods for synaptic plasticity. Disinhibited by the alerting signals, cortical circuits seem to recover some of their juvenile plasticity, thus reopening the sensitive period for signals that the mouse brain labels as crucial.
What about Homo sapiens? It is tempting to think that a similar reorganization of cortical maps occurs every time a composer or a mathematician passionately dives into their chosen field, especially when their passion starts at an early age. A Mozart or a Ramanujan is perhaps so electrified by fervor that his brain maps become literally invaded with mental models of music or math. Furthermore, this may apply not only to geniuses, but to anyone passionate in their work, from a manual worker to a rocket scientist. By allowing cortical maps to massively reshape themselves, passion breeds talent.
Even though not everyone is a Mozart, the same brain circuits of alertness and motivation are present in all people. What circumstances of daily life would mobilize these circuits? Do they activate only in response to trauma or strong emotions? Maybe not. Some research suggests that video games, especially action games that play with life and death, provide a particularly effective means of engaging our attentional mechanisms. By mobilizing our alerting and reward systems, video games massively modulate learning. The dopamine circuit, for example, fires when we play an action game.9 Psychologist Daphné Bavelier has shown that this translates into rapid learning.10 The most violent action games seem to have the most intense effects, perhaps because they most strongly mobilize the brain’s alerting circuits. Ten hours of game-play suffice to improve visual detection, refine the rapid estimation of the number of objects on the screen, and expand the capacity to concentrate on a target without being distracted. A video game player manages to make ultra-fast decisions without compromising his or her performance.
Parents and teachers complain that today’s children, plugged into computers, tablets, consoles, and other devices, constantly zap from one activity to the next and have lost the capacity to concentrate—but this is untrue. Far from reducing our ability to concentrate, video games can actually increase it. In the future, will they help us remobilize synaptic plasticity in adults and children alike? Undoubtedly, they are a powerful stimulant of attention, which is why my laboratory has developed a whole range of educational tablet games for math and reading, based on cognitive science principles.11
Video games also have their dark side: they present well-known risks of social isolation, time loss, and addiction. Fortunately, there are many other ways to unlock the effects of the alerting system while also drawing on the brain’s social sense. Teachers who captivate their students, books that draw in their readers, and films and plays that transport their audiences and immerse them in real-life experiences probably provide equally powerful alerting signals that stimulate our brain plasticity.
The second attention system in the brain determines what we should attend to. This orienting system acts as a spotlight on the outside world. From the millions of stimuli that bombard us, it selects those to which we should allocate our mental resources, because they are urgent, dangerous, appealing . . . or merely relevant to our present goals.
The founding father of American psychology, William James (1842–1910), in his The Principles of Psychology (1890), best defined this function of attention: “Millions of items of the outward order are present to my senses which never properly enter into my experience. Why? Because they have no interest for me. My experience is what I agree to attend to. Only those items which I notice shape my mind.”
Selective attention operates in all sensory domains, even the most abstract. For example, we can pay attention to the sounds around us: dogs move their ears, but for us humans, only an internal pointer in our brain moves and tunes in to whatever we decide to focus on. At a noisy cocktail party, we are able to select one out of ten conversations based on voice and meaning. In vision, the orienting of attention is often more obvious: we generally move our head and eyes toward whatever attracts us. By shifting our gaze, we bring the object of interest into our fovea, which is an area of very high sensitivity in the center of our retina. However, experiments show that even without moving our eyes, we can still pay attention to any place or any object, wherever it is, and amplify its features.12 We can even attend to one of several superimposed drawings, just like we attend to one of several simultaneous conversations. And there is nothing stopping you from paying attention to the color of a painting, the shape of a curve, the speed of a runner, the style of a writer, or the technique of a painter. Any representation in our brains can become the focus of attention.
In all these cases, the effect is the same: the orienting of attention amplifies whatever lies in its spotlight. The neurons that encode the attended information increase their firing, while the noisy chattering of other neurons is squashed. The impact is twofold: attention makes the attended neurons more sensitive to the information that we consider relevant, but, above all, it increases their influence on the rest of the brain. Downstream neural circuits echo the stimulus to which we lend our eyes, ears, or mind. Ultimately, vast expanses of cortex reorient to encode whatever information lies at the center of our attention.13 Attention acts as an amplifier and a selective filter.
“The art of paying attention, the great art,” says the philosopher Alain (1868–1951), “supposes the art of not paying attention, which is the royal art.” Indeed, paying attention also involves choosing what to ignore. For an object to come into the spotlight, thousands of others must remain in the shadows. To direct attention is to choose, filter, and select: this is why cognitive scientists speak of selective attention. This form of attention amplifies the signal which is selected, but it also dramatically reduces those that are deemed irrelevant. The technical term for this mechanism is “biased competition”: at any given moment, many sensory inputs compete for our brain’s resources, and attention biases this competition by strengthening the representation of the selected item while squashing the others. This is where the spotlight metaphor reaches its limits: to better light up a region of the cortex, the attentional spotlight of our brain also reduces the illumination of other regions. The mechanism relies on interfering waves of electrical activity: to suppress a brain area, the brain swamps it with slow waves in the alpha frequency band (between eight and twelve hertz), which inhibit a circuit by preventing it from developing coherent neural activity.
Paying attention, therefore, consists of suppressing the unwanted information—and in doing so, our brain runs the risk of becoming blind to what it chooses not to see. Blind, really? Really. The term is fully appropriate, because many experiments, including the famous “invisible gorilla” experiment,14 demonstrate that inattention can induce a complete loss of sight. In this classic experiment, you are asked to watch a short movie where basketball players, dressed in black and white, pass a ball back and forth. Your task is to count, as precisely as you can, the number of passes of the white team. A piece of cake, you think—and indeed, thirty seconds later, you triumphantly give the right answer. But now the experimenter asks a strange question: “Did you see the gorilla?” The gorilla? What gorilla? We rewind the tape, and to your amazement, you discover that an actor in a full-body gorilla costume walked across the stage and even stopped in the middle to pound on his chest for several seconds. It seems impossible to miss. Furthermore, experiments show that, at some point, your eyes looked right at the gorilla. Yet you did not see it. The reason is simple: your attention was entirely focused on the white team and therefore actively inhibited the distracting players who were dressed in black . . . gorilla included! Busy with the counting task, your mental workspace was unable to become aware of this incongruous creature.
The invisible gorilla experiment is a landmark study in cognitive science, and one which is easily replicated: in a great variety of settings, the mere act of focusing our attention blinds us to unattended stimuli. If, for instance, I ask you to judge whether the pitch of a sound is high or low, you may become blind to another stimulus, such as a written word that appears within the next fraction of a second. Psychologists call this phenomenon the “attentional blink”:15 your eyes may remain open, but your mind “blinks”—for a short while, it is fully busy with its main task and utterly unable to attend to anything else, even something as simple as a single word.
In such experiments, we actually suffer from two distinct illusions. First, we fail to see the word or the gorilla, which is bad enough. (Other experiments show that inattention can lead us to miss a red light or run over a pedestrian—never use your cell phone behind the wheel!) But the second illusion is even worse: we are unaware of our own unawareness—and, therefore, we are absolutely convinced that we have seen all there is to see! Most people who try the invisible gorilla experiment cannot believe their own blindness. They think that we played a trick on them, for instance by using two different movies. Typically, their reasoning is that if there really was a gorilla in the video, they would have seen it. Unfortunately, this is false: our attention is extremely limited, and despite all our good will, when our thoughts are focused on one object, other objects—however salient, amusing, or important—can completely elude us and remain invisible to our eyes. The intrinsic limits of our awareness lead us to overestimate what we and others can perceive.
The gorilla experiment truly deserves to be known by everyone, especially parents and teachers. When we teach, we tend to forget what it means to be ignorant. We all think that what we see, everyone can see. As a result, we often have a hard time understanding why a child, despite the best of intentions, fails to see, in the most literal sense of the term, what we are trying to teach him. But the gorilla heeds a clear message: seeing requires attending. If students, for one reason or another, are distracted and fail to pay attention, they may be entirely oblivious to their teacher’s message—and what they cannot perceive, they cannot learn.16
As an example, consider an experiment recently performed by the American psychologist Bruce McCandliss which probed the role of attention in learning to read.17 Is it better to pay attention to the individual letters of a word or to the overall form of the whole word? To find out, McCandliss and his colleagues taught adults an unusual writing system made up of elegant curves. The subjects were first trained with sixteen words, then their brain responses were recorded while they tried to read these sixteen learned words, as well as sixteen new words in the same script. Unbeknownst to them, however, their attention was also being manipulated. Half the participants were told to attend to the curves as a whole, because each of them, much like a Chinese character, corresponded to one word. The other group was told that, in fact, the curves were made up of three superimposed letters, and that they would learn better by paying attention to each letter. Thus, the first group paid attention on the whole-word level, while the second group attended to the individual letters, which had actually been used to write the words.
Selective attention can orient learning to the right or wrong circuit. In this experiment, adults learned to read a new writing system using either a phonics approach or a whole-word approach. Those who attended to the overall shape of the words did not realize that the words were made of letters, even after three hundred trials. Whole-word attention directed the learning to an inappropriate circuit in the right hemisphere and prevented the participants from generalizing what they had learned to novel words. When attention was drawn to the presence of letters, however, people were able to decipher the alphabet and to read novel words, using the normal reading circuit located in the left ventral visual cortex.
What were the results? Both groups managed to remember the first sixteen words, but attention radically altered their ability to decipher new words. The participants in the second group, focused on letters, discovered many of the correspondences between letters and sounds and were able to read 79 percent of the new words. Furthermore, an examination of their brains showed that they had activated the normal reading circuitry, localized to the ventral visual areas of the left hemisphere. In the first group, however, attending to the overall word form completely hindered the capacity to generalize to novel items: these volunteers could not read any new words, and they activated a totally inappropriate circuit located in the visual areas of the right hemisphere.
The message is clear: attention radically changes brain activity. Paying attention to the overall shape of the words prevents the discovery of the alphabetic code and directs brain activity toward an inadequate circuit in the opposite hemisphere. To learn to read, phonics training is essential. Only by attending to the correspondence between letters and sounds can a student activate the classical reading circuit, allowing for the proper type of learning to take place. All first-grade teachers who teach reading should be familiar with this data: they show how important it is to properly direct children’s attention. Many converging data convincingly demonstrate the superiority of such a phonics approach over whole-word reading.18 When a child attends to the letter level, for instance, by tracking each letter with her finger, from left to right, learning becomes much easier. If, on the other hand, the child is not provided with any attentional clues and naively examines the written word as a whole, without attending to its internal structure, nothing happens. Attention is a key ingredient of successful learning.
Above all, therefore, good teaching requires permanent attention to children’s attention. Teachers must carefully choose what they want children to attend to, because only the items that lie at the focus of attention are represented in the brain with sufficient strength to be efficiently learned. The other stimuli, the losers of the attentional competition, cause little or no stir inside the child’s plastic synapses.
The efficient teacher therefore pays close attention to his pupils’ mental states. By constantly stirring children’s curiosity with attention-grabbing lessons, he ensures that each class is a memorable experience. By tailoring his teaching to each child’s attention span, he ensures that all students follow the entire lesson.
Our third and final attention system determines how the attended information is processed. The executive control system, sometimes called the “central executive,” is a hodgepodge of circuits that allows us to choose a course of action and stick to it.19 It involves a whole hierarchy of cortical areas, mainly located in the frontal cortex—the huge mass of cortex that lies beneath our forehead and comprises close to a third of the human brain. Compared with other primates, our frontal lobes are enlarged, better connected, and packed with a larger number of neurons, each with a broader and more complex dendritic tree.20 It’s no wonder, then, that human cognitive abilities are much more developed than those of any other primate—and this is especially true at the highest level of the cognitive hierarchy, which allows us to supervise our mental operations and become aware of our mistakes: the executive control system.21
Imagine having to mentally multiply 23 by 8. It is your executive control system that ensures that the whole series of relevant mental operations runs smoothly from beginning to end: first, focus on the ones digit (3) and multiply it by 8, then store the result (24) in memory; now focus on the tens digit (2) and also multiply it by 8 to obtain 16, and remember that you are working in the tens column, therefore it corresponds to 160; and finally, add 24 and 160 to reach the final result: 184.
Executive control is the switchboard of the brain: it orients, directs, and governs our mental processes, much like a railroad yardman who tends the switches in a busy railway station and manages to bring each train to the right track by choosing the appropriate orientation for each switch. The brain’s central executive is considered one of the attention systems because, like the others, it selects from many possibilities—but this time, from the available mental operations rather than from the stimuli that reach us. Thus, spatial attention and executive attention complement each other. When we do mental arithmetic, spatial attention is the system that scans the mathematics textbook page and shines the spotlight on the problem 23 × 8—but it is executive attention which then guides the spotlight step by step, first selecting the 3 and the 8, then routing them to the brain circuits for multiplication, and so on. The central executive activates the relevant operations and inhibits the inappropriate ones. It constantly ensures that the mental program runs smoothly, and decides when to change strategies. It is also the system which, within a specialized subcircuit of the cingulate cortex, detects when we make an error, or when we deviate from the goal, and immediately corrects our action plan.
There is a close link between executive control and what cognitive scientists call working memory. In order to follow a mental algorithm and control its execution, we must constantly keep in mind all the elements of the ongoing program: intermediate results, steps already carried out, operations remaining to be performed. . . . Thus, executive attention controls the inputs and the outputs of what I have called the “global neural workspace”: a temporary conscious memory within which we can maintain, for a short period, practically any piece of information that seems relevant to us and relay it to any other module.22 The global workspace acts as the brain’s router, the signalman that decides how, and in what order, to send the information to the many different processors that our brain hosts. At this level, mental operations are slow and serial: this is a system that processes one piece of information at a time and is therefore incapable of doing two operations at once. Psychologists also call it the “central bottleneck.”
Are we really unable to execute two mental programs at once? We are sometimes under the impression that we can simultaneously perform two tasks, or even follow two distinct trains of thought—but this is a pure illusion. A basic experiment illustrates this point: Give someone two very simple tasks—for example, pressing a key with the left hand whenever they hear a high-pitched sound, and pressing another key with the right hand if they see the letter Y. When both targets occur simultaneously or in close succession, the person performs the first task at a normal speed, but the execution of the second task is considerably slowed down, in direct proportion to the time spent making the first decision.23 In other words, the first task delays the second: while our global workspace is busy with the first decision, the second one has to wait. And the lag is huge: it easily reaches a few hundred milliseconds. If you are too concentrated on the first task, you may even miss the second task entirely. Remarkably, however, none of us is aware of this large dual-task delay—because, by definition, we cannot be aware of information before it enters our conscious workspace. While the first stimulus gets consciously processed, the second one has to wait outside the door, until the global workspace is free—but we have no introspection of that waiting time, and if asked about it, we think that the second stimulus appeared exactly when we were finished with the first, and that we processed it at a normal speed.24
Once again, we are unaware of our mental limits (indeed, it would be paradoxical if we could somehow become aware of our lack of awareness!). The only reason we believe that we can multitask is that we are unaware of the huge delay it causes. Thus, many of us continue to text while we drive—in spite of all the evidence that texting is one of the most distracting activities ever. The lure of the screen and the myth of multitasking are among the most dangerous fabrications of our digital society.
What about training? Can we ever turn ourselves into genuine multitaskers who do multiple things at once? Perhaps, but only with intense training on one of the two tasks. Automatization frees the conscious workspace: by routinizing an activity, we can execute it unconsciously, without tying up the brain’s central resources. Through hard practice, for instance, a professional pianist may be able to talk while playing, or a typist may be able to copy a document while listening to the radio. However, these are rare exceptions, and psychologists continue to debate them, because it is also possible that executive attention quickly switches from one task to the next in an almost undetectable manner.25 The basic rule stands: in any multitask situation, whenever we have to perform multiple cognitive operations under the control of attention, at least one of the operations is slowed down or forgotten altogether.
Because of this severe effect of distraction, learning to concentrate is an essential ingredient of learning. We cannot expect a child or an adult to learn two things at once. Teaching requires paying attention to the limits of attention and, therefore, carefully prioritizing specific tasks. Any distraction slows down or wastes our efforts: if we try to do several things at once, our central executive quickly loses track. In this respect, cognitive science experiments in the lab converge nicely with educational findings. For instance, field experiments demonstrate that an overly decorated classroom distracts children and prevents them from concentrating.26 Another recent study shows that when students are allowed to use their smartphones in class, their performance suffers, even months later, when they are tested on the specific content of that day’s class.27 For optimal learning, the brain must avoid any distraction.
Executive attention roughly corresponds to what we call “concentration” or “self-control.” Importantly, this system is not immediately available to children: it will take fifteen or twenty years before their prefrontal cortex reaches its full maturity. Executive control emerges slowly throughout childhood and adolescence as our brain, through experience and education, gradually learns to control itself. Much time is needed for the brain’s central executive to systematically select the appropriate strategies and inhibit the inadequate ones, all the while avoiding distraction.
Cognitive psychology is full of examples where children gradually correct their mistakes as they increasingly manage to concentrate and inhibit inappropriate strategies. Psychologist Jean Piaget was the first to notice this: Very young children sometimes make seemingly silly mistakes. If, for example, you hide a toy a few times at location A, and then switch to hiding it at location B, babies below one year of age continue to look for it at location A (even if they saw perfectly well what happened). This is the famous “A-not-B error,” which led Piaget to conclude that infants lack object permanence—the knowledge that an object continues to exist when it is hidden. However, we now know that this interpretation is wrong. Examination of the babies’ eyes shows that they know where the hidden object is. But they have trouble resolving mental conflicts: in the A-not-B task, the routine response that they learned on previous trials tells them to go to location A, while their more recent working memory tells them that, on the present trial, they should inhibit this habitual response and go to location B. Before ten months of age, the habit prevails. At this age, what is lacking is executive control, not object knowledge. Indeed, the A-not-B error disappears around twelve months of age, in direct relation to the development of the prefrontal cortex.28
Another typical error of children is the confusion between number and size. Here again, Piaget made an essential discovery but got the interpretation wrong. He found that young children, before they were about three years old, had trouble judging the number of objects in a group. In his classical number conservation experiments, Piaget first showed children two equal rows of marbles, in one-to-one correspondence, such that even the youngest children would agree that the rows had the same number of marbles. He would then space the marbles in one of the rows apart:
Remarkably, the children would now affirm that the two sets were unequal, and that the longer row had more objects. This is a surprisingly silly error—but contrary to what Piaget thought, it does not mean that children at this age are incapable of “conserving number.” As we have seen, even newborn babies already possess an abstract sense of number, independent of the spacing of items or even the sensory modality in which they are presented. No, the difficulty arises, once again, from executive control. Children must learn to inhibit a prominent feature (size) and amplify a more abstract one (number). Even in adults, such selective attention may fail. For instance, we all have a hard time deciding which of two sets is larger when the items in the smaller set are bigger and more spread out in space; and we also have a hard time choosing the larger number between 7 and 9. What develops with age and education is not so much the intrinsic precision of the number system, but the ability to use it efficiently without getting distracted by irrelevant cues, such as density or size.29 Once again, progress in such tasks correlates with the development of neural responses in the prefrontal cortex.30
I could multiply the examples: at all stages of life and in all domains of knowledge, whether cognitive or emotional, it is primarily the development of our executive control abilities which allows us to avoid making errors.31 Let’s try it on your own brain: name the color of the ink (black or white) in which each of the following words is printed:
When you reached the second half of the list, did the task become more difficult? Did you slow down and make errors? This classic effect (which is even more striking when the words are printed in color) reflects the intervention of your executive control system. When the words and colors conflict, the central executive must inhibit word reading to remain focused on the task of naming the ink color.
Now try solving the following problem: “Mary has twenty-six marbles. This is four more than Gregory. How many marbles does Gregory have?” Did you have to fight the urge to add the two numbers? Did you think of thirty instead of the correct result of twenty-two? The problem statement uses the word “more” even though you have to subtract—this is a trap that many children fall into before they manage to control themselves and think deeper about the meanings of such math problems in order to select the relevant arithmetic operation.
Attention and executive control develop spontaneously with the progressive maturation of the prefrontal cortex, which extends over the first two decades of our lives. But this circuit, like all others, is plastic, and many studies show that its development can be enhanced by training and education.32 Because this system intervenes in a great variety of cognitive tasks, many educational activities, including the most playful, can effectively develop executive control. The American psychologist Michael Posner was the first to develop educational software that improves young children’s ability to concentrate. One game, for instance, forces the player to heed the orientation of a fish in the center of the screen. The target fish is surrounded by others that face in the opposite direction. In the course of the game, which consists of many levels of increasing difficulty, the child progressively learns to avoid being distracted by the target fish’s neighbors—a simple task that teaches concentration and inhibition. This is just one of many ways to encourage reflection and discourage immediate, knee-jerk responding.
Long before computers were invented, the Italian doctor and teacher Maria Montessori (1870–1952) noticed how a variety of practical activities could develop concentration in young children. In today’s Montessori schools, for example, children walk along an ellipse drawn on the ground, without ever taking their feet off the line. Once they succeed, the difficulty is raised by having them walk with a spoon in their mouth, then with a ping-pong ball in the spoon, and so on. Experimental studies suggest that the Montessori approach has a positive impact on many aspects of child development.33 Other studies demonstrate the attentional benefits of video games, meditation, or the practice of a musical instrument. . . . For a young child, controlling their body, gaze, and breathing while coordinating their two hands can be an excruciatingly difficult task—that is probably why playing music at an early age has a strong impact on the attention circuits of the brain, including a significant bilateral increase in the thickness of the prefrontal cortex.34
Executive attention, the ability to concentrate and control oneself, develops with age and education. Learning to play a musical instrument is one of the many ways to enhance concentration and self-control from an early age. The cortex is thicker in musicians than in well-matched nonmusicians, particularly the dorsolateral prefrontal cortex, which plays an important role in executive control.
Training in executive control can even change one’s IQ. This may come as a surprise, because IQ is often viewed as a given—a fundamental determinant of children’s mental potential. However, the intellectual quotient is just a behavioral ability, and as such, it is far from being unchangeable by education. Like any of our abilities, IQ rests on specific brain circuits whose synaptic weights can be changed by training. What we call fluid intelligence—the ability to reason and solve new problems—makes massive use of the brain’s executive control system: both mobilize a similar network of brain areas, notably the dorsolateral prefrontal cortex.35 Indeed, standardized measures of fluid intelligence resemble the tests that cognitive psychologists use to assess executive control: both emphasize attention, concentration, and the ability to move quickly from one activity to another, without losing sight of the overall goal. And in fact, training programs that focus on working memory and executive control cause a slight increase in fluid intelligence.36 These results are consistent with previous findings showing that although intelligence is not devoid of genetic determinism, it can change dramatically in response to environmental factors, including education. And these effects can be enormous. In one study, low-IQ children between the ages of four and six were adopted in families with either a high or a low socioeconomic status. At adolescence, those who had landed in the better-off families had gained twenty IQ points, compared to only eight points for the others.37 A recent meta-analysis examined the effect of education on intelligence, and concluded that each additional year at school yields a gain of one to five IQ points.38
The current frontier of research involves optimizing the effects of cognitive training and clarifying their limits. Can the effects last for years? How can we ensure that they extend well beyond the trained tasks, in various situations throughout life? This is the challenge, because, by default, the brain tends to develop tricks specific to each task, on a case-by-case basis. The solution probably lies in the diversification of learning experiences, and the best results seem to be obtained by educational programs that stimulate the core cognitive skills of working memory and executive attention in a great variety of contexts.
Certain findings make me particularly optimistic. Early training in working memory, especially if done in kindergarten, appears to have positive effects on concentration and success in many areas, including those most directly relevant to school: reading and mathematics.39 This is not surprising, since we have known for years that working memory is one of the best predictors of later success in arithmetic.40 The effects of these exercises are multiplied if we combine memory training with more direct teaching of the concept of the “number line”—the essential idea that numbers are organized on a linear axis where adding or subtracting consists of moving right or left.41 All these educational interventions seem to be the most beneficial to children from disadvantaged backgrounds. For families at low socioeconomic levels, early intervention, starting in kindergarten and teaching the fundamentals of learning and attention, can be one of the best educational investments.
Ο ἄνθρωπος φύσει πολιτικὸν ζῷον
Man is by nature a social (or political) animal.
Aristotle (350 BCE)
All mammalian species—including, of course, all primates—possess attention systems. But attention in humans exhibits a unique feature that further accelerates learning: social attention sharing. In Homo sapiens, more than in any other primate, attention and learning depend on social signals: I attend where you attend, and I learn from what you teach me.
From the earliest age, infants gaze at faces and pay particular attention to people’s eyes. As soon as something is said to them, their first reflex is not to explore the scene, but to catch the gaze of the person they are interacting with. Only once eye contact is established do they turn toward the object that the adult is staring at. This remarkable ability for social attention sharing, also called “shared attention,” determines what children learn.
I have already told you about experiments where babies are taught the meaning of a new word, such as “wog.” If the infants can follow the speaker’s gaze toward the so-called wog, they have no trouble learning this word in just a few trials—but if wog is repeatedly emitted by a loudspeaker, in direct relation to the same object, no learning occurs. The same goes for learning phonetic categories: a nine-month-old American child who interacts with a Chinese nanny for only a few weeks acquires Chinese phonemes—but if he receives exactly the same amount of linguistic stimulation from a very high-quality video, no learning occurs.42
Hungarian psychologists Gergely Csibra and György Gergely postulate that teaching others and learning from others are fundamental evolutionary adaptations of the human species.43 Homo sapiens is a social animal whose brain is endowed with circuits for “natural pedagogy” that are triggered as soon as we attend to what others are trying to teach us. Our global success is due, at least in part, to a specific evolutionary trait: the ability to share attention with others. Most of the information we learn, we owe to others, rather than to our personal experience. In this manner, the collective culture of the human species can rise far above what any individual could discover alone. This is what psychologist Michael Tomasello calls the “cultural ratchet” effect: like a ratchet prevents an elevator from falling back down, social sharing prevents culture from regressing. Whenever one person makes a useful discovery, it quickly spreads to the whole group. Thanks to social learning, it is very rare for the cultural elevator to come down and for a major invention to be forgotten.
Our attentional system has adapted to this cultural context. Gergely and Csibra’s research shows that, from an early age, children’s attention is highly attuned to adult signals. The presence of a human tutor, who looks at the child before making a specific demonstration, massively modulates learning. Not only does eye contact attract the child’s attention, but it also signals that the tutor intends to teach the child an important point. Even babies are sensitive to this: eye contact puts them in a “pedagogical stance” that encourages them to interpret the information as important and generalizable.
Let’s take an example: A young woman turns to object A with a big smile, then to object B with a grimace. An eighteen-month-old baby watches the scene. What conclusion will the baby draw? It all depends on the signals that the child and the adult exchanged. If no eye contact was established, then the child simply remembers one specific piece of information: this person likes object A and dislikes object B. If, however, eye contact was established, then the child deduces much more: he believes that the adult was trying to teach him something important, and he therefore draws the more general conclusion that object A is pleasant and object B is bad, not only for this person in particular but for everyone. Children pay extreme attention to any evidence of voluntary communication. When someone gives obvious signs of trying to communicate with them, they infer that this person wants to teach them abstract information, not just their own idiosyncratic preferences.
It is not only eye contact that matters: children also quickly understand the communicative intention that lies behind the act of pointing with a finger (whereas chimpanzees never really understand this gesture). Even babies realize when someone is trying to get their attention and give them important information. For instance, when nine-month-old babies see someone trying to catch their attention and then pointing to an object, they later remember the identity of that object, because they understand that this is the information that matters to their interlocutor—whereas, if they see the same person reaching toward the object without looking at them, they remember only the position of the object, not its identity.44
Social interactions are an essential ingredient of the human learning algorithm. What we learn depends on our understanding of the intentions of others. Even eighteen-month-old babies understand that if you look them in the eye, you are trying to convey important information to them. Following eye contact, they learn more effectively and succeed more in generalizing than other people (top). As early as fourteen months of age, babies can already interpret people’s intentions: after seeing a person turn on a light with her head, they imitate this gesture in every way, unless the person’s hands were occupied, in which case babies understand that they can simply press the button with their hands (bottom).
Parents and teachers, always keep this crucial fact in mind: your attitude and your gaze mean everything for a child. Getting a child’s attention through visual and verbal contact ensures that she shares your attention and increases the chance that she will retain the information you are trying to convey.
No other species can teach like we do. The reason is simple: we are probably the only animals with a theory of other people’s minds, an ability to pay attention to them and imagine their thoughts—including what they think others think, and so on and so forth, in an infinite loop. This type of recursive representation is typical of the human brain and plays an essential role in the pedagogical relationship. Educators must constantly think about what their pupils do not know: teachers adapt their words and choose their examples in order to increase their students’ knowledge as quickly as possible. And the pupils know that their teacher knows that they do not know. Once children adopt this pedagogical stance, they interpret each act of the teacher as an attempt to convey knowledge to them. And the loop goes on forever: adults know that children know that adults know that they do not know . . . which allows adults to choose their examples knowing that children will try to generalize them.
This pedagogical relationship may well be unique to Homo sapiens: it does not seem to exist in any other species. In 2006, a landmark article45 published in Science described a form of teaching in the meerkat, a small South African mammal of the mongoose family—but in my view, the study misused the very definition of teaching. What was it about? The biggest family affair: learning how to prepare food! Mongooses face a serious cooking challenge: they feed on extremely dangerous prey, scorpions with deadly stingers that need to be removed before eating. Their plight is similar to that of Japanese cooks preparing fugu, a fish whose liver, ovaries, eyes, and skin contain deadly doses of the paralyzing drug tetrodotoxin: one error in the recipe, and you are dead. Japanese chefs train for three years before they are allowed to serve their first fugu—but how do meerkats acquire their know-how? The Science paper convincingly showed that adult meerkats help their young by first offering them “prepared” food consisting of scorpions with the stingers removed. As young meerkats grow, the adults provide them with an increasing proportion of live scorpions, and this obviously helps the young become independent hunters. Thus, according to the authors, three teaching criteria are met: the adult performs a specific behavior in the presence of the young; this behavior has a cost for the adult; and the young benefit by acquiring knowledge more quickly than if the adult had not intervened.
The case of meerkats is certainly noteworthy: during mongoose evolution, a singular mechanism emerged that clearly facilitates survival. But is this genuine teaching? In my opinion, the data do not allow us to conclude that meerkats really teach their young, because one crucial ingredient is missing: shared attention to one another’s knowledge. There is no evidence that adult meerkats pay any attention to what the young know or, conversely, that the young take into account the pedagogical stance of the adults. Adult mongooses only present increasingly dangerous prey to their young as they age. As far as we know, this mechanism could be completely pre-wired and specific to scorpion consumption—a complex but narrow-minded behavior comparable to the famous bee dance or the flamingo’s bridal parade.
In brief, although we attempt to project onto mongooses and scorpions our own preconceptions, a closer look reveals how far their behavior is from ours. With its obvious limitations, the story of the teaching mongoose actually teaches us, as in a negative image, what is truly unique and precious about our species. The genuine pedagogical relationships that happen in our schools and universities involve strong mental bonds between teachers and students. A good teacher builds a mental model of his students, their skills and their mistakes, and takes every action to enrich his pupils’ minds. This ideal definition therefore excludes any teacher (human or computer) who merely mechanically delivers a stereotypical lesson, without tailoring it to the prior knowledge and expectations of his audience—such mindless, unidirectional teaching is inefficient. On the flip side, teaching is efficient only when the students, for their part, have good reasons to be persuaded that teachers do their best to convey their knowledge. Any healthy pedagogical relationship must be based on bidirectional streams of attention, listening, respect, and mutual trust. There is currently no evidence that such a “theory of mind”—the capacity of students and teachers to attend to one another’s mental states—exists in any animal other than the human species.
The meerkat’s modest pedagogy also fails to do justice to the role that education plays in human societies. “Every man is a humanity, a universal history,” says Jules Michelet (1798–1874). Through education, we convey to others the best thoughts of the thousands of human generations that preceded us. Every word, every concept we learn is a small conquest that our ancestors passed on to us. Without language, without cultural transmission, without communal education, none of us could have discovered, alone, all the tools that currently extend our physical and mental abilities. Pedagogy and culture make each of us the heir to an extensive chain of human wisdom.
But Homo sapiens’ dependency on social communication and education is as much of a curse as it is a gift. On the flip side of the coin, it is education’s fault that religious myths and fake news propagate so easily in human societies. From the earliest age, our brains trustfully absorb the tales we are told, whether they are true or false. In a social context, our brains lower their guard; we stop acting like budding scientists and become mindless lemmings. This can be good—as when we trust the knowledge of our science teachers, and thus avoid having to replicate every experiment since Galileo’s time! But it can also be detrimental, as when we collectively propagate an unreliable piece of “wisdom” inherited from our forebears. It is on this basis that doctors foolishly practiced bloodletting and cupping therapies for centuries, without ever testing their actual impact. (In case you are wondering, both are actually harmful in the vast majority of diseases.)
A famous experiment demonstrates the extent to which social learning can turn intelligent children into unthinking copycats. As early as fourteen months of age, babies readily imitate a person’s action, even if it doesn’t make sense to them—or perhaps especially when it doesn’t.46 In this experiment, infants see an adult with her hands tied up by a shawl, pressing a button with her head. The infants infer that they can simply press the button with their free hands, and this is how they end up imitating the action, rather than copying it in every detail. If, however, they see the same person pressing a button with her head for no particular reason, hands completely free and perfectly visible, then the babies seem to abandon all reasoning and blindly trust the adult—they faithfully imitate the action with a bow of the head, although this movement is meaningless. The infants’ head bow seems to be a precursor of the thousands of arbitrary gestures and conventions that human societies and religions perpetuate. In adulthood, this social conformism persists and grows. Even the most trivial of our perceptual decisions, such as judging the length of a line, are influenced by social context: when our neighbors come to a different conclusion than us, we frequently revise our judgment to align it with theirs, even when their answer seems implausible.47 In such cases, the social animal in us overrides the rational beast.
In short, our Homo sapiens brain is equipped with two modes of learning: an active mode, in which we test hypotheses against the outside world like good scientists, and a receptive mode, in which we absorb what others transmit to us without personally verifying it. The second mode, through a cultural ratchet effect, is what allowed the extraordinary expansion of human societies over the past fifty thousand years. But without the critical thinking that characterizes the first mode, the second becomes vulnerable to the spread of fake news. The active verification of knowledge, the rejection of simple hearsay, and the personal construction of meaning are essential filters to protect us from deceitful legends and gurus. We must therefore find a compromise between our two learning modes: our students must be attentive and confident in their teachers’ knowledge, but also autonomous and critical thinkers, actors of their own learning.
We are now touching the second pillar of learning: active engagement.