There are multiple neural systems that drive decision-making, each of which has different computational properties that make it better suited to drive action-selection under different conditions. We will identify four action-selection systems: Reflexes, Pavlovian, Deliberative, and Procedural. Along with motoric, perceptual, situation-categorization, and motivational support systems, these make up the decision-making system.
Imagine the first time you drive to work. You would probably look at a map, plan the best route. During the drive, you would pay particular attention to road signs and landmarks. But driving that same route each day for weeks, months, or years, the route becomes automatic. It no longer requires the same amount of attention, and you can think about other things: the test you have to write, your kids’ soccer game that evening, what you’re going to make for dinner. And yet, you arrive at work. In fact, it can become so automatic, you drive to work without meaning to.A
As we will see later in this section, these two processes (the conscious, attention-demanding, map-based, planning process and the automatic, low-attention, sequence-based, routine process) actually reflect two different systems that process information about the decision in fundamentally different ways, and depend on different brain structures.1 Scientists studying decision-making have identified at least four separate action-selection processes:
• A hardwired system that reacts quickly to immediately sensed direct dangers and events. These are your reflexes, genetically wired into your spinal cord, your peripheral nervous system, and your central brainstem.
• A system that predicts outcomes and reacts in a genetically prewired way to those outcomes. Computationally, what is learned in this system is a hypothesized causal relationship between the cues and the outcomes. The actions are not learned, they are “released” as appropriate responses to the expected outcome. Throughout this book, I will refer to this as the Pavlovian action-selection system.
• A system that deliberates over decisions, requiring a lot of resources, but also capable of complex planning and flexibility. I will refer to this as the Deliberative system, but it has also been described as goal-directed learning, as model-based decision-making, as the action–outcome system, and as the locale or place navigation system.2
• A system that simply learns the best action to take in a given situation. Once that association is stored, the decision process is very simple, but is inflexible. I will refer to this as the Procedural action-selection system as it is very involved in learned motor sequences (procedures), but it has also been referred to as the habit, stimulus–response, stimulus–action, cached-action, and taxon navigation system.3
These four systems are generally sufficient to explain action-selection in the mammalian brain;B however, they each have internal subtleties that we will address as we examine each system separately. In addition, their interactions produce subtleties not directly obvious from their individual computational components.
For these systems to work in the real world, we need several additional support systems. At this point, it is not clear whether these support systems are separate components that are shared among the decision-making systems, or whether there are separate versions of each, so that each decision-making system has its own copy of each support system. Alternatively, these support systems could be full-fledged systems in their own right, with their own internal components. Finally, it’s possible that the support systems correspond in some way to some aspect of one or more of the decision-making systems. Nevertheless, breakdowns can occur in the support systems as well,5 and we will need to include them in order to complete our description of the machinery of decision-making.
• First, of course, one needs a motor control system that produces the muscle movements, the effectors that move the body and take the actual action. As the questions being addressed in this book are primarily about the selection of the action, not the action itself, we will not spend a lot of time on the spinal motor control system, except to say that actions are generally stored as processes more complex than moving a single muscle.
• Second, one needs perceptual systems that receive and interpret the basic sensory signals to determine what information those signals carry. These systems need to recognize objects and their basic properties (What is it? Where is it?). We will discuss these in terms of the detection of features and the integration of information.
• In addition, however, one needs a system that recognizes which are the important cues on which to base one’s decision, whether in terms of external cues in the world or internal body-related cues. Although many scientific decision-making literatures refer to this process as “stimulus identification,” I think that term leads us to imagine that reactions are made to a single stimulus, while clearly we are recognizing multiple stimuli and integrating them into a recognition of the situation we are in. Thus, I will refer to this as the situation-recognition system.
• Finally, one needs a system that encodes one’s goals and desires. This system is going to depend on one’s internal (evolutionary) needs, but will also need to mediate conflicting goals, needs, and desires. For obvious reasons, I will refer to this as the motivational system.
Although I will refer to these different aspects of decision-making as “systems,” I don’t want you to take them as different, separate modules. A good example of what I mean by the term “system” is the electric and gas engines in the new hybrid cars (like the Toyota Prius)—the car can be driven by the electric motor or by the gas motor, but both engines still share the same drive train and the same accelerator pedal. Some parts are shared, and some parts are separate. In the end, however, the goal of both engine systems is to make the car go. And the car still requires an additional steering system to guide it on the correct course. Similarly, what matters for evolution is the behavior of the organism itself, not the separate decision-making systems. The decision-making systems are separate in that they require different information processing and involve overlapping, but also different, neural structures, even though they all interact to produce behavior. We will find it useful to separate these systems as a first step in our analysis, but we will need to come back to their interaction in order to understand behavior (and its failure modes).
Although we like to think of ourselves as a unitary decision-maker with a single set of desires, we all have experiences in which these multiple systems can come into conflict with each other. The reason that multiple systems evolved is that they each have advantages and disadvantages, and they are each better at some tasks and worse at others. If one can successfully select which system is going to be best at each time, one can gain the advantages of each and diminish the disadvantages of each. We will talk later about what is known about the mechanism by which we mediate between these systems, but there is still a lot of debate in the scientific community about exactly how that mediation happens.
Our reflexes are the simplest decision-making systems we have. Scientists studying decision-making often dismiss the reflex as not being a decision-making system because it’s too simple. But by our definition of decision-making as selecting an action, we have to include a reflex as a decision. Because reflexes interact with the other decision-making systems, we will find that it is useful to include reflexes in our taxonomy of decision-making systems.
Let’s take an example reflex. If your hand touches something hot enough to burn it (say a candle flame), you pull your hand away quickly. This is something that an animal wants to do as quickly as possible. If we took the time to think, “That’s hot. Should I pull my hand away?” the flame would have done damage to our fingers. Similarly, we don’t have time to learn to pull our hand away from the fire; evolution wants that to happen correctly the first time. We can think of reflexes as simple operations (actions to be selected in reaction to specific conditions) that have been learned over evolutionary time, but that once learned within a species, are hardwired within a given individual. These twin issues of the time it takes to compute an answer and the time it takes to learn the right answer are critical advantages and disadvantages of each of the action-selection systems.
Proof that a reflex really is a decision can be seen in its interaction with the other action-selection systems. In David Lean’s Lawrence of Arabia, there is a famous scene where T. E. Lawrence, played by Peter O’Toole, waits for the match to burn all the way down to his fingers. His companion then tries it and complains that it hurts. “Of course,” replies Lawrence, “but the trick is not minding that it hurts.” It is possible to override the default Reflex action-selection with a different, more conscious system. This interaction between multiple systems as an important part of how we make decisions is something we will return to several times in this book.
Pavlovian action-selection is also computationally simple in that it can only produce actions that have been learned over an evolutionary timescale (often called unconditioned responses),6 but it differs from the Reflex action-selection system in that it learns.C
The Pavlovian system learns stimuli that predict outcomes, but it does not learn the actions to take with those outcomes.14 Pavlovian learning is named after Ivan Pavlov (1849–1936), the great Russian neuroscientist. Supposedly, Pavlov rang a bell before feeding a dog and found that the dog learned to salivate to the bell in anticipation of the food.D
When describing Pavlovian learning, scientists often describe the stimuli as “releasing specific actions.”19 In classic animal learning theory, there is an unconditioned stimulus (the food being presented to Pavlov’s dog), an unconditioned response (salivating), and a conditioning stimulus (the bell). In Pavlovian learning, the unconditioned response shifts in time from the unconditioned stimulus to the conditioning stimulus. Interpretations of Pavlovian learning tend to ignore the action-selection component of the learning, suggesting that the unconditioned response is simply a tool to identify an association between the conditioning stimulus and the unconditioned stimulus.20
These interpretations are based in part because the same training that makes animals take the unconditioned response after seeing the training cue (Pavlovian action-selection) also often makes animals more likely to take other, untrained actions that they know will reach the unconditioned stimulus (as if they had an increased motivation to reach that unconditioned stimulus).21 This is a process called Pavlovian-to-instrumental transfer, which we will examine in depth when we reach our discussion of motivation, below. Recent data suggest that the two effects (release of unconditioned responses to conditioned stimuli [Pavlovian action-selection] and an increase in motivation as evidenced by increased actions taken by other systems [Pavlovian-to-instrumental transfer]) are dissociable. They are differently affected by different brain structure lesions and they are differently affected by different pharmacological manipulations.22 The relationship between the Pavlovian action-selection and motivational systems is still an open question being actively pursued by scientists.
However, we must not forget that the animal is, in fact, taking an action. Recent interpretations suggest that Pavlovian learning plays a role in an action-selection system, in which the unconditioned responses are actions learned over an evolutionary timescale.23 Imagine living on the savannah where there are lions that might hunt you. You can learn that the rustle in the brush predicts that a lion is stalking you and you can learn to run from the rustling grass, but you don’t get a chance to learn to run from the lion—you’ve got to get that right the first time. The Pavlovian action-selection system can learn that the rustle in the grass is a lion stalking you, which leads to fear, and running, but it can’t learn that the right response is to do jumping jacks.
A great way to understand the Pavlovian system is the phenomenon of sign-trackers and goal-trackers.24 Imagine rats being placed in an environment with a food-release port and a light, separated by some small but significant distance. The light comes on, and a few moments later food is delivered at the food port. The connection between the light and the food is independent of what the rat does—the light simply predicts that the food will be available. The right thing for the rat to do is to go to the food port when the light comes on. (This would be “goal-tracking.”) Some rats, however, go to the light during the intervening interval and gnaw on it. They still go to the food when the food is delivered, but they have wasted time and energy going out of their way to the light first. (This would be “sign-tracking.”) In fact, if a sign-tracking rat has had enough experience with the light–food association, and you stop giving the food if the rat goes to the light and give it only if the rat goes directly to the food port, the rat has a devil of a time stopping going to the light.25 The Pavlovian association is too strong.
Current thinking suggests that Pavlovian action-selection is related to what we recognize as emotional responses,26 although the specific relationship between the Pavlovian action and the linguistically labeled emotion is controversial.27 Some authors have suggested that emotional responses are Pavlovian actions changing internal responses (heart rate, salivation, etc.).28 Others have suggested that the things we label emotions are a categorization process applied to these internal responses.29 A discussion of categorization processes in mammalian brains (including humans) can be found in Appendix C. We will come back to the relationships between Pavlovian action-selection and emotion in Chapter 8.
Although flexible in the stimuli it can react to, the set of actions available to the Pavlovian action-selection system is greatly limited. Pavlovian responses can only transfer a simple “unconditioned response” to an earlier stimulus. This makes it more flexible than simple reflexes, but the Pavlovian action-selection system cannot take an arbitrary action in response to an arbitrary stimulus. For that, we need a more complex decision-making machinery, like the Deliberative or Procedural systems.
The Deliberative system is more flexible. It can take any action in any situation, but it is a computationally expensive system. In humans, deliberation entails planning and consists of a conscious investigation of the future. This is particularly useful for large one-time decisions, where one cannot try the individual options multiple times to determine the value of each.
For example, a friend of mine spent the last month deciding between two universities who had made him job offers. Each offer had advantages and disadvantages. This was not a decision that he could make ten times, observe what the answers are each time, and then slowly learn to make the correct choice. Obviously, it was not a decision with an action that evolved over generations.E Instead, he had to imagine himself in those two futures—What would life be like at University One? What would life be like at University Two? Through his life experience and through discussion with his colleagues,F he tried to work out the consequences of his decision. From that complex deliberation, he worked out what he thought was the best choice.
This is a process called episodic future thinking.30 We will return to the issue of episodic future thinking in later chapters, but the concept is that when trying to plan a complex future event, one needs to draw together memories and experiences from lots of sources to imagine what the potential future might be like. By imagining ourselves in that future event, we can determine whether it is something we want to happen or something we want to avoid. As we will see, both the hippocampus and the prefrontal cortex are critically involved in episodic future thinking.31;G
A critical question that has plagued psychology for years is Do animals deliberate as well?34 We can’t ask a rat what it is thinking about, but we can decode information from neural signals. If we know what information is represented by an ensemble of cells, then we can decode that information from the activity in those cells. (See Appendix B for a discussion of current technology and methods that enable the decoding of neural signals.) The hippocampus is a critical part of episodic future thinking in humans.35 In rats, the information represented in hippocampal cells is first the location of the animal in an environment (these are known as place cells).36 From the activity in these cells, we can decode the location represented at a moment in time by the cells.37 So, if a rat were to imagine being at another place in the environment, we should be able to decode that imagination by looking at the set of active hippocampal cells.38
Rats certainly look like they deliberate over choices. When rats come to a difficult or recently changed choice (say at a T-intersection in a maze), they sometimes pause, turning first one way and then the other.39 When this behavior was first observed by Karl Muenzinger and his student Evelyn Gentry in 1931, they called it “vicarious trial and error” because they thought that the rat was vicariously trying out the possibilities. Just as my friend imagined himself at his two future jobs to try to decide between them, the rat was imagining itself taking the two choices. These ideas were further developed by Edward Tolman in the 1930s and 1940s, who showed that these vicarious trial and error processes occurred when we ourselves would deliberate over choices and argued explicitly that the animals were representing the future consciously.H
Unfortunately, Tolman did not have a mathematical way to explain his ideas and was attacked because he “neglected to predict what the rat will do.” The psychologist Edwin Guthrie, in a 1937 review article, complained that “So far as the theory is concerned the rat is left buried in thought.”42 Tolman, however, was writing before the invention of the modern computer and had no language to say “yes, the rat is buried in thought because it is doing a complex calculation.” In our modern world, we have all experienced the idea of computational complexity—algorithms take time to process. When our computers go searching for a file in a directory. When a website takes time to load. When a computer opponent in a game takes time before responding. It is not so strange to us for a computer to be “lost in thought.” And, in fact, Tolman (following Muenzinger and Gentry) had directly observed rats pausing at those choice points, presumably “lost in thought.”
In addition, of course, Tolman and his colleagues did not have the neurophysiological tools we have now. Just as humans with hippocampal damage cannot imagine future events,43 so too rats with hippocampal damage no longer pause at choice points, nor do they show as much vicarious trial and error.44 We can now decode representations of place from recorded ensembles of hippocampal neurons (Appendix B). When we did this in my laboratory, we found that during those paused “vicarious trial and error” events, the hippocampal representation swept ahead of the animal, first down one potential future choice, then down the other.45 Just as my friend was imagining what it would be like to take one job or the other—so, too, the rat was imagining what would happen if it went running down the leftward path or down the rightward path. The most important observation we made was that these representations were sequential, coherent, and serial. They were sequential in that they consisted of a sequence of cells firing in the correct order down a path. They were coherent in that the sequences consisted of accurate representations of places ahead of the animal. And they were serial in that they went down one path and then the other, not both. Our rats really were imagining the future.I
It’s one thing to imagine the future. But how did our rats escape being lost in thought? How did our rats make the actual decision? The next step in deliberation is evaluation. Our rat needs to go beyond saying “going left will get me banana-flavored food pellets” to say “and banana-flavored food pellets are good.” Two structures involved in evaluation are the ventral striatum and the orbitofrontal cortex. Both of these structures are involved in motivation and are often dysfunctional in drug addiction and other motivation-problem syndromes.47 Moreover, the ventral striatum receives direct input from the hippocampal formation.48 Orbitofrontal cortex firing is modulated by hippocampal integrity.49 What were these structures doing when the hippocampus represented these potential futures?
Just as we can record neural ensembles from the hippocampus and decode spatial representation, we can record neural ensembles from the ventral striatum or the orbitofrontal cortex and decode reward-related representations. We found that both ventral striatal and orbitofrontal reward-related cells (cells that normally respond when the animal gets reward) showed extra activity at the choice point where vicarious trial and error was occurring.50 Muenzinger, Gentry, and Tolman were entirely correct—when these rats pause and look back and forth at choice points, they are vicariously trying out the potential choices, searching through the future, and evaluating those possibilities.
The deliberative decision-making system has a lot of advantages. It’s extremely flexible. Knowing that going left can get you banana-flavored food pellets does not force you to go left. You can make a decision about which college to go to, which job to take, which city to move to, without having to spend years trying them. Knowing that one company has made you a job offer does not commit you to taking that offer. But deliberation is computationally expensive. It takes resources and time to calculate those potential future possibilities. If you are faced with the same situation every day and the appropriate action is the same every time, then there’s no reason to waste that time re-planning and re-evaluating those options—just learn what the right choice is and take it.
The Procedural system is, on the surface, much simpler—it simply needs to recognize the situation and then select a stored action or sequence of actions.51 This is the classic stimulus–response theory of action-selection. The Procedural action-selection system differs from the three other systems. Unlike the hardwired reflexes, the Procedural system can learn to take any action. Like the Pavlovian system, it can learn to work from any given situation, but unlike the Pavlovian system, it can learn to take any action in response to that recognized situation. In contrast to the Deliberative system, Procedural action-selection can work very quickly, but it doesn’t include a representation of the outcome, nor is it flexible enough to change easily. In short, Procedural learning associates actions with situations. It can take the action quickly because it is just looking up which action to take, but because it is just looking things up, it doesn’t have the flexibility to easily change if needed.
To see the distinction between the Deliberative and Procedural systems, we can imagine a rat trained to push a lever to receive fruit-flavored food pellets. Once the rat has learned the task (push the lever, get fruit-flavored food), when the rat is placed back in the experimental box, it will push the lever and eat the food. We can imagine two possible learning mechanisms here:52 “If I push the lever, fruit-flavored food comes out. I like fruit-flavored food. I’ll push the lever” or “Pushing the lever is a good thing to do. When the lever appears, I’ll push the lever.” The first logic corresponds to the Deliberative (search-based, predicting outcomes from actions) system, while the second logic corresponds to the Procedural (cached-action, automatic, habit, stimulus–response) system.
These two cognitive processes can be differentiated by changing the value of the fruit-flavored food to the rat. For example, the rat can be given fruit-flavored food pellets and then given lithium chloride, a chemical that does not hurt the rat but makes it feel ill.53 (Lithium chloride often makes humans vomit.) When the rat is next placed into the experimental box, if it is using the Deliberative system, the logic becomes “If I push the lever, fruit-flavored food comes out. Yuck! That stuff is disgusting. Don’t push the lever.” If, instead, the rat is using the Procedural system, then the logic becomes “There’s that lever. I should push it.” When fruit-flavored food comes out, the rat ignores it and doesn’t eat it—but it still pushes the lever. These two reactions depend on different brain structures, occur after different training paradigms, and require different computational calculations.54
Computationally, the Procedural system is very simple: learn the right action or sequence of actions, associate it with a recognized situation.55 (Actually, most models associate a value with each available action in each situation, storing values of situation–action pairs, and then take the action with the highest value.56) Then the next time you face that situation, you know exactly what to do. This is very much the concept of a non-conscious habit that we are all familiar with. When a sports star learns to react quickly “without thinking,” it’s the Procedural system that’s learning. When soldiers or police learn to respond lightning-quick to danger, it’s their Procedural systems that are learning those responses.
The old-school psychology literature often talked about this system as “stimulus–response,” with the idea that the specific responses (press a lever) are becoming attached to specific stimuli (the light turns on), but we know now that the responses can be pretty complex (throw the ball to the receiver) and that the stimuli are better described as full situations (he’s open). Extensive experimental work on the mechanisms of motor control shows that the Procedural learning process can recognize very complex situations and can learn very complex action sequences.57 Part of the key to being a successful sports star or making the correct police response is the ability to recognize the situation and categorize it correctly.58 Once the situation has been categorized, the action-selection process is simple. One way to think of the difference between the Deliberative and Procedural systems is that the Procedural system shifts all of the decision-making complexity into the situation-recognition component.
One of my favorite scenes of this is in the movie Men in Black,59 where the new recruit (Agent J-to-be, played by Will Smith) is faced with a simulated city scene and decides to ignore all the cutouts of scary aliens and shoot the little girl. When asked why he had shot “little Tiffany” by Chief Agent Zed, he responds by explaining how each alien looked scary but was just “trying to get home” or was “actually sneezing,” but “little Tiffany was an eight-year-old girl carrying quantum mechanics textbooks through an unsafe city street at night.” The point, of course, is that Agent J had excellent situation-categorization skills and noticed the subtleties (such as the scary alien’s handkerchief and little Tiffany’s quantum textbooks) and could react quickly to the correct situation.
Although describing Procedural action-selection as storing an action seems simple enough—recognize the situation and take the stored action—there are two important complexities in how this system works. First, one has to learn what the right action is. Unlike the Deliberative system, the Procedural system is very inflexible; which action is stored as the best choice cannot be updated on the fly. We will see later that overriding a stored action is difficult and requires additional neural circuitry (self-control systems in the prefrontal cortex, Chapter 15). Second, the circuitry in the mammalian brain (including in the human) seems to include separate “go” and “don’t go” circuits—one system learns “in this situation, take this action,” while another system learns “in that situation, don’t.”60
The current data strongly suggest that the Procedural system includes the basal ganglia, particularly the dorsal and lateral basal ganglia circuits, starting with the caudate nucleus and the putamen, which receive inputs from the cortex.61 These synaptic connections between the cortical circuits (which are believed to represent information about the world) are trained up by dopamine signals.62 If you remember our discussion of value, euphoria, and the do-it-again signal (Chapter 4), you’ll remember that dopamine signaled the error in our prediction of how valuable things were63—if things came out better than we expected, there was an increase in dopamine and we wanted to increase our willingness to take an action; if things were worse than we expected, there was a decrease in dopamine and we wanted to decrease our willingness to take an action; and if things came out as expected, then we didn’t need to learn anything about what to do in that situation. The connection between the cortex and the caudate/putamen encodes our “willingness to take an action” and is trained up by these dopamine signals.
In addition to the four action-selection systems, we need four additional components, a physical action system, which can physically take the action that’s been selected, a perceptual system, which can recognize the objects in the world around us, a situation recognition system that categorizes stimuli into “situations,” and a motivational system that identifies what we need next and how desperate we are to satisfy those needs.
Most actions require that we move our muscles. This is true whether we are running down a street, turning to face an adversary, smiling or frowning, or signing on a dotted line. In the end, we move our muscles and have an effect on the world. Even speech is muscle movement. As a support system, this would often be referred to as the “motor control” system, because it is the final step in physically interacting with the world. For completeness, it is probably worth including nonmotor actions within this support system. For example, in our description of Pavlovian action-selection, Pavlov’s dogs were salivating, releasing a liquid into their mouths, which is not really muscle movement. Similarly, getting overheated leads to sweating, which again is not really muscle movement. Nevertheless, most of the effects of the physical action system are motor movements.
Details of how muscles work can be found in any medical textbook. Many neuroscience textbooks have good descriptions.64 The important issue for us is that within the motor control (physical action) system, the building blocks are not individual muscle movements, but rather components made of sets of related movements, called muscle synergies.65 But even within those muscle synergies, we do not place one foot down, heel to toe, place the other down, and lift the first foot; we do not put our feet one step at a time; instead, we walk.J Human walking, like a fish or lamprey swimming, is an example of a process called a central pattern generator, so called because it is able to generate a sequence centrally, without external input.67 Central pattern generators can continue to oscillate without input, but they can be modified directly from external sensory cues, and also from top-down signals (presumably from the action-selection systems themselves).
Since our target in this book is to understand the action-selection process, we are going to sidestep the question of what those actions are at the individual motor control level, but obviously these processes can be very important clinically.
Much like the thermostat that we started with at the beginning of the book (Chapter 2), to take the appropriate actions in the world, one needs to perceive the world. Of course, our perceptual systems are much more complex than simply measuring the temperature. Human perceptual systems take information about the world from sensory signals, process that information, and interpret it. Although our eyes transform light (photons) into neural activity, we don’t see the photons themselves, we see objects—the tree outside my window, the chickadee hopping from branch to branch. Similarly, our ears detect vibrations in the air, but we hear sounds—the whistle of the wind, the song of the chickadee.
As with the physical action system, details of how perceptual systems work can be found in many neuroscience textbooks. Neil Carlson’s Physiology of Behavior and Avi Chaudhuri’s Fundamentals of Sensory Perception are excellent starting points. We are not going to spend much time on the transition from physical reality to neural activity (photons to sight, vibrations to sound), but we will take some time to explore how the neural systems categorize perception to identify objects and pull information out of those sensory signals (see Chapter 11).
When psychologists classically tested animals in learning experiments, they used individually identifiable stimuli so that they could control the variables in their experiments.68;K But what’s the important cue in the world right now? I am writing this paragraph on a black Lenovo laptop, sitting at a desk covered in a dark-brown woodgrain and glass, listening to Carlos Santana’s wailing guitar, looking out the window at the darkness (it’s 6 a.m. in winter). Outside, a streetlight illuminates the empty intersection, the shadow of a tree looms over my window, there’s snow on the ground, and I can see the snow-flakes falling. The light over my desk reflects off the window. Next to me are my notes. Which of these cues are important to the situation? If something were to happen, what would I identify as important?
Although it’s controversial, my current hunch is that the neocortex is one big situation-recognition machine. Basically, each cortical system is categorizing information about the world. The visual cortex recognizes features about sights you see; the auditory cortex recognizes features about sounds you hear. There are also components of the cortex that recognize more abstract features—the parietal cortex recognizes the location of objects around you; the temporal cortex recognizes what those objects are. There is an area of the cortex that recognizes faces, and an area that recognizes the room you are in.70 Even the motor cortex can be seen as a categorization of potential actions.71 As one progresses from the back of the brain (where the sensory cortices tend to be) to the front of the brain, the categories get more and more abstract, but the cortical structure is remarkably conserved and the structure of the categorization process does not seem to change much.72
Models aimed at each of these individual cortices all seem to work through a process called content-addressable memory.73 Content-addressable memory can be distinguished from the kind of indexed memory that we are familiar with in typical computers because content-addressable memory recalls the full memory from part of the memory, while indexed memory is recalled from an unrelated identifier. An image of your daughter’s birthday party stored on your computer might be titled IMG3425.jpg (indexed memory, no relation between title and subject), but the memory would be recalled in your brain from a smile on her face, cake, candles getting blown out …
Content-addressable memory is a process in which partial patterns become completed. The computational processes that underlie content-addressable memories are well understood and can be thought of as a means of categorization. A thorough description of those computational processes, their neural implementation, and how they lead to categorization can be found in Appendix C.
The hypothesis that the neocortex is machinery for situation-categorization provides a potential explanation for both the extensive increase in neocortical surface area in humans and our amazing abilities that seem to be uniquely human—we are better at categorizing situations, better at learning the causal structure of the world (what leads to what), and better at abstracting this information. A more complex categorization of the world would lead to an improved ability to generalize from one situation to another and an improved ability to modify what worked in one situation for use in another. However, this also provides a potential explanation for those cognitive diseases and dysfunctions that are uniquely human, such as the paranoia and incorrect causal structure seen in schizophrenic delusions74—our ability to recognize the complexity of the world can lead us to find conspiracies and interactions that might not actually be present.
In a sense, the situation-recognition system is learning the structure of the world. It learns what remains constant in a given situation and what changes. It learns what leads to what. This has sometimes been referred to as semantic knowledge,75 the facts about our world, our place in it, the stories we tell ourselves. Semantic knowledge in humans includes both the categorization of the features of the world (that’s a chair, but that’s a table), facts about the world (this table is made of wood), and the narrative structure we live in (I have to go home so that my family can eat dinner together tonight). This semantic knowledge forms the inputs to our action-selection systems. Changing that knowledge changes the causal structure that leads to expectations (a rustle in the brush predicts a lion, Pavlovian action-selection), the expected consequences of our actions (too much alcohol will get you drunk, Deliberative), and the set of situations in which we react (stop at a red light, Procedural). We’ll explore this system and how changing it changes our decisions in depth in Chapter 12.
Finally, we need a motivational system, which includes two important components: first, What are the goals we need to achieve? (we want to take different actions if we’re hungry or if we’re thirsty) and second, How hard should we be working for those goals? (how thirsty we are will translate into how desperate we are to find water). We will see that both of these issues are related to the concept of value that we first looked at in Chapter 3.
Historically, these issues have been addressed through the reduction of innate “drives.”76 But what creates those drives? Some motivations certainly exist to maintain our internal homeostatic balance,L but people often eat when they are not hungry. Why? To understand this, two additional concepts need to be taken into account—first, that of intrinsic reward functions77 and second, that of the role of learning in value calculations.78
Our complex decision-making system evolved because animals who made better decisions were more able to survive and to find better mates, were more likely to procreate, and were more likely to have offspring that survived themselves.79 But there is nothing in this process that requires that the proximal reasons for decisions be related to any of these evolutionary goals. Instead, what evolution tends to do is find goals that are correlated with these factors. For example, an animal that lived in an environment where food was scarce would be less likely to starve and more likely to survive if it ate whenever food was available, whether it was hungry or not.80 But what happens when these intrinsic processes no longer track survival? What would happen to a species that had evolved to eat whenever food was available but was now faced with an overabundance of food? It would get fat, and new diseases, such as diabetes, would appear.81
We can see examples of how intrinsic reward functions correlated to evolutionary success can get corrupted in many animals and many situations. Dogs love to chase large game (e.g., deer, moose, elk). They do this because that was how they got their food—that was their evolutionary niche. Domestic dogs chase cars. It is pretty clear that if a dog ever caught a car, it would not be able to eat it. But they chase cars for the sake of the chase. In humans, the classic example is pornography. Men like looking at pretty women because being attracted to a beautiful woman helped them judge who would be the best mates,M but reading Playboy or watching Internet porn is not going to help men find better mates. In fact, one could easily imagine an argument that these nonreproductive options impair the ability to find better mates.
So, yes, we have drives that are directly related to evolutionary success (hunger, thirst, sex), and we have intrinsic reward functions that are correlated to evolutionary success (chasing prey, an attraction to beauty). But there is also a learned component to motivation in mammals. This learned component can be seen both in generalized increases (or decreases) in arousal and in specific increases or decreases in the willingness to work for a given reward.
Does this learning component also occur in humans? Actually, this is one of the best-established effects in drug addicts: seeing the paraphernalia for their drug of choice leads to craving for that drug and an increase in actions leading to taking that drug.85 Marketing symbols (Coke, McDonald’s) increase our internal perceptions of thirst or hunger and make people more likely to go get a soda or a hamburger.86 These are all examples of the motivational system driving other action systems.
Motivation is a complex phenomenon related to the complexity of evaluation. It depends on intrinsic functions that we have evolved as animals.87;N Motivation can also be modified through an interaction between the emotional, Pavlovian systems and the other, more complex decision-making systems (Deliberative, Procedural).
Behavioral decision-making occurs as a consequence of a number of different, interacting systems. We have identified four action-selection systems (Reflex, Pavlovian action-selection, Deliberative, Procedural) and four support systems (taking the physical action, perception, situation-recognition, motivation). One of the remarkable things that has occurred over the past several decades is the convergence of different fields on these multiple decision-making systems. A number of fields have been examining how different agents interact with each other and with the world (e.g., human psychology and psychiatry, animal learning theory, robotics and control theory, artificial intelligence, and neuroscience, including both computational modeling and the new fields of neuroeconomics and computational psychiatry).91
One of the things that motivated me to write this book has been the observation that these different fields seem to be converging on a similar categorization of decision-making systems. Even the debates within the fields are similar—for example, Is the value calculation within the search-based (Deliberative) and cached-action (Procedural) systems separate or unitary?92 (I have argued here and elsewhere that they are separate, but the issue is far from settled.)
For example, roboticists have come to the conclusion that one needs multiple levels of decision-making systems—fast, low-level, hardwired systems (don’t bump into walls; a Reflex system), as well as slower, search processes (slow but flexible; a Deliberative system), and fast, cached-action processes (fast but inflexible; a Procedural system).93
Computer scientists studying artificial intelligence and computational neuroscientists trying to model animal decision-making have identified these systems in terms of their expectations—Pavlovian systems create an expectation of an outcome and lead to unconditioned actions related to that outcome; Deliberative systems search through potential futures, produce an expectation of the outcome, and evaluate it; Procedural systems directly evaluate actions in response to specific situations.94
Similarly, learning theorists working in the fields of psychology and animal-learning identify three systems.95 First, there is a stimulus–outcome system in which unconditioned reactions associated with outcomes get transferred from being done in response to the outcome to being done in response to the stimulus. (Think Pavlov’s dogs salivating in response to a cue that predicts food reward.) Second, they identify an action-outcome system in which animals learn what outcomes will occur if they take a given action. These actions are taken to achieve that expected outcome. Third, they identify a stimulus–response or stimulus–action system in which actions are taken in response to a stimulus. The second and third systems can be differentiated by revaluation and devaluation experiments, in which the value of the outcome is changed online. (Remember our example of the rat deciding whether or not to push the lever to get food that it didn’t like anymore.)
So where does that leave us? We have identified four action-selection systems and four support systems. Although these systems are separable in that different brain structures are involved in each of them, they also must interact in order to produce behavior. (The analogy brought up earlier was that of the electrical and gasoline systems in a hybrid car.) Each of these systems is a complex calculation in its own right and will require its own chapter for full treatment. In the rest of this part of the book, we will explore each of these components, the calculations they perform, and their neural substrates, and we will begin to notice potential failure points within them. Let’s start with the simplest action-selection system, the reflex.
• Matthijs A. A. van der Meer, Zeb Kurth-Nelson, and A. David Redish (2012) Information processing in decision-making systems. The Neuroscientist, 18, 342–359.
• Yael Niv, Daphna Joel, and Peter Dayan (2006) A normative perspective on motivation. Trends in Cognitive Sciences, 10, 375–381.
• Mortimer Mishkin and Tim Appenzeller (1987) The anatomy of memory. Scientific American, 256 (June), 80–89.