A The pharmacology term “precursors” is used to identify molecules that are converted into the molecule being studied.6 Your cells are, in a sense, chemical factories that convert molecules to other molecules, using specialized molecules called enzymes. For example, the chemical levodopa (L-dopa) is a precursor for dopamine, which means that if you had extra levodopa in your system, your cells would have more building blocks with which to make more dopamine, and you would find that your cells had more dopamine to use. This is why levodopa forms a good treatment for Parkinson’s disease, since Parkinson’s disease is, in fact, a diminishment of dopamine production in certain areas of the brain.7

B The ventral tegmental area is so called because it is ventral (near the base of the brain) in the tegmentum (Latin for “covering”), an area that covers the brain stem. Substantia nigra is Latin for “black stuff,” so named because it contains melatonin, which makes the area appear black on a histological slice.

Melatonin is a chemical used by the body to signal the nighttime component of day/night cycles.8 This is why it is sometimes used to reset circadian rhythm problems. It interacts with dopamine, particularly the release of dopamine in the brain, which means that manipulations of melatonin (say for jetlag) can affect learning and cognition. This is why pharmacology is so complex:9 manipulating one thing often produces lots of other effects.

C Not all abused drugs are euphoric. Nicotine, for example, is often highly dysphoric on initial use, even though it is one of the most reinforcing of all drugs.14

D Control theory uses equations to determine how to change variables in response to stimuli so as to have desired effects. The thermostat (negative feedback) that we saw at the very beginning of the book is often the first example used in control theory textbooks. Cruise control in your car and the stability maintained by current fighter jets are both derived from modern control theory mechanisms. Operations research is the term often used for university departments studying these questions.

E I prefer to use the term situation rather than the typical computer science term state (or the term preferred by psychology—stimulus set). State is used in many other literatures to refer to things like “internal state” (such as being under the influence of a drug, or not being under the influence of the drug), so overloaded terms like state can be confusing. Although many animal psychology experiments do use a single stimulus, look around you—what is the stimulus that is driving your behavior? I am currently listening to music and watching the sun rise above the snow outside my house, my feet are in fuzzy slippers, but my ankles are chilly—which of these stimuli are important? As we will see in Chapter 12, how we identify the important stimuli is a critical component of the decision-making system, so simply saying stimulus is incomplete. However, conveniently, state, stimulus, and situation all start with S, so if we wanted to write equations (as scientists are wont to do), we don’t have to change our variables.

F There is a tradeoff between exploration and exploitation. If you take only the best choice, you might miss opportunities. If you go exploring, you might make mistakes. For our discussion here, we will assume that there is enough exploration not to get trapped in a situation where you know a good answer but haven’t found the best one yet. We will explore the exploration–exploitation tradeoff in depth in its own chapter (Chapter 14).

G One of the problems with the science of decision-making is that it draws from lots of fields, each of which has its own terminology. The δ used in computer science to represent the value-prediction error is unrelated to the δ used to differentiate types of opioid receptors.

H Parkinson’s disease was first identified in the 1800s by James Parkinson, who wrote a monograph called An Essay on the Shaking Palsy. It is not uncommon in modern societies, but descriptions consistent with Parkinson’s go back to ancient times.26 For example, a description exists in Indian Ayurvedic texts that is likely Parkinson’s disease, and it was treated with a medicinal plant that is now known to contain levodopa (a common modern treatment for Parkinson’s disease). Throughout most of the 20th century, Parkinson’s disease has been thought of as a dysfunction in the motor system—most patients with Parkinson’s show an inability to initiate movement (akinesia) or a slowing of the initiation of movement (bradykinesia).27 For example, patients with the disease develop posture problems and tremors in their limbs both before and during movement. Parkinson’s disease, however, is not a disorder of the muscles or the actual motor system—there are situations where a Parkinson’s patient can react surprisingly well.28 This is nicely shown in the movie Awakenings, where Rose is able to walk to the window once the floor tiles are painted black and white. We will see later in this book that these are due to the intact nature of other decision-making systems in Parkinson’s patients. Whether these are due to emotional reactions (running from a room during a fire, Chapter 8), to simple, long-stored reflexive action sequences (surprised by a thrown ball and “Catch!,” Chapter 10), or to the presence of explicit visual cues (lines on the floor enabling access of Deliberative systems, Chapter 9) is still unknown.29

Neurophysiologically, Parkinson’s disease is due to the loss of dopamine neurons.30 How the effects of dopamine loss in Parkinson’s disease are related to the issues of dopamine as a δ signal is complex. The δ signal observed by Schultz and colleagues is related to fast changes in dopamine firing (bursts, called phasic signaling). Dopamine cells also fire at a regular baseline rate (called tonic firing).31 The relationship between the phasic bursts, the tonic firing, Parkinson’s disease, and decision-making is still unknown, but it is known that decision-making is impaired in Parkinson’s disease,32 and several hypotheses have been proposed.33 Tonic levels of dopamine have been suggested to play roles in the recognition of situations34 (this chapter, below, and Chapter 12) and in the invigoration of movement35 (motivation, Chapter 13).

I Just how well the value-prediction error (δ) theory explains the dopamine data is a point of much controversy today. Part of the problem is that all experiments require behavior, mammalian behavior is a complex process, and we often need to run computer simulations of the behavior itself to determine what a value-prediction error signal would look like in a given behavioral experiment.

For example, Sham Kakade and Peter Dayan showed that if two signals, one positive and one neutral, are provided to the agent, and the sensory cues from the two signals are similar, then one can see “generalization effects” where there is an illusory positive δ signal to the neutral stimulus that is followed quickly by a less-than-expected negative δ signal.43 Imagine two doors, one that leads to food and one that doesn’t. At the sound of a door opening, you might look to the two doors, getting a positive δ signal, but then when you realize that the neutral (nonrewarded) door was the one that was opened, you would be disappointed (giving you a negative δ).

As another example, Patryk Laurent has shown that an agent with limited sensory resources is evolutionarily well served by a positive orienting signal, even to potentially aversive events.44 Essentially, this orienting signal allows the agent to do better than expected at avoiding those aversive events.

Both of these examples would provide complex dopamine signals to aversive events, explaining some of the dopamine signals seen to neutral and aversive events. However, they make specific predictions about how dopamine should respond to those events. Schultz and colleagues have found that the predicted less-than-expected reaction after generalization dopamine signals occurs.45 Although the Laurent hypothesis has not been explicitly tested, it may explain the very fast response signals seen by Peter Redgrave and colleagues.46

J Some researchers have suggested that serotonin might play the role of the negative (aversive) signal,48 but recent experiments have shown that not to be the case.49 Another interesting potential candidate is norepinephrine (called noradrenalin in Europe). Norepinephrine is a chemical modification of dopamine, making dopamine a precursor to it.50 In invertebrates, the norepinephrine analogue octopamine serves as the positive error signal, while dopamine serves as the negative error signal.51 All of these molecules (serotonin, norepinephrine, octopamine, and dopamine) are very similar in their molecular structure,52 making them plausible candidates, given the copy-and-modify process often seen in evolution.

Additionally, some recent very exciting work has found that neurons in another neural structure, the habenula (which projects to the dopamine neurons in the ventral tegmental area), increase their firing with aversion and decrease their firing with reinforcement.53 It is still unclear whether these habenula neurons are the long-sought aversion neurons or if they are part of the brain’s calculation of δ. Remember, δ is a mathematical term. Dopamine reflects δ, therefore, the brain calculates δ … somehow.

K Using the word extinction for this process is unfortunate because it has no relation to the evolutionary term extinction, meaning the ending of a species.