The Mind within the Brain

Appendix C
CONTENT-ADDRESSABLE MEMORY

Memories in the brain are stored as patterns of firing across cells and are retrieved by the content they contain. This process is called “pattern completion” because a memory is retrieved by completing a partial pattern of cellular firing. The neural mechanisms of this are well understood and have important implications for how we recognize and remember, which has important implications for decision-making systems.

When a digital camera takes a picture of your kid’s birthday party and writes it onto its flash memory card, it translates that picture into a set of on-and-off signals (“bits”^A), which it writes onto the card. It then writes the location into the “file allocation table,” which is basically an index of locations on the memory card. When you want to recall the picture, the camera checks the index, follows it, and reads the file. When the camera first stored the picture on its memory card, it gave it a name like img_3541.jpg, which tells you nothing about the content of that image. Your brain can also store information like the memory of your kid’s birthday party in small changes across many neurons, in this case through changes in the connection strength between neurons. But there’s no central index. Instead, the full memory is retrieved from the partial content—the smell of the birthday cake, the sound of kids’ laughter, the sight of a specific toy.

Figure C.1 CONTENT-ADDRESSABLE MEMORY. Information can be addressed by indexes (whether by related names [such as stonehenge.jpg] or by unrelated names [such as img_1054.jpg]) or by content-addressable memory. Content-addressable memories are stored so that similar pictures and similar concepts are linked. Thus, a blurry image of Stonehenge, a paper-cutout image of Himeji, and a pencil drawing of Notre Dame retrieve their respective pictures. Similarly, memories can be retrieved by conceptual content.

The term content-addressable memory comes from the computer science and psychology literatures of the 1980s. In computer science, the term was used to contrast with index-addressable memory, which is how your laptop, desktop, or smartphone stores information.¹ In psychology, the term was used because human memory is triggered by content, like the sights, sounds, or smells of the birthday party.² (Compare Figure C.1.) The concept, though, comes from Donald Hebb’s famous 1949 book.³

Cell assemblies

One of the most famous sentences in neuroscience is “cells that fire together wire together,” which is supposedly from a 1949 book on neuropsychology by the psychologist Donald Hebb. However, that sentence isn’t actually in Hebb’s book; instead, the citation is to a paragraph that concludes, “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased” (Hebb, 1949/2002, p. 62). From this paragraph, we get the terms Hebbian synapse and Hebbian learning.

The concept that co-firing leads to an increase in connectivity almost certainly predates Hebb. In a sense, it is the obvious translation of the theories that mentation is associative, which goes back to David Hume in the 1700s, if not earlier.⁴ Hebb himself says that the general idea is an old one (Hebb, 1949/2002, p. 70). In his historical perspective on LTP, Bruce McNaughton tells that Hebb was surprised at the excitement about the “Hebbian synapse,” that it dates at least to Lorento de No, and that it is the only obvious mechanism of association.⁵ In fact, the famous paragraph in Hebb’s book is the assumption step in his theory, not the prediction step: the full paragraph starts with “Let us assume that …” What Hebb was really proposing was that if we see this mechanism in a cortical system, then we would get cell assemblies and content-addressable memory.

Hebb defined a cell assembly as a set of cells that worked together to represent a specific set of information. Sometimes the term “cell assembly” is used to identify a single thing—thus, the set of cells active when you think of the dog you had growing up forms a cell assembly. Sometimes the term is used to identify a set of things—thus, the set of cells active whenever you see or think about dogs would form a cell assembly.

The important concept in both cases is that it’s not a single cell that forms these sets.⁶ In both cases, we’re talking about groups of cells that work together. A single cell may participate in multiple cell assemblies, but some cells will differ between them. This means that it will only be in the aggregate that an observer can differentiate representations in this population of cells. We can ask what the tuning curve is from a single cell, but we cannot decode the representation from it. For that we need a population of cells, an ensemble of cells. This kind of a representation is called distributed because the representation is distributed across the population.

Hebb’s concept was that cells that represent, say, the sound of that dog barking, the color of its fur, and how you felt when you hugged it would all be linked together because these experiences tended to co-occur.⁷ Thus, the taste of ratatouille made the way your mom made it could bring back a flood of memory—of your hometown, your youth, and that family emotion that can melt your heart.⁸ The Hebbian synapses would complete a part of the cell assembly into the whole memory. The memory could be addressed by its content alone.

Hebb’s conceptual derivation occurred before the necessary mathematical basis that was needed to study it, and the concepts of cell assembly and content-addressable memory were not tested or explored until the 1980s and 1990s, after the development of the needed computational and neurophysiological technologies.⁹

Memory as pattern completion

Computationally, content-addressable memory is equivalent to completing a partial pattern.¹⁰ Imagine a degraded picture, where some of the pixels have been replaced with noise. With enough noise, so much of the picture will be changed that you won’t be able to recognize it anymore, but for a long time before that, you will still be able to recognize the picture, even through the noise.

Computer models of this have shown that a very good system for completing partial patterns is to store connections between each pixel representing the similarities and differences between the pixels. For simplicity, let’s imagine a black and white picture, where each pixel is either on or off. If two pixels are both on, then they can support each other, but when the pair of pixels are opposite to each other (one on, the other off), they counteract each other. This is a network originally studied by John Hopfield, who showed in a very famous 1982 paper that such a network will complete a degraded picture to an original stored picture.¹¹ Hopfield was a physicist at CalTech at the time. Using mathematics originally derived in physics to describe how iron becomes magnetic when it cools down (the magnetic poles of the individual atoms slowly align), he showed how you can define an error or energy surface to understand how the system settled from the noisy, degraded picture into the remembered picture.

Imagine that we represent the activity across all of our neurons as a point in a very high-dimensional space.¹² Each neuron provides a dimension and the position along that dimension is the firing rate of the neuron. This is easiest to think of with only two neurons, so we have only two dimensions, but the mathematics works with any number of dimensions. At any moment in time, our firing pattern across the population of neurons is a point in this n-dimensional space. We can describe the error (energy) function as height in a third dimension ((n + 1)th dimension). This gives us a surface with hills and valleys. What Hopfield was able to show was that his network would always roll downhill and that the synaptic weights in his network could be constructed so that the bottom of the valleys correctly corresponded to the stored (remembered) patterns. The connection strength between the cells of the pattern determine the depth of the basin representing that pattern, and the likelihood that one will roll down into it.¹³

So imagine that we store some pattern (say a picture of Rodin’s The Thinker) in our network (Figure C.2). This means that the pattern of activity across the network that corresponds to our picture of that iconic image is going to be at the bottom of one of these valleys. We then degrade that pattern by adding noise. When we put that degraded pattern into our network, it starts at some point different from our stored memory. But then the activity of our network starts to change—neurons that are off, but have lots of other connected neurons that say they should be on, start to turn on; neurons that are on, but don’t have a lot of support to stay on, turn off. As the activity pattern changes, we are moving along that energy/error surface. We are a ball rolling downhill. Eventually, we will arrive at the bottom of the valley of The Thinker.

Of course, as we saw in our discussion of actual neurons (Appendix A), neurons are not off or on. Instead, they fire spikes, but these spikes are in response to signals arriving in the neuron’s dendrites, which depend on the firing of other neurons and the synaptic weight between those other neurons and the neuron in question. This produces a network of neurons working together.

Figure C.2 DEGRADED PATTERNS. We can measure the difference between patterns as the mean squared error E. As the pattern changes from the remembered pattern, the error increases. We can use this error as an energy function. If a means can be found to move down in this “error” function, we will find ourselves completing the pattern, “remembering” our image. Content-addressable memories work this way—by moving downhill of the error or energy function.

The concept of this network of neurons working together to recall a memory is sometimes called connectionism, because it is the connections (synapses) between the neurons where the memory is stored.^B Hopfield’s contribution was to simplify the model to the point where he could define the energy or error function and prove that his system would recall the stored memory through pattern completion. In fact, these sorts of models work well, even if we only change the excitatory connections and use inhibition only to maintain an overall level of activity, or if we have graded neurons, which are trying to recall firing rates, or if we have detailed models of spiking neurons.^15;^C These models have been able to explain how changes in neuromodulators (like dopamine, acetylcholine, or serotonin) can change the ability of a network to store memories, as well as very specific (and often surprising) firing rate patterns observed in neural recordings from behaving animals.²² What these models give us is an explanation for memory as categorization, which has a number of interesting consequences.

Memory as categorization

What if you have two memories stored in this network? Then you would have two valleys. This would mean that there was a set of patterns that would roll down into the first valley and a different set of locations that would roll down into the second valley. We call each of these sets the basin of attraction for a given valley. Just as water to the east of the Continental Divide in the Rocky Mountains in the United States flows east to the Atlantic, while water to the west of the Continental Divide flows west to the Pacific, starting points in one basin of attraction flow to one memory, while starting points in the other basin flow to the other (Figure C.3).

What is really going on here is that the pattern of hills and valleys depends on the connections between cells. By changing those connections, we change the hills and valleys and change how our memory completes patterns. Storing a memory entails changing those connections so that the memory is at the bottom of a valley.²³ Recalling a memory entails rolling down that surface to the bottom of the valley (Figure C.4). Changing connection weights changes the basins of attraction for each valley and changes what memories are recalled from what cues. It turns out that (as originally predicted by Hebb) the Hebbian synapse (or modifications of it that include both increases and decreases ²⁴) changes the connection weights in the right ways to change the basins of attraction. In fact, many memories, with many valleys and many basins of attraction, can be stored in these networks. There’s good evidence now that several brain structures that store and recall memory work this way, including both the hippocampus and frontal cortex.²⁵

Figure C.3 WATERSHEDS OF NORTH AMERICA AS BASINS OF ATTRACTION. The simplest example of a basin of attraction is that of a watershed. Water to the west of the Continental Divide ends in the Pacific Ocean, while water east of it ends in the Atlantic. Most of the Canadian water flows to the Arctic, but some flows to the St. Lawrence Seaway, and some flows to the Pacific. Water in the Great Basin flows into the Great Salt Lake and does not end up in an ocean. (Map modified from the U.S. Geological Survey.)

Examples of the effects of these basins can be seen in how they affect categorization—for example, in speech, in the ability to recognize differences between sounds, or, in perception, in the ability to differentiate colors.

Although human speech is capable of making many sounds reliably, each language selects a subset of those sounds to use to communicate information.²⁶ Thus, the Xhosian languages in southern Africa include different kinds of clicks (e.g., a slap of the tongue vs. a sounded swallowing), something not included in any Indo-European language. Some languages differentiate sounds that others don’t. For example, in Japanese, the sounds /l/ and /r/ (as in “glossary”) are both categorized as the same sound. Japanese words do include both /l/ and /r/ sounds, but native Japanese speakers don’t hear the difference. Every language has sounds it categorizes together. English, for example, doesn’t differentiate aspirated and unaspirated /t/ sounds—although English uses both aspirated and unaspirated consonants (for example, between the sound of the /t/ in the English words “ton” and “stun”), English speakers generally don’t hear the difference.^D In Hindi, the aspiration of certain consonants differentiates words (just as /l/ and /r/ differentiate “light” and “right”). We can explain this categorization ability in terms of basins of attraction ²⁷ (Figure C.5).

Figure C.4 BASINS OF ATTRACTION. The basin of attraction is the set of points that, when allowed to change through a dynamic system, converge on the same place. If we think of the state of a neural system as a point in a high-dimensional space, then we can describe the flow of points in this space as a space of hills and valleys. Similar patterns will change to recall a stored pattern.

In actuality, the two sounds (/l/ and /r/) form a continuum in the shape of the mouth and the position of the tongue. /l/ is the tongue touching the front top of the palate, while /r/ is the tongue against the side teeth. (The rolled /r/ of French or Spanish includes a glottal purring with the tongue against the side teeth.) In Japanese, all of the sounds in this continuum between /l/ and /r/ fall into the same basin of attraction; they activate the same cell assembly, and they are categorized as a single sound. In English, we draw a line between the extremes of this continuum, separating them. More /l/-like sounds fall into the /l/ basin, while more /r/-like sounds fall into the /r/ basin; they activate different cell assemblies, and they are categorized differently. This understanding of categorization has led to new techniques for teaching Japanese speakers to recognize the English differentiation between /l/ and /r/ by starting with the largest separation, thus overemphasizing the difference between them, and then explicitly training the categorization skills.²⁸ This enables Japanese speakers to create new basins of attraction, to learn to separate these categories, and to improve their ability to differentiate the sounds.²⁹ Similarly, parents talking to young children overemphasize the difference between individual vowels, which may help demonstrate the clear category differences to the children.³⁰

Figure C.5 PHONEME-RECOGNITION DEPENDS ON LEARNED BASINS OF ATTRACTION. The difference between being able to distinguish two categories depends on them being in different basins. Native English speakers have learned that the sounds /l/ and /r/ are in different basins, while native Japanese speakers have learned that they are in the same basin. This makes it very difficult for native Japanese speakers to recognize the differences between the /l/ and /r/ sounds.

Another classic example of learned categorization is color differentiation, which turns out to be helped by being able to name the colors.³¹ Although there are universals in some of the color labels between languages, likely due to the neurophysiology of color vision, every language differentiates colors a little differently.³² (The nice thing about color is that one can use pure colors, which means that one can quantitatively measure the difference between the two colors in terms of the wavelength of light.) English speakers are more easily able to distinguish colors at the boundary between blue and green than two colors that are both blue or that are both green, even when faced with color pairs that are separated by the same wavelength difference.³³ This is a general property across languages. For example, while English has a single color term for blue (“blue”), Russian has two (синий “seenee,” dark blue, and голубой “goloboi,” light blue). This doesn’t mean that English speakers can’t separate light and dark blue; we often talk about sky blue and navy blue. These are small and subtle effects; certainly, we recognize the differences between the many blues in a big box of crayons. These effects should not be confused with the idea that some concepts do not exist in some languages, and that one cannot think about things one cannot name.

For example, some people have argued that some languages do not represent time or number, and that those people live ever in the present.³⁴ This hypothesis is known as the “Sapir-Whorf” hypothesis (named after the linguists Edward Sapir and Benjamin Whorf, not to be confused with the Klingon Worf) and is best known through the use of Newspeak in George Orwell’s 1984, where concepts are manipulated by changing the language itself. While it is clear that political discourse is changed by the framing of concepts and issues, it is still possible to think about things, even without terminology for them.³⁵ Similarly, the myth that the Inuit have twenty words for snow ³⁶ should not be taken as so different from English.³⁷ English can also identify twenty different kinds of snow: sleet, ice, rain, snow, fluffy snow, heavy snow, etc. We certainly recognize the differences between these forms of snow (I know I do when I’m trying to shovel my driveway!). The difference is that the Inuit tend to use more accurate words for snow because it is an important part of their lives. Similarly, while a casual observer may label a bird as “a bird,” a birdwatcher or a hunter would be more accurate, labeling it as a “hawk” or even a “red-tailed hawk.”³⁸

Russian speakers have learned to categorize dark blue (синий, seenee) and light blue (голубой, goloboi) rather than to categorize blue together. Native Russian speakers are faster to identify differences in blue colors that cross the синий (seenee)–голубой (goloboi) boundary than native English speakers. These abilities depend on language—if one is given a linguistic blocking task (like repeating random strings of numbers silently while doing the task ^E), then the differences between color pairs crossing categories and color pairs within categories disappears.³⁹

Presumably, the difference between native English and native Russian speakers is that the native Russian speakers have developed two basins of attraction to separate light and dark blue while the native English speakers have only one. What is particularly interesting about these categories is that they are not arbitrary; they relate to universal perceptual experiences and an interaction between our sensory perceptions and the world.⁴⁰ The cell assemblies in the color-recognition structures in our cortex are working from the specific color signals in the world (the blue of the sky, the blue of the ocean, the red of a sunset, the green of a leaf, the yellow of a sunflower) and the specific color signals that the photoreceptors in the retina of our eyes can detect.^F

The neurophysiology of cell assemblies

Neurophysiologically, studying cell assemblies, content-addressable memory, and pattern completion requires the ability to record the activity of large numbers of neurons simultaneously, yet separately. It is not enough to see the broad activity of a network, as, for example, from EEG, LFP, or fMRI; one needs to see the actual pattern of neuronal firing. Additionally, it is not enough to see the activity of a single neuron because what matters is not whether a single neuron fires when a stimulus is presented or an action is taken; what is needed is to see how the activity of a single neuron reflects the activity of its neighbors.

The first major steps toward the identification of cell assemblies in real nervous systems were due to work by Bruce McNaughton and his colleagues in the 1990s, where they developed techniques capable of recording large ensembles of neurons simultaneously. In a famous paper published in 1993, Matt Wilson and Bruce McNaughton ⁴² recorded from over a hundred hippocampal place cells simultaneously. The hippocampus has been a particularly useful place to examine cell assemblies because hippocampal cells in the rat have a very identifiable representation—they encode the location of the animal in space.⁴³ This has enabled neuroscientists to use mathematics originally derived for spatial reasoning to examine hippocampal cell assemblies.⁴⁴

More recently, it has been possible to examine patterns of cortical cells during action selection by examining the trajectory of the ensemble neural firing pattern through the high-dimensional space. As discussed above, we can imagine the population as a point in a very high-dimensional space, one dimension for each neuron. The activity of the cells at a moment in time can be described as a point in that space. Several labs studying action-taking have found that just before the animal takes the action, the firing pattern of the population of cells in motor cortex moves to a consistent point in that space and then takes a consistent trajectory to another point in that space.⁴⁵ Similar analyses have been done for ensembles of prefrontal neurons in rats—the firing pattern of the population of cells clusters into subspaces at important times, say when the animal changes between tasks.⁴⁶

Just as Hebb predicted, the mental representation of decisions requires pattern completion of cell assemblies and entails trajectories through cell-assembly space.

Books and papers for further reading

• George Lakoff (1990/1997). Women, Fire, and Dangerous Things. Chicago: University of Chicago Press.

• Donald O. Hebb (1949/2002). The Organization of Behavior. New York: Wiley / Lawrence Erlbaum.