Chapter 5
IN THIS CHAPTER
Seeing the role the eye and its components play in vision
Examining the vision centers of the brain
Looking at how we see color, depth, and shapes
Uncovering the causes of visual impairment and the secrets behind optical illusions
How do you see? Most people think that, when we look at things, the light coming off those things enters the eye, which sends a camera-like image of what we’re looking at to the brain. However, the retina itself is an extension of the brain and already modifies the camera-like image that it receives, and this image is further modified by the rest of the brain.
Vision occurs when retinal photoreceptors capture photons and the retina and brain perform a complex analysis of this information. Visual input that reaches consciousness is processed in parallel pathways through the retina, thalamus, and occipital lobe of the cortex. About 15 other visual pathways process visual input unconsciously to do things like control pupil diameter and circadian rhythms. Specific classes of neural cells in the retina sense different aspects of the visual image, such as the colors present, whether something is moving, and the location of edges. These neurons act like little agents that “shout out” the presence of these features so that you can recognize objects and determine their distance from you.
This chapter covers the cellular agents involved in sight. Early in the visual system, these agents are very simple-minded and react to qualities like color and intensity. Higher in the visual system these agents become very sophisticated and very picky, some responding only to certain faces. This is their story. Oh — and after you’ve studied all this visual processing neural circuitry, think about the fact that blind people, without any input from the eyes, still can visualize.
You may have heard that the eye is like a camera. As Figure 5-1 shows, this comparison springs from the fact that the eye has evolved tissues that act like the optical elements of a camera, namely, the lens, which acts like a camera’s lens, and the pupil, the opening through which the light enters, acts like an aperture.
Illustration by Frank Amthor
FIGURE 5-1: Light entering the eye.
When light enters the eye, the cornea, which is the outermost clear layer at the front of the eye, first focuses it, and then the lens focuses it further, and the pupil, located between the cornea and lens, opens and closes to let in more or less light, like a camera aperture. The image formed by the cornea, lens, and pupil is projected onto the retina, the neural lining inside the eye. The retina is where the real action in vision takes place. The following sections outline what happens to this image once it hits the retina.
The smallest units of light are called photons. When photons hit the retina, they are absorbed by photoreceptors, specialized neural cells in the retina that convert light into electric current that modulates the release of a neurotransmitter (glutamate). This whole process — from photons to electrical current to neurotransmitter release — is called phototransduction.
Two main types of photoreceptors exist:
At night, when only your rods are absorbing enough photons to generate signals, you have no color vision because the signal from a rod contains no information about the wavelength of the photon absorbed. During the day, however, three different cone types are active (red, green, and blue). Individually, cones don’t signal wavelength either, but the brain can deduce wavelength from the ratio of activity of different cones. In very blue light, for example, the blue cones are relatively more activated than the green and red cones.
When a photoreceptor absorbs a photon of light, a cascade of events occurs that result in a message being sent to other neurons in the retina:
A chemical reaction occurs.
The molecule rhodopsin (in rods; similar molecules exist in cones), which absorbs photons, starts out in a kinked form, called 11-cis retinal. When this molecule absorbs a photon of light, a molecular bond in the middle of the 11-cis retinal flips from a kinked to a straight configuration, converting it to what’s called all-trans retinal.
The all-trans retinal is a stereoisomer of 11-cis retinal, which means it has the same chemical composition but a different structure.
All-trans retinal reduces the concentration of cyclic GMP (cGMP).
cGMP is an intracellular messenger inside the photoreceptor that keeps depolarizing ion channels in the cell membrane open.
These depolarizing ion channels are like the channels of metabotropic receptors (refer to Chapter 3) except that they’re triggered by light absorption rather than a neurotransmitter binding to a receptor. When light reduces the internal concentration of cGMP in the photoreceptor, it reduces the number of these channels that are open via a second messenger cascade, hyperpolarizing the receptor.
In all vertebrates (animals with backbones, like humans), photoreceptors hyperpolarize to light by a similar mechanism, using similar photochemistry. Some non-vertebrates, like barnacles and squids, have photoreceptors that use different photochemistry to depolarize to light.
The hyperpolarization of the photoreceptor causes a structure at its base, called the pedicle, to release less glutamate, the photoreceptor neurotransmitter.
The photoreceptor pedicle is very similar to a conventional axon terminal except that, instead of individual action potentials releasing puffs of neurotransmitter, light absorption continuously modulates the release of neurotransmitter.
The modulation of glutamate release drives other cells in the retina.
The outputs of photoreceptors drive two main types of cells called bipolar and horizontal cells. These cells are discussed in the next section.
Photoreceptors do not send an image of the world directly to the brain. Instead, they communicate with other retinal neurons that extract specific information about the image to send to higher brain centers. The following sections explore that communication.
Why doesn’t the eye just send the electrical signal from all the rod and cone photoreceptors directly to the brain? The main reason is that there are well over 100 million rods and cones but only a million axon transmission lines available to go to the brain (refer to Chapter 3 for more about axons). Even worse, these transmission lines work by sending a few action potentials per second along the axon, further limiting what information each line can send.
The retina gets around these limitations on transmission capacity in a few interesting ways, as the following sections explain.
The most common misconception about the retina is that it sends some sort of raw image to the brain. In reality, the retina processes the image and sends information extracted from the image to at least 15 different brain areas by at least 20 parallel pathways.
The retina has more photoreceptors and other retinal cells in the center of the eye (called the fovea; refer to Figure 5-1) than it does in the periphery. You notice things happening in your periphery, but to identify something clearly, you have to look directly at it to place the image on the high-resolution fovea.
The retina uses adaptation in which the photoreceptors respond briefly whenever the light level changes but then settle down and reduce their output after a few seconds. Adaptation saves energy and spikes if cells don’t have to keep telling the brain, “Yes, the light level is still the same as it was a few seconds ago.” Photoreceptors change their dynamics so that their basal neurotransmitter release modulates their responses around the average current light level. This type of temporal adaptation occurs in photoreceptors and other retinal neurons.
Adaptation occurs in another way in the retina (and the brain) via neural circuit interactions. This type of adaptation minimizes information across space. A process called lateral inhibition reduces how much information is transmitted to the brain because photoreceptors and other retinal neurons communicate the difference between the light where they are and the surrounding area rather than the absolute level of the light they receive. The next sections explain the neural circuitry in the retina that makes lateral inhibition possible.
After temporal adaptation, lateral inhibition is another way your nervous system overcomes the limitations on how much information can be transmitted from the retina to the brain. In lateral inhibition, photoreceptors don’t convey the absolute level of the light they receive; instead, they communicate the difference between the light they receive and the surrounding light. This section explains the neural circuitry that makes lateral inhibition possible.
Photoreceptors connect to two retinal neural cell classes: horizontal cells and bipolar cells (see Figure 5-2). Horizontal cells mediate lateral inhibition, and bipolar cells pass the photoreceptor signal, which has been modified by the horizontal cells, on toward the next retinal layer, which then projects to the brain. The next sections explain how this process works.
© John Wiley & Sons, Inc.
FIGURE 5-2: Photoreceptors connect to bipolar and horizontal cells.
Suppose you’re staring at a stop sign. You don’t need all the cells responding to different parts of the sign to report with high precision that exactly the same shade of red occurs everywhere over the entire sign. The retina can avoid sending redundant spatial information because lateral inhibition uses horizontal cells to allow photoreceptors to communicate the difference between the light they receive and the surrounding light.
Here’s how it works: Horizontal cells receive excitation from surrounding photoreceptors and subtract a percentage of this excitation from the output of the central photoreceptor. This action allows each photoreceptor to report the difference between the light intensity and color it receives and the average intensity and color nearby. The photoreceptor can then signal small differences in intensity or color from those of nearby areas. These highly precise signals go to the next cells, the bipolar cells.
The signals from the photoreceptors that have been modified by the horizontal cells are then sent to the bipolar cells (refer to Figure 5-2). The bipolar cells then carry these signals to the next retinal processing layer. Bipolar cells come in two major varieties:
As mentioned, bipolar cells carry the signal forward toward the next layer of retinal processing before the brain. In this second synaptic retinal layer, bipolar cells connect to (synapse on) two kinds of postsynaptic cells:
I cover both of these cells types in more detail in the next section. (Stick with me through the next section: We’re almost to the brain!)
The visual image that the photoreceptors capture and that the horizontal and bipolar cells modify is received by another group of neurons, the retinal ganglion cells, where yet another set of lateral interactions, mediated by amacrine cells, occurs.
As I explain in the previous sections, the connections within the retina are between cells much closer than one millimeter to each other. But the messages going from your eye to your brain have to travel many centimeters. A few centimeters may not sound like a lot to you, but when you’re a cell, it’s a marathon! Traveling this distance requires axons that conduct action potentials, by which ganglion cells convert their analog bipolar cell input into a digital pulse code for transmission to the brain. (See Chapter 3 for a discussion of action potentials.)
The depolarizing bipolar cells, which are excited by light, are connected to matching ganglion cells called on-center. The hyperpolarizing bipolar cells, which are inhibited by light (but excited by dark), are connected to matching ganglion cells called off-center ganglion cells.
In addition to performing other functions, amacrine cells modulate signals from bipolar cells to ganglion cells much the same way horizontal cells modulate signals from photoreceptors before sending them to the bipolar cells. That is, amacrine cells conduct inhibitory signals from the surrounding bipolar cells so that the ganglion cell responds to the difference between the illumination in its area and the surrounding areas, rather than to the absolute level of illumination (as coded by its bipolar cell inputs). This action reduces the amount of redundant information that must be transmitted over the limited number of ganglion cell axon transmission lines.
Despite this similarity between amacrine and horizontal cell function, amacrine cells come in more varieties and are more complicated than horizontal cells. As a result, the same bipolar cell inputs create different ganglion cell classes. The two most important ganglion cell classes are
Parvocellular ganglion cells are by far the most numerous ganglion cells in the retina. Both of these ganglion cell classes have on-center and off-center varieties.
Other types of amacrine cells produce ganglion cell classes that respond only to specific features from the visual input and project to particular areas of the brain. For example, some ganglion cells respond only to motion in a certain direction and help you track moving objects or keep your balance. Other ganglion cells sense only certain colors, helping you tell ripe from unripe fruit, or red stop lights from green lights. Still others indicate the presence of edges in the scene.
The previous section explores how the retina converts light into ganglion cell pulses that signal different things about the visual image. These pulses, called action potentials, can travel the centimeter distances to the brain over the ganglion cell axons. In this section, I finally get to the heart of the (gray) matter and discuss where these pulses go in the brain and what happens after they get there.
The main output of the retina is to an area of the brain called the thalamus (refer to Chapter 2 for a general description of the thalamus). The visual sub-region of the thalamus is called the dorsal lateral geniculate nucleus (dLGN). Both the parvocellular and magnocellular ganglion cell classes — refer to the earlier section “Breaking down into ganglion cell types and classes” — project to the dLGN.
Figure 5-3 shows that a bundle of axons leaves each eye and that the two bundles join a few centimeters later. These bundles of axons are called the optic nerves (the term nerve is a general term for a bundle of axons). The junction point where the optic nerves first meet is called the optic chiasm, which means “optic crossing.”
© John Wiley & Sons, Inc.
FIGURE 5-3: The thalamus and neocortex.
A very interesting thing happens at the optic chiasm. Some of the ganglion cell axons from each eye cross at the chiasm and go to the other side of the brain, and some don’t. Which do and which don’t, you may wonder, and why.
Look carefully at the right eye in Figure 5-3. The part of the right retina closest to the nose (called the nasal retina) receives images from the world on the right side (right visual field), while the part of the right retina farthest from the nose (called the temporal retina) receives input from the left side of the world. In the left eye, the right visual field falls on its temporal retina. What happens at the optic chiasm is that axons sort themselves so that the information received from the right visual field is sent to the left side of the brain, while the right side of the brain gets inputs from the left side of visual space (that the left brain deals with right side should be a familiar theme by now).
This left-right sorting occurs because axons from nasal retinal ganglion cells (that see the visual world on the same side as the eye) cross at the optic chiasm and go to the opposite side of the brain, while the axons of temporal retinal ganglion cells do not cross.
So the nerves after the optic chiasm have different axons than the optic nerves going to the chiasm; they also have a different name, the optic tracts. The left optic tract has axons from both eyes that see the right visual field, while the right optic tract has axons from both eyes that see the left visual field. This means that there are cells in the cortex that are driven by the same visual field location in both eyes, in addition to the left visual cortex dealing with the right visual field, and vice versa.
What happens to the visual signal in the thalamus? Twenty years ago, most researchers would have answered very little because ganglion cell axons synapse on cells in the thalamus that have very similar response properties to their ganglion cell inputs. Each thalamic relay cell receives inputs from one or a few similar ganglion cells and then it projects to visual cortex; for this reason, LGN neurons are often referred to as relay cells. For example, some layers of the dLGN receive inputs from only parvocellular ganglion cells, and within those layers, on-center parvocellular ganglion cells drive on-center parvocellular type relay cells. Similarly, off-center parvocellular ganglion cells drive off-center parvocellular relay cells. A corresponding situation exists for on- and off-center magnocellular relay cells in other LGN layers.
Why do ganglion cells make this relay stop then? Is it because the ganglion cell axons simply can’t grow far enough? This explanation seems unlikely because all mammals have this visual relay through the thalamus, despite large differences in brain sizes (consider the distances involved in elephant versus mice brains). The more likely explanation is that, although the relay cells in the thalamus seem to respond very much like their parvocellular or magnocellular ganglion cell inputs, other inputs to the dLGN from other parts of the brain allow gating functions associated with attention. (A gating function is a modulation of the strength of a neuron’s responses to any particular stimulus based on the context of importance of that stimulus.)
How does attention use a gating function in the thalamus? Imagine you’re meeting someone you’ve never seen before and you’ve been told this person is wearing a red sweater. As you scan the crowd, you orient to and attend to people wearing red. This task is accomplished at numerous places in your visual system, including your thalamus, because cells that respond to red things in your thalamus have their responses enhanced by your attention. If you got a text message that your party took off the red sweater because the plane was too hot and was wearing a green shirt, you could switch your attention to green, with green-responding cells’ outputs enhanced.
The thalamic relay cells in turn send their axons to the visual area of the neocortex at the back of the head, called the occipital lobe (refer to Chapter 2). This fiber tract of axons is called the optic radiation due to the appearance of the axons “fanning out” from a bundle. I discuss cortical processing in later sections starting with “From the thalamus to the occipital lobe.”
Some ganglion cells project to retinal recipient zones other than the thalamus. These zones, explained in the following sections, carry specific information extracted from images for functions like the control of eye movements, pupillary reflexes, and circadian rhythms.
This midbrain region receives axons from almost all ganglion cell classes except the parvocellular cells. The superior colliculus controls eye movements. Our eyes are almost never still; instead, they jump from fixation to fixation about three to four times per second. These large rapid eye moments are called saccades. Saccades can be voluntary, such as when you’re visually searching for something, or involuntary, as is the case when something appears or moves in your peripheral vision that draws your attention and your gaze (like red sweaters or green shirts). Often, when you make a saccade, your eyes quite accurately move to a new area of interest whose details are below your level of awareness, but that have been processed by the ganglion cell projections to the superior colliculus.
Several accessory optic and pretectal nuclei in the brainstem receive inputs from ganglion cells that detect self-movement. These visual nuclei are essential for balance and enable you to maintain fixation on a particular object while you or your head moves. They project to motor areas of the brain that control eye muscles so that no retinal slip (or movement of the image across the retina) occurs despite your movement or movement of the object of your attention.
One important function of this pathway is for visual tracking, the ability to follow, for example, the flight of a bird across the sky while keeping the bird image centered on your high acuity fovea. You can do this kind of tracking fixation not only when you are standing still, but also while you are running — a handy skill if you are a wide receiver running to catch a forward pass.
Suprachiasmatic means “above the optic chiasm.” This area regulates circadian rhythms, the body’s intrinsic day-night cycle, which includes being awake and sleeping. Humans are built to be active during daylight hours and to sleep at night.
This natural cycle is activated by a class of ganglion cells that are intrinsically light-sensitive; that is, they have their own photoreceptor molecules and respond to light directly, in addition to being driven by the photoreceptor-bipolar cell sequence (explained in the earlier section “Processing signals from the photoreceptors: Horizontal and bipolar cells”). These intrinsically photoreceptive cells, as they’re called, send information about day versus night light levels to the area of the brain that controls your circadian rhythms. (See Chapter 11 for more about what happens during sleep.)
Like the suprachiasmatic nucleus (refer to the preceding section), the Edinger-Westphal nucleus receives inputs from intrinsically photoreceptive ganglion cells that inform it of the current overall light level. This nucleus controls your pupil’s level of dilation, which controls the amount of light entering the eye.
The cells in the dLGN of the thalamus that receive projections from the retina project to the occipital lobe of the cerebral cortex at the back of your brain. This is the pathway that mediates almost all of the vision you’re conscious of. (Contrast this with vision functions, such as pupil contraction and dilation, that you’re neither conscious of nor able to voluntarily control.) The area of the occipital lobe that receives this thalamic input is called V1 (meaning, “visual area 1”) and is at the bottom of the horizontal brain section in Figure 5-3.
Neurons in area V1 project to other areas of cortex and these to other areas still so that virtually all the occipital lobe and most of the parietal lobe and inferior temporal lobe have cells that respond to certain types of visual inputs. What all these different visual areas (more than 30 at last count) are doing is responding to and analyzing different features of the image on the retinas, enabling you to recognize and interact with objects out there in the world. The way these visual areas accomplish this feat is through neurons in different visual areas that respond to discrete features of the visual input.
Take a brief look at the numbers. A little over one million retinal ganglion cells project to about the same number of relay neurons in the dLGN of the thalamus. However, each thalamic relay neuron projects to over 100 V1 neurons. In other words, the tiny area of the visual image subserved by a few retinal ganglion and thalamic cells drives hundreds of V1 neurons.
What the hundreds of V1 neurons are doing with the output of a much smaller number of ganglion cells is extracting local features that exist across several of their inputs.
As David Hubel and Torsten Weisel of Harvard University famously showed, V1 cells are almost all sensitive to the orientation of the stimulus that excites them. This means that these cells don’t fire action potentials unless a line or edge of a particular orientation exists in the image, which is represented by several ganglion cells in a line in some direction being activated.
All stimulus orientations (vertical, horizontal, and everything in between) are represented in V1 so that some small group of ganglion cells that respond to local light or dark in some area of the image gives rise to a much larger group of V1 cortical cells that respond only to a particular orientation of an edge going through that area.
Other V1 neurons only respond to certain directions of motion, as though a particular sequence of ganglion cells has to be stimulated in a certain order. As with orientation, all directions are encoded, each by a particular cell or small set of cells. Other V1 cells are sensitive to the relative displacement of image components between the two eyes due to their slightly different viewing position (called binocular disparity).
As mentioned earlier, area V1 is at the posterior pole of the occipital lobe. Just anterior to V1 is (you may have guessed) area V2. Anterior to that is V3. Neurons in these areas tend to have relatively similar response properties. For now, just think of V1–V3 as a complex from which projections to other areas arise (yes, I know this is a gross oversimplification of the undoubtedly important differences in their functions that additional research will make clear).
Understanding the immensely complex visual processing network that takes up nearly half of all the neocortex is one of the most challenging areas of research in neuroscience today. One of the most important organizing principles we currently have is that there is a structural and functional division in the visual processing hierarchy. This is illustrated in Figure 5-4.
Illustration by Frank Amthor
FIGURE 5-4: Visual cortical areas.
The dorsal stream is the projection into the parietal lobe. Cortical areas in the dorsal stream, such as areas called MT (middle temporal) and MST (medial superior temporal) are dominated by cells that respond best to image movement. In MST particularly, there are cells that respond best to the types of visual images that would be produced by self-movement, such as rotation of the entire visual field, and optic flow (the motion pattern generated by translation through the world, with low speeds around the direction to which you are heading, but high speeds off to the side). In addition, motion parallax, in which close objects appear to shift more than distant ones when you move your head from side to side, is encoded by motion-selective cells in the dorsal pathway. Many dorsal stream cortical areas project to the frontal lobe via areas such as VIP, LIP, and AIP (ventral, lateral, and anterior intraparietal areas, respectively).
The ventral stream goes to areas along the inferior aspect of the temporal lobe (the so-called infero-temporal cortex). It includes areas such as TE and TEO (temporal and temporal-occipital areas). These areas project to the hippocampus and frontal lobe. The ventral pathway is often called the “what” pathway. Cortical areas in this stream have neurons that prefer particular patterns or colors (almost all neurons in ventral stream area V4, for example, are color selective) and are not generally motion selective.
As you move from posterior to anterior along the inferior temporal lobe, you find cells that respond only to increasingly complex patterns such as the shape of a hand. Near the pole of the temporal lobe on its medial side is an area called the fusiform face area, with cells that respond only to faces. Damage to this area has resulted in patients with normal visual acuity but who cannot recognize any faces, including their own.
Despite the clear segregation of functions between the dorsal and ventral streams, they clearly exist in a network in which there is crosstalk. For example, in structure from motion experiments, researchers put reflector spots on various body parts of actors who wore black suits and filmed their movements in very low light so that only the dots were visible on the film. Anyone seeing these films can tell, once the film is set in motion, that the dots are on the bodies of people, what the people are doing, and even their gender. In this case, motion-detecting neurons from the dorsal pathway must communicate with object detecting neurons in the ventral pathway.
Another example of dorsal-ventral pathway crosstalk is depth perception. The visual system estimates depth or distance to various objects in the environment in a number of ways. Some cues, such as pictorial depth cues, can be represented in pictures and photographs. These include near objects overlapping objects that are farther away and relative size (nearer objects are larger than more distant ones — a tiny car in a picture must be farther away than a large person). Ventral pathway pattern-based cues must work with dorsal pathway motion-based cues to give a unified judgment of depth.
We tend to believe that we see “what is really out there” when, in reality, what we “see” is a construct from a combination of the current images on our retinas and our past experience. If you have a color-vision defect, for example, you may show up for work with a pair of socks that you think match but that your coworkers see as being different (to which your best response might be “Funny, but I have another pair at home just like these!”). There are also things — such as optical illusions — that none of us see as they really are. Visual deficits and illusions both tell neuroscientists a lot about how our visual system is built and works.
As I note in the earlier section “The retina: Converting photons to electrical signals,” the three different cone types (red, green, and blue) enable you to see color. Take away any of those particular cone types, and you have color blindness.
The most common (by far) forms of color blindness results from the absence of one cone type in the retina. About 1 in 20 men and 1 in 400 women are missing either red cones (a condition called protanopia) or green cones (deuteranopia). People with these conditions can’t discriminate red from green colors.
Even rarer in both men and women is the loss of blue cones, called tritanopia. These folks can’t distinguish blue-green colors.
An example of an acquired form of color blindness involves damage to cortical area V4 in the ventral stream, resulting in achromatopsia. Achromatopsia differs from retinal color blindness in that, in retinal color blindness, the person can’t discriminate between certain hues. In achromatopsia, on the other hand, the different colors appear as different shades of gray but without color, enabling the person to discriminate between them.
People’s dread of losing their vision has nearly the same intensity as their dread of contracting cancer. Although many blind people have led productive and very satisfying lives, loss of sight is considered one of the most disabling of all possible injuries. In this section, I discuss some of the most common causes of blindness.
Most blindness, at least in the developed world, originates in the retina. The most common forms of retinal blindness are retinopathies, such as retinitis pigmentosa, macular degeneration, and diabetic retinopathy, which cause photoreceptors to die, but numerous other causes of blindness occur as well:
How is it that we sometimes see something that is not there? Some optical illusions, like mirages and rainbows, are due to optical properties of the atmosphere and can be photographed.
Other illusions, however, seem to be constructions of our brains, so that we perceive something that is not photographable. Typical examples of these include the Ponzo (railroad track) illusion, in which two identical lines appear to be different sizes when placed on parallel lines that converge in the distance, and the Necker cube, where the face of the cube seems to change, depending on which side the viewer focuses on. Another famous visual illusion is the Kanizsa triangle (see Figure 5-5), in which a solid white triangle seems to overlay a black triangle outline. The catch? There is no white triangle.
© John Wiley & Sons, Inc.
FIGURE 5-5: The Kanizsa triangle; there is no solid white triangle.
Each of these illusions can be explained similarly. Our visual system evolved to make sense of images projected onto our retinas resulting from real, three-dimensional objects in the real world. In other words, we see what we expect to see. The illusion image of the Kanizsa triangle, for example, is a very complicated two-dimensional image, what with the three precisely spaced angles and the three precisely arranged circle segments. In the three dimensional world, such an image is reasonably possible only when a solid white triangle is present. Hence, that’s what we see.