vision
Rule #10
Vision trumps all other senses.
WE DO NOT SEE with our eyes. We see with our brains.
The evidence lies with a group of 54 wine aficionados. Stay with me here. To the untrained ear, the vocabularies that wine tasters use to describe wine may seem pretentious, more reminiscent of a psychologist describing a patient. (“Aggressive complexity, with just a subtle hint of shyness” is something I once heard at a wine-tasting soirée to which I was mistakenly invited—and from which, once picked off the floor rolling with laughter, I was hurriedly escorted out the door).
These words are taken very seriously by the professionals, however. A specific vocabulary exists for white wines and a specific vocabulary for red wines, and the two are never supposed to cross. Given how individually we each perceive any sense, I have often wondered how objective these tasters actually could be. So, apparently, did a group of brain researchers in Europe. They descended upon ground zero of the wine-tasting world, the University of Bordeaux, and asked: “What if we dropped odorless, tasteless red dye into white wines, then gave it to 54 wine-tasting professionals?” With only visual sense altered, how would the enologists now describe their wine? Would their delicate palates see through the ruse, or would their noses be fooled? The answer is “their noses would be fooled.” When the wine tasters encountered the altered whites, every one of them employed the vocabulary of the reds. The visual inputs seemed to trump their other highly trained senses.
Folks in the scientific community had a field day. Professional research papers were published with titles like “The Color of Odors” and “The Nose Smells What the Eye Sees.” That’s about as much frat boy behavior as prestigious brain journals tolerate, and you can almost see the wicked gleam in the researchers’ eyes. Data such as these point to the nuts and bolts of this chapter’s Brain Rule. Visual processing doesn’t just assist in the perception of our world. It dominates the perception of our world. Starting with basic biology, let’s find out why.
a hollywood horde
We see with our brains.
This central finding, after years of study, is deceptively simple. It is made more misleading because the internal mechanics of vision seem easy to understand. First, light (groups of photons, actually) enters our eyes, where it is bent by the cornea, the fluid-filled structure upon which your contacts normally sit. Then the light travels through the eye to the lens, where it is focused and allowed to strike the retina, a group of neurons in the back of the eye. The collision generates electric signals in these cells, and the signals travel deep into the brain via the optic nerve. The brain then interprets the electrical information, and we become visually aware. These steps seem effortless, 100 percent trustworthy, capable of providing a completely accurate representation of what’s actually out there.
Though we are used to thinking about our vision in such reliable terms, nothing in that last sentence is true. The process is extremely complex, seldom provides a completely accurate representation of our world, and is not 100 percent trustworthy. Many people think that the brain’s visual system works like a camera, simply collecting and processing the raw visual data provided by our outside world. Such analogies mostly describe the function of the eye, however, and not particularly well. We actually experience our visual environment as a fully analyzed opinion about what the brain thinks is out there.
We thought that the brain processed information such as color, texture, motion, depth, and form in discrete areas; higher-level structures in the brain then gave meaning to these features, and we suddenly obtained a visual perception. This is very similar to the steps discussed in the Multisensory chapter: sensing, routing, and perception, using bottom-up and top-down methods. It is becoming clearer that we need to amend this notion. We now know that visual analysis starts surprisingly early on, beginning when light strikes the retina. In the old days, we thought this collision was a mechanical, automated process: A photon shocked a retinal nerve cell into cracking off some electrical signal, which eventually found its way to the back of our heads. All perceptual heavy lifting was done afterward, deep in the bowels of the brain. There is strong evidence that this is not only a simplistic explanation of what goes on. It is a wrong explanation.
Rather than acting like a passive antenna, the retina appears to quickly process the electrical patterns before it sends anything off to Mission Control. Specialized nerve cells deep within the retina interpret the patterns of photons striking the retina, assemble the patterns into partial “movies, and then send these movies off to the back of our heads. The retina, it seems, is filled with teams of tiny Martin Scorseses. These movies are called tracks. Tracks are coherent, though partial, abstractions of specific features of the visual environment. One track appears to transmit a movie you might call Eye Meets Wireframe. It is composed only of outlines, or edges. Another makes a film you might call Eye Meets Motion, processing only the movement of an object (and often in a specific direction). Another makes Eye Meets Shadows. There may be as many as 12 of these tracks operating simultaneously in the retina, sending off interpretations of specific features of the visual field. This new view is quite unexpected. It’s like discovering that the reason your TV gives you feature films is that your cable is infested by a dozen amateur independent filmmakers, hard at work creating the feature while you watch it.
streams of consciousness
These movies now stream out from the optic nerve, one from each eye, and flood the thalamus, that egg-shaped structure in the middle of our heads that serves as a central distribution center for most of our senses. If these streams of visual information can be likened to a large, flowing river, the thalamus can be likened to the beginning of a delta. Once it leaves the thalamus, the information travels along increasingly divided neural streams. Eventually, there will be thousands of small neural tributaries carrying parts of the original information to the back of the brain. The information drains into a large complex region within the occipital lobe called the visual cortex. Put your hand on the back of your head. Your palm is now less than a quarter of an inch away from the area of the brain that is currently allowing you to see this page. It is a quarter of an inch away from your visual cortex.
The visual cortex is a big piece of neural acreage, and the various streams flow into specific parcels. There are thousands of lots, and their functions are almost ridiculously specific. Some parcels respond only to diagonal lines, and only to specific diagonal lines (one region responds to a line tilted at 40 degrees, but not to one tilted at 45). Some process only the color information in a visual signal; others, only edges; others, only motion.
Damage to the region responding to motion results in an extraordinary deficit: the inability to see moving objects as actually moving. This can be very dangerous, observable in the famous case of a Swiss woman we’ll call Gerte. In most respects, Gerte’s eyesight was normal. She could provide the names of objects in her visual field; recognize people, both familiar and unfamiliar, as human; read newspapers with ease. But if she looked at a horse galloping across a field, or a truck roaring down the freeway, she saw no motion. Instead, she saw a sequence of static, strobe-like snapshots of the objects. There was no smooth impression of continuous motion, no effortless perception of instantaneous changes of location. There was no motion of any kind. Gerte became terrified to cross the street. Her strobe-like world did not allow her to calculate the speed or destination of the vehicles. She could not perceive the cars as moving, let alone moving toward her (though she could readily identify the offending objects as automobiles, down to make and license plate). Gerte even said that talking to someone face-to-face was like speaking on the phone. She could not see the changing facial expressions associated with normal conversation. She could not see “changing” at all.
Gerte’s experience shows the modularity of visual processing. But it is not just motion. Thousands of streams feeding into these regions allow for the separate processing of individual features. And if that was the end of the visual story, we might perceive our world with the unorganized fury of a Picasso painting, a nightmare of fragmented objects, untethered colors, and strange, unboundaried edges.
But that’s not what happens, because of what takes place next. At the point where the visual field lies in its most fragmented state, the brain decides to reassemble the scattered information. Individual tributaries start recombining, merging, pooling their information, comparing their findings, and then sending their analysis to higher brain centers. The centers gather these hopelessly intricate calculations from many sources and integrate them at an even more sophisticated level. Higher and higher they go, eventually collapsing into two giant streams of processed information. One of these, called the ventral stream, recognizes what an object is and what color it possesses. The other, termed the dorsal stream, recognizes the location of the object in the visual field and whether it is moving. “Association regions” do the work of integrating the signals. They associate—or, better to say, reassociate—the balkanized electrical signals. Then, you see something. So, the process of vision is not as simple as a camera taking a picture. The process is more complex and more convoluted than anyone could have imagined. There is no real scientific agreement about why this disassembly and reassembly strategy even occurs.
Complex as visual processing is, things are about to get worse. We generally trust our visual apparati to serve us a faithful, up-to-the-minute, 100 percent accurate representation of what’s actually out there. Why do we believe that? Because our brain insists on helping us create our perceived reality. Two examples explain this exasperating tendency. One involves people who see miniature policemen who aren’t there. The other involves the active perception of camels.
camels and cops
You might inquire whether I had too much to drink if I told you right now that you were actively hallucinating. But it’s true. At this very moment, while reading this text, you are perceiving parts of this page that do not exist. Which means you, my friend, are hallucinating. I am about to show you that your brain actually likes to make things up, not 100 percent faithful to what the eyes broadcast to it.
There is a region in the eye where retinal neurons, carrying visual information, gather together to begin their journey into deep brain tissue. That gathering place is called the optic disk. It’s a strange region, because there are no cells that can perceive sight in the optic spot, and each eye has one. Do you ever see two black holes in your field of view that won’t go away? That’s what you should see. But your brain plays a trick on you. As the signals are sent to your visual cortex, the brain detects the presence of the holes and then does an extraordinary thing. It examines the visual information 360 degrees around the spot and calculates what is most likely to be there. Then, like a paint program on a computer, it fills in the spot. The process is called “filling in,” but it could be called “faking it.” Some believe that the brain simply ignores the lack of visual information, rather than calculating what’s missing. Either way, you’re not getting a 100 percent accurate representation.
It should not be surprising that the brain possesses such independent-minded imaging systems. Proof is as close as last night’s dream. But just how much of a loose cannon these systems can be is evidenced in a phenomenon known as the Charles Bonnet Syndrome. Millions of people suffer from it. Most who have it keep their mouth shut, however, and perhaps with good reason. People with Charles Bonnet syndrome see things that aren’t there. It’s like the blind-spot-fill-in apparatus gone horribly wrong. For some patients with Charles Bonnet, everyday household objects suddenly pop into view. For others, unfamiliar people unexpectedly appear next to them at dinner. Neurologist Vilayanur Ramachandran describes the case of a woman who suddenly—and delightfully—observed two tiny policemen scurrying across the floor, guiding an even smaller criminal to a matchbox-size van. Other patients have reported angels, goats in overcoats, clowns, Roman chariots, and elves. The illusions often occur in the evening and are usually quite benign. It is common among the elderly, especially among those who previously suffered damage somewhere in their visual pathway. Extraordinarily, almost all of the patients experiencing the hallucinations know that they aren’t real. No one really knows why they occur.
This is just one example of the powerful ways brains participate in our visual experience. Far from being a camera, the brain is actively deconstructing the information given to it by the eyes, pushing it through a series of filters, and then reconstructing what it thinks it sees. Or what it thinks you should see.
Yet even this is hardly the end of the mystery. Not only do you perceive things that aren’t there with careless abandon, but exactly how you construct your false information follows certain rules. Previous experience plays an important role in what the brain allows you to see, and the brain’s assumptions play a vital role in our visual perceptions. We consider these ideas next.
Since ancient times, people have wondered why two eyes give rise to a single visual perception. If there is a camel in your left eye and a camel in your right eye, why don’t you perceive two camels? Here’s an experiment to try that illustrates the problem nicely.
1) Close your left eye, then stretch your left arm in front of you.
2) Raise up the index finger of your left hand, as if you were pointing to the sky.
3) Keep the arm in this position while you hold your right arm about six inches in front of your face. Raise your right index finger like it too was pointing to the sky.
4) With your eye still closed, position your right index finger so that it appears just to the left of your left index finger.
5) Now speedily open you left eye and close the right one. Do this several times.
If you positioned your fingers correctly, your right finger will jump to the other side of your left finger and back again. When you open both eyes, the jumping will stop. This little experiment shows that the two images appearing on each retina always differ. It also shows that both eyes working together somehow give the brain enough information to see non-jumping reality.
Why do you see only one camel? Why do you see two arms with stable, non-jumping fingers? Because the brain interpolates the information coming from both eyes. It makes about a gazillion calculations, then provides you its best guess. And it is a guess. You can actually show that the brain doesn’t really know where things are. Rather, it hypothesizes the probability of what the current event should look like and then, taking a leap of faith, approximates a viewable image. What you experience is not the image. What you experience is the leap of faith. Why does the brain do this? Because it is forced to solve a problem: We live in a three-dimensional world, but the light falls on our retina in a two-dimensional fashion. The brain must deal with this disparity if it is going to accurately portray the world. Just to complicate things, our two eyes give the brain two separate visual fields, and they project their images upside down and backward. To make sense of it all, the brain is forced to start guessing.
Upon what does it base its guesses, at least in part? The answer is bone-chilling: prior experience with events in your past. After adamantly inserting numerous assumptions about the received information (some of these assumptions may be inborn), the brain then offers up its findings for your perusal. It goes to all of this trouble for an important reason dripping with Darwinian good will: so you will see one camel in the room when there really is only one camel in the room (and see its proper depth and shape and size and even hints about whether or not it will bite you). All of this happens in about the time it takes to blink your eyes. Indeed, it is happening right now.
If you think the brain has to devote to vision a lot of its precious thinking resources, you are right on the money. It takes up about half of everything you do, in fact. This helps explains why snooty wine tasters with tons of professional experience throw out their taste buds so quickly in the thrall of visual stimuli. And that lies at the very heart of this chapter’s Brain Rule.
phantom of the ocular
In the land of sensory kingdoms, there are many ways to show that vision isn’t the benevolent prime minister but the dictatorial emperor. Take phantom-limb experiences. Sometimes, people who have suffered an amputation continue to experience the presence of their limb, even though no limb exists. Sometimes the limb is perceived as frozen into a fixed position. Sometimes it feels pain. Scientists have used phantoms to demonstrate the powerful influence vision has on our senses.
An amputee with a “frozen” phantom arm was seated at a table upon which had been placed a topless, divided box. There were two portals in the front, one for the arm and one for the stump. The divider was a mirror, and the amputee could view a reflection of either his functioning hand or his stump. When he looked at his functioning hand, he could see his right arm present and his left arm missing. But when he looked at the reflection of his right arm in the mirror—what looked like another arm—the phantom limb on the other side of the box suddenly “woke up.” If he moved his normal hand while gazing at its reflection, he could feel his phantom move, too. And when he stopped moving his right arm, his missing left arm “stopped” also. The addition of visual information began convincing his brain of a miraculous rebirth of the absent limb. This is vision not only as dictator but as faith healer. The visual-capture effect is so powerful, it can be used to alleviate pain in the phantom.
How do we measure vision’s dominance?
One way is to show its effects on learning and memory. Researchers historically have used two types of memory in their investigations. The first, recognition memory, is a glorified way to explain familiarity. We often deploy recognition memory when looking at old family photographs, such as gazing at a picture of an old aunt not remembered for years. You don’t necessarily recall her name, or the photo, but you still recognize her as your aunt. You may not be able to recall certain details, but as soon as you see it, you know that you have seen it before.
Other types of learning involve the familiar working memory. Explained in greater detail in the Memory chapters, working memory is that collection of temporary storage buffers with fixed capacities and frustratingly short life spans. Visual short-term memory is the slice of that buffer dedicated to storing visual information. Most of us can hold about four objects at a time in that buffer, so it’s a pretty small space. And it appears to be getting smaller. Recent data show that as the complexity of the objects increases, the number of objects capable of being captured drops. The evidence also suggests that the number of objects and complexity of objects are engaged by different systems in the brain, turning the whole notion of short-term capacity, if you will forgive me, on its head. These limitations make it all the more remarkable—or depressing—that vision is probably the best single tool we have for learning anything.
worth a thousand words
When it comes to memory, researchers have known for more than 100 years that pictures and text follow very different rules. Put simply, the more visual the input becomes, the more likely it is to be recognized—and recalled. The phenomenon is so pervasive, it has been given its own name: the pictorial superiority effect, or PSE.
Human PSE is truly Olympian. Tests performed years ago showed that people could remember more than 2,500 pictures with at least 90 percent accuracy several days post-exposure, even though subjects saw each picture for about 10 seconds. Accuracy rates a year later still hovered around 63 percent. In one paper—adorably titled “Remember Dick and Jane?”—picture recognition information was reliably retrieved several decades later.
Sprinkled throughout these experiments were comparisons with other forms of communication. The favorite target was usually text or oral presentations, and the usual result was “picture demolishes them both.” It still does. Text and oral presentations are not just less efficient than pictures for retaining certain types of information; they are way less efficient. If information is presented orally, people remember about 10 percent, tested 72 hours after exposure. That figure goes up to 65 percent if you add a picture.
The inefficiency of text has received particular attention. One of the reasons that text is less capable than pictures is that the brain sees words as lots of tiny pictures. Data clearly show that a word is unreadable unless the brain can separately identify simple features in the letters. Instead of words, we see complex little art-museum masterpieces, with hundreds of features embedded in hundreds of letters. Like an art junkie, we linger at each feature, rigorously and independently verifying it before moving to the next. The finding has broad implications for reading efficiency. Reading creates a bottleneck. My text chokes you, not because my text is not enough like pictures but because my text is too much like pictures. To our cortex, unnervingly, there is no such thing as words.
That’s not necessarily obvious. After all, the brain is as adaptive as Silly Putty. With years of reading books, writing email, and sending text messages, you might think the visual system could be trained to recognize common words without slogging through tedious additional steps of letter-feature recognition. But that is not what happens. No matter how experienced a reader you become, you will still stop and ponder individual textual features as you plow through these pages, and you will do so until you can’t read anymore.
Perhaps, with hindsight, we could have predicted such inefficiency. Our evolutionary history was never dominated by text-filled billboards or Microsoft Word. It was dominated by leaf-filled trees and saber-toothed tigers. The reason vision means so much to us may be as simple as the fact that most of the major threats to our lives in the savannah were apprehended visually. Ditto with most of our food supplies. Ditto with our perceptions of reproductive opportunity.
The tendency is so pervasive that, even when we read, most of us try to visualize what the text is telling us. “Words are only postage stamps delivering the object for you to unwrap,” George Bernard Shaw was fond of saying. These days, there is a lot of brain science technology to back him up.
a punch in the nose
Here’s a dirty trick you can pull on a baby. It may illustrate something about your personality. It certainly illustrates something about visual processing.
Tie a ribbon around the baby’s leg. Tie the other end to a bell. At first she seems to be randomly moving her limbs. Soon, however, the infant learns that if she moves one leg, the bell rings. Soon she is happily—and preferentially—moving that leg. The bell rings and rings and rings. Now cut the ribbon. The bell no longer rings. Does that stop the baby? No. She still kicks her leg. Something is wrong, so she kicks harder. Still no sound. She does a series of rapid kicks in sequence. Still no success. She gazes up at the bell, even stares at the bell. This visual behavior tells us she is paying attention to the problem. Scientists can measure the brain’s attentional state even with the diaper-and-breast-milk crowd because of this reliance on visual processing.
This story illustrates something fundamental about how brains perceive their world. As babies begin to understand cause-and-effect relationships, we can determine how they pay attention by watching them stare at their world. The importance of this gazing behavior cannot be underestimated. Babies use visual cues to show they are paying attention to something—even though nobody taught them to do that. The conclusion is that babies come with a variety of preloaded software devoted to visual processing.
That turns out to be true. Babies display a preference for patterns with high contrast. They seem to understand the principle of common fate: Objects that move together are perceived as part of the same object, such as stripes on a zebra. They can discriminate human faces from non-human equivalents and seem to prefer them. They possess an understanding of size related to distance—that if an object is getting closer (and therefore getting bigger), it is still the same object. Babies can even categorize visual objects by common physical characteristics. The dominance that vision displays behaviorally begins in the tiny world of infants.
And it shows up in the even tinier world of DNA. Our sense of smell and color vision are fighting violently for evolutionary control, for the right to be consulted first whenever something on the outside happens. And vision is winning. In fact, about 60 percent of our smell-related genes have been permanently damaged in this neural arbitrage, and they are marching toward obsolescence at a rate fourfold faster than any other species sampled. The reason for this decommissioning is simple: The visual cortex and the olfactory cortex take up a lot of neural real estate. In the crowded zero-sum world of the sub-scalp, something has to give.
Whether looking at behavior, cells, or genes, we can observe how important visual sense is to the human experience. Striding across our brain like an out-of-control superpower, giant swaths of biological resource are consumed by it. In return, our visual system creates movies, generates hallucinations, and consults with previous information before allowing us to see the outside. It happily bends the information from other senses to do its bidding and, at least in the case of smell, seems to be caught in the act of taking it over.
Is there any point in trying to ignore this juggernaut, especially if you are a parent, educator, or business professional? You don’t have to go any further than the wine experts of Bordeaux for proof.
ideas
I owe my career choice to Donald Duck. I am not joking. I even remember the moment he convinced me. I was 8 years old at the time, and my mother trundled the family off to a showing of an amazing 27-minute animated short called Donald in Mathmagic Land. Using visual imagery, a wicked sense of humor, and the wide-eyed wonder of an infant, Donald Duck introduced me to math. Got me excited about it. From geometry to football to playing billiards, the power and beauty of mathematics were made so real for this nerd-in-training, I asked if I could see it a second time. My mother obliged, and the effect was so memorable, it eventually influenced my career choice. I now have a copy of those valuable 27 minutes in my own home and regularly inflict it upon my poor children. Donald in Mathmagic Land won an Academy Award for best animated short of 1959. It also should have gotten a “Teacher of the Year” award. The film illustrates—literally—the power of the moving image in communicating complex information to students. It’s one inspiration for these suggestions.
Teachers should learn why pictures grab attention
Educators should know how pictures transfer information. There are things we know about how pictures grab attention that are rock solid. We pay lots of attention to color. We pay lots of attention to orientation. We pay lots of attention to size. And we pay special attention if the object is in motion. Indeed, most of the things that threatened us in the Serengeti moved, and the brain has evolved unbelievably sophisticated trip-wires to detect it. We even have specialized regions to distinguish when our eyes are moving versus when our world is moving. These regions routinely shut down perceptions of eye movement in favor of the environmental movement.
Teachers should use computer animations
Animation captures the importance not only of color and placement but also of motion. With the advent of web-based graphics, the days when this knowledge was optional for educators are probably over. Fortunately, the basics are not hard to learn. With today’s software, simple animations can be created by anybody who knows how to draw a square and a circle. Simple, two-dimensional pictures are quite adequate; studies show that if the drawings are too complex or lifelike, they can distract from the transfer of information.
Test the power of images
Though the pictorial superiority effect is a well-established fact for certain types of classroom material, it is not well-established for all material. Data are sparse. Some media are better at communicating some types of information than others. Do pictures communicate conceptual ideas such as “freedom” and “amount” better than, say a narrative? Are language arts better represented in picture form, or are other media styles more robust? Working out these issues in real-world classrooms would provide the answer, and that takes collaboration between teachers and researchers.
Communicate with pictures more than words
“Less text, more pictures” were almost fighting words in 1982. They were used derisively to greet the arrival of USA Today, a brand-new type of newspaper with, as you know, less text, more pictures. Some predicted the style would never work. Others predicted that if it did, the style would spell the end of Western civilization as the newspaper-reading public knows it. The jury may be out on the latter prediction, but the former has a powerful and embarrassing verdict. Within four years, USA Today had the second highest readership of any newspaper in the country, and within 10, it was the number one. It still is.
What happened? First, we know that pictures are a more efficient delivery mechanism of information than text. Second, the American work force is consistently overworked, with more things being done by fewer people. Third, many Americans still read newspapers. In the helter-skelter world of overworked Americans, more-efficient information transfer may be the preferred medium. As the success of USA Today suggests, the attraction may be strong enough to persuade consumers to reach for their wallet. So, pictorial information may be initially more attractive to consumers, in part because it takes less effort to comprehend. Because it is also a more efficient way to glue information to a neuron, there may be strong reasons for entire marketing departments to think seriously about making pictorial presentations their primary way of transferring information.
The initial effect of pictures on attention has been tested. Using infrared eye-tracking technology, 3,600 consumers were tested on 1,363 print advertisements. The conclusion? Pictorial information was superior in capturing attention—independent of its size. Even if the picture was small and crowded with lots of other non-pictorial elements close to it, the eye went to the visual. The researchers in the study, unfortunately, did not check for retention.
Toss your PowerPoint presentations
The presentation software called PowerPoint has become ubiquitous, from corporate boardrooms to college classrooms to scientific conferences. What’s wrong with that? It’s text-based, with six hierarchical levels of chapters and subheads—all words. Professionals everywhere need to know about the incredible inefficiency of text-based information and the incredible effects of images. Then they need to do two things:
1. Burn their current PowerPoint presentations.
2. Make new ones.
Actually, the old ones should be stored, at least temporarily, as useful comparisons. Business professionals should test their new designs against the old and determine which ones work better. A typical PowerPoint business presentation has nearly 40 words per slide. That means we have a lot of work ahead of us.
Summary
Rule #10
Vision trumps all other senses.
• Vision is by far our most dominant sense, taking up half of our brain’s resources.
• What we see is only what our brain tells us we see, and it’s not 100 percent accurate.
• The visual analysis we do has many steps.The retina assembles photons into little movie-like streams of information.The visual cortex processes these streams, some areas registering motion, others registering color, etc. Finally, we combine that information back together so we can see.
• We learn and remember best through pictures, not through written or spoken words.