On his album Songs for Swinging Lovers, Frank Sinatra is awesomely in control of his emotional expression, rhythm, and pitch. Now, I am not a Sinatra fanatic. I only have a half dozen or so of the more than two hundred albums he’s released, and I don’t like his movies. Frankly, I find most of his repertoire to be just plain sappy; in everything post-1980, he sounds too cocky. Years ago Billboard hired me to review the last album he made, duets with popular singers such as Bono and Gloria Estefan. I panned it, writing that Frank “sings with all the satisfaction of a man who just had somebody killed.”
But on Swinging Lovers, every note he sings is perfectly placed in time and pitch. I don’t mean “perfectly” in the strict, as-notated sense; his rhythms and timing are completely wrong in terms of how the music is written on paper, but they are perfect for expressing emotions that go beyond description. His phrasing contains impossibly detailed and subtle nuances—to be able to pay attention to that much detail, to be able to control it, is something I can’t imagine. Try to sing along with any song on Swinging Lovers. I’ve never found anyone who could match his phrasing precisely—it is too nuanced, too quirky, too idiosyncratic.
How do people become expert musicians? And why is that of the millions of people who take music lessons as children, relatively few continue to play music as adults? When they find out what I do for a living, many people tell me that they love music listening, but their music lessons “didn’t take.” I think they’re being too hard on themselves. The chasm between musical experts and everyday musicians that has grown so wide in our culture makes people feel discouraged, and for some reason this is uniquely so with music. Even though most of us can’t play basketball like Shaquille O’Neal, or cook like Julia Child, we can still enjoy playing a friendly backyard game of hoops, or cooking a holiday meal for our friends and family. This performance chasm does seem to be cultural, specific to contemporary Western society. And although many people say that music lessons didn’t take, cognitive neuroscientists have found otherwise in their laboratories. Even just a small exposure to music lessons as a child creates neural circuits for music processing that are enhanced and more efficient than for those who lack training. Music lessons teach us to listen better, and they accelerate our ability to discern structure and form in music, making it easier for us to tell what music we like and what we don’t like.
But what about that class of people that we all acknowledge are true musical experts—the Alfred Brendels, Sarah Changs, Wynton Marsalises, and Tori Amoses? How did they get what most of us don’t have, an extraordinary facility to play and perform? Do they have a set of abilities—or neural structures—that are of a totally different sort than the rest of us have (a difference of kind) or do they just have more of the same basic stuff all of us are endowed with (a difference of degree)? And do composers and songwriters have a fundamentally different set of skills than players?
The scientific study of expertise has been a major topic within cognitive science for the past thirty years, and musical expertise has tended to be studied within the context of general expertise. In almost all cases, musical expertise has been defined as technical achievement—mastery of an instrument or of compositional skills. The late Michael Howe, and his collaborators Jane Davidson and John Sloboda, launched an international debate when they asked whether the lay notion of “talent” is scientifically defensible. They assumed the following dichotomy: Either high levels of musical achievement are based on innate brain structures (what we refer to as talent) or they are simply the result of training and practice. They define talent as something (1) that originates in genetic structures; (2) that is identifiable at an early stage by trained people who can recognize it even before exceptional levels of performance have been acquired; (3) that can be used to predict who is likely to excel; and (4) that only a minority can be identified as having because if everyone were “talented,” the concept would lose meaning. The emphasis on early identification entails that we study the development of skills in children. They add that in a domain such as music, “talent” might be manifested differently in different children.
It is evident that some children acquire skills more rapidly than others: The age of onset for walking, talking, and toilet training vary widely from one child to another, even within the same household. There may be genetic factors at work, but it is difficult to separate out ancillary factors—with a presumably environmental component—such as motivation, personality, and family dynamics. Similar factors can influence musical development and can mask the contributions of genetics to musical ability. Brain studies, so far, haven’t been of much use in sorting out the issues because it has been difficult to separate cause from effect. Gottfried Schlaug at Harvard collected brain scans of individuals with absolute pitch (AP) and showed that a region in the auditory cortex—the planum temporale—is larger in the AP people than the non-AP people. This suggests that the planum is involved in AP, but it’s not clear if it starts out larger in people who eventually acquire AP, or rather, if the acquisition of AP causes the planum to increase in size. The story is clearer in the areas of the brain that are involved in skilled motor movements. Studies of violin players by Thomas Elbert have shown that the region of the brain responsible for moving the left hand—the hand that requires the most precision in violin playing—increases in size as a result of practice. We do not know yet if the propensity for increase preexists in some people and not others.
The strongest evidence for the talent position is that some people simply acquire musical skills more rapidly than others. The evidence against the talent account—or rather, in favor of the view that practice makes perfect—comes from research on how much training the experts or high achievement people actually do. Like experts in mathematics, chess, or sports, experts in music require lengthy periods of instruction and practice in order to acquire the skills necessary to truly excel. In several studies, the very best conservatory students were found to have practiced the most, sometimes twice as much as those who weren’t judged as good.
In another study, students were secretly divided into two groups (not revealed to the students so as not to bias them) based on teachers’ evaluations of their ability, or the perception of talent. Several years later, the students who achieved the highest performance ratings were those who had practiced the most, irrespective of which “talent” group they had been assigned to previously. This suggests that practice is the cause of achievement, not merely something correlated with it. It further suggests that talent is a label that we’re using in a circular fashion: When we say that someone is talented, we think we mean that they have some innate predisposition to excel, but in the end, we only apply the term retrospectively, after they have made significant achievements.
Anders Ericsson, at Florida State University, and his colleagues approach the topic of musical expertise as a general problem in cognitive psychology involving how humans become experts in general. In other words, he takes as a starting assumption that there are certain issues involved in becoming an expert at anything; that we can learn about musical expertise by studying expert writers, chess players, athletes, artists, mathematicians, in addition to musicians.
First, what do we mean by “expert”? Generally we mean that it is someone who has reached a high degree of accomplishment relative to other people. As such, expertise is a social judgment; we are making a statement about a few members of a society relative to a larger population. Also, the accomplishment is normally considered to be in a field that we care about. As Sloboda points out, I may become an expert at folding my arms or pronouncing my own name, but this isn’t generally considered the same as becoming, say, an expert at chess, at repairing Porsches, or being able to steal the British crown jewels without being caught.
The emerging picture from such studies is that ten thousand hours of practice is required to achieve the level of mastery associated with being a world-class expert—in anything. In study after study, of composers, basketball players, fiction writers, ice skaters, concert pianists, chess players, master criminals, and what have you, this number comes up again and again. Ten thousand hours is equivalent to roughly three hours a day, or twenty hours a week, of practice over ten years. Of course, this doesn’t address why some people don’t seem to get anywhere when they practice, and why some people get more out of their practice sessions than others. But no one has yet found a case in which true world-class expertise was accomplished in less time. It seems that it takes the brain this long to assimilate all that it needs to know to achieve true mastery.
The ten-thousand-hours theory is consistent with what we know about how the brain learns. Learning requires the assimilation and consolidation of information in neural tissue. The more experiences we have with something, the stronger the memory/learning trace for that experience becomes. Although people differ in how long it takes them to consolidate information neurally, it remains true that increased practice leads to a greater number of neural traces, which can combine to create a stronger memory representation. This is true whether you subscribe to multiple-trace theory or any number of variants of theories in the neuroanatomy of memory: The strength of a memory is related to how many times the original stimulus has been experienced.
Memory strength is also a function of how much we care about the experience. Neurochemical tags associated with memories mark them for importance, and we tend to code as important things that carry with them a lot of emotion, either positive or negative. I tell my students if they want to do well on a test, they have to really care about the material as they study it. Caring may, in part, account for some of the early differences we see in how quickly people acquire new skills. If I really like a particular piece of music, I’m going to want to practice it more, and because I care about it, I’m going to attach neurochemical tags to each aspect of the memory that label it as important: The sounds of the piece, the way I move my fingers, if I’m playing a wind instrument the way that I breathe—all these become part of a memory trace that I’ve encoded as important.
Similarly, if I’m playing an instrument I like, and whose sound pleases me in and of itself, I’m more likely to pay attention to subtle differences in tone, and the ways in which I can moderate and affect the tonal output of my instrument. It is impossible to overestimate the importance of these factors; caring leads to attention, and together they lead to measurable neurochemical changes. Dopamine, the neurotransmitter associated with emotional regulation, alertness, and mood, is released, and the dopaminergic system aids in the encoding of the memory trace.
Owing to various factors, some people who take music lessons are less motivated to practice; their practice is less effective because of motivational and attentional factors. The ten-thousand-hours argument is convincing because it shows up in study after study across many domains. Scientists like order and simplicity, so if we see a number or a formula that pops up in different contexts, we tend to favor it as an explanation. But like many scientific theories, the ten-thousand-hours theory has holes in it, and it needs to account for counterarguments and rebuttals.
The classic rebuttal to the ten-thousand-hours argument goes something like this: “Well, what about Mozart? I hear that he was composing symphonies at the age of four! And even if he was practicing forty hours a week since the day he was born, that doesn’t make ten thousand hours.” First, there are factual errors in this account: Mozart didn’t begin composing until he was six, and he didn’t write his first symphony until he was eight. Still, writing a symphony at age eight is unusual, to say the least. Mozart demonstrated precociousness early in his life. But that is not the same as being an expert. Many children write music, and some even write large-scale works when they’re as young as eight. And Mozart had extensive training from his father, who was widely considered to be the greatest living music teacher in all of Europe at the time. We don’t know how much Mozart practiced, but if he started at age two and worked thirty-two hours a week at it (quite possible, given his father’s reputation as a stern taskmaster) he would have made his ten thousand hours by the age of eight. Even if Mozart hadn’t practiced that much, the ten-thousand-hours argument doesn’t say that it takes ten thousand hours to write a symphony. Clearly Mozart became an expert eventually, but did the writing of that first symphony qualify him as an expert, or did he attain his level of musical expertise sometime later?
John Hayes of Carnegie Mellon asked just this question. Does Mozart’s Symphony no. 1 qualify as the work of a musical expert? Put another way, if Mozart hadn’t written anything else, would this symphony strike us as the work of a musical genius? Maybe it really isn’t very good, and the only reason we know about it is because the child who wrote it grew up to become Mozart—we have a historical interest in it, but not an aesthetic one. Hayes studied the performance programs of the leading orchestras and the catalog of commercial recordings, assuming that better musical works are more likely to be performed and recorded than lesser works. He found that the early works of Mozart were not performed or recorded very often. Musicologists largely regard them as curiosities, compositions that by no means predicted the expert works that were to follow. Those of Mozart’s compositions that are considered truly great are those that he wrote well after he had been at it for ten thousand hours.
As we have seen in the debates about memory and categorization, the truth lies somewhere between the two extremes, a composite of the two hypotheses confronting each other in the nature/nurture debate. To understand how this particular synthesis occurs, and what predictions it makes, we need to look more closely at what the geneticists have to say.
Geneticists seek to find a cluster of genes that are associated with particular observable traits. They assume that if there is a genetic contribution to music, it will show up in families, since brothers and sisters share 50 percent of their genes with one another. But it can be difficult to separate out the influence of genes from the influence of the environment in this approach. The environment includes the environment of the womb: the food that the mother eats, whether she smokes or drinks, and other factors that influence the amount of nutrients and oxygen the fetus receives. Even identical twins can experience very different environments from one another within the womb, based on the amount of space they have, their room for movement, and their position.
Distinguishing genetic from environmental influences on a skill that has a learned component, such as music, is difficult. Music tends to run in families. But a child with parents who are musicians is more likely to receive encouragement for her early musical leanings than a child in a nonmusical household, and siblings of that musically raised child are likely to receive similar levels of support. By analogy, parents who speak French are likely to raise children who speak French, and parents who do not are unlikely to do so. We can say that speaking French “runs in families,” but I don’t know anyone who would claim that speaking French is genetic.
One way that scientists determine the genetic basis of traits or skills is by studying identical twins, especially those who have been reared apart. The Minnesota twins registry, a database kept by the psychologists David Lykken, Thomas Bouchard, and their colleagues, has followed identical and fraternal twins reared apart and reared together. Because fraternal twins share 50 percent of their genetic material, and identical twins share 100 percent, this allows scientists to tease apart the relative influences of nature versus nurture. If something has a genetic component, we would expect it to show up more often in each individual who is an identical twin than in each who is a fraternal twin. Moreover, we would expect it to show up even when the identical twins have been raised in completely separate environments. Behavioral geneticists look for such patterns and form theories about the heritability of certain traits.
The newest approach looks at gene linkages. If a trait appears to be heritable, we can try to isolate the genes that are linked to that trait. (I don’t say “responsible for that trait,” because interactions among genes are very complicated, and we cannot say with certainty that a single gene “causes” a trait.) This is complicated by the fact that we can have a gene for something without its being active. Not all of the genes that we have are “turned on,” or expressed, at all times. Using gene chip expression profiling, we can determine which genes are and which genes aren’t expressed at a given time. What does this mean? Our roughly twenty-five thousand genes control the synthesis of proteins that our bodies and brains use to perform all of our biological functions. They control hair growth, hair color, the creation of digestive fluids and saliva, whether we end up being six feet tall or five feet tall. During our growth spurt around the time of puberty, something needs to tell our body to start growing, and a half dozen years later, something has to tell it to stop. These are the genes, carrying instructions about what to do and how to do it.
Using gene chip expression profiling, I can analyze a sample of your RNA and—if I know what I’m looking for—I can tell whether your growth gene is active—that is, expressed—right now. At this point, the analysis of gene expression in the brain isn’t practical because current (and foreseeable) techniques require that we analyze a piece of brain tissue. Most people find that unpleasant.
Scientists studying identical twins who’ve been reared apart have found remarkable similarities. In some cases, the twins were separated at birth, and not even told of each other’s existence. They might have been raised in environments that differed a great deal in geography (Maine versus Texas, Nebraska versus New York), in financial means, and in religious or other cultural values. When tracked down twenty or more years later, a number of astonishing similarities emerged. One woman liked to go to the beach and when she did, she would back into the water; her twin (whom she had never met) did exactly the same thing. One man sold life insurance for a living, sang in his church choir, and wore Lone Star beer belt buckles; so did his completely-separated-from-birth identical twin. Studies like these suggested that musicality, religiosity, and criminality had a strong genetic component. How else could you explain such coincidences?
One alternative explanation is statistical, and can be stated like this: “If you look hard enough, and make enough comparisons, you’re going to find some really weird coincidences that don’t really mean anything.” Take any two random people off the street who have no relationship to one another, except perhaps through their common ancestors Adam and Eve. If you look at enough traits, you’re bound to find some in common that aren’t obvious. I’m not talking about things like “Oh, my gosh! You breathe the atmosphere too!!” but things like “I wash my hair on Tuesdays and Fridays, and I use an herbal shampoo on Tuesdays—scrubbing with only my left hand, and I don’t use a conditioner. On Fridays I use an Australian shampoo that has a conditioner built in. Afterward, I read The New Yorker while listening to Puccini.” Stories like these suggest that there is an underlying connection between these people, in spite of the scientists’ assurances that their genes and environment are maximally dissimilar. But all of us differ from one another in thousands upon thousands of different ways, and we all have our quirks. Once in a while we find co-occurrences, and we’re surprised. But from a statistical standpoint, it isn’t any more surprising than if I think of a number between one and one hundred and you guess it. You may not guess it the first time, but if we play the game long enough, you’re going to guess it once in a while (1 percent of the time, to be exact).
A second alternative explanation is social psychological—the way someone looks influences the way that others treat him (with “looks” assumed to be genetic); in general, an organism is acted on by the world in particular ways as a function of its appearance. This intuitive notion has a rich tradition in literature, from Cyrano de Bergerac to Shrek: Shunned by people who were repulsed by their outward appearance, they rarely had the opportunity to show their inner selves and true nature. As a culture we romanticize stories like these, and feel a sense of tragedy about a good person suffering for something he had nothing to do with: his looks. It works in the opposite way as well: good-looking people tend to make more money, get better jobs, and report that they are happier. Even apart from whether someone is considered attractive or not, his appearance affects how we relate to him. Someone who was born with facial features that we associate with trustworthiness—large eyes, for example, with raised eyebrows—is someone people will tend to trust. Someone tall may be given more respect than someone short. The series of encounters we have over our lifetimes are shaped to some extent by the way others see us.
It is no wonder, then, that identical twins may end up developing similar personalities, traits, habits, or quirks. Someone with downturned eyebrows might always look angry, and the world will treat them that way. Someone who looks defenseless will be taken advantage of; someone who looks like a bully may spend a lifetime being asked to fight, and eventually will develop an aggressive personality. We see this principle at work in certain actors. Hugh Grant, Judge Reinhold, Tom Hanks, and Adrien Brody have innocent-looking faces; without doing anything, Grant has an “awww, shucks” look, a face that suggests he has no guile or deceit. This line of reasoning says that some people are born with particular features, and their personalities develop in large part as a reflection of how they look. Genes here are influencing personality, but only in an indirect, secondary way.
It is not difficult to imagine a similar argument applying to musicians, and in particular to vocalists. Doc Watson’s voice sounds completely sincere and innocent; I don’t know if he is that way in person, and at one level it doesn’t matter. It’s possible that he became the successful artist he is because of how people react to the voice that he was born with. I’m not talking about being born with (or acquiring) a “great” voice, like Ella Fitzgerald’s or Placido Domingo’s, I’m talking about expressiveness apart from whether the voice itself is a great instrument. Sometimes as Aimee Mann sings, I hear the traces of a little girl’s voice, a vulnerable innocence that moves me because I feel that she is reaching down deep inside and confessing feelings that normally are expressed only to a close friend. Whether she intends to convey this, or really feels this, I don’t know—she may have been born with a vocal quality that makes listeners invest her with those feelings, whether she is experiencing them or not. In the end, the essence of music performance is being able to convey emotion. Whether the artist is feeling it or was born with an ability to sound as if she’s feeling it may not be important.
I don’t mean to imply that the actors and musicians I’ve mentioned don’t have to work at what they do. I don’t know any successful musicians who haven’t worked hard to get where they are; I don’t know any who had success fall into their laps. I’ve known a lot of artists whom the press has called “overnight sensations,” but who spent five or ten years becoming that! Genetics are a starting point that may influence personality or career, or the specific choices one makes in a career. Tom Hanks is a great actor, but he’s not likely to get the same kinds of roles as Arnold Schwarzenegger, largely owing the differences in their genetic endowments. Schwarzenegger wasn’t born with a body-builder’s body; he worked very hard at it, but he had a genetic predisposition toward it. Similarly, being six ten creates a predisposition toward becoming a basketball player rather than a jockey. But it is not enough for someone who is six ten to simply stand on the court—he needs to learn the game and practice for years to become an expert. Body type, which is largely (though not exclusively) genetic, creates predispositions for basketball as it does for acting, dancing, and music.
Musicians, like athletes, actors, dancers, sculptors and painters, use their bodies as well as their minds. The role of the body in the playing of a musical instrument or in singing (less so, of course, in composing and arranging) means that genetic predispositions can contribute strongly to the choice of instruments a musician can play well—and to whether a person chooses to become a musician.
When I was six years old, I saw the Beatles on The Ed Sullivan Show, and in what has become a cliché for people of my generation, I decided then that I wanted to play the guitar. My parents, who were of the old school, did not view the guitar as a “serious instrument” and told me to play the family piano instead. But I wanted desperately to play. I would cut out pictures of classical guitarists like Andrés Segovia from magazines and casually leave them around the house. At six, I was still speaking with a prominent lisp that I had had all my life; I didn’t get rid of it until age ten when I was embarrassingly plucked out of my fourth-grade class by the public-school speech therapist who spent a grueling two years (at three hours a week) teaching me to change the way that I said the letters. I pointed out that the Beatles must be therious to share the stage of The Ed Sullivan Show with such therious artithts as Beverly Thills, Rodgers and Hammerthtein, and John Gielgud. I was relentless.
By 1965, when I was eight, the guitar was everywhere. With San Francisco just fifteen miles away, I could feel a cultural and musical revolution going on, and the guitar was at the center of it all. My parents were still not enthusiastic about me studying the guitar, perhaps because of its association with hippies and drugs, or perhaps as a result of my failure the previous year to practice the piano diligently. I pointed out that by now, the Beatles had been on The Ed Sullivan Show four times and my parents finally quasi-relented, agreeing to ask a friend of theirs for advice. “Jack King plays the guitar,” my mother said at dinner one night to my father. “We could ask him if he thinks Danny is old enough to begin guitar lessons.” Jack, an old college friend of my parents, dropped by the house one day on his way home from work. His guitar sounded different from the ones that had mesmerized me on television and radio; it was a classical guitar, not made for the dark chords of rock and roll. Jack was a big man with large hands, and a short black crew cut. He held the guitar in his arms as one might cradle a baby. I could see the intricate patterns of wood grain bending around the curves of the instrument. He played something for us. He didn’t let me touch the guitar, instead he asked me to hold my hand out, and he pressed his palm against mine. He didn’t talk to me or look at me, but what he said to my mother I can still hear clearly: “His hands are too small for the guitar.”
I now know about three-quarter size and half-size guitars (I even own one), and about Django Reinhardt, one of the greatest guitarists of all time, who only had the full use of two of the fingers on his left hand. But to an eight-year-old, the words of adults can seem unbreachable. By 1966, when I had grown some, and the Beatles were egging me on with electric guitar strains of “Help,” I was playing the clarinet and happy to at least be making music. I finally bought my first guitar when I was sixteen and with practice, I learned to play reasonably well; the rock and jazz that I play don’t require the long reach that classical guitar does. The very first song I learned to play—in what has become another cliché for my generation—was Led Zeppelin’s “Stairway to Heaven” (hey, it was the seventies). Some musical parts that guitarists with different hands can play will always be difficult for me, but that is always the case with every instrument. On Hollywood Boulevard in Hollywood, California, some of the great rock musicians have placed their handprints in the cement. I was surprised last summer when I put my hands in the imprint left by Jimmy Page (of Led Zeppelin), one of my favorite guitarists, that his hands were no bigger than mine.
Some years ago I shook hands with Oscar Peterson, the great jazz pianist. His hands were very large; the largest hands I have ever shaken, at least twice the size of my own. He began his career playing stride piano, a style dating back to the 1920s in which the pianist plays an octave bass with his left hand and the melody with his right. To be a good stride player, you need to be able to be able to reach keys that are far apart with a minimum of hand movements, and Oscar can stretch a whopping octave and a half with one hand! Oscar’s style is related to the kinds of chords he is able to play, chords that someone with smaller hands could not. If Oscar Peterson had been forced to play violin as a child it would have been impossible with those large hands; his wide fingers would make it difficult to play a semitone on the relatively small neck of the violin.
Some people have a biological predisposition toward particular instruments, or toward singing. There may also be a cluster of genes that work together to create the component skills that one must have to become a successful musician: good eye-hand coordination, muscle control, motor control, tenacity, patience, memory for certain kinds of structures and patterns, a sense of rhythm and timing. To be a good musician, one must have these things. Some of these skills are involved in becoming a great anything, especially determination, self-confidence, and patience.
We also know that, on average, successful people have had many more failures than unsuccessful people. This seems counterintuitive. How could successful people have failed more often than everyone else? Failure is unavoidable and sometimes happens randomly. It’s what you do after the failure that is important. Successful people have a stick-toit-iveness. They don’t quit. From the president of FedEx to the novelist Jerzy Kosinsky, from Van Gogh to Bill Clinton to Fleetwood Mac, successful people have had many, many failures, but they learn from them and keep going. This quality might be partly innate, but environmental factors must also play a role.
The best guess that scientists currently have about the role of genes and the environment in complex cognitive behaviors is that each is responsible for about 50 percent of the story. Genes may transmit a propensity to be patient, to have good eye-hand coordination, or to be passionate, but certain life events—life events in the broadest sense, meaning not just your conscious experiences and memories, but the food you ate and the food your mother ate while you were in her womb—can influence whether a genetic propensity will be realized or not. Early life traumas, such as the loss of a parent, or physical or emotional abuse, are only the obvious examples of environmental influences causing a genetic predisposition to become either heightened or suppressed. Because of this interaction, we can only make predictions about human behavior at the level of a population, not an individual. In other words, if you know that someone has a genetic predisposition toward criminal behavior, you can’t make any predictions about whether he will end up in jail in the next five years. On the other hand, knowing that a hundred people have this predisposition, we can predict that some percentage of them will probably wind up in jail; we simply don’t know which ones. And some will never get into any trouble at all.
The same applies to musical genes we may find someday. All we can say is that a group of people with those genes is more likely to produce expert musicians, but we cannot know which individuals will become the experts. This, however, assumes that we’ll be able to identify the genetic correlates of musical expertise, and that we can agree on what constitutes musical expertise. Musical expertise has to be about more than strict technique. Music listening and enjoyment, musical memory, and how engaged with music a person is are also aspects of a musical mind and a musical personality. We should take as inclusive an approach as possible in identifying musicality, so as not to exclude those who, while musical in the broad sense, are perhaps not so in a narrow, technical sense. Many of our greatest musical minds weren’t considered experts in a technical sense. Irving Berlin, one of the most successful composers of the twentieth century, was a lousy instrumentalist and could barely play the piano.
Even among the elite, top-tier classical musicians, there is more to being a musician than having excellent technique. Both Arthur Rubinstein and Vladimir Horowitz are widely regarded as two of the greatest pianists of the twentieth century but they made mistakes—little technical mistakes—surprisingly often. A wrong note, a rushed note, a note that isn’t fingered properly. But as one critic wrote, “Rubinstein makes mistakes on some of his records, but I’ll take those interpretations that are filled with passion over the twenty-two-year-old technical wizard who can play the notes but can’t convey the meaning.”
What most of us turn to music for is an emotional experience. We aren’t studying the performance for wrong notes, and so long as they don’t jar us out of our reverie, most of us don’t notice them. So much of the research on musical expertise has looked for accomplishment in the wrong place, in the facility of fingers rather than the expressiveness of emotion. I recently asked the dean of one of the top music schools in North America about this paradox: At what point in the curriculum is emotion and expressivity taught? Her answer was that they aren’t taught. “There is so much to cover in the approved curriculum,” she explained, “repertoire, ensemble, and solo training, sight singing, sight reading, music theory—that there simply isn’t time to teach expressivity.” So how do we get expressive musicians? “Some of them come in already knowing how to move a listener. Usually they’ve figured it out themselves somewhere along the line.” The surprise and disappointment in my face must have been obvious. “Occasionally,” she added, almost in a whisper, “if there’s an exceptional student, there’s time during the last part of their last semester here to coach them on emotion …. Usually this is for people who are already performing as soloists in our orchestra, and we help them to coax out more expressivity from their performance.” So, at one of the best music schools we have, the raison d’être for music is taught to a select few, and then, only in the last few weeks of a four- or five-year curriculum.
Even the most uptight and analytic among us expect to be moved by Shakespeare and Bach. We can marvel at the craft these geniuses have mastered, a facility with language or with notes, but ultimately that facility must be brought into service for a different type of communication. Jazz fans, for example, are especially demanding of their post-big-band-era heroes, starting with the Miles Davis/John Coltrane/Bill Evans era. We say of lesser jazz musicians who appear detached from their true selves and from emotion that their playing is nothing more than “shucking and jiving,” attempts to please the audience through musical obsequiousness rather than through soul.
So—in a scientific sense—why are some musicians superior to others when it comes to the emotional (versus the technical) dimension of music? This is the great mystery, and no one knows for sure. Musicians haven’t yet performed with feeling inside brain scanners, due to technical difficulties. (The scanners we currently use require the subject to stay perfectly still, so as not to blur the brain image; this may change in the coming five years.) Interviews with, and diary entries of, musicians ranging from Beethoven and Tchaikovsky to Rubinstein and Bernstein, B. B. King, and Stevie Wonder suggest that part of communicating emotion involves technical, mechanical factors, and part of it involves something that remains mysterious.
The pianist Alfred Brendel says he doesn’t think about notes when he’s onstage; he thinks about creating an experience. Stevie Wonder told me that when he’s performing, he tries to get himself into the same frame of mind and “frame of heart” that he was in when he wrote the song; he tries to capture the same feelings and sentiment, and that helps him to deliver the performance. What this means in terms of how he sings or plays differently is something no one knows. From a neuroscientific perspective, though, this makes perfect sense. As we’ve seen, remembering music involves setting the neurons that were originally active in the perception of a piece of music back to their original state—reactivating their particular pattern of connectivity, and getting the firing rates as close as possible to their original levels. This means recruiting neurons in the hippocampus, amygdala, and temporal lobes in a neural symphony orchestrated by attention and planning centers in the front lobe.
The neuroanatomist Andrew Arthur Abbie speculated in 1934 a linkage between movement, the brain, and music that is only now becoming proven. He wrote that pathways from the brain stem and cerebellum to the frontal lobes are capable of weaving all sensory experience and accurately coordinated muscular movements into a “homogeneous fabric” and that when this occurs, the result is “man’s highest powers as expressed … in art.” His idea of this neural pathway was that it is dedicated to motor movements that incorporate or reflect a creative purpose. New studies by Marcelo Wanderley of McGill, and by my former doctoral student Bradley Vines (now at Harvard) have shown that nonmusician listeners are exquisitely sensitive to the physical gestures that musicians make. By watching a musical performance with the sound turned off, and attending to things like the musician’s arm, shoulder, and torso movements, ordinary listeners can detect a great deal of the expressive intentions of the musician. Add in the sound, and an emergent quality appears—an understanding of the musician’s expressive intentions that goes beyond what was available in the sound or the visual image alone.
If music serves to convey feelings through the interaction of physical gestures and sound, the musician needs his brain state to match the emotional state he is trying to express. Although the studies haven’t been performed yet, I’m willing to bet that when B.B. is playing the blues and when he is feeling the blues, the neural signatures are very similar. (Of course there will be differences, too, and part of the scientific hurdle will be subtracting out the processes involved in issuing motor commands and listening to music, versus just sitting on a chair, head in hands, and feeling down.) And as listeners, there is every reason to believe that some of our brain states will match those of the musicians we are listening to. In what is a recurring theme of your brain on music, even those of us who lack explicit training in music theory and performance have musical brains, and are expert listeners.
In understanding the neurobehavioral basis of musical expertise and why some people become better performers than others, we need to consider that musical expertise takes many forms, sometimes technical (involving dexterity) and sometimes emotional. The ability to draw us into a performance so that we forget about everything else is also a special kind of ability. Many performers have a personal magnetism, or charisma, that is independent of any other abilities they may or may not have. When Sting is singing, we can’t take our ears off of him. When Miles Davis is playing the trumpet, or Eric Clapton the guitar, an invisible force seems to draw us toward him. This doesn’t have to do so much with the actual notes they’re singing or playing—any number of good musicians can play or sing those notes, perhaps even with better technical facility. Rather, it is what record company executives call “star quality.” When we say of a model that she is photogenic, we’re talking about how this star quality manifests itself in photographs. The same thing is true for musicians, and how their quality comes across on records—I call this phonogenic.
It is also important to distinguish celebrity from expertise. The factors that contribute to celebrity could be different from, maybe wholly unrelated to, those that contribute to expertise. Neil Young told me that he did not consider himself to be especially talented as a musician, rather, he was one of the lucky ones who managed to become commercially successful. Few people get to pass through the turnstiles of a deal with a major record label, and fewer still maintain careers for decades as Neil has done. But Neil, along with Stevie Wonder and Eric Clapton, attributes a lot of his success not to musical ability but to a good break. Paul Simon agrees. “I’ve been lucky to have been able to work with some of the most amazing musicians in the world,” he said, “and most of them are people no one’s ever heard of.”
Francis Crick turned his lack of training into a positive aspect of his life’s work. Unbound by scientific dogma, he was free—completely free, he wrote—to open his mind and discover science. When an artist brings this freedom, this tabula rasa, to music, the results can be astounding. Many of the greatest musicians of our era lacked formal training, including Sinatra, Louis Armstrong, John Coltrane, Eric Clapton, Eddie Van Halen, Stevie Wonder, and Joni Mitchell; in classical music, George Gershwin, Mussorgsky, and David Helfgott, and according to his diaries Beethoven considered his own musical training poor.
Joni Mitchell had sung in choirs in public school, but had never taken guitar lessons or any other kind of music lessons. Her music has a unique quality that has been variously described as avant-garde, ethereal, and as bridging classical, folk, jazz, and rock. Joni uses a lot of alternate tunings; that is, instead of tuning the guitar in the customary way, she tunes the strings to pitches of her own choosing. This doesn’t mean that she plays notes that other people don’t—there are still only twelve notes in a chromatic scale—but it does mean that she can easily reach with her fingers combinations of notes that other guitarists can’t reach (regardless of the size of their hands).
An even more important difference involves the way the guitar makes sound. Each of the six strings of the guitar is tuned to a particular pitch. When a guitarist wants a different one, of course, she presses one or more strings down against the neck; this makes the string shorter, which causes it to vibrate more rapidly, making a tone with a higher pitch. A string that is pressed on (“fretted”) has a different sound from one that isn’t, due to a slight deadening of the string caused by the finger; the unfretted or “open” strings have a clearer, more ringing quality, and they will keep on sounding for a longer time than the ones that are fretted. When two or more of these open strings are allowed to ring together, a unique timbre emerges. By retuning, Joni changed the configuration of which notes are played when a string is open, so that we hear notes ringing that don’t usually ring on the guitar, and in combinations we don’t usually hear. You can hear it on her songs “Chelsea Morning” and “Refuge of the Roads” for example.
But there is something more to it than that—lots of guitarists use their own tunings, such as David Crosby, Ry Cooder, Leo Kottke, and Jimmy Page. One night, when I was having dinner with Joni in Los Angeles, she started talking about bass players that she had worked with. She has worked with some of the very best of our generation: Jaco Pastorius, Max Bennett, Larry Klein, and she wrote an entire album with Charles Mingus. Joni will talk compellingly and passionately about alternate tunings for hours, comparing them to the different colors that van Gogh used in his paintings.
While we were waiting for the main course, she went off on a story about how Jaco Pastorius was always arguing with her, challenging her, and generally creating mayhem backstage before they would go on. For example when the first Roland Jazz Chorus amplifier was hand-delivered by the Roland Company to Joni to use at a performance, Jaco picked it up, and moved it over to his corner of the stage. “It’s mine,” he growled. When Joni approached him, he gave her a fierce look. And that was that.
We were well into twenty minutes of bass-player stories. Because I was a huge fan of Jaco when he played with Weather Report, I interrupted and asked what it was like musically to play with him. She said that he was different from any other bass player she had every played with; that he was the only bass player up to that time that she felt really understood what she was trying to do. That’s why she put up with his aggressive behaviors.
“When I first started out,” she said, “the record company wanted to give me a producer, someone who had experience churning out hit records. But [David] Crosby said, ‘Don’t let them—a producer will ruin you. Let’s tell them that I’ll produce it for you; they’ll trust me.’ So basically, Crosby put his name as producer to keep the record company out of my way so that I could make the music the way that I wanted to.
“But then the musicians came in and they all had ideas about how they wanted to play. On my record! The worst were the bass players because they always wanted to know what the root of the chord was.” The “root” of a chord, in music theory, is the note for which the chord is named and around which it is based. A “C major” chord has the note C as its root, for example, and an “E-flat minor” chord has the note E-flat as its root. It is that simple. But the chords Joni plays, as a consequence of her unique composition and guitar-playing styles, aren’t typical chords: Joni throws notes together in such a way that the chords can’t be easily labeled. “The bass players wanted to know the root because that’s what they’ve been taught to play. But I said, ‘Just play something that sounds good, don’t worry about what the root is.’ And they said, ‘We can’t do that—we have to play the root or it won’t sound right.’”
Because Joni hadn’t had music theory and didn’t know how to read music, she couldn’t tell them the root. She had to tell them what notes she was playing on the guitar, one by one, and they had to figure it out for themselves, painstakingly, one chord at a time. But here is where psychoacoustics and music theory collide in an explosive conflagration: The standard chords that most composers use—C major, E-flat minor, D7, and so on—are unambiguous. No competent musician would need to ask what the root of a chord like those is; it is obvious, and there is only one possibility. Joni’s genius is that she creates chords that are ambiguous, chords that could have two or more different roots. When there is no bass playing along with her guitar (as in “Chelsea Morning” or “Sweet Bird”), the listener is left in a state of expansive aesthetic possibilities. Because each chord could be interpreted in two or more different ways, any prediction or expectation that a listener has about what comes next is less grounded in certainty than with traditional chords. And when Joni strings together several of these ambiguous chords, the harmonic complexity greatly increases; each chord sequence can be interpreted in dozens of different ways, depending on how each of its constituents is heard. Since we hold in immediate memory what we’ve just heard and integrate it with the stream of new music arriving at our ears and brains, attentive listeners to Joni’s music—even nonmusicians—can write and rewrite in their minds a multitude of musical interpretations as the piece unfolds; and each new listening brings a new set of contexts, expectations, and interpretations. In this sense, Joni’s music is as close to impressionist visual art as anything I’ve heard.
As soon as a bass player plays a note, he fixes one particular musical interpretation, thus ruining the delicate ambiguity the composer has so artfully constructed. All of the bass players Joni worked with before Jaco insisted on playing roots, or what they perceived to be roots. The brilliance of Jaco, Joni said, is that he instinctively knew to wander around the possibility space, reinforcing the different chord interpretations with equal emphasis, sublimely holding the ambiguity in a delicate, suspended balance. Jaco allowed Joni to have bass guitar on her songs without destroying one of their most expansive qualities. This, then, we figured out at dinner that night, was one of the secrets of why Joni’s music sounds unlike anyone else’s—its harmonic complexity born out of her strict insistence that the music not be anchored to a single harmonic interpretation. Add in her compelling, phonogenic voice, and we become immersed in an auditory world, a soundscape unlike any other.
Musical memory is another aspect of musical expertise. Many of us know someone who remembers all kinds of details that the rest of us can’t. This could be a friend who remembers every joke he’s ever heard in his life, while some of us can’t even retell one we’ve heard that same day. My colleague Richard Parncutt, a well-known musicologist and music cognition professor at the University of Graz in Austria, used to play piano in a tavern to earn money for graduate school. Whenever he comes to Montreal to visit me he sits down at the piano in my living room and accompanies me while I sing. We can play together for a long time: Any song I name, he can play from memory. He also knows the different versions of songs: If I ask him to play “Anything Goes,” he’ll ask if I want the version by Sinatra, Ella Fitzgerald, or Count Basie! Now, I can probably play or sing a hundred songs from memory. That is typical for someone who has played in bands or orchestras, and who has performed. But Richard seems to know thousands and thousands of songs, both the chords and lyrics. How does he do it? Is it possible for mere memory mortals like me to learn to do this too?
When I was in music school, at the Berklee College of Music in Boston, I ran into someone with an equally remarkable form of musical memory, but different from Richard’s. Carla could recognize a piece of music within just three or four seconds and name it. I don’t actually know how good she was at singing songs from memory, because we were always busy trying to come up with a melody to stump her, and this was hard to do. Carla eventually took a job at the American Society of Composers and Publishers (ASCAP), a composers’ rights organization that monitors radio station playlists in order to collect royalties for ASCAP members. ASCAP workers sit in a room in Manhattan all day, listening to excerpts from radio programs all over the country. To be efficient at their job, and indeed to be hired in the first place, they have to be able to name a song and the performer within just three to five seconds before writing it down in the log and moving on to the next one.
Earlier, I mentioned Kenny, the boy with Williams syndrome who plays the clarinet. Once when Kenny was playing “The Entertainer” (the theme song from The Sting), by Scott Joplin, he had difficulty with a certain passage. “Can I try that again?” he asked me, with an eagerness to please that is typical of Williams syndrome. “Of course,” I said. Instead of backing up just a few notes or a few seconds in the piece, however, he went all the way back to the beginning! I had seen this before, in recording studios, with master musicians from Carlos Santana to the Clash—a tendency to go back, if not to the beginning of the entire piece, to the beginning of a phrase. It is as though the musician is executing a memorized sequence of muscle movements, and the sequence has to begin from the beginning.
What do these three demonstrations of memory for music have in common? What is going on in the brains of someone with a fantastic musical memory like Richard and Carla, or the “finger memory” that Kenny has? How might those operations be different from—or similar to—the normal neural processes in someone with a merely ordinary musical memory? Expertise in any domain is characterized by a superior memory, but only for things within the domain of expertise. My friend Richard doesn’t have a superior memory for everything in life—he still loses his keys just like anyone else. Grandmaster chess players have memorized thousands of board and game configurations. However, their exceptional memory for chess extends only to legal positions of the chess pieces. Asked to memorize random arrangements of pieces on a board, they do no better than novices; in other words, their knowledge of chess-piece positions is schematized, and relies on knowledge of the legal moves and positions that pieces can take. Likewise, experts in music rely on their knowledge of musical structure. Expert musicians excel at remembering chord sequences that are “legal” or make sense within the harmonic systems that they have experience with, but they do no better than anyone else at learning sequences of random chords.
When musicians memorize songs, then, they are relying on a structure for their memory, and the details fit into that structure. This is an efficient and parsimonious way for the brain to function. Rather than memorizing every chord or every note, we build up a framework within which many different songs can fit, a mental template that can accommodate a large number of musical pieces. When learning to play Beethoven’s “Pathétique” Sonata, the pianist can learn the first eight measures and then, for the next eight, simply needs to know that the same theme is repeated but an octave higher. Any rock musician can play “One After 909” by the Beatles even if he’s never played it before, if he is simply told that it is a “standard sixteen-bar blues progression.” That phrase is a framework within which thousands of songs fit. “One After 909” has certain nuances that constitute variations of the framework. The point is that musicians don’t typically learn new pieces one note at a time once they have reached a certain level of experience, knowledge, and proficiency. They can scaffold on the previous pieces they know, and just note any variations from the standard schema.
Memory for playing a musical piece therefore involves a process very much like that for music listening as we saw in Chapter 4, through establishing standard schemas and expectation. In addition, musicians use chunking, a way of organizing information similar to the way chess players, athletes, and other experts organize information. Chunking refers to the process of tying together units of information into groups, and remembering the group as a whole rather than the individual pieces. We do this all the time without much conscious awareness when we have to remember someone’s long-distance phone number. If you’re trying to remember the phone number of someone in New York City—and if you know other NYC phone numbers and are familiar with them—you don’t have to remember the area code as three individual numerals, rather, you remember it as a single unit: 212. Likewise, you may know that Los Angeles is 213, Atlanta is 404, or that the country code for England is 44. The reason that chunking is important is because our brains have limits on how much information they can actively keep track of. There is no practical limit to long-term memory that we know of, but working memory—the contents of our present awareness—is severely limited, generally to nine pieces of information. Encoding a North American phone number as the area code (one unit of information) plus seven digits helps us to avoid that limit. Chess players also employ chunking, remembering board configurations in terms of groups of pieces arranged in standard, easy-to-name patterns.
Musicians also use chunking in several ways. First, they tend to encode in memory an entire chord, rather than the individual notes of the chord; they remember “C major 7” rather than the individual tones C - E - G - B, and they remember the rule for constructing chords, so that they can create those four tones on the spot from just one memory entry. Second, musicians tend to encode sequences of chords, rather than isolated chords. “Plagal cadence,” “aeolian cadence,” “twelve-bar minor blues with a V-I turnaround,” or “rhythm changes” are shorthand labels that musicians use to describe sequences of varying lengths. Having stored the information about what these labels mean allows the musician to recall big chunks of information from a single memory entry. Third, we obtain knowledge as listeners about stylistic norms, and as players about how to produce these norms. Musicians know how to take a song and apply this knowledge—schemas again—to make the song sound like salsa, or grunge, or disco, or heavy metal; each genre and era has stylistic tics or characteristic rhythmic, timbral, or harmonic elements that define it. We can encode those in memory holistically, and then retrieve these features all at once.
These three forms of chunking are what Richard Parncutt uses when he sits at the piano to play thousands of songs. He also knows enough music theory and is acquainted enough with different styles and genres that he can fake his way through a passage he doesn’t really know, just as an actor might substitute words that aren’t in the script if she momentarily forgets her lines. If Richard is unsure of a note or chord, he’ll replace it with one that is stylistically plausible.
Identification memory—the ability that most of us have to identify pieces of music that we’ve heard before—is similar to memory for faces, photos, even tastes and smells, and there is individual variability, with some people simply being better than others; it is also domain specific, with some people—like my classmate Carla—being especially good at music, while others excel in other sensory domains. Being able to rapidly retrieve a familiar piece of music from memory is one skill, but being able to then quickly and effortlessly attach a label to it, such as the song title, artist, and year of recording (which Carla could do) involves a separate cortical network, which we now believe involves the planum temporale (a structure associated with absolute pitch) and regions of the inferior prefrontal cortex that are known to be required for attaching verbal labels to sensory impressions. Why some people are better at this than others is still unknown, but it may result from an innate or hardwired predisposition in the way their brains formed, and this in turn may have a partial genetic basis.
When learning sequences of notes in a new musical piece, musicians sometimes have to resort to the brute-force approach that most of us took as children in learning new sequences of sounds, such as the alphabet, the U.S. Pledge of Allegiance, or the Lord’s Prayer: We simply do everything we can to memorize the information by repeating it over and over again. But this rote memorization is greatly facilitated by a hierarchical organization of the material. Certain words in a text or notes in a musical piece (as we saw in Chapter 4) are more important than others structurally, and we organize our learning around them. This sort of plain old memorization is what musicians do when they learn the muscle movements necessary to play a particular piece; it is part of the reason that musicians like Kenny can’t start playing on just any note, but tend to go to the beginnings of meaningful units, the beginnings of their hierarchically organized chunks.
Being an expert musician thus take many forms: dexterity at playing an instrument, emotional communication, creativity, and special mental structures for remembering music. Being an expert listener, which most of us are by age six, involves having incorporated the grammar of our musical culture into mental schemas that allow us to form musical expectations, the heart of the aesthetic experience in music. How all these various forms of expertise are acquired is still a neuroscientific mystery. The emerging consensus, however, is that musical expertise is not one thing, but involves many components, and not all musical experts will be endowed with these different components equally—some, like Irving Berlin, may lack what most of us would even consider a fundamental aspect of musicianship, being able to play an instrument well. It seems unlikely from what we now know that musical expertise is wholly different from expertise in other domains. Although music certainly uses brain structures and neural circuits that other activities don’t, the process of becoming a musical expert—whether a composer or performer—requires many of the same personality traits as becoming an expert in other domains, especially diligence, patience, motivation, and plain old-fashioned stick-to-it-iveness.
Becoming a famous musician is another matter entirely, and may not have as much to do with intrinsic factors or ability as with charisma, opportunity, and luck. An essential point bears repeating, however: All of us are expert musical listeners, able to make quite subtle determinations of what we like and don’t like, even when we’re unable to articulate the reasons why. Science does have something to say about why we like the music we do, and that story is another interesting facet of the interplay between neurons and notes.