CHAPTER 3
What’s Involved in Sounding Human?
I hope they’re as human as they sound. DR. BONNELL, The Invasion of the Body Snatchers
Dr. Bonnell, a character in the 1956 film The Invasion of the Body Snatchers,1 finds himself increasingly isolated as alien body snatchers separate people’s souls and bodies, replacing their souls with robotlike minds. At a certain point, Bonnell hears music and is encouraged to think that there might still be other human beings in the vicinity. This is when he utters the statement in the epigraph, “I hope they’re as human as they sound.” He soon discovers to his horror that he has only heard the radio of a truck that is now driven by one of the snatched bodies.
What is it to sound human? Is there even a generic way of sounding human, as there is a generic way of sounding like a gibbon or a humpback whale? Can we use sound to recognize others of our species, as many songbirds do? Ethnomusicologist Bruno Nettl contends that human individuals can almost always recognize other cultures’ music as music, and that they are thereby recognizing what can be termed “musical universals.”2
In 1974 a Sumerian song from approximately 1400 BC (probably a hymn to Nikkal, the wife of the Hurrian moon god), which had been decoded from cuneiform on clay tablets, was performed for an American audience. Sandra Trehub reports, “Listeners at the song’s North American premiere did not hear the exotic melody that they had anticipated; what they heard, instead, sounded like an ordinary lullaby, hymn, or folk song.”3 Surely, one of the remarkable aspects of music is its ability to traverse time. When we hear the music of Josquin, Bach, or Beethoven, we hear living impulses from those very composers, despite their historical distance. We literally hear Josquin or Bach or Beethoven. And surely there is something seemingly magical about music transmitted from 1400 BC. Through it those dead for several millennia sing to us, as if alive.
John Blacking raised the issue of whether there is a universal way of sounding human in his groundbreaking book How Musical Is Man? His answer is that musicality is a basic part of our human inheritance, and that it is only because we have allowed our natural abilities to atrophy that we are not all active performers:
A variety of circumstances and taboos . . . have suppressed the innate talents of millions of people. That is what concerns me most, because I believe that music-making is a special activity that has had, and could continue to have, important consequences for the full development of human potential.4
Despite his distress about our failure to claim our musical inheritance, however, Blacking is sanguine about the prospect for cross-cultural musical communication on the basis of our common perceptual abilities:
Our own experience suggests that there are some possibilities of cross-cultural communication. I am convinced that the explanation is to be found in the fact that at the level of deep structures in music there are elements that are common to the human psyche, although that may not appear in the surface structures.5
As is no doubt obvious, I am sympathetic to Blacking’s views. Indeed, he emphasizes one of the major points that I am defending in this book, the centrality of musical capacity in members of the human species. I also agree that the common character of human music is greater than divergent surface features might indicate. In any event, if music is a central aspect of human experience, cross-cultural evidence should support this. We should be able to find what social scientists call “species-invariant traits” pertaining to musicality in individuals of various cultures. Without a common denominator, the claim that musicality is basic to human nature will not be very convincing.
ACOUSTICS AND UNIVERSALITY
Insofar as music is constructed of sounds displaying certain frequency patterns, we might think that the science of acoustics ensures a common basis for musical experience across cultures. For example, it has sometimes been argued that natural acoustics determines what would be an acceptable cadence, the end to a phrase or melody. How plausible is the idea that acoustics will show us which features of music are universal?
We should first consider the way acoustics explains consonance. In the West Pythagoras is credited with the discovery that vibrating strings of lengths in simple ratios to one another produce pleasing, harmonious intervals. Moreover, a vibrating string produces not only a single tone but also numerous “harmonics,” or overtones. When we hear middle C, for example, we hear in addition to the fundamental tone, middle C, the frequencies that are multiples of 2, 3, 4, for example, times the fundamental frequency. Thus we hear the C above middle C, the G above that, and so on. This is because in addition to vibrating as a whole, each half of the string also vibrates, as does each half of these halves, and so on. Each of these vibrating subsections of the string generates a tone as well, and these tones generated by the vibrations of subcomponents of the string are called overtones. The overtones are less prominent than the basic tone, the fundamental, but we hear a blend of the fundamental and its overtones as a single sound. Within this blend, a particular overtone is more evident the more closely it relates to the fundamental.6
When two tones are sounded simultaneously, the particular distance between the pitches is called an interval.7 We hear in the interval the fundamental and the overtone series of each of the tones. If one of the fundamental tones corresponds to one of the first several overtones of the other, the interval will tend to be perceived as consonant. Another way of expressing the tones’ relationship is to say that two tones will tend to be heard as consonant if many of the tones in their overtone series overlap. These acoustic principles, dealing as they do with the physics of sound, apply to music throughout the world. Hence, one might expect basic consonances, intervals formed of tones with simple frequency ratios, to be the same in music across the globe.
Even an appeal to these basics as the core of what is universal, however, is not straightforward. Perception of consonance is highly context dependent. The range of acceptable consonances in a musical tradition is subject to change. The perfect consonance of the fourth (strings resonating in the ratio of 3:4) was considered relatively dissonant in some periods of Western music. The Western musical tradition has shifted considerably since 1600, when thirds were considered too dissonant to use in a cadence.
Dowling and Harwood consider the changes in Western perceptions of dissonance to reflect a psychological principle articulated by Harry Helson. Helson endorses the theory that in any sensory modality, we prefer stimuli that differ from our adaptation level (the level to which we are accustomed) by a small amount.8 Dowling and Harwood comment:
As listeners adapt to greater amounts of tonal dissonance, there is a gradual shift along the continuum. Helson’s model also explains why a shift toward greater consonance is sometimes attractive—for example, Stravinsky’s backing away in the 1920s from the dissonances of Rite of Spring.9
The Rite of Spring is the rhythmically jabbing, harmonically jarring ballet of 1913 that provoked a riot when it was first performed in Paris; by contrast, Stravinsky’s works from around 1923 are commonly described as “neoclassical” because of their inclusion of techniques and idiomatic tendencies from earlier, less dissonant times.10
Lerdahl and Jackendoff observe that the analysis of consonance in terms of the overtone series is of limited usefulness. They point out that “there is no way to derive the minor triad from the overtone series,” although the minor triad has been taken as sufficiently consonant to be used in the final cadence in Western classical music since the eighteenth century. They conclude, “derivation from the overtone series is neither a sufficient nor a necessary condition for consonance, even in the musical idiom most familiar to us.”11 Patel reiterates this point, noting that that although the octave (an interval with a 2:1 frequency ratio) is interpreted as “the same again” virtually everywhere, and the fifth (with an interval with a 2:3 frequency ratio) is an important interval in most cultures’ music, the relationship between the overtone series and the scales of many cultures is not very strong.12
Moreover, pentatonic scales, commonly taken to be the most universal among scales, are not understood the same way in various cultures.13 Some cultures use semitones,14 some require particular ornaments,15 the intervals and their sequence differ, and the relative hierarchy of tones within the pentatonic scale varies across societies.16 Most importantly, different musical cultures use different methods of tuning to determine the exact pitches used in the scale.
During the baroque period, the West began using tempered tuning, which is not based on strict acoustics but modifies acoustically pure intervals slightly in order to facilitate transposition and modulation from one key to another.17 The form of tempering used in Western music is called “equal temperament.”18 In natural acoustics F-sharp and G-flat are not identical, but tempering makes it so. The consequence is that in a scale tuned according to equal temperament, the ratios of frequency vibrations for the intervals among the tones are not precisely the simple ratios that Pythagoras identified. They are, however, close enough that the ear tolerates them. The octave is kept acoustically pure, the fourth and fifth are tuned in a way that slightly fudges the simple ratios between the frequencies of vibrations, and other intervals are modified to a greater degree. Tempered tuning is employed in the now standard tuning of the piano.
The motivation for tempered tuning stems from a surprising fact about the harmonic system. Successive modulation from one key to its nearest neighbor does not, in nature, return one to exactly the same key.
One can get the point by attempting to go through the circle of fifths, the series of tones produced by successive moves up a fifth. If one starts on the tone of C, the series goes C-G-D-A-E-B-F#-C#-G#-D#-A#-E#(F)-B#(C). In order to keep the sequence of tones within an octave, for practicality, this sequence can also be thought of as involving successive leaps up a fifth and down a fourth; the descending leap of a fourth lands one on the lower octave relative of the tone one would reach by leaping up a fifth. Through this process, one returns to the same key on the piano in twelve moves. Without some distortion of natural tuning, however, E# and F would not be equivalent, nor would B# and C. In other words, the pitch one arrived at would not be the same as that produced by the first piano key.19
The circle of fifths can also be characterized in terms of key proximity. Let us consider the keys in the Western diatonic system, that is, the system in which the distances between notes in the scale come in two sizes, distances of whole tones and of semitones, with no two successive half steps and no more than three successive whole steps. Keys in this system are more or less close to each other depending on how many of the tones in the scale of one key are also included in that of the other. The closest keys have only one differing tone. This relationship obtains between keys whose tonics are a fifth apart and a fourth apart. Thus G major and F major are the closest keys to C major, because each requires only one change from the tones used in C major.
Tempering is possible because our ears, as we will shortly consider further, are willing to interpret intervals that are less than acoustically perfect as close enough. This enables us to hear equal-tempered tuning (the specific tempering system used in the West) without having a sense that something is out of tune. We deliberately tune fifths to produce one beat per second. A beat, in this usage, is an interference of sound waves that produces a recurrent audible intensification of sound, a regular cycle in which the sound gets louder periodically. The faster the beats, the more unpleasant we tend to find it. We in the West take one beat per second as acceptable, even though the beats can be perceived.
The reason those of us accustomed to Western music do not usually notice the beats inherent in tempered tuning is that this pattern of beats conforms to our pattern of expectation. Gaining familiarity with any musical tradition involves developing mental templates, or schemata, for the way the music ought to sound, and the schemata we develop in acquiring familiarity with Western music include those for scales based on equal-tempered tuning.20 Other societies may use different schemata for tuning, constructing scales on different bases than we do. Cultural schemata for tuning may also differ with respect to how standardized tuning is across the range of instruments.21
Unfortunately for cross-cultural musical understanding, we may well hear music from cultures with tuning systems unlike ours as out of tune, just as they might hear our music that way. Alternatively, if the discrepancy is not too vast, we might distort what we are hearing so that it conforms to our schema for pitch. Y. R. Chao, who was both a linguist and a musician, described such a case in his own experience:
The writer once heard a piece of music and interpreted it as here in major and there in minor and its notes as being do, re, mi, etc., only being slightly “off,” but subsequently learned to his surprise that it was a scale of seven equal steps in the octave. The illusion persisted even after he was told. He had forced his own intervals into the new scale. . . .22
Recognition of the role of culture in music should lead us to recognize that acoustics do not ensure that everyone will experience the same piece of music in the same way. As Stephen Davies observes, musically significant features of a piece of music may not even be perceptible if one has internalized a different sense of tuning. Davies takes Javanese music as an example:
To someone raised on the major scale, almost all the intervals of the Javanese slendro scale sound so horribly out of tune on first hearing that no sense can be made of the different relationships holding between them. To such a person, a shift from the mode (or patet) of manyura to nem does not sound like a modulation at all. And, for such a person, unused from Western music to the cadential function of long-lasting, gong-punctuated structures, the colotomic pattern is likely to be missed altogether, even if the sound of gong agung is unmistakable.23
Acoustics will not ensure cross-cultural musical comfort. We need another basis for determining species-invariant musical traits. In other words, we need to ascertain what traits, if any, amount to musical universals.
SEEKING MUSICAL UNIVERSALS
The quest for universal musical characteristics of human beings is fraught with difficulties, however. To begin with, different disciplines have different criteria for universals. Philosophers, sometimes postulating that universals exist on a different, more abstract plane than the material world, will typically refuse to call a trait universal with respect to a particular set or group unless it appears in every possible individual instance. A single counterexample can be sufficient to jettison a firmly accepted assertion.24 By contrast, anthropologists, concerning themselves with real-world cases, are much more comfortable accepting as universals traits that are characteristic of members of a group, though not exhibited in every case. Mantle Hood, for example, straightforwardly proposes,
let us be content in defining “universals in music” as those attributes of music that approach worldwide distribution. . . . In other words, let us accept “universality” in the sense of “high probability of occurrence.” Another analogy: babies are born with two arms and two legs—almost always.25
Hood’s point is that we tend to view cases that depart from the “universal” as involving some disability. The “universal” is thus a normative paradigm. Given that I am concerned with real-world musical experience, I will be employing this weak standard for a universal in what follows. Even using this weak standard of (more or less) species invariance, the number of such universals is small.26
If we decide provisionally to accept as universals what most philosophers would at most allow are “near-universals,” what evidence should we use to establish them? How much evidence is necessary to allow the induction that a particular musical capacity is universal, in a rough sense? And even if we agree on the nature of the evidence, what does it show? Evidence must be subject to analysis, and the most suggestive hints require speculation and extrapolation. But speculation and extrapolation can go wrong. What kind of evidence is sufficient to establish that a right or wrong step has been taken?
Presumably we will be on reasonably firm ground if we seek universals in standard human perceptual mechanisms. To be a member of the human species is fundamentally to fulfill certain biological criteria. As we shall see, many alleged musical universals are rooted in human sensory perception. They therefore seem good candidates for what human beings have in common, at least to the extent that they share the same sensory capabilities. (Some of the universals that have been proposed, however, are not uniquely dependent on auditory perception but would also be involved in tactile pattern recognition. They would therefore apply to the perceptions of deaf individuals as well as to hearing ones.)
For the remainder of this chapter, I will focus on various “musical universals” that empirical evidence seems to support thus far. I will not insert quotation marks around the expression in the remainder of my discussion, although we should keep in mind that cross-cultural research is at a sufficiently preliminary stage that some revisions to the list may become warranted.
Taken together, musical universals provide a degree of perceptual access into foreign music.27 At the same time, some of them ensure a certain amount of difficulty adjusting to music that contrasts with familiar music, particularly in connection with pitch. Even if we all use scales and categories in our perception of music, the fact that cultures make use of different ones means that mental schemata for interpreting incoming musical signals are not universally the same. The use of schemata in our musical processing, in fact, ensures that there are perceptual barriers restricting the ease of access to music that departs sufficiently from what we have learned to expect.
PROCESSING UNIVERSALS
I will initially distinguish between universals of musical perception and universals of musical structure.28 The former we would expect to apply to the basic operations involved in perceiving music (assuming the requisite sensory modalities). The latter may apply on a certain level of generality to instances of music that are stylistically diverse on another level. We will consider proposed universals in each of these categories, beginning with universals of musical processing. I will go on to suggest that these universal processes alone already establish an illusion that has an impact on our sense that music has meaning, the impression that music “moves” in a manner akin to our own activity.
Several universals relating to the perceptual processing of what Lerdahl and Jackendoff call “the musical surface” have been identified.29 Harwood terms these “processing universals.”30 I will list a number of them, briefly explaining where appropriate, and then comment on the extent to which they might ground a sense of common human experience.
1. We distinguish signals from noise.31 Cultures in general distinguish between music and nonmusic. Albert Bregman points out that perceptually, “tones (sounds whose waveforms repeat cyclically) will often segregate from noises.”32 The acoustic characteristics of what is deemed “noise,” however, differ from culture to culture.33 Robert Walker points out that
the sound of an automobile braking may become periodic and may, therefore, have pitch purely acoustically, as would a wolf howl or a dog bark. However, our learning and experience operate on the information from the auditory perceptual mechanism in such a way as to enable us to distinguish clearly between those categories of periodic sounds that we know as being musical, according to our own particular culture, and those that acoustically have similar periodic function but do not belong to the sound world we call “music.” To this extent one may say that just as acoustical dissonance is not the same as musical dissonance, so acoustic pitch is, de facto, not necessarily the same as musical pitch.34
2. Sounds that are candidates for being incorporated into music must be within the vibration range of human pitch perception, which is most accurate between 100 and 1,000 Hz.35
3. We perceive musical information in “chunks.” In other words, when we perceive an unfolding musical stream, our mind grasps it as a sequence of units or events.36
4. We perceive a tone and its counterpart an octave away (tones vibrating in a frequency ratio of 2:1) as “functionally equivalent.”37 Even three-month-old babies accept the substitution of one tone by its octave counterpart without showing signs of being startled.38 This phenomenon is called octave equivalence. One consequence of octave equivalence is that men and women singing an octave apart, in most societies, are taken to be singing in unison.39
5. We stretch octaves. This means that at higher frequency ranges, an interval of frequencies vibrating in a ratio slightly greater than 2:1 (i.e., somewhat larger than an acoustic octave) is perceived as an octave. In lower-frequency ranges, by contrast, the ratio of tones accepted as an octave is slightly smaller than 2:1.40
6. Musical signals are organized in terms of melodic contour.41 We grasp and remember a musical sequence as having a certain shape. Harwood describes contour as a “potent perceptual ‘chunking’ mechanism in memory for melodies.”42 Awareness of contour appears to begin early in life. Chang and Trehub found that five-month-old babies’ heart rates, which decelerate when startled, decelerated when melodies they had become adapted to changed contour.43
7. Melodic fission occurs. This means that a single line of sequential pitches is heard as two lines, a high line and a low line, when the pitches alternate across a relatively wide intervallic distance.44 As an example, we might consider the melody of “Joy to the World,” which is simply the descending major scale. Let us imagine two tones that begin and end the opening descending melody of “Joy to the World,” to which the words are “Joy to the world, the Lord is come.” If we imagine just those two tones, one after the other, without the intervening ones, what we have in mind is a descending leap of an octave. If the next note we imagine is the tone that is conjoined with the word “to” in this Christmas carol, followed by another tone an octave lower than it, we have begun a sequence that could be continued down the major scale, with each tone followed by its counterpart an octave below it. Melodic fission means that we perceive this going back and forth between a higher set of tones and a lower set of tones as creating two lines, or two descending minor scales, instead of a single line that is constantly leaping. Bach strikingly exploits melodic fission in his works for solo violin and solo cello, creating the impression of multiple melodic lines by going rapidly back and forth between registers (i.e., distinct areas of the range of pitches).45
8. We accept an acoustically deviant tone as the nearest pitch in the scale, so long as it is sufficiently close. This can be referred to as the phenomenon of pitch proximity. Our ears are rather tolerant of acoustic imperfection. As long as a tone is sufficiently close to a pitch within the scale, we will hear it as that pitch. Another way of putting this point is that we perceive pitches categorically, that is, we tend to hear pitches as conforming to notes of the scale even if they are somewhat sharp or flat (although past a certain point, we notice that they are not on target, and at some point the discrepancy is great enough that we do not hear it even as a malformed instance of the intended tone).46 This perceptual tendency has enabled those of us accustomed to tempered tuning to develop a schema for pitch that makes the acoustical beats produced unnoticeable.47
9. We prefer and more easily remember intervals and sequences of tones with frequencies in small-integer ratios with one another (relative consonances) to those in frequency relationships of larger-integer ratios (relative dissonances).48 We perceive the former as relatively stable and the latter as relatively instable. (In light of the previous point, we should note that the frequencies heard only need to approximate those in small-integer ratios to be perceived as relatively consonant.)
Western musical analysis since the Renaissance has counted as “perfect consonances” those intervals comprised of tones whose frequencies of vibration are in simple ratios with one another. These correlate with strings whose lengths stand in simple ratios to one another. The perfect consequences are the octave (with a 1:2 ratio), the (perfect) fifth (with a 2:3 ratio), and the (perfect) fourth (with a 3:4 ratio). At the other end of the spectrum are intervals composed of tones whose frequencies of vibration stand in very large-integer ratios to one another. The frequency ratio for the tritone (the augmented fourth or diminished fifth, an unstable interval referred to as “the devil in music”) is 45:32.49 (One can hear our tempered version of this interval by playing C and the nearest F-sharp/G-flat on the piano.) Experimental studies have provided some evidence of cross-cultural agreement that the fourth and fifth are more consonant than the tritone.50
10. Temporal patterns are more important for processing and remembering musical sequences than are specific timing cues.51 In other words, we pay more attention to overall timing patterns than to the timing of individual musical events.
11. With few exceptions, human music makes use of scales, frameworks of discrete pitches, typically with uneven step size.52 Isabelle Peretz considers “the encoding of pitch along musical scales” to be one of two anchoring points used by the brain to process music. (The other is “the ascription of a regular beat to incoming events.”53) The use of scales with definite steps seems to be a near-universal, at least. This does not imply that all cultures utilize an explicit concept of a scale, but most utilize stable pitch contrasts, even if the number of pitches utilized varies widely.54
Five other universals further characterize the scale frameworks used in music.
12. Virtually all scales are restricted to five to seven tones.55 This is consistent with George Miller’s principle that our short-term memory can manage “7 plus or minus 2” items of information.56 Some psychologists have suggested that the number of items in the scale is as restricted as it is because of this limitation of memory; but Burns and Ward think this is exaggerated, since competent Western musicians can usually keep track of the twelve tones of the chromatic scale. Still, even they think there is an upper limit of how many tones we can keep track of, and the use of scales ranging from five to seven tones per octave is widespread.57
13. Pitches are organized hierarchically within the scale.58 In other words, scales include both a pitch that is most stable, called “the tonic” in Western music, and others that are less stable, each to a different degree.59
14. The temporal lengths of tones are typically uneven.60 This tendency, coupled with the tendency to employ scales of discrete tones, may explain why Western music was for centuries successfully notated by means of neumes, the signs used to notate liturgical chant from around the ninth century. Neumes indicate that the melody should rise or fall, and show only two durations, long and short.61
15. Rhythm is more basic for making judgments of similarity of musical patterns than pitch,62 and we tend to normalize rhythm.63 In other words, as Lerdahl and Jackendoff point out, the listener “normally . . . treats . . . local deviations from the metrical pattern as if they did not exist; a certain amount of metrical inexactness is tolerated in the service of emphasizing grouping or gestural patterns.”64 Thus we tend to hear less than regular intervals as regular, up to a point. We also tend to hear temporal intervals as half or twice as long as previous intervals, even when they approximate such durations very inexactly.65 According to Sloboda, this is because we categorize rhythm and have “a limited set of categories for describing durations of notes.” Given a specific duration for a particular note, “all the other symbols acquire a defined duration which is exactly double, or half of that standard.”66 Thus our musical notation and terminology reflect our categorical perception of duration.
16. Tempo keeping seems to be proportional, on the basis of low integer ratios.67 This means that sections within musical performances display tempos that relate to one another in simple proportions (e.g., 1:1, 2:1, and 3:1). This is the case even when performance is interrupted by rest periods. David Epstein found evidence that these results hold cross-culturally.68 He proposes that proportional tempo keeping may provide an aesthetic constraint on musical performance, establishing one criterion for aesthetic success.69 This pattern suggests that Leibniz is onto something when he writes that music is “an unconscious exercise in arithmetic in which the mind does not know it is counting.”70
In addition to these universals formulated in relation to music, we can include others that are more generally applicable to human cognition. The principles of Gestalt psychology concern the ways that we group perceptions together into objects and coherent shapes, which, Dowling and Harwood suggest, “seem to describe aspects of stimulus organization that arise automatically from the operation of the sensory systems, without involving more complex cognitive systems such as memory.”71 Musicologist Leonard Meyer contends that Gestalt principles should be included in any roster of universals involved in processing music:
The universals central for music theory are not those of physics or acoustics but those of human psychology—principles such as the following: proximity between stimuli tends to create connection, disjunction results in separation; orderly processes imply continuation to a point of relative stability; a return to patterns previously presented enhances closure; and, because of the requirements of memory, music tends to have considerable redundancy and is often hierarchically structured.72
By “hierarchically structured,” Meyer means that music is patterned in layers, with “higher,” more encompassing patterns subsuming those on lower layers. For example, measures subsume individual notes; phrases subsume measures; sections subsume phrases; and so forth.73
Glenn Schellenberg provides experimental support for the claim that Gestalt principles are universal in a study that compared melodic expectancy among American and Chinese listeners, using Chinese and British melodies. Two Gestalt principles that applied to all the subjects, regardless of national background, were the expectation that melodies would be composed of small intervals (“pitch proximity”) and the expectation that a leap in one direction would be followed by pitch movement in the opposite direction (“pitch reversal”).74 Consider, for example, “The Star-Spangled Banner.” The first phrase (“Oh say, can you see”) involves relatively small steps between tones. The second (“By the dawn’s early light”) does, too, except for the leap down as one sings “early.” This leap is followed immediately by two small ascending steps, an instance of pitch reversal. The third phrase similarly involves a leap, this time an ascending leap (at the word “proudly”), which is followed by small steps in the opposite direction.75
The first of Meyer’s principles (disjunction in stimuli results in separation) is evident in melodic fission. This is a specific manifestation of the Gestalt “principle of proximity,” which holds that objects that are close together are perceived as grouped together.76 The second of Meyer’s principles is an application of the basic law of Gestalt psychology, the law of “good continuance,” or prägnanz, which he defines as follows: “A shape or pattern will, other things being equal, tend to be continued in its initial mode of operation.”77 Meyer’s third principle is an instance of the Gestalt “principle of closure,” which holds that objects that are physically incomplete will be perceptually filled in. (For example, we will tend to see an incomplete circle as a circle, even though it is not physically closed up.) The fourth deals with musical structures, which we will consider shortly.
Of course, further empirical evidence might show that some of what are now thought to be universals are the result of incomplete evidence. Patel describes an investigation in which he participated that called into question a long-standing belief about a universal. The belief held that the grouping pattern for linguistic and music chunks of a short duration followed by a long one was a universal perceptual law. Inconveniently for the belief, a large number of Japanese subjects indicated that they heard the reverse pattern.78 Huron also points out that evidence has not sustained the theory, premised on Gestalt theory and defended by both Meyer and musicologist Eugene Narmour, that listeners prefer a melody that moves along a scale but skips a tone to backtrack to “fill in” the gap that has been left.79 Such findings indicate the need for considerably more empirical work before we can rest content with our theories about what constitute universals.80
For the moment, however, let us consider the processing universals so far discussed and the way these already suggest a basis for recognizing something familiar in even quite foreign music. I will suggest that even if some of these purported universals prove to be less than ubiquitous, the evidence so far suggests that the list indicates features of musical perception and widespread practices (such as scale use) that accommodate them. These features, I will go on to argue, also underlie our tendency to experience music as “behaving” or “acting” in a manner that we associate with our own activities. If this is so, perceptual tendencies that appear to be common across cultures already take us a long way toward a shared ground for associating meanings from our extramusical experience with music.
MUSICAL PERCEPTION AND HEARING ACTIVITY IN MUSIC
Certain of the seemingly universal perceptual processes we have considered are particularly important for awareness of sharing a common world with other human beings. In the vast majority of instances, music seems to resemble human activity and draws attention to unfolding temporality, both features of our experience that we have in common with other human beings. Both, too, depend on the way we perceive music. I will defer most of my discussion of music’s reflection of our temporal experience until chapter 8. Here I will consider the ways that musical perception overdetermines our tendency to find a strong resemblance between music and human activity.
Certainly the fact that we move when making music would already incline us to associate music with human behavior. In any context in which musical performance is live, performers and listeners experience the music as a result of human movement.81 Even in cases in which music is not live, I suspect that we usually think of the music in this way so long as we have any sense of how it is produced (although the production of music by means of computers might make the connection a bit less obvious).
But we connect music with our nonmusical activity as well, and what I will argue here is that some of the perceptual processes considered above are particularly relevant for this association. First, music suggests movement through the changes of state we refer to in terms of consonance and dissonance. We prefer intervals with frequency ratios of simple integers, which we experience as consonant, and we expect more dissonant intervals to give way to more consonant ones. This is the case not only for dissonant intervals that are performed simultaneously (harmonic dissonances), but also for dissonant intervals that are performed sequentially (melodic dissonances).82 Consonance and dissonance are often defined in terms of relative tendency to movement or repose, stability or instability, with dissonance tending toward resolution in a consonance.83
We tend to experience patterns of dissonance and consonance in terms of tension and relaxation. We feel tension and relaxation ourselves in response to the degrees of tension among the intervals, and we also tend to objectify tension and relaxation as features of the music. Accordingly, we easily sense a similarity between the behavior of music and our activity, for we notice recurrent tensions, followed by partial or complete resolutions of those tensions in music. These resemble our own patterns of exertion and relaxation. For example, we follow moments of exertion (lifting a heavy weight, for example) with moments of relaxation. These may be temporary (as when we lift a weight, set it down, lift it again, and so on) or they may be relatively enduring (as when we leave the gym entirely). We should note, however, that a sense of relative tension among intervals depends upon the hierarchical organization of pitches within a scale, which may not be immediately obvious to the listener who is unfamiliar with the system through which the scale is constructed. We have also observed earlier that what counts as consonance, although premised on frequency ratios that are simple, is relative to a musical system.
Nevertheless, besides sequences of consonances and dissonances, one of the most basic ways that human beings organize what they hear as music is melodic contour, and contour is also among the means by which music suggests relative tension and relaxation and the change of states this involves. We sense that a melodic line reveals more and less effort, in accordance with changes in the relative height of pitches. The fact that higher pitches in one’s vocal range are more difficult to sing than lower pitches may be a factor in the correlation of higher pitches with greater effort. The fact that Western and a number of other cultures use the straightforwardly spatial metaphor of height in reference to pitch also supports the tendency to consider music as resembling our effortful activity; we speak of “reaching” for “high” notes, for example.
Patel suggests a third basis for associating music with human activity. He proposes that an inherent tension between two features of perception may be operative in the impression of animation. One is the expectation of proximity in tones that are similar; the other is the fact that neighbors in pitch may be quite different with respect to their hierarchical roles in relation to the centering tone. In the case of Western tonal music, the hierarchical roles amount to stability within a key; but Patel points out that pitch hierarchies arise in any music that is organized around a tonal center. Melody, he submits, is animated in part by the psychological pull between these two principles.84
A fourth ground on which we experience music as active like ourselves is the analogy that we observe between pitch space and the space of our activities. That we do suppose such an analogy is suggested by the very fact that many cultures speak of pitches in spatial terms.85 We also have the impression while listening that music is filling space, the very space in which we literally operate.86 We tend to associate spatial position with sound as well, for we use acoustics to localize sound sources laterally in space.87
Charles Nussbaum indicates a neurophysiological basis for the analogy that we make between musical space and the space in which we act. On the basis of the close relationship between the auditory system and the motor system, according to Nussbaum, music makes use of the same system we use to interact with the external spatial world. In the spatial case, we mentally model the features of the environment to prepare ourselves to act in it. Nussbaum claims that we do the same thing with music, representing “virtual layouts and scenarios in an imaginary musical space in which the listener acts (off-line).” In the case of music, we mentally “represent virtual layouts and scenarios in an imaginary musical space in which the listener acts (off-line).”88 We are exploring a virtual space when we listen to music, and our motor systems are stimulated by it (as is evident from our tendency to swing and sway with music).
The engagement of our motor systems suggests an explanation for the fact that we tend to interpret rhythm in kinetic terms. Soundtracks underscoring the movements of cartoon characters exploit this association of musical rhythms with the rhythms of active agents. This, too, inclines us to consider the music as active in much the way that we are.
Fifth, we locate ourselves within the unfolding music much as we locate ourselves in the space in which we act. The unevenness of both the steps within the vast majority of human scales and the temporal lengths of tones gives us a sense of position within music, a position from which we can move by steps and leaps. We might compare the pitch and rhythmic spectrums to two axes of a grid extending indefinitely in both directions (just as the spatial continuum is often modeled as grids extending indefinitely in three dimensions). If pitch and rhythmic increments were absolutely even in their respective spheres, locations in music would be no more distinctive with respect to each other than are the spaces of a checkerboard. John Sloboda compares the function of scales and rhythms in this respect: “Scale and rhythm perform the same essential function, that of dividing up the pitch and time continua into discrete and re-identifiable locations, on which backdrop all the essential dialectical activities (tension-resolution, motion-rest), can flourish.”89 The fact that in many cultures’ music some tones are more structural in a melody and others ornamental might psychologically reinforce the sense of moving by steps from one definite position to another.90
Sixth, musical entrainment also encourages us to associate music with our actions. Entrainment is the synchronization of one’s biological activity (including our physical movements) with an externally produced rhythm, such as that of a metronome or another person.91 Apparently unlike other animals, human beings are able to deliberately adjust the rhythm of their own movements to concur with the pulse of something outside themselves.92 People can march in formation, perform music in accordance with composers’ tempo indications, and conjoin their strength in common tasks by means of entrainment. Nietzsche makes the point that music can be a tool of politics for this reason: “Rhythm is a compulsion; it engenders an unconquerable urge to yield and join in; not only our feet follow the beat but the soul does, too.”93
Because music can and often does entrain many listeners at once, it is a powerful means of synchronizing our activity. Work songs and marching songs, two globally widespread phenomena, illustrate the impact of music for entraining and coordinating the actions of many human beings. Such interpersonal synchronization also effects a strong sense of connection with other participants. Phenomenologist Alfred Schutz describes the social bond established among performers and listeners through music’s synchronization of their multilayered impression of time. He observes,
On the one hand, there is the inner time in which the flux of musical events unfolds, a dimension in which each performer re-creates in polythetic94 steps the musical thought of the (possibly anonymous) composer and by which he is also connected to the listener. On the other hand, making music together is an event in outer time, presupposing also a face-to-face relationship, that is, a community of space, and it is this dimension which unifies the fluxes of inner time and warrants their synchronization into a vivid present.95
A seventh possible experiential basis for relating music to our own activity is a reference within the performance of musical phrases to the speeding up and slowing down involved in our locomotion. Bruno Repp found that within musical phrases performers tend to speed up at the beginning and to get slower toward the end. He suggests that this derives from our patterns of physical movement.96 If this is so, music literally mimics our physical behavior.
The universal and quasi-universal features of our musical perception not only encourage comparison between the activity of music and our own behavior; their doing so serves as a ground for further associations between music and extramusical meaning. Because musical perception itself leads us to compare music to our own modus operandi, it encourages us to identify with music even as listeners, as Nussbaum explains:
The internal representations employed in recovering the musical structure from the musical surface specify motor hierarchies and action plans, which, in turn, put the listener’s body into off-line motor states that specify virtual movements through a virtual terrain or a scenario possessing certain features.97
We feel as though the movements of the music are our own movements because our mental representations of music involve enacting virtual movements in accordance with it. Music’s relationship to the systems by which we navigate our world also ultimately provides a basis for the many dimensions of meaning that music has for us and for our sense of its relevance to extramusical content. “Musical modeling of nonmusical domains,” Nussbaum claims, “is pervasive, because musical experience is fundamentally bodily, gestural, and simulational in significance.”98 In other words, because music engages our bodily systems for relating to the world, we model many features of the world in terms of it.
The resemblances between music and our own activity are significant for a number of reasons. First, they suggest a basis for recognizing commonalities between ourselves and others whose minds, like ours, relate music to their own activity and exploration of the world. Second, they provide grounding for musical associations with extramusical meaning, a topic that we shall consider further in a later chapter. Third, and most relevant to our topic here, they offer an initial way of relating to music that is disorientingly foreign.
Can we really recognize our own activity in unfamiliar foreign music? We can to some extent. That is, we can recognize in such music activity akin to our own to the extent that we can discover melodies whose contours suggest various levels of exertion and relaxation, entrain to a regular rhythm, locate ourselves within patterns of uneven pitches and rhythmic patterns, and mentally represent the musical signal in terms of layouts and scenarios in virtual musical space. Music’s reflection of features of our own activity, largely on the basis of relatively universal features of musical perception, encourages us to identify with its movement. The animation of music, with which we can identify, provides a basis for our feeling to some degree at home in unfamiliar music.
Nevertheless, processing universals are not the only aspects of music that affect our sense of comfort. We have already considered the potential difficulties that arise when the schemata we bring to an experience of music deviate from those utilized in the structuring of the music. Cross-cultural intelligibility is not ensured by the processing universals. In fact, the universals of pitch proximity and the employment of scales almost ensure the opposite—that sometimes we will distort what we hear more or less automatically, conflating it with what we expect to hear.
In the following chapter we will consider further complications for the project of cross-cultural understanding of music. On the one hand, we will note some of the quasi-universal tendencies that shape musical structure, which should help us to recognize familiar features in music that is structured in unfamiliar ways. On the other hand, we will find that evaluative and interpretive approaches to many aspects of music are culturally relative. We will reach the unsurprising conclusion that although we can enjoy much foreign music, we may not be sure what it means, at least to those who produce it. However, I will suggest that some of the obstacles to understanding can be ameliorated and, with persistence, overcome.