CHAPTER 5
The Music of Language
People suppose that words are different from the peeps of baby birds, but is there any difference, or isn’t there?
CHUANG TZU, The Complete Works of Chuang Tzu
I wondered whether music might not be the unique example of what might have been—if the invention of language, the formation of words, the analysis of ideas had not intervened—the means of communication between souls.
MARCEL PROUST, Remembrance of Things Past
From the folk saying with which we began in chapter 1 to the contemporary psychology lab, music has often been compared to a language, particularly in connection with its universal aspects. “Music . . . is a kind of language which we speak and understand yet cannot translate,” observes Eduard Hanslick.1 Claude Lévi-Strauss, too, refers to music as “the only language with the contradictory attributes of being at once intelligible and untranslatable.” 2 Wittgenstein is equivocal: “Music, some music at least, makes us want to call it a language; but some music of course doesn’t.”3 Theodor Adorno characterizes music as “the most eloquent of all languages” even as he endeavors to explicate the dissimilarities between the two modalities.4
How plausible is this characterization of music as a language? That depends on what is meant and on how literally it is intended. Clearly, music and language have much in common. They both involve sound addressed to hearing and have rules for combining elements. The resulting strings of sound events are meaningful,5 and both link human beings to the external world. Music and language have enough in common to justify metaphoric comparison.
But despite their similarities, the two modes are also different in ways that neuropsychology increasingly specifies. Patients with amusia (the loss of certain musical abilities) due to brain lesions do not necessarily develop aphasia (an inability to speak), nor do those with aphasia necessarily develop amusia.6 This suggests neurological dissociability of the two systems.7 The two modes seem to involve separate brain areas, even if there is considerable overlap in the areas engaged.8 Evolutionary theorists also stress the differences in the way the two systems operate, often arguing that language has evolutionary priority (in terms of importance), although there is much debate about whether either, neither, or both are evolutionary adaptations.9
As modes of communication, speech and music differ in their means and in what is communicated. Wittgenstein proposes, “one says of a piece of music: ‘This is like some sentence, but what sentence is it like?’”10 We can ask other questions, too. What in music resembles vocabulary? Where are music’s parts of speech?
One might, I suppose, claim that music has affinity with a particular part of speech, specifically the preposition. Sequences of music have directionality, and the various ways this directionality proceeds might be described in terms of prepositions (above, beyond, during, etc.). The differences among musical idioms might also be compared to the different usage of equivalent prepositions in various languages. The prepositions used in one’s native language feel “natural,” while dissenting usage in foreign languages seems mildly perverse. I imagine that the up is appropriate in the expression pick up. Indeed, I can rationalize it, in that I do lift something when I pick it up. But what of the up in shut up!? Does it literally command one to lift one’s jaw? The German pronouns auf (on, upon, at, in) and zu (to, toward, up) used for open and shut (shortened forms of aufmachen and zumachen) strike me, by contrast, as merely conventional, but I doubt that this is the impression of most native German speakers.11
Even if music might be compared to a language of prepositions, this would be a language in a very strange sense. A language with only one part of speech is unlike the languages people speak. Natural languages involve systematic interconnections among words assuming different roles. How would the terms in our all-preposition language be strung together? “Around, up, below, around.” At most this seems like a series of directions to be followed. This string of words would have meaning only in conjunction with much that is implicitly understood.
And yet the language notion of music has taken root. It is particularly strong in recent times when many linguists and philosophers take language as the model for thought. I am convinced that music’s centrality to human life is often obscured by the dominance of a linguistic model. If one describes music as a language, it is far too easy to assume that its role in human life is sufficiently investigated by elaborating its parallels with linguistic structures and meanings. Music’s uniqueness can too easily be overlooked when it is examined through analytic grids designed in efforts to understand language.
In this chapter I will suggest that the linguistic model obscures some aspects of music’s communicative powers. For example, music more holistically communicates an overall sense of intentional orientation than does language. At the same time, using language as a model for music obscures the extent to which linguistic communication relies on musical characteristics. While it may be true that our experience with language can illuminate certain features of music, it is also true that our experience with music can reveal certain features of language, features that are often underappreciated.
In order to draw attention to these omissions, I will engage in a thought experiment. I will reverse the model of the language-music comparison, suggesting, along with composer and musical semiotician David Lidov, that we might justly call language a music.12 Lidov observes that a reversal of this sort is a bit facetious: “To claim today that musicology should encompass linguistics rather than the reverse can only be a literary ploy.” Nevertheless, he concludes, “the musical aspect of speech is truly of its essence.”13
I grant that my discussion in this chapter is a bit tongue in cheek, but my purposes are serious. One of these is to indicate that music has different communicative strengths than language, even though the two modes draw on some of the same capabilities. The other is to show that these strengths facilitate communication across language groups, supporting the suggestion that “music” can be a means by which we bridge our cultural barriers.
COMMON CHARACTERISTICS
Music and language are both universal human practices, and they share many characteristics.14 Insofar as these are shared characteristics, they offer at least as much a warrant for claiming that language is a music as for claiming that music is a language. In what follows I will take spoken language to be the paradigmatic case, as I take performed music as paradigmatic.15
If we return to the list of musical processing universals discussed in chapter 3, we find that many of those not dealing with precise pitch are relevant to language perception as well. The first three are uncontroversially important for perceiving language.
1. We make a distinction between signals and noise, and we certainly make a distinction between spoken language and noise. Joseph P. Swain observes that languages, like music, exclude certain sounds. “Completely banned from all languages are laughs, whistles, and other vocal products that are simply deemed to be nonlinguistic.”16 We do not make a similar distinction between music and language; we use the same stream of sound as both music and language in any song with words. We do not make the distinction between speech and noise, however, in the same way in every language. A study of brain response to speech and nonspeech suggested a differential response to clicks between Zulu and English speakers. The brains of Zulu speakers, who employ clicks in their language, responded to them as they did other speech sounds, while English speakers’ brains did not.17
2. Human pitch perception is most accurate in the vibration range between 100 and 1,000 Hz. This universal determines the basic range in which spoken language can be understood as well as the basic range for musical sound.
3. We perceive linguistic information in “chunks.” After some familiarization infants are able to distinguish between sequences of consonant-vowel clusters that are commonly used and those that are unusual. This suggests that they recognize spoken language in chunks and develop familiarity with those that occur frequently.18
Several other “universal” features of musical perception are also applicable to language, although this might not be immediately obvious in every case. Stating these perceptual factors as linguistic universals, they are as follows:
1. Linguistic signals are organized in terms of melodic contour.
2. Categorical perception is involved in our apprehension of phonemes and temporal intervals.19
3. Frameworks of discrete scale pitches are utilized, typically with uneven step size.
4. Durations of syllables are typically uneven.
5. Rhythm is more basic for making judgments of similarity of pattern than pitch.20
6. We employ Gestalt principles in grouping linguistic strings.
Let us consider these one by one.
Melodic Contour
Melodic contour is important in linguistic communication.21 One occasionally has difficulty understanding one’s native language when spoken by a person with a foreign accent. Sometimes the speaker’s using nonstandard contour in shaping their words and phrases causes this. In Japan I experienced the other side of this situation. I was not getting the reaction I expected when I attempted to say thank you—Domo aragato. I was pronouncing aragato with an accent on the penultimate syllable, a typical accent pattern in Spanish, a language with which I am more familiar. I began to notice that the Japanese pronounce the word with an even stress on all the syllables. When I weighted the syllables more evenly in pronouncing aragato, the Japanese to whom I spoke seemed to understand me.
Melody also appears to express emotion in a similar way in both language and music.22 Patrik Juslin and Petri Laukka, using 1,095 data points, suggest that rising pitch contours may correlate with “‘active’ emotions (e.g., happiness, anger, fear)” and falling contours with sadness and tenderness, which are less active.23 More generally, emotion is conveyed by the similar prosodic cues in speech and music.24 Prosody has to do with melodic nuances of speech, and it involves those features of vocal expression that are described as “tone of voice.” It includes, as Steven Brown summarizes, “the local risings and fallings, quickenings and slowings, and loudenings and softenings that are involved in expressively conveying our meanings in a pragmatic sense.”25
Juslin and Laukka show that similar patterns of “speech rate/vocal intensity, vocal intensity/sound level, and high frequency energy” are used to convey the same emotions in speech and in music. They found that “low pitch was associated with sadness in both vocal expression and musical compositions . . . whereas high pitch as was associated with happiness in both vocal expression and musical compositions.” Fear and anger tend to be associated with high pitch, and tenderness with low pitch. They note a discrepancy between music and speech in expression of fear, since it is “commonly associated with high intensity in vocal expression, albeit low intensity (sound level) in music performance.” They suggest, however, that the explanation for this divergence may be a function of experimental structure: studies of music performance may have focused on expressions of mild fear, while vocal expression studies may have focused on panic fear.26
The human neurological system is highly specialized for recognizing tone of voice. Psychologist Isabelle Peretz has established that this recognition is dissociable from both evaluation of the emotional content of facial expression and from recognition of semantic information.27 Patel suggests that the main difference between speech and song contours is that speech contours rarely attract attention and tend not to be very intricate. Citing Simon Shaheen’s characterization of melody as “a group of notes that are in love with each other,” Patel submits that by contrast “a speech melody is a loose affiliation of tones (or pitch movements) that work together to get a job done.”28
In addition to its value for word recognition and emotional expression, prosody in language is also important for conveying information about syntax.29 Prosody apparently helps listeners parse incoming speech, a matter of particular importance in the case of ambiguous sentences (e.g., “The girl met the husband of the woman who was on steroids”). Prosody helps disambiguate such statements by indicating grouping patterns.30 S. G. Nooteboom and J. G. Kruyt asked subjects to indicate the acceptability of accent patterns for presenting new information. Their results indicate that subjects expected new information to be indicated by acoustical emphasis.31
Categorical Perception
Categorical perception applies to language as well as to music. Just as we tend to hear both pitches and durations categorically in music, we also categorically perceive the phonemes of language. Sloboda observes, “From sounds which vary continuously on a number of dimensions we extract a few categories into which all normal speech sounds are assigned.”32 Some training seems to be required before listeners perceive pitch categorically, while exposure (presumably in early life) seems to be sufficient to develop categorical perception of the phonemes of one’s native language.33
We tend to utilize the habits we have for segmenting speech even in cases of listening to a language we don’t understand.34 One consequence is that we have difficulty hearing certain distinctions in languages that divide up the sound continuum in an unfamiliar manner.35 In some languages, duration is also organized categorically. For example, in Dutch, long and short vowels are distinguished in this way. Categorical perception is so important in speech that according to Nelson Goodman, syntax would be impossible without it.36
We normalize irregularities in both language and music by inserting sounds that make the acoustic signal easier to recognize. In experiments conducted by Richard M. Warren and Roslyn P. Warren, subjects were told that a phoneme would be missing from a recorded sentence. The sentence used the word “legislatures,” but in the recording the first /s/ was replaced with noise. The subjects did not register the phoneme as missing, and instead heard the word “legislatures.” Similarly, when subjects were presented with noise before “eel” in various sentential contexts, they variously heard “‘wheel,’ ‘heel,’ ‘peel,’ and ‘meal,’ to complete the sentence in the most sensible way.”37
In both speech and music, listeners similarly insert a sound to clarify the harmonic situation in the phenomenon of the missing fundamental. This is the case in which a number of the harmonics of a tone (the fundamental) are heard, but the tone itself is not presented. Despite the absence of the tone, the human ear hears the fundamental, so long as it is in the frequency range of about 20–2,000 Hz.38 Dowling and Harwood point out that this phenomenon occurs frequently in our everyday experience, for “telephones do not transmit the fundamental frequencies of male speakers, and yet that has no effect on the perceived pitches of their voices over the phone.”39 The number of harmonics necessary to produce this effect varies; fewer are required if one is hearing fairly low-numbered harmonics, that is, early harmonics within the overtone series.40
Discrete Scale Pitches
The idea that language utilizes scales of discrete pitches as frameworks may seem improbable, at least for languages such as English that are not “tonal” languages. A tonal language is a language in which the relative pitch and inflection affect the meaning of a word. Mandarin Chinese, or putonghua, is an example of a tonal language. It has four tones, and distinct words are produced when what would in English be the “same” syllable is pronounced in the various tones. Ma when pronounced with the first (high, level) tone can mean, among other things, mother.41 Pronounced with the second (ascending) tone, ma can mean hemp. Pronounced with the third (dipping) tone (a low tone that descends and then quickly reascends), ma can mean horse. Pronounced with the fourth tone (which descends from a high beginning) ma can mean to scold.42
However, according to the consensus theory in phonology, the autosegmental theory of intonation, pitch plays an important role as a semantic device in nontonal languages as well as in tonal ones. The autosegmental theory resolved a debate about whether intonation is a matter of movement between discrete pitch levels or whether it involves pitch movement, without reference to level tones. The autosegmental theory, as Steven Brown characterizes it, claims that “phonological events should be modeled as sequential movements between discrete pitch levels, often only two levels, High and Low, and that all movements between them should be reduced to the status of transitions, rather than primary phonological events of importance.”43 In other words, specific levels of pitch are of primary importance. They serve as targets, with movement between them amounting simply to a means of getting from one to another.
The autosegmental theory implies that level tones are central to intonational languages, and thus that the use of tones is not unique to tonal languages (which, we should note, sometimes employ nonlevel tones). It also regards “all spoken utterances as series of steps from one level tone to the next.”44 Brown cites studies showing that speakers of the same language tend to standardize their pitches when reading the same passage.45 The conclusion to be drawn, he claims, is that “speech, like music, is based on scales consisting of discrete pitch levels. The major difference between speech and music in this regard is that these scales change quite a bit during speech (e.g., when pitch levels change) and thus so do the level tones themselves.”46
As Brown observes, some intonational languages move between just two pitches. This pattern resembles those musical cultures mentioned in chapter 3 that utilize only two tones in their music. In tonal languages, more scale pitches are often utilized. In Mandarin Chinese, employing four tones, we can easily distinguish at least a high, a low, and perhaps two more moderate pitches (the target pitch of the ascending and descending tones). Cantonese, which uses nine tones, includes a middle pitch area as well.47 The number of pitches in a speech scale seems, however, to abide by a restriction in number of pitches, as would be expected given the limits of individuals’ comfortable speaking range.
Uneven Syllable Lengths
The duration of syllables in speech is typically uneven. Bruce Richman observes that both music-making and speaking depend on the ability to repeat sequences exactly. He emphasizes that speech, like music, uses formulaic patterns, or “open-slot formulas.”48 To develop the ability for exact repetition in either case, one must attend to a regular beat, which presupposes uneven temporal durations.49 Those of us who used memorized dialogues to learn a foreign language have experienced firsthand the usefulness of such formulas. Once a dialogue is memorized, each of its sentences serves as a template for constructing novel statements in the new language. The uneven syllable lengths in speech, as in music, make possible the distinctive rhythmic patterns on which phrasing and open-slot formulas depend.
Rhythm
Although I know of no direct evidence, it seems plausible that temporal patterns are more important for remembering linguistic sequences than specific timing cues. Indeed, a frequently taught mnemonic device for memorizing strings of words is to repeat them in a distinctive rhythm. I can still recite about half of a memorized list of English prepositions, and I am fairly certain that my recall depends on the rhythmic pattern my eighth grade teacher taught our class to use. Janellen Huttenlocher and Deborah Burke show that grouping series of digits into rhythmic groupings improved the memories of children four to eleven years of age for the sequences.50
In any case, rhythm is fundamental to linguistic communication. A number of studies indicate that individuals synchronize their “gestures, postures, and rapidly changing body movements” to the speech rhythm, usually without being aware that they are doing so.51 Successful everyday conversation depends on rhythm, as Stanford Gregory demonstrates.52 The rhythm of successful talk generates solidarity among participants, who tune in to a common rhythm of speech. People tend to doubt they are being comprehended if the conversational rhythm is broken.53
On the basis of films of interactions, William Condon and Louis Sander observe such conversational synchrony across cultures.
Interactional synchrony appears with frame-by-frame analysis as the precise dancelike sharing of micro-body-motion patterns of change between speaker and listener. Like self-synchrony, it has been observed in all normal human interaction thus far studied, including films of Mayans, Kung Bushmen, Eskimos, among others. It has also been observed in group behavior, for example, seven listeners moving in synchrony with an eighth who was talking.54
Gestalt Principles
Gestalt grouping patterns should apply to speech as well as to music.55 Lerdahl applies the principles formulated in his and Jackendoff’s generative musical grammar to poetry. The meter of both poetry and music involves a hierarchy of beats, with beats at one level being subdivided into two to three beats at the level below. This means that although cultures differ in which metrical patterns they prefer, “cultural variation on poetic and musical meters is intrinsically limited,” for “the possible combinations of two and three, across or within levels, are very small.”56 The tendency to construct musical rhythms on the basis of patterns of twos and threes corresponds to a comparable tendency in language.
Other Similarities
In addition to these characteristics of perception that are applicable to speech and music, the two modes of communication are linked by certain other similarities. Like music, speech involves rules for conjoining elements, collectively termed “combinatorial syntax.” Brown offers the following characterization of combinatorial syntax: “a limited repertoire of discrete units is chosen out of an infinite number of possible acoustic elements, such that phrases are generated through combinatorial arrangements of these unitary elements.” On the level of syntax, then, music and language both have a grammar.57
To this we can add the relevance of context for the acoustic properties of both musical tones and phonemes. Swain observes that the production of a particular phoneme depends on what is before or after it, and he offers the persuasive example of the way “I have to” sounds in practice like “I hafta,” which he analyzes as follows:
The conversion occurs because the vocal tract is getting ready to pronounce /t/ in the next word, which is a voiceless consonant. If the voicing does not “turn off,” /t/ will come out /d/, and usually the voicing cuts out prematurely to ensure that this does not happen. This extremely common phenomenon, called co-articulation, does not impair listener comprehension in the slightest, but it does imply that sound items cannot operate in a completely primitive manner, from the ground up.58
The many parallels just discussed may suggest some of the ways that music and language are on a par. But is there any ground for tipping the balance toward claiming, as I propose, that language is a music instead of the reverse? Yes, there is. Language presupposes musical sensitivities, both developmentally and operationally.
LANGUAGE PRESUPPOSES MUSICAL CAPABILITIES
Language relies on our musical capacities. To begin with, our first steps toward learning language depend on its musical characteristics. Musical sensitivity is evident earlier, and we learn language, initially, through its “musical” features, such as pitch and rhythm patterns and melodic contour, features that infants attend to.59 Recognition of these characteristics is developmentally a prerequisite to learning meaningful language. Developmental psychologist Hanus Papousek claims that “musical elements . . . pave the way to linguistic capacities earlier than phonetic elements.”60 According to Dowling and Harwood, “Nine-month-olds have been observed to babble using the sentence intonation contours of adult English.”61 I suspect that many people besides myself have witnessed prelinguistic toddlers mimic telephone conversations on exactly this basis.
Not only do babies pay most attention to the melodic characteristics of language; caregivers also go out of their way to emphasize these musical characteristics of speech when they speak to infants.62 Trehub describes this infant-directed style of speaking, or “motherese.”
Caregivers the world over enhance their vocal messages to prelinguistic infants by making them more musical than usual. They use simple but distinctive pitch contours but articulate words poorly; they raise their pitch level, slow their tempo, and make their utterances more rhythmic and repetitive compared with their conventional speech patterns. . . . In general, playful speech to infants embodies high pitch and expanded pitch contours that are rising or rise-fall in shape; soothing speech involves low pitch, a reduced pitch range, and pitch contours that are level or falling. . . .
The pervasiveness of musical features in infant-directed utterances led several investigators to characterize these utterances as melodies.63
As examples of this kind of infant-directed speech, one might consider the utterances of television’s Teletubbies, a series aimed at children as young as one year old. Although the Teletubbies seem to be speaking, at times their speech is hard to distinguish from singing.64 Some theorists have suggested that babies seem to have innate predispositions to attend to the exaggerated intonation that this infant-directed speech employs.
Such patterns of prosody are strongly correlated with specific communicative intentions. For example, caregivers use descending pitch to soothe a baby; rising pitch to attract attention and provoke a response; and bell-shaped pitch contour to maintain attention. Infant behavioral responses to these prosody patterns vary appropriately. Although comparative linguistic studies have been limited, they have nevertheless supported the hypothesis that prosodic patterns are used to convey communicative intentions across language groups.65
Musical features of caregivers’ speech and behavior, such as rhythm and style of movement, also enable infants to attune themselves to their caregivers.66 Insofar as the development of speech depends upon a sense of interaction with another person, the engagement of these musical characteristics is necessarily a prolegomena to language. Condon and Sander studied newborns and concluded that they are rhythmically entrained to their caregivers’ speech:
If the infant, from the beginning, moves in precise, shared rhythm with the organization of the speech structure of his culture, then he participates developmentally through complex, sociobiological entrainment processes in millions of repetitions of linguistic forms long before he will later use them in speaking and communicating. By the time he begins to speak he may already have laid down within himself the form and structure of the language system of his culture. This would encompass a multiplicity of interlocking aspects: rhythmic and syntactic “hierarchies,” suprasegmental features, paralinguistic nuances, not to mention body-motion styles and rhythms.67
On this view, our communication with other human beings, whatever system it employs, presupposes the bond effected by the musical capacity for rhythmic entrainment.
In a series of experiments by Jacques Mehler and his colleagues, infants as young as four days old were able to distinguish their native language from a different language, while they were unable to distinguish utterances in two foreign languages.68 The babies were more aroused by utterances in the native language, as indicated by the faster rate at which they sucked on their pacifiers. On the basis of several studies indicating that some sound from speech reaches infants in utero, although reduced in frequency range and intensity,69 Mehler and his colleagues tested very young infants with highly filtered versions of recordings in the native language and one that was nonnative. The infants were able to discriminate preferentially in favor of their native language. This suggests that prosodic cues play an important role in the infants’ responses, since those were the only cues available on the filtered tapes. The experimenters conclude that prosody is sufficient for infants to discriminate the two languages.70
Adults, too, understand other speakers largely by virtue of the rhythmic and melodic characteristics of their speech. We have already noted that the importance of melodic contour for speech comprehension is evident in the difficulty we have in understanding someone with a radically different accent. Anne Fernald, Rainer Banse and Klaus Scherer, and others have conducted experiments to determine whether adults could determine communicative intent in speech electronically filtered so that the words were unintelligible.71 Fernald’s experiment demonstrated that adults more easily recognize communicative intent through the exaggerated intonations of infant-directed speech than through adult-directed speech, and that consistent patterns of prosodic cues convey intents such as warning or comforting.72 Banse and Scherer found good recognition of intended emotions in the vocal qualities of actors speaking meaningless “words” composed of Indo-European phonemes, although certain emotions, in particular shame and disgust, were not easily recognized using exclusively vocal cues.
The Neo-Futurists, a Chicago theater group, demonstrate the extent to which intonation conveys communicative intent in a production called Too Much Light Makes the Baby Go Blind, which is actually thirty plays in sixty minutes. In one of these plays, two actors interact for two minutes. Their lines consist only of descriptions of utterance types, spoken with the inflection appropriate to the type. For example, the actors would say, in appropriate intonation, “agreement,” “overconfident statement,” “elaborated defensive excuse,” “self-assured agreement as denial,” and “aggressive childish insult.” One has the impression of a real interaction because the actors use prosody to convey the communicative intent in each utterance.73 The play impresses one as extremely witty because it underscores the near irrelevance of the specific words, so long as their affective intent is conveyed.
That musical factors have priority in matters of comprehension is indicated by cases in which the accents or contour patterns of words are distorted to preserve musical shape and pattern. William Bright cites examples from the Navajo and from the Lushai of northwest India.74
Although the latter have a tonal language, they sometimes allow musical pitch patterns to override word pitch completely, so that a word which has rising pitch in speech may have any type of falling or level pitch in singing. . . . This is true not only in modern songs, which often copy European melodies, but also in traditional songs. The Lushai claim, however, that they can understand the meanings of song lyrics even when the word pitches are effaced in this way. 75
A study by Stefan Koelsch and his colleagues contends that music can even suggest specific verbal content by activating brain mechanisms that are involved in processing word meaning.76 Koelsch and his colleagues consider whether music can prime meaning for particular words, as sentences in language can. Priming means presenting a stimulus that facilitates the subsequent processing of another stimulus. A sentence can prime the processing of words that have related meaning. For example, a sentence about boating might prime the word water. In the Koelsch study, experimenters wanted to determine whether presenting a sequence of music could facilitate the processing of particular words, as they suspected. “Intuitively, it seems plausible that certain passages of Beethoven’s symphonies prime the word hero, rather than the word flea.”77 Their findings confirm that music can suggest particular verbal meanings when the musical sounds resemble the sounds or qualities of objects, or when they resemble prosodic or gestural cues. In addition, they found that musical styles or forms could prime meaning for associated words (such as a hymn priming devotion).”78 Other musical means that allegedly convey specific verbal content are the use of instruments that mimic speech patterns (such as talking drums) and whistled speech. Patel notes that the latter has been alleged to convey very specific messages about what should be done in a particular situation, and that the “speech” of krar, a five-stringed instrument, in southwest Ethiopia is used to describe the position of objects by imitating the tones used in speech.79
Besides the fact that musical factors facilitate the acquisition and comprehension of language, another reason suggests that music may be a more deepseated communication system than language. Extensive evidence shows that many patients who lose the ability to speak can express themselves by singing.80 Words that are unavailable to them in the form of speech can nevertheless be accessed while they are singing. A study by Martin Albert, Robert Sparks, and Nancy Helm, for example, describes patients with Broca’s aphasia (the inability to speak due to lesions in the part of the brain termed Broca’s area) who could nevertheless learn to sing what they wanted to communicate.81
Musical ability is presupposed in language acquisition, involved in linguistic communication among adults, and is a means whereby speech-deprived patients can express their thoughts. On the basis of this evidence, I concur with ethnomusicologist Charles Seeger when he claims, “Music, though not a universal language, is without question more nearly universal in all senses of the word, including world-wide perspective, than speech.”82
THE ENTRENCHMENT OF THE MUSIC AS LANGUAGE MODEL
Historical Background
Why, then, does the music as language model continue to hold sway, while David Lidov and I are able to suggest the language as music metaphor only provocatively? A confluence of intellectual developments has promoted the tendency to see music on the model of languages. The rise of instrumental music in the eighteenth century was a development that many found hard to understand. Prior to the eighteenth century, the assumption was that music served important educational purposes, usually accomplished through words set to music. The comparison of music to language gained prominence in this context, for it was offered as an apology for instrumental music on its own.83
Several developments in twentieth-century anglophone philosophy have also encouraged the tendency to think of music as a language, among them Wittgenstein’s analysis of language games. A language, according to Wittgenstein, is like a game in which the significance of each element is established by the entire practice that constitutes the game. Similarly, the meaning of the word comes from the way it is used, not from its correspondence with a thing in the world or some mental idea. Language learning involves becoming initiated into the rules of the “game”; that is, one learns the contexts in which particular expressions are used and how words relate to people’s actions. Meaning is not something to be discovered by philosophical analysis; instead, it is evident in the way people use words within particular communities. If one adopts Wittgenstein’s account of language as a practice within a community’s way of life, the suggestion that music is a language would emphasize music’s acquisition of meaning by virtue of the way it functions in a social context.
Another trend that has encouraged the “language” view of music is the position of some contemporary philosophers that language is a formal system. Like natural languages, on this view, artificial languages have well-defined rules for the use of the symbols of the language and for indicating when strings of symbols can be take as equivalent. The grammaticality and interpretation of any string of symbols in such a language is established according to whether it follows those rules. An interpretation of a string of symbols in an artificial language is typically a string of symbols itself. Artificial languages invite comparison with music in that neither tracks objects and occurrences in the external world, as natural languages do. They also can be said to refer in the same way: just as strings of formal language refer to other such strings, a musical “utterance” refers to other music.
A comparison of music to formal language in this way, however, illustrates the way that the model emphasizes some features of music at the expense of others. The features of music that seem most relevant if one emphasizes this comparison are its formal, systematic, and rule-governed characteristics. What is left out is the actual performance of music, the context, and the performer’s communicative gestures.
J. L. Austin’s speech act theory is a third development in twentieth-century philosophy of language that has encouraged a comparison of language and music.84 Austin’s claims that language plays many social functions in addition to transferring information, such as performing an action (e.g., making a vow) or affecting the listener (e.g., persuading or irritating the person), strike some philosophers as relevant to music. Music can similarly accomplish certain extramusical goals (e.g., alerting), and it often affects the listener’s state of mind.
In linguistics, Noam Chomsky’s overthrow of the empirical theory of language acquisition in favor of a theory of an inherited linguistic faculty has also reinforced the language model of music (as well as providing new motivation for considering the universal features of music). Chomsky holds that human minds all recognize certain relational patterns that they express in language, using an innate grammar of transformation rules that operate in all languages. These are rules that enable one to change the surface structure of an utterance without changing its meaning. In addition to enabling us to learn a language, according to Chomsky, they enable us to form novel utterances within it.
The idea of an innate faculty of music, akin to the Chomskian faculty of language, was proposed by Leonard Bernstein in his series of Norton Lectures at Harvard, titled The Unanswered Question.85 He argues that we inherit musical transformation rules as well as linguistic ones, and thus have an innate grammar of music.86 Bernstein compares elements of music to those of language, the note being like a phoneme, the motive like a noun, the notes added to chords like adjectives, and rhythm like a verb.87 The natural sentence is like the musical phrase. Music and language both have syntax, and importantly, a sequence of musical events, like a string of language, has meaning that can be preserved despite changes in the musical surface. Thus composers are able to develop thematic material by deleting, extending, ambiguating, transposing, ornamenting, and such, without listeners losing track of its relationship to the originally stated theme.
Inspired by Bernstein, musicologist Fred Lerdahl and linguist Ray Jackendoff set out to formulate our innate musical grammar. Their impressive study, A Generative Theory of Tonal Music, articulates principles used by experienced listeners to make sense of tonal music. Listeners analyze musical input at several levels, from small-scale patterns of pitch and rhythm to highly abstract levels of structure.88 The grammar Lerdahl and Jackendoff present involves rules for analyzing incoming “musical signals” in terms of structural patterns of time span, meter, grouping, and pitch prolongation. In practice the process is mostly unconscious, Lerdahl and Jackendoff assert, but listeners process the music they hear by applying these principles.
The grammatical rules Lerdahl and Jackendoff articulate are various, including both syntactic rules, which determine when a particular musical sequence is well formed, and preference rules, which determine what tones or structures listeners prefer to hear. They base these rules on principles of Gestalt psychology, which they take to be universal despite the restricted focus of their discussion (tonal music). Lerdahl and Jackendoff present tree analyses of works and passages of music, akin to the type of analysis that linguists give to sentences, showing the relative dominance and subordination relations of the musical elements. Their work reinforces the tendency to compare music to language and to describe it in terms of such linguistic notions as grammar and syntax.89
Along similar lines, I should mention the inspiration that some have drawn from the field of semiotics, the systematic study of signs and signification, in articulating an interpretation of music as a kind of language.90 Some semioticians (e.g., Roland Barthes) interpret music as a language of gestures, while others (e.g., Jean-Jacques Nattiez) focus on the deep structure that they believe can be ascertained through careful analysis of the surface characteristics of music.91 In any case, the semiotic approach to music encourages the application of the terminology of semantics, syntax, and pragmatics to music.
A further intellectual development that has helped to entrench the language model of music is recent work on metaphor by linguist George Lakoff and philosopher Mark Johnson. They propose that we make use of basic conceptual metaphors in structuring our experience, such as argument is war and mind is a machine.92 Inspired by their work, musicologist Justin London has explicitly suggested that the notion “music is language” is such a metaphor, enabling us to make our way in music.93
Language as More Important
The idea of music as a type of language is also reinforced, by the recent growth of interest in the evolution of language. Given that music and language utilize similar human capabilities, efforts to ascertain the evolutionary history of language commonly compare the two systems, often concluding from their dissimilarities that language is a more foundational feature of human life. Steven Pinker takes this tack in his popular book How the Mind Works, where he claims that music is “auditory cheesecake, an exquisite confection crafted to tickle the sensitive spots of at least six of our mental faculties,” which he goes on to itemize.94 Despite its multifaceted ability to entertain, Pinker concludes that music, unlike language, is really inessential to human experience:
As far as biological cause and effect are concerned, music is useless. It shows no signs of design for attaining a goal such as long life, grandchildren, or accurate perception and prediction of the world. Compared with language, vision, social reasoning, and physical know-how, music could vanish from our species and the rest of our lifestyle would be virtually unchanged.95
The claim that music is useless is a veritable mantra. In The Principles of Psychology, William James claims that “susceptibility to music . . . has no zoological utility.”96 In The Life of Reason, George Santayana asserts, “Music is essentially useless, as is life.”97 Charles Darwin remarks in The Descent of Man, “As neither the enjoyment nor the capacity of producing musical notes are faculties of the least use to man in reference to his daily habit of life, they must be ranked amongst the most mysterious with which he is endowed.”98
Music’s supposed uselessness is not necessarily intended as disparaging, however, as Santayana’s comment suggests.99 But most of those who, like Pinker, think that music is an evolutionary by-product stress its costliness from the standpoint of survival. It makes those who practice it more evident at a distance, and hence more vulnerable to predators. Musical expertise takes considerable time and effort to cultivate, both of which might be spent on more obviously useful pursuits.
I will not engage in the debate about music’s evolutionary status here.100 Instead I will consider one of the grounds on which music is sometimes compared unfavorably to language in terms of its usefulness as a mode of communication, the fact that music lacks the kind of semantics that language has. Items in music do not seem to refer to objects and events in the world by means of a systematically meaningful vocabulary.101 Neurologist R. A. Henson observes that “the capacity of musical language to represent exactly is extremely limited in contrast with speech.”102
The upshot for many in the current era is that when it is compared to language, music seems to be a deficient language. The main thing we find useful in language is its ability to refer systematically to particular meanings. Those who consider music to be “useless” usually consider language to be our fundamental tool, seemingly because language can embody precise meanings.
Some theorists are not willing to concede that music lacks semantics. One tack is to claim that music does have a kind of semantics, an “affective semantics” that connects musical sound to specific emotional character. Along these lines Diana Raffman refers to music’s “quasi-semantics” with respect to the emotional feelings that particular musical structures inspire.103 Jean Molino similarly contends that music involves rhythmo-affective semantics, “which involves the body, its movements, and the fundamental emotions that are associated with them.”104 Others acknowledge certain semantic resources in music, such as the ability to take on wordlike meanings (as is the case with Wagner’s leitmotifs) and its ability to refer to other music.105
One’s inclination to consider music as having semantics on any of these grounds probably depends on how strictly one wants to define “semantics.” Graham McFee contends that insofar as music is meaning-bearing, it has semantics: “as with meaning for words, attributing meaning to music is ascribing a semantics.”106 But this is an overly permissive use of the term. On this point I agree with Scruton, who argues that to demonstrate a musical semantics would require showing how musical structures connect with particular meanings.107
If music lacks a full-blown semantics, does that render music an almost-language, a language wannabe that doesn’t mean anything? Not at all. Granted, music does not encode meaning in a manner comparable to the way that language does. But this is not necessarily a disadvantage for music. If we acknowledge the difference between the two systems, we should acknowledge their different virtues.
I am hardly the first to suggest that music’s lack of a robust semantics is no deficiency. Historically, in fact, those who attended to this lack often thought it redounded to music’s stature. The idea of music’s ineffability—its untranslatability into words—has often been taken as a symptom of an inadequacy on the part of language. Although the suggestion that music is a language was offered as a defense of the legitimacy of instrumental music, music’s nonsemantic character became associated by some late eighteenth-century and many Romantic theorists with “spontaneity, immediacy, prereflexivity.” Such thinkers construed these characteristics as grounds for valuing music as superior to language, communicating in immediacy what language could only convey in a mediated way.108 E. T. A. Hoffmann expresses this view, albeit through his own version of the music is language metaphor: “Is not music the mysterious language of a distant realm of spirits, whose wondrous accents, echoing within us, awaken us to a higher, more intensive life?”109
This association of music with prearticulate experience stands in tension with the view, defended by Felix Mendelssohn and Susanne Langer, that music communicates with greater precision than language, though this view could be based on the idea that the sensuous surface of music, but not language, communicates the topography of feeling.110 Importantly, the Romantics’ praise for musical ineffability reverses the evaluation of thinkers such as Hegel and many of his eighteenth-century predecessors, who judged both music and language in terms of their ability to articulate the structures of reality in objective terms and gave the laurels to language.111 As Andrew Bowie sums up this view:
For eighteenth-century representational theories there is always a verbal equivalent of what music says, the apparently non-representational aspect of music being catered for by an underlying representational or mimetic conception of language as that which can render explicit what is only implicit in the music.112
The Romantics’ view, by contrast, celebrated music’s sensuous directness, holding the conceptual clarity of language to be achieved at the sacrifice of the prelinguistic and prereflective. John Hamilton describes the praise of music’s ineffability as preserving a distinction between language and music in order “to protect the nonsemantic from the crime of symbolization.”113 The Romantics’ enthusiasm for music’s seeming resistance to conceptual confinement, as opposed to the tidy classification schemes of language, is akin to their fascination with the sublime as opposed to the beautiful.
In contemporary times, Leonard Bernstein and Diana Raffman are among those who see music’s lack of a languagelike semantics as an asset. Bernstein does not hesitate to use the term “semantics” in reference to music, although he does not use the term in a technical sense.114 He means that music has genuine meanings. In his argument that music shows remarkable parallels with language, he claims that it is more like poetry than everyday speech. Music lacks the denotative character of language but retains the connotative, and this, he argues, is where its poetic character lies.115 Music’s lack of a distinct vocabulary, coupled with its insinuation of tensions that could be resolved in a variety of ways, makes it ambiguous and thereby particularly expressive. Even poetic language is more expressively restricted than music. When poetry involves ambiguity, because multiple alternative meaning structures are at work, the meaning structures must remain more or less compatible to avoid a mixed metaphor. Music has fewer restrictions in this respect, since one does not have explicit denotative meaning structures that must be reconcilable.
Raffman similarly rejects the idea that music’s lack of semantics is a liability for music. She accepts the language metaphor but argues that music is a language that bypasses semantics.116 Indeed, her explicit concern is with musical ineffability, music’s ability to provoke experiences that exceed our linguistic capacity. Raffman distinguishes three types of musical ineffability. Structural ineffability is the impossibility of definitively articulating certain high-level structural characteristics of a piece of music, caused by the fact that at the most abstract level, musical structure is susceptible to multiple analyses (and thus to different interpretations in performance). Feeling ineffability is the listener’s inability to articulate the sensory-perceptual “feel” of a performed piece of music to someone who does not have the same sensory-perceptual experience.
The kind of ineffability of most interest to Raffman is nuance ineffability, the impossibility of linguistically articulating the details of musical performance that occur on the level of differences too fine-grained to be captured by the analytical framework of musical grammar.117 These differences are thus in principle ineffable. The discriminations our perceptual apparatus can make and our memories for nuance are both limited, so we are not able to reidentify nuances or categorize them into types to the extent that would be necessary to systematize what we know of them.
As the very topic of musical ineffability suggests, the breakdown of the analogy between music and language is the point at which Raffman’s model becomes most interesting. She denies that there is a level for language that is comparable to the nuance level of music. Actually, I think there is such a level, but only for spoken language. The nuances of the way a particular individual articulates a given phrase in language are fine-grained in a way that is comparable to musical articulation.118 This is so, in my view, because language at this point is functioning musically. In other words, the nuances occur with respect to the shaping of the sounds rather than the shaping of meaning.
Language functions musically (as opposed to referentially) in some other ways as well. In discussing the relevance of Austin’s speech act theory for music, London claims that even absolute music can perform what Austin defines as the “behabitive” class of speech acts.119 This is a miscellaneous class that includes, according to Austin, speech acts that “have to do with attitudes and social behaviour.”120 London mentions “warnings, threats, greetings, etc.” He notes that these “often involve little or no propositional content” and are performed in the present tense. He also contends that these are
speech acts which are strongly marked by intonation as well as other paralinguistic features. . . . Thus behabitives . . . require the listener to attend to the “musical” qualities (pitch, tone of voice, loudness, rhythm, and articulation) of the locution in order to comprehend the illocutionary act.121
In my opinion, these are cases in which language functions like music. The same is true in the case of some Russian soldiers in Leo Tolstoy’s War and Peace mimicking the phonetic sounds and intonations of French while pseudo-addressing French soldiers for their own amusement. Tolstoy tells us that “the soldiers burst into a roar of such hearty, jovial laughter that the French could not help joining them.”122 A similar example is John Searle’s account of an American soldier who recited a line from Goethe in order to convince his Italian captors that he was a German soldier (a ploy that works only if he is right in assuming that his captors do not speak German).123 Referring to this example, London observes, “Thus while it is important that the American soldier’s utterance have the appropriate phonological form (both in its phonetic content and in its overall intonation), it need not have any particular syntactic or semantic form.”124 The crucial feature of the soldiers’ utterances in both examples is the sound, not the linguistic meaning.
London remarks, “There is not any sense of ‘reference’ involved in the American’s German babbling, but there is a sense of signification: when the American produces German-sounding language (in the given context) this expression counts as a sign of his Germanness.”125 Such “signification” is not specifically linguistic. Making music from an identifiable culture, or even using a particular instrument, can achieve signification (e.g., a bagpipe to show Celtic identification, or a balalaika to indicate being Russian). The characters performing the “Marseillaise” in the movie Casablanca signify their French identification whether they are singing (and using French words) or playing an instrument.126 To adapt a remark from Nietzsche that was cited earlier, we don’t need to take their word for it. We recognize their sound.
An interesting musical case in which words are used in this manner is that of an instance of a song in the language of an idealized “old country” that the musical participants do not speak. Ethnomusicologist Ron Ernoff asserts:
I was told often by Cajuns, many of whom did not speak Cajun French, that “the words make the song beautiful.” With the marked demise of the use of Cajun French in Louisiana, this aesthetic evaluation inevitably can refer more to the sound shape of the words, to their sound sense, than to their semantic sense. For example, even though he did speak Cajun French, Cajun fiddler Dennis McGee was purported to have used ‘words for rhythm and sound more than to present a story.’”127
Ernoff notes a similar pattern among native Hawaiians, who value songs that are sung in Hawaiian, even when they do not speak the language themselves.128 One might also consider the use of nonvernacular languages in some religious traditions, for example Sanskrit, Pali, Hebrew, and Latin. These languages can have profound emotional impact on the religious participant even in the absence of knowledge of the words’ meaning.
Meaningful words of language can be employed musically in yet another way. Frits Staal notes a particular type of language use in which verbal meaning is of little importance. This is the case of the mantra, an untranslatable word or a string of words repeated over and over in ritual contexts without concern for whether the words are meaningful or meaningless. “While form, in a natural language, is at most as important as meaning, the form of mantras is more important than their meaning.” Staal compares mantras to bird songs, which are similarly hard to explain in terms of a particular function (though functional accounts abound). The meanings of bird songs vary with context.129 Staal cites Konrad Lorenz, who argues that “acquired motor skills . . . are forever being performed for their own sake in the obvious absence of any other motivating or reinforcing factors. Indeed, the very concept of play is based on this fact to a large extent.”130 Staal concludes:
The similarity between mantras and bird songs is due not to common function, but to common non-functionality. Mantras and bird songs share not only certain structural properties, but also lack of an inherent or absolute purpose. It is precisely these features that express the common characteristic of both as essentially satisfying, pleasurable and playful. 131
Although music and language diverge where semantics are concerned, I agree with Bernstein and Raffman that music’s lack of a full-blown semantics reflects specific virtues of music. Besides the virtues they describe—music’s poetic suggestiveness, its capacity to convey feelings too precise and nuances too particular for linguistic capture—I see another: music’s openness to acquired and multiple meanings. Thomas Turino describes the accumulation of meaning that the same musical signs can collect as a consequence of their use in various contexts as “semantic snowballing.”132 John Blacking and Alan Merriam both contend that music is intrinsically polysemic, in that the same musical pattern can be conjoined with multiple meanings.133
Ian Cross also stresses music’s polysemy, claiming that “music has the capacity to lack consensual reference; it can be about something, but its aboutness can vary from context to context and even within context.”134 Using cross-modal comparisons of a sort that we consider further in the next chapter, Cross describes music’s inherent polysemy as essential to its role in cognitive development:
If music is about anything, it exhibits . . . a “transposable aboutness.” And it is conceivable that music’s “transposable aboutness” is exploited in infancy and childhood as a means of forming connections and interrelations between different domains of infant and childhood competence such as the social, biological, and mechanical. To give a crude example; the arc of a ball thrown through the air, the prosodic contour of a comforting utterance, the trajectory of a swallow as it hawks an insect, the pendular ballistics of a limb swung in purposive movement, might, for a child, each underlie the significances of a single musical phrase or proto-musical behaviour on different occasions. Indeed, these heterogeneous incidents may be bound together simultaneously. . . . Hence one and the same musical activity might, at one and the same time, be about the trajectory of a body in space, the dynamic emergence or signification of an affective state, the achievement of a goal and the unfolding of an embodied perspective.135
Music’s polysemy stands in tension with the virtue celebrated by the Romantics when they acclaimed music’s freedom from the straitjacket of conceptualization. Nevertheless, music’s lack of an assigned semantics affords it both capacities. It can convey impressions of immediate sensuous experience, and it can acquire multiple layers of meaning. What is remarkable is that it can do both at the same time.136
Music can be replete with meaning because it is not limited by denotation. Its availability for metaphoric and associative meaning enables it to play a remarkable range of roles, some of which we considered in chapter 2. Music can map both the landscape and our various cognitive domains. It is tremendously flexible in terms of what meanings can conjoin with it. Music does not lack meaning by comparison with language; its meaning is less restricted.
WHAT CAN THE MUSICAL MODEL OF LANGUAGE DO FOR US?
Although ample justification exists for reversing the music is language framework, this framework endures. One reason is that the characterization of music as a language is reinforced by the kinds of comparisons it activates. Swain’s book, Musical Languages, abounds in discussions of the many ways one can use the metaphor of music as a language to draw attention to features of music. But what is to recommend consideration of language as a music? Does the reversal of the cliché actually do any work for us?
One thing the music model of language might do is alert us to the acoustic potentials of speech—for example, to the ways in which our speaking can be beautiful and melodious, and ways in which it can fail to be. It can also remind us of the importance of rhythmic pattern in speech communication, emphasized by Gregory, Condon, and others.
A second benefit I see in the music model of language is that it counters the easy assumption that language is an adequate model of thought. This would be a major contribution to the philosophy of language.
Philosophers frequently characterize the contents of thought as propositions, understood in terms of linguistic clauses (e.g., “that Mary likes Fred”) but as conveyable by a variety of sentences. On this view we take various “propositional attitudes” toward these contents, such as “believing that” or “desiring that.” One has the tendency, in recent anglophone philosophy, to take these two specific propositional attitudes—believing and desiring—as the most important kinds.
Music, by contrast, does not typically involve propositional content, and even if Swain is right that Wagner uses the leitmotif at times to construct veritable musical propositions, this is not the typical musical case.137 Does music therefore reflect thought less aptly than language? I would say no. If anything, music reflects thought’s specific modalities more accurately than language. Language can state that a person approaches some content reverently, but music can convey the thought-tone of reverence.
I would hardly deny that some thoughts, some of them spoken, do involve the expression of beliefs and desires in propositional form. But this is hardly the full extent of thought. Often we take attitudes toward other contents besides propositions. The reigning perspective in philosophy of language is that language is designed to convey propositions. But this is only one of the functions of language. Significantly, if language were only designed to convey propositions, its prosodic elements would be of minimal importance. At most, they would clarify the propositional attitude. But even this role would be trivialized by much recent philosophical discussion, for the set of propositional attitudes of concern to philosophers has tended to be truncated to the small set of beliefs and desires.
“A lovely girl—ah!” is certainly a thought, and sometimes a thought expressed in language, but I think it would be ludicrous to restate this in terms of a standard propositional attitude taken toward a proposition. How might one reconstruct the thought in these terms? “I believe that a girl exists and that she is present and that she is lovely and I desire the continued presence of the girl, etc.” For good reason we do not ordinarily express our ideas in the strung-out mode of predicate calculus. Not only is it absurdly explicit. It does not even express the statement’s actual content. It fails to reflect the focus of the statement, and it insists on “beliefs” where none may be present. One can imagine the lovely girl, or recall her, or consider the archetype of a lovely girl, without having any particular lovely girl completely in mind.
In my opinion, the nuanced contents of music reflect much of our thinking life more aptly than does language, because it draws attention to various ways of holding mental content in mind. Music, as Lidov puts it, appears to “think out loud.”138 One can meditate upon a motive or theme; one can find its recurrence oppressive; one can linger over it; one can start to see reflections of it everywhere. Focus on propositions may be useful for analyzing the power of language to convey explicit information, but it does not indicate the manner in which a flow of thought occurs. Music can reveal both tonal attitudes of thought and the character of whole streams of thought, whether they are halting or straightforward, meandering or rushing. The pace is a part of the thought, sometimes trivially so, but sometimes importantly. A conclusion drawn hesitantly is not the same conclusion as one articulated in the same words but reached in a leap or as the crest of a volley of ideas. The language is music framework can help us attend to these differences and thereby reflect our thought more accurately.
Finally, the language is music model can remind us of the contextual layers that modify linguistic meaning, for these are the bases by which musical meaning becomes in any way specific. Feld analyzes the “interpretive moves” that listeners make in the course of relating to a piece of music. These interpretive moves are judgments, involving “the action of pattern discovery as experience is organized by the juxtapositions, interactions, or choices in time when we encounter and engage obviously symbolic objects of performances.”139 In other words, the broader circumstances in which we encounter music, including the many societal decisions about what contexts, occasions, and associations are appropriate to music of a certain sort, affect our sense of its significance.
Among the types of “interpretive moves” Feld envisions are references of particular features of music to: locations (understood from a subjective perspective); categories (which assign the item to particular classes of things, and perhaps subclasses as well); associations (with what Feld describes as imagery—“visual, musical, or verbal”); reflections (having to do with “some personal and social conditions [like political attitudes, patriotism, nationalism] and related experiences where things like this can be heard, mediated or live”); and evaluations (“instantly finding this funny, distasteful, inappropriate, or immoral”).140 Societies (and particular individuals) interpret music in terms of its relationship to locations, categories, associations, reflections, and evaluations relevant to the listener.
These interpretive moves have corollaries in the linguistic case, corollaries that are usually lost sight of by philosophers. I note a slang word in an utterance, and this leads me to infer where the speaker is from. For example, I judge that someone asking me, “Would you like some brekky?” is Australian (although accent, another musical aspect of language, might make this judgment more definitive). Or perhaps a word pronounced a certain way tips me off, for example, the dropped r in the speech of Bostonians, New Zealanders, or Australians. Certain buzzwords lead me to associate a speaker with certain contexts. A sprinkling of terms such as “propositional attitudes,” “bare particulars,” and “rigid designators” would lead me to associate the speaker with twentieth-century analytic philosophy. “Whatever!” pronounced with a certain exasperated inflection might stir up associations with the San Fernando Valley. It might also lead me to reflect on the oddity that an expression denotatively suggesting the speaker’s deference to another person is actually a retort.141 I might go on to judge the speaker as rather rude or as a hothead on this basis. In My Fair Lady Eliza Dolittle’s elegant pronunciation of statements in lower-class diction is funny because we relate both the pronunciation and the diction to the circumstances in which each seems appropriate. Only thus are we able to recognize their incongruity. By looking at the way we relate ourselves to music and thereby determine what meanings it has for us, we can become more aware of the way we similarly assess the meanings linguistic utterances have for us, and not just the meanings they have denotatively.
Language is often touted as our great species achievement, as both the product of our intelligence and the way we express this intelligence. Fair enough—but language is not our only aptitude that fits this description. Music is similarly the product and expression of the kinds of minds we have. If we are interested in our species nature, we should not attend to language at the expense of music, for music reveals some features of our minds and souls more clearly than does language when it is understood in terms of entrenched structural models that underplay its musical characteristics. Such models also shortchange the ways in which the meanings associated with linguistic expressions derive from listeners’ (or readers’) interpretive efforts to relate themselves to what is said (or written). To understand our use of language, we would do well to pursue musical models. To understand ourselves, we cannot dispense with music.
Importantly, some of the musical characteristics of language, those involved in prosody, apply cross-culturally; and these same characteristics are evident in music as well. Apart from semantic information, the acoustic codes that are cross-culturally evident in both language and music convey considerable information about attitudes and affect.
Music cross-culturally intimates something further in the absence of semantic information—a sense of a broader world beyond music. I will argue for this claim in the following chapter. I will contend that as a consequence of its biological depth and the salience of its signal to multiple senses, music galvanizes the entire sensorium, which is the means through which we experience the larger world. One consequence is that music prompts synesthetic associations in those who experience it. Music’s enlivening of our sensory faculties, I will argue, primes us for attending to the larger world that is both our musical and our extramusical environment. One consequence is our readiness to link music with extramusical content, a tendency that is cross-culturally prevalent. As we shall see, however, this cross-cultural power results in a proliferation of associated meanings that are often not cross-culturally accessible.