What would establish a general basis for the rules of perception, applicable to music as well as languages—and, why not, to the image as well as to sound—is not a miraculous affinity between things but, of course, a similar mental activity when dealing with them. This promising point also promises many difficulties.
Since the perceived object (as an intentional unit) corresponds to a structure (of perceptual experience), we always tend to separate these two aspects: the object, which we think is on one side, and the experience, which we think is on the other; or again, the perceived structure and the constituent activity. We know that this is, in fact, tantamount to ruining the object, forgetting the authenticity of perception. But becoming aware of this experience involves giving ourselves a new object for thought, using a certain distancing from perception in order to examine its mechanism more fully—no longer hearing, but hearing oneself hear. If, in turn, I examine this mechanism, it is by virtue of a structure of reflective consciousness that in turn remains hidden from me . . . and so on, to infinity.
In the same way, at the end of the last chapter we spoke of the embedding of levels of structuring, which form an endless double-linked chain. As soon as I investigate the mechanisms of a perception, I am obliged to refer it to a higher level, in which it appeared to me as an object in a structure, and if I then investigate it in itself, isolated from this structure, it will present itself again as a structure and enable me to identify the objects on the level below. So there is something annoying about the choice of words, since in present-day language it is the word object that seems best suited to describe the grasping of something quite distinct that can be examined at leisure. By a reversal of meaning, this type of object is indeed provided by the structure above it, which allows us to identify it, but its properties, as we have said, are still concealed from us. If we take this object out of the structure to which it belongs, it immediately becomes a structure in itself, and it can only really be evaluated through being resolved into objects belonging to the level below.
If we use two letters to symbolize this double object-structure interplay, we could represent the chain of levels symbolically as follows:
(SO) 1 -------- (SO) 2 -------- (SO) 3 --------
This schema summarizes what we have just been saying: (SO) 2 has been identified as an object in S 3, and it forms the structure for identifying objects at level O 1.
We should add that a perpendicular chain of mutual reflection (complexity of perceptual structures) may branch off from any level.
(PA)a |
Pa representing any of the perceptions represented by one of the pairs SO, Aa the activity giving rise to this perception, |
|
(PA)b |
which becomes the object of the perception Pb in response to a new activity forming Ab, |
|
(PA)c |
which in turn becomes, etc. |
Is there any need to say that it is easier to get lost in this dimension than in the previous one?
This treatise set out to discover the elementary. Nothing is more ambitious than aiming for this, nothing less certain, these days, than reaching it. At best we can see that levels are embedded and things are reflected in each other. We must check that we are not dreaming.
As we have seen, notes can best be identified in a melody, which, for all that, does not explain the notes. We will suppose that level 3 of complexity is the melodic object-structure. As a structure, it can be explained in terms of level 2 notes, but in its turn it will serve as an object to explain a larger form, phrase, or strophe, a movement from a work, a piece of music, as they say. For it must also be pointed out that the object melody is only noticed within the work by contrast like a figure against a background, or else it is articulated in it along with other melodies, accompaniment, or counterpoint: these melodies are analyzed within the work through a musical sense rather than bringing in significations.1 When we say that the “theme” is recapitulated, reworked or merged with another, this type of identification of level 3 melodies is really based on level 4, whereas the musicologist believes he has explained and described level 4 through the part played on that level by the components of level 3.
But our research is not aiming for these lofty heights of complexity. We have the intuition that the enigma of music, like matter, resides at the opposite extreme: in the smallest significant musical element, the one on which everything will be built from the very beginning.2 Western musicians think it is the note. But what is a note? We can see that, because the question has never been properly put, there are no bases for any explanation.
Our argument could be turned upside down. Once we have managed to explain the note—that is, its structure and the objects on the level below—all we have done is pushed the explanation back. True. So our response will be that what counts is not so much to pry apart one more link in the chain but rather not to consider the note as a terminus. We must unblock the two blind alleys that obscure such an elementary concept: the blind alley of constituent criteria and the blind alley of the perceptual activity on which these are based. All of this goes back, infinitely more than we think, to our individual conditioning and our social conventions.
The (supposed) resistance of some musicians to this sort of awareness is not specific to them. It is shared by everyone who belongs to a system and is accustomed to experiencing it precisely as a value system, considering these values as absolutes, whereas in most cases they are no more than conventions. Hence the moral panic when the values table collapses or is smashed. How can these turnarounds, these breaks in the system, be explained? The meaning of music ultimately remains as impenetrable as the nature of its materials. Each one of us in this respect, and this is his secret, may have his own explanatory system or hypotheses about some way or other of using the conventional signs of music theory and will retain this rather than that, which has an effect on the interpretation of meaning.3 This is true of the words in language and their relationship with ideas: one person knows what the spoken word means, another not at all. How much more true this is of music, because of its implicit nature, and its code, which is both impenetrable and inexhaustible.
We will take a less sensitive example. We can learn good manners from a book of etiquette, which is a music theory of the social code. This sort of book always makes people laugh, why? Not only because it formalizes and fixes customs that are full of nuances but also because explaining these underlines how arbitrary they are. A code is not learned like this from books; no one would believe in it.
The true code is unconscious. It is no less strict and remarkably detailed. Unconsciously, I bring my behavior into line with it, at the moment when it seems at its most spontaneous. Unconsciously, I apply it to my visitor’s conduct. I don’t say to myself, “He has failed to observe the code”; I say, “He’s vulgar.” I don’t even say, “He’s vulgar,” but I see him spontaneously as vulgar.
How can there be something that is both ideal (since it is a group of conventions separate from the particular acts that conform to it, or any particular evaluations of it) and implicit? We may wish to clarify this. The clarification comes after the event. It is partial. It operates in the form of rules that are given as absolutes: “this is the done thing,” and, above all, “that is not the done thing.”
We may also come across the code by chance. The “new boy” joining a group learns this the hard way, not without playing a number of “wrong notes.” But he assimilates the code directly, without really being aware of it. The one best placed to do this is the one who has experienced enough different environments to have learned about both the relativity of codes and their importance: knowing them to be variable, he will not confuse any of them with the Ten Commandments; knowing them to be binding, he will not attribute to individual traits of character what belongs to the general law.
Having at last observed and compared them, he may ask himself two sorts of questions: either he may wonder about their historical or psychological origins, and the factors that might change them; or else he may consider them at a given moment in time, independently of any value judgment or inquiry into causes, and wonder “how they are made.” This is when we are likely to notice that they are systems, balanced wholes in which, as in perceived structures, changing one element necessitates amending the whole.
We find these various approaches clearly represented and differentiated in the sciences of language: grammar, normative, with its prescriptions and prohibitions; the separation, once this stage is past, between language systems4 and speech, code and conduct, observed by Ferdinand de Saussure: on the one hand, the conventions that enable us to understand one another, and on the other hand, the individual speech acts actually pronounced and heard that derive from them; the separation, in fact, of the study of language systems into two aspects: the study of their development (the historical, diachronic approach) and of their systems at a given moment in time (a synchronic approach).
We can now attempt to draw the parallel with linguistics. Apart from Danhauser’s argument (?) (“music can be written and read as easily as we read and write the words we say”), we have other, perhaps better, reasons for doing this.
1. In no other domain will we see the problem of defining units in relation to structures, and hence in relation to the system and the main intention, presented with such clarity.
2. Like music, language is sound and takes place in time. It is interesting to compare the uses, structures, and perceptions that diverge from this shared basis. It is no less interesting to try to find a viewpoint, beyond these constructs, from which all these can be explored at the same time. We run little risk of getting it wrong if we assume that this viewpoint, if it exists, must be sought at the level of the sound object.
To be complete, our comparison should deal with the “meaning of music.” The structures of language are obviously determined by its communicative function. A definition of musical communication, which immediately appears to be of another kind, would enable us better to understand musical structures through their functions.
Perfectly aware of this dependence, we have simply chosen to proceed from the other direction: the study of its structures, the problem of defining its units, may inform us about the meaning of music. And this indirect approach has the advantage of sparing us from aesthetic dissertations that lead nowhere.
When we listen to speech, how and according to what criteria do we locate units?
At first sight the question appears otiose. The division of speech into sentences and words gives us no problem. Words are separated for us as easily when we listen as when we read, when they are separated by blank spaces. We are, however, happy to agree quite quickly that a foreigner who does not know our system of language would not separate them so easily. His ear will simply not allow him to know whether what he has just heard is two words or three. And we ourselves may hesitate if we do not have enough information about the meaning. “A Frenchman who hears a group such as lavoir will say immediately that it contains two syllables, but he needs to hear it in context to know whether it is two words or one: lavoir (a public washhouse) or l’avoir (to have it). A person who does not know French and who hears a group such as je l’ai vu (I saw it/him) will probably hear the number of syllables it has but will be utterly incapable of telling us the number of words as long as he does not understand the meaning.”5
This, as Saussure observes, means that “language systems do not present as a set of signs delimited in advance, where all we have to do is study their significations and organization. It is an undifferentiated mass where only attentiveness and habit can enable us to distinguish particular components. The unit has no special phonic character, and the only definition that can be given of it is as follows: a chunk of sound which, as distinct from what comes before and what follows it in the chain of speech, is the signifier of a certain concept.”6
But, it seems, this amorphous mass is only so if we try through hearing alone to cut it up into units that are both sound units and meaningful. If we have not been told the significations, it is not surprising that we cannot do this.
Noting that the definition of units that appear so obvious to us, written into the sound itself, is relative to the meaning and our understanding of that meaning, we will continue our research. So what does a foreigner hear?
He hears syllables, as Malmberg has already told us (including, for reasons to be determined, a “probably”). We should add that syllables can be analyzed into phonemes (consonants and vowels). So we may suppose that a foreigner will hear phonemes? Absolutely not. If he applies himself, he will hear sound objects that are only phonemes to us. And if he does not apply himself, he will hear the phonemes of his own tongue, pronounced with a foreign accent.
Here again, we are misled by the written word. The same goes for English, German, Spanish, and French, although the language systems are different. We naturally assume that an l remains an l in every language system, that an r remains an r, simply admitting that they “are not pronounced in the same way.” And we fall back into the illusion, already criticized by Saussure, that there are “signs defined in advance,” which are organized later. The phoneme is not a reality in itself any more than the word. It is defined in relation to the language system of which it is part. Not only is the limited number of concrete sounds I hear as consonants and vowels, and which combine to form words, not identical from one language system to another, but they are infinitely varied within one single language system:
We do not pronounce a vowel or a consonant in exactly the same way on any two occasions. What surrounds the sound varies every time. The accentuation, the speed of delivery, the register and the qualities of the voice vary from one occasion to another and from individual to individual. There are differences in pronunciation between individuals that can be explained by anatomical differences or individual quirks of speech. Spectrograms show significant differences between men’s, women’s, and children’s vowels.7
Given all this, why and how do we identify these phonemes? Why do they stay the same despite their variations? How is it that we did not even perceive these variations, so much so that we have to use spectrograms to reveal them to us?
And why do we think we hear the same consonant in qui (who—kick) and coup (blow—coo), in tas (heap—tick) and tot (early—tock)?
Spectrograms reveal acoustically different units in the various cases. Palatograms and X-rays show considerable differences in articulation. Why, finally, does a Parisian Frenchman, who pronounces his rs at the back of the throat, immediately identify a word such as rire (laugh) pronounced by a southerner, who rolls his rs? The answer is that the k before i and the c before ou, the masculine and the feminine i, the a after s and the r after l, the rolled and uvular r, are identical from the point of view of their linguistic function. Some features of the sounds in a language are important for identification, others are not. Every vowel and every consonant articulated in a context contains distinctive or relevant features together with a number of nondistinctive or nonrelevant features.8
In other words, the definition of the phoneme is relative to its function in the whole language system. It is “the smallest sound unit needed to discern one word from another.”9 Thus a Frenchman will identify the l in tableau (picture) with the l in peuple (people), whereas the l in peuple is more or less silent, and in tableau it is fully sounded. “For a Welshman, the sounded l and the silent l are two independent units that he will never think the same. The explanation for this is that the consonantal systems in Welsh and French are different. The Frenchman cannot change the meaning of a word by replacing the sounded l with a mute l or vice versa. . . . The two ls are variants of the same phoneme. In Welsh, on the contrary, they are two different phonemes. The difference between the two is relevant.”10
So, what seemed to us to be given immediately and even imperiously to perception is indeed given, but to a conditioned, trained perception, which has gradually become very adept at grasping relevant differences, while at the same time being practically deaf to those that are not. So much so that, when we learn a foreign language, we must unlearn to articulate and hear French while at the same time applying ourselves to a different training. This is why the acquisition of a foreign language is easier for a child, who still has the potential capacity to pronounce all phonemes, and a fresher, though as yet unskilled, ear.
This description of the origin of language brings us back to our first chapter. In the sense we have just been describing, and without taking meaning into account, we have acknowledged the deafness of one musical civilization to another, the musical objects of one of them being heard by the citizens of another only as imperfect versions of their own phonemes.
In the same way, we have tried to describe the birth of unconscious musical systems, shaped simultaneously by practice and ear training, which makes the members of a musical civilization so skilled at recognizing features that are relevant (which play a part in the structure) and at the same time make them practically deaf to nonrelevant features—the former being at the cost of the latter. We can now measure more efficiently the power of this training and the whole process of learning needed to unlearn it and hear the music of others.11
As for music theory, we found it in more or less the same state as the grammars of the eighteenth century, codifying the structures that had arisen in the social unconscious after the event and confusing them with the norms of reason, like Lavignac identifying physics with music.
Finally, where the phoneme is concerned, we find the same ambivalence in the “musical note,” which also, like consonants and vowels, is aided by a notation that misleads us, by making us think that, because it is fixed in advance on the score, it is a sign that existed before it was played. Its relevant features are, of course, pitch and duration, which have a functional role in musical structures. Jakobson, moreover, makes this connection explicit in Fundamentals of Language.
“Pitch-duration” values as relevant features have a radically different origin here from the one assigned to them in the previous book. Whereas previously these sound qualities resulted from a comparison of sound objects through reduced listening, with acoustic signals measured in time and frequency, these same qualities (or more precisely similar qualities with the same name) now come from comparisons made through musical listening in the context of a given language. We should not expect the three descriptive systems that are normally conflated—the third system being physics—to coincide nor, however, to be noticeably different. We will therefore keep in mind that the word pitch has different meanings according to whether it refers to physics (signal), reduced listening (sound object), or a cultural phenomenon (musical object).
Thus, listening to phonemes confirms the insensitivity to acoustic variations, sometimes considerable, which is ours. In music, similar experiments have highlighted the no less considerable variations in pitch that an opera singer is likely to produce (see Winckel and Francès). As for ignoring nonrelevant features, we need only to point to those clunks that the musician does not hear (the noise of the attack, for example, in a high piano note), whereas objectively (i.e., in reduced listening) they are louder than the tonic sound. This is, moreover, only one example among many. Noise, which is not remembered as a value, is present in all musical sounds, and its discreet presence, in suitable measure, is an indispensable element of sonority.
If, in the musical note, the constituent element of musical structures, we find the equivalent of the syllable, the constituent element of the speech chain, surely we will find a methodological precedent in phonetics? Perhaps phonetics could give us an example of a theory of verbal objects?
Yes and no. Our approach, at the level of concrete sounds, is indeed the same. The aims of our research, and hence the methods, are different.
(a) Yes, for the phonetician is, in fact, practicing that reduced listening that we are constantly urging our reader to try. The task of the phonologist, faced with a language system unknown to him, will be to locate the phonemes, while identifying their variants: either by consulting the linguistic consciousness—that is, defining units according to what the natives of the country count as the same in spite of variations—or through a general study of the structures of the language system (commutations method). Conversely, the phonetician will make himself listen to the sounds of his own language as if they were unknown to him and notice their variations.
This is, in fact, what we are doing when, making ourselves listen to a musical note with a fresh ear, we consider it as a sound object and in addition to its relevant features, which we call values, find in it many other characteristics (which could perhaps become values in other structures, as a phonetic variant becomes a distinct phoneme in another language system).
(b) No, for phonetics, which is subordinate to linguistics, is by the same token in an ambiguous situation. A natural, experimental science, it studies, using its own methods, objects that it does not define itself but receives already defined from phonology, which is itself a science concerned with relational and differential systems. All the phonetician has to do is record the differences between concrete sounds and phonemes. He is not concerned with the sound object independently of the use made of it by the various language systems. Therefore, to highlight these differences, all he has to do is give a physiological, articulatory description, equivalent to an instrumental and a physical, acoustic description.
Nothing illustrates the ambiguity of his situation more clearly than the way he traditionally classifies sounds, a classification that, according to Malmberg, amounts to a compromise between different principles: “We could say that the traditional classification of the sounds of language is physiological, modified by acoustic or functional considerations. The principle of an articulatory classification has never been fully pursued, and in any case would have led to obvious absurdities. Experienced phoneticians have let themselves be guided by their ear and their linguistic sense.”12
At present, we are feeling the need for an acoustic, more rigorous, classification, but this has not yet been fully achieved. We could, however, object that a classification such as this is equally unwarranted: what is the point of spectrograms when they reveal differences that are eliminated in perception? It is quite obvious that they cannot in any case play a part in any language system. To identify those variations that are remembered in “linguistic listening,” there is no other solution than the ear.
We have seen that the phonetician used and trained his ear even though he did not suggest this practice as an essential research tool. But we must also point out that the phonetician, concerned, like the linguist, with existing language systems and not the origins of possible languages, knows about it indirectly. If a particular variant, which will not fail to be shown up by the spectrogram, is used in another language system as a phoneme distinct from the first one, it is because this variant was also a clearly perceptible variation. Does he need to know more?
(c) The dependence of phonetics on linguistics has another consequence: the phonetician—except to note important individual variations in the articulation of a particular phoneme—is only really concerned with variations that are general enough to lead to modifications of the language system: those that are “combinatory,” owing to the position of the phoneme in relation to the surrounding phonemes, and common to everyone; those that denote contrasting categories (men and women) or groups (regional accents), historical periods (changes in pronunciation, when they run counter to its oppositional system, leading to a transformation in the language system, which creates another system), or different language systems. Where laws are concerned, the level of generalization attained in linguistics is sought in phonetics in statistical regularities. There is no doubt that if linguists single-mindedly pursued that linguistics of speech13 (individual discourse), desired by Saussure, to complete their study of language systems (a shared treasure), phonetics would feel the repercussions, take on a new importance, and perhaps change its methods.
These references to the ideas of the pioneer of general linguistics, this rather brief summary of the major distinction between language systems and speech, is likely once again to antagonize two groups of readers. To some our parallels with linguistic disciplines in which they are specialists will appear cursory. To others they will seem esoteric, complicated, perhaps pointless. This is the abiding problem of interdisciplinary research: finding its material wherever it can but finding itself constantly at odds with different levels and types of expertise.
Our reasons for involving language were given in the preceding section and in section 16.5. The (musician) reader will not perhaps find the results very spectacular. What, all this fuss because a (distinctive, relevant) phoneme has the same (linguistic) function, despite its different (phonetic) variants? Is that all this discipline can contribute to music?
Important discoveries and meaningful comparisons cannot be measured by this yardstick. We know there are discrepancies, often very slight, between laws and measurements in physics, which leads to the questioning of a whole system, which then proves to be unsound at the outer limits. We believe this is what is happening here. We notice that the definition of musical values raises not only the strange question of anamorphoses, which we discussed in the last book, but also the question of social obfuscation and mass conditioning. As we said in section 16.8, we are dealing with three value systems: the values of a musical language system, physical measurement, and the laws of perception in reduced listening.
This means that by taking this detour we come back to the three listening intentions, for the meaning, the event, and the sound object, schematized in chapter 8 (fig. 2).
Two of these, as we know, go beyond the sound object, while using it as a vehicle for meanings or a bearer of indicators. So the sound object, ignored, becomes the speaker’s linguistic object, just like the musician’s musical object, even if it has a natural origin or carries the marks of use and the secondary significations related to civilizations. The sound object, equally ignored for the sake of the event, becomes the horse, then the Native North American, the timbre of a beloved voice, or again the physical object. So there are many ways of using sound as a sign (language, Morse code, onomatopoeia, speech intonations). In addition we should not be surprised to find sound used in all manner of ways for analyzing some revealing property (it covers the whole gamut from primitive man to the scholar, from the ordinary user to the specialized practitioner).
Finally, there is the sound object itself, all the more ignored as it has been used to signify so many things or reveal so many others. The event that it is, the values it carries within itself, these are what our hearing intention ultimately targets. If the reader has figure 2 at hand, he will have recognized the plan of our three-part epilogue: at the top on the left the search for meanings (this word, used in a more general sense, includes the system of traditional musical values), at the top on the right the search for events, and, finally, at the bottom, with its two poles reversing the former line of questioning, and going back to the sound object, the apprehension of the object through reduced listening.
Since we have devoted a whole book to a comparison between the physical and the sound object, there should be no further need to spend time here on the boundaries between what comes from the physical event and what is perceived as sound, or music, or the spoken word. But underlying every sound object there is still an event (a musical instrument is being played, someone is playing a particular instrument, a particular language is being spoken, or someone is saying these words), which always stops us from ignoring this natural pole and this in all three possible cases: when all our attention is focused on the event, which is a particular musical or dramatic mode of listening (to the instrument or the performer), when all our attention is focused on the meaning (thus it happens that, with a language system, the speaker is ignored, how can this be possible?), or finally when our attention, focused on the object of reduced listening, uses what it knows of the event, and even the meaning, to reach a better understanding of how the object is made and what value it has. So this leads to a plan that should elucidate each of these three systems concerned with objects of hearing: the cultural system of words or notes, the system of natural sound events, and the system of reduced listening. We will, in fact, devote chapters 18, 19, and 20 to this. First, though, we must finish our comparative examination of linguistics and the musical. For we have not resolved a central enigma: how the science of language can restrict itself to a study of language systems, ignoring speech; and why, in music, a similar methodological bias would not be effective, except in exceptional circumstances when the musicality is minimal enough to do without sonority?