We are all guilty of old habits: hearing with our eyes and wanting to understand before perceiving with our ears.
If we attach both ends of a string to a Greek zither or the soundboard of a Pleyel piano, we know perfectly well that these two points will in future remain fixed and that the string cannot vibrate without one, two, three antinodes, thus tracing out either the half-wave of the fundamental, the full wave of harmonic 2, or the three half-waves of harmonic 3, and so forth. These vibration shapes taken on by the string and corresponding to a given sound, compared with the shapes of other strings of a half, a third, and so forth of the length and giving precisely the same harmonics 2, 3, etcetera, clearly lead to these basic acoustic-musical observations: a sound “contains” other sounds, and these sounds are in simple (harmonic) relationship with each other. By adding strings to each other or piercing holes in vibrating pipes, we repeat the phenomenon of the antinodes, and the whole of traditional music appears as the interaction of various possible and intertwined multiplication tables.
If our musician tends so easily to forget his ear in favor of arithmetic, it is doubtless because his ear has perceived so logically that his eye alone is enough: all these strings—two, three, four times the length of each other—cause us to hear sounds that are so alike, except in the tessitura, that they are notated by the same name: they are all dos or res and so forth, the repetition of the initial sound one, two, three octaves higher. . . .
Moreover, the third harmonic (three antinodes of the string) forms a particularly striking interval with the second (two antinodes), noticed by all musical civilizations, which is in the fraction 3/2 and can be read either as “three antinodes over two” (in terms of string length) or “three vibrations over two vibrations,” if we take the vibration of the fundamental as the unit.
Is it our aim to work from the opposite direction? Coming to the rescue of those of us who have complained that we have not seen prose composition and translation brought together, Helmholtz brings the intuition of his genius, the inventiveness of his famous experiments and, what is often forgotten, an exceptional “ear.” We will go back to those sources, for they are well worth the trouble.
Helmholtz has a series of resonators, each one tuned to one of the harmonics of the fundamental to be analyzed. They are a series of spheres, with two holes in each one. The first receives the sound and conveys it into the sphere, where only some frequencies can be propagated; if one of these was present in the initial sound, it will be selected to come out of the sphere (second hole); otherwise, the acoustic energy will be dissipated as heat. The resonance frequencies are determined by the geometrical and physical characteristics of the resonator, such as the volume, the diameter of the holes, the temperature, and so forth. Provided these are known, we can of course construct a whole “harmonic” series of spheres.
Stimulated by a rich sound of clear pitch adjusted for sphere no. 1, each of these spheres selectively takes up one of the harmonics of the sound. If you put your ear next to the hole, you will clearly perceive the selected harmonic. Play another sound, of the same pitch, but on another instrument—that is, with a different timbre—and the resonators make a similar selection but with a different proportion of harmonics. Besides, Helmholtz will say, apply yourself, work hard at listening to a sound, and you may manage to perceive its harmonics directly: with a little practice. . . .
Several generations of musicians have accepted not only Helmholtz’s theory but his advice as a teacher of music theory: they have “heard the harmonics.” I rest my case. . . .
Moreover, Helmholtz builds on his advantage. The same property of the ear to distinguish harmonics explains, according to him, the ability to listen to simultaneous sounds. We quote:
The movement of air in the auditory canal has no property whereby composite musical sound (coming from several vibrating bodies) can be distinguished from simple musical sound (coming from one vibrating body). Unless the ear is guided by some accidental circumstance, for example one tuning fork beginning to vibrate before the other, so that we can hear them being struck, or perhaps the sound of air against the embouchure of a flute or the slit in the organ pipe, it has no means of deciding whether the musical sound is simple or composite.
Now, how does the ear behave in relation to this movement of air? Does it analyze it or not? Experiments show that when two tuning forks with pitches an octave or a twelfth apart are set in motion together, the ear is entirely capable of distinguishing each simple sound, although it is a little more difficult with these intervals than others. But if the ear is capable of analyzing a composite musical sound produced by two tuning forks, there is no reason why it cannot operate in the same way when the same movement of air is produced by a single flute, or a single organ pipe. And this is indeed what happens. The single musical sound from such instruments, coming from a single source is, as we have already pointed out, analyzed into partial simple sounds, that is, in every case a fundamental sound and a higher partial, the latter different each time.
The analysis of a single musical sound into a series of partials consequently depends on the same property of the ear that enables it to distinguish different musical sounds from each other, and it must necessarily carry out both analyses in accordance with a rule that is independent of whether the sound wave is coming from one or several instruments.
The rule that governs the ear in its analysis was first formulated as a general rule by G.S. Ohm: only the particular movement of air that we called simple vibration, in which the particles move backward and forward according to the law of pendular movement is capable of giving the ear the sensation of a single simple sound. Therefore, all movement of air arising from a composite group of musical sounds can, according to Ohm’s law, be analyzed into a number of simple pendular vibrations, and each of these simple vibrations is associated with a simple sound, locatable by the ear, and with its pitch determined by the duration of the associated movement of air.1
Helmholtz, however, modifies such categorical assertions. So his words are those of a physicist rather than a psychologist: “The question is very different,” he says,
if we set out to analyze less common examples of perception, and to understand more fully the conditions under which the above distinction can or cannot be made, as we do with the physiology of sounds. We observe then that there are two pathways or levels when we become aware of a sensation. The lower level of this awareness is when the influence of the sensation in question is only felt through the concept of external things or processes we create for ourselves, and that help us to identify them. This can happen without our needing to or even being capable of recognizing which element of our sensations a particular link between our perceptions relates to. In this case, we would say that the expression of the sensation in question is perceived synthetically. The higher level is where we immediately identify the sensation in question as a real part of the sum of the sensations within us. In this case we would say that the sensation is perceived analytically. The two situations must be carefully distinguished from each other.
Serbeck and Ohm agree that the higher partials of a musical sound are perceived synthetically.2
We can see that a physicist of genius, even in acoustics, is not so easily misled. This text already sketches out the fundamental difference between the physicist’s and the musician’s mode of listening. It is to be expected that Helmholtz’s “higher” level should belong to the former and not the latter. But we should pay tribute to Helmholtz’s reservations, which foreshadow the reversal of positions we are advocating:
What is more, the sound of most instruments is usually accompanied by characteristic irregular noises, like the scraping or rubbing of the bow on the violin, the flow of air in the flute and organ-pipes, the vibration of reeds, and so forth. These noises, already familiar to us insofar as they characterize the instrument, physically facilitate our ability to identify them in a composite mass of sounds. Partial sounds in a compound sound do not, of course, have the same characteristic features.
We therefore have no reason to be surprised that the resolution of a compound sound into its partials is not as easy for the ear as is the resolution of a composite mass of musical sounds from many instruments into its immediate constituents, and that even a practiced musical ear needs to apply itself carefully when it is trying to resolve the first of these problems.
It can easily be seen that the secondary circumstances we have mentioned do not always lead to musical sounds being correctly separated. In uniformly sustained sounds, one can be considered as the higher partial of another, and our judgment may very well be led astray.3
The physicist having harmoniously taken over the baton from the musician, we could expect the mathematician, in his turn, to take over from the physicist.
We may indeed easily “explain” the quivering of strings, despite the extreme complexity of the patterns they make all the time, but we have more difficulty in explaining what happens with the eardrum: it deals only with the air pressure that carries the energy from the “acoustic field” created by the movement of strings or reeds. We know, for example, that if we replace the eardrum with a microphone connected to a cathode ray tube, which displays the pressures on the membrane, we can see a spot of light moving around in an apparently highly disordered way. When recorded, this movement creates an absolutely incomprehensible oscillograph line on the time axis: it is indeed the outline of the sound, but it is illegible.
With Fourier, everything becomes easy again. Fourier showed us how to “analyze into series” the most complicated function, even the one that gives as the only value for t the pressure or elongation of the eardrum.4
The different terms of Fourier’s series are sinusoidal, “pendular” vibrations, that is, functions that accurately represent the sounds detected by Helmholtz’s resonators. So now, to perfect the scientific explanation, we only have to imagine an internal ear mechanism that behaves like a series of resonators to reach the conclusion that the ear hears sounds by analyzing them into Fourier series. This is effectively what Helmholtz thought and what he tried to prove, though he surrounded himself with many caveats. It is regrettable that some hasty readers have forgotten his warnings and built a general theory of musical listening that, in addition to gross errors and unwarranted simplifications, contains even more serious methodological errors. These are Helmholtz’s wise words:
Fourier’s theorem, shown here, demonstrates first that it is mathematically possible to consider a musical sound as a sum of simple sounds, in the sense we have given to these words, and, indeed, mathematicians have found it convenient to base their acoustical research on this way of analyzing vibrations. But it certainly does not follow from this that we are obliged to look at things in this way. Rather we should be asking: do these partial components of musical sound, demonstrated by mathematical theory and perceived by the ear, really exist in the mass of air outside the ear? Surely this method of analyzing vibrations, stipulated and made possible by Fourier’s theorem, is simply a mathematical fiction, good for helping with calculations, but not necessarily having any real meaning for things themselves? Why do we find that pendular vibrations, and not others, are the simplest elements of all sound-producing movements? A whole can be divided into parts in all sorts of very different and arbitrary ways. Thus, it may be convenient for a particular calculation to think of the number 12 as the sum of 4 and 8, because the 8, for example is relevant elsewhere, but it does not follow that 12 must always and necessarily be thought of as the sum of 4 and 8. In a different situation, it may be more convenient to think of 12 as being 7+5. The mathematical possibility of breaking down all periodic vibrations into simple vibrations as demonstrated by Fourier does not further authorize us to conclude that this is the only permissible form of analysis, if we cannot also establish that it has an essential significance in nature as well. That this is in fact the case (that this form of analysis has meaning in nature independently of theory) is made probable by the fact that the ear carries out precisely the same analysis, and also because it happens that, as we have already mentioned, this sort of analysis has more advantages for mathematical investigation than any other. The approaches to phenomena that are closest to the innermost composition of the matter under consideration are of course those that also lead to the fullest and clearest theoretical presentation. But it is not appropriate to begin the investigation with the functions of the ear, given their great complexity, and the explanations they themselves require. This is why we shall try to find out if the analysis of complex into simple vibrations has any real perceptible significance, independently of the action of the ear, in the external world, and then we shall be fully in a position to show that certain mechanical effects depend on the presence or absence of a particular partial sound in a composite mass of musical sounds. The existence of partial sounds will find a meaning in nature, and in return the understanding of their mechanical effects will shed a new light on their relationships with the human ear.5
Going back to the sources, while giving us the opportunity to pay well-deserved tribute to a great physicist without entering into disagreement with him, also shows us the research pathway that has been discontinued since he passed away, at least as he apparently envisaged it: two ways of hearing, on two distinct levels, and in different ways. In one of them the ear is conditioned as much as “the objects are prepared.” This is experimentation in sensory physics, in “analytical” audition. The other is a total act of listening with those “auxiliary circumstances,” as Helmholtz says, that make it easier for the musician to identify sounds (we would say sound objects).
Keeping our thoughts away from Helmholtzian caveats, what could we say today about the perception of pitch? Even before we turn to the most recent acoustic theories, we should start with some commonsense observations:
1. There is no doubt that the “structure of the signal” is harmonic and can be broken down into the Fourier series, and there is even a probability that in the first stage the physiological ear behaves like an analyzer.
2. Between this and saying that we hear distinct “harmonics,” there is something of a difference. That the observer, his ear glued to Helmholtz resonators, hears them, there is no doubt. And it has been seen that he then rehears the sound, convincing himself that he is separating the harmonics out one from another. It is already more likely that he “separates out” the harmonics of a sound as they are dying away, as harmonics are formed in the course of the duration of the sound. Nothing is very certain in these rather convoluted musicianship exercises. And this is fortunate: if a practiced ear managed to distinguish all the harmonics in this way, we should point out that its musical listening would very rapidly be seriously disturbed. It would no longer separate the clarinet from the oboe, or the violin from the cello . . . or again, in a piano, string, or woodwind chord, especially a consonant chord, it would no longer identify the notes, which have so many harmonics in common. None of this has been borne out; quite the contrary: the whole package of harmonics seems to be fully merged with the sound.
3. When, as (musicians’ and acousticians’) “common sense” suggests, we link the sensation of pitch to the number of vibrations, we forget the object—that is, what actually comes to the ear—and focus entirely on the visible signal, whether on the zither, the oscillograph, or the mathematical expression in the Fourier series. There is, on the zither, one string that is longer than the others, which of course vibrates less rapidly, and which in Fourier’s series is a first term designating the fundamental. But what is going on in the ear? Doubtless, we are tempted to say, the analysis of a group of harmonic frequencies, each one allocated a coefficient. But if, for example, the zither “hardly” vibrates in the fundamental, that is, if the first term of the Fourier series has a very small coefficient, will my ear continue reacting quantitatively to that first term, that fundamental? If this were the case, depending on whether a string is attacked with timbre or not, we should have different perceptions of pitch depending on whether the first, second, or third harmonic dominated. Now, we observe that the ear nearly always assesses pitch in relation to the fundamental, whether this is physically strong or weak, as if it were going back to a sort of “first reason” for spectral data.
We should therefore conclude that the ear does not hear the fundamental but infers the fundamental by perceiving the harmonic network—that is, its internal correlations.
It is time to take a closer look, using two essential experiments.
This is the reverse of Helmholtz’s experiment. His experiment untangled the partials by analyzing “a rich sound.” If we take several partials, will we be able to recompose a fundamental?
Schouten’s experiments throw into confusion any idea of a simple relationship between perceived pitch and the physical presence of fundamentals. In fact, if we listen to three or four equally distributed frequencies in the high register, we perceive, in general, a low pitch. For example, the group 1800, 2020, 2240 Hz will have the pitch of a fundamental sound 220 Hz (A3), called the “residual sound” of the three initial frequencies: so we have a complex of sounds in the high register with a (perceived) low pitch. If we now observe that this low pitch is sometimes very obvious, sometimes unclear, and that this depends on the phase relationship between the three high-register harmonics, and also the distance between them and their intensity; and if we add that the residual sound is not a phenomenon arising from the nonlinearity of the ear and that, putting aside the questions of phase, it appears in a very specific area in the field of audibility that depends on intensity and frequency; and if we notice, finally, that it sometimes gives rise to the perception of several low pitches—then we believe we have given some idea of the number of problems that must still be resolved before we arrive at a simple theory of the perception of pitch, or even simply of residual sound. We should keep this in mind: our present understanding of psychoacoustics rejects any direct link between frequency and pitch, even if this is often a useful way of presenting them. Rather, we are inclined to see pitch as a perception that depends on both the frequency and the periodicity of a sound. (It is understood that here the frequency of a sound is the frequency of the lowest harmonic of the sound that actually contains energy; the periodicity, on the contrary, is determined by the frequency of the theoretical fundamental harmonic of the sound, independently of questions of energy.) For example: a sound complex formed by the superimposition of two sinusoidal sounds of 200 and 250 Hz will have a frequency of 200 Hz and a periodicity of 50 Hz. The periodicity is therefore determined by the greatest common divisor of the harmonics of a sound; normally, of course, the two concepts coincide, for in sounds there is nearly always energy at the frequency of the fundamental harmonic.
Equally, there seem to be two mechanisms in the ear: one peripheral (the internal ear), which analyzes from the frequency, activating a spatial analysis of the sound vibration in the cochlea, where the different frequencies are perceived in relation to their own energy; the other central (nervous), which seems to be linked to the periodicity and not the energy of the sound vibration. These two mechanisms coexist and are interpreted by means of schemas that are not yet entirely clear. We do not have unanimous agreement on this model, however, and, besides, it does not explain every phenomenon. . . . Furthermore, the rules that claim to “find out” the perceived pitch of a sound from the physical knowledge of the signal are complex, often vary from one person to another, and, in a universe of nonharmonic sounds, are nearly always unknown.
For our part we wanted to do a similar experiment on the relationship between pitches and frequencies, taking the perception of unison as our assessment criterion. Our approach is broadly as follows:
The mathematical analysis of a sound A shows that it can be broken down into a number of sinusoidal vibrations with frequencies f, 2f, 3f, and so forth. If, furthermore, we consider the sound A′, containing all the components of A except the first, f, which is taken out by filtering, A′ will have the structure 2f, 3f, . . . The fundamental of A′ is 2f, whereas the fundamental of A is f; but the periodicity is still f, the same as the periodicity of A. If the (perceived) pitch depended only on periodicity, A and A′ would be permanently in unison. In fact, the findings are as follows:
(a)There is indeed unison between A and A′ for low sounds.
(b)But as we go up the register of pitches, the filtered sounds (A′) are more and more perceived as an octave higher than the original sounds (A).
But we were able to go further and carry out a second group of observations using the particular methods we had introduced for the previous experiment: for practical reasons we had, in fact, decided to compare orchestral instrument sounds (sounds A, unfiltered) and sinusoidal sounds S of the same frequency f. The results were quite unexpected:
(a)There is indeed unison, as may be expected, between the sounds A and the sounds S, in the medium and high registers; but,
(b)as we go further down the register, the listener tends more and more to perceive unison between an orchestral instrumental sound A with a frequency f and a sinusoidal sound S with a frequency not of f but of 2f. In other words, unfiltered instrumental bass sounds are perceived an octave higher in relation to the sinusoidal sound that has the same frequency as their fundamental.
We should note, however, that the respective calibrations of the orchestral instrumental sounds and the sinusoidal sounds are coherent in themselves, which means, on the one hand, that a bassoon C1 will not be confused with a C2 nor, on the other hand, will a 50 Hz and a 100 Hz sinusoidal. So we must allow that there are several calibrations for the perception of pitches, which coincide in the medium and high registers but diverge in the low register, although each is perfectly coherent in itself.
Here again we could refer to the two mechanisms mentioned at the end of the last section to explain these results. Whatever the explanation may ultimately be, the phenomena observed seem to us to amount to a serious warning not to be too eager to transfer from one field to another concepts that have the same name but turn out to refer to different realities (for more detailed information on these experiments see the appendix at the end of the chapter).
Our normal “musical values,” the foundation of our musical intervals, operate within the framework of a tonal music or at least a music with highly defined degrees, ultimately almost conventions, or at least the result of “specialized” training. The situation is quite different for the experimental psychologist, for whom “interval” designates a perception of “a space” between two pitches, which is conveniently represented by the psychoacoustic unit called a mel. So, from the experimental psychologist’s point of view, a fifth or a third in the bass register is much tighter than in the middle register: it has a smaller number of mels. The reader will perhaps then ask how it is that musically all fifths are identical, since from a subjective point of view they are not. Without claiming to answer this question, we will, in order to make things clearer, simply illustrate how the calibration in mels was constructed by Stevens, based on sinusoidal sounds (fig. 3).
FIGURE 3. Scale for assessment of melodic pitches according to Stevens and Volkman.
A sound is presented to a listener periodically; during the periods of silence he has to tune an oscillator to a frequency such that the pitch of the sound it gives is in a certain relationship (1/2, 2, or 4, 10, or 1/3, etc.) with the test sound. This process is repeated many times: for different values in the relationships, for different initial test sounds, then closer and closer together to get a succession of responses, and finally, of course, with different participants. Then all the results are statistically analyzed, and the most probable values are used to construct the mels curve (subjective units of pitch obtained from judgments about relationships) in relation to frequency. So mels are based on at least three working hypotheses, the validity of which is demonstrated only by the consistency of the final results:
Hypothesis 1: the concept of pitch always relates to some subjective magnitude of the perceived sound: it is not necessarily the same from one person to another, but everyone has at least one criterion, a sort of unique perception, which for a certain class of stimuli he links to the word pitch.
Hypothesis 2: the expression “half” or “double” or “one-third” pitch, even if it refers to a mathematical relational concept, which is a nonintuitive abstract concept, is a judgment criterion for everyone that has proved to be stable through time and consistent, whatever the initial sound.
Hypothesis 3: so there is for everyone, even if the reason for it is not understood, an idea, quite subjective it is true, of a yardstick for pitch, in relation to which judgments of half, a third, and so forth can be made.
We could question these hypotheses. In any case it has been possible to set up an experimental scale of mels, which coincides with the harmonic scale only in a limited portion of the register. Does this amount to proof of a divergence between the objects of scientific and artistic study, or are we dealing with two distinct phenomena, equally important for music, depending on the experimental conditions and the way the objects for listening are used? We will return to this enigma.6
Some composers, a little too naively smitten with the acoustics of music, are tempted to use uncritically all research that comes from science and, in particular, acousticians’ work on thresholds. This makes them think that the area of audibility is broken up into as many intervals as differential listening to pure frequencies by the ear will allow (cf. Moles’s “slabs of sensation”). We will attempt to clarify the concept of differentiation threshold and to highlight the wrong use of it that is likely to be made in music.
These thresholds have been accurately measured by audiometry, under carefully controlled conditions: a slight change in frequency at a certain level, without any other sound or masking. As soon as a musician uses a natural or a synthetic sound, he must know, as we have seen, that he is mobilizing not only the nominal frequencies on his score. On the one hand, he is providing an object that is already frequency-rich, affecting a whole zone of differentiation thresholds. On the other hand, this sound is with other sounds, which produce a masking effect, and consequently tend to disrupt any predictions that might be made from a simple mathematical hypothesis. But in the calculations on noise, based on elementary stimulus curves, Fletcher and other researchers have had remarkable results; selecting their field of reference on each occasion, they studied the effect of raising or lowering the thresholds when the sound is continuous or discontinuous. The musician, quite overwhelmed by such calculations, is advised to turn to the nearest and also the most accurate electronic machine in music: his ear.
Then he will observe this: if we must take the concept of threshold into account in music, we will probably have to look for it at the two extremities of a “polarization” of the ear induced by the context. Indeed, disrupt the ear with huge gaps in pitch and intensity or the effects of large numbers of objects: perceptions of slight differences in pitch will blur. Train it to perceive more and more subtly, during a pianissimo, with a huge reduction of objects: it will then perceive tiny variations.
Moreover, learning to listen can teach us to hear objects or nuances within objects better. It is with this in mind that another experimenter, Heinz Werner, has attracted our attention. He has his listeners listen five consecutive times to a group of two tonic sounds (i.e., with highly defined pitch) that are very close together in pitch.7 The interval formed by these sounds is markedly smaller than the minimum interval normally perceived by the ear (estimated as a twentieth of a tone in the register under study); so it is not perceived on the first time of listening. But repetition makes it become more and more apparent, until it appears as a well-defined interval. Not only is this experiment simple and convincing, but it is also of fundamental importance, for it gives the lie to theories that fix the norms of listening as an absolute, without taking into account either context or training.
More generally, we should observe that, when the physicist habituates his ear to perceive a stimulus, he behaves no differently from the musician seeking to make a subtle structure perceptible: he must lead his ear to it and not violate it by imagining that, like a printing machine, it will regurgitate a whole network of curves or pour out its “slabs of sensation” with no problem. . . . So perception depends on context. Hence an instrumentalist will instinctively remember Pythagoras’s scale to add an accidental to a note, but also, above and beyond this, the intervals will be more or less “right” in relation to the “vectors” of the music that contains them, as Robert Francès has so clearly demonstrated:
If we take as our base line the tempered accuracy of a piano and lower the pitch of two of its notes, we can predict that this change will be felt less by the listener when these notes form part of a structure in which, as with the tendency described above, they have descending vectors, than the converse. So, for example, if the note is in both a descending and an ascending appoggiatura phrase, the lowering of the note objectively will be better tolerated (i.e., less noticed) in the first than in the second; in the former the lowering is in keeping with the harmonic vector of the note, but in the latter it runs contrary to it.8
Furthermore, we wonder whether the tuning of the piano, in the high and bass registers of this instrument, is not done tendentiously to satisfy musical imperatives, as if to force the ear to perceive as lower or higher sounds whose pitch is in truth very confused at the outer zones. R.W. Young gives a much more scholarly explanation of this fact.9
We could also think of dodgy horn notes, this time beyond the limits of tolerance, but whose charm induces smiles and indulgence in the listener. We should note that this is nothing to do with the dialectic of the musical discourse but the difficulties of the instrument maker.
After these various experiments and observations, a conclusion is called for: the concept of pitch, far from being obvious and linked, as people say, to the frequency of the fundamental, is complex and plural.
It is time to sum up what we have learned, going from the most complex to the most elementary,10 from the musical to physics.
(a) Instrumental calibrations (register of given instrumental sounds)
Here we mean the musician’s sound, and not the heterodyne sound of the acoustician. Musicians have keyboards, valves, fingering, and so forth, which produce notes that are fixed or at least agreed on in advance. These registers, tuned as well as possible to a “temperament,” are used, in the case of the piano and keyboard instruments, melodically and harmonically. The instrument lends itself just as well to either use, although the ear makes a fundamental distinction between them: for the ear it is not at all the same thing to hear two pitches together (chord) or one after the other (melody). It is to be noted, as we have seen, that the confusion that occurs over a bass A on the piano, when heard an octave higher than a pure frequency nominally of the same value, is not likely to be repeated when the piano is being played or when the piano plays with the orchestra. Experience, just as much as Western orchestral convention, assures the ear that the bass A is indeed a bass A and not the octave above.
The fact that the acoustician can find no acoustic energy in the reference frequency makes no difference. It is a musical fact, distinct from an acoustic fact. So, written in the same place on the musical staff, there are Cs and As that are acoustically different depending on the instrument on which they are played: each one has a particular spectrum, a particular localization of energy, which is “somewhere” in the tessitura, more or less high or low. So we can see to what extent a score can mislead about the “acoustic” content of the work and, reciprocally, how little chance a score that aims to be acoustically accurate has of recording what the ear will really perceive of those spectra that are so accurately defined. The champions of the electronic score would do well to think about this.
(b) Calibration of intervals
A calibration of intervals can only be correctly assessed if instrumental registers are available: in the absence of these, the ear—the musical ear, of course—tends nevertheless to retain a certain number of “relationships” from this group of instrumental pitches. Here again we find the concept of pitch as structural value, as independent as it can be of any characteristics or the nature of the objects that produce it.
When we come to evaluating, describing, or justifying such a calibration of intervals, attitudes vary considerably: will it be harmonic or melodic, depending on whether the ear hears simultaneous or successive sounds? Will the perception of intervals be based on consonance? Or (as the capricious tuning of the African balaphone might suggest) on custom?
This type of question seems to us to be a waste of time for the researcher, and we would suggest, rather, that he should consider these two pieces of information based on musical experience:
1. The perception of intervals is a cultural fact, conditioned by a specific practice and a specific number of conventions about the use of pitches, which provide waymarks for perception. The diatonic scale, a horrible arithmetical compromise between a number of simple relationships, is perfectly well tolerated by us. For an Indian it is only a crude calibration, and he, for his part, places other scalar degrees between the existing ones.
2. This perception of intervals will still be closely linked to the instrumental context: far from being better with pure sounds, the ear will clamor for sounds full of substance, demonstrating to the acoustician that it prefers to compare spectra rather than waves. In addition, the ear will judge performances by intentions: depending on context, a singer will appear to be singing on the right note, whereas she is a semitone out, but, on other occasions, the ear will not forgive her for being a few commas wrong.
(c) Functional and experimental calibrations
In the two previous examples the ear situates pitches within a context that is instrumental (registers) or structural (intervals).
But—as in Helmholtz’s experiment with resonators—it is equally prepared to free itself from these contexts and even, when the occasion arises, to break down into harmonics the instrumental unity in which these harmonics “merge.”
Thus, in a rich harmonic sound heard several times, we will initially hear only a pitch taken to be a tonic, crowned with a harmonic timbre. Then we will more clearly identify various “components,” which seem to create points of “condensation” in this sound; then, listening to those resonances, we will perceive that one resonance stands out more than another. Better still, if we compare this sound (which we suppose to be harmonically very ambiguous by now) with a “resolution into a chord” that the piano, or any other registered instrument, suggests, experience shows that there will be attraction or repulsion, and in response we will not always hear the sound at the same pitch: it will be heard in relation to the pattern the piano suggests. And so we observe that a given object, which has certain harmonic characteristics, can assume various values depending on the environment.
Is it legitimate to call such possible calibrations of evaluation by pitch “subjective”? No, if we think that they arise from a relationship between objects that is itself a function of group conditioning and individual training.
(d) Calibration of “gaps” in the tessitura
This is a psychophysical calibration of mels, very strange for a musician. The word mel seems very ill chosen if it suggests a melody of degrees, where all the ears in the world agree in their judgment that a third or a fifth are similar in the high or low register. So we must point out the risk of confusion this terminology may cause. It describes certain particular conditions for the perception of pitch relationships on the part of the experimenter and a particular listening approach on the part of the participants, who must use a type of evaluation quite uncommon in music. Our aim is not to criticize the objectivity of these evaluations but to draw attention to the experimental context that justifies them and, hence, to the potential interpretations. We will make ourselves clear at the end of this work.
Since the concept of musical pitch is dependent, on the one hand, on instrumental registers and, on the other hand, on a particular cultural conditioning, there is nothing to stop us extending the perception of pitch by including in it sounds that are chosen differently or listened to in a new way.
Here we must distinguish between the concept of pitch linked to instrumental registers and the concept of pitch linked to the perception of intervals: register obviously comes from instrument making, intervals from music theory. If we are hoping to make a general statement about the way pitch is perceived, we must of course start with the ear. The very concept of register in the traditional sense loses its meaning as soon as we leave the field of classical instruments: a scale played on the phonogène using any concrete sound no longer has any meaning. But if we listen carefully to this random concrete sound (for example, it may come from a membrane, a metal sheet, a rod . . .), we notice that, without having a clearly locatable pitch like traditional sounds, it nevertheless presents a sound “mass” situated somewhere in the tessitura and is more or less characterized by occupying fairly decipherable intervals. It includes, for example, several sounds of gradually evolving pitch, crowned or surrounded with a conglomeration of partials that also evolve, all of this more or less locatable in a certain pitch zone. The ear soon manages to locate the most prominent components and aspects of these, provided it is trained to do so; such sounds can then become as familiar to it as traditional harmonic sounds: they have a characteristic mass.
If we applied the same criteria to these sounds as to harmonic sounds, we should expect that filtering them would limit them to slices of pitch strictly determined by the frequencies at which the filters cut off. We have already seen, with traditional sounds, that filtering a bass piano note has some surprises in store for experimenters: when it is done in the frequency zone that contains the fundamental, it does not change the perceived pitch of the note. On the contrary, in the high register it changes the piano-like character of the sound but without modifying the perception of pitch. Generally speaking, a symphony can still be recognized on the telephone: this shows us that structural pitch relationships remain indestructible, despite the system’s weak pass band. Similarly, on a small transistor the bass notes are reduced to practically 100 or 200 Hz (because of the smallness of the loudspeaker)—that is, only one or two octaves below the A on the tuning fork: musical works nevertheless continue to be “played” with physically nonexistent bass fundamentals.
These first experimental observations on filtering enable us to see that, even for harmonic sounds, which are in theory very localized in the tessitura, the concept of the “mass” of a sound is based in a concrete reality resistant to a number of manipulations that are, nevertheless, theoretically capable of modifying it considerably (proportionately with the changes in the spectrum of frequencies). Nontraditional sounds present the researcher with a permanence that is just as, if not more, obstinate where their occupation of the pitch field is concerned. So we can see that the beginner in experimental music, desperate to filter sounds, is in fact performing very minor surgery. From one incision to another, the sound is indeed transformed, painted in different “colors,” from dark to light; but through all these transformations it is nevertheless the same sound, with a mass that is still identifiable.
So this leads us to a very general conclusion about the correlation between pitch or mass, on the one hand, and the spectrum of frequency, on the other: the apparent bulk or mass of a sound, or its precise location in pitch are not in direct correlation with the physical bulk of the spectrum and its fragmentation, or the localization of a fundamental.
Very limited fragments of this type of spectrum will, in fact, retain the subjective characteristics of the mass, or localization, or harmonic makeup of the original sound, albeit with “colors” that depend on the filtering: the ear, while recognizing an impoverishment or distortion of this original sound, will tend to reconstruct it with its characteristic individuality. In practice, anyone wishing to make experimental music is advised not to try to give the sound material he uses, concrete or synthetic, any pitch or mass values that are very closely related to the portions of spectrum determined by filtering.
Our aim initially is to find out whether the perception of the pitch of a musical sound depends exclusively on the frequency of its fundamental harmonic.
We know that the pitch of a (harmonic) musical sound is a well-defined value on which practically all Western music has come to depend. But the perceived pitch of a sound is not always determined by its fundamental. Although the fundamental alone sometimes does account for the greater part of the energy of the note, in other cases its energetic importance is, on the contrary, negligible. Thus it would appear that we cannot properly account for the perception of pitch solely by the presence of the fundamental.
To clarify the situation, we devised the following experiment: we played normal musical sounds, then the same sounds after filtering out their fundamental (using an electronic device), and compared the (perceived) pitch of these sounds with the pitch of those pure (sinusoidal) sounds that have a frequency related to the frequency of the fundamentals in question.
1. The sound material used in our experiment consisted of sixty-four instrumental sounds, with pitch varying between Ea1 (f = 38.9 Hz) and Ga7 (f = 2960 Hz).
These sounds came from:
Piano |
9 |
sounds |
Xylophone |
3 |
— |
Vibraphone |
3 |
— |
Oboe |
5 |
— |
Clarinet |
3 |
— |
Flute |
6 |
— |
Trumpet |
3 |
— |
Trombone |
3 |
— |
Viola |
4 |
sounds |
Piccolo |
3 |
— |
Violin |
5 |
— |
Cello |
4 |
— |
Double Bass |
4 |
— |
Bassoon |
4 |
— |
Electronic source |
5 |
— |
Except for the percussion instruments, these sounds were presented as swelled notes, without vibrato, with a duration of 3–5 s, played mf, and then heard at a similar intensity (80–90 dB, i.e., 0.0002 bars). The five electronic sounds, produced by an appropriately distorted sinusoidal wave, were given a fairly slow attack and a gradual decay: their timbre was somewhat like a violin, and in general their electronic origin was not noticed by the participants. Each sound was listened to twice during the test: once as it had been recorded and once after the fundamental had been filtered out; in total, therefore, 2 × 64 = 128 sounds were played.
2. In practice, these 128 sounds were divided over six tapes, each lasting approximately twenty minutes; there was a break between tapes. The six tapes were played in two separate sessions: three tapes per session.
Each of the 128 sounds whose pitch was being evaluated appeared in a sequence arranged as follows:
Sound played; |
The three reference sounds consisting |
The sounds were played randomly; in particular, unfiltered and filtered sounds followed each other in no fixed order.
3. The participants, mainly final-year students from the Paris conservatory, or composers at the Groupe de recherches musicales, had been instructed in the aims of the experiment and introduced to the way it would be run by means of a preliminary training tape. We asked them to indicate, in a given sequence, which of the three reference sounds was in unison with the sound, which was played three times. Twenty-two people agreed to take the test; some of the responses, however, were discarded for reasons that will be explained later.
4. Eliminating the fundamental, in theory easy at the time of recording, had in practice to be done live—that is, by means of an adequate filter that could be inserted into the listening circuit. The intermodulation coefficient of magnetic tape does not, in fact, allow effective filtering out of the fundamental in a recorded sound above 40 dB, whereas we wanted to go to 50 dB. The efficiency of our device was monitored by a reference microphone positioned in front of the loudspeakers, in the premises where the listening was taking place.
5. Findings. The question may be asked why we did not make a direct comparison between the pitch of the filtered and the homologous unfiltered sounds: the reason is that actual experiments of this type gave only very varied results, which were difficult to interpret. This may be because the change of timbre that accompanies the elimination of the fundamental, and varies according to the instrumental origin of the sound, disrupts the perception of pitch in an unpredictable manner. Whatever the case may be, the use of sinusoidal reference sounds seemed to us to resolve the difficulty by giving the ear a fixed reference point valid as much for unfiltered as for filtered sounds. It should be noted that the choice of sinusoidal sounds as “benchmarks” is not obligatory but for convenience. We intend, moreover, to go back over all our experiments later on with reference sounds that will themselves be instrumental.11
Under these conditions we should expect to obtain two categories of findings:
1. The breakdown of the answers for the sixty-four “unfiltered” sounds in effect gives us an insight into the question: “To what extent is the (perceived) pitch of a harmonic sound (such as is found in traditional music) linked to the pitch of a sinusoidal sound with a frequency that is in simple relationship with the fundamental?”
2. Comparing these answers with the answers for the corresponding filtered sounds gives us an insight, although indirectly, into the question: “Do an unfiltered and a filtered sound have the same pitch?”
To ensure maximum reliability in the interpretation of the results, we took the following precautions:
(a) Seventeen of the twenty-two people tested linked musical unison (even perceived pitch) with physical “unison” (equality of frequency) in more than 90 percent of cases, which is in conformity with the most general hypothesis. The other five people were not considered to have had the usual conditioning, and their answers were disregarded.
(b) As the listening sessions were relatively long and, most important, boring, we wanted to be sure that the quality of the answers was not affected by tiredness: we could detect no variation in fatigue levels between the beginning and the end of the sessions.
(c) Finally, we checked that the position of the participant in the listening room did not affect the perception of pitch of filtered sounds (we feared phenomena such as intermodulation, which are liable to “restore” the fundamental artificially) by comparing the answers of a control person who moved from place to place during one sequence, repeated several times.
The findings shown in section 10.6 are from the experiment carried out and interpreted in accordance with the directions above.