12

Temporal Anamorphoses I

Timbres and Dynamics

12.1. TIME LOCALIZATION

Sound information does indeed pass through the ear inasmuch as it is a mechanical organ. But the processes that prepare this information and continue it at a higher level elude our reconstructions and models. So only by observing and impartially describing the raw results of perception can we hope to reach a more accurate understanding of hearing phenomena. One of these, practically unnoticed until now, directly questions our sense of time: what comes before or after. In this chapter our approach will be through these “temporal localizations,” building on our first experiments in 1957,1 and so identifying an initial category of temporal anamorphoses.2

So here we are no longer concerned with determining time thresholds. A threshold, of course, has no sense of duration: it is a “speck of time,” the smallest perceptible, more or less describable, temporal event. On the contrary, evaluation in duration involves going in a more or less conscious, controlled, or instinctive way through these “slices of present,” which are no longer thresholds, since they blend and become a whole in the short-term memory, and so give that real-world grasp of the object that we call its temporal form, already in the past although almost present. Between thresholds without duration and those durations not formed of successive points, might we find some distinctive feature of time perception?

We must confess we had hardly ever thought about this. If it were the case, surely our many predecessors would have told us about it? We would still be just as innocent were it not for a decisive experiment that, as sometimes happens, we did by chance. We will give an account of it before moving on to a more systematic analysis, and we would invite the reader to imagine himself at our own stage of work, ignorance, and even ready-made ideas.

Musical listening, as well as practice, lays the emphasis mainly on sound attacks. The contributions of physicists are along similar lines, as we have seen again and again, with “transient phenomena,” which supposedly give the beginnings of sounds both their richness and their mystery. But even if our listening exercises prompted us to question the way duration is evaluated, and led us from a linear concept of the way time is distributed to the idea that not all the moments of sound are equivalent in duration, nothing made us abandon the very Cartesian schema of a succession of moments: the first before the second, the second before the third. Consequently nothing suggested that we should look for a sound’s attack anywhere else but . . . at the beginning. So this is where, in the wake of so many others, we looked for it.

12.2. BEGINNINGS OF SOUNDS

So, convinced that, linked to the famous transients, the first moments of sound held the secret of attacks, and thus also of timbre, and, with the piano, of the “touch” specific to instruments or virtuosi, we started to observe the beginnings of sound with the oscillograph. And so we compared the initial phases of different types of sound: piano, and later wind or bowed instrument sounds. We expected to find not a characteristic vibration line but at least an overall curve that would, for example, have explained the steepness of attack sensed musically. The experimentation was carried out on the first 50 milliseconds of sounds, this length of time being empirically accepted as long enough for all transient phenomena arising from the establishment of the sound to be over.

Our first finding was that the documents we obtained seemed to defy investigation. For example, two open Es on the violin, with attacks identical to the ear, played by the same instrumentalist, gave atypical oscillograms (fig. 4a); the same thing happened for two As played under the same conditions (fig. 4b). A more spectacular experiment involved asking a very good trumpet player to play a staccato with an accuracy appreciable to the ear: none of this sound’s eight impulses gave an oscillogram similar to the others (fig. 5).

FIGURE 4a. Oscillograms of the first 50 milliseconds of two open Es on the violin.

FIGURE 4b. Oscillograms of the first 50 milliseconds of two As on the violin.

FIGURE 5. Oscillograms of eight successive staccato impulses on the trumpet.

What do we understand by a “typical oscillogram”? An oscillogram that would be obtained under the following conditions:

(a) the samples from the same musical object (two As on the violin, for example) would give images with at least some features in common;

(b) two musical objects with different (musical) attacks would generate images that differed in a similarly characteristic way.

In fact, the oscillogram may have given an account of some aspects of the sounds under study, but it remained silent on the main point. What were we to conclude from these astonishing findings?

First, it is important to know what we are looking for and what we hope to understand: is it the transient electroacoustic workings of the complex chain sound object–microphone–tape recorder–oscillograph or the musical perception of attack? But above all, why should we want to explore the first 50 milliseconds and endeavor to find characteristic elements there, when the events that take place in that sliver of time are precisely not isolated by the ear, because of its poor resolving power?

We were not alone in such difficulties. Many other researchers were earnestly working away with infinitely more care and expertise than we—too many, perhaps, for they restricted themselves, it seems, to the comfort zone of their experimentation. Dayton Miller, says Fritz Winckel, was analyzing the first ten harm onics of a piano note (well beyond the first moments), played piano, mezzo forte or forte, and of course did not find the same spectrum; he therefore concluded that . . . timbres are variable, still, of course, in relation to spectra. Winckel continues:

On the other hand, we still have not managed to find a satisfactory explanation for the influence on the sound object of the individual player’s attack on the piano key. We are not unaware, of course, that timbre changes with the strength of the attack, as is shown by the spectra. . . . A medium attack makes the sound harsher, while a powerful attack gives a bright timbre not unlike a wind instrument. This is not enough to account for the various shades of sonority that the pianist can produce through secondary variations of touch; researchers at the University of Pennsylvania (U.S.A.) compared the sound spectra produced on the same piano by a famous pianist’s touch and by a weight dropped on to the note: the oscillograms showed no difference.3

Here we can see the weakness of musical acoustics: problematic measurements and hypotheses, as well as a lack of specific observations. We are failing to see the wood for the trees. We will soldier on.

12.3. THE SPLICED PIANO

Poor laboratories have at least one advantage; they make the researcher go back to simple experiments. Keeping to the experimental method we will describe more fully in chapter 13, we had just the tape recorder and scissors. How was it that we had not thought of this earlier, before embarking on a whole battery of delicate operations on the beginnings of sounds? Because, to our minds, the attack was so bound up with TEMPORAL LOCALIZATION that if we cut off the beginnings of sounds, we were certain to eliminate it from our listening. So it was without any preliminary conviction, and as one might carry out a somewhat absurd check to set one’s mind at rest, that we recorded a bass piano note and, by cutting somewhere after several tenths of a second, eliminated what was certain to be the attack. When we replayed the tape, we expected to hear a sound with its characteristic beginning decapitated. Now, this bass sound, with several tenths of a second, then half a second, indeed one second cut off, reproduced the whole of the piano note, with all its characteristics of timbre and attack.

So, after this first experiment, we could already conclude that, for bass piano sounds, the perception of attack is not linked to the phase when the sound is being physically established, since the beginning can be removed without changing the attack. Hence, our initial approach, which was based on the study of transient regimes, became null and void at least in the bass piano register and was likely to be so in other cases. It should be noted that this strange finding, once we have got over our initial surprise, can be quite easily explained if we consider that the transients at the beginning of a sound occur precisely within a space of time below or, at most, equal to the resolving power of the ear. We have already pointed out this contradiction in the previous chapter. This proof allows us finally to put aside a persistent misunderstanding. We perhaps had no greater understanding of the attack itself, but since it was clear which way we should go, we repeated this experiment first of all with the various piano registers, then with sounds from various other instruments. We will try to give more details about these experiments.

First, with the piano we found that the perception of the steepness of attack varied according to where the cut was made: the steepness was greater if the cut was made in a portion with a steeper descending dynamic. With bass piano notes the dynamic line is perceptibly linear, and, in fact, these cuts can be made a long way beyond the initial moments of the sound without the character of the attack (or indeed the timbre) changing perceptibly: in practice we can cut up to a second into the beginning of the sound (fig. 6). If we go any further, the artificial attack seems less powerful than the original attack. If, however, we cut an A4 on the piano at 1/2 a second, or 1 sec., the sound becomes unrecognizable; it is more like a flute than a piano. In keeping with the general rule given above, we also note that the dynamic of this A4, which is fairly steep immediately after the beginning of the sound, is almost flat at the end (fig. 7).

FIGURE 6. Bathygrams of an A1 on the piano with the attacks cut out: (a) the average slope of the decreasing dynamic is constant; (b) and (c) the cut sounds are heard with approximately the same attack as the original sound.

FIGURE 7. Bathygram of an A4 on the piano.

Cuts that do not change the attack are difficult to make in the high register of the piano, since the dynamic gap is very tight because of the brevity of the sounds, and unless we cut very near to the beginning of the sound (barely 50 ms), the slope after the cut is not as strong as it is at the very beginning of the original sound, which explains why we obtain gentler attacks then.

It is tempting to extend this experimentation to all instruments that give the same type of objects as the piano: attack-resonance. Say, for example, that we cut into a vibraphone sound. Now, even when the cut is quite near the beginning of the sound, we have to admit that the attack (as well as the timbre) is clearly different. This countercheck makes us clarify our vocabulary, on the one hand, and the limits of our investigation, on the other. In fact, with the bass piano, we were in the straightforward situation of experimenting on a sound with a stable harmonic content: the cuts therefore only affected the steepness of the attack (the dynamic aspect) but had no effect on the harmonic content, since this was constant. It is not the same with the vibraphone: this instrument—as, in fact, we noted in most percussion instruments—gives a double attack, where the vibration of the metal strip, which seemingly constitutes the main part of the sound, is superimposed on the initial sound of the stick, which rapidly disappears. The experiment in deleting this double attack shows that this short sound is nevertheless part of what characterizes the vibraphone perceptually, and this is why, although the cut does not change the steepness of the attack (the vibraphone’s dynamic is remarkably linear), it changes its timbre. This analysis implies a new type of ear training. Aided by the experiment in cutting, the ear learns to identify both a steepness and a color in an attack.

12.4. SCISSOR ATTACK

At this point, after these various trials, we must make an important comment: in practice, all we have done with our cuts is eliminate the natural beginning of the sound and in each case replaced it with an artificial beginning using scissors: the sound must indeed start somewhere, and we can only replace one beginning with another. To what extent is this “scissor attack” parasitic?

First we must settle a question of terminology. We will use the phrase “the beginning of the sound” for the beginning of the signal, materialized by the tape, and the term attack for the perception localized at the initial moment. Now, to return to our bass piano note: in our first experiment we made straight cuts in the tape. Now we make a 45 degree cut in the same place: the attack is slightly gentler. If we repeat the experiment with different piano notes, we observe that in every case the sloping cuts give gentler attacks than the straight cuts; the latter are therefore the only ones able to reproduce the piano’s percussive attack in the right circumstances (a suitable dynamic slope).

We should note, further, that, for the piano as for the vibraphone, whether the cut is more or less angled seems less of a determining factor in the perception of attack than the dynamic slope of the sound at the place where the cut is made: we would say that the effect of the angle of the cut is secondary to the dynamic slope of the sound.

To summarize what we have learned from this first series of experiments:

• With the bass piano the attacks obtained by straight cuts are identical to the original attack (as, in fact, is the timbre).

• With the midregister piano these cuts give attacks that are more or less steep, depending on whether the slope of the decreasing dynamic of the sound is itself steep at the point where the cut is made; if it is made very near to the beginning of the sound, where the slope is the same as at the very beginning, we obtain the whole original note in both steepness of attack and timbre.

• If the cut is angled, the attack seems slightly gentler, but this effect is secondary to the one above.

• For percussion instruments, such as the vibraphone, or the high notes of the piano, in which there is a marked change in harmonic content in the course of the sound (the disappearance of noise because of the very short initial shock), such cuts give sounds with a different timbre, but the above rules remain valid for the quality now perceived, after ear training, as the steepness of attack.

We will now test these experiments on the “scissor attack” against the comments we made on the integration threshold of the ear (50 ms): it is easy to calculate that the 45 degree cut, which we have just said produces a gentler attack than the straight cut, gives a time for the sound energy to appear of about only 20 ms. We may wonder how far below its integration threshold the ear is still sensitive to the time it takes for the sound to appear: we observe by way of experiment that, from 0 to 5 milliseconds, the attack obtained from the straight or slightly angled cut retains the same character of steepness and gives rise to a slight sensation of shock (a phenomenon due to the ear’s mechanical inertia; see section 11.6). When it takes more than 5 ms for the sound to appear, the attack becomes progressively gentler.

12.5. CUTTING SOUNDS OTHER THAN PERCUSSIVE

Now we will attempt to make cuts in sustained sounds and to assess their importance in the perception of these sounds.

We observe, for example, that with a swelled flute sound a straight cut distorts the timbre, giving an explosive attack that has nothing in common with the original attack, whereas a steeply angled (60 degree) cut reproduces the original attack. But this sort of manipulation on an expressive flute sound (with vibrato) gives an appreciably less strange sound. With a very short note, on the contrary, it makes the sound unrecognizable.

Conversely, a straight cut in a trumpet sound restores the tonguing sensation characteristic of this instrument’s attacks on sound reasonably well; this finding is hardly surprising, for we know that the time for a trumpet sound to appear is very short; however, an angled cut gives a gentle attack that, in some cases (e.g., a swelled piano type sound) can cast doubt on the source of the sound and indeed even bring about veritable instrumental mutations. And so we manage practically to “transform” a middle-register trumpet sound into a flute sound.

The importance of the attack as an element in identifying sound with its timbre is therefore very variable depending on the nature of the objects delivered by the instrument:

• With very short sounds the attack plays a decisive role; it is characteristic of the timbre, as in percussive instruments (e.g., the piano).

• With swelled sounds of medium duration the importance of the attack diminishes; attention starts to be directed toward the evolving sound.

• With sustained sounds with vibrato (this is the norm) the role of the attack becomes almost negligible, and we may think that in these cases the ear is mainly directed toward the development of the sound, which constantly fixes its attention.

Cuts in violin or oboe sounds confirm the above findings: to study the influence of cuts, it would be worthwhile to work with fairly short swelled sounds: cuts to reproduce the original attack should be more or less angled in keeping with the steepness of the attacks themselves.

If, finally, we make cuts in rich, fluctuating sounds, such as a gong sound, the new objects obtained in this way may be very different from the initial objects: the cut, in fact, brings out a part of the object that was masked by an initial harmonic content, which was particularly overwhelming for the ear. Nevertheless, the steepness of the scissor attacks obeys the general rule that emerged from the above experiments. And the ear “learns” similarly to distinguish between two qualities: the timbre of the attack, a function of the harmonic content “discovered” at the moment of cutting, and the steepness of the attack, always linked to the dynamic slope.

12.6. GENERAL INTERPRETATION OF THESE FINDINGS

We have discovered through making cuts in the magnetic tape that the musical perception of attack correlates, on the one hand, with the general dynamic of the sound—that is, its energetic development—and, on the other hand, with the harmonic content.

So we have reached a first milestone, since these correlations account for all first-order phenomena at least.

Here is an overall review of these results:

In general every sound has three temporal phases (fig. 8):

• an establishment phase A

• a sustainment phase B

• a decay phase C

FIGURE 8. Dynamic phases of sustained sounds.

We should note that these three phases are often so bound together that we have some difficulty separating them. With percussion sounds followed by resonance there is no phase B; A is directly linked to C, which is of variable length (fig. 9).

FIGURE 9. Dynamic phases of resonance-percussion.

We have seen that the musical perception of attack correlated in two ways with the physical structure of the sound signal, involving, on the one hand, the general dynamic of the sound, linked to its energetic history, and, on the other hand, its harmonic content.

1. The general dynamic comes into play through the speed of establishment of the sound (phase A), which brings us to suggest three orders of magnitude:

• very rapid establishment (lasting fewer than 5 to 10 ms), where the ear cannot “follow” the too rapid variations;

• medium duration establishment (about 50 ms);

• and finally very slow establishment.

The general dynamic also comes into play in percussive sounds followed by resonance through its downward slope after the beginning of the sound.

2. On the physical level harmonic content is commonly described in terms of spectrum, using a Fourier series analysis. The ear perceives the greater or lesser richness of sound, the distribution of partials and their development.

We find two different types of perceptions to characterize the attack, corresponding to two sorts of physical variables:

• the first we call the steepness of the attack, related to the dynamic phenomena;

• the second we call the color of the attack, related to the harmonic phenomena.

These two types of perception are in theory independent. Nevertheless, it often happens that an attack is both steep and rich (a sudden shock giving rise to an increased number of partials) or else gentle and poor.

The laws that follow apply first to sustained sounds, then to percussive sounds followed by resonance. In both cases they deal first with the perception of the steepness of attack alone, without taking its color into account. Then we will discuss the overall perception of attacks: steepness + color.

12.7. LAWS OF PERCEPTION OF ATTACK

First law: To describe its perception of the steepness of attack with sustained sounds, the ear, as a general rule, is sensitive to the way in which sound energy appears in time (phase A).

It should be noted that by this we mean the total energy and not the energy of a particular isolated harmonic component of the sound.

There are several possibilities:

1. The energy appears in roughly 3 to 10 ms (see fig. 10). In this case, whatever the sound, the sensation of steepness of attack is always the same: the ear is not capable of following such steep ascents, which then give a sort of attack noise (a result of the spectrum spreading out in the ear): this is a short click, which may disappear if it is masked by a large harmonic content (for example, in an attack with a rosined bow). This clicking sound is more apparent in relatively poor sounds (trumpet). It is the rule in all artificial (straight) tape cuts (see fig. 7).

FIGURE 10. The energy appears in a time less than or equal to 5 ms: all the attacks are perceived as having the same steepness.

2. The sound energy appears in roughly 10 to 50 ms (see fig. 11). Here it seems that the perceived steepness of attack is linked only to this appearance time and not to its fluctuations in detail. Take, for example, a sustained flute sound with an establishment duration of about 40 ms: we observe that a 60 degree or 70 degree cut in the tape quite perceptibly reproduces the flute’s attack; now, with a “scissors” beginning, the attack appears in a strictly linear way, which is not the case with the natural beginning; the ear is therefore not sensitive to the detail but only to the overall duration of establishment of the energy (see fig. 8).

FIGURE 11. The energy appears in a period of time between 10 and 50 ms. To describe its perception of the steepness of attack, the ear is sensitive to the time it takes for the energy to appear but not to the various fluctuations that accompany this.

Experiments such as these can be repeated with violin or clarinet sounds, as well as others.

Moreover, in the two examples 1 and 2, if the harmonic content of the sound is constant throughout their duration, a cut made at a suitable angle reproduces the original attack in its entirety, with its degree of steepness and its color.

Indeed, in both cases the original steepness can be reproduced by making the cut at the appropriate angle, and in theory the same harmonic content would be found at any point in the sound. This rule can be verified with very consistent flute or violin sounds (see fig. 12).

FIGURE 12. The harmonic content is stable; in this example, whatever T is, and if t remains the same, the cut will reproduce an attack identical to the original in steepness and color.

However, as it is practically impossible, even with sounds such as the flute, which can easily be sustained, to obtain rigorous consistency of harmonic content, because a performer always allows tiny fluctuations to form, the cuts always give slight differences of color compared with the original attack. In addition, it is very rare for the beginning of a sound, particularly if it is rapid (example 1), not to contain some parasitic sound (the noise of a peg, tonguing); these noises, although in general not very noticeable, are nevertheless an integral part of the characteristic timbre of instruments, so it is only very rarely that we are really in the situation presumed by these hypotheses.

3. The sound energy takes much longer than 50 ms to appear. In this case the ear is able to follow the dynamic and harmonic developments when the sound appears; this finding logically rounds off our previous conclusions. It should be noted that here the term attack is only used by extrapolation from the earlier examples, for it is difficult to say where the attack finishes and where the continuation of the sound, properly speaking, begins; the concept of steepness of attack no longer has much meaning, since the sound emerges progressively out of silence. On the one hand, the technique of cutting will not now help us to study the beginnings of sounds; on the other hand, it may shed light on our perception of particular moments of the sound, which may be masked by the moments immediately before, by eliminating these.

Second law: To describe its perception of steepness of attack with sounds that have a percussive or plucked attack followed by resonance, the ear is sensitive to the way the energy disappears even more than to the way it appears. As we have seen, the steepness of attack is linked in the first place to the slope of the decreasing dynamic immediately after the beginning of the (natural or artificial) sound and only in second place to the upward slope as the sound appears.

Here, in theory, we should come back to the three processes described in 1, 2, and 3 on sustained sounds; in practice, when a string is struck or plucked, the energy moves in very quickly, about 5 to 10 ms; so the original attack can be reproduced by a straight cut, provided that it is made at a point where the descending dynamic has the same slope as immediately after the beginning of the original sound. At a point where the slope is less steep, cutting produces a gentler attack; similarly, an angled cut softens the attack, but this effect is secondary.

In addition, and as above, if the harmonic content is constant (as in the bass register of the piano), a straight cut in a part of the sound where the dynamic has the same slope as at the beginning reproduces the original attack in its entirety, with its steepness and color.

We may wonder why, with attack-resonance sounds, the ear is more sensitive to the descending dynamic than the ascending slope; it may be that, as the energy appears quite abruptly in both cases, the most perceptible difference between the sounds is where they decrease; the ear limits itself to accepting delivery, insofar as it is able, of an energy that establishes itself quite suddenly but disappears more or less quickly. The oscillations of the energy are all the more significant in their decay phase if they are still apparently the same in their appearance phase.

Other researchers had preceded us in this line of thought. “Karl Stumpf,” says Winckel, “showed that a sound with a timbre and intensity that are constant in time loses its character to some extent if the characteristic attack is eliminated by any sort of procedure (cutting with scissors). Then, after an abrupt attack, the influence of which can be evaluated, there remains the sound object’s continuation itself, with characteristics that no longer vary in time.”4

We would have to consult Stumpf’s original work to verify the quotation. Oddly, if it is based on experimentation similar to our own, it shows how far, in the view of the person quoting it, the attack is linked to the beginning of the sound.

12.8. EFFECT OF DYNAMIC ON THE PERCEPTION OF TIMBRES

The above experiments have helped us to reach a better understanding of the importance of attack as an element in the identification of instrumental timbre. Although we have been careful to note our findings, we should like at this point to summarize our conclusions. We observed, for example, that it was possible, by means of an exaggeratedly toned-down attack, to transform a piano sound (in the midregister) into a flute sound and that a vibraphone sound, with its natural beginning cut off, becomes unrecognizable. . . . In other words, at least for a certain type of sound, the ear deduces the elements necessary for identifying the instrument from the attack. We have seen that the same is almost true of “swelled” sustained sounds that are short or do not develop; conversely, the attack becomes secondary as an element in the identification of timbre when the sounds have dynamic or harmonic variations in the course of their duration (vibrato, for example), and this all the more so if these variations are numerous and unpredictable. We could, therefore, as a general rule, say that:

1. Every percussion-resonance type of sound has its characteristic timbre immediately from the moment of attack.

2. The timbre of every sustained sound with dynamic or harmonic variations will only be identified secondarily by its attack; the timbre will be the result of a perception that evolves throughout the duration of the sound.

We could also conflate these two statements into one: perceived timbre is a synthesis of the variations in harmonic content and dynamic development; in particular, it exists immediately from the first moment of the attack whenever the rest of the sound flows directly from this.