Chapter 9. Audio and Music Processing

Deep in the back of my mind is an unrealized sound Every feeling I get from the street says it soon could be found When I hear the cold lies of the pusher, I know it exists It’s confirmed in the eyes of the kids, emphasized with their fists . . .
The music must change For we’re chewing a bone We soared like the sparrow hawk flied Then we dropped like a stone Like the tide and the waves Growing slowly in range Crushing mountains as old as the Earth So the music must change
—The Who, "Music Must Change"

9.0 Introduction

Audio and music can be approached in three different ways with Mathematica: (1) as traditional musical notes with associated pitch names and other specifications, such as duration, timbre, loudness, etc.; (2) as abstract mathematical waveforms that represent vibrating systems; and (3) as digitally represented sound—just think of .wav and .aiff files. If nothing else, this chapter should hint at the ease with which Mathematica can be put in the service of the arts. Let’s make some music!

Mathematica allows you to approach music and sound in at least three different ways. You can talk to Mathematica about musical notes such as "C" or "Fsharp". You can directly specify other traditional concepts, such as timbre and loudness, with Mathematica’s Sound, SoundNote, and PlayList functions. You can ask Mathematica to play analog waveforms. And you can ask Mathematica to interpret digital sound samples.

9.1 Creating Musical Notes

Problem

You want to create musical notes corresponding to traditional musical notation.

Solution

The Mathematica function SoundNote represents a musical sound. SoundNote uses either a numerical convention, for which middle C is represented as zero, or it accepts strings like "C", "C3", or "Aflat4", where "A0" represents the lowest note on a piano keyboard.

Discussion

SoundNote assumes you want to play a piano sound, for exactly one second, at a medium volume. You can override these presets. Here’s a loud (Soundvolume→1), short (0.125 second), guitar blast ("GuitarOverdriven").

9.2 Creating a Scale or a Melody

Problem

You want to create a sequence of notes, like a scale or single-note melody.

Solution

Sound can accept a list of notes, which it will play sequentially. Here is a whole-tone scale specified to take exactly 1.5 seconds to play in its entirety.

Here’s an alternative syntax using Map (/@), which requires less typing and collects the note specifications into a list.

Here’s a randomly generated melody composed of notes from an Ab major scale. The duration of each note is specified as 0.125 second. The duration specification, now a parameter of SoundNote rather than an overall specification of the entire melody as in the previous examples, sets the stage for the next example.

9.3 Adding Rhythm to a Melody

Problem

You need to specify a melody for which the notes have different rhythm values.

Solution

Replace the 0.125 specification in the previous example with other values. Since you’re generating a random melody, why not generate random durations?

Here, the weighting feature of RandomChoice is used to guarantee a preponderance of short notes.

9.4 Controlling the Volume

Problem

You would like to add some phrasing to your melody by controlling the volume.

Solution

Unlike duration, which is specified as a parameter to SoundNote, you control the volume with an option setting. Pulling everything together from the examples above and adding a randomized volume yields this funky guitar pattern. Anyone for a cup of Maxwell House coffee?

9.5 Creating Chords

Problem

You want to move beyond simple sequences of single notes to chord patterns.

Solution

To make a chord, give SoundNote a list of notes. For example, you can specify the C major triad using the pitches C, E, and G specified as a list of numbers {0,4,7}. Don’t confuse making chords by giving SoundNote a list of notes with making melodies by giving Sound a list of SoundNotes.

9.6 Playing a Chord Progression

Solution

This is the same as making melodies. Spell out the chords in your chord progression as lists inside a list. Feed them into SoundNote using Map.

Here’s a popular pop song progression.

9.7 Writing Music with Traditional Chord Notation

Problem

You want to specify a chord progression using traditional notation. For example, you would like to write something like:

In[703]:= myProg = "C A7 d-7 F/G C";

or, using roman numerals as is common in jazz notation,

In[704]:= myJazzProgression = "<Eb> I vi-9 II7/#9b13 ii-9 V7sus I";

Solution

Mathematica can deftly handle this task with its String manipulation routines and its pattern recognition functions. First, decide which chord symbols will be allowed. Here’s a list of jazz chords: Maj7/9, Majadd9, add9, Maj7#11, Maj7/13, Maj7/#5, Maj7, Maj, -7b5, -7, -9, -11, min, 7/b913, 7/#9b13, 7/b9b13, 7/b9#11, 7/b5, 7/b9, 7/#9,7/#11,7/13,7,7/9, 7sus, and sus.

The rules below turn the chord names into the appropriate scale degree numbers in the key of C. Later, as a second step, you’ll transpose these voicings to other keys.

Make a table by concatenating together each possible root and type. Then /. can be used to decode chord.

Now create a function for converting the chord string into a progression representation.

And a function to play the progression.

In[713]:= playProgression[progression[k_, csyms_, kn_, chords_]]:=
           Sound[SoundNote[#, 1] & /@chords, 5]

Let’s test it on a jazz progression.

Let’s add some rhythm and volume.

Discussion

There’s a very unsatisfying feature to the result: the chords jump around in an unmusical way. A piano player would typically invert the chords to keep the voicings centered around middle C. So for example, when playing a CMaj7 chord, which is defined as {0,4,7,11} or {"C3","E3","G3","B3"}, a piano player might drop the tap two notes down an octave and play {-5,-1,0,4} or {"G2","B2","C3","E3"}. You can use Mathematica’s Mod function to achieve the same result. Here the notes greater than 6 {"F#3"} are transposed down an octave simply by subtracting 12 from them.

In[719]:= buffer
Out[719]= {{-21, 3, 7, 10}, {-12, 12, 15, 19, 22, 26}, {-19, 5, 8, 13},
           {-19, 5, 8, 12, 15, 19}, {-14, 10, 15, 17, 20}, {-21, 3, 7, 10}}

Currently in the buffer, the nonbass notes are all positive, so this rule, which uses /; n>0 as a condition, leaves the (negative) bass notes untouched while processing the rest of the voicing.

Here’s another progression showing all the steps in one place.

9.8 Creating Percussion Grooves

Solution

Mathematica has implemented 60 percussion instruments as specified in the General MIDI (musical instrument digital interface) specification.

Here the percussion instruments are listed in alphabetical order. Some of the names are not obvious. For example, there is no triangle or conga, instead there’s "MuteTriangle", "OpenTriangle", "HighCongaMute", "HighCongaOpen", and "LowConga".

In [724]:= allPerc = {"BassDrum", "BassDrum2", "BellTree", "cabasa", "Castanets",
               "ChineseCymbal", "Clap", "Claves", "Cowbell", "CrashCymbal",
               "CrashCymba12", "ElectricSnare", "GuiroLong", "Guiroshort", "HighAgogo",
               "HighBongo", "HighCongaMute", "HighCongaOpen", "HighFloorTom",
               "HighTimbale", "HighTom", "HighWoodblock", "HiHatClosed", "HiHatOpen",
               "HiHatPedal", "JingleBell", "LowAgogo", "LowBongo", "LowConga",
               "LowFloorTom", "LowTimbale", "LowTom", "LowWoodblock", "Maracas",
               "MetronomeBell", "MetronomeClick", "MidTom", "MidTom2", "MuteCuica",
               "MuteSurdo", "MuteTriangle", "Opencuica", "Opensurdo", "OpenTriangle",
               "RideBell", "RideCymbal", "RideCymba12", "ScratchPull", "ScratchPush",
               "Shaker", "SideStick", "Slap", "Snare", "SplashCymbal", "SquareClick",
               "Sticks", "Tambourine", "Vibraslap", "WhistleLong", "WhistleShort"};

Here’s what each instrument sounds like. The instrument name is fed into SoundNote where, more typically, the note specification should be. In fact, in the Standard MIDI specification, each percussion instrument is represented as a single pitch in a "drum" patch. So for example, "BassDrum" is CO, "BassDrum2" is C#O, "Snare" is DO, and so on. Therefore, it makes sense for Mathematica to treat these instruments as notes, not as "instruments" as was done above for "Piano", "GuitarMuted", and "GuitarOverDriven".

Here’s a measure’s worth of closed hi-hat:

And here’s something with a little more pizzazz. Both the choice of instrument and volume are randomized.

9.9 Creating More Complex Percussion Grooves

Problem

You want to create a drum kit groove for a pop song using kick, snare, and hi-hat.

Solution

This task is the percussion equivalent of making chords, because on certain beats all three instruments could be playing, on other beats only one instrument or possibly none. Here’s the previous hi-hat pattern, played at a slower tempo.

Here’s a kick drum pattern. Use None as a rest indication.

Here’s the snare drum backbeat. The display omits the leading rests, so the picture is a little misleading. As soon as we integrate this with the hi-hat and kick drum, everything will look correct.

Each list has exactly eight elements, so we can use Transpose to interlace the elements.

In[731]:=  groove = Transpose[{Table["HiHatClosed", {8}],
               {"BassDrum", None, None, "BassDrum", "BassDrum", None, None, None},
               {None, None, "Snare", None, None, None, "Snare", None}}]
Out[731]=  {{HiHatClosed, BassDrum, None}, {HiHatClosed, None, None},
            {HiHatClosed, None, Snare}, {HiHatClosed, BassDrum, None},
            {HiHatClosed, BassDrum, None}, {HiHatClosed, None, None},
            {HiHatClosed, None, Snare}, {HiHatClosed, None, None}}

An entire tune can now be made by repeating this one-measure groove as many times as desired.

Discussion

Getting the curly braces just right in Mathematica’s syntax can be a little frustrating. Without Flatten in the example above, the SoundNote function is confused by the List-within-List results of the Table function. Consequently, you get no output.

In[734]:=  Sound[SoundNote[#, 0.25] &/@ Table[groove, {4}]]

Out[734]=  Sound[
            {SoundNote[{{"HiHatClosed", "BassDrum", None}, {"HiHatClosed", None, None},
               {"HiHatClosed", None, "Snare"}, {"HiHatClosed", "BassDrum", None},
               {"HiHatClosed", "BassDrum", None}, {"HiHatClosed", None, None},
               {"HiHatClosed", None, "Snare"}, {"HiHatClosed", None, None}}, 0.25`],
             SoundNote[{{"HiHatClosed", "BassDrum", None}, {"HiHatClosed", None, None},
               {"HiHatClosed", None, "Snare"}, {"HiHatClosed", "BassDrum", None},
               {"HiHatClosed", "BassDrum", None}, {"HiHatClosed", None, None},
               {"HiHatClosed", None, "Snare"}, {"HiHatClosed", None, None}}, 0.25'],

           SoundNote[{{"HiHatClosed", "BassDrum", None}, {"HiHatClosed", None, None},
               {"HiHatClosed", None, "Snare"}, {"HiHatClosed", "BassDrum", None},
               {"HiHatClosed", "BassDrum", None}, {"HiHatClosed", None, None},
               {"HiHatClosed", None, "Snare"}, {"HiHatClosed", None, None}}, 0.25`],
             SoundNote[{{"HiHatClosed", "BassDrum", None}, {"HiHatClosed", None, None},
               {"HiHatClosed", None, "Snare"}, {"HiHatClosed", "BassDrum", None},
               {"HiHatClosed", "BassDrum", None}, {"HiHatClosed", None, None},
               {"HiHatClosed", None, "Snare"}, {"HiHatClosed", None, None}}, 0.25`]}]

Furthermore, with a simple Flatten wrapped around the Table function, each hit is treated individually; we lose the chordal quality of the drums hitting simultaneously. Go back and notice that the correct idea is to remove just one layer of braces by using Flatten [ ... , 1 ].

9.10 Exporting MIDI files

Problem

You want to save your Mathematica expression as a standard MIDI file.

Solution

Mathematica can export any expression composed of Sound and SoundNote expressions as a standard MIDI file. The rub, however, is that Mathematica does not import MIDI files. So let’s create some utilities that at the very least let you look at the guts of standard MIDI files.

Here’s a simple phrase that gets exported as the file myPhrase.mid.

9.11 Playing Functions As Sound

Problem

You want to listen to the waveform generated by a mathematical function.

Solution

If you know how to plot a function in Mathematica:

You can play a function. Play uses the same syntax as Plot. However, you don’t want to listen to 1/1000th of a second, which is what was plotted above, so specify something like {t, 0, 1}.

Discussion

Here are other crazy-sounding functions.

9.12 Adding Tremolo

Solution

"Tremolo" is the musical term for amplitude modulation. Here a 20 Hz signal modifies the amplitude of a 1,000 Hz signal.

And here, a 5 Hz signal modifies a 1,000 Hz signal.

9.13 Adding Vibrato

Solution

Vibrato is frequency modulation. Notice that the sine wave alternates between regions of compression and expansion.

Here the parameters are adjusted for listening.

Why not put the two modulations together: tremolo and vibrato?

9.14 Applying an Envelope to a Signal

Problem

You want to apply an envelope to your signal.

Solution

The Mathematica function Piecewise is the perfect tool for creating an envelope. Here is the popular attack-decay-sustain-release (ADSR) envelope.

Sine waves are typically represented as amplitude * sine (ωt). You can simply substitute the entire Piecewise[] envelope for amplitude.

Listen!

Discussion

Calculating the envelope functions for the four regions is not as hard as you might expect. Perhaps you remember the equation for a straight line: y = m x + b, where m is the slope of the line and b is the y-intercept. Here is a line with a slope of -2 that intercepts the y-axis at y = 4, so its equation is y = -2x + 4.

If this were the function for the second portion of the envelope, the decay portion, you would need to shift this line to the right. You can shift the line to the right simply by replacing x with (x - displacement). In general, the template for creating the equations for the Piecewise functions will be: y = m (x - displacement) + initial value of segment. Notice that what was at first the y-intercept is now the "initial value of the segment." The line here is shifted two units to the right, and the new equation is y = -2 (x - 2) + 4. If we simplify the right side, the equation becomes y = -2x + 8. This line has the same -2 slope but would intercept the y-axis at y = 8 if we were to extend the line to the left.

9.15 Exploring Alternate Tunings

Problem

You want to explore different partitions of the musical scale and alternate instrument tunings.

Solution

Modern Western music uses tempered tuning, which is a slight compromise to the vibrations of the natural world, or at least the perfection of the natural world as the Greeks described it 3,000 years ago. The ancient Greeks (and even earlier, the Babylonians) noticed that when objects vibrate in simple, integer ratios to each other, the resulting sound is pleasant. The simple ratio of 2:1 is so pleasant that we perceive it as an equivalence. When two notes vibrate in a ratio of 2:1, we say they have the same pitch but are in different octaves. The history of music has been the history of partitioning the octave.

The first obvious division of the octave is created by the next simplest ratio, a 3:1 ratio. Consider the following schematic of a vibrating string. The only requirement on the string is that its endpoints remain fixed. The string can vibrate in many different modes, as shown in the first column. Each mode has a characteristic number of still points, called "nodes," that appear symmetrically along the length of the string. Each mode also has a characteristic rate of vibration, which is a simple integer multiple to the lowest fundamental frequency. Notice that three out of the first four harmonics are octave equivalences. The third harmonic, situated between the second and fourth harmonics, has a ratio of 3:2 to the second harmonic and 3:4 to the fourth. These were the kinds of simple ratios that appealed to the Greeks.

The following keyboard shows how a successive application of the 3:2 ratio can be used to build the entire chromatic scale. After 12 applications of this 3:2 ratio, every note of the modern chromatic scale has been visited once and we are returned to starting pitch—sort of!

There’s a problem: (3/2)¹² represents the C seven octaves above the starting C and should equal a C with a frequency ratio of 2⁷ = 128, but (3/2)¹² equals 129.75. The equal temperament solution to this problem is to distribute this discrepancy equally over all the intervals. In other words, in equal temperament, every interval is made slightly, and equally, "out of tune." Johann Sebastian Bach composed a series of keyboard pieces in 1722 called "The Well-Tempered Clavier" to demonstrate that this compromise was basically imperceptible and had no negative impact on the beauty of the music.

Mathematically, equal temperament means that the frequency of each pitch should have the same ratio to its immediate lower neighbor’s frequency. Call this ratio α. Then it must be the case that if a chromatic scale, which contains 12 pitches, takes you from some frequency to twice that frequency, then α¹²= 2. So the ratio of a semitone in equal temperament is 1.0596.

However, now that we have the octave in perfect shape, every other interval is slightly "wrong"—or at least wrong according to the manner in which the Greeks were trying to make their intervals. So for example, a Pythagorean fifth, which is 3/2 = 1.5, is slightly flat in equal temperament (the musical interval of a fifth is composed of seven half-steps).

In[756]:= α⁷
Out[756]= 1.49831⁷

In[757]:= 1.498307
Out[757]= 1.49831

Now that we’ve gone through the basics of tuning, how do you use Mathematica to explore alternate tunings?

9.16 Importing Digital Sound Files

Problem

You want to import a digital sound file, for example, a WAV or AIFF file.

Solution

Mathematica imports many standard file formats. Both AIFF and WAV are in the list.

In[762]:= $ImportFormats
Out[762]= {3DS, ACO, AIFF, ApacheLog, AU, AVI, Base64, Binary, Bit, BMP, Byte, BYU,
           BZIP2, CDED, CDF, Character16, Character8, Complex128, Complex256,
           Complex64, CSV, CUR, DBF, DICOM, DIF, Directory, DXF, EDF, ExpressionML,
           FASTA, FITS, FLAC, GenBank, GeoTIFF, GIF, Graph6, GTOPO30, GZIP,
           HarwellBoeing, HDF, HDF5, HTML, ICO, Integer128, Integer16, Integer24,
           Integer32, Integer64, Integer8, JPEG, JPEG2000, JVX, LaTeX, List, LWO,
           MAT, MathML, MBOX, MDB, MGF, MMCIF, MOL, MOL2, MPS, MTP, MTX, MX, NB,
           NetCDF, NOFF, OBJ, ODS, OFF, Package, PBM, PCX, PDB, PDF, PGM, PLY, PNG,
           PNM, PPM, PXR, QuickTime, RawBitmap, Real128, Real32, Real64, RIB,
           RSS, RTF, SCT, SDF, SDTS, SDTSDEM, SHP, SMILES, SND, SP3, Sparse6, STL,
           String, SXC, Table, TAR, TerminatedString, Text, TGA, TIFF, TIGER,
           TSV, UnsignedInteger128, UnsignedInteger16, UnsignedInteger24,
           UnsignedInteger32, UnsignedInteger64, UnsignedInteger8, USGSDEM, UUE,
           VCF, WAV, Wave64, WDX, XBM, XHTML, XHTMLMathML, XLS, XML, XPORT, XYZ, ZIP}

Using the "Data" specification will save you the aggravation of decoding the syntax of the imported data. Don’t forget the semicolon, which prevents Mathematica from listing all the sample points. The easiest way to access a file is to type Import[ ], place your cursor between the empty brackets, choose File... from the Insert Menu, navigate in the dialog box to the file you want to open.

In[763]:= file = FileNameJoin [{NotebookDirectory [], "..", "data", "JCK_01.aif"}];
          data = Flatten@Import[file, "Data"];

You’ll need to know the sample rate and whether this file is a mono or stereo, so do a second Import on the same file but specify "Options".

If you simply wanted to play the file, specify "Sound" as the second parameter.

In[766]:=  snd = Import[file, "Sound"];

This returns a Sound object.

In[767]:=  snd // Head
Out[767]=  Sound

And can be played like so:

Discussion

Sound files can be huge, and as such, become difficult to work with.

In[769]:=  Length [data]
Out[769]=  1396853

Here’s a quick way to get an overview of a sound file. Mathematica is being asked to display every thousandth sample point. You can easily see there are a handful of bursts of energy.