Chapter 11

Frequency Analysis Part I

Fourier Decomposition

Pascal Wallisch

This chapter introduces the most common method of decomposing a time series into frequency components, Fourier analysis. You will learn about the Fourier transform and the associated amplitude and phase spectra. The MATLAB® implementation of the fast Fourier transform (FFT), an efficient algorithm for calculating Fourier transformations, will be introduced and applied to the analysis of human speech sounds.

Keywords

Fourier analysis; Fourier transform; fast Fourier transform; Fourier series; power; phase analysis

11.1 Goals of this Chapter

This chapter introduces the most common method of decomposing a time series into frequency components, Fourier analysis. You will learn about the Fourier transform and the associated amplitude and phase spectra. The MATLAB® implementation of the fast Fourier transform (FFT), an efficient algorithm for calculating Fourier transformations, will be introduced and applied to the analysis of human speech sounds.

11.2 Background

Figure 11.1 shows typical recordings of two human vowel sounds. How can you characterize these different sounds? Frequency analysis provides a way to examine the relative contributions of various frequencies to an overall signal. In the case of an auditory signal, a given frequency component would be termed pitch.

11.2.1 Real Fourier Series

Take some continuous function f. We can approximate such a function with a weighted series of sinusoids of various frequencies. Such a series is termed the real Fourier series:

image (11.1)

Here, the coefficients an and bn represent the relative strength of each frequency component n/2π. [a0 represents the nonoscillatory component of f(t).] So, given f(t), determining the coefficients an and bn allows for the representation of f(t) as a series sum of sinusoids.

We will exploit two special properties of the sine and cosine functions to find the Fourier series coefficients an and bn. Over the interval −π to π, cosine and sine functions with differing frequencies have the special property of orthogonality. The integral of the product of two mutually orthogonal functions evaluates to zero. So, the integral of the product of cosine or sine functions with differing frequencies results in zero over this interval. Another interesting property of sine and cosine is that the integral of the square of a cosine or sine function over this integral is π.

To find the strength, am, of a cosine component m, multiply by the corresponding cosine function and integrate:

image (11.2)

All terms on the right side except the cosine term where m=n yield zero:

image (11.3)

The right side integral evaluates to one over the integration range, yielding an expression for the Fourier series term coefficient:

image (11.4)

image (11.5)

In general, the interval of f(t) will not be −π to π. For an interval centered on x with length 2L, the expression becomes

image (11.6)

A similar procedure using sine functions yields the coefficients for the sine terms of the Fourier series.

11.3 Exercises

Exercise 11.1

Write a MATLAB function to calculate coefficients for a real Fourier transform. Hint: The function will need to shift the interval so that the interval encompasses the entire time series. In other words, x=0 and L=half the range of t.

11.3.4 Amplitude Spectrum

Often when you are using Fourier analysis, the amplitude spectrum is one of the first analyses performed. The amplitude spectrum graphs amplitude against frequency. In terms of the Fourier series representation, the amplitude spectrum depicts the magnitude of the coefficients at various frequencies. As such, it depicts the relative strengths of the various frequency components.

The following code generates a time series composed of 10 sine waves whose frequencies and amplitudes vary systematically.

L = 1000;

X = zeros(1,L);

sampling_interval=0.1;

t = (1:L) * sampling_interval;

for N = 1:10

X = X + N * sin (N*pi*t);

end

plot(t, X);

Y = fft(X)/L;

Now, the variable Y contains the normalized FFT of X. Note the normalization factor L. Displaying the amplitude spectrum of X requires plotting the amplitudes at various frequencies. Note that fft returns only a single value, the transform coefficients. Now, how do you determine the frequency scale?

The return value of the FFT assumes that frequency is evenly spaced, from 0 to a theoretical result called the Nyquist limit. Nyquist demonstrated that a discrete sampling of a continuous process can capture frequencies no higher than half the sampling frequency. Since the code above has the sampling interval, this Nyquist limit is half the inverse of the sampling interval.

The following code calculates the Nyquist limit for the time series:

NyLimit = (1 / sampling_interval)/ 2;

When viewing the FFT, it is important to remember that the result is the complex transform. Thus, simply using the result of the FFT as a set of real coefficients can cause a number of problems. To display the amplitude spectrum, the absolute value of the complex coefficients will be used. The values returned by fft are the coefficients for frequencies from the negative Nyquist limit to the positive Nyquist limit. If the time series data are purely real, then the resultant transform will have even symmetry. That is, the transform will be symmetrical across the abscissa. So, in this very frequent case, only the first half of the result of fft is used. The following code employs linspace to generate frequency values and plots the amplitude spectrum. linspace generates a linearly spaced sequence of values given initial and final values. Here, the initial and final values are 0 and 1, with a value count of L/2. The resultant vector is scaled by the Nyquist limit to generate the frequency vector.

F = linspace(0,1,L/2)*NyLimit;

plot(F, abs(Y(1:L/2)));

11.4 Project

In this project, you will be asked to use Fourier decomposition to analyze vowel sounds produced by human speakers. On the companion website, you will find five examples of vowel sounds as produced by male American English speakers. Each sound corresponds to one of the vowel sounds in Table 11.1. The formant frequencies in Table 11.1 note the average formant frequencies as spoken by a male speaker of American English. You will use power spectra of these sounds to classify the recordings as one of these vowel sounds in the table.

To complete this project, you need to understand how formants relate to frequency analysis. The human vocal tract has multiple cavities in which speech sounds resonate. As such, most sounds have multiple strong frequency components. In classifying speech sounds, the lowest strong frequency band is termed the first formant. The next highest is termed the second formant, and so on.

Vowels lend themselves to a particularly simple characterization through their formants. Typically, vowel sounds have distinguishable first and second formants. Table 11.1 shows first and second formants for four vowel sounds in American English. Thus, the short “i” sound would have strong frequency representation at 342 Hz and at 2322 Hz.

MATLAB Functions, Commands, and Operators Covered in this Chapter

fft

ifft

conj