© Springer Nature Switzerland AG 2020
M. La Rocca et al. (eds.)Nonparametric StatisticsSpringer Proceedings in Mathematics & Statistics339https://doi.org/10.1007/978-3-030-57306-5_42

AutoSpec: Detecting Exiguous Frequency Changes in Time Series

David S. Stoffer1  
(1)
Department of Statistics, University of Pittsburgh, Pittsburgh, PA, USA
 
 
David S. Stoffer

Abstract

Most established techniques that search for structural breaks in time series may not be able to identify slight changes in the process, especially when looking for frequency changes. The problem is that many of the techniques assume very smooth local spectra and tend to produce overly smooth estimates. The problem of over-smoothing tends to produce spectral estimates that miss slight frequency changes because frequencies that are close together will be lumped into one frequency. The goal of this work is to develop techniques that concentrate on detecting slight frequency changes by requiring a high degree of resolution in the frequency domain.

1 Introduction

Many time series are realizations of nonstationary random processes, hence estimating their time varying spectra may provide insight into the physical processes that give rise to these time series. For example, EEG time series are typically nonstationary, and estimating the time varying spectra based on the EEG of epilepsy patients may lead to methods capable of predicting seizure onset; e.g., see [2]. Similarly, analyzing the time varying spectrum of the Southern Oscillation Index (SOI) may further our knowledge of the frequency of the El Niño Southern Oscillation (ENSO) phenomenon and its impact on global climate; e.g., see [3].

Most established techniques that search for structural breaks in time series, however, may not be able to identify slight frequency changes at the resolution of interest. Of course, the resolution depends on the particular application. The problem is that many of the techniques assume very smooth local spectra and tend to produce overly smooth estimates. The problem of assuming very smooth spectra produces spectral estimates that may miss slight frequency changes because frequencies that are close together will be lumped into one frequency; see Sect. 5 for further details. The goal of this work is to develop techniques that concentrate on detecting slight frequency changes by requiring a high degree of resolution in the frequency domain.

The basic assumptions here are that, conditional on the location and number of segments, the time series is piecewise stationary with each piece having a spectral density. A detailed description of the model is given in Sect. 2. In addition to representing time series that have regime changes, the model can be used to approximate slowly varying processes; e.g., see [1], which uses dyadic segmentation to find the approximate location of breakpoints. The approach taken in [8] was to fit piecewise AR models using minimum description length and a genetic algorithm for solving the difficult optimization problem. [13] proposed nonparametric estimators based on dyadic segmentation and smooth local exponential functions. [15] estimated the log of the local spectrum using a Bayesian mixture of splines. The basic idea of this approach is to first partition the data into small sections. It is then assumed that the log spectral density of the evolutionary process in any given partition is a mixture of individual log spectra. A mixture of smoothing splines model with time varying mixing weights is used to estimate the evolutionary log spectrum. Later, [16] improved on the technique of [15] by adaptively selecting breakpoints.

For background, note that spectral analysis has to do with partitioning the variance of a stationary time series, $$\{X_t \}$$, into components of oscillation indexed by frequency $$\omega $$, and measured in cycles per unit of time, for $$-1/2< \omega \le 1/2$$. Given a time series sample, $$\{X_t;\, t = 1,..., n\}$$, that has been centered by its sample mean, the sample spectral density (or periodogram) is defined in terms of frequency $$\omega $$:
$$\begin{aligned} I_n(\omega ) = \Bigl |n^{-1/2} \sum _{t=1}^n X_t \, \text {e}^{-2 \pi i \omega t}\Bigr |^2 \,. \end{aligned}$$
(1)
The periodogram is essentially the squared-correlation of the data with sines and cosines that oscillate at frequency $$\omega $$.
The spectral density, $$f(\omega )$$, of a stationary time series can be defined as the limit ($$n \rightarrow \infty $$) of $$E[I_n(\omega )]$$, provided that the limit exists; details can be found in [17, Chap. 4]. It is worthwhile to note that $$f(\omega ) \ge 0$$, $$f(\omega ) = f(-\omega )$$, and
$$\begin{aligned} \int _{-1/2}^{1/2} f(\omega ) \ d\omega = 2\int _{0}^{1/2} f(\omega ) \ d\omega = \sigma ^2, \end{aligned}$$
(2)
where $$\sigma ^2=\text {var}(X_t) < \infty $$. Thus, the spectral density can be thought of as the variance density of a time series relative to frequency of oscillation. That is, for positive frequencies between 0 and 1/2, the proportion of the variance that can be attributed to oscillations in the data at frequency $$\omega $$ is roughly $$2f(\omega ) d\omega $$. If the time series $$X_t$$ is white noise, that is, $$E(X_t)$$ is independent of time t, and $$\text {cov}(X_s, X_t)=0$$ for all $$s\ne t$$, then $$f(\omega )\equiv \sigma ^2$$. The designation white originates from the analogy with white light and indicates that all possible periodic oscillations are present with equal strength.

2 Model and Existing Methods

Let a time series $$\{X_{t};\, t=1,\dots , n\}$$ consist of an unknown number of segments, m, and let $$\xi _{j}$$ be the unknown location of the end of the jth segment, $$j=0,1,\ldots ,m$$, with $$\xi _{0}=0$$ and $$\xi _{m}=n$$. Then conditional on m and $$\pmb \xi =(\xi _{0}, \ldots ,\xi _{m})'$$, we assume that the process $$\{X_{t}\}$$ is piecewise stationary. That is,
$$\begin{aligned} X_{t} = \sum _{j=1}^{m} X_{t,j} \, \delta _{t,j}\,, \end{aligned}$$
(3)
where for $$j=1,\ldots ,m$$, the processes $$X_{t,j}$$ have spectral density $$f^\theta _{j}(\omega )$$ that may depend on parameters $$\theta $$, and $$\delta _{t,j} =1$$ if $$t \in [\xi _{j-1}+1, \xi _{j}]$$ and 0 otherwise.
Consider a realization $$ \pmb x=(x_1,\ldots ,x_n)'$$ from process (3), where the number and locations of the stationary segments is unknown. Let $$n_{j}$$ be the number of observations in the jth segment. We assume that each $$n_{j}$$ is large enough for the local Whittle likelihood (see [20]) to provide a good approximation to the likelihood. Given a partition of the time series $$\pmb x$$, the jth segment consists of the observations $$\pmb x_{j}=\{x_t: \xi _{j-1}+1\le t\le \xi _{j}\}$$, $$j=1,\ldots ,m$$, with underlying spectral densities $$f^\theta _{j}$$ and periodograms $$I_{j}$$, evaluated at frequencies $$\omega _{k_j}=k_j/n_{j}$$, $$0\le k_j\le n_{j}-1$$. For a given partition $$\pmb \xi $$, the approximate likelihood of the time series is given by
$$\begin{aligned}  L(f^\theta _{1},\ldots ,f^\theta _{m} \mid \pmb x, \pmb \xi ) \approx \prod _{j=1}^m (2\pi )^{-n_{j}/2} \prod _{k_j=0}^{n_{j}-1} \exp \Bigl \{-\frac{1}{2}\Bigl [\log f^\theta _{j}(\omega _{k_j})\\ +I_{j}(\omega _{k_j})/f^\theta _{j}(\omega _{k_j})\Bigr ]\Bigr \}. \end{aligned}$$
(4)
Note that in setting up the model, most items depend on the number of regimes, m. For ease, that dependence is understood and dropped from the notation.

2.1 AdaptSpec

The frequency domain approach used in [16] is a Bayesian method that incorporates (4) with a linear smoothing spline prior on the $$\log f^\theta _{j}(\omega )$$ for $$j=1,\ldots ,m$$. In addition, a uniform prior is placed on the breakpoints, $$\Pr (\xi _{j}=t \mid m)=1/p_{j},$$ for $$j=1,\ldots , m-1,$$ where $$p_{j}$$ is the number of available locations for split point $$\xi _{j}$$, as is the prior on the number of segments, $$ \Pr (m=k)=1/M$$ for $$ k=1,\ldots ,M$$ and M is some large but fixed number. The approach uses reversible jump Markov chain Monte Carlo (RJ-MCMC) methods to evaluate the posteriors. The technique is available in an R package called BayesSpec.

2.2 AutoParm

Although this method, which is described in [8], is a time domain approach, the authors argue that AR models are dense in the space of bounded spectral densities (e.g., see [17, Sect. 4.5]) and can thus be used as a frequency domain approach. In this case, each time series $$\{X_{t,j}\}$$ is piecewise stationary with AR$$(p_j)$$ behavior in each segment, $$j=1,\dots , m$$. Then, Minimum Description Length (MDL) as described in [14] is used to find the best combination of the number of segments, m, the breakpoints $$\xi _j$$ (or segment sizes $$n_j$$), and the orders/estimates of the piecewise AR processes. The idea is to minimize the Code Length (CL) necessary to store the data (i.e., the amount of memory required to encode the data), which leads to a BIC-type criterion to find the model that minimizes
$$\begin{aligned} \sum _j \Bigl ( \ell _j + \log p_j + \frac{p_j+2}{2} \log n_j \Bigr )+ \log m + (m+1) \log n, \end{aligned}$$
(5)
where $$\ell _j = -\log \hat{L}_j(\mu _j, \phi _1,\dots ,\phi _{p_j}, \sigma _j^2 \mid x,\xi _m) $$ and $$\hat{L}_j$$ maximized value of the usual Gaussian AR($$p_j$$) likelihood for segment $$j=1,\dots ,m$$,
$$\begin{aligned} L_j (\cdot \mid \cdot ) = (2\pi )^{-n_j/2}\vert {\Gamma _j}\vert ^{-1/2}\exp \left\{ -\tfrac{1}{ 2} (\pmb x_j-{\mu _j}\pmb 1)'{\Gamma _j}^{-1}(\pmb x_j-{\mu _j} \pmb 1)\right\} \,, \end{aligned}$$
(6)
where $$\mu _j$$ is the segment’s constant mean value, $$\pmb 1$$ is the corresponding vector of ones, and $$\Gamma _j = \sigma ^2_j V_j$$ is the usual AR$$(p_j)$$ variance-covariance matrix corresponding to $$\pmb x_j$$. Because of the Markov structure of AR models, the likelihood has a simple form; see [6, Prob. 8.7] for details. Fitting the model has to be done via numerical optimization, which is accomplished via a genetic algorithm (a derivative free smart search for minimization based on evolutionary biology concepts). Basic information on genetic algorithms may be obtained from the Mathworks site [11].

2.3 The Problem Is Resolution

The problem with the aforementioned techniques is that they tend to over-smooth the spectral density estimate so that small frequency shifts cannot be detected. Resolution problems were thoroughly discussed in the literature in the latter half of the twentieth century. Some details of the history of the problem as well as a simulation example are given in Sect. 5. For our analysis, we focus on El Niño–Southern Oscillation (ENSO). The Southern Oscillation Index (SOI) measures changes in air pressure related to sea surface temperatures in the central Pacific Ocean. The central Pacific warms every three to seven years due to the El Niño effect, which has been blamed for various global extreme weather events. It has become taken as fact that sometime after 1980, the frequency of the El Niño–La Niña (or ENSO) cycle has increased with the global warming; e.g., see [19].

Monthly values of the SOI are displayed in Fig. 1 for years 1866–2017 [7]; additionally, the data have been filtered to exhibit the ENSO cycle. Also shown in Fig. 1 are the AutoParm results (vertical dashed line) and AdaptSpec results (vertical solid line) when applied to the SOI series. AutoParm prefers a breakpoint around 1920, whereas AdaptSpec is indicating there are no breakpoints because $$\Pr (\text {break} \mid \text {data}) =0.3$$. However, assuming that there is one structural break, the posterior distribution of the breakpoints (with a vertical line at the mean) is displayed in the figure.
../images/461444_1_En_42_Chapter/461444_1_En_42_Fig1_HTML.png
Fig. 1

Monthly values of the SOI for years 1866–2017 with breakpoints (vertical lines) determined by AutoParm (- - -) and by AdaptSpec (|). The solid smooth line is the filtered series that exhibits the ENSO cycle. For AdaptSpect, $$\Pr (\text {break} \mid \text {data}) =0.3$$ indicates there is probably not a breakpoint

../images/461444_1_En_42_Chapter/461444_1_En_42_Fig2_HTML.png
Fig. 2

The segmented spectral estimates using AutoParm and AdaptSpec. The vertical lines show the 3–7 7 year cycle known ENSO cycle range. The frequency scale is in years and is truncated at the annual cycle

Figure 2 shows the estimated spectra for each segment for the AutoParm and AdaptSpec techniques. The vertical lines show the 3–7 year cycle known ENSO cycle (see https://​www.​weather.​gov/​mhx/​ensowhat for more details). Both methods indicate that in the second segment, the ENSO cycle is much more broad, including both slower and faster frequencies than the usual ENSO cycle. One thing that is clear from both methods is that the estimated spectra are too smooth (broad) to reveal if there has been a decisive frequency shift in the ENSO cycle.

3 AutoSpec—Parametric

Since the interest is in spectra, an obvious extension of AutoParm is to replace the Gaussian AR likelihood in MDL with Whittle likelihood. That is, in (6), replace $$L_j$$ with the Whittle likelihood (4) but with
$$\begin{aligned} f^\theta _j(\omega ) = \sigma ^2_j\, |\phi _j(e^{ 2 \pi \, i\, \omega })|^{-2}\,, \end{aligned}$$
(7)
where $$\phi _j(z)$$ is the AR polynomial of order $$p_j$$ given by.
$$\phi _j(z) = 1 - \phi _1 z - \dots -\phi _{p_j} z^{p_j}\,, $$
for $$j=1,\dots , m$$. The basic idea is use peridograms as the data in a move from the time domain to the frequency domain. Another similar method would be to use the Bloomfield EXP model [5],
$$\begin{aligned} f^\theta _j(\omega ) = \sigma _j^2 \exp \Bigl (2 \sum _{\ell =1}^{p_j} \theta _{\ell ,j} \cos (2\pi \ell \omega )\Bigr )\,. \end{aligned}$$
However, EXP yields spectral estimates that are very similar to the AR spectra, so these are not displayed. Using the AR form in (7) yields an interesting and different result, which is shown in Fig. 3. The method finds three segments, but no break near 1980. The AR spectral estimates are shown in Figure 3, and because they are so smooth, it is not easy to interpret the results. However, we used nonparametric spectral estimates (see [17, Sect. 4]) on the three chosen segments and those estimates are displayed in Fig. 4. Here we note that segments two and three show an increased frequency around the 2.5 year cycle, and segment three shows an additional slower frequency, which may be interpreted as the existence of longer El Niño/La Niña cycles.
../images/461444_1_En_42_Chapter/461444_1_En_42_Fig3_HTML.png
Fig. 3

The SOI series with the two breakpoints found using AutoSpec. The individual AR spectral estimates for each of the three segments. The vertical lines show the 3–7 year cycle known ENSO cycle range

../images/461444_1_En_42_Chapter/461444_1_En_42_Fig4_HTML.png
Fig. 4

Nonparametric estimates of the spectra in the three segments identified by AutoSpec; compare to Figure 3. The vertical lines show the 3–7 year cycle known ENSO cycle range

4 AutoSpecNP—Nonparametric

Because of the success of the nonparametric approach in the previous section, the natural next step would be to develop a fully nonparametric technique. To this end, consider a triangular kernel (Bartlett window), $$\{W(\ell );\, \ell =0, \pm 1, \dots ,\pm b\}$$ with $$W(\ell ) \propto 1 - |\ell |/(b+1)$$ such that $$\sum W(\ell )=1$$, with the bandwidth b chosen by MDL to smooth the periodogram of the fully tapered data and then used the Whittle likelihood for the given spectral estimate. That is, to nonparametrically evaluate the likelihood (4) in each segment $$j=1,\dots , m$$, use
$$\begin{aligned} \hat{f}_j(\omega _{k_j}) = \sum _{\ell =-b_j}^{b_j} W(\ell )\, I^{\text {taper}}_{j}(\omega _{k_j}+\ell ), \end{aligned}$$
(8)
where the $$b_j=1,2,\dots $$ are chosen by MDL (similar to AR orders in AutoParm). Here, $$I_j^{\text {taper}}(\cdot )$$ represents the periodogram of the fully cosine tapered data in segment j for $$j=1,\dots , m$$.
For example, if $$\{x_t\}$$ represents the data in a segment, then they are preprocessed as $$y_t= h_t x_t$$ where $$h_t$$ is the cosine bell taper favored by [4],
$$ h_t=.5\biggl [1+\cos \biggl (\frac{2\pi (t-\overline{t}) }{ n}\biggr )\biggr ]\,, $$
where $$\overline{t} = (n+1)/2$$ and n is the number of observations in that segment. In this case, the periodogram is of the preprocessed data, $$y_t$$. Figure 5 shows an example of the Bartlett window with $$b=4$$; the corresponding spectral window (see [17, Sect. 4]) of the Bartlett kernel is not very good unless the data are tapered. The spectral window corresponding to the Bartlett kernel with tapering is also displayed in Figure 5.
../images/461444_1_En_42_Chapter/461444_1_En_42_Fig5_HTML.png
Fig. 5

Example of the Bartlett kernel and the corresponding Fejér spectral window when a taper is applied

../images/461444_1_En_42_Chapter/461444_1_En_42_Fig6_HTML.png
Fig. 6

Nonparametric estimates of the spectra in the three segments identified by AutoSpecNP; compare to Figs. 3 and 4. The vertical lines show the 3–7 year cycle known ENSO cycle range

Figure 6 shows the results of the fully nonparametric method. The figure displays the SOI series along with the estimated breakpoints. The fully nonparametric method finds a breakpoint near 1980 as has been suggested by climatologists, initially in [18]. Although the differences in each segment are subtle, this method has enough resolution to distinguish between minor frequency shifts. We may interpret the findings as follows. From 1866 to 1905, the ENSO cycle was a 3–7 year cycle. After that, there appears to be a shift to a higher ENSO cycle of about 2.5 years in addition to the usual 3–7 year cycle. Finally, after 1980, there appears to be a slower cycle introduced into the system. That is, after 1980, the ENSO cycle included a much slower cycle that indicates that El Niño events tend to be longer, but not faster than 2.5 years.

5 Spectral Resolution and a Simulation

As previously stated, the problem of resolution was discussed in the literature in the latter half of the twentieth century; e.g., [9, 10]. The basic rule of thumb is that the achievable frequency resolution, $$\Delta \omega $$ should be approximately the reciprocal of the observational time interval, $$\Delta t$$ of the data, $$\Delta \omega \approx 1/{\Delta t}.$$ Two signals can be as close as $$1/\Delta t$$ apart before there is significant overlap in the transform and the separate peaks are no longer distinguishable.

For example, we generated a time series of length 2000 where
$$\begin{aligned} X_t = {\left\{ \begin{array}{ll} X_{1t} = 2\cos (2\pi \omega \, t)\cos (2\pi \delta \, t) + Z_{1t}\,, &{} 1\le t\le 1000\,,\\ X_{2t} = \cos (2\pi \omega \, t) + Z_{2t}\,, &{} 1001 \le t\le 2000\,, \end{array}\right. } \end{aligned}$$
(9)
$$\omega = 1/25$$, $$\delta =1/150$$, and $$Z_{it}$$ for $$i=1,2$$ are independent i.i.d. standard normals. The difference between the two halves of the data is that $$X_{1t}$$ is a modulated version of $$X_{2t}$$. Modulation is a common occurrence in many signal processing applications, e.g., EEG (see [12]). In addition, note that
$$ X_{1t}= \cos (2\pi [\omega +\delta ]\, t) + \cos (2\pi [\omega -\delta ]\, t) + Z_{1t}\,, $$
so that $$X_{1t}$$ is distinguishable by twin peaks in the frequency domain.
../images/461444_1_En_42_Chapter/461444_1_En_42_Fig7_HTML.png
Fig. 7

Realization of (9) showing the true breakpoint as a solid vertical line; the dashed vertical line shows the breakpoint identified by AutoSpecNP

../images/461444_1_En_42_Chapter/461444_1_En_42_Fig8_HTML.png
Fig. 8

The results of running AutoSpecNP on the data $$X_t$$, which finds one breakpoint at $$t=1087$$. The figures are the estimated AutoSpecNP spectra for each identified segment and the estimate (8) with $$b=4$$ on all the data

Figure 7 shows a realization of $$X_t$$ with the changepoint marked. The figure also displays the breakpoint $$t=1087$$ identified by AutoSpecNP. We note, however, that AutoSpec and AutoParm do not identify any breakpoints.

Figure 8 shows the results of running AutoSpecNP described in Sect. 4 on the data $$X_t$$. As seen from the figure, the procedure is able to distinguish between the two processes (with a breakpoint at $$t=1087$$). The method works because the procedure allows very limited to no smoothing of the periodogram. In addition to showing the spectral estimates of each segment, the figure also displays the estimate (8) with $$b=4$$ on the entire realization of $$X_t$$. This figure helps in realizing why the method works.