9
Multivariate spectral analysis of time series

Similar to the univariate time series analysis, where one can study a univariate time series through its autocovariance/autocorrelation functions and lag relationships or through its spectrum properties, we can study a multivariate time series through a time domain approach or a frequency domain approach. In the time domain approach, we use the covariance/correlation matrices, and in frequency domain approach we will use the spectrum matrices. In this chapter, after a brief review of the univariate frequency domain method, we will introduce the spectral analysis for both stationary and nonstationary vector time series. With no loss of generality, we will assume a zero‐mean time series in the following discussion.

9.1 Introduction

Recall that for a univariate stationary time series process, Zt, its spectral representation is given by

where dU(ω) is a complex‐valued orthogonal stochastic process for each ω such that

(9.2)equation
(9.3)equation

and

(9.4)equation

where U*(ω) is the complex conjugate of U(ω). Let γk be the autocovariance function of Zt. Its spectral representation is given by

(9.5)equation

where F(ω) is the spectral distribution function.

When the autocovariance function is absolutely summable, the spectrum or the spectral density exists and is equal to

(9.6)equation

In this case, we have

with

(9.8)equation

Given a time series Z1, Z2, …, Zn, its Fourier transform at the Fourier frequencies ωj = 2πj/n, − [(n − 1)/2] ≤ j ≤ [n/2], is

It can be shown that the sample spectrum is given by

(9.10)equation

Looking at the following expression of the previous double sums for ZtZr

equation

and letting k = t − r, we see that

where

equation

is the sample autocovariance function. If Zt is a Gaussian white noise process with mean 0 and variance σ2, then images for j = 1, 2, …, (n − 1)/2, are distributed independently and identically as the following Chi‐square distribution, that is

(9.12)equation

where (σ2/2π) is actually the spectrum of Zt.

Although images is an unbiased estimate, since the sample spectrum is only defined at the Fourier frequencies and its variance is independent of the sample size n, it is a rather poor estimate of the spectrum. To solve this problem, we introduce a suitable kernel or spectral window to smooth the sample spectrum, that is,

(9.13)equation

where ωj = 2πj/n, j = 0, ± 1, …, ± [n/2], are Fourier frequencies; Mn is a function of n such that Mn → ∞ but Mn/n → 0 as n →  ∞ ; and Wn(ωj) is the kernel or spectral window that is a weighting function with the following properties

(9.14)equation

From Eq. (9.7), we see that the spectrum is the Fourier transform of the autocovariance function. So, we can also apply a weighting function to the sample autocovariances, that is,

(9.15)equation

where M is the truncation point that is a function of n such that M/n → 0 as n →  ∞; W(k) is the lag window chosen to be an absolutely summable sequence

(9.16)equation

which is derived from a bounded even continuous function W(x) satisfying

(9.17)equation

Some commonly used spectral and lag windows include the Daniell (1946) Rectangular, Bartlett (1950), Blackman–Tukey (1959), and Parzen (1961, 1963) windows. For more details, we refer readers to Priestley (1981), and Wei (2006, 2008).

When Zt is a Gaussian process with the spectrum f(ω), under a properly chosen lag window, we have

(9.18)equation

with

equation

and W(x) is the continuous weighting function used in the associated lag window. This smoothed spectrum images has a distribution related to the Chi‐square distribution. images is an asymptotically unbiased estimate of f(ω) so that

equation

and

equation

The estimation procedure for series observed at high sampling frequency that does not require equally spaced observations, we refer readers to a recent study by Chang, Hall, and Tang (2017).

9.2 Spectral representations of multivariate time series processes

These univariate time series results can be readily generalized to the m‐dimensional vector process. Let Zt = [Z1,t, Z2, t, …, Zm,t]′ be a zero‐mean jointly stationary m‐dimensional vector process with the covariance matrix function, Γ(k) = [γi,j(k)], the spectral representation of Zt is given by

(9.19)equation

where dU(ω) = [dU1(ω), dU2(ω), …, dUm(ω)]′ is a m‐dimensional complex‐valued process with dUi(ω), for i = 1, 2, …, m, being both orthogonal as well as cross‐orthogonal such that

(9.20)equation

and

(9.21)equation

The spectral representation of the covariance matrix function is given by

where

and F(ω) is the spectral distribution matrix function of Zt. The diagonal elements Fi,i(ω) are the spectral distribution functions of the Zi,t and the off‐diagonal elements Fi,j(ω) are the cross‐spectral distribution functions between the Zi,t and the Zj,t.

If the covariance matrix function is absolutely summable in the sense that each of the m × m sequence γi,j(k) is absolutely summable, then the spectrum matrix or the spectral density matrix function exists and is given by

Thus, we can write

(9.25)equation

and

where

(9.27)equation

When k = 0 in Eq. (9.22), we have

(9.28)equation

So, the area under the multivariate spectrum is the variance–covariance matrix of Zt. The element fi,i(ω) in Eq. (9.26) is the spectrum or the spectral density of Zi,t, and the element fi,j(ω) in Eq. (9.26) is the cross‐spectrum or the cross‐spectral density of Zi,t and Zj,t. It is easily seen that the spectral density matrix function f(ω) is positive semidefinite, that is, cf(ω)c ≥ 0 for any nonzero m‐dimensional complex vector c. Also, the matrix f(ω) is Hermitian, that is,

(9.29)equation

Hence, images for all i and j.

Since fi,j(ω) is in general complex, we can write it as

(9.30)equation

where ci,j(ω) and − iqi,j(ω) are the real and imaginary parts of fi,j(ω), that is,

and

The function ci,j(ω) is known as the co‐spectrum, and qi,j(ω) is known as the quadrature spectrum, between Zi,t and Zj,t. We can also write fi,j(ω) in the polar form,

where

(9.34)equation

and

(9.35)equation

The function αi,j(ω) is called the cross‐amplitude spectrum, and the function ϕi,j(ω) is called the phase spectrum.

To properly understand these spectra, we note that for any ω, the function dUi(ω) is a complex‐valued random variable, and we can write it as

(9.36)equation

where αi(ω) and ϕi(ω) are the amplitude spectrum and the phase spectrum of the Zi,t series. From Eqs. (9.23), (9.24), and (9.33), since

(9.37)equation

where for simplicity we assume that the amplitude and phase spectra are independent. Thus, αi,j(ω) can be thought as the average value of the product of amplitudes of the ω − frequency components of Zi,t and Zj,t, and the phase spectrum ϕi,j(ω) represents the average phase shift, ϕi(ω) − ϕj(ω), between the ω − frequency components of Zi,t and Zj,t. In terms of a causal relationship, Zj,t = αZi,t − τ + et, or Zi,t = αZj,t − τ + et, where there is no feedback relationship between them, the phase spectrum is a measure of the extent to which each frequency component of one series leads the other. The ω − frequency component of Zi,t leads the ω − frequencycomponent of Zj,t if the phase ϕi,j(ω) is negative. The ω − frequency component of Zi,t lags the ω − frequency component of Zj,t if the phase ϕi,j(ω) is positive. For a given ϕi,j(ω), the shift in time units is ϕi,j(ω)/ω, and the actual time delay of the ω − frequency component of Zj,t is equal to

Other useful functions in the multivariate spectral analysis are the gain function defined as

(9.39)equation

and the squared coherency function defined as

(9.40)equation

From Eqs. (9.23) and (9.24), it is easy to see that

(9.41)equation

which is actually square of the correlation coefficient between the ω − frequency components of Zi,t and Zj,t. A value of images close to 1 implies that the ω − frequency components of the two series are strongly linearly related, and a value of images close to 0 implies that they are very weakly linearly related. It should be noted that just like the correlation coefficient between two random variables, the square coherency is invariant under linear transformations.

9.3 The estimation of the spectral density matrix

9.3.1 The smoothed spectrum matrix

Given a zero‐mean m‐dimensional time series, Z1, Z2, …, and Zn, its Fourier transform at the Fourier frequencies ωp = 2πp/n, − [(n − 1)/2] ≤ p ≤ [n/2], is

(9.44)equation

Then, the m × m sample spectrum matrix, which is also known as periodogram matrix, is simply the extension of Eqs. (9.9)(9.11). Thus,

where

equation
equation

and

equation

When Zt is a multivariate Gaussian process with mean vector 0 and variance–covariance matrix Σ, images has a distribution related to the sample variance–covariance matrix that is known as Wishart distribution with n degrees of freedom, which is the multivariate analog of the Chi‐square distribution. We refer readers to Goodman (1963), Hannan (1970), and Brillinger (2002) for further discussion of the properties of the periodogram and Wishart distribution.

Similar to the univariate extension of the sample spectral density discussed in Section 9.1, the sample spectrum matrix or periodogram matrix is also a poor estimate. So, we replace it by the following smoothed spectrum matrix

(9.46)equation

where

(9.47)equation

Wi(ω) is a smoothing function, also known as kernel or spectral window, and Mi is the bandwidth of the spectral window, and

(9.48)equation

where Wi,j(ω) is a spectral window, and Mi,j is the corresponding bandwidth. Similar to the extension of the univariate smoothed spectrum, the smoothed spectrum matrix can also be approximated by the Wishart distribution.

Once fi,i(ω) and fi,j(ω)are estimated, we can estimate the co‐spectrum, ci,j(ω), the quadrature spectrum, qi,j(ω), the cross‐amplitude spectrum, αi,j(ω), phase spectrum, ϕi,j(ω), the gain function, Gi,j(ω), and the squared coherency function, images

Note that the spectrum matrix is the Fourier transform of the covariance function, Γ(k) = [γi,j(k)], and the sample spectrum matrix is

(9.49)equation

Instead of spectrum smoothing, we can also apply the smoothing function to the sample covariance matrices, that is,

(9.50)equation

where

(9.51)equation

images is the sample autocovariance for the Zi,t series, Wi(k) is a suitable lag window, and Mi is the truncation point, and

(9.52)equation

images is the sample cross‐covariance function between Zi,t and Zj,t, Wi,j(k) is a suitable lag window, and Mi,j is the corresponding truncation point. In this case, the estimations of the co‐spectrum, the quadrature spectrum, the cross‐amplitude spectrum, the phase spectrum, the gain function, and the squared coherency function are given by

(9.53)equation

(9.54)equation
(9.55)equation
(9.56)equation
(9.57)equation

and

(9.58)equation

For commonly used spectral or lag windows and their properties, we refer readers to Priestley (1981), Brillinger (2002), and Wei (2006).

Some important remarks are in order here.

  1. The smoothed lag windows and the associated truncation points used in estimating fi,i(ω) and fi,j(ω) are not necessarily the same for all i and j. Even the same lag window is used, the associated truncation points can be different. However, the use of the same lag window and truncation point for all estimates will in general make the study of sampling properties of the estimates simpler.
  2. In estimating the cross‐spectrum, the lag window places heavier weight on the sample cross‐covariances around the zero lag. If one series leads another and the cross‐covariance does not peak at zero lag, then the estimated cross‐spectrum will be biased. This bias is especially severe for the estimated square coherency. Because the square coherency is invariant under the linear transformation, to reduce the bias, one can properly align two series by shifting one of the series by a time lag τ so that the peak in the cross‐spectrum function of the aligned series occurs at the zero lag. In practice, the time lag τ can be estimated by the lag corresponding to the maximum cross‐correlation.
  3. The estimates of the cross‐amplitude, phase, and gain functions are not reliable when the square coherency is small.
  4. It should be noted that when we write the spectral density as images, it implies that the range of the frequencies is from −π to π. Given a time series of length n, due to symmetry, the Fourier frequencies used in the sample spectrum will be ωj = 2πj/n, with j = 0, 1, …, [n/2]. However, if one writes the spectral density as images it implies that the range of the frequencies will be −1/2 to 1/2. In this case, the frequencies used in the sample spectrum will be ωj = j/n, with j = 0, 1, …, [n/2]. In the following discussions, we may use either one depending on what is used in software and the referenced papers.

9.3.2 Multitaper smoothing

Developed by Thompson (1982) for univariate processes and extended by Walden (2000) for multivariate processes, multitaper smoothing is another useful way to estimate power spectrum density that balances the bias and variance of nonparametric spectral estimation. The multitaper reduces estimation bias by averaging modified periodograms obtained using a family of mutually orthogonal tapers from the same sample data. Let hj(t) for t = 1, …, n and j = 1, …, n, be n orthonormal tapers such that

(9.59)equation

From Eq. (9.45), we note that

(9.60)equation

where images is the discrete Fourier transform of Zt. The multitaper power spectral estimator at frequency ω is

(9.61)equation

where K is chosen through the method shown below and images is the tapered Fourier transform such that

(9.62)equation

The multitaper spectral estimator is asymptotically unbiased and is consistent when the number of tapers K increase with the number of observations n.

One of the most commonly used multitapers is the Discrete Prolate Spheroidal Sequences (DPSS) or tapers suggested by Thompson (1982), also known as Slepian sequences or tapers because the term DPSS was given in Slepian (1978). The Slepian tapers are sequences of functions, which are the Fourier transform of the DPSS, that were designed to maximize the spectral concentration defined as

for a frequency W, ∣W ∣  < π/2 is the bandwidth defining local and normally in the order of 1/n, where images Consequently, the first of the sequences, {h1(t), t = 1, …, n}, is chosen such that its corresponding spectral window or taper maximizing the concentration ratio in Eq. (9.63) over the interval (−W, W), which is equal to λ1(n,W). This is done by maximizing the power

equation

subject to the constraint that the total power is normalized such that

equation

which leads to the following eigenvalue equation,

(9.64)equation

for t′ = 1, …, n. So, λ1(n,W) is the largest eigenvalue and {h1(t), t = 1, …, n} is the first taper. The second taper, {h2(t), t = 1, …, n} is selected to maximize the corresponding concentration ratio but subject to being orthogonal to the first. Going further, the third taper maximizes the concentration ratio, but subject to being orthogonal to the first two tapers. In general, the jth taper corresponds to the jth largest eigenvalue, λj(n,W).

By maximizing concentration ratio, we are essentially minimizing the sidelobe energy outside a frequency band (−W,W). The eigenvalues should be close to unity for well‐behaved tapers. Let M be the matrix formed by those orthogonal tapers. Then M is an n × n positive–definite matrix with K dominant eigenvalues, λ1(n,W) > λ2(n,W) > ⋯ > λK(n,W) > ⋯ > λn(n,W), which are close to 1.

Another widely used set of tapers is called the Sine tapers (Riedel and Sidorenko, 1995), which are in the form of

for t = 1, …, n. They were used by Dai and Guo (2004) to obtain a preliminary estimate of the spectral matrix in the smoothing spline framework discussed in the next section.

9.3.3 Smoothing spline

The main disadvantage of the kernel smoothing method and the multitaper smoothing method is that they cannot guarantee that the final estimate is positive semidefinite while allowing flexible smoothing for each element of the spectral matrix. Thus, the same bandwidth is often applied to smoothing all the spectral components. However, in many applications, different components of the spectral matrix may need different smoothnesses, and require different smoothing parameters to get optimal estimates. To overcome this difficulty, Dai and Guo (2004) proposed a Cholesky decomposition based smoothing spline method for the spectrum estimation. The method models each Cholesky component separately by using different smoothing parameters. The method first obtains positive–definite and asymptotically unbiased initial spectral estimator images through sine multitapers as shown in Eq. (9.65). Then, it further smooths the Cholesky components of the spectral matrix via the smoothing spline and penalized sum of squares, which allows different degrees of smoothness for different Cholesky elements.

Suppose the spectral matrix images has Cholesky decomposition such that images, where Γ is m × m lower triangular matrix. To obtain unique decomposition, the diagonal elements of Γ are constrained to be positive. The diagonal elements γj,j, j = 1, …, m, the real part of γj,k, (γj,k), and imaginary part of γj,k, (γj,k), j > k are smoothed by spline with different smoothing parameters. Suppose γ ∈ {γj,j, (γj,k), (γj,k), j > k, for j,k = 1, …, m}, we have

(9.66)equation

where a(ω) = E{γ(ω)}, and the e(ω), ℓ = 1, …, n,are independent errors with zero means and the variances depending on the frequency point ω. a(⋅) is periodic and is fitted by periodic smoothing spline (Wahba, 1990) of the form,

(9.67)equation

The estimator of power spectrum component is obtained by minimizing the following

(9.68)equation

where because of issues related asymptotic distributions, ω = 0, 0.5,  and 1 are not used in the estimation, τ = n − 1 if n is odd, and τ = n − 2 if n is even; images is the estimator of γ(ω); and λ is a smoothing parameter. The final form of the estimator is given by

(9.69)equation

where [⋅] is the greatest integer function. The smoothing parameter λ can be chosen by the generalized cross‐validation or the generalized maximum likelihood. For more details, we refer readers to Wahba (1990) and Dai and Guo (2004). Qin and Wang (2008) suggested, through simulations, that the generalized maximum likelihood is stable and performs better than the generalized cross‐validation.

9.3.4 Bayesian method

Recall that the discrete Fourier transform images, ℓ = 0, …, (n − 1)/2, are approximately independent complex multivariate normal random variables. The large‐sample distribution of images leads to the Whittle likelihood (Whittle, 1953, 1954),

(9.70)equation

where ωk = ℓ/n and L = [(n − 1)/2]. Based on the Whittle likelihood, Rosen and Stoffer (2007) proposed a Bayesian method to the estimate spectrum of the second‐order stationary multivariate time series. They model the spectrum f(ω) by the modified complex Cholesky factorization, such that,

(9.71)equation

where Γ is a complex‐valued lower triangular matrix with one on its diagonal,

equation

and D is a diagonal matrix with positive real values, that is, images. Let images Θ = (θ1, …, θL), and D = {D1, …, DL},the modified Cholesky representation facilitates development of Bayesian sampler by noticing that the Whittle likelihood can be rewritten as

(9.72)equation

where θ is an m(m − 1)/2–dimensional vector and R is a m × m(m − 1)/2 design matrix such that

equation

and Yj, ℓ is the jth entry of images. Let imagesand images Each component of the Cholesky decomposition is fitted by the Demmler–Reinsch basis functions for linear smoothing splines of Eubank and Hsing (2008) as follows

(9.73)equation

for each frequency ω. Let X be the design matrix of the basis functions and images be the associated parameters. We then have

(9.74)equation

for j = 1, …, m, k = 1, …, j − 1. The priors on cj, cj,k, (re), and cj,k, (im) are chosen to be bivariate normal distributions imagesand the priors on dj, dj,k, (re), and dj,k, (im) are chosen to be L–dimensional normal distributions images, images, and images, respectively. The hyperparameters images images, and images are smoothing parameters, which control the amount of smoothness. As the smoothing parameters tend to zero, the spline becomes a linear fit; as the smoothing parameters tend to infinity, the spline will be an interpolating spline. Gibbs sampling with the Metropolis–Hastings algorithm is used to draw parameters from posterior distribution.

9.3.5 Penalized Whittle likelihood

Penalized methods have been one of the most popular research topics for the past decade. In the area of spectrum estimation, Pawitan and O'Sullivan (1994) developed penalized likelihood method for the nonparametric estimation of the power spectrum of a univariate time series. Pawitan (1996) developed a penalized Whittle likelihood estimator for a bivariate time series. However, their approach cannot be easily generalized to higher dimension. More recently, Krafty and Collinge (2013) proposed a penalized Whittle likelihood method to estimate the power spectrum of a vector‐valued time series. The method allows for varying levels of smoothness among spectral components while accounting for the positive definiteness of spectral matrices and the Hermitian and periodic structures of power spectra as functions of frequency. In this section, we briefly discuss the method by Krafty and Collinge (2013).

Recall that images are discrete Fourier transform of the time series, and define the negative log Whittle likelihood as

(9.75)equation

where ω = ℓ/n and L = [(n − 1)/2]. Based on L, the method penalizes the roughness of the estimated spectrum. The spectral matrix f(ω) is modeled by the Cholesky decomposition such that f(ω) = [ΓΓ*]−1, where Γ is a m × m lower triangular matrix with real‐valued diagonal elements. Further, denote Γi,j,R(ω;f) and Γi,j,I(ω;f) as the real and imaginary parts of the (i,j) element of Γ. The method proposes a measure of roughness of a power spectrum through the integrated squared kth derivatives of the m(m + 1)/2 real and m(m − 1)/2 imaginary components of the Cholesky decomposition. Suppose the smoothing parameters, λ = {ρi,j, θi,j : i ≤ j = 1, …, m},of Γi,j,R and Γi,j,I that control the roughness penalty are ρi,j > 0 and θi,j > 0, the roughness measure for a spectrum images is

(9.76)equation

The interpretation is that the penalty function J shrinks the estimates of power spectra toward real‐valued matrix functions that are constant across frequency. Consequently, we consider minimizing the penalized Whittle negative loglikelihood

equation

The estimation procedure is based on an iterative algorithm adopted from Wood (2011) and the confidence intervals are obtained via bootstrap.

9.3.6 VARMA spectral estimation

As shown in Chapter 2, the underlying vector time series process is often described by a vector autoregressive moving average (VARMA) model. Specifically, the m‐dimensional VARMA process of order p and q, VARMA(p, q) is given by

(9.77)equation

or

(9.78)equation

where images at is a sequence of m‐dimensional vector white noise process, VWN(0, Σ), and

(9.79)equation

We assume that all zeros of ∣Φp(B)∣ and ∣Θq(B)∣ lie outside of the unit circle, so that we can also represent it as

(9.80)equation

where images and the sequence Ψj is square summable. The spectral density matrix of VARMA(p, q) model is given by

(9.81)equation

where images and Ψ*(e) is its conjugate transpose.

Given a vector time series of n observations, Z = (Z1, Z2, …, Zn), we will first build its VARMA(p,q) model including the estimation of its parameter matrices. Let images images and images be the corresponding estimates of the parameter matrices. We have

(9.82)equation

where images images and at is a sequence of m‐dimensional vector white noise, images Then, the spectral density matrix estimation of the underlying process is given by

(9.83)equation

where

(9.84)equation

In practice, given a vector time series Zt, t = 1, 2, …, n, one can often approximate the underlying process with a VAR(p) model,

(9.85)equation

So, the commonly used parametric estimation of spectral matrix is

where the order p can be determined through some kind of model selection criteria such as Akaike’s information criterion (AIC) or the Bayesian information criterion (BIC).

9.4 Empirical examples of stationary vector time series

9.4.1 Sample spectrum

Let us first examine the sample spectrum of each series, which as shown in Wei (2006) and Priestley (1981) can be used to test hidden periodic components. They are displayed in Figure 9.2 and they are noisy.

5 Graphs of raw periodogram vs. frequency for AUT, BUM, GEM, GRO, and COM, depicting sample spectrum of the five US monthly sales changes.

Figure 9.2 The sample spectrum of the five U.S. monthly sales changes.

So, we will compute a smoothed kernel sample spectrum matrix

(9.88)equation

where

(9.89)equation

where W(ω) is a spectral window, and M is the corresponding bandwidth. Let us simply consider Daniell's window, that is,

(9.90)equation

where

(9.91)equation

The smoothed five individual sample spectra with M = 5 are shown on the first row in Figure 9.3. We then increase the truncation point M to 7. The corresponding five smoothed sample spectra are shown on the second row in Figure 9.3, and they are much smoother.

2 Sets of 5 graphs for AUT, BUM, GEM, GRO, and COM (left–right) depicting five individual smoothed sample spectrum using Daniell window with M = 5 (top row) and M = 7 (bottom row).

Figure 9.3 The five individual smoothed sample spectrum using Daniell window with M = 5: first row; Daniell window with M = 7: second row.

It is clear that, except AUT, almost all other estimated spectra contain a large peak at two cycles per year, which corresponds to a half‐year periodic pattern. In addition, the estimated spectra of AUT and BUM have a peak at one cycle per year that is related to a yearly pattern. Finally, the spectra of GEM and COM also suggest a peak at four cycles per year, which is related to seasonal effects. In general, the power of AUT and GRO concentrates at higher frequencies; the power of BUM concentrates at lower frequencies; and the power of GEM and COM seems evenly spread out from medium to high frequencies.

From Eqs. (9.31) and (9.32), we have

(9.92)equation

and

(9.93)equation

Once the smoothed images are obtained, we can compute the estimated co‐spectra, images and the estimated quadrature spectra, images as follows.

(9.94)equation

and

(9.95)equation

for i = 1, …, 5 and j = 1, …, 5. For an illustration, we use Daniell window with M = 7, and Figure 9.4 shows the estimated co‐spectra, images and quadrature spectra, images for j = 1, …, 4.

8 Graphs displaying estimated co-spectra for ĉ5,1 (ω), ĉ5,2 (ω), ĉ5,3 (ω) , and ĉ5,4 (ω) (a–d) and estimated quadrature spectra qˆ5,1 (ω), qˆ5,2 (ω), qˆ5,3 (ω), and qˆ5,4 (ω) (e–h).

Figure 9.4 Estimated co‐spectra (a–d): images, images, images, and images; estimated quadrature spectra (e–h): images images images and images.

The estimated squared coherences can be obtained by

(9.96)equation

They are shown in Figure 9.5. An unexpected observation is that the squared coherence between COM and GEM is large and is constant across all frequencies. This indicates a strong relationship between sales changes of consumer materials and general merchandise.

10 Graphs, each with a curve, illustrating the estimated squared coherences with the Daniell window and bandwidth of 7.

Figure 9.5 The estimated squared coherences with the Daniell window and bandwidth of 7.

9.4.2 Bayesian method

9.4.3 Penalized Whittle likelihood method

9.4.4 Example of VAR spectrum estimation

According to Eq. (9.86), the estimated spectral matrix is

(9.98)equation

where images

Figure 9.8 shows the estimated spectrum from the VAR(4) model. Comparing with the kernel smoothed estimations given in Figure 9.3, we see that there are some similarities but also some differences. Similar to the nonparametric estimation, only the spectra of GEM and COM have peak at four cycles per year, which is related to seasonal effects. BUM concentrates at lower frequencies, GRO concentrates at higher frequencies, and the powers of GEM and COM are more evenly spread out in frequencies. From Figure 9.3, we see that the estimated spectra contain a large peak at two cycles per year except AUT. However, in terms of the VAR estimation shown in Figure 9.8, the estimated spectra contain a large peak at two cycles per year including AUT. Also, all estimated spectra have peak at one cycle per year, which is related to a yearly pattern, and unlike the kernel estimation, only AUT and BUM have peak at one cycle per year. Overall, the estimates of kernel window smoothing are much smoother than that of the VAR model. One possible reason is that the sample size of this particular example is small, which leads to unstable VAR parameter estimation.

5 Graphs for AUT, BUM, GEM, GRO, and COM (left–right) depicting the estimated power spectra by using the VAR(4) model.

Figure 9.8 The estimated power spectra by using the VAR(4) model.

Figure 9.9 shows the estimated co‐spectra, images and quadrature spectra, images for j = 1, …, 4. Comparing with the kernel smoothed estimations given in Figure 9.4, again we see that there are some similarities and differences. However, the VAR estimation of the squared coherences as shown in Figure 9.10 is quite similar to the kernel estimation from Figure 9.5. Both estimation methods show that the estimated squared coherence between COM and GEM is large and constant across all frequencies, indicating a strong relationship between sales changes of consumer materials and general merchandise.

8 Graphs displaying estimated co-spectra for ĉ5,1 (ω), ĉ5,2 (ω), ĉ5,3 (ω) , and ĉ5,4 (ω) (a–d) and estimated quadrature spectra qˆ5,1 (ω), qˆ5,2 (ω), qˆ5,3 (ω), and qˆ5,4 (ω) (e–h) from the VAR(4) model.

Figure 9.9 Estimated co‐spectra (a–d): images, images, images, images and estimated quadrature spectra (e–h): images, images, images, and images from the VAR(4) model.

10 Graphs, each with a curve, illustrating the estimated squared coherences with the VAR(4) model.

Figure 9.10 The estimated squared coherences by the VAR(4) model.

9.5 Spectrum analysis of a nonstationary vector time series

9.5.1 Introduction

For a nonstationary time series, we normally use some transformation like variance stabilization and/or differencing to reduce it to stationary before performing its spectral matrix estimation. However, there are many kinds of nonstationary time series that cannot be reduced to stationary by these transformations. Let us first consider a univariate case. There are many univariate nonstationary processes Zt, which cannot be represented by images given in Eq. (9.1), because the function ϕ(ω) = eiωt as a sine and cosine waves is stationary. Priestley (1965, 1966, and 1967) has pointed out that in this case, instead of using ϕ(ω) = eiωt, we need to consider an oscillatory function, which is a generalized Fourier transform,

(9.99)equation

so that

(9.100)equation

where A(t,ω) is a time‐varying modulating or transfer function with absolute maximum at zero frequency. In other words, Zt is an oscillatory process with an evolutionary spectrum, which has the same type of physical interpretation as the spectrum of a stationary process. The main difference is that while the spectrum of a stationary process describes the power distribution across frequencies over all time, the evolutionary spectrum describes power distribution over frequency at instantaneous time. However, within this framework, letting the length of series n → ∞ does not increase our knowledge about local behavior of spectrum since the pattern of the forthcoming series is different. As a result, the formulation does not allow the development of rigorous asymptotic theory of statistical inference. To overcome this problem, Dahlhaus (1996, 2000) introduced locally stationary time series that allows theoretical asymptotic analysis of the evolutionary spectrum for a univariate case, and further extended it to the multivariate case.

9.5.2 Spectrum representations of a nonstationary multivariate process

Let {Zt : t = 1, …, n} be a m‐dimensional time series. The idea of locally stationary process is to rescale the transfer function to unit time scale, such that

(9.101)equation

More rigorously, we have following definition.

Based on Definition 9.1, the time‐varying power spectrum of the process is given by

(9.103)equation

Developed from locally stationary time series, methods for estimating the time‐varying spectrum of a multivariate time series can be roughly grouped into three categories. The first category consists of estimators in which second‐order frequency domain structures evolve continuously over time. This includes the slowly evolving multivariate locally stationary process, which can be analyzed parametrically by fitting time series models with time‐varying parameters (Dahlhaus, 2000), and nonparametrically via the bivariate smoothing of spectral components as functions of frequency and time (Guo and Dai, 2006). The second category of methods consist of estimators that are piecewise stationary. Approaches within this second category typically divide a time series into approximately stationary segments, then obtain the estimates of local spectrum within segments. These methods include both parametric approaches, such as fitting piecewise vector autoregressive models (Davis et al., 2006), and nonparametric approaches, such as using the multivariate smooth localized complex exponential (SLEX) library (Ombao von Sachs and Guo, 2005). The final category are methods that can automatically approximate both abrupt and slowly varying changes by averaging over piecewise stationary models, and they include the smoothing stochastic approximation Monte Carlo (SSAMC) based method of Zhang (2016) and the reversible jump Monte Carlo and Hamiltonian Monte Carlo (HMC) based method by Li and Krafty (2018).

9.5.2.1 Time‐varying autoregressive model

One of the methods used in Dahlhaus (2000) is the time‐varying VARMA(p,q) model. For a vector autoregressive model VAR(p), it is defined by

(9.104)equation

where εt are m‐dimensional independent random variable with mean zero and unit variance I. In addition, we assume some smoothness conditions on Σ(⋅) and Φj(⋅). In some neighborhood of a fixed time point u0 = t0/n, the process Zt can be approximated by the stationary process Zt(u0) given by

(9.105)equation

Zt has a unique time‐varying power spectrum, which is locally the same as the power spectrum of Zt(u), that is,

(9.106)equation

where u = t/n, and

equation

Similarly, the locally covariance matrix is

(9.107)equation

Based on this statement, the time‐varying spectrum can be obtained by estimating the time‐varying parameters of the VAR model. For more properties of the estimation based on time‐varying VARMA(p,q), VAR(p), and VMA(q) models, we refer readers to Dahlhaus (2000).

9.5.2.2 Smoothing spline ANOVA model

Based on the locally stationary process, the time‐varying spectrum can also be estimated nonparametrically via the smoothing spline Analysis of Variance (ANOVA) model by Guo and Dai (2006). However, their definition of locally stationary process is slightly different from Dahlhaus (2000). In Section 9.5.2, we mentioned that Dahlhaus (2000) assumes a series of transfer functions A0(t/n,ω) that converge to a large‐sample transfer function A(u,ω) in order to allow for the fitting of parametric models. Since Guo and Dai (2006) considered a nonparametric estimation, they used A(u,ω) directly.

Based on this definition, the smoothing ANOVA model takes a two‐stage estimation procedure. At the first stage, the locally stationary process is approximated by piecewise stationary time series with small blocks to obtain initial spectrum estimates and the Cholesky decomposition. The initial spectrum estimates are obtained by the multitaper method to reduce variance. At the second stage, each element of the Cholesky decomposition is treated as a bivariate smooth function of time and frequency and is modeled by the smoothing spline ANOVA model by Gu and Wahba (1993). The final estimated time‐varying spectrum is reconstructed from the smoothed elements of the Cholesky decomposition. Thus, the method provides a way to ensure the final estimate of the multivariate spectrum is positive–definite while allowing enough flexibility in the smoothness of its elements.

We shall briefly discuss the smoothing spline ANOVA step. Suppose the spectral matrix images has the Cholesky decomposition such that images where L(u,ω) is a m × m lower triangular matrix. The method smooths the diagonal elements γj,j(u,ω), j = 1, …, m, the real part of γj,k(u,ω), {γj,k(u,ω)}, and the imaginary part of γj,k(u,ω), {γj,k(u,ω)}, for j > k separately with their own smoothing parameters. Let γ(u,ω) ∈ {γj,j(u,ω), (γj,k)(u, ω), (γj,k)(u,ω), j > k, for j, k = 1, …, m}. We have

equation

where a(u,ω) is the corresponding Cholesky decomposition element of the spectrum, such that a(u,ω) = E{γ(u,ω)}, the ε(u,ω) are independent errors with zero‐mean and the variance depending on the time‐frequency point (u,ω).

The smoothing spline ANOVA model is defined by the corresponding reproducing kernels (RKs) for the time and frequency domains. In the frequency domain, the reproducing kernel Hilbert space (RKHS) W1 for ω can be decomposed as W1 = 1 ⊕ H1. The reproducing kernel for H1 is R1(ω1, ω2) =  − K4(∣ω1 − ω2∣)/24, where Kk(⋅) is the kth order Bernoulli polynomial. In the time domain, the RKHS W2 can be decomposed as W2 = 1 ⊕ (u − 0.5) ⊕ H2. The reproducing kernel for H2 is R2(u1, u2) = K2(u1)K2(u2)/4 − K4(∣u1 − u2∣)/24. The full tensor product RKHS for (u,ω) is then

(9.108)equation
(9.109)equation
(9.110)equation

where the RK for H3 = H1 ⊗ (u − 0.5) is R3{(u1, ω1), (u2, ω2)} = R1(ω1, ω2)(u1 − 0.5)(u2 − 0.5), the RK for H4 = H1 ⊗ H2 is R4{(u1, ω1), (u2, ω2)} = R1(ω1, ω2)R2(u1, u2). Thus, a(u,ω) has the ANOVA‐like decomposition

(9.111)equation

where d1 + d2(u − 0.5) is the linear trend, a1(ω) is the smooth main effect for frequency, a2(u) is the smooth main effect across time, a3(u,ω) is the smooth in frequency and linear in time, and a4(u,ω) is the interaction that is smooth at both time and frequency. The least square estimation can be used to obtain the estimates images For the details, see Guo and Dai (2006).

9.5.2.3 Piecewise vector autoregressive model

Davis, Lee, and Rodriguez‐Yam (2006) considered modeling nonstationary time series using piecewise VAR processes. The number and locations of the piecewise VAR segments, and the orders of the corresponding VAR process, are assumed to be unknown. Based on the minimum description length (MDL) principle, the method penalizes complexity of the model, and thus provides the criteria to define the best fitting model. We refer readers to Rissanen (1989), Hansen and Yu (2001) for more details about the MDL principle.

Let Zt be a m‐dimensional time series with k segments, and assume that there are partitions or changepoints δ = (δ0, …, δk)′ with δ0 = 0 and δk = n. The time series within the jth segment is modeled by VAR(pj) process, such that

(9.112)equation

where images are m × m dimensional coefficient matrices of the VAR process. We define the entire class of piecewise VAR models as M and a model from this class as F ∈ M. The principle of MDL is to find the best fitting model from M as the one that produces the shortest code length. The code length of an object is the amount of memory space required to store the data Zt. The MDL has two components, a fitted model images and the portion that is unexplained by images. The later component can be defined as the residuals images. Let images be the code length of fitted model images and images be the code length of the residuals of the jth segment. Then, the total code length of the data can be decomposed to

(9.113)equation

The MDL principle suggests that a best fitting model by minimizing L( Zt).

Let nj be the sample size in the jth segment and δ be the information set about the partition of segments. Since the piecewise VAR model can be characterized by following parameters: k, δ , p = (p1, …, pk), and images, we can further decompose images by

The equality is due to the fact that the complete knowledge of δ implies complete knowledge of n1, …, nk. In order to store an integer I whose value is not bounded, approximately log2I bits are needed. Thus, we have L(k) = log2k, L(pj) = log2pj, and images. On the other hand, if integer I is upper‐bounded say IU, then log2IU bits are needed. Since nj is bounded by n, we have L(nj) = log2n and L(n1, …, nk) = klog2n (Hansen and Yu, 2001). It is known that a maximum likelihood estimate of a real parameter computed from n observations can be effectively encoded with (1/2)log2n bits. Because the total number of parameters in images is (pj + 1)m2, we have

(9.116)equation

Combining the results, we have

(9.117)equation

Rissanen (1989) showed that the code length of images is approximated by the negative of the loglikelihood of the fitted model images. Thus, we have

(9.118)equation

where L denotes the Whittle likelihood. Consequently, from Eqs. (9.114) and (9.115), the MDL objective function is defined as

(9.119)equation

The best fitting model is the one that minimizes the MDL. To overcome difficulties raised in optimization, the method proposes using a genetic algorithm.

9.5.2.4 Bayesian methods

There are two recently proposed Bayesian methods to estimate multivariate time‐varying spectrum, including Zhang (2016) and Li and Krafty (2018). In this section, we focus on the method of Li and Krafty (2018). The method is also based on locally stationary time series to estimate the time‐varying spectrum

(9.120)equation

The method assumes that for every u ∈ (0, 1), each component of f(u, ⋅) possesses a square‐integrable first derivative as a function of frequency; for every ω, each component of f(⋅, ω) is continuous as a function of scaled time at all but a possible finite number of points. The assumption on transfer function and time‐varying spectrum are slightly different from Definitions 9.1 and 9.2 in two ways. First, Definition 9.1 assumes a series of transfer functions A0(u,ω) that converge to a large‐sample transfer function A(u,ω) in order to allow for the fitting of parametric models. Since the method considers nonparametric estimation, in a manner similar to the smoothing ANOVA method, A(u,ω) is used. Second, Definitions 9.1 and 9.2 require the time‐varying spectrum to be continuous in both time and frequency. The method is more flexible and allows for components of spectrum to evolve not only continuously, but also abruptly in time.

The analysis of a locally stationary time series { Zt : t = 1, …, n} begins by using the piecewise stationary approximation. Consider a partition of the time series into k segments defined by partition points δ = (δ0, …, δk) with δ0 = 0 and δk = n such that Zt is approximately stationary within the segments {t : δq − 1 < t ≤ δq} for q = 1, …, k. Then

(9.121)equation

where Aq(ω) = A(uq, ω)I(δq − 1 < t ≤ δq), I(⋅) is the indicator function, and uq = (δq + δq − 1)/2n is the scaled midpoint of the qth segment. Within the qth segment, the time series is approximately second‐order stationary with local power spectrum f(uq, ω) = Aq(ω)Aq(ω)*.

Conditional on an approximately stationary partition δ, known approaches and properties for stationary time series can be applied. In particular, the large‐sample distribution of the discrete Fourier transform of a stationary time series that provides the Whittle likelihood allows for the formulation of a product of Whittle likelihoods. Let us define the local discrete Fourier transform at frequency ℓ within segment q as

(9.122)equation

where nq = δq − δq − 1 is the number of observations in segment q, ωq,ℓ = ℓ/nq are the Fourier frequencies, and Lq = [(nq − 1)/2].Given a partition of k segments, δ, the yq, ℓ are approximately independent zero‐mean complex multivariate Gaussian random variables with covariance matrices, which equals spectral matrices f(uq, ωq,ℓ). This leads to a loglikelihood that can be approximated by a sum of log Whittle likelihoods

(9.123)equation

where Y is collection of Fourier transformations.

The spectral matrix is positive–definite and, to nonparametrically allow for flexible smoothing among the different components while preserving positive definiteness, the procedure modified the Cholesky components of local spectra via linear penalized splines. The modified Cholesky decomposition represents a time‐varying spectral matrix as

(9.124)equation

for a complex‐valued m × m lower triangular matrix Θ(u,ω) with ones on the diagonal and a positive diagonal matrix Ψ(u,ω). For a piecewise stationary approximation with partition δ into k segments, we define the local modified Cholesky decomposition as f−1(uq, ω) = Θ(uq, ω)Ψ(uq, ω)−1Θ(uq, ω)*, for q = 1, …, k, and let θj,k,q(ω) and ψj,j,q(ω) be the (j,k) and (j,j) elements of Θ(uq, ω) and Ψ(uq, ω), respectively. Then, for each segment, there are m2 components to estimate: {θj,k,q(ω)} for (j > k) = 1, …, m,{θj,k,q(ω)} for (j > k) = 1, …, m, and ψj,j,q(ω) for j = 1, …, m. The Cholskey components are modeled by periodic even and odd linear splines by considering

(9.125)equation
(9.126)equation

and

(9.127)equation

The Fourier frequencies for each segment form an equally spaced grid, so that Demmler–Reinsch bases for periodic even and odd smoothing splines for local periodograms are given by {cos(2πsω) : s = 0, 1, …, (Lq − 1)} and {sin(2πsω) : s = 1, …, Lq}, respectively. For the details, we refer readers to Schwarz and Krivobokova (2016).

The method relies on reversible jump Markov chain and HMC methods to sample from posterior distributions. The estimates of time‐varying spectrum components are obtained by averaging over the distribution of partitions so that both abrupt and slowly varying changes can be recovered.

9.6 Empirical spectrum example of nonstationary vector time series

We first applied the method of Li and Krafty (2018) to the time series. Figures 9.12 and 9.13 displays estimated spectra and pairwise squared coherence respectively. There are several obvious changes in the spectra and coherences, including (i) the period 1997–1998, which corresponds to the Asia financial crisis in 1997, which also affected the U.S. economy; (ii) the year 2003, which is the period of the U.S. invasion of Iraq; and (iii) the period 2007–2009, which corresponds to the global financial crisis that began in 2007. All the coherences had a big jump during this period instead of drop as in 1997–1998, indicating that the financial crisis in 2007–2009 had broad impacts on both the DJIA and the NASDAQ.

3D graphs depicting the estimated spectrum of Dow Jones Industrial Average (DJIA) (left), NASDAQ (middle), and S&P 500 (right).

Figure 9.12 Estimated spectrum of Dow Jones Industrial Average (DJIA) (left), NASDAQ (middle), and S&P 500 (right).

3D graphs depicting the estimated squared pairwise coherence between DJIA and NASDAQ (left), DJIA and S&P 500 (middle), and NASDAQ and S&P 500 (right).

Figure 9.13 Estimated squared pairwise coherence between DJIA and NASDAQ (left), DJIA and S&P 500 (middle), and NASDAQ and S&P 500 (right).

When a finite number of stationary segments are identified with a high probability, the results can be displayed as estimated local spectra at certain time point. For example, we obtain the local spectra of DJIA and NASDAQ and their coherence at the week of October 18, 2007 in the Figure 9.14. It appears that the spectra and coherence are all flat, and they clearly indicate a well‐known phenomenon that the log returns of stock time series are white noise.

3 Graphs, each displaying a flat curve representing the local spectrum of DJIA (top left), NASDAQ (top right), and their coherences (bottom) at October 18, 2007.

Figure 9.14 Local spectrum of DJIA (top left), NASDAQ (top right), and their coherences (bottom) at October 18, 2007.

We then applied the piecewise vector autoregressive method of Davis, Lee, and Rodriguez‐Yam (2006) to the same data set. The procedure produces changepoints at the following dates: July 13, 1998, July 15, 2002, September 8, 2003, and June 25, 2007. Figure 9.15 presents the DJIA time series along with the changepoints found by the approach of Davis, et al. (2006). The changepoints found by the piecewise autoregressive model are somewhat similar to that found by the Bayesian method. However, one of the main differences between the assumptions of the Bayesian method and the piecewise vector autoregressive method is that the number and location of partitions (changepoints) is random for the Bayesian method, while they are fixed for the piecewise autoregressive method.

Graph displaying a fluctuating curve with 4 vertical dotted lines, illustrating the Weekly log returns for the Dow Jones from April 1990 to December 2011.

Figure 9.15 Weekly log returns for the Dow Jones from April 1990 to December 2011 along with the changepoints found by the piecewise vector autoregressive model of Davis, Lee, and Rodriguez‐Yam (2006).

Before concluding this chapter on multivariate spectral analysis and its applications, I would like to mention some more recent references including Grant and Quinn (2017), Ray et al. (2017), and Wilson (2017), among others, and point out that multivariate spectral analysis can also be used to analyze space–time data sets. However, it should be stated that, so far, the developed procedures can only apply to a stationary or properly transformed stationary space–time series. For references on spectral analysis with space–time series, we refer readers to Bandyopadhyay, Jentsch, and Rao (2017), Rao and Terdick (2017), among others.

Projects

  1. Find a m‐dimensional stationary vector time series data set of your interest with m no less than 5. Complete your detailed nonparametric analysis of multivariate spectrum with a written report and analysis software code.
  2. Find a m‐dimensional stationary vector time series data set of your interest with m no less than 5. Complete your detailed VARMA spectral analysis with a written report and analysis software code.
  3. Use the data set from Project 1 to perform a VARMA spectral analysis and compare the results from the two methods.
  4. Use the data set from Project 2 to perform a nonparametric analysis of multivariate spectrum, and compare the results from the two methods.
  5. Find a m‐dimensional nonstationary vector time series data set of your interest and complete your detailed analysis of multivariate spectral analysis with a written report and analysis software code.

References

  1. Bandyopadhyay, S., Jentsch, C., and Rao, S.S. (2017). A spectral domain test for stationarity of spatio‐temporal data. Journal of Time Series Analysis 38: 326–351.
  2. Bartlett, M.S. (1950). Periodogram analysis and continuous spectra. Biometrika 37: 1–16.
  3. Blackman, R.B. and Tukey, J.W. (1959). The Measurements of Power Spectrum from the Point of View of Communications Engineering. Dover Publications.
  4. Brillinger, D.R. (2002). Time Series: Data Analysis and Theory. Philadelphia: SIAM.
  5. Chang, J., Hall, P., and Tang, C.Y. (2017). A frequency domain analysis of the error distribution from noisy high‐frequency data. Biometrika 103: 1–16.
  6. Dahlhaus, R. (1996). Fitting time series model to nonstationary process. Annals of Statistics 25: 1–37.
  7. Dahlhaus, R. (2000). A likelihood approximation for locally stationary process. Annals of Statistics 28: 1762–1794.
  8. Dai, M. and Guo, W. (2004). Multivariate spectral analysis using Cholesky decomposition. Biometrika 91: 629–643.
  9. Daniell, P.J. (1946). Discussion on symposium on autocorrelation in time series. Journal of the Royal Statistical Society (Suppl. 8): 88–90.
  10. Davis, R.A., Lee, T.C.M., and Rodriguez‐Yam, G.A. (2006). Structural break estimation for nonstationary time series models. Journal of the American Statistical Association 101: 223–239.
  11. Eubank, R.L. and Hsing, T. (2008). Canonical correlation for stochastic processes. Stochastic Processes and their Applications 118: 1634–1661.
  12. Goodman, N. (1963). Statistical analysis based on a certain multivariate complex Gaussian distribution. Annals of Mathematical Statistics 34: 152–177.
  13. Grant, A.J. and Quinn, B.G. (2017). Parametric spectral discrimination. Journal of Time Series Analysis 38: 838–864.
  14. Gu, C. and Wahba, G. (1993). Semiparametric analysis of variance with tensor product thin plate splines. Journal of the Royal Statistical Society, Series B 55: 353–368.
  15. Guo, W. and Dai, M. (2006). Multivariate time‐dependent spectral analysis using Cholesky decomposition. Statistica Sinica 16: 825–845.
  16. Hannan, E.J. (1970). Multiple Time Series. Wiley.
  17. Hansen, M.H. and Yu, B. (2001). Model selection and the principle of minimum description length. Journal of the American Statistical Association 96: 746–774.
  18. Krafty, R.T. and Collinge, W.O. (2013). Penalized multivariate Whittle likelihood for power spectrum estimation. Biometrika 100: 447–458.
  19. Li, Z. and Krafty, R.T. (2018). Adaptive Bayesian time‐frequency analysis of multivariate time series. To appear in Journal of the American Statistical Association.
  20. Ombao, H., von Sachs, R., and Guo, W. (2005). SLEX analysis of multivariate nonstationary time series. Journal of the American Statistical Association 100: 519–531.
  21. Parzen, E. (1961). Mathematical considerations in the estimation of spectra. Technometrics 3: 167–190.
  22. Parzen, E. (1963). Notes on Fourier analysis and spectral windows. Technical report No. 48, Office of Naval Research.
  23. Pawitan, Y. (1996). Automatic estimation of the cross‐spectrum of a bivariate time series. Biometrika 83: 419–432.
  24. Pawitan, Y. and O'Sullivan, F. (1994). Nonparametric spectral density estimation using penalized whittle likelihood. Journal of the American Statistical Association 89: 600–610.
  25. Priestley, M.B. (1965). Evolutionary spectral and non‐stationary process. Journal of the Royal Statistical Society, Series B 27: 204–237.
  26. Priestley, M.B. (1966). Design relations for non‐stationary processes. Journal of the Royal Statistical Society, Series B 28: 228–240.
  27. Priestley, M.B. (1967). Power spectral analyses of non‐stationary random processes. Journal of Sound and Vibration 6: 86–97.
  28. Priestley, M.B. (1981). Spectral Analysis and Time Series, vol. 1 and 2. Academic Press.
  29. Qin, L. and Wang, Y. (2008). Nonparametric spectral analysis with applications to seizure characterization using EEG time series. Annals of Applied Statistics 2: 1432–1451.
  30. Rao, T.S. and Terdick, G. (2017). On the frequency variogram and on frequency domain methods for the analysis of spatio‐temporal data. Journal of Time Series Analysis 38: 308–325.
  31. Ray, E.L., Sakrejda, K., Lauer, S.A., Johansson, M.A., and Reich, N.G. (2017). Infectious disease prediction with kernel conditional density estimation. Statistics in Medicine 36: 4908–4929.
  32. Riedel, K. and Sidorenko, A. (1995). Minimum bias multiple taper spectral estimation. IEEE Transaction on Signal Processing 43: 188–195.
  33. Rissanen, J. (1989). Stochastic Complexity in Statistical Inquiry. Singapore: Word Scientific.
  34. Rosen, O. and Stoffer, D.S. (2007). Automatic estimation of multivariate spectra via smoothing splines. Biometrika 94: 335–345.
  35. Schwarz, K. and Krivobokova, T. (2016). A unified framework for spline estimators. Biometrika 103: 103–120.
  36. Slepian, D. (1978). Prolate spheroidal wave functions, Fourier analysis, and uncertainty – V: the discrete case. Bell System Technical Journal 57: 1371–1430.
  37. Thompson, D.J. (1982). Spectrum estimation and harmonic analysis. Proceedings of the IEEE 70: 1055–1096.
  38. Wahba, G. (1990). Spline Models for Observational Data, CBMS‐NSF Regional Conference Series in Applied Mathematics. Philadelphia: SIAM.
  39. Walden, A.T. (2000). A unified view of multitaper multivariate spectral estimation. Biometrika 87: 767–788.
  40. Wei, W.W.S. (2006). Time Series Analysis Univariate & Multivariate Methods, 2e. Pearson Addison‐Wesley.
  41. Wei, W.W.S. (2008). Spectral analysis. In: Handbook of Longitudinal Research, Design, Measurement, and Analysis (ed. S. Menard), 601–620. Academic Press.
  42. Whittle, P. (1953). Estimation and information in stationary time series. Arkiv för Matematik 2: 423–434.
  43. Whittle, P. (1954). Some recent contributions to the theory of stationary processes. In: A Study in the Analysis of Stationary Time Series, 2e (ed. H.O. Wold), 196–228. Stockhol: Almqristand Witsett.
  44. Wilson, G.T. (2017). Spectral estimation of the multivariate impulse response. Journal of Time Series Analysis 38: 381–391.
  45. Wood, S.N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society, Series B 73: 3–36.
  46. Zhang, S. (2016). Adaptive spectral estimation for nonstationary multivariate time series. Computational Statistics & Data Analysis 103: 330–349.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.253.199