14
Spectral Analysis

In time series analysis, the estimation of the correlation properties of random samples is referred to as spectral analysis. If a process x[n] is stationary, its correlation properties are fully defined by the autocorrelation or equivalently by its Fourier transform, in term of power spectral density (PSD):

images

Spectral analysis deals with the estimation of a PSD from a limited set of N observations images:

images

There are two fundamental approaches to spectral analysis: parametric or non‐parametric, depending if there is a parametric model for the PSD or not. Non‐parametric spectral analysis makes no assumptions on the PSD and thus it is more general and flexible as it can be used in every situation when there is no a‐priori knowledge on the mechanism that correlates the samples in the time series. Parametric methods model (or guess) the generation mechanism of the process and thus the spectral analysis reduces to the estimate of these parameters. Except for a few cases, spectral analysis is basically a search for the most appropriate (non‐parametric/parametric) method to better estimate the PSD and some features that are of primary interest for the application.

There is no general rule to carrying out spectral analysis except for common sense related to the needs of the specific application. Therefore, the organization of this chapter aims to offer a set of tools and to establish the pros and cons of each method based on statistical performance metrics. The lack of a general method that is optimal for every application makes spectral analysis a professional expertise itself where experience particularly matters. For this reason, spectral analysis is considered “an art”, and the experts are “artisans” of spectral analysis who specialize in different application areas. All the principal methods are discussed herein with the usual pragmatic approach aiming to provide the main tools to deal with the most common problems and to refine the intuition, while avoiding entering into excessive detail in an area that could easily fill a whole dedicated book, see e.g., [52].

14.1 Periodogram

Spectral analysis estimates the PSD images from the observations images, and performance is evaluated by the bias images and the variance images. As for any estimator, it is desirable that the PSD estimate is unbiased (at least asymptotically as images) and consistent. Non‐parametric spectral analysis is the periodogram, and it is historically based on the concept of the filter‐bank revised in Section 14.1.3, as every value of the PSD represents the power collected by a band‐pass filter centered at the specific frequency bin.

Let

be the discrete Fourier transform (DFT) of the sequence x evaluated for the k th bin of the angular frequency images (or k th frequency bin images). The periodogram is the estimator

and it corresponds to the energy density of the sequence of N samples |X(ωk)|2 divided by the number of samples. If considering unit‐time sampling, the periodogram divides the energy density by the observation time and reflects the definition of power density (power per unit of frequency bin). The periodogram has the benefit of simplicity and computational efficiency and this motivates its widespread use. However, an in‐depth knowledge of its performance is mandatory in order to establish the limits of this simple and widely adopted estimator.

14.1.1 Bias of the Periodogram

A careful inspection of the definition of the periodogram shows that the periodogram can be rewritten as an FT of the sequence of N samples images sampled in regular frequency bins:

images

but

images

is the FT of the sample autocorrelation

images

The properties of images govern the bias, and since (see Appendix A)

images

this proves that the periodogram (a sampled version of images) is biased, as

where the convolution images is periodic over 2π. The periodgram is a biased estimator of the PSD that is asymptotically unbiased as

images

Remark 1: The properties of the DFT makes it possible to evaluate the FT in other frequency bins, not necessarily spaced by 1/N (or 2π/N if angular frequency bins). This is obtained by padding with zeros the sequence images up to a new value, say images:

images

From signal theory [12,13] , the DFT over the M samples is the FT of the sequence sampled over the new frequency bin, which in turn is correlated (i.e., not statistically independent or uncorrelated) with the DFT X(ωk):

images

The DFT of y[n] is the interpolated values of DFT of x[n] (14.1) using the periodic sinc as interpolating function.

A common practice is to pad to multiple values of N, say images, so that the interpolating function images) honors the DFT X(ωk) and thus the periodogram images coincides with images as

images

with images interpolated values between the samples of the images.

Remark 2: The bias (14.3) vanishes when considering some special cases. Let the PSD of one (or more) real (or complex) sinusoids with amplitude A, random phase, and frequency ωo be

images

The bias becomes

images

and the periodogram is biased as shown in Figure 14.1. However, if the frequency

images

for some integer value o (i.e., the sinusoid has a period that makes o cycles into the N samples of the observation window), the periodogram is unbiased, as

images

Since the value of periodogram in o is images, apparently the periodogram seems to yield a wrong value for the power of the sinusoid, as it should be A2/2. Recall that the periodogram is the estimator of the power spectrum “density” and thus the power is obtained by multiplying each bin by the corresponding bandwidth, which in this case it is the frequency bin spacing 1/N, and thus

images

as expected

Another interesting case is when the PSD is for a white WSS random process:

images

when the bias becomes

images

that is, still unbiased. Once again, the power is still the same as summing N terms scaled by the bandwidth 1/N of the periodogram bin; the power is imagesimages.

Bias of periodogram for sinusoids, illustrated a vertical line coinciding (top) and not coinciding (bottom) with an upward arrow at middle of a bell-shaped curve with circle markers.

Figure 14.1 Bias of periodogram for sinusoids.

14.1.2 Variance of the Periodogram

The exact evaluation of the variance of the periodogram is not straightforward, but some simplifications in the simple case can be drawn before providing the general equation. For a real‐valued process, it can be written as

The computation needs the 4th order moments, which are not straightforward. However, the computation becomes manageable if the process is Gaussian and white (in this way all the conclusions can be extended to every bin, regardless of the bias effect in (14.3)) so that images. For stationary processes involving stationarity up to the 4th order moment of the Gaussian process (this condition is stricter than wide‐sense stationary condition, but not uncommonly seen to be fulfilled):1

images

and furthermore from the bias:

These two identities let us simplify the variance (14.4) as

images

and the first term can be further simplified from (14.5) into the sum of two terms that from complex conjugate symmetry are

images

The first term becomes (from (14.5))

images

while the second term can be evaluated in closed form from the statistical uncorrelation of the samples

images

To summarize, the variance of the periodogram for an observation of N samples of a white Gaussian process with power images is

images

and for the samples of the periodogram it is

images

The periodogram is not consistent as its variance does not vanish for images and some remedies are mandatory in order to use the periodogram as a PSD estimator, since in this simple form the variance of the periodogram is intolerably high and, surprisingly, it is a very bad estimator!

The variance derived so far can be used as a good approximation for the variance in any context (not necessarily for white Gaussian processes), and it follows the routinely adopted variance of the periodogram for real‐valued discrete‐time signals:

without the effect of bias. Control of the variance is mandatory in order to use the periodogram effectively as an estimator of the PSD, and this will be discussed next.

Remark: The periodogram is based on the DFT and thus all the properties of symmetry of Fourier transforms hold. Namely, if the process is real, images, and this motivates the larger variance for the samples images. However, for a complex process the DFT is no longer symmetric: images, and the variance is the same for all frequency bins:

images

A justification is offered in the next section.

14.1.3 Filterbank Interpretation

The relationship (14.6) is not surprising if one considers the effective measurements carried out by every frequency bin. The periodogram can be interpreted as a collection of signals from an array of band‐pass filters (filter‐bank) that separates the signal x[n] into the components images such that (Figure 14.2)

images

where the kth filter is centered at the frequency images with a bandwidth images to guarantee a uniform filter‐bank (see [55] for multirate filtering and filter‐banks).

The samples yk[n] are temporally correlated as the bandwidth of the process is as wide as the bandwidth of the filter Hk(ω); these samples can be extracted with a coarser sampling rate that extracts one sample every N (decimation 1:N) so that the samples after the filter‐bank are a set of components images; these components are statistically uncorrelated (as maximally decimated [55]) and can be used to extract the power of each component

images

as estimate of the corresponding power. Since the samples are random

images

it follows that2

images

thus showing that the power from every filter is the estimate of the PSD with variance S2(ωk). The filter‐bank is not just a model, but it coincides with the periodogram where the filter‐bank implements the DFT as in Figure 14.2 and the filters are chosen as

images

This gives an interesting interpretation to the bias of the periodogram as being due to the poor selectivity of the band‐pass filtering, which can be significantly improved by choosing a different filter response as discussed below in Section 14.1.5.

Filter-bank model of periodogram with a dashed portion depicting the DFT (top), with a descending solid curve labeled Sx(ω) on top of 4 overlapping curves for ωk−2, ωk−1, ωk, and ωk+1 (bottom).

Figure 14.2 Filter‐bank model of periodogram.

14.1.4 Pdf of the Periodogram (White Gaussian Process)

As the periodogram is an rv that in every cell estimates the corresponding PSD, in order to gain insight, it is useful to evaluate the corresponding pdf for the same context of Section 14.1.2. Let the data be

images

The DFT that is a linear transformation of x (see Remark 3 of Appendix B)

images

so that highlighting the bins X[0 : N/2] as statistically independent, we have:

images
images

or equivalently

images

Each bin of the periodogram is the square of the DFT so that

images

which is the sum of the square of two independent and Gaussian rvs, and the pdf is exponential (central Chi‐Square images with 2 degrees of freedom):

images

with rate parameter images. From the properties of the exponential pdf:

images

where we included the results for images. Using the definition of the periodogram (14.2), all results derived so far should be scaled by 1/N and this coincides with Section 14.1.2.

For the case

images

(there is no correlation between X[k] and images as for real valued processes) and thus

images

that is, there is no difference for the variance among samples of the periodogram.

14.1.5 Bias and Resolution

The periodogram is biased, and the total bias depends on the length N. The main consequences of bias is loss of resolution,3 smoothing of the PSD, and spectral leakage due to the side‐lobes of the biasing periodic sinc(.) function in (14.3). To gain insight into this, consider the periodogram to estimate the PSD in the case of a selective PSD as in Figure 14.3. In bias (14.3) the periodic sinc

images

that governs the convolution with the PSD Sx(ω) can be divided into two terms over disjoint intervals (i.e., images): the main lobe BP(ω) over the interval images and the side‐lobes BL(ω):

images

as sketched in Figure 14.3. The main lobe has the effect of smoothing the PSD in the case Sx(ω) as a rapid transition with a loss of selectivity and resolution as illustrated when evaluating images (solid line) compared to the PSD Sx(ω) (dashed line). The side‐lobes have the effect of spreading the PSD over the frequency axis where the PSD Sx(ω) is very small (spectral leakage), thus creating artifacts such as images that are not in the PSD. The overall effect is a “distorted” PSD (solid line) that in some cases can be unacceptable.

A detailed inspection of the reasons for bias proves that it is a consequence of the abrupt (rectangular‐like) truncation of the data that can be mitigated by a smooth windowing. Re‐interpreting the periodogram

images

as the DFT of the samples

images

obtained after the rectangular windowing w[n] (that is, images for images) gives the interpretation of the bias as the consequence of the truncation. Any arbitrary window w[n] on the data x[n] before DFT affects the autocorrelation, as if the windowing is images, and the bias is

images

which is to be evaluated for images. The window w[n] can be smooth to reduce the side‐lobes of the corresponding Fourier transform, and thus it mitigates spectral leakage. However, this is at the expense of a loss of resolution, as windows with small side‐lobes have a broader main lobe. The Figure 14.4 shows an example of the most common windows and their Fourier transforms; the choice of the data‐windowing is a trade‐off between resolution and spectral leakage. Needless to say that windowing changes the total power of the data as the size of the effective data is smaller (edge samples are down weighted), and compensation is necessary to recover the values of the PSD. One side effect of the windowing is the increased variance due to the reduction of the effective data length.

Bias and spectral leakage. Top: A solid curve, along with Sx(ω) dashed curve, intersects at the peak of a bell curve. Bottom: A solid curve, along with a dashed curve, intersects at a point under the peak of a bell curve.

Figure 14.3 Bias and spectral leakage.

Segmentation of data into M disjoint blocks of N samples each, depicted by 3 curves with circle markers for X1, X2, and XM along a rightward arrow labeled n, with a horizontal two-headed arrow labeled N x M at the top.

Figure 14.4 Rectangular, Bartlett (or triangular) and Hanning windows w[m] with M = 31, and their Fourier transforms images on frequency axis ω/2π.

14.1.6 Variance Reduction and WOSA

The variance of the periodogram is intolerably high and it prevents its reliable use as a PSD estimator. A simple method for variance reduction is by averaging independent simple periodograms to reduce the variance accordingly (see the examples in Section 6.7). The WOSA method (window overlap spectral analysis) is based on averaging of the periodograms after the segmentation of the data into smaller segments, with some overlap if windowing is adopted (to control the bias as discussed above).

The overall observations can be segmented into M blocks of N samples each as in Figure 14.5. The periodogram is the average of M simple periodograms over N samples:

images

with a requirement for images observations in place of N. The variance is

images

and the choice of M can control the accuracy that is needed for the specific application. In practice, the overall observations Ntot are given and the choice of M and N is a trade‐off between the need to have small variance (with M large) and small bias (with N large). Since resolution and spectral leakage can be also controlled by data‐windowing, the overall data can be windowed before computing the simple periodogram. As mentioned in the previous section, the windowing reduces the effective data used, and some overlapping can be tolerated to guarantee that the averaging is for independent periodograms. In WOSA (see Figure 14.6) the overlapping factor

images

highlights the degree of overlapping in the windowed segmentation, and for a total set of samples Ntot the WOSA periodogram becomes

images

where the number of overlapping windows is

images

Once again, the choice of η should guarantee the averaging of independent periodograms. If using a Bartlett window, a reasonable choice is images; smoothed windows tolerate a larger overlapping η at the price of some spectral leakage.

WOSA spectral analysis, depicted by 3 curves with circles markers along a rightward arrow labeled n with a row of overlapping bell-shaped curves on the top with brackets for Sˆx,1 (ωk), Sˆx,2 (ωk), Sˆx,3 (ωk), etc.

Figure 14.5 Segmentation of data into M disjoint blocks of N samples each.

Graph of PSD vs. frequency depicting the periodogram (WOSA method) for rectangular, Bartlett, and Hamming window, displaying 3 groups of curves with a dashed line and two filled dots.

Figure 14.6 WOSA spectral analysis.

Inspired by the variance mitigation of WOSA with data‐windowing, the variance can be reduced by smoothing a simple periodogram over N samples with a proper smoothing W(ωk) over the whole range of the N samples; this averaging of independent samples is by circular convolution (periodogram smoothing method)

images

The smoothing function W(ωk) should be positive definite and can be chosen to be rectangular, or triangular (Bartlett shape), or any other shape. Even if the smoothing reduces the variance on the order of the smoothing interval, it introduces a bias that can be seen by the loss of resolution due to the smoothing itself. More specifically, the bias is

images

where the last approximation is a consequence of the choice of a smoothing window that is large compared to the periodogram spacing 1/N. The choice images guarantees that the smoothing does not add any bias, not even asymptotically. The variance is

images

which depends on the variance of the other terms as consequence of the smoothing. Since the influence of the smoothing is to average in the neighborhood of the frequency ωk, it is expected that the PSD is approximately constant within the smoothing window so that the ratio

images

depends approximately only on the window choice and it can be evaluated for some typical window choices that can be applied either in time (Table 14.1) or as periodogram smoothing in frequency (Table 14.2). Recall that a raised cosine with images is a general window, and it can be specialized to the Hann (or Hanning) window for images, and the Hamming window for images.

Table 14.1 Time windows and variance reduction.

Time window: w[m] (support images) images
rectangular: images images
Bartlett: images images
raised cosine: images images
images

Table 14.2 Frequency smoothing and variance reduction.

Frequency smoothing: W(ωk) images
(support imageswith images)
rectangular: images images
Bartlett: images images
raised cosine: images images

14.1.7 Numerical Example: Bandlimited Process and (Small) Sinusoid

The periodogram is used as the first step in spectral analysis when it would be uncertain to use any parametric method, but bias and variance should be used judiciously. Let us consider as observations

images

modeled as the sum of a bandlimited process z[n] with bandwidth B and with a small sinusoid superimposed with frequency images close to the edge of the bandlimited process (images). The periodogram is used here to estimate the PSD and highlight the existence, or not, of the sinusoid (intentionally or accidentally) “hidden” on the side of the PSD Sz(f). A numerical analysis can ease the discussion based on the Matlab code below.

The bandlimited process has been generated by filtering a white Gaussian process with a Chebyshev filter that is chosen to have a high order (here 15th order) to guarantee good selectivity with sharp transition between pass‐band and stop‐band; cutoff frequency is chosen as images. The sinusoid has frequency images and amplitude images that is much smaller than the in‐band PSD of Sz(f).

The WOSA method has been applied over images segments of images samples/segment without overlapping by using unwindowed observations (plain periodogram), or by windowing using either Hanning or Bartlett windows. Figure 14.7 shows the superposition of 10 independent simulations of averaged periodograms for different windowing superimposed to the “true” PSD Sz(f) (dashed line). Without windowing, the PSD is biased with a smooth transition toward the edge of the band (f = 1/2) that completely hides the PSD of the sinusoid. A similar conclusion can be drawn for the use of Hanning windowing on the data. On the other hand, the periodogram using Bartlett windows on the observations still has the effect of reducing the selectivity, but this bias does not hide the PSD of the sinusoid and the value of the periodogram at the frequency of the sinusoid restores the true value of the PSD of the sinusoid (filled dot in Figure 14.7).

Graph of PSD vs. frequency depicting periodogram for varying N = 128, 256, 512, and 1024, and Bartlett window, displaying 4 groups of curves with a dashed line and 6 filled dots.

Figure 14.7 Periodogram (WOSA method) for rectangular, Bartlett, and Hamming window.

The effect of the number of samples is equally important to highlight the sinusoid in the PSD as the periodogram for the sinusoid increases with N (see Section 14.1.5). This is shown in Figure 14.8 where the Bartlett window has been used on observations with increasing length thus showing that N = 128 or 256 samples are not enough to avoid the bias of the periodogram hiding the line spectrum of the sinusoid, while any choice of N = 512 or larger can allow the lines of the sinusoid to exceed the biased PSD of Sz(f).

14.2 Parametric Spectral Analysis

In parametric spectral analysis, the PSD is modeled in terms of a set of parameters θ so that images accounts for the PSD and the estimate of θ yields the estimate of the PSD:

images

Accuracy depends on the compliance of the parametric model to the PSD of the specific process at hand. To exemplify, if the PSD contains resonances, the parametric model should account for these resonances in a number that is the same as the PSD, otherwise there could be errors due to under‐ or over‐parameterization. Parametric models are based on AR, MA, or ARMA processes (Section 4.4) where the N samples are obtained by properly filtering an uncorrelated Gaussian process w[n] with power (unknown) images. The most general case is the parametric spectral analysis for an ARMA model (ARMA spectral analysis), where the corresponding z‐transform is

Model for AR spectral analysis, depicted by a rectangle labeled 1/A(ʐ | a) with a rightward arrow at the left labeled w[n] atop and σ2AR(N ₐ) below, and another rightward arrow at the right labeled x[n].

Figure 14.8 Periodogram (WOSA method) for varying images and Bartlett window.

images

The PSD

images

is parametric as it depends on the set of images complex‐valued coefficients a and b, and on the power images acting as a scaling term. The overall parameters

images

completely define the PSD, and the model can be reduced to AR (by choosing images) and MA (by choosing images) spectral analysis.

14.2.1 MLE and CRB

The MLE is based on the parametric model of the distribution of x that is based on the Gaussian model:

images

First of all, it is necessary to evaluate the relationship between the structure of the covariance Cxx and the PSD. Starting from the log‐pdf of the data (see also Section 7.2.4)

images

each of these terms can be made dependent on the PSD by exploiting the relationships between eigenvalues/eigenvectors of the covariance matrix

images

and its PSD (Appendix B):

images

as in spectral analysis a common assumption is to have a large value of N. The k th eigenvector can be approximated by a complex sinusoid with k cycles over N samples:

images

By substituting into the pdf we get:

images

and since images, the term

images

is again the periodogram. The pdf can be rewritten now in terms of the PSD values

images

Since 1/N is the normalized bandwidth of the frequency bin in the periodogram (recall that images), the log‐likelihood can be rewritten as

images

The latter approximation holds for large enough N —in principle for images, and it is the pdf of data for spectral analysis.

14.2.2 General Model for AR, MA, ARMA Spectral Analysis

For parametric spectral analysis, the PSD images depends on a set of non random parameters θ, so the log‐likelihood is

and the ML estimate is

images

that is specialized below for AR, MA, and ARMA models.

The asymptotic equation for the conditional pdf (14.7) is the basis for the CRB as the FIM entry is

images

from the derivatives

images

Since the periodogram is asymptotically unbiased

images

the first term of the FIM is zero (for images) and the FIM reduces to the celebrated Whittle formula for the CRB [56] :

that is generally applicable for the CRB in a broad set of spectral analysis methods.

14.3 AR Spectral Analysis

AR spectral analysis models the PSD Sx(ω) by a generation model with poles as in Figure 14.9, so that the PSD is assumed to be

images

for a certain order Na and excitation power images. The positions of the poles in the z‐plane define the frequency and strength of the resonances, and AR is the appropriate description when the PSD shows resonances only, either because the generation mechanism is known or just assumed.

14.3.1 MLE and CRB

In Na th order AR spectral analysis, in short AR(Na), the unknown terms are the recursive term of the filter images and the excitation power images that depends on the specific AR order Na highlighted in the subscript:

images

The log‐likelihood (14.7) is (apart from a scaling)

images
|z|vs. A2/σ2 graph depicting the radial position of pole, displaying 2 sets of 3 discrete ascending curves for N = 10 and N = 50 with curves for AR(1) (solid), AR(2) (dashed), and analytic model (dashed shaded). Var1/2(z) vs. A2/σ2 graph depicting the radial position of dispersion, having 2 sets of 3 discrete descending curves for N = 10 and N = 50 with curves for AR(1) (solid), AR(2) (dashed), and analytic model (dashed shaded).

Figure 14.9 Model for AR spectral analysis.

Since for monic polynomials, images (Appendix C), the log‐likelihood simplifies to

images

The MLE follows by nulling the gradients in order, starting from the gradient wrt images:

images

The ML estimate of the excitation power depends on the set of parameters a. Plugging into the log‐likelihood

images

its maximization is equivalent to minimizing images wrt a. From the gradients it is (to simplify images):

images

Recalling that

images

the following condition holds:

images

Since Ŝx(f, x) is real‐valued, images, it is sufficient to solve for

images

Substituting:

images

but recalling that the periodogram is the Fourier transform of the sample (biased) estimate of the autocorrelation, the ML estimate follows from a set of Na equations

images

These are the Yule–Walker equations in Section 12.3.1, here rewritten wrt the sample autocorrelation that recalls the LS approach to linear prediction (Section 12.4). Furthermore

images

which is the power of the excitation from the sample autocorrelation as the power of the prediction error from the Yule–Walker equations (12.4) in Section 12.3.1.

Remark: The coincidence of AR spectral analysis with the Yule–Walker equations of linear prediction is not surprising. AR spectral analysis can be interpreted as a search for the linear predictor that fully uncorrelates the process under analysis. The sample autcorrelation replaces the ensemble autocorrelation as this estimates the correlation properties from a limited set of observations with positive definite Fourier transform (while still being biased).

The CRB can be evaluated from the Whittle formula (14.8) adapted for the AR model:

images

Computing first the terms

images
images

and then the elements of the FIM from the Whittle formula, recalling that images is causal and minimum phase4

images

The FIM is block‐partitioned

images

so the estimation of the excitation images is decoupled from the coefficients a of the AR model polynomial. The CRB follows from the inversion of the FIM with covariance

images

The CRB of the coefficients a depends on the inverse of the covariance matrix Cxx. Note that all coefficients do not have the same variance, but they are unbalanced and they depend on the specific problem as discussed in the example in the next section.

14.3.2 A Good Reason to Avoid Over‐Parametrization in AR

AR spectral analysis can be carried out regardless of the true model, or even when the PSD does not show resonances due to the presence of poles. Since the generation mechanism is not always known, the order Na of the AR spectral analysis is often another unknown of the problem at hand. Even if in principle the estimates â could account for the order as those terms larger that the true order should be zero, the augmentation of the model order in AR spectral analysis has to be avoided as the parameters a are not at all decoupled, and the covariance cov{â} could be unacceptably large.

A simple example can help to gain insight into the concept of model over‐parameterization (set an AR order in spectral analysis larger than the true AR order). Let a set of observations be generated as an AR(1) with parameters images for images (the complex‐valued case will be considered later), the autocorrelation is

images

and the Yule–Walker equations to estimate the parameters a leads to the estimates â such that images and images for images (the bias images is due to the sample autocorrelation that vanishes for N large). The CRB depends on the correlation matrix Cxx that for an arbitrary order Na is structured as

images

and its inverse is known (Section 4.5). The CRB for different order selection is:

images

thus showing that diagonal terms are not the same and that (for this specific problem) the CRB matrix is diagonal banded. Furthermore:

images

that is, the over‐parameterization increases the variance of the AR spectral analysis and the additional poles due to âk because the choice images perturbs the PSD estimates that in turn depend on the dispersion of the poles.

The conclusion on over‐parameterization is general: using more parameters than necessary reduces the accuracy of the PSD estimate so that sometimes it is better to under‐parameterize AR spectral analysis. Alternatively, one can start cautiously with low order and increase the order in the hunt for a better trade‐off, being aware that every augmentation step increases the variance of the PSD estimate.

14.3.3 Cramér–Rao Bound of Poles in AR Spectral Analysis

An AR model can be rewritten wrt the roots of the polynomial A(z) as

images

where μ collects all the roots images as detailed below. The z‐transform of the PSD becomes

images

and the Whittle relationship for the CRB of an AR spectral analysis can be rewritten in terms of the roots of the z‐transform. The importance of this model stems from the use of AR spectral analysis to estimate the angular position of the poles, and these in turn are related to the frequency of the resonances in PSD.

To simplify the reasoning, let the process x be real‐valued so that poles are complex conjugates μk and images; in addition poles are either real or complex but are assumed to have multiplicity one. Let the poles be decoupled into those that are real (ℛ) and those that are complex conjugates (images) so that the z‐transform of the PSD is

images

The Whittle formula needs the derivatives evaluated separately for the two sets

images

Integrating over the set for the elements of the FIM:5

images
images

and using the identity images (Appendix C). The entries of the FIM partitioned either into real (ℛ) or complex conjugate (images) poles have the same value for the accounted pole. To summarize, it is convenient to order the Na poles into real (μ) and complex (images) with their conjugate (images)

images

such that images and images; so the FIM is

images

Example: Pole Scattering in AR(2)

Consider an AR(2) with poles images. From the relationship above, the FIM is

images

To derive the covariance for real and imaginary component of the poles it is useful to recall the following identity:

images

so that the CRB for the transformed variables (Section 8.3) is

images

After some algebra it follows that

images

The variance of μR is independent of the frequency of the poles, and the variance for the imaginary term is very large when images (both poles are coincident) and vanishes for images. The error ellipsoid is not referred to the orthogonal axes, as images; the following numerical example for AR(1) can clarify.

14.3.4 Example: Frequency Estimation by AR Spectral Analysis

Let a sequence be modeled as a complex sinusoid in white Gaussian noise with power images

images

The angular frequency ω and the amplitude images are not known and should be estimated as poles in AR(1) spectral analysis. The phase of the sinusoid is denoted as images and imagesimages. AR(1) spectral analysis models observations as

images

such that images and images as the pole should approximate at best the true PSD: images. AR(1) follows from the Yule–Walker equation (12.2)

images

For the sample estimate of the autocorrelation:

images

where images just to ease the notation for the computations below. The objective is to evaluate the performance (bias and variance) of AR(1) spectral analysis to estimate the frequency of the sinusoid as an alternative to the conventional MLE in Section 9.3. The frequency estimate accuracy depends on the small deviations of the radial and angular position of the pole p due to noise.

Since the pole depends on the sample autocorrelations, images is decomposed into (Appendix A)

images

and

images

For large N and large SNR defined as

images

the pole can be approximated as

images

The bias is

This proves that the estimate of the frequency is unbiased, but the radial position of the pole images depends on γ —however for images and images the pole attains the unitary circle. The variance is evaluated from the deviations δrk:

images

Since images it is convenient to group the terms as follows

images

From the properties in Appendix D, the variance is

where the last approximation is for images. For images, Figure 14.10 shows the comparison between the simulated (black solid line) and the analytical (gray dashed line) bias (14.9) (upper figure) and standard deviation images from (14.10) that perfectly superimpose for AR(1) and therefore validate the approximations in the sensitivity analysis for images and 50 samples. Figure 14.10 also shows the dispersion for AR(2) (dash dot line) when extracting only one pole corresponding to the frequency close to images, that is less dispersed and closer to the unit circle for images.

4 Scatterplots of poles of Montecarlo simulations for AR(1) (top left), AR(2) (top right), AR(3) (bottom left), and AR(4) (bottom right)spectral analyses.

Figure 14.10 Radial position of pole (upper figure) and dispersion (lower figure) for AR(1) (solid line) and AR(2) (dash dot), compared to analytic model (14.1014.11) (dashed gray line)

In practice, the number of sinusoids is not known and it is quite common practice to “search” for the order by trial‐and‐error on the poles. Figure 14.11 shows the poles from a set of independent AR spectral analyses with order images (superposition of 50 independent trails with images and images) and images. Notice that for AR(1), the poles in every trial superimpose around the true angular position with a radial location set by the bias (14.9). When increasing the order of the AR spectral analysis, the poles cluster around the angular position in images thus providing an estimate of the frequency, while the other poles disperse within the unit circle far from the sinusoid thus allowing clear identification of the frequency of the sinusoid. However, each additional pole can be a candidate to model (and estimate) a second, or third, or forth sinusoid in observation thus providing a tool to better identify the number of sinusoids in the data (if unknown). Needless to say, ranking the poles with decreasing radial distance from the unit circle offers a powerful tool to identify the number of sinusoids, if necessary, when the number is not clear from the practical problem at hand.

Model for MA spectral analysis, depicted by a rectangle labeled B(ʐ|b) with a rightward arrow at the left labeled w[n] atop and σ2MA(N b) below, and another rightward arrow at the right labeled x[n].

Figure 14.11 Scatter‐plot of poles of Montecarlo simulations for AR(1), AR(2), AR(3), AR(4) spectral analysis.

14.4 MA Spectral Analysis

The PSD of the Nb th order moving average (MA) process is the one generated by filtering a white Gaussian noise with power images with a limited support filter (or averaging filter):

images

as illustrated in Figure 14.11. The MA spectral analysis estimates the set of generation parameters images from the limited observations so that the PSD of the MA(Nb) model is

images
Model for ARMA spectral analysis, depicted by a rectangle labeled B(ʐ|b)/ A(ʐ|a) with a rightward arrow at the left labeled w[n] atop and σ2ARMA below and another a rightward arrow at the right labeled x[n].

Figure 14.12 Model for MA spectral analysis.

From the sample estimate of the autocorrelation images over the support of images samples, it is sufficient to use the identity over the z‐transform

images

and solve wrt b and images, the latter acts as a scale factor of the estimated PSD. The overall problem cast in this way is strongly non‐linear and does not have a unique solution for images. From the sample estimate of the autocorrelation images one can extract the 2Nb roots, which are in all reciprocal positions according to the Paley–Wiener theorem (Section 4.4). These roots can be paired in any arbitrary way and collected into a set images provided that each selection does not contain the reciprocal set. Even if any collection of zeros in the filter images is an MA estimate, the preferred one is the minimum phase arrangement images so that

images

This is essential for the MA analysis of the sample autocorrelation, and some remarks are necessary to highlight some practical implications.

Remark 1: To reduce the variance of the sample estimate of the autocorrelation, it is customary to average the sample estimate after segmenting the set of observations into M subsets of disjoint time series of images samples each. Assuming images to be the mth segment, the sample autocorrelation is

images

and the variance for disjoint sets decreases with 1/M. Of course, the order of the MA analysis should be traded with the variance of the sample autocorrelation.

Remark 2: Given the sample autocorrelation over an arbitrary interval, say images, MA(Nb) cannot be carried out by simply using the first Nb samples (that would imply a truncation of the sample autocorrelation) but must rather be by windowing the autocorrelation by a window images that has a positive definite FT (images for images but images for any ω). To justify this, let

images

be the sample autocorrelation over N samples; the FT images has the following property (Appendix A):

images

To carry out MA(Nb) spectral analysis, the sample autocorrelation images is windowed and the MA spectral analysis becomes

images

for some choice of images. The MA PSD is biased, as

images

and it is biased for images as images. The choice of the window should guarantee that images to make the PSD positive definite. Truncation images for images would not satisfy this constraint, while images for images (triangular window) fulfills the constraints and is the most common choice.

14.5 ARMA Spectral Analysis

ARMA(Na, Nb) spectral analysis is the composition of the AR(Na) and MA(Nb) systems. Namely, the ARMA process is obtained from the causal linear system excited by white Gaussian process w[n] as described by the difference equation (Figure 14.13)

The constitutive relationship of ARMA can be derived from the definition of the autocorrelation

images

using the second form of the finite difference equation (14.11) by multiplying for images and computing the expectation:

images
image

Figure 14.13 Model for ARMA spectral analysis.

Since the system is causal, the impulse response is h[n] and the process x[n] is the filtered excitation:

images

Substituting into the second term of the relationship above:

images

and thus

images

In this form it is still dependent on the impulse response, but since images for images (causal system), the second term is zero for images and also for images. It follows the modified Yule–Walker equations:

which is a generalization of the Yule–Walker equations for the ARMA processes to derive the AR(Na) component â of the ARMA(Na, Nb). Of course, for ARMA analysis from a limited set of samples x[n], the autocorrelation is replaced by the sample (and biased) estimate of the autocorrelation (Appendix A).

After solving for the modified Yule–Walker equations (14.12) to get â, one solves for MA(Nb). This is obtained by noticing that when filtering the observations with images (incidentally, this is the prediction error using a linear predictor based on the modified Yule–Walker equations) obtained so far, the residual depends on the MA component only as

images

The autocorrelation of the MA component images:

images

rewritten in the form of sample autocorrelation (recall that images)

images

Therefore, the images samples of the autocorrelation images represent the MA component of the ARMA process and can be estimated following the same rules as MA(Nb) applied now on images. By using the identity

images

it can be solved for b and the excitation power images that is now the scaling factor for the whole ARMA process.

Remark 1: Note that there is no guarantee that the sample autocorrelation images holds over the range images, so the PSD of the MA component may not be positive semidefinite. In this case it is a good rule when handling sample estimates (that could have some non‐zero samples outside of the range of the MA process) to window images with a triangular window over the range of the order of the MA component, as this guarantees that the PSD is positive definite.

Remark 2: The estimate of the AR portion by the modified Yule–Walker equation (14.12) with the sample autocorrelation could be slow, as the system of equations uses samples with lower confidence (as for images). Some alternative methods could be explored to make a better (and more efficient) use of the available samples.

14.5.1 Cramér–Rao Bound for ARMA Spectral Analysis

Let the PSD for the ARMA be

images

where images and images are both minimum phase factorizations (a and b are real‐valued vectors). The CRB is obtained from the asymptotic Whittle formula starting from:

images

The second order derivatives for the FIM elements are (recall the causality condition for the factorizations images and images):

images

where raa[k], rbb[k], rab[k] are the auto/cross‐correlation sequences for two equivalent AR processes and a mixed one, all excited by a unit‐power white process:

images

Autocorrelation raa[k] and rbb[k] of the AR processes can be numerically evaluated from the Yule–Walker equations for the polynomials images and images, respectively. The numerical computation of the cross‐correlation rab[k] deserves a comment as the following factorization of

images

highlights the way to compute it. Namely, first the Yule–Walker equations are applied to compute the autocorrelation for the equivalent AR process images represented by the first term using the minimum phase factorization images, and then the cross‐correlation terms of interests are obtained by filtering the autocorrelation computed thus far with a finite impulse non‐causal (as images is non‐causal) filter images.

The FIM is block‐partitioned:

images

with blocks

images

The structure of the FIM confirms that for ARMA spectral analysis the AR and MA parameters are coupled, and this offers room to find different estimation procedures than in Sections 14.314.4, but the power of the excitation acting as a scale parameter is always decoupled. This decoupling is not surprising as the structure of the PSD in terms of resonances and nulls depends on the poles and zeros of the AR and MA components, but not on the scale factor. The CRB for ARMA spectral analysis can be considered the more general one as the CRB for AR and MA can be derived from the CRB of ARMA by removing the corresponding blocks.

Appendix A: Which Sample Estimate of the Autocorrelation to Use?

Given N samples of a random process images, the sample estimate of the autocorrelation is:

images

However, this sample estimate is biased as for every value of the shift k the product terms that are summed up are images; analytically it is

images

Alternatively, one can avoid the bias by scaling the sum for the number of terms, and it follows the definition of the sample autocorrelation

images

which is unbiased as

images

Herein we prove that even if the estimate images is apparently the preferred choice being unbiased, the biased estimate of the sample autocorrelation images is the choice to be adopted in all spectral analysis methods.

The Fourier transform of the two sample estimates are

images

The expectations are related to the PSD Sx(ω) as follows

where the limited support autocorrelation has been replaced by rectangular and triangular (or Bartlett) windowing

images

Inspection of the relationships (14.13) proves that the FT of the unbiased sample autocorrelation images is not positive definite as W(R)(ω) could be negative valued, for some spectral behaviors. On the other hand, the Fourier transform of the biased sample autocorrelation images is positive definite (in the mean) as the sample estimate is implicitly windowed by a Bartlett window w(B)[k] that has positive definite FT. Furthermore, it can be seen that for images, the Bartlett windowing becomes a constant and images, therefore

images

is the unbiased estimate of the PSD. Incidentally, the biased sample autocorrelation is the implicit estimator used by the periodogram in PSD estimation.

Appendix B: Eigenvectors and Eigenvalues of Correlation Matrix

The objective here is to show the equivalence between the eigenvectors of a correlation matrix with Toeplitz structure and the basis of the discrete Fourier transform (DFT). A few preliminary remarks are in order. Given a periodic sequence {e[n]} with period of N samples, the circular (or periodic) convolution over the period N with a filter {h[n]} is equivalent to the linear convolution images made periodic with period N

images

The DFT of the circular convolution is

images

where

images

is the DFT of the sequence images (recall that images is the Nth root of unitary value). In compact notation, the DFT is

images

where WN is the matrix of forward DFT (it has a Vandermonde structure). Since the inverse DFT is

images

it follows that

images

Here, the columns of WN are a complete basis ℂN and each column represents a complex exponential exp(j2πnk/N) of the DFT transformation. The circular convolution after DFT is

images

where

images

The properties of an arbitrary covariance matrix Cxx are derived by assuming that random process x is the output of a linear system with Gaussian white input

images

so that the following property holds for its DFT:

images

Since images, the covariance matrix is

images

but still from the definitions of covariance matrix and DFT:

images

leading to the following equality:

images

Each column of the DFT basis is one eigenvector of the covariance matrix Cxx obtained after circular convolution, and the entry images is the corresponding eigenvalue. Resuming the definition of eigenvectors/eigenvalues with an orthonormal basis:

images

where the k th eigenvector and eigenvalue are

images

notice that the k th eigenvalue coincides with the power spectral density

images

Based on these numerical equalities, some remarks are now necessary to better focus on the importance of the results derived so far.

Remark 1: For a stationary random process, the eigenvector decomposition of the covariance matrix Cyy is

images

In general, the kth eigenvector images, and similarly images as the process x[n] is not periodic, or it was not generated as circular convolution over a period N. However, for a very large period N, the periodic and linear convolution coincides, except for edge effects that become negligible for large N. The same conclusion holds for the covariance if images (in practice, it is enough that N is larger than the coherence interval of the process x[n], or equivalently the covariance matrix Cxx is a diagonal banded matrix with a band small enough compared to the size of the covariance matrix itself):

images

Remark 2: In software programs for numerical analysis, the eigenvectors of the covariance matrix are ordered for increasing (or decreasing) eigenvalues while power spectral density images is ordered wrt increasing frequency as the eigenvectors are predefined as the basis of the Fourier transform.

Remark 3: In the case of a complex‐valued process, it is convenient to partition the DFT into real and imaginary components:

images

where each of the vectors are (Section 5.4)

images

and the following properties of orthogonality hold:

images

In summary

images

and thus

images

as expected.

Appendix C: Property of Monic Polynomial

This Appendix extends some properties of the z‐transform in Appendix B of Chapter 4. Let H(z) be the monic polynomial of a minimum phase filter images excited by a white Gaussian noise with unitary power (images); the z‐transform of the autocorrelation sequence is

images

Since along the unitary circle the Fourier transforms images and images are positive semidefinite, after the log as monotonic transformation:

images

The careful inspection of

images

proves that since H(z) is minimum phase, it is analytic for images (poles/zeros are for images), and from the properties of the log it follows that poles/zeros are transformed as follows: zeros of images poles of ln H(z); poles of images poles of ln H(z). Since H(z) is minimum phase, the polynomial images is minimum phase also. From the initial value theorem images (monic polynomial) and thus images, hence6

images

and thus

images

Appendix D: Variance of Pole in AR(1)

The biased estimate of the autocorrelation is (sample index is by subscript):

images

and the mean value is

images

To compute the variance it is necessary to evaluate the expectation for all mixed terms images. After some algebra:

images

and the expectations are

images

Under the assumption that images and images, one can specify the following moments:

images
images

Recalling that the random term in the sample estimate is redefined as deviation wrt the mean value images, it follows that:

images

these are useful for the derivation in the main text.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.80.45