Chapter 11
Applications of EBMs: Optimal Estimation

11.1 Introduction

An estimator is an algorithm making use of an observation or combination of observations in order to provide a useful estimate of some system parameter such as the global average temperature. In order to conduct such a procedure, we have to make a statistical model of our process. Once we have a statistical model in place, we can examine it and the estimation procedure to learn several things about the underlying system parameter. Usually, the estimator is a random variable that has some probability distribution (pdf) due to errors in the measurement process or perhaps sampling error. The following are some questions of interest: (i) does the mean of the pdf of the estimator coincide with the actual value of the system parameter? If it does, we say the estimator is unbiased. (ii) What is the variance of the estimator? The square root of this variance is the standard deviation or root mean square error or RMS error.

EBMs can be of service in some estimation problems. The usefulness of an EBM for a particular application comes in two forms:

  1. 1. Many estimation problems can be evaluated with the assistance of general circulation model (GCM) simulations. In this chapter, we use the EBM to show how a few of these work in the simpler context. In many cases with the EBM, we are able to solve the problem analytically or nearly so. This puts aside the issue of whether the more realistic model actually has a solution or whether it is over-fitted, and so on. In this application, we can focus on the estimation process from its beginning to its end without letting the details or mathematical transgressions cloud the picture. The main point is the understanding of the estimation process in a simple and more heuristic context. It might be that the lessons learned can be applied with much more complicated models.
  2. 2. There are some estimation problems where it might not be feasible to solve the problem in the more complicated models because of lack of computing or data resources. For example, how many fully coupled GCMs have a 10 000 year control run with which to assess the low-frequency statistical parameters of natural variability? How can our intuition about such problems be enhanced? To paraphrase a statement by John Maynard Keynes, “it might be better to have a rough idea of the truth than a very precise [or satisfying] answer which is wrong.”

In a typical case, there may be many unbiased estimators for a given problem, often combining lots of measurements, such as areal or temporal averages. Think of the estimate of the global average temperature. We might take gauge (point) measurements from a number of different stations or perhaps other observing systems (e.g., satellites). We suspect that if we simply average the data, then these averages might form an unbiased estimate of the global average temperature. We might ask whether a straight arithmetic average is actually the best unbiased estimator in terms of its RMS error. Perhaps some nonuniform weighting would improve the estimate. This general class of problems is the subject of this chapter. We will apply the method with the help of the EBM to two different examples: Section 11.3 on estimating the global average temperature; Section 11.4 on detecting faint deterministic signals (such as the greenhouse warming or episodic cooling by volcanic dust veils) in the climate system. We start with a simple problem involving two imperfect observations of a heat reservoir.

11.2 Independent Estimators

Consider estimating the temperature of a reservoir with two devices. Let the estimators c011-math-001 and c011-math-002 be unbiased; that is, c011-math-003 where c011-math-004 is the true temperature, and c011-math-005 means ensemble average.

The individual estimators are assumed to be of the form c011-math-006; where the errors c011-math-007 are assumed to be random variables taking on different values in each realization of the measurement process. The errors or noise are assumed to have mean zero when considered over a large number of trials: c011-math-008, and the covariances of the errors are given by c011-math-009. The previous expression states that the errors of the separate devices are assumed to be uncorrelated and that their individual variances are given by c011-math-010 and c011-math-011. We assume that these characteristics of the errors are known beforehand. Our task is to take one realization of the measurement process and obtain an optimal estimate of the true reservoir temperature. We wish to make maximal use of the data collected from each device in an appropriate linear combination. The question is, what is the appropriate weighting to assign to each measurement? We form the estimate

11.1 equation

where c011-math-013 is a weight to be adjusted to make the mean square error (MSE) the minimum. The estimator c011-math-014 is clearly unbiased if the individual estimators are. We can form the MSE for the measurement as

11.2 equation
11.3 equation

The latter is a quadratic in c011-math-017 and is shown in Figure 11.1 for a choice of c011-math-018. The point of this figure is that the MSE is rather insensitive to the choice of c011-math-019 so long as it is near its optimum value. This is an important point to be stressed later in the climate signal detection exercises.

Image described by caption and surrounding text.

Figure 11.1 Error squared c011-math-020 versus weighting c011-math-021 for two unbiased estimators.

The minimum of the quadratic above is easily found, and it yields the familiar and very important result:

11.4 equation

where

11.5 equation

The result is easily generalized to include c011-math-024 independent unbiased estimators:

11.6 equation

and

11.7 equation

An interesting way of expressing these last results is

11.8 equation

with the column vector c011-math-028 and

11.9 equation

This last form gives us a convenient way of viewing the optimal estimation procedure in the form of an optimal filter of the raw data. The filter loads each observation with a weight inversely proportional to its individual error variance. The factor c011-math-030 assures the normalization necessary for unbiasedness.

After some algebra, it can be shown that the optimal error variance is just

11.10 equation

This shows that adding another device always improves the signal-to-noise ratio indicator no matter how poor its quality.

The derivation presented above required that the individual errors be uncorrelated with one another. If this were not so, the coordinate axes could simply be rotated to the principal axes of the error or noise covariance matrix. Then the entire formalism goes through as before except in the rotated coordinate system. In climatology, this is the transformation to the empirical orthogonal function (EOF) basis set, which we will return to in later sections.

In two dimensions, this is easily spelled out explicitly. Let the covariance matrix of the noise be given by

11.11 equation

Taking the noise to be distributed bivariate normally, the contours of equal probability of occurrence of pairs of values of c011-math-033 are given by (see Thiébaux, 1994):

11.12 equation

which is an ellipse in the c011-math-035 plane. In this two-dimensional case, we can find an angle c011-math-036 to rotate the coordinate axes through, such that the principal axes of the ellipse coincide with new coordinate axes c011-math-037. In the new coordinate system, c011-math-038 and c011-math-039 are uncorrelated. In the case of c011-math-040 dimensions, the figure is an ellipsoid in the c011-math-041-dimensional space and a simple length-preserving rotation can also be used to find the appropriate coordinate system. This rotation of the coordinate axes is familiar in data analysis as the transformation from spatial coordinates to the EOF basis set.

The simple derivation of optimal weighting of independent estimators is familiar to many researchers. The result is very intuitive. We simply weight each estimator inversely according to its individual error variance.

11.3 Estimating Global Average Temperature

We will follow the method of Shen et al. (1994). The global average c011-math-042 changes over time and it is smoothed over an interval c011-math-043 centered at c011-math-044 (it is a running average):

11.13 equation

where c011-math-046 refers to solid angle, c011-math-047 is a unit vector originating at the Earth's center and pointing to a location on the sphere, and

11.14 equation

An anomaly at point c011-math-049 is defined by

11.15 equation

and

11.16 equation

and by definition, c011-math-052.

The global average temperature at time c011-math-053 can be estimated for a given fixed network of stations c011-math-054:

11.17 equation

and to ensure unbiasedness:

11.18 equation

Our estimator is

11.19 equation

where

11.20 equation

and the MSE is given by

11.21 equation

After multiplying the factors together,

and we have introduced the temporally smoothed covariance

11.23 equation

To choose the optimal weighting coefficients, we need to use the method of Lagrange multipliers (e.g., Arfken and Weber, 2005). We minimize the function

11.24 equation

where c011-math-063 is a Lagrange multiplier.

Next, we take partial derivatives

11.25 equation

and

11.26 equation

After inserting the expression for c011-math-066 and rearranging, we find

11.27 equation

Our result is similar to the c011-math-068 thermometer case of the last section. The problem here is that the temperatures at one location c011-math-069 and another c011-math-070 are correlated. To proceed, we need to know how to find variables that are not correlated. Thus the next section introduces EOFs.

11.3.1 Karhunen–Loève Functions and Empirical Orthogonal Functions

We begin by noting that c011-math-071 is a real symmetric function.1 Such functions play an important role in mathematical analysis. Consider, for example, the kernel of the integral in the eigenvalue problem2

:

This equation is in the form of a Stürm–Liouville system (introduced in Chapter 7). The functions c011-math-073 are the eigenfunctions for integer index c011-math-074 and the c011-math-075 are the (real and positive) eigenvalues. Properties of Stürm–Liouville systems can be found in most books on mathematical methods for physicists and engineers (e.g., Arfken and Weber, 2005 and later editions). Additional properties include the orthogonality relation:

and the completeness relation:

The first of these tells us how to expand any reasonably well-behaved function3 on the sphere into these basis functions. The functions which are the eigenfunctions of the covariance kernel (11.28) are called the Karhunen–Loève functions (K-LFs). They form a convenient basis set into which many useful decompositions might be derived.

Consider developing the function c011-math-078 into a series of these basis functions:

11.31 equation

The coefficients c011-math-080 can be calculated from (11.29):

11.32 equation

The completeness relation (11.30) assures that any well-behaved function on the sphere can be represented in the series.

Using these relations, we find that a function such as c011-math-082 can be represented as

We can insert this last expression into (11.22) to obtain a matrix equation for the MSE, c011-math-084. Inspection of the result reveals that the problem can be cast into the form of a filter through which data in the form of correlation information can be entered. Shen et al. (1994) proceed by expanding each EOF, c011-math-085, into a spherical harmonic4 series:

11.34 equation

where c011-math-087 is a truncation level, typically set at spherical harmonic degree 11 or 15 in the experiments to be described.

Let us now ask why optimal weighting helps. The key is the understanding of how the annual averaged surface temperature data are correlated from one location to another. This was examined in an important paper by Hansen and Lebedeff (1987). Figure 11.2 shows scatter diagrams of the correlation of the annual averaged surface temperatures in different latitude belts (the figure concentrates on the polar and mid-latitude belts; tropical spatial autocorrelations are not well defined, but in general are much longer). The correlations (on the average) fall off with separation distance to a level 1/e at about 1500 km.5

Image described by caption and surrounding text.

Figure 11.2 Spatial autocorrelation diagrams for annually averaged temperatures at stations separated by distances c011-math-088. The solid lines are averages of the scattered points. The vertical and horizontal gray lines indicate the 1/e point and their values on the abscissa. Note that in the polar and mid-latitudes, the correlation lengths are about 1500 km.

(Hansen and Lebedeff (1987). © American Meteorological Society. Used with permission.)

The tropical temperatures do not follow this scheme. The reason is that in polar and mid-latitudes, the weather with autocorrelation times of the order of a few days drives the surface temperature field, which, for large scales, has an autocorrelation time of the order of weeks to a month over land and much longer over ocean. The conditions are right for the Langevin approximation used in Chapter 9. However, in the tropics, there is no weather noise and the dynamics smear out heat very quickly via direct circulations rather than in a kind of thermal diffusion.

We follow Shen et al. (1994) here to show that a relatively small number of gauges or sites (c011-math-089) are needed to achieve pretty good accuracy for estimating the global average temperature. Figure 11.3 shows the MSE for several gauge configurations: c011-math-090, a 63 station well-dispersed gauge network used by Angell and Korshover (1983); and finally a c011-math-091 array. The EOFs (or K–LFs) were computed as indicated above using a spherical harmonic basis truncated at degree 11 using the data itself, and using the noise-forced 2-D model of Chapter 9. Figure 11.3 shows a graph of the MSE versus the number of EOF modes retained (c011-math-092) (the number of terms retained in (11.33)). This latter determines the dimension of the matrix problem. Note in the figure that the MSE levels off at a particular value of c011-math-093 for each network configuration. Coarser networks require more modes than dense networks.

Image described by caption and surrounding text.

Figure 11.3 The mean squared error for estimation of the global average temperature as a function of the total number of spherical harmonic modes retained.

(Shen et al. (1994). © American Meteorological Society. Used with permission.)

To be complete, we must mention that the Shen et al. (1994) paper implicitly assumes that there is no power beyond the truncation level of the spherical harmonic expansion in the data. This is not quite true. But given the snugness of the fit in the figures, this might not be a bad approximation when the data are smoothed by time averaging (Figures 11.4 and 11.5).

Image described by caption and surrounding text.

Figure 11.4 Plots of the temperature estimate based on only 16 geometrically symmetrically located gauges (dashed line) together with the best estimate of the data from the UK CRU (Climate Research Unit) data set. The EOFs used in the optical weighting were based on the UK data. (a) Uniform weighting. (b) Optimal weighting.

(Shen et al. (1994). © American Meteorological Society. Used with permission.)

Image described by caption and surrounding text.

Figure 11.5 Similar to Figure 11.4 except for the 63 well-dispersed Angell–Korshover network. Note that the agreement is nearly perfect for the optimal weighting case.

(Shen et al. (1994). © American Meteorological Society. Used with permission.)

11.3.2 Relationship with EBMs

So what does this have to do with EBMs? The answer lies in the functional form of the spatial autocorrelation functions in Figure 11.2. We can compare with Figure 11.6, where the annually averaged data are from Siberia. The autocorrelation length in the latter figure is about 50% larger than in Figure 11.2. Both are based on annually averaged data. The reason for the longer autocorrelation length is that the latter figure is over a land mass, while the data from Hansen and Lebedeff are mixed over land and ocean. Ocean correlation lengths for annually averaged data are shown in Figures 9.8 and 9.9. From Figure 11.7, we can see that over land, where c011-math-094, the autocorrelation length c011-math-095, whereas over ocean, where c011-math-096, and thus c011-math-097. Hence, the autocorrelation length for all-land areas is large, while that over ocean is smaller, as suggested in both data and models of Figures 9.8 and 9.9.

Image described by caption and surrounding text.

Figure 11.6 Scatter diagram of spatial autocorrelation data from eastern Siberia. The data were annually averaged. The solid curve is based on the well-known model form c011-math-098, where c011-math-099 is the separation of stations in kilometers, c011-math-100 is the modified Bessel function of the second kind and of degree unity (see Arfken and Weber, 2005), and the broken line is the average over the sample estimates.

(North et al. (2011). © American Meteorological Society. Used with permission.)

Image described by caption and surrounding text.

Figure 11.7 Theoretical (noise-forced EBM-based) spatial autocorrelation for different angular frequencies c011-math-101. The relaxation time for the surface is c011-math-102 (typically, a few years for a mixed-layer model) and c011-math-103 is the characteristic length scale for c011-math-104. Over land, the characteristic length (c011-math-105) is roughly twice the low-frequency limit over ocean (c011-math-106).

(North et al. (2011). © American Meteorological Society. Used with permission.)

11.4 Deterministic Signals in the Climate System

The problem of detecting climate signals in the noisy background was first advanced by Hasselmann (1979), but also see Hasselmann (1993 1997). In this section, we take on the problem of detecting a faint signal in the noisy background of the climate system using a two-dimensional EBM. By signal, we mean a deterministic pattern in space–time—a response to a forced energy imbalance. The noise here is the natural variability as in our noise-forced EBMs of Chapter 9 or the natural variability in a GCM simulation. The statistical model we have in mind is that the data are a linear sum of signal and noise:

11.35 equation

This would be true in the EBM world of a linear-sampled diffusive model driven by stationary random noise. It appears to be true also for large GCMs if c011-math-108 is small enough. We have included a coefficient c011-math-109 in front of the signal because we usually want to estimate the strength of the signal. In many applications, we know the space–time shape of the signal, but we do not know how strong it is. Often, the strength of such a signal depends on feedback factors that are only poorly known. We begin with a single signal and its characterization. To begin our discussion consider a sinusoidal wave in one dimension:

11.36 equation

The coefficients c011-math-111 and c011-math-112 determine the amplitude and phase of the wave. We can characterize the signal as a vector in a two-dimensional space:

11.37 equation

where c011-math-114 are orthogonal unit vectors in the plane. The above result can, of course, be generalized to any number of dimensions, if, for example, the signal is composed of many harmonics. For example,

11.38 equation

Then the vector representation of c011-math-116 is

11.39 equation

or

11.40 equation

Hence, if the signal is composed of c011-math-119 harmonics, there will be c011-math-120 coefficients representing the amplitude and phase of each, with the exception of the zero-frequency harmonic which has no phase.

11.4.1 Signal and Noise

The additive noise can also be decomposed into frequency components (dropping the awkward (+) and (c011-math-121) superscripts)

11.41 equation

In this case, the components c011-math-123 are random variables and uncorrelated. (If they were correlated, we would rotate the axes to new coordinates such that there is no correlation; that is, in climate, we would use the EOF basis set.) For each realization of the process, a new value of c011-math-124 and c011-math-125 must be drawn from a distribution function and the distribution of c011-math-126 and that of c011-math-127 are independent. If the same frequency component of noise is added to that of the signal, we can write

11.42 equation

where we have used c011-math-129 to indicate “data.” It is worth noting that if the noise process is a stationary time series, the noise from one frequency component to another is uncorrelated. Hence, in this simple case, no rotation is required.

In all the applications that follow, we assume the signals are linearly added to one another and to the natural variability background (the “noise”). We are now in a position to form some estimators of interesting quantities. For example, an unbiased estimator of c011-math-130 is simply c011-math-131, as c011-math-132 and thus c011-math-133.

11.4.2 Fingerprint Estimator of Signal Amplitude

A common problem in climatology is that we know the waveform of the signal (in the above example, the frequency and phase) but want to know its strength. In other words, we know the direction of the signal vector (indicated by the unit vector c011-math-134). An unbiased estimator of c011-math-135 is

11.43 equation
11.44 equation
11.45 equation

where the subscript “rf” indicates “raw fingerprint.” In other words, we find the length of the component of the data vector which lies along the direction of c011-math-139. The raw fingerprint estimator has an MSE

11.46 equation

The raw fingerprint method is very easy to implement and has attracted some users. On the other hand, it does not take advantage of the fact that c011-math-141 and c011-math-142 might be quite different. Hence, we might want to weigh the information from the two component directions optimally. The way to do this is presented in the next section.

11.4.3 Optimal Weighting

Consider a two-dimensional case in which we do know the direction of the signal vector (c011-math-143) and the angle c011-math-144 that it makes with the c011-math-145 axis. Then we can write for the component of c011-math-146 along the c011-math-147 axis:

11.47 equation

or

11.48 equation

If there were no noise, we could calculate c011-math-150 by first obtaining its component in the c011-math-151 direction from data, then dividing by the direction cosine of the known signal vector and the c011-math-152 axis. This means we can form an unbiased estimate of c011-math-153:

11.49 equation

(note that c011-math-155). Hence, the data vector is to be projected along the 1-axis and inversely weighted by the direction cosine of the signal vector to the 1-axis. This unbiased estimator of c011-math-156 has an error variance of

11.50 equation

But we have many statistically independent unbiased estimators of c011-math-158, one for each component direction. The problem has been reduced to the same one as the thermometers in the reservoir analyzed at the beginning of this section. Hence, the optimal estimator of c011-math-159 is

11.51 equation
11.52 equation

with

In the last expression for c011-math-163, we show the data vector c011-math-164 factored out to emphasize that the procedure is a linear operation or projection of the data vector; hence, the term filter. Each term in the expression for the filter is an independent estimator for the signal's component along c011-math-165, and each estimator is inversely proportional to the variance c011-math-166, as this variance is the eigenvalue of the corresponding EOF.

The matrix form of (11.53) will occur later:

11.54 equation

The form c011-math-168 is a kind of indicator of the signal-to-noise ratio squared. The numerator of each term is the projection the signal onto the EOF (ec011-math-169) squared. The denominator is the variance corresponding to the natural variability of that EOF mode. Presumably, this series converges, but one must consider that the denominator will decrease toward zero because the EOFs are ordered by the magnitude of the eigenvalues. On the other hand, the numerator should tend toward zero as well, as the projection of the signals on the EOFs should diminish as the mode indices increase. The convergence will depend on the problem being investigated. In the case of signal detection in the climate system, we will see that the convergence is satisfactory.

Let us recall a few key assumptions. First and foremost, we assumed the linear additivity of the signal and the noise. This is likely to hold for weak signals that we expect in climate change problems. We have used the principal component directions of the natural variability to formulate the problem from the beginning; that is, we chose the coordinate axes to be the principal axes of the covariance ellipsoid of the noise vector. We had to assume knowledge of the direction of the signal waveform, and this had to be based on a model estimate itself. Our job is to estimate its strength given this information.

The quantity c011-math-170 is an a priori measure of the quality of the procedure, as for a signal strength of unity, the signal-to-noise ratio is squared. We can use c011-math-171 as computed with models to tell which vector components are most important in the estimation problem without really invoking the data. This is very important as we can use our climate models to condition our choice of the subspace within which we can make a reliable estimation of signal strength without involving the data (cheating).

Consider the error involved in the use of imperfect models in constructing the filter. The first type of error is in choosing an incorrect fingerprint. In the present context, this means the vector c011-math-172 has the wrong direction in the state space. An equivalent statement is that the direction cosines c011-math-173 are incorrect. The single constraint is that the squares of the direction cosines must add up to unity. An incorrect fingerprint can lead to a bias in the estimation of the signal strength. For this reason, it is well to find ways to eliminate aspects of the model-predicted signal which may lead to incorrect signal waveform prediction. This could be done by eliminating certain subspaces of the state space, but this is probably not a good approach as the EOFs are very irregular functions over the globe and it is not easy to relate these shapes to the areas that we know are weak in signal generation. Instead, it might be better to mask off certain regions on the globe, such as the polar regions where we know the models perform poorly. Once we have masked off certain areas (with tapered edges), we completely redo the problem including the EOFs on the newly masked planet. We do not pursue this possibility further in this book.

Another type of error comes from the optimal weights as generated from models. This type of error is less egregious than error in the signal waveform. Since the estimator is composed of c011-math-174 independent estimators which are assumed to be unbiased, the weighting does not introduce a bias. If erroneous, they can lead to a suboptimal estimator. In addition, they can lead to an underestimation of the theoretical MSE (c011-math-175). It turns out that as the minimum in the MSE as a function of the weights is the minimum of a multidimensional parabolic surface (actually, the intersection of this parabolic surface with the plane c011-math-176), the MSE is not sensitive to the exact choice of the weights (c011-math-177).

11.4.4 Interfering Signals

Following North and Wu (2001), four signals have been identified for climate signal detection. These are c011-math-178, the cooling due to atmospheric aerosols; c011-math-179, the greenhouse warming signal; c011-math-180, the volcanic dust veil episodic cooling; and S, the solar cycle. Consider the case of two signals. If the unit vectors describing them are not orthogonal, we have to do some additional filtering. Suppose the signals c011-math-181 (the greenhouse gas signal) and c011-math-182 (the aerosol particle signal) are turned off. We want estimates of the amplitudes of c011-math-183 (the solar change signal) and c011-math-184 (the volcanic dust veil signal). Let us start with c011-math-185. If the direction of c011-math-186 and interfering signal c011-math-187 (i.e., their space–time patterns or fingerprints) are known, we can obtain independent estimates of their strengths by estimating the components of each which are perpendicular to the other. For example, consider the component of c011-math-188 which is perpendicular to c011-math-189:

11.55 equation
11.56 equation

where c011-math-192 is a unit vector along c011-math-193. Hence, using this projection procedure (operator), we can now proceed to estimate the strength of c011-math-194 and therefore find the strength of c011-math-195, as c011-math-196. The problem, of course, is that c011-math-197 will be shorter than c011-math-198 with a corresponding loss of performance (signal-to-noise ratio c011-math-199) in the procedure.

We can now use the same procedure to find the length of c011-math-200 and therefore the length of c011-math-201. As a consistency check, we could then proceed to look at the parallel components if each signal, which, in principle, are now known.

11.57 equation
11.58 equation

It is of interest to know the angle between c011-math-204 and c011-math-205,

11.59 equation

If the two signals are orthogonal to one another, there is no interference. If there is a significant alignment or anti-alignment of the two signals, there will be trouble discriminating between them. This condition is known in multiple regression as collinearity. If the length and direction of the interfering signals are both known, we have an unbiased estimator of the length of c011-math-207, which when divided by c011-math-208 becomes an unbiased estimator of c011-math-209. We can optimally combine this with the independent estimate based upon the parallel component which can be found by first subtracting (the known) c011-math-210 from the data stream.

Some interesting examples of the angles between signal vectors are given in North and Stevens (1998) for a narrow band of eight discrete frequencies centered at a period of one decade. In Table 11.1, we see that most of the combinations of the four signals (in the narrow frequency band used by North and Stevens) the vectors are reasonably perpendicular except for c011-math-211 and c011-math-212. This latter is hardly a surprise as these two vectors clearly are nearly anti-collinear expressions of linear global warming from the greenhouse effect and a similar linear cooling effect due to aerosols.

Table 11.1 Angles between possible pairs of signal vectors

Vector pair Angle (c011-math-213)
c011-math-214 77.9
c011-math-215 88.0
c011-math-216 101.0
c011-math-217 84.2
c011-math-218 93.8
c011-math-219 153.3

North and Stevens (1998).

11.4.5 All Four Signals Simultaneously

We can cast the problem in the following form:

where the subscript c011-math-221 is an index running over all space–time points in the record. For instance, in the published papers North and Stevens (1998), and North and Wu (2001), the number may be 100 years (of annual averages) c011-math-222 36 sites (see Figure 11.8). North and Wu (2001), used several different space–time combinations.

Image described by caption and surrounding text.

Figure 11.8 Black squares showing the 36 stations used by North and Stevens (1998). Each of the 36 c011-math-223 detection boxes comprised of four c011-math-224 boxes from the Climate Research Unit (UK) data set, each of which has 1200 months of data (1894–1993). These boxes were chosen based on where there was sufficient data, spatial sampling was maximized, and correlation between boxes was minimized. The sites designated by black disks were added by North and Wu (2001). These latter each contain 50 years of data.

(North and Wu (2001). © American Meteorological Society. Used with permission.)

The problem has been discretized for c011-math-225 stations, and c011-math-226 time steps (see Figure 11.8). We have introduced the notation c011-math-227, c011-math-228 for the four signals. The c011-math-229 are the four unknown coefficients that are to be estimated from the data stream. c011-math-230 is a Gaussian random field denoting the so-called natural variability. In order to build a set of statistically independent estimators of the c011-math-231, we use space–time EOFs (we will refer to them as EOFs from here instead of Karhunen–Loéve functions). These are the eigenvectors of the space–time 3600 c011-math-232 3600 covariance matrix:

11.61 equation

The angular brackets here imply an infinite-member ensemble average. Since we obtain these EOF basis vectors from very long runs of GCMs (or stochastic EBMs), we can assume the sampling errors in taking these averages are negligible. The eigenvector problem is posed as follows:

11.62 equation

where c011-math-235 is the c011-math-236 eigenvector and c011-math-237 is the corresponding (positive and real) eigenvalue. In what follows, we assume the c011-math-238 and the c011-math-239 are not random numbers because of the large number of realizations in determining them. The first step is to expand all quantities in (11.60) into the eigenvectors.

11.63 equation
11.64 equation
11.65 equation

To summarize, in what follows, the c011-math-243, and c011-math-244 are not random variables. Because of sampling error in the actual data record, the quantities c011-math-245 are random variables. The c011-math-246 are zero mean, normally distributed variates representing natural climate variability with the property

11.66 equation

which means that when referred to the EOF basis set, the c011-math-248 are uncorrelated from one component to another. After multiplying (11.60) through with c011-math-249 and summing over c011-math-250 we arrive at

11.67 equation

The last equation indicates that the equations for c011-math-252 are statistically independent of one another. We would like to make estimates of the strength coefficients as a function of the number of EOFs retained, c011-math-253. We can make this into a standard regression model by first normalizing the errors to white noise:

11.68 equation

where the c011-math-255 implies that the variable is divided by c011-math-256. Now we form the MSE and minimize it with respect to c011-math-257. What we understand by “mean” here is

11.69 equation
11.70 equation

with

11.71 equation

Now we can invert the matrix to obtain our estimator:

11.72 equation

It is important to realize that c011-math-262 has to be larger than or equal to 4, otherwise the matrix c011-math-263 will not have an inverse. As expressed in the conventional notation of unnormalized variables,

11.73 equation

This last is our optimal estimator of the four signal strengths. The above derivation is equivalent to a multiple regression model with four unknown coefficients. The value of our approach or decomposition is that it provides the solution as a function of the number of EOFs retained in the analysis as we will see later in the numerical example from North and Wu (2001). If the estimates are stable as the value of c011-math-265 is increased, we have more confidence in the procedure. This might not always be the case as some of the series leading up to c011-math-266 terms have c011-math-267 in the denominator. This is a problem because the eigenvectors (EOFs) and their eigenvalues are traditionally arranged in descending order as a function of c011-math-268. Hence, c011-math-269 is likely to approach zero as c011-math-270. This is always a problem in optimal estimation as can be seen in the simple case of c011-math-271 thermometers.

11.4.6 EBM-Generated Signals

A persistent problem in detection and attribution problems is to accurately characterize the signals. In GCM studies, one can run many realizations of perturbed runs with only a single forcing applied, then average across the realizations to obtain the signal fingerprint in space–time. Presumably, the natural variability cancels out and one is left with the bare signal. In linear EBM studies, one can simply turn off the noise forcing and the signal will be evident (Figure 11.9). All of the four forcings can be superimposed because the problem is taken to be linear with no time-dependent coefficients (Figure 11.9d). This is the method used by Stevens and North (1996), North and Stevens (1998), and North and Wu (2001). The natural variability statistics can be gathered from long control runs from a GCM (usually a 1000 years or so) or EBM (Stevens calculated EOFs for a 10 000 year control run).6

Image described by caption and surrounding text.

Figure 11.9 The four global average forcings. The abscissa is time in years. The ordinate for each panel is watts per meter square. (a) The solar cycle signal. (b) The stratospheric aerosol remaining in the stratosphere following volcanic eruptions. (c) The greenhouse and tropospheric aerosol forcing. (d) The sum of all four forcings globally averaged.

(North and Stevens (1998). © American Meteorological Society. Used with permission.)

Before proceeding, it is useful to show how the North and Wu (2001) signals compare with four realizations of a GCM (HadCM2) (dotted line in Figures 11.1011.12) of the same era and observational data (light solid line in the same three figures) which include natural variability. In the same figures, the heavy solid line represents the evaluation of the EBM greenhouse signal (reduced by the factor c011-math-272 = 0.65 to conform with our results shown in this section). Each box in the three figures represents one of the 36 boxes used in the analysis (Figure 11.8). Note the natural variability about the heavy black curve, indicating natural variability (see caption in Fig. 11.10). Also note the agreement of the (scaled) shape of the EBM curve versus the HadCM2 curve. Note also that if the average over the four realizations is taken to be the signal used in a detection study, there will still be a fair amount of noise in the signal pattern. Being satisfied with the space–time pattern as shown in the three figures, in what follows, we will use the EBM to generate the four signals.

Image described by caption and surrounding text.

Figure 11.10 Each panel shows modeled and observed time series from a different observational site as indicated in Figure 11.8. The greenhouse gas signal from the EBCM (thick solid line) has been multiplied by 0.65 (in conformity with our detection results). The dotted line is an average across a four-member ensemble of HadCM2 forced by greenhouse gases (also multiplied by 0.65 to conform with our detection results). Observational data from Jones are shown by the thin solid line.

(North and Wu (2001). © American Meteorological Society. Used with permission.)

Image described by caption and surrounding text.

Figure 11.11 Continued from Figure 11.10.

(North and Wu (2001). © American Meteorological Society. Used with permission.)

Image described by caption and surrounding text.

Figure 11.12 Continued from Figure 11.11.

(North and Wu (2001). © American Meteorological Society. Used with permission.)

The signals used in the North and Wu (2001) paper were taken from earlier work in a dissertation by Stevens (1997). We show here a few figures which illustrate the time dependence of the signals. Figure 11.9 shows the global average signal (actually the average over the 36 boxes shown as black squares in Figure 11.8. The topmost panel shows the faint solar signal, the next lower is the stratospheric aerosol signal from volcanic particles left after eruptions. The sharp dips going backward in time are Mt Pinatubo in 1992, then El Chichon, then Mt Agung. The next lower panel shows both the greenhouse gas and tropospheric aerosol signals. Note the anti-collinearity of these two global signals, making it very difficult to discriminate between them in a detection scheme. The lowermost panel shows the time dependence of the sum of all four signals.

Figure 11.13 shows the same 36 stations. In this figure, signals are band-pass filtered in what Stevens called the “solar band,” a frequency band straddling the frequency 1/10 c011-math-273. The squares of the real and imaginary parts of the Fourier frequency components are shown as columns above the stations. The amplitude is the square root of the sum of these two parts (site by site). Note that both of the imaginary parts are dominant over the real parts. This simply means that the phase lag is nearly c011-math-274, as expected from a gradual temporal increase in the signal. The important thing about this figure is that the two panels show a strong asymmetry between the hemispheres. This asymmetry should assert some discriminating power between the two signals and help us to distinguish one from the other in our detection process.

Image described by caption and surrounding text.

Figure 11.13 The real and imaginary parts squared for the Fourier component of the greenhouse gas forcing (a) and the tropospheric aerosol forcing (b). The Fourier frequency component is at a period of one decade. The imaginary part dominates in both upper and lower panels, suggesting that the phase lag is c011-math-275. The important point is that there is considerable asymmetry between the two hemispheres, suggesting a good possibility of discriminating between the two signals.

(North and Stevens (1998). © American Meteorological Society. Used with permission.)

11.4.7 Characterizing Natural Variability

When Stevens selected the 36 boxes of Figure 11.8, he tried to space them so that there was as little correlation as possible. If there were no correlation, the 36-dimensional vector with unity in position one and zeroes elsewhere would be the first normalized EOF, and so on. This suggests that only a small rotation of the natural variability field will be necessary to generate the EOFs. It might also mean that the procedure will not be very sensitive to our choice of model-generated fields to use in our study. Nevertheless, North and Wu (2001) decided to use not only the EBM-generated EOFs but also to use several (one 1000 year run from the Max Planck Institute (ECHAM1/LSG), two 1000 year runs from different GFDL models, and one 1000 year run from the Hadley Centre (HadCM2)) GCM-generated sets. We could then compare them to see if there is much difference. Recall that even if we do not use the best set, we do not bias the result, but our estimate might be slightly suboptimal with respect to error variance with our estimators. That comparison will be indicated in the figures to follow in this chapter.

With four signals, it is best to recognize that the problem is equivalent to multiple regression for the signal amplitudes. The filter formalism we have used so far gets more complicated because we must make sure that for each of the four signals, the other three have no component parallel to that which is passed through and optimally weighted. On the basis of standard multiple regression analysis, the optimal estimator for a particular c011-math-276 is given by

where c011-math-278. In (11.74), the quantity c011-math-279 is a diagonal matrix with diagonal entries given by the components of the signal vector. The matrix c011-math-280 is the inverse space–time lagged covariance matrix of the natural variability in its EOF or diagonal form, c011-math-281, where c011-math-282 is the covariance matrix. It forms a metric tensor in space (Hasselmann, 1993; also see the Appendix of North and Wu, 2001).

The formalism leads to a similar expression to that of the single-signal case:

11.75 equation

The matrix c011-math-284 can be formed as the array of the c011-math-285,

11.76 equation

Then the covariance matrix of the estimators c011-math-287 and c011-math-288 is just

11.77 equation

11.4.8 Detection Results

The results of the North and Wu (2001) paper are compactly summarized in chart form in Figure 11.14. This graphic shows the results for a total of five experiments, the asterisk indicating that the estimate is based on 20 tropical stations with 100 year records. The c011-math-290 symbol indicates the 36 stations and 100 years of data as in the previous figure. The c011-math-291 symbol indicates that the experiment was for 43 stations—20 with 100 years record and 23 with only 50 years; the c011-math-292 symbol indicates the experiment was conducted with 72 stations—36 with 50 years record and 36 with 100 years records; c011-math-293 is based on 72 stations with 50 years of data (1944–1993). The error bars in the figure represent a 90% confidence region. If an error bar reaches below the dotted line (zero) the corresponding c011-math-294 coefficient is not significant at the 90% level. The clusters labeled GFCLc, GFDLml, EBCM, ECHAM1/LSG, and HadCM2 are used to indicate that these experiments were conducted with the EOFs generated from long (usually 1000 years, but 10 000 years for the EBM).

Image described by caption and surrounding text.

Figure 11.14 Estimates of signals using natural variability from GFDLc, GFDLml, EBM, MPI, and HadCM2, and using the EBM-generated signals for c011-math-295, and c011-math-296. This graphic also shows the results for a total of five experiments. The asterisk indicates that the estimate is based on 20 tropical stations with 100 year records. The c011-math-297 symbol indicates that 36 stations and 100 years of data as in the previous figure. The c011-math-298 symbol indicates that the experiment was for 43 stations—20 with 100 year records and 23 with only 50 year records; the c011-math-299 symbol indicates the experiment was conducted with 72 stations—36 with 50 year records and 36 with 100 year records; c011-math-300 is based on 72 stations with 50 years of data (1944–1993).

(Figure from North and Wu (2001). (©Amer. Meteorol. Soc., with permission).)

We can examine each row individually to sort out the main features. The top row representing the detection of the solar cycle shows quite a few dips below the dotted line, indicating that in many experiments it was not significantly different from zero (i.e., the zero-amplitude hypothesis could not be rejected). As we scan the different natural variability choices, we see there is rather good consistency; further, it is the rightmost experiment with only 50 years of data that is most unstable, which is hardly surprising because temporally, less data have been included (less than five solar cycles). The greenhouse gas (second) row shows very tight error bars and the estimates of the c011-math-301-signal strength seem very robust across the different experiments and across the different choices of natural variability. The volcanic signal (third row) shows wider error bars but with unanimous statistical significance. There are not that many volcanic events in the record. Finally, the fourth row showing the aerosol signal strength shows many overlaps with the zero line and very unstable results across all possible configurations. Each estimate seems to have very precise error bars, but there is strong dependence on all factors. We would have to conclude that no aerosol signal has been detected.

Next consider the ellipses in Figure 11.15 which represent the different elements of c011-math-302, the covariance of the estimators for signals c011-math-303 and c011-math-304. Look first at the upper left corner. The collinearity of c011-math-305 and c011-math-306 are quite evident across all of the five ellipses in the box. An error on the small side of c011-math-307 is correlated with a similar small estimate of c011-math-308. Note that all the ellipses intersect c011-math-309, meaning that it fails the significance test. Other figures in the diagram can be interpreted in a similar way.

Image described by caption and surrounding text.

Figure 11.15 Error ellipses of pairs of signals, given five different model prescriptions for the natural variability: GFDLc (solid line); GFDLml (dotted line); MPI (dashed line); EBM (dashed-dotted line); HadCM2 (dashed-dotted-dotted line). Here (a), (b), and (c) are EBCM signals for 72 stations, 36 with 100 years of data, 36 with 50 years of data; (d)–(f) are EBM signals for 72 global sites all with 50 years of data; (g)–(i) are HadCM2 c011-math-310 and c011-math-311 and EBM c011-math-312 and c011-math-313 signals for 72 global sites all with 50 years of data.

(Figure from North and Wu (2001). (© Amer. Meteorol. Soc., with permission).)

11.4.8.1 Convergence

In this section, we examine the convergence of the estimation process. Figures 11.16 and 11.17 show the convergence results. The abscissa shows the number of space–time EOF modes included in the partial sum up to that many terms. The EOFs are arranged in order of descending variance (eigenvalue). We can see from (1a), (1b), (1c), and (1d) that all of the series converge. We already know that c011-math-314 and c011-math-315 are unstable statistically. Panels (1e) and (4e) show this by the irregularity of the amplitude estimate c011-math-316 as a function of EOF number. The estimate of c011-math-317 even drops below zero. On the other hand, the estimates of c011-math-318 and c011-math-319 are quite stable, both converging consistently to a value around 0.60.

Image described by caption and surrounding text.

Figure 11.16 (1a)–(1e) For solar cycle (S), (2a)–(2e) for greenhouse gas (G), (3a)–(3e) for volcanic (V), and (4a)–(4e) for aerosol (A) in the case of the 36 global stations over 100 years. The space–time EOF modes are arranged in order of descending variance (EOFs from 10,000 year EBM control run). (1a)–(4a) The normalized cumulative fraction of variance of the signal, c011-math-320 with c011-math-321. (1b)–(4b) indicate the eigenvalue of each spatial–temporal mode; (1c)–(4c) indicate the contributions to c011-math-322 from the individual EOF modes; (1d)–(4d) indicate the cumulative c011-math-323. (1e)–(4e) The cumulative estimate of c011-math-324 including EOFs up to c011-math-325.

(Figure from North and Wu (2001). (©Amer. Meteorol. Soc., with permission).)

Image described by caption and surrounding text.

Figure 11.17 Continued from the previous figure.

(Figure from North and Wu (2001). (©Amer. Meteorol. Soc., with permission).)

11.4.9 Discussion of the Detection Results

We suggest two reasons that the North and Wu (2001) study yields a somewhat lower estimate than expected amplitude for c011-math-326 and c011-math-327 as well as the near-zero amplitude for c011-math-328. The Appendices of North and Wu (2001) contain a number of interesting tests of the detection program. For example, Figure 11.18 shows results of a Monte Carlo experiment using the EOFs from the 10,000 year run of the EBM for the natural variability (EOFs) and all four signals are included with c011-math-329. Then 200 of the EBM 50 year runs are used as “data” inserted into the 72 data sites. In the figure, the error ellipses for 90% confidence are drawn along with the individual points representing each run. Note that each ellipse has (1, 1) at its center. In the left panel, the correlation of c011-math-330 and c011-math-331 is evident. In the right panel the orthogonality is expressed as virtually no correlation between the errors in c011-math-332 and c011-math-333.

Image described by caption and surrounding text.

Figure 11.18 (left) Scatter plot of Monte Carlo studies and 90% error ellipse of detection studies for the pair of signal G–A for 72 boxes all with 50-yr (1944–93) observational data. In Monte Carlo studies, the artificial data is constructed by adding 200 50-yr EBCM control run and four EBCM signals S, G, V, and A. The truncated eigenmode is 500 in the 10 k yr control run of EBCM. (right) Same as (a) except for pair of signal G–V.

(Figure from North and Wu (2001). (©Amer. Meteorol. Soc., with permission).)

Another interesting test is shown in Figure 11.19. Here, the comparison of the time dependence of c011-math-334 with that of c011-math-335 (c011-math-336 and c011-math-337 are omitted in the experiment), where the solid line is the EBM signal, the light solid line indicates the GCM HadCM2, and the dotted line represents the GCM ECHAM4. The EBM signal is very close to those of the two GCMs. One can notice a very slight difference between the two signals (panels (a) and (b)) with the aerosols making a slight bend in the c011-math-338 curve. The statistical collinearity is evident in the global average curves. The only difference that can be used for discrimination between them must be in the inter-hemispheric difference (presumably there is higher c011-math-339 in the NH. Most GCM simulations have indicated a larger value of c011-math-340 that cancels part of a larger c011-math-341. This latter would mean that the data point would lie in the upper right end of the ellipses in Figure 11.15a,d. In this case, the “equilibrium sensitivity” would be greater than the nominal 2.3 K that we assume for the EBM used in this book. We cannot rule out this case as it would lie within the 90% confidence area of those two panels of Figure 11.15.

Image described by caption and surrounding text.

Figure 11.19 (a) First principal component time series of annual mean climate change signal for greenhouse-gas-only forcing from EBCM (heavy solid line), HadCM2 GCM (light solid line), and ECHAM4 GCM (dotted line); (b) same as (a) except for greenhouse-gas-plus-aerosol forcing.

(Figure from North and Wu (2001). (©Amer. Meteorol. Soc., with permission).)

Another possible reason for the discrepancy between the EBM detection study and that of many IPCC GCMs is that the EBM uses only a mixed-layer ocean in its EBM-generated signal. Compared to a deeper ocean coupling, this would make the EBM signal larger than the one which might have been generated from a coupled ocean–atmosphere GCM. The latter would hold down the signal from such a coupled model as we discussed in Chapter 10.

A final possible criticism of the North and Wu (2001) study lies in the fact that so many EOFs are used. Using very large numbers of EOFs often raises a red flag in statistical studies because their eigenvalues are necessarily close together and this means that they will “mix” (if the eigenvalues were close together, a linear combination of the associated eigenvectors is also an eigenvector). But this criticism is false. The (sample) PC time series associated with a particular EOF are orthogonal by construction and the EOFs form a good basis set.

Notes for Further Reading

Surface temperature data sets can be found at the following websites. In many cases, there are descriptions of how the data are processed and how the estimation procedure works:

  1. 1. The NASA/Goddard Institute for Space Studies (GISS) provides a very accessible digital as well as graphical (including maps) surface temperature data. http://data.giss.nasa.gov/gistemp/.
  2. 2. The Climate Research Unit (CRU) of the University of East Anglia provides both digital and colorful graphical information. They also provide a list of publications by Dr. Philip D. Jones and his colleagues. http://www.cru.uea.ac.uk/cru/data/temperature/.
  3. 3. The NOAA/National Climate Data Center (NCDC) website is a little more difficult to navigate, but it has the data. http://www.ncdc.noaa.gov/temp-and-precip/global-maps/.
  4. 4. NOAA also publishes online the pdfs of annual issues of the Bulletin of the American Meteorological Society on the State of the Climate for each year starting in 1991; there is also a decadal review covering 1981–1990. http://www.ncdc.noaa.gov/bams/past-reports.

Exercises

  1. 11.1
    1. a. Consider two measurements of surface temperature at a location. They are described as
      equation

      where c011-math-342 is the true temperature and c011-math-343 and c011-math-344 are measurement errors. Assume that measurements are unbiased, that is, c011-math-345, and error variance is given by c011-math-346 and c011-math-347. Further assume that the errors in the two measurements are uncorrelated, that is, c011-math-348. Show that an unbiased optimal estimator with the least MSE is given by

      equation
    2. b. Show that an unbiased optimal estimator for three independent unbiased measurements is given by
      equation
    3. c. For c011-math-349 independent unbiased measurements, the optimal unbiased estimator is obtained by minimizing the so-called error functional defined by
      equation

      where c011-math-350 is called a Lagrange multiplier. Show that the optimal unbiased estimator is given by

      equation
    4. d. Show that the error variance of the optimal estimator is given by
      equation
  2. 11.2 Consider two measurements, c011-math-351, of which the errors are correlated. Assume that error covariance matrix is given by
    equation
    1. a. Find the principal axes for which two new measurements are uncorrelated.
    2. b. Show that the two measurements rotated according to the eigenvectors become uncorrelated.
    3. c. Determine the optimal unbiased estimator for c011-math-352 based on the two measurements c011-math-353.
  3. 11.3 As discussed in Exercise 11.2, two or more measurements on the surface of the Earth are typically correlated. Let us consider c011-math-354 measurements of length c011-math-355 of a variable c011-math-356 (say, temperature), c011-math-357. Then, the covariance matrix of the c011-math-358 measurements is given by
    equation

    The eigenvalues and eigenfunctions of are determined by solving the Karhunen–Loève equation

    equation

    where c011-math-359 is the c011-math-360th eigenvector with corresponding eigenvalue c011-math-361. Then, the c011-math-362 measurements can be written as a unique linear combination of eigenvectors as

    equation

    This procedure is called the EOF analysis. The unique amplitude time series, c011-math-363, is often called the it principal component time series; for this reason, this decomposition is also called the PCA (principal component analysis). The eigenfunctions are orthogonal to each other and eigenvalues are all positive, as the covariance matrix is real and symmetric. In addition, the eigenvectors (also called EOF loading vectors) and PC time series should satisfy the following:

    equation
    equation

    where c011-math-364 is a proportionality constant.

    1. a. Show that the covariance matrix, c011-math-365, can be written as
      equation
    2. b. Show that the EOF decomposition indeed satisfies the Karhunen–Loève equation and the proportionality constant c011-math-366.
    3. c. Show how you can calculate the PC time series from the eigenfunctions of the covariance matrix.
  4. 11.4 Global average temperature is defined by
    equation

    where c011-math-367 denotes the surface of the Earth, and c011-math-368 and c011-math-369 represent longitude and latitude of a location c011-math-370 on the surface of the Earth. The factor c011-math-371 is introduced to properly scale the result.

    1. a. Let us consider the problem of estimating global average temperature based on a small number of samples on the surface of the Earth. Set up an optimal estimation problem based on samples c011-math-372.
    2. b. Solve the optimal estimation problem for the weights.
  5. 11.5 Consider a noise-forced one-dimensional energy balance model of the form
    equation
    1. a. Calculate the spectrum of c011-math-373 using the spectrum of noise forcing c011-math-374.
    2. b. What is the nature of the spectrum of c011-math-375 when the model is forced by a white-noise forcing, that is, c011-math-376, regardless of the frequency c011-math-377? Plot the spectrum for ocean and land responses by using c011-math-378, c011-math-379 W mc011-math-380C)c011-math-381, c011-math-382=10 years, c011-math-383.
    3. c. Express the variance of the temperature response in terms of the spectrum of temperature and in terms of the spectrum of forcing.
    4. d. Calculate the autocovariance function for the temperature response when the model is driven by a white-noise forcing. Then, determine the e-folding timescale of response.
  6. 11.6 Consider a signal detection problem, where the normalized signal at two stations is given in the form
    equation

    where c011-math-384 are orthogonal unit vectors. Actual data at the two stations are given by

    equation

    where c011-math-385 is the true strength of the signal andc011-math-386 are the natural variability at the two stations. Thus, the signal of constant strength is embedded amid randomly varying natural variability. Further, assume that

    equation

    that is, natural variability at each station has mean zero and the random variability at the two stations are uncorrelated.

    1. a. One way to determine the signal strength is to project the normalized signal on the data set, that is, c011-math-387. Show that this is an unbiased estimator of the signal strength. What is the error variance of the signal strength c011-math-388?
    2. b. Note that the estimator in Part (a) is not optimal, as it did not consider that the magnitude of natural variability differs at the two stations. One way to account for this is to estimate the signal strength at each station and weigh the two estimates optimally. Develop an optimal estimator based on this idea. Calculate the error variance and compare it with that in Part (a).
    3. c. Show for the signal and mutually uncorrelated noise background (natural variability) at c011-math-389 stations that
      equation
    4. where the signal is determined by
      equation
  7. 11.7 Consider data consisting of four different temporally varying signals on top of natural variability:
    equation

    where c011-math-390 denote different signals, c011-math-391 represents natural variability, and c011-math-392 space–time points. Assume that the signals are not necessarily orthogonal to each other.

    1. a. Show that this problem can be recast in terms of EOFs of natural variability in the form
      equation
    2. where c011-math-393 represents the EOF mode number.
    3. b. Show that an unbiased estimator for the signal strength can be derived from each EOF mode equation in Part (a) as
      equation
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.10.246