Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

3
State‐Space Models for Identification

3.1 Introduction

State‐space models are easily generalized to multichannel, nonstationary, and nonlinear processes [1–23]. They are very popular for model‐based signal processing primarily because most physical phenomena modeled by mathematical relations naturally occur in state‐space form (see [15] for details). With this motivation in mind, let us proceed to investigate the state‐space representation in a more general form to at least “touch” on its inherent richness. We start with continuous‐time systems and then proceed to the sampled‐data system that evolves from digitization followed by the purely discrete‐time representation – the primary focus of this text.

3.2 Continuous‐Time State‐Space Models

We begin by formally defining the concept of state [1] . The state of a system at time is the “minimum” set of variables (state variables) along with the input sufficient to uniquely specify the dynamic system behavior for all over the interval . The state vector is the collection of state variables into a single vector. The idea of a minimal set of state variables is critical, and all techniques to define them must ensure that the smallest number of “independent” states have been defined in order to avoid possible violation of some important system theoretic properties [2,3]. In short, one can think of states mathematically as simply converting an th‐order differential equation into a set of ‐first‐order equations – each of which is a state variable. From a systems perspective, the states are the internal variables that may not be measured directly, but provide the critical information about system performance. For instance, measuring only the input/output voltages or currents of an integrated circuit board with a million embedded (internal) transistors – the internal voltages/currents would be the unmeasured states of this system.

Let us consider a general deterministic formulation of a nonlinear dynamic system including the output (measurement) model in state‐space form (continuous‐time)¹

for , , and the respective ‐state, ‐output, and ‐input vectors with corresponding system (process), input, measurement (output), and feedthrough functions. The ‐dimensional system and input functions are defined by and , while the ‐dimensional output and feedthrough functions are given by and .

In order to specify the solution of the th‐order differential equations completely, we must specify the above‐noted functions along with a set of ‐initial conditions at time and the input for all . Here is the dimension of the “minimal” set of state variables.

If we constrain the state‐space representation to be linear in the states, then we obtain the generic continuous‐time, linear, time‐varying state‐space model given by

(3.1)

where , , and the respective system, input, output, and feedthrough matrices are: , , , and .

An interesting property of the state‐space representation is to realize that these models represent a complete generic form for almost any physical system. That is, if we have an RLC circuit or a mass‐damper‐spring (MCK) mechanical system, their dynamics are governed by the identical set of differential equations – only their coefficients differ. Of course, the physical meaning of the states is different, but this is the idea behind state‐space – many physical systems can be captured by this generic form of differential equations, even though the systems are physically different.

Systems theory, which is essentially the study of dynamic systems, is based on the study of state‐space models and is rich with theoretical results exposing the underlying properties of the system under investigation. This is one of the major reasons why state‐space models are employed in signal processing, especially when the system is multivariable, having multiple inputs and multiple outputs (MIMO). Next we develop the relationship between the state‐space representation and input/output relations – the transfer function.

For this development, we constrain the state‐space representation above to be a linear time‐invariant (LTI) state‐space model given by

(3.2)

where , , , and their time invariant counterparts with the subscript, “,” annotating continuous‐time matrices.

This LTI model corresponds the constant coefficient differential equation solutions, which can be solved using Laplace transforms. Taking the Laplace transform² of these equations and solving for , we have that

(3.3)

where is the identity matrix. The corresponding output is

(3.4)

From the definition of transfer function (zero initial conditions), we have the desired result

(3.5)

Taking the inverse Laplace transform of this equation gives us the corresponding impulse response matrix of the LTI‐system as [1]

(3.6)

So we see that the state‐space representation enables us to express the input–output relations in terms of the internal variables or states. Note also that this is a multivariable representation compared to the usual single‐input‐single‐output (SISO) (scalar) systems models that frequently appear in the signal processing literature.

Now that we have the multivariable transfer function representation of our LTI system, we can solve the state equations directly using inverse transforms to obtain the time‐domain solutions. First, we simplify the notation by defining the Laplace transform of the state transition matrix or the so‐called resolvent matrix of systems theory [ 1 , 3 ] as

(3.7)

with

(3.8)

Therefore, the state transition matrix is a critical component in the solution of the state equations of an LTI system given by

(3.9)

and we can rewrite the transfer function matrix as

(3.10)

with the corresponding state‐input transfer matrix given by

(3.11)

Taking the inverse Laplace transformation gives the time‐domain solution

(3.12)

with the corresponding output solution

(3.13)

Revisiting the continuous‐time system of Eq. 3.12 and substituting the matrix exponential for the state transition matrix gives the LTI solution as

(3.14)

and the corresponding measurement system

(3.15)

In general, the continuous state transition matrix satisfies the following properties: [ 1 , 2 ]

is uniquely defined for (Unique)
(Identity)
satisfies the matrix differential equation:
(3.16)
(Semi‐Group)
(Inverse)

Thus, the transition matrix plays a pivotal role in LTI systems theory for the analysis and prediction of the response of LTI and time‐varying systems [2] . For instance, the poles of an LTI govern important properties such as stability and response time. The poles are the roots of the characteristic (polynomial) equation of , which are found by solving for the roots of the determinant of the resolvent, that is,

(3.17)

Stability is determined by assuring that all of the poles lie within the left half of the ‐plane. Next we consider the sampled‐data state‐space representation.

3.3 Sampled‐Data State‐Space Models

Sampling a continuous‐time system is commonplace with the advent of high‐speed analog‐to‐digital converters (ADC) and modern computers. A sampled‐data system lies somewhere between the continuous analog domain (physical system) and the purely discrete domain (stock market prices). Since we are strictly sampling a continuous‐time process, we must ensure that all of its properties are preserved. The well‐known Nyquist sampling theorem precisely expresses the required conditions (twice the highest frequency) to achieve “perfect” reconstruction of the process from its samples [15] .

Thus, if we have a physical system governed by continuous‐time dynamics and we “sample” it at given time instants, then a sampled‐data model can be obtained directly from the solution of the continuous‐time state‐space model. That is, we know from Section 3.2 that

where is the continuous‐time state transition matrix that satisfies the matrix differential equation of Eq. 3.16, that is,

Sampling this system such that over the interval , then we have the corresponding sampling interval defined by . Note this representation need not necessarily be equally‐spaced – another important property of the state‐space representation. Thus, the sampled solution becomes

(3.18)

and therefore from the differential equation of Eq. 3.16 , we have the solution

(3.19)

where is the sampled‐data state transition matrix – the critical component in the solution of the state equations enabling us to calculate state evolution in time. Note that for an LTI sampled‐data system, the state‐transition matrix is where is the sampled‐data system (process) matrix.

If we further assume that the input excitation is piecewise constant () over the interval , then it can be removed from under the superposition integral in Eq. 3.18 to give

(3.20)

Under this assumption, we can define the sampled‐data input transmission matrix as

(3.21)

and therefore the sampled‐data state‐space system with equally or unequally sampled data is given by

(3.22)

Computationally, sampled‐data systems pose no particular problems when care is taken, especially since reasonable approximation and numerical integration methods exist [19]. This completes the discussion of sampled‐data systems and approximations. Next we consider the discrete‐time systems.

3.4 Discrete‐Time State‐Space Models

Discrete state‐space models, the focus of subspace identification, evolve in two distinct ways: naturally from the problem or from sampling a continuous‐time dynamical system. An example of a natural discrete system is the dynamics of balancing our own checkbook. Here the state is the evolving balance given the past balance and the amount of the previous check. There is “no information” between time samples, so this model represents a discrete‐time system that evolves naturally from the underlying problem. On the other hand, if we have a physical system governed by continuous‐time dynamics, then we “sample” it at given time instants. So we see that discrete‐time dynamical systems can evolve from a wide variety of problems both naturally (checkbook) or physically (circuit). In this text, we are primarily interested in physical systems (physics‐based models), so we will concentrate on sampled systems reducing them to a discrete‐time state‐space model.

We can use a first‐difference approximation and apply it to the general LTI continuous‐time state‐space model of Eq. 3.2 to obtain a discrete‐time system, that is,

Solving for , we obtain

(3.23)

Recognizing that the first‐difference approximation is equivalent to a first‐order Taylor series approximation of gives the discrete system, input, output, and feedthrough matrices as

(3.24)

The discrete, linear, time‐varying state‐space representation is given by the system or process model as

(3.25)

and the corresponding discrete output or measurement model as

(3.26)

where are the respective ‐state, ‐input, ‐output and are the ()‐system, ()‐input, ()‐output, and ()‐feedthrough matrices.

The state‐space representation for (LTI), discrete systems is characterized by constant system, input, output, and feedthrough matrices, that is,

and is given by the LTI system

(3.27)

The discrete system representation replaces the Laplace transform with the ‐transform defined by the transform pair:

(3.28)

Time‐invariant state‐space discrete systems can also be represented in input/output or transfer function form using the ‐transform to give

(3.29)

where recall “adj” is the matrix adjoint (transpose of the cofactor matrix) and “det” is the matrix determinant.

We define the characteristic equation or characteristic polynomial of the ‐dimensional system matrix as

(3.30)

with roots corresponding to the poles of the underlying system that are also obtained from the eigenvalues of defined by

(3.31)

for is the th system pole.

Taking inverse ‐transforms of Eq. 3.29, we obtain the discrete impulse (or pulse) response matrix as

(3.32)

for the Kronecker delta function.

The solution to the state‐difference equations can easily be derived by induction [3] and is given by

(3.33)

where is the discrete‐time state‐transition matrix. For time‐varying systems, it can be shown (by induction) that the state‐transition matrix satisfies

while for time‐invariant systems the state‐transition matrix is given by

The discrete state‐transition matrix possesses properties analogous to its continuous‐time counterpart, that is,

is uniquely defined (Unique)
(Identity)
satisfies the matrix difference equation:
(3.34)
(Semi‐Group)
(Inverse)

3.4.1 Linear Discrete Time‐Invariant Systems

In this section, we concentrate on the discrete LTI system that is an integral component of the subspace identification techniques to follow. Here we develop a compact vector–matrix form of the system input/output relations that will prove useful in subsequent developments.

For a discrete LTI system, we have that the state transition matrix becomes

The discrete LTI solution of Eq. 3.33 is therefore

(3.35)

with the measurement or output system given by

(3.36)

Expanding this relation further over ‐samples and collecting terms, we obtain

(3.37)

where is the observability matrix and is a Toeplitz matrix [15] .

Shifting these relations in time () yields

(3.38)

leading to the vector input/output relation

(3.39)

Catenating these ‐vectors () of Eq. 3.39 to create a batch‐data (block Hankel) matrix over the ‐samples, we can obtain the “data equation,” that is, defining the block matrices as

(3.40)

(3.42)

and, therefore, we have the vector–matrix data equation that relates the system model to the data (input and output matrices)

(3.43)

This expression represents the fundamental relationship for the input–state–output of an LTI state‐space system. Next we discuss some of the pertinent system theoretic results that will enable us to comprehend much of the subspace realization algorithms to follow.

3.4.2 Discrete Systems Theory

In this section we investigate the discrete state‐space model from a systems theoretic viewpoint. There are certain properties that a dynamic system must possess in order to assure a consistent representation of the dynamics under investigation. For instance, it can be shown [2] that a necessary requirement of a measurement system is that it is observable, that is, measurements of available variables or parameters of interest provide enough information to reconstruct the internal variables or states.

Mathematically, a system is said to be completely observable, if for any initial state, say , in the state‐space, there exists a finite such that knowledge of the input and the output is sufficient to specify uniquely. Recall that the linear state‐space representation of a discrete system is defined by the following set of equations:

with the corresponding measurement system or output defined by

Using this representation, the simplest example of an observable system is one in which each state is measured directly, therefore, and the measurement matrix is a matrix. In order to reconstruct from its measurements , then from the measurement system model, must be invertible. In this case, the system is said to be completely observable; however, if is not invertible, then the system is said to be unobservable.

The next level of complexity involves the solution to this same problem when is a matrix, then a pseudo‐inverse must be performed instead [ 1 , 2 ]. In the general case, the solution gets more involved because we are not just interested in reconstructing , but over all finite values of ; therefore, we must include the state model, that is, the dynamics as well.

With this motivation in mind, we now formally define the concept of observability. The solution to the state representation is governed by the state‐transition matrix, , where recall that the solution of the state equation is [3]

Therefore, premultiplying by the measurement matrix, the output relations are

(3.44)

or rearranging we define

(3.45)

The problem is to solve this resulting equation for the initial state; therefore, multiplying both sides by , we can infer the solution from the relation

Thus, the observability question now becomes under what conditions can this equation uniquely be solved for ? Equivalently, we are asking if the null space of is . It has been shown [ 2 ,4] that the following observability Gramian has the identical null space, that is,

(3.46)

which is equivalent to determining that is nonsingular or rank .

Further assuming that the system is LTI then over a finite time interval for for leads to the observability matrix [4] given by

(3.47)

Therefore, a necessary and sufficient condition for a system to be completely observable is that the rank of or must be . Thus, for the LTI case, checking that all of the measurements contain the essential information to reconstruct the states reduces to checking the rank of the observability matrix. Although this is a useful mathematical concept, it is primarily used as a rule of thumb in the analysis of complicated measurement systems.

Analogous to the system theoretic property of observability is that of controllability, which is concerned with the effect of the input on the states of the dynamic system. A discrete system is said to be completely controllable if for any , , there exists an input sequence, such that the solution to the state equations with initial condition is for some finite . Following the same approach as for observability, we obtain that the controllability Gramian defined by

(3.48)

is nonsingular or

Again for the LTI system, then over a finite time interval for for the controllability matrix defined by

(3.49)

must satisfy the rank condition, to be completely controllable [4] .

If we continue with the LTI system description, we know from ‐transform theory that the discrete transfer function can be represented by an infinite power series [4] , that is,

(3.50)

where is the unit impulse response matrix with . Here is defined as the Markov sequence with the corresponding set of Markov parameters given by the embedded system .

If the MIMO transfer function matrix is available, then the impulse response (matrix) sequence can be determined simply by long division of each matrix component transfer function, that is, . Consider the following example to illustrate this calculation that will prove useful in the classical realization theory of Chapter 6.

We know from the Cayley–Hamilton theorem of linear algebra that an ‐dimensional matrix satisfies its own characteristic equation [4]

(3.51)

Therefore, pre‐ and postmultiplying this relation by the measurement and input transmission matrices and , respectively, it follows that the Markov parameters satisfy the recursion for the ‐degree

(3.52)

This result will have critical realizability conditions subsequently in Chapter 6.

The problem of determining the internal description from the external description ( or ) of Eq. 3.32 is called the realization problem. Out of all possible realizations, having the same Markov parameters, those of smallest dimension are defined as minimal realizations. Thus, the dimension of the minimal realization is identical to the degree of the characteristic polynomial (actually the minimal polynomial for multivariable systems) or equivalently the degree of the transfer function (number of system poles).

In order to develop these relations, we define the Hankel matrix by

(3.53)

Suppose the dimension of the system is , then the Hankel matrix could be constructed such that using impulse response matrices. Knowledge of the order indicates the minimum number of terms required to exactly “match” the Markov sequence and extract the Markov parameters. Also the is the dimension of the minimal realization. If we did not know the dimension of the system, then we would let (in theory) and determine the rank of . Therefore, the minimal dimension of an “unknown” system is the rank of the Hankel matrix.

In order for a system to be minimal, it must be completely controllable and completely observable [4] . This can be seen from the fact that the Hankel matrix factors as

(3.54)

or simply

(3.55)

From this factorization, it follows that the . Therefore, we see that the properties of controllability and observability are carefully woven into that of minimality and testing the rank of the Hankel matrix yields the dimensionality of the underlying dynamic system. This fact will prove crucial when we “identify” a system, , from noisy measurement data. We shall discuss realization theory in more detail in Chapter 6 using these results.

3.4.3 Equivalent Linear Systems

Equivalent systems are based on the concept that there are an infinite number of state‐space systems that possess “identical” input/output responses. In the state‐space, this is termed a set of coordinates; that is, we can change the state vectors describing a system by a similarity transformation such that the system matrices are transformed to a different set of coordinates by the transformation matrix [ 1 –5], that is,

where we have

yields an “equivalent” system from an input/output perspective, that is, the transfer functions are identical

as well as the corresponding impulse response matrices

There does exist unique representations of state‐space systems termed “canonical forms,” which we will see in Section 3.7 for SISO systems as well as for MIMO systems discussed subsequently in Chapter 6. In fact, much of control theory is based on designing controllers in a modal coordinate system with a diagonal system or modal matrix and then transforming the modal system back to the physical coordinates using the inverse transform [4] .

3.4.4 Stable Linear Systems

Stability of a linear system can also be cast in conjunction with the properties of controllability and observability [ 4 –10]. For a homogeneous, discrete system, it has been shown that asymptotic stability follows directly from the eigenvalues of the system matrix expressed as . These eigenvalues must lie within the unit circle or equivalently .

Besides determining eigenvalues of or equivalently factoring the characteristic equation to extract the system poles, another way to evaluate this condition follows directly from Lyapunov stability theory [9, 10 ] that incorporates observability.

An observable system is stable if and only if the Lyapunov equation

(3.56)

has a unique positive definite solution, . The proof is available in [9] or [10] . As we shall see, this is an important property that will appear in subspace identification theory when extracting a so‐called balanced realization as well as a unique stochastic realization [10] .

This completes the section on discrete systems theory. It should be noted that all of the properties discussed in this section exist for continuous‐time systems as well (see [2] for details).

3.5 Gauss–Markov State‐Space Models

In this section, we extend the state‐space representation to incorporate random inputs or noise sources along with random initial conditions. The discrete‐time Gauss–Markov model will be applied extensively throughout this text.

3.5.1 Discrete‐Time Gauss–Markov Models

Here we investigate the case when random inputs are applied to a discrete state‐space system with random initial conditions. If the excitation is a random signal, then the state is also random. Restricting the input to be deterministic and the noise to be zero‐mean, white, Gaussian , the Gauss–Markov model evolves as

(3.57)

where and

The solution to the Gauss–Markov equations can easily be obtained by induction to give

(3.58)

which is Markov depending only on the previous state. Since is just a linear transformation of Gaussian processes, it is also Gaussian. Thus, we can represent a Gauss–Markov process easily employing the state‐space models.

When the measurement model is also included, we have

(3.59)

where . The model is shown diagrammatically in Figure 3.1.

Block diagram depicting the Gauss-Markov model of a discrete process, characterized by a Gaussian distribution, completely specified by its mean and variance. — Figure 3.1 Gauss–Markov model of a discrete process.

Since the Gauss–Markov model of Eq. 3.57 is characterized by a Gaussian distribution, it is completely specified (statistically) by its mean and variance. Therefore, if we take the expectation of Eqs. 3.57 and 3.59 respectively, we obtain the state mean vector as

(3.60)

and the measurement mean vector as

(3.61)

The state variance³ is given by the discrete Lyapunov equation:

(3.62)

and the measurement variance, is

(3.63)

Similarly, it can be shown that the state covariance propagates according to the following equations:

(3.64)

We summarize the Gauss–Markov and corresponding statistical models in Table 3.1.

If we restrict the Gauss–Markov model to the stationary case, then

and the variance equations become

with

(3.65)

At steady state (), we have

and therefore, the measurement covariance relations become

(3.66)

By induction, it can be shown that

(3.67)

The measurement power spectrum is easily obtained by taking the ‐transform of this equation to obtain

(3.68)

where

with

Thus, using the spectrum is given by

(3.69)

So we see that the Gauss–Markov state‐space model enables us to have a more general representation of a multichannel stochastic signal. In fact, we are able to easily handle the multichannel and nonstationary statistical cases within this framework. Generalizations are also possible with the vector models, but those forms become quite complicated and require some knowledge of multivariable systems theory and canonical forms (see [1] for details). Before we leave this subject, let us consider a simple input/output example with Gauss–Markov models.

Example 3.2

Consider the following difference equation driven by random (white) noise:

The corresponding state‐space representation is obtained as

Taking ‐transforms (ignoring the randomness), we obtain the transfer function

Using Eq. 3.62, the variance equation for the above model is

Assume the process is stationary, then for all and solving for it follows that

Therefore,

for .

Choosing gives . Taking ‐transforms the discrete power spectrum is given by

Therefore, we conclude that for stationary processes these models are equivalent.

Now if we assume a nonstationary process and let , and , then the Gauss–Markov model is given by

The corresponding statistics are given by the mean relations

and the variance equations

We apply the simulator available in MATLAB [15] to obtain a 100‐sample realization of the process. The results are shown in Figure 3.2a through c. In Figure 3.2 a,b we see the mean and simulated states with the corresponding confidence interval about the mean, that is,

and

Using the above confidence interval, we expect of the samples to lie within () () From the figure, we see that only 2 of the 100 samples exceed this bound, indicating a statistically acceptable simulation. We observe similar results for the simulated measurements. The steady‐state variance is given by

Therefore, we expect of the measurement samples to lie within () at steady state. This completes the example.

Graphs depicting that Gauss-Markov simulation of first-order process: (a) state/measurement means, (b) state with 95% confidence interval about its mean, and (c) Measurement with 95% confidence interval about its mean. — Figure 3.2 Gauss–Markov simulation of first‐order process: (a) state/measurement means, (b) state with confidence interval about its mean, and (c) Measurement with confidence interval about its mean.

It should be noted that if the bounds are exceeded by more than ( lie within), then we must select a new seed and reexecute the simulation until the bound conditions are satisfied signifying a valid Gauss–Markov simulation.

3.6 Innovations Model

In this section, we briefly develop the innovations model that is related to the Gauss–Markov representation. The significance of this model will be developed throughout the text, but we take the opportunity now to show its relationship to the basic Gauss–Markov representation. We start by extending the original Gauss–Markov representation to the correlated process and measurement noise case and then show how the innovations model is a special case of this structure.

The standard Gauss–Markov model for correlated process and measurement noise is given by

(3.70)

where and

Here we observe that in the standard Gauss–Markov model, the block covariance matrix, , is full with cross‐covariance matrices on its off‐diagonals. The usual standard model assumes that they are null (uncorrelated). To simulate a system with correlated and is more complicated using this form of the Gauss–Markov model because must first be factored such that

(3.71)

where are matrix square roots [6,7]. Once the factorization is performed, then the correlated noise is synthesized “coloring” the uncorrelated noise sources, and as

(3.72)

The innovations model is a constrained version of the correlated Gauss–Markov characterization. If we assume that is a zero‐mean, white, Gaussian sequence, that is, , then the innovations model [11– 15 ] evolves as

(3.73)

where is the ‐dimensional innovations vector and is the weighting matrix with the innovations covariance specified by

It is important to note that the innovations model has implications in Wiener–Kalman filtering (spectral factorization) because can be represented in factored or square‐root form () directly in terms of the weight and innovations covariance matrix as

(3.74)

Comparing the innovations model to the Gauss–Markov model, we see that they are both equivalent to the case when and are correlated. Next we show the equivalence of the various model sets to this family of state‐space representations.

3.7 State‐Space Model Structures

In this section, we discuss special state‐space structures usually called “canonical forms” in the literature, since they represent unique state constructs that are particularly useful. We will confine the models to SISO forms, while the more complicated multivariable structures will be developed in Chapter 6. Here we will first recall the autoregressive‐moving average model with exogenous inputs ARMAX model of Chapter 2 and then its equivalent representation in the state‐space form.

3.7.1 Time‐Series Models

Time‐series models are particularly useful representations used frequently by statisticians and signal processors to represent time sequences when no physics is available to employ directly. They form the class of black‐box or gray‐box models [15] , which are useful in predicting data. These models have an input/output structure, but they can be transformed to an equivalent state‐space representation. Each model set has its own advantages: the input/output models are easy to use, while the state‐space models are easily generalized and usually evolve when physical phenomenology can be described in a model‐based sense [15] .

We have a difference equation relating the output sequence to the input sequence in terms of the backward shift operator as ⁴

(3.75)

(3.76)

Recall from Section 2.4 that when the system is excited by random inputs, the models can be represented by an ARMAX and is abbreviated by ARMAX().

(3.77)

where , , , are polynomials in the backward‐shift operator such that and is a white noise source with coloring filter

Since the ARMAX model is used to characterize a random signal, we are interested in its statistical properties (see [15] for details). We summarize these properties in Table 3.2.

This completes the section on ARMAX models.

3.7.2 State‐Space and Time‐Series Equivalence Models

In this section, we show the equivalence between the ARMAX and state‐space models (for scalar processes). That is, we show how to obtain the state‐space model given the ARMAX models by inspection. We choose particular coordinate systems in the state‐space (canonical form) and obtain a relationship between entries of the state‐space system to coefficients of the ARMAX model. An example is presented that shows how these models can be applied to realize a random signal. First, we consider the ARMAX to state‐space transformation.

Recall from Eq. 3.77 that the general difference equation form of the ARMAX model is given by

(3.78)

or equivalently in the frequency domain as

(3.79)

where and and is a zero‐mean, white sequence with spectrum given by .

It is straightforward to show (see [5] ) that the ARMAX model can be represented in observer canonical form:

(3.80)

where and are the ‐state vector, scalar input, noise, and output with

Noting this structure we see that each of the matrix or vector elements can be determined from the relations

(3.81)

where

Consider the following example to illustrate these relations.

It is important to realize that if we assume that is Gaussian, then the ARMAX model is equivalent to the innovations representation of Section 3.6, that is,

(3.83)

where, in this case, , , and . Also, the corresponding covariance matrix becomes

This completes the discussion on the equivalence of the general ARMAX to state‐space. Next let us develop the state‐space equivalent models for some of the special cases of the ARMAX model presented in Section 2.4.

We begin with the moving average (MA)

Define the state variable as

(3.84)

and therefore,

(3.85)

Expanding this expression, we obtain

(3.86)

or in vector–matrix form

(3.87)

Thus, the general form for the MA state‐space is given by

(3.88)

with , .

Next consider the autoregressive (AR) model (all‐pole) given by

(3.89)

Here the state vector is defined by and therefore, with . Expanding over , we obtain the vector–matrix state‐space model

(3.90)

In general, we have the AR (all‐pole) state‐space model

(3.91)

with , .

Another useful state‐space representation is the normal form that evolves by performing a partial fraction expansion of a rational discrete transfer function model (ARMA) to obtain

(3.92)

for the set of residues and poles of . Note that the normal form model is the decoupled or parallel system representation based on the following set of relations:

Defining the state variable as , then equivalently

(3.93)

and therefore, the output is given by

(3.94)

Expanding these relations over , we obtain

(3.95)

Thus, the general decoupled form of the normal state‐space model is given by

(3.96)

for with , an ‐vector of unit elements. Here and .

3.8 Nonlinear (Approximate) Gauss–Markov State‐Space Models

Many processes in practice are nonlinear, rather than linear. Coupling the nonlinearities with noisy data makes the signal processing problem a challenging one. In this section, we develop an approximate solution to the nonlinear modeling problem involving the linearization of the nonlinear process about a “known” reference trajectory. In this section, we limit our discussion to discrete nonlinear systems. Continuous solutions to this problem are developed in [ 6 – 15 ].

Suppose we model a process by a set of nonlinear stochastic vector difference equations in state‐space form as

(3.97)

with the corresponding measurement model

(3.98)

where , , , are nonlinear vector functions of , , with , and , .

Ignoring the additive noise sources, we “linearize” the process and measurement models about a known deterministic reference trajectory defined by as illustrated in Figure 3.3,⁵ that is,

(3.99)

Graphical curves depicting the linearization of a deterministic system using the reference trajectory defined by(x*(t), u*(t)). — Figure 3.3 Linearization of a deterministic system using the reference trajectory defined by ().

images — Figure 3.3 Linearization of a deterministic system using the reference trajectory defined by ().

Deviations or perturbations from this trajectory are defined by

Substituting the previous equations into these expressions, we obtain the perturbation trajectory as

(3.100)

The nonlinear vector functions and can be expanded into a first‐order Taylor series about the reference trajectory as⁶

(3.101)

We define the first‐order Jacobian matrices as

(3.102)

Incorporating the definitions of Eq. 3.102 and neglecting the higher order terms (HOT) in Eq. 3.101, the linearized perturbation process model in 3.100 can be expressed as

(3.103)

Similarly, the measurement system can be linearized by using the reference measurement

(3.104)

and applying the Taylor series expansion to the nonlinear measurement model

(3.105)

The corresponding measurement perturbation model is defined by

(3.106)

Substituting the first‐order approximations for and leads to the linearized measurement perturbation as

(3.107)

where is defined as the measurement Jacobian.

Summarizing, we have linearized a deterministic nonlinear model using a first‐order Taylor series expansion for the model functions, , , and and then developed a linearized Gauss–Markov perturbation model valid for small deviations from the reference trajectory given by

(3.108)

with , , , and the corresponding Jacobian matrices with , zero‐mean, Gaussian.

We can also use linearization techniques to approximate the statistics of the process and measurements. If we use the first‐order Taylor series expansion and expand about the mean, , rather than , then taking expected values

(3.109)

gives

(3.110)

which follows by linearizing about and taking the expected value.

The variance equations can also be developed in a similar manner (see [7] for details) to give

(3.111)

Using the same approach, we arrive at the accompanying measurement statistics

(3.112)

We summarize these results in the “approximate” Gauss–Markov model of Table 3.3.

Before we close, consider the following example to illustrate the approximation.

Although the linearization approach discussed here seems somewhat extraneous relative to the previous sections, it becomes a crucial ingredient in the classical approach to (approximate) nonlinear estimation of the subsequent chapters. We discuss the linear state‐space approach (Kalman filter) to the estimation problem in Chapter 4 and then show how these linearization concepts can be used to solve the nonlinear estimation problem. There the popular “extended” Kalman filter processor relies heavily on these linearization techniques developed in this section for its development.

3.9 Summary

In this chapter, we have discussed the development of continuous‐time, sampled‐data, and discrete‐time state‐space models. The stochastic variants of the deterministic models were presented leading to the Gauss–Markov representations for both linear and (approximate) nonlinear systems. The discussion of both the deterministic and stochastic state‐space models included a brief development of their second‐order statistics. We also discussed the underlying discrete systems theory as well as a variety of time‐series models (ARMAX, AR, MA, etc.) and showed that they can easily be represented in state‐space form through the use of canonical forms (models). These models form the embedded structure incorporated into the model‐based processors that will be discussed in subsequent chapters. We concluded the chapter with a brief development of a “linearized” nonlinear model leading to an approximate Gauss–Markov representation.

MATLAB Notes

MATLAB has many commands to convert to/from state‐space models to other forms useful in signal processing. Many of them reside in the Signal Processing and Control Systems toolboxes. The matrix exponential is invoked by the expm command and is determined from Taylor/Padé approximants using the scaling and squaring approach. Also the commands expmdemo1, expmdemo2, and expmdemo3 demonstrate the trade‐offs of the Padé, Taylor, and eigenvector approaches to calculate the matrix exponential. The ordinary differential equation method is available using the wide variety of numerical integrators available (ode*). Converting to/from transfer functions and state‐space is accomplished using the ss2tf and tf2ss commands, respectively. ARMAX simulations are easily accomplished using the filter command with a variety of options converting from ARMAX‐to/from transfer functions. The Identification Toolbox converts polynomial‐based models to state‐space and continuous parameters including Gauss–Markov to discrete parameters (th2ss, thc2thd, thd2thc).

References

1 Kailath, T. (1980). Linear Systems. Englewood Cliffs, NJ: Prentice‐Hall.
2 Szidarovszky, F. and Bahill, A. (1980). Linear Systems Theory. Boca Raton, FL: CRC Press.
3 DeCarlo, R. (1989). Linear Systems: A State Variable Approach. Englewood Cliffs, NJ: Prentice‐Hall.
4 Chen, C. (1984). Introduction to Linear System Theory. New York: Holt, Rinehart, and Winston.
5 Tretter, S. (1976). Introduction to Discrete‐Time Signal Processing. New York: Wiley.
6 Jazwinski, A. (1970). Stochastic Processes and Filtering Theory. New York: Academic Press.
7 Sage, A. and Melsa, J. (1971). Estimation Theory with Applications to Communications and Control. New York: McGraw‐Hill.
8 Maybeck, P. (1979). Stochastic Models, Estimation and Control, vol. 1. New York: Academic Press.
9 Anderson, B. and Moore, J. (2005). Optimal Filtering. Mineola, NY: Dover Publications.
10 Katayama, T. (2005). Subspace Methods for System Identification. London: Springer.
11 Goodwin, G. and Payne, R.L. (1976). Dynamic System Identification. New York: Academic Press.
12 Goodwin, G. and Sin, K. (1984). Adaptive Filtering, Prediction and Control. Englewood Cliffs, NJ: Prentice‐Hall.
13 Mendel, J. (1995). Lessons in Estimation Theory for Signal Processing, Communications, and Control. Englewood Cliffs, NJ: Prentice‐Hall.
14 Brown, R. and Hwang, P.C. (1997). Introduction to Random Signals and Applied Kalman Filtering. New York: Wiley.
15 Candy, J. (2006). Model‐Based Signal Processing. Hoboken, NJ: Wiley/IEEE Press.
16 Robinson, E. and Silvia, M. (1979). Digital Foundations of Time Series Analysis, vol. 1. San Francisco, CA: Holden‐Day.
17 Simon, D. (2006). Optimal State Estimation Kalman, and Nonlinear Approaches. Hoboken, NJ: Wiley.
18 Grewal, M.S. and Andrews, A.P. (1993). Kalman Filtering: Theory and Practice. Englewood Cliffs, NJ: Prentice‐Hall.
19 Moler, C. and Van Loan, C. (2003). Nineteen dubious ways to compute the exponential of a matrix, twenty‐five years later. SIAM Rev. 45 (1): 3–49.
20 Golub, G. and Van Loan, C. (1989). Matrix Computation. Baltimore, MA: Johns Hopkins University Press.
21 Ho, B. and Kalman, R. (1966). Effective reconstruction of linear state variable models from input/output data. Regelungstechnik 14: 545–548.
22 Candy, J., Warren, M., and Bullock, T. (1977). Realization of an invariant system description from Markov sequences. IEEE Trans. Autom. Control 23 (7): 93–96.
23 Kung, S., Arun, K., and Bhaskar Rao, D. (1983). State‐space and singular‐value decomposition‐based approximation methods for the harmonic retrieval problem. J. Opt. Soc. Am. 73 (12): 1799–1811.

Problems

3.1 Derive the following properties of conditional expectations:

if and are independent.
.
.
.
.
.
.

(Hint: See Appendix A.1.)

3.2 Suppose , , are Gaussian random variables with corresponding means , , and variances , , show that:
1. If , constants, then .
2. If and are uncorrelated, then they are independent.
3. If are Gaussian with mean and variance , then for
4. If and are jointly (conditionally) Gaussian, then
5. The random variable is orthogonal to .
6. If and are independent, then
7. If and are not independent, show that
  
  for .
(Hint: See Appendices A.1–A.3.)
3.3 Suppose we are given the factored power spectrum with
1. Develop the ARMAX model for the process.
2. Develop the corresponding Gauss–Markov model for both the standard and innovations representation of the process.
3.4 We are given the following Gauss–Markov model
1. Calculate the state power spectrum, .
2. Calculate the measurement power spectrum, .
3. Calculate the state covariance recursion, .
4. Calculate the steady‐state covariance, .
5. Calculate the output covariance recursion, .
6. Calculate the steady‐state output covariance, .
3.5 Suppose we are given the Gauss–Markov process characterized by the state equations

for a step of amplitude 0.03 and and .
1. Calculate the covariance of , i.e. .
2. Since the process is stationary, we know that
  
  What is the steady‐state covariance, , of this process?
3. Develop a MATLAB program to simulate this process.
4. Plot the process with the corresponding confidence limits for 100 data points, do 95% of the samples lie within the bounds?
3.6 Suppose we are given the ARMAX model
1. What is the corresponding innovations model in state‐space form for ?
2. Calculate the corresponding covariance matrix .
3.7 Given a continuous–discrete Gauss–Markov model

where and are zero‐mean and white with respective covariances, and , along with a piecewise constant input, .
1. Develop the continuous–discrete mean and covariance propagation models for this system.
2. Suppose is processed by a coloring filter that exponentially correlates it, . Develop the continuous–discrete Gauss–Markov model in this case.
3.8 Develop the continuous–discrete Gauss–Markov models for the following systems:
1. Wiener process: ; , is zero‐mean, white with .
2. Random bias: ; where .
3. Random ramp: ; .
4. Random oscillation: ; .
5. Random second order: ; .
3.9 Develop the continuous–discrete Gauss–Markov model for correlated process noise, that is,
3.10 Develop the approximate Gauss–Markov model for the following nonlinear state transition and measurement model are given by

where and . The initial state is Gaussian distributed with .
3.11 Consider the discrete nonlinear process given by

with corresponding measurement model

where and . The initial state is Gaussian distributed with .

Develop the approximate Gauss–Markov process model for this nonlinear system.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Output propagation

Mean propagation

Impulse propagation

Variance/covariance propagation

		the output or measurement sequence
		the input sequence
		the process (white) noise sequence with variance
		the impulse response sequence
		the impulse input of amplitude
		the mean output or measurement sequence
		the mean process noise sequence
		the stationary output covariance at lag
		the th‐order system characteristic (poles) polynomial
		the th‐order input (zeros) polynomial
		the th‐order noise (zeros) polynomial

Table of Contents for 3 State‐Space Models for Identification

Create new playlist

Sign In

Sign Up

3.1 Introduction

3.2 Continuous‐Time State‐Space Models

3.3 Sampled‐Data State‐Space Models

3.4 Discrete‐Time State‐Space Models

3.4.1 Linear Discrete Time‐Invariant Systems

3.4.2 Discrete Systems Theory

3.4.3 Equivalent Linear Systems

3.4.4 Stable Linear Systems

3.5 Gauss–Markov State‐Space Models

3.5.1 Discrete‐Time Gauss–Markov Models

3.6 Innovations Model

3.7 State‐Space Model Structures

3.7.1 Time‐Series Models

3.7.2 State‐Space and Time‐Series Equivalence Models

3.8 Nonlinear (Approximate) Gauss–Markov State‐Space Models

3.9 Summary

MATLAB Notes

References

Problems

Notes

Table of Contents for
3 State‐Space Models for Identification