Chapter 1
First steps

1.1 Preliminaries

This book focuses on a class of random evolutions, in discrete time (by successive steps) on a discrete state space (finite or countable, with isolated elements), which satisfy a fundamental assumption, called the Markov property. This property can be described informally as follows: the evolution “forgets” its past and is “regenerated” at each step, retaining as sole past information for its future evolution its present state. The probabilistic description of such an evolution requires

  • a law (probability measure) for drawing its initial state and
  • a family of laws for drawing iteratively its state at the “next future instant” given its “present state,” indexed by the state space.

Such a random evolution will be called a Markov chain.

Precise definitions can be found in the Appendix, Section A.3, but we give now the probabilistic framework. A probability space c01-math-0001 will be considered throughout. When c01-math-0002 is discrete, usually its measurable structure is given by the collection of all subsets, and all functions with values in c01-math-0003 are assumed to be measurable.

A random variable (r.v.) with values in a measurable state space c01-math-0004 is a measurable function

equation

Intuitively, the output c01-math-0006 varies randomly with the input c01-math-0007, which is drawn in c01-math-0008 according to c01-math-0009, and the measurability assumptions allow to assign a probability to events defined through c01-math-0010.

For the random evolutions under investigation, the natural random elements are sequences c01-math-0011 taking values in the same discrete state space c01-math-0012, which are called (random) chains or (discrete time) processes. Each c01-math-0013 should be an r.v., and its law c01-math-0014 on the discrete space c01-math-0015 is then given by

equation

and hence can be identified in a natural way with c01-math-0017.

Finite-dimensional marginals

More generally, for any c01-math-0018 and c01-math-0019 in c01-math-0020, the random vector

equation

takes values in the discrete space c01-math-0022, and its law c01-math-0023 can be identified with the collection of the

equation

All these laws for c01-math-0025 and c01-math-0026 constitute the family of the finite-dimensional marginals of the chain c01-math-0027 or of its law.

Law of the chain

The r.v.

equation

takes values in c01-math-0029, which is uncountable as soon as c01-math-0030 contains at least two elements. Hence, its law cannot, in general, be defined by the values it takes on the elements of c01-math-0031. In the Appendix, Section A.3 contains some mathematical results defining the law of c01-math-0032 from its finite-dimensional marginals.

Section A.1 contains some more elementary mathematical results used throughout the book, and Section A.2 a discussion on the total variation norm and on weak convergence of laws.

1.2 First properties of Markov chains

1.2.1 Markov chains, finite-dimensional marginals, and laws

1.2.1.1 First definitions

We now provide rigorous definitions.

Markov chain evolution

A family c01-math-0056 of laws on c01-math-0057 indexed by c01-math-0058 is defined by

equation

The evolution of c01-math-0060 can be obtained by independent draws, first of c01-math-0061 according to c01-math-0062, and then iteratively of c01-math-0063 according to c01-math-0064 for c01-math-0065 without taking any further notice of the evolution before the present time c01-math-0066 or of its actual value.

Inhomogeneous Markov chains

A more general and complex evolution can be obtained by letting the law of the steps depend on the present instant of time, that is, using the analogous formulae with c01-math-0067 instead of c01-math-0068; this corresponds to a time-inhomogeneous Markov chain, but we will seldom consider this generalization.

Markov chain graph

The graph of the transition matrix c01-math-0069, or of a Markov chain with matrix c01-math-0070, is the oriented marked graph with nodes given by the elements of c01-math-0071 and directed links given by the ordered pairs c01-math-0072 of elements of c01-math-0073 such that c01-math-0074 marked by the value of c01-math-0075. The restriction to the graph to c01-math-0076 in c01-math-0077 is of the form [if c01-math-0078]

c01g001

The graph and the matrix are equivalent descriptors for the random evolution. The links from c01-math-0079 to c01-math-0080 in the graph are redundant as they are marked by c01-math-0081, but illustrate graphically the possible transitions from c01-math-0082.

1.2.1.2 Conditional formulations

The last formula in Definition 1.2.1 can be written as

which is often used as the definition. Moreover, if c01-math-0084 is nonnegative or bounded then

equation

For the sake of mathematical efficiency and simplicity, nonconditional expressions will be stressed, before possibly being translated into equivalent conditional formulations. As an example, Definition 1.2.1 immediately yields by summing over c01-math-0086 that

equation

which is not quite so obvious starting from (1.2.1).

1.2.1.3 Initial law, instantaneous laws

For a Markov chain c01-math-0088, the law of c01-math-0089 of c01-math-0090 is called the instantaneous law at time c01-math-0091 and c01-math-0092 the initial law. The notations c01-math-0093 and c01-math-0094 implicitly imply that c01-math-0095 is given and arbitrary, c01-math-0096 and c01-math-0097 for some law c01-math-0098 on c01-math-0099 indicate that c01-math-0100, and c01-math-0101 and c01-math-0102 indicate that c01-math-0103. By linearity,

equation

A frequent abuse of notation is to write c01-math-0105, and so on.

1.2.1.4 Law on the canonical space of the chain

The notions in the Appendix, Section A.3.4, will now be used.

Definition 1.2.1 is actually a statement on the law of the Markov chain c01-math-0121, which it characterizes by giving an explicit expression for its finite-dimensional marginals in terms of its initial law c01-math-0122 and transition matrix c01-math-0123.

Indeed, some rather simple results in measure theory show that there is uniqueness of a law on the canonical probability space c01-math-0124 with product c01-math-0125-field having a given finite-dimensional marginal collection.

It is immediate to check that this collection is consistent [with respect to (w.r.t.) projections] and then the Kolmogorov extension theorem (Theorem A.3.10) implies that there is existence of a law c01-math-0126 on the canonical probability space c01-math-0127 with the product c01-math-0128-field such that the canonical (projection) process c01-math-0129 has the given finite-dimensional marginal collection, which hence is a Markov chain with initial law c01-math-0130 and transition matrix c01-math-0131 (see Corollary A.3.11).

The Kolmogorov extension theorem follows from a deep and general result in measure theory, the Caratheodory extension theorem.

1.2.2 Transition matrix action and matrix notation

1.2.2.1 Nonnegative and signed measures, total variation measure, andnorm

A (nonnegative) measure c01-math-0132 on c01-math-0133 is defined by (and can be identified with) a collection c01-math-0134 of nonnegative real numbers and, in the sense of nonnegative series,

equation

A measure c01-math-0136 is finite if its total mass c01-math-0137 is finite and then c01-math-0138 for all c01-math-0139. A measure is a probability measure, or a law, if c01-math-0140.

For c01-math-0141 in c01-math-0142, let c01-math-0143 and c01-math-0144 denote the nonnegative and nonpositive parts of c01-math-0145, which satisfy c01-math-0146 and c01-math-0147.

For c01-math-0148 with c01-math-0149, the measures c01-math-0150, c01-math-0151, and c01-math-0152 can be defined term wise. Then,

equation

is the minimal decomposition of c01-math-0154 as a difference of (nonnegative) measures, which have disjoint supports, and

equation

is called the total variation measure of c01-math-0156.

If c01-math-0157 is such that c01-math-0158 is finite (equivalently, if both c01-math-0159 and c01-math-0160 are finite), then we can extend it to a signed measure c01-math-0161 acting on subsets of c01-math-0162 by setting, in the sense of absolutely converging series,

equation

and we can define its total variation norm by

equation

Note that c01-math-0165 for all c01-math-0166.

The space c01-math-0167 of all signed measures, furnished with the total variation norm, is a Banach space, which is isomorphic to the Banach space c01-math-0168 of summable sequences with its natural norm.

Probability measures or laws

The space of probability measures

equation

is the intersection of the cone of nonnegative measures with the unit sphere. It is a closed subset of c01-math-0170 and hence is complete for the induced metric.

Some properties of c01-math-0171 and c01-math-0172 are developed in the Appendix, Section A.2. Note that, according to the definition taken here, nonnegative measures with infinite mass are not signed measures.

Complex measures

Spectral theory naturally involves complex extensions. For its purposes, complex measures can be readily defined, and the corresponding space c01-math-0173, where the modulus in c01-math-0174 is again denoted by c01-math-0175, allows to define a total variation measure c01-math-0176 and total variation norm c01-math-0177 for c01-math-0178 in c01-math-0179. The Banach space is isomorphic to c01-math-0180. The real and imaginary parts of a complex measure are signed measures.

1.2.2.2 Line and column vectors, measure-function duality

In matrix notation, the functions c01-math-0181 from c01-math-0182 to c01-math-0183 are considered as column vectors c01-math-0184, and nonnegative or signed measures c01-math-0185 on c01-math-0186 as line vectors c01-math-0187, of infinite lengths if c01-math-0188 is infinite. The integral of a function c01-math-0189 by a measure c01-math-0190 is denoted by c01-math-0191, in accordance with the matrix product

equation

defined in c01-math-0193 in the sense of nonnegative series if c01-math-0194 and c01-math-0195 and in c01-math-0196 in the sense of absolutely converging series if c01-math-0197 and c01-math-0198, the Banach space of bounded functions on c01-math-0199 with the uniform norm.

For c01-math-0200 subset of c01-math-0201, the indicator function c01-math-0202 is defined by

equation

For c01-math-0204 in c01-math-0205, the Dirac mass at c01-math-0206 is the probability measure c01-math-0207 such that c01-math-0208, that is,

equation

For c01-math-0210 and c01-math-0211 in c01-math-0212, it holds that

equation

but c01-math-0214 will be represented by a line vector and c01-math-0215 by a column vector.

If c01-math-0216 is a nonnegative or signed measure, then

equation
Duality and total variation norm

A natural duality bracket between the Banach spaces c01-math-0218 and c01-math-0219 is given by

equation

and for c01-math-0221 in c01-math-0222, it holds that

equation

and hence that

1.2.2 equation

Thus, c01-math-0225 can be identified with a closed subspace of the dual of c01-math-0226 with the strong dual norm, which is the norm as an operator on c01-math-0227.

The space c01-math-0228 can be identified with the space c01-math-0229 of bounded sequences, and this duality between c01-math-0230 and c01-math-0231 to the natural duality between c01-math-0232 and c01-math-0233.

The operations between complex measures and complex functions are performed by separating the real and imaginary parts.

1.2.2.3 Transition matrices, actions on measures, and functions

A matrix c01-math-0234 is a transition matrix if and only if each of its line vectors c01-math-0235 corresponds to a probability measure. Then, its column vector c01-math-0236 defines a nonnegative function, which is bounded by c01-math-0237.

A transition matrix can be multiplied on its right by nonnegative functions and on its left by nonnegative measures, or on its right by bounded functions and on its left by signed measures. The order of these operations does not matter. The function c01-math-0238, the nonnegative or signed measure c01-math-0239, and c01-math-0240 are given for c01-math-0241 by

equation

in which the notations c01-math-0243 and c01-math-0244 are used only when c01-math-0245 is a law.

In matrix notation,

equation
Intrinsic notation

The linear mapping

equation

has matrix c01-math-0248 in the canonical basis. Its dual, or adjoint, mapping on c01-math-0249, w.r.t. the duality bracket c01-math-0250, is given by

equation

and has the adjoint (or transpose) matrix, also denoted by c01-math-0252. In order to respect the vector space structure and identify linear mappings and their matrices in the canonical bases, we could write c01-math-0253 instead of c01-math-0254 and c01-math-0255 instead of c01-math-0256 and c01-math-0257 instead of c01-math-0258.

1.2.2.4 Transition matrix products, many-step transition

If c01-math-0259 and c01-math-0260 are both transition matrices on c01-math-0261, then it is easy to check that the matrix product c01-math-0262 with generic term

equation

is a transition matrix on c01-math-0264. Let

equation

Then,

equation

is the probability for c01-math-0267 to go in one step from c01-math-0268 to c01-math-0269, and hence for c01-math-0270 to do so in c01-math-0271 steps. In particular,

equation
Chapman–Kolmogorov formula

As c01-math-0273 for c01-math-0274, this yields the Chapman–Kolmogorov formula

equation
Probabilistic interpretation

These algebraic formulae have simple probabilistic interpretations: the probability of going from c01-math-0276 to c01-math-0277 in c01-math-0278 steps can be obtained as the sum of the probabilities of taking every c01-math-0279-step path allowing to do so, as well as the sum over all intermediate positions after c01-math-0280 steps.

1.2.3 Random recursion and simulation

Many Markov chains are obtained in a natural way as a random recursion, or random iterative sequence, as follows.

When a random sequence c01-math-0303 is defined in some particular way, there is often a natural interpretation in terms of a random recursion of the previous kind. This allows to prove that a c01-math-0304 is a Markov chain, without having to directly check the definition, or even having to explicit its matrix. Moreover, such a pathwise representation for the Markov chain may be used for its study or its simulation.

Any Markov chain, with arbitrary initial law c01-math-0305 and transition matrix c01-math-0306, can be thus represented, using an i.i.d. sequence c01-math-0307 of uniform r.v. on c01-math-0308. The state space is first enumerated as c01-math-0309.

Let c01-math-0310. For the initial value, if c01-math-0311 is determined by

equation

then c01-math-0313. For the transitions, for c01-math-0314, if c01-math-0315 is determined by

equation

then c01-math-0317.

In theory, this allows for the simulation of the Markov chain, but in practice this is not necessarily the best way to do so.

Note that this representation yields a construction for an arbitrary Markov chain, starting from a rigorous construction of i.i.d. sequences of uniform r.v. on c01-math-0318, without having to use the more general Kolmogorov extension theorem (Theorem A.3.10).

1.2.4 Recursion for the instantaneous laws, invariant laws

The instantaneous laws c01-math-0319 satisfy the recursion

equation

with solution c01-math-0321. This is a linear recursion in dimension c01-math-0322, and the affine constraint c01-math-0323 allows to reduce it to an affine recursion in dimension c01-math-0324. Note that c01-math-0325 for c01-math-0326.

An elementary study of this recursion starts by searching for its fixed points. These are the only possible large c01-math-0327 limits for the instantaneous laws, for any topology such that c01-math-0328 is continuous.

By definition, a fixed point for this recursion is a law (probability measure) c01-math-0329 such that c01-math-0330, and is called an invariant law or a stationary distribution for (or of) the matrix c01-math-0331, or the Markov chain c01-math-0332.

Stationary chain, equilibrium

If c01-math-0333 is an invariant law and c01-math-0334, then c01-math-0335 for all c01-math-0336 in c01-math-0337, and c01-math-0338 is again a Markov chain with initial law c01-math-0339 and transition matrix c01-math-0340. Then, c01-math-0341 is said to be stationary, or in equilibrium.

Invariant measures and laws

In order to find an invariant law, one must:

  1. Solve the linear equation c01-math-0342.
  2. Find which solutions c01-math-0343 are nonnegative, that is, such that c01-math-0344.
  3. Normalize such c01-math-0345 by setting c01-math-0346, which is possible only if
    equation

A nonnegative measure c01-math-0348 such that c01-math-0349 is called an invariant measure. An invariant measure is said to be unique if it is unique up to a multiplicative factor, that is, if all invariant measures are proportional (to it).

Algebraic interpretation

The invariant measures are the left eigenvectors for the eigenvalue 1 for the matrix c01-math-0350 acting on nonnegative measures, that is, for the adjoint (or transposed) matrix c01-math-0351 acting on nonnegative vectors. Note that the constant function c01-math-0352 is a right eigenvector for the eigenvalue c01-math-0353, as

equation

The possible convergence of c01-math-0355 to an invariant law c01-math-0356 is related to the moduli of the elements of the spectrum of the restriction of the action of c01-math-0357 on the signed measures of null total mass.

More generally, the exact or approximate computation of c01-math-0358 depends in a more or less explicit way on a spectral decomposition of c01-math-0359.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.212.124