An algebraic study based on the natural duality between the space of signed measures and the space of bounded functions will provide some structural results. These results are quite complete for finite state spaces . The complete study for arbitrary discrete will be done later using probabilistic techniques. A reader for which this is the main interest may go directly to Section 1.4.
The eigenvalues of the operator
are given by all such that is not injective as an operator on .
The eigenspace of is the kernel
and its nonzero elements are called eigenvectors. Hence, in is an eigenvector of if and only if
and then
The generalized eigenspace of is given by
If it contains strictly the eigenspace, then it contains some and some eigenvector such that
and then
An eigenvalue is said to be
Similar definitions involving are given for the adjoint operator
Its eigenspaces are often said to be eigenspaces on the left, or left eigenspaces, of . Those of
may accordingly be called eigenspaces on the right, or right eigenspaces, of .
Hence, is a left eigenvector of if and only if
and then
If the generalized left eigenspace of contains strictly the eigenspace, then it contains some and some left eigenvector such that
and then
The spectrum of on is given by
It is a simple matter to check that, using in the definition,
and mentions of “left” or “right” are useless.
The spectrum contains both the left and right eigenvectors. In finite dimensions, invertibility of an operator is the same as injectivity, and hence the spectrum, the left eigenspace, and the right eigenspace coincide, but it is not so in general.
If is in the spectrum of an operator on a real vector space, such as or , then is also in the spectrum, and if moreover is an eigenvalue, then is an eigenvalue, and the corresponding (generalized) eigenspaces are conjugate.
If , then can be considered instead of in all definitions. Moreover, the real and complex (generalized) eigenspaces have same dimension.
A state in is absorbing if and then is an invariant law for .
If contains subsets for such that for every , these are said to be absorbing or closed. The restriction of to each is Markovian; if it has an invariant measure , then any convex combination is an invariant measure on , and if the are laws then is an invariant law. (By abuse of notation, denotes the extension of the measure to vanishing outside of .)
Hence, any uniqueness result for invariant measures or laws requires adequate assumptions excluding the above situation.
The standard hypothesis for this is that of irreducibility: a transition matrix on is irreducible if, for every and in , there exists such that . Equivalently, there exists in the oriented graph of the matrix a path covering the whole graph (respecting orientation). This notion will be further developed in due time.
This proof heavily uses techniques that are referred under the terminology “the maximum principle,” which we will try to explain in Section 1.3.3.
A transition matrix is strongly irreducible if there exists such that .
Note that and that only in the trivial case in which is a sequence of i.i.d. r.v. of law .
The Doeblin Condition (or strong irreducibility) is seldom satisfied when the state space is infinite. For a finite state space, Section 4.2.1 will give verifiable conditions for strong irreducibility. The following result is interesting in these perspectives.
If the state space is finite, then the vector spaces and have finite dimension .
Then, the eigenvalues and the dimensions of the eigenspaces and generalized eigenspaces of , which are by definition those of and , are identical, and the spectrum is constituted of the eigenvalues.
A function is harmonic if . It is a right eigenvector for the eigenvalue .
Theorem 4.2.7 will provide an extension of these results for infinite , by wholly different methods. (Saloff-Coste, L. (1997), Section 1.2) provides a detailed commentary on the Doeblin condition and the Perron–Frobenius theorem.
This terminology comes from the following short direct proof of the fact that if the state space is finite and is irreducible, then every harmonic function is constant. If is harmonic on , then it attains its maximum in at least a state . Moreover,
and as is a probability measure, for all such that . Thus, irreducibility yields that for every , and thus that is constant.
We are now going to solve the recursion for the instantaneous laws , and see how the situation deteriorates in practice very quickly as the size of the state space increases.
Let us denote the states by and . There exists such that the transition matrix and its graph are given by
The recursion formula then writes
and the affine constraint allows to reduce this linear recursion in dimension to the affine recursion, in dimension ,
If , then and every law is invariant, and is not irreducible as . Else, the unique fixed point is , the unique invariant law is , and
and the formula for is obtained by symmetry or as .
If , then has eigenvalues and , the latter with eigenvector , and the chain alternates between the states and and is equal to for even and to for odd .
If , then and with geometric rate with reason .
Let us denote the states by , , and . There exists , , , , , and in , satisfying , , and , such that the transition matrix and its graph are given by
As discussed above, we could reduce the linear recursion in dimension to an affine recursion in dimension . Instead, we give the elements of a vectorial computation in dimension , which can be generalized to all dimensions. This exploits the fact that is an eigenvalue of and hence a root of its characteristic polynomial . Hence,
and by developing this polynomial and using the fact that is a root, factorizes into
The polynomial of degree is the characteristic polynomial of the affine recursion in dimension . It has two possible equal roots and in , and if , then . Their exact theoretical expression is not very simple, as the discriminant of this polynomial does not simplify in general, but they can easily be computed on a case-by-case basis.
In order to compute , we will use the Cayley–Hamilton theorem, according to which (nul matrix). The Euclidean division of by yields
and in order to effectively compute , , and , we take the values of the polynomials for the roots of , which yields the linear system
This system has rank if the three roots are distinct.
If there is a double root (be it or ), then two of these equations are identical, but as the double root is also a root of , a simple derivative yields a third equation , which is linearly independent of the two others.
If the three roots of are equal, then they are equal to and .
If is irreducible, then there exists a unique invariant law , given by
Let . The above-mentioned method can be extended without any theoretical problem.
The Euclidean division and yield that
If are the distinct roots of and are their multiplicities, then
is a system of linearly independent equations for the unknowns , which thus has a unique solution.
The enormous obstacle for the effective implementation of this method for computing is that we must compute the roots of first. The main information we have is that is a root, and in general, computing the roots becomes a considerable problem as soon as . Once the roots are found, solving the linear system and finding the invariant laws is a problem only when is much larger.
This general method is simpler than finding the reduced Jordan form for , which also necessitates to find the roots of the characteristic polynomial , and then solving a linear system to find the change-of-basis matrix and its inverse . Then, , where can be made explicit.
We are going to describe in informal manner some problems concerning random evolutions, for which the answers will obviously depend on some data or parameters. We then will model these problems using Markov chains.
These models will be studied in detail all along our study of Markov chains, which they will help to illustrate.
In these descriptions, random variables and draws will be supposed to be independent if not stated otherwise.
A particle evolves on a network , that is, on a discrete additive subgroup such as . From in , it chooses to go to in with probability , which satisfies . This can be, for instance, a model for the evolution of an electron in a network of crystals.
Some natural questions are the following:
Let be a sequence of i.i.d. random variables such that , and
Theorem 1.2.3 shows that is a Markov chain on , with a transition matrix, which is spatially homogeneous, or invariant by translation, given by
The matrix restricted to the network generated by all such that is irreducible. The constant measures are invariant, as
If , then the strong law of large numbers yields that
and for the chain goes to infinity in the direction of . The case is problematic, and if , then the central limit theorem shows that converges in law to , which gives some hints to the long-time behavior of the chain.
For , this Markov chain is called a nearest-neighbor random walk when for , and the symmetric nearest-neighbor random walk when for . These terminologies are used for other regular networks, such as the one in Figure 1.1.
Two gamblers A and B play a game of head or tails. Gambler A starts with a fortune of units of money and Gambler B of units. At each toss, each gambler makes a bet of unit, Gambler A wins with probability and loses with probability , and the total of the bets is given to the winner; a gambler thus either wins or loses unit.
The game continues until one of the gamblers is ruined: he or she is left with a fortune of units, the global winner with a fortune of units, and the game stops. This is illustrated in Figure 1.2.
When , the game is said to be fair, else to be biased.
As an example, Gambler A goes to a casino (Gambler B). He decides to gamble unit at each draw of red or black at roulette and to stop either after having won a total of units (what he or she would like to gain) or lost a total of units (the maximal loss he or she allows himself).
Owing to the and (most usually) the double on the roulette, which are neither red nor black, the game is biased against him, and is worth either if there is no double or if there is one.
From a formal point of view, there is a symmetry in the game obtained by switching and simultaneously with and . In practice, no casino allows a bias in favor of the gambler, nor even a fair game.
A unilateral case will also be considered, in which and . In the casino example, this corresponds to a compulsive gambler, who will stop only when ruined. In all cases, the evolution of the gambler's fortune is given by a nearest-neighbor random walk on , stopped when it hits a certain boundary.
Some natural questions are the following:
In all cases, the evolution of the gambler's fortune is given by a nearest-neighbor random walk on stopped when it hits a certain boundary.
The evolution of the fortune of Gambler A can be represented using a sequence of i.i.d. r.v. satisfying and by
where is its initial fortune , or more generally a r.v. with values in and independent of . Gambler B's fortune at time is .
Theorem 1.2.3 yields that is a Markov chain on with matrix and graph given by
the other terms of being 0.
The states and are absorbing, hence is not irreducible. The invariant measures are of the form if with and arbitrary, and uniqueness does not hold (uniqueness being understood as “up to proportionality”).
We study the successive generation of a population, constituted for instance of viruses in an organism, of infected people during an epidemic, or of neutrons during an atomic reaction.
The individuals of one generation disappear in the following, giving birth there each to descendants with probability , with . A classic subcase is that of a binary division: and .
The result of this random evolution mechanism is called a branching process. It is also called a Galton–Watson process; the initial study of Galton and Watson, preceded by a similar study of Bienaymé, bore on family names in Great Britain.
Some natural questions are the following:
We shall construct a Markov chain corresponding to the sizes (numbers of individuals) of the population along the generations.
Let be i.i.d. r.v. such that for in . We assume that the individuals of generation are numbered and that the th one yields descendants in generation , so that
(An empty sum being null by convention.)
Figure 1.3 illustrates this using the genealogical tree of a population, which gives the relationships between individuals in addition to the sizes of the generations, and explains the term “branching.”
The state space of is , and Theorem 1.2.3 applied to yields that it is a Markov chain. The transition matrix is given by
and state is absorbing. The matrix is not practical to use under this form.
It is much more practical to use generating functions. If
denotes the generating function of the reproduction law, the i.i.d. manner in which individuals reproduce yields that
For in , let
denote the generating function of the size of generation . An elementary probabilistic computation yields that, for ,
and hence that
We will later see how to obtain this result by Markov chain techniques and then how to exploit it.
A container (urn,…) is constituted of two communicating compartments and contains a large number of particles (such as gas molecules). These are initially distributed in the two compartments according to some law, and move around and can switch compartment.
Tatiana and Paul Ehrenfest proposed a statistical mechanics model for this phenomenon. It is a discrete time model, in which at each step a particle is chosen uniformly among all particles and changes compartment. See Figure 1.4.
Some natural questions are the following:
Let be the number of molecules, and let the compartments be numbered by and .
A microscopic description of the system at time is given by
where the th coordinate is the number of the compartment in which the th particle is located.
Starting from a sequence of i.i.d. r.v. which are uniform on , and an initial r.v. independent of this sequence, we define recursively for by changing the coordinate of rank of . This random recursion is a faithful rendering of the particle dynamics.
Theorem 1.2.3 implies that is a Markov chain on with matrix given by
This is the symmetric nearest-neighbor random walk on the unit hypercube . This chain is irreducible.
This chain has for unique invariant law the uniform law with density .
As the typical magnitude of is comparable to the Avogadro number , the number of configurations is enormously huge. Any computation, even for the invariant law, is of a combinatorial nature and will be most likely untractable.
According to statistical mechanics, we should take advantage of the symmetries of the system, in order to stop following individual particles and consider collective behaviors instead.
A reduced macroscopic description of the system is the number of particles in compartment at time , given in terms of the microscopic description by
The information carried by being less than the information carried by , it is not clear that is a Markov chain, but the symmetry of particle dynamics will allow to prove it.
For , let be the permutation of obtained by first placing in increasing order the such that and then by increasing order the such that . Setting
it holds that
For some deterministic and , using the random recursion for ,
and hence, for all and ,
as is uniform and independent of . Hence, is uniform on and independent of . By a simple recursion, we conclude that the are i.i.d. r.v. which are uniform on and independent of and hence of .
Thus, Theorem 1.2.3 yields that is a Markov chain on with matrix and graph given by
all other terms of being zero. As is irreducible on , it is clear that is irreducible on , and this can be readily checked.
As the uniform law on , with density , is invariant for , a simple combinatorial computation yields that the invariant law for is binomial , given by
This law distributes the particles uniformly in both compartments, and this is preserved by the random evolution.
18.117.74.231