Chapter 1

Introduction to Stochastic Processes

In this introductory chapter we present some notions regarding sequences of random variables (r.v.) and the most important classes of random processes.

1.1. Sequences of random variables

Let image be a probability space and (An) a sequence of events in image.

– The set of ω ∈ Ω belonging to an infinity of An, that is,

image

is called the upper limit of the sequence (An).

– The set of ω ∈ Ω belonging to all An except possibly to a finite number of them, that is,

image

is called the lower limit of the sequence (An).

THEOREM 1.1.– (Borel-Cantelli lemma) Let (An) be a sequence of events in image.

1. image, them image

2. image and (An) is a sequence of independent events, then image

Let (Xn) be a sequence of r.v. and X an r.v., all of them defined on the same probability space image. The convergence of the sequence (Xn) to X will be defined as follows:

1. Almost sure image

2. In probability image if for any ε > 0,

image.

3. In distribution (or weekly, or in law) image pointwise in every continuity point of FX, where FXn and FX are the distribution functions of Xn and X, respectively.

4. In mean of order image for all n ∈ N+, and if image. The most commonly used are the cases p = 1 (convergence in mean) and p = 2 (mean square convergence).

The relations between these types of convergence are as follows:

[1.1] image

The convergence in distribution of r.v. is a convergence property of their distributions (i.e. of their laws) and it is the most used in probability applications. This convergence can be expressed by means of characteristic functions.

THEOREM 1.2.– (Lévy continuity theorem) The sequence (Xn) converges in distribution to X if and only if the sequence of characteristic functions of (Xn) converges pointwise to the characteristic function of X.

EXAMPLE 1.3.– Let X1, X2,… be independent r. v. such that imageimage and image. We have image , but Xn is not almost surely convergent to 1 as n → ∞.

Indeed, for any > 0 we have

image

Consequently, image.

To analyze the convergence a.s., we recall the following condition: imageimage, if and only if, for any ∈ > 0 and 0 < δ < 1, there exists an n0 such that for any n > n0

[1.2] image

As for any > 0, δ(0,1), and image we have

image

it does not exist an n0 such that relation [1.2] is satisfied, so we conclude that Xn does not converge a.s.

EXAMPLE 1.4.– Let us consider the probability space image, with Ω = [0,1], image, and image the Lebesgue measure on image. Let the r.v. X and the sequence of r.vimage be defined by

image

and

image

On the one hand, as

image

we obtain that image. Thus (Xn) does not converge in probability. On the other hand, the characteristic functions of the r.v. image, and X are

image

and, respectively,

image

so image.

Let us consider a sequence of r.v. (Xn,n ≥ 1) and suppose that the expected values image, exist. Let

[1.3] image

DEFINITION 1.5.– The sequence (Xn,n ≥ 1) is said to satisfy the weak or the strong law of large numbers if image, respectively.

Obviously, if a sequence of r.v. satisfies the strong law of large numbers, then it also satisfies the weak law of large numbers.

The traditional name “law of large numbers” that is still in use nowadays should be clearly understood: it is indeed a “true” mathematical theorem! We do indeed still speak of laws of large numbers to express vaguely but correctly the sense of a convergence of frequency toward probability, but we also tend to believe sometimes that a return to equilibrium will always occur in the end, and this is a big error of which we are often unaware.

Let us present now some “classic laws of large numbers.”

THEOREM 1.6.– (Markov) Let (Xn,n ≥ 1) be a sequence of r.v. such that image and image If

[1.4] image

then (Xn, n ≥ 1) satisfies the weak law of large numbers.

PROOF.– Chebyshev’s inequality applied to Yn defined in equation [1.3] yields

image

and thus we obtain

image

COROLLARY 1.7.– (Chebyshev) Let (Xn,n ≥ 1) be a sequence of independent r.v. such that image If there exists an M, 0 < M < ∞, such that image then (Xn,n ≥ 1) satisfies the weak law of large numbers.

PROOF.– In our case we have

image

and condition [1.4] is verified.

THEOREM 1.8.– (Kolmogorov) Let (Xn, n ≥ 1) be a sequence of independent r.v. such that image, and image, then (Xn,n ≥ 1) satisfies the strong law of large numbers.

COROLLARY 1.9.– In the settings of Corollary 1.7, the sequence of r.v. (Xn,n ≥ 1) satisfies the strong law of large numbers.

EXAMPLE 1.10.– Let (Xn) be a sequence of i.i.d. r.v. of distribution P.1 For any Borel set B, the r.v. ll (Xn ∈ B) are i.i.d., with common expected value P(B). Using the law of large numbers, Corollary 1.9, we have image

This argument justifies the estimation of probabilities by frequencies.

EXAMPLE 1.11.– Let (Xn) be a sequence of r.v. that take the values image, image, with probabilities image and image

We have image and imagear (Xk) = 2, k = 2,3,.... Consequently, the sequence (Xn) satisfies the strong law of large numbers (see Corollary 1.9).

EXAMPLE 1.12.– Let (Xn) be a sequence of r.v. that take the values image, image, with probabilities image and image

We have image and image, which yields

image

and

image

Consequently, the hypotheses of Theorem 1.6 are fulfilled and the sequence (Xn) satisfies the weak law of large numbers.

It is easy to see that in applications we often have to deal with random variables composed of an important number of independent elements. Under very general conditions, such sum variables are normally distributed. This fundamental fact can be mathematically expressed in one of the forms (depending on the conditions) of the central limit theorem. To state the central limit theorem, we consider (Xn, n ≥ 1) a sequence of r.v. and we introduce the following notation:

image

A central limit theorem gives sufficient conditions for the random variable

image

to converge in distribution to the standard normal distribution N(0, 1).

We see that, if the r.v. (Xn, n ≥ 1) are independent, then image and image are standard r.v., because image and image .

We give now a central limit theorem important in probability and mathematical statistics.

THEOREM 1.13.– Let (Xn, n ≥ 1) be a sequence of independent identically distributed r.v. with mean a and finite positive variance a2. Then, the sequence of r.v. image, is convergent in distribution to the normal distribution N(0,1). Zn is said to be asymptotically normal.

PROOF.- Let φ(t) be the characteristic function of (X1 — a) (hence of all (Xn — a), n ≥ 1). Then the characteristic function of Zn is image

THEREFORE, image and, using Theorem 1.2, we obtain image.

COROLLARY 1.14.– (De Moivre-Laplace) If X ~ Bi(n,p) and q = 1 — p, then image is asymptotically normal N(0,1), as n>.

PROOF.- This result is a direct consequence of Theorem 1.13, because X = image, with (Xn, n ≥ 1) i.i.d. random variables, Xn ~ Be(p).

EXAMPLE 1.15.– We roll a die n times and we consider the r.v. N = number of six. Starting from which value n of N do we have 9 out of 10 chances to get image

As N is a binomial r.v. Bi(n, 1/6), for n large enough we can use Corollary 1.14 to approach the r.v. image by a normal r.v. N(0, 1). Then, the condition image can be written as image. From relation image which can be written as 2Φ(x) − 1 = 0.9,2 we obtain x = 1.645 (using the table of the standard normal distribution). Thus we have nine out of 10 chances to have image if the inequality image 1.645 is satisfied, that is, if

EXAMPLE 1.16.– Suppose that the lifetime of a component is an exponentially distributed r.v. with parameter λ = 0.2 x 10-3h-1. When a component breaks down, it is immediately replaced with a new identical one. What is the probability that after 50,000 hours the component which will be working will be at least the 10th used from the beginning?

Let us denote by Sn the sum of n independent exponential random variables that represent the lifetimes of the first n components. From the properties of the exponential distribution, we have image . Using Theorem 1.13, we can approach image by a standard normal random variable, which is equivalent to saying that we can approach Sn by a normal r.v. N(n/λ, image) = N(5000n, 5000image). The fact that the component which will be working after 50,000 hours will be at least the 10th used means that we have S9< 50,000, so we need only to compute image, with S9 ~ N (45,000,15,000). Thus we obtain the required probability

image

EXAMPLE 1.17.– A restaurant can serve 75 meals a day. We know from experience that 20% of the clients that reserved would not eventually show up.

a) The owner of the restaurant accepts 90 reservations. What is the probability that more than 50 clients show up?

b)How many reservations should the owner of the restaurant accept in order to have a probability greater than or equal to 0.9 that he will be able to serve all the clients who will show up?

If n is the number of clients who reserved, the number N of clients who will come has a binomial distribution with parameters n and p = 0.8. Using corollary 1.14, this binomial distribution is approached by a normal distribution with the same mean and variance.

a) image Consequently,

image

b) We have to solve the inequality image with respect to N. On the one hand, we have

image

and, on the other hand, we have $(1.281) = 0.9. So we need to have image, that is, image. By letting image we get the inequality 0.8x2 + 0.5124x − 75 < 0 and finally obtain x < 9.367503, 097, i.e. n ≤ 87.

1.2. The notion of stochastic process

A stochastic or a random process is a family of r.v. image defined on the same probability space image, with values in a measurable space image.

The set E can be either image or image, and in this case is the σ-algebra of Borel sets, or an arbitrary finite or countable infinite discrete set, and in this case image.

The index set I is usually image (in this case the process is called chain) or image , and the parameter t is interpreted as being the time. The function image is called a realization or a sample path of the process (see Figure 1.1).

If the evolution of a stochastic process is done by jumps from state to state and if almost all its sample paths are constant except at isolated jump times, then the process is called a jump process.

EXAMPLE 1.18.– Let us consider the process image, with state space E = image, describing the evolution of a population, the number of failures of a component, etc. Then Xt(ω) = X(t, ω) = iE means that the population has i individuals at time t, or that the component has failed i times during the time interval (0, t].

The times T0(ω),T1(ω), ... (Figure 1.2) are the jump times (transition times) for the particular sample path ω ∈ Ω. The r.v. ζ = supn Tn is the lifetime of the process.

Figure 1.1. Process with continuous sample path

image

The stochastic processes can be studied either by means of their finite-dimensional distributions or by considering the type of dependence between the r.v. of the process. In the latter case, the nature of the process is given by this type of dependence.

The distribution of a stochastic process X, i.e. image , is specified by the knowledge of its finite-dimensional distributions. For a real-valued

Figure 1.2. Sample path of a jump process

image

process we define

[1.5] image

for any t1, ..., tnI and x1, ..., xnimage.

The process is said to be given if all the finite-dimensional distributions Ft1 ,...,tn (·, ..., ·), t1, ..., tnI, are given.

The finite-dimensional distributions must satisfy the following properties:

1) for any permutation (i1, ..., in) of (1, ..., n),

image

2) for any 1 ≤ kn and x1, ..., xnimage,

image.

Let X = (Xt;tI) and Y = (Yt;tI) be two stochastic processes on the same probability space image, with values in the same measurable space (E, ∈ ).

DEFINITION 1.19.– (Stochastically equivalent processes in the wide sense) If two stochastic processes X and Y satisfy

image

for all n ∈ N, t1, ..., tn I, and A1, ..., An ∈ E, then they are called stochastically equivalent in the wide sense.

DEFINITION 1.20.– (Stochastically equivalent processes) If two stochastic processes X and Y satisfy

image

then they are called stochastically equivalent.

DEFINITION 1.21.– (Indistinguishable processes) If two stochastic processes X and Y satisfy

image

then they are called indistinguishable.

PROPOSITION 1.22.– We have the following implications: X,Y indistinguishable image X,Y stochastically equivalent image image X, Y stochastically equivalent in the wide sense.

PROPOSITION 1.23.– If the processes X and Y are stochastically equivalent and right continuous, then they are indistinguishable.

DEFINITION 1.24.– (Version or modification of a process) If the process Y is stochastically equivalent to the process X, then we say that Y is a version or a modification of the process X.

DEFINITION 1.25.– (Continuous process) Aprocess X with values in aBorel space (E, ∈ ) is said to be continuous a.s. if, for almost all ω, the function image is continuous.

PROPOSITION 1.26.– (Kolmogorov continuity criterion) A real process X has a continuous modification if there exist the constants α, β, C > 0, such that image for all t and s.

1.3. Martingales

1.3.1. Stopping time

Let image be a probability space. An increasing sequence of σ-algebras, image, is called a filtration of image. A sequence (Xn,nimage) of r.v. is said to be F-adapted if Xn is imagen -measurable for all image. Usually, a filtration is associated with the sequence of r.v. image, that is, we have imagen and the sequence will be F-adapted. This is called the natural filtration of image.

DEFINITION 1.27.– Let image be a filtration of image. A stopping time for F (or F-adapted, or F-stopping time) is an r.v. T with values in image satisfying one of the following (equivalent) conditions:

image

If image, T is said to be adapted to the sequence image. In this case we note image.

EXAMPLE 1.28.– (Hitting time) Let image be a sequence of r.v. with values in image d and image. The r.v. T = infimage is a stopping time, adapted to the sequence (Xn, nimage) and is called the hitting time of B. Indeed,

image

Properties of stopping times

1.The set image is a σ-algebra called the σ-algebra of events prior to T.

2. If S and T are stopping times, then S + T, S ⋀ T, and ST are also stopping times.

3.If image is a sequence of stopping times, then image is also a stopping time.

4.If S and T are stopping times such that imageS ≤ T, then image sT.

image

PROPOSITION 1.29.– (Wald identity) If T is a stopping time with finite expected value, adapted to an i.i.d. and integrable random sequence (Xn,nimage), then

1.3.2. Discrete-time martingales

We will consider in the following that every filtration F is associated with the corresponding random sequence.

DEFINITION 1.30.– A sequence of r.v. (Xn, nimage) is an

1)Xn F-martingale if

(a)Xn is F-adapted for all n;

(b)E |Xn | < ∞ for all nimage;

(c) imageimage for all nimage.

2)F-submartingale if it satisfies (a), (b), and (c) with <.

3) F-supermartingale if it satisfies (a), (b), and (c) with >.

Note that condition (c) is equivalent to image .

EXAMPLE 1.31.– Let image be a sequence of i.i.d. r.v. such that image and image. The random walk (see section 2.2) image, is a submartingale for the natural filtration of (ζn) if p > 1/2, a martingale if p = 1/2, and a supermartingale if p < 1/2. Indeed, we have image Consequently,

EXAMPLE 1.32.– Let X be a real r.v. with image and let F be an arbitrary filtration. Then, the sequence image is a martingale. Indeed,

image

EXAMPLE 1.33.– (A martingale at the casino) A gambler bets 1 euro the first time, if he loses he bets image2 the second time, etc. imagek−1 the kth time. He stops gambling when he wins for the first time. At every play, he wins or loses his bet with a probability of 1/2. This strategy will make him eventually win the game. Indeed, when he stops playing at a random time N, he would have won 2N (1 + 2 + ··· + 2N − 1) = image1.

If Xn is the r.v. defined as the fortune of the gambler after the nth game, we have

image

if he loses up to the nth game. Consequently image (Xn+1 | Xn,..., X1) = Xn.

On the other hand, the expected value of the loss is

image

because N is a geometrically distributed r.v. with parameter 1/2 and image for n ≥ N. Consequently, the strategy of the player is valid only if his initial fortune is greater than the casino’s.

1.3.3. Martingale convergence

THEOREM 1.34.– (Doob’s convergence theorem) Every supermartingale, submartingale or martingale bounded in L1 converges a.s. to an integrable r.v.

EXAMPLE 1.35.– If X is an r.v. on image with finite expected value and image is a filtration of image, then

image

THEOREM 1.36.– If (Xn) is an uniformly integrable martingale, i.e.

image

then there exists X integrable such that image, and we haveimage.

A martingale (Xn) for image is said to be a square integrable if image for all n ≥ 1.

THEOREM 1.37.– (Strong convergence theorem) If (Xn) is a square integrable martingale, then there exists an r.v. X such that image.

THEOREM 1.38.– (Stopping theorem) Let (Xn) be a martingale (resp. a submartingale) for (imagen). If S and T are two stopping times adapted to (imagen) such that

1) image and image

2) image and image

then image

1.3.4. Square integrable martingales

Let (Mn, n ≥ 0) be a square integrable martingale, i.e.

image

The process image is a submartingale and Doob’s decomposition gives

image

where Xn is a martingale, and < M >n is a predictable increasing process, that is, image-measurable for all n ≥ 1.

THEOREM 1.39.– If (Mn) is a square integrable martingale, with predictable process < M >n, and if

1) image

2) image

for all image

then

image

and

image

where (an) is a sequence increasing to infinity.

1.4. Markov chains

Markov chains and processes represent probabilistic models of great importance for the analysis and study of complex systems. The fundamental concepts of Markov modeling are the state and the transition.

1.4.1. Markov property

It is clear that the situation of a physical system at a certain moment can be completely specified by giving the values of a certain number of variables that describe the system. For instance, a physical system can often be specified by giving the values of its temperature, pressure, and volume; in a similar way, a particle can be specified by its coordinates with respect to a coordinate system, by its mass and speed. The set of such variables is called the state of the system, and the knowledge of the values of these variables at a fixed moment allows us to specify the state of the system, and, consequently, to describe the system at that precise moment.

Usually, a system evolves in time from one state to another, and is thus characterized by its own dynamics. For instance, the state of a chemical system can change due to a modification in the environment temperature and/or pressure, whereas the state of a particle can change because of interaction with other particles. These state modifications are called transitions.

In many applications, the states are described by continuous variables and the transitions may occur at any instant. To simplify, we will consider a system with a finite number of states, denoted by E = {1, 2, . . ., N}, and with transitions that can occur at discrete-time moments. So, if we set X(n) for the state of the system at time n, then the sequence X(0), X(1), . . ., X(n) describes the “itinerary” of the system in the state space, from the beginning of the observation, up to the fixed time n; this sequence is called a sample path (realization or trajectory) of the process (see section 1.2).

In most of the concrete situations, the observation of the process makes us come to the conclusion that the process is random. Consequently, to a sample path of the process a certain probability

image

needs to be associated. Elementary techniques of probability theory show that these probabilities can be expressed in terms of the conditional probabilities

image

for all image and for any states i0, i1, . . ., in+1 ∈ E. This means that it is necessary to know the probability that the system is in a certain state in+1 ∈ E after the (n + 1)th transition, image, knowing its history up to time n. Computing all these conditional probabilities renders the study of a real phenomenon modeled in this way very complicated. The statement that the process is Markovian is equivalent to the simplifying hypothesis that only the last state (i.e. the current state) counts for its future evolution. In other words, for a Markov process we have (the Markov property)

[1.6]image

DEFINITION 1.40.– The sequence of r.v. image defined on image, with values in the set E, is called a Markov chain (or discrete-time Markov process) with a finite or countable state space, if the Markov property is satisfied.

A Markov chain is called nonhomogenous or homogenous (with respect to time) whether or not the common value of the two members of [1.6] (i.e. the function p(n, i, j)) depends on n. This probability is called transition function (or probability) of the chain.

For more details on the study of discrete-time Markov processes with finite or countable state space, see [CHU 67, IOS 80, KEM 60, RES 92].

So, the Markovian modeling is adapted to physical phenomena or systems whose behavior is characterized by a certain memoryless property, in the sense specified in [1.6]. For real applications, it is very difficult often even impossible to know whether a physical system has a Markovian behavior or not; in fact, it is important to be able to justify this Markovian behavior (at least as a first approximation of the phenomenon) and thus to obtain a model useful for the study of the phenomenon.

Note that the Markovian modeling can also be used in the case in which a fixed number of states, not only the last one, determines the future evolution of the phenomenon. For instance, suppose that we take into account the last two visited states. Then, we can define a new process with n2 states, where the states are defined as couples of states of the initial process; this new process satisfies property [1.6] and the Markovian model is good enough, but with the obvious drawback of computational complexity.

PROPOSITION 1.41.– If the Markov property (1.6) is satisfied, then

[1.7] image

for all image and i0, i1, ..., in,j1, ...jmE.

REMARK 1.42.– The more general relation

[1.8] image

can be proved for any Aσ(Xk;kn + 1). The equalities between conditional probabilities have to be understood in the sense of almost surely image. We will often write image instead of imageimage

1.4.2. Transition function

Throughout this chapter, we will be concerned only with homogenous Markov chains and the transition function will be denoted by p(i, j). It satisfies the following properties:

[1.9] image

[1.10] image

A matrix that satisfies [1.9] and [1.10] is said to be stochastic.

From [1.8], taking A = (Xn+1 = j), we obtain

[1.11] image

The probability image does not depend on n and will be denoted by p(m)(i, j).

The matrix p = (p(i, j); i, jE) is called the transition matrix of the chain, and p(m)(i, j) is the (i, j) element of the matrix pm.

From the following matrix equality

[1.12] image

we obtain

[1.13] image

for all image. Equality [1.12] (or [1.13]) is called the Chapman-Kolmogorov identity (or equality).

EXAMPLE 1.43.– Binary component. Consider a binary component starting to work at time n = 0. The lifetime of the component has a geometric distribution on image of parameter p. When a failure occurs, it is replaced with a new, identical component. The replacement time is a geometric random variable of parameter q. Denote by 0 the working state, by 1 the failure state, and let Xn be the state of the component at time n ≥ 0. Then Xn is an r.v. with values in E = {0, 1}. It can be shown that image is a Markov chain with state space E and transition matrix

image

Using the eigenvalues of the matrix p, we obtain

image

EXAMPLE 1.44.– A queuing model. A service unit (health center, civil service, tasks arriving in a computer, etc.) can serve customers (if any) at times 0,1, 2,.... We suppose that during the time interval (n, n + 1] there are ξn clients arriving, image, where the r.v. ξ0, ξ1, ξ2, ... are i.i.d. and imagepk, k ≥ 0, Σk≥0pk = 1. Let m be the number ofplaces available in the queue (m can also take the value +∞). When a customer arrives and sees m clients waiting to be served, then he leaves without waiting. Let Xn be the number of customers in the queue at time n (including also the one that is served at that moment). We have

image

The process image is a Markov chain with state space E = {0,1,…, m}, because Xn+1 depends only on the independent r.v. Xn and ξn. The transition function is given by

image

and, for 1≤im

image

EXAMPLE 1.45.– Storage model. A certain product is stocked in order to face a random demand. The stockpile can be replenished at times 0, 1, 2, … and the demand in the time interval (n, n + 1] is considered to be a discrete image-valued r.v. ξn. The r.v. ξ0, ξ1, ξ2, ... are supposed to be i.i.d. and image image

The stocking strategy is the following: let m, Mimage, be such that m < M; if, at an arbitrary time image, the inventory level is less or equal to m, then the inventory is brought up to M. Whereas, if the inventory level is greater than m, no replenishment is undertaken.

If Xn denotes the stock level just before the possible supply at time n, we have

image

The same approach as in example 1.44 can be used in order to show that image is a Markov chain and subsequently obtain its transition function.

1.4.3. Strong Markov property

Let image be a Markov chain defined on image, with values in E. Consider the filtration image be a stopping time for the chain X (i.e. for the filtration image and image be the σ-algebra of events prior to τ. We want to obtain the Markov property (relation [1.6]) in case that the “present moment” is random.

PROPOSITION 1.46.– For all image such that image, we have

[1.14] image

REMARK 1.47.– We can prove more general relations as follows: if B is an event subsequent to τ, i.e. B ∈ σ(Xτ+k, k ≥ 0), then

[1.15] image

[1.16] image

DEFINITION 1.48.– Relation [1.15] is called the strong Markov property.

If X0 = j a.s., j ∈ E, then the r.v.

image

with the convention inf Ø = ∞, is called the sojourn time in state j and it is a stopping time for the family image.

PROPOSITION 1.49.– The sojourn time in a state jE is a geometric r.v. of parameter 1 − p(j, j) with respect to the probability image for image.

1.5. State classification

DEFINITION 1.50.– Let X = (Xn, nimage) be a Markov chain defined on image, with values in E (finite or countable). We call a state jE accessible from state iE (we write ij) if there exists an image such that p(n)(i, j) > 0. The states i and j are said to communicate if ij and ji; this will be denoted by ij.

Note that relation ij is transitive, i.e. if ij and jk then ik.

DEFINITION 1.51.– A class of states is a subset C of E that satisfies one of the two following properties:

1)The set C contains only one state iE and relation ii is not verified.

2)For all i, jC we have ij and C is maximal with this property, i.e. it is not possible to increase C by another state which communicates with all the other states of C.

EXAMPLE 1.52.– Let X = (Xn, nimage) be a Markov chain with values in E = {0, 1, 2, 3} and of transition matrix

image

The matrices pm, m ≥ 2, have all the entries on the second column equal to zero. Consequently, the classes are C1= {0}, C2 = {1}, and C3 = {2,3}.

DEFINITION 1.53.– 1. A property concerning the states of E, such that the validity for a state iE implies the validity for all states from the class of i, is called a class property.

2. A class of states C is called closed if ΣjC p(i, j) = 1 for all i ∈ C. In other words, the matrix (p(i,j); i, jC) is a stochastic matrix; thus, Σjc p(n) {i, j) = 1 for all iC and nimage.

3. If a closed class contains only one state, then this state is called absorbing

4. If the state space E consists of only one closed class, then the chain is called irreducible.

5.A state iE is said to be essential if ij yields ji; otherwise i is called inessential.

6. If ii, then the greatest common divisor of nimage such that p(n)(i, i) > 0 is called the period of i and will be denoted by d%. If di = 1, then the state i is called aperiodic.

PROPOSITION 1.54.– Let C be a closed class of period d and C0, C1, …, Cd−1 be the cyclic subclasses.

1. If iCr and p(n)(i,j) > 0, then jCn+r.

2.If i ∈ Cr, we have

image

(The class subscripts are considered mod d).

We will denote by image, the probabilities on σ(Xn; nimage) defined by

image

and by imagei the corresponding expected values.

If α = (α(i), iE) is a probability on E, i.e. image then image and image is the corresponding expected value.

DEFINITION 1.55.– Let ηi,iE, be the first-passage time to state i. The state iE is said to be recurrent (or persistent) if image. In the opposite case, i.e. if image, the state i is said to be transient. If i is a recurrent state, then if image, i is said to be positive recurrent, and if μi = ∞, then i is said null recurrent.

PROPOSITION 1.56.– A state iE is recurrent or transient, if the series image is divergent, respectively convergent.

PROPOSITION 1.57.– Let (Xn, nimage) be a Markov chain with finite state space E. Then:

1) there exists at least a recurrent state;

2) a class is recurrent iff it is closed.

PROPOSITION 1.58.– Let X = (Xn, nimage) be a Markov chain with finite state space. Then any recurrent class C of X is positive. If the chain X is irreducible, then it is positive recurrent (i.e. all states are positive recurrent).

1.5.1. Stationary probability

DEFINITION 1.59.– A probability distribution π on E is said to be stationary or invariant for the Markov chain X = (Xn, nimage) with transition matrix P = (p(i, j); i, jE) if, for all jE,

[1.17] image

Relation [1.17] can be written in matrix form

[1.18] image

where π = (π(i); iE) is a row vector. From [1.18] we can easily prove that

[1.19] image

and

[1.20] image

PROPOSITION 1.60.– We suppose that the transition matrix is such that the limits

[1.21] image

exist for all i, jE and do not depend on i. Then, there are two possibilities:

1) π(j) = 0 for all jE and, in this case, there is no stationary probability;

2) Σj∈E π(j) = 1 and, in this case, π = (π(j); jE) is a stationary probability and it is unique.

DEFINITION 1.61.– A Markov chain is said to be ergodic if, for all i, jE, the limits limn→∞p(n)(i, j) exist, are positive, and do not depend on i.

PROPOSITION 1.62.– An irreducible, aperiodic, and positive recurrent Markov chain is ergodic.

PROPOSITION 1.63.– A finite Markov chain is ergodic if and only if

[1.22] image

A Markov chain verifying [1.22] is called regular.

PROPOSITION 1.64.– A Markov chain with finite or countable state space E has a unique stationary probability if and only if the state space contains one and only one recurrent positive class.

COROLLARY 1.65.– A finite ergodic Markov chain (i.e. regular) has a unique stationary probability.

1.6. Continuous-time Markov processes

The concept of Markovian dependence in continuous time was introduced by A. N. Kolmogorov in 1931. The first important contributions are those of B. Pospisil, W. Feller, and W. Doeblin, and the standard references are [BLU 68, DYN 65, GIH 74].

1.6.1. Transition function

Let E be a finite or countable set and image, be a matrix function with the properties:

[1.23] image

where δij is the Kronecker symbol.

A family of r.v. image with values in E is called a Markov process homogenous with respect to time, with state space E, if

[1.24] image

for all 0 ≤ t0 < t1 << tn, t ≥ 0, i0, i1, …,in−1, i, jE, n ≤ 1. We will suppose that the sample paths tX(t, ω) are right-continuous with left limits a.s. The matrix P(t), called the transition matrix function of the process, satisfies the Chapman-Kolmogorov equation

[1.25] image

which can be written under matrix form

image

The functions pij (t) have some remarkable properties:

1) For all i, jE, the function Pij(t) is uniformly continuous on [0,).

2) For all i,jE, the function pij (t) is either identically zero, or positive, on (0,).

3) For all image exists and it is finite.

4) For all image exists and image. A state z is said to be stable if qi < ∞, instantaneous if qi = ∞, and absorbing if qi = 0.

PROPOSITION 1.66.– If E is finite, then there is no instantaneous state.

PROPOSITION 1.67.– The sample paths of the process are right continuous a.s. if and only if there is no instantaneous state.

1.6.2. Kolmogorov equations

If qi < ∞ for all iE, then the matrix Q = (qij,i,jE), where qii = —qi, is called the infinitesimal generator of the process or transition intensity matrix. Generally, we have qij ≥ 0 for i ≠ j and qii ≤ 0. Additionally,

[1.26] image

If E is finite, then [1.26] becomes

[1.27] image

Relation [1.27] is a necessary and sufficient condition such that pij (t) satisfy the differential equations

[1.28] image

or, in matrix form,

image

Equations [1.28] are called Kolmogorov backward equations. In addition, if qj <, jE, and the limit

image

is uniform with respect to ij, then we have

[1.29] image

Equations [1.29] are called Kolmogorov forward equations.

Note the important fact that, if E is finite, then the two equations [1.28] and [1.29] are satisfied. In this case, from these two systems of equations, with initial conditions pij(0)= δij, i, jE (or under matrix form P(0) = I), we obtain

[1.30] image

Consequently, the transition intensity matrix Q uniquely determines the transition matrix function P(t).

The first jump time or the sojourn time in a state is defined by

image

Then image is the distribution function of the sojourn time in state i.

PROPOSITION 1.68.– For i, jE, we have

[1.31] image

and the r.v. T1 and XT1 are image-independent.

The successive jump times of the process can be defined as follows:

image

The lifetime of the process is the r.v. image, then the process is regular (non-explosive); in the opposite case, if image, the process is called non-regular (explosive).

If Tn+1 > Tn a.s. for all n ≥ 1, then (X(t),t ≥ 0) is said to be a jump process.

PROPOSITION 1.69.– There exists a stochastic matrix (aij,i, jE) with aii = 0, iE, such that

[1.32] image

This means that the successively visited states form a Markov chain Yn = XTn, called the embedded chain, with transition function A. Given the successively visited states of the process, the sojourn times in different states are mutually independent.

REMARK 1.70.– We have

[1.33] image

PROPOSITION 1.71.– (Reuter’s explosion condition) A jump Markov process is regular iff the only non-negative bounded solution of equation Qy = y is y = o.

PROPOSITION 1.72.– A necessary and sufficient condition for a Markov process (X(t),t ≥ 0) to be regular is image a.s.

PROPOSITION 1.73.– (Kolmogorov integral equation) For i, jE, we have

[1.34] image

for i non-absorbing, and Pij (t) = δij for i absorbing.

Let us denote by ηi the first-passage time to state i, i.e. ηi = inf (tT1|Xt=i). A state iE is said to be:

recurrent if image;

transient if image;

positive recurrent if it is recurrent and image;

null recurrent if it is recurrent and μii = ∞.

For α = (αi, iE) a probability on E, we define

image

The probabilities Pi(t) = Pα(Xt = i), iE, t ≥ 0, are called state probabilities. If pi = pi(0) = Pα(X0 = j), jE, then the state probabilities satisfy the equations

[1.35] image

A probability π = (πj, jE) on E is said to be stationary or invariant if Pπ(Xt = j) = πj, for all j ∈ E, t ≥ 0. This condition can be written in the matrix form πP(t) = π for all t ≥ 0. From [1.35] it can be inferred that (πj, jE) is stationary if and only if image.

A Markov process (X(t),t > 0) is said to be ergodic if there exists a probability π on E such that, for all i, jE, we have

image

or, equivalently,

image

for all iE.

PROPOSITION 1.74.– For any state i of E, we have

image

PROPOSITION 1.75.– If the Markov process (X(t),t > 0) is irreducible (i.e. the chain (Yn, nimage) is so) and it has an invariant probability π, then for any positive and bounded function g on E, we have

image

1.7. Semi-Markov processes

A semi-Markov process is a natural generalization of a Markov process. Its future evolution depends only on the time elapsed from the last transition.

The semi-Markov processes that we will present are minimal semi-Markov processes associated with Markov renewal processes, with finite state spaces E. Consequently, we do not have to take into account instantaneous states and the sojourn times of a process form, almost surely on image+, a dense set.

1.7.1. Markov renewal processes

Let the functions Qij, i, jE, defined on the real line, be non-decreasing, right continuous, and such that Qij () ≤ 1; they will be called mass functions. Besides, if we have

image

with image, the matrix function Q(t) = (Qij(t),i, jE), timage+, is called a semi-Markov kernel (matrix) on the state space E. On E we consider the σ-algebra image.

Let us consider the Markov transition function P((i, s), {j} x[0,t]) defined on the measurable space image by

image

Then [BLU 68, DYN 65, GIH74], there exists a Markov chain ((Jn,Tn),n image) with values in image, such that the transition function is given by

[1.36] image

If we set X0 = T0, Xn =Tn — Tn_1, n ≥ 1, then the process ((Jn, Xn),n ∈ image) is a Markov chain with values in E x R+ and a transition function given by

[1.37] image

Obviously,

image

where image

DEFINITION 1.76.– The processes ((Jn,Tn), nimage) and ((Jn,Xn),nimage) are called Markov renewal process (MRP) and, respectively, J — X process.

Note that (Jn,nimage) is a Markov chain with values in E and transition matrix p = (Pij, i, jE), with pij = Qij().

The n-fold matrix Stieltjes convolution image is defined by

image

and we have

[1.38] image

1.7.2. Semi-Markov processes

We will suppose that

image

and define

image

DEFINITION 1.77.– The jump stochastic process (Zt, t image+), where

[1.39] image

is said to be a semi-Markov process associated with the MRP (Jn, Tn).

We only present some notions regarding semi-Markov processes here, for they will be detailed in Chapter 5.

The jump times are T0 < T1 < T2 < · · · < Tn < … and the inter-jump times are X1, X2, ….

Obviously,

image

For any i, jE we define the distribution function Fij by

image

Note that Fij is the distribution function of the sojourn time in state i, knowing that the next visited state is j. These distribution functions are called conditional transition functions.

The transition functions of the semi-Markov process are defined by

[1.40] image

We denote by Nj(t) the number of times the semi-Markov process visits state j during the time interval (0, t]. For i, jE we let

The function t → Ψij(t) is called the Markov renewal function. This function is the solution of the Markov renewal equation

[1.41] image

The transition probabilities [1.40] satisfy the Markov renewal equation

[1.42] image


1. In fact, the distribution (or law) of an r.v. is a probability on image, i.e. on the Borel sets of image.

2. Φ is the d.f. of N (0, 1).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.136.233.153