Chapter 3

Conditional Expectation

3.1. Definition

Let (X, Y) be a pair of random variables defined on the probability space images in which only X is observed. We wish to know what information X carries about Y: this is the filtering problem defined in Chapter 1.

This problem may be formalized in the following way: supposing Y to be real and square integrable, construct a real random variable of the form r(X) that gives the best possible approximation of Y with respect to the quadratic error, i.e. E[(Yr(X))2] being minimal.

If we identify the random variables with their P-equivalence classes, we deduce that r(X) exists and is unique, since it is the orthogonal projection (in the Hilbert space L2 (P)) of Y on the closed vector subspace L2 (P) constituted by the real random variables of the form h(X) and such that E[(h(X))2] < +∞.

From Doob’s lemma, the real random variables of the form h(X) are those that are measurable with respect to the σ-algebra generated by X. We say that r(X) is the conditional expectation of Y with respect to the σ-algebra generated by X (or with respect to X), and that r is the regression of Y on X. We write:

images

The above equation leads us to the following definition.

DEFINITION 3.1.– Let images be a probability space and let images be a sub-σ-algebra of images. We call the orthogonal projection of L2 images onto L2 images the conditional expectation with respect to images, denoted by images or E ( ·|images).

CHARACTERIZATION: Following from the definition of an orthogonal projection, images is characterized by:

1) images

2) images

We may replace (2) by

2′) images

which is easily seen using the linearity and the monotone continuity of the integral.

3.2. Properties and extension

1) images is a contracting and idempotent linear map of L2 images onto L2 images. Moreover, it is positive and it conserves constants.

The first three properties (contraction (i.e. images), idempotence (i.e. images), and linearity) are characteristics of orthogonal projections.

Its positivity (i.e. images) is established by noting that, for Y ≥ 0,

images

which implies that images a.s.

Finally, it is clear that images.

COMMENT 3.1.– We may show that the above five properties characterize the operators of L2 (images) that are conditional expectations.

2) images-measurable and bounded.

In effect, images, and

images

therefore images is indeed the orthogonal projection of UY onto L2 images.

3) images. The linearity and positivity of images affirms that lim images exists. Yet

images

and since |Yn| ≤ |Y1| + |Y| and images, by twice applying the dominated convergence theorem, we obtain:

images

Since lim images is in L2 images, we have lim images.

4) If Y−1 images and images are independent, images. In effect:

images

5) If images and images are two sub-σ-algebras such that images, then images.

This is a known property of orthogonal projections.

6) Extension: We will now define images when Y is only positive or integrable.

For Y positive, we note that there exists a sequence (Yn) of positive bounded (and therefore square integrable) real random variables such that YnY. We then set images. It is straightforward to see that images is unique, and that it is characterized by:

images

(2 bis) may be replaced by:

images

Among the properties of images, we may cite the following:

For positive Y, and positive and images-measurable U, we have:

images

Now, for Y ∈ L1 images, we note that images and images are integrable, and we set:

images

Again, we have uniqueness, and the characterizations (1)–(2) and (1)–(2′ bis), where it is necessary to replace L2 images and L2 images with L1 images and L1 images, respectively. Furthermore, properties (1)–(5) are still valid, with slight modifications. In particular, we have the following important property:

images

The proofs in this section are left to the reader, as are the extension and the properties of images for random variables with values in images.

3.3. Conditional probabilities and conditional distributions

DEFINITION 3.2.– Let images and images be sub-σ-algebras of images is called the conditional probability of A with respect to images and is written as images or P images. The mapping images is called the conditional probability with respect to images and is written as images or images.

CHARACTERIZATION: Following from the above definition, images is characterized by its images-measurability and the formula:

images

3.3.1. Regular version of the conditional probability

We say that a map images(·|·) from images in images is a version of images if images for all images.

Furthermore, given a sub-σ-algebra images of images, if images (·|ω) is a probability on images, then for almost all ω ∈ Ω, we say that images(·|·) is a regular version of the conditional probability with respect to images on images. Such a version does not always exist.

If images is regular on images, we may write:

images

By linearity and monotone continuity, it follows that, for positive or integrable and images-measurable Y:

images

3.3.2. Conditional distributions

Let (X, Y) be a pair of random variables with values in images. A regular version of images on images will be, for all A fixed in images, a function of X, which we will write N(A, X)1. The image of N(·,x) by Y is then called the conditional distribution of Y knowing that X = x and is written as images or PY (·|X = x). The mapping images is then written as images or PY(·|X) and is called the conditional distribution of Y with respect to X; it is defined by the formula:

images

Now, if Y is a positive or integrable real random variable, the transfer theorem states that:

images

3.3.3. Theorem for integration with respect to the conditional distribution

THEOREM 3.1.–Let φ be a images-measurable and positive or P(X, Y) -integrable real function defined on images. Then

images

where the function images is defined PX a.s.

This theorem is proved in the following way: the definition of a conditional distribution shows that it is true for images. We deduce from this that the theorem is true for images and we conclude the demonstration as in the usual Fubini theorem. Details are left to the reader.

3.3.4. Determination of the conditional distributions in the usual cases

1) If X is a discrete random variable with values in images, we may set, for example,

images

It is clear that we thus obtain a regular version of the conditional probability with respect to images on images.

For positive or integrable Y, we have:

images

2) Let (X, Y) be a pair of random variables with values in images and density f(x, y) with respect to the Lebesgue measure dxdy on images. The density of X is then fX (x) = ∫ f(x, y) dy, and we may set:

images

The function f(·|x) is a density on images called the density of Y knowing that X = x, and this is the density of the conditional distribution of Y knowing that X = x.

We therefore have, for positive or integrable Y:

images

EXAMPLE 3.1.– Let (X, Y) be a non-degenerate two-dimensional Gaussian variable. The conditional distribution of Y knowing that X = x is a Gaussian distribution with expectation images and standard deviation images. Consequently, the regression of Y on X is an affine function:

images

3.4. Exercises

EXERCISE 3.1.– Give a proof of the properties indicated in section 3.2. We may, in particular, define images for Y with values in images by setting:

images

EXERCISE 3.2. (martingale). –Let images be a probability space and images be a sequence of sub-σ-algebras of images, increasing for inclusion. We consider a sequence (Xn, n ≥ 1) of integrable and images-adapted (i.e. each Xn is images measurable) real random variables. We say that (Xn) is a martingale if:

images

1) Show that (Xn) is a martingale if and only if there exists an integrable and images-adapted sequence (Yn) such that:

images

and

images

2) Show that, if the Xn are square integrable, the Yn are orthogonal two-by-two.

3) Let X be an integrable, real random variable. Show that images is a martingale.

4) Let (ξn, n ≥ 1) be a sequence of zero-mean, integrable, and independent real random variables with the same distribution. We set Xn = ξ1 + … + ξn, n ≥ 1, and we denote the σ-algebra generated by ξ1,…, ξn, n ≥ 1 by images. Show that (Xn) is a images-adapted martingale.

EXERCISE 3.3.– Let X, Y, Z be random variables taking values in countable sets. The probabilities (conditional or otherwise) of the events below are all assumed to be strictly positive. We make the following assumption:

images

Show that:

images

that is X and Z are independent, given Y.

EXERCISE 3.4. (Markov chain).– Let (Xn, n ≥ 1) be a sequence of random variables taking values in a countable set D. We say that this is a Markov chain if:

images

x1, …, xnD, n ≥ 1. Use Exercise 3.3 to show that X1, …, Xn−1 and Xn+1, …, Xn+k are independent given that Xn = xn.

EXERCISE 3.5. (Markov process).– Let (Xt, tT) be a family of real random variables defined on the probability space images, where images. We set τn = {t1, …, tn} (t1 < … < tn) and we denote images the σ-algebra generated by images and images the σ-algebra generated by Xt. Show that the following conditions are equivalent:

1) For all t1 < t2 < … < tn < tn+1 (tjT), n ≥ 1,

images

2) For all tT, if we denote images, the σ-algebra generated by Xs, st, sT, and images, the σ-algebra generated by Xs, s′ ≥ t, s′ ∈ T, then we have:

images


1 With the above notation, we have images.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.239.50