A.6 Spaces of measures

Let S be a topological space. S is called metrizable if there exists a metric d on S which generates the topology of S. That is, the open d-balls

Bε(x) := {y S | d(x, y) < ε}, x S, ε > 0,

form a base for the topology of S in the sense that a set U S is open if and only if it can be written as a union of such d-balls. A convenient feature of metrizable spaces is that their topological properties can be characterized via convergent sequences. For instance, a subset A of the metrizable space S is closed if and only if for every convergent sequence in A its limit point is also contained in A. Moreover, a function f : S is continuous at y S if and only if f (yn) converges to f (y) for every sequence (yn) converging to y.We write Cb(S) for the set of all bounded and continuous functions on S.

The metrizable space S is called separable if there exists a countable dense subset {x1, x2, . . . } of S. In this case, the Borel σ-algebra S of S is generated by the open d-balls Bε(x) with radii ε > 0, ε , and centered in x {x1, x2, . . . }. In what follows, we will always assume that S is separable and metrizable. If, moreover, the metric d can be chosen to be complete, i.e., if every Cauchy sequence with respect to d converges to some point in S, then S is called a Polish space. Clearly, d with the Euclidean distance is a complete and separable metric space, hence a Polish space.

Let us denote by

M(S) := M(S,S)

the set of all nonnegative finite measures on (S,S). Every μ M(S) is of the form μ = α ν for some factor α [0,) and some probability measure ν on the measurable space (S,S). The space of all probability measures on (S,S) is denoted by

M1(S) = M1(S,S).

Definition A.40. The weak topology on M(S) is the coarsest topology for which all mappings

are continuous.

It follows from this definition that the sets

for μ M(S), ε > 0, n , and f1, . . . , fn Cb(S) form a base for the weak topology on M(S); for details see, e.g., Section 2.13 of [3]. Since the constant function 1 is continuous,

is a closed subset of M(S). Awell-known example for weak convergence of probability measures is the classical central limit theorem; the following version is needed in Section 5.7.

Theorem A.41. Suppose that for each N we are given N independent random variables onN ,FN , PN) which satisfy the following conditions:

There are constants γN such that 0 and γN

where varN denotes the variance with respect to PN.

Then the distributions of

converge weakly to the normal distribution with mean m and variance σ2.

Proof. See, for instance, the corollary to Theorem 7.1.2 of [58].

The following theorem allows us to examine the weak topology in terms of weakly converging sequences of measures.

Theorem A.42. The space M(S) is separable and metrizable for the weak topology. If S is Polish, then so is M(S). Moreover, if S0 is a dense subset of S, then the set

of simple measures on S0 with rational weights is dense in M(S) for the weak topology.

Proof. In most textbooks on measure theory, the previous result is proved for M1(S) instead of M(S); see, e.g., Theorem 14.12 of [3]. The general case requires only minor modifications. It is treated in full generality in Chapter IX, §5, of [34].

The following characterization of weak convergence in M(S) is known as the portmanteau theorem.

Theorem A.43. For any sequence μ, μ1, μ2, . . . of measures in M(S), the following conditions are equivalent:

(a) The sequence (μn)nN converges weakly to μ.

(b) μn(S) μ(S) and

(c) μn(S) μ(S) and

(d) μn(B) μ(B) for every Borel set B whose boundary B is not charged by μ in the sense that μ(B) = 0.

(e) f dμn f dμ for every bounded measurable function f which is μ-a.e. continuous.

(f) f dμn f dμ for every bounded and uniformly continuous function f .

Proof. The result is proved for M1(S) in [3], Theorem 14.3. The general case requires only minor modifications; see Chapter IX of [34].

Exercise A.6.1. Use the portmanteau theorem to show that the following conditions are equivalent for any sequence (μn)n=0,1,... in M1().

(a) μn converges weakly to μ0.

(b) The corresponding distribution functions satisfy

for all x .

(c) F μn (x) F μ0 (x) for any continuity point x of F μ0 .

(d) For any choice of quantile functions qμn and all t (0, 1),

(e) We have qμn (t) qμ0 (t) for any continuity point t of qμ0.

The next theorem can be regarded as a stability result for weak convergence.

Theorem A.44 (Slutsky). Suppose that, for n , Xn and Yn are real-valued random variables onn ,Fn , Pn) such that the laws of Xn converge weakly to the law of X, and the laws of Yn converge weakly to δy for some y . Then:

(a) The laws of Xn + Yn converge weakly to the law of X + y.

(b) The laws of X n · Yn converge weakly to the law of X · y.

Proof. See, for instance, Section 8.1 of [57].

We turn now to the fundamental characterization of the relative compact subsets of M(S) known as Prohorovs theorem.

Theorem A.45 (Prohorov). Let S be a Polish space. A subset M of M(S) is relatively compact for the weak topology if and only if

and if M is tight, i.e., if for every ε > 0 there exists a compact subset K of S such that

In particular, M1(S) is weakly compact if S is a compact metric space.

Proof. For a proof in the context of probability measures, see for instance Theorem 1 in §III.2 of [262]. The general case requires only minor modifications; see Chapter IX of [34].

Example A.46. Take for S the positive half axis [0,) and define

where δx denotes the Dirac point mass in x S, i.e., δx(A) = A (x). Clearly,

so that μn converges weakly to μ. However, if we take the continuous but unbounded function f (x) = x, then f dμn = 1 for all n so that

The preceding example shows that the weak topology is not an appropriate topology for ensuring the convergence of integrals against unbounded test functions. Let us introduce a suitable transformation of the weak topology which will allow us to deal with certain classes of unbounded functions.

We fix a continuous function

which will serve as a gauge function, and we denote by

Cψ(S)

the linear space of all continuous functions f on S for which there exists a constant c such that

|f (x)| c · ψ(x) for all x S.

Furthermore, we denote by

Mψ(S)

the set of all measures μ M(S) such that ψ dμ < .

Definition A.47. The ψ-weak topology on Mψ(S) is the coarsest topology for which all mappings

are continuous.

Since the gauge function ψ takes values in [1,), every bounded continuous function f belongs to Cψ(S). It follows that all mappings

are continuous. In particular, the set

of all Borel probability measures in Mψ(S) is closed for the ψ-weak topology. As in the case of the weak topology, it follows that the sets

for μ Mψ(S), ε > 0, n , and f1, . . . , fn Cψ(S) form a base for the ψ-weak topology on Mψ(S).

Consider the map

defined through

Clearly, Ψis a bijective mapping between the two sets M(S) and Mψ(S). Moreover, if we apply Ψt o an open neighborhood for the weak topology as in (A.32), we get

Since Cψ(S) for each bounded and continuous function f , and since every function in Cψ(S) arises in this way, we conclude that a subset U of M(S) is weakly open if and only if Ψ(U) is open for the ψ-weak topology. Hence, Ψ is a homeomorphism. This observation allows us to translate statements for the weak topology into results for the ψ-weak topology:

Corollary A.48. For separable and metrizable S, the space Mψ(S) is separable and metrizable for the ψ-weak topology. If S is Polish, then so is Mψ(S). Moreover, if S0 is a dense subset of S, then the set

of simple measures on S0 with rational weights is dense in Mψ(S) for the ψ-weak topology.

The preceding corollary implies in particular that it suffices to consider ψ-weakly converging sequences when studying the ψ-weak topology. The following corollary is implied by the portmanteau theorem.

Corollary A.49. A sequence (μn)nN in Mψ(S) converges ψ-weakly to μ if and only if

for every measurable function f which is μ-a.e. continuous and for which exists a constant c such that | f | c · ψ μ-almost everywhere.

Prohorovs theorem translates as follows to our present setting:

Corollary A.50. Let S be a Polish space and M be a subset of Mψ(S). The following conditions are equivalent:

(a) M is relatively compact for the ψ-weak topology.

(b) We have

and for every ε > 0 there exists a compact subset K of S such that

(c) There exists a measurable function ϕ : S [1,] such that each set

{x S | ϕ(x) n ψ(x)} , n ,

is relatively compact in S, and such that

Proof. (a)(b): This follows immediately from Theorem A.45 and the fact that Ψ is a homeomorphism.

(b)(c): Take an increasing sequence K1 K2 · · · of compact sets in S such that

and define ϕ by

Then nψ} Kn. Moreover,

(c)(b): Since ψ} is relatively compact, we have that

Hence, ψ c + ϕ (1 + c)ϕ, and therefore

Moreover, for n ε1 supμM ϕ dμ, the relatively compact set K := nψ} satisfies

and so condition (b) is satisfied.

We turn now to the task of identifying a linear functional on a space of functions as the integral with respect to a suitable measure.

Theorem A.51 (Riesz). Let S be a compact metric space and suppose that I is a linear functional on C(S) that is nonnegative in the sense that f 0 everywhere on S implies I (f) 0. Then there exists a unique positive Borel measure μ on S such that

The preceding theorem can be deduced from the next result; see Exercise A.6.2 below. To state this result, we need the notion of a vector lattice of real-valued functions on an arbitrary set S. This is a linear space L that is stable under the operation of taking the pointwise maximum: for f , g L also f g L. The vector lattice L is called a Stone vector lattice if f 1 L for all f L. One example is the space of all bounded measurable functions on (S,S). Another one is the space C b(S) of all bounded continuous functions on a separable metric space S. In this case, the σ-algebra σ(L) generated by L = C b(S) coincides with the Borel σ-algebra of S.

Theorem A.52 (DaniellStone). Let I be a linear functional on a Stone vector lattice L of functions on S such that the following conditions hold:

(a) I is nonnegative in the sense that f 0 everywhere on S implies I (f) 0.

(b) If (fn) is a sequence in L such that fn 0 pointwise on S, then I (fn) 0.

Then there exists a unique positive measure μ on the measurable space (S, σ(L)) such that

Proof. See, e.g., Theorem 4.5.2 of [100].

Exercise A.6.2. Use Dinis lemma, as recalled in Lemma 4.24, to deduce Theorem A.51 from Theorem A.52.

Without the continuity assumption (b) in Theorem A.52, the representation of positive linear functionals on the space of bounded measurable functions takes a different form, as we will discuss now.

Definition A.53. Let (Ω,F) be a measurable space. A mapping μ : F is called a finitely additive set function if μ() = 0, and if for any finite collection A1, . . . , An F of mutually disjoint sets

We denote by M1,f := M1,f (Ω,F) the set of all those finitely additive set functions μ : F [0, 1] which are normalized to μ(Ω) = 1. The total variation of a finitely additive set function μ is defined as

The space of all finitely additive measures μ whose total variation is finite is denoted by ba(Ω,F).

We will now give a brief outline of the integration theory with respect to a measure μ ba := ba(Ω,F); for details we refer to Chapter III in [104]. The space X of all bounded measurable functions on (Ω,F) is a Banach space if endowed with the supremum norm,

Let X0 denote the linear subspace of all finitely valued step functions which can be represented in the form

for some n , αi , and disjoint sets A1, . . . , An F. For this F we define

and one can check that this definition is independent of the particular representation of F. Moreover,

Since X0 is dense in X with respect to · , this inequality allows us to define the integral on the full space X as the extension of the continuous linear functional X0 ϶ F F dμ. Clearly, M1,f is contained in ba, and we will denote the integral of a function F X with respect to Q M1,f by

Theorem A.54. The integral

defines a one-to-one correspondence between continuous linear functionals on X and finitely additive set functions μ ba.

Proof. By definition of the integral and by (A.33), it is clear that any μ ba defines a continuous linear functional on X . Conversely, if a continuous linear functional is given, then we can define a finitely additive set function μ on (Ω,F) by

If L 0 is such that (F) L for F 1, then μ var L, and so μ ba. One then checks that the integral with respect to μ coincides with on X0. Since X0 is dense in X , we see that Fdμ and (F) coincide for all F X .

Remark A.55. Theorem A.54 yields in particular a one-to-one correspondence between set functions Q M1,f and continuous linear functionals on X such that (1) = 1 and (X) 0 for X 0.

Example A.56. Clearly, the set M1,f coincides with the set M1 := M1(Ω,F) of all σ-additive probability measures if (Ω,F) can be reduced to a finite set, in the sense that F is generated by a finite partition of Ω. Otherwise, M1,f is strictly larger than M1. Suppose in fact that there are infinitely many disjoint sets A1, A2, . . . F, take ωn An, and define

The continuous linear functionals n on X belong to the unit ball B1 in the dual Banach space X ʹ. By Theorem A.66, there exists a cluster point of (n). For any X X there is a subsequence (nk) such that nk (X) (X). This implies that (X) 0 for X 0 and (1) = 1. Hence, Theorem A.54 allows us to write (X) = EQ[ X ] for some Q M1,f . But Q is not σ-additive, since Q[ An ] = (An ) = 0 and

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.184.3