Let S be a topological space. S is called metrizable if there exists a metric d on S which generates the topology of S. That is, the open d-balls
Bε(x) := {y ∈ S | d(x, y) < ε}, x ∈ S, ε > 0,
form a base for the topology of S in the sense that a set U ⊂ S is open if and only if it can be written as a union of such d-balls. A convenient feature of metrizable spaces is that their topological properties can be characterized via convergent sequences. For instance, a subset A of the metrizable space S is closed if and only if for every convergent sequence in A its limit point is also contained in A. Moreover, a function f : S → ℝ is continuous at y ∈ S if and only if f (yn) converges to f (y) for every sequence (yn) converging to y.We write Cb(S) for the set of all bounded and continuous functions on S.
The metrizable space S is called separable if there exists a countable dense subset {x1, x2, . . . } of S. In this case, the Borel σ-algebra S of S is generated by the open d-balls Bε(x) with radii ε > 0, ε ∈ ℚ, and centered in x ∈ {x1, x2, . . . }. In what follows, we will always assume that S is separable and metrizable. If, moreover, the metric d can be chosen to be complete, i.e., if every Cauchy sequence with respect to d converges to some point in S, then S is called a Polish space. Clearly, ℝd with the Euclidean distance is a complete and separable metric space, hence a Polish space.
Let us denote by
M(S) := M(S,S)
the set of all nonnegative finite measures on (S,S). Every μ ∈ M(S) is of the form μ = α ν for some factor α ∈ [0,∞) and some probability measure ν on the measurable space (S,S). The space of all probability measures on (S,S) is denoted by
M1(S) = M1(S,S).
Definition A.40. The weak topology on M(S) is the coarsest topology for which all mappings
are continuous.
◊
It follows from this definition that the sets
for μ ∈ M(S), ε > 0, n ∈ ℕ, and f1, . . . , fn ∈ Cb(S) form a base for the weak topology on M(S); for details see, e.g., Section 2.13 of [3]. Since the constant function 1 is continuous,
is a closed subset of M(S). Awell-known example for weak convergence of probability measures is the classical central limit theorem; the following version is needed in Section 5.7.
Theorem A.41. Suppose that for each N ∈ ℕ we are given N independent random variables on (ΩN ,FN , PN) which satisfy the following conditions:
–There are constants γN such that → 0 and γN
–
–where varN denotes the variance with respect to PN.
Then the distributions of
converge weakly to the normal distribution with mean m and variance σ2.
Proof. See, for instance, the corollary to Theorem 7.1.2 of [58].
The following theorem allows us to examine the weak topology in terms of weakly converging sequences of measures.
Theorem A.42. The space M(S) is separable and metrizable for the weak topology. If S is Polish, then so is M(S). Moreover, if S0 is a dense subset of S, then the set
of simple measures on S0 with rational weights is dense in M(S) for the weak topology.
Proof. In most textbooks on measure theory, the previous result is proved for M1(S) instead of M(S); see, e.g., Theorem 14.12 of [3]. The general case requires only minor modifications. It is treated in full generality in Chapter IX, §5, of [34].
The following characterization of weak convergence in M(S) is known as the “portmanteau theorem”.
Theorem A.43. For any sequence μ, μ1, μ2, . . . of measures in M(S), the following conditions are equivalent:
(a) The sequence (μn)n∈N converges weakly to μ.
(b) μn(S) → μ(S) and
(c) μn(S) → μ(S) and
(d) μn(B) → μ(B) for every Borel set B whose boundary ∂B is not charged by μ in the sense that μ(∂B) = 0.
(e) ∫ f dμn → ∫ f dμ for every bounded measurable function f which is μ-a.e. continuous.
(f) ∫ f dμn → ∫ f dμ for every bounded and uniformly continuous function f .
Proof. The result is proved for M1(S) in [3], Theorem 14.3. The general case requires only minor modifications; see Chapter IX of [34].
Exercise A.6.1. Use the portmanteau theorem to show that the following conditions are equivalent for any sequence (μn)n=0,1,... in M1(ℝ).
(a) μn converges weakly to μ0.
(b) The corresponding distribution functions satisfy
for all x ∈ ℝ.
(c) F μn (x) → F μ0 (x) for any continuity point x of F μ0 .
(d) For any choice of quantile functions qμn and all t ∈ (0, 1),
(e) We have qμn (t) → qμ0 (t) for any continuity point t of qμ0.
◊
The next theorem can be regarded as a stability result for weak convergence.
Theorem A.44 (Slutsky). Suppose that, for n ∈ ℕ, Xn and Yn are real-valued random variables on (Ωn ,Fn , Pn) such that the laws of Xn converge weakly to the law of X, and the laws of Yn converge weakly to δy for some y ∈ ℝ. Then:
(a) The laws of Xn + Yn converge weakly to the law of X + y.
(b) The laws of X n · Yn converge weakly to the law of X · y.
Proof. See, for instance, Section 8.1 of [57].
We turn now to the fundamental characterization of the relative compact subsets of M(S) known as Prohorov’s theorem.
Theorem A.45 (Prohorov). Let S be a Polish space. A subset M of M(S) is relatively compact for the weak topology if and only if
and if M is tight, i.e., if for every ε > 0 there exists a compact subset K of S such that
In particular, M1(S) is weakly compact if S is a compact metric space.
Proof. For a proof in the context of probability measures, see for instance Theorem 1 in §III.2 of [262]. The general case requires only minor modifications; see Chapter IX of [34].
Example A.46. Take for S the positive half axis [0,∞) and define
where δx denotes the Dirac point mass in x ∈ S, i.e., δx(A) = A (x). Clearly,
so that μn converges weakly to μ. However, if we take the continuous but unbounded function f (x) = x, then ∫ f dμn = 1 for all n so that
◊
The preceding example shows that the weak topology is not an appropriate topology for ensuring the convergence of integrals against unbounded test functions. Let us introduce a suitable transformation of the weak topology which will allow us to deal with certain classes of unbounded functions.
We fix a continuous function
which will serve as a gauge function, and we denote by
Cψ(S)
the linear space of all continuous functions f on S for which there exists a constant c such that
|f (x)| ≤ c · ψ(x) for all x ∈ S.
Furthermore, we denote by
Mψ(S)
the set of all measures μ ∈ M(S) such that ∫ ψ dμ < ∞.
Definition A.47. The ψ-weak topology on Mψ(S) is the coarsest topology for which all mappings
are continuous.
◊
Since the gauge function ψ takes values in [1,∞), every bounded continuous function f belongs to Cψ(S). It follows that all mappings
are continuous. In particular, the set
of all Borel probability measures in Mψ(S) is closed for the ψ-weak topology. As in the case of the weak topology, it follows that the sets
for μ ∈ Mψ(S), ε > 0, n ∈ ℕ, and f1, . . . , fn ∈ Cψ(S) form a base for the ψ-weak topology on Mψ(S).
Consider the map
defined through
Clearly, Ψis a bijective mapping between the two sets M(S) and Mψ(S). Moreover, if we apply Ψt o an open neighborhood for the weak topology as in (A.32), we get
Since fψ ∈ Cψ(S) for each bounded and continuous function f , and since every function in Cψ(S) arises in this way, we conclude that a subset U of M(S) is weakly open if and only if Ψ(U) is open for the ψ-weak topology. Hence, Ψ is a homeomorphism. This observation allows us to translate statements for the weak topology into results for the ψ-weak topology:
Corollary A.48. For separable and metrizable S, the space Mψ(S) is separable and metrizable for the ψ-weak topology. If S is Polish, then so is Mψ(S). Moreover, if S0 is a dense subset of S, then the set
of simple measures on S0 with rational weights is dense in Mψ(S) for the ψ-weak topology.
The preceding corollary implies in particular that it suffices to consider ψ-weakly converging sequences when studying the ψ-weak topology. The following corollary is implied by the portmanteau theorem.
Corollary A.49. A sequence (μn)n∈N in Mψ(S) converges ψ-weakly to μ if and only if
for every measurable function f which is μ-a.e. continuous and for which exists a constant c such that | f | ≤ c · ψ μ-almost everywhere.
Prohorov’s theorem translates as follows to our present setting:
Corollary A.50. Let S be a Polish space and M be a subset of Mψ(S). The following conditions are equivalent:
(a) M is relatively compact for the ψ-weak topology.
(b) We have
and for every ε > 0 there exists a compact subset K of S such that
(c) There exists a measurable function ϕ : S → [1,∞] such that each set
{x ∈ S | ϕ(x) ≤ n ψ(x)} , n ∈ ℕ ,
is relatively compact in S, and such that
Proof. (a)⇔(b): This follows immediately from Theorem A.45 and the fact that Ψ is a homeomorphism.
(b)⇒(c): Take an increasing sequence K1 ⊂ K2 ⊂ · · · of compact sets in S such that
Then {ϕ ≤ nψ} ⊂ Kn. Moreover,
(c)⇒(b): Since {ϕ ≤ ψ} is relatively compact, we have that
Hence, ψ ≤ c + ϕ ≤ (1 + c)ϕ, and therefore
Moreover, for n ≥ ε−1 supμ∈M ∫ ϕ dμ, the relatively compact set K := {ϕ ≤ nψ} satisfies
and so condition (b) is satisfied.
We turn now to the task of identifying a linear functional on a space of functions as the integral with respect to a suitable measure.
Theorem A.51 (Riesz). Let S be a compact metric space and suppose that I is a linear functional on C(S) that is nonnegative in the sense that f ≥ 0 everywhere on S implies I (f) ≥ 0. Then there exists a unique positive Borel measure μ on S such that
The preceding theorem can be deduced from the next result; see Exercise A.6.2 below. To state this result, we need the notion of a vector lattice of real-valued functions on an arbitrary set S. This is a linear space L that is stable under the operation of taking the pointwise maximum: for f , g ∈ L also f ∨ g ∈ L. The vector lattice L is called a Stone vector lattice if f ∧ 1 ∈ L for all f ∈ L. One example is the space of all bounded measurable functions on (S,S). Another one is the space C b(S) of all bounded continuous functions on a separable metric space S. In this case, the σ-algebra σ(L) generated by L = C b(S) coincides with the Borel σ-algebra of S.
Theorem A.52 (Daniell–Stone). Let I be a linear functional on a Stone vector lattice L of functions on S such that the following conditions hold:
(a) I is nonnegative in the sense that f ≥ 0 everywhere on S implies I (f) ≥ 0.
(b) If (fn) is a sequence in L such that fn ↘ 0 pointwise on S, then I (fn) ↘ 0.
Then there exists a unique positive measure μ on the measurable space (S, σ(L)) such that
Proof. See, e.g., Theorem 4.5.2 of [100].
Exercise A.6.2. Use Dini’s lemma, as recalled in Lemma 4.24, to deduce Theorem A.51 from Theorem A.52.
◊
Without the continuity assumption (b) in Theorem A.52, the representation of positive linear functionals on the space of bounded measurable functions takes a different form, as we will discuss now.
Definition A.53. Let (Ω,F) be a measurable space. A mapping μ : F → ℝ is called a finitely additive set function if μ(∅) = 0, and if for any finite collection A1, . . . , An ∈ F of mutually disjoint sets
We denote by M1,f := M1,f (Ω,F) the set of all those finitely additive set functions μ : F → [0, 1] which are normalized to μ(Ω) = 1. The total variation of a finitely additive set function μ is defined as
The space of all finitely additive measures μ whose total variation is finite is denoted by ba(Ω,F).
◊
We will now give a brief outline of the integration theory with respect to a measure μ ∈ ba := ba(Ω,F); for details we refer to Chapter III in [104]. The space X of all bounded measurable functions on (Ω,F) is a Banach space if endowed with the supremum norm,
Let X0 denote the linear subspace of all finitely valued step functions which can be represented in the form
for some n ∈ ℕ, αi ∈ ℝ, and disjoint sets A1, . . . , An ∈ F. For this F we define
and one can check that this definition is independent of the particular representation of F. Moreover,
Since X0 is dense in X with respect to · , this inequality allows us to define the integral on the full space X as the extension of the continuous linear functional X0 ϶ F ↦ ∫ F dμ. Clearly, M1,f is contained in ba, and we will denote the integral of a function F ∈ X with respect to Q ∈ M1,f by
Theorem A.54. The integral
defines a one-to-one correspondence between continuous linear functionals on X and finitely additive set functions μ ∈ ba.
Proof. By definition of the integral and by (A.33), it is clear that any μ ∈ ba defines a continuous linear functional on X . Conversely, if a continuous linear functional is given, then we can define a finitely additive set function μ on (Ω,F) by
If L ≥ 0 is such that (F) ≤ L for F≤ 1, then ‖ μ ‖ var ≤ L, and so μ ∈ ba. One then checks that the integral with respect to μ coincides with on X0. Since X0 is dense in X , we see that ∫ Fdμ and (F) coincide for all F ∈ X .
Remark A.55. Theorem A.54 yields in particular a one-to-one correspondence between set functions Q ∈ M1,f and continuous linear functionals on X such that (1) = 1 and (X) ≥ 0 for X ≥ 0.
◊
Example A.56. Clearly, the set M1,f coincides with the set M1 := M1(Ω,F) of all σ-additive probability measures if (Ω,F) can be reduced to a finite set, in the sense that F is generated by a finite partition of Ω. Otherwise, M1,f is strictly larger than M1. Suppose in fact that there are infinitely many disjoint sets A1, A2, . . . ∈ F, take ωn ∈ An, and define
The continuous linear functionals n on X belong to the unit ball B1 in the dual Banach space X ʹ. By Theorem A.66, there exists a cluster point of (n). For any X ∈ X there is a subsequence (nk) such that nk (X) → (X). This implies that (X) ≥ 0 for X ≥ 0 and (1) = 1. Hence, Theorem A.54 allows us to write (X) = EQ[ X ] for some Q ∈ M1,f . But Q is not σ-additive, since Q[ An ] = (An ) = 0 and
◊
3.143.239.234