A.3 Quantile functions

Suppose that F : (a, b) is an increasing function which is not necessarily strictly increasing. Let

Definition A.18. A function q : (c, d) (a, b) is called an inverse function for F if

The functions

are called the left- and right-continuous inverse functions.

The following lemma explains the reason for calling q and q+ the left- and right-continuous inverse functions of F.

Lemma A.19. A function q : (c, d) (a, b) is an inverse function for F if and only if

In particular, q and q+ are inverse functions. Moreover, q is left-continuous, q+ is right-continuous, and every inverse function q is increasing and satisfies q(s) = q(s) and q(s+) = q+(s) for all s (c, d). In particular, any two inverse functions coincide a.e. on (c, d).

Proof. We have q q+, and any inverse function q satisfies q q q+, due to the definitions of q and q+. Hence, the first part of the assertion follows if we can show that F(q+(s)) s F(q(s)+) for all s. But x < q+(s) implies F(x) s and y > q(s) implies F (y) s, which gives the result.

Next, it is clear that both q and q+ are increasing. Moreover, the set {x | F (x) > s} is the union of the sets {x | F (x) > s + ε} for ε > 0, and so q+ is right-continuous. An analogous argument shows the left-continuity of q.

Remark A.20. The left- and right-continuous inverse functions can also be represented as

To see this, note first that q(s) is clearly dominated by the infimum on the right. On the other hand, y > q(s) implies F(y) s, and we get q(s) inf{x | F(x) s}. The proof for q+ is analogous.

Lemma A.21. Let q be an inverse function for F. Then F is an inverse function for q. In particular,

Proof. If s > F(x) then q(s) q(s) x, and hence q(F(x)+) x. Conversely, s < F (x) implies q(s) q+(s) x, and thus q(F(x)) x. This proves that F is an inverse function for q.

Remark A.22. By defining q(d) := b we can extend (A.14) to

From now on we will assume that

and that F is normalized in the sense that c = 0 and d = 1. This assumption always holds if F is the distribution function of a random variable X on some probability space (Ω,F, P), i.e., F is given by F(x) = P[ X x ]. The following lemma shows in particular that also the converse is true: any normalized increasing right-continuous functions F : [0, 1] is the distribution function of some random variable. By considering the laws of random variables, we also obtain the one-to-one correspondence F (x) = μ((, x]) between all Borel probability measures μ on and all normalized increasing right-continuous functions F : [0, 1].

Lemma A.23. Let U be a random variable on a probability space (Ω,F, P) with a uniform distribution on (0, 1), i.e., P[ U s ] = s for all s (0, 1). If q is an inverse function of a normalized increasing right-continuous function F : [0, 1], then

has the distribution function F.

Proof. First note that any inverse function for F is measurable because it is increasing. Since q(F(x)) x by Lemma A.21, we have q(s) x for s < F (x). Moreover, the monotonicity of F and (A.13) yield that q(s) x implies F (x) F (q(s)) = F (q(s)+) s. It follows that

Hence,

The assertion now follows from the identity P[ U {s | q(s) x}] = P[ X x ].

Definition A.24. An inverse function q : (0,1) of a distribution function F is called a quantile function. That is, q is a function with

The left- and right-continuous inverses,

are called the lower and upper quantile functions.

We will often use the generic notation F X for the distribution function of a random variable X. When the emphasis is on the law μ of X, we will also write F μ. In the same manner, we will write q X or qμ for the corresponding quantile functions. The value q X(λ) of a quantile function at a given level λ (0, 1) is often called a λ-quantile of X.

Exercise A.3.1. Compute the upper and lower quantile functions for the following distribution functions.

(a) The distribution function of a Dirac point mass: F (x) = for some x0 . {xx0}

(b) The exponential distribution function: F (x) = 0 for x < 0 and dy for x 0, where λ > 0.

(c) The distribution function of a Gumbel distribution: where μ and β > 0.

The following result complements Lemma A.23. It implies that a probability space supports a random variable with uniform distribution on (0, 1) if and only if it supports any nonconstant random variable X with a continuous distribution.

Lemma A.25. Let X be a random variable with a continuous distribution function F X and with quantile function qX. Then U := F X(X) is uniformly distributed on (0, 1), and X = qX(U) P-almost surely.

Proof. Let (, F,) be any probability space that supports a random variable with a uniform distribution on (0, 1). Then := qX() has the same distribution as X due to Lemma A.23. Hence, F X(X) and F X() also have the same distribution. But if F X is continuous, then F XqX(s)= s and thus F X() = . This shows that F X(X) is uniformly distributed.

To show that X = q X(U) P-a.s., note first that by Lemma A.21. Hence, P-almost surely. Now let f : (0, 1) be any strictly increasing function. Since qX(U) and X have the same law, we have E[ f (qX(U)) ] = E[ f (X) ] and get

There are several possibilities how the preceding lemma can be generalized to the case of discontinuous distribution functions F X. A first possibility is provided in the following exercise. It requires the existence of an independent random variable with a uniform distribution on (0, 1). The second possibility will be given in Lemma A.32. There, we will only assume the existence of some uniformly distributed random variable, not its independence of X.

Exercise A.3.2. Let X be a random variable with distribution function F X. The modified distribution function of X is defined by

Suppose that is a random variable that is independent of X and uniformly distributed on (0, 1). Show that

is uniformly distributed on (0, 1) and that

X = qX(U) P-a.s.

The following lemma uses the concept of the FenchelLegendre transform of a convex function as introduced in Definition A.8.

Lemma A.26. Let X be a random variable with distribution function F X and quantile function qX such that E[ |X|] < . Then the FenchelLegendre transform of the convex function

is given by

Moreover, for 0 < y < 1, the supremum above is attained in x if and only if x is a y-quantile of X.

Proof. Note first that, by Fubinis theorem and Lemma A.23,

It follows that Ψ(y) = + for y < 0,Ψ(0) = infx Ψ(x) = 0,

and Ψ(y) = for y > 1. To prove our formula for 0 < y < 1, note that the right-hand and left-hand derivatives of the concave function f (x) = xy Ψ (x) are given by and A point x is a maximizer of f if and which is equivalent to x being a y-quantile. Taking x = q X(y) and using (A.15) gives

and our formula follows.

Lemma A.27. If X = f (Y) for an increasing function f and qY is a quantile function for Y, then f (qY (t)) is a quantile function for X. In particular,

for any quantile function q X of X.

If f is decreasing, then f (qY(1 t)) is a quantile function for X. In particular,

Proof. If f is decreasing, then q(t) := f (qY(1 t)) satisfies

since FY (qY(1 t)) 1 t FY (qY(1 t)) by definition. Hence q(t) = f (qY(1 t)) is a quantile function. A similar argument applies to an increasing function f .

Exercise A.3.3. Let X, Y, and Z by random variables such that X = f (Z) and Y = g(Z) for two increasing functions f and g. Show that if qX, qY , and qX+Y are quantile functions for X, Y, and X + Y, then

Note: Two random variables X and Y in the form considered in this exercise are called comonotone. See Section 4.7 for more background on the notion of comonotonicity.

The following theorem is a version of the HardyLittlewood inequalities. They estimate the expectation E [ XY ] in terms of quantile functions qX and qY .

Theorem A.28. Let X and Y be two random variables on (Ω,F, P) with quantile functions qX and qY . Then,

provided that all integrals are well defined. If X = f (Y) and the lower (upper) bound is finite, then the lower (upper) bound is attained if and only if f can be chosen as a decreasing (increasing) function.

Proof. We first prove the result for X, Y 0. By Fubinis theorem,

Since

and since

for any random variable Z 0, another application of Fubinis theorem yields

In the same way, the upper estimate follows from the inequality

For X = f (Y),

due to Lemma A.23, and so Lemma A.27 implies that the upper and lower bounds are attained for increasing and decreasing functions, respectively.

Conversely, assume that X = f (Y), and that the upper bound is attained and finite:

Our aim is to show that

where fis an increasing function on [0,) such that

if 0 < FY (x) < 1 and x is a continuity point of F Y , and

if F Y (x) < FY (x). It is shown in Exercise 3.4.1 that

where Eλ[ · | qY ] denotes the conditional expectation with respect to qY under the Lebesgue measure λ on (0, 1). Therefore, (A.17) implies that

where we have used Lemma A.23 in the first identity. After these preparations, we can proceed to proving (A.18). Let ν denote the distribution of Y. By introducing the positive measures = f dν and d = , (A.20) can be written as

On the other hand, with g denoting the increasing function , the upper Hardy-[y,) Littlewood inequality, Lemma A.27, and (A.19) yield

In view of (A.21), we obtain μ = , hence f =ν-a.s. and X =(Y) P-almost surely. An analogous argument applies to the lower bound, and the proof for X, Y 0 is concluded.

The result for general X and Y is reduced to the case of nonnegative random variables by separately considering the positive and negative parts of X and Y:

where we have used the upper HardyLittlewood inequality on the positive terms and the lower one on the negative terms. Since qZ+ (t) = (qZ(t))+ and qZ (t) = (qZ(1 t)) for all random variables due to Lemma A.27, one checks that the right-hand side of (A.22) is equal to and we obtain the general form of the upper HardyLittlewood inequality. The same argument also works for the lower one.

Now suppose that X = f (Y). We first note that (A.16) still holds, and so Lemma A.27 implies that the upper and lower bounds are attained for increasing and decreasing functions, respectively. Conversely, let us assume that the upper HardyLittlewood inequality is an identity. Then all four inequalities used in (A.22) must also be equalities. Using the fact that XY+ = f (Y+)Y+ and XY = f (Y)Y, the assertion is reduced to the case of nonnegative random variables, and one checks that f can be chosen as an increasing function. The same argument applies if the lower HardyLittlewood inequality is attained.

Remark A.29. For indicator functions of two sets A and B in F, the Hardy Littlewood inequalities reduce to the elementary inequalities

note that these estimates were used in the preceding proof. Applied to the sets {X x} and {Y y}, where X and Y are random variables with distribution functions F X and F Y and joint distribution function F X,Y defined by F X,Y (x, y) = P[ X x, Y y ], they take the form

The estimates (A.23) and (A.24) are often called Fréchet bounds. The Hardy Littlewood inequalities, which are also called HoeffdingFréchet bounds, provide their natural extension from sets to random variables.

Exercise A.3.4. Let X and Y be two random variables on (Ω,F, P) for which all terms appearing in the HardyLittlewood inequalities make sense. Suppose moreover that there exists a third random variable Z and functions f and g such that X = f (Z) and Y = g(Z).

(a) Show that the upper HardyLittlewood inequality reduces to an equality if X and Y are comonotone in the sense that f and g are both increasing or both decreasing.

(b) Show that the lower HardyLittlewood inequality reduces to an equality if X and Y are anticomonotone in the sense that one of the functions f and g is increasing and the other one is decreasing.

Note: See Section 4.7 for more background on the notion of comonotonicity.

Exercise A.3.5. This exercise complements the Fréchet bounds (A.23) and (A.24).

(a) Derive similar bounds as in (A.23) for the probability P[ A B ] of a union of two events A, B F.

(b) Show that the bounds in (A.23) admit the following extension to the case of n events:

for A1, . . . , An F. Then derive an extension of (A.24) to the case of n random variables X1, . . . , Xn.

Definition A.30. A probability space (Ω,F, P) is called atomless if it contains no atoms. That is, there is no set A F such that P[ A ] > 0 and P[ B ] = 0or P[ B ] = P[ A ] whenever B F is a subset of A.

Proposition A.31. For any probability space, the following conditions are equivalent.

(a) (Ω,F, P) is atomless.

(b) There exists an i.i.d. sequence X1, X2, . . . of random variables with Bernoulli distribution

(c) For any μ M1() there exist i.i.d. random variables Y1, Y2, . . . with common distribution μ.

(d) (Ω,F, P) supports a random variable with a continuous distribution.

Proof. (a)(b):We need the following intuitive fact from measure theory: If (Ω,F, P) is atomless, then for every A F and all δ with 0 δ P[ A ] there exists a measurable set B A such that P[ B ] = δ; see Theorem 9.51 of [3] for a proof. Thus, we may take a set A F such that P[ A ] = 1/2 and define X1 := 1 on A and X1 := 0 on Ac. Now suppose that X1, . . . , X n have already been constructed. Then

for all x1, . . . , xn {0, 1}, and this property is equivalent to X1, . . . , Xn being independent with the desired symmetric Bernoulli distribution. For all x1, . . . , xn {0, 1} we may choose a set

B {X1 = x1, . . . , Xn = xn}

such that P[ B ] = 2(n+1) and define Xn+1 := 1 on B and Xn+1 := 0 on Bc {X1 = x1, . . . , Xn = xn}. Clearly, the collection X1, . . . , Xn+1 is again i.i.d. with a symmetric Bernoulli distribution.

(b)(c): By relabeling the sequence X1, X2, . . . , we may obtain a double-indexed sequence (Xi,j)i,jN of independent Bernoulli-distributed random variables. If we let

then it is straightforward to check that Ui has a uniform distribution. Let q be a quantile function for μ. Lemma A.23 shows that the i.i.d. sequence Yi := q(Ui), i = 1,2, . . . , has common distribution μ.

The proofs of the implications (c)(d) and (d)(a) are straightforward.

Lemma A.32. If X is a random variable on an atomless probability space, then there exists a random variable U with uniform distribution on (0, 1) such that X = qX(U) P-a.s. for any quantile function q X of X.

Proof. Let us write q := q X and denote by λ the Lebesgue measure on (0, 1). For each x , Ix := {t (0, 1) | q(t) = x} is a (possibly empty or degenerate) real interval, which has Lebesgue measure λ(Ix) = P[ X = x ] by Lemma A.23. Consider the set D := {x | P[ X = x ] > 0}, which is at most countable. For each x D, the probability space (Ω,F, P[ · | X = x ]) is again atomless and hence supports a random variable Ux : Ω Ix with a uniform law on Ix. That is, P[ Ux A | X = x ] = λ(A Ix)/λ(Ix) or, equivalently,

Let F := F X be the distribution function of X and define the random variable

Since q(Ux(ω)) = x for any x D, we have q(U(ω)) = X(ω) for all ω {X D}. Let us show next that q(F(X)) = X P-a.s. on {X / D}, which will then yield that q(U) = X P-a.s. as desired. To this end, let Δ := {t (0, 1) | q(t+) > q(t)} denote the set of discontinuities of q. If F (x) /Δ, then q(F(x)+) = q(F(x)), while Lemma A.21 implies that q(F(x)) x q(F(x)+). Combining these two facts gives

Thus, we will have q(F(X)) = X P-a.s. on {X / D} if we can prove that

To prove (A.27), fix t Δ. Then Remark A.20 and Lemma A.19 yield that the set Jt := {x | F (x) = t} is a real interval with endpoints q(t) and q(t+). On the one hand, by the right-continuity of F,

On the other hand, P[ X {q(t), q(t+)}, X / D ] = 0. Thus, P[ X Jt , X / D ] = 0 for each t Δ. Since Δ is countable, our claim (A.27) follows.

It remains to show that U has a uniform law. To this end, take a measurable subset A of (0, 1). Using (A.25) we get

Thus, the proof is completed by showing that P[ F(X) A, X / D ] = λ(A Ic), where To this end, recall first that λ ◦ q1 = P ◦ X1 by Lemma A.23. Now, by (A.27),

as {t | q(t) / D} = Ic. We now claim that

To prove this claim, note first that {F(q) A} {q(F(q)) q(A)} and q(F(q)) = q on {F(q) / Δ} by (A.26). Therefore, the set on the left-hand side is contained in {q q(A)}{F(q) / Δ}Ic. Next, suppose q(t) q(A) for some t {F(q) / Δ}Ic. Then q(t) = q(a) for some a A. But since t Ic,wehave q(t) / D and therefore λ(Iq(t)) = 0. Hence, there is no other s (0, 1) with q(s) = q(t), which implies t = a and in turn t A. This gives in (A.29). Conversely, suppose that t belongs to the set on the right-hand side of (A.29). Since t Ic, we have q(t) / D and hence F (q(t)) = F (q(t)). Combining the latter fact fact with F (q(t)) t F ((q(t)) gives F (q(t)) = t A and in turn , which completes the proof of (A.29). Plugging (A.29) into (A.28) gives

Applying Lemma A.23 to (A.27) yields

Therefore, λ(A {F(q) / Δ} Ic) = λ(A Ic), and the proof is complete.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.89.30