Conditional Probability
A few facts concerning conditional probability needed in Chapter 3 are reviewed here. For further details, consult Chapters 6 and 7 of the book by Ross [99].
Let A and B denote events in a sample space S. The conditional probability of A given B, written prob(A|B), is defined by
(D.1)
where AB is the joint event of A and B. When prob(A|B) = prob(A), we say that A and B are independent.
As an example, suppose that X and Y are discrete random variables taking on non-negative integer values. If A is the event “X = k” and B is “Y = i,” then (D.1) becomes
Let Bi be a collection of disjoint events indexed by i whose union is S. Then
(D.2)
(From now on, all sums are taken over the indicated index from 0 to ∞.)
In terms of X and Y, (D.2) means that
Because of (D.1), we can now write (D.2) as
(D.3)
and so, for the variables X and Y, we obtain
The expected value (also called the mean value) of X is defined by
and if h(X) is some function of X, then the expected value of the random variable h(X) is given by
(D.4)
For example, if h(X) = X2, then
The conditional expectation (conditional mean) of X, given that Y = i, is defined by
(D.5)
Relation (D.5) enables us to define E(X|Y) as a function of Y, call it h(Y), whose value when Y = i is given by (D.5). From (D.3) and (D.4), we therefore obtain the unconditional expectation of X as
It follows that
(D.6)
When X and Y are continuous random variables taking on real non-negative values, the sums are replaced by integrals over a continuum of events indexed by the non-negative real numbers. Moreover, the discrete probabilities prob(X = k) are now represented by a continuous density function f(s). Consider, for example, X and Y to be exponentially distributed random variables. The event “X < Y ” means that the values assumed by X are less than the values taken on by Y. Then (D.2) and (D.3) become
and
(From now on, all integrals are taken from 0 to ∞.)
Relation (D.6) is now expressed as
Now let X = 1 if event E occurs and X = 0 otherwise. It follows immediately that E(X) = prob(E) and E(X|Y = i) = prob(E|Y = i). Therefore
(D.7)
This result is used in Appendix B.
18.118.137.67