CHAPTER 7

Large Sample Theory for Estimation and Testing

PART I: THEORY

We have seen in the previous chapters several examples in which the exact sampling distribution of an estimator or of a test statistic is difficult to obtain analytically. Large samples yield approximations, called asymptotic approximations, which are easy to derive, and whose error decreases to zero as the sample size grows. In this chapter, we discuss asymptotic properties of estimators and of test statistics, such as consistency, asymptotic normality, and asymptotic efficiency. In Chapter 1, we presented results from probability theory, which are necessary for the development of the theory of asymptotic inference. Section 7.1 is devoted to the concept of consistency of estimators and test statistics. Section 7.2 presents conditions for the strong consistency of the maximum likelihood estimator (MLE). Section 7.3 is devoted to the asymptotic normality of MLEs and discusses the notion of best asymptotically normal (BAN) estimators. In Section 7.4, we discuss second and higher order efficiency. In Section 7.5, we present asymptotic confidence intervals. Section 7.6 is devoted to Edgeworth and saddlepoint approximations to the distribution of the MLE, in the one–parameter exponential case. Section 7.7 is devoted to the theory of asymptotically efficient test statistics. Section 7.8 discusses the Pitman’s asymptotic efficiency of tests.

7.1 CONSISTENCY OF ESTIMATORS AND TESTS

Consistency of an estimator is a property, which guarantees that in large samples, the estimator yields values close to the true value of the parameter, with probability close to one. More formally, we define consistency as follows.

Definition 7.1.1 Let {inlinen;n = n0, n0 + 1, … } be a sequence of estimators of a parameter θ. inlinen is called consistent if inlinen inline θ as n → ∞. The sequence is called strongly consistent if θnθ almost surely (a.s.) as n→ ∞ for all θ.

Different estimators of a parameter θ might be consistent. Among the consistent estimators, we would prefer those having asymptotically, smallest mean squared error (MSE). This is illustrated in Example 7.2.

As we shall see later, the MLE is asymptotically most efficient estimator under general regularity conditions.

We conclude this section by defining the consistency property for test functions.

Definition 7.1.2. Let {inlinen} be a sequence of test functions, for testing H0: θ inline Θ0 versus H1: θ inline Θ1. The sequence {inlinen} is called consistent if

(i) inlineα, 0 < α < 1
and
(ii) inline Eθ {inlinen(Xn)} = 1, for all θ inline Θ1.

A test function inlinen satisfying property (i) is called asymptotically size α test.

All test functions discussed in Chapter 4 are consistent. We illustrate in Example 7.3 a test which is not based on an explicit parametric model of the distribution F(x). Such a test is called a distribution free test, or a nonparametric test. We show that the test is consistent.

As in the case of estimation, it is not sufficient to have consistent test functions. One should consider asymptotically efficient tests, in a sense that will be defined later.

7.2 CONSISTENCY OF THE MLE

The question we address here is whether the MLE is consistent. We have seen in Example 5.22 a case where the MLE is not consistent; thus, one needs conditions for consistency of the MLE. Often we can prove the consistency of the MLE immediately, as in the case of the MLE of θ = (μ, σ) in the normal case, or in the Binomial and Poisson distributions.

Let X1, X2, …,Xn, … be independent identically distributed (i.i.d.) random variables having a p.d.f. f(x; θ), θ inline Θ. Let

Unnumbered Display Equation

If θ0 is the parameter value of the distribution of the Xs, then from the strong law of large numbers (SLLN)

(7.2.1) numbered Display Equation

as n→ ∞, where I(θ0, θ) is the Kullback–Leibler information. Assume that I(θ0, θ′) > 0 for all θ′ ≠ θ0. Since the MLE, inlinen, maximizes the left–hand side of (7.2.1) and since I(θ0, θ0) = 0, we can immediately conclude that if Θ contains only a finite number of points, then the MLE is strongly consistent. This result is generalized in the following theorem.

Theorem 7.2.1. Let X1, …,Xn be i.i.d. random variables having a p.d.f. f(x;θ), θ inline Θ, and let θ0 be the true value of θ. If

(i) Θ is compact;
(ii) f(x;θ) is upper semi–continuous in θ, for all x;
(iii) there exists a function K(x), such that Eθ0{|K(X)|} < ∞ and log f(x;θ) − log f(x;θ0) ≤ K(x), for all x and θ;
(iv) for all θ inline Θ and sufficiently small δ > 0, inline is measurable in x;
(v) f(x;θ) = f(x;θ0) for almost all x, implies that θ = θ0 (identifiability);

then the MLE inlinen inline θ0, as n→ ∞.

The proof is outlined only. For δ > 0, let Θδ = {θ: |θθ0|≥ δ }. Since Θ is compact so is Θδ. Let U(X; θ) = log f(X; θ) − log f(X; θ0). The conditions of the theorem imply (see Ferguson, 1996, p. 109) that

Unnumbered Display Equation

where μ (θ) = −I(θ0, θ) < 0, for all θ inline Θδ. Thus, with probability one, for n sufficiently large,

Unnumbered Display Equation

But,

Unnumbered Display Equation

Thus, with probability one, for n sufficiently large, |inlinenθ0| < δ. This demonstrates the consistency of the MLE.

For consistency theorems that require weaker conditions, see Pitman (1979, Ch. 8). For additional reading, see Huber (1967), Le Cam (1986), and Schervish (1995, p. 415).

7.3 ASYMPTOTIC NORMALITY AND EFFICIENCY OF CONSISTENT ESTIMATORS

The presentation of concepts and theory is done in terms of real parameter θ. The results are generalized to k–parameters cases in a straightforward manner.

A consistent estimator inline(Xn) of θ is called asymptotically normal if, there exists an increasing sequence {cn}, cn inline ∞ as n→ ∞, so that

(7.3.1) numbered Display Equation

The function AV{inlinen} = v2(θ) /inline is called the asymptotic variance of inline(Xn). Let S(Xn;θ) = inline log f(Xi; θ) be the score function and I(θ) the Fisher information. An estimator inlinen that, under the Cramér–Rao (CR) regularity conditions, satisfies

(7.3.2) numbered Display Equation

as n→ ∞, is called asymptotically efficient. Recall that, by the Central Limit Theorem (CLT), inlineS (Xn;θ) inline N(0,I(θ)). Thus, efficient estimators satisfying (7.3.2) have the asymptotic property that

(7.3.3) numbered Display Equation

For this reason, such asymptotically efficient estimators are also called BAN estimators.

We show now a set of conditions under which the MLE inlinen is a BAN estimator.

In Example 1.24, we considered a sequence {Xn} of i.i.d. random variables, with X1B(1,θ), 0 < θ < 1. In this case, inlinen = inline inlineXi is a strongly consistent estimator of θ. The variance stabilizing transformation g(inlinen) = 2 sin −1inline is a (strongly) consistent estimator of ω = 2 sin −1inline. This estimator is asymptotically normal with an asymptotic variance AV{g(inlinen)} = inline for all ω.

Although consistent estimators satisfying (7.3.3) are called BAN, one can construct sometimes asymptotically normal consistent estimators, which at some θ values have an asymptotic variance, with v2(θ) < inline. Such estimators are called super efficient. In Example 7.5, we illustrate such an estimator.

Le Cam (1953) proved that the set of point on which inline has a Lebesgue measure zero, as in Example 7.5.

The following are sufficient conditions for a consistent MLE to be a BAN estimator.

C.1. The CR regularity conditions hold (see Theorem 5.2.2);
C.2. inline S(Xn; θ) is continuous in θ, a.s.;
C.3. inlinen exists, and S(Xn;inlinen) = 0 with probability greater than 1−δ, 0 < δ arbitrary, for n sufficiently large.
C.4. inline S(Xn; inlinen) inlineI(θ), as n→ ∞.

Theorem 7.3.1 (Asymptotic efficiency of MLE). Let inlinen be an MLE of θ then, under conditions C.1.–C.4.

Unnumbered Display Equation

Sketch of the Proof. Let Bδ, θ, n be a Borel set in inlinen such that, for all Xn inline Bδ, θ, n, inlinen exists and S(Xn; inlinen) = 0. Moreover, Pθ (Bδ, θ, n) ≥ 1 − δ. For Xn inline Bδ, θ, n, consider the expansion

Unnumbered Display Equation

where |inlineθ| ≤ |inlinenθ|.

According to conditions (iii)–(v) in Theorem 7.2.1, and Slutzky’s Theorem,

Unnumbered Display Equation

as n→ ∞, since by the CLT

Unnumbered Display Equation

as n→ ∞.

7.4 SECOND–ORDER EFFICIENCY OF BAN ESTIMATORS

Often BAN estimators inlinen are biased, with

(7.4.1) numbered Display Equation

The problem then is how to compare two different BAN estimators of the same parameter. Due to the bias, the asymptotic variance may not present their precision correctly, when the sample size is not extremely large. Rao (1963) suggested to adjust first an estimator inlinen to reduce its bias to an order of magnitude of 1/n2. Let inline be the adjusted estimator, and let

(7.4.2) numbered Display Equation

The coefficient D of 1/n2 is called the second–order deficiency coefficient. Among two BAN estimators, we prefer the one having a smaller second–order deficiency coefficient.

Efron (1975) analyzed the structure of the second–order coefficient D in exponential families in terms of their curvature, the Bhattacharyya second–order lower bound, and the bias of the estimators. Akahira and Takeuchi (1981) and Pfanzagl (1985) established the structure of the distributions of asymptotically high orders most efficient estimators. They have shown that under the CR regularity conditions, the distribution of the most efficient second–order estimator inline is

(7.4.3) numbered Display Equation

where

(7.4.4) numbered Display Equation

and

(7.4.5) numbered Display Equation

For additional reading, see also Barndorff–Nielsen and Cox (1994).

7.5 LARGE SAMPLE CONFIDENCE INTERVALS

Generally, the large sample approximations to confidence limits are based on the MLEs of the parameter(s) under consideration. This approach is meaningful in cases where the MLEs are known. Moreover, under the regularity conditions given in the theorem of Section 7.3, the MLEs are BAN estimators. Accordingly, if the sample size is large, one can in regular cases employ the BAN property of MLE to construct confidence intervals around the MLE. This is done by using the quantiles of the standard normal distribution, and the square root of the inverse of the Fisher information function as the standard deviation of the (asymptotic) sampling distribution. In many situations, the inverse of the Fisher information function depends on the unknown parameters. The practice is to substitute for the unknown parameters their respective MLEs. If the samples are very large this approach may be satisfactory. However, as will be shown later, if the samples are not very large it may be useful to apply first a variance stabilizing transformationg(θ) and derive the confidence limits of g(θ).

A transformation g(θ) is called variance stabilizing if g′(θ) = inline. If inlinen is an MLE of θ then g(inlinen) is an MLE of g(θ). The asymptotic variance of g(inlinen) under the regularity conditions is (g′(θ))2/nI(θ). Accordingly, if g′(θ) = inline then the asymptotic variance of g(inlinen) is inline. For example, suppose that X1, …,Xn is a sample of n i.i.d. binomial random variables, B(1,θ). Then, the MLE of θ is inlinen. The Fisher information function is In(θ) = n/θ (1 − θ). If g(θ) = 2 sin−1inline then g′(θ) = 1/inline. Hence, the asymptotic variance of g(inlinen) = 2 sin−1 inline is inline. Transformations stabilizing whole covariance matrices are discussed in the paper of Holland (1973).

Let θ = t(g) be the inverse of a variance stabilizing transformation g(θ), and suppose (without loss of generality) that t(g) is strictly increasing. For cases satisfying the BAN regularity conditions, if inlinen is the MLE of θ,

(7.5.1) numbered Display Equation

A (1 − α) confidence interval for g(θ) is given asymptotically by (g(inlinen) − z1 − α /2/inline, g(inlinen) + z1 − α /2/inline), where z1 − α /2 = Φ −1(1 − α /2). Let gL and gU denote these lower and upper confidence intervals. We assume that both limits are within the range of the function g(θ); otherwise, we can always truncate it in an appropriate manner. After obtaining the limits gL and gU we make the inverse transformation on these limits and thus obtain the limits θL = t(gL) and θU = t(gU). Indeed, since t(g) is a one–to–one increasing transformation,

(7.5.2) numbered Display Equation

Thus, (θL, θU) is an asymptotically (1−α)–confidence interval.

7.6 EDGEWORTH AND SADDLEPOINT APPROXIMATIONS TO THE DISTRIBUTION OF THE MLE: ONE–PARAMETER CANONICAL EXPONENTIAL FAMILIES

The asymptotically normal distributions for the MLE require often large samples to be effective. If the samples are not very large one could try to modify or correct the approximation by the Edgeworth expansion. We restrict attention in this section to the one–parameter exponential type families in canonical form.

According to (5.6.2), the MLE, inlinen, of the canonical parameter inline satisfies the equation

(7.6.1) numbered Display Equation

The cumulant generating function K(inline) is analytic. Let G(x) be the inverse function of K′(inline). G(x) is also analytic and one can write, for large samples,

(7.6.2) numbered Display Equation

Recall that K″(inline) = I(inline) is the Fisher information function, and for large samples,

(7.6.3) numbered Display Equation

Moreover, E{inlinen} = K′(inline) and V{inline inlinen} = I(inline). Thus, by the CLT,

(7.6.4) numbered Display Equation

Equivalently,

(7.6.5) numbered Display Equation

This is a version of Theorem 7.3.1, in the present special case.

If the sample is not very large, we can add terms to the distribution of inline according to the Edgeworth expansion. We obtain

(7.6.6) numbered Display Equation

where

(7.6.7) numbered Display Equation

and

(7.6.8) numbered Display Equation

Let Tn = inline U(Xi). Tn is the likelihood statistic. As shown in Reid (1988) the saddlepoint approximation to the p.d.f. of the MLE, inlinen, is

(7.6.9) numbered Display Equation

where cn is a factor of proportionality, such that inline ginlinen (x; inline)dμ (x) = 1.

Let L(θ; Xn) and l(θ ;Xn) denote the likelihood and log–likelihood functions. Let inlinen denotes the MLE of θ, and

(7.6.10) numbered Display Equation

We have seen that Eθ {Jn(θ)} = I(θ). Thus, Jn(θ) inline I(θ), as n→ ∞ (the Fisher information function). Jn(inlinen) is an MLE estimator of Jn(θ). Thus, if inlinen inline θ, as n→ ∞, then, as in condition C.4. of Theorem 7.3.1, Jn(inlinen) inline I(θ), as n→ ∞. The saddlepoint approximation to ginlinen (x;θ) in the general regular case is

(7.6.11) numbered Display Equation

Formula (7.6.11) is called the Barndorff–Nielsen p*–formula. The order of magnitude of its error, in large samples, is O(n−3/2).

7.7 LARGE SAMPLE TESTS

For testing two simple hypotheses there exists a most powerful test of size α. We have seen examples in which it is difficult to determine the exact critical level kα of the test. Such a case was demonstrated in Example 4.4. In that example, we have used the asymptotic distribution of the test statistic to approximate kα. Generally, if X1, …,Xn are i.i.d. with common p.d.f. f(x;θ) let

(7.7.1) numbered Display Equation

where the two sample hypotheses are H0: θ = θ0 and H1: θ = θ1. The most powerful test of size α can be written as

Unnumbered Display Equation

Thus, in large samples we can consider the test function inline (Sn) = I{Snkα }, where Sn = inline log R(Xi). Note that under H0, Eθ0{Sn} = −n(I(θ0, θ1) while under H1, Eθ1{Sn} = nI(θ1, θ0), where I(θ, θ′) is the Kullback–Leibler information.

Let inline = Vθ0{log R(X1)}. Assume that 0 < inline < ∞. Then, by the CLT, inline = Φ (x). Hence, a large sample approximation to kα is

(7.7.2) numbered Display Equation

The large sample approximation to the power of the test is

(7.7.3) numbered Display Equation

where σinline = Vθ1{log R(X1)}. Generally, for testing H0: θ = θ0 versus H1: θθ0 where θ is a k–parameter vector, the following three test statistics are in common use, in cases satisfying the CR regularity condition:

1. The Wald Statistic

(7.7.4) numbered Display Equation

where inlinen is the MLE of θ, and

(7.7.5) numbered Display Equation

Here, H(θ) is the matrix of partial derivatives

Unnumbered Display Equation

An alternative statistic, which is asymptotically equivalent to Qw, is

(7.7.6) numbered Display Equation

One could also use the FIM, I(θ0), instead of J(θ0).

2. The Wilks’ Likelihood Ratio Statistic:

(7.7.7) numbered Display Equation

3. Rao’s Efficient Score Statistic:

(7.7.8) numbered Display Equation

where S(Xn;θ) is the score function, namely, the gradient vector inlineθ inline log f(Xi; θ). QR does not require the computation of the MLE inlinen.

On the basis of the multivariate asymptotic normality of inlinen, we can show that all these three test statistics have in the regular cases, under H0, an asymptotic χ2[k] distribution. The asymptotic power function can be computed on the basis of the non–central χ2[k;λ] distribution.

7.8 PITMAN’S ASYMPTOTIC EFFICIENCY OF TESTS

The Pitman’s asymptotic efficiency is an index of the relative performance of test statistics in large samples. This index is called the Pitman’s asymptotic relative efficiency (ARE). It was introduced by Pitman in 1948.

Let X1, …,Xn be i.i.d. random variables, having a common distribution F(x;θ), θ inline Θ. Let Tn be a statistic. Suppose that there exist functions μ (θ) and σn(θ) so that, for each θ inline Θ, Zn = (Tnμ (θ))/σn(θ)inline N(0,1), as n→ ∞. Often σn(θ) = c(θ) w(n), where w(n) = nα for some α > 0.

Consider the problem of testing the hypotheses H0: θθ0 against H1: θ > θ0, at level αnα, as n→ ∞. Let the sequence of test functions be

(7.8.1) numbered Display Equation

where knZ1 − α. The corresponding power functions are

(7.8.2) numbered Display Equation

We assume that

1. μ (θ) is continuously differentiable in the neighborhood of θ0, and μ′(θ0) > 0;
2. c(θ) is continuous in the neighborhood of θ0, and c(θ0) > 0.

Under these assumptions, if θn = θ0 + δ w(n) then, with δ > 0,

(7.8.3) numbered Display Equation

The function

(7.8.4) numbered Display Equation

is called the asymptotic efficacy of Tn.

Let Vn be an alternative test statistic, and Wn = (Vnη (θ))/(v(θ)w(n))inline N(0,1), as n→ ∞. The asymptotic efficacy of Vn is J(θ;V) = (η′(θ))2/v2(θ). Consider the case of w(n) = n−1/2. Let θn = θ0 + inline, δ > 0, be a sequence of local alternatives. Let inlinen(θn;Vn) be the sequence of power functions at θn = θ0 + δ /inline and sample size n′(n) so that inline inlinen(θn; Vn) = inline = inline inlinen(θn; Tn). For this

(7.8.5) numbered Display Equation

and

(7.8.6) numbered Display Equation

This limit (7.8.6) is the Pitman ARE of Vn relative to Tn.

We remark that the asymptotic distributions of Zn and Wn do not have to be N(0,1), but they should be the same. If Zn and Wn converge to two different distributions, the Pitman ARE is not defined.

7.9 ASYMPTOTIC PROPERTIES OF SAMPLE QUANTILES

Give a random sample of n i.i.d. random variables, the empirical distribution of the sample is

(7.9.1) numbered Display Equation

This is a step function, with jumps of size 1/n at the location of the sample random variables {Xi, i = 1, …,n}. The pth quantile of a distribution F is defined as

(7.9.2) numbered Display Equation

according to this definition the quanitles are unique. Similarly, the pth sample quantile are defined as ξn,p = inline(p).

Theorem 7.9.1. Let 0 < p < 1. Suppose that F is differentiable at the pth quantile ξp, and F′(ξp) > 0, then ξn,pξp a.s. as n→ ∞.

Proof. Let inline > 0 then

Unnumbered Display Equation

By SLLN

Unnumbered Display Equation

and

Unnumbered Display Equation

Hence,

Unnumbered Display Equation

as n→ ∞. Thus,

Unnumbered Display Equation

as n→ ∞. That is,

Unnumbered Display Equation        QED

Note that if 0 < F(ξ) < 1 then, by CLT,

(7.9.3) numbered Display Equation

for all −∞ < t < ∞. We show now that, under certain conditions, ξn,p is asymptotically normal.

Theorem 7.9.2. Let 0 < p < 1. Suppose that F is continuous at ξp = F−1(p). Then,

(i) If F′(ξp−) > 0 then, for all t < 0,

(7.9.4) numbered Display Equation

(ii) If F′(ξp+) > 0 then, for all t > 0,

(7.9.5) numbered Display Equation

Proof. Fix t. Let A > 0 and define

(7.9.6) numbered Display Equation

Thus,

(7.9.7) numbered Display Equation

Moreover, since nFn(ξp) ∼ B(n, F(ξp)),

(7.9.8) numbered Display Equation

By CLT,

(7.9.9) numbered Display Equation

as n→ ∞, where inline(ξp) = 1 − F(ξp). Let

(7.9.10) numbered Display Equation

and

(7.9.11) numbered Display Equation

Then

(7.9.12) numbered Display Equation

where

(7.9.13) numbered Display Equation

Since F is continuous at ξp,

Unnumbered Display Equation

Moreover, if inline. Hence, if t > 0

(7.9.14) numbered Display Equation

Similarly, if t < 0

(7.9.15) numbered Display Equation

Thus, let

Unnumbered Display Equation

Then, inlineCn(t) = t. Hence, from (7.9.12),

Unnumbered Display Equation        QED

Corollary. If F is differentiable at ξp, and f(ξp) = inlineF(x) |x=ξp> 0, then ξn,p is asymptotically N inline.

PART II: EXAMPLES

Example 7.1. Let X1, X2, … be a sequence of i.i.d. random variables, such that E{|X1|} < ∞. By the SLLN, inlinen = inline inline Xi inline μ, as n→ ∞, where μ = E{X1}. Thus, the sample mean inlinen is a strongly consistent estimator of μ. Similarly, if E{|X1|r} < ∞, r≥ 1, then the rth sample moment Mn,r is strongly consistent estimator of μr = E{inline}, i.e.,

Unnumbered Display Equation

Thus, if σ2 = V{X1}, and 0 < σ2 < ∞,

Unnumbered Display Equation

That is, inline is a strongly consistent estimator of σ2. It follows that inline = inline inline (Xiinlinen)2 is also a strongly consistent estimator of σ2. Note that, since Mn,r inline μr, as n→ ∞ whenever E{|X1|r} < ∞, then for any continuous function g(·), g(Mn,r) inline g(μr), as n→ ∞. Thus, if

Unnumbered Display Equation

is the coefficient of skewness, the sample coefficient of skewness is strongly consistent estimator of β1, i.e.,

Unnumbered Display Equation inline

Example 7.2. Let X1, X2, … be a sequence of i.i.d. random variables having a rectangular distribution R(0, θ), 0 < θ < ∞. Since μ1 = θ /2, inline1,n = 2inlinen is a strongly consistent estimator of θ. The MLE inline2,n = X(n) is also strongly consistent estimator of θ. Actually, since for any 0 < inline < θ,

Unnumbered Display Equation

Hence, by Borel–Cantelli Lemma, P{inline2,nθinline, i.o.} = 0. This implies that inline2,n inline θ, as n→ ∞. The MLE is strongly consistent. The expected value of the MLE is E{inline2,n} = inline θ. The variance of the MLE is

Unnumbered Display Equation

The MSE of {inline2,n} is V{inline2,n} + Bias2{inline2,n}, i.e.,

Unnumbered Display Equation

The variance of inline1,n is V{inline1,n} = inline. The relative efficiency of inline1,n against inline2,n is

Unnumbered Display Equation

as n→ ∞. Thus, in large samples, 2inlinen is very inefficient estimator relative to the MLE. inline

Example 7.3. Let X1, X2, …, Xn be i.i.d. random variables having a continuous distribution F(x), symmetric around a point θ. θ is obviously the median of the distribution, i.e., inline. We index these distributions by θ and consider the location family inlines = {Fθ: Fθ (x) = F(xθ), and F(−z) = 1−F(z); −∞ < θ < ∞ }. The functional form of F is not specified in this model. Thus, inlines is the family of all symmetric, continuous distributions. We wish to test the hypotheses

Unnumbered Display Equation

The following test is the Wilcoxon signed–rank test:

Let Yi = Xiθ0, i = 1, …,n. Let S(Yi) = I{Yi > 0}, i = 1, …,n. We consider now the ordered absolute values of Yi, i.e.,

Unnumbered Display Equation

and let R(Yi) be the index (j) denoting the place of Yi in the ordered absolute values, i.e., R(Yi) = j, j = 1, …,n if, and only if, |Yi| = |Y|(j). Define the test statistic

Unnumbered Display Equation

The test of H0 versus H1 based on Tn, which rejects H0 if Tn is sufficiently large is called the Wilcoxon signed–rank test. We show that this test is consistent. Note that under H0, P0{S(Yi) = 1} = inline. Moreover, for each i = 1, …, n, under H0

Unnumbered Display Equation

Thus, S(Yi) and |Yi| are independent. This implies that, under H0, S(Y1), …, S(Yn) are independent of R(Y1), …,R(Yn), and the distribution of Tn, under H0, is like that of Tn = inline jWjinline jBinline. It follows that, under H0,

Unnumbered Display Equation

Similarly, under H0,

Unnumbered Display Equation

According to Problem 3 of Section 1.12, the CLT holds, and

Unnumbered Display Equation

Thus, the test function

Unnumbered Display Equation

has, asymptotically size α, 0 < α < 1. This establishes part (i) of the definition of consistency.

When θ > θ0 (under H1) the distribution of Tn is more complicated. We can consider the test statistic Vn = inline One can show (see Hettmansperger, 1984, p. 47) that the asymptotic mean of Vn, as n→ ∞, is inline p2(θ), and the asymptotic variance of Vn is

Unnumbered Display Equation

where

Unnumbered Display Equation

In addition, one can show that the asymptotic distribution of Vn (under H1) is normal (see Hettmansperger, 1984).

Unnumbered Display Equation

Finally, when θ > 0, p2(θ) > inline and

Unnumbered Display Equation

for all θ > 0. Thus, the Wilcoxon signed–rank test is consistent. inline

Example 7.4. Let T1, T2, …, Tn be i.i.d. random variables having an exponential distribution with mean β, 0 < β < ∞. The observable random variables are Xi = min (Ti, t*), i = 1, …,n; 0 < t* < ∞. This is the case of Type I censoring of the random variables T1, …, Tn.

The likelihood function of β, 0 < β < ∞, is

Unnumbered Display Equation

where Kn = inline I{Xi < t }. Note that the MLE of β does not exist if Kn = 0. However, P{Kn = 0} = ent* /β → 0 as n→ ∞. Thus, for sufficiently large n, the MLE of β is

Unnumbered Display Equation

Note that by the SLLN,

Unnumbered Display Equation

and

Unnumbered Display Equation

Moreover,

Unnumbered Display Equation

Thus, inlinen inline β, as n→ ∞. This establishes the strong consistency of inlinen. inline

Example 7.5. Let {Xn} be a sequence of i.i.d. random variables, X1N(θ, 1), −∞ < θ < ∞. Given a sample of n observations, the minimal sufficient statistic is inlinen = inline inline Xj. The Fisher information function is I(θ) = 1, and inlinen is a BAN estimator. Consider the estimator,

Unnumbered Display Equation

Let

Unnumbered Display Equation

Now,

Unnumbered Display Equation

Thus,

Unnumbered Display Equation

We show now that inlinen is consistent. Indeed, for any δ > 0,

Unnumbered Display Equation

If θ = 0 then

Unnumbered Display Equation

since inlinen is consistent. Similarly, if θ ≠ 0,

Unnumbered Display Equation

Thus, inlinen is consistent. Furthermore,

Unnumbered Display Equation

Hence,

Unnumbered Display Equation

as n→ ∞. This shows that inlinen is asymptotically normal, with asymptotic variance

Unnumbered Display Equation

inlinen is super efficient. inline

Example 7.6. Let X1, X2, …, Xn be i.i.d. random variables, X1B(1,eθ), 0 < θ < ∞. The MLE of θ after n observations is

Unnumbered Display Equation

inlinen does not exist if inlineXi = 0. The probability of this event is (1−eθ)n. Thus, if nN(δ, θ) = inline, then Pθ inline < δ. For nN(δ, θ), let

Unnumbered Display Equation

then Pθ {Bn} > 1 − δ. On the set Bn, inline, where inlinen = inline inlineXi is the MLE of p = eθ. Finally, the Fisher information function is I(θ) = eθ/ (1 − eθ), and

Unnumbered Display Equation

All the conditions of Theorem 7.3.1 hold, and

Unnumbered Display Equation inline

Example 7.7. Consider again the MLEs of the parameters of a Weibull distribution G1/β(λ, 1); 0 < β, λ < ∞, which have been developed in Example 5.19. The likelihood function L(λ, β; Xn) is specified there. We derive here the asymptotic covariance matrix of the MLEs λ and β. Note that the Weibull distributions satisfy all the required regularity conditions.

Let Iij, i = 1,2, j = 1,2 denote the elements of the Fisher information matrix. These elements are defined as

Unnumbered Display Equation

We will derive the formulae for these elements under the assumption of n = 1 observation. The resulting information matrix can then be multiplied by n to yield that of a random sample of size n. This is due to the fact that the random variables are i.i.d.

The partial derivatives of the log–likelihood are

Unnumbered Display Equation

Thus,

Unnumbered Display Equation

since XβE(λ). It is much more complicated to derive the other elements of I(θ). For this purpose, we introduce first a few auxiliary results. Let M(t) be the moment generating function of the extreme–value distribution. We note that

Unnumbered Display Equation

Accordingly,

Unnumbered Display Equation

similarly,

Unnumbered Display Equation

These identities are used in the following derivations:

Unnumbered Display Equation

where γ = 0.577216… is the Euler constant. Moreover, as compiled from the tables of Abramowitz and Stegun (1968, p. 253)

Unnumbered Display Equation

We also obtain

Unnumbered Display Equation

where Γ ″(2) = 0.82367 and Γ ″(3) = 2.49293. The derivations of formulae for I12 and I22 are lengthy and tedious. We provide here, for example, the derivation of one expectation:

Unnumbered Display Equation

However, XβE(λ) ∼ inline U, where UE(1). Therefore,

Unnumbered Display Equation

The reader can derive other expressions similarly.

For each value of λ and β, we evaluate I11, I12, and I22. The asymptotic variances and covariances of the MLEs, designated by AV and AC, are determined from the inverse of the Fisher information matrix by

Unnumbered Display Equation

and

Unnumbered Display Equation

Applying these formulae to determine the asymptotic variances and asymptotic covariance of inline and inline of Example 5.20, we obtain, for λ = 1 and β = 1.75, the numerical results I11 = 1, I12 = 0.901272, and I22 = 1.625513. Thus, for n = 50, we have AV{inline} = 0.0246217, AV{inline} = 0.0275935 and AC(inline, inline) = −0.0221655. The asymptotic standard errors (square roots of AV) of inline and inline are, 0.1569 and 0.1568, respectively. Thus, the estimates inline = 0.839 and inline= 1.875 are not significantly different from the true values λ = 1 and β = 1.75. inline

Example 7.8. Let X1, …, Xn be i.i.d. random variables with X1E inline, 0 < ξ < ∞. Let Y1, …, Yn be i.i.d. random variables, Y1 ∼ G inline, 0 < η < ∞, and assume that the Y–sample is independent of the X–sample.

The parameter to estimate is θ = inline. The MLE of θ is inlinen = inline, where inlinen and inlinen are the corresponding sample means. For each n ≥ 1, inlinenθ F[2n, 2n]. The asymptotic distribution of inlinen is Ninline, 0 < θ < ∞. inlinen is a BAN estimator. To find the asymptotic bias of inlinen, verify that

Unnumbered Display Equation

The bias of the MLE is B(inlinen) = inline, which is of inline. Thus, we adjust inlinen by inline = inline inlinen. The bias of inline is B(inline) = 0. The variance of inline is

Unnumbered Display Equation

Thus, the second–order deficiency coefficients of inline is D = 3θ2. Note that inline is the UMVU estimator of θ. inline

Example 7.9. Let X1, X2, …, Xn be i.i.d. Poisson random variables, with mean λ, 0 < λ < ∞. We consider θ = eλ, 0 < θ < 1.

The UMVU of θ is

Unnumbered Display Equation

where Tn = inlineXi. The MLE of θ is

Unnumbered Display Equation

where inlinen = Tn/n. Note that inlineninlinen inline 0. The two estimators are asymptotically equivalent. Using moment generating functions, we prove that

Unnumbered Display Equation

Thus, adjusting the MLE for the bias, let

Unnumbered Display Equation

Note that, by the delta method,

Unnumbered Display Equation

Thus,

Unnumbered Display Equation

The variance of the bias adjusted estimator inline is

Unnumbered Display Equation

Continuing the computations, we find

Unnumbered Display Equation

Similarly,

Unnumbered Display Equation

It follows that

Unnumbered Display Equation

In the present example, the bias adjusted MLE is most efficient second–order estimator. The variance of the UMVU inlinen is

Unnumbered Display Equation

Thus, inlinen has the deficiency coefficient D = λ2 e−2λ/2.

Example 7.10

(a) In n = 100 Bernoulli trials, we observe 56 successes. The model is XB(100,θ). The MLE of θ is inline = 0.56 and that of g(θ) = 2 sin −1(inline) is g(0.56) = 1.69109. The 0.95–confidence limits for g(θ) are gL = g(0.56) − z.975/10 = 1.49509 and gU = 1.88709. The function g(θ) varies in the range [0,π]. Thus, let inline = max (0,gL) and inline = min (gU, π). In the present case, both gL and gU are in (0,π). The inverse transformation is

Unnumbered Display Equation

In the present case, θL = 0.462 and θU = 0.656. We can also, as mentioned earlier, determine the approximate confidence limits directly on θ by estimating the variance of θ. In this case, we obtain the limits

Unnumbered Display Equation

Both approaches yield here close results, since the sample is sufficiently large.

(b) Let (X1, Y1), …, (Xn, Yn) be i.i.d. vectors having the bivariate normal distribution, with expectation vector (ξ, η) and covariance matrix

Unnumbered Display Equation

The MLE of ρ is the sample coefficient of correlation r = Σ (Xiinline)(Yiinline)/[Σ (Xiinline)2 · Σ (Yiinline)2]1/2. By determining the inverse of the Fisher information matrix one obtains that the asymptotic variance of r is AV{r} = inline(1-ρ2)2. Thus, if we make the transformation g(ρ) = inline log inline then g′(ρ) = inline. Thus, g(r) = inline log ((1 + r)/(1 − r)) is a variance stabilizing transformation for r, with an asymptotic variance of 1/n. Suppose that in a sample of n = 100 we find a coefficient of correlation r = 0.79. Make the transformation

Unnumbered Display Equation

We obtain on the basis of this transformation the asymptotic limits

Unnumbered Display Equation

The inverse transformation is ρ = (e2g−1)/(e2g + 1). Thus, the confidence interval of ρ has the limits ρL = 0.704 and ρU = 0.853. On the other hand, if we use the formula

Unnumbered Display Equation

We obtain the limits ρL = 0.716 and ρU = 0.864. The two methods yield confidence intervals which are close, but not the same. A sample of size 100 is not large enough. inline

Example 7.11. In Example 6.6, we determined the confidence limits for the cross–ratio productρ. We develop here the large sample approximation, according to the two approaches discussed above. Let

Unnumbered Display Equation

Let inlineij = Xij/nij (i, j = 1,2). inlineij is the MLE of θij. Let inlineij = log (θij/(1 − θij)). The MLE of inlineij is inlineij = log (inlineij/(1 − inlineij)). The asymptotic distribution of inlineij is normal with mean inlineij and

Unnumbered Display Equation

Furthermore, the MLE of ω is

Unnumbered Display Equation

Since Xij, (i, j) = 1,2, are mutually independent so are the terms on the RHS of inline. Accordingly, the asymptotic distribution of inline is normal with expectation ω and asymptotic variance

Unnumbered Display Equation

Since the values θij are unknown we substitute their MLEs. We thus define the standard error of inline as

Unnumbered Display Equation

According to the asymptotic normal distribution of inline, the asymptotic confidence limits for ρ are

Unnumbered Display Equation

where inline is the MLE of ρ = eω. These limits can be easily computed. For a numerical example, consider the following table (Fleiss, 1973, p. 126) in which we present the proportions of patients diagnosed as schizophrenic in two studies both performed in New York and London.

Unnumbered Table

These samples yield the MLE inline = 2.9. The asymptotic confidence limits at level 1 − α = 0.95 are inline(1) = 1.38 and inline(2) = 6.08. This result indicates that the interaction parameter ρ is significantly greater than 1. We show now the other approach, using the variance stabilizing transformation 2 sin −1(inline). Let inlineij = (Xij + 0.5)/(nij + 1) and Yij = 2 sin −1(inlineij). On the basis of these variables, we set the 1 − α confidence limits for ηij = 2 sin −1(inline). These are

Unnumbered Display Equation

For these limits, we directly obtain the asymptotic confidence limits for inlineij that are

Unnumbered Display Equation

where

Unnumbered Display Equation

and

Unnumbered Display Equation

We show now how to construct asymptotic confidence limits for ρ from these asymptotic confidence limits for inlineij.

Define

Unnumbered Display Equation

and

Unnumbered Display Equation

D is approximately equal to inlineAV{inline}. Indeed, from the asymptotic theory of MLEs, inline/4 is approximately the asymptotic variance of inlineij times inline. Accordingly,

Unnumbered Display Equation

and by employing the normal approximation, the asymptotic confidence limits for ρ are

Unnumbered Display Equation

Thus, we obtain the approximate confidence limits for Fleiss’ example, ρ(1) = 1.40 and ρ(2) = 6.25. These limits are close to the ones obtained by the other approach. For further details, see Zacks and Solomon (1976). inline

Example 7.12. Let X1, X2, …, Xn be i.i.d. random variables having the gamma distribution G(1,ν), 0 < ν < ∞. This is a one–parameter exponential type family, with canonical p.d.f.

Unnumbered Display Equation

Here, K(ν) = log Γ (ν).

The MLE of ν is the root of the equation

Unnumbered Display Equation

where inlinen = inline inline log (Xi). The function Γ′(ν)/Γ (ν) is known as the di–gamma, or psi function, inline (ν) (see Abramowitz and Stegun, 1968, p. 259). inline (ν) is tabulated for 1 ≤ ν ≤ 2 in increments of Δ = 0.05. For ν values smaller than 1 or greater than 2 use the recursive equation

Unnumbered Display Equation

The values of inlinen can be determined by numerical interpolation.

The function inline (ν) is analytic on the complex plane, excluding the points ν = 0, −1, −2, …. The nth order derivative of inline (ν) is

Unnumbered Display Equation

Accordingly,

Unnumbered Display Equation

To assess the normal and the Edgeworth approximations to the distribution of inlinen, we have simulated 1000 independent random samples of size n = 20 from the gamma distribution with ν = 1. In this case I(1) = 1.64493, β1 = −1.1395 and β2 − 3 = 2.4. In Table 7.1, we present some empirical quantiles of the simulations. We see that the Edgeworth approximation is better than the normal for all standardized values of inlinen between the 0.2th and 0.8th quantiles. In the tails of the distribution, one could get better results by the saddlepoint approximation. inline

Table 7.1 Normal and Edgeworth Approximations to the Distribution of inline20, n = 20, ν = 1

Table07-1

Example 7.13. Let X1, X2, …, Xn be i.i.d. random variables having the exponential distribution X1E(inline), 0 < inline < ∞. This is a one–parameter exponential family with canonical p.d.f.

Unnumbered Display Equation

where U(x) = −x and K(inline) = −log (inline).

The MLE of inline is inlinen = 1/inlinen. The p.d.f. of inlinen is obtained from the density of inlinen and is

Unnumbered Display Equation

for 0 < x < ∞.

The approximation to the p.d.f. according to (7.6.9) yields

Unnumbered Display Equation

Substituting cn = nne−n/Γ (n) we get the exact equation. inline

Example 7.14. Let X1, X2, …, Xn be i.i.d. random variables, having a common normal distribution N(θ, 1). Consider the problem of testing the hypothesis H0: θ ≤,0 against H1: θ > 0. We have seen that the uniformly most powerful (UMP) test of size α is

Unnumbered Display Equation

where inlinen = inline inline Xi and and Z1 − α = Φ −1 (1 − α). The power function of this UMP test is

Unnumbered Display Equation

Let θ1 > 0 be specified. The number of observations required so that inlinen(θ1) ≥ γ is

Unnumbered Display Equation

Note that

(i) inline inlinen(θ1) = 1 for each θ1 > 0
and
(ii) if δ > 0 and θ1 = inline then

Unnumbered Display Equation

where 0 < α < inline < 1.

Suppose that one wishes to consider a more general model, in which the p.d.f. of X1 is f(xθ), −∞ < θ < ∞, where f(x) is symmetric about θ but not necessarily equal to inline (x), and Vθ {X} = σ2 for all −∞ < θ < ∞. We consider the hypotheses H0: θ ≤ 0 against H1: θ > 0.

Due to the CLT, one can consider the sequence of test statistics

Unnumbered Display Equation

where an inline Z1 − α as n→ ∞, and the alternative sequence

Unnumbered Display Equation

where Me is the sample median

Unnumbered Display Equation

According to Theorem 1.13.7, inline(Meθ) inline N inline as n→ ∞. Thus, the asymptotic power functions of these tests are

Unnumbered Display Equation

and

Unnumbered Display Equation

Both inline(θ1) and inline (θ1) converge to 1 as n→ ∞, for any θ1 > 0, which shows their consistency. We wish, however, to compare the behavior of the sequences of power functions for θn = inline. Note that each hypothesis, with θ1,n = inline is an alternative one. But, since θ1,n → 0 as n→ ∞, these alternative hypotheses are called local hypotheses. Here we get

Unnumbered Display Equation

and

Unnumbered Display Equation

To insure that inline* = inline** one has to consider for inline(2) a sequence of alternatives inline with sample size n′ = inline so that

Unnumbered Display Equation

The Pitman ARE of inline to inline is defined as the limit of n/n′(n) as n→ ∞. In the present example,

Unnumbered Display Equation

If the original model of XN(θ, 1), f(0) = inline and ARE of inline(2) to inline(1) is 0.637. On the other hand, if f(x) = inline e−|x|, which is the Laplace distribution, then the ARE of inline(2) to inline(1) is 1. inline

Example 7.15. In Example 7.3, we discussed the Wilcoxon signed–rank test of H0: θθ0 versus H1: θ > θ0, when the distribution function F is absolutely continuous and symmetric around θ. We show here the Pitman’s asymptotic efficiency of this test relative to the t–test. The t–test is valid only in cases where inline = V{X} and 0 < inline < ∞. The t–statistic is tn = inline, where inline is the sample variance. Since Sn inline σf, as n→ ∞, we consider

Unnumbered Display Equation

The asymptotic efficacy of the t–test is

Unnumbered Display Equation

where inline is the variance of X, under the p.d.f. f(x). Indeed, μ (θ) = θ.

Consider the Wilcoxon signed–rank statistic Tn, given by (7.1.3). The test function, for large n, is given by (7.1.8). For this test

Unnumbered Display Equation

where p2(θ) is given in Example 7.3. Thus,

Unnumbered Display Equation

Hence,

Unnumbered Display Equation

Using σ2(0) = inline, we obtain the asymptotic efficacy of

Unnumbered Display Equation

as n→ ∞. Thus, the Pitman ARE of Tn versus tn is

(7.9.16) numbered Display Equation

Thus, if f(x) = inline (x) (standard normal) ARE (Tn, tn) = 0.9549. On the other hand, if f(x) = inline exp {−|x|} (standard Laplace) then ARE (Tn, tn) = 1.5. These results deem the Wilcoxon signed–rank test to be asymptotically very efficient nonparametric test. inline

Example 7.16. Let X1, …, Xn be i.i.d. random variables, having a common Cauchy distribution, with a location parameter θ, −∞ < θ < ∞, i.e.,

Unnumbered Display Equation

We derive a confidence interval for θ, for large n (asymptotic). Let inlinen be the sample median, i.e.,

Unnumbered Display Equation

Note that, due to the symmetry of f(x;θ) around θ, θ = F−1inline. Moreover,

Unnumbered Display Equation

Hence, the (1 − α) confidence limits for θ are

Unnumbered Display Equation inline

PART III: PROBLEMS

Section 7.1

7.1.1 Let Xi = α + β zi + inlinei, i = 1, …,n, be a simple linear regression model, where z1, …,zn are prescribed constants, and inline1, …, inlinen are independent random variables with E{ei} = 0 and V{inlinei} = σ2, 0 < σ2 < ∞, for all i = 1, …,n. Let inline and inlinen be the LSE of α and β.

(i) Show that if inline (ziinlinen)2 → ∞, as n→ ∞, then inlinen inline β, i.e., inlinen is consistent.
(ii) What is a sufficient condition for the consistency of inline?

7.1.2 Suppose that X1, X2, …, Xn, … are i.i.d. random variables and 0 < E{inline} < ∞. Give a strongly consistent estimator of the kurtosis coefficient β2 = inline.

7.1.3 Let X1, …, Xk be independent random variables having binomial distributions B(n, θi), i = 1, …, k. Consider the null hypothesis H0: θ1 = ··· = θk against the alternative H1: inline (θiinline)2 > 0, where inline= inline θi. Let pi = Xi/n and inline. Show that the test function

Unnumbered Display Equation

has a size αn converging to α as n→ ∞. Show that this test is consistent.

7.1.4 In continuation of Problem 3, define Yi = 2 sin −1inline, i = 1, …, k.

(i) Show that the asymptotic distribution of Yi, as n→ ∞, is N(ηi, inline), where ηi = 2 sin −1 inline.
(ii) Show that Q = n inline (Yiinline)2, where inline = inlineYi, is distributed asymptotically (as n→ ∞) like χ2 [k−1;λ θ)], where λ (θ) = inline inline (ηiinline)2; inline= inlineηi. Thus, prove the consistency of the test.
(iii) Derive the formula for computing the asymptotic power of the test inline (X) = I{Qinline [k−1]}.
iv Assuming that inline(ηiinline)2 is independent of n, how large should n be so that the probability of rejecting H0 when inline(ηiinline)2≥ 10−1 will not be smaller than 0.9?

Section 7.2

7.2.1 Let X1, X2, …, Xn, … be i.i.d. random variables, X1G(1,ν), 0 < νν* < ∞. Show that all conditions of Theorem 7.2.1 are satisfied, and hence the MLE, inlinen inline ν as n→ ∞ (strongly consistent).

7.2.2 Let X1, X2, …, Xn, … be i.i.d. random variables, X1β (ν, 1), 0 < ν < ∞. Show that the MLE, inlinen, is strongly consistent.

7.2.3 Consider the Hardy–Weinberg genetic model, in which (J1, J2) ∼ MN(n,(p1(θ),p2(θ))), where p1(θ) = θ2 and p2(θ) = 2θ (1 − θ), 0 < θ < 1. Show that the MLE of θ, inlinen, is strongly consistent.

7.2.4 Let X1, X2, …, Xn be i.i.d. random variables from G(λ, 1), 0 < λ < ∞. Show that the following estimators inline (inlinen) are consistent estimators of ω (λ):

(i) inline(inlinen) = −log inlinen, ω (λ) = log λ;
(ii) inline(inlinen) = inline, ω (λ) = 1/λ2;
(iii) inline(inlinen) = exp {−1/inlinen}, ω (λ) = exp { −λ}.

7.2.5 Let X1, …, Xn be i.i.d. from N(μ, σ2), −∞ < μ < ∞, 0 < σ < ∞. Show that

(i) log (1 + inline) is a consistent estimator of log (1 + μ2);
(ii) inline (inlinen/S) is a consistent estimator of inline (μ /σ), where S2 is the sample variance.

Section 7.3

7.3.1 Let (Xi, Yi), i = 1, …,n be i.i.d. random vectors, where

Unnumbered Display Equation

−∞ < ξ < ∞, 0 < η < ∞, 0 < σ1, σ2 < ∞, −1 < ρ < 1. Find the asymptotic distribution of Wn = inlinen/ inlinen, where inlinen = inline inline Xi and inlinen = inline inline Yi.

7.3.2 Let X1, X2, …, Xn, … be i.i.d. random variables having a Cauchy distribution with location parameter θ, i.e.,

Unnumbered Display Equation

Let Me be the sample median, or Me = inlineinline. Is Me a BAN estimator?

7.3.3 Derive the asymptotic variances of the MLEs of Problems 1–3 of Section 5.6 and compare the results with the large sample approximations of Problem 4 of Section 5.6.

7.3.4 Let X1, …, Xn be i.i.d. random variables. The distribution of X1 as that of N(μ, σ2). Derive the asymptotic variance of the MLE of Φ(μ /σ).

7.3.5 Let X1, …, Xn be i.i.d. random variables having a log–normal distribution LN(μ, σ2). What is the asymptotic covariance matrix of the MLE of ξ = exp {μ + σ2/2} and D2 = ξ2 exp {σ2 − 1}?

Section 7.4

7.4.1 Let X1, X2, …, Xn be i.i.d. random variables having a normal distribution N(μ, σ2), −∞ < μ < ∞, 0 < σ < ∞. Let θ = eμ.

(i) What is the bias of the MLE inlinen?
(ii) Let inlinen be the bias adjusted MLE. What is inlinen, and what is the order of its bias, in terms of n?
(iii) What is the second order deficiency coefficient of inlinen?

7.4.2 Let X1, X2, …, Xn be i.i.d. random variables, X1 ∼ Ginline, 0 < β < ∞. Let θ = e−1/β, 0 < θ < 1.

(i) What is the MLE of θ?
(ii) Use the delta method to find the bias of the MLE, inlinen, up to inline.
(iii) What is the second–order deficiency coefficient of the bias adjusted MLE?

7.4.3 Let X1, X2, …, Xn be i.i.d. random variables having a one–parameter canonical exponential type p.d.f. Show that the first order bias term of the MLE inlinen is

Unnumbered Display Equation

Section 7.5

7.5.1 In a random sample of size n = 50 of random vectors (X, Y) from a bivariate normal distribution, −∞ < μ, η < ∞, 0 < σ1, σ2 < ∞, −1 < ρ < 1, the MLE of ρ is inline = 0.85. Apply the variance stabilizing transformation to determine asymptotic confidence limits of inline = sin −1(ρ); −inline < inline < inline.

7.5.2 Let inline be the sample variance in a random sample from a normal distribution N(μ, σ2). Show that the asymptotic variance of

Unnumbered Display Equation

Suppose that n = 250 and inline = 17.39. Apply the above transformation to determine asymptotic confidence limits, at level 1 − α = 0.95, for σ2.

7.5.3 Let X1, …, Xn be a random sample (i.i.d.) from N(μ, σ2); −∞ < μ < ∞, 0 < σ2 < ∞.

(i) Show that the asymptotic variance of the MLE of σ is σ2/2n.
(ii) Determine asymptotic confidence intervals at level (1 − α) for ω = μ + Zγ σ.
(iii) Determine asymptotic confidence intervals at level (1 − α) for μ /σ and for Φ (μ /σ).

7.5.4 Let X1, …, Xn be a random sample from a location parameter Laplace distribution; −∞ < μ < ∞. Determine a (1 − α)–level asymptotic confidence interval for μ.

Section 7.6

7.6.1 Let X1, X2, …, Xn be i.i.d. random variables having a one–parameter Beta (ν, ν) distribution.

(i) Write the common p.d.f. f(x;ν) in a canonical exponential type form.
(ii) What is the MLE, inlinen?
(iii) Write the Edgeworth expansion approximation to the distribution of the MLE inlinen.

7.6.2 In continuation of the previous problem, derive the p*–formula of the density of the MLE, inlinen?

PART IV: SOLUTION OF SELECTED PROBLEMS

7.1.3 Let θ′ = (θ1, …, θk) and p′ = (p1, …, pk). Let D = (diag(θi(1 − θi)), i = 1, …,k) be a k × k diagonal matrix. Generally, we denote XnAN inline if inline(Xnξ) inlineN(0, V) as n→ ∞. [AN(·, ·) stands for ‘asymptotically normal’]. In the present case, inline.

Let H0: θ1 = θ2 = ··· =θk = θ. Then, if H0 is true,

Unnumbered Display Equation

Now, inline (piinlinek)2 = pinlinep, where Jk = 1k1′k. Since inline is idempotent, of rank (k−1), npinlinep inline θ (1 − θ)χ2[k−1]. Moreover, inlinek = inline piθ a.s., as n→ ∞. Thus, by Slutsky’s Theorem

Unnumbered Display Equation

and

Unnumbered Display Equation

If H0 is not true, inline (θiinlinek)2 > 0. Also,

Unnumbered Display Equation

Thus, under H1,

Unnumbered Display Equation

Thus, the test is consistent.

7.2.3 The MLE of θ is inlinen =inline. J1B(n, θ2). Hence, inline and inline. Thus, inlineinlinen = θ a.s.

7.3.1 By SLLN, inline a.s.

Unnumbered Display Equation

where

Unnumbered Display Equation

Thus, inline.

7.3.2 As shown in Example 7.16,

Unnumbered Display Equation

Also,

Unnumbered Display Equation

On the other hand, the Fisher information is In(θ) = inline. Thus, AV(inlinen) = inline. Thus, inlinen is not a BAN estimator.

7.4.2

(i) Xβ G(1,1). Hence, the MLE of β is inlinen. It follows that the MLE of θ = e−1/β is inlinen = e−1/inlinen.
(ii)

Unnumbered Display Equation

Hence,

Unnumbered Display Equation

The bias adjusted estimator is

Unnumbered Display Equation

(iii) Let f(x) = e−1/x inline. Then

Unnumbered Display Equation

It follows that

Unnumbered Display Equation

Accordingly, the second–order deficiency coefficient of inlinen is

Unnumbered Display Equation

7.6.1 X1, …, Xn are i.i.d. like Beta (ν, ν), 0 < ν < ∞.

(i)

Unnumbered Display Equation

where K(ν) = log B(ν, ν).

(ii) The likelihood function is equivalent to

Unnumbered Display Equation

The log likelihood is

Unnumbered Display Equation

Note that the derivative of K(ν) is

Unnumbered Display Equation

It follows that the MLE of ν is the root of

Unnumbered Display Equation

The function inline log Γ (ν) is also called the psi function, i.e., inline (ν) = inline log Γ (ν). As shown in Abramowitz and Stegun (1968, p. 259), inline (2ν) − inline (ν) = inline(inline inlineinline (ν))+log 2. Also, −inlineinline log (Xi(1 − Xi)) > log 4. Thus, the MLE is the value of ν for which

Unnumbered Display Equation

(iii) Since Beta (ν, ν) is symmetric distribution around x = inline, X∼ (1−X) and hence

Unnumbered Display Equation

Thus, since X1, …, Xn are i.i.d., the Fisher information is I(ν) = 4V{log X}. The first four central moments of Beta (ν, ν) are inline = 0; inline = inline; inline = 0 and inline. Thus,

Unnumbered Display Equation

and

Unnumbered Display Equation

It follows that the Edgeworth asymptotic approximation to the distribution of the MLE, inlinen, is

Unnumbered Display Equation

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.37.254