13
NONPARAMETRIC STATISTICAL INFERENCE

13.1 INTRODUCTION

In all the problems of statistical inference considered so far, we assumed that the distribution of the random variable being sampled is known except, perhaps, for some parameters. In practice, however, the functional form of the distribution is seldom, if ever, known. It is therefore desirable to devise methods that are free of this assumption concerning distribution. In this chapter we study some procedures that are commonly referred to as distribution-free or nonparametric methods. The term “distribution-free” refers to the fact that no assumptions are made about the underlying distribution except that the distribution function being sampled is absolutely continuous. The term “nonparametric” refers to the fact that there are no parameters involved in the traditional sense of the term “parameter” used thus far. To be sure, there is a parameter which indexes the family of absolutely continuous DFs, but it is not numerical and hence the parameter set cannot be represented as a subset of imagesn, for any images. The restriction to absolutely continuous distribution functions is a simplifying assumption that allows us to use the probability integral transformation (Theorem 5.3.1) and the fact that ties occur with probability 0.

Section 13.2 is devoted to the problem of unbiased (nonparametric) estimation. We develop the theory of U-statistics since many estimators and test statistics may be viewed as U-statistics. Sections 13.3 through 13.5 deal with some common hypotheses testing problems. In Section 13.6 we investigate applications of order statistics in nonparametric methods. Section 13.7 considers underlying assumptions in some common parametric problems and the effect of relaxing these assumptions.

13.2 U-STATISTICS

In Chapter 6 we encountered several nonparametric estimators. For example, the empirical DF defined in Section 6.3 as an estimator of the population DF is distribution-free, and so also are the sample moments as estimators of the population moments. These are examples of what are known as U-statistics which lead to unbiased estimators of population characteristics. In this section we study the general theory of U-statistics. Although the thrust of this investigation is unbiased estimation, many of the U-statistics defined in this section may be used as test statistics.

Let X1,X2,…, Xn be iid RVs with common law images(X), and let images be the class of all possible distributions of X that consists of the absolutely continuous or discrete distributions, or subclasses of these.

We have already encountered many examples of complete statistics or complete families of distributions in Chapter 8.

The following result is stated without proof. For the proof we refer to Fraser [32, pp. 27–30, 139–142].

Clearly, the U-statistic defined in (3) is symmetric in the Xi’s, and

(4)images

Moreover, U(X) is a function of the complete sufficient statistic X(1), X(2),…,X(n). It follows from Theorem 8.4.6 that it is UMVUE of its expected value.

For estimating μ3(F), a symmetric kernel is images so that the corresponding U-statistic is

images

For estimating F(x) a symmetric kernel is images so the corresponding U-statistic is

images

and for estimating images the U-statistic is

images

Finally, for estimatin images the U-statistic is

images

Now note that the numerator has images factors involving n, while the denominator has m such factors so that for images, the ratio involving n goes to 0 as images. For images, this ratio images and

images

as images

Finally we state, without proof, the following result due to Hoeffding [45], which establishes the asymptotic normality of a suitably centered and normed U-statistic. For proof we refer to Lehmann [61, pp. 364–365] or Randles and Wolfe [85, p. 82].

The concept of U-statistics can be extended to multiple random samples. We will restrict ourselves to the case of two samples. Let images and images be two independent random samples from DFs F and G, respectively.

The statistic T in Definition 8 is called a kernel of g and a symmetrized version of T, Ts is called a symmetric kernel of g. Without loss of generality therefore we assume that the two-sample kernel T in (9) is a symmetric kernel.

Finally we state, without proof, the two-sample analog of Theorem 3 which establishes the asymptotic normality of the two-sample U-statistic defined in (10).

PROBLEMS 13.2

  1. Let images be a probability space, and let images. Let A be a Borel subset of images, and consider the parameter images. Is d estimable? If so, what is the degree? Find the UMVUE for d, based on a sample of size n, assuming that images is the class of all continuous distributions.
  2. Let X1, X2,…, Xm and Y1, Y2,…, Yn be independent random samples from two absolutely continuous DFs. Find the UMVUEs of (a) E{XY} and (b) images.
  3. Let (X1, Y1), (X2, Y2),…,(Xn, Yn) be a random sample from an absolutely continuous distribution. Find the UMVUEs of (a) E(XY) and (b) images.
  4. Let T(X1, X2,…, Xn) be a statistic that is symmetric in the observations. Show that T can be written as a function of the order statistic. Conversely, if T (X1, X2,…,Xn) can be written as a function of the order statistic, T is symmetric in the observations.
  5. Let X1, X2, …,Xn be a random sample from an absolutely continuous images. Find U-statistics for images. Find the corresponding expressions for the variance of the U-statistic in each case.
  6. In Example 3, show that μ2(F) is not estimable with one observation. That is, show that the degree of μ2(F) where images, the class of all distributions with finite second moment, is 2.
  7. Show that for images.
  8. Let X1, X2, …,Xn be a random sample from an absolutely continuous images. Let
    images

    Find the U-statistic estimator of g(F) and its variance.

13.3 SOME SINGLE-SAMPLE PROBLEMS

Let X1, X2,…,Xn be a random sample from a DF F. In Section 13.2 we studied properties of U-statistics as nonparametric estimators of parameters g(F). In this section we consider some nonparametric tests of hypotheses. Often the test statistic may be viewed as a function of a U-statistic.

13.3.1 Goodness-of-Fit Problem

The problem of fit is to test the hypothesis that the sample comes from a specified DF F0 against the alternative that it is from some other DF F, where images for some images. In Section 10.3 we studied the chi-square test of goodness of fit for testing images. Here we consider the Kolmogorov–Smirnov test of H0. Since H0 concerns the underlying DF of the X’s, it is natural to compare the U-statistic estimator of images with the specified DF F0 under H0. The U-statistic for images is the empirical images.

Since F(X(i)) is the ith-order statistic of a sample from U (0,1) irrespective of what F is, as long as it is continuous, we see that the distribution of images is independent of F. Similarly,

images

and the result follows.

Without loss of generality, therefore, we assume that F is the DF of a U(0,1) RV.

We will not prove this result here. Let Dn, α be the upper α-percent point of the distribution of Dn, that is, images. The exact distribution of Dn for selected values of n and α has been tabulated by Miller [74], Owen [79], and Birnbaum [9]. The large-sample distribution of Dn was derived by Kolmogorov [53], and we state it without proof.

The statistics images and images have the same distribution because of symmetry, and their common distribution is given by the following theorem.

Tables for the critical values images where images, are also available for selected values of n and α; see Birnbaum and Tingey [8]. Table ST7 at the end of this book gives images and Dn, α for some selected values of n and α. For large samples Smirnov [108] showed that

In fact, in view of (9), the statistic images has a limiting χ2(2) distribution, for images if and only if images, and the result follows since

images

so that

images

which is the DF of a χ2 (2) RV.

It is worthwhile to compare the chi-square test of goodness of fit and the Kolmogorov–Smirnov test. The latter treats individual observations directly, whereas the former discretizes the data and sometimes loses information through grouping. Moreover, the Kolmogorov–Smirnov test is applicable even in the case of very small samples, but the chi-square test is essentially for large samples.

The chi-square test can be applied when the data are discrete or continuous, but the Kolmogorov–Smirnov test assumes continuity of the DF. This means that the latter test provides a more refined analysis of the data. If the distribution is actually discontinuous, the Kolmogorov–Smirnov test is conservative in that it favors H0.

We next turn our attention to some other uses of the Kolmogorov–Smirnov statistic. Let X1, X2,…,Xn be a sample from a DF F, and let images be the sample DF. The estimate images of F for large n should be close to F. Indeed,

(10)images

and, since images, we have

(11)images

Thus images can be made close to F with high probability by choosing λ and large enough n. The Kolmogorov–Smirnov statistic enables us to determine the smallest n such that the error in estimation never exceeds a fixed value ε with a large probability 1 – α. Since

(12)images

images; and, given ε and α, we can read n from the tables. For large n, we can use the asymptotic distribution of Dn and solve images for n.

We can also form confidence bounds for F. Given α and n, we first find Dn,α such that

(13)images

which is the same as

images

Thus

(14)images

Define

(15)images

And

(16)images

Then the region between Ln(x) and Un(x) can be used as a confidence band for F(x) with associated confidence coefficient 1 – α.

13.3.2 Problem of Location

Let X1,X2,…,Xn be a sample of size n from some unknown DF F. Let p be a positive real number, images, and let imagesP (F) denote the quantile of order p for the DF F. In the following analysis we assume that F is absolutely continuous. The problem of location is to test images a given number, against one of the alternatives images and images. The problem of location and symmetry is to test images, and F is symmetric against images or F is not symmetric.

We consider two tests of location. First, we describe the sign test.

13.3.2.1 The Sign Test

Let X1,X2,…,Xn be iid RVs with common PDF f. Consider the hypothesis testing problem

(17)images

where imagesP(f) is the quantile of order p of PDF f, images. Let images. Then the corresponding U-statistic is given by

images

the number of positive elements in X1images0, X2images0,…,Xnimages0. Clearly, P(Xi = images0) = 0. Fraser [32, pp. 167–170] has shown that a UMP test of H0 against H1 is given by

(18)images

where c and γ are chosen from the size restriction

Note that, under images, so that images and images. The same test is UMP for images against images. For the two-sided case, Fraser [32, p. 171] shows that the two-sided sign test is UMP unbiased.

If, in particular, images is the median of f, then images under H0. In this case one can also use the sign test to test images, F is symmetric.

For large n one can use the normal approximation to binomial to find c and γ in (19).

We have to find c and γ such that

images

From the table of cumulative binomial distribution (Table ST1) for images, images, we see that images. Then γ is given by

images

Thus

images

In our case the number of positive signs, xi – 195, i = 1,2,..., 12, is 7, so we reject H0 that the upper quartile is ≤195.

The single-sample sign test described above can easily be modified to apply to sampling from a bivariate population. Let (X1, Y1), (X2, Y2),…,(Xn, Yn) be a random sample from a bivariate population. Let images, and assume that Zi has an absolutely continuous DF. Then one can test hypotheses concerning the order parameters of Z by using the sign test. A hypothesis of interest here is that Z has a given median images0. Without loss of generality let images. Then images, that is, images. Note that med(Z) is not necessarily equal to med(X) – med(Y), so that H0 is not that images but that images. The sign test is UMP against one-sided alternatives and UMP unbiased against two-sided alternatives.

Using the two-sided sign test, we cannot reject H0 at level α = 0.05, since images. The RVs Zi can be considered to be distributed normally, so that under H0 the common mean of Zi,'s is 0. Using a paired comparison t-test on the data, we can show that images for 9 d.f., so we cannot reject the hypothesis of equality of means of X and Y at level images.

Finally, we consider the Wilcoxon signed-ranks test.

13.3.2.2 The Wilcoxon Signed-Ranks Test

The sign test for median and symmetry loses information since it ignores the magnitude of the difference between the observations and the hypothesized median. The Wilcoxon signed-ranks test provides an alternative test of location (and symmetry) that also takes into account the magnitudes of these differences.

Let X1, X2,…, Xn be iid RVs with common absolutely continuous DF F, which is symmetric about the median images1/2. The problem is to test images against the usual one- or two-sided alternatives. Without loss of generality, we assume that images. Then images for all images. To test images or images, we first arrange |X1|, |X2|,…,|Xn| in increasing order of magnitude, and assign ranks 1, 2,…,n, keeping track of the original signs of Xi. For example, if images and images, the rank of |X1| is 3, of |X2| is 1, of |X3| is 4, and of |X4| is 2.

Let

(20)images

Then, under H0, we expect T+ and T- to be the same. Note that

(21)images

so that T+ and T-are linearly related and offer equivalent criteria. Let us define

(22)images

and write images for the rank of |Xi| Then images and images. Also,

(23)images

The statistic T+ (or T-) is known as the Wilcoxon statistic. A large value of T+ (or, equivalently, a small value of T) means that most of the large deviations from 0 are positive, and therefore we reject H0 in favor of the alternative, images.

A similar analysis applies to the other two alternatives. We record the results as follows:

Test
H0 H1 Reject H0 if
images images images
images images images
images images images or images

We now show how the Wilcoxon signed-ranks test statistic is related to the U-statistic estimate of images. Recall from Example 13.2.6 that the corresponding U-statistic is

(24)images

First note that

(25)images

Next note that for images if and only if images and |X(i) | |X(j) | . It follows that images is the signed-rank of X(j). Consequently,

where U1 is the U-statistic for images.

We next compute the distribution of T+ for small samples. The distribution of T+ is tabulated by Kraft and Van Eeden [55, pp. 221–223].

Let

images

Note that images if all differences have negative signs, and images if all differences have positive signs. Here a difference means a difference between the observations and the postulated value of the median. T+ is completely determined by the indicators Z(i), so that the sample space can be considered as a set of 2n n-tuples (z1, z2,…, zn), where each zi is 0 or 1. Under images and each arrangement is equally likely. Thus

(27)images

Note that every assignment has a conjugate assignment with plus and minus signs interchanged so that for this conjugate, T+ is given by

(28)images

Thus under H0 the distribution of T+ is symmetric about the mean images.

Remark 2. If we have n independent pairs of observations (X1,Y1),(X2,Y2),,…,(Xn,Yn) from a bivariate DF, we form the differences images, images. Assuming that Z1, Z2,…,Zn are (independent) observations from a population of differences with absolutely continuous DF F that is symmetric with median images1/2, we can use the Wilcoxon statistic to test images.

We present some examples.

From Table ST10, we reject H0 at images if either T+ > 46 or T+ < 9. Since T+ > 9 and < 46, we accept H0. Note that hypothesis H0 was also accepted by the sign test.

For large samples we use the normal approximation. In fact, from (26) we see that

images

Clearly, images and since images, the first term →0 in probability as images. By Slutsky’s theorem (Theorem 7.2.15) it follows that

images

have the same limiting distribution. From Theorem 13.2.3 and Example 13.2.7 it follows that images, and hence images, has a limiting normal distribution with mean 0 and variance

images

Under H0, the RVs iZ(i) are independent b(1,1/2) so

images

Also, under H0, F is continuous and symmetric so

images

and

images

Thus images so that

images

However,

images

as n→∞. Consequently, under H0

images

Thus, for large enough n we can determine the critical values for a test based on T+ by using normal approximation.

As an example, take images. From Table ST10 the P-value associated with images is 0.10. Using normal approximation

images

PROBLEMS 13.3

  1. Prove Theorem 4.
  2. A random sample of size 16 from a continuous DF on [0,1] yields the following data: 0.59,0.72,0.47,0.43,0.31,0.56,0.22,0.90,0.96,0.78,0.66,0.18,0.73,0.43,0.58,0.11. Test the hypothesis that the sample comes from U[0,1].
  3. Test the goodness of fit of normality for the data of Problem 10.3.6, using the Kolmogorov–Smirnov test.

    Do not reject H0.

  4. For the data of Problem 10.3.6 find a 0.95 level confidence band for the distribution function.
  5. The following data represent a sample of size 20 from U[0,1]: 0.277,0.435,0.130, 0.143, 0.853, 0.889, 0.294, 0.697, 0.940, 0.648, 0.324, 0.482, 0.540, 0.152, 0.477, 0.667, 0.741, 0.882, 0.885, 0.740. Construct a .90 level confidence band for F(x).
  6. In Problem 5 test the hypothesis that the distribution is U[0,1]. Take images.
  7. For the data of Example 2 test, by means of the sign test, the null hypothesis images against images.

    Reject H0.

  8. For the data of Problem 5 test the hypothesis that the quantile of order images is 0.20.
  9. For the data of Problem 10.4.8 use the sign test to test the hypothesis of no difference between the two averages.
  10. Use the sign test for the data of Problem 10.4.9 to test the hypothesis of no difference in grade-point averages.

    Do not reject H0 at 0.05 level.

  11. For the data of Problem 5 apply the signed-rank test to test images againstimages

    images, do not reject H0.

  12. For the data of Problems 10.4.8 and 10.4.9 apply the signed-rank test to the differences to test images against images.

    (Second part) images, do not reject H0 at images.

13.4 SOME TWO-SAMPLE PROBLEMS

In this section we consider some two-sample tests. Let X1,X2,…,Xm and Y1,Y2 ,…,Yn be independent samples from two absolutely continuous distribution functions FX and FY, respectively. The problem is to test the null hypothesis images for all images against the usual one- and two-sided alternatives.

Tests of H0 depend on the type of alternative specified. We state some of the alternatives of interest even though we will not consider all of these in this text.

  1. Location alternative: images, images
  2. Scale alternative: images, images
  3. Lehmann alternative: images, images.
  4. Stochastic alternative: images for all x, and images for at least one x.
  5. General alternative: images for some x.

Some comments are in order. Clearly I through IV are special cases of V. Alternatives I and II show differences in FX and FY in location and scale, respectively. Alternative III states that images. In the special case when θ is an integer it states that Y has the same distribution as the smallest of the images of X-variables. A similar alternative to test that is sometimes used is images for some images and all x. When α is an integer, this states that Y is distributed as the largest of the α X-variables. Alternative IV refers to the relative magnitudes of X’s and Y’s. It states that

images

so that

for all x. In other words, X’s tend to be larger than the Y’s.

A similar interpretation may be given to the one-sided alternative images. In the special case where both X and Y are normal RVs with means μ1, μ2 and common variance σ2 images corresponds to and images and images corresponds to images

In this section we consider some common two-sample tests for location (Case I) and stochastic ordering (Case IV) alternatives. First, note that a test of stochastic ordering may also be used as a test of less restrictive location alternatives since, for example, images corresponds to larger Y’s and hence larger location for Y. Second, we note that the chi-square test of homogeneity described in Section 10.3 can be used to test general alternatives (Case V) images for some x. Briefly, one partitions the real line into Borel sets A1,A2,…,Ak. Let

images

images. Under images, images, images, which is the problem of testing equality of two independent multinomial distributions discussed in Section 10.3.

We first consider a simple test of location. This test, based on the sample median of the combined sample, is a test of the equality of medians of the two DFs. It will tend to accept images even if the shapes of F and G are different as long as their medians are equal.

13.4.1 Median Test

The combined sample X1,X2,…,Xm, Y1,Y2,…,Yn is ordered and a sample median is found. If images is odd, the median is the images th value in the ordered arrangement. If images is even, the median is any number between the two middle values. Let V be the number of observed values of X that are less than or equal to the sample median for the combined sample. If V is large, it is reasonable to conclude that the actual median of X is smaller than the median of Y. One therefore rejects images in favor of images for all x and images for some x if V is too large, that is, if images. If, however, the alternative is images for all x and images for some x, the median test rejects H0 if images.

For the two-sided alternative that images for some x, we use the two-sided test.

We next compute the null distribution of the RV V. If images, p a positive integer, then

(2)images

Here images. If images, is an integer, the images th value is the median in the combined sample, and

(3)images

Remark 1. Under H0 we expect images observations above the median and images below the median. One can therefore apply the chi-square test with 1 d.f. to test H0 against the two-sided alternative.

We now consider two tests of the stochastic alternatives. As mentioned earlier they may also be used as tests of location.

13.4.2 Kolmogorov–Smirnov Test

Let X1,X2 ,…,Xm and Y1,Y2,…,Yn be independent random samples from continuous DFs F and G, respectively. Let images and images, respectively, be the empirical DFs of the X’s and the Y’s. Recall the images is the U-statistic for F and images, that for G. Under images for all x, we expect a reasonable agreement between the two sample DFs. We define

(4)images

Then Dm, n may be used to test H0 against the two-sided alternative images for some x. The test rejects H0 at level α if

(5)images

where images.

Similarly, one can define the one-sided statistics

(6)images

and

(7)images

to be used against the one-sided alternatives

(8)images

and

(9)images

respectively.

For small samples tables due to Massey [72] are available. In Table ST9, we give the values of Dm,n,α and images for some selected values of m, n, and α. Table ST8 gives the corresponding values for the images case.

For large samples we use the limiting result due to Smirnov [107]. Let images.

Then

Relations (10) and (11) give the distribution of images and Dm,n, respectively, under images for all images.

Let us first apply the Kolmogorov–Smirnov test to test H0 that the population distribution of length of life for the two brands is the same.

x imagesimages images images
30 images
images
0 images
40 images images images
45 images images images
50 images images images
55 1 images images
60 1 1 0
images

From Table ST8, the critical value for images at level images is images. Since images, we accept H0 that the population distribution for the length of life for the two brands is the same.

Let us next apply the two-sample t-test. We have images, images, images, images, images. Thus

images

Since images, we accept the hypothesis that the two samples come from the same (normal) population.

The second test of stochastic ordering alternatives we consider is the Mann–Whitney–Wilcoxon test which can be viewed as a test based on a U-statistic.

13.4.3 The Mann–Whitney–Wilcoxon Test

Let X1,X2,…,Xm and Y1,Y2,…,Yn be independent samples from two continuous DFs, F and G, respectively. As in Example 13.2.10, let

images

for images, images. Recall that T(Xi ; Yj) is an unbiased estimator of images and the two sample U-statistic for g is given by images. For notational convenience, let us write

(12)images

Then U is the number of values of X1,X2,…,Xm that are smaller than each of Y1,Y2,…,Yn. The statistic U is called the Mann–Whitney statistic. An alternative equivalent form using Wilcoxon scores is the linear rank statistic given by

(13)images

where Qj = rank of Yj among the combined images observations. Indeed,

images

Thus

(14)images

so that U and W are equivalent test statistics. Hence the name Mann–Whitney–Wilcoxon Test. We will restrict attention to U as the test statistic.

Note that images if all the Xi,’s are larger than all the Yj’s and images if all the Xi’s are smaller than all the Yj’s, because then there are m images, images, and so on. Thus images. If U is large, the values of Y tend to be larger than the values of X (Y is stochastically larger than X), and this supports the alternative images for all x and images for some x. Similarly, if U is small, the Y values tend to be smaller than the X values, and this supports the alternative images for all x and F(x) images G(x) for some x. We summarize these results as follows:

H0 H1 Reject H0 if
images images images
images images images
images images images

To compute the critical values we need the null distribution of U. Let

(15)images

We will set up a difference equation relating pm, n to pm–1,n and pm, n–1. If the observations are arranged in increasing order of magnitude, the largest value can be either an x value or a y value. Under H0, all images values are equally likely, so the probability that the largest value will be an x value is images and that it will be a y value is images.

Now, if the largest value is an x, it does not contribute to U, and the remaining images values of x and n values of y can be arranged to give the observed value images with probability pm–1,n(u). If the largest value is a Y, this value is larger than all the m x’s. Thus, to get images, the remaining images values of Y and m values of x contribute images. It follows that

(16)images

If images, then for images

(17)images

If images, images, then

(18)images

and

(19)images

For small values of m and n one can easily compute the null PMF of U. Thus, if images, then

images

If images, images, then

images

Tables for critical values are available for small values of m and n, images. See, for example, Auble [3] or Mann and Whitney [71]. Table ST11 gives the values of uα for which images for some selected values of m, n, and α.

If m, n are large we can use the asymptotic normality of U. In Example 13.2.11 we showed that, under H0,

images

as images such that images constant. The approximation is fairly good for images.

PROBLEMS 13.4

  1. For the data of Example 4 apply the median test.
  2. Twelve 4-year-old boys and twelve 4-year-old girls were observed during two 15-minute play sessions, and each child’s play during these two periods was scored as follows for incidence and degree of aggression:
    • Boys: 86, 69,72, 65, 113, 65, 118, 45, 141, 104, 41, 50
    • Girls: 55, 40, 22, 58, 16, 7, 9, 16, 26, 36, 20, 15

    Test the hypothesis that there were sex differences in the amount of aggression shown, using (a) the median test and (b) the Mann-Whitney-Wilcoxon test (Siegel [105]).

  3. To compare the variability of two brands of tires, the following mileages (1000 miles) were obtained for eight tires of each kind:
    • Brand A:32.1, 20.6, 17.8, 28.4, 19.6, 21.4, 19.9, 30.1
    • Brand B:19.8, 27.6, 30.8, 27.6, 34.1, 18.7, 16.9, 17.9

    Test the null hypothesis that the two samples come from the same population, using the Mann–Whitney–Wilcoxon test.

  4. Use the data of Problem 2 to apply the Kolmogorov−Smirnov test.
  5. Apply the Kolmogorov−Smirnov test to the data of Problem 3.
  6. Yet another test for testing images against general alternatives is the so-called runs test. A run is a succession of one or more identical symbols which are preceded and followed by a different symbol (or no symbol). The length of a run is the number of like symbols in a run. The total number of runs, R, in the combined sample of X’s and Y’s when arranged in increasing order can be used as a test of H0. Under H0 the X and Y symbols are expected to be well-mixed. A small value of R supports images. A test based on R is appropriate only for two-sided (general) alternatives. Tables of critical values are available. For large samples, one uses normal approximation: images.
    1. Let images of X-runs, images -runs, and images. Under H0, show that
      images

      Where images if images if images, images and images.

    2. Show that
      images
  7. Fifteen 3-year-old boys and 15 3-year-old girls were observed during two sessions of recess in a nursery school. Each child’s play was scored for incidence and degree of aggression as follows:
    • Boys: 96 65 74 78 82 121 68 79 111 48 53 92 81 31 40
    • Girls: 12 47 32 59 83 14 32 15 17 82 21 34 9 15 51

    Is there evidence to suggest that there are sex differences in the incidence and amount of aggression? Use both Mann–Whitney–Wilcoxon and runs tests.

13.5 TESTS OF INDEPENDENCE

Let X and Y be two RVs with joint DF F(x, y), and let F1 and F2, respectively, be the marginal DFs of X and Y. In this section we study some tests of the hypothesis of independence, namely,

images

against the alternative

images

If the joint distribution function F is bivariate normal, we know that X and Y are independent if and only if the correlation coefficient images. In this case, the test of independence is to test images.

In the nonparametric situation the most commonly used test of independence is the chi-square test, which we now study.

13.5.1 Chi-square Test of Independence—Contingency Tables

Let X and Y be two RVs, and suppose that we have n observations on (X,Y). Let us divide the space of values assumed by X (the real line) into r mutually exclusive intervals A1, A2,…,Ar. Similarly, the space of values of Y is divided into c disjoint intervals B1, B2,…,Bc. As a rule of thumb, we choose the length of each interval in such a way that the probability that X(Y) lies in an interval is approximately (1/r)(1/c). Moreover, it is desirable to have n/r and n/c at least equal to 5. Let Xij denote the number of pairs (Xk, Yk), images, that lie in Ai × Bj, and let

(1)images

Where images, images. If each pij is known, the quantity

(2)images

has approximately a chi-square distribution with images d.f., provided that n is large (see Theorem 10.3.2.). If X and Y are independent, images. Let us write images and images. Then under images, images, images. In practice, pij will not be known. We replace pij by their estimates. Under H0, we estimate pi· by

(3)images

and p·j by

(4)images

Since images we have estimated only images parameters. It follows (see Theorem 10.3.4) that the RV

(5)images

is asymptotically distributed as χ2 with images d.f., under H0. The null hypothesis is rejected if the computed value of U exceeds χ 2(r-1)(c-1),α..

It is frequently convenient to list the observed and expected frequencies of the rc events Ai × Bj in an r × c table, called a contingency table, as follows:

Observed Frequencies, Oij Expected Frequencies, Eij
B1 B2…Bc B1 B2…Bc
A1 X11 X12X2c images A1 np1.p.1 np1.p.2… np1.p.c np1
A2 X21 X22X2c images A2 np2.p.1 np2.p.2… np2.p.c np2
. . .... . . . . ….
. . .... . . . . ….
. . .... . . . . ….
Ar Xr1 Xr2…Xrc images Ar npr.p.1 npr.p.c… npr.p.c npr
images images n np.1 np.2 np.c n

Note that the Xij’s in the table are frequencies. Once the category Ai × Bj is determined for an observation (X, Y), numerical values of X and Y are irrelevant. Next, we need to compute the expected frequency table. This is done quite simply by multiplying the row and column totals for each pair (i, j) and dividing the product by n. Then we compute the quantity

images

and compare it with the tabulated images value. In this form the test can be applied even to qualitative data. A1, A2, …,Ar and B1, B2,…,Bc represent the two attributes, and the null hypothesis to be tested is that the attributes A and B are independent.

13.5.2 Kendall’s Tau

Let (X1, Y1), (X2, Y2),…,(Xn, Yn) be a sample from a bivariate population.

Writing πc and πd for the probability of perfect concordance and of perfect discordance, respectively, we have

(8)images

and

(9)images

and, if the marginal distributions of X and Y are continuous,

If the marginal distributions of X and Y are continuous, we may rewrite (11), in view of (10), as follows:

(12)images

In particular, if X and Y are independent and continuous RVs, then

images

since then images is a symmetric RV. Then

images

and it follows that images for independent continuous RVs.

Note that, in general, images does not imply independence. However, for the bivariate normal distribution images if and only if the correlation coefficient ρ, between X and Y, is 0, so that images if and only if X and Y are independent (Problem 6).

Let

Then images, and we see that imagesis estimable of degree 2, with symmetric kernel ψ defined in (13). The corresponding one-sample U-statistic is given by

(14)images

Then the corresponding estimator of Kendall’s tau is

(15)images

and is called Kendall’s sample correlation coefficient.

Note that images. To test H0 that X and Y are independent against H1 : X and Y are dependent, we reject H0 if |T| is large. Under H0, images, so that the null distribution of T is symmetric about 0. Thus we reject H0 at level α if the observed value of T, t, satisfies |t| images tα/2, where images.

For small values of n the null distribution can be directly evaluated. Values for images are tabulated by Kendall [51]. Table ST12 gives the values of Sα for which images, where images T for selected values of n and α.

For a direct evaluation of the null distribution we note that the numerical value of T is clearly invariant under all order-preserving transformations. It is therefore convenient to order X and Y values and assign them ranks. If we write the pairs from the smallest to the largest according to, say, X values, then the number of pairs of values of images for which images is the number of concordant pairs, P.

For large n we can use an extension of Theorem 13.3.3 to bivariate case to conclude that images, where

images

Under H0, it can be shown that

images

See, for example, Kendall [51], Randles and Wolfe [85], or Gibbons [35]. Approximation is good for images.

13.5.3 Spearman’s Rank Correlation Coefficient

Let (X1, Y1), (X2, Y2),…, (Xn,Yn) be a sample from a bivariate population. In Section 6.3 we defined the sample correlation coefficient by

where

images

If the sample values X1,X2,…,Xn and Y1,Y2,…,Yn are each ranked from 1 to n in increasing order of magnitude separately, and if the X’s and Y’s have continuous DFs, we get a unique set of rankings. The data will then reduce to n pairs of rankings. Let us write

images

then Ri and Si ∈ {1,2,…,n}. Also

(17)images
(18)images

and

(19)images

Substituting in (16), we obtain

Writing images, we have

images

and it follows that

The statistic R defined in (20) and (21) is called Spearman’s rank correlation coefficient (see also Example 4.5.2).

From (20) we see that

(22)images

Under H0, the RVs X and Y are independent, so that the ranks Ri and Si are also independent. It follows that

images

and

(23)images

Thus we should reject H0 if the absolute value of R is large, that is, reject H0 if

(24)images

where images. To compute Rα we need the null distribution of R. For this purpose it is convenient to assume, without loss of generality, that images. Then images, images. Under H0, X and Y being independent, the n! pairs (i,Si) of ranks are equally likely. It follows that

(25)images

Note that images, and the extreme values can occur only when either the rankings match, that is, images, in which case images, or images, in which case images. Moreover, one need compute only one half of the distribution, since it is symmetric about 0 (Problem 7).

In the following example we will compute the distribution of R for images and 4. The exact images complete distribution of images, and hence R, for images has been tabulated by Kendall [51]. Table ST13 gives the values of Rα for some selected values of n and α.

Since images, we cannot reject H0 at images or images.

For large samples it is possible to use a normal approximation. It can be shown (see, e.g., Fraser [32, pp. 247–248]) that under H0 the RV

images

or, equivalently,

images

has approximately a standard normal distribution. The approximation is good for images.

PROBLEMS 13.5

  1. A sample of 240 men was classified according to characteristics A and B. Characteristic A was subdivided into four classes A1, A2, A3, and A4, while B was subdivided into three classes B1, B2, and B3, with the following result:
    A1 A2 A3 A4
    B1 12 25 32 11 80
    B2 17 18 22 23 80
    B3 21 17 16 26 80
    50 60 70 60 240

    Is there evidence to support the theory that A and B are independent?

  2. The following data represent the blood types and ethnic groups of a sample of Iraqi citizens:
    Blood Type
    Ethnic Group O A B AB
    Kurd 531 450 293 226
    Arab 174 150 133 36
    Jew 42 26 26 8
    Turkoman 47 49 22 10
    Ossetian 50 59 26 15

    Is there evidence to conclude that blood type is independent of ethnic group?

  3. In a public opinion poll, a random sample of 500 American adults across the country was asked the following question: “Do you believe that there was a concerted effort to cover up the Watergate scandal? Answer yes, no, or no opinion.” The responses according to political beliefs were as follows:
    Political Affiliation Response
    Yes No No Opinion
    Republican 45 75 30 150
    Independent 85 45 20 150
    Democrat 140 30 30 200
    270 150 80 500

    Test the hypothesis that attitude toward the Watergate cover-up is independent of political party affiliation.

  4. A random sample of 100 families in Bowling Green, Ohio, showed the following distribution of home ownership by family income:
    Residential Status Annual Income (dollars)
    Less than 30,000 30,000-50,000 50,000or Above
    Home Owner 10 15 30
    Renter 8 17 20

    Is home ownership in Bowling Green independent of family income?

  5. In a flower show the judges agreed that five exhibits were outstanding, and these were numbered arbitrarily from 1 to 5. Three judges each arranged these five exhibits in order of merit, giving the following rankings:
    Judge A: 5 3 1 2 4
    Judge B: 3 1 5 4 2
    Judge C: 5 2 3 1 4

    Compute the average values of Spearman’s rank correlation coefficient R and Kendall’s sample tau coefficient T from the three possible pairs of rankings.

  6. For the bivariate normally distributed RV (X, Y) show that images if and only if X and Y are independent. [Hint: Show that images, where p is the correlation coefficient between X and Y.]
  7. Show that the distribution of Spearman's rank correlation coefficient R is symmetric about 0 under H0.
  8. In Problem 5 test the null hypothesis that rankings of judge A and judge C are independent. Use both Kendall’s tau and Spearman's rank correlation tests.
  9. A random sample of 12 couples showed the following distribution of heights:
    Couple Height (in.) Couple Height (in.)
    Husband Wife Husband Wife
    1 80 72 7 74 68
    2 70 60 8 71 71
    3 73 76 9 63 61
    4 72 62 10 64 65
    5 62 63 11 68 66
    6 65 46 12 67 67
    1. Compute T.
    2. Compute R.
    3. Test the hypothesis that the heights of husband and wife are independent, using T as well as R. In each case use the normal approximation.
    1. images;
    2. images;
    3. Reject H0 in each case.

13.6 SOME APPLICATIONS OF ORDER STATISTICS

In this section we consider some applications of order statistics. We are mainly interested in three applications, namely, tolerance intervals for distributions, coverages, and confidence interval estimates for quantiles and location parameters.

Let X1, X2,…,Xn be a sample of size n from F, and let X(1), X(2), …,X(n) be the corresponding set of order statistics. If the end points of the tolerance interval are two-order statistics X(r), X(s), r images s, we have

Since F is continuous, F(X) is U(0, 1), and we have

(2)images

where U(r), U(s) are the order statistics from U(0,1). Thus (1) reduces to

The statistic images, images , is called the coverage of the interval (X(r),X(s)). More precisely, the differences images, for images, where images and images, are called elementary coverages.

Since the joint PDF of U(1), U(2),…, U(n) is given by

images

the joint PDFofV1, V2,…,Vn is easily seen to be

(4)images

Note that h is symmetric in its arguments. Consequently, Vi’s are exchangeable RVs and the distribution of every sum of r, r images n, of these coverages is the same and, in particular, it is the distribution of images namely,

(5)images

The common distribution of elementary coverages is

images

Thus images and images. This may be interpreted as follows: The order statistics X(1),X(2),…,X(n) partition the area under the PDF in images parts such that each part has the same average (expected) area.

The sum of any r successive elementary coverages Vi+1,Vi+1 ,…,Vi+r is called an r-coverage. Clearly

(6)images

and, in particular,) images. Since V's are exchangeable it follows that

(7)images

with PDF

images

From (3), therefore

where the last equality follows from (5.3.48). Given n, p, γ it may not always be possible to find s - r to satisfy (8).

In general, given p, 0 < p < 1, it is possible to choose a sufficiently large sample of size n and a corresponding value of images such that with probability ≥ γ an interval of the form (X(r),X(s)) covers at least 100p percent of the distribution. If images is specified as a function of n, one chooses the smallest sample size n.

We next consider the use of order statistics in constructing confidence intervals for population quantiles. Let X be an RV with a continuous DF F,0 < p < 1. Then the quantile of order p satisfies

(9)images

Let X1, X2,…,Xn be n independent observations on X. Then the number of Xi's < imagesp is an RV that has a binomial distribution with parameters n and p. Similarly, the number of Xi's that are at least imagesp has a binomial distribution with parameters n and images.

Let X(1),X(2),…,X(n) be the set of order statistics for the sample. Then

Similarly

It follows from (10) and (11) that

It is easy to determine a confidence interval for images p from (12), once the confidence level is given. In practice, one determines r and s such that images is as small as possible, subject to the condition that the level is images.

Finally we consider applications of order statistics to constructing confidence intervals for a location parameter. For this purpose we will use the method of test inversion discussed in Chapter 11. We first consider confidence estimation based on the sign test of location.

Let X1,X2,…,Xn be a random sample from a symmetric, continuous images and suppose we wish to find a confidence interval for 6. Let images of images, be the sign-test statistic for testing images against images. Clearly, images under H0. The sign-test rejects H0 if

for some integer c to be determined from the level of the test. Let images. Then any value of θ is acceptable provided it is greater than the rth smallest observation and smaller than the rth largest observation, giving as confidence interval

If we want level images to be associated with (14), we choose c so that the level of test (13) is α.

We next consider the Wilcoxon signed-ranks test of images to construct a confidence interval for θ. The test statistic in this case is T+ = sum of ranks of positive images's in the ordered |Xt — θ+|'s. From (13.3.4)

images

Let images,images and order the images Tij's in increasing order of magnitude

images

Then using the argument that converts (13) to (14) we see that a confidence interval for θ is given by

(15)images

Critical values c are taken from Table ST10.

  1. Find the smallest values of n such that the intervals (a) (X(1),X(n)) and (b) (X(2),X(n-1)) contain the median with probability images 0.90.
    1. 5;
    2. 8.
  2. Find the smallest sample size required such that (X(1), X(n)) covers at least 90 percent of the distribution with probability > 0.98.
  3. Find the relation between n andp such that (X(1),X(n)) covers at least 100p percent of the distribution with probability. images.

    images.

  4. Given γ, δ, p0, p1 with images, find the smallest n such that
    images

    and

    images

    Find also images.

    [Hint: Use the normal approximation to the binomial distribution.]

    images.

  5. In Problem 4 find the smallest n and the associated value of images if images, images, images, images.
  6. Let X1 , X2,… , X7 be a random sample from a continuous DF F. Compute:
    1. images.
    2. images
    3. images..
  7. Let X1,X2,…,Xn be iid with common continuous DF F.
    1. What is the distribution of
      images

      for images?

    2. What is the distribution of images.

13.7 ROBUSTNESS

Most of the statistical inference problems treated in this book are parametric in nature. We have assumed that the functional form of the distribution being sampled is known except for a finite number of parameters. It is to be expected that any estimator or test of hypothesis concerning the unknown parameter constructed on this assumption will perform better than the corresponding nonparametric procedure, provided that the underlying assumptions are satisfied. It is therefore of interest to know how well the parametric optimal tests or estimators constructed for one population perform when the basic assumptions are modified. If we can construct tests or estimators that perform well for a variety of distributions, for example, there would be little point in using the corresponding nonparametric method unless the assumptions are seriously violated.

In practice, one makes many assumptions in parametric inference, and any one or all of these may be violated. Thus one seldom has accurate knowledge about the true underlying distribution. Similarly, the assumption of mutual independence or even identical distribution may not hold. Any test or estimator that performs well under modifications of underlying assumptions is usually referred to as robust.

In this section we will first consider the effect that slight variation in model assumptions have on some common parametric estimators and tests of hypotheses. Next we will consider some corresponding nonparametric competitors and show that they are quite robust.

13.7.1 Effect of Deviations from Model Assumptions on Some Parametric Procedures

Let us first consider the effect of contamination on sample mean as an estimator of the population mean.

The most commonly used estimator of the population mean μ is the sample mean images. It has the property of unbiasedness for all populations with finite mean. For many parent populations (normal, Poisson, Bernoulli, gamma, etc.) it is a complete sufficient statistic and hence a UMVUE. Moreover, it is consistent and has asymptotic normal distribution whenever the conditions of the central limit theorem are satisfied. Nevertheless, the sample mean is affected by extreme observations, and a single observation that is either too large or too small may make images worthless as an estimator of μ. Suppose, for example, that X1, X2,…,Xn is a sample from some normal population. Occasionally something happens to the system, and a wild observation is obtained that is, suppose one is sampling from images, σ2), say, 100α percent of the time and from images, 2), where images percent of the time. Here both μ and σ2 are unknown, and one wishes to estimate μ. In this case one is really sampling from the density function

(1)images

Where f0 is the PDF of images(μ, σ2), and f1, the PDF of images(μ, 2). Clearly,

(2)images

is still unbiased for μ. If α is nearly 1, there is no problem since the underlying distribution is nearly (μ, σ2), and images is nearly the UMVUE of μ with variance σ2/n. If images is large (that is, not nearly 0), then, since one is sampling from f, the variance of X1 is σ2 with probability α and is 2 with probability images, and we have

(3)images

If images is large, images is large and we see that even an occasional wild observation makes images subject to a sizable error. The presence of an occasional observation from images(μ, 2) is frequently referred to as contamination. The problem is that we do not know, in practice, the distribution of the wild observations and hence we do not know the PDF f. It is known that the sample median is a much better estimator than the mean in the presence of extreme values. In the contamination model discussed above, if we use Z1/2, the sample median of the Xi’s, as an estimator of μ (which is the population median), then for large n

(4)images

(See Theorem 7.5.2 and Remark 7.5.7.) Since

images

we have

(5)images

As images. If there is no contamination, images and images. Also,

images

which will be close to 1 if α is close to 1. Thus the estimator Z1/2 will not be greatly affected by how large k is, that is, how wild the observations are. We have

images

Indeed, images as images, whereas images as images. One can check that, when images and images, the two variances are (approximately) equal. As k becomes larger than 9 or α smaller than 0.915, Z1/2 becomes a better estimator of μ than images.

There are other flaws as well. Suppose, for example, that X1, X2,…,Xn is a sample from images. Then both images and images, where images, are unbiased for images. Also, images, and one can show that images. It follows that the efficiency of images relative to that of T is

images

In fact, images as images, so that in sampling from a uniform parent images is much worse than T, even for moderately large values of n.

Let us next turn our attention to the estimation of standard deviation. Let X1, X2, …,Xn be a sample from images(μ, σ2). Then the MLE of σ is

(6)images

Note that the lower bound for the variance of any unbiased estimator for σ is σ2/2n. Although images is not unbiased, the estimator

(7)images

is unbiased for σ Also,

(8)images

Thus the efficiency of S1 (relative to the estimator with least variance = σ2/2n)is

images

and →1 as n → ∞. For small n, the efficiency of S1 is considerably smaller than 1. Thus, for images, images and, for images, images.

Yet another estimator of σ is the sample mean deviation

(9)images

Note that

images

and

(10)images

If n is large enough so that images, we see that images is nearly unbiased for σ with variance images. The efficiency of S3 is

images

For large n, the efficiency of S1 relative to S3 is

images

Now suppose that there is some contamination. As before, let us suppose that for a proportion α of the time we sample from images(μ, σ2) and for a proportion images of the time we get a wild observation from images(μ, kσ2), images. Assuming that both μ and σ2 are unknown, suppose that we wish to estimateσ. In the notation used above, let

images

Where f0 is the PDF of images(μ, σ2),and f1, the PDF of us see how even small contamination can make the ma(μ, kσ2). Let us see how even small contamination can make the maximum likelihood estimate images of σ quite useless.

If images is the MLE of θ, and ϕ is a function of θ, then images is the MLE of ϕ(θ). Inview of (7.5.7) we get

Using Theorem 7.3.5, we see that

(12)images

(drooping the other two terms with n2 and n3 in the denominator), so that

(13)images

For the density f, we see that

(14)images

and

(15)images

It follows that

(16)images

If we are interested in the effect of very small contamination, images and images. Assuming that images, we see that

(17)images

In the normal case, images and images, so that from (11)

images

Thus we see that the mean square error due to a small contamination is now multiplied by a factor images. If, for example, images, then images. If images, then images, and so on.

A quick comparison with S3 shows that, although S1 (or even images a) is a better estimator of σ than S3 if there is no contamination, S3 becomes a much better estimator in the presence of contamination as k becomes large.

Next we consider the effect of deviation from model assumptions on tests of hypotheses. One of the most commonly used tests in statistics is Student’s t-test for testing the mean of a normal population when the variance is unknown. Let X1, X2,…,Xn be a sample from some population with mean μ and finite variance σ2. As usual, let images denote the sample mean, and S2, the sample variance. If the population being sampled is normal, the t-test rejects images against images at level α if images. If n is large, we replace images by the corresponding critical value, zα/2 under the standard normal law. If the sample does not come from a normal population, the statistic images is no longer distributed as a t images statistic. If, however, n is sufficiently large, we know that T has an asymptotic normal distribution irrespective of the population being sampled, as long as it has a finite variance. Thus, for large n, the distribution of T is independent of the form of the population, and the t-test is stable. The same considerations apply to testing the difference between two means when the two variances are equal. Although we assumed that n is sufficiently large for Slutsky s result (Theorem 7.2.15) to hold, empirical investigations have shown that the test based on Student s statistic is robust. Thus a significant value of t may not be interpreted to mean a departure from normality of the observations. Let us next consider the effect of departure from independence on the t-distribution. Suppose that the observations X1, X2,…,Xn have a multivariate normal distribution with images, and ρ as the common correlation coefficient between any Xi and Xj, images. Then

(18)images

and since Xi’s are exchangeable it follows from Remark 6.3.1 that

(19)images

For large n, the statistic images will be asymptotically distributed as images, instead of images(0, 1). Under H0, images and images is distributed as images. Consider the ratio

(20)images

The ratio equals 1 if images but is > 0 for images and →∞ as ρ→1. It follows that a large value of T is likely to occur when images and is large, even though μ0 is the true value of the mean. Thus a significant value of t may be due to departure from independence, and the effect can be serious.

Next, consider a test of the null hypothesis images against images. Under the usual normality assumptions on the observations X1, X2,…Xn, the test statistic used is

(21)images

Which has a images distribution under H0. The usual test is to reject H0 of

(22)images

Let us suppose that X1, X2,…Xn are not normal. It follows from Corollary 2 of Theorem 7.3.4 that

(23)images

so that

(24)images

Writing images, we have

(25)images

When the Xi’s are not normal, and

(26)images

when the Xi’s are normal images. Now images is the sum of n identically distributed but dependent images, j = 1, 2,…,n. Using a version of the central limit theorem for dependent RVs (see, e.g., Cramér [17, p. 365]), it follows that

images

under H0, is asymptotically images, and not images(0, 1)as under the normal theory. As a result the size of the test based on the statistic V0 will be different from the stated level of significance if γ2 differs greatly from 0. It is clear that the effect of violation of the normality assumption can be quite serious on inferences about variances, and the chi-square test is not robust.

In the above discussion we have used somewhat crude calculations to investigate the behavior of the most commonly used estimators and test statistics when one or more of the underlying assumptions are violated. Our purpose here was to indicate that some tests or estimators are robust whereas others are not. The moral is clear: One should check carefully to see that the underlying assumptions are satisfied before using parametric procedures.

13.7.2 Some Robust Procedures

Let X1, X2,…, Xn be a random sample from a continuous PDF images and assume that f is symmetric about θ. We shall be interested in estimation or tests of hypotheses concerning θ. Our objective is to find procedures that perform well for several different types of distributions but do not have to be optimal for any particular distribution. We will call such procedures robust. We first consider estimation of θ.

The estimators fall under one of the following three types:

  1. Estimators that are functions of images, where Rj is the rank of Xj, are known as R-estimators. Hodges and Lehmann [44] devised a method of deriving such estimators from rank tests. These include the sample median images (based on the sign test) and images based on the Wilcoxon signed-rank test.
  2. Estimators of the form images are called L-estimators, being linear combinations of order statistics. This class includes the median, the mean, and the trimmed mean obtained by dropping a prespecified proportion of extreme observations.
  3. Maximum likelihood type estimators obtained as solutions to certain equations images are called M-estimators. The function images gives MLEs.

Two extreme examples of trimmed means are the sample mean images and the median images when all except the central (n odd) or the two central (n even) observations are excluded.

We will limit this discussion to four estimators of location, namely, the sample median, trimmed mean, sample mean, and Hodges–Lehmann type estimator based on Wilcoxon signed-rank test. In order to compare the performance of two procedures A and B we will use a (large sample) measure of relative efficiency due to Pitman. Pitman’s asymptotic relative efficiency (ARE) of procedure B relative to procedure A is the limit of the ratio of sample sizes nA/nB, where nA, nB are sample sizes needed for procedures A and B to perform equivalently with respect to a specified criterion. For example, suppose {Tn(B)} and {Tn(A)} are two sequences of estimators for ψ(θ) such that

images

and

images

Suppose further that A and B perform equivalently if their asymptotic variances are the same, that is,

images

Then

images

Clearly, different performance measures may lead to different measures of ARE.

Similarly if procedures A and B lead to two sequences of tests, then ARE is the limiting ratio of the sample sizes needed by the tests to reach a certain power β0 against the same alternative and at the same limiting level α.

Accordingly, let e(B,A) denote the ARE of B relative to A. If images say, then procedure A requires (approximately) half as many observations as procedure B. We will write eF(B, A), whenever necessary to indicate the dependence of ARE on the underlying DF F.

For detailed discussion of Pitman efficiency we refer to Lehmann images, Lehmann [63, section 5.2], Serfling [102, chapter 10], Randles and Wolfe [85, chapter 5], and Zacks [121]. The expressions for AREs of median and the Hodges-Lehmann estimators of location parameter θ with respect to the sample mean images are

where f is the PDF corresponding to F. In order to get images we use the fact that

Bickel [5] showed that

(31)images

where

and imagesα is the unique αth percentile of F. It is clear from (32) that no closed form expression for images is possible for most DFs F.

In the following table we give the AREs for some selected F.

ARE Computations for Selected F
F images images images
images 1/3 1 1/3
images(0,1) images images 2/3
Logistic, images images 1.10 0.748
Double Exponential, images 2 1.5 4/3
images(0,1) 4/3

It can be shown that images for all images symmetric F, so images is quite inefficient compared to images for images. Even for normal f, images would require 157 observations to achieve the same accuracy that images achieves with 100 observations. For heavier tailed distributions, however, Image provides more protection that images.

The values of images, on the other hand, are quite high for most F and, in fact, images for all symmetric F. Even for normal F one loses little (4.5%) in using W instead of images. Thus W is more robust as an estimator of θ.

A look at the values of images shows that images is worse than W for distributions with light-tails but does slightly better than W for heavier-tailed F.

Let us now compare the AREs of images and W. The following AREs for selected α are due to Bickel [5].

ARE Comparisons
images images
F images images images images
Uniform 0.96 1.04 0.83
Normal 0.995 0.96 0.97 0.985
Double Exponential 1.06 1.41 1.21 1.24
Cauchy 6.72 2.67

We note that images performs quite well compared to images. In fact, for normal distribution the efficiency is quiet close to 1 so there is little loss in using images. For heavier-tailed distributions images is preferable. For small values of α, it should be noted that images does not differ much from images. Nevertheless, images is more robust; it cannot do much worse than images but can do much better. Compared to Hodges–Lehmann estimator, images does not perform as well. It (W) provides better protection against outliers (heavy tails) and gives up little in the normal case.

Finally we consider testing images against images. Recall that X1, X2,…, Xn are iid with common continuous symmetric images and PDF images. Suppose images. Let S denotes the sign test based on the statistic, images, W denotes the Wilcoxon signed-rank test based on the statistic images, M denotes the test based on the Z-statistic images, and t denotes the student’s t-test based on the statistic images, where S2 is the sample variance.

First note that images Next we note that images so that AREs are the same as given in (28), (29), and (30) and values of ARE given in the table for various F remain the same for corresponding tests.

Similar remarks apply as in the case of estimation of θ. Sign test is not as efficient as the Wilcoxon signed-rank test. But for heavier-tailed distributions such as Cauchy and double exponential sign test does better than the Wilcoxon signed-rank test.

  1. Let (X1, X2,…, Xn) be jointly normal with images and images otherwise.
    1. Show that
      images

      And

      images
    2. Show that the t-statistic images is asymptotically normally distributed with mean 0 and variance images. Conclude that the significance of t is overestimated for positive values of ρ and underestimated for images in large samples.
    3. For finite n, consider the statistic
      images

      Compare the expected values of the numerator and the denominator of T2 and study the effect of images to interpret significant t values (Scheffé [101, p. 338].)

  2. Let X1, X2,…, Xn be a random sample from images:
    1. Show that
      images
    2. Show that
      images
    3. Show that the large sample distribution of images is normal.
    4. Compare the large-sample test of images based on the asymptotic normality of images with the large-sample test based on the same statistic when the observations are taken from a normal population. In particular, take images.
  3. Let X1, X2,…, Xm and Y1, Y2,…, Yn be two independent random samples from populations with means μ1 and μ2 and variances images and images respectively. Let images be the two sample means, and images be the two sample variances. Write images. The usual normal theory test of images is the t-test based on the statistic
    images

    where

    images

    Under H0, the statistic T has a t-distribution with images d.f., provided that images.

    Show that the asymptotic distribution of T in the nonnormal case is images for large m and n. Thus, if images, T is asymptotically images(0,1) as in the normal theory case assuming equal variances, even though the two samples come from nonnormal populations with unequal variances. Conclude that the test is robust in the case of large, equal sample sizes (Scheffé [101, p. 339]).

  4. Verify the computations in the table above using the expressions of ARE in (28), (29), and (30).
  5. Suppose F is a G(α, β) r.v. Show that
    images

    (Note that F is not symmetric.)

  6. Suppose F has PDF
    images

    for images. compute images, and images. (From Problem 3.2.3, images if . images.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.138.178