9
NEYMAN–PEARSON THEORY OF TESTING OF HYPOTHESES

9.1 INTRODUCTION

Let X1, X2, …, Xn be a random sample from a population distribution Fθ, images , where the functional form of Fθ is known except, perhaps, for the parameter θ. Thus, for example, the Xi’s may be a random sample from images (θ,1), where images is not known. In many practical problems the experimenter is interested in testing the validity of an assertion about the unknown parameter θ. For example, in a coin-tossing experiment it is of interest to test, in some sense, whether the (unknown) probability of heads p equals a given number images , Similarly, it is of interest to check the claim of a car manufacturer about the average mileage per gallon of gasoline achieved by a particular model. A problem of this type is usually referred to as a problem of testing of hypotheses and is the subject of discussion in this chapter. We will develop the fundamentals of Neyman–Pearson theory. In Section 9.2 we introduce the various concepts involved. In Section 9.3 the fundamental Neyman–Pearson lemma is proved, and Sections 9.4 and 9.5 deal with some basic results in the testing of composite hypotheses. Section 9.6 deals with locally optimal tests.

9.2 SOME FUNDAMENTAL NOTIONS OF HYPOTHESES TESTING

In Chapter 8 we discussed the problem of point estimation in sampling from a population whose distribution is known except for a finite number of unknown parameters. Here we consider another important problem in statistical inference, the testing of statistical hypotheses. We begin by considering the following examples.

To fix ideas, let us define formally the concepts involved. As usual, images and let images . It will be assumed that the functional form of Fθ is known except for the parameter θ. Also, we assume that Θ contains at least two points.

Usually the null hypothesis is chosen to correspond to the smaller or simpler subset Θ0 of Θ and is a statement of “no difference,” whereas the alternative represents change.

The problem of testing of hypotheses may be described as follows: Given the sample point images , find a decision rule (function) that will lead to a decision to reject or fail to reject the null hypothesis. In other words, partition the sample space into two disjoint sets C and Cc such that, if x ∈ C, we reject H0, and if x ∈ Cc, we fail to reject H0. In the following we will write accept H0 when we fail to reject H0. We emphasize that when the sample point x ∈Cc and we fail to reject H0, it does not mean that H0 gets our stamp of approval. It simply means that the sample does not have enough evidence against H0.

There are two types of errors that can be made if one uses such a procedure. One may reject H0 when in fact it is true, called a type I error, or accept H0 when it is false, called a type II error,

images

If C is the critical region of a rule, images , is a probability of type I error, and images , is a probability of type II error. Ideally, one would like to find a critical region for which both these probabilities are 0. This will be the case if we can find a subset images such that images for every images and images for every images . Unfortunately, situations such as this do not arise in practice, although they are conceivable. For example, let images under H0 and images under H1. Usually, if a critical region is such that the probability of type I error is 0, it will be of the form “do not reject H0” and the probability of type II error will then be 1.

The procedure used in practice is to limit the probability of type I error to some pre-assigned level α (usually 0.01 or 0.05) that is small and to minimize the probability of type II error. To restate our problem in terms of this requirement, let us formulate these notions.

Some simple examples of test functions are images for all images for all images , or images , for all images . In fact, Definition 4 includes Definition 3 in the sense that, whenever φ is the indicator function of some Borel subset A of images , A is called the critical region (of the test φ).

The following interpretation may be given to all tests φ satisfying images for all images . To every images we assign a number images , which is the probability of rejecting H0 that images , if x is observed. The restriction images then says that, if H0 were true, φ rejects it with a probability ≤ α. We will call such a test a randomized test function. If images will be called a nonrandomized test. If images , we reject H0 with probability 1; and if images , this probability is 0. Needless to say, images .

We next turn our attention to the type II error.

In view of Definitions 5 and 6 the problem of testing of hypotheses may now be reformulated. Let images . Also, let images be given. Given a sample point x, find a test φ(x) such that images , and βφ(θ) is a maximum for images .

Note that if φ1, φ2 are two tests and λ is a real number, images , then images is also a test function, and it follows that the class of all test functions Φα is convex.

Remark 1. The problem of testing of hypotheses may be considered as a special case of the general decision problem described in Section 8.8. Let images , where a0 represents the decision to accept images and a1 represents the decision to reject H0. A decision function δ is a mapping of images into images . Let us introduce the following loss functions:

images

and

images

Then the minimization of EθL2(θ, δ(X)) subject to images is the hypotheses testing problem discussed above. We have

images

and

images

Remark 2. In Example 6 we saw that the chosen size α is often unattainable. The choice of a specific value of α is completely arbitrary and is determined by nonstatistical considerations such as the possible consequences of rejecting H0 falsely, and the economic and practical implications of the decision to reject H0. An alternative, and somewhat subjective, approach wherever possible is to report the so-called P-value of the observed test statistic. This is the smallest level at which the observed sample statistic is significant. In Example 6, let images is observed, then images . By symmetry, if we reject H0 for images we should do so also for images so the probability of interest is images which is the P-value. If images is observed and we decide to reject H0, then we would do so also for images because images is more extreme than images . By symmetry considerations

images

This discussion motivates Definition 9 below. Suppose the appropriate critical region for testing H0 against H1 is one-sided. That is, suppose C is either of the form images or images , where T is the test statistic.

If α is given, then we reject H0 if images and do not reject H0 if images . In the two-sided case when the critical region is of the form images , the one-sided P-value is doubled to obtain the P-value. If the distribution of T is not symmetric then the P-value is not well-defined in the two-sided case although many authors recommend doubling the one-sided P-value.

PROBLEMS 9.2

  1. A sample of size 1 is taken from a population distribution P(λ). To test images against images , consider the nonrandomized test images if images . Find the probabilities of type I and type II errors and the power of the test against images . If it is required to achieve a size equal to 0.05, how should one modify the test φ?

    0.019, 0.857.

  2. Let X1, X2, …,Xn be a sample from a population with finite mean μ and finite variance σ2. Suppose that μ is not known, but σ is known, and it is required to test images against images . Let n be sufficiently large so that the central limit theorem holds, and consider the test
    images

    where images . Find k such that the test has (approximately) size α. What is the power of this test at images ? If the probabilities of type I and type II errors are fixed at α and β, respectively, find the smallest sample size needed.

    images .

  3. In Problem 2, if σ is not known, find k such that the test φ has size α.
  4. Let X1,X2, …,Xn be a sample from images (μ, 1) For testing images against images consider the test function
    images

    Show that the power function of φ is a nondecreasing function of μ. What is the size of the test?

  5. A sample of size 1 is taken from an exponential PDF with parameter θ, that is, images . Totest images , against images , the test to be used is the nonrandomized test
    images

    Find the size of the test. What is the power function?

  6. Let X1, X2, …,Xn be a sample from images (0, σ2). To test images against images , it is suggested that the test
    images

    be used. How will you find c1 and c2 such that the size of φ is a preassigned number images ? What is the power function of this test?

  7. An urn contains 10 marbles, of which M are white and 10–M are black. To test that images against the alternative hypothesis that images , one draws 3 marbles from the urn without replacement. The null hypothesis is rejected if the sample contains 2 or 3 white marbles; otherwise it is accepted. Find the size of the test and its power.

9.3 NEYMAN–PEARSON LEMMA

In this section we prove the fundamental lemma due to Neyman and Pearson [76], which gives a general method for finding a best (most powerful) test of a simple hypothesis against a simple alternative. Let images , where images , be a family of possible distributions of X. Also, fθ represents the PDF of X if X is a continuous type rv, and the PMF of X if X is of the discrete type. Let us write images and images for convenience.

Remark 1. It is possible to show (see Problem 6) that the test given by (1) or (2) is unique (except on a null set), that is, if φ is an MP test of size α of H0 against H1, it must have form (1) or (2), except perhaps for a set A with images .

Remark 2. An analysis of proof of part (a) of Theorem 1 shows that test (1) is MP even if f1 and f0 are not necessarily densities.

Remark 3. If the family images admits a sufficient statistic, one can restrict attention to tests based on the sufficient statistic, that is, to tests that are functions of the sufficient statistic. If φ is a test function and T is a sufficient statistic, images is itself a test function, images , and

images

so that φ and images have the same power function.

PROBLEMS 9.3

  1. A sample of size 1 is taken from PDF
    images

    Find an MP test of images against images .

  2. Find the Neyman–Pearson size α test of images against images based on a sample of size 1 from the PDF
    images
  3. Find the Neyman–Pearson size α test of images against images based on a sample of size 1 from
    images
  4. Find an MP size α test of images , where images , against images whhere images , based on a sample of size 1.
  5. For the PDF images , find an MP size α test of images against images , based on a sample of size n.
  6. If φ* is an MP size α test of images against images show that it has to be either of form (1) or form (2) (except for a set of x that has probability 0 under H0 and H1).
  7. Let φ* be an MP size α images test of H0 against H1, and let k(α) denote the value of k in (1). Show that if images , then images .
  8. For the family of Neyman–Pearson tests show that the larger the α the smaller the images .
  9. Let images be the power of an MP size α test, where images . Show that images unless images .
  10. Let α be a real number, images , and images be an MP size α test of H0 against H1. Also, let images . Show that images is an MP test for testing H1 against H0 at level images .
  11. Let X1, X2, …,Xn be a random sample from PDF
    images

    Find an MP test of images against images .

  12. Let X be an observation in (0,1). Find an MP size α test of images if images , and images , against images . Find the power of your test.
  13. In each of the following cases of simple versus simple hypotheses images , draw a graph of the ratio images and find the form of the Neyman–Pearson test:
    1. images
    2. images
    3. images
  14. Let X1, X2,, Xn be a random sample with common PDF
    images

    Find a size α MP test for testing images versus images .

  15. Let images , where
    images
    1. Find the form of the MP test of its size.
    2. Find the size and the power of your test for various values of the cutoff point.
    3. Consider now a random sample of size n from f0 under H0 or f1 under H1. Find the form of the MP test of its size.

9.4 FAMILIES WITH MONOTONE LIKELIHOOD RATIO

In this section we consider the problem of testing one-sided hypotheses on a single real-valued parameter. Let images be a family of PDFs (PMFs), images , and suppose that we wish to test images against the alternatives images or its dual, images , against images . In general, it is not possible to find a UMP test for this problem. The MP test of images , say, against the alternative images depends on θ1 and cannot be UMP. Here we consider a special class of distributions that is large enough to include the one-parameter exponential family, for which a UMP test of a one-sided hypothesis exists.

It is also possible to define families of densities with nonincreasing MLR in T(x), but such families can be treated by symmetry.

Remark 1. The nondecreasingness of Q(θ) can be obtained by a reparametrization, putting images , if necessary.

Theorem 1 includes normal, binomial, Poisson, gamma (one parameter fixed), beta (one parameter fixed), and so on. In Example 1 we have already seen that U[0, θ], which is not an exponential family, has an MLR.

Remark 3. We caution the reader that UMP tests for testing images and images for the one-parameter exponential family do not exist. An example will suffice.

PROBLEMS 9.4

  1. For the following families of PMFs (PDFs) images , find a UMP size α test of images against images , based on a sample of n observations.
    1. images .
    2. images .
    3. images .
    4. images .
    5. images .
    6. images .
  2. Let X1, X2, …Xn be sample of size n from the PMF
    images
    1. Show that test
      images
      is UMP size α for testing images against images .
    2. Show that
      images
      is a UMP size α test of images against images .
  3. Let X1, X2, …,Xn be a sample of size n from images . Show that the test
    images

    is UMP size α for testing images against images and that the test

    images

    is UMP size α for images against images .

  4. Does the Laplace family of PDFs
    images

    possess an MLR?

  5. Let X have logistic distribution with PDF
    images

    Does {fθ} belong to the exponential family? Does {fθ} have MLR?

    1. Let fθ be the PDF of a (θ, θ) RV. Does {fθ} have MLR?
    2. Do the same as in (a) if images .

9.5 UNBIASED AND INVARIANT TESTS

We have seen that, if we restrict ourselves to the class Φα of all size α tests, there do not exist UMP tests for many important hypotheses. This suggests that we reduce the class of tests under consideration by imposing certain restrictions.

Clearly images . If a UMP test exists in Φα, it is UMP in Uα. This follows by comparing the power of the UMP test with that of the trivial test images . It is convenient to introduce another class of tests.

It is clear that there exists at least one similar test on every Θ*, namely, images .

Remark 1. Thus, if βφ(θ) is continuous in θ for any φ, an unbiased size α test of H0 against H1 is also α-similar for the PDFs (PMFs) of Λ, that is, for images . If we can find an MP similar test of images against H1 and if this test is unbiased size α, then necessarily it is MP in the smaller class.

It is frequently easier to find a UMP α-similar test. Moreover, tests that are UMP similar on the boundary are often UMP unbiased.

Remark 2. The continuity of power function βφ(θ) is not always easy to check but sufficient conditions may be found in most advanced calculus texts. See, for example, Widder [117, p. 356]. If the family of PDF (PMF) fθ is an exponential family then a proof is given in Lehman [64, p. 59].

Theorem 2 can be used only if it is possible to find a UMP α-similar test. Unfortunately this requires heavy use of conditional expectation, and we will not pursue the subject any further. We refer to Lehmann [64, chapters 4 and 5] and Ferguson [28, pp. 224–233] for further details.

Yet another reduction is obtained if we apply the principle of invariance to hypothesis testing problems. We recall that a class of distributions is invariant under a group of transformations images if for every images and every images there exists a unique images such that g(X) has distribution images , whenever images . We rewrite images .

In a hypothesis testing problem we need to reformulate the principle of invariance. First, we need to ensure that under transformations images not only does images remain invariant but also the problem of testing images against images remain invariant. Second, since the problem has not changed by application of images , the decision also must not change.

The search for UMP invariant tests is greatly facilitated by the use of the following result.

Remark 3. The use of Theorem 3 is obvious. If a hypothesis testing problem is invariant under a group images , the principle of invariance restricts attention to invariant tests. According to Theorem 3, it suffices to restrict attention to test functions that are functions of maximal invariant T.

A particular case of Example 4 will be, for instance, to test images , against images . See Problem 1.

PROBLEMS 9.5

  1. To test images against images a sample of size 2 is available on X. Find a UMP invariant test of H0 against H1.
  2. Let X1,X2, …,Xn be a sample from P(λ) Find a UMP unbiased size α test for the null hypothesis images against alternatives images by the methods of this section.
  3. Let images . By the methods of this section find a UMP unbiased size α test of images against images .
  4. Let X1, X2, …,Xn iid images (μ,σ2) RVs. Consider the problem of testing images against images :
    1. It suffices to restrict attention to sufficient statistic (U,V) where images and images . Show that the problem of testing H0 is invariant under images and a maximal invariant is images .
    2. Show that the distribution of T has MLR and a UMP invariant test rejects H0 when images .
  5. Let X1, X2, …,Xn be iid RVs and let H0 be that images , and H1 be that the common PDF is images . Find the form of the UMP invariant test of H0 against H1.
  6. Let X1,X2, …,Xn be iid RVs and suppose images and images :
    1. Show that the problem of testing H0 against H1 is invariant under scale changes images and a maximal invariant is images .
    2. Show that the MP invariant test rejects H0 when images < k where images , or equivalently when
      images

9.6 LOCALLY MOST POWERFUL TESTS

In the previous section we argued that whenever a UMP test does not exist, we restrict the class of tests under consideration and then find a UMP test in the subclass. Yet another approach when no UMP test exists is to restrict the parameter set to a subset of Θ1. In most problems, the parameter values that are close to the null hypothesis are the hardest to detect. Tests that have good power properties for “local alternatives” may also retain good power properties for “nonlocal” alternatives.

We assume that the tests under consideration have continuously differentiable power function at images and the derivative may be taken under the integral sign. In that case, an LMP test maximizes

(3)images

subject to the size constraint (1). A slight extension of the Neyman–Pearson lemma (Remark 9.3.2) implies that a test satisfying (1) and given by

(4)images

will maximize images . It is possible that a test that maximizes images is not LMP, but if the test maximizes β′(θ0) and is unique then it must be LMP test (see Kallenberg et al. [49, p. 290] and Lehmann [64, p. 528]).

Note that for x for which images we can write

images

and then

In each case the power function is differentiable and the derivatives may be taken inside the integral sign because the PDF is a one–parameter exponential type PDF.

In this case {fθ} does not have MLR. A direct computation using the Neyman–Pearson lemma shows that an MP test of images against images depends on θ1 and hence cannot be MP for testing images against images . Hence a UMP test of H0 against H1 does not exist. An LMP test of H0 against H1 is of the form

images

where k is chosen so that the size of φ0 is α For small n it is hard to compute k but for large n it is easy to compute k using the central limit theorem. Indeed images are iid RVs with mean 0 and finite variance images so that images will give an (approximate) level α test for large n.

The test φ0 is good at detecting small departures from images but it is quite unsatisfactory in detecting values of θ away from 0. In fact, for images as images .

This procedure for finding locally best tests has applications in nonparametric statistics. We refer the reader to Randles and Wolfe [85, section 9.1] for details.

PROBLEMS 9.6

  1. Let X1,X2, …,Xn be iid images (1, θ) RVs Show that images . Hence or otherwise show that images .
  2. Let X1,X2, …,Xn be a random sample from logistic PDF
    images

    Show that the LMP test of images against images rejects H0 if images .

  3. Let X1,X2, …,Xn be iid RVs with common Laplace PDF
    images

    For images show that UMP size images test of images against images does not exist. Find the form of the LMP test.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.217.198