Chapter 9

Chi-Squared Tests for Specific Distributions

9.1 Tests for Poisson, binomial, and “binomial” approximation of Feller’s distribution

Let image be independent and identically distributed random variables. Consider the problem of testing the composite hypothesis image according to which the distribution of image is a member of a parametric family with the Poisson distribution

image (9.1)

Let image be the observed frequencies meaning the number of realized values of image that fall into a specific class or interval image, where the fixed integer grouping classes image are such that image. As before, let image be a column vector of standardized grouped frequencies with its components as

image

If the nuisance parameter image is estimated effectively from grouped data by image, then the standard Pearson’s sum image will follow in the limit the chi-square distribution with image degrees of freedom. If, on the other hand, the parameter image is estimated from raw (ungrouped) data, for example, by the maximum likelihood estimate (MLE), then the standard Pearson test must be modified. Let image be a image (s being the dimensionality of the parameter space) matrix with its elements as

image

For the null hypothesis in (9.1), the image matrix image possesses its elements as

image

Let image be the MLE of image based on the raw data. Then, the well-known Nikulin-Rao–Robson (NRR) modified chi-squared test statistic, distributed in the limit as image, can be expressed as

image (9.2)

where image and image are the estimators of the Fisher information matrix (scalar) and of the column vector image, respectively.

Next, let us consider the binomial null hypothesis. Let the probability mass function be specified as

image (9.3)

where image and image. In this case, the Fisher information matrix (scalar) is image and the elements of the matrix image are

image

Let image be the MLE estimator of image. Then, the modified chi-squared test statistic, distributed in the limit as image, can be expressed as

image (9.4)

where image and image.

Now, let the probability distribution of the null hypothesis follow Feller’s 1948, pp. 105–115) discrete distribution with cumulative distribution function

image (9.5)

where image is the complement gamma function.

If the parameter image and image, then the distribution function in (9.5) can be approximated as

image

Using the results of Bol’shev (1963) a more accurate approximation of (9.5) can be obtained as

image (9.6)

where image and image. Note that (9.6) looks like the binomial distribution function with the exception that the parameter image can be any real positive number. Consider the probability mass function of (9.6) given by

image (9.7)

where the parameter image. In this case, there are three possibilities to construct a modified chi-squared test for testing a composite null hypothesis about the distribution in (9.6). First one is to use MLEs of image and image and the NRR statistic image. Since the MLEs of image and image cannot be derived easily, the modified test image based on MMEs (see Eq. (4.9)) or Singh’s image (see Eq. (3.25)) can be used.

For the model in (9.7) and image, the DN statistic image will follow in the limit image and image. If image, then, as before (see Eq. (4.12)), we will have

image

In this case, image, image, and

image

To specify the above tests, we of course will need explicit expressions of all the matrices involved.

The elements of the Fisher information matrix image for the model in (9.7) are

image

where image is the largest integer contained in image and image is the psi-function. It is known that series expansions for the psi-function converge very slowly. But, for integer values of x, a recurrence image can be used, from which it follows that

image

This result permits us to calculate all expressions containing image with a very high accuracy.

The elements of the matrix image are

image

The elements of the matrix image are

image

The elements of the matrix image are

image

The components image and image of the vector image for Singh’s test image for the model in (9.7) are

image

and

image

It has to be noted that the test in (3.25) is computationally much more complicated than the statistic image for large samples.

For the model in (9.7), image and image. Denoting the first two sample moments image and image and then equating them to population moments, the MMEs of image and image are obtained as

image (9.8)

From (9.8), we see that negative values of image and image are possible, but the proportion of such estimates will be almost negligible for samples of size image. It seems that image test can be used for analyzing Rutherford’s data, but the question about image-consistency of the MMEs in (9.8) is still open.

To examine the rate of convergence of estimators image and image for sample sizes image, we simulated 3,000 estimates of image and image assuming that image and image, values that correspond to Rutherford’s data. The power curve fit of image, the average value of estimates image for 3000 runs, in Figure 9.1 shows that image and image. The power curve fit of image in Figure 9.2 gives image and image. To check for the distribution of the statistic image under the null “Feller’s” distribution (image and image), we simulated image values of image. The histogram of these values is well described by the image distribution (see Figure 9.3). The average value image also does not contradict the assumption that the statistic image follows in the limit the chi-squared distribution with one degree of freedom. Another important property of any test statistic is its independence from the unknown parameters. To check for this feature of the test image, we simulated image values of image assuming that image (two times less than for the null hypothesis image) and image (two times more than for the null hypothesis image). The results (Figure 9.4) show that the simulated values do not contradict the independence, because the histogram is again well described by image distribution.

image

Figure 9.1 Simulated average value of image (circles) and the power function fit (solid line) as function of the sample size n.

image

Figure 9.2 Simulated average value of image (circles) and the power function fit (solid line) as function of the sample size n.

image

Figure 9.3 The histogram of the 1000 simulated values of image for the null hypothesis (image and image) and the image distribution (solid line).

image

Figure 9.4 The histogram of the 1000 simulated values of image for image and the image distribution (solid line).

The above results evidently allow us to use the HRM statistic image for Rutherford’s data analysis.

9.2 Elements of matrices K, B, C, and V for the three-parameter Weibull distribution

For r equiprobable cells of the model in (4.25), the borders of equiprobable intervals are defined as:

image

Then, the elements of the matrix image are as follows:

image

where image and image is the psi-function. For the required calculation of image, we used the series expansion

image

where image is the Euler’s constant.

Similarly, the elements of the matrices image and image are as follows:

image

where the population moments are

image

and image is the incomplete gamma function. For the required calculation of image, we used the following series expansion (Prudnikov et al., 1981, p. 705):

image

Finally, the elements of the matrix image are as follows:

image

9.3 Elements of matrices J and B for the Generalized Power Weibull distribution

Elements image of the Fisher information matrix image are as follows:

image

Next, the elements of matrix image are as follows:

image

9.4 Elements of matrices J and B for the two-parameter exponential distribution

Consider the two-parameter exponential distribution with cumulative distribution function

image (9.9)

where the unknown parameter image. It is easily verified that the matrix image for the model in (9.9) is

image (9.10)

Based on the set of n i.i.d. random variables image, the MLE image of the parameter image equals image, where

image (9.11)

Consider r disjoint equiprobable intervals

image

For these intervals, the elements of the matrix image (see Eq. (3.4)) are

image

Using the matrix in (9.10) and the above elements of the matrix image with image replaced by the MLE image in (9.11), the NRR test image (see Eq. (3.8)) can be used. While using Microsoft Excel, the calculations based on double precision is recommended.

9.5 Elements of matrices B, C, K, and V to test the logistic distribution

Let image, and image, be borders of r equiprobable random grouping intervals. Then, the probabilities of falling into each interval are image.

The elements of the image matrix image, for image, are as follows:

image

Next, the elements of the image matrix image are as follows:

image

where image is Euler’s dilogarithm function that can be computed by the series expansion

image

and by the expansion

image

for image (Prudnikov et al., 1986, p. 763).

Finally, we have the matrices image and image as

image

9.6 Testing for normality

System requirements for implementing the software of Sections 9.6, 9.7, 9.8, 9.9, 9.10 are Windows XP, Windows 7, MS Office 2003, 2007, 2010.

1. Open file Testing Normality.xls;

2. Enter your sample data in column “I” starting from cell 1;

3. Click the button “Compute,” introduce the sample size and the desired number of equiprobable intervals image. The recommended number of intervals for the NRR test image in (3.8), under close alternatives (such as the logistic), is image. The recommended number of intervals for the test image in (3.24) is image (see Section 4.4.1). Note that the power of image can be more than that of the NRR test;

4. Click OK;

5. Numerical values of image and image are in cells F2 and G2, respectively. Cells F3 and G3 contain the corresponding percentage points at level 0.05. The P-values of image and image are in cells F4 and G4, respectively.

9.7 Testing for exponentiality

9.7.1 Test of Greenwood and Nikulin (see Section 3.6.1)

1. Open file Testing Exp GrNik.xls;

2. Enter your sample data in column “I” starting from cell 1;

3. Click the button “Compute,” introduce the sample size and desired number of equiprobable intervals. The recommended number of equiprobable intervals is image;

4. Click OK;

5. The numerical value of image (see Eq. (3.44)) is in cell F2. The percentage point at level 0.05 and the P-value are in cells F3 and F4, respectively.

9.7.2 Nikulin-Rao-Robson test (see Eq. (3.8) and Section 9.4)

1. Open file Testing NRR 2-param EXP.xls;

2. Enter your sample data in column “I” starting from cell 1;

3. Click the button “Compute,” introduce the sample size and desired number of equiprobable intervals. The recommended number of equiprobable intervals is image;

4. Click OK;

5. The numerical value of image is in cell F2. The percentage point at level 0.05 and the P-value are in cells F3 and F4, respectively.

9.8 Testing for the logistic

1. Open file Testing Logistic.xls;

2. Enter your sample data in column “I” starting from cell 1;

3. Click the button “Compute,” introduce the sample size and desired number of equiprobable intervals image. The recommended number of equiprobable intervals, for close alternatives (such as normal), is image;

4. Click OK;

5. Numerical values of image in (4.9) and image in (4.13) are in cells E2 and F2, respectively. Cells E3 and F3 contain the corresponding percentage points at level 0.05. The P-values of image and image are in cells E4 and F4, respectively.

9.9 Testing for the three-parameter Weibull

1. Open file Testing Weibull3.xls;

2. Enter your sample data in column “I” starting from cell 1;

3. Click the button “Compute,” introduce the sample size and desired number of equiprobable intervals image. The recommended number of equiprobable intervals for the Exponentiated Weibull and Power Generalized Weibull alternatives is image;

4. Click OK;

5. Numerical values of image in (4.9) and image in (4.14) (see also Section 9.2) are in cells F2 and G2, respectively. Note that the power of image is usually higher than that of image. Cells F3 and G3 contain the corresponding percentage points at level 0.05. The P-values of image and image are in cells F4 and G4, respectively.

9.10 Testing for the Power Generalized Weibull

1. Open file Test for PGW (Left-tailed).xls;

2. Enter your sample data in column “I” starting from cell 1;

3. Click the button “Run,” introduce the sample size and desired number of equiprobable intervals image. The recommended number of equiprobable intervals for the Exponentiated Weibull, Generalized Weibull, and Three-Parameter Weibull alternatives is image;

4. Click OK. Note that the power of image in (3.50) is usually higher than that of image in (3.48);

5. Numerical values of image and image (see Eqs. (3.48), (3.50) and Section 9.3) are in cells F2 and G2, respectively. Cells F3 and G3 contain the corresponding percentage points at level 0.05. The P-values of image and image are in cells F4 and G4, respectively.

9.11 Testing for two-dimensional circular normality

1. Open file Testing Circular Normality.xls;

2. Enter your sample data in columns “I” and “J” starting from cell 1;

3. Click the button “Compute,” introduce the sample size and the desired number of equiprobable intervals. The recommended number of intervals for the two-dimensional logistic alternative is image, while the recommended number of intervals for the two-dimensional normal alternative is 3;

4. Click OK;

5. Numerical values of image and image (see Section 3.5.3) are in cells F2 and G2, respectively. Cells F3 and G3 contain the corresponding percentage points at level 0.05. The P-values of image and image are in cells F4 and G4, respectively.

References

1. Bol’shev LN. Asymptotical Pearson’s transformations. Theory of Probability and its Applications. 1963;8:129–155.

2. Feller W. On probability problems in the theory of counters. In: Courant Anniversary Volume. New York: Interscience Publishers; 1948;105–115.

3. Prudnikov AP, Brychkov YA, Marichev OI. Integrals and Series. Nauka, Moscow: Elementary Functions; 1981.

4. Prudnikov AP, Brychkov YA, Marichev OI. Integrals and Series. Nauka, Moscow: Additional Chapters; 1986.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.39.252