Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10 Other Tests

In this chapter we present a well-known test for the problem if two independent samples are drawn from the same population or not. The test is based on very few assumptions, for example, it is not necessary to specify the distributions beyond the fact that they are continuous distributions.

10.1 Two-sample tests

10.1.1 Kolmogorov–Smirnov two-sample test (Smirnov test)

Description:	Tests if two independent samples are sampled from the same distribution.
Assumptions:	Data are at least measured on an ordinal scale. The random variables $c10-math-0001$ and $c10-math-0002$ are independent with continuous distribution functions $c10-math-0003$ and $c10-math-0004$ . Samples $c10-math-0005$ and $c10-math-0006$ are independently drawn from the two populations.
Hypotheses:	(A) $c10-math-0007$ vs $c10-math-0008$ for at least one $c10-math-0009$
	(B) $c10-math-0010$ vs $c10-math-0011$ with $c10-math-0012$ for at least one $c10-math-0013$
	(C) $c10-math-0014$ vs $c10-math-0015$ with $c10-math-0016$ for at least one $c10-math-0017$
Test statistic:	(A) $c10-math-0018$
	(B) $c10-math-0019$
	(C) $c10-math-0020$
	where $c10-math-0021$ and $c10-math-0022$ denote the empirical distribution functions based on the two samples.

Test decision:	Reject $c10-math-0023$ if for the observed value $c10-math-0024$ of $c10-math-0025$
	(A) $c10-math-0026$
	(B) $c10-math-0027$
	(C) $c10-math-0028$
	The critical values $c10-math-0029$ , $c10-math-0030$ , $c10-math-0031$ can be found for instance in Sheskin (2007, table A.23).
p-values:	(A) $c10-math-0032$
	(B) $c10-math-0033$
	(C) $c10-math-0034$
Annotations:	The test statistics evaluate the maximum distances between the two empirical distribution functions. The Smirnov test can be presented as a rank test as the statistics can be written as supremum of linear rank statistics (Steck 1969). The test is known as the Kolmogorov–Smirnov test as well as the Smirnov test for two samples.

Example

To test the hypothesis that the two populations of healthy subjects (status=0) and subjects with hypertension (status=1) do not differ with respect to the distribution of their systolic blood pressure. The dataset contains $c10-math-0035$ observations for status=0 and $c10-math-0036$ observations for status=1 (dataset in Table A.1).

SAS code

proc npar1way data=blood_pressure D;
 class status;
 var mmhg;
 exact edf;
run;

SAS output

             The NPAR1WAY Procedure
    Kolmogorov–Smirnov Test for Variable mmhg
          Classified by Variable status
                     EDF at    Deviation from Mean
status       N       Maximum        at Maximum
---------------------------------------------------
0           25      0.880000          2.218182
1           30      0.066667         -2.024914
Total       55      0.436364
     Maximum Deviation Occurred at Observation 25
          Value of mmhg at Maximum = 125.0
              KS  0.4050    KSa  3.0034
    Kolmogorov–Smirnov Two-Sample Test (Asymptotic)
            D = max |F1 - F2|     0.8133
            Pr > D                <.0001
            D+ = max (F1 - F2)    0.8133
            Pr > D+               <.0001
            D- = max (F2 - F1)    0.0000
            Pr > D-               1.0000

Remarks:

The option D enables the one-sided (B) and (C) tests in addition to the two-sided test (A). However, if only the two-sided test is desired, do not use any option or the option EDF.
exact edf is optional and applies an additional exact test. Note, the computation of an exact test can be very time consuming. Although this option is given in the listing, the output is generated without this option because it would have taken too much time to calculate the exact p-values even for this tiny dataset.
$c10-math-0037$ is the test statistic for hypothesis (B) and $c10-math-0038$ is the test statistic for hypothesis (C). From Figure 10.1 it can be seen that the cumulative distribution function of the healthy subjects is above the cumulative distribution function of the subjects with hypertension. Accordingly hypothesis (B) is rejected while hypothesis (C) is not.

R code

x<-blood_pressure$mmhg[blood_pressure$status==0]
y<-blood_pressure$mmhg[blood_pressure$status==1]
ks.test(x,y,alternative="two.sided",exact=FALSE)

R output

Two-sample Kolmogorov–Smirnov test
data:  x and y
D = 0.8133, p-value = 2.923e-08
alternative hypothesis: two-sided

Remarks:

alternative=“value” is optional and defines the type of alternative hypothesis: “two.sided”= the cumulative distribution functions of $c10-math-0039$ and $c10-math-0040$ do not differ (A); “greater”= the cumulative distribution function of $c10-math-0041$ lies above $c10-math-0042$ (C); “less”=the cumulative distribution function of $c10-math-0043$ lies below $c10-math-0044$ (B). Default is “two.sided”.
exact=value is optional. If value is not specified or TRUE an exact p-value is computed if the product of the sample sizes is less than 10 000, otherwise only the approximative p-value is computed. In the case of ties or a one-sided alternative no exact test is computed.
$c10-math-0045$ is the test statistic for hypothesis (B) with option alternative=“greater” and $c10-math-0046$ is the test statistic for hypothesis (C) with option alternative=“less”. From Figure 10.1 it can be seen that the cumulative distribution function of the healthy subjects is above the cumulative distribution function of the subjects with hypertension. Accordingly hypothesis (B) is rejected while hypothesis (C) is not.

Figure 10.1 Cumulative empirical distribution functions of the blood pressure of healthy subjects (bold lines) and subjects with hypertension (non-bold lines).

References

Sheskin D. 2007 Handbook of Parametric and Nonparametric Statistical Procedures, 4nd edn. Chapman & Hall.

Steck G.P. 1969 The Smirnov two sample tests as rank tests. The Annals of Mathematical Statistics 40, 1449–1466.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 10: Other Tests

Create new playlist

Sign In

Sign Up

Chapter 10

Other Tests

10.1 Two-sample tests

10.1.1 Kolmogorov–Smirnov two-sample test (Smirnov test)

References

Table of Contents for
Chapter 10: Other Tests