Chapter 9

Tests on scale difference

In this chapter we present nonparametric tests for the scale parameter. Actually, it is tested if two samples come from the same population where alternatives are characterized by differences in dispersion. These tests are called tests on the scale, spread or dispersion. The most famous one is the Siegel–Tukey test (Test 9.1.1). The introduced tests can be employed if the samples are not normally distributed, but the equality of median assumption is crucial.

9.1 Two-sample tests

9.1.1 Siegel–Tukey test

Description: Tests if the scale (variance) of two independent populations is the same.
Assumptions:
  • Data are measured at least on an ordinal scale.
  • Samples c09-math-0001 and c09-math-0002 are independently drawn from the two populations, c09-math-0003.
  • The random variables c09-math-0004 and c09-math-0005 are independent with continuous distribution functions c09-math-0006 and c09-math-0007, scale parameters c09-math-0008 and median c09-math-0009. It holds that c09-math-0010.
  • c09-math-0011 and c09-math-0012 belong to the same distribution function with possibly differences in scale and location. Under the assumption of equal median, the hypothesis c09-math-0013 reduces to c09-math-0014.
Hypotheses: (A) c09-math-0015 vs c09-math-0016
(B) c09-math-0017 vs c09-math-0018
(C) c09-math-0019 vs c09-math-0020
Test statistic: For c09-math-0021 the test statistic is given by:
c09-math-0022
c09-math-0023

equation

If c09-math-0025 is uneven, the above ranking is applied after the middle observation of the combined and ordered sample is discarded and the sample size is reduced to c09-math-0026.

Test decision: Reject c09-math-0027 if for the observed value c09-math-0028 of c09-math-0029
(A) c09-math-0030 or c09-math-0031
(B) c09-math-0032
(C) c09-math-0033
p-value: (A) c09-math-0034
(B) c09-math-0035
(C) c09-math-0036
Annotations:
  • Tables with critical values c09-math-0037 can be found in Siegel and Tukey (1980). Due to the used ranking procedure the same tables for critical values can be used as for the Wilcoxon rank sum test for location.
  • For the calculation of the test statistic, first combine both samples and rank the combined sample from the lowest to the highest values according to the above ranking scheme. Hence, the lowest value gets the rank c09-math-0038, the highest value the rank c09-math-0039, the second highest value the rank c09-math-0040, the second lowest value the rank c09-math-0041, the third lowest value the rank c09-math-0042, and so forth. The above test statistic c09-math-0043 is the sum of the ranks of the sample of c09-math-0044 based on the assumption c09-math-0045. The test can also be based on the ranks of c09-math-0046-observations in the combined sample. Usually the sum of ranks of the sample with the smaller sample size is used due to arithmetic convenience (Siegel and Tukey 1980).
  • The distribution with the larger scale will have the lower sum of ranks, because the lower ranks are on both ends of the combined sample.
  • It is not necessary to remove the middle observation if the combined sample size is odd. The advantage of this is, that the sum of ranks of adjacent observations is always the same and therefore the sum of ranks is a symmetric distribution under c09-math-0047.
  • For large samples the test statistic c09-math-0048 can be used, which is approximately a standard normal distribution. The sign has to be chosen such that c09-math-0049 is smaller (Siegel and Tukey 1980).

Example
To test the hypothesis that the dispersion of the systolic blood pressure in the two populations of healthy subjects (status=0) and subjects with hypertension (status=1) is the same. The dataset contains c09-math-0050 observations for status=0 and c09-math-0051 observations for status=1 (dataset in Table A.1).


SAS code
proc npar1way data=blood_pressure correct=no st;
 var mmhg;
 class status;
 exact st;
run;
SAS output
                   The NPAR1WAY Procedure
           Siegel–Tukey Scores for Variable mmhg
              Classified by Variable status
                 Sum of    Expected   Std Dev     Mean
status    N      Scores    Under H0   Under H0    Score
-------------------------------------------------------
0        25      655.0      700.0     59.001584   26.20
1        30      885.0      840.0     59.001584   29.50
              Average scores were used for ties.
                 Siegel–Tukey Two-Sample Test
                 Statistic             655.0000
                 Z                      −0.7627
                 One-Sided Pr <  Z       0.2228
                 Two-Sided Pr > |Z|      0.4456
Remarks:
  • The parameter st enables the Siegel–Tukey test of the procedure NPAR1WAY.
  • correct=value is optional. If value is YES than a continuity correction for the normal approximation is used. The default is NO.
  • exact st is optional and applies an additional exact test. Note, the computation of an exact test can be very time consuming. This is the reason why in this example no exact p-values are given in the output.
  • Besides the two-sided p-value SAS also reports a one-sided p-value; which one is printed depends on the Z-statistic. If it is greater than zero the right-sided p-value is printed. If it is less than or equal to zero the left-sided p-value is printed.
  • In this example the sum of scores for the healthy subjects is 655.0 compared with 885.0 for the people with hypertension. So there is evidence that the scale of healthy subjects is higher than the scale of unhealthy subjects. In fact the variance of the healthy subjects is 124.41 and the variance of the unhealthy subjects is 120.05. Therefore the p-value for hypothesis (C) is c09-math-0052 and the p-value for hypothesis (B) is c09-math-0053.
  • In the case of odd sample sizes SAS does not delete the middle observation.


R code
# Helper functions to find even or odd numbers
is.even <- function(x) x %% 2 == 0
is.odd  <- function(x) x %% 2 == 1
# Create a sorted matrix with first column the blood
# pressure and second column the status
data<-blood_pressure[order(blood_pressure$mmhg),]
x<-c(data$mmhg)
x<-cbind(x,data$status)
# If the sample size is odd then remove the observation
# in the middle
if (is.odd(nrow(x))) x<-x[-c(nrow(x)/2+0.5),]
# Calculate the (remaining) sample size
n<-nrow(x)
# y returns the Siegel–Tukey scores
y<-rep(0,times=n)
# Assigning the scores
for (i in seq(along=x)) {
 if (1<i & i <= n/2 & is.even(i))
 {
  y[i]<-2*i
 }
 else if (n/2<i & i<=n & is.even(i))
 {
  y[i]<-2*(n-i)+2
 }
 else if (1<=i & i <=n/2 & is.odd(i))
 {
  y[i]<-2*i-1
 }
 else if (n/2<i & i < n & is.odd(i))
 {
  y[i]<-2*(n-i)+1
 }
}
# Now mean scores must be created if necessary
t<-tapply(y,x[,1],mean) # Get mean scores for tied values
v<-strsplit(names(t), “ ”) # Get mmhg values
# r
r<-rep(0,times=n)
# Assign ranks and mean ranks to r
for (i in seq(along=r))
{
 for (j in seq(along=v))
 {
 if (x[i,1]==as.numeric(v[j])) r[i]=t[j]
 }
}
# Now calculate the test statistics S_0 (status 0)
# and S_1 (status 1) for both samples
S_0<-0
S_1<-0
for (i in seq(along=r)) {
 if(x[i,2]==0) S_0=S_0+r[i]
 if(x[i,2]==1) S_1=S_1+r[i]
}
# Calculate sample sizes for status=0 and status=1
n1<-sum(x[,2]==0)
n2<-sum(x[,2]==1)
# Choose the test statistic which belongs to the smallest
# sample size
if (n1<=n2) {
  # Choose the smaller |z| value
  z1<-(2*S_0-n1*(n+1)+1)/sqrt((n1*n2*(n+1)/3))
  z2<-(2*S_0-n1*(n+1)-1)/sqrt((n1*n2*(n+1)/3))
  if (abs(z1)<=abs(z2)) z=z1 else z=z2
  pvalue_B=1-pnorm(-abs(z))
  pvalue_C=pnorm(-abs(z))
}
if (n1>n2) {
  # Choose the smaller |z| value
  z1<-(2*S_1-n2*(n+1)+1)/sqrt((n1*n2*(n+1)/3))
  z2<-(2*S_1-n2*(n+1)-1)/sqrt((n1*n2*(n+1)/3))
  if (abs(z1)<=abs(z2)) z=z1 else z=z2
  pvalue_B=pnorm(-abs(z));
  pvalue_C=1-pnorm(-abs(z));
}
pvalue_A=2*min(pnorm(-abs(z)),1-pnorm(-abs(z)));
# Output results
print(“Siegel–Tukey test”)
n
S_0
S_1
z
pvalue_A
pvalue_B
pvalue_C
R output
[1] “Siegel–Tukey test”
> n
[1] 54
> S_0
[1] 600.5
> S_1
[1] 884.5
> z
[1] -1.027058
> pvalue_A
[1] 0.3043931
> pvalue_B
[1] 0.8478035
> pvalue_C
[1] 0.1521965
Remarks:
  • There is no basic R function to calculate this test directly.
  • In this implementation of the test, the observation in the middle of the sorted sample is removed. This is different to SAS and therefore the calculated values of the test statistic are not the same.
  • In the case of ties–as in the above sample–the construction of ranks must be made in two passes. First the ranks are constructed in the ordered combined sample. Afterwards the mean of ranks of the tied observations are calculated.

9.1.2 Ansari–Bradley test

Description: Tests if the scale (variance) of two independent populations is the same.
Assumptions:
  • Data are measured at least on an ordinal scale.
  • Samples c09-math-0054 and c09-math-0055 are independently drawn from the two populations, c09-math-0056.
  • The random variables c09-math-0057 and c09-math-0058 are independent with continuous distribution functions c09-math-0059 and c09-math-0060, scale parameters c09-math-0061 and median c09-math-0062. It holds that c09-math-0063.
  • c09-math-0064 and c09-math-0065 belong to the same distribution function with possibly differences in scale and location. Under the assumption of equal median, the hypothesis c09-math-0066 reduces to c09-math-0067.
Hypotheses: (A) c09-math-0068 vs c09-math-0069
(B) c09-math-0070 vs c09-math-0071
(C) c09-math-0072 vs c09-math-0073
Test statistic: For c09-math-0074 the test statistic is given by:
c09-math-0075 sum of ranks of c09-math-0076 in the combined sample.

equation

Test decision: Reject c09-math-0078 if for the observed value c09-math-0079 of c09-math-0080
(A) c09-math-0081 or c09-math-0082 with c09-math-0083
(B) c09-math-0084
(C) c09-math-0085
p-value: (A) c09-math-0086
(B) c09-math-0087
(C) c09-math-0088
Annotations:
  • For the calculation of the test statistic, first combine both samples and rank the combined sample from the lowest to the highest values according to the above ranking scheme. It means that for even sample size the series of ranks will be c09-math-0089 and for odd sample size it will be c09-math-0090 (Ansari and Bradley 1960). The distribution with the larger scale will have the lower sum of ranks because the lower ranks are on the both ends of the combined sample.
  • Here, c09-math-0091 denotes the upper-tail probability for the null distribution of the Ansari–Bradley statistic calculated for the sample with the smaller sample size; tables are given in Ansari and Bradley (1960) as well as in Hollander and Wolfe (1999, Table A.8). In general, the test can alternatively be set up by using the sum of ranks of the sample with the larger sample size as the test statistic.
  • In the case of tied observations mean ranks are used.
  • For large sample sizes (c09-math-0092 and c09-math-0093) the test statistic c09-math-0094 is asymptotically normally distributed. If no ties are present and c09-math-0095 is even, then c09-math-0096 and c09-math-0097. If no ties are present and c09-math-0098 is odd, then c09-math-0099 and c09-math-0100c09-math-0100a. In the case of ties the expectation is the same, but the variance is somewhat different. Let c09-math-0101 be the number of tied groups, c09-math-0102 the number of tied observations in group c09-math-0103, and c09-math-0104 the middle range in group c09-math-0105. If c09-math-0106 is even, then c09-math-0107. If c09-math-0108 is odd, then c09-math-0109. (Hollander and Wolfe 1999, p. 145).

Example
To test the hypothesis that the dispersion of the systolic blood pressure in the two populations of healthy subjects (status=0) and subjects with hypertension (status=1) is the same. The dataset contains c09-math-0110 observations for status=0 and c09-math-0111 observations for status=1 (dataset in Table A.1).


SAS code
proc npar1way data=blood_pressure correct=no ab;
 var mmhg;
 class status;
 exact ab;
run;
SAS output
                  The NPAR1WAY Procedure
           Ansari–Bradley Scores for Variable mmhg
               Classified by Variable status
                    Sum of      Expected   Std Dev  Mean
status  N   Scores  Under H0    Under H0   Score    Score
----------------------------------------------------------
0       25  334.0   356.363636  29.533137  13.360   13.360
1       30  450.0   427.636364  29.533137  15.000   15.000
             Average scores were used for ties.
              Ansari–Bradley Two-Sample Test
              Statistic             334.0000
              Z                      -0.7572
              One-Sided Pr < Z        0.2245
              Two-Sided Pr < |Z|      0.4489
Remarks:
  • The parameter ab enables the Ansari–Bradley test of the procedure NPAR1WAY.
  • correct=value is optional. If value is YES than a continuity correction for the normal approximation is used. The default is NO.
  • exact ab is optional and applies an additional exact test. Note, the computation of an exact test can be very time consuming. This is the reason why in this example no exact p-values are given in the output.
  • Besides the two-sided p-value SAS also reports a one-sided p-value; which one is printed depends on the Z-statistic. If the value of the Z-statistic is greater than zero the right-sided p-value is printed. If it is less than or equal to zero the left-sided p-value is printed.
  • In this example the sum of scores for the healthy subjects is c09-math-0112 compared with c09-math-0113 for the people with hypertension. So there is evidence that the scale of healthy subjects is higher than the scale of unhealthy subjects. In fact the variance of the healthy subjects is c09-math-0114 and the variance of the unhealthy subjects is c09-math-0115. Therefore the p-value for hypothesis (C) is c09-math-0116 and the p-value for hypothesis (B) is c09-math-0117.


R code
x<-blood_pressure$mmhg[blood_pressure$status==0]
y<-blood_pressure$mmhg[blood_pressure$status==1]
ansari.test(x,y,exact=NULL,alternative =“two.sided”)
R output
       Ansari–Bradley test
data:  x and y
AB = 334, p-value = 0.4489
alternative hypothesis: true ratio of scales is not
                                                equal to 1
Remarks:
  • exact=value is optional. If value is not specified or TRUE an exact p-value is computed if the combined sample size is less than c09-math-0118. If it is NULL or FALSE the approximative p-value is computed. In the case of ties R cannot compute an exact test.
  • R tests equivalent hypotheses of the type c09-math-0119 vs c09-math-0120 for hypothesis (A), and so on.
  • alternative=“value” is optional and defines the type of alternative hypothesis: “two.sided”= true ratio of scales is not equal to 1 (A); “greater”=true ratio of scales is greater than 1 (C); “lower”=true ratio of scales is less than 1 (B). Default is “two.sided”.

9.1.3 Mood test

Description: Tests if the scale (variance) of two independent populations is the same.
Assumptions:
  • Data are measured at least on an ordinal scale.
  • Samples c09-math-0121 and c09-math-0122 are independently drawn from the two populations, c09-math-0123.
  • The random variables c09-math-0124 and c09-math-0125 are independent with continuous distribution functions c09-math-0126 and c09-math-0127, scale parameters c09-math-0128 and median c09-math-0129. It holds that c09-math-0130.
  • c09-math-0131 and c09-math-0132 belong to the same distribution function with possibly differences in scale and location. Under the assumption of equal median, the hypothesis c09-math-0133 reduces to c09-math-0134.
Hypotheses: (A) c09-math-0135 vs c09-math-0136
(B) c09-math-0137 vs c09-math-0138
(C) c09-math-0139 vs c09-math-0140
Test statistic: For c09-math-0141 the test statistic is given by:
c09-math-0142
where c09-math-0143 is the rank of the c09-math-0144th c09-math-0145-observation in the combined sample
Test decision: Reject c09-math-0146 if for the observed value c09-math-0147 of c09-math-0148
(A) c09-math-0149 or c09-math-0150 with c09-math-0151
(B) c09-math-0152
(C) c09-math-0153
p-value: (A) c09-math-0154
(B) c09-math-0155
(C) c09-math-0156
Annotations:
  • Tables with critical values c09-math-0157 can be found in Laubscher et al. (1968).
  • For the calculation of the test statistic, first combine both samples and rank the combined sample from the lowest to the highest values. Above test statistic c09-math-0158 is the sum of the quadratic distance of the ranks of the c09-math-0159-observations from the median of all ranks based on the assumption c09-math-0160. The test can also be based on the ranks of c09-math-0161-observations in the combined sample. Usually the sum of ranks of the sample with the smaller sample size is used.
  • In the case of tied observations mid ranks are used. However, tied observations only influence the test statistics if they are between the c09-math-0162- and c09-math-0163-observations.
  • For large sample sizes (c09-math-0164) the test statistic is asymptotically normally distributed with c09-math-0165 and c09-math-0166 (Mood 1954).

Example
To test the hypothesis that the dispersion of the systolic blood pressure in the two populations of healthy subjects (status=0) and subjects with hypertension (status=1) is the same. The dataset contains c09-math-0167 observations for status=0 and c09-math-0168 observations for status=1 (dataset in Table A.1).


SAS code
proc npar1way data=blood_pressure correct=no mood;
 var mmhg;
 class status;
 exact mood;
run;
SAS output
                 The NPAR1WAY Procedure
             Mood Scores for Variable mmhg
             Classified by Variable status
              Sum of     Expected    Std Dev       Mean
status   N    Scores     Under H0    Under H0      Score
----------------------------------------------------------
0        25   6864.0     6300.0      837.786511    274.560
1        30   6996.0     7560.0      837.786511    233.200
               Average scores were used for ties.
                     Mood Two-Sample Test
               Statistic             6864.0000
               Z                        0.6732
               One-Sided Pr>  Z        0.2504
               Two-Sided Pr> |Z|       0.5008
Remarks:
  • The parameter mood enables the Mood test of the procedure NPAR1WAY.
  • correct=value is optional. If value is YES than a continuity correction for the normal approximation is used. The default is NO.
  • exact mood is optional and applies an additional exact test. Note, the computation of an exact test can be very time consuming. This is the reason why in this example no exact p-values are given in the output.
  • Besides the two-sided p-value SAS also reports a one-sided p-value; which one is printed depends on the Z-statistic. If the observed value of the Z-statistic is greater than zero the right-sided p-value is printed. If it is less than or equal to zero the left-sided p-value is printed.
  • In this example the sum of scores for the healthy subjects is c09-math-0169 compared with c09-math-0170 for the people with hypertension. So there is evidence that the scale of healthy subjects is higher than the scale of unhealthy subjects. In fact the variance of the healthy subjects is c09-math-0171 and the variance of the unhealthy subjects is c09-math-0172. Therefore the p-value for hypothesis (C) is c09-math-0173 and the p-value for hypothesis (B) is c09-math-0174.


R code
x<-blood_pressure$mmhg[blood_pressure$status==0]
y<-blood_pressure$mmhg[blood_pressure$status==1]
mood.test(x,y,alternative =“two.sided”)
R output
 Mood two-sample test of scale
data:  x and y
Z = 0.6765, p-value = 0.4987
alternative hypothesis: two.sided
Remarks:
  • R handles ties differently to SAS. Instead of mid ranks a procedure by Mielke is used (Mielke 1967).
  • alternative=“value” is optional and defines the type of alternative hypothesis: “two.sided”= true ratio of scales is not equal to 1 (A); “greater”=true ratio of scales is greater than 1 (C); “lower”=true ratio of scales is less than 1 (B). Default is “two.sided”.

References

Ansari A.R. and Bradley R.A. 1960 Rank-sum tests for disperson. Annals of Mathematical Statistics 31, 1174–1189.

Hollander M. and Wolfe D.A. 1999 Nonparametric Statistical Methods, 2nd edn. John Wiley & Sons, Ltd.

Laubscher N.F., Steffens F.E. and DeLange E.M. 1968 Exact critical values for Mood's distribution-free test statistic for dispersion and its normal approximation. Technometrics 10, 497–508.

Mielke P.W. 1967 Note on some squared rank tests with existing ties. Technometrics 9, 312–314.

Mood A.M. 1954 On the asymptotic efficiency of certain nonparametric two-sample tests. Annals of Mathematical Statistics 25, 514–522.

Siegel S. and Tukey J.W. 1980 A nonparametric sum of ranks procedure for relative spread in unpaired samples. Journal of the American Statistical Association 55, 429–445.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.26.138