Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 14 Tests on contingency tables

In this chapter we deal with the question of whether there is an association between two random variables or not. This question can be formulated in different ways. We can ask if the two random variables are independent or test for homogeneity. The corresponding tests are presented in Section 14.1. These are foremost the well-known Fisher's exact test and Pearson's $c14-math-0001$ -test. In Section 14.2 we test if two raters agree on their rating of the same issue. Section 14.3 deals with two risk measures, namely the odds ratio and the relative risk.

14.1 Tests on independence and homogeneity

In this chapter we deal with the two null hypotheses of independence and homogeneity. While a test of independence examines if there is an association between two random variables or not, a test of homogeneity tests if the marginal proportions are the same for different random variables. The test problems in this chapter can be described for the homogeneity hypothesis as well as for the independence hypothesis.

14.1.1 Fisher's exact test

Description:

Tests the hypothesis of independence or homogeneity in a $c14-math-0002$ contingency table.

Assumptions:

Data are at least measured on a nominal scale with two possible categories, labeled as $c14-math-0003$ and $c14-math-0004$ , for each of the two variables $c14-math-0005$ and $c14-math-0006$ of interest.
The random sample follows a Poisson, Multinomial or Product-Multinomial sampling distribution.
A dataset of $c14-math-0007$ observations is available and presented as a $c14-math-0008$ contingency table.

Hypotheses:	(A) $c14-math-0009$ vs $c14-math-0010$
	(B) $c14-math-0011$ vs $c14-math-0012$
	(C) $c14-math-0013$ vs $c14-math-0014$
	with $c14-math-0015$ ,
	$c14-math-0016$ and $c14-math-0017$
Test statistic:	$c14-math-0018$
Test decision:	Reject $c14-math-0019$ if for the observed value $c14-math-0020$ of $c14-math-0021$
	(A) $c14-math-0022$
	or $c14-math-0023$
	(B) $c14-math-0024$
	(C) $c14-math-0025$
p-values:	(A) $c14-math-0026$
	(B) $c14-math-0027$
	(C) $c14-math-0028$
	with $c14-math-0029$
Annotations:	The test is based on the exact distribution of the test statistic $c14-math-0030$ conditional on all marginal frequencies $c14-math-0031$ , which is for all three sampling distributions the hypergeometric distribution with $c14-math-0032$ . Given the marginal totals, $c14-math-0033$ can take values from $c14-math-0034$ to $c14-math-0035$ ) (Agresti 1990). This test has its origin in Fisher (1934, 1935) and Irwin (1935) and is also called the Fisher–Irwin test. When testing for homogeneity let row variable $c14-math-0036$ indicate to which of two populations each observation belongs. The test problem considers the probabilities to observe characteristic $c14-math-0037$ of variable $c14-math-0038$ in the two populations, usually denoted by $c14-math-0039$ and $c14-math-0040$ for the two populations. Hence $c14-math-0041$ and $c14-math-0042$ . Thereby we have the three test problems (A) $c14-math-0043$ , (B) $c14-math-0044$ , and (C) $c14-math-0045$ . The test procedure is just the same as given above. All three hypotheses can also be expressed in terms of the odds ratio, see Agresti (1990) for details. Fisher's exact test was originally developed for $c14-math-0046$ tables. Freeman and Halton (1951) extended it to any $c14-math-0047$ table and multinomial distributed random variables. This test is called Freeman–Halton test as well as just Fisher's exact test like the original test.

Example

To test if there is an association between the malfunction of workpieces and which of two companies A and B produces them. A sample of $c14-math-0048$ workpieces has been checked with $c14-math-0049$ for functioning and $c14-math-0050$ for defective (dataset in Table A.4).

SAS code

proc freq data=malfunction;
 tables company*malfunction /fisher;
run;

SAS output

         Fisher's Exact Test
---------------------------------------
Cell (1,1) Frequency (F)         9
Left-sided Pr <= F          0.0242
Right-sided Pr >= F         0.9960
Table Probability (P)       0.0202
Two-sided Pr <= P           0.0484

Remarks:

The procedure proc freq enables Fisher's exact test. After the tables statement the two variables must be specified and separated by a star ().
The option fisher invokes Fisher's exact test. Alternatively the option chisq can be used, which also returns Fisher's Exact test in the case of $c14-math-0051$ tables.
Instead of using the raw data as in the example above, it is also possible to use the counts directly by constructing a table and handing this over to the function as first parameter:
```
data counts;
 input r c counts;
 datalines;
 1 1 9
 1 2 11
 2 1 16
 2 2 4
run;
proc freq;
 tables r*c /fisher;
 weight counts;
run;
```
Here the first variable r holds the first index (the rows), the second variable c holds the second index variable (the columns). The variable counts holds the frequencies for each cell. The weight command indicates the variable that holds the frequencies.
SAS arranges the factors into the $c14-math-0053$ table according to the (internal) order unless the weight method is used. The one-sided hypothesis (B) or (C) depends in their interpretation on the way data are arranged in the table, so which table is finally analyzed needs to be carefully checked.

R code

# Read the two variables company and malfunction
x<-malfunction$company
y<-malfunction$malfunction
# Invoke the test
fisher.test(x,y,alternative="two.sided")

R output

Fisher's Exact Test for Count Data
data:  x and y
p-value = 0.04837

Remarks:

alternative=“value” is optional and defines the type of alternative hypothesis: “two.sided”= two sided (A); “greater”=one sided (B); “less”=one sided (C). Default is “two.sided”.
Instead of using the raw data as in the example above, it is also possible to use the counts directly by constructing a $c14-math-0054$ table and handing this over to the function as first parameter:
fisher.test(matrix(c(9,11,16,4), ncol = 2))
It is not clear how R arranges the factors into the $c14-math-0055$ table if the “table” method is not used. For the two-sided hypothesis this does not matter, but for the directional hypotheses it is important. So in the latter case we recommend to construct a $c14-math-0056$ table and to hand this over to the function.

14.1.2 Pearson's $c14-math-0057$ -test

Description:	Tests the hypothesis of independence or homogeneity in a two-dimensional contingency table.
Assumptions:	Data are at least measured on a nominal scale with $c14-math-0058$ and $c14-math-0059$ possible outcomes of the two variables $c14-math-0060$ and $c14-math-0061$ of interest. The random sample follows a Poisson, Multinomial or Product-Multinomial sampling distribution. A dataset of $c14-math-0062$ observations is available and presented as $c14-math-0063$ contingency table.

Hypotheses:	$c14-math-0064$ and $c14-math-0065$ are independent
	vs $c14-math-0066$ and $c14-math-0067$ are not independent
Test statistic:	$c14-math-0068$
	with $c14-math-0069$ the random variable of cell counts of combination $c14-math-0070$ and $c14-math-0071$ the expected cell count.
Test decision:	Reject $c14-math-0072$ if for the observed value $c14-math-0073$ of $c14-math-0074$
	$c14-math-0075$
p-values:	$c14-math-0076$
Annotations:	This test was introduced by Pearson (1900). Fisher (1922) corrected the degrees of freedom of this test, which Pearson incorrectly thought were $c14-math-0077$ . The test problem can also be stated as: $c14-math-0078$ for all $c14-math-0079$ .vs $c14-math-0080$ for at least one pair $c14-math-0081$ , $c14-math-0082$ , $c14-math-0083$ The test statistic $c14-math-0084$ is asymptotically $c14-math-0085$ -distributed. $c14-math-0086$ is the $c14-math-0087$ -quantile of the $c14-math-0088$ -distribution with $c14-math-0089$ degrees of freedom. For $c14-math-0090$ tables, Yates (1934) supposed a continuity correction for a better approximation to the $c14-math-0091$ -distribution. In this case the test statistic is: $c14-math-0092$ . The number of expected frequencies in each cell of the contingency table should be at least $c14-math-0093$ to ensure the approximate $c14-math-0094$ -distribution. If this condition is not fulfilled an alternative is Fisher's exact test (Test 14.1.1). Special versions of this test are the $c14-math-0095$ goodness-of-fit test (Test 12.2.1) and the K-sample binomial test (Test 4.3.1).

Example

To test if there is an association between the malfunction of workpieces and which of two companies A and B produces them. A sample of $c14-math-0096$ workpieces has been checked with $c14-math-0097$ for functioning and $c14-math-0098$ for defective (dataset in Table A.4).

SAS code

proc freq data=malfunction;
 tables company*malfunction /chisq;
run;

SAS output

     Statistics for Table of company by malfunction
Statistic                     DF       Value      Prob
------------------------------------------------------
Chi-Square                     1      5.2267    0.0222
Continuity Adj. Chi-Square     1      3.8400    0.0500

Remarks:

The procedure proc freq enables Pearson's $c14-math-0099$ -test. Following the tables statement the two variables must be specified and separated by a star ().
The option chisq invokes the test.
SAS prints the value of the test statistic and the p-value of the $c14-math-0100$ -test statistic as well as the Yates corrected $c14-math-0101$ -test statistic.
Instead of using the raw data, it is also possible to use the counts directly. See Test 14.1.1 for details.

R code

# Read the two variables company and malfunction
x<-malfunction$company
y<-malfunction$malfunction
# Invoke the test
chisq.test(x,y,correct=TRUE)

R output

Pearson's Chi-squared test with Yates' continuity correction
data:  x and y
X-squared = 3.84, df = 1, p-value = 0.05004

Remarks:

correct=“value” is optional and determines if Yates' continuity correction is used (value=TRUE) or not (value=FALSE). Default is TRUE.
Instead of using the raw data as in the example above, it is also possible to use the counts directly by constructing a table and handing this over to the function as first parameter:
```
chisq.test(matrix(c(9,11,16,4), ncol = 2))
```

14.1.3 Likelihood-ratio $c14-math-0103$ -test

Description:	Tests the hypothesis of independence or homogeneity in a two-dimensional contingency table.
Assumptions:	Data are at least measured on a nominal scale with $c14-math-0104$ and $c14-math-0105$ possible outcomes of the two variables $c14-math-0106$ and $c14-math-0107$ of interest. The random sample follows a Poisson, Multinomial or Product-Multinomial sampling distribution. A dataset of $c14-math-0108$ observations is available and presented as $c14-math-0109$ contingency table.
Hypotheses:	$c14-math-0110$ and $c14-math-0111$ are independent
	vs $c14-math-0112$ and $c14-math-0113$ are not independent
Test statistic:	$c14-math-0114$
	with $c14-math-0115$ the random variable of cell counts of combination $c14-math-0116$ and $c14-math-0117$ the expected cell count.
Test decision:	Reject $c14-math-0118$ if for the observed value $c14-math-0119$ of $c14-math-0120$
	$c14-math-0121$
p-values:	$c14-math-0122$
Annotations:	The test statistic $c14-math-0123$ is asymptotically $c14-math-0124$ -distributed. $c14-math-0125$ is the $c14-math-0126$ -quantile of the $c14-math-0127$ -distribution with $c14-math-0128$ $c14-math-0129$ degrees of freedom. This test is an alternative to Pearson's $c14-math-0130$ -test (Test 14.1.2). The approximation to the $c14-math-0131$ -distribution is usually good if $c14-math-0132$ . See Agresti (1990) for more details on this test.

Example

To test if there is an association between the malfunction of workpieces and which of two companies A and B produces them. A sample of $c14-math-0133$ workpieces has been checked with $c14-math-0134$ for functioning and $c14-math-0135$ for defective (dataset in Table A.4).

SAS code

proc freq data=malfunction;
 tables company*malfunction /chisq;
run;

SAS output

Statistics for Table of company by malfunction
Statistic                     DF       Value      Prob
------------------------------------------------------
Likelihood Ratio Chi-Square    1      5.3834    0.0203

Remarks:

The procedure proc freq enables the likelihood-ratio $c14-math-0136$ -test. Following the tables statement the two variables must be specified and separated by a star ().
The option chisq invokes the test.
Instead of using the raw data, it is also possible to use the counts directly. See Test 14.1.1 for details.

R code

# Read the two variables company and malfunction
x<-malfunction$company
y<-malfunction$malfunction
# Get the observed and expected cases
e<-chisq.test(x,y)$expected
o<-chisq.test(x,y)$observed
# Calculate the test statistic
g2<-2*sum(o*log(o/e))
# Get degrees of freedom from function chisq.test()
df<-chisq.test(x,y)$parameter
# Calculate the p-value
p_value=1-pchisq(g2,1)
# Output results
cat(“Likelihood-Ratio Chi-Square Test   

”,
“test statistic   ”,“p-value”,“
”,
“--------------   ----------”,“
”,
“  ”,g2,“     ”,p_value,“ ”,“
”)

R output

Likelihood-Ratio Chi-Square Test
 test statistic    p-value
 --------------   ----------
    5.38341       0.02032911

Remarks:

There is no basic R function to calculate the likelihood-ratio $c14-math-0137$ -test directly.
We used the R function chisq.test() to calculate the expected and observed observations as well as the degrees of freedom. See Test 14.1.2 for details on this function.

14.2 Tests on agreement and symmetry

Often categorical data are observed in so-called matched pairs, for example, as ratings of two raters on the same objects. Then it is of interest to analyze the agreement of the classification of objects into the categories. We present a test on the kappa coefficient, which is a measurement of agreement. Another question would be if the two raters classify objects into the same classes by the same proportion. For $c14-math-0138$ tables the McNemar test is given, in which case the hypothesis of marginal homogeneity is equivalent to that of axial symmetry.

14.2.1 Test on Cohen's kappa

Description:	Tests if the kappa coefficient, as measure of agreement, differs from zero.
Assumptions:	Data are at least measured on a nominal scale. Measurements are taken by letting two raters classify objects into $c14-math-0139$ categories. The raters make their judgement independently. The two random variables $c14-math-0140$ and $c14-math-0141$ describe the rating of the two raters for one subject, respectively, with the $c14-math-0142$ categories as possible outcomes. Data are summarized in a $c14-math-0143$ contingency table counting the number of occurrences of the possible combinations of ratings in the sample. A sample of size $c14-math-0144$ is given, which follows the multinomial sampling scheme.
Hypotheses:	(A) $c14-math-0145$ vs $c14-math-0146$
	(B) $c14-math-0147$ vs $c14-math-0148$
	(C) $c14-math-0149$ vs $c14-math-0150$
	where $c14-math-0151$ is the kappa coefficient
	given by $c14-math-0152$ and $c14-math-0153$
Test statistic:	$c14-math-0154$
	where $c14-math-0155$ ,
	$c14-math-0156$
	$c14-math-0157$ , $c14-math-0158$
Test decision:	Reject $c14-math-0159$ if for the observed value $c14-math-0160$ of $c14-math-0161$
	(A) $c14-math-0162$ or $c14-math-0163$
	(B) $c14-math-0164$
	(C) $c14-math-0165$
p-values:	(A) $c14-math-0166$
	(B) $c14-math-0167$
	(C) $c14-math-0168$

Annotations:

The kappa coefficient was introduced by Cohen (1960) and is therefore known as Cohen's kappa.
$c14-math-0169$ is under the null hypothesis asymptotically normally distributed with mean $c14-math-0170$ and variance $c14-math-0171$ .
In the case of a perfect agreement $c14-math-0172$ takes the value $c14-math-0173$ . It becomes $c14-math-0174$ if the agreement is equal to that given by change. A higher positive value indicates a stronger agreement, whereas negative values suggest that the agreement is weaker than expected by change (Agresti 1990).
The above variance formula $c14-math-0175$ is different from the formula Cohen published. SAS uses the formula from Fleiss et al. (2003), which we present here.

Example

To test if two reviewers of X-rays of the lung agree on their rating of the lung disease silicosis. Judgements from both reviewers on $c14-math-0176$ patients are available with $c14-math-0177$ for silicosis and $c14-math-0178$ for no silicosis (dataset in Table A.9).

SAS code

proc freq;
 tables reviewer1*reviewer2;
 exact kappa;
run;

SAS output

     Simple Kappa Coefficient
--------------------------------
Kappa (K)                 0.3000
ASE                       0.2122
95% Lower Conf Limit     -0.1160
95% Upper Conf Limit      0.7160
      Test of H0: Kappa = 0
ASE under H0              0.2225
Z                         1.3484
One-sided Pr>  Z         0.0888
Two-sided Pr> |Z|        0.1775
Exact Test
One-sided Pr>=  K        0.1849
Two-sided Pr>= |K|       0.3698

Remarks:

The procedure proc freq enables this test. After the tables statement the two variables must be specified and separated by a star ().
The option exact kappa invokes the test with asymptotic and exact p-values.
Instead of using the raw data, it is also possible to use the counts directly. See Test 14.1.1 for details.
Alternatively the code
```
proc freq data=silicosis;
 tables reviewer1*reviewer2 /agree;
 test agree;
run;
```
can be used, but this will only give the p-values based on the Gaussian approximation.
The p-value of hypothesis (C) is not reported and must be calculated as one minus the p-value of hypothesis (B).

R code

# Get the number of observations
n<-length(silicosis$patient)
# Construct a 2x2 table
freqtable <- table(silicosis$reviewer1,silicosis$reviewer2)
# Calculate the observed frequencies
po<-(freqtable[1,1]+freqtable[2,2])/n
# Calculate the expected frequencies
row<-margin.table(freqtable,1)/n
col<-margin.table(freqtable,2)/n
pe<-row[1]*col[1]+row[2]*col[2]
# Calculate the simple kappa coefficient
k<-(po-pe)/(1-pe)
# Calculate the variance under the null hypothesis
var0<-( pe+pe∧2 - (row[1]*col[1]*(row[1]+col[1])+
                   row[2]*col[2]*(row[2]+col[2])))
                  /(n*(1-pe)∧2)
# Calculate the test statistic
z<-k/sqrt(var0)
# Calculate p_values
p_value_A<-2*pnorm(-abs(z))
p_value_B<-1-pnorm(z)
p_value_C<-pnorm(z)
# Output results
k
z
p_value_A
p_value_B
p_value_C

R output

> k
0.3
> z
1.3484
> p_value_A
0.1775299
> p_value_B
0.08876493
> p_value_C
0.9112351

Remarks:

There is no basic R function to calculate the test directly.
The R function table is used to construct the basic $c14-math-0179$ table and the R function margin.table is used to get the marginal frequencies of this table.

14.2.2 McNemar's test

Description:	Test on axial symmetry or marginal homogeneity in a $c14-math-0180$ table.
Assumptions:	Data are at least measured on a nominal scale. Measurements are taken in matched pairs, for example, by letting two raters classify objects into two categories labeled with $c14-math-0181$ and $c14-math-0182$ . The random variable $c14-math-0183$ states the first rating and $c14-math-0184$ the second rating. Data are summarized in a $c14-math-0185$ contingency table counting the number of occurrences of the four possible combinations of ratings in the sample. A sample of size $c14-math-0186$ is given, which follows the multinomial sampling scheme.
Hypotheses:	$c14-math-0187$ vs $c14-math-0188$
	with $c14-math-0189$ and
	$c14-math-0190$ .
Test statistic:	$c14-math-0191$

Test decision:	Reject $c14-math-0192$ if for the observed value $c14-math-0193$ of $c14-math-0194$
	$c14-math-0195$
p-values:	$c14-math-0196$
Annotations:	The test goes back to McNemar (1947). The hypothesis of symmetry of probabilities $c14-math-0197$ and $c14-math-0198$ is equivalent to that of marginal homogeneity $c14-math-0199$ . The test statistic $c14-math-0200$ is asymptotically $c14-math-0201$ -distributed (Agresti 1990, p. 350). $c14-math-0202$ is the $c14-math-0203$ -quantile of the $c14-math-0204$ -distribution with one degree of freedom. Sometimes a continuity correction for the better approximation to the $c14-math-0205$ -distribution is proposed. In this case the test statistic is: $c14-math-0206$ . This test is a large sample test as it is based on the asymptotic $c14-math-0207$ -distribution of the test statistic. For small samples an exact test can be based on the binomial distribution of $c14-math-0208$ conditional on the off-main diagonal total with $c14-math-0209$ . Alternatively the test decision can be based on Markov chain Monte Carlo methods, see Krampe and Kuhnt (2007), which also cover Bowker's test for symmetry as an extension to $c14-math-0210$ tables.

Example

Of interest is the marginal homogeneity of intelligence quotients over $c14-math-0211$ before training (IQ1) and after training (IQ2). The dataset contains measurements of $c14-math-0212$ subjects (dataset in Table A.2), which first need to be transformed into a binary variable given by the cut point of an intelligence quotient of $c14-math-0213$ .

SAS code

* Dichotomize the variables iq1 and iq2;
data temp;
 set iq;
  if iq1<=100 then iq_before=0;
  if iq1> 100 then iq_before=1;
  if iq2<=100 then iq_after=0;
  if iq2> 100 then iq_after=1;
run;
* Apply the test;
proc freq;
 tables iq_before*iq_after;
 exact mcnem;
run;

SAS output

   Statistics for Table of iq_before by iq_after
                McNemar's Test
        ----------------------------
        Statistic (S)         6.0000
        DF                         1
        Asymptotic Pr >  S    0.0143
        Exact      Pr >= S    0.0313

Remarks:

The procedure proc freq enables this test. After the tables statement the two variables must be specified and separated by a star ().
The option exact mcnem invokes the test with asymptotic and exact p-values.
Instead of using the raw data, it is also possible to use the counts directly. See Test 14.1.1 for details.
SAS does not provide a continuity correction.

R code

# Dichotomize the variables IQ1 and IQ2
iq_before <- ifelse(iq$IQ1<=100, 0, 1)
iq_after  <- ifelse(iq$IQ2<=100, 0, 1)
# Apply the test
mcnemar.test(iq_before, iq_after, correct = FALSE)

R output

McNemar's Chi-squared test
data:  iq_before and iq_after
McNemar's chi-squared = 6, df = 1, p-value = 0.01431

Remarks:

correct=“value” is optional and determines if a continuity correction is used (value=TRUE) or not (value=FALSE). Default is TRUE.
Instead of using the raw data as in the example above, it is also possible to use the counts directly by constructing the table and handing this over to the function as first parameter:
```
freqtable<-table(iq_before, iq_after)
mcnemar.test(freqtable, correct = FALSE)
```

14.2.3 Bowker's test for symmetry

Description:	Test on symmetry in a $c14-math-0215$ table.
Assumptions:	Data are at least measured on a nominal scale. Measurements are taken in matched pairs, for example, by letting two raters classify objects into $c14-math-0216$ categories labeled with $c14-math-0217$ to $c14-math-0218$ . The random variable $c14-math-0219$ states the first rating and $c14-math-0220$ the second rating for an individual object. Data are summarized in a $c14-math-0221$ contingency table counting the number of occurrences of the possible combinations of ratings in the sample. A sample of size $c14-math-0222$ is given, which follows the multinomial sampling scheme.
Hypotheses:	$c14-math-0223$ for all $c14-math-0224$
	vs $c14-math-0225$ for at least one pair $c14-math-0226$ , $c14-math-0227$
	with $c14-math-0228$ .
Test statistic:	$c14-math-0229$
Test decision:	Reject $c14-math-0230$ if for the observed value $c14-math-0231$ of $c14-math-0232$
	$c14-math-0233$
p-values:	$c14-math-0234$
Annotations:	The test was introduced by Bowker (1948) as an extension of McNemar's test for symmetry in $c14-math-0235$ tables to higher dimensional tables. The test statistic $c14-math-0236$ is asymptotically $c14-math-0237$ -distributed with $c14-math-0238$ degrees of freedom (Bowker 1948). $c14-math-0239$ is the $c14-math-0240$ -quantile of the $c14-math-0241$ -distribution with $c14-math-0242$ degrees of freedom. Sometimes a continuity correction of the test statistic for the better approximation to the $c14-math-0243$ -distribution is proposed. Edwards 1948 suggested a correction for the McNemar test which extended to Bowker's test reads $c14-math-0244$ . Under the null hypothesis of symmetry $c14-math-0245$ is also approximately $c14-math-0246$ -distributed. This test is a large sample test as it is based on the asymptotic $c14-math-0247$ -distribution of the test statistic. For small samples test decisions can be based on Markov chain Monte Carlo methods, see Krampe and Kuhnt (2007).

Example

Of interest is the symmetry of the health rating of two general practitioners. The ratings can range from poor (=1) through fair (=2) to good (=3). Ratings of $c14-math-0248$ patients are observed in the given sample (dataset in Table A.13).

SAS code

* Construct the contingency table;
data counts;
  input gp1 gp2 counts;
  datalines;
  1 1 10
  1 2  8
  1 3 12
  2 1 13
  2 2 14
  2 3  6
  3 1  1
  3 2 10
  3 3 20
run;
* Apply the test;
proc freq;
 tables gp1*gp2;
 weight counts;
 exact agree;
run;

SAS output

Statistics for Table of gp1 by gp2
      Test of Symmetry
   ------------------------
   Statistic (S)    11.4982
   DF                     3
   Pr> S            0.0093

Remarks:

The procedure proc freq enables this test. After the tables statement the two variables must be specified and separated by a star ().
The first variable gp1 holds the rating index of the first physician, and the second variable gp2 the rating index of the second physician. The variable counts hold the frequency for each cell of the contingency table.
The option exact agree invokes Bowker's test if applied to tables larger than $c14-math-0249$ , stating asymptotic and exact p-values.
It is also possible to use raw data, see Test 14.1.1 for details.
SAS does not provide a continuity correction.

R code

# Construct the contingency table
table<-matrix(c(10,13,1,8,14,10,12,6,20),ncol=3)
# Apply the test
mcnemar.test(table)

R output

   McNemar's Chi-squared test
data:  table
McNemar's chi-squared = 11.4982, df = 3, p-value = 0.009316

Remarks:

R uses the function mcnemar.test to apply Bowker's test for symmetry, but a continuity correction is not provided.
It is also possible to use raw data, see Test 14.1.1 for details.

14.3 Test on risk measures

In this section we introduce tests for two common risk measures in $c14-math-0250$ tables. The odds ratio and the relative risks are mainly used in epidemiology to identify risk factors for an health outcome. Note, for risk estimates a confidence interval is in most cases more meaningful than a test, because the confidence interval reflects the variability of an estimator.

14.3.1 Large sample test on the odds ratio

Description:	Tests if the odds ratio in a $c14-math-0251$ contingency table differs from unity.
Assumptions:	Data are at least measured on a nominal scale with two possible categories, labeled as $c14-math-0252$ and $c14-math-0253$ , for each of the two variables $c14-math-0254$ and $c14-math-0255$ of interest. The random sample follows a Poisson, Multinomial or Product-Multinomial sampling distribution. A dataset of $c14-math-0256$ observations is available and presented as a $c14-math-0257$ contingency table.
Hypotheses:	(A) $c14-math-0258$ vs $c14-math-0259$
	(B) $c14-math-0260$ vs $c14-math-0261$
	(C) $c14-math-0262$ vs $c14-math-0263$
	where $c14-math-0264$ is the odds ratio.

Test statistic:	$c14-math-0265$
	$c14-math-0266$
	$c14-math-0267$
Test decision:	Reject $c14-math-0268$ if for the observed value $c14-math-0269$ of $c14-math-0270$
	(A) $c14-math-0271$ or $c14-math-0272$
	(B) $c14-math-0273$
	(C) $c14-math-0274$
p-values:	(A) $c14-math-0275$
	(B) $c14-math-0276$
	(C) $c14-math-0277$
Annotations:	The statistic $c14-math-0278$ is asymptotically Gaussian distributed and $c14-math-0279$ is an estimator of its asymptotic standard error (Agresti 1990, p. 54). $c14-math-0280$ is the $c14-math-0281$ -quantile of the standard normal distribution. The odds ratio is also called the cross-product ratio as it can be expressed as a ratio of probabilities diagonally opposite in the table, $c14-math-0282$ . $c14-math-0283$ means in row $c14-math-0284$ response $c14-math-0285$ is more likely than in row $c14-math-0286$ , and if $c14-math-0287$ response $c14-math-0288$ is in row $c14-math-0289$ less likely than in row $c14-math-0290$ . The further away the odds ratio lies from unity the stronger is the association. If $c14-math-0291$ rows and columns are independent. This is a large sample test. In the case of small sample sizes Fisher's exact test can be used (14.1.1) as $c14-math-0292$ is equivalent to independence. Cornfield (1951) showed that the odds ratio is an estimate for the relative risk in case-control studies. The concept of odds ratios can be extended to larger contingency tables. Furthermore it is possible to adjust for other variables by using logistic regression.

Example

To test the odds ratio of companies A and B with respect to the malfunction of workpieces produced by them. A sample of $c14-math-0293$ workpieces has been checked with $c14-math-0294$ for functioning and $c14-math-0295$ for defective (dataset in Table A.4).

SAS code

* Sort the dataset in the right order;
proc sort data=malfunction;
 by company descending malfunction;
run;
* Use proc freq to get the counts saved into freq_table;
proc freq order=data;
 tables company*malfunction /out=freq_table;
run;
* Get the counts out of freq_table;
data n11 n12 n21 n22;
 set freq_table;
 if company='A' and malfunction=1 then do;
    keep count; output n11;
 end;
 if company='A' and malfunction=0 then do;
    keep count; output n12;
 end;
 if company='B' and malfunction=1 then do;
    keep count; output n21;
 end;
 if company='B' and malfunction=0 then do;
    keep count; output n22;
 end;
run;
* Rename counts;
 data n11; set n11; rename count=n11; run;
 data n12; set n12; rename count=n12; run;
 data n21; set n21; rename count=n21; run;
 data n22; set n22; rename count=n22; run;
* Merge counts together and calculate test statistic;
data or_table;
 merge n11 n12 n21 n22;
 * Calculate the Odds Ratio;
 OR=(n11*n22)/(n12*n21);
 * Calculate the standard deviation of ln(OR);
 SD=sqrt(1/n11+1/n12+1/n22+1/n21);
 * Calculate test statistic;
 z=log(OR)/SD;
 * Calculate p-values;
 p_value_A=2*probnorm(-abs(z));
 p_value_B=1-probnorm(z);
 p_value_C=probnorm(z);
run;
* Output results;
proc print split='*' noobs;
 var OR z p_value_A p_value_B p_value_C;
 label OR='Odds Ratio*----------'
       z='Test Statistic*--------------'
       p_value_A='p-value A*---------'
    p_value_B='p-value B*---------'
    p_value_C='p-value C*---------';
 title 'Test on the Odds Ratio';
run;

SAS output

                         Test on the Odds Ratio
Odds Ratio    Test Statistic    p-value A    p-value B
----------    --------------    ---------    ---------
 4.88889         2.21241        0.026938     0.013469
p-value C
---------
 0.98653

Remarks:

The above code calculates the odds ratio for the malfunctions of company A vs B. An odds ratio of $c14-math-0296$ means that a malfunction in company A is $c14-math-0297$ times more likely than in company B. Changing the rows of the table results in an estimated odds ratio of $c14-math-0298$ , which means that a malfunction in company B is $c14-math-0299$ less likely than in company A.
There is no generic SAS function to calculate the p-value in a table directly, but logistic regression can be used as in the following code:
```
proc logistic data=malfunction;
 class company (PARAM=REF REF='B'),
 model malfunction (event='1') = company;
run;
```
Note, this code correctly returns the above two-sided p-value and also the odds ratio of , because with the code class company (PARAM=REF REF='B'), we tell SAS to use company B as reference. One-sided p-values are not given.

Also with proc freq the odds ratio itself can be calculated.

* Sort the dataset in the right order;
proc sort data=malfunction;
 by company descending malfunction;
run;
* Apply the test;
proc freq order=data;
 tables company*malfunction /relrisk;
 exact comor;
run;

However, no p-values are reported.

R code

# Get the cell counts for the 2x2 table
n11<-sum(malfunction$company=='A' &
                        malfunction$malfunction==1)
n12<-sum(malfunction$company=='A' &
                        malfunction$malfunction==0)
n21<-sum(malfunction$company=='B' &
                        malfunction$malfunction==1)
n22<-sum(malfunction$company=='B' &
                        malfunction$malfunction==0)
# Calculate the Odds Ratio
OR=(n11*n22)/(n12*n21)
# Calculate the standard deviation of ln(OR)
SD=sqrt(1/n11+1/n12+1/n22+1/n21)
# Calculate test statistic
z=log(OR)/SD
# Calculate p-values
p_value_A<-2*pnorm(-abs(z));
p_value_B<-1-pnorm(z);
p_value_C<-pnorm(z);
# Output results
OR
z
p_value_A
p_value_B
p_value_C

R output

> OR
[1] 4.888889
> z
[1] 2.212413
> p_value_A
[1] 0.02693816
> p_value_B
[1] 0.01346908
> p_value_C
[1] 0.986531

Remarks:

The above code calculates the odds ratio for the malfunctions of company A vs B. An odds ratio of $c14-math-0302$ means that a malfunction in company A is $c14-math-0303$ times more likely than in company B. Changing the rows in the table results in an odds ratio of $c14-math-0304$ and means that a malfunction in company B is $c14-math-0305$ less likely than in company A.
There is no generic R function to calculate the odds ratio in a table, but logistic regression can be used as in the following code:
```
x<-malfunction$company
y<-malfunction$malfunction
summary(glm(x∼y,family=binomial(link=“logit”)))
```
Note, this code correctly returns the above two-sided p-value, but not the odds ratio of , due to the used specification of which factors enter the regression in which order. Here, R returns a log(odds ratio) of which equals an odds ratio of (see first remark). One-sided p-values are not given.

14.3.2 Large sample test on the relative risk

Description:	Tests if the relative risk in a $c14-math-0310$ contingency table differs from unity.
Assumptions:	Data are at least measured on a nominal scale with two possible categories, labeled as $c14-math-0311$ and $c14-math-0312$ , for each of the two variables $c14-math-0313$ and $c14-math-0314$ of interest. The random sample follows a Poisson, Multinomial or Product-Multinomial sampling distribution. A dataset of $c14-math-0315$ observations is available and presented as a $c14-math-0316$ contingency table.
Hypotheses:	(A) $c14-math-0317$ vs $c14-math-0318$
	(B) $c14-math-0319$ vs $c14-math-0320$
	(C) $c14-math-0321$ vs $c14-math-0322$
	with $c14-math-0323$ the relative risk.
Test statistic:	$c14-math-0324$
	$c14-math-0325$
	$c14-math-0326$
Test decision:	Reject $c14-math-0327$ if for the observed value $c14-math-0328$ of $c14-math-0329$
	(A) $c14-math-0330$ or $c14-math-0331$
	(B) $c14-math-0332$
	(C) $c14-math-0333$
p-values:	(A) $c14-math-0334$
	(B) $c14-math-0335$
	(C) $c14-math-0336$
Annotations:	The statistic $c14-math-0337$ is asymptotically Gaussian distributed and $c14-math-0338$ is an estimator of its asymptotic standard error (Agresti 1990, p. 55). $c14-math-0339$ is the $c14-math-0340$ -quantile of the standard normal distribution. $c14-math-0341$ means that in row $c14-math-0342$ of the table the risk of response $c14-math-0343$ is higher than in row $c14-math-0344$ , and if $c14-math-0345$ the risk of response $c14-math-0346$ is in row $c14-math-0347$ lower than in row $c14-math-0348$ . The further away the RR ratio is from unity the stronger is the association. If $c14-math-0349$ rows and columns are independent and there is no risk. The relative risk can also defined in terms of columns instead of rows.

This is a large sample test.
The concept of relative risk can be extended to larger contingency tables and it is possible to adjust for other variables by using generalized linear models.

Example

To test the relative risk of a malfunction in workpieces produced in company A compared with company B. A sample of $c14-math-0350$ workpieces has been checked with $c14-math-0351$ for functioning and $c14-math-0352$ for defective (dataset in Table A.4).

SAS code

* Sort the dataset in the right order;
proc sort data=malfunction;
 by company descending malfunction;
run;
* Use proc freq to get the counts saved into freq_table;
proc freq order=data;
 tables company*malfunction /out=freq_table;
run;
* Get the counts out of freq_table;
data n11 n12 n21 n22;
 set freq_table;
 if company='A' and malfunction=1 then do;
    keep count; output n11;
 end;
 if company='A' and malfunction=0 then do;
    keep count; output n12;
 end;
 if company='B' and malfunction=1 then do;
    keep count; output n21;
 end;
 if company='B' and malfunction=0 then do;
    keep count; output n22;
 end;
run;
* Rename counts;
 data n11; set n11; rename count=n11; run;
 data n12; set n12; rename count=n12; run;
 data n21; set n21; rename count=n21; run;
 data n22; set n22; rename count=n22; run;
* Merge counts and calculate test statistic;
data rr_table;
 merge n11 n12 n21 n22;
 * Calculate the Relative Risk;
 RR=(n11/(n11+n12))/(n21/(n21+n22));
 * Calculate the standard deviation of ln(RR);
 SD=sqrt(1/n11-1/(n11+n12)+1/n21-1/(n21+n22));
 * Calculate test statistic;
 z=log(RR)/SD;
 * Calculate p-values;
 p_value_A=2*probnorm(-abs(z));
 p_value_B=1-probnorm(z);
 p_value_C=probnorm(z);
run;
* Output results;
proc print split='*' noobs;
 var RR z p_value_A p_value_B p_value_C;
 label RR='Relative Risk*-------------'
       z='Test Statistic*--------------'
       p_value_A='p-value A*---------'
    p_value_B='p-value B*---------'
    p_value_C='p-value C*---------';
 title 'Test on the Relative Risk';
run;

SAS output

                  Test on the Relative Risk
Relative Risk    Test Statistic    p-value A    p-value B
-------------    --------------    ---------    ---------
    2.75            2.06102        0.039301     0.019650
 p-value C
 ---------
  0.98035

Remarks:

The above code calculates the relative risk of malfunctions in products from company A vs B. The risk is $c14-math-0353$ times higher in company A than in company B. Changing the rows of the table results in an estimated relative risk of $c14-math-0354$ and means that a malfunction in a product from company B is $c14-math-0355$ times less likely than from company A.
There is no generic SAS function to calculate the p-values of a relative risk ratio in a table, but generalized linear models can be used as in the following code:
```
proc genmod data = malfunction descending;
 class company (PARAM=REF REF='B'),
 model malfunction=company /dist=binomial link=log;
 run;
```
Note, this code correctly returns the above two-sided p-value and also the relative risk of , as with the code class company (PARAM=REF REF='B') we tell SAS to use company B as reference. SAS returns here a log(relative risk) of which equals a relative risk of (see first remark). One-sided p-values are not given.
However, with proc freq the relative risk itself can be calculated but not the p-values:
```
* Sort the dataset in the right order;
proc sort data=malfunction;
 by company descending malfunction;
run;
* Apply the test;
proc freq order=data;
 tables company*malfunction /relrisk;
run;
```
In the output the Cohort (Col1 Risk) states our wanted relative risk estimate as we are interested in the risk between row and row .

R code

# Get the cell counts for the 2x2 table
n11<-sum(malfunction$company=='A' &
                        malfunction$malfunction==1)
n12<-sum(malfunction$company=='A' &
                        malfunction$malfunction==0)
n21<-sum(malfunction$company=='B' &
                        malfunction$malfunction==1)
n22<-sum(malfunction$company=='B' &
                        malfunction$malfunction==0)
# Calculate the Relative Risk
RR=(n11/(n11+n12))/(n21/(n21+n22))
# Calculate the standard deviation of ln(RR)
SD=sqrt(1/n11-1/(n11+n12)+1/n21-1/(n21+n22))
# Calculate test statistic
z=log(RR)/SD
# Calculate p-values
p_value_A<-2*pnorm(-abs(z));
p_value_B<-1-pnorm(z);
p_value_C<-pnorm(z);
# Output results
RR
z
p_value_A
p_value_B
p_value_C

R output

> RR
[1] 2.75
> z
[1] 2.061022
> p_value_A
[1] 0.03930095
> p_value_B
[1] 0.01965047
> p_value_C
[1] 0.9803495

Remarks:

The above code calculates the relative risk of malfunctions in products from company A vs B. The risk of a malfunction in a product is $c14-math-0362$ times higher in company A than in company B. Changing the rows in the table results in an estimated relative risk of $c14-math-0363$ and means that a malfunction in a product from company B is $c14-math-0364$ times less likely than from company A.
There is no generic R function to calculate the relative risk ratio in a table, but generalized linear models can be used. The following code will do that:
```
x<-malfunction$company
y<-malfunction$malfunction
summary(glm(y∼x,family=binomial(link=“logit”)))
```
Note, this code correctly returns the above two-sided p-value, but not the relative risk of , due to the used specification of which factors enter the regression in which order. Here, R returns a log(relative risk) of which equals a relative risk of (see first remark). One-sided p-values are not given.

References

Agresti A. 1990 Categorical Data Analysis. John Wiley & Sons, Ltd.

Bowker A.H. 1948 A test for symmetry in contingency tables. Journal of the American Statistical Associtaion 43, 572–574.

Cohen J. 1960 A coefficient of agreement for nominal scales. Educational and Psychological Measurement 10, 37–46.

Cornfield J. 1951 A method of estimation comparative rates from clinical data. Applications to cancer of the lung, breast and cervix. Journal of the National Cancer Institute 11, 1229–1275.

Edwards A.L. 1948. Note on the correction for continuity in testing the significance of the difference between correlated proportions. Psychometrika 13, 185–187.

Fisher R.A. 1922 On the interpretation of chi-square from contingency tables, and the calculation of P. Journal of the Royal Statistical Society 85, 87–94.

Fisher R.A. 1934 Statistical Methods for Research Workers, 5th edn. Oliver & Boyd.

Fisher R.A. 1935 The logic of inductive inference. Journal of the Royal Statistical Society, Series A 98, 39–54.

Fleiss J.L., Levin B. and Paik M.C. 2003 Statistical Methods for Rates and Proportions, 3rd edn. John Wiley & Sons, Ltd.

Freeman G.H. and Halton J.H. 1951 Note on an exact treatment of contingency, goodness of fit and other problems of significance. Biometrika 38, 141–149.

Irwin J.O. 1935 Tests of significance for differences between percentages based on small numbers. Metron 12, 83–94.

Krampe A. and Kuhnt S. 2007 Bowker's test for symmetry and modifications within the algebraic framework. Computational Statistics & Data Analysis 51, 4124–4142.

McNemar Q. 1947 Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12, 153–157.

Pearson K. 1900 On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine 50, 157–175.

Yates F. 1934 Contingency tables involving small numbers and the $c14-math-0369$ test. Journal of the Royal Statistical Society Supplement 34, 217–235.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 14: Tests on contingency tables

Create new playlist

Sign In

Sign Up

Chapter 14

Tests on contingency tables

14.1 Tests on independence and homogeneity

14.1.1 Fisher's exact test

14.1.2 Pearson's -test

14.1.3 Likelihood-ratio -test

14.2 Tests on agreement and symmetry

14.2.1 Test on Cohen's kappa

14.2.2 McNemar's test

14.2.3 Bowker's test for symmetry

14.3 Test on risk measures

14.3.1 Large sample test on the odds ratio

14.3.2 Large sample test on the relative risk

References

Table of Contents for
Chapter 14: Tests on contingency tables

14.1.2 Pearson's $c14-math-0057$ -test

14.1.3 Likelihood-ratio $c14-math-0103$ -test