Chapter 13
Experimental Designs

Package(s): BHH2, AlgDesign, granova, multcomp, car, agricolae, phia

Dataset(s): olson, tensile, girder, Hardness, reaction, rocket, rocket_Graeco, battery, bottling, SP, intensity

13.1 Introduction

Experimental Designs, also known as Design of Experiments (DOE), is one of the most important pillars of statistics.

Section 13.2 will introduce the important principles of experimental design, beginning with an interesting real experiment. The first model of the experimental design will be deliberated in Section 13.3, and its extensions to the block design will be detailed in Section 13.4. An effective extension and important class of models of factorial design will be taken up in Section 13.5.

13.2 Principles of Experimental Design

Salsburg (2001) has written a very interesting and historical account in his book titled “The Lady Tasting Tea”. The first chapter, with the same title as the book, has this amazing story of a lady who declared in a cafeteria that she can clearly distinguish between two variants of tea: (i) tea poured into milk, and (ii) milk poured into tea. As Salsburg puts it “A thin, short man, with thick glasses and a Vandyke beard beginning to turn gray, pounced on the problem”, and a live experiment rolled on. The lady was then sent a sequence of different patterns of tea poured into milk cups and milk poured into tea cups. For each cup, the lady would have one sip following which she would declare the process she felt was the underlying preparation, and the results would be noted down. The lady would not be told whether her observation was correct or not at the end of each experiment. This experiment performed by the Vandyke bearded scientist became very famous and laid the foundations of DOE. The reader may be curious to know the answer to two points: (i) Who is this Vandyke bearded scientist? and (ii) What was the result of the experiment? Of course, it was Sir Ronald Fisher who had this Vandyke-styled beard and he put the importance of randomization in action for the tea tasting expert lady. It is important to note here that the goal of the experiment was to test if that lady's claim was correct or not! The randomization prevents the effect of false guesses on the results. If the results of the “The Lady Tasting Tea” experiment were declared, then it was a possibility that people would have failed the concept of “randomization” and not distinguished the fact that this experiment was about the claim of the lady. The lady was correct with all her ten guesses and this was thanks to the fact that she was indeed an expert in making the distinction between the two methods of tea preparation. Salsburg (2001) has built this story in a more fascinating writing and the reader should read the same for more details.

The three important concepts of DOE are (i) Randomization, (ii) Replication, and (iii) Blocking.

Randomization. An experimenter may or may not have biases while conducting an experiment. This influence needs to be done away with before we run the experiment. For example, if Fisher had sent first five times tea poured with milk, and then next five times milk poured with tea to the lady, we are actually having just two observations and not ten observations. Similarly, sending the two types of tea alternatively also sets in predictability. Thus, it is necessary to mix things up and what can be better than sending the tea in a random order for removing the experimenter's bias.

Replication. In “The Lady Tasting Tea” example, randomization alone does not help us if we were sending just two cups of tea to the lady. We need to ensure that the number of cups of “tea poured into milk” and “milk poured into tea” is large enough to support our randomization technique mentioned earlier.

Blocking. Consider an artificial example where we have to decide if the students learn better in classrooms with or without air conditioning (AC). For students from Classes I to IV, we have been careful enough to allocate enough numbers of students to the classrooms with and without AC. Despite our caution, suppose that we have accidentally put all the boys in the classrooms with AC and all the girls in the classrooms without AC. Assume that the result shows that boys perform better than girls. Is the result acceptable? Or assume that the results show that AC students get higher marks than non-AC students. Are the results still acceptable? As we know that the results are not acceptable, we must understand that there are some natural obstacles/restrictions in the nature of the experimental units. This variation in the experimental units cannot be allowed to ruin the results of the experiments. Thus, we need to form blocks which remove this kind of bias/error from the experiment.

We will now consider the simple experimental design, where the importance of randomization will play the central role, in the next section.

13.3 Completely Randomized Designs

Completely Randomized Designs (CRD) is one of the first steps in DOE, and it is a simple setup which involves replication and randomization. If the source of variation in the output is only due to the treatments, CRD is appropriate to deduce the more effective treatments.

13.3.1 The CRD Model

Let c13-math-0001 denote the c13-math-0002 experimental unit for the c13-math-0003 treatment, with c13-math-0004, and c13-math-0005. We assume that the number of observations for each treatment is the same as for any other treatment. Suppose that the experimenter believes that the average yield due to treatment c13-math-0006 is c13-math-0007. The CRD model is then expressed by

where c13-math-0009. This model 13.1 is known as the means model. The more general and useful mathematical model for the CRD is

The CRD model 13.2 in this form is known as the effects model. The effects model is more feasible from a practical point of view. In this form the mean c13-math-0011 is thought of as some guaranteed yield in the absence of any kind of treatment. This parameter is also known as the baseline or control treatment. The parameter values c13-math-0012 are a reflection of the effect due to the treatment c13-math-0013. We will consider the second format of the model throughout the rest of this section. The error component, or the noise factor, c13-math-0014 are noise factors associated with the c13-math-0015 experimental unit. We will assume that the errors are iid as c13-math-0016, where the variance of the normal distribution is not known. Thus, the CRD model says that the probability distribution of the experimental units c13-math-0017 is c13-math-0018. The means and effects model are also related in the sense of defining c13-math-0019.

In both models 13.1 and 13.2, each treatment receives an equal number of experimental units, that is, each treatment c13-math-0020 receives c13-math-0021 number of units. These kinds of models are called balanced designs. In practical setups, it may not be feasible to allocate equal numbers of units, and we allow the c13-math-0022-th treatment of c13-math-0023, number of units. In this case, the model is called the unbalanced design. The inferential aspects of balanced or unbalanced models do not vary drastically from each other, at least for the CRD model, and the coverage for the balanced model is provided in the rest of this section.

The movement to Design Matrix from Covariate Matrix. The covariates c13-math-0024's are missing in the effects model 13.2! In fact, the c13-math-0025's will not appear in the rest of this chapter either. Recollect that the median polish model 4.14 also did not have the covariates c13-math-0026's. However, the covariates are very much present in these models and they have a well-defined format. Indeed, the covariates are designed to appear in a specific way and this is the reason why we call the covariate matrix the Design Matrix. The models in this broad area completely determine the exact structure of the covariate matrix. In R, the function model.matrix will generate the exact design matrix, and we will come to this function later.

Consider the effects in model 13.2. Suppose we have c13-math-0027 treatments and c13-math-0028 observations for each of the treatments. Let the treatment effect be denoted by c13-math-0029, and let c13-math-0030 take the value of 1 if treatment c13-math-0031 is assigned to the observation, and 0 otherwise, c13-math-0032. The design matrix c13-math-0033 is then defined as in Table 13.1. Note that it is important to drop one of the c13-math-0034!

Table 13.1 Design Matrix of a CRD with c13-math-0035 Treatments and c13-math-0036 Observations

Observation Intercept c13-math-0037 c13-math-0038
1 1 1 0
2 1 1 0
3 1 1 0
4 1 1 0
5 1 0 1
6 1 0 1
7 1 0 1
8 1 0 1
9 1 0 0
10 1 0 0
11 1 0 0
12 1 0 0

The next small sub-section will help in random allocation of the treatments to the experimental units.

13.3.2 Randomization in CRD

Suppose that we have c13-math-0039 number of treatments and that the c13-math-0040 treatment, c13-math-0041, is allocated c13-math-0042 number of experimental units. Let c13-math-0043 be the total number of available experimental units.

The statistical inference for the CRD model will now be discussed.

13.3.3 Inference for the CRD Models

A few standard notations are in order. The c13-math-0047 treatment sample sum (mean), denoted by c13-math-0048 (c13-math-0049) and total sample sum (mean), c13-math-0050 (c13-math-0051), are defined by

13.4 equation

Define the total (corrected) sum of squares, denoted by c13-math-0054, as

13.5 equation

The ANOVA technique partitions the c13-math-0056 as the sum of two components: (i) the sum of squares due to treatments c13-math-0057, and (ii) the sum of squares due to error c13-math-0058. Here, c13-math-0059 and c13-math-0060 are defined by

13.7 equation

Note that c13-math-0063 accounts for the between treatments effect, and c13-math-0064 accounts for the within treatments difference. Here, c13-math-0065 has c13-math-0066 degrees of freedom, whereas c13-math-0067 and c13-math-0068 have respectively c13-math-0069 and c13-math-0070 degrees of freedom. It is easy to verify that

equation

Define

equation

That is, c13-math-0073 is the sampling variance of the c13-math-0074-th treatment. We can pool these c13-math-0075 sampling variances and obtain the following:

equation

where c13-math-0077 denotes the mean error sum of squares. Note that c13-math-0078 is an estimator of the variance c13-math-0079 for the c13-math-0080-th treatment, and it may be seen that c13-math-0081 is an estimator of the common variance within each of the treatments. Similarly, we can also use the variation of the treatment averages from the grand average, under the assumption that there is no difference among the treatment means, for estimation of c13-math-0082. That is, the mean treatment sum of squares is given by

equation

An interesting hypothesis testing problem is about the equality of effect of the treatment means: c13-math-0084 against the alternative c13-math-0085. The details can then be presented in the ANOVA Table 13.2.

Table 13.2 ANOVA for the CRD Model

Source of Variation Sum of Squares Degrees of Freedom Mean Square c13-math-0086-Statistic
Between Treatments c13-math-0087 c13-math-0088 c13-math-0089 c13-math-0090
Error within Treatments c13-math-0091 c13-math-0092 c13-math-0093
Total c13-math-0094 c13-math-0095

In light of Theorem 6.6.2, the sampling distribution of c13-math-0096 and c13-math-0097 may be seen as an c13-math-0098-distribution with c13-math-0099 and c13-math-0100 degrees of freedom respectively. Finally, the sampling distribution of c13-math-0101 is seen to be an c13-math-0102-distribution, see Theorem 6.3.4, with c13-math-0103 degrees of freedom.

Before a formal illustration of the CRD model, let us take an EDA route for understanding ANOVA. The package granova has a very interesting graphical tool granova.1w, where 1w stands for “one-way layout”. The illustration through example(granova.1w) is first considered. The granova may be useful for outlier identification, skewness, etc.

In the next example, we will find out how R handles the covariates for the CRD model!

We will consider one more example for the CRD model 13.2 from Dean and Voss (1999).

Validation of the model assumptions is considered next.

13.3.4 Validation of Model Assumptions

The CRD model is a linear regression model, as in 12.1. The assumptions for the model are the same as detailed in Section 12.2, that is, linearity, independent, and normality assumptions. For the model under consideration, the plots, as in Section 12.2.6, give us all that is required. The demonstration is carried out for Example 13.3.3.

13.3.5 Contrasts and Multiple Testing for the CRD Model

Let us begin with a definition.

The statistical interest is in the problem of testing the hypothesis c13-math-0114 against the alternative c13-math-0115. An estimate of the contrast c13-math-0116 is given by

equation

The test procedure is to reject c13-math-0118 if c13-math-0119, where

equation

The test statistic c13-math-0121 is rewritten as

equation

where c13-math-0123 and c13-math-0124 is defined by

equation

Some contrasts are considered in the next example.

The problem of multiple testing was dealt with in Section 17.15. Here, the results are specialized to the CRD model.

If the hypothesis c13-math-0129 is rejected, we know that at least two treatment levels have significantly different effects. This calls for techniques which will help the reader to identify which treatments are significantly different. The details of the multiple comparison problem may be found in Miller (1981), Hsu (1996), and Bretz, et al. (2011).

For the multiple testing problem we are familiar with the Bonferroni's method and the Holm's method. Tukey's honest significant differences, HSD, and Dunnett's procedures are detailed next.

In a CRD model with c13-math-0130 treatment levels, the interest is in comparison of the equality for each possible combination of the levels, that is, there are c13-math-0131 possible hypotheses. Tukey's procedure uses the distribution of the Studentized range statistic:

13.9 equation

where c13-math-0133 and c13-math-0134 are the largest and smallest sample means out of the c13-math-0135 possible means. Tukey's HSD procedure declares two means to be significantly different if the absolute value of their differences exceeds

equation

where c13-math-0137 is the upper c13-math-0138 percentage point of c13-math-0139 and c13-math-0140 is the number of degrees of freedom associated with the c13-math-0141. Thus, a c13-math-0142 % confidence interval for all possible pairs of means is given by

equation

Suppose that one of the c13-math-0144 treatment levels is a control level and comparisons are required with respect to this control level. The Dunnett's procedure offers a solution in this case. The hypotheses are then c13-math-0145, which then need to be tested against the set of hypotheses c13-math-0146. The Dunnett's procedure begins with computation of the differences:

equation

The test procedure is to reject the hypothesis c13-math-0148 at size c13-math-0149 if

equation

where c13-math-0151 corresponds to Dunnett's distribution.

The four methods of multiple testing are next illustrated for the tensile strength experiment.

The tensile strength experiment is an example of a balanced design. Fortunately, there is a slight change of c13-math-0153, instead of the c13-math-0154, but all the results continue to hold. In R there are no changes in the structure of the commands and functionality. Thus, the user can easily solve the multiple testing problem for the Olson dataset, which is an example of an unbalanced design.

c13-math-0155

13.4 Block Designs

In the CRD model, the source of variation arises due to the different treatment levels. It is also assumed that the nuisance factor is completely unknown and uncontrollable. In the case of the nuisance factor being known and controllable, the experiment can be designed to account for such a source of error through blocking.

13.4.1 Randomization and Analysis of Balanced Block Designs

In a block model, a comparison needs to be carried out for the different treatment levels and blocks. In a balanced block design, each block will have one observation per treatment level. It is to be noted that randomization is only carried out on the treatments within a block. The effects model for a balanced block design model is specified by

13.10 equation

where c13-math-0157 is the overall mean, c13-math-0158 is the effect of the c13-math-0159-th treatment, and c13-math-0160 is the effect of the c13-math-0161-th block. Also, define c13-math-0162. As previously we assume that the error term c13-math-0163 follows a Gaussian distribution c13-math-0164.

The randomization for the above design 13.10 is illustrated next.

Towards inference for the randomized block design, define the following quantities:

equation

The decomposition of the total sum of squares c13-math-0169 is given below:

13.11 equation

where

equation

Observe that c13-math-0172 has c13-math-0173 degrees of freedom, c13-math-0174 has c13-math-0175 df, c13-math-0176 has c13-math-0177 df, and finally, c13-math-0178 has c13-math-0179 df. Thus, the ANOVA table is then set up as given in Table 13.3. The balanced block design will be illustrated now.

Table 13.3 ANOVA for the Randomized Balanced Block Model

Source of Variation Sum of Squares Degrees of Freedom Mean Square c13-math-0180-Statistic
Between Treatments c13-math-0181 c13-math-0182 c13-math-0183 c13-math-0184
Between Blocks c13-math-0185 c13-math-0186 c13-math-0187 c13-math-0188
Error c13-math-0189 c13-math-0190 c13-math-0191
Total c13-math-0192 c13-math-0193

As seen in Models 13.2, 13.10, etc., these linear models need to be investigated for model assumptions and other regression diagnostics too. Can it be said without loss of generality that leverages are not a concern in DOE since all the covariate values are fixed? Now, we consider an example from Montgomery (2005), wherein we will investigate issues related to model adequacy.

13.4.2 Incomplete Block Designs

The constraints of the real world may not allow each block to have an experimental unit for every treatment level. The restriction may force allocation of only c13-math-0195 units within each block. Such models are called incomplete block designs. If in an incomplete block design, any two treatments pair appearing together, across the blocks, occurring an equal number of times, the designs are then called balanced incomplete block designs, abbreviated as BIBD. It may be seen that a BIBD may be constructed in c13-math-0196 different ways.

The setup is now recollected again. We have c13-math-0197 treatments, c13-math-0198 distinct blocks, c13-math-0199 is the number of units appearing in a block, and c13-math-0200 is the total number of observations, where c13-math-0201 is the number of times a treatment appears in the design. Then, the number of times each pair of treatments appears in the same block is given by

13.12 equation

The statistical model for BIBD is given by

13.13 equation

where c13-math-0204 is the c13-math-0205 observation in the c13-math-0206 block, c13-math-0207 is the overall mean, c13-math-0208 is the effect of the c13-math-0209 treatment, and c13-math-0210 is the effect of the c13-math-0211 block. The error term c13-math-0212 is assumed to follow c13-math-0213. The total variability needs to be handled in a slightly different manner. Define

13.14 equation

where

equation

Using the c13-math-0216's, the adjusted treatment sum of squares is defined by

equation

Thus, the total variability is partitioned by

13.15 equation

where

equation

The ANOVA table for BIBD model is given in Table 13.4.

Table 13.4 ANOVA for the BIBD Model

Source of Variation Sum of Squares Degrees of Freedom Mean Square c13-math-0220-Statistic
Between Treatments c13-math-0221 c13-math-0222 c13-math-0223 c13-math-0224
Between Blocks c13-math-0225 c13-math-0226 c13-math-0227 c13-math-0228
Error c13-math-0229 c13-math-0230 c13-math-0231
Total c13-math-0232 c13-math-0233

The function BIB.test from the agricolae package is useful to fit a BIBD model.

Two very important variations of the BIBD will be considered in the rest of the section.

13.4.3 Latin Square Design

In each of the Examples 13.4.1 to 13.4.4, and the blocking models 13.2 and 13.10, we had a single source of known and controllable source of variation and we use the blocking principle to overcome this. Suppose now that there are two ways or sources of the variation which may be controlled through blocking! Consider the following example from Montgomery (2001), page 144.

In the above example, we need to ensure that each type of formulation occurs exactly once for the type of raw material and also that each formulation is prepared exactly once by each of the five operators. This type of problem is handled by the Latin Square Design, abbreviated as LSD, and we use the Latin letters to represent the type of formulations. A simple technique is to set up an LSD for say c13-math-0238 formulations, set up in the first row of the design as 1, 2, …, c13-math-0239, the second row as 2, 3, …, c13-math-0240,1, the third row as 3, 4, …, c13-math-0241, 2, 1, and so forth until the last row as c13-math-0242, c13-math-0243, …, 3, 2, 1. There are many other techniques available for creating an LSD, such as Wichmann-Hill, Marsaglia-Multicarry, Mersenne-Twister, Super-Duper, etc. A couple of illustrations are given next using the design.lsd function from the agricolae package.

The LSD model and its statistical analyses is now detailed. The LSD statistical model is specified by

13.16 equation

Here, c13-math-0245 will continue to represent the effect of the c13-math-0246-th treatment (or formulation), c13-math-0247 for the row (block) effect (raw material), and c13-math-0248 for the column (block) effect (operator). The errors are assumed to follow normal distribution c13-math-0249. The ANOVA decomposition of the sum of squares for the LSD model 13.16 will be as follows:

13.17 equation

where

equation

The ANOVA table for the LSD is given in Table 13.5. For Example 13.4.5, the R method is illustrated next.

Table 13.5 ANOVA for the LSD Model

Source of Variation Sum of Squares Degrees of Freedom Mean Square c13-math-0252-Statistic
Between Treatments c13-math-0253 c13-math-0254 c13-math-0255 c13-math-0256
Between Rows c13-math-0257 c13-math-0258 c13-math-0259 c13-math-0260
Between Columns c13-math-0261 c13-math-0262 c13-math-0263 c13-math-0264
Error c13-math-0265 c13-math-0266 c13-math-0267
Total c13-math-0268 c13-math-0269

There are many other useful variants of LSD and the reader may refer to Chapter 10 of Hinkelmann and Kempthorne (2008) for the same. An extension of the LSD is considered in the next topic.

13.4.4 Graeco Latin Square Design

Suppose that there is an extra source of randomness for the Rocket Propellant problem in Example 13.4.5 in test assemblies, which forms an additional type of treatment. Now, it is obvious that this is a second treatment which may be again addressed by another LSD. Let us denote the second treatment by Greek letters, say c13-math-0270, c13-math-0271, c13-math-0272, c13-math-0273, and c13-math-0274. There is symbolic confusion in the sense that the Greek letters of c13-math-0275, c13-math-0276, and c13-math-0277 may be confused with the notations used in Model 13.16. However, we will use the notations as elements of the LSD matrix. It should be clear from the context whether the notations are as in the Greek letters required or in the statistical model. Now, the LSD matrices for the two treatments are superimposed on each other to obtain the Graeco-Latin Square Design model. Table 13.6 shows how superimposing is done for two simple LSDs to obtain the GLSD.

Table 13.6 The GLSD Model

LSD GSD GLSD
A B C c13-math-0278 c13-math-0279 c13-math-0280 Ac13-math-0281 Bc13-math-0282 Cc13-math-0283
B C A c13-math-0284 c13-math-0285 c13-math-0286 Bc13-math-0287 Cc13-math-0288 Ac13-math-0289
C A B c13-math-0290 c13-math-0291 c13-math-0292 Cc13-math-0293 Ac13-math-0294 Bc13-math-0295

In more formal terms, the Graeco-Latin square design model, abbreviated as GLSD, is specified by

where c13-math-0297 now denotes the effect of the additional treatment. A couple of GLSDs will be first set up.

The ANOVA decomposition of the sum of squares for the GLSD model 13.18 will be as follows:

13.19 equation

where

equation

The ANOVA table for the GLSD is given in Table 13.7. For Example 13.4.5, the R method is illustrated next.

Table 13.7 ANOVA for the GLSD Model

Source of Variation Sum of Squares Degrees of Freedom Mean Square c13-math-0401-Statistic
Between Latin Treatments c13-math-0402 c13-math-0403 c13-math-0404 c13-math-0405
Between Greek Treatments c13-math-0406 c13-math-0407 c13-math-0408 c13-math-0409
Between Rows c13-math-0410 c13-math-0411 c13-math-0412 c13-math-0413
Between Columns c13-math-0414 c13-math-0415 c13-math-0416 c13-math-0417
Error c13-math-0418 c13-math-0419 c13-math-0420
Total c13-math-0421 c13-math-0422

We now move to more important and complex experimental designs.

13.5 Factorial Designs

Factorial designs1 are useful in experiments involving two or more factor variables. Suppose that there are two factor variables, with variable 1 having c13-math-0424 levels and variable 2 having c13-math-0425 levels. A factorial design investigates all possible combinations of the levels of the factors. In simple words, if factor c13-math-0426 is at c13-math-0427 levels, factor c13-math-0428 at c13-math-0429 levels, and factor c13-math-0430 at c13-math-0431 levels, a factorial design will investigate all possible levels c13-math-0432. It is then common practice to say that factors are crossed, implying that each factor variable level also has a corresponding observation among the levels of other factors. In factorial designs, the interest is more often to determine if there is an interaction between some combination of the factor levels. What does interaction really mean? In general, the main effects of a factor are observed to be the changes in the regressand due to the changes of the factor levels. However, if the changes in such expected values of the regressand also depend on the levels of other factor variables, we believe that there is an interaction effect between the factor variables. In the previously discussed designs in Sections 13.3 and 13.4, the factors were implicitly assumed to not have any kind of interaction among their different levels. Recollect from the three-dimensional scatter plot in Figure 12.6 and the cantor plot in Figure 12.7, the model 12.31 consisted of straight lines or planes only. However, for the models 12.31 and Figure 13.6, the three-dimensional plots and cantor plots had curvi-linear shapes in them, which is then an indication of the presence of interaction among the variables.

Two plots, with the headings: A: Design of the battery factorial experiment, and B: Interaction effect of the battery factorial experiment, with Mean of L and Average life on the y-axes, and Factors and Temperature on the x-axis.

Figure 13.6 Design and Interaction Plots for 2-Factorial Design

Four plots, with the heading: Understanding height of bottling − interaction plots; with Mean of deviation on the y-axis, and Carbonation, Pressure, Speed Factors on the x-axis at the plot on the top left; Deviation on the y-axis, Pressure on the y-axis f

Figure 13.7 Understanding Interactions for the Bottling Experiment

Sir R.A. Fisher advocated the use of a complex design like factorial designs with “No aphorism is more frequently repeated in connection with field trials, than that we must ask Nature a few questions, or, ideally, one question, at a time. The writer is convinced that this view is wholly mistaken.” To understand this advantage, consider two factorial variables at two levels each. Suppose that two factor variables, c13-math-0433 and c13-math-0434, are both available at levels high c13-math-0435 and low c13-math-0436. Then the four possible combinations of the two treatments are c13-math-0437, c13-math-0438, c13-math-0439, and c13-math-0440. If low refers to the absence of factor levels, it is also a common practice to denote these respective combination levels with c13-math-0441, c13-math-0442, c13-math-0443, and c13-math-0444. Suppose we are interested in finding the effect of changing the factors c13-math-0445 and c13-math-0446. A rule of thumb is to take two observations at each combination level of the factors in the presence of the experimental error. Now, the effect of treatment factor c13-math-0447 is found by the difference in the combination level of c13-math-0448, while that of c13-math-0449 is with c13-math-0450. Thus, we require data on the three combination levels c13-math-0451, c13-math-0452, and c13-math-0453, or equivalently six observations only. In a full factorial experiment, we would also have two more data points at the combination level c13-math-0454. Now, we can obtain two main effects for c13-math-0455 with c13-math-0456 and c13-math-0457, and two main effects for c13-math-0458 too with c13-math-0459 and c13-math-0460.

In the remainder of this section, we will consider some useful factorial designs. The focus will be only on fixed effects models.

13.5.1 Two Factorial Experiment

Consider the case of two factor variables. Suppose that factor c13-math-0461 is at c13-math-0462 levels, and c13-math-0463 at c13-math-0464 levels. The two factorial design model is given by

13.20 equation

Here, c13-math-0466 is the overall mean effect, c13-math-0467 is the effect of the c13-math-0468-th factor level of c13-math-0469, c13-math-0470 is the c13-math-0471-th effect of the factor c13-math-0472, and c13-math-0473 is the interaction effect between the variables. The ANOVA decomposition of the sum of squares for the two factorial experiments is given by

13.21 equation

where

equation

The computation of c13-math-0476 involves two stages. Since such issues do not exist when we use R, we skip this detail. The ANOVA table for the two-factorial model is given in Table 13.8. The hypotheses problems of interest here are the following:

  1. 1. Testing for Factor c13-math-0477:equation
  2. 2. Testing for Factor c13-math-0479:equation
  3. 3. Testing for the Interaction of factors:equation

Table 13.8 ANOVA for the Two Factorial Model

Source of Variation Sum of Squares Degrees of Freedom Mean Square c13-math-0482-Statistic
Treatment A c13-math-0483 c13-math-0484 c13-math-0485 c13-math-0486
Treatment B c13-math-0487 c13-math-0488 c13-math-0489 c13-math-0490
Interaction c13-math-0491 c13-math-0492 c13-math-0493 c13-math-0494
Error c13-math-0495 c13-math-0496 c13-math-0497
Total c13-math-0498 c13-math-0499

The two-factorial experiment is now illustrated with an example from Montgomery (2005).

The extension of the two-factorial experiment is discussed in the next subsection.

13.5.2 Three-Factorial Experiment

A natural extension of the two-factorial experiment will be the three-factorial experiment. Let the three factors be denoted by c13-math-0503, c13-math-0504, and c13-math-0505, each respectively at c13-math-0506, c13-math-0507, and c13-math-0508 factor levels. Consequently, there are now three more interaction terms to be modeled for in c13-math-0509, c13-math-0510, and c13-math-0511. Towards a complete crossed model, we need each replicate (of index c13-math-0512) to cover all possible factor combinations. Thus, the three-way factorial model is then modeled by

13.22 equation

The total sum of squares decomposition equation is then

13.23 equation

where

equation

The ANOVA table for the three-factorial design model is given in Table 13.9.

Table 13.9 ANOVA for the Three-Factorial Model

Source of Variation Sum of Squares Degrees of Freedom Mean Square c13-math-0516-Statistic
Treatment A c13-math-0517 c13-math-0518 c13-math-0519 c13-math-0520
Treatment B c13-math-0521 c13-math-0522 c13-math-0523 c13-math-0524
Treatment C c13-math-0525 c13-math-0526 c13-math-0527 c13-math-0528
Interaction (AB) c13-math-0529 c13-math-0530 c13-math-0531 c13-math-0532
Interaction (AC) c13-math-0533 c13-math-0534 c13-math-0535 c13-math-0536
Interaction (BC) c13-math-0537 c13-math-0538 c13-math-0539 c13-math-0540
Interaction (ABC) c13-math-0541 c13-math-0542 c13-math-0543 c13-math-0544
Error c13-math-0545 c13-math-0546 c13-math-0547
Total c13-math-0548 c13-math-0549

Since the details of the two-factorial experiment extend in a fairly straightforward manner to the current model, the model will be illustrated with two examples from Montgomery (2005).

The principle of blocking in the context of factorial experiments is taken up in the next subsection.

13.5.3 Blocking in Factorial Experiments

The two- and three-factorial experiments discussed above are extensions of the basic CRD model. In Section 13.4, we saw that replication does not alone eliminate the source of error and that the principle of blocking helps improve the efficiency of the experiment. Similarly, in the case of factorial experiments, practical considerations sometimes dictate that the replication principle cannot be implemented uniformly and that the factorial experiments may be improved through the introduction of a blocking route. The blocking model for a two-factorial experiment is stated next:

13.24 equation

The total sum of squares decomposition equation for the model 13.24 is then

13.25 equation

where

equation

The corresponding ANOVA table for the blocking factorial experiment 13.24 is given in Table 13.10.

Table 13.10 ANOVA for Factorial Models with Blocking

Source of Variation Sum of Squares Degrees of Freedom Mean Square c13-math-0555-Statistic
Blocks c13-math-0556 c13-math-0557 c13-math-0558 c13-math-0559
Treatment A c13-math-0560 c13-math-0561 c13-math-0562 c13-math-0563
Treatment B c13-math-0564 c13-math-0565 c13-math-0566 c13-math-0567
Interaction (AB) c13-math-0568 c13-math-0569 c13-math-0570 c13-math-0571
Error c13-math-0572 c13-math-0573 c13-math-0574
Total c13-math-0575 c13-math-0576

The theory and applications of Experimental Designs extend far beyond the models discussed here and thankfully the purpose of the current chapter has been to consider some of the important models among them.

13.6 Further Reading

Kempthorne (1952), Cochran and Cox (1958), Box, Hunter, and Hunter (2005), Federer (1955), and of course, Fisher (1971) are some of the earliest treatises in this area.

Montgomery (2005), Wu and Hamada (2000–9), and Dean and Voss (1999) are some of the modern accounts in DOE. Casella (2008) is a very useful source with emphasis on computations using R. The web link http://cran.r-project.org/web/views/ExperimentalDesign.html is very useful for the reader interested in almost exhaustive options for DOE analysis using R.

13.7 Complements, Problems, and Programs

  1. Problem 13.1 Identify which of the design models studied in this chapter are appropriate for the datasets available in the BHH2 package. The list of datasets available in the package may be found with try(data(package="BHH2")). The exercise may also be repeated for design-related packages such as agricolae, AlgDesign, and granova.

  2. Problem 13.2 Carry out the diagnostic tests for the olson_crd fitted model in Example 13.3.4. Repeat a similar exercise for the ANOVA model fitted in Example 13.3.2.

  3. Problem 13.3 Multiple comparison tests of Dunnett, Tukey, Holm, and Bonferroni have been explored in Example 13.3.8. The confidence intervals are reported only for TukeyHSD. The reader should obtain the confidence intervals for the rest of the multiple comparison tests contrasts.

  4. Problem 13.4 Explore the use of the functions design.crd and design.rcbd from the agricolae package for setting up CRD and block designs.

  5. Problem 13.5 The function granova.2w may be applied on the girdernew dataset with a slight modification. Create a new data frame girdernew2 <- girdernew[,c(3,1,2)]. Note that an additional R package rgl will be required though. Test the code granova.2w( girdernew2,ss gf + mf) and make an attempt to interpret the output.

  6. Problem 13.6 In the fitted models ssaov, hardness_aov, rocket_aov, rocket.aov, and rocket.glsd.aov discussed in Section 13.4, investigate the presence of an interaction effect through the use of the interaction.plot graphical function.

  7. Problem 13.7 Perform the diagnostic tests on the BIBD model in Example 13.4.4.

  8. Problem 13.8 Investigate the presence of outliers and influential measures for the fitted models rocket.lm and rocket.glsd.lm in the respective Examples 13.4.7 and 13.4.9.

  9. Problem 13.9 Obtain the confidence intervals for the contrasts of the fitted model battery.aov in Example 13.5.1.

  10. Problem 13.10 Carry out the diagnostic tests for the fitted models bottling.aov, SP.aov, and intensity.aov.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.82.79