Response surface modeling is essentially a regression analysis problem. In situations where data are collected on a number of response (dependent) variables under the controlled levels of process or recipe (independent) variables, these responses are correlated. However, as it often happens, the levels of process or recipe variables that are optimum for one dependent variable may not be optimum for others. Therefore it is important to investigate response variables simultaneously and not individually or independently of one another, in order to also account for interrelationships. Consequently, the "best model" search for the individual response variables may not be meaningful. What is desired is a best set of models for these responses. A way to do this would be to simultaneously fit multivariate regression models and statistically test the significance of various terms corresponding to independent variables using multivariate methods. This can be done by the repeated use of the MTEST statement in PROC REG.
EXAMPLE 5
Quality Improvement of Mullet Flesh Tseo, Deng, Cornell, Khuri, and Schmidt (1983) considered a study of quality improvement of minced mullet flesh where three process variables, namely washing temperature (TEMP), washing time (TIME), and washing ratio (WRATIO) were varied in a controlled manner in a designed experiment. The four responses, springiness (SPRNESS) in mm, TBA number (TBA), cooking loss (COOKLOSS) in % and whiteness index (WHITNESS) were observed for all 17 experiments. For the analysis, all the variables are standardized to have zero means. The standardization of independent variables was carried out to have the levels of the three variables as +1 and −1 in the 23 full factorial part of the design. To do so, we define
The four response variables are standardized to have sample variances equal to 1. The standardized variables are respectively denoted by y1, y2, y3, and y4. The multiple response surfaces are fitted for y1, y2, y3, and y4 as functions of X1, X2, and X3.
Suppose the interest is to simultaneously fit models which contain effects only up to the second degree in X1, X2, and X3 and up to two variable interactions. For that, we first obtain the values for the variables X1, X2, and X3 as indicated above and their respective interactions defined as X1X2=X1*X2, X1X3=X1*X3, X2X3=X2*X3. Then we obtain the quadratic effects X1SQ=X1*X1, X2SQ=X2*X2 and X3SQ=X3*X3, within the DATA step for the data set WASH. The values for variables Y1,...,Y4 are obtained by using the STANDARD procedure where the options MEAN = 0 and STD = 1 are used to set the respective sample means at zero and respective sample standard deviations at unity for these variables which are listed in the VAR statement. Output is stored in the data set WASH2. We perform a multivariate regression analysis on WASH2 to obtain the appropriate response surfaces by performing the various significance tests.
A complete second order model would involve a total of nine terms, namely X1, X2, X3, X1X2, X1X3, X2X3, X1SQ, X2SQ, and X3SQ, apart from the intercept. We will confine our discussion to only three specific hypotheses
H0(1)
H0(1) The multivariate model contains only the linear terms plus an intercept vector.
H0(2) The multivariate model is quadratic without any interaction terms.
H0(3) The multivariate model has only linear, two-variable interaction terms and intercept vector but no quadratic terms.
The three null hypotheses stated above can respectively be tested using the following three MTEST statements:
linear: mtest x1sq, x2SQ, x3sq, x1x2, x1x3, x2x3/print; nointctn: mtest x1x2, x1x3, x2x3/print; noquad: mtest x1sq, x2SQ, x3sq/print;
The names LINEAR, NOINTCTN and NOQUAD before the colon (:) in the respective three statements are used only for labeling purposes and are optional. The SAS code and resulting output are presented in Program 3.6 and Output 3.6.
/* Program 3.6 */
options ls=64 ps=45 nodate nonumber; title1 'Output 3.6'; data wash; input temp time wratio sprness tba cookloss whitness ; x1 = (temp − 33)/7.0 ; x2 = (time − 5.5)/2.7 ; x3 = (wratio-22.5)/4.5 ; x1sq =x1*x1; x2SQ = x2*X2; x3sq = x3*X3; x1x2 = x1*X2; x1x3 = x1*X3; x2x3 = x2*X3; y1= sprness; y2= tba; y3 = cookloss; y4= whitness ; lines; 26.0 2.8 18.0 1.83 29.31 29.50 50.36 40.0 2.8 18.0 1.73 39.32 19.40 48.16 26.0 8.2 18.0 1.85 25.16 25.70 50.72 40.0 8.2 18.0 1.67 40.81 27.10 49.69 26.0 2.8 27.0 1.86 29.82 21.40 50.09 40.0 2.8 27.0 1.77 32.20 24.00 50.61 26.0 8.2 27.0 1.88 22.01 19.60 50.36 40.0 8.2 27.0 1.66 40.02 25.10 50.42 21.2 5.5 22.5 1.81 33.00 24.20 29.31 44.8 5.5 22.5 1.37 51.59 30.60 50.67 33.0 1.0 22.5 1.85 20.35 20.90 48.75 33.0 10.0 22.5 1.92 20.53 18.90 52.70 33.0 5.5 14.9 1.88 23.85 23.00 50.19 33.0 5.5 30.1 1.90 20.16 21.20 50.86 33.0 5.5 22.5 1.89 21.72 18.50 50.84 33.0 5.5 22.5 1.88 21.21 18.60 50.93 33.0 5.5 22.5 1.87 21.55 16.80 50.98 ; /* Source: Tseo et al. (1983). Reprinted by permission of the Institute of Food Technologists. */ proc standard data=wash mean=0 std=1 out=wash2 ; var y1 y2 y3 y4 ; run; proc reg data = wash2; model y1 y2 y3 y4 = x1 x2 x3 x1sq x2SQ x3sq x1x2 x1x3 x2x3 ; Linear: mtest x1sq, x2SQ, x3sq, x1x2, x1x3, x2x3/print;
Nointctn: mtest x1x2,x1x3,x2x3/print; Noquad: mtest x1sq, x2SQ, x3sq/print; title2 'Quality Improvement in Mullet Flesh'; run; proc reg data = wash2; model y1 y2 y3 y4 = x1 x2 x3 x1sq x2SQ x3sq ; run;
Quality Improvement in Mullet Flesh Multivariate Test: LINEAR E, the Error Matrix 0.8657014633 0.0034272456 −0.815101433 −1.634723273 0.0034272456 0.661373219 −0.17637356 0.6463179546 −0.815101433 −0.17637356 1.9374527491 2.2687768511 −1.634723273 0.6463179546 2.2687768511 6.765404456 H, the Hypothesis Matrix 8.0041171357 -8.822976865 -7.893007279 6.5980348245 -8.822976865 10.054652572 9.1165224684 -7.103378577 -7.893007279 9.1165224684 12.622482263 -5.15027833 6.5980348245 -7.103378577 -5.15027833 5.9966106878 Multivariate Statistics and F Approximations S=4 M=0.5 N=1 Statistic Value F Num DF Den DF Pr > F Wilks' Lambda 0.001812 3.228 24 15.164 0.0107 Pillai's Trace 2.001528 1.1685 24 28 0.3436 Hotelling-Lawley Trace 88.2617 9.1939 24 10 0.0004 Roy's Greatest Root 83.82687 97.798 6 7 0.0001 NOTE: F Statistic for Roy's Greatest Root is an upper bound. Multivariate Test: NOINTCTN E, the Error Matrix 0.8657014633 0.0034272456 −0.815101433 −1.634723273 0.0034272456 0.661373219 −0.17637356 0.6463179546 −0.815101433 −0.17637356 1.9374527491 2.2687768511 −1.634723273 0.6463179546 2.2687768511 6.765404456 H, the Hypothesis Matrix 0.3238160164 −0.440143047 −0.758060209 −0.030667821 −0.440143047 0.7018039785 0.6864395048 −0.025289745 −0.758060209 0.6864395048 3.9583870894 0.4937367103 −0.030667821 −0.025289745 0.4937367103 0.0844373123 Multivariate Statistics and F Approximations S=3 M=0 N=1 Statistic Value F Num DF Den DF Pr > F Wilks' Lambda 0.061574 1.6927 12 10.875 0.1973 Pillai's Trace 1.443396 1.3909 12 18 0.2558 Hotelling-Lawley Trace 7.205689 1.6013 12 8 0.2566 Roy's Greatest Root 5.86998 8.805 4 6 0.0110 NOTE: F Statistic for Roy's Greatest Root is an upper bound. Multivariate Test: NOQUAD E, the Error Matrix 0.8657014633 0.0034272456 −0.815101433 −1.634723273 0.0034272456 0.661373219 −0.17637356 0.6463179546 −0.815101433 −0.17637356 1.9374527491 2.2687768511 −1.634723273 0.6463179546 2.2687768511 6.765404456 H, the Hypothesis Matrix 7.6803011193 -8.382833817 -7.134947069 6.6287026452 -8.382833817 9.3528485931 8.4300829636 -7.078088833 -7.134947069 8.4300829636 8.6640951737 -5.644015041 6.6287026452 -7.078088833 -5.644015041 5.9121733755 Multivariate Statistics and F Approximations S=3 M=0 N=1 Statistic Value F Num DF Den DF Pr > F Wilks' Lambda 0.004154 6.2945 12 10.875 0.0024 Pillai's Trace 1.659428 1.8568 12 18 0.1141 Hotelling-Lawley Trace 81.05601 18.012 12 8 0.0002 Roy's Greatest Root 79.06247 118.59 4 6 0.0001 NOTE: F Statistic for Roy's Greatest Root is an upper bound. |
The output resulting from the MTEST statements indicated above shows that the null hypothesis H0(1), which states that the models contain only linear terms and intercepts, is probably not true. The p value corresponding to Wilks' Λ is 0.0107. Except for Pillai's trace test, all the other multivariate tests produce small p values. Thus at least some of the quadratic and/or interaction terms may be important and may need to be included in the model. Also, it may be of interest to test the hypotheses H0(2) and H0(3) (among many others) which exclusively test for the absence of two-variable interaction effects and the absence of quadratic effects respectively. As Output 3.6 shows, the null hypothesis H0(2) is not rejected and hence, we may probably drop all the two-variable interaction terms from the model. In view of small p values corresponding to all multivariate tests except the Pillai's trace statistic, H0(3) is rejected leading to the conclusion that there are at least some quadratic effects present. As a result, the equations of the four estimated response surfaces, obtained from the output corresponding to univariate analyses, (not shown), are
Ŷ1 = 0.5831 − 0.7187*X1 −0.0073*X2 + 0.0667*X3 − 0.7569*X1SQ + 0.0099*X2SQ + 0.0227*X3SQ,
Ŷ2 = −0.8297 − 0.6071*X1 − 0.0186*X2 − 0.1314*X3 + 0.8736*X1SQ + 0.0510*X2SQ +0.1065*X3SQ,
Ŷ3 = −1.1671 + 0.1854*X1 − 0.0025*X2 − 0.2660*X3 + 0.856*X1SQ + 0.2044*X2SQ + 0.3904*X3SQ,
Ŷ4 = 0.2979 − 0.4684*X1 + 0.1212*X2 + 0.0516*X3 − 0.6040*X1SQ + 0.1275*X2SQ + 0.1075*X3SQ.
Even the terms X2, X3, X2SQ, and X3SQ can also be dropped (the output is not shown here), leaving the four quadratic response surfaces as functions of the variable X1, that is temperature only. In that sense, the four responses appear to be robust with respect to washing time and washing ratio.
18.218.196.182