Example 30.3: Unbalanced ANOVA for TwoWay Design with Interaction
This example uses data from Kutner (1974, p. 98) to
illustrate a twoway analysis of variance.
The original data source is Afifi and Azen (1972, p. 166).
These statements produce Output 30.3.1.
/**/
/* Note: Kutner's 24 for drug 2, disease 1 changed to 34. */
/**/
title 'Unbalanced TwoWay Analysis of Variance';
data a;
input drug disease @;
do i=1 to 6;
input y @;
output;
end;
datalines;
1 1 42 44 36 13 19 22
1 2 33 . 26 . 33 21
1 3 31 3 . 25 25 24
2 1 28 . 23 34 42 13
2 2 . 34 33 31 . 36
2 3 3 26 28 32 4 16
3 1 . . 1 29 . 19
3 2 . 11 9 7 1 6
3 3 21 1 . 9 3 .
4 1 24 . 9 22 2 15
4 2 27 12 12 5 16 15
4 3 22 7 25 5 12 .
;
proc glm;
class drug disease;
model y=drug disease drug*disease / ss1 ss2 ss3 ss4;
run;
Output 30.3.1: Unbalanced ANOVA for TwoWay Design with Interaction
Unbalanced TwoWay Analysis of Variance 
Class Level Information 
Class 
Levels 
Values 
drug 
4 
1 2 3 4 
disease 
3 
1 2 3 
Number of observations 
72 
NOTE: 
Due to missing values, only 58 observations can be used in this analysis. 


Unbalanced TwoWay Analysis of Variance 
The GLM Procedure 
Dependent Variable: y 
Source 
DF 
Sum of Squares 
Mean Square 
F Value 
Pr > F 
Model 
11 
4259.338506 
387.212591 
3.51 
0.0013 
Error 
46 
5080.816667 
110.452536 


Corrected Total 
57 
9340.155172 



RSquare 
Coeff Var 
Root MSE 
y Mean 
0.456024 
55.66750 
10.50964 
18.87931 
Source 
DF 
Type I SS 
Mean Square 
F Value 
Pr > F 
drug 
3 
3133.238506 
1044.412835 
9.46 
<.0001 
disease 
2 
418.833741 
209.416870 
1.90 
0.1617 
drug*disease 
6 
707.266259 
117.877710 
1.07 
0.3958 
Source 
DF 
Type II SS 
Mean Square 
F Value 
Pr > F 
drug 
3 
3063.432863 
1021.144288 
9.25 
<.0001 
disease 
2 
418.833741 
209.416870 
1.90 
0.1617 
drug*disease 
6 
707.266259 
117.877710 
1.07 
0.3958 
Source 
DF 
Type III SS 
Mean Square 
F Value 
Pr > F 
drug 
3 
2997.471860 
999.157287 
9.05 
<.0001 
disease 
2 
415.873046 
207.936523 
1.88 
0.1637 
drug*disease 
6 
707.266259 
117.877710 
1.07 
0.3958 
Source 
DF 
Type IV SS 
Mean Square 
F Value 
Pr > F 
drug 
3 
2997.471860 
999.157287 
9.05 
<.0001 
disease 
2 
415.873046 
207.936523 
1.88 
0.1637 
drug*disease 
6 
707.266259 
117.877710 
1.07 
0.3958 

Note the differences between the four types of sums of squares.
The Type I sum of squares for drug essentially tests for
differences between the expected values of the arithmetic mean
response for different drugs, unadjusted for the effect of disease.
By contrast, the Type II sum of squares for drug measure the
differences between arithmetic means for each drug after adjusting for
disease. The Type III sum of squares measures the differences
between predicted drug means over a balanced drug×disease
population that is, between the LSmeans for drug.
Finally, the Type IV sum of squares is the same as the Type III sum
of squares in this case, since there is data for every drugbydisease
combination.
No matter which sum of squares you prefer to use, this analysis shows
a significant difference among the four drugs, while the disease effect
and the drugbydisease interaction are not significant.
As the previous discussion indicates, Type III sums of squares correspond to
differences between LSmeans, so you can follow up the Type III tests with
a multiple comparisons analysis of the drug LSmeans.
Since the GLM procedure is interactive, you can accomplish this by
submitting the following statements after the previous ones that
performed the ANOVA.
lsmeans drug / pdiff=all adjust=tukey;
run;
Both the LSmeans themselves and a matrix of adjusted pvalues for
pairwise differences between them are displayed; see Output 30.3.2.
Output 30.3.2: LSMeans for Unbalanced ANOVA
Unbalanced TwoWay Analysis of Variance 
The GLM Procedure 
Least Squares Means 
Adjustment for Multiple Comparisons: TukeyKramer 
drug 
y LSMEAN 
LSMEAN Number 
1 
25.9944444 
1 
2 
26.5555556 
2 
3 
9.7444444 
3 
4 
13.5444444 
4 

Unbalanced TwoWay Analysis of Variance 
The GLM Procedure 
Least Squares Means 
Adjustment for Multiple Comparisons: TukeyKramer 
Least Squares Means for effect drug Pr > t for H0: LSMean(i)=LSMean(j)
Dependent Variable: y 
i/j 
1 
2 
3 
4 
1 

0.9989 
0.0016 
0.0107 
2 
0.9989 

0.0011 
0.0071 
3 
0.0016 
0.0011 

0.7870 
4 
0.0107 
0.0071 
0.7870 


The multiple comparisons analysis shows that drugs 1 and 2 have very
similar effects, and that drugs 3 and 4 are also insignificantly different
from each other. Evidently, the main contribution to the
significant drug effect is the difference between the 1/2 pair and
the 3/4 pair.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.