Example 67.1: Comparing Group Means Using Input Data Set of Summary Statistics
The following example, taken from
Huntsberger and Billingsley (1989), compares two grazing
methods using 32 steer. Half of the steer are
allowed to graze continuously while the other half are
subjected to controlled grazing time. The researchers
want to know if these two grazing methods impact weight
gain differently. The data are read by the following
DATA step.
title 'Group Comparison Using Input Data Set of Summary
Statistics';
data graze;
length GrazeType $ 10;
input GrazeType $ WtGain @@;
datalines;
controlled 45 controlled 62
controlled 96 controlled 128
controlled 120 controlled 99
controlled 28 controlled 50
controlled 109 controlled 115
controlled 39 controlled 96
controlled 87 controlled 100
controlled 76 controlled 80
continuous 94 continuous 12
continuous 26 continuous 89
continuous 88 continuous 96
continuous 85 continuous 130
continuous 75 continuous 54
continuous 112 continuous 69
continuous 104 continuous 95
continuous 53 continuous 21
;
run;
The variable GrazeType denotes the grazing method:
`controlled' is controlled grazing and `continuous' is continuous grazing.
The dollar sign ($) following GrazeType makes it a
character variable, and the trailing at signs
(@@) tell the procedure that there is more than one observation per line.
The MEANS procedure is invoked to create a data set of summary statistics
with the following statements:
proc sort;
by GrazeType;
proc means data=graze noprint;
var WtGain;
by GrazeType;
output out=newgraze;
run;
The NOPRINT option eliminates all output from the MEANS
procedure. The VAR statement tells PROC MEANS to compute summary
statistics for the WtGain variable, and the BY statement requests
a separate set of summary statistics for each level of GrazeType.
The OUTPUT OUT= statement tells PROC MEANS to put the summary
statistics into a data set called newgraze so that it may be used
in subsequent procedures. This new data set is displayed in
Output 67.1.1 by using PROC PRINT as follows:
proc print data=newgraze;
run;
The _STAT_ variable contains the names of the statistics,
and the GrazeType variable indicates which group the statistic is from.
Output 67.1.1: Output Data Set of Summary Statistics
Group Comparison Using Input Data Set of Summary Statistics 
Obs 
GrazeType 
_TYPE_ 
_FREQ_ 
_STAT_ 
WtGain 
1 
continuous 
0 
16 
N 
16.000 
2 
continuous 
0 
16 
MIN 
12.000 
3 
continuous 
0 
16 
MAX 
130.000 
4 
continuous 
0 
16 
MEAN 
75.188 
5 
continuous 
0 
16 
STD 
33.812 
6 
controlled 
0 
16 
N 
16.000 
7 
controlled 
0 
16 
MIN 
28.000 
8 
controlled 
0 
16 
MAX 
128.000 
9 
controlled 
0 
16 
MEAN 
83.125 
10 
controlled 
0 
16 
STD 
30.535 

The following code invokes PROC TTEST using the newgraze
data set, as denoted by the DATA= option.
proc ttest data=newgraze;
class GrazeType;
var WtGain;
run;
The CLASS statement contains the variable that distinguishes
between the groups being compared, in this case GrazeType.
The summary statistics and confidence intervals are displayed first,
as shown in Output 67.1.2.
Output 67.1.2: Summary Statistics
Group Comparison Using Input Data Set of Summary Statistics 
Statistics 
Variable 
Class 
N 
Lower CL Mean 
Mean 
Upper CL Mean 
Lower CL Std Dev 
Std Dev 
Upper CL Std Dev 
Std Err 
Minimum 
Maximum 
WtGain 
continuous 
16 
57.171 
75.188 
93.204 
. 
33.812 
. 
8.4529 
12 
130 
WtGain 
controlled 
16 
66.854 
83.125 
99.396 
. 
30.535 
. 
7.6337 
28 
128 
WtGain 
Diff (12) 

31.2 
7.938 
15.323 
25.743 
32.215 
43.061 
11.39 



In Output 67.1.2,
the Variable column states the variable used in computations
and the Class column specifies the group for which the statistics
are computed.
For each class, the sample size, mean, standard deviation and standard
error, and maximum and minimum values are displayed. The confidence
bounds for the mean are also displayed; however, since summary statistics
are used as input, the confidence bounds for the standard deviation of
the groups are not calculated.
Output 67.1.3: t Tests
Group Comparison Using Input Data Set of Summary Statistics 
TTests 
Variable 
Method 
Variances 
DF 
t Value 
Pr > t 
WtGain 
Pooled 
Equal 
30 
0.70 
0.4912 
WtGain 
Satterthwaite 
Unequal 
29.7 
0.70 
0.4913 
Equality of Variances 
Variable 
Method 
Num DF 
Den DF 
F Value 
Pr > F 
WtGain 
Folded F 
15 
15 
1.23 
0.6981 

Output 67.1.3 shows the results of tests for
equal group means and equal variances.
A group test statistic for the equality of means is reported
for equal and unequal variances. Before deciding which test
is appropriate, you should look at the test for
equality of variances; this test does not indicate a significant
difference in the two variances (F' = 1.23, p = 0.6981), so the
pooled t statistic should be used. Based on the pooled
statistic, the two grazing methods are not significantly
different (t=0.70, p=0.4912). Note that this test assumes
that the observations in both data sets are normally distributed;
this assumption can be checked in PROC UNIVARIATE using the raw data.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.