The GLM Procedure

## Specification of Effects

Each term in a model, called an effect, is a variable or combination of variables. Effects are specified with a special notation using variable names and operators. There are two kinds of variables: classification (or class) variables and continuous variables. There are two primary operators: crossing and nesting. A third operator, the bar operator, is used to simplify effect specification.

In an analysis-of-variance model, independent variables must be variables that identify classification levels. In the SAS System, these are called class variables and are declared in the CLASS statement. (They can also be called categorical, qualitative, discrete, or nominal variables.) Class variables can be either numeric or character. The values of a class variable are called levels. For example, the class variable Sex has the levels "male" and "female."

In a model, an independent variable that is not declared in the CLASS statement is assumed to be continuous. Continuous variables, which must be numeric, are used for response variables and covariates. For example, the heights and weights of subjects are continuous variables.

### Types of Effects

There are seven different types of effects used in the GLM procedure. In the following list, assume that A, B, C, D, and E are class variables and that X1, X2, and Y are continuous variables:

• Regressor effects are specified by writing continuous variables by themselves: X1    X2.
• Polynomial effects are specified by joining two or more continuous variables with asterisks: X1*X1    X1*X2.
• Main effects are specified by writing class variables by themselves: A    B    C.
• Crossed effects (interactions) are specified by joining class variables with asterisks: A*B    B*C    A*B*C.
• Nested effects are specified by following a main effect or crossed effect with a class variable or list of class variables enclosed in parentheses. The main effect or crossed effect is nested within the effects listed in parentheses:

B(A)    C(B*A)    D*E(C*B*A) .

In this example, B(A) is read "B nested within A."

• Continuous-by-class effects are written by joining continuous variables and class variables with asterisks: X1*A.
• Continuous-nesting-class effects consist of continuous variables followed by a class variable interaction enclosed in parentheses: X1(A)    X1*X2(A*B).

One example of the general form of an effect involving several variables is

X1*X2*A*B*C(D*E)

This example contains crossed continuous terms by crossed classification terms nested within more than one class variable. The continuous list comes first, followed by the crossed list, followed by the nesting list in parentheses. Note that asterisks can appear within the nested list but not immediately before the left parenthesis. For details on how the design matrix and parameters are defined with respect to the effects specified in this section, see the section "Parameterization of PROC GLM Models". The MODEL statement and several other statements use these effects. Some examples of MODEL statements using various kinds of effects are shown in the following table; a, b, and c represent class variables, and y, y1, y2, x, and z represent continuous variables.

 Specification Kind of Model `model y=x;` simple regression `model y=x z;` multiple regression `model y=x x*x;` polynomial regression `model y1 y2=x z;` multivariate regression `model y=a;` one-way ANOVA `model y=a b c;` main effects model `model y=a b a*b;` factorial model (with interaction) `model y=a b(a) c(b a);` nested model `model y1 y2=a b;` multivariate analysis of variance (MANOVA) `model y=a x;` analysis-of-covariance model `model y=a x(a);` separate-slopes model `model y=a x x*a;` homogeneity-of-slopes model

### The Bar Operator

You can shorten the specification of a large factorial model using the bar operator. For example, two ways of writing the model for a full three-way factorial model are

```   proc glm;                 and           proc glm;
class A B C;                            class A B C;
model Y=A B C A*B                       model Y=A|B|C;
A*C B*C A*B*C;                 run;
run;
```

When the bar (|) is used, the right- and left-hand sides become effects, and the cross of them becomes an effect. Multiple bars are permitted. The expressions are expanded from left to right, using rules 2 -4 given in Searle (1971, p. 390).

• Multiple bars are evaluated left to right. For instance, A|B|C is evaluated as follows.

 A | B | C { A | B } | C { A  B  A*B } | C A  B  A*B  A*C  B*C  A*B*C
• Crossed and nested groups of variables are combined. For example, A(B) | C(D) generates A*C(B D), among other terms.
• Duplicate variables are removed. For example, A(C) | B(C) generates A*B(C C), among other terms, and the extra C is removed.
• Effects are discarded if a variable occurs on both the crossed and nested parts of an effect. For instance, A(B) | B(D E) generates A*B(B D E), but this effect is eliminated immediately.

You can also specify the maximum number of variables involved in any effect that results from bar evaluation by specifying that maximum number, preceded by an @ sign, at the end of the bar effect. For example, the specification A | B | C@2 would result in only those effects that contain 2 or fewer variables: in this case, A  B  A*B  C  A*C and B*C.

The following table gives more examples of using the bar and at operators.

 A | C(B) is equivalent to A   C(B)   A*C(B) A(B) | C(B) is equivalent to A(B)   C(B)   A*C(B) A(B) | B(D E) is equivalent to A(B)   B(D E) A | B(A) | C is equivalent to A   B(A)   C   A*C   B*C(A) A | B(A) | C@2 is equivalent to A   B(A)   C   A*C A | B | C | D@2 is equivalent to A  B  A*B  C  A*C  B*C  D  A*D  B*D  C*D A*B(C*D) is equivalent to A*B(C D)