Chapter Contents
Chapter Contents
The GENMOD Procedure

Generalized Estimating Equations

Let Yij, j = 1, ... ,ni, i = 1, ... ,K represent the jth measurement on the ith subject. There are ni measurements on subject i and \sum_{i=1}^K n_i total measurements.

Correlated data are modeled using the same link function and linear predictor setup (systematic component) as the independence case. The random component is described by the same variance functions as in the independence case, but the covariance structure of the correlated measurements must also be modeled. Let the vector of measurements on the ith subject be Y = [Yi1, ... , Yi ni]' with corresponding vector of means \bdmu=[\mu_{i1},
 ... , \mu_{i n_i}]^' and let V be the covariance matrix of Y. Let the vector of independent, or explanatory, variables for the jth measurement on the ith subject be

Xij = [xij1, ... , xijp]'
The Generalized Estimating Equation of Liang and Zeger (1986) for estimating the p ×1 vector of regression parameters {\beta} is an extension of the independence estimating equation to correlated data and is given by
S({\beta}) = \sum_{i=1}^K \frac{\partial \bdmu^'}{\partial {\beta}}
\bdV^{-1} (\bdY-\bdmu({\beta})) = 0


g(\mu_{ij}) = \xdpbeta
where g is the link function, the p ×ni matrix of partial derivatives of the mean with respect to the regression parameters for the ith subject is given by
\frac{\partial \bdmu^'}{\partial {\beta}} =
 ...p}}{g^'(\mu_{i1})} &  ...  &
 \displaystyle\frac{x_{in_ip}}{g^'(\mu_{in_i})} \ ]

Working Correlation Matrix

Let R_i({\alpha}) be an ni ×ni "working" correlation matrix that is fully specified by the vector of parameters {\alpha}.The covariance matrix of Y is modeled as
\bdV = \phi A_i^{\frac{1}2}R({\alpha})
where Ai is an ni ×ni diagonal matrix with v(\mu_{ij}) as the jth diagonal element. If R_i({\alpha}) is the true correlation matrix of Y, then V is the true covariance matrix of Y.

The working correlation matrix is usually unknown and must be estimated. It is estimated in the iterative fitting process using the current value of the parameter vector {\beta} to compute appropriate functions of the Pearson residual

If you specify the working correlation as R0 = I, which is the identity matrix, the GEE reduces to the independence estimating equation.

Following are the structures of the working correlation supported by the GENMOD procedure and the estimators used to estimate the working correlations.

Working Correlation Structure   Estimator
 Corr(Yij,Yik) = rjk where rjk is the jkth element of a constant, user-specified correlation matrix R0.The working correlation is not estimated in this case.
 \{1 & j = k \ 0 & j \ne k
 . }The working correlation is not estimated in this case.
 \{1 & t = 0 \ \alpha_{t} & t=1,2, ... ,m \ 0 & t \gt m . }{\displaystyle
 \hat{\alpha}_{t} =
 \sum_{j\leq n_i-t}e_{ij}e_{i,j+t} }
  {\displaystyle K_t = \sum_{i=1}^K (n_i - t)}
 {Corr}(Y_{ij},Y_{ik})=\{1 & j = k \ \alpha & j \neq k \ . {\displaystyle
 \hat{\alpha} =
 \sum_{j\neq k}e_{ij}e_{ik} }
  {\displaystyle N^{*}=\sum_{i=1}^K n_i(n_i-1)}
 {Corr}(Y_{ij},Y_{ik})= \{
 1 & j = k \ \alpha_{jk} & j \neq k \ . \displaystyle
 \hat{\alpha}_{jk} = \frac{1}{(K-p)\phi}\sum_{i=1}^Ke_{ij}e_{ik}
Autoregressive AR(1)  
 \alpha^t for t = 0,1,2, ... ,ni-j{\displaystyle
 \hat{\alpha} = \frac{1}{(K_1-p)\phi}\sum_{i=1}^K
 \sum_{j\leq n_i-1}e_{ij}e_{i,j+1} }
  {\displaystyle K_1 = \sum_{i=1}^K (n_i - 1)}

Dispersion Parameter

The dispersion parameter \phi is estimated by
\hat{\phi} = \frac{1}{N-p} \sum_{i=1}^K
 \sum_{j=1}^{n_i} e_{ij}^2
where N = \sum_{i=1}^K n_i is the total number of measurements and p is the number of regression parameters.

The square root of \hat{\phi} is reported by PROC GENMOD as the scale parameter in the "Analysis of GEE Parameter Estimates Model-Based Standard Error Estimates" output table.

Fitting Algorithm

The following is an algorithm for fitting the specified model using GEEs. Note that this is not in general a likelihood-based method of estimation, so that inferences based on likelihoods are not possible for GEE methods.
  1. Compute an initial estimate of {\beta} with an ordinary generalized linear model assuming independence.
  2. Compute the working correlations R based on the standardized residuals, the current {\beta}, and the assumed structure of R.
  3. Compute an estimate of the covariance:
    \bdV = \phi A_i^{\frac{1}2}{\hat{R}}
 ({\alpha}) A_i^{\frac{1}2}
  4. Update {\beta}:
    {\beta}_{r+1} = {\beta}_{r} + [\sum_{i=1}^K
 \frac{\partial \bdmu}{\partial {\be...
 ...m_{i=1}^K \frac{\partial \bdmu}
 {\partial {\beta}}^'\bdV^{-1}
 (\bdY-\bdmu) ]
  5. Iterate steps 2-4 until convergence

Missing Data

Refer to Diggle, Liang, and Zeger (1994, Chapter 11) for a discussion of missing values in longitudinal data. Suppose that you intend to take measurements Yi1, ... , Yin for the ith unit. Missing values for which Yij are missing whenever Yik is missing for all j \ge k are called dropouts. Otherwise, missing values that occur intermixed with nonmissing values are intermittent missing values. The GENMOD procedure can estimate the working correlation from data containing both types of missing values using the all available pairs method, in which all nonmissing pairs of data are used in the moment estimators of the working correlation parameters defined previously.

For example, for the unstructured working correlation model,

\hat{\alpha}_{jk} = \frac{1}{(K^'-p) \phi} \sum e_{ij} e_{ik}
where the sum is over the units that have nonmissing measurements at times j and k, and K' is the number of units with nonmissing measurements at j and k. Estimates of the parameters for other working correlation types are computed in a similar manner, using available nonmissing pairs in the appropriate moment estimators.

The contribution of the ith unit to the parameter update equation is computed by omitting the elements of (\bdY - \bdmu), the columns of D_i^'=\frac{\partial {\mu}}{\partial {\beta}}^', and the rows and columns of V corresponding to missing measurements.

Parameter Estimate Covariances

The model-based estimator of {Cov}(\hat{{\beta}}) is given by
{{\Sigma}}_{m}(\hat{{\beta}}) = I_0^{-1}
I_0 = \sum_{i=1}^K \frac{\partial \bdmu}
 {\partial {\beta}}^'\bdV^{-1}
 \frac{\partial \bdmu}{\partial {\beta}}
This is the GEE equivalent of the inverse of the Fisher information matrix that is often used in generalized linear models as an estimator of the covariance estimate of the maximum likelihood estimator of {\beta}.It is a consistent estimator of the covariance matrix of \hat{{\beta}} if the mean model and the working correlation matrix are correctly specified.

The estimator

{{\Sigma}}_{e} = I_0^{-1} I_1 I_0^{-1}
is called the empirical, or robust, estimator of the covariance matrix of \hat{{\beta}} where
I_{1} = \sum_{i=1}^K \frac{\partial \bdmu}
 {\partial {\beta}}^'\bdV^{-1}
 {{\rm Cov}}(\bdY)
 \bdV^{-1} \frac{\partial \bdmu}{\partial {\beta}}
It has the property of being a consistent estimator of the covariance matrix of \hat{{\beta}}, even if the working correlation matrix is misspecified, that is, if {Cov}(\bdY) \neq \bdV.In computing M, {\beta} and \phi are replaced by estimates, and Cov(Y) is replaced by the estimate
(\bdY - \bdmu(\hat{{\beta}}))(\bdY -

Multinomial GEEs

Lipsitz, Kim, and Zhao (1994) and Miller, Davis, and Landis (1993) describe how to extend GEEs to multinomial data. Currently, only the independent working correlation is available for multinomial models in PROC GENMOD.

Alternating Logistic Regressions

If the responses are binary (that is, they take only two values), then there is an alternative method to account for the association among the measurements. The Alternating Logistic Regressions (ALR) algorithm of Carey, Zeger, and Diggle (1993) models the association between pairs of responses with log odds ratios, instead of with correlations, as ordinary GEEs do.

For binary data, the correlation between the jth and kth response is, by definition,

Corr(Y_{ij},Y_{ik}) = \frac{\Pr(Y_{ij}=1,Y_{ik}=1)-\mu_{ij}\mu_{ik}}
The joint probability in the numerator satisfies the following bounds, by elementary properties of probability, since \mu_{ij}=\Pr(Y_{ij}=1):
\max(0,\mu_{ij}+\mu_{ik}-1) \leq \Pr(Y_{ij}=1,Y_{ik}=1) \leq
 \min( \mu_{ij},\mu_{ik})
The correlation, therefore, is constrained to be within limits that depend in a complicated way on the means of the data.

The odds ratio, defined as

OR(Yij,Yik) = [(Pr(Yij = 1,Yik = 1)Pr(Yij = 0,Yik = 0))/(Pr(Yij = 1,Yik = 0)Pr(Yij = 0,Yik = 1))]
is not constrained by the means and is preferred, in some cases, to correlations for binary data.

The ALR algorithm seeks to model the logarithm of the odds ratio, {\gamma_{ijk} = \log(OR(Y_{ij},Y_{ik}))}, as

\gamma_{ijk} = z_{ijk}'{\alpha}
where {\alpha} is a q ×1 vector of regression parameters and zijk is a fixed, specified vector of coefficients.

The parameter \gamma_{ijk} can take any value in (-\infty, \infty) with \gamma_{ijk} = 0 corresponding to no association.

The log odds ratio, when modeled in this way with a regression model, can take different values in subgroups defined by zijk. For example, zijk can define subgroups within clusters, or it can define `block effects' between clusters.

You specify a GEE model for binary data using log odds ratios by specifying a model for the mean, as in ordinary GEEs, and a model for the log odds ratios. You can use any of the link functions appropriate for binary data in the model for the mean, such as logistic, probit, or complementary log-log. The ALR algorithm alternates between a GEE step to update the model for the mean and a logistic regression step to update the log odds ratio model. Upon convergence, the ALR algorithm provides estimates of the regression parameters for the mean, {\beta}, the regression parameters for the log odds ratios, {\alpha}, their standard errors, and their covariances.

Specifying Log Odds Ratio Models

Specifying a regression model for the log odds ratio requires you to specify rows of the z-matrix zijk for each cluster i and each unique within-cluster pair (j, k). The GENMOD procedure provides several methods of specifying zijk. These are controlled by the LOGOR=keyword and associated options in the REPEATED statement. The supported keywords and the resulting log odds ratio models are described as follows.

exchangeable log odds ratios. In this model, the log odds ratio is a constant for all clusters i and pairs (j,k). The parameter \alpha is the common log odds ratio.
zijk = 1      for all     i, j, k

fully parameterized clusters. Each cluster is parameterized in the same way, and there is a parameter for each unique pair within clusters. If a complete cluster is of size n, then there are [(n(n-1))/2] parameters in the vector {\alpha}. For example, if a full cluster is of size 4, then there are [4 ×3/2] = 6 parameters, and the z-matrix is of the form
Z = [ 1 & 0 & 0 & 0 & 0 & 0 \ 0 & 1 & 0 & 0 & 0 & 0 \ 0 & 0 & 1 & 0 & 0 & 0 \ 0 & 0 & 0 & 1 & 0 & 0 \ 0 & 0 & 0 & 0 & 1 & 0 \ 0 & 0 & 0 & 0 & 0 & 1 \ ]
The elements of {\alpha} correspond to log odds ratios for cluster pairs in the following order:

Pair Parameter
log odds ratios by cluster. The argument variable is a variable name that defines the `block effects' between clusters. The log odds ratios are constant within clusters, but they take a different value for each different value of the variable. For example, if Center is a variable in the input data set taking a different value for k treatment centers, then specifying LOGOR=LOGORVAR(Center) requests a model with different log odds ratios for each of the k centers, constant within center.

k-nested log odds ratios. You must also specify the SUBCLUST=variable option to define subclusters within clusters. Within each cluster, PROC GENMOD computes a log odds ratio parameter for pairs having the same value of variable for both members of the pair and one log odds ratio parameter for each unique combination of different values of variable.

1-nested log odds ratios. You must also specify the SUBCLUST=variable option to define subclusters within clusters. There are two log odds ratio parameters for this model. Pairs having the same value of variable correspond to one parameter; pairs having different values of variable correspond to the other parameter. For example, if clusters are hospitals and subclusters are wards within hospitals, then patients within the same ward have one log odds ratio parameter, and patients from different wards have the other parameter.

specifies the full z-matrix. You must also specify a SAS data set containing the z-matrix with the ZDATA=data-set-name option. Each observation in the data set corresponds to one row of the z-matrix. You must specify the ZDATA data set as if all clusters are complete, that is, as if all clusters are the same size and there are no missing observations. The ZDATA data set has K [nmax(nmax-1)/2] observations, where K is the number of clusters and nmax is the maximum cluster size. If the members of cluster i are ordered as 1,2, ... ,n, then the rows of the z-matrix must be specified for pairs in the order (1,2),(1,3), ... ,(1,n),(2,3), ... ,(2,n), ... ,(n-1,n). The variables specified in the REPEATED statement for the SUBJECT effect must also be present in the ZDATA= data set to identify clusters. You must specify variables in the data set that define the columns of the z-matrix by the ZROW=variable-list option. If there are q columns, (q variables in variable-list), then there are q log odds ratio parameters. You can optionally specify variables indicating the cluster pairs corresponding to each row of the z-matrix with the YPAIR=(variable1, variable2) option. If you specify this option, the data from the ZDATA data set is sorted within each cluster by variable1 and variable2. See Example 29.6 for an example of specifying a full z-matrix.

replicated z-matrix. You specify z-matrix data exactly as you do for the ZFULL case, except that you specify only one complete cluster. The z-matrix for the one cluster is replicated for each cluster. The number of observations in the ZDATA data set is [(nmax(nmax-1))/2], where nmax is the size of a complete cluster (a cluster with no missing observations). See Example 29.6 for an example of specifying a replicated z-matrix.

direct input of the replicated z-matrix. You specify the z-matrix for one cluster with the syntax LOGOR=ZREP ( (y1    y2) z1    z2 ... zq, ... ), where y1 and y2 are numbers representing a pair of observations and the values z1, z2, ... , zq make up the corresponding row of the z-matrix. The number of rows specified is [(nmax(nmax-1))/2], where nmax is the size of a complete cluster (a cluster with no missing observations). For example,
   LOGOR =  ZREP((1 2) 1 0,
                 (1 3) 1 0,
                 (1 4) 1 0,
                 (2 3) 1 1,
                 (2 4) 1 1,
                 (3 4) 1 1)
specifies the [4 ×3/2] = 6 rows of the z-matrix for a cluster of size 4 with q=2 log odds ratio parameters. The log odds ratio for pairs (1 2), (1 3), (1 4) is \alpha_1, and the log odds ratio for pairs (2 3), (2 4), (3 4) is \alpha_1 + \alpha_2.

Generalized Score Statistics

Boos (1992) and Rotnitzky and Jewell (1990) describe score tests applicable to testing {L{\beta}} = 0 in GEEs, where L is a user-specified r ×p contrast matrix or a contrast for a Type 3 test of hypothesis.

Let \tilde{{\beta}} be the regression parameters resulting from solving the GEE under the restricted model {L{\beta}} = 0, and let S(\tilde{{\beta}}) be the generalized estimating equation values at \tilde{{\beta}}.

The generalized score statistic is

T =
where {{\Sigma}}_{m} is the model-based covariance estimate and {{\Sigma}}_{e} is the empirical covariance estimate. The p-values for T are computed based on the chi-square distribution with r degrees of freedom.

Chapter Contents
Chapter Contents

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.