Chapter Contents 
Previous 
Next 
The LOGISTIC Procedure 
Antibodies produced in response to an infectious disease like malaria remain in the body after the individual has recovered from the disease. A serological test detects the presence or absence of such antibodies. An individual with such antibodies is termed seropositive. In areas where the disease is endemic, the inhabitants are at fairly constant risk of infection. The probability of an individual never having been infected in Y years is , where is the mean number of infections per year (refer to the appendix of Draper et al. 1972). Rather than estimating the unknown , it is of interest to epidemiologists to estimate the probability of a person living in the area being infected in one year. This infection rate is given by
The following SAS statements create the data set sero, which contains the results of a serological survey of malarial infection. Individuals of nine age groups were tested. Variable A represents the midpoint of the age range for each age group. Variable N represents the number of individuals tested in each age group, and variable R represents the number of individuals that are seropositive.
data sero; input group A N R; X=log(A); label X='Log of Midpoint of Age Range'; datalines; 1 1.5 123 8 2 4.0 132 6 3 7.5 182 18 4 12.5 140 14 5 17.5 138 20 6 25.0 161 39 7 35.0 133 19 8 47.0 92 25 9 60.0 74 44 ;
For the ith group with age midpoint A_{i}, the probability of being seropositive is . It follows that
proc logistic data=sero; model R/N= / offset=X link=cloglog clparm=pl scale=none; title 'Constant Risk of Infection'; run;Output 39.10.1: Modeling Constant Risk of Infection
Results of fitting this constant risk model are shown in Output 39.10.1. The maximum likelihood estimate of and its estimated standard error are and ,respectively. The infection rate is estimated as
The 95% confidence interval for , obtained by backtransforming the 95% confidence interval for ,is (0.0082, 0.0011); that is, there is a 95% chance that, in repeated sampling, the interval of 8 to 11 infections per thousand individuals contains the true infection rate.
The goodness of fit statistics for the constant risk model are statistically significant (p < 0.0001), indicating that the assumption of constant risk of infection is not correct. You can fit a more extensive model by allowing a separate risk of infection for each age group. Suppose is the mean number of infections per year for the ith age group. The probability of seropositive for the ith group with age midpoint A_{i} is , so that
In the following SAS statements, nine dummy variables (agegrp1 agegrp9) are created as the design variables for the age groups. PROC LOGISTIC is invoked to fit a complementary loglog model that contains agegrp1 agegrp9 as the only explanatory variables with no intercept term and with X=log(A) as an offset term. Note that is the regression parameter associated with agegrpi.
data two; array agegrp(9) agegrp1agegrp9 (0 0 0 0 0 0 0 0 0); set sero; agegrp[group]=1; output; agegrp[group]=0; run; proc logistic data=two; model R/N=agegrp1agegrp9 / offset=X noint link=cloglog clparm=pl; title 'Infectious Rates and 95% Confidence Intervals'; run;Output 39.10.2: Modeling Separate Risk of Infection

Number Infected per 1000 People  
Age  Point  95% Confidence Limits  
Group  Estimate  Lower  Upper 
1  44  20  80 
2  12  5  23 
3  14  8  21 
4  8  5  14 
5  9  6  13 
6  11  8  15 
7  4  3  7 
8  7  4  10 
9  15  11  20 
Results of fitting the model for separate risk of infection are shown in Output 39.10.2. For the first age group, the point estimate of is 3.1048. This translates into an infection rate of 1exp(exp(3.1048)) = 0.0438. A 95% confidence interval for the infection rate is obtained by transforming the 95% confidence interval for .For the first age group, the lower and upper confidence limits are 1exp(exp(3.8880) = 0.0203 and 1exp(exp(2.4833)) = 0.0801, respectively. Table 39.3 shows the estimated infection rate in one year's time for each age group. Note that the infection rate for the first age group is high compared to the other age groups.
Chapter Contents 
Previous 
Next 
Top 
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.