Chapter Contents Previous Next
 The LOGISTIC Procedure

# Getting Started

The LOGISTIC procedure is similar in use to the other regression procedures in the SAS System. To demonstrate the similarity, suppose the response variable y is binary or ordinal, and x1 and x2 are two explanatory variables of interest. To fit a logistic regression model, you can use a MODEL statement similar to that used in the REG procedure:

   proc logistic;
model y=x1 x2;
run;


The response variable y can be either character or numeric. PROC LOGISTIC enumerates the total number of response categories and orders the response levels according to the ORDER= option in the PROC LOGISTIC statement. The procedure also allows the input of binary response data that are grouped:

   proc logistic;
model r/n=x1 x2;
run;


Here, n represents the number of trials and r represents the number of events.

The following example illustrates the use of PROC LOGISTIC. The data, taken from Cox and Snell (1989, pp. 10 -11), consist of the number, r, of ingots not ready for rolling, out of n tested, for a number of combinations of heating time and soaking time. The following invocation of PROC LOGISTIC fits the binary logit model to the grouped data:

   data ingots;
input Heat Soak r n @@;
datalines;
7 1.0 0 10  14 1.0 0 31  27 1.0 1 56  51 1.0 3 13
7 1.7 0 17  14 1.7 0 43  27 1.7 4 44  51 1.7 0  1
7 2.2 0  7  14 2.2 2 33  27 2.2 0 21  51 2.2 0  1
7 2.8 0 12  14 2.8 0 31  27 2.8 1 22  51 4.0 0  1
7 4.0 0  9  14 4.0 0 19  27 4.0 1 16
;

proc logistic data=ingots;
model r/n=Heat Soak;
run;


The results of this analysis are shown in the following tables.

 The SAS System
 The LOGISTIC Procedure
 Model Information Data Set WORK.INGOTS Response Variable (Events) r Response Variable (Trials) n Number of Observations 19 Link Function Logit Optimization Technique Fisher's scoring

PROC LOGISTIC first lists background information about the fitting of the model. Included are the name of the input data set, the response variable(s) used, the number of observations used, and the link function used.

 The LOGISTIC Procedure
 Response Profile OrderedValue Binary Outcome TotalFrequency 1 Event 12 2 Nonevent 375
 Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied.

The "Response Profile" table lists the response categories (which are EVENT and NO EVENT when grouped data are input), their ordered values, and their total frequencies for the given data.

 The LOGISTIC Procedure
 Model Fit Statistics Criterion Intercept Only Intercept and Covariates AIC 108.988 101.346 SC 112.947 113.221 -2 Log L 106.988 95.346
 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 11.6428 2 0.0030 Score 15.1091 2 0.0005 Wald 13.0315 2 0.0015

The "Model Fit Statistics" table contains the Akaike Information Criterion (AIC), the Schwarz Criterion (SC), and the negative of twice the log likelihood (-2 Log L) for the intercept-only model and the fitted model. AIC and SC can be used to compare different models, and the ones with smaller values are preferred. Results of the likelihood ratio test and the efficient score test for testing the joint significance of the explanatory variables (Soak and Heat) are included in the "Testing Global Null Hypothesis: BETA=0" table.

 The LOGISTIC Procedure
 Analysis of Maximum Likelihood Estimates Parameter DF Estimate StandardError Chi-Square Pr > ChiSq Intercept 1 -5.5592 1.1197 24.6503 <.0001 Heat 1 0.0820 0.0237 11.9454 0.0005 Soak 1 0.0568 0.3312 0.0294 0.8639
 Odds Ratio Estimates Effect Point Estimate 95% WaldConfidence Limits Heat 1.085 1.036 1.137 Soak 1.058 0.553 2.026

The "Analysis of Maximum Likelihood Estimates" table lists the parameter estimates, their standard errors, and the results of the Wald test for individual parameters. The odds ratio for each slope parameter, estimated by exponentiating the corresponding parameter estimate, is shown in the "Odds Ratios Estimates" table, along with 95% Wald confidence intervals.

Using the parameter estimates, you can calculate the estimated logit of p as

-5.5592+0.082 × Heat+0.0568 × Soak

If Heat=7 and Soak=1, then logit. Using this logit estimate, you can calculate as follows:

This gives the predicted probability of the event (ingot not ready for rolling) for Heat=7 and Soak=1. Note that PROC LOGISTIC can calculate these statistics for you; use the OUTPUT statement with the P= option.

 The LOGISTIC Procedure
 Association of Predicted Probabilities andObserved Responses Percent Concordant 64.4 Somers' D 0.460 Percent Discordant 18.4 Gamma 0.555 Percent Tied 17.2 Tau-a 0.028 Pairs 4500 c 0.730

Finally, the "Association of Predicted Probabilities and Observed Responses" table contains four measures of association for assessing the predictive ability of a model. They are based on the number of pairs of observations with different response values, the number of concordant pairs, and the number of discordant pairs, which are also displayed. Formulas for these statistics are given in the "Rank Correlation of Observed Responses and Predicted Probabilities" section.

To illustrate the use of an alternative form of input data, the following program creates the INGOTS data set with new variables NotReady and Freq instead of n and r. The variable NotReady represents the response of individual units; it has a value of 1 for units not ready for rolling (event) and a value of 0 for units ready for rolling (nonevent). The variable Freq represents the frequency of occurrence of each combination of Heat, Soak, and NotReady. Note that, compared to the previous data set, NotReady=1 implies Freq=r, and NotReady=0 implies Freq= n-r.

   data ingots;
input Heat Soak NotReady Freq @@;
datalines;
7 1.0 0 10  14 1.0 0 31  14 4.0 0 19  27 2.2 0 21  51 1.0 1  3
7 1.7 0 17  14 1.7 0 43  27 1.0 1  1  27 2.8 1  1  51 1.0 0 10
7 2.2 0  7  14 2.2 1  2  27 1.0 0 55  27 2.8 0 21  51 1.7 0  1
7 2.8 0 12  14 2.2 0 31  27 1.7 1  4  27 4.0 1  1  51 2.2 0  1
7 4.0 0  9  14 2.8 0 31  27 1.7 0 40  27 4.0 0 15  51 4.0 0  1
;


The following SAS statements invoke PROC LOGISTIC to fit the same model using the alternative form of the input data set.

   proc logistic data=ingots descending;
freq Freq;
run;


Results of this analysis are the same as the previous one. The displayed output for the two runs are identical except for the background information of the model fit and the "Response Profile" table.

PROC LOGISTIC models the probability of the response level that corresponds to the Ordered Value 1 as displayed in the "Response Profile" table. By default, Ordered Values are assigned to the sorted response values in ascending order.

The DESCENDING option reverses the default ordering of the response values so that NotReady=1 corresponds to the Ordered Value 1 and NotReady=0 corresponds to the Ordered Value 2, as shown in the following table:

 The LOGISTIC Procedure
 Response Profile OrderedValue NotReady TotalFrequency 1 1 12 2 0 375

If the ORDER= option and the DESCENDING option are specified together, the response levels are ordered according to the ORDER= option and then reversed. You should always check the "Response Profile" table to ensure that the outcome of interest has been assigned Ordered Value 1. See the "Response Level Ordering" section for more detail.

 Chapter Contents Previous Next Top