Chapter Contents Previous Next
 The CATMOD Procedure

## Example 22.11: Predicted Probabilities

Suppose you have collected marketing research data to examine the relationship between a prospect's likelihood of buying your product and their education and income. Specifically, the variables are as follows.

 Variable Levels Interpretation Education high, low prospect's education level Income high, low prospect's income level Purchase yes, no Did prospect purchase product?

The following statements first create a data set, loan, that contains the marketing research data, then they use the CATMOD procedure to fit a model, obtain the parameter estimates, and obtain the predicted probabilities of interest. These statements produce Output 22.11.1 through Output 22.11.5.

```   data loan;
input Education \$ Income \$ Purchase \$ wt;
datalines;
high  high  yes    54
high  high  no     23
high  low   yes    41
high  low   no     12
low   high  yes    35
low   high  no     42
low   low   yes    19
low   low   no      8
;

ods output PredictedValues=Predicted
(keep=Education Income PredFunction);

proc catmod data=loan order=data;
weight wt;
response marginals;
model Purchase=Education Income / pred;
run;

proc sort data=Predicted;
by descending PredFunction;
run;

proc print data=Predicted;
run;
```

Notice that the preceding statements use the Output Delivery system (ODS) to output the parameter estimates instead of the OUT= option, though either can be used.

Output 22.11.1: Marketing Research Data: Obtaining Predicted Probabilities

 The CATMOD Procedure

 Response Purchase Response Levels 2 Weight Variable wt Populations 4 Data Set LOAN Total Frequency 234 Frequency Missing 0 Observations 8

Output 22.11.2: Profiles and Design Matrix

 The CATMOD Procedure

 Population Profiles Sample Education Income Sample Size 1 high high 77 2 high low 53 3 low high 77 4 low low 27

 Response Profiles Response Purchase 1 yes 2 no

 Sample ResponseFunction Design Matrix 1 2 3 1 0.70130 1 1 1 2 0.77358 1 1 -1 3 0.45455 1 -1 1 4 0.70370 1 -1 -1

Output 22.11.3: ANOVA Table and Parameter Estimates

 The CATMOD Procedure

 Analysis of Variance Source DF Chi-Square Pr > ChiSq Intercept 1 418.36 <.0001 Education 1 8.85 0.0029 Income 1 4.70 0.0302 Residual 1 1.84 0.1745

 Analysis of Weighted Least Squares Estimates Effect Parameter Estimate StandardError Chi-Square Pr > ChiSq Intercept 1 0.6481 0.0317 418.36 <.0001 Education 2 0.0924 0.0311 8.85 0.0029 Income 3 -0.0675 0.0312 4.70 0.0302

Output 22.11.4: Predicted Values and Residuals

 The CATMOD Procedure

 Predicted Values for Response Functions Sample Education Income FunctionNumber Observed Predicted Residual Function StandardError Function StandardError 1 high high 1 0.7012987 0.052158 0.67293982 0.047794 0.02835888 2 high low 1 0.77358491 0.057487 0.80803395 0.051586 -0.034449 3 low high 1 0.45454545 0.056744 0.48811031 0.051077 -0.0335649 4 low low 1 0.7037037 0.087877 0.62320444 0.064867 0.08049927

Output 22.11.5: Predicted Probabilities Data Set

 Obs Education Income PredFunction 1 high low 0.80803395 2 high high 0.67293982 3 low low 0.62320444 4 low high 0.48811031

You can use the predicted values (values of PredFunction in Output 22.11.5) as scores representing the likelihood that a randomly chosen subject from one of these populations will purchase the product. Notice that the Response Profiles in Output 22.11.2 show you that the first sorted level of Purchase is "yes," indicating that the predicted probabilities are for Pr(Purchase='yes'). For example, someone with high education and low income has an estimated probability of purchase of 0.808. As with any response function estimate given by PROC CATMOD, this estimate can be obtained by cross-multiplying the row from the design matrix corresponding to the sample (sample number 2 in this case) with the vector of parameter estimates ((1*0.6481)+(1*0.0924)+(-1*(-0.0675))).

This ranking of scores can help in decision making (for example, with respect to allocation of advertising dollars, choice of advertising media, choice of print media, and so on).

 Chapter Contents Previous Next Top