Chapter Contents |
Previous |
Next |

The PLS Procedure |

The techniques implemented by the PLS procedure are

- principal components regression, which extracts factors to explain as much predictor sample variation as possible.
- reduced rank regression, which extracts factors to explain as much response variation as possible. This technique, also known as (maximum) redundancy analysis, differs from multivariate linear regression only when there are multiple responses.
- partial least squares regression, which balances the two objectives of explaining response variation and explaining predictor variation. Two different formulations for partial least squares are available: the original method of Wold (1966) and the SIMPLS method of de Jong (1993).

The number of factors to extract depends on the data. Basing the
model on more extracted factors improves the model fit to the observed
data, but extracting too many factors can cause *over-fitting*,
that is, tailoring the model too much to the current data, to the
detriment of future predictions. The PLS procedure enables you to
choose the number of extracted factors by *cross validation*, that
is, fitting the model to part of the data and minimizing the
prediction error for the unfitted part. Various methods of
cross validation are available, including one-at-a-time validation,
splitting the data into blocks, and test set validation.

You can use the general linear modeling approach of the GLM procedure to specify a model for your design, allowing for general polynomial effects as well as classification or ANOVA effects. You can save the model fit by the PLS procedure in a data set and apply it to new data by using the SCORE procedure.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.