Regression Methods
All of the predictive methods implemented in PROC PLS work essentially
by finding linear combinations of the predictors (factors) to use to
predict the responses linearly. The methods differ only in how the
factors are derived, as explained in the following sections.
Partial Least Squares
Partial least squares (PLS) works by extracting one factor at a time. Let
X=X_{0} be the centered and scaled matrix of predictors and Y=Y_{0}
the centered and scaled matrix of response values. The PLS method starts with a
linear combination t = X_{0}w of the predictors, where
t is called a score vector and w is its associated
weight vector. The PLS method predicts both X_{0} and Y_{0} by regression
on t:
The vectors p and c are called the X and Yloadings,
respectively.
The specific linear combination t = X_{0}w is the one that
has maximum covariance t'u with some response linear
combination u = Y_{0}q. Another characterization is that
the X and Yweights w and q are proportional to the
first left and right singular vectors of the covariance matrix
X_{0}'Y_{0} or, equivalently, the first eigenvectors of X_{0}'Y_{0}Y_{0}'X_{0}
and Y_{0}'X_{0}X_{0}'Y_{0}, respectively.
This accounts for how the first PLS factor is extracted. The
second factor is extracted in the same way by replacing X_{0} and
Y_{0} with the X and Yresiduals from the first factor
These residuals are also called the deflated X and Y blocks.
The process of extracting a score vector and deflating the data
matrices is repeated for as many extracted factors as are desired.
SIMPLS
Note that each extracted PLS factor is defined in terms of different
Xvariables X_{i}. This leads to difficulties in comparing different
scores, weights, and so forth. The SIMPLS method of de Jong (1993)
overcomes these difficulties by computing each score t_{i} = Xr_{i} in terms of the original (centered and scaled) predictors X.
The SIMPLS Xweight vectors r_{i} are similar to the eigenvectors of
SS' = X'YY'X, but they satisfy a different orthogonality condition.
The r_{1} vector is just the first eigenvector e_{1} (so that the
first SIMPLS score is the same as the first PLS score), but whereas the
second eigenvector maximizes

e_{1}'SS'e_{2} subject to e_{1}'e_{2} = 0
the second SIMPLS weight r_{2} maximizes

r_{1}'SS'r_{2} subject to r_{1}'X'Xr_{2} = t_{1}'t_{2} = 0
The SIMPLS scores are identical to the PLS scores for one response but
slightly different for more than one response; refer to de Jong (1993)
for details. The X and Yloadings are defined as in PLS, but since
the scores are all defined in terms of X, it is easy to compute the
overall model coefficients B:
Principal Components Regression
Like the SIMPLS method, principal components regression (PCR) defines all the scores in
terms of the original (centered and scaled) predictors X. However,
unlike both the PLS and SIMPLS methods, the PCR method chooses the Xweights/Xscores without
regard to the response data. The Xscores are chosen to explain as
much variation in X as possible; equivalently, the Xweights for
the PCR method are the eigenvectors of the predictor covariance matrix X'X.
Again, the X and Yloadings are defined as in PLS; but, as in SIMPLS,
it is easy to compute overall model coefficients for the original
(centered and scaled) responses Y in terms of the original predictors
X.
Reduced Rank Regression
As discussed in the preceding sections, partial least squares depends on
selecting factors t = Xw of the predictors and
u = Yq of the responses
that have maximum covariance, whereas principal components regression
effectively ignores u and selects t to have maximum
variance, subject to orthogonality constraints. In contrast, reduced
rank regression selects u to account for as much variation in
the predicted responses as possible, effectively ignoring the
predictors for the purposes of factor extraction. In reduced rank
regression, the Yweights q_{i} are the eigenvectors of the
covariance matrix of the responses
predicted by ordinary least squares regression; the Xscores are the
projections of the Yscores Yq_{i} onto the X space.
Relationships Between Methods
When you develop a predictive model, it is important to consider not
only the explanatory power of the model for current responses, but
also how well sampled the predictive functions are, since this
impacts how well the model can extrapolate to future observations. All
of the techniques implemented in the PLS procedure work by extracting
successive factors, or linear combinations of the predictors, that
optimally address one or both of these two goals explaining response
variation and explaining predictor variation. In particular,
principal components regression selects factors that explain as
much predictor variation as possible, reduced rank regression selects
factors that explain as much response variation as possible,
and partial least squares balances the two objectives, seeking for
factors that explain both response and predictor variation.
To see the relationships between these methods, consider how each one
extracts a single factor from the following artificial
data set consisting of two predictors and one response:
data data;
input x1 x2 y;
datalines;
3.37651 2.30716 0.75615
0.74193 0.88845 1.15285
4.18747 2.17373 1.42392
0.96097 0.57301 0.27433
1.11161 0.75225 0.25410
1.38029 1.31343 0.04728
1.28153 0.13751 1.00341
1.39242 2.03615 0.45518
0.63741 0.06183 0.40699
2.52533 1.23726 0.91080
2.44277 3.61077 0.82590
;
proc pls data=data nfac=1 method=rrr;
title "Reduced Rank Regression";
model y = x1 x2;
proc pls data=data nfac=1 method=pcr;
title "Principal Components Regression";
model y = x1 x2;
proc pls data=data nfac=1 method=pls;
title "Partial Least Squares Regression";
model y = x1 x2;
run;
The amount of model and response variation explained by the first factor
for each method is shown in Figure 51.7 through Figure 51.9.
Percent Variation Accounted for by Reduced Rank Regression Factors 
Number of Extracted Factors 
Model Effects 
Dependent Variables 
Current 
Total 
Current 
Total 
1 
15.0661 
15.0661 
100.0000 
100.0000 

Figure 51.7: Variation Explained by First Reduced Rank Regression Factor
Principal Components Regression 
Percent Variation Accounted for by Principal Components 
Number of Extracted Factors 
Model Effects 
Dependent Variables 
Current 
Total 
Current 
Total 
1 
92.9996 
92.9996 
9.3787 
9.3787 

Figure 51.8: Variation Explained by First Principal Components Regression Factor
Partial Least Squares Regression 
Percent Variation Accounted for by Partial Least Squares Factors 
Number of Extracted Factors 
Model Effects 
Dependent Variables 
Current 
Total 
Current 
Total 
1 
88.5357 
88.5357 
26.5304 
26.5304 

Figure 51.9: Variation Explained by First Partial Least Squares Regression Factor
Notice that, while the first reduced rank regression factor explains
all of the response variation, it accounts for only about 15%
of the predictor variation. In contrast, the first principal
components regression factor accounts for most of the predictor
variation (93%) but only 9% of the response variation. The first
partial least squares factor accounts for only slightly less predictor
variation than principal components but about three times as much
response variation.
Figure 51.10 illustrates how partial least squares balances the
goals of explaining response and predictor variation in this case.
Figure 51.10: Depiction of First Factors for Three Different Regression Methods
The ellipse shows the general shape of the 11 observations in the
predictor space, with the contours of increasing y overlaid. Also
shown are the directions of the first factor for each of the three
methods. Notice that, while the predictors vary most
in the x1 = x2
direction, the response changes most in the
orthogonal x1 = x2
direction. This explains why the first principal component accounts
for little variation in the response and why the first reduced rank
regression factor accounts for little variation in the predictors.
The direction of the first partial least squares factor represents a
compromise between the other two directions.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.