Chapter Contents |
Previous |
Next |

Multivariate Analyses |

Principal component analysis reduces the dimensionality
of a set of data while trying to preserve the structure.
Given a data set with *n*_{y} **Y** variables,
*n*_{y} eigenvalues and their associated eigenvectors
can be computed from its covariance or correlation matrix.
The eigenvectors are standardized to unit length.
The principal components are linear
combinations of the **Y** variables.
The coefficients of the linear combinations are the
eigenvectors of the covariance or correlation matrix.
Principal components are formed as follows:

- The first principal component is the linear combination of the
**Y**variables that accounts for the greatest possible variance. - Each subsequent principal component is the linear combination
of the
**Y**variables that has the greatest possible variance and is uncorrelated with the previously defined components.

For a covariance or correlation matrix, the sum of
its eigenvalues equals the *trace* of the matrix,
that is, the sum of the variances of the *n*_{y}
variables for a covariance matrix, and *n*_{y}
for a correlation matrix.
The principal components are sorted by descending order of
their variances, which are equal to the associated eigenvalues.

Principal components can be used to reduce the number of variables in statistical analyses. Different methods for selecting the number of principal components to retain have been suggested. One simple criterion is to retain components with associated eigenvalues greater than the average eigenvalue (Kaiser 1958). SAS/INSIGHT software offers this criterion as an option for selecting the numbers of eigenvalues, eigenvectors, and principal components in the analysis. Principal components have a variety of useful properties (Rao 1964; Kshirsagar 1972):

- The eigenvectors are orthogonal, so the principal components represent jointly perpendicular directions through the space of the original variables.
- The principal component scores are jointly uncorrelated. Note that this property is quite distinct from the previous one.
- The first principal component has the largest variance of
any unit-length linear combination of the observed
variables. The
*j*th principal component has the largest variance of any unit-length linear combination orthogonal to the first*j*-1 principal components. The last principal component has the smallest variance of any linear combination of the original variables. - The scores on the first
*j*principal components have the highest possible generalized variance of any set of unit-length linear combinations of the original variables. - In geometric terms, the
*j*-dimensional linear subspace spanned by the first*j*principal components gives the best possible fit to the data points as measured by the sum of squared perpendicular distances from each data point to the subspace. This is in contrast to the geometric interpretation of least squares regression, which minimizes the sum of squared vertical distances. For example, suppose you have two variables. Then, the first principal component minimizes the sum of squared perpendicular distances from the points to the first principal axis. This is in contrast to least squares, which would minimize the sum of squared vertical distances from the points to the fitted line.

SAS/INSIGHT software computes principal components from either
the correlation or the covariance matrix. The covariance matrix
can be used when the variables are measured on comparable scales.
Otherwise, the correlation matrix should be used.
The new variables with principal component scores have variances
equal to corresponding eigenvalues (**Variance=Eigenvalues**)
or one (**Variance=1**).
You specify the computation method and type of output components
in the method options dialog, as shown in Figure 40.3.
By default, SAS/INSIGHT software uses the correlation matrix
with new variable variances equal to corresponding eigenvalues.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.