Chapter Contents |
Previous |
Next |
The FASTCLUS Procedure |
If you specify the IMPUTE option, the OUT= data set also contains a new variable, _IMPUTE_, giving the number of imputed values in each observation.
If you specify the LEAST=p option with a value other than 2, the _RMSSTD_ variable is replaced by the _SCALE_ variable, which contains the pooled scale estimate analogous to the root mean square standard deviation but based on pth power deviations instead of squared deviations:
If you specify the OUTITER option, the variables _SCALE_ or _RMSSTD_, _RADIUS_, _NEAR_, and _GAP_ have missing values except for the last pass.
You can use the OUTSEED= data set as a SEED= input data set for a subsequent analysis.
The values of _TYPE_ for all LEAST= options are given in the following table.
Table 27.2: _TYPE_ Values for all LEAST= Options_TYPE_ | Contents of VAR variables |
Contents of OVER_ALL |
INITIAL | Initial seeds | Missing |
CRITERION | Missing | Optimization criterion; see the LEAST= option; this value is displayed just before the "Cluster Summary" table |
CENTER | Cluster centers; see the LEAST= option | Missing |
SEED | Cluster seeds: additional information used for imputation | |
DISPERSION | Dispersion estimates for each cluster; see the LEAST= option; these values are displayed in a separate row with title depending on the LEAST= option | Dispersion estimates pooled over variables; see the LEAST= option; these values are displayed in the "Cluster Summary" table with label depending on the LEAST= option |
FREQ | Frequency of each cluster omitting observations with missing values for the VAR variable; these values are not displayed | Frequency of each cluster based on all observations with any nonmissing value; these values are displayed in the "Cluster Summary" table |
WEIGHT | Sum of weights for each cluster omitting observations with missing values for the VAR variable; these values are not displayed | Sum of weights for each cluster based on all observations with any nonmissing value; these values are displayed in the "Cluster Summary" table |
Observations with _TYPE_='WEIGHT' are included only if you specify the WEIGHT statement.
The _TYPE_ values included only for least-squares clustering are given in the following table. Least-squares clustering is obtained by omitting the LEAST= option or by specifying LEAST=2.
Table 27.3: _TYPE_ Values for Least-Squares Clustering_TYPE_ | Contents of VAR variables |
Contents of OVER_ALL |
MEAN | Mean for the total sample; this is not displayed | Missing |
STD | Standard deviation for the total sample; this is labeled "Total STD" in the output | Standard deviation pooled over all the VAR variables; this is labeled "Total STD" in the output |
WITHIN_STD | Pooled within-cluster standard deviation | Within cluster standard deviation pooled over clusters and all the VAR variables |
RSQ | R^{2} for predicting the variable from the clusters; this is labeled "R-Squared" in the output | R^{2} pooled over all the VAR variables; this is labeled "R-Squared" in the output |
RSQ_RATIO | [(R^{2})/(1-R^{2})]; this is labeled "RSQ/(1-RSQ)" in the output | [(R^{2})/(1-R^{2})]; labeled "RSQ/(1-RSQ)" in the output |
PSEUDO_F | Missing | Pseudo F statistic |
ESRQ | Missing | Approximate expected value of R^{2} under the null hypothesis of a single uniform cluster |
CCC | Missing | The cubic clustering criterion |
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.