Chapter Contents |
Previous |
Next |

The MODECLUS Procedure |

If you specify the SIMPLE option and the data are coordinates, PROC MODECLUS displays the following simple descriptive statistics for each variable:

- the MEAN
- the standard deviation, STD DEV
- the SKEWNESS
- the KURTOSIS
- a coefficient of BIMODALITY (see Chapter 23, "The CLUSTER Procedure")

If you specify the NEIGHBOR option, PROC MODECLUS displays a list of the neighbors of each observation. The table contains

- the observation number or ID value of the observation
- the observation number or ID value of each of its neighbors
- the distance to each neighbor

If you specify the CROSSLIST option, PROC MODECLUS produces a table of information regarding cross validation of the density estimates. Each table has a row for each observation. For each observation, the following are displayed:

- the observation number or ID value of the observation
- the radius of the neighborhood
- the number of neighbors
- the estimated log density
- the estimated cross-validated log density

If you specify the LOCAL option, PROC MODECLUS produces a table of information regarding estimates of local dimensionality. Each table has a row for each observation. For each observation, the following are displayed:

- the observation number or ID value of the observation
- the radius of the neighborhood
- the estimated local dimensionality

If you specify the LIST option, PROC MODECLUS produces a table listing the observations within each cluster. The table can include

- the cluster number
- the observation number or ID value of the observation
- the estimated density
- the sum of the density estimates of observations within the neighborhood that belong to the same cluster
- the sum of the density estimates of observations within the neighborhood that belong to a different cluster
- the sum of the density estimates of all the observations within the neighborhood
- the ratio of the sum of the density estimates for the same cluster to the sum of all the density estimates in the neighborhood

If you specify the LIST option and there are unassigned objects, PROC MODECLUS produces a table listing those observations. The table includes

- the observation number or ID value of the observation
- the estimated density
- the ratio of the sum of the density estimates for the same cluster to the sum of the density estimates in the neighborhood for all other clusters

If you specify the BOUNDARY option, PROC MODECLUS produces a table listing the observations in each cluster that have a neighbor belonging to a different cluster. The table includes

- the observation number or ID value of the observation
- the estimated density
- the cluster number
- the ratio of the sum of the density estimates for the same cluster to the sum of the density estimates in the neighborhood for all other clusters

If you do not specify the SHORT option, PROC MODECLUS produces a table of cluster statistics including

- the cluster number
- the cluster frequency (the number of observations in the cluster)
- the maximum estimated density within the cluster
- the number of observations in the cluster having a neighbor that belongs to a different cluster
- the estimated saddle density of the cluster

If you specify the TEST or JOIN option, the table of cluster statistics includes the following items pertaining to the saddle test:

- the number of observations within the fixed-radius density-estimation neighborhood of the modal observation
- the number of observations within the fixed-radius density-estimation neighborhood of the saddle observation
- the number of observations within the overlap of the two preceding neighborhoods
- the
*z*statistic for comparing the preceding counts - the approximate
*p*-value

If you do not specify the NOSUMMARY option, PROC MODECLUS produces a table summarizing each cluster solution containing the following items:

- the smoothing parameters and cascade value
- the number of clusters
- the frequency of unclassified objects
- the likelihood cross-validation criterion if you specify the CROSS or CROSSLIST option

If you specify the JOIN option, the summary table also includes

- the number of clusters joined
- the maximum
*p*-value of any cluster in the solution

If you specify the TRACE option, PROC MODECLUS produces a table
for each cluster solution that
lists each observation along with its
cluster membership as it is reassigned
from the "Old" cluster to the "New" cluster.
This reassignment is described in
**Step 1** through **Step 3**
of the section "METHOD=6".
Each table has a row for each observation.
For each observation, the following are displayed:

- the observation number or ID value of the observation
- the estimated density
- the "Old" cluster membership. 0 represents an unassigned observation and -1 represents a seed.
- the "New" cluster membership
- "Ratio," which is documented
in the section "METHOD=6".
The following character values can also be
displayed:
- "M"
- means the observation is a mode
- "S"
- means the observation is a seed
- "N"
- means the neighbor of a mode or seed, for which the ratio is not computed

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.