Chapter Contents |
Previous |
Next |

The TREE Procedure |

The TREE procedure produces a tree diagram, also known as a
*dendrogram*
or *phenogram*,
using a data set created by the CLUSTER or VARCLUS
procedure. The CLUSTER and VARCLUS procedures create output data sets
that contain the results of hierarchical clustering as a tree
structure. The
TREE procedure uses the output data set to produce a
diagram of the tree
structure in the style of Johnson(1967), with the root at the top.
Alternatively, the diagram can be oriented horizontally, with the root
at the left. Any numeric variable in the output data set can be used
to specify the heights of the clusters. PROC TREE can also create an
output data set containing a variable to indicate the disjoint
clusters at a specified level in the tree.

Tree diagrams are discussed in the context of cluster analysis by Duran and Odell (1974), Hartigan (1975), and Everitt (1980). Knuth (1973) provides a general treatment of tree diagrams in computer programming.

The literature on tree diagrams contains a mixture of botanical and
genealogical terminology. The objects that are clustered are *
leaves*. The cluster containing all objects is the *root*. A
cluster containing at least two objects but not all of them is a *
branch*.
The general term for leaves, branches, and roots is *
node*. If a cluster A is the union of clusters B and C, then A is the
*parent* of B and C, and B and C are *children* of A. A leaf
is thus a node with no children, and a root is a node with no
parent. If every cluster has at most two children, the tree diagram is
a *binary tree*. The CLUSTER procedure always produces binary
trees. The VARCLUS procedure can produce tree diagrams with clusters
that have many children.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.