Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The CLUSTER Procedure

Example 23.6: Size, Shape, and Correlation

The following example shows the analysis of a data set in which size information is detrimental to the classification. Imagine that an archaeologist of the future is excavating a 20th century grocery store. The archaeologist has discovered a large number of boxes of various sizes, shapes, and colors and wants to do a preliminary classification based on simple external measurements: height, width, depth, weight, and the predominant color of the box. It is known that a given product may have been sold in packages of different size, so the archaeologist wants to remove the effect of size from the classification. It is not known whether color is relevant to the use of the products, so the analysis should be done both with and without color information.

Unknown to the archaeologist, the boxes actually fall into six general categories according to the use of the product: breakfast cereals, crackers, laundry detergents, Little Debbie snacks, tea, and toothpaste. These categories are shown in the analysis so that you can evaluate the effectiveness of the classification.

Since there is no reason for the archaeologist to assume that the true categories have equal sample sizes or variances, the centroid method is used to avoid undue bias. Each analysis is done with Euclidean distances after suitable transformations of the data. Color is coded as five dummy variables with values of 0 or 1. The DATA step is as follows:

   options ls=120;
   title 'Cluster Analysis of Grocery Boxes';
   data grocery2;
      length name $35   /* name of product */
             class $16  /* category of product */
             unit $1    /* unit of measurement for weights:
                              g=gram
                              o=ounce
                              l=lb
                           all weights are converted to grams */
             color $8   /* predominant color of box */
             height 8   /* height of box in cm. */
             width 8    /* width of box in cm. */
             depth 8    /* depth of box (front to back) in cm. */
             weight 8   /* weight of box in grams */
             c_white c_yellow c_red c_green c_blue 4; 
                        /* dummy variables */
      retain class;
      drop unit;

      /*--- read name with possible embedded blanks ---*/
      input name & @;

      /*--- if name starts with "---",              ---*/ 
      /*--- it's really a category value            ---*/
      if substr(name,1,3) = '---' then do;
         class = substr(name,4,index(substr(name,4),'-')-1);
         delete;
         return;
      end;

      /*--- read the rest of the variables ---*/
      input height width depth weight unit color;

      /*--- convert weights to grams ---*/
      select (unit);
         when ('l') weight = weight * 454;
         when ('o') weight = weight * 28.3;
         when ('g') ;
         otherwise put 'Invalid unit ' unit;
      end;

      /*--- use 0/1 coding for dummy variables for colors ---*/
      c_white  = (color = 'w');
      c_yellow = (color = 'y');
      c_red    = (color = 'r');
      c_green  = (color = 'g');
      c_blue   = (color = 'b');

   datalines;

   ---Breakfast cereals---

   Cheerios                            32.5 22.4  8.4  567 g y
   Cheerios                            30.3 20.4  7.2  425 g y
   Cheerios                            27.5 19    6.2  283 g y
   Cheerios                            24.1 17.2  5.3  198 g y
   Special K                           30.1 20.5  8.5   18 o w
   Special K                           29.6 19.2  6.7   12 o w
   Special K                           23.4 16.6  5.7    7 o w
   Corn Flakes                         33.7 25.4  8     24 o w
   Corn Flakes                         30.2 20.6  8.4   18 o w
   Corn Flakes                         30   19.1  6.6   12 o w
   Grape Nuts                          21.7 16.3  4.9  680 g w
   Shredded Wheat                      19.7 19.9  7.5  283 g y
   Shredded Wheat, Spoon Size          26.6 19.6  5.6  510 g r
   All-Bran                            21.1 14.3  5.2 13.8 o y
   Froot Loops                         30.2 20.8  8.5 19.7 o r
   Froot Loops                         25   17.7  6.4   11 o r

   ---Crackers---

   Wheatsworth                         11.1 25.2  5.5  326 g w
   Ritz                                23.1 16    5.3  340 g r
   Ritz                                23.1 20.7  5.2  454 g r
   Premium Saltines                    11   25   10.7  454 g w
   Waverly Wafers                      14.4 22.5  6.2  454 g g

   ---Detergent---

   Arm & Hammer Detergent              38.8 30   16.9   25 l y
   Arm & Hammer Detergent              39.5 25.8 11   14.2 l y
   Arm & Hammer Detergent              33.7 22.8  7      7 l y
   Arm & Hammer Detergent              27.8 19.4  6.3    4 l y
   Tide                                39.4 24.8 11.3  9.2 l r
   Tide                                32.5 23.2  7.3  4.5 l r
   Tide                                26.5 19.9  6.3   42 o r
   Tide                                19.3 14.6  4.7   17 o r

   ---Little Debbie---

   Figaroos                            13.5 18.6  3.7   12 o y
   Swiss Cake Rolls                    10.1 21.8  5.8   13 o w
   Fudge Brownies                      11   30.8  2.5   12 o w
   Marshmallow Supremes                 9.4 32    7     10 o w
   Apple Delights                      11.2 30.1  4.9   15 o w
   Snack Cakes                         13.4 32    3.4   13 o b
   Nutty Bar                           13.2 18.5  4.2   12 o y
   Lemon Stix                          13.2 18.5  4.2    9 o w
   Fudge Rounds                         8.1 28.3  5.4  9.5 o w

   ---Tea---

   Celestial Saesonings Mint Magic      7.8 13.8  6.3   49 g b
   Celestial Saesonings Cranberry Cove  7.8 13.8  6.3   46 g r
   Celestial Saesonings Sleepy Time     7.8 13.8  6.3   37 g g
   Celestial Saesonings Lemon Zinger    7.8 13.8  6.3   56 g y
   Bigelow Lemon Lift                   7.7 13.4  6.9   40 g y
   Bigelow Plantation Mint              7.7 13.4  6.9   35 g g
   Bigelow Earl Grey                    7.7 13.4  6.9   35 g b
   Luzianne                             8.9 22.8  6.4    6 o r
   Luzianne                            18.4 20.2  6.9    8 o r
   Luzianne Decaffeinated               8.9 22.8  6.4 5.25 o g
   Lipton Tea Bags                     17.1 20    6.7    8 o r
   Lipton Tea Bags                     11.5 14.4  6.6 3.75 o r
   Lipton Tea Bags                      6.7 10    5.7 1.25 o r
   Lipton Family Size Tea Bags         13.7 24    9     12 o r
   Lipton Family Size Tea Bags          8.7 20.8  8.2    6 o r
   Lipton Family Size Tea Bags          8.9 11.1  8.2    3 o r
   Lipton Loose Tea                    12.7 10.9  5.4    8 o r

   ---Paste, Tooth---

   Colgate                              4.4 22    3.5    7 o r
   Colgate                              3.6 15.6  3.3    3 o r
   Colgate                              4.2 18.3  3.5    5 o r
   Crest                                4.3 21.7  3.7  6.4 o w
   Crest                                4.3 17.4  3.6  4.6 o w
   Crest                                3.5 15.2  3.2  2.7 o w
   Crest                                3.0 10.9  2.8  .85 o w
   Arm & Hammer                         4.4 17    3.7    5 o w
   ;


   data grocery;
      length name $16;
      set grocery2;

The FORMAT procedure is used to define to formats to make the output easier to read. The STARS. format is used for graphical crosstabulations in the TABULATE procedure. The $COLOR format displays the names of the colors instead of just the first letter.

       /*------ formats and macros for displaying ------*/
       /*------ cluster results                   ------*/
   proc format; value stars
         0='               '
         1='              #'
         2='             ##'
         3='            ###'
         4='           ####'
         5='          #####'
         6='         ######'
         7='        #######'
         8='       ########'
         9='      #########'
        10='     ##########'
        11='    ###########'
        12='   ############'
        13='  #############'
        14=' ##############'
   15-high='>##############';
   run;

   proc format; value $color
      'w'='White'
      'y'='Yellow'
      'r'='Red'
      'g'='Green'
      'b'='Blue';
   run;

Since a full display of the results of each cluster analysis would be very long, a macro is used with five macro variables to select parts of the output. The macro variables are set to select only the PROC CLUSTER output and the crosstabulation of clusters and true categories for the first two analyses. The example could be run with different settings of the macro variables to show the full output or other selected parts.

   %let cluster=1;   /* 1=show CLUSTER output, 0=don't */
   %let tree=0;      /* 1=print TREE diagram, 0=don't */
   %let list=0;      /* 1=list clusters, 0=don't */
   %let crosstab=1;  /* 1=crosstabulate clusters and classes, 
                        0=don't                              */
   %let crosscol=0;  /* 1=crosstabulate clusters and colors, 
                        0=don't                              */

      /*--- define macro with options for TREE ---*/
   %macro treeopt;
      %if &tree %then h page=1;
      %else noprint;
   %mend;

      /*--- define macro with options for CLUSTER ---*/
   %macro clusopt;
      %if &cluster %then pseudo ccc p=20;
      %else noprint;
   %mend;

      /*------ macro for showing cluster results ------*/
   %macro show(n); /* n=number of clusters 
                      to show results for */

   proc tree data=tree %treeopt n=&n out=out;
      id name;
      copy class height width depth weight color;
   run;

   %if &list %then %do;
      proc sort;
         by cluster;
      run;

      proc print;
         var class name height width depth weight color;
         by cluster clusname;
      run;
   %end;

   %if &crosstab %then %do;
      proc tabulate noseps /* formchar='           ' */;
           class class cluster;
           table cluster, class*n=' 
                 '*f=stars./rts=10 misstext=' ';
   run;
   %end;

   %if &crosscol %then %do;
      proc tabulate noseps /* formchar='           ' */;
         class color cluster;
         table cluster, color*n=' 
               '*f=stars./rts=10 misstext=' ';
         format color $color.;
   run;
   %end;
   %mend;

The first analysis uses the variables height, width, depth, and weight in standardized form to show the effect of including size information. The CCC, pseudo F, and pseudo t2 statistics indicate 10 clusters. Most of the clusters do not correspond closely to the true categories, and four of the clusters have only one or two observations.

   /**********************************************************/
   /*                                                        */
   /*       Analysis 1: standardized box measurements        */
   /*                                                        */
   /**********************************************************/
   title2 'Analysis 1: Standardized data';
   proc cluster data=grocery m=cen std %clusopt outtree=tree;
      var height width depth weight;
      id name;
      copy class color;
   run;

   %show(10);

Output 23.6.1: Analysis of Standardized Data

Cluster Analysis of Grocery Boxes
Analysis 1: Standardized data

The CLUSTER Procedure
Centroid Hierarchical Cluster Analysis

Eigenvalues of the Correlation Matrix
  Eigenvalue Difference Proportion Cumulative
1 2.44512438 1.64456210 0.6113 0.6113
2 0.80056228 0.33149770 0.2001 0.8114
3 0.46906458 0.18381582 0.1173 0.9287
4 0.28524876   0.0713 1.0000

The data have been standardized to mean 0 and variance 1

Root-Mean-Square Total-Sample Standard Deviation = 1

Root-Mean-Square Distance Between Observations = 2.828427


Cluster Analysis of Grocery Boxes
Analysis 1: Standardized data

The CLUSTER Procedure
Centroid Hierarchical Cluster Analysis

The data have been standardized to mean 0 and variance 1

Root-Mean-Square Total-Sample Standard Deviation = 1

Root-Mean-Square Distance Between Observations = 2.828427

Cluster History
NCL Clusters Joined FREQ SPRSQ RSQ ERSQ CCC PSF PST2 Norm
Cent
Dist
T
i
e
20 CL22 Lipton Family Si 11 0.0028 .974 . . 85.4 4.5 0.3073  
19 CL36 Corn Flakes 5 0.0026 .972 . . 83.7 15.3 0.3146  
18 CL24 CL41 12 0.0080 .964 . . 70.2 10.0 0.3316  
17 CL18 CL30 18 0.0144 .949 . . 53.8 12.7 0.3343  
16 Marshmallow Supr CL29 3 0.0024 .947 . . 55.8 4.7 0.3363  
15 CL50 CL33 7 0.0055 .941 . . 55.0 24.4 0.346  
14 CL46 CL15 10 0.0069 .934 . . 53.7 8.1 0.3192  
13 CL27 Lipton Family Si 6 0.0035 .931 . . 56.1 6.3 0.362  
12 CL31 CL16 5 0.0075 .923 .861 8.03 55.8 6.6 0.4416  
11 CL19 CL23 7 0.0102 .913 .848 7.59 54.6 12.7 0.4713  
10 Arm & Hammer Det Tide 2 0.0037 .909 .835 8.36 59.1 . 0.4781  
9 CL11 CL17 25 0.0393 .870 .819 4.72 45.2 19.3 0.4918  
8 CL13 CL14 16 0.0329 .837 .801 2.95 40.4 23.7 0.5215  
7 CL8 CL20 27 0.0629 .774 .779 -.31 32.0 25.9 0.5467  
6 CL7 Crest 28 0.0112 .763 .752 0.61 36.7 2.4 0.6003  
5 CL9 CL6 53 0.1879 .575 .718 -5.9 19.6 43.4 0.6641  
4 CL5 CL21 55 0.0345 .541 .672 -5.2 23.2 4.5 0.745  
3 CL4 CL12 60 0.1137 .427 .602 -5.3 22.4 14.5 0.8769  
2 CL3 CL10 62 0.1511 .276 .471 -4.3 23.2 15.8 1.5559  
1 CL2 Arm & Hammer Det 63 0.2759 .000 .000 0.00 . 23.2 2.948  


  class
Breakfast cereal Crackers Detergent Little Debbie Paste, Tooth Tea
CLUSTER           ###########
1
2   ##   #   ###
3 #####   ##      
4       ### #######  
5 ########### ## ###     ##
6       #####    
7   #       #
8     ##      
9         #  
10     #      


The second analysis uses logarithms of height, width, depth, and the cube root of weight; the cube root is used for consistency with the linear measures. The rows are then centered to remove size information. Finally, the columns are standardized to have a standard deviation of 1. There is no compelling a priori reason to standardize the columns, but if they are not standardized, height dominates the analysis because of its large variance. The STANDARD procedure is used instead of the STD option in PROC CLUSTER so that a subsequent analysis can separately standardize the dummy variables for color.

   /**********************************************************/
   /*                                                        */
   /*    Analysis 2: standardized row-centered logarithms    */
   /*                                                        */
   /**********************************************************/

   title2 'Row-centered logarithms';
   data shape;
      set grocery;
      array x height width depth weight;
      array l l_height l_width l_depth l_weight; 
                             /* logarithms */
      weight=weight**(1/3);  /* take cube root to conform with
                                the other linear measurements */
      do over l;             /* take logarithms */
         l=log(x);
      end;
      mean=mean( of l(*));   /* find row mean of logarithms */
      do over l;
         l=l-mean;           /* center row */
      end;
   run;

   title2 'Analysis 2: Standardized row-centered logarithms';
   proc standard data=shape out=shapstan m=0 s=1;
      var l_height l_width l_depth l_weight;
   run;


   proc cluster data=shapstan m=cen %clusopt outtree=tree;
      var l_height l_width l_depth l_weight;
      id name;
      copy class height width depth weight color;
   run;

   %show(8);

The results of the second analysis are shown for eight clusters. Clusters 1 through 4 correspond fairly well to tea, toothpaste, breakfast cereals, and detergents. Crackers and Little Debbie products are scattered among several clusters.

Output 23.6.2: Analysis of Standardized Row-Centered Logarithms

Cluster Analysis of Grocery Boxes
Analysis 2: Standardized row-centered logarithms

The CLUSTER Procedure
Centroid Hierarchical Cluster Analysis

Eigenvalues of the Covariance Matrix
  Eigenvalue Difference Proportion Cumulative
1 1.94931049 0.34845395 0.4873 0.4873
2 1.60085654 1.15102358 0.4002 0.8875
3 0.44983296 0.44983296 0.1125 1.0000
4 0.00000000   0.0000 1.0000

Root-Mean-Square Total-Sample Standard Deviation = 1

Root-Mean-Square Distance Between Observations = 2.828427

Cluster History
NCL Clusters Joined FREQ SPRSQ RSQ ERSQ CCC PSF PST2 Norm
Cent
Dist
T
i
e
20 CL29 All-Bran 4 0.0017 .977 . . 94.7 2.9 0.2658  
19 CL26 CL27 8 0.0045 .972 . . 85.4 8.4 0.3047  
18 Fudge Rounds Crest 2 0.0016 .971 . . 87.2 . 0.3193  
17 Fudge Brownies Snack Cakes 2 0.0018 .969 . . 89.1 . 0.3331  
16 Arm & Hammer Det Lipton Loose Tea 2 0.0019 .967 . . 91.3 . 0.3434  
15 CL23 CL18 5 0.0050 .962 . . 86.5 4.8 0.3587  
14 CL37 CL21 5 0.0051 .957 . . 83.5 10.4 0.3613  
13 CL30 CL24 9 0.0068 .950 . . 79.2 12.9 0.3682  
12 CL32 CL20 16 0.0142 .936 .892 5.75 67.6 29.3 0.3826  
11 CL22 Apple Delights 4 0.0037 .932 .881 6.31 71.4 3.2 0.3901  
10 CL11 CL31 7 0.0090 .923 .869 6.17 70.8 6.3 0.4032  
9 CL33 CL13 11 0.0092 .914 .853 6.25 71.7 7.6 0.4181  
8 CL19 CL16 10 0.0131 .901 .835 6.12 71.4 10.9 0.503  
7 CL14 CL9 16 0.0297 .871 .813 4.63 63.1 15.6 0.5173  
6 CL10 CL15 12 0.0329 .838 .785 3.69 59.1 13.6 0.5916  
5 CL6 CL28 19 0.0557 .783 .748 2.01 52.2 15.8 0.6252  
4 CL12 CL8 26 0.0885 .694 .697 -.16 44.6 48.8 0.6679  
3 CL5 CL17 21 0.0459 .648 .617 1.21 55.3 7.4 0.8863  
2 CL4 CL7 42 0.2841 .364 .384 -.56 34.9 60.3 0.9429  
1 CL2 CL3 63 0.3640 .000 .000 0.00 . 34.9 0.8978  


  class
Breakfast cereal Crackers Detergent Little Debbie Paste, Tooth Tea
CLUSTER   #       ##########
1
2         #######  
3 ############## ##        
4 #   ########     #
5       ## # ##
6 #         ####
7   ##   #####    
8       ##    


The third analysis is similar to the second analysis except that the rows are standardized rather than just centered. There is a clear indication of seven clusters from the CCC, pseudo F, and pseudo t2 statistics. The clusters are listed as well as crosstabulated with the true categories and colors.

   /**********************************************************/
   /*                                                        */
   /*  Analysis 3: standardized row-standardized logarithms  */
   /*                                                        */
   /**********************************************************/

   %let list=1;
   %let crosscol=1;

   title2 'Row-standardized logarithms';
   data std;
      set grocery;
      array x height width depth weight;
      array l l_height l_width l_depth l_weight; 
                            /* logarithms */
      weight=weight**(1/3); /* take cube root to conform with
                               the other linear measurements */
      do over l;
         l=log(x);          /* take logarithms */
      end;
      mean=mean( of l(*));  /* find row mean of logarithms */
      std=std( of l(*));    /* find row standard deviation */
      do over l;
         l=(l-mean)/std;    /* standardize row */
      end;
   run;

   title2 'Analysis 3: Standardized row-standardized logarithms';
   proc standard data=std out=stdstan m=0 s=1;
      var l_height l_width l_depth l_weight;
   run;

   proc cluster data=stdstan m=cen %clusopt outtree=tree;
      var l_height l_width l_depth l_weight;
      id name;
      copy class height width depth weight color;
   run;

   %show(7);

The output from the third analysis shows that cluster 1 contains 9 of the 17 teas. Cluster 2 contains all of the detergents plus Grape Nuts, a very heavy cereal. Cluster 3 includes all of the toothpastes and one Little Debbie product that is of very similar shape, although roughly twice as large. Cluster 4 has most of the cereals, Ritz crackers (which come in a box very similar to most of the cereal boxes), and Lipton Loose Tea (all the other teas in the sample come in tea bags). Clusters 5 and 6 each contain several Luzianne and Lipton teas and one or two miscellaneous items. Cluster 7 includes most of the Little Debbie products and two types of crackers. Thus, the crackers are not identified and the teas are broken up into three clusters, but the other categories correspond to single clusters. This analysis classifies toothpaste and Little Debbie products slightly better than the second analysis,

Output 23.6.3: Analysis of Standardized Row-Standardized Logarithms

Cluster Analysis of Grocery Boxes
Analysis 3: Standardized row-standardized logarithms

The CLUSTER Procedure
Centroid Hierarchical Cluster Analysis

Eigenvalues of the Covariance Matrix
  Eigenvalue Difference Proportion Cumulative
1 2.42684848 0.94583675 0.6067 0.6067
2 1.48101173 1.38887193 0.3703 0.9770
3 0.09213980 0.09213980 0.0230 1.0000
4 -.00000000   -0.0000 1.0000

Root-Mean-Square Total-Sample Standard Deviation = 1

Root-Mean-Square Distance Between Observations = 2.828427

Cluster History
NCL Clusters Joined FREQ SPRSQ RSQ ERSQ CCC PSF PST2 Norm
Cent
Dist
T
i
e
20 CL35 CL33 8 0.0024 .990 . . 229 32.0 0.1923  
19 CL22 Ritz 5 0.0010 .989 . . 224 2.9 0.2014  
18 CL44 CL27 6 0.0018 .987 . . 206 20.5 0.2073  
17 CL18 CL26 9 0.0025 .985 . . 187 6.4 0.1956  
16 Fudge Rounds Crest 2 0.0009 .984 . . 192 . 0.24  
15 CL24 CL23 5 0.0029 .981 . . 177 7.8 0.2753  
14 CL25 Waverly Wafers 4 0.0021 .979 . . 175 7.7 0.2917  
13 CL30 CL19 17 0.0101 .969 . . 130 41.0 0.2974  
12 CL16 CL31 9 0.0049 .964 .932 5.49 124 20.5 0.3121  
11 CL21 Lipton Family Si 4 0.0029 .961 .924 5.81 129 8.2 0.3445  
10 CL41 CL11 6 0.0045 .957 .915 5.94 130 5.0 0.323  
9 CL29 Lipton Tea Bags 4 0.0031 .953 .904 6.52 138 20.3 0.3603  
8 CL14 CL15 9 0.0101 .943 .890 6.08 131 10.7 0.3761  
7 CL20 Lipton Family Si 9 0.0047 .939 .872 6.89 143 11.7 0.4063  
6 CL13 CL9 21 0.0272 .911 .848 5.23 117 30.0 0.5101  
5 CL6 CL17 30 0.0746 .837 .814 1.30 74.3 42.2 0.606  
4 CL10 CL7 15 0.0440 .793 .764 1.40 75.3 36.4 0.6152  
3 CL8 CL12 18 0.0642 .729 .681 2.02 80.6 44.0 0.6648  
2 CL3 CL4 33 0.2580 .471 .470 0.01 54.2 54.4 0.9887  
1 CL5 CL2 63 0.4707 .000 .000 0.00 . 54.2 0.9636  


CLUSTER=1 CLUSNAME=CL7

Obs class name height width depth weight color
1 Tea Bigelow Plantati 7.7 13.4 6.9 3.27107 g
2 Tea Bigelow Earl Gre 7.7 13.4 6.9 3.27107 b
3 Tea Celestial Saeson 7.8 13.8 6.3 3.65931 b
4 Tea Celestial Saeson 7.8 13.8 6.3 3.58305 r
5 Tea Bigelow Lemon Li 7.7 13.4 6.9 3.41995 y
6 Tea Celestial Saeson 7.8 13.8 6.3 3.82586 y
7 Tea Celestial Saeson 7.8 13.8 6.3 3.33222 g
8 Tea Lipton Tea Bags 6.7 10.0 5.7 3.28271 r
9 Tea Lipton Family Si 8.9 11.1 8.2 4.39510 r

CLUSTER=2 CLUSNAME=CL17

Obs class name height width depth weight color
10 Detergent Tide 26.5 19.9 6.3 10.5928 r
11 Detergent Tide 19.3 14.6 4.7 7.8357 r
12 Detergent Tide 32.5 23.2 7.3 12.6889 r
13 Breakfast cereal Grape Nuts 21.7 16.3 4.9 8.7937 w
14 Detergent Arm & Hammer Det 33.7 22.8 7.0 14.7023 y
15 Detergent Arm & Hammer Det 27.8 19.4 6.3 12.2003 y
16 Detergent Arm & Hammer Det 38.8 30.0 16.9 22.4732 y
17 Detergent Tide 39.4 24.8 11.3 16.1045 r
18 Detergent Arm & Hammer Det 39.5 25.8 11.0 18.6115 y

CLUSTER=3 CLUSNAME=CL12

Obs class name height width depth weight color
19 Paste, Tooth Colgate 3.6 15.6 3.3 4.39510 r
20 Paste, Tooth Crest 3.5 15.2 3.2 4.24343 w
21 Paste, Tooth Crest 4.3 17.4 3.6 5.06813 w
22 Paste, Tooth Arm & Hammer 4.4 17.0 3.7 5.21097 w
23 Paste, Tooth Colgate 4.2 18.3 3.5 5.21097 r
24 Paste, Tooth Crest 4.3 21.7 3.7 5.65790 w
25 Paste, Tooth Colgate 4.4 22.0 3.5 5.82946 r
26 Little Debbie Fudge Rounds 8.1 28.3 5.4 6.45411 w
27 Paste, Tooth Crest 3.0 10.9 2.8 2.88670 w

CLUSTER=4 CLUSNAME=CL13

Obs class name height width depth weight color
28 Breakfast cereal Cheerios 27.5 19.0 6.2 6.56541 y
29 Breakfast cereal Froot Loops 25.0 17.7 6.4 6.77735 r
30 Breakfast cereal Special K 30.1 20.5 8.5 7.98644 w
31 Breakfast cereal Corn Flakes 30.2 20.6 8.4 7.98644 w
32 Breakfast cereal Special K 29.6 19.2 6.7 6.97679 w
33 Breakfast cereal Corn Flakes 30.0 19.1 6.6 6.97679 w
34 Breakfast cereal Froot Loops 30.2 20.8 8.5 8.23034 r
35 Breakfast cereal Cheerios 30.3 20.4 7.2 7.51847 y
36 Breakfast cereal Cheerios 24.1 17.2 5.3 5.82848 y
37 Breakfast cereal Corn Flakes 33.7 25.4 8.0 8.79021 w
38 Breakfast cereal Special K 23.4 16.6 5.7 5.82946 w
39 Breakfast cereal Cheerios 32.5 22.4 8.4 8.27677 y
40 Breakfast cereal Shredded Wheat, 26.6 19.6 5.6 7.98957 r
41 Crackers Ritz 23.1 16.0 5.3 6.97953 r
42 Breakfast cereal All-Bran 21.1 14.3 5.2 7.30951 y
43 Tea Lipton Loose Tea 12.7 10.9 5.4 6.09479 r
44 Crackers Ritz 23.1 20.7 5.2 7.68573 r

CLUSTER=5 CLUSNAME=CL10

Obs class name height width depth weight color
45 Tea Luzianne 8.9 22.8 6.4 5.53748 r
46 Tea Luzianne Decaffe 8.9 22.8 6.4 5.29641 g
47 Crackers Premium Saltines 11.0 25.0 10.7 7.68573 w
48 Tea Lipton Family Si 8.7 20.8 8.2 5.53748 r
49 Little Debbie Marshmallow Supr 9.4 32.0 7.0 6.56541 w
50 Tea Lipton Family Si 13.7 24.0 9.0 6.97679 r

CLUSTER=6 CLUSNAME=CL9

Obs class name height width depth weight color
51 Tea Luzianne 18.4 20.2 6.9 6.09479 r
52 Tea Lipton Tea Bags 17.1 20.0 6.7 6.09479 r
53 Breakfast cereal Shredded Wheat 19.7 19.9 7.5 6.56541 y
54 Tea Lipton Tea Bags 11.5 14.4 6.6 4.73448 r

CLUSTER=7 CLUSNAME=CL8

Obs class name height width depth weight color
55 Crackers Wheatsworth 11.1 25.2 5.5 6.88239 w
56 Little Debbie Swiss Cake Rolls 10.1 21.8 5.8 7.16545 w
57 Little Debbie Figaroos 13.5 18.6 3.7 6.97679 y
58 Little Debbie Nutty Bar 13.2 18.5 4.2 6.97679 y
59 Little Debbie Apple Delights 11.2 30.1 4.9 7.51552 w
60 Little Debbie Lemon Stix 13.2 18.5 4.2 6.33884 w
61 Little Debbie Fudge Brownies 11.0 30.8 2.5 6.97679 w
62 Little Debbie Snack Cakes 13.4 32.0 3.4 7.16545 b
63 Crackers Waverly Wafers 14.4 22.5 6.2 7.68573 g


CLUSTER=4 CLUSNAME=CL13

Obs class name height width depth weight color
28 Breakfast cereal Cheerios 27.5 19.0 6.2 6.56541 y
29 Breakfast cereal Froot Loops 25.0 17.7 6.4 6.77735 r
30 Breakfast cereal Special K 30.1 20.5 8.5 7.98644 w
31 Breakfast cereal Corn Flakes 30.2 20.6 8.4 7.98644 w
32 Breakfast cereal Special K 29.6 19.2 6.7 6.97679 w
33 Breakfast cereal Corn Flakes 30.0 19.1 6.6 6.97679 w
34 Breakfast cereal Froot Loops 30.2 20.8 8.5 8.23034 r
35 Breakfast cereal Cheerios 30.3 20.4 7.2 7.51847 y
36 Breakfast cereal Cheerios 24.1 17.2 5.3 5.82848 y
37 Breakfast cereal Corn Flakes 33.7 25.4 8.0 8.79021 w
38 Breakfast cereal Special K 23.4 16.6 5.7 5.82946 w
39 Breakfast cereal Cheerios 32.5 22.4 8.4 8.27677 y
40 Breakfast cereal Shredded Wheat, 26.6 19.6 5.6 7.98957 r
41 Crackers Ritz 23.1 16.0 5.3 6.97953 r
42 Breakfast cereal All-Bran 21.1 14.3 5.2 7.30951 y
43 Tea Lipton Loose Tea 12.7 10.9 5.4 6.09479 r
44 Crackers Ritz 23.1 20.7 5.2 7.68573 r

CLUSTER=5 CLUSNAME=CL10

Obs class name height width depth weight color
45 Tea Luzianne 8.9 22.8 6.4 5.53748 r
46 Tea Luzianne Decaffe 8.9 22.8 6.4 5.29641 g
47 Crackers Premium Saltines 11.0 25.0 10.7 7.68573 w
48 Tea Lipton Family Si 8.7 20.8 8.2 5.53748 r
49 Little Debbie Marshmallow Supr 9.4 32.0 7.0 6.56541 w
50 Tea Lipton Family Si 13.7 24.0 9.0 6.97679 r


CLUSTER=6 CLUSNAME=CL9

Obs class name height width depth weight color
51 Tea Luzianne 18.4 20.2 6.9 6.09479 r
52 Tea Lipton Tea Bags 17.1 20.0 6.7 6.09479 r
53 Breakfast cereal Shredded Wheat 19.7 19.9 7.5 6.56541 y
54 Tea Lipton Tea Bags 11.5 14.4 6.6 4.73448 r

CLUSTER=7 CLUSNAME=CL8

Obs class name height width depth weight color
55 Crackers Wheatsworth 11.1 25.2 5.5 6.88239 w
56 Little Debbie Swiss Cake Rolls 10.1 21.8 5.8 7.16545 w
57 Little Debbie Figaroos 13.5 18.6 3.7 6.97679 y
58 Little Debbie Nutty Bar 13.2 18.5 4.2 6.97679 y
59 Little Debbie Apple Delights 11.2 30.1 4.9 7.51552 w
60 Little Debbie Lemon Stix 13.2 18.5 4.2 6.33884 w
61 Little Debbie Fudge Brownies 11.0 30.8 2.5 6.97679 w
62 Little Debbie Snack Cakes 13.4 32.0 3.4 7.16545 b
63 Crackers Waverly Wafers 14.4 22.5 6.2 7.68573 g


  class
Breakfast cereal Crackers Detergent Little Debbie Paste, Tooth Tea
CLUSTER           #########
1
2 #   ########      
3       # ########  
4 ############## ##       #
5   #   #   ####
6 #         ###
7   ##   #######    


  color
Blue Green Red White Yellow
CLUSTER ## ## ###   ##
1
2     #### # ####
3     ### ######  
4     ###### ###### #####
5   # ### ##  
6     ###   #
7 # #   ##### ##


The last several analyses include color. Obviously, the dummy variables must not be included in calculations to standardize the rows. If the five dummy variables are simply standardized to variance 1.0 and included with the other variables, color dominates the analysis. The dummy variables should be scaled to a smaller variance, which must be determined by trial and error. Four analyses are done using PROC STANDARD to scale the dummy variables to a standard deviation of 0.2, 0.3, 0.4, or 0.8. The cluster listings are suppressed.

Since dummy variables drastically violate the normality assumption on which the CCC depends, the CCC tends to indicate an excessively large number of clusters.

   /************************************************************/
   /*                                                          */
   /* Analyses 4-7: standardized row-standardized logs & color */
   /*                                                          */
   /************************************************************/
   %let list=0;
   %let crosscol=1;

   title2
     'Analysis 4: Standardized row-standardized 
                  logarithms and color (s=.2)';
   proc standard data=stdstan out=stdstan m=0 s=.2;
      var c_:;
   run;

   proc cluster data=stdstan m=cen %clusopt outtree=tree;
      var l_height l_width l_depth l_weight c_:;
      id name;
      copy class height width depth weight color;
   run;

   %show(7);

   title2
     'Analysis 5: Standardized row-standardized 
                  logarithms and color (s=.3)';
   proc standard data=stdstan out=stdstan m=0 s=.3;
      var c_:;
   run;

   proc cluster data=stdstan m=cen %clusopt outtree=tree;
      var l_height l_width l_depth l_weight c_:;
      id name;
      copy class height width depth weight color;
   run;

   %show(6);

   title2
     'Analysis 6: Standardized row-standardized 
                  logarithms and color (s=.4)';
   proc standard data=stdstan out=stdstan m=0 s=.4;
      var c_:;
   run;

   proc cluster data=stdstan m=cen %clusopt outtree=tree;
      var l_height l_width l_depth l_weight c_:;
      id name;
      copy class height width depth weight color;
   run;

   %show(3);

   title2
     'Analysis 7: Standardized row-standardized 
                  logarithms and color (s=.8)';
   proc standard data=stdstan out=stdstan m=0 s=.8;
      var c_:;
   run;

   proc cluster data=stdstan m=cen %clusopt outtree=tree;
      var l_height l_width l_depth l_weight c_:;
      id name;
      copy class height width depth weight color;
   run;

   %show(10);

Using PROC STANDARD on the dummy variables with S=0.2 causes four of the Little Debbie products to join the toothpastes. Using S=0.3 causes one of the tea clusters to merge with the breakfast cereals while three cereals defect to the detergents. Using S=0.4 produces three clusters consisting of (1) cereals and detergents, (2) Little Debbie products and toothpaste, and (3) teas, with crackers divided among all three clusters and a few other misclassifications. With S=0.8, ten clusters are indicated, each entirely monochrome. So, S=0.2 or S=0.3 degrades the classification, S=0.4 yields a good but perhaps excessively coarse classification, and higher values of the S= option produce clusters that are determined mainly by color.

Output 23.6.4: Analysis of Standardized Row-Standardized Logarithms and Color

Cluster Analysis of Grocery Boxes
Analysis 4: Standardized row-standardized logarithms and color (s=.2)

The CLUSTER Procedure
Centroid Hierarchical Cluster Analysis

Eigenvalues of the Covariance Matrix
  Eigenvalue Difference Proportion Cumulative
1 2.43584975 0.94791932 0.5800 0.5800
2 1.48793042 1.39363531 0.3543 0.9342
3 0.09429511 0.03686218 0.0225 0.9567
4 0.05743293 0.01036136 0.0137 0.9704
5 0.04707157 0.00489503 0.0112 0.9816
6 0.04217654 0.00693298 0.0100 0.9916
7 0.03524355 0.03524355 0.0084 1.0000
8 -.00000000 0.00000000 -0.0000 1.0000
9 -.00000000   -0.0000 1.0000

Root-Mean-Square Total-Sample Standard Deviation = 0.68313

Root-Mean-Square Distance Between Observations = 2.898275

Cluster History
NCL Clusters Joined FREQ SPRSQ RSQ ERSQ CCC PSF PST2 Norm
Cent
Dist
T
i
e
20 CL46 Lemon Stix 3 0.0016 .968 . . 67.5 11.9 0.2706  
19 Luzianne Lipton Family Si 2 0.0014 .966 . . 69.7 . 0.2995  
18 CL25 CL37 6 0.0041 .962 . . 67.1 5.0 0.3081  
17 CL33 CL35 16 0.0099 .952 . . 57.2 16.7 0.3196  
16 CL19 Luzianne Decaffe 3 0.0024 .950 . . 59.2 1.7 0.3357  
15 CL30 CL16 5 0.0042 .946 . . 59.5 2.7 0.3299  
14 CL27 CL18 8 0.0057 .940 . . 58.9 4.2 0.3429  
13 CL20 Fudge Brownies 4 0.0031 .937 . . 61.7 3.6 0.3564  
12 CL24 Lipton Tea Bags 4 0.0031 .934 .905 3.23 65.2 4.7 0.359  
11 CL39 CL28 6 0.0068 .927 .896 3.17 65.9 12.1 0.3743  
10 CL13 Snack Cakes 5 0.0036 .923 .886 3.62 70.8 2.3 0.3755  
9 CL11 CL32 13 0.0176 .906 .874 2.70 64.8 16.0 0.4107  
8 CL14 Lipton Family Si 9 0.0052 .900 .859 3.29 71.0 2.6 0.4265  
7 Waverly Wafers CL10 6 0.0052 .895 .841 4.09 79.8 2.4 0.4378  
6 CL17 CL12 20 0.0248 .870 .817 3.52 76.6 19.7 0.4898  
5 CL15 CL8 14 0.0326 .838 .783 3.08 75.0 14.0 0.5607  
4 CL6 CL21 30 0.0743 .764 .734 1.35 63.5 35.6 0.5877  
3 CL9 CL7 19 0.0579 .706 .653 2.17 72.0 22.8 0.6611  
2 CL4 CL3 49 0.3632 .343 .450 -2.6 31.8 73.0 0.9838  
1 CL2 CL5 63 0.3426 .000 .000 0.00 . 31.8 0.9876  


  class
Breakfast cereal Crackers Detergent Little Debbie Paste, Tooth Tea
CLUSTER ##   ########      
1
2   #   #### ########  
3 ############# ##       #
4 #         ###
5   #   #####    
6           #########
7   #       ####


  color
Blue Green Red White Yellow
CLUSTER     #### # #####
1
2     ### ##########  
3     ###### ###### ####
4     ###   #
5 # #   ## ##
6 ## ## ###   ##
7   # ### #  


Cluster Analysis of Grocery Boxes
Analysis 5: Standardized row-standardized logarithms and color (s=.3)

The CLUSTER Procedure
Centroid Hierarchical Cluster Analysis

Eigenvalues of the Covariance Matrix
  Eigenvalue Difference Proportion Cumulative
1 2.44752302 0.95026671 0.5500 0.5500
2 1.49725632 1.36701945 0.3365 0.8865
3 0.13023687 0.02135049 0.0293 0.9157
4 0.10888637 0.00867367 0.0245 0.9402
5 0.10021271 0.00628821 0.0225 0.9627
6 0.09392449 0.02196469 0.0211 0.9838
7 0.07195981 0.07195981 0.0162 1.0000
8 0.00000000 0.00000000 0.0000 1.0000
9 -.00000000   -0.0000 1.0000

Root-Mean-Square Total-Sample Standard Deviation = 0.703167

Root-Mean-Square Distance Between Observations = 2.983287

Cluster History
NCL Clusters Joined FREQ SPRSQ RSQ ERSQ CCC PSF PST2 Norm
Cent
Dist
T
i
e
20 CL24 CL28 4 0.0038 .953 . . 45.7 2.7 0.3448  
19 Grape Nuts CL23 6 0.0033 .950 . . 46.0 3.5 0.3477  
18 CL46 Lemon Stix 3 0.0027 .947 . . 47.1 21.9 0.3558  
17 CL21 Lipton Tea Bags 4 0.0031 .944 . . 48.2 2.5 0.3577  
16 CL39 CL33 6 0.0064 .937 . . 46.9 12.1 0.3637  
15 CL19 CL29 14 0.0152 .922 . . 40.6 12.4 0.3707  
14 CL18 Fudge Brownies 4 0.0035 .919 . . 42.5 2.5 0.3813  
13 CL16 CL25 13 0.0175 .901 . . 38.0 13.7 0.4103  
12 CL22 Lipton Family Si 5 0.0049 .896 .875 1.76 40.0 3.2 0.4353  
11 CL12 CL37 7 0.0089 .887 .865 1.71 40.9 4.6 0.4397  
10 CL20 Luzianne Decaffe 5 0.0056 .882 .854 2.02 43.9 2.5 0.4669  
9 CL26 CL17 16 0.0222 .859 .841 1.20 41.3 16.6 0.479  
8 CL32 CL11 9 0.0125 .847 .826 1.31 43.5 4.5 0.4988  
7 CL14 Snack Cakes 5 0.0070 .840 .806 1.95 49.0 3.3 0.519  
6 Waverly Wafers CL7 6 0.0077 .832 .782 2.79 56.6 2.3 0.5366  
5 CL9 CL15 30 0.0716 .761 .749 0.54 46.1 28.3 0.5452  
4 CL10 CL8 14 0.0318 .729 .700 1.21 52.9 8.6 0.5542  
3 CL5 CL6 36 0.0685 .660 .622 1.50 58.3 14.2 0.6516  
2 CL13 CL4 27 0.2008 .460 .427 0.90 51.9 46.6 0.9611  
1 CL3 CL2 63 0.4595 .000 .000 0.00 . 51.9 0.9609  


  class
Breakfast cereal Crackers Detergent Little Debbie Paste, Tooth Tea
CLUSTER ### ## ########     #
1
2   #   #### ########  
3 #############         ###
4   #   #####    
5           #########
6   #       ####


  color
Blue Green Red White Yellow
CLUSTER     ######## # #####
1
2     ### ##########  
3     ##### ###### #####
4 # #   ## ##
5 ## ## ###   ##
6   # ### #  


Cluster Analysis of Grocery Boxes
Analysis 6: Standardized row-standardized logarithms and color (s=.4)

The CLUSTER Procedure
Centroid Hierarchical Cluster Analysis

Eigenvalues of the Covariance Matrix
  Eigenvalue Difference Proportion Cumulative
1 2.46469435 0.95296119 0.5135 0.5135
2 1.51173316 1.28149311 0.3149 0.8284
3 0.23024005 0.04306536 0.0480 0.8764
4 0.18717469 0.01766446 0.0390 0.9154
5 0.16951023 0.01827481 0.0353 0.9507
6 0.15123542 0.06582379 0.0315 0.9822
7 0.08541162 0.08541162 0.0178 1.0000
8 -.00000000 0.00000000 -0.0000 1.0000
9 -.00000000   -0.0000 1.0000

Root-Mean-Square Total-Sample Standard Deviation = 0.730297

Root-Mean-Square Distance Between Observations = 3.098387

Cluster History
NCL Clusters Joined FREQ SPRSQ RSQ ERSQ CCC PSF PST2 Norm
Cent
Dist
T
i
e
20 CL29 CL44 10 0.0074 .955 . . 47.7 8.2 0.3789  
19 CL38 Lipton Family Si 3 0.0031 .952 . . 48.1 9.3 0.3792  
18 CL25 CL41 11 0.0155 .936 . . 38.8 36.7 0.4192  
17 CL23 CL43 10 0.0120 .924 . . 35.0 11.6 0.4208  
16 Grape Nuts CL26 6 0.0050 .919 . . 35.6 5.8 0.4321  
15 CL19 CL31 5 0.0074 .912 . . 35.4 5.3 0.4362  
14 Premium Saltines CL27 4 0.0046 .907 . . 36.8 2.9 0.4374  
13 CL18 CL20 21 0.0352 .872 . . 28.4 19.7 0.4562  
12 CL13 CL16 27 0.0372 .835 .839 -.37 23.4 12.0 0.4968  
11 CL21 CL17 15 0.0289 .806 .828 -1.5 21.6 13.6 0.5183  
10 CL14 CL15 9 0.0200 .786 .815 -1.8 21.6 7.2 0.5281  
9 Waverly Wafers Luzianne Decaffe 2 0.0047 .781 .801 -1.2 24.1 . 0.5425  
8 CL10 CL24 12 0.0243 .757 .785 -1.3 24.5 5.8 0.5783  
7 CL12 CL46 29 0.0224 .735 .765 -1.3 25.8 5.3 0.6105  
6 CL8 CL37 14 0.0220 .712 .740 -1.1 28.3 4.0 0.6313  
5 CL6 CL32 16 0.0251 .687 .707 -.78 31.9 3.9 0.6664  
4 CL11 CL9 17 0.0287 .659 .660 -.04 38.0 7.0 0.7098  
3 CL4 Snack Cakes 18 0.0180 .641 .584 2.21 53.5 3.2 0.7678  
2 CL3 CL5 34 0.2175 .423 .400 0.67 44.8 31.4 0.8923  
1 CL7 CL2 63 0.4232 .000 .000 0.00 . 44.8 0.9156  


  class
Breakfast cereal Crackers Detergent Little Debbie Paste, Tooth Tea
CLUSTER >############## ## ######## ##   #
1
2   ##   ####### ######## #
3   #       >##############


  color
Blue Green Red White Yellow
CLUSTER     ########## ####### ############
1
2 # ## ### ############  
3 ## ## ######### # ##


Cluster Analysis of Grocery Boxes
Analysis 7: Standardized row-standardized logarithms and color (s=.8)

The CLUSTER Procedure
Centroid Hierarchical Cluster Analysis

Eigenvalues of the Covariance Matrix
  Eigenvalue Difference Proportion Cumulative
1 2.61400794 0.93268930 0.3631 0.3631
2 1.68131864 0.77645948 0.2335 0.5966
3 0.90485916 0.22547234 0.1257 0.7222
4 0.67938683 0.00292216 0.0944 0.8166
5 0.67646466 0.12119211 0.0940 0.9106
6 0.55527255 0.46658428 0.0771 0.9877
7 0.08868827 0.08868827 0.0123 1.0000
8 -.00000000 0.00000000 -0.0000 1.0000
9 -.00000000   -0.0000 1.0000

Root-Mean-Square Total-Sample Standard Deviation = 0.894427

Root-Mean-Square Distance Between Observations = 3.794733

Cluster History
NCL Clusters Joined FREQ SPRSQ RSQ ERSQ CCC PSF PST2 Norm
Cent
Dist
T
i
e
20 CL29 CL44 10 0.0049 .970 . . 72.7 8.2 0.3094  
19 CL38 Lipton Family Si 3 0.0021 .968 . . 73.3 9.3 0.3096  
18 CL21 CL23 12 0.0153 .952 . . 53.0 15.0 0.4029  
17 Waverly Wafers Luzianne Decaffe 2 0.0032 .949 . . 53.8 . 0.443  
16 CL27 CL24 6 0.0095 .940 . . 48.9 10.4 0.444  
15 CL19 CL16 9 0.0136 .926 . . 43.0 6.1 0.4587  
14 CL41 Grape Nuts 7 0.0058 .920 . . 43.6 51.2 0.4591  
13 CL26 CL46 7 0.0105 .910 . . 42.1 22.0 0.4769  
12 CL25 CL13 12 0.0205 .889 .743 16.5 37.3 13.8 0.467  
11 CL18 Premium Saltines 13 0.0093 .880 .726 16.7 38.2 4.0 0.5586  
10 CL17 CL37 4 0.0134 .867 .706 16.5 38.3 7.9 0.6454  
9 CL14 CL20 17 0.0567 .810 .684 11.0 28.8 52.6 0.6534  
8 CL12 CL9 29 0.0828 .727 .659 5.03 20.9 20.7 0.604  
7 CL11 CL43 16 0.0359 .691 .631 4.25 20.9 14.4 0.6758  
6 CL15 CL31 11 0.0263 .665 .598 4.24 22.6 8.0 0.7065  
5 CL7 CL6 27 0.1430 .522 .557 -1.7 15.8 28.2 0.8247  
4 CL8 CL5 56 0.2692 .253 .507 -9.1 6.6 31.5 0.7726  
3 Snack Cakes CL32 3 0.0216 .231 .435 -6.6 9.0 46.0 1.0027  
2 CL4 CL10 60 0.1228 .108 .289 -5.6 7.4 9.5 1.0096  
1 CL2 CL3 63 0.1083 .000 .000 0.00 . 7.4 1.0839  


  class
Breakfast cereal Crackers Detergent Little Debbie Paste, Tooth Tea
CLUSTER ### ## ####     #
1
2   ##   ###### #####  
3 #######          
4 ######   #### ##    
5         ###  
6           #########
7   #       ###
8           ##
9           ##
10       #    


  color
Blue Green Red White Yellow
CLUSTER     ##########    
1
2       #############  
3       #######  
4         ############
5     ###    
6     #########    
7   ####      
8 ##        
9         ##
10 #        

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.