Chapter Contents


SAS/ACCESS Software for Relational Databases: Reference

Calculating Statistics

You can also use SAS statistical procedures on DBMS data. This section shows examples using the MEANS and RANK procedures. See Using DBMS Data in Version 7 and Version 8 for an example of the FREQ procedure.

Using the MEANS Procedure

In your analysis of recent orders, suppose you also want to calculate some statistics for each U.S. customer. From the DB2 table SASDEMO.ORDERS, the view descriptor VLIB.USAORDR selects a subset of observations that have a SHIPTO value beginning with a 1, indicating a U.S. customer. The observations are also ordered by the SHIPTO variable.

The following example generates the means and sums of the length of material ordered (in yards) and the fabric charges (in dollars) for each U.S. customer. Also included are the number of observations (N) and the number of missing values (NMISS). The MAXDEC= option specifies the number of decimal places (0-8) for PROC MEANS to use in printing the results.

proc access dbms=db2;
/* create access descriptor    */
    create adlib.order.access;
    rename dateorderd = dateord
           processdby = procesby;
    format dateorderd  date9.
           shipped     date9.
           ordernum    5.0
           length      4.0
           stocknum    4.0
           takenby     6.0
           processdby  6.0
           fabcharges  12.2;
    list all;

/* create vlib.usaordr view  */
    create vlib.usaordr.view;
    select ordernum stocknum length
           fabcharges shipto;
    subset where shipto like '1%';

/* example             */
proc means data=vlib.usaordr mean
     sum n nmiss maxdec=0;
   by shipto;
   var length fabcharg;
title 'Data Described by VLIB.USAORDR';

Statistics on Fabric Length and Charges for Each U.S. Customer shows the output for this example.

Statistics on Fabric Length and Charges for Each U.S. Customer
               Data Described by VLIB.USAORDER

---------------------- SHIPTO=14324742 -----------------------   

 Variable  Label               Mean           Sum   N  Nmiss
 LENGTH    LENGTH              1095          4380   4      0
 FABCHARG  FABCHARGES       1934460       3868920   2      2

---------------------- SHIPTO=14898029 -----------------------

 Variable  Label               Mean           Sum   N  Nmiss
 LENGTH    LENGTH              2500          5000   2      0
 FABCHARG  FABCHARGES       1400825       2801650   2      0

---------------------- SHIPTO=15432147 -----------------------

 Variable  Label               Mean           Sum   N  Nmiss
 LENGTH    LENGTH               725          2900   4      0
 FABCHARG  FABCHARGES        252149        504297   2      2

---------------------- SHIPTO=18543489 -----------------------

 Variable  Label               Mean           Sum   N  Nmiss
 LENGTH    LENGTH               303          1820   6      0
 FABCHARG  FABCHARGES      11063836      44255344   4      2

---------------------- SHIPTO=19783482 -----------------------

 Variable  Label               Mean           Sum   N  Nmiss
 LENGTH    LENGTH               450          1800   4      0
 FABCHARG  FABCHARGES        252149       1008594   4      0

---------------------- SHIPTO=19876078 -----------------------

 Variable  Label               Mean           Sum   N  Nmiss
 LENGTH    LENGTH               690          1380   2      0
 FABCHARG  FABCHARGES             .             .   0      2

The BY statement causes the interface view engine to generate a DBMS-specific SQL ORDER BY clause so that the data from this table is returned as if it were sorted.

Using the RANK Procedure

You can also use more advanced statistical procedures on DBMS data. The following example uses the RANK procedure to calculate the order of birthdays for a set of employees who are listed in the DB2 table SASDEMO.EMPLOYEES. The OUT= option creates a SAS data file, DLIB.RANKEXAM, from the view descriptor VLIB.EMPS so that the data in the SAS file can be sorted by the SORT procedure. The RANKS statement assigns the name DATERANK to the new variable (in the SAS data file) that is created by the procedure.

proc access dbms=db2;
/* create access descriptor  */
   create adlib.employ.access;
   drop salary;
   rename birthdate birthdat;
   list all;

/* create vlib.emps view     */
   create vlib.emps.view;
   select empid jobcode birthdate lastname;
   format birthdate date9.
          empid     6.0;
   subset where jobcode=602;

/* example          */
proc rank data=vlib.emps out=dlib.rankexam;
   var birthdat;
   ranks daterank;

proc sort data=dlib.rankexam;
   by lastname;

proc print data=dlib.rankexam(drop=jobcode);
title 'Order of Dept 602 Employee Birthdays';
The DROP= data set option is used on the PROC PRINT statement because the JOBCODE variable is not needed in the output. Ranking of Employee Birthdays shows the result of this example.

Ranking of Employee Birthdays
              Order of Dept 602 Employee Birthdays


    1    456910    24SEP1958     ARDIS                     5
    2    237642    13MAR1959     BATTERSBY                 6
    3    239185    28AUG1964     DOS REMEDIOS              7
    4    321783    03JUN1940     GONZALES                  2
    5    120591    12FEB1951     HAMMERSTEIN               4
    6    135673    21MAR1966     HEMESLY                   8
    7    456921    12MAY1967     KRAUSE                    9
    8    457232    15OCT1968     LOVELL                   11
    9    423286    31OCT1969     MIFUNE                   12
   10    216382    24JUL1968     PURINTON                 10
   11    234967    21DEC1972     SMITH                    13
   12    212916    29MAY1933     WACHBERGER                1
   13    119012    05JAN1951     WOLF-PROVENZA             3

Chapter Contents



Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.