Chapter Contents |
Previous |
Next |

The UNIVARIATE Procedure |

The CIBASIC option produces the table of the basic confidence measures that includes the confidence limits for the mean, standard deviation, and variance. The CIPCTLDF option and CIPCTLNORMAL option produce tables of confidence limits for the quantiles. The LOCCOUNT option produces the table that shows the number of values greater than, equal to, and less than the value of MU0=. The FREQ option produces the table of frequencies counts. The NEXTRVAL= option produces the table with the frequencies of the extreme values. The NORMAL option produces the table with the tests for normality. The TRIMMED=, WINSORIZED=, and ROBUSTCALE options produce tables with robust estimators.

The table of trimmed or Winsorized means includes the percentage and
the number of observations that are trimmed or Winsorized at each end, the
mean and standard error, confidence limits, and the Student's **t**
test. The table with robust measures of scale includes interquartile range,
Gini's mean difference **G**, **MAD**,
, and
, with their corresponding estimates of
.

Missing Values |

- If a BY or an ID variable value is missing, PROC UNIVARIATE treats
it like any other BY or ID variable value. The missing values form a separate
BY group.
- If the FREQ variable value is missing or nonpositive, PROC UNIVARIATE
excludes the observation from the analysis.
- If the WEIGHT variable value is missing, PROC UNIVARIATE excludes
the observation from the analysis.

PROC UNIVARIATE tabulates the number of the missing values and reports this information in the procedure output. Before the number of missing values is tabulated, PROC UNIVARIATE excludes observations when

- you use the FREQ statement and the frequencies are nonpositive
- you use the WEIGHT
statement and the weights are missing or nonpositive
(you must specify the EXCLNPWGT option).

Histograms |

- parameters for the fitted curve, estimated mean, and estimated
standard deviation
- EDF goodness-of-fit tests
- histogram intervals
- quantiles.

Output Data Set |

The output data set includes

- BY statement variables
- variables that contain
statistics
- variables that contain percentiles.

The BY variables indicate which BY group each observation summarizes. When you omit a BY statement, the procedure computes statistics and percentiles by using all the observations in the input data set. When you use a BY statement, the procedure computes statistics and percentiles by using the observations within each BY group.

OUTHISTOGRAM= Data Set |

The data set contains a group of observations for each variable that the HISTOGRAM statement plots. The group contains an observation for each interval of the histogram, beginning with the leftmost interval that contains a value of the variable and ending with the rightmost interval that contains a value of the variable. These intervals will not necessarily coincide with the intervals displayed in the histogram since the histogram may be padded with empty intervals at either end. If you superimpose one or more fitted curves on the histogram, the OUTHISTOGRAM= data set contains multiple groups of observations for each variable (one group for each curve). If you use a BY statement, the OUTHISTOGRAM= data set contains groups of observations for each BY group. ID variables are not saved in the OUTHISTOGRAM= data set.

The variables in OUTHISTOGRAM= data set are

_CURVE_ | name of fitted distribution (if requested in HISTOGRAM statement) |

_EXPPCT_ | estimated percent of population in histogram interval determined from optional fitted distribution |

_MIDPT_ | midpoint of fitted distribution |

_OBSPCT_ | percent of variable values in histogram interval |

_VAR_ | variable name |

Chapter Contents |
Previous |
Next |
Top of Page |

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.