Chapter Contents |
Previous |
Next |

The SURVEYMEANS Procedure |

**PROC SURVEYMEANS***< options > < statistic-keywords >***;**

In the PROC SURVEYMEANS statement, you also can use

You can specify the following options in the PROC SURVEYMEANS statement.

**ALPHA=**-
sets the confidence level for confidence limits. The
value of the ALPHA= option must be between 0.0001 and
0.9999, and the default value is 0.05. A confidence
level of produces %
confidence limits. The default of ALPHA=0.05 produces
95% confidence limits. If is between 0 and 1
but outside the range of 0.0001 to 0.9999, the
procedure uses the closest range endpoint. For
example, if you specify ALPHA=0.000001, the
procedure uses 0.0001 to determine confidence limits.
**DATA=***SAS-data-set*-
specifies the SAS data set to be analyzed by PROC SURVEYMEANS. If
you omit the DATA= option, the procedure uses the most recently
created SAS data set.
**MISSING**-
requests that the procedure treat missing values as a valid
category for categorical variables.
**ORDER=DATA | FORMATTED | INTERNAL**-
specifies the order in which the values of the categorical
variables are to be reported. Note that the ORDER= option
applies to all the categorical variables. The exception is
ORDER=FORMATTED (the default) for numeric variables for
which you have supplied no explicit format (that is, for
which there is no corresponding FORMAT statement in
the current PROC SURVEYMEANS run or in the DATA step that
created the data set). In this case, the values of the
numerical categorical variables are ordered by their
internal (numeric) value. The following shows how PROC
SURVEYMEANS interprets values of the ORDER= option.
- DATA
- orders values according to their order in the input data set.
- FORMATTED
- orders values by their formatted values. This order is operating environment dependent. By default, the order is ascending.
- INTERNAL
- orders values by their unformatted values, which yields the same order that the SORT procedure does. This order is operating environment dependent.

**RATE=***value**SAS-data-set***R=***value**SAS-data-set*-
specifies the sampling rate as a positive
*value*, or names an input data set that contains the stratum sampling rates. The procedure uses this information to compute a finite population correction for variance estimation. If your sample design has multiple stages, you should specify the*first-stage sampling rate*, which is the ratio of the number of PSUs selected to the total number of PSUs in the population

For a nonstratified sample design, or for a stratified sample design with the same sampling rate in all strata, you should specify a positive*value*for the RATE= option. If your design is stratified with different sampling rates in the strata, then you should name a SAS data set that contains the stratification variables and the sampling rates. See the section "Specification of Population Totals and Sampling Rates" for details.

The sampling rate*value*must be a positive number. You can specify*value*as a number between 0 and 1. Or you can specify*value*in percentage form as a number between 1 and 100, and PROC SURVEYMEANS will convert that number to a proportion. The procedure treats the value 1 as 100%, and not the percentage form 1%.

If you do not specify the TOTAL= option or the RATE= option, then the variance estimation does not include a finite population correction. You cannot specify both the TOTAL= option and the RATE= option. **TOTAL=***value**SAS-data-set***N=***value**SAS-data-set*-
specifies the total number of primary sampling units
(PSUs) in the study population as a positive
*value*, or names an input data set that contains the stratum population totals. The procedure uses this information to compute a finite population correction for variance estimation.

For a nonstratified sample design, or for a stratified sample design with the same population total in all strata, you should specify a positive*value*for the TOTAL= option. If your sample design is stratified with different population totals in the strata, then you should name a SAS data set that contains the stratification variables and the population totals. See the section "Specification of Population Totals and Sampling Rates" for details.

If you do not specify the TOTAL= option or the RATE= option, then the variance estimation does not include a finite population correction. You cannot specify both the TOTAL= option and the RATE= option. *statistic-keywords*-
specifies the statistics for the procedure to compute.
If you do not specify any statistic-keywords, PROC
SURVEYMEANS computes the NOBS, MEAN, STDERR, and CLM
statistics by default.

PROC SURVEYMEANS performs univariate analysis, analyzing each variable separately. Thus the number of nonmissing and missing observations may not be the same for all analysis variables. See the section "Missing Values" for more information.

The statistics produced depend on the type of the analysis variable. If you name a numeric variable in the CLASS statement, then the procedure analyzes that variable as a categorical variable. The procedure always analyzes character variables as categorical. See the section "CLASS Statement" for more information.

PROC SURVEYMEANS computes MIN, MAX, and RANGE for numeric variables but not for categorical variables. For numeric variables, the keyword MEAN produces the mean, but for categorical variables it produces the proportion in each category or level. Also for categorical variables, the keyword NOBS produces the number of observations for each variable level, and the keyword NMISS produces the number of missing observations for each level. If you request the keyword NCLUSTER for a categorical variable, PROC SURVEYMEANS displays for each level the number of clusters with observations in that level. PROC SURVEYMEANS computes SUMWGT the same for categorical and numeric variables, as the sum of the weights over all nonmissing observations.

The valid statistic-keywords are as follows:- ALL
- all statistics listed

- CLM
- % confidence limits for the
MEAN, where is determined by the
ALPHA= option, and the default is

- CLSUM
- % confidence limits for
the SUM, where is determined by the
ALPHA= option, and the default is

- CV
- coefficient of variation

- DF
- degrees of freedom for the
*t*test

- MAX
- maximum value

- MEAN
- mean for a numeric variable,
or the proportion in each category for a categorical
variable

- MIN
- minimum value

- NCLUSTER
- number of clusters

- NMISS
- number of missing observations

- NOBS
- number of nonmissing observations

- RANGE
- range, MAX-MIN

- STD
- standard deviation of the SUM. When you
request SUM, the procedure computes STD by
default.

- STDERR
- standard error of the MEAN. When you
request MEAN, the procedure computes STDERR by
default.

- SUM
- weighted sum, , or estimated
population total when the appropriate
sampling weights are used

- SUMWGT
- sum of the weights,

- T
*t*value for*H*: population MEAN = 0, and its two tailed_{0}*p*-value with DF degrees of freedom

- VAR
- variance of the MEAN

- VARSUM
- variance of the SUM

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.