Chapter Contents Previous Next
 HISTOGRAM Statement

## Kernel Density Estimates

You can use the KERNEL option to superimpose kernel density estimates on histograms. Smoothing the data distribution with a kernel density estimate can be more effective than using a histogram to examine features that might be obscured by the choice of histogram bins or sampling variation. A kernel density estimate can also be more effective than a parametric curve fit when the process distribution is multimodal. See Example 4.5.

The general form of the kernel density estimator is

where K0(·) is a kernel function, is the bandwidth, n is the sample size, and xi is the i th observation.

The KERNEL option provides three kernel functions (K0): normal, quadratic, and triangular. You can specify the function with the K= kernel-option in parentheses after the KERNEL option. Values for the K= option are NORMAL, QUADRATIC, and TRIANGULAR (with aliases of N, Q, and T, respectively). By default, a normal kernel is used. The formulas for the kernel functions are

The value of , referred to as the bandwidth parameter, determines the degree of smoothness in the estimated density function. You specify indirectly by specifying a standardized bandwidth c with the C= kernel-option. If Q is the interquartile range, and n is the sample size, then c is related to by the formula
For a specific kernel function, the discrepancy between the density estimator and the true density f(x) is measured by the mean integrated square error (MISE):

The MISE is the sum of the integrated squared bias and the variance. An approximate mean integrated square error (AMISE) is

A bandwidth that minimizes AMISE can be derived by treating f(x) as the normal density having parameters and estimated by the sample mean and standard deviation. If you do not specify a bandwidth parameter or if you specify C=MISE, the bandwidth that minimizes AMISE is used. The value of AMISE can be used to compare different density estimates. For each estimate, the bandwidth parameter c, the kernel function type, and the value of AMISE are reported in the SAS log.

 Chapter Contents Previous Next Top