Chapter Contents |
Previous |
Next |

Distribution Analyses |

for | ||

Both theory and practice suggest that the choice of
a kernel function is not crucial to the statistical
performance of the method (Epanechnikov 1969).
With a specific kernel function, the value of determines the degree of averaging in the estimate of
the density function and is called a *smoothing parameter*.
You select a bandwidth for each kernel
estimator by specifying *c* in the formula

where *Q* is the sample
interquartile range of the **Y** variable.
This formulation makes *c*
independent of the units of **Y**.
For a specific kernel function, the discrepancy between
the density estimator and the true density *f*(*y*) can be measured by
the mean integrated square error

which is the sum of the integrated square bias and the integrated variance. An approximate mean integrated square error based on the bandwidth is

If *f*(*y*) is assumed normal, then a bandwidth based on
the sample mean and variance can be computed to minimize AMISE.
The resulting bandwidth for a specific kernel
is used when the associated kernel function is
selected in the density estimation options dialog.
This is equivalent to choosing **MISE** from the
normal, triangular, or quadratic kernel menus.
If *f*(*y*) is not roughly normal,
this choice may not be appropriate.
SAS/INSIGHT software divides the range of the data into
128 evenly spaced intervals, then approximates the data
on this grid and uses the fast Fourier transformation
(Silverman 1986) to estimate the density.
If you select a **Weight** variable, the kernel estimator is
modified to include the individual observation weights.

You can specify the kernel function in the density
estimation options dialog or from the **Curves** menu.
When you specify the kernel function in the density
estimation options dialog, **AMISE** is used.
After choosing **Curves:Kernel Density** from
the menu, you can specify the kernel function and use
either **AMISE** or a specified C value
in the **Kernel Density Estimation** dialog.

The default uses a normal kernel density with
a *c* value that minimizes the AMISE.
Figure 38.26 displays normal kernel estimates
with *c* = 0.7852 (the AMISE value) and
*c* = 0.25.
Small values of *c* (and hence small values
of the smoothing parameter ) provide jagged estimates
as the curve more closely follows the data points.
Large values of *c* provide smoother estimates.
The **Mode** is the point with the largest estimated density.
Use the slider to change the smoothing parameter, *c*.

**Figure 38.26:** Kernel Density Estimation

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.