## Nonparametric Smoothing Spline

Two criteria can be used to select an estimator
for the function *f*:
- goodness of fit to the data
- smoothness of the fit

A standard measure of goodness of fit
is the mean residual sum of squares

A measure of the smoothness of a fit is
the integrated squared second derivative

A single criterion that combines
the two criteria is then given by

where belongs to the set of
all continuously differentiable functions with square integrable
second derivatives, and is a positive constant.
The estimator that results from minimizing *S*()is called the *smoothing spline estimator*.
This estimator fits a cubic polynomial
in each interval between points.
At each point *x*_{i}, the curve and its
first two derivatives are continuous (Reinsch 1967).
The smoothing parameter controls the amount of
smoothing; that is, it controls the trade-off between the
goodness of fit to the data and the smoothness of the fit.
You select a smoothing parameter by
specifying a constant *c* in the formula

where *Q* is the interquartile range of the explanatory variable.
This formulation makes *c* independent of the units of **X**.

After choosing **Curves:Spline**, you specify a smoothing
parameter selection method in the **Spline Fit** dialog.

**Figure 39.40:** Spline Fit Dialog

The default **Method:GCV** uses a *c* value that
minimizes the generalized cross validation mean
squared error .Figure 39.41 displays smoothing spline estimates
with *c* values of 0.0017 (the GCV value) and 15.2219 (DF=3).
Use the slider in the table to change
the *c* value of the spline fit.

**Figure 39.41:** Smoothing Spline Estimates

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.