Trimmed and Winsorized Means
When outliers are present in the data, trimmed and Winsorized
means are robust estimators of the population mean that
are relatively insensitive to the outlying values.
Therefore, trimming and Winsorization are methods for
reducing the effects of extreme values in the sample.
The k-times trimmed mean is calculated as
The trimmed mean is computed after the k smallest and
k largest observations are deleted from the sample.
In other words, the observations are trimmed at each end.
The k-times Winsorized mean is calculated as
The Winsorized mean is computed after the k
smallest observations are replaced by the (k+1)st
smallest observation, and the k largest observations
are replaced by the (k+1)st largest observation.
In other words, the observations are Winsorized at each end.
For a symmetric distribution, the symmetrically trimmed or
Winsorized mean is an unbiased estimate of the population mean.
But the trimmed or Winsorized mean does not have a normal
distribution even if the data are from a normal population.
The Winsorized sum of squared deviations is defined as
A robust estimate of the variance of the trimmed mean can be based on the Winsorized
sum of squared deviations (Tukey and McLaughlin 1963).
The resulting trimmed t test
is given by
where is the standard error
of :
A Winsorized t test is given by
where is the standard error
of :
When the data are from a symmetric distribution,
the distribution of the trimmed t statistic
t_{tk} or the Winsorized t statistic
t_{wk} can be approximated by a Student's
t distribution with n-2k-1 degrees of freedom
(Tukey and McLaughlin 1963,
Dixon and Tukey 1968).
You can specify the number or percentage of observations
to be trimmed or Winsorized from each end either by using
the Trimmed/Winsorized Means options dialog or
by using the Trimmed/Winsorized Means dialog after choosing
Tables:Trimmed/Winsorized Mean:(1/2)N
or Tables:Trimmed/Winsorized Mean:(1/2)Percent from the menus.
Figure 38.15: (1/2)N Menu
Figure 38.16: (1/2)Percent Menu
If you specify a percentage, 100 p%, 0<p<1,
the smallest integer greater than or equal to np
is trimmed or Winsorized from each end.
The Trimmed Mean and Winsorized Mean tables,
as shown in Figure 38.17,
contain the following statistics:
- (1/2)Percent is the percentage of observations
trimmed or Winsorized at each end.
- (1/2)N is the number of observations trimmed or
Winsorized at each end.
- Mean is the trimmed or Winsorized mean.
- Std Mean is the standard error of the
trimmed or Winsorized mean.
- DF is the degrees of freedom used in the Student's
t test for the trimmed or Winsorized mean.
- Confidence Interval includes Level (%): the confidence level,
LCL: lower confidence limit, and UCL: upper confidence limit.
- t for H0: Mean=Mu0 includes Mu0: the location parameter
, t Stat: the trimmed or Winsorized t statistic
for testing the hypothesis that the population mean is
, and
p-value: the approximate p-value of the trimmed
or Winsorized t statistic.
Figure 38.17: Trimmed Means and Winsorized Means Tables
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.