HISTOGRAM Statement

## Example 4.7: Annotating a Folded Normal Curve

 See FNORM2 in the SAS/QC Sample Library

This example shows how to display a fitted curve that is not supported by the HISTOGRAM statement.

The offset of an attachment point is measured (in mm) for a number of manufactured assemblies, and the measurements are saved in a data set named ASSEMBLY.

   data assembly;
label offset = 'Offset (in mm)';
input offset @@;
datalines;
11.11 13.07 11.42  3.92 11.08  5.40 11.22 14.69  6.27  9.76
9.18  5.07  3.51 16.65 14.10  9.69 16.61  5.67  2.89  8.13
9.97  3.28 13.03 13.78  3.13  9.53  4.58  7.94 13.51 11.43
11.98  3.90  7.67  4.32 12.69  6.17 11.48  2.82 20.42  1.01
3.18  6.02  6.63  1.72  2.42 11.32 16.49  1.22  9.13  3.34
1.29  1.70  0.65  2.62  2.04 11.08 18.85 11.94  8.34  2.07
0.31  8.91 13.62 14.94  4.83 16.84  7.09  3.37  0.49 15.19
5.16  4.14  1.92 12.70  1.97  2.10  9.38  3.18  4.18  7.22
15.84 10.85  2.35  1.93  9.19  1.39 11.40 12.20 16.07  9.23
0.05  2.15  1.95  4.39  0.48 10.16  4.81  8.28  5.68 22.81
0.23  0.38 12.71  0.06 10.11 18.38  5.53  9.36  9.32  3.63
12.93 10.39  2.05 15.49  8.12  9.52  7.77 10.70  6.37  1.91
8.60 22.22  1.74  5.84 12.90 13.06  5.08  2.09  6.41  1.40
15.60  2.36  3.97  6.17  0.62  8.56  9.36 10.19  7.16  2.37
12.91  0.95  0.89  3.82  7.86  5.33 12.92  2.64  7.92 14.06
;


The assembly process is in statistical control, and it is decided to fit a folded normal distribution to the offset measurements. A variable X has a folded normal distribution if X=|Y|, where Y is distributed as . The fitted density is You can use SAS/IML software to compute preliminary estimates of and based on a method of moments given by Elandt (1961). These estimates are computed by solving equation (19) of Elandt (1961), which is given by where is the standard normal distribution function, and Then the estimates of and are given by Begin by using the MEANS procedure to compute the first and second moments and using the DATA step to compute the constant A.

   proc means data=assembly noprint;
var offset;
output out=stat mean=m1 var=var n=n min=min;

* Compute constant A from equation (19) of Elandt (1961) ;
data stat;
keep m2 a min;
set stat;
a  = (m1*m1);
m2 = ((n-1)/n)*var + a;
a  = a/m2;


Next, use the SAS/IML subroutine NLPDD to solve equation (19) by minimizing , and compute and .

   proc iml;
use stat;
read all var {m2}  into m2;
read all var {a}   into a;
read all var {min} into min;

* f(t) is the function in equation (19) of Elandt (1961) ;
start f(t) global(a);
y = 0.39894*exp(-0.5*t*t);
y = (2*y-(t*(1-2*probnorm(t))))**2/(1+t*t);
y = (y-a)**2;
return(y);
finish;

* Minimize (f(t)-A)**2 and estimate mu and sigma ;
if ( min < 0 ) then do;
print "Warning: Observations are not all nonnegative.";
print "The folded normal is inappropriate.";
stop;
end;
if ( a < 0.6374 ) then do;
print "Warning: Estimates may be unreliable";
end;
opt = { 0 0 };
con = { 1e-6 };
x0  = { 2.0 };
tc  = { . . . . . 1e-12 . . . . . . .};
call nlpdd(rc,etheta0,"f",x0,opt,con,tc);
esig0 = sqrt(m2/(1+etheta0*etheta0));
emu0  = etheta0*esig0;

create prelim var {emu0 esig0 etheta0};
append;
close prelim;

The preliminary estimates are saved in the data set PRELIM, as shown in Output 4.7.1.

Output 4.7.1: Preliminary Estimates of , , and The Data Set PRELIM

 EMU0 ESIG0 ETHETA0 6.51735 6.54953 0.99509

Now, using and as initial estimates, call the NLPDD subroutine to maximize the log likelihood, , of the folded normal distribution, where, up to a constant, * Define the log likelihood of the folded normal ;
start g(p) global(x);
y = 0.0;
do i = 1 to nrow(x);
z = exp( (-0.5/p)*(x[i]-p)*(x[i]-p) );
z = z + exp( (-0.5/p)*(x[i]+p)*(x[i]+p) );
y = y + log(z);
end;
y = y - nrow(x)*log( sqrt( p ) );
return(y);
finish;

* Maximize the log likelihood with subroutine NLPDD ;
use assembly;
read all var {offset} into x;
esig0sq = esig0*esig0;
x0      = emu0||esig0sq;
opt     = { 1 0 };
con     = { . 0.0, .  .  };
call nlpdd(rc,xr,"g",x0,opt,con);
emu     = xr;
esig    = sqrt(xr);
etheta  = emu/esig;

create parmest var{emu esig etheta};
append;
close parmest;
quit;


The data set PARMEST saves the maximum likelihood estimates and (as well as ), as shown in Output 4.7.2.

Output 4.7.2: Final Estimates of , , and The Data Set PARMEST

 EMU ESIG ETHETA 6.66761 6.39650 1.04239

To annotate the curve on a histogram, begin by computing the width and endpoints of the histogram intervals. The following statements save these values in an OUTFIT= data set called OUT. Note that a plot is not produced at this point.

   proc capability data=assembly noprint;
histogram offset / outfit=out normal(noprint) noplot;
run;


Output 4.7.3 provides a listing of the data set OUT. The width and endpoints of the histogram bars are saved as values of the variables _WIDTH_, _MIDPT1_, and _MIDPTN_. See "Output Data Sets".

Output 4.7.3: The OUTFIT= Data Set OUT

 OUTFIT= Data Set OUT

 _VAR_ _CURVE_ _LOCATN_ _SCALE_ _CHISQ_ _DF_ _PCHISQ_ _MIDPT1_ _WIDTH_ _MIDPTN_ _EXPECT_ _ESTSTD_ _ADASQ_ _ADP_ _CVMWSQ_ _CVMP_ _KSD_ _KSP_ offset NORMAL 7.62 5.24 31.17 5 0 1.5 3 22.5 7.62 5.24 1.90 0 0.28 0 0.09 0.01

The following statements create an annotate data set named ANNO, which contains the coordinates of the fitted curve:

   data anno;
merge parmest out;
length function color \$ 8;

function = 'point';
color    = 'black';
size     =  2;
xsys     = '2';
ysys     = '2';
when     = 'a';
constant = 39.894*_width_;
left     =  _midpt1_ - 0.5*_width_;
right    =  _midptn_ + 0.5*_width_;
inc      = (right-left)/100;
do x = left to right by inc;
z1 = (x-emu)/esig;
z2 = (x+emu)/esig;
y  = (constant/esig)*(exp(-0.5*z1*z1)+exp(-0.5*z2*z2));
output;
function = 'draw';
end;
run;


The following statements read the ANNOTATE= data set and display the histogram and fitted curve, as shown in Output 4.7.4:

   title "Folded Normal Distribution";
legend1 frame cframe=ligr cborder=black position=center;
proc capability data=assembly noprint;
spec usl=27  cusl=black  lusl=2  wusl=2;
histogram offset / annotate  = anno
cfill     = red
cframe    = ligr
legend    = legend1;
run;


Output 4.7.4: Histogram with Annotated Folded Normal Curve 