Chapter Contents |
Previous |
Next |

The TRANSREG Procedure |

Two vectors of information are needed to produce the
optimally scaled variable: the initial variable scaling
vector **x** and the target vector **y**.
For convenience, both vectors are first sorted
on the values of the initial scaling vector.
If you request an UNTIE transformation, the target vector
is sorted within ties in the initial scaling vector.
The normal SAS System collating sequence
for missing and nonmissing values is used.
Sorting simply allows constraints to be specified
in terms of relations among adjoining coefficients.
The sorting process partitions **x** and
**y** into missing and nonmissing parts
(**x**_{m}'**x**_{n}')',
and (**y**_{m}' **y**_{n}')'.

Next, PROC TRANSREG determines category membership.
Every ordinary missing value (.) forms a separate category.
(Three ordinary missing values form three categories.)
Every special missing value within the range specified
in the UNTIE= *a-option* forms a separate category.
(If UNTIE= BC and there are three .B and two .C
missing values, five categories are formed from them.)
For all other special missing values, a separate
category is formed for each different value.
(If there are four .A missing values,
one category is formed from them.)

Each distinct nonmissing value forms a separate category for OPSCORE and MONOTONE transformations (1 1 1 2 2 3 form three categories). Each nonmissing datum forms a separate category for all other transformations (1 1 1 2 2 3 form six categories). Once category membership is determined, category means are computed. Here is an example:

x: | `(. . .A .A .B 1 1 1 2 2 3 3 3 4)'` | |

y: | `(5 6 2 4 2 1 2 3 4 6 4 5 6 7)'` | |

OPSCORE and | ||

MONOTONE means: | `(5 6 3 2 2 5 5 7)'` | |

other means: | `(5 6 3 2 1 2 3 4 6 4 5 6 7)'` |

The category means are the coefficients
of a category indicator design matrix.
The category means are the Fisher (1938) optimal scores.
For MONOTONE and UNTIE transformations, order constraints
are imposed on the category means for the nonmissing
partition by merging categories that are out of order.
The algorithm checks upward until an order violation is found,
then averages downward until the order violation is averaged away.
(The average of computed from *n _{1}* observations
and computed from

The UNTIE transformation (Kruskal 1964, primary approach to ties) uses the same algorithm on the means of the nonmissing values (1 2 3 4 6 4 5 6 7)' but with different results for this example: 1<2:OK, 2<3:OK, 3<4:OK, 4<6:OK, 6>4:average 6 and 4 and replace 6 and 4 by the average. The new means of the nonmissing values are (1 2 3 4 5 5 5 6 7)'. The check resumes: 4<5:OK, 5=5:OK, 5=5:OK, 5<6:OK, 6<7:OK. If some of the special missing values are ordered, the upward checking, downward averaging method is applied to them also, independently of the other missing and nonmissing partitions. Once the means conform to any required category or order constraints, an optimally scaled vector is produced from the means. The following example results from a MONOTONE transformation.

x: | '`(. . .A .A .B 1 1 1 2 2 3 3 3 4)` | |

y: | '`(5 6 2 4 2 1 2 3 4 6 4 5 6 7)` | |

result: | '`(5 6 3 3 2 2 2 2 5 5 5 5 5 7)` |

The upward checking, downward averaging algorithm is equivalent to creating a category indicator design matrix, solving for least-squares coefficients with order constraints, then computing the linear combination of design matrix columns.

For the optimal transformation LINEAR and for nonoptimal
transformations, missing values are handled as just described.
The nonmissing target values are regressed onto the matrix
defined by the nonmissing initial scaling values and an intercept.
In this example, the target vector
*y*_{n} = (1 2 3 4 6 4 5 6 7)'
is regressed onto the design matrix

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.