Chapter Contents


SAS Companion for the OS/2 Environment

Special Considerations When Using External DLLs

Using PEEK Functions to Access Character String Arguments

Because the SAS language does not provide pointers as data types, you must use the PIB4. format or informat to represent pointers. You can then use the SAS PEEK functions to access the data that are stored at these address values.

For example, suppose that you have a routine that is named GetPath in a library that is named SERVICES.DLL. It has two arguments, an integer function code and a pointer to a pointer. The function code determines what action GetPath will take, and the second argument points to a pointer that will be updated by GetPath to refer to a system character string. The calling code in C might be

printf("GetPath indicates string is

Using MODULE, the corresponding attribute table entry would be

The attribute table could be invoked as follows:
call module('SERVICES,GetPath',1,stgptr);
put stgptr= stgptr=hex8.;
If the pointer value in STGPTR is 0035F780, STGPTR would actually be set to the decimal value 3536768, which is the decimal equivalent of 0035F780. So the preceding PUT statement would produce
STGPTR=3536768 STGPTR=0035F780
However, you want the data at address 0035F780, not the value of the pointer itself. To access those data, you need to use the PEEKC function.

The PEEKC function is given two arguments, a pointer via a numeric variable (such as STGPTR above) and a length in bytes (characters). PEEKC returns a character string of the specified length that contains the characters at the pointer location.

In the example, suppose that GetPath sets the second argument's pointer value to the address of the null-terminated character string C:\XYZ. You can access the character data with the following statements:

call module('SERVICES,GetPath',1,stgptr);
length path $64;
path = peekc(stgptr,64);
i = index(path,'00'x);
if i then substr(path,i)=' ';
/* path now contains the string */

The PEEKC function copies 64 bytes that start at the location that is referred to by the pointer in STGPTR. Because you need only the data up to the null terminator (but not including it), you search for the null terminator with the INDEX function, then assign a blank character including and after that point.

You can also use the $CSTR format in this scenario to simplify your code slightly:

call module('SERVICES,GetPath',1,stgptr);
length path $64;
path = put(peekc(stgptr,64),$cstr64.);
The $CSTR format accepts as input a character string of a specified width. It looks for a null terminator and pads the output string with blanks from that point. For more information, see $CSTRw. Format.

Accessing External DLLs Efficiently

The MODULExy routine reads the attribute table that is referenced by the SASCBTBL fileref once per step (DATA step, PROC IML step, or SCL step). MODULExy parses the table and stores the attribute information for future use during the step. When you use a MODULExy function, the SAS System searches the stored attribute information for the matching routine and module names. The first time that you access a DLL during a step, the SAS System loads the DLL and determines the address of the requested routine. Each DLL that you invoke stays loaded for the duration of the step, and is not reloaded in subsequent calls. All modules and routines are unloaded at the end of the step. For example, suppose the attribute table had the basic form

* routines XYZ and BBB in FIRST.DLL;
* routines ABC and DDD in SECOND.DLL;
Then suppose that the DATA step looked like this:
filename sascbtbl 'myattr.tbl';
data _null_;
   do i=1 to 50;
      /* FIRST.DLL is loaded only once */
      value = modulen('XYZ',i);
      /* SECOND.DLL is loaded only once */
      value2 = modulen('ABC',value);
      put i= value= value2=;
In this example, MODULEN parses the attribute table during DATA step compilation. In the first loop iteration (i=1), FIRST.DLL is loaded and the XYZ routine is accessed when MODULEN calls for it. Next, SECOND.DLL is loaded and the ABC routine is accessed. For subsequent loop iterations (starting when i=2), FIRST.DLL and SECOND.DLL remain loaded, so the MODULEN function simply accesses the XYZ and ABC routines. The SAS System unloads both DLLs at the end of the DATA step.

Note that the attribute table can contain any number of descriptions for routines that are not accessed for a given step. This does not cause any additional overhead (apart from a few bytes of internal memory to hold the attribute descriptions). In the previous example, BBB and DDD are in the attribute table but are not accessed by the DATA step.

Grouping SAS Variables as Structure Arguments

A common need when you call external routines is to pass a pointer to a structure. Some parts of the structure might be used as input to the routine, while other parts might be replaced or filled in by the routine. Even though the SAS System does not have structures in its language, you can indicate to MODULExy that you want a particular set of arguments to be grouped into a single structure. You indicate this by using the FDSTART option of the ARG statement to flag the argument that begins the structure in the attribute table. The SAS System gathers that argument and all the arguments that follow (until it encounters another FDSTART option) into a single contiguous block, and passes a pointer to the block as an argument to the DLL routine.

For example, consider the GetClientRect routine, which is part of the Win32 API in USER32.DLL. This routine retrieves the coordinates of a window's client area. This also requires the use of another routine, GetActiveWindow, to get the window handle for the window that you want the coordinates from.

The C prototypes for these routines are

HWND GetActiveWindow(VOID);
BOOL GetClientRect(HWND hWnd, LPRECT lprc);
In C, the code to invoke them is
typedef struct tagRECT {
    int left;
    int top;
    int right;
    int bottom;
    } RECT;
/* RECT is a structure variable */
/* more SAS statements          */
/* Need the window handle first */
/* Function call, passing the address */
/* of RECT                            */
GetClientRect(hWnd, &RECT);

To call these routines using MODULE, you use the following attribute table entries:

routine GetActiveWindow
routine GetClientRect
arg 1 num input byvalue format=pib4.;
arg 2 num update fdstart format=ib4.;
arg 3 num update         format=ib4.;
arg 4 num update         format=ib4.;
arg 5 num update         format=ib4.;
Then use the following DATA step:
filename sascbtbl 'sascbtbl.dat';
data _null_;
   call module('GetClientRect',hwnd,left,
   put left= top= right= bottom=;

The use of the FDSTART option in the ARG statement for argument 2 indicates that argument 2 and all subsequent arguments are to be gathered together into a single parameter block.

The output in the log from the PUT statement would be as follows:


Using Constants and Expressions as Arguments to MODULE

You can pass any kind of expression as an argument to the MODULExy functions. The attribute table indicates whether the argument is for input, output, or update access.

You can specify input arguments as constants and arithmetic expressions. However, because output and update arguments must be able to be modified and returned, you can pass only a variable for these parameters. If you specify a constant or expression where a value that can be updated is expected, the SAS System issues a warning message pointing out the error. Processing continues, but the MODULExy routine cannot update a constant or expression argument (this means that the value of the argument that you wanted to update will be lost).

Consider these examples. Here is the attribute table:

* attribute table entry for ABC;
routine abc minarg=2 maxarg=2;
arg 1 input format=ib4.;
arg 2 output format=ib4.;
Here is the DATA step with the MODULE calls:
data _null_;
  /* passing a variable as the    */
  /*   second argument - OK       */
  call module('abc',1,x);
  /* passing a constant as the    */
  /*   second argument - INVALID  */
  call module('abc',1,2);
  /* passing an expression as the */
  /*   second argument - INVALID  */
  call module('abc',1,x+1);

In the preceding example, the first call to MODULE is correct because the variable x is updated with the value that the abc routine returns for the second argument. The second call to MODULE is not correct because a constant is passed. MODULE issues a warning that indicates that you have passed a constant, and MODULE passes a temporary location instead. The third call to MODULE is not correct because an arithmetic expression is passed, and this causes a temporary location from the DATA step to be used. The returned value is lost.

Specifying Formats and Informats to Use with MODULE Arguments

You specify the SAS format and informat for each DLL name that is specified in the ROUTINE statement by specifying the FORMAT attribute within the ARG statement. The format indicates how numeric and character values should be passed to the DLL routine and how they should be read back upon completion of the routine.

Usually, the format that you use corresponds to a variable type for a given programming language. The following sections describe the proper formats that correspond to different variable types in various programming languages.

C Language Formats

C Type SAS Format/Informat
double RB8.
float FLOAT4.
signed int IB4.
signed short IB2.
signed long IB4.
char * IB4.
unsigned int PIB4.
unsigned short PIB2.
unsigned long PIB4.
char[w] $CHARw. or $CSTRw. (see $CSTRw. Format)

Note:   For information about passing character data other than as pointers to character strings, see $BYVALw. Format.  [cautionend]

FORTRAN Language Formats

FORTRAN Type SAS Format/Informat
integer*2 IB2.
integer*4 IB4.
real*4 RB4.
real*8 RB8.
character*w $CHARw.

The MODULE routines can support FORTRAN character arguments only if they are not expected to be passed by a descriptor.

PL/I Language Formats

PL/I Type SAS Format/Informat

The PL/I descriptions are added here for completeness; that they are listed here does not guarantee that you will be able to invoke PL/I routines.

COBOL Language Formats

PIC Sxxxx BINARY IBw. integer binary
COMP-2 RB8. double-precision floating point
COMP-1 RB4. single-precision floating point
PIC xxxx or Sxxxx Fw. printable numeric
PIC yyyy $CHARw. character

The following COBOL specifications might not properly match with the formats that are supplied by the Institute because zoned and packed decimal formats are not truly defined for systems that are based on Intel architecture.

PIC Sxxxx DISPLAY ZDw. zoned decimal
PIC Sxxxx PACKED-DECIMAL PDw. packed decimal

The following COBOL specifications do not have true native equivalents and are usable only in conjunction with the corresponding S370Fxxx informat and format. The specifications allow IBM mainframe-style representations to be read and written in the PC environment.

COBOL Format SAS Format/Informat Description
PIC xxxx DISPLAY S370FZDUw. zoned decimal unsigned
PIC Sxxxx DISPLAY SIGN LEADING S370FZDLw. zoned decimal leading sign
PIC Sxxxx DISPLAY SIGN LEADING SEPARATE S370FZDSw. zoned decimal leading sign separate
PIC Sxxxx DISPLAY SIGN TRAILING SEPARATE S370FZDTw. zoned decimal trailing sign separate
PIC xxxx BINARY S370FIBUw. integer binary unsigned
PIC xxxx PACKED-DECIMAL S370FPDUw. packed decimal unsigned

$CSTRw. Format

If you pass a character argument as a null-terminated string, use the $CSTRw. format. This format looks for the last nonblank character of your character argument and passes a copy of the string with a null terminator after the last nonblank character. For example, consider the following attribute table entry:

* attribute table entry;
routine abc minarg=1 maxarg=1;
arg 1 input char format=$cstr10.;

You can then use the following DATA step:

data _null_;
     rc = module('abc','my string');

The $CSTR format adds a null terminator to the character string my string before passing it to the abc routine. This action produces an equivalent to the following attribute entry:

* attribute table entry;
routine abc minarg=1 maxarg=1;
arg 1 input char format=$char10.;

You can then use the following DATA step:

data _null_;
     rc = module('abc','my string'||'00'x);

The first example is easier to understand and easier to use when you use variable or expression arguments.

The $CSTR informat converts a null-terminated string into a blank-padded string of the specified length. If the DLL routine is supposed to update a character argument, use the $CSTR informat in the argument attribute.

$BYVALw. Format

When you use a MODULExy function to pass a single character by value, the argument is automatically promoted to an integer. If you want to use a character expression in the MODULExy call, you must use the special format or informat called $BYVALw. The $BYVALw. format or informat expects a single character and produces a numeric value, the size of which depends on w. $BYVAL2. produces a short, $BYVAL4. produces a long, and $BYVAL8. produces a double. Consider this example, uses the C language:

long xyz(a,b)
  long a; double b;
  static char c = 'Y';
  if (a == 'X')
  else if (b == c)
  else return(3);

In this example, the xyz routine expects two arguments, a long and a double. If the long is an X, the actual value of the long is 88 in decimal. This is because an ASCII X is stored as hex 58, and this is promoted to a long, represented as 0x00000058 (or 88 decimal). If the value of a is X, or 88, a 1 is returned. If the second argument, a double, is Y (which is interpreted as 89), then 2 is returned.

Now suppose that you want to pass characters as the arguments to xyz. In C, you would invoke them as follows:

x = xyz('X',(double)'Z');
y = xyz('Q',(double)'Y');
This is because the X and Q values are automatically promoted to ints (which are the same as longs for the sake of this example), and the integer values corresponding to Z and Y are cast to doubles.

To call xyz by using the MODULEN function, your attribute table must reflect the fact that you want to pass characters:

routine xyz minarg=2 maxarg=2 returns=long;
arg 1 input char byvalue format=$byval4.;
arg 2 input char byvalue format=$byval8.;
Note that it is important that the BYVALUE option appear in the ARG statement as well. Otherwise, MODULEN assumes that you want to pass a pointer to the routine, instead of a value.

Here is the DATA step that invokes MODULEN and passes it characters:

data _null_;
     x = modulen('xyz','X','Z');
     put x= ' (should be 1)';
     y = modulen('xyz','Q','Y');
     put y= ' (should be 2)';

Understanding MODULE Log Messages

If you specify i in the control string parameter to MODULE, the SAS System prints several informational messages to the log. You can use these messages to determine whether you have passed incorrect arguments or coded the attribute table incorrectly.

Consider this example that uses MODULEIN from within the IML procedure. It uses the MODULEIN function to invoke the changi routine (stored in the theoretical TRYMOD.DLL). In the example, MODULEIN passes the constant 6 and the matrix x2, which is a 4x5 matrix that is to be converted to an integer matrix. The attribute table for changi is as follows:

routine changi module=trymod returns=long;
arg 1 input num format=ib4. byvalue;
arg 2 update num format=ib4.;
The following IML step invokes MODULEIN:
proc iml;
   x1 = J(4,5,0);
   do i=1 to 4;
      do j=1 to 5;
         x1[i,j] = i*10+j+3;
   y1= x1;
   x2 = x1;
   y2 = y1;
   rc = modulein('*i','changi',6,x2);
The '*i' control string causes the lines shown in MODULEIN Output to be printed in the log.

CHR PARM 2 885E0AD0 6368616E6769 (changi)
NUM PARM 3 885E0AE0 0000000000001840
NUM PARM 4 885E07F0
---ROUTINE changi LOADED AT ADDRESS 886119B8 (PARMLIST AT 886033A0)--- PARM 1 06000000     <CALL-BY-VALUE>
PARM 2 88604720
PARM 2 88604720
NUM PARM 4 885E07F0

The output is divided into four sections.

The first section describes the arguments that are passed to MODULEIN.

The 'CHR PARM n' portion indicates that character parameter n was passed. In the example, 885E0AA8 is the actual address of the first character parameter that was passed to MODULEIN. The value at the address is hex 2A69, and the ASCII representation of that value ('*i') is in parentheses after the hex value. The second parameter is likewise printed. Only these first two arguments have their ASCII equivalents printed; this is because other arguments might contain unreadable binary data.

The remaining parameters appear with only hex representations of their values (NUM PARM 3 and NUM PARM 4 in the example).

The third parameter to MODULEIN is numeric, and it is at address 885E0AE0. The hex representation of the floating point number 6 is shown. The fourth parameter is at address 885E07F0, which points to an area that contains all the values for the 4x5 matrix. The *i option prints the entire argument; be careful if you use this option with large matrices because the log might become quite large.

The second section of the log lists the arguments to be passed to the requested routine and, in this case, changed. This section is important for determining if the arguments are being passed to the routine correctly. The first line of this section contains the name of the routine and its address in memory. It also contains the address of the location of the parameter block that MODULEIN created.

The log contains the status of each argument as it is passed. For example, the first parameter in the example is call-by-value (as indicated in the log). The second parameter is the address of the matrix. The log shows the address, along with the data to which it points.

Note that all the values in the first parameter and in the matrix are long integers because the attribute table states that the format is IB4.

In the third section, the log contains the argument values upon return from changi. The call-by-value argument is unchanged, but the other argument (the matrix) contains different values.

The last section of the log output contains the values of the arguments as they are returned to the MODULEIN calling routine.

Chapter Contents



Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.