hash and hash iterator objects
enable you to quickly and efficiently store, search, and retrieve data based on lookup
keys. The hash object keys and data are DATA step variables. Key and data values
can be directly assigned constant values or values from a SAS data set. For
information about the hash and hash iterator object language elements, see
“Dictionary of Hash and Hash Iterator Object Language Elements” in SAS
Component Objects: Reference.
Java object
provides a mechanism that is similar to the Java Native Interface (JNI) for
instantiating Java classes and accessing fields and methods on the resultant objects.
For more information, see “Dictionary of Java Object Language Elements” in SAS
Component Objects: Reference.
logger and appender objects
enable you to record logging events and write these events to the appropriate
destination. For more information, see “Component Object Reference” in SAS
Logging: Configuration and Programming Reference.
The DATA step Component Interface enables you to create and manipulate these
component objects using statements, attributes, operators, and methods. You use the
DATA step object dot notation to access the component object's attributes and methods.
For detailed information about dot notation and the DATA step objects' statements,
attributes, methods, and operators, see the Dictionary of Component Language Elements
in SAS Component Objects: Reference.
Note: The DATA step component object statement, attributes, methods, and operators
are limited to those defined for these objects. You cannot use the SAS Component
Language functionality with these predefined DATA step objects.
Using the Hash Object
Why Use the Hash Object?
The hash object provides an efficient, convenient mechanism for quick data storage and
retrieval. The hash object stores and retrieves data based on lookup keys.
To use the DATA step Component Object Interface, follow these steps:
1. Declare the hash object.
2. Create an instance of (instantiate) the hash object.
3. Initialize lookup keys and data.
After you declare and instantiate a hash object, you can perform many tasks, including
these:
Store and retrieve data.
Maintain key summaries.
Replace and remove data.
Compare hash objects.
Output a data set that contains the data in the hash object.
518 Chapter 22 Using DATA Step Component Objects
For example, suppose you have a large data set that contains numeric lab results
corresponding to a unique patient number and weight. And suppose you have a small
data set that contains patient numbers (a subset of those in the large data set). You can
load the large data set into a hash object using the unique patient number as the key and
the weight values as the data. A single pass is made over the small data set using the
patient number to look up the current patient in the hash object whose weight is over a
certain value and output that data to a different data set.
Depending on the number of lookup keys and the size of the data set, the hash object
lookup can be significantly faster than a standard format lookup. If you’re just looking
up keys, you have a lot of memory, and you want fast performance, load the large data
set first. If you do not want to use a lot of memory, load the small data set first.
Declaring and Instantiating a Hash Object
You declare a hash object using the DECLARE statement. After you declare the new
hash object, use the _NEW_ operator to instantiate the object. For example:
declare hash myhash;
myhash = _new_ hash();
The DECLARE statement tells the compiler that the object reference MyHash is of type
hash. At this point, you have declared only the object reference MyHash. It has the
potential to hold a component object of type hash. You should declare the hash object
only once. The _NEW_ operator creates an instance of the hash object and assigns it to
the object reference MyHash.
There is an alternative to the two-step process of using the DECLARE statement and the
_NEW_ operator to declare and instantiate a component object. You can use the
DECLARE statement to declare and instantiate the component object in one step.
declare hash myhash();
The above statement is equivalent to the following code:
declare hash myhash;
myhash = _new_ hash();
For more information, see “DECLARE Statement, Hash and Hash Iterator Objects” in
SAS Component Objects: Reference and the “_NEW_ Operator, Hash and Hash Iterator
Objects” in SAS Component Objects: Reference.
Initializing Hash Object Data Using a Constructor
When you create a hash object, you might want to provide initialization data. A
constructor is a method that you can use to instantiate a hash object and initialize the
hash object data.
The hash object constructor can have either of the following formats:
declare hash object_name(argument_tag-1: value-1
<, ...argument_tag-n: value-n>);
object_name = _new_ hash(argument_tag-1: value-1
<, ...argument_tag-n: value-n>);
For more information, see the “DECLARE Statement, Hash and Hash Iterator Objects”
in SAS Component Objects: Reference and the “_NEW_ Operator, Hash and Hash
Iterator Objects” in SAS Component Objects: Reference.
Using the Hash Object 519
Defining Keys and Data
The hash object uses lookup keys to store and retrieve data. The keys and the data are
DATA step variables that you use to initialize the hash object by using dot notation
method calls. A key is defined by passing the key variable name to the DEFINEKEY
method. Data is defined by passing the data variable name to the DEFINEDATA
method. After you have defined all key and data variables, the DEFINEDONE method is
called. Keys and data can consist of any number of character or numeric DATA step
variables.
For example, the following code initializes a character key and a character data variable:
length d $20;
length k $20;
if _N_ = 1 then do;
declare hash h();
rc = h.defineKey('k');
rc = h.defineData('d');
rc = h.defineDone();
end;
You can have multiple key and data variables, but the entire key must be unique, unless
you create the hash object with the MULTIDATA:“YES” argument tag. For more
information, see “Non-Unique Key and Data Pairs” on page 521.
You can store more than one data item with a particular key. For example, you could
modify the previous example to store auxiliary numeric values with the character key
and data. In this example, each key and each data item consists of a character value and a
numeric value:
length d1 8;
length d2 $20;
length k1 $20;
length k2 8;
if _N_ = 1 then do;
declare hash h();
rc = h.defineKey('k1', 'k2');
rc = h.defineData('d1', 'd2');
rc = h.defineDone();
end;
For more information, see the “DEFINEDATA Method” in SAS Component Objects:
Reference, “DEFINEDONE Method” in SAS Component Objects: Reference, and the
“DEFINEKEY Method” in SAS Component Objects: Reference.
Note: The hash object does not assign values to key variables (for example,
h.find(key:'abc')), and the SAS compiler cannot detect the data variable
assignments that are performed by the hash object and the hash iterator. Therefore, if
no assignment to a key or data variable appears in the program, SAS issues a note
stating that the variable is uninitialized. To avoid receiving these notes, you can
perform one of the following actions:
Set the NONOTES system option.
Provide an initial assignment statement (typically to a missing value) for each
key and data variable.
520 Chapter 22 Using DATA Step Component Objects
Use the CALL MISSING routine with all the key and data variables as
parameters. Here is an example.
length d $20;
length k $20;
if _N_ = 1 then do;
declare hash h();
rc = h.defineKey('k');
rc = h.defineData('d');
rc = h.defineDone();
call missing(k, d);
end;
Non-Unique Key and Data Pairs
By default, all of the keys in a hash object are unique. This means one set of data
variables exists for each key. In some situations, you might want to have duplicate keys
in the hash object, that is, associate more than one set of data variables with a key.
For example, assume that the key is a patient ID and the data is a visit date. If the patient
were to visit multiple times, multiple visit dates would be associated with the patient ID.
When you create a hash object with the MULTIDATA:“YES” argument tag, multiple
sets of the data variables are associated with the key.
If the data set contains duplicate keys, by default, the first instance is stored in the hash
object and subsequent instances are ignored. To store the last instance in the hash object,
use the DUPLICATE argument tag. The DUPLICATE argument tag also writes an error
to the SAS log if there is a duplicate key.
However, the hash object allows storage of multiple values for each key if you use the
MULTIDATA argument tag in the DECLARE statement or _NEW_ operator. The hash
object keeps the multiple values in a list that is associated with the key. This list can be
traversed and manipulated by using several methods such as HAS_NEXT or
FIND_NEXT.
To traverse a multiple data item list, you must know the current list item. Start by calling
the FIND method for a given key. The FIND method sets the current list item. Then to
determine whether the key has multiple data values, call the HAS_NEXT method. After
you have determined that the key has another data value, you can retrieve that value with
the FIND_NEXT method. The FIND_NEXT method sets the current list item to the next
item in the list and sets the corresponding data variable or variables for that item.
In addition to moving forward through the list for a given key, you can loop backward
through the list by using the HAS_PREV and FIND_PREV methods in a similar manner.
When you have a hash object that has multiple values for a single key, you can use the
DO_OVER method in an iterative DO loop to traverse through the duplicate keys. The
DO_OVER method reads the key on the first method call and continues to iterate over
the duplicate key list until it reaches the end.
Note: The items in a multiple data item list are maintained in the order in which you
insert them.
For more information about these and other methods associated with non-unique key and
data pairs, see “Dictionary of Hash and Hash Iterator Object Language Elements” in SAS
Component Objects: Reference.
Using the Hash Object 521
Storing and Retrieving Data
How to Store and Retrieve Data
After you initialize the hash object's key and data variables, you can store data in the
hash object using the ADD method, or you can use the dataset argument tag to load a
data set into the hash object. If you use the dataset argument tag, and if the data set
contains more than one observation with the same value of the key, by default, SAS
keeps the first observation in the hash table and ignores subsequent observations. To
store the last instance in the hash object or to send an error to the log if there is a
duplicate key, use the DUPLICATE argument tag. To allow duplicate values for each
key, use the MULTIDATA argument tag.
You can then use the FIND method to search and retrieve data from the hash object if
one data value exists for each key. Use the FIND_NEXT and FIND_PREV methods to
search and retrieve data if multiple data items exist for each key.
For more information, see “ADD Method” in SAS Component Objects: Reference,
“FIND Method” in SAS Component Objects: Reference, “FIND_NEXT Method” in SAS
Component Objects: Reference, and the “FIND_PREV Method” in SAS Component
Objects: Reference.
You can consolidate a FIND method and ADD method using the REF method. In the
following example, you can reduce the amount of code from this:
rc = h.find();
if (rc != 0) then
rc = h.add();
to a single method call:
rc = h.ref();
For more information, see the “REF Method” in SAS Component Objects: Reference.
Note: You can also use the hash iterator object to retrieve the hash object data, one data
item at a time, in forward and reverse order. For more information, see “Using the
Hash Iterator Object ” on page 531.
Example 1: Using the ADD and FIND Methods to Store and Retrieve
Data
The following example uses the ADD method to store the data in the hash object and
associate the data with the key. The FIND method is then used to retrieve the data that is
associated with the key value Homer.
data _null_;
length d $20;
length k $20;
/* Declare the hash object and key and data variables */
if _N_ = 1 then do;
declare hash h();
rc = h.defineKey('k');
rc = h.defineData('d');
rc = h.defineDone();
end;
/* Define constant value for key and data */
522 Chapter 22 Using DATA Step Component Objects
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.106.233