Avoiding Unnecessary Procedure Invocation

Overview

Best practices specify that you avoid unnecessary procedure invocation. One way to do this is to take advantage of procedures that accomplish multiple tasks with one invocation.
Several procedures enable you to perform multiple tasks or create multiple reports by invoking the procedure only once. These include the following:
  • the SQL procedure
  • the DATASETS procedure
  • the FREQ procedure
  • the TABULATE procedure.
Note: BY-group processing can also minimize unnecessary invocations of procedures.
To illustrate this principle, examine features of the DATASETS procedure.

Executing the DATASETS Procedure

The DATASETS procedure can use RUN-group processing to process multiple sets of statements. RUN-group processing enables you to submit groups of statements without ending the procedure.
When the DATASETS procedure executes, the following actions occur:
  • SAS reads the program statements that are associated with one task until it reaches a RUN statement or an implied RUN statement.
  • SAS executes all of the preceding statements immediately, and then continues reading until it reaches another RUN statement or an implied RUN statement.
To execute the last task, you must use a RUN statement or a QUIT statement.
proc datasets lib=company;
   modify orders;
      rename quantity=Units_Ordered;
      format costprice_per_unit dollar13.2;
      label delivery_date='Date of Delivery';
   run;
   modify customers;
      format customer_birthdate mmddyy10.
   run;
quit;
You can terminate the PROC DATASETS execution by submitting a DATA statement, a PROC statement, or a QUIT statement.

RUN-Group Processing

RUN-group processing avoids unnecessary procedure invocation. The procedures that support RUN-group processing include the following:
  • CHART, GCHART
  • PLOT, GPLOT
  • GIS, GMAP
  • GLM
  • REG
  • DATASETS

Using Different Types of RUN Groups with PROC DATASETS

The DATASETS procedure supports four types of RUN groups. Each RUN group is defined by the statements that compose it and cause it to execute.
Some statements in PROC DATASETS act as implied RUN statements because they cause the RUN group that precedes them to execute.
The following statements compose a RUN group and what causes each RUN group to execute:
  • The PROC DATASETS statement always executes immediately. No other statement is necessary to cause the PROC DATASETS statement to execute. Therefore, the PROC DATASETS statement alone is a RUN group.
  • The MODIFY statement and any of its subordinate statements form a RUN group. These RUN groups always execute immediately. No other statement is necessary to cause a MODIFY RUN group to execute.
  • The APPEND, CONTENTS, and COPY statements (including EXCLUDE and SELECT, if present) form their own separate RUN groups. Every APPEND statement forms a single-statement RUN group, every CONTENTS statement forms a single-statement RUN group, and every COPY step forms a RUN group. Any other statement in the procedure, except those that are subordinate to either the COPY or MODIFY statement, causes the RUN group to execute.
Also, one or more of the following statements form a RUN group:
  • AGE
  • EXCHANGE
  • CHANGE
  • REPAIR
If any of these statements appear in sequence in the PROC step, the sequence forms a RUN group. For example, if a REPAIR statement appears immediately after a SAVE statement, the REPAIR statement does not force the SAVE statement to execute; it becomes part of the same RUN group. To execute the RUN group, submit one of the following statements:
  • PROC DATASETS
  • MODIFY
  • APPEND
  • QUIT
  • CONTENTS
  • RUN
  • COPY
  • another DATA or PROC step

Comparative Example: Modifying the Descriptor Portion of SAS Data Sets

Overview

Suppose you want to use the DATASETS procedure to modify the data sets NewCustomer, NewOrders, and NewItems.
The following sample programs compare two techniques. You can use these samples as models for creating benchmark programs in your own environment. Your results might vary depending on the structure of your data, your operating environment, and the resources that are available at your site.

Programming Techniques

1 Multiple DATASETS Procedures
This program invokes PROC DATASETS three times to modify the descriptor portion of the data set NewCustomer, two times to modify the descriptor portion of the data set NewOrders, and once to change the name of the data set NewItems.
proc datasets lib=company nolist;
   modify newcustomer;
   rename Country_ID=Country
          Name=Customer_Name;
quit;

proc datasets lib=company nolist;
   modify newcustomer;
   format birth_date date9.;
quit;

proc datasets lib=company nolist;
   modify newcustomer;
   label birth_date='Date of Birth';
quit;

proc datasets lib=company nolist;
   modify neworders;
   rename order=Order_ID
          employee=Employee_ID
          customer=Customer_ID;
quit;

proc datasets lib=company nolist;
   modify neworders;
   format order_date date9.;
quit;

proc datasets lib=company nolist;
   change newitems=NewOrder_Items;
quit;
2 Single DATASETS Procedure
This program invokes PROC DATASETS once to modify the descriptor portion of the data sets NewCustomer and NewOrders, and to change the name of the data set NewItems. This technique is more efficient.
proc datasets lib=company nolist;
   modify newcustomer;
   rename country_ID=Country
          name=Customer_Name;
   format birth_date date9.;
   label  birth_date='Date of Birth';
   modify neworders;
   rename order=Order_ID
          employee=Employee_ID
          customer=Customer_ID;
   format order_date date9.;
   change newitems=NewOrder_Items;
quit;

General Recommendations

  • Invoke the DATASETS procedure once and process all the changes for a library in one step to save CPU and I/O resources—at the cost of memory resources.
  • Use the NOLIST option on the PROC DATASETS statement. The NOLIST option suppresses printing of the library members in the log. Using NOLIST can save I/O.
Note: Because the specified library could change between invocations of the DATASETS procedure, the procedure is reloaded into memory for each invocation.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.198.174