Using Benchmarks to Compare Techniques

To decide which SAS programming technique is most efficient for a particular task, you can benchmark (measure and compare) the resource usage for each technique that you are comparing. You should benchmark with the actual data to determine the most efficient technique.

Guidelines for Benchmarking

Your benchmarking is most likely to yield useful results if you follow these guidelines:
  • Before you test the programming techniques, turn on the SAS system options that report resource usage.
    As explained earlier, to track and report on resource usage, you can use some or all of the system options STIMER, MEMRPT, FULLSTIMER, and STATS. The availability, usage, and functionality of these options vary by operating environment. You can also specify MSGLEVEL=I to display additional notes in the SAS log. Use the FULLSTIMER option to log a complete list of the resources that are used.
    Note: To turn on the FULLSTIMER option, use the following statement:
    options fullstimer;
  • Execute the code for each programming technique in a separate SAS session.
    The first time that program code (including the DATA step, functions, formats, and SAS procedures) is referenced, the operating system might have to load the code into memory or assign virtual address space to it. The first time data is read, it is often loaded into a cache from which it can be retrieved more quickly the next time it is read. The resource usage that is required for performing these actions is overhead. Using separate SAS sessions for each technique change can minimize the effect of the overhead on your resource statistics.
  • In each programming technique that you are testing, include only the SAS code that is essential for performing the task.
    If you include too many elements in the code for each technique, you do not know what caused the results. If the program that you are benchmarking is not large, you can optimize it by changing individual programming techniques, one at a time, and running the entire program after each change to measure the effect on resource usage. However, a more complex program might be easier to optimize by identifying the steps that use the most resources and extracting those steps into separate programs. You can measure the effects of different programming techniques by repeatedly changing, running, and measuring the separate programs. When isolating parts of your program, be careful to measure their resource usage under the conditions in which they are used in the complete program.
  • If your system is doing other work at the same time that you are running your benchmarking tests, be sure to run the code for each programming technique several times.
    Running the code several times reduces any variability in resource consumption that is associated with other work that the system is doing. How you handle multiple measurements depends on the resource, as indicated below:
    • Use the minimum real time and CPU time measurements, because these represent most closely the amount of time your programming technique actually requires. The larger time values (especially in the case of real time) are the result of interference from other work that the computer was doing while your program ran.
    • The amount of memory should not vary from trial to trial. If memory does vary, it is possible that your program sometimes shares a resource with another program. In this situation, you must determine whether the higher or lower memory consumption is more likely to be the case when your program is used in production.
    • I/O can be an especially elusive resource to measure. With modern file systems and storage systems, the effect of your program on the I/O activity of the computer sometimes must be observed by operating system tools, file system tools, or storage system tools because it cannot be captured by your SAS session. Data is often aggressively cached by modern file systems and storage systems, and file caches are greatly affected by other activity in the file system. Be realistic when you measure I/O—it is possible to achieve good performance on a system that is not doing other work, but performance is likely to worsen when the application is deployed in a more realistic environment.
  • Run your benchmarking tests under the conditions in which your final program will run.
    Results might vary under different conditions, so it is important to control the conditions under which your benchmarks are tested. For example, if batch execution and large data sets are used in your environment, you should incorporate these conditions into your benchmarking environment.
  • After testing is finished, consider turning off the options that report resource usage.
    The options that report resource usage are themselves consumers of resources. If it is a higher priority in your environment to minimize resource usage than to periodically check an application's resource usage, then it is most efficient to turn off these options.
    Note: To turn off the FULLSTIMER option, use the following statement:
    options nofullstimer;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.28.70