Using the SASFILE Statement

Overview

Another way of improving performance is to use the SASFILE statement to hold a SAS data file in memory so that the data is available to multiple program steps. Keeping the data file open reduces I/O processing and open/close operations, including the allocation and freeing of memory for buffers.
General form, SASFILE statement:
SASFILE SAS-data-file <(password-option(s))> OPEN | LOAD | CLOSE;
Here is an explanation of the syntax:
SAS-data-file
is a valid SAS data file (a SAS data set with the member type DATA).
password-option(s)
specifies one or more password options.
OPEN
opens the file and allocates the buffers, but defers reading the data into memory until a procedure or statement is executed.
LOAD
opens the file, allocates the buffers, and reads the data into memory.
CLOSE
closes the file and frees the buffers.
The SASFILE statement opens a SAS data file and allocates enough buffers to hold the entire file in memory. Once the data file is read, the data is held in memory, and it is available to subsequent DATA and PROC steps or applications until either of the following occurs:
  • a SASFILE CLOSE statement frees the buffers and closes the file
  • the SAS session ends, which automatically frees the buffers and closes the file
In the following program, the first SASFILE statement opens the SAS data file Company.Sales, allocates the buffers, and reads the data into memory.
sasfile company.sales load;
proc print data=company.sales;
   var Customer_Age_Group;
run;
proc tabulate data=company.sales;
   class Customer_Age_Group;
   var Customer_BirthDate;
   table Customer_Age_Group,Customer_BirthDate*(mean median);
run;
sasfile company.sales close;
Note: The SASFILE statement can also be used to reduce CPU time and I/O in SAS programs that repeatedly read one or more SAS data views. Use a DATA step to create a SAS data file in the Work library that contains the view's result set. Then use the SASFILE statement to load that data file into memory.
Note: Although a file that is opened with the SASFILE statement can be used for subsequent input or update processing, it cannot be used for subsequent utility or output processing. For example, you cannot replace the file or rename its variables.

Guidelines for Using the SASFILE Statement

When the SASFILE statement executes, SAS allocates the number of buffers based on the number of pages for the data file and index file. If the file in memory increases in size during processing because of changes or additions to the data, the number of buffers also increases.
It is important to note that I/O processing is reduced only if there is sufficient real memory. If there is not sufficient real memory, the operating environment might use the following:
  • virtual memory
  • the default number of buffers
If SAS uses virtual memory, there might be a degradation in performance.
If you need to repeatedly process part of a SAS data file and the entire file does not fit into memory, use a DATA step with the SASFILE statement to create a subset of the file that does fit into memory. Then process that subset repeatedly. This action saves CPU time in the processing steps because those steps read a smaller file, in addition to the benefit of the file being resident in memory.
Note: When using a SASFILE statement, monitor the paging activity (the I/O activity that is done by the virtual memory management subsystem of your operating environment) while your program runs. If the paging activity increases substantially, consider keeping less data in memory.

Comparative Example: Using the SASFILE Statement

Using Different Data File Sizes

Suppose you want to create multiple reports from SAS data files that vary in size. Using small, medium, and large data files, you can compare the resource usage when the PRINT, TABULATE, MEANS, and FREQ procedures are used with and without the SASFILE statement to create reports.
Name of Data File
Number of Rows
Page Size
Number of Pages
Number of Bytes
Retail.Small
45,876
24,576
540
13,279,232
Retail.Medium
458,765
24,576
5,398
132,669,440
Retail.Large
4,587,654
24,576
53,973
1,326,448,640
The following sample programs compare six techniques for using data file sizes. You can use these samples as models for creating benchmark programs in your own environment. Your results might vary depending on the structure of your data, your operating environment, and the resources that are available at your site.

Programming Techniques

1 Small Data File without the SASFILE Statement
This program creates reports using the PRINT, TABULATE, MEANS, and FREQ procedures. The SAS data file Retail.Small is opened and closed with each procedure.
proc print data=retail.small;
   where cs=100;
   var Customer_Age_Group;
run;
proc tabulate data=retail.small;
   class Customer_Age_Group;
   var Customer_BirthDate;
   table Customer_Age_Group,Customer_BirthDate*(mean median);
run;
proc means data=retail.small;
   var Customer_Age;
   class Customer_Group;
   output out=summary sum=;
run;
proc freq data=retail.small;
   tables Customer_Country;
run;
2 Medium Data File without the SASFILE Statement
This program creates reports using the PRINT, TABULATE, MEANS, and FREQ procedures. The SAS data file Retail.Medium is opened and closed with each procedure.
proc print data=retail.medium;
   where cm=100;
   var Customer_Age_Group;
run;
proc tabulate data=retail.medium;
   class Customer_Age_Group;
   var Customer_BirthDate;
   table Customer_Age_Group,Customer_BirthDate*(mean median);
run;
proc means data=retail.medium;
   var Customer_Age;
   class Customer_Group;
   output out=summary sum=;
run;
proc freq data=retail.medium;
   tables Customer_Country;
run;
3 Large Data File without the SASFILE Statement
This program creates reports using the PRINT, TABULATE, MEANS, and FREQ procedures. The SAS data file Retail.Large is opened and closed with each procedure.
proc print data=retail.large;
   where cl=100;
   var Customer_Age_Group;
run;
proc tabulate data=retail.large;
   class Customer_Age_Group;
   var Customer_BirthDate;
   table Customer_Age_Group,Customer_BirthDate*(mean median);
run;
proc means data=retail.large;
   var Customer_Age;
   class Customer_Group;
   output out=summary sum=;
run;
proc freq data=retail.large;
   tables Customer_Country;
run;
4 Small Data File with the SASFILE Statement
In this program, the SASFILE LOAD statement opens the SAS data file Retail.Small and loads the entire file into memory. The data is then available to the PRINT, TABULATE, MEANS, and FREQ procedures. The SASFILE CLOSE statement closes Retail.Small and frees the buffers.
sasfile retail.small load;
proc print data=retail.small;
   where cs=100;
   var Customer_Age_Group;
run;
proc tabulate data=retail.small;
   class Customer_Age_Group;
   var Customer_BirthDate;
   table Customer_Age_Group,Customer_BirthDate*(mean median);
run;
proc means data=retail.small;
   var Customer_Age;
   class Customer_Group;
   output out=summary sum=;
run;
proc freq data=retail.small;
   tables Customer_Country;
run;
sasfile retail.small close;
5 Medium Data File with the SASFILE Statement
In this program, the SASFILE LOAD statement opens the SAS data file Retail.Medium and loads the entire file into memory. The data is then available to the PRINT, TABULATE, MEANS, and FREQ procedures. The SASFILE CLOSE statement closes Retail.Medium and frees the buffers.
sasfile retail.medium load;
proc print data=retail.medium;
   where cm=100;
   var Customer_Age_Group;
run;
proc tabulate data=retail.medium;
   class Customer_Age_Group;
   var Customer_BirthDate;
   table Customer_Age_Group,Customer_BirthDate*(mean median);
run;
proc means data=retail.medium;
   var Customer_Age;
   class Customer_Group;
   output out=summary sum=;
run;
proc freq data=retail.medium;
   tables Customer_Country;
run;
sasfile retail.medium close;
6 Large Data File with the SASFILE Statement
In this program, the SASFILE LOAD statement opens the SAS data file Retail.Large and loads the entire file into memory. The data is then available to the PRINT, TABULATE, MEANS, and FREQ procedures. The SASFILE CLOSE statement closes Retail.Large and frees the buffers.
sasfile retail.large load;
proc print data=retail.large;
   where cl=100;
   var Customer_Age_Group;
run;
proc tabulate data=retail.large;
   class Customer_Age_Group;
   var Customer_BirthDate;
   table Customer_Age_Group,Customer_BirthDate*(mean median);
run;
proc means data=retail.large;
   var Customer_Age;
   class Customer_Group;
   output out=summary sum=;
run;
proc freq data=retail.large;
   tables Customer_Country;
run;
sasfile retail.large close;

General Recommendations

  • If you need to repeatedly process a SAS data file that fits entirely in memory, use the SASFILE statement to reduce I/O and some CPU usage.
  • If you use the SASFILE statement and the SAS data file does not fit entirely in memory, the code executes, but there might be a degradation in performance.
  • If you need to repeatedly process part of a SAS data file and the entire file does not fit into memory, use a DATA step with the SASFILE statement to create a subset of the file that does fit into memory. Then process that subset repeatedly. This action saves CPU time in the processing steps because those steps read a smaller file, in addition to the benefit of the file being resident in memory.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.91.24