Documenting and Maintaining Indexes

Overview

Indexes are stored in the same SAS library as the data set that they index, but in a separate SAS file from the data set. Index files have a member type of INDEX. There is only one index file per data set; all indexes for a data set are stored together in a single file.
The following figure shows the relationship of SAS data set files and SAS index files in a Windows operating environment. Notice that the index files have the same name as the data set with which they are associated, but they have different file extensions. Also, notice that each index file can contain one or more indexes, and that different index files can contain indexes with identical names.
SAS data set files and SAS index files
Note: Index files are stored in the same location as the data sets with which they are associated. However, keep the following in mind:
  • Index files do not appear in the SAS Explorer window.
  • Index files do not appear as separate files in z/OS operating environment file lists.
Sometimes, you might want to view a list of the indexes that exist for a data set. You might also want to see information about the indexes such as whether they are unique, and what key variables they use. Let us consider some ways to document indexes.
Information about indexes is stored in the descriptor portion of the data set. You can use either the CONTENTS procedure or the CONTENTS statement in PROC DATASETS to list information from the descriptor portion of a data set.
Output from the CONTENTS procedure or from the CONTENTS statement in PROC DATASETS contains the following information about the data set:
  • general and summary information
  • engine/host dependent information
  • alphabetic list of variables and attributes
  • alphabetic list of integrity constraints
  • alphabetic list of indexes and attributes
General form, PROC CONTENTS:
PROC CONTENTS DATA=<libref.>SAS-data-set-name;
RUN;
Here is an explanation of the syntax:
SAS-data-set-name
specifies the data set for which the information is listed.
General form, PROC DATASETS with the CONTENTS statement:
PROC DATASETS <LIBRARY=libref> <NOLIST>;
CONTENTS DATA=<libref.>SAS-data-set-name;
QUIT;
Here is an explanation of the syntax:
SAS-data-set-name
specifies the data set for which the information is listed.
NOLIST
suppresses the printing of the directory of SAS files in the SAS log and as ODS output.
Note: If you use the LIBRARY= option, you do not need to specify a libref in the DATA= option. Likewise, if you specify a libref in the DATA= option, you do not need to use the LIBRARY= option.

Example

The following example prints information about the Sasuser.Sale2000 data set. Notice that the library is specified in the LIBRARY= option of the PROC DATASETS statement.
proc datasets library=sasuser nolist;
   contents data=sale2000;
quit;
The following example also prints information about the Sasuser.Sale2000 data set. Notice that the library is specified in the CONTENTS statement.
proc datasets nolist;
   contents data=sasuser.sale2000;
quit;
The following example also prints information about the Sasuser.Sale2000 data set:
proc contents data=sasuser.sale2000;
run;
The PROC DATASETS and PROC CONTENTS output from these programs is identical. The last piece of information that is printed in each set of output is a list of the indexes that have been created for Sasuser.Sale2000, as shown below.
Sasuser.Sale2000
You can also use either of these methods to list information about an entire SAS library rather than an individual data set. To list the contents of all files in a SAS library with either PROC CONTENTS or with the CONTENTS statement in PROC DATASETS, you specify the keyword _ALL_ in the DATA= option.

Example

The following example prints information about all of the files in the Work data library:
proc contents data=work._all_;
run;
The following example also prints information about all of the files in the Work data library:
proc datasets library=work nolist;
   contents data=_all_;
quit;
Remember that indexes are stored in a separate SAS file. When you perform maintenance tasks on a data set, there might be resulting effects on the index file. If you alter the variables or values within a data set, there might be a resulting effect on the value/identifier pairs within a particular index.
The following table describes the effects on an index file or an index file that result from several common maintenance tasks.
Task
Effect
Add an observation or observations to a data set.
Value/identifier pairs are added to the index or indexes.
Delete an observation or observations from a data set.
Value/identifier pairs are deleted from the index or indexes.
Update an observation or observations in a data set.
Value/identifier pairs are updated in the index or indexes.
Delete a data set.
The index file is deleted.
Rebuild a data set with the DATA step.
The index file is deleted.
Sort the data in place with the FORCE option in PROC SORT.
The index file is deleted.
Let us consider some of the other common tasks that you might perform on your data sets, as well as the actions that SAS performs on the index files as a result.

Copying Data Sets

You might want to copy an indexed data set to a new location. You can copy a data set with the COPY statement in a PROC DATASETS step. When you use the COPY statement to copy a data set that has an associated index, a new index file is automatically created for the new data file.
General form, PROC DATASETS with the COPY statement:
PROC DATASETS LIBRARY=old-libref <NOLIST>;
COPY OUT=new-libref;
SELECT SAS-data-set-name;
QUIT;
Here is an explanation of the syntax:
old-libref
names the library from which the data set is copied.
new-libref
names the library to which the data set is copied.
SAS-data-set-name
names the data set that is copied.
You can also use the COPY procedure to copy data sets to a new location. Generally, PROC COPY functions the same as the COPY statement in the DATASETS procedure. When you use PROC COPY to copy a data set that has an associated index, a new index file is automatically created for the new data file. If you use the MOVE option in the COPY procedure, the index file is deleted from the original location and rebuilt in the new location.
General form, PROC COPY step:
PROC COPY OUT=new-libref IN=old-libref
<MOVE>;
SELECT SAS-data-set-name(s);
RUN;
QUIT;
Here is an explanation of the syntax:
old-libref
names the library from which the data set is copied.
new-libref
names the library to which the data set is copied.
SAS-data-set-name
names the data set or data sets that are copied.

Examples

The following programs produce the same result. Both programs copy the Sale2000 data set from the Sasuser library and place it in the Work library. Likewise, both of these programs cause a new index file to be created for Work.Sale2000 that contains all indexes that exist in Sasuser.Sale2000.
proc datasets library=sasuser nolist;
   copy out=work;
   select sale2000;
quit;

proc copy out=work in=sasuser;
   select sale2000;
run;
Note: If you copy and paste a data set in either SAS Explorer or in SAS Enterprise Guide, a new index file is automatically created for the new data file.

Renaming Data Sets

Another common task is to rename an indexed data set. To preserve the index, you can use the CHANGE statement in PROC DATASETS to rename a data set. The index file is automatically renamed as well.
General form, PROC DATASETS with the CHANGE statement:
PROC DATASETS LIBRARY=libref <NOLIST>;
CHANGE old-data-set-name = new-data-set-name;
QUIT;
Here is an explanation of the syntax:
libref
names the SAS library where the data set is stored.
old-data-set-name
is the current name of the data set.
new-data-set-name
is the new name of the data set.

Example

The following example copies the Revenue data set from Sasuser into Work, and renames the Work.Revenue data set to Work.Income. The index file that is associated with Work.Revenue is also renamed to Work.Income.
proc copy out=work in=sasuser;
   select revenue;
run;

proc datasets library=work nolist;
   change revenue=income;
quit;

Renaming Variables

You have seen how to use PROC DATASETS to rename an indexed data set. Similarly, you might want to rename one or more variables within an indexed data set. In order to preserve any indexes that are associated with the data set, you can use the RENAME statement in the DATASETS procedure to rename variables.
General form, PROC DATASETS with the RENAME statement:
PROC DATASETS LIBRARY=libref <NOLIST>;
MODIFY SAS-data-set-name;
RENAME old-var-name-1 = new-var-name-1
<...old-var-name-n = new-var-name-n>;
QUIT;
Here is an explanation of the syntax:
libref
names the SAS library where the data set is stored.
SAS-data-set-name
is the name of the data set that contains the variables to be renamed.
old-var-name
is the original variable name.
new-var-name
is the new name to be assigned to the variable.
When you use the RENAME statement to change the name of a variable for which there is a simple index, the statement also renames the index. If the variable that you are renaming is used in a composite index, the composite index automatically references the new variable name. However, if you attempt to rename a variable to a name that has already been used for a composite index, you receive an error message.

Example

The following example renames the variable FlightID as FlightNum in the Work.Income data set. If a simple index exists that is named FlightID, the index is renamed FlightNum.
proc datasets library=work nolist;
   modify income;
   rename flightid=FlightNum;
quit;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.171.162