The IMPORT Procedure

The Basics of PROC IMPORT

The IMPORT procedure reads data from an external data source and writes it to a SAS data set.
In delimited files, a delimiter (such as a blank, comma, or tab) separates columns of data values. If you license SAS/ACCESS Interface to PC Files, additional external data sources can include Microsoft Access database files, Microsoft Excel files, and Lotus spreadsheets.
When you run the IMPORT procedure, it reads the input file and writes the data to the specified SAS data set. By default, IMPORT procedure expects the variable names to appear in the first row. The procedure scans the first 20 rows to count the variables, and it attempts to determine the correct informat and format for each variable. You can use the IMPORT procedure statements to do the following:
  • indicate how many rows SAS scans for variables to determine the type and length (GUESSINGROWS=)
  • indicate at which row SAS begins to read the data (DATAROW=)
  • modify whether SAS extracts the variable names (GETNAMES=)
You can also use these same statements to change the default values.
When the IMPORT procedure reads a delimited file, it generates a DATA step to import the data. You control the results with options and statements that are specific to the input data source.
The IMPORT procedure generates the specified output SAS data set and writes information about the import to the SAS log. The log displays the DATA step code that is generated by the IMPORT procedure.
If you need to revise your code after the procedure runs, issue the RECALL command (or press F4) to recall the generated DATA step. At this point, you can add or remove options from the INFILE statement and customize the INFORMAT, FORMAT, and INPUT statements to your data.
Note: By default, the IMPORT procedure reads delimited files as varying record-length files. If your external file has a fixed-length format, use a SAS DATA step with an INFILE statement that includes the RECFM=F and LRECL= options

The PROC IMPORT Statement

PROC IMPORT statement imports an external data file to a SAS data set.
PROC IMPORT
DATAFILE= “filename” | TABLE= “tablename
OUT=<libref. SAS data set><SAS data set options>
<DBMS=identifier><REPLACE>;

DATAFILE= “filename” | “fileref

specifies the complete path and filename or fileref for the input PC file, spreadsheet, or delimited external file. A fileref is a SAS name that is associated with the physical location of the output file. To assign a fileref, use the FILENAME statement.

If you specify a fileref or if the complete path and filename does not include special characters such as the backslash in a path, lowercase characters, or spaces, then you can omit the quotation marks.
Restrictions The IMPORT procedure does not support device types or access methods for the FILENAME statement except for DISK. For example, the IMPORT procedure does not support the TEMP device type, which creates a temporary external file.
The IMPORT procedure can import data only if SAS supports the data type. SAS supports numeric and character types of data but not (for example) binary objects. If the data that you want to import is a type that SAS does not support, the IMPORT procedure might not be able to import it correctly. In many cases, the procedure attempts to convert the data to the best of its ability. However, conversion is not possible for some types.
Interactions By default, the IMPORT procedure reads delimited files as varying record-length files. If your external file has a fixed-length format, use a SAS DATA step with an INFILE statement that includes the RECFM=F and LRECL= options.
When you use a fileref to specify a delimited file to import, the logical record length (LRECL) defaults to 256, unless you specify the LRECL= option in the FILENAME statement. The maximum LRECL value that the IMPORT procedure supports is 32767.
For delimited files, the first 20 rows are scanned to determine the variable attributes. You can increase the number of rows that are scanned by using the GUESSINGROWS= statement. All values are read in as character strings. If a Date and Time format or a numeric informat can be applied to the data value, the type is declared as numeric. Otherwise, the type remains character.

OUT= <libref.>SAS data set

identifies the output SAS data set with either a one or two-level SAS name (library and member name). If the specified SAS data set does not exist, the IMPORT procedure creates it. If you specify a one-level name, by default the IMPORT procedure uses either the USER library (if assigned) or the WORK library (if USER is not assigned).

A SAS data set name can contain a single quotation mark when the VALIDMEMNAME=EXTEND system option is also specified. Using VALIDMEMNAME= expands the rules for the names of certain SAS members, such as a SAS data set name.

TABLE= “tablename

specifies the name of the input DBMS table. If the name does not include special characters (such as question marks), lowercase characters, or spaces, you can omit the quotation marks. Note that the DBMS table name might be case sensitive.

Requirements You must have a license for SAS/ACCESS Interface to PC Files to import to a DBMS table.
When you import a DBMS table, you must specify the DBMS= option.

<DBMS=identifier>

specifies the type of data to import. You can import delimited files or JMP files (DBMS=JMP) in Base SAS. The JMP file format must be JMP 7 or later, and JMP variable names can be up to 255 characters long. SAS supports importing JMP files that have more than 32,767 variables.

To import a tab-delimited file, specify TAB as the identifier. To import any other delimited file that does not end in .CSV, specify DLM as the identifier. For a comma-separated file with a .CSV extension, DBMS= is optional. The IMPORT procedure recognizes .CSV as an extension for a comma-separated file.

<REPLACE>

overwrites an existing SAS data set. If you omit REPLACE, the IMPORT procedure does not overwrite an existing data set.

CAUTION:
Using the IMPORT procedure with the REPLACE option to write to an existing SAS generation data set causes the most recent generation data set or group of generation data sets to be deleted.
Here are two scenarios:
  • If you specify the GENMAX= data set option to increase or decrease the number of generations, then all existing generations are deleted and replaced with a single new base generation data set
  • If you omit the GENMAX= data set option, then all existing generations are deleted and replaced with a single new data set by the same name, but it is not a generation data set
Instead, use a SAS DATA step with the REPLACE= data set option to replace a permanent SAS data set and to maintain the generation group for that SAS data set.

<SAS data set options>

specifies SAS data set options. For example, to assign a password to the resulting SAS data set, you can use the ALTER=, PW=, READ=, or WRITE= data set options. To import only data that meets a specified condition, you can use the WHERE= data set option.

Restriction You cannot specify data set options when importing delimited, comma-separated, or tab-delimited external files.

Example: Importing a Comma-Delimited File with a CSV Extension

This example imports a comma-delimited file and creates a temporary SAS data set Work.Shoes.
proc import datafile="C:certdata	est.csv";
	out=shoes
	dbms=csv
	replace;
	getnames=no;
run;
proc print data=work.shoes;
run;
Output 6.3 HTML Output: Work.Shoes Data Set
HTML Output: Work.Shoes Data Set
Last updated: January 10, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.62.122