The IMPORT Procedure

The Basics of PROC IMPORT

The IMPORT procedure reads data from an external data source and writes it to a SAS data set. You can import structured and unstructured data using PROC IMPORT. You can import delimited files (blank, comma, or tab) along with Microsoft Excel files. If you are using SAS 9.4, then you can import JMP 7 or later files as well.
When you run the IMPORT procedure, it reads the input file and writes the data to the specified SAS data set. By default, the IMPORT procedure expects the variable names to appear in the first row. The procedure scans the first 20 rows to count the variables, and it attempts to determine the correct informat and format for each variable. You can use the IMPORT procedure statements to do the following:
  • indicate how many rows SAS scans for variables to determine the type and length (GUESSINGROWS=)
  • modify whether SAS extracts the variable names from the first row of the data set (GETNAMES=)
  • indicate at which row SAS begins to read the data (DATAROW=)
When the IMPORT procedure reads a delimited file, it generates a DATA step to import the data. You control the results with options and statements that are specific to the input data source.
The IMPORT procedure generates the specified output SAS data set and writes information about the import to the SAS log. The log displays the DATA step code that is generated by the IMPORT procedure.

PROC IMPORT Syntax

The IMPORT procedure imports an external data file to a SAS data set.
PROC IMPORT
DATAFILE= “filename” | TABLE= “tablename
OUT=<libref. SAS-data-set><SAS-data-set-options>
<DBMS=identifier><REPLACE>;

DATAFILE= “filename” | “fileref

specifies the complete path and filename or fileref for the input PC file, spreadsheet, or delimited external file. A fileref is a SAS name that is associated with the physical location of the output file. To assign a fileref, use the FILENAME statement.

If you specify a fileref, complete path, and filename does not include special characters, then you can omit the quotation marks.
Restrictions The IMPORT procedure does not support device types or access methods for the FILENAME statement except for DISK. For example, the IMPORT procedure does not support the TEMP device type, which creates a temporary external file.
The IMPORT procedure can import data only if SAS supports the data type. SAS supports numeric and character types of data but not (for example) binary objects. If the data that you want to import is a type that SAS does not support, the IMPORT procedure might not be able to import it correctly. In many cases, the procedure attempts to convert the data to the best of its ability. However, conversion is not possible for some types.
Interactions By default, the IMPORT procedure reads delimited files as varying record-length files. If your external file has a fixed-length format, use the FILENAME statement prior to PROC IMPORT to specify the input filename using the RECFM=F and LRECL= options.
When you use a fileref to specify a delimited file to import, the logical record length (LRECL) defaults to 256, unless you specify the LRECL= option in the FILENAME statement. The maximum LRECL value that the IMPORT procedure supports is 32,767.
For delimited files, the first 20 rows are scanned to determine the variable attributes. You can increase the number of rows that are scanned by using the GUESSINGROWS= statement. All values are read in as character strings. If a Date and Time format or a numeric informat can be applied to the data value, the type is declared as numeric. Otherwise, the type remains character.

OUT= <libref.>SAS-data-set

identifies the output SAS data set with either a one or two-level SAS name (library and member name). If the specified SAS data set does not exist, the IMPORT procedure creates it. If you specify a one-level name, by default the IMPORT procedure uses either the USER library (if assigned) or the WORK library (if USER is not assigned).

A SAS data set name can contain a single quotation mark when the VALIDMEMNAME=EXTEND system option is also specified. Using VALIDMEMNAME= expands the rules for the names of certain SAS members, such as a SAS data set name.
If a SAS data set name contains national characters or special characters, use VALIDMEMNAME=EXTEND system option. The exceptions for special characters are: / * ? “ < > | : —. Using VALIDMEMNAME= expands the rules for the name of certain SAS members, such as a SAS data set name. For more information, see VALIDMEMNAME=System Option.

TABLE= “tablename

specifies the name of the input DBMS table. If the name does not include special characters (such as question marks), lowercase characters, or spaces, you can omit the quotation marks. Note that the DBMS table name might be case sensitive.

Requirement When you import a DBMS table, you must specify the DBMS= option.

<DBMS=identifier>

specifies the type of data to import.

Here are the common DBMS identifiers that are included with Base SAS:
  • CSV — comma-separated values. For a comma-separated file with a .CSV extension, DBMS= is optional.
  • JMP — JMP files. Use JMP 7 or later. Use DBMS=JMP to specify importing JMP files. JMP variable names can be up to 255 characters long. SAS supports importing JMP files that have more than 32,767 variables.
  • TAB — tab-delimited values. Specify DBMS=DLM to import any other delimited file that does not end in .CSV.

<REPLACE>

overwrites an existing SAS data set. If you omit REPLACE, the IMPORT procedure does not overwrite an existing data set.

Instead, use a SAS DATA step with the REPLACE= data set option to replace a permanent SAS data set.

<SAS-data-set-options>

specifies SAS data set options. For example, to assign a password to the resulting SAS data set, you can use the ALTER=, PW=, READ=, or WRITE= data set options. To import only data that meets a specified condition, you can use the WHERE= data set option.

Restriction You cannot specify data set options when importing delimited, comma-separated, or tab-delimited external files.

Example: Importing an Excel File with an XLSX Extension

This example imports an Excel file and creates a temporary SAS data set, Work.BootSales.
options validvarname=v7;                                  /*#1*/
proc import datafile='C:UsersStudent1certoots.xlsx'  /*#2*/
  dbms=xlsx
  out=work.bootsales
  replace;
  sheet=boot;                                             /*#3*/
  getnames=yes;                                           /*#4*/
run;
proc contents data=bootsales;                             /*#5*/
run;
proc print data=bootsales;
run;
1 The VALIDVARNAME=V7 statement forces SAS to convert spaces to underscores when it converts column names to variable names. In SAS Studio, the _ (underscore) in Total_Sale would not be added without the VALIDVARNAME=V7 statement.
2 Specify the input file. DATAFILE= specifies the path for the input file. The DBMS= option specifies the type of data to import. When importing an Excel workbook, specify DBMS=XLSX. The REPLACE option overwrites an existing SAS data set. The OUT= option identifies the output SAS data set.
3 Use the SHEET option to import specific worksheets from an Excel workbook.
4 Set the GETNAMES= statement to YES to generate variable names from the first row of data.
5 Use the CONTENTS procedure to display the descriptor portion of the Work.BootSales data set.
The following is printed to the SAS log. The SAS log notes that the import was successful. It also notes that there is a variable name change from Total Sale (with a space between the two words) to Total_Sale. SAS converted the space to an underscore ( _ ).
Log 4.1 SAS Log
75   options validvarname=v7;
76   proc import datafile='C:UsersStudent1certoots.xlsx'
77      dbms=xlsx
78      out=work.bootsales replace;
79      sheet=boot;
80      getnames=yes;
81   run;

NOTE:    Variable Name Change.  Total Sale -> Total_Sale
NOTE: The import data set has 10 observations and 3 variables.
NOTE: WORK.BOOTSALES data set was successfully created.
Output 4.1 PROC CONTENTS Descriptor Portion (partial output)
PROC CONTENTS Descriptor Portion (Partial Output)
Output 4.2 PROC PRINT Output of the Work.BootSales Data Set
PROC PRINT Output of Work.BootSales Data Set
For an alternate method of reading Microsoft Excel files in SAS, see Reading Microsoft Excel Data with the XLSX Engine.

Example: Importing a Delimited File with a TXT Extension

This example imports a delimited external file and creates a temporary SAS data set, Work.MyData. The delimiter is an ampersand (&).
options validvarname=v7;
proc import datafile='C:UsersStudent1certdelimiter.txt'   /*#1*/
  dbms=dlm                                                    /*#2*/
  out=mydata
  replace;
  delimiter='&';                                              /*#3*/
  getnames=yes;
run;
proc print data=mydata;
run;
1 Specify the input file. DATAFILE= specifies the path for the input file. The DBMS= option specifies the type of data to import. If the delimiter is a character other than TAB or CSV, then the DBMS= option is DLM. The REPLACE option overwrites an existing SAS data set. The OUT= option identifies the output SAS data set.
2 Specify an ampersand (&) for the DELIMITER statement.
3 Set the GETNAMES= statement to YES to generate variable names from the first row of data.
Output 4.3 PROC PRINT Output: Work.MyData Data Set
PROC PRINT Output: Work.MyData Data Set

Example: Importing a Space-Delimited File with a TXT Extension

This example imports a space-delimited file and creates a temporary SAS data set named Work.States.
The following input data illustrates enclosing values in quotation marks when you want to avoid separating their values by the space between the words.
Region State Capital Bird
South Georgia Atlanta 'Brown Thrasher'
South 'North Carolina' Raleigh Cardinal
North Connecticut Hartford Robin
West Washington Olympia 'American Goldfinch'
Midwest Illinois Springfield Cardinal
You can submit the following code to import the file.
options validvarname=v7;
filename stdata 'C:UsersStudent1certstate_data.txt' lrecl=100;  /*#1*/
proc import datafile=stdata                                         /*#2*/
  dbms=dlm
  out=states
  replace;
  delimiter=' ';                                                    /*#3*/
  getnames=yes;
run;
proc print data=states;
run;
1 Specify the fileref and the location of the file. Specify the LRECL= system option if the file has a fixed-length format. The LRECL= system option specifies the default logical record length to use when reading external files.
2 Specify the input file and specify that it is a delimited file. The DBMS= option specifies the type of data to import. If the delimiter type is a character other than TAB or CSV, then the DBMS= option is DLM. The REPLACE option overwrites an existing SAS data set. The OUT= option identifies the output SAS data set.
3 Specify a blank value for the DELIMITER statement. Set the GETNAMES= statement to YES to generate variable names from the first row of data.
Output 4.4 PROC PRINT Output: Work.States Data Set
PROC PRINT Output: Work.States Data Set

Example: Importing a Comma-Delimited File with a CSV Extension

This example imports a comma-delimited file and creates a temporary SAS data set Work.Shoes. Boot.csv is a comma-separated value file that is a delimited-text file and that uses a comma to separate values.
options validvarname=v7;
proc import datafile='C:UsersStudent1certoot.csv'  /*#1*/
  dbms=csv
  out=shoes
  replace;
  getnames=no;                                          /*#2*/
run;
proc print data=work.shoes;
run;
1 Specify the input file. DATAFILE= specifies the input data file, and OUT= specifies the output data set. The DBMS= specifies the type of data to import. If the file type is CSV, then the DBMS= option is CSV. The REPLACE option overwrites an existing SAS data set.
2 Set the GETNAMES= statement to NO to not use the first row of data as variable names.
Output 4.5 PROC PRINT Output: Work.Shoes Data Set
PROC PRINT Output: Work.Shoes Data Set

Example: Importing a Tab-Delimited File

This example imports a tab-delimited file and creates a temporary SAS data set Work.Class.
proc import datafile='C:UsersStudent1certclass.txt'   /*#1*/
  dbms=tab
  out=class
  replace;
  delimiter='09'x;                                        /*#2*/
run;
proc print data=class;
run;
1 Specify the input file. DATAFILE= specifies the input data file, and OUT= specifies the output data set. DBMS= specifies the type of data to import. If the file type is TXT, then the DBMS= option is TAB. The REPLACE option overwrites an existing SAS data set. GETNAMES= statement defaults to YES.
2 Specify the delimiter. On an ASCII platform, the hexadecimal representation of a tab is '09'x. On an EBCDIC platform, the hexadecimal representation of a tab is a '05'x.
Output 4.6 PROC PRINT Output of Work.Class
PROC PRINT Output of Work.Class
Last updated: August 23, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.69.83