Using PROC SQL to Join Data

The SQL Procedure

Another method that you can use to join data sets that do not have a common variable is the SQL procedure. You should already be familiar with using PROC SQL to create a table from the results of an inner join.
In a PROC SQL step, you can choose from each input data set only the specific variables that you want to include in the new data set. The input data sets do not need to contain a common BY variable, nor do they need to be sorted or indexed. However, if the lookup table has an index, the SQL procedure can take advantage of the index to provide faster retrieval of lookup values.
You can join up to 256 tables by using the SQL procedure, to combine data horizontally from sources that have any type of relationship (one-to-one, one-to-many, many-to-many, or nonmatching).
Note: Although numerous types of joins are possible with PROC SQL, only inner joins are discussed in this chapter. Therefore, in the remainder of this chapter, a PROC SQL join refers to an inner join on multiple tables, whose results are stored in a new table. You can learn more about PROC SQL joins in Combining Tables Horizontally Using PROC SQL.
One drawback to using the SQL procedure to perform table lookups is that you cannot use DATA step syntax with PROC SQL. Therefore, complex business logic is difficult to incorporate into the join. However, by using PROC SQL you can often accomplish in one step what it takes multiple PROC SORT and DATA steps to accomplish.

Example: Working with Multiple Lookup Tables

The following example joins Sasuser.Revenue, Sasuser.Expenses, and Sasuser.Acities into a new data set named Work.Sqljoin:
proc sql;
   create table sqljoin as
      select revenue.flightid, revenue.date format=date9.,
             revenue.origin, revenue.dest,
             sum(revenue.rev1st,
                 revenue.revbusiness,
                 revenue.revecon)
             -expenses.expenses as Profit,
             acities.city,
             acities.name
         from sasuser.expenses, sasuser.revenue,
             sasuser.acities
         where expenses.flightid=revenue.flightid
              and expenses.date=revenue.date
              and acities.code=revenue.dest
         order by revenue.dest,
                 revenue.flightid,
                 revenue.date;
quit;

proc print data=work.sqljoin(obs=5);
   title 'Result of Joining Three Data Sets';
run;
The PROC PRINT step produces the first five observations of the Work.Sqljoin data set that is created in the PROC SQL step above, as shown here:
Working with Multiple Lookup Tables
Note: Notice that the Work.Sqljoin data set is identical to the Sasuser.Alldata data set that was previously created in the DATA step merge.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.172.56