The FedSQL Procedure

A Brief Overview

SAS FedSQL is a SAS proprietary implementation of ANSI SQL:1999 core standard. It provides support for new data types and other ANSI 1999 core compliance features and proprietary extensions.
At the highest level, FedSQL provides a common ANSI SQL syntax across all data sources. FedSQL is a vendor-neutral SQL dialect that accesses data from various data sources without having to submit queries in the SQL dialect that is specific to the data source (or DBMS). An ANSI standard SQL allows for as much processing to be pushed into a DBMS as possible.
FedSQL is not a replacement for PROC SQL. On the contrary, both are tools that are used for specific scenarios. They offer different strengths for different situations. However, when you know one, you can easily transition to another.

Comparing PROC FedSQL and PROC SQL

Table 6.1 Comparing PROC FedSQL and PROC SQL
PROC FedSQL
PROC SQL
Complies with ANSI standard 3.
Follows ANSI standard 2.
Processes 17 ANSI data types.
Is limited to SAS data types (numeric or character).
Provides vendor-neutral ANSI SQL.
Provides SAS SQL implementation.
Is fully multi-threaded on the SAS Platform.
Is multi-threaded for sorting and indexing on the SAS Platform.
Includes very few non-ANSI SAS enhancements.
Includes many non-ANSI standard SAS enhancements.
  • PROC FedSQL processes 17 ANSI data types including BIGINT, BINARY(n), CHAR( ), VARCHAR( ). PROC SQL is limited to only SAS data types: numeric or character.
  • As noted, PROC FedSQL enables you to work with ANSI data types. Some ANSI data types allow for numbers larger than 15-16 digits, and FedSQL enables you to work with them accurately.
  • PROC FedSQL provides a scalable, threaded, high-performance way to access, manage, and share relational data in multiple data sources. When possible, PROC FedSQL queries are optimized with multi-threaded algorithms in order to resolve large-scale operations. By contrast, PROC SQL is mostly a single-threaded procedure, and PROC SQL only sorts and indexes multi-threaded on the SAS Platform.

PROC FEDSQL Syntax

The foundation of the PROC FedSQL syntax is similar to PROC SQL. The only major change is the PROC name: Instead of PROC SQL, write PROC FedSQL.
Syntax, PROC FEDSQL step:
PROC FedSQL;
SELECT col-name
FROM input-table
<WHERE clause>
<GROUP BY clause>
<HAVING clause>
<ORDER BY clause>
;
QUIT;

The LIBNAME Statement

PROC FedSQL can process any data set that is accessible via libref with a supported engine. In order to connect to a data source, the FedSQL language requires that a connection string be submitted. The connection string defines how FedSQL can connect to a data source. The FedSQL procedure generates a connection string by using the attributes of currently assigned librefs. First, submit a LIBNAME statement for the data source that you want to access (for example, submit a Base SAS LIBNAME statement or a Hadoop LIBNAME statement). Then run PROC FedSQL.
Syntax, PROC FedSQL step:
LIBNAME libref-engine <SAS/ACCESS-engine-options>;
PROC FedSQL;
SELECT col-name
FROM input-table
<WHERE clause>
<GROUP BY clause>
<HAVING clause>
<ORDER BY clause>
;
QUIT;
To run PROC FedSQL, specify the LIBNAME statement with the correct SAS/ACCESS engine and connection options. Then reference the DBMS table like a SAS library.
libname market oracle user=cert password=student
                              path=localhost schema=Analyst;
proc fedsql;
   select State,
         count(*) as TotalCustomer format=comma14.
      from market.customer
      where CreditScore > 650
      group by State
      order by TotalCustomer desc;
quit;

The LIMIT Clause

Suppose you wanted to limit the number of rows that the SELECT statement displays in the query output. With PROC SQL you could use INOBS= or OUTOBS=, but those options are not available in PROC FedSQL. However, you can use the LIMIT clause to specify the number of rows that the SELECT statement returns. The LIMIT clause is similar to OUTOBS=, but is a much more common approach to limit SQL query output rows in ANSI systems.
Syntax, LIMIT clause:
LIBNAME libref-engine <SAS/ACCESS-engine-options>;
PROC FedSQL;
SELECT col-name
FROM input-table
<LIMIT {count}>
;
QUIT;
count
specifies the number of rows that the SELECT statement returns.
Tip
count can be an integer or any simple expression that resolves to an integer value.
Note: When you use the LIMIT clause, it is recommended that you use an ORDER BY clause to create an ordered sequence. Otherwise, you can get an unpredictable subset of a query's rows.
Suppose you want to create a FedSQL query that contains the variables State, Census_Apr2010, and PopEst_Apr2018. The query is ordered by the state name. However, you want to limit the result to the first ten rows only.
libname certadv v9 'C:UsersStudentcertadv';
proc fedsql;
   select State, Census_Apr2010, PopEst_Apr2018
      from certadv.census
      order by State
      limit 10;
quit;
Output 6.5 PROC FedSQL Query Result
PROC FedSQL Query Result

The PUT Function

Suppose you want to format the output of a FedSQL query. In PROC SQL, you would use the FORMAT statement in the SELECT clause to apply the format. However, in FedSQL, the FORMAT statement does not work in applying a format to the output.
Use the PUT function to apply a format to a value in the FedSQL list to change its appearance. It is not analogous to applying a format using FORMAT= in PROC SQL. Using PUT to format a value in FedSQL also modifies the column's data type in the result set.
Syntax, PUT function:
LIBNAME libref-engine <SAS/ACCESS-engine-options>;
PROC FedSQL;
SELECT col-name <PUT(col-name, format) as col-name>
FROM input-table
;
QUIT;
col-name
identifies the variable or constant whose value you want to reformat.
format
contains the SAS or FedSQL format that you want applied to the variable or constant that is specified in the source.
Formats can be associated with any of the data types that are supported by FedSQL. However, the data types are converted. Any value that is passed to the PUT function with a numeric format is converted to NVARCHAR, VARBINARY, or BINARY. The type conversions are carried out based on the format name. Any value that is passed with a character format to the PUT function is converted to NVARCHAR.
If the result set is saved using a CREATE TABLE statement, columns that are formatted with a PUT function have a new data type (NVARCHAR, VARBINARY, or BINARY) instead of the data type that is found in the source table. Column values and data types in the source table are not affected.
Suppose you need to format your query output. There are two ways to use the PUT function to do that.
The first way is to use the PUT function with no AS keyword. This method replaces the column names with a generic column name such as Column.
libname certadv v9 'C:UsersStudentcertadv';
proc fedsql;
   select SalesRep,
          put(Sales1, dollar10.2),
          put(Sales2, dollar10.2),
          put(Sales3, dollar10.2),
          put(Sales4, dollar10.2)
      from certadv.qsales;
quit;
Output 6.6 PROC FedSQL Query Result
PROC FedSQL Query Result
The second method is to use the AS keyword and include either the current or new variable name for each time a format is associated using the PUT function.
libname certadv v9 'C:UsersStudentcertadv';
proc fedsql;
   select SalesRep,
          put(Sales1, dollar10.2) as Sales1,
          put(Sales2, dollar10.2) as Sales2,
          put(Sales3, dollar10.2) as Sales3,
          put(Sales4, dollar10.2) as Sales4
      from certadv.qsales;
quit;
Output 6.7 PROC FedSQL Query Result
PROC FedSQL Query Result

System Options in FedSQL

FedSQL does not provide any options that affect the processing of an entire SAS program or interactive SAS session from the time the option is specified until it is changed. Nor does FedSQL support SAS system options, with the exception of certain SAS invocation options such as NOPRINT and NUMBER.
Last updated: October 16, 2019
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.154.69