SAS In-Memory Analytics Technology
SAS In-Memory Analytics technology takes advantage of the large number of threads
and high level of memory that is available in some specially configured DBMS
appliances such as Teradata and EMC Greenplum and on commodity hardware using
Hadoop Distributed File System (HDFS). The Teradata and EMC Greenplum appliances
are assembled as computing clusters using specific massively parallel processing (MPP)
techniques. With SAS In-Memory Analytics technology, all of the data to be processed is
distributed across the cluster and loaded into memory before the analytic procedure
begins. This is in distinct contrast to traditional processing where data is loaded in blocks
as they are needed. In addition, SAS High-Performance Analytics procedures are
engineered to execute on hundreds of threads and each thread is responsible for a small
subset of the overall data to be processed. Faster analysis of large data sets results in
greater refinement of analytic models.
Several members of the SAS High-Performance Analytics family of products are based
on SAS In-Memory Analytics technology, including the SAS High-Performance
Analytics Server, SAS Visual Analytics, and SAS High-Performance Risk. For more
detail on the array of SAS In-Memory Analytics components, please see the In-Memory
Analytics website: Products & Solutions/In-Memory Analytics.
SAS High-Performance Analytics Server
The SAS High-Performance Analytics Server is engineered to run in threads and
provides high-performance analytic procedures that focus on predictive model
development with computationally intensive calculations. These procedures are
drawn from the libraries of SAS/STAT, SAS/QC, and SAS/ETS. The procedures
execute in the MPP computing environment provided by EMC Greenplum and
Teradata appliances and Hadoop clusters. High-performance MPP configurations
typically have a minimum 1.5TB of memory and upward of 192 cores with multiple
threads per core.
The SAS High-Performance Analytics procedures are invoked on the requesting
SAS client where a Base SAS session is executing. This can be the traditional
Display Manager System, SAS Enterprise Guide, or through the SAS High-
Performance Data Mining tab in SAS Enterprise Miner. In the MPP environment, the
SAS client communicates with the SAS High-Performance Analytics Server nodes
where a thin SAS environment executes a copy of the requested SAS procedure or
DS2 code. Once completed, the analytic results are returned to the requesting
application on the SAS client.
The PERFORMANCE statement in the SAS High-Performance Analytics
procedures enables you to specify parameters to control threading and the mode of
processing, SMP (client mode) or MPP (distributed mode). In SMP mode (which is a
SAS session on the client machine), the CPUCOUNT default is the number of CPUs
on the client machine. CPUCOUNT and NOTHREADS options can override the
SAS system options. In this environment, if the procedure executes in MPP mode,
then the CPUCOUNT option default is that the number of threads is determined by
the number of CPUs on the appliance nodes. The NTHREADS option available only
in the PERFORMANCE statement throttles the number of threads.
The SAS High-Performance Analytics Server and procedures are documented in the
SAS High-Performance Server Administration Guide. The SAS High-Performance
Analytics Server relies on the SAS LASR Analytic Server to provide a highly
scalable and reliable analytics infrastructure that is optimized for large volumes of
data and complex computations.
216 Chapter 13 Support for Parallel Processing
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.146.155