Apache MADlib

One of the lesser-known but feature-rich platforms is Apache MADlib, which aims to perform analytics and run algorithms in-database, as in, it can execute functions locally without requiring an external programming interface. It supports parallel processing and can work seamlessly with multiple data sources such as Greenplum, PostgreSQL, and others.

As an example, an apriori model can be created by simply running an SQL command, as shown here, from http://madlib.apache.org/docs/latest/group__grp__assoc__rules.html:

SELECT * FROM madlib.assoc_rules(.25,            -- Support 
                                  .5,             -- Confidence 
                                  'trans_id',     -- id col 
                                  'product',      -- Product col 
                                  'test_data',    -- Input data 
NULL,           -- Output schema 
                                  TRUE            -- Verbose output 
); 

Further information about Apache MADlib (screenshot of site shown below) is available at http://madlib.apache.org.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.37.136