Recommendation system and STING

STatistical Information Grid (STING) is a grid-based clustering algorithm. The dataset is recursively divided into a hierarchy structure. The whole input dataset serves as the root node in the hierarchy structure. Each cell/unit in a layer is composed of a couple of cells/units in the lower layer. An example is shown in the following diagram:

Recommendation system and STING

To support the query for a dataset, the statistical information of each unit is calculated in advance for further processing; this information is also called statistics parameters.

The characteristics of STING algorithms are (but not limited to) the following:

  • A query-independent structure
  • Intrinsically parallelizable
  • Efficiency

The STING algorithm

The summarized pseudocodes for the STING algorithm are as follows:

The STING algorithm

The R implementation

Please take a look at the R codes file ch_06_sting.R from the bundle of R codes for the previously mentioned algorithm. The codes can be tested with the following command:

> source("ch_06_sting.R")

Recommendation systems

Depending on statistical, data-mining, and knowledge-discovery techniques, recommendation systems are being used by most of the e-commerce sites to make it easy for consumers to find products to purchase. Three main parts: representation of input data, neighborhood formation, and recommendation generation are shown in the following diagram:

Recommendation systems
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.