Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Web sentiment analysis and CLIQUE

CLustering In QUEst (CLIQUE) is a bottom-up and grid-based clustering algorithm. The idea behind this algorithm is the Apriori feature, that is, the monotonicity of dense units with respect to dimensionality. If a set of data points, S, is a cluster in a k-dimensional projection of the space, then S is also contained in a cluster in any (k-1)-dimensional projections of this space.

The algorithm proceeds by passes. The one-dimensional dense units are produced by one pass through the data. The candidate k-dimensional units are generated using the candidate-generation procedure and the determined (k-l)-dimensional dense units that are fetched at the (k-1) pass.

The characteristics of the CLIQUE algorithm are as follows:

Efficient for high-dimensional datasets
Interpretability of results
Scalability and usability

The CLIQUE algorithm contains three steps to cluster a dataset. First, a group of subspaces is selected to cluster the dataset. Then, clustering is independently executed in every subspace. Finally, a concise summary of every cluster is produced in the form of a disjunctive normal form (DNF) expression.

The CLIQUE algorithm

The summarized pseudocode for the CLIQUE algorithm is as follows:

Indentification of subspaces that contain clusters.
Indentification of cluster.
Generation minimal description for the cluster.

The candidate-generation algorithm is illustrated as follows:

Here is the algorithm to find the connected components of the graph; this is equivalent to finding clusters:

The R implementation

Please take a look at the R codes file ch_06_clique.R from the bundle of R codes for the previously mentioned algorithm. The codes can be tested with the following command:

> source("ch_06_clique.R")