CLustering In QUEst (CLIQUE) is a bottom-up and grid-based clustering algorithm. The idea behind this algorithm is the Apriori feature, that is, the monotonicity of dense units with respect to dimensionality. If a set of data points, S, is a cluster in a k-dimensional projection of the space, then S is also contained in a cluster in any (k-1)-dimensional projections of this space.
The algorithm proceeds by passes. The one-dimensional dense units are produced by one pass through the data. The candidate k-dimensional units are generated using the candidate-generation procedure and the determined (k-l)-dimensional dense units that are fetched at the (k-1) pass.
The characteristics of the CLIQUE algorithm are as follows:
The CLIQUE algorithm contains three steps to cluster a dataset. First, a group of subspaces is selected to cluster the dataset. Then, clustering is independently executed in every subspace. Finally, a concise summary of every cluster is produced in the form of a disjunctive normal form (DNF) expression.
The summarized pseudocode for the CLIQUE algorithm is as follows:
The candidate-generation algorithm is illustrated as follows:
Here is the algorithm to find the connected components of the graph; this is equivalent to finding clusters:
Please take a look at the R codes file ch_06_clique.R
from the bundle of R codes for the previously mentioned algorithm. The codes can be tested with the following command:
> source("ch_06_clique.R")
Web sentiment analysis is used to identify the idea or thought behind the text, for example, the sentiment analysis of microblogs such as Twitter. One simple example is comparing the post with a predefined labeled word list for sentiment judging. Another example is that we can judge a movie review as thumbs up or thumbs down.
Web sentiment analyses are used in news article analyses of biases pertaining to specific views, newsgroup evaluation, and so on.
18.218.11.211