Evaluation

A clustering algorithm's quality can be estimated by using the logLikelihood measure, which measures how consistent the identified clusters are. The dataset is split into multiple folds, and clustering is run with each fold. The motivation is that, if the clustering algorithm assigns a high probability to similar data that wasn't used to fit parameters, then it has probably done a good job of capturing the data structure. Weka offers the CluterEvaluation class to estimate it, as follows:

double logLikelihood = ClusterEvaluation.crossValidateModel(model, data, 10, new Random(1));
System.out.println(logLikelihood);

It provides the following output:

-8.773410259774291 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.251.56