Evaluating a model

As statistical topic modeling has an unsupervised nature, it makes model selection difficult. For some applications, there may be some extrinsic tasks at hand, such as information retrieval or document classification, for which performance can be evaluated. However, in general, we want to estimate the model's ability to generalize topics regardless of the task.

In 2009, Wallach et al. introduced an approach that measures the quality of a model by computing the log probability of held-out documents under the model. The likelihood of unseen documents can be used to compare models—higher likelihood implies a better model.

We will evaluate the model using the following steps:

Let's split the documents into training and test sets (that is, held-out documents), where we use 90% for training and 10% for testing:

// Split dataset 
InstanceList[] instanceSplit= instances.split(new Randoms(), new 
   double[] {0.9, 0.1, 0.0});

Now let's rebuild our model using only 90% of our documents:

// Use the first 90% for training 
model.addInstances(instanceSplit[0]); 
model.setNumThreads(4); 
model.setNumIterations(50); 
model.estimate();

We will initialize an estimator object that implements Wallach's log probability of held-out documents, MarginalProbEstimator:

// Get estimator 
MarginalProbEstimator estimator = model.getProbEstimator();

An intuitive description of LDA is summarized by Annalyn Ng in her blog: https://annalyzin.wordpress.com/2015/06/21/laymans-explanation-of-topic-modeling-with-lda-2/. To get deeper insight into the LDA algorithm, its components, and its working, take a look at the original paper LDA by David Blei et al. (2003) at http://jmlr.csail.mit.edu/papers/v3/blei03a.html, or take a look at the summarized presentation by D. Santhanam of Brown University at http://www.cs.brown.edu/courses/csci2950-p/spring2010/lectures/2010-03-03_santhanam.pdf.

The class implements many estimators that require quite deep theoretical knowledge of how the LDA method works. We'll pick the left-to-right evaluator, which is appropriate for a wide range of applications, including text mining, and speech recognition. The left-to-right evaluator is implemented as the double evaluateLeftToRight method, accepting the following components:

Instances heldOutDocuments: This tests the instances.
int numParticles: This algorithm parameter indicates the number of left-to-right tokens, where the default value is 10.
boolean useResampling: This states whether to resample topics in left-to-right evaluation; resampling is more accurate, but leads to quadratic scaling in the length of documents.
PrintStream docProbabilityStream: This is the file or stdout in which we write the inferred log probabilities per document.

Let's run estimator, as follows:

double loglike = estimator.evaluateLeftToRight( 
  instanceSplit[1], 10, false, null);); 
System.out.println("Total log likelihood: "+loglike);

In our particular case, the estimator outputs the following log likelihood, which makes sense when it is compared to other models that are either constructed with different parameters, pipelines, or data—the higher the log likelihood, the better the model is:

    Total time: 3 seconds
    Topic Evaluator: 5 topics, 3 topic bits, 111 topic mask
    Total log likelihood: -360849.4240795393

Now let's take a look at how to make use of this model.

Table of Contents for Evaluating a model

Create new playlist

Sign In

Sign Up

Table of Contents for
Evaluating a model