Log-likelihood ratios recommendation system method

The log-likelihood ratio (LLR) is a measure of how two events A and B are unlikely to be independent but occur together more than by chance (more than the single event frequency). In other words, the LLR indicates where a significant co-occurrence might exist between two events A and B with a frequency higher than a normal distribution (over the two events variables) would predict.

It has been shown by Ted Dunning (http://tdunning.blogspot.it/2008/03/surprise-and-coincidence.html) that the LLR can be expressed based on binomial distributions for events A and B using a matrix k with the following entries:



Not A




Not B



Log-likelihood ratios recommendation system method

Here, Log-likelihood ratios recommendation system method and Log-likelihood ratios recommendation system method is the Shannon entropy that measures the information contained in the vector p.

Note: Log-likelihood ratios recommendation system method is also called the Mutual Information (MI) of the two event variables A and B, measuring how the occurrence of the two events depend on each other.

This test is also called G2, and it has been proven effective to detect co-occurrence of rare events (especially in text analysis), so it's useful with sparse databases (or a utility matrix, in our case).

In our case, the events A and B are the like or dislike of two movies A and B by a user, where the event of like a movie is defined when the rating is greater than 3 (and vice versa for dislike). Therefore, the implementation of the algorithm is given by the following class:

Log-likelihood ratios recommendation system method

The constructor takes as input the utility matrix, the movie titles list, and the likethreshold that is used to define if a user likes a movie or not (default 3). The function loglikelihood_ratio generates the matrix with all the LLR values for each pair of movies i and j calculating the matrix k (calc_k) and the corresponding LLR (calc_llr). The function GetRecItems returns the recommended movie list for the user with ratings given by u_vec (the method does not predict the rating values).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.