Association rules for learning recommendation system

Although this method is not used often in many commercial recommendation systems, association rules learning is certainly a method worth knowing about because of historical data reasons, and it can be employed to solve a wide range of problems in real-world examples. The main concept of this method is to find relationships among items based on some statistical measure of the occurrences of the items in the database of transactions T (for example, a transaction could be the movies seen by a user i or the products bought by i). More formally, a rule could be {item1,item2} => {item3}, that is, a set of items ({item1,item2}) implies the presence of another set ({item3}). Two definitions are used to characterize each X=>Y rule:

  • Support: Given a set of items X, the support supp(X) is the portion of transactions that contains the set X over the total transactions.
  • Confidence: It is the fraction of transactions that contains the set X that also contains the set Y: conf(X=>Y)=supp(X U Y)/supp(X). Note that the confidence conf(X=>Y) can have a very different value than conf(Y=>X).

Support represents the frequency of a certain rule on the transaction database, while the confidence indicates the probability that set Y will occur if set X is present. In other words, the support value is chosen to filter the number of rules we want to mine from the database (the higher the support, the fewer rules will satisfy the condition), while the confidence can be thought of as a similarity metric between sets X and Y. In the case of the movie recommendation system, the transaction database can be generated from the utility matrix R considering the movies each user likes, and we look for rules composed by sets X and Y that contain only one item (movie). These rules are collected in a matrix, ass_matrix, in which each entry ass_matrixij represents the confidence of the rule i =>j. The recommendations for the given user are obtained by simply multiplying the ass_matrix by his ratings u_vec: Association rules for learning recommendation system, and sorting all the values Association rules for learning recommendation system by the largest value corresponding to the most recommended movie to the least. Therefore, this method does not predict the ratings, but the list of movie recommendations; however, it is fast and it also works well with a sparse utility matrix. Note that to find all the possible combinations of items to form sets X and Y as fast as possible, two algorithms have been developed in the literature: apriori and fp-growth (not discussed here since we only require rules with one item per set X and Y).

The class that implements the method is as follows:

Association rules for learning recommendation system

The class constructor takes as input parameters the utility matrix Umatrix, the movie titles list Movieslist, the support min_support, confidence min_confidence thresholds (default 0.1), and the likethreshold, which is the minimum rating value to consider a movie in a transaction (default 3). The function combine_lists finds all the possible rules, while filterSet just reduces the rules to the subset that satisfies the minimum support threshold. calc_confidence_matrix fills the ass_matrix with the confidence value that satisfies the minimum threshold (otherwise 0 is set by default) and GetRecItems returns the list of recommended movies given the user ratings u_vec.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.133.233