© Ervin Varga 2019
E. Varga Practical Data Science with Python 3https://doi.org/10.1007/978-1-4842-4859-1_8

8. Recommender Systems

Ervin Varga1 
(1)
Kikinda, Serbia
 
When someone asks where machine learning is broadly applied in an industrial context, recommender systems is a typical answer. Indeed, these systems are ubiquitous, and we rely on them a lot. Amazon is maybe the best example of an e-commerce site that utilizes many types of recommender systems to enhance users’ experience and help them quickly find what they are looking for. Spotify, whose domain is music, is another good example. Despite heavy usage of machine learning, recommender systems differ in two crucial ways from classically trained ML systems:
  • Besides basic content, they try to combine many additional clues to offer more personalized choices.

  • They tend to favor variability over predictability to boost excitement in users. Just imagine a recommender system whose proposals would never contain surprising items.

These traits entail a completely different set of metrics to evaluate recommender systems. If you were to try to optimize prediction accuracy solely by using the standard root mean squared error, then your system would probably not perform well in practice; it would be judged as boring. This chapter briefly introduces you to recommender systems and explains some core concepts and related techniques.

Introduction to Recommender Systems

To familiarize yourself with a recommender system, try out the freely available MovieLens project ( https://movielens.org ) without any commercial baggage.1 MovieLens provides personalized movie recommendations. After creating an account, it immediately asks you to distribute three points among six possible genres of movies to produce an initial profile. Later, as you rate and tag movies, it learns more about you and offers better-suited movies. This is a typical case in recommender systems: more data allows the system to create a finer-grained profile about you that can be used to filter content more successfully. In this respect, high-quality input should result in high-quality output. Figure 8-1 shows part of the main user interface of MovieLens. Try to rate a couple of movies and watch how the system adapts to your updated profile.

As with MovieLens, most recommender systems collect two types of data:
  • Explicit data: Voluntarily given by users. For example, each time you rate a movie, you intentionally provide feedback. Collecting lots of explicit data requires system designers to find ingenious ways to entice users to respond, such as by offering incentives. Of course, if you properly rate movies, then you should get better recommendations, which is the minimal incentive for anyone to bother rating products.

  • Implicit data: Collected automatically by a system while monitoring activities of users (such as clicking, purchasing, etc.). Even timing events may be valuable; for example, if a user has spent more time on a page showing the script of a particular movie than she has spent on the pages of other movies, that may indicate a higher interest in that movie. Another possibility for collecting implicit data is to trace tweets about movies to evaluate their general popularity. All in all, a multitude of sources could be combined into a single preference formula. Naturally, acquiring implicit data is easier than acquiring explicit data, as it doesn’t require active participation by users.
    ../images/473947_1_En_8_Chapter/473947_1_En_8_Fig1_HTML.png
    Figure 8-1

    MovieLens recommendations are displayed in rows that denote various categories

To always give users a chance to discover new stuff, recommendations are a mixture of personalized content and nonpersonalized content (such as overall popular items). The latter may be customized based on general demographic information (for example, age group, gender, etc.). Another consideration is that sometimes you just want to explore things without “disturbing” your profile (remember that every action is tracked, including browsing and clicking, which might influence future offerings). In MovieLens, this is possible via the MovieExplorer feature (it may change over time, as it is purely tentative at the time of this writing). At any rate, consider in your design a similar possibility to offer an “untracked” option for your customers.

Figure 8-2 highlights another peculiarity of recommender systems compared to classical databases and other information retrieval systems. Recommender systems filter items based on learned preferences, while a database tries to answer ad hoc queries as fast as possible. The set of queries executed by MovieLens is pretty static (for example, list top movies for a user, list most popular movies, etc.), as shown in Figure 8-1, although you get different results. If you run the same query against a relational database system, then you will get back the same output (assuming that the basic content didn’t change) irrespective of whether you like it or not.
../images/473947_1_En_8_Chapter/473947_1_En_8_Fig2_HTML.jpg
Figure 8-2

Within an information retrieval system, a user must explicitly identify what she is looking for (for example, in a relational database, this is accomplished by issuing SQL queries)

A recommender system learns what a user likes and dislikes and modifies the underlying queries accordingly. The output reflects the current profile (i.e., the outcome resonates with the user’s taste, assuming a personalized solution). Typically, you don’t often alter the output categories in a recommender system; they are quite stable over time. A content-based recommender system may use the products database to display details about items, so it can encode taste in terms of attributes of items.

There are three major interfaces pertaining to recommender systems:
  • Filtering facilities: Try to remove uninteresting topics (according to the user profile) from a streaming data source, such as news, tweets, e-mails, etc. For example, a personalized news suggestion engine may pick only articles that could be significant to you.

  • Recommendation interfaces: Present a selection of items that you may like, such as shown earlier in Figure 8-1. Usually, modern recommenders are hybrid systems that leverage different recommendation algorithms to produce a mix of content. For example, it can give you both specific items matching your taste as well as suggestions for items that are currently popular. Furthermore, the system may only use its own initial recommendation to trigger a dialog-based interaction, where additional feedback from a user may help narrow down the final selection.

  • Prediction interfaces: Attempt to accurately foresee how much you would like some products (for example, the number of stars you would assign to a movie you haven’t viewed). Of course, these values are estimates and may sometimes turn out to be inaccurate. In general, it doesn’t matter that much whether you will give 5 stars or 4.5 stars to an item. It is more important to offer things that you will find valuable.

Nowadays, most recommender systems are based on collaborative filtering techniques. Collaboration means that data from other users is leveraged to craft offerings for you. A system may correlate your taste with other users, so that you get recommendations based on what those users liked and disliked. This is the user-user type of filtering. It is also possible to correlate items regarding their ratings, which constitutes the item-item type of filtering; this is the preferred scalable model in most systems today. Finally, it is also possible to recommend items using product association rules of the form “users who bought this also bought...”

Context-based systems also use contextual and environmental data to fine-tune the result (for example, depending on your mood, the system may offer different music on top of your general preference model). Such data may be collected from your smartphone; the system may use location information to reduce the list only to objects that are in your vicinity.

Simple Movie Recommender Case Study

We will build a very simple mashup to recommend movies similar to a movie entered by a user. The inspiration for this example comes from reference [3]. The program will use two public services and combine their result; this arrangement is known as a mashup.

TasteDive ( https://tastedive.com , which is also its API’s base URL) is a recommendation engine for suggesting similar music, movies, TV shows, books, authors, games, and podcasts based on your taste. It offers a Resource API (level 2 REST) to acquire similar artifacts in programmatic fashion (the API is documented at /read/api). If you just hit the /api/similar endpoint (without entering an access key or providing any other parameters), you will receive the following abbreviated JSON response (the content may change over time):
{
  "Similar": {
    "Info": [
      {
        "Name": "!!!",
        "Type": "music"
      }
    ],
    "Results": [
      {
        "Name": "Meeting Of Important People",
        "Type": "music"
      },
      {
        "Name": "The Vanity Project",
        "Type": "music"
      },
      ...
    ]
  }
}
You can issue a couple of HTTP requests for free. You will need an access key to put your future recommender service into production. Listing 8-1 shows the class to retrieve artifacts from TasteDive. This program is totally independent from our final service. It can be reused in many other contexts, too. As an additional exercise, change the requests module to requests_with_caching (look it up in Chapter 24 of reference [3]).
import requests
class TasteDiveService:
    SUPPORTED_ARTIFACTS = ['music', 'movies', 'shows', 'podcasts', 'books', 'authors', 'games']
    API_URL = 'https://tastedive.com/api/similar'
    def __init__(self, artifact_type = 'movies'):
        assert artifact_type in TasteDiveService.SUPPORTED_ARTIFACTS, 'Invalid artifact type'
        self._artifact_type = artifact_type
    def _retrieve_artifacts(self, name, limit):
        params = {'q': name, 'type': self._artifact_type, 'limit': limit}
        return requests.get(TasteDiveService.API_URL, params).json()
    @staticmethod
    def _extract_titles(response):
        artifacts = response['Similar']['Results']
        return [artifact['Name'] for artifact in artifacts]
    def similar_titles(self, titles, limit = 5):
        """
        Returns a set of similar titles up to the defined limit. Each instance of this class is supposed to work only with one artifact type. This type is specified during object construction.
        """
        assert 0 < limit <= 50, 'Limit must be in range (0, 50].'
        return {similar_title
                for title in titles
                    for similar_title in TasteDiveService._extract_titles(
                        self._retrieve_artifacts(title, limit))}
Listing 8-1

tastedrive_service.py Module in the simple_recommender Folder for Getting Similar Stuff from TasteDive

The following is an example of how to receive similar movies to Terminator:
>> td.similar_titles(['Terminator'])
{'Aliens', 'Commando', 'First Blood', 'Predator', 'Terminator Salvation'}
The OMDb service (see http://www.omdbapi.com ) exposes a Resource API to obtain movie information. We will use it to get ratings for movies, so that we may sort the final result from TasteDive based on this criterion. Listing 8-2 shows the class to communicate with OMDb.
import requests
class OMDbService:
    API_URL = 'http://www.omdbapi.com/'
    def __init__(self, api_key):
        self._api_key = api_key
    def retrieve_info(self, title):
        """Returns information about the movie title in JSON format."""
        params = {'apikey': self._api_key, 't': title, 'type': 'movie', 'r': 'json'}
        return requests.get(OMDbService.API_URL, params).json()
Listing 8-2

omdb_service.py Module to Help Communicate with the OMDb Service

You will need to obtain an API key to be able to use OMDb. Luckily, you can ask for a free key (with a daily limit of 1000 requests) on the web site (click the API Key tab on the home page). The following is an example of how to get information about the movie Terminator:
>> omdb = OMDbService('YOUR API KEY HERE')
>> omdb.retrieve_info('Terminator')
{'Title': 'Terminator',
 'Year': '1991',
 'Rated': 'N/A',
 'Released': 'N/A',
 'Runtime': '39 min',
 'Genre': 'Short, Action, Sci-Fi',
 'Director': 'Ben Hernandez',
 'Writer': 'James Cameron (characters), James Cameron (concept), Ben Hernandez (screenplay)',
 'Actors': 'Loris Basso, James Callahan, Debbie Medows, Michelle Kovach',
 'Plot': 'A cyborg comes from the future, to kill a girl named Sarah Lee.',
 'Language': 'English',
 'Country': 'USA',
 'Awards': 'N/A',
 'Poster': 'N/A',
 'Ratings': [{'Source': 'Internet Movie Database', 'Value': '6.2/10'}],
 'Metascore': 'N/A',
 'imdbRating': '6.2',
 'imdbVotes': '23',
 'imdbID': 'tt5817168',
 'Type': 'movie',
 'DVD': 'N/A',
 'BoxOffice': 'N/A',
 'Production': 'N/A',
 'Website': 'N/A',
 'Response': 'True'}
We are now ready to create our custom movie recommender that will sort offerings based on their ratings. It is possible to use various sources. We will use the Internet Movie Database as a primary source and fall back to imdbRating, as necessary. You can leverage other source, too (like Rotten Tomatoes). Listing 8-3 shows the simple recommender mashup service.
from tastedive_service import TasteDiveService
from omdb_service import OMDbService
class SimpleMovieRecommender:
    PRIMARY_SOURCE = 'Internet Movie Database'
    def __init__(self, omdb_api_key):
        self._omdb = OMDbService(omdb_api_key)
        self._td = TasteDiveService()
    @staticmethod
    def _retrieve_rating(omdb_response):
        for rating in omdb_response['Ratings']:
            if rating['Source'] == SimpleMovieRecommender.PRIMARY_SOURCE:
                return float(rating['Value'].split('/')[0])
        return float(omdb_response['imdbRating'])
    def recommendations(self, titles, limit = 5):
        """
        Return a list of recommended movie titles up to the specified limit.
        The items are ordered according to their ratings (from top to bottom).
        """
        similar_titles = self._td.similar_titles(titles, limit)
        ratings = map(
            lambda title:
                   SimpleMovieRecommender._retrieve_rating(self._omdb.retrieve_info(title)),
            similar_titles)
        return list(map(lambda item: item[1],
                        sorted(zip(ratings, similar_titles), reverse = True)))
Listing 8-3

simple_movie_recommender.py Module to Offer Similar Movies in Sorted Order

The following is a session to retrieve ten recommendations for Terminator in sorted order:
>> smr = SimpleMovieRecommender('YOUR API KEY')
>> smr.recommendations(['Terminator'], 10)
['Aliens',
 'Die Hard',
 'Predator',
 'First Blood',
 'Robocop',
 'Lethal Weapon',
 'Commando',
 'Terminator Salvation',
 'Alien: Resurrection',
 'Robocop 2']

Introduction to LensKit for Python

LensKit for Python (LKPY) is an open-source framework for conducting offline recommender experiments. It has a highly modular design that leverages the PyData ecosystem ( https://pydata.org ). LensKit enables you to quickly fire up a recommender system and play with various algorithms and evaluation strategies. It also integrates with external recommender tools to provide a common control plane including metrics and configuration. In this section we will demonstrate how easy it is to experiment with LensKit for both research and educational purposes. You can install LensKit by issuing conda install -c lenskit lenskit.

Figure 8-3 depicts the package structure of LensKit, while Figure 8-4 shows the partial UML class diagram of the algorithms package. LensKit is very flexible, enabling you to add new algorithms or metrics that should seamlessly integrate with existing artifacts, thanks to the reliance on standardized data structures and frameworks of PyData. This was the major breakthrough compared to the previous Java-based LensKit system that introduced proprietary data formats as well as kept things buried inside the tool (like opinionated evaluation flow, implicit features, and indirect configuration).
../images/473947_1_En_8_Chapter/473947_1_En_8_Fig3_HTML.jpg
Figure 8-3

LensKit’s package structure (the root package is lenskit) with well-defined APIs between them

The metrics package accepts Pandas Series objects as input, so it may be combined with any external framework. There is a subtle inconsistency of lacking a simple knn wrapper package, since user_knn and item_knn are just different implementations of the nearest neighbor search (something already reflected in the corresponding class names). There are two additional wrapper classes toward the implicit external framework, which are omitted here. The Fallback class is a trivial hybrid that will return the first result from a set of predictors passed as input. This is handy when a more sophisticated algorithm has difficulties computing the output, as a simpler version may provide an alternative answer. I suggest that you read reference [4] for a good overview of various approaches to implement recommendation engines (all examples are realized in RapidMiner, with an extension for recommender systems).

The next example uses the small MovieLens dataset, which you can download from http://files.grouplens.org/datasets/movielens/ml-latest-small.zip . You should unpack the archive into the data subfolder (relative to this chapter’s source code directory). I advise you to read the README.txt file , which is part of the archive, to get acquainted with its content.
../images/473947_1_En_8_Chapter/473947_1_En_8_Fig4_HTML.jpg
Figure 8-4

The common API shared by all concrete algorithm classes follows the SciPy style, which enables unified handling of algorithms irrespective of their inner details

We will demo here the benefits of using LensKit for experimenting with various recommendation engines (see also reference [2]). Just enter the following statements inside an IPython console (make sure you are inside this chapter’s source directory); all steps are recorded in the lkpy_demo.py file , so you can easily execute them all by issuing %load lkpy_demo.py . We start with importing the necessary packages:
from itertools import tee
import pandas as pd
from lenskit import batch
from lenskit import crossfold as xf
from lenskit.algorithms import funksvd, item_knn, user_knn
from lenskit.metrics import topn
Next, we load our dataset and create a Pandas DataFrame object:
ratings = pd.read_csv('data/ratings.csv')
ratings.rename({'userId': 'user', 'movieId': 'item'}, axis = 'columns', inplace = True)
print(ratings.head())
   user  item  rating  timestamp
0     1     1     4.0  964982703
1     1     3     4.0  964981247
2     1     6     4.0  964982224
3     1    47     5.0  964983815
4     1    50     5.0  964982931
It is important to rename the columns as depicted above. The framework expects users to be under user and items to be under item. Next, we produce a dataset for user-based, five-fold cross-validation:
xf_dataset_batch, xf_dataset_test = tee(xf.partition_users(ratings[['user', 'item', 'rating']]
, 5, xf.SampleFrac(0.2)))
truth = pd.concat([test for _, test in xf_dataset_test], ignore_index = True)

The tee function generates two copies of the underlying generator. One is needed for batch evaluation, while the other is needed for creating complete test data. This function is going to be needed later to calculate the ideal DCG for users.

Next we set up a couple of algorithms with different configurations. The MultiEval facility is perfect to sweep over various configuration parameters and produce recommendations. The outcome from each algorithm is saved in a designated subfolder inside a combined Parquet file (you can also make separate files and collect them by calling collect_results). Afterward, we may calculate various metrics to evaluate the achieved level of sophistication.
runner = batch.MultiEval('result', False, nprocs = 4)
runner.add_algorithms(
    [item_knn.ItemItem(10), item_knn.ItemItem(20), item_knn.ItemItem(30)],
    False,
    ['nnbrs']
)
runner.add_algorithms(
    [user_knn.UserUser(10), user_knn.UserUser(20), user_knn.UserUser(30)],
    True,
    ['nnbrs']
)
runner.add_algorithms(
    [funksvd.FunkSVD(40, damping = 0, range = (1, 5)),
     funksvd.FunkSVD(50, damping = 5, range = (1, 5)),
     funksvd.FunkSVD(60, damping = 10, range = (1, 5))],
    False,
    ['features', 'damping']
)
runner.add_datasets(xf_dataset_batch)
runner.run()
The UserUser algorithm is much slower than the other two. This is why you definitely want to run it in parallel. For each algorithm, you can collect pertinent attributes to trace configuration. The next couple of statements load the results and merge them into a nice, unified record structure:
runs = pd.read_parquet('result/runs.parquet',
                       columns = ('AlgoClass','RunId','damping','features','nnbrs'))
runs.rename({'AlgoClass': 'Algorithm'}, axis = 'columns', inplace = True)
def extract_config(x):
    from math import isnan
    damping, features, nnbrs = x
    result = "
    if not isnan(damping):
        result = "damping=%.2f " % damping
    if not isnan(features):
        result += "features=%.2f " % features
    if not isnan(nnbrs):
        result += "nnbrs=%.2f" % nnbrs
    return result.strip()
runs['Configuration'] = runs[['damping','features','nnbrs']].apply(extract_config, axis = 1)
runs.drop(columns = ['damping','features','nnbrs'], inplace = True)
recs = pd.read_parquet('result/recommendations.parquet')
recs = recs.merge(runs, on = 'RunId')
recs.drop(columns = ['RunId'], inplace = True)
print(recs.head(10))
     item     score  user  rank  rating Algorithm Configuration
0   98154  5.273100     3     1     0.0  ItemItem   nnbrs=10.00
1    4429  5.027890     3     2     0.0  ItemItem   nnbrs=10.00
2    1341  5.002217     3     3     0.0  ItemItem   nnbrs=10.00
3  165103  4.991935     3     4     0.0  ItemItem   nnbrs=10.00
4    4634  4.871810     3     5     0.0  ItemItem   nnbrs=10.00
5   98279  4.871810     3     6     0.0  ItemItem   nnbrs=10.00
6    7008  4.869243     3     7     0.0  ItemItem   nnbrs=10.00
7    6530  4.777890     3     8     0.0  ItemItem   nnbrs=10.00
8   32770  4.777890     3     9     0.0  ItemItem   nnbrs=10.00
9    4956  4.777890     3    10     0.0  ItemItem   nnbrs=10.00
The extract_config method collects relevant configuration information per algorithm. We will group an algorithm together with its configuration to decide what works best. The next lines compute the nDCG Top-N accuracy metric (see also Exercise 8-1):2
user_dcg = recs.groupby(['Algorithm', 'Configuration', 'user']).rating.apply(topn.dcg)
user_dcg = user_dcg.reset_index(name='DCG')
ideal_dcg = topn.compute_ideal_dcgs(truth)
user_ndcg = pd.merge(user_dcg, ideal_dcg)
user_ndcg['nDCG'] = user_ndcg.DCG / user_ndcg.ideal_dcg
user_ndcg = user_ndcg.groupby(['Algorithm', 'Configuration']).nDCG.mean()
The next two lines produce the bar plot, as shown in Figure 8-5:
%matplotlib inline
user_ndcg.plot.bar()
../images/473947_1_En_8_Chapter/473947_1_En_8_Fig5_HTML.jpg
Figure 8-5

The average nDCG for each algorithm and configuration pair

LensKit helps you to experiment with a wide range of recommendation algorithms and evaluate them in a standardized fashion. It will be interesting to see whether LensKit will support Predictive Model Markup Language (see http://dmg.org ) in the future, as a means of exchanging recommender models. Instead of coding up the pipeline manually, this may come as an XML input formatted according to the PMML schema. PMML serialized models can be managed by GUI tools (like RapidMiner, which can even tune model parameters via generic optimizers, including the one based on a genetic algorithm), which further streamlines the whole process.

Exercise 8-1. Report Prediction Accuracy

Read about prediction accuracy metrics in LKPY’s documentation ( https://lkpy.lenskit.org/en/stable/ ). Notice that we have called our evaluator with batch.MultiEval('result', False, nprocs = 4). Change the second argument to True to turn on predictions. You will need to evaluate accuracy using metrics available in the lenskit.metrics.predict package.

This will also be a good opportunity to try the Fallback algorithm to cope with missing data.

Exercise 8-2. Implement a New Metric

Among the Top-N accuracy metrics, you will find two classification metrics (a.k.a. decision-support metrics): precision (P) and recall (R). The former reports how many recommended elements are relevant, while the latter reports how many relevant items are considered by a recommender system. Recall is crucial, since this speaks to whether a recommender will really be able to “surprise” you with proper offerings (i.e., avoid missing useful stuff). This may alleviate a known problem called filter bubble , where a recommender becomes biased with sparse input provided by a user for specific items. The fact that there are missing ratings for many items doesn’t imply that those are worthless for a user.

Usually, we want to balance precision and recall. It is trivial to maximize recall by simply returning everything. F-metrics give us the evenhanded answer. Implement $$ {F}_1=frac{2 PR}{P+R} $$ in a similar fashion as we have done with nDCG. Rank algorithms based on this new metric.

Summary

In an online world, abundant with information and product offerings, recommender systems may come as a savior. They can select items (news articles, books, movies, music, etc.) matching our interest (assuming a personalized version) and suggest them to us. To diversify the list, most recommenders are hybrids that mix highly customized items and generic items. Moreover, many systems also take into account the current context to narrow down the possibilities. The overall influence of a recommender system may be judged by performing an A/B test and monitoring whether our activities have been altered in a statistically significant way.

Nonetheless, there are two major topics that must be properly handled:
  • Privacy and confidentiality of users’ preference data, since recommenders may pile up lots of facts based on explicit and implicit input. Users must be informed how this data is handled and why a particular item is recommended (tightly associated with interpretability of machine learning models), and users must be able to manage preference data (for example, you should be able to delete some facts about you from a system). Chapter 9 discusses privacy and confidentiality in depth.

  • Negative biases induced by machine-based taste formation. It is a bit disturbing that there are already artists (and consultancy companies to help them) that try to produce content appealing to leading recommender systems. On the other side, to bootstrap the content-based engines, some systems mechanically preprocess and extract features from content (for example, doing signal processing on audio data to discover the rhythm, pitch, and impression of songs). We definitely wouldn’t like to see items at the long tail of any distribution disappear just because they aren’t recommender friendly.

References

  1. 1.

    F. Maxwell Harper and Joseph A. Konstan, “The MovieLens Datasets: History and Context,” ACM Transactions on Interactive Intelligent Systems 5, no. 4, 2015; doi: https://doi.org/10.1145/2827872 .

     
  2. 2.

    Michael D. Ekstrand, “The LKPY Package for Recommender Systems Experiments: Next-Generation Tools and Lessons Learned from the LensKit Project,” Computer Science Faculty Publications and Presentations 147, Boise State University, presented at the REVEAL 2018 Workshop on Offline Evaluation for Recommender Systems, Oct. 7, 2018; doi: https://doi.org/10.18122/cs_facpubs/147/boisestate; arXiv: https://arxiv.org/abs/1809.03125 .

     
  3. 3.

    Brad Miller, Paul Resnick, Lauren Murphy, Jeffrey Elkner, Peter Wentworth, Allen B. Downey, Chris Meyers, and Dario Mitchell, Foundations of Python Programming, Runstone Interactive, https://fopp.umsi.education/runestone/static/fopp/index.html .

     
  4. 4.

    Vijay Kotu and Bala Deshpande, Data Science, Concepts and Practice, 2nd Edition, Morgan Kaufmann Publishers, 2018.

     
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.163.250