Machine learning and recommendation engines

Machine learning is a study of methods that enable computers to learn with experience. More formally, a learning task is defined as:

 

"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."

 
 --Machine Learning, Tom M. Mitchell

In simple words, let us say that T is a task of playing chess. Then after you have played some games you have gained experience, which we call E. However, it is the performance P that will measure how well you have learned to play chess. A simple observation of this definition would indicate that P should be non-decreasing for our learning strategy to be worth investing our time. For us human beings, this learning process is natural. However, when we want to make computers learn, then it is a different game altogether. Machine learning tools and techniques allow us to enable computers to learn such strategies. Typically, we want to learn to a point where no more further learning is feasible.

In general, there are three broad categories in which we can segregate different machine learning techniques:

  • Supervised learning: When we can make computer learn from historical data
  • Unsupervised learning: When we just want to understand the structure of data we are presented with
  • Reinforcement learning: When we want to maximize a reward in the learning process

Covering all the machine learning techniques is beyond the scope of this book; however we will cover some of the techniques in Chapter 4, Machine Learning Algorithms, specifically those that are relevant to creating a recommendation engine.

Our objective with this book is to build a recommendation engine using Scala. How does machine learning fit in here?

A recommendation engine is also called a recommender system. Given a plethora of information that is present in an information retrieval system, the task of a recommender system is to show to a user, only what is relevant. Let's take a common example of a e-commerce store. You log on to an e-commerce site, and search for a headphone. The website has thousands of headphones in its inventory. Which ones would the website show you? That's one kind of decision a recommender system would help a website with. Of course, our discussion would involve the ways and means of making that decision, so as to both keep the customer engaged, and also increase sales at the same time.

The concept of "relevant" data brings an implicit connection between the actor (the user browsing the site), and object (the headphone). We cannot just magically know how an object is relevant to an actor. We have to find some measure of relevance. To find that relevance, we need to have some data to back that relevance factor. For example:

  • A popular headphone may be relevant
  • A headphone that is cheaper as well as of high quality may be relevant
  • A headphone owned by a user similar to the user logged in may be relevant as well
  • A headphone similar to one that a user already browsed may be relevant

Do you see a pattern here? We need to make lots of decisions to come up with a good recommendation. This is where machine learning algorithms help us with understanding the data better.

Recommendation systems are not just limited to e-commerce sites. They are present at so many places we often see:

  • Facebook friends suggestions
  • You may know XYZ person on LinkedIn
  • News you may be interested in
  • Advertisements you see on your phones (ad placement)
  • Movies you may want to watch (think of Netflix)
  • Music you may want to listen (think of Spotify)
  • Places you may love to visit
  • Food you may relish at some restaurant and so on, the possibilities are endless...

We just saw how many applications are possible with recommendation engines. However, it is machine learning that helps us make our recommendations even better. Machine learning is an inter-disciplinary field that integrates scientific techniques from many fields such as information theory, cognitive sciences, mathematics, artificial intelligence to name a few. For now, let's conclude this section with the statement:

 

"Machine learning is the most exciting field of all the computer sciences. Sometimes I actually think that machine learning is not only the most exciting thing in computer science, but also the most exciting thing in all of human endeavor."

 
 --Andrew Ng, Associate Professor at Stanford and Chief Scientist of Baidu

Well, if you have read this far, you are already part of this exciting field!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.71.94