Chapter 11. Brief Tour of Machine Learning

This chapter takes the user on a whirlwind tour of machine learning, focusing on using the pandas library as a tool that can be used to preprocess data used by machine learning programs. It also introduces the user to the scikit-learn library, which is the most popular machine learning toolkit in Python.

In this chapter, we illustrate machine learning techniques by applying them to a well-known problem about classifying which passengers survived the Titanic disaster at the turn of the last century. The various topics addressed in this chapter include the following:

  • Role of pandas in machine learning
  • Installation of scikit-learn
  • Introduction to machine learning concepts
  • Application of machine learning – Kaggle Titanic competition
  • Data analysis and preprocessing using pandas
  • Naïve approach to Titanic problem
  • scikit-learn ML classifier interface
  • Supervised learning algorithms
  • Unsupervised learning algorithms

Role of pandas in machine learning

The library we will be considering for machine learning is called scikit-learn. The scikit-learn Python library provides an extensive library of machine learning algorithms that can be used to create adaptive programs that learn from data inputs.

However, before this data can be used by scikit-learn, it must undergo some preprocessing. This is where pandas comes in. The pandas can be used to preprocess and filter data before passing it to the algorithm implemented in scikit-learn.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.96.86