Classifying Newsgroup Topics with Support Vector Machines

In the previous chapter, we built a spam email detector with Naïve Bayes. This chapter continues our journey of supervised learning and classification. Specifically, we will be focusing on multiclass classification and support vector machine classifiers. The support vector machine has been one of the most popular algorithms when it comes to text classification. The goal of the algorithm is to search for a decision boundary in order to separate data from different classes. We will be discussing in detail how that works. Also, we will be implementing the algorithm with scikit-learn and TensorFlow, and applying it to solve various real-life problems, including newsgroup topic classification, fetal state categorization on cardiotocography, as well as breast cancer prediction.

We will go into detail as regards the topics mentioned:

  • What is support vector machine?
  • The mechanics of SVM through three cases
  • The implementations of SVM with scikit-learn
  • Multiclass classification strategies
  • The kernel method
  • SVM with non-linear kernels
  • How to choose between linear and Gaussian kernels
  • Overfitting and reducing overfitting in SVM
  • Newsgroup topic classification with SVM
  • Tuning with grid search and cross-validation
  • Fetal state categorization using SVM with non-linear kernel
  • Breast cancer prediction with TensorFlow
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.