0%

Book Description

Take a comprehensive and step-by-step approach to understanding machine learning

Key Features

  • Discover how to apply the scikit-learn uniform API in all types of machine learning models
  • Understand the difference between supervised and unsupervised learning models
  • Reinforce your understanding of machine learning concepts by working on real-world examples

Book Description

Machine learning algorithms are an integral part of almost all modern applications. To make the learning process faster and more accurate, you need a tool flexible and powerful enough to help you build machine learning algorithms quickly and easily. With The Machine Learning Workshop, you'll master the scikit-learn library and become proficient in developing clever machine learning algorithms.

The Machine Learning Workshop begins by demonstrating how unsupervised and supervised learning algorithms work by analyzing a real-world dataset of wholesale customers. Once you've got to grips with the basics, you'll develop an artificial neural network using scikit-learn and then improve its performance by fine-tuning hyperparameters. Towards the end of the workshop, you'll study the dataset of a bank's marketing activities and build machine learning models that can list clients who are likely to subscribe to a term deposit. You'll also learn how to compare these models and select the optimal one.

By the end of The Machine Learning Workshop, you'll not only have learned the difference between supervised and unsupervised models and their applications in the real world, but you'll also have developed the skills required to get started with programming your very own machine learning algorithms.

What you will learn

  • Understand how to select an algorithm that best fits your dataset and desired outcome
  • Explore popular real-world algorithms such as K-means, Mean-Shift, and DBSCAN
  • Discover different approaches to solve machine learning classification problems
  • Develop neural network structures using the scikit-learn package
  • Use the NN algorithm to create models for predicting future outcomes
  • Perform error analysis to improve your model's performance

Who this book is for

The Machine Learning Workshop is perfect for machine learning beginners. You will need Python programming experience, though no prior knowledge of scikit-learn and machine learning is necessary.

Table of Contents

  1. The Machine Learning Workshop
  2. Second Edition
  3. Preface
    1. About the Book
      1. Audience
      2. About the Chapters
      3. Conventions
      4. Code Presentation
      5. Setting up Your Environment
      6. Installing Python on Windows and MacOS
      7. Installing Python on Linux
      8. Installing pip
      9. Installing Libraries
      10. Opening a Jupyter Notebook
      11. Accessing the Code Files
  4. 1. Introduction to Scikit-Learn
    1. Introduction
    2. Introduction to Machine Learning
      1. Applications of ML
      2. Choosing the Right ML Algorithm
    3. Scikit-Learn
      1. Advantages of Scikit-Learn
      2. Disadvantages of Scikit-Learn
      3. Other Frameworks
    4. Data Representation
      1. Tables of Data
      2. Features and Target Matrices
      3. Exercise 1.01: Loading a Sample Dataset and Creating the Features and Target Matrices
      4. Activity 1.01: Selecting a Target Feature and Creating a Target Matrix
    5. Data Preprocessing
      1. Messy Data
        1. Missing Values
        2. Outliers
      2. Exercise 1.02: Dealing with Messy Data
      3. Dealing with Categorical Features
      4. Feature Engineering
      5. Exercise 1.03: Applying Feature Engineering to Text Data
      6. Rescaling Data
      7. Exercise 1.04: Normalizing and Standardizing Data
      8. Activity 1.02: Pre-processing an Entire Dataset
    6. Scikit-Learn API
      1. How Does It Work?
        1. Estimator
        2. Predictor
        3. Transformer
    7. Supervised and Unsupervised Learning
      1. Supervised Learning
      2. Unsupervised Learning
    8. Summary
  5. 2. Unsupervised Learning – Real-Life Applications
    1. Introduction
    2. Clustering
      1. Clustering Types
      2. Applications of Clustering
    3. Exploring a Dataset – Wholesale Customers Dataset
      1. Understanding the Dataset
    4. Data Visualization
      1. Loading the Dataset Using pandas
      2. Visualization Tools
      3. Exercise 2.01: Plotting a Histogram of One Feature from the Circles Dataset
      4. Activity 2.01: Using Data Visualization to Aid the Pre-processing Process
      5. k-means Algorithm
      6. Understanding the Algorithm
        1. Initialization Methods
        2. Choosing the Number of Clusters
      7. Exercise 2.02: Importing and Training the k-means Algorithm over a Dataset
      8. Activity 2.02: Applying the k-means Algorithm to a Dataset
    5. Mean-Shift Algorithm
      1. Understanding the Algorithm
      2. Exercise 2.03: Importing and Training the Mean-Shift Algorithm over a Dataset
      3. Activity 2.03: Applying the Mean-Shift Algorithm to a Dataset
    6. DBSCAN Algorithm
      1. Understanding the Algorithm
      2. Exercise 2.04: Importing and Training the DBSCAN Algorithm over a Dataset
      3. Activity 2.04: Applying the DBSCAN Algorithm to the Dataset
    7. Evaluating the Performance of Clusters
      1. Available Metrics in Scikit-Learn
      2. Exercise 2.05: Evaluating the Silhouette Coefficient Score and Calinski–Harabasz Index
      3. Activity 2.05: Measuring and Comparing the Performance of the Algorithms
    8. Summary
  6. 3. Supervised Learning – Key Steps
    1. Introduction
    2. Supervised Learning Tasks
    3. Model Validation and Testing
      1. Data Partitioning
      2. Split Ratio
      3. Exercise 3.01: Performing a Data Partition on a Sample Dataset
      4. Cross-Validation
      5. Exercise 3.02: Using Cross-Validation to Partition the Train Set into a Training and a Validation Set
      6. Activity 3.01: Data Partitioning on a Handwritten Digit Dataset
    4. Evaluation Metrics
      1. Evaluation Metrics for Classification Tasks
        1. Confusion Matrix
        2. Accuracy
        3. Precision
        4. Recall
      2. Exercise 3.03: Calculating Different Evaluation Metrics on a Classification Task
      3. Choosing an Evaluation Metric
      4. Evaluation Metrics for Regression Tasks
      5. Exercise 3.04: Calculating Evaluation Metrics on a Regression Task
      6. Activity 3.02: Evaluating the Performance of the Model Trained on a Handwritten Dataset
    5. Error Analysis
      1. Bias, Variance, and Data Mismatch
      2. Exercise 3.05: Calculating the Error Rate on Different Sets of Data
      3. Activity 3.03: Performing Error Analysis on a Model Trained to Recognize Handwritten Digits
    6. Summary
  7. 4. Supervised Learning Algorithms: Predicting Annual Income
    1. Introduction
    2. Exploring the Dataset
      1. Understanding the Dataset
    3. The Naïve Bayes Algorithm
      1. How Does the Naïve Bayes Algorithm Work?
      2. Exercise 4.01: Applying the Naïve Bayes Algorithm
      3. Activity 4.01: Training a Naïve Bayes Model for Our Census Income Dataset
    4. The Decision Tree Algorithm
      1. How Does the Decision Tree Algorithm Work?
      2. Exercise 4.02: Applying the Decision Tree Algorithm
      3. Activity 4.02: Training a Decision Tree Model for Our Census Income Dataset
    5. The Support Vector Machine Algorithm
      1. How Does the SVM Algorithm Work?
      2. Exercise 4.03: Applying the SVM Algorithm
      3. Activity 4.03: Training an SVM Model for Our Census Income Dataset
    6. Error Analysis
      1. Accuracy, Precision, and Recall
    7. Summary
  8. 5. Supervised Learning – Key Steps
    1. Introduction
    2. Artificial Neural Networks
      1. How Do ANNs Work?
        1. Forward Propagation
        2. Cost Function
        3. Backpropagation
        4. Updating the Weights and Biases
      2. Understanding the Hyperparameters
        1. Number of Hidden Layers and Units
        2. Activation Functions
        3. Regularization
        4. Batch Size
        5. Learning Rate
        6. Number of Iterations
      3. Applications of Neural Networks
      4. Limitations of Neural Networks
    3. Applying an Artificial Neural Network
      1. Scikit-Learn's Multilayer Perceptron
      2. Exercise 5.01: Applying the MLP Classifier Class
      3. Activity 5.01: Training an MLP for Our Census Income Dataset
    4. Performance Analysis
      1. Error Analysis
      2. Hyperparameter Fine-Tuning
      3. Model Comparison
      4. Activity 5.02: Comparing Different Models to Choose the Best Fit for the Census Income Data Problem
    5. Summary
  9. 6. Building Your Own Program
    1. Introduction
    2. Program Definition
      1. Building a Program – Key Stages
        1. Preparation
        2. Creation
        3. Interaction
      2. Understanding the Dataset
      3. Activity 6.01: Performing the Preparation and Creation Stages for the Bank Marketing Dataset
    3. Saving and Loading a Trained Model
      1. Saving a Model
      2. Exercise 6.01: Saving a Trained Model
      3. Loading a Model
      4. Exercise 6.02: Loading a Saved Model
      5. Activity 6.02: Saving and Loading the Final Model for the Bank Marketing Dataset
    4. Interacting with a Trained Model
      1. Exercise 6.03: Creating a Class and a Channel to Interact with a Trained Model
      2. Activity 6.03: Allowing Interaction with the Bank Marketing Dataset Model
    5. Summary
  10. Appendix
    1. 1. Introduction to Scikit-Learn
      1. Activity 1.01: Selecting a Target Feature and Creating a Target Matrix
      2. Activity 1.02: Pre-processing an Entire Dataset
    2. 2. Unsupervised Learning – Real-Life Applications
      1. Activity 2.01: Using Data Visualization to Aid the Pre-processing Process
      2. Activity 2.02: Applying the k-means Algorithm to a Dataset
      3. Activity 2.03: Applying the Mean-Shift Algorithm to a Dataset
      4. Activity 2.04: Applying the DBSCAN Algorithm to the Dataset
      5. Activity 2.05: Measuring and Comparing the Performance of the Algorithms
    3. 3. Supervised Learning – Key Steps
      1. Activity 3.01: Data Partitioning on a Handwritten Digit Dataset
      2. Activity 3.02: Evaluating the Performance of the Model Trained on a Handwritten Dataset
      3. Activity 3.03: Performing Error Analysis on a Model Trained to Recognize Handwritten Digits
    4. 4. Supervised Learning Algorithms: Predicting Annual Income
      1. Activity 4.01: Training a Naïve Bayes Model for Our Census Income Dataset
      2. Activity 4.02: Training a Decision Tree Model for Our Census Income Dataset
      3. Activity 4.03: Training an SVM Model for Our Census Income Dataset
    5. 5. Artificial Neural Networks: Predicting Annual Income
      1. Activity 5.01: Training an MLP for Our Census Income Dataset
      2. Activity 5.02: Comparing Different Models to Choose the Best Fit for the Census Income Data Problem
    6. 6. Building Your Own Program
      1. Activity 6.01: Performing the Preparation and Creation Stages for the Bank Marketing Dataset
      2. Activity 6.02: Saving and Loading the Final Model for the Bank Marketing Dataset
      3. Activity 6.03: Allowing Interaction with the Bank Marketing Dataset Model
3.144.161.116