0%

Book Description

Cut through the noise and get real results with a step-by-step approach to understanding supervised learning algorithms

Key Features

  • Ideal for those getting started with machine learning for the first time
  • A step-by-step machine learning tutorial with exercises and activities that help build key skills
  • Structured to let you progress at your own pace, on your own terms
  • Use your physical print copy to redeem free access to the online interactive edition

Book Description

You already know you want to understand supervised learning, and a smarter way to do that is to learn by doing. The Supervised Learning Workshop focuses on building up your practical skills so that you can deploy and build solutions that leverage key supervised learning algorithms. You'll learn from real examples that lead to real results.

Throughout The Supervised Learning Workshop, you'll take an engaging step-by-step approach to understand supervised learning. You won't have to sit through any unnecessary theory. If you're short on time you can jump into a single exercise each day or spend an entire weekend learning how to predict future values with auto regressors. It's your choice. Learning on your terms, you'll build up and reinforce key skills in a way that feels rewarding.

Every physical print copy of The Supervised Learning Workshop unlocks access to the interactive edition. With videos detailing all exercises and activities, you'll always have a guided solution. You can also benchmark yourself against assessments, track progress, and receive content updates. You'll even earn a secure credential that you can share and verify online upon completion. It's a premium learning experience that's included with your printed copy. To redeem, follow the instructions located at the start of your book.

Fast-paced and direct, The Supervised Learning Workshop is the ideal companion for those with some Python background who are getting started with machine learning. You'll learn how to apply key algorithms like a data scientist, learning along the way. This process means that you'll find that your new skills stick, embedded as best practice. A solid foundation for the years ahead.

What you will learn

  • Get to grips with the fundamental of supervised learning algorithms
  • Discover how to use Python libraries for supervised learning
  • Learn how to load a dataset in pandas for testing
  • Use different types of plots to visually represent the data
  • Distinguish between regression and classification problems
  • Learn how to perform classification using K-NN and decision trees

Who this book is for

Our goal at Packt is to help you be successful, in whatever it is you choose to do. The Supervised Learning Workshop is ideal for those with a Python background, who are just starting out with machine learning. Pick up a Workshop today, and let Packt help you develop skills that stick with you for life.

Table of Contents

  1. Preface
    1. About the Book
      1. Audience
      2. About the Chapters
      3. Conventions
      4. Before You Begin
      5. Installation and Setup
      6. Installing the Code Bundle
  2. 1. Fundamentals of Supervised Learning Algorithms
    1. Introduction
      1. When to Use Supervised Learning
    2. Python Packages and Modules
      1. Loading Data in Pandas
      2. Exercise 1.01: Loading and Summarizing the Titanic Dataset
      3. Exercise 1.02: Indexing and Selecting Data
      4. Exercise 1.03: Advanced Indexing and Selection
      5. Pandas Methods
      6. Exercise 1.04: Splitting, Applying, and Combining Data Sources
      7. Quantiles
      8. Lambda Functions
      9. Exercise 1.05: Creating Lambda Functions
    3. Data Quality Considerations
      1. Managing Missing Data
      2. Class Imbalance
      3. Low Sample Size
      4. Activity 1.01: Implementing Pandas Functions
    4. Summary
  3. 2. Exploratory Data Analysis and Visualization
    1. Introduction
    2. Exploratory Data Analysis (EDA)
    3. Summary Statistics and Central Values
      1. Exercise 2.01: Summarizing the Statistics of Our Dataset
    4. Missing Values
      1. Finding Missing Values
      2. Exercise 2.02: Visualizing Missing Values
      3. Imputation Strategies for Missing Values
      4. Exercise 2.03: Performing Imputation Using Pandas
      5. Exercise 2.04: Performing Imputation Using Scikit-Learn
      6. Exercise 2.05: Performing Imputation Using Inferred Values
      7. Activity 2.01: Summary Statistics and Missing Values
    5. Distribution of Values
      1. Target Variable
      2. Exercise 2.06: Plotting a Bar Chart
      3. Categorical Data
      4. Exercise 2.07: Identifying Data Types for Categorical Variables
      5. Exercise 2.08: Calculating Category Value Counts
      6. Exercise 2.09: Plotting a Pie Chart
      7. Continuous Data
        1. Skewness
        2. Kurtosis
      8. Exercise 2.10: Plotting a Histogram
      9. Exercise 2.11: Computing Skew and Kurtosis
      10. Activity 2.02: Visually Representing the Distribution of Values
    6. Relationships within the Data
      1. Relationship between Two Continuous Variables
        1. Pearson's Coefficient of Correlation
      2. Exercise 2.12: Plotting a Scatter Plot
      3. Exercise 2.13: Plotting a Correlation Heatmap
        1. Using Pairplots
      4. Exercise 2.14: Implementing a Pairplot
      5. Relationship between a Continuous and a Categorical Variable
      6. Exercise 2.15: Plotting a Bar Chart
      7. Exercise 2.16: Visualizing a Box Plot
      8. Relationship Between Two Categorical Variables
      9. Exercise 2.17: Plotting a Stacked Bar Chart
      10. Activity 2.03: Relationships within the Data
    7. Summary
  4. 3. Linear Regression
    1. Introduction
    2. Regression and Classification Problems
      1. The Machine Learning Workflow
        1. Business Understanding
        2. Data Understanding
        3. Data Preparation
        4. Modeling
        5. Evaluation
        6. Deployment
      2. Exercise 3.01: Plotting Data with a Moving Average
      3. Activity 3.01: Plotting Data with a Moving Average
    3. Linear Regression
      1. Least Squares Method
      2. The Scikit-Learn Model API
      3. Exercise 3.02: Fitting a Linear Model Using the Least Squares Method
      4. Activity 3.02: Linear Regression Using the Least Squares Method
      5. Linear Regression with Categorical Variables
      6. Exercise 3.03: Introducing Dummy Variables
      7. Activity 3.03: Dummy Variables
      8. Polynomial Models with Linear Regression
      9. Exercise 3.04: Polynomial Models with Linear Regression
      10. Activity 3.04: Feature Engineering with Linear Regression
      11. Generic Model Training
      12. Gradient Descent
      13. Exercise 3.05: Linear Regression with Gradient Descent
      14. Exercise 3.06: Optimizing Gradient Descent
      15. Activity 3.05: Gradient Descent
    4. Multiple Linear Regression
      1. Exercise 3.07: Multiple Linear Regression
    5. Summary
  5. 4. Autoregression
    1. Introduction
    2. Autoregression Models
      1. Exercise 4.01: Creating an Autoregression Model
      2. Activity 4.01: Autoregression Model Based on Periodic Data
    3. Summary
  6. 5. Classification Techniques
    1. Introduction
    2. Ordinary Least Squares as a Classifier
      1. Exercise 5.01: Ordinary Least Squares as a Classifier
    3. Logistic Regression
      1. Exercise 5.02: Logistic Regression as a Classifier – Binary Classifier
      2. Exercise 5.03: Logistic Regression – Multiclass Classifier
      3. Activity 5.01: Ordinary Least Squares Classifier – Binary Classifier
        1. Select K Best Feature Selection
      4. Exercise 5.04: Breast Cancer Diagnosis Classification Using Logistic Regression
    4. Classification Using K-Nearest Neighbors
      1. Exercise 5.05: KNN Classification
      2. Exercise 5.06: Visualizing KNN Boundaries
      3. Activity 5.02: KNN Multiclass Classifier
    5. Classification Using Decision Trees
      1. Exercise 5.07: ID3 Classification
        1. Classification and Regression Tree
      2. Exercise 5.08: Breast Cancer Diagnosis Classification Using a CART Decision Tree
      3. Activity 5.03: Binary Classification Using a CART Decision Tree
    6. Artificial Neural Networks
      1. Exercise 5.09: Neural Networks – Multiclass Classifier
      2. Activity 5.04: Breast Cancer Diagnosis Classification Using Artificial Neural Networks
    7. Summary
  7. 6. Ensemble Modeling
    1. Introduction
    2. One-Hot Encoding
      1. Exercise 6.01: Importing Modules and Preparing the Dataset
    3. Overfitting and Underfitting
      1. Underfitting
      2. Overfitting
      3. Overcoming the Problem of Underfitting and Overfitting
    4. Bagging
    5. Bootstrapping
      1. Exercise 6.02: Using the Bagging Classifier
      2. Random Forest
      3. Exercise 6.03: Building the Ensemble Model Using Random Forest
    6. Boosting
      1. Adaptive Boosting
      2. Exercise 6.04: Implementing Adaptive Boosting
      3. Gradient Boosting
      4. Exercise 6.05: Implementing GradientBoostingClassifier to Build an Ensemble Model
    7. Stacking
      1. Exercise 6.06: Building a Stacked Model
      2. Activity 6.01: Stacking with Standalone and Ensemble Algorithms
    8. Summary
  8. 7. Model Evaluation
    1. Introduction
    2. Importing the Modules and Preparing Our Dataset
    3. Evaluation Metrics
      1. Regression Metrics
      2. Exercise 7.01: Calculating Regression Metrics
      3. Classification Metrics
        1. Numerical Metrics
        2. Curve Plots
      4. Exercise 7.02: Calculating Classification Metrics
    4. Splitting a Dataset
      1. Hold-Out Data
      2. K-Fold Cross-Validation
      3. Sampling
      4. Exercise 7.03: Performing K-Fold Cross-Validation with Stratified Sampling
    5. Performance Improvement Tactics
      1. Variation in Train and Test Errors
        1. Learning Curve
        2. Validation Curve
      2. Hyperparameter Tuning
      3. Exercise 7.04: Hyperparameter Tuning with Random Search
      4. Feature Importance
      5. Exercise 7.05: Feature Importance Using Random Forest
      6. Activity 7.01: Final Test Project
    6. Summary
  9. Appendix
    1. 1. Fundamentals of Supervised Learning Algorithms
      1. Activity 1.01: Implementing Pandas Functions
    2. 2. Exploratory Data Analysis and Visualization
      1. Activity 2.01: Summary Statistics and Missing Values
      2. Activity 2.02: Representing the Distribution of Values Visually
      3. Activity 2.03: Relationships within the Data
    3. 3. Linear Regression
      1. Activity 3.01: Plotting Data with a Moving Average
      2. Activity 3.02: Linear Regression Using the Least Squares Method
      3. Activity 3.03: Dummy Variables
      4. Activity 3.04: Feature Engineering with Linear Regression
      5. Activity 3.05: Gradient Descent
    4. 4. Autoregression
      1. Activity 4.01: Autoregression Model Based on Periodic Data
    5. 5. Classification Techniques
      1. Activity 5.01: Ordinary Least Squares Classifier – Binary Classifier
      2. Activity 5.02: KNN Multiclass Classifier
      3. Activity 5.03: Binary Classification Using a CART Decision Tree
      4. Activity 5.04: Breast Cancer Diagnosis Classification Using Artificial Neural Networks
    6. 6. Ensemble Modeling
      1. Activity 6.01: Stacking with Standalone and Ensemble Algorithms
    7. 7. Model Evaluation
      1. Activity 7.01: Final Test Project
3.17.181.21