0%

Book Description

Companies are spending billions on machine learning projects, but it’s money wasted if the models can’t be deployed effectively. In this practical guide, Hannes Hapke and Catherine Nelson walk you through the steps of automating a machine learning pipeline using the TensorFlow ecosystem. You’ll learn the techniques and tools that will cut deployment time from days to minutes, so that you can focus on developing new models rather than maintaining legacy systems.

Data scientists, machine learning engineers, and DevOps engineers will discover how to go beyond beyond model development to successfully productize their data science projects, while managers will better understand the role they play in helping to accelerate these projects. The book also explores new approaches for integrating data privacy into machine learning pipelines.

  • Understand the machine learning management lifecycle
  • Implement data pipelines with Apache Airflow and Kubeflow Pipelines
  • Work with data using TensorFlow tools like ML Metadata, TensorFlow Data Validation, and TensorFlow Transform
  • Analyze models with TensorFlow Model Analysis and ship them with the TFX Model Pusher Component after the ModelValidator TFX Component confirmed that the analysis results are an improvement
  • Deploy models in a variety of environments with TensorFlow Serving, TensorFlow Lite, and TensorFlow.js
  • Learn methods for adding privacy, including differential privacy with TensorFlow Privacy and federated learning with TensorFlow Federated
  • Design model feedback loops to increase your data sets and learn when to update your machine learning models


Table of Contents

  1. 1. Introduction
    1. What are Machine Learning Pipelines?
    2. Overview of Machine Learning Pipelines
      1. Experiment Tracking
      2. Data Versioning
      3. Data Validation
      4. Data Preprocessing
      5. Model Training and Tuning
      6. Model Analysis
      7. Model Versioning
      8. Model Deployment
      9. Feedback Loops
      10. Data Privacy
    3. Why Machine Learning Pipelines?
    4. The Business Case for Automated Machine Learning Pipelines
    5. Overview of the Chapters
    6. Our Example Project
      1. Downloading the Dataset
      2. Our Machine Learning Model
    7. Who is this book for?
    8. Summary
  2. 2. Pipeline Orchestration
    1. Why Pipeline Orchestration
    2. Directed Acyclic Graphs
    3. Machine Learning Pipelines with Apache Beam
      1. Setup
      2. Basic Pipeline
      3. Executing your Basic Pipeline
      4. Orchestrating TensorFlow Extended Pipelines with Apache Beam
    4. Machine Learning Pipelines with Apache Airflow
      1. Setup
      2. Basic Pipeline
      3. Orchestrating TensorFlow Extended Pipelines with Apache Airflow
    5. Machine Learning Pipelines with Kubeflow Pipeline
      1. Installation & Setup
      2. Orchestrating TensorFlow Extended Pipelines with Kubeflow Pipelines
    6. Which Orchestration Tool to Choose?
    7. Summary
  3. 3. Data Validation with TensorFlow
    1. Why Data Validation?
    2. TensorFlow Data Validation
      1. Installation
      2. Generating Statistics from your Data
      3. Generating Schema from your Data
      4. Comparing Data Sets
      5. Data Skew Detection
      6. Data Drift Detection
    3. Processing Large Data Sets with Google Cloud Platform
    4. Integrate TensorFlow Data Validation into your Machine Learning Pipeline
    5. Summary
  4. 4. Model Deployment with TensorFlow Serving
    1. A Simple Model Server
      1. Why it isn’t Recommended
    2. TensorFlow Serving
    3. TensorFlow Architecture Overview
    4. Exporting Models for TensorFlow Serving
    5. Model Signatures
    6. Inspecting Exported Models
      1. Inspecting the Model
      2. Testing the Model
    7. Setting up TensorFlow Serving
      1. Docker Installation
      2. Native Ubuntu Installation
      3. Building TensorFlow Serving from Source
    8. Configure a TensorFlow Server
      1. Single Model Configuration
      2. Multi Model Configuration
    9. REST vs gRPC
      1. Representational State Transfer
      2. Google Remote Procedures Calls
    10. Making predictions from the Model Server
      1. Getting model predictions via REST
      2. Inferring TensorFlow Serving via gRPC
    11. Model A/B Testing with TensorFlow Serving
    12. Requesting Model Meta Data from the Model Server
      1. REST Requests for Model Meta Data
      2. gRPC Requests for Model Meta Data
    13. Batching Inference Requests
      1. Configure Batch Predictions
    14. Other TensorFlow Serving Optimizations
    15. TensorFlow Serving Alternatives
      1. Seldon
      2. GraphPipe
      3. Simple TensorFlow Serving
      4. MLflow
    16. Deploying with Cloud Providers
      1. Use Cases
      2. Example Deployment with Google Cloud Platforms
    17. Summary
  5. 5. Feedback Loops
    1. Introduction to feedback loops
      1. Explicit and implicit feedback
      2. The data flywheel
      3. Feedback loops in the real world
    2. Design patterns for collecting feedback
      1. Users take some action as a result of the prediction
      2. Users rate the quality of the prediction
      3. Users correct the prediction
      4. Crowdsource the annotations
      5. Expert annotations
      6. Feedback is produced automatically by the system
    3. How to track feedback loops
      1. Tracking explicit feedback
      2. Tracking implicit feedback
    4. Summary
18.226.4.239