0%

Deliver huge improvements to your machine learning pipelines without spending hours fine-tuning parameters! This book’s practical case studies reveal feature engineering techniques that upgrade your data wrangling—and your ML results.

In Feature Engineering Bookcamp you will learn how to:

  • Identify and implement feature transformations for your data
  • Build powerful machine learning pipelines with unstructured data like text and images
  • Quantify and minimize bias in machine learning pipelines at the data level
  • Use feature stores to build real-time feature engineering pipelines
  • Enhance existing machine learning pipelines by manipulating the input data
  • Use state-of-the-art deep learning models to extract hidden patterns in data

Feature Engineering Bookcamp guides you through a collection of projects that give you hands-on practice with core feature engineering techniques. You’ll work with feature engineering practices that speed up the time it takes to process data and deliver real improvements in your model’s performance. This instantly-useful book skips the abstract mathematical theory and minutely-detailed formulas; instead you’ll learn through interesting code-driven case studies, including tweet classification, COVID detection, recidivism prediction, stock price movement detection, and more.

About the Technology
Get better output from machine learning pipelines by improving your training data! Use feature engineering, a machine learning technique for designing relevant input variables based on your existing data, to simplify training and enhance model performance. While fine-tuning hyperparameters or tweaking models may give you a minor performance bump, feature engineering delivers dramatic improvements by transforming your data pipeline.

About the Book
Feature Engineering Bookcamp walks you through six hands-on projects where you’ll learn to upgrade your training data using feature engineering. Each chapter explores a new code-driven case study, taken from real-world industries like finance and healthcare. You’ll practice cleaning and transforming data, mitigating bias, and more. The book is full of performance-enhancing tips for all major ML subdomains—from natural language processing to time-series analysis.

What's Inside
  • Identify and implement feature transformations
  • Build machine learning pipelines with unstructured data
  • Quantify and minimize bias in ML pipelines
  • Use feature stores to build real-time feature engineering pipelines
  • Enhance existing pipelines by manipulating input data


About the Reader
For experienced machine learning engineers familiar with Python.

About the Author
Sinan Ozdemir is the founder and CTO of Shiba, a former lecturer of Data Science at Johns Hopkins University, and the author of multiple textbooks on data science and machine learning.

Quotes
An excellent resource for the often-undervalued task of feature engineering. It demonstrates the ‘when, why, and how’ of techniques you can use to improve your models.
- Josh McAdams, Google

The most overlooked step in a data science pipeline, simplified!
- Shaksham Kapoor, UBS

Make major improvements in your models by improving the features.
- Harveen Singh Chadha, Thoughtworks

Detailed and practical.
- Satej Kumar Sahu, Honeywell

The practical case studies reveal core feature engineering techniques that improve process and ML workflow.
- Maria Ana, CG

Table of Contents

  1. Feature Engineering Bookcamp
  2. Copyright
  3. contents
  4. front matter
  5. 1 Introduction to feature engineering
  6. 2 The basics of feature engineering
  7. 3 Healthcare: Diagnosing COVID-19
  8. 4 Bias and fairness: Modeling recidivism
  9. 5 Natural language processing: Classifying social media sentiment
  10. 6 Computer vision: Object recognition
  11. 7 Time series analysis: Day trading with machine learning
  12. 8 Feature stores
  13. 9 Putting it all together
  14. index
18.118.200.136