Preface

With the huge amount of data being generated over the internet and the benefits that Machine Learning (ML) predictions bring to businesses, ML implementation has become a low-hanging fruit that everyone is striving for. The complex mathematics behind it, however, can be discouraging for a lot of users. This is where H2O comes in – it automates various repetitive steps, and this encapsulation helps developers focus on results rather than handling complexities.

You’ll begin by understanding how H2O’s AutoML simplifies the implementation of ML by providing a simple, easy-to-use interface to train and use ML models. Next, you’ll see how AutoML automates the entire process of training multiple models, optimizing their hyperparameters, as well as explaining their performance. As you advance, you’ll find out how to leverage a Plain Old Java Object (POJO) and Model Object, Optimized (MOJO) to deploy your models to production. Throughout this book, you’ll take a hands-on approach to implementation using H2O that’ll enable you to set up your ML systems in no time.

By the end of this H2O book, you’ll be able to train and use your ML models using H2O AutoML, right from experimentation all the way to production without a single need to understand complex statistics or data science.

Who this book is for

This book is for engineers and data scientists who want to quickly adopt ML into their products without worrying about the internal intricacies of training ML models.

If you are someone who wants to incorporate ML into your software system but doesn’t know where to start or doesn’t have much expertise in the domain of ML, then you will find this book useful.

What this book covers

Chapter 1, Understanding H2O AutoML Basics, talks about an AutoML technology by H2O.ai named H2O AutoML and implements a basic setup of the technology to see it in action.

Chapter 2, Working with H2O Flow (H2O’s Web UI), explores H2O’s Web UI called H2O Flow and shows how we can set up our H2O AutoML system using the Web UI without writing a single line of code.

Chapter 3, Understanding Data Processing, explores some of the common data processing functionalities that we can perform using H2O’s in-built dataframe manipulation operations.

Chapter 4, Understanding H2O AutoML Training and Architecture, deep dives into understanding the high-level architecture of H2O technology and teaches us how H2O AutoML trains all the models and optimizes their hyperparameters.

Chapter 5, Understanding AutoML Algorithms, explores the various ML algorithms that H2O AutoML uses to train various models.

Chapter 6, Understanding H2O AutoML Leaderboard and Other Performance Metrics, explores the different performance metrics that are used in the AutoML Leaderboard as well as some additional metrics that are important for users to know.

Chapter 7, Working with Model Explainability, explores the H2O explainability interface and helps us to understand the various explainability features that we get as outputs.

Chapter 8, Exploring Optional Parameters for H2O AutoML, looks at some of the optional parameters that are available to us when configuring our AutoML training and shows how we can use them.

Chapter 9, Exploring Miscellaneous Features in H2O AutoML, explores two unique features of H2O AutoML. The first one is H2O AutoML’s compatibility with the scikit-learn library and the second one is H2O AutoML’s inbuilt logging system for debugging AutoML training issues.

Chapter 10, Working with Plain Old Java Objects (POJOs), covers model POJOs and how we can extract and use them to make predictions in production environments.

Chapter 11, Working with Model Object, Optimized (MOJO), covers model MOJOs, how they are different from model POJOs, how to view them, and how we can extract and use them to make predictions in production environments.

Chapter 12, Working with H2O AutoML and Apache Spark, explores in detail how H2O AutoML can be used along with Apache Spark using H2O Sparkling Water.

Chapter 13, Using H2O AutoML with Other Technologies, explores how we can use H2O models in collaboration with other commonly used technologies in the ML domain, such as Spring Boot Web applications and Apache Storm.

To get the most out of this book

Basic knowledge of statistics and programming is beneficial. Some understanding of ML and Python will be helpful. You will need Python installed on your computer, preferably with version 3.7 or above, or R installed on your computer with version 4.0 or above. All code examples have been tested using Python 3.10 and R 4.1.2 on Windows 10 OS and Ubuntu 22.04.1 LTS. However, they should work with future version releases too.

Software/hardware covered in the book

Operating system requirements

Python 3.10

Windows, macOS, or Linux

R 4.1.2

H2O 3.36.1.4

Java 11

Spark 3.2

Scala 2.13

Maven 3.8.6

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Practical-Automated-Machine-Learning-on-H2O. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/IighZ.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “The only dependency on using POJO models is the h2o-genmodel.jar file.”

A block of code is set as follows:

import h2o
h2o.init()

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

data_frame = h2o.import_file("Dataset/iris.data")

Any command-line input or output is written as follows:

mkdir H2O_POJO

cd H2O_POJO

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “You can simply click the Download POJO button to download the model as a POJO.”

Tips or Important Notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Practical Automated Machine Learning using H2O.ai, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.40.189