29,99 €
With the huge amount of data being generated over the internet and the benefits that Machine Learning (ML) predictions bring to businesses, ML implementation has become a low-hanging fruit that everyone is striving for. The complex mathematics behind it, however, can be discouraging for a lot of users. This is where H2O comes in – it automates various repetitive steps, and this encapsulation helps developers focus on results rather than handling complexities.
You’ll begin by understanding how H2O’s AutoML simplifies the implementation of ML by providing a simple, easy-to-use interface to train and use ML models. Next, you’ll see how AutoML automates the entire process of training multiple models, optimizing their hyperparameters, as well as explaining their performance. As you advance, you’ll find out how to leverage a Plain Old Java Object (POJO) and Model Object, Optimized (MOJO) to deploy your models to production. Throughout this book, you’ll take a hands-on approach to implementation using H2O that’ll enable you to set up your ML systems in no time.
By the end of this H2O book, you’ll be able to train and use your ML models using H2O AutoML, right from experimentation all the way to production without a single need to understand complex statistics or data science.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 477
Veröffentlichungsjahr: 2022
Discover the power of automated machine learning,from experimentation through to deployment to production
Salil Ajgaonkar
BIRMINGHAM—MUMBAI
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Dhruv Jagdish Kataria
Senior Editor: Nathanya Dias
Technical Editor: Devanshi Ayare
Copy Editor: Safis Editing
Project Coordinator: Farheen Fathima
Proofreader: Safis Editing
Indexer: Subalakshmi Govindhan
Production Designer: Ponraj Dhandapani
Marketing Coordinators: Shifa Ansari, Abeer Riyaz Dawe
First published: September 2022
Production reference: 1140922
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80107-452-0
www.packt.com
Salil Ajgaonkar is a software engineer experienced in building and scaling cloud-based microservices and productizing machine learning models. His background includes work in transaction systems, artificial intelligence, and cyber security. He is passionate about solving complex scaling problems, building machine learning pipelines, and data engineering. Salil earned his degree in IT from Xavier Institute of Engineering, Mumbai, India, in 2015 and later earned his master’s degree in computer science from Trinity College Dublin, Ireland, in 2018, specializing in future networked systems. His work history includes the likes of BookMyShow, Genesys, and Vectra AI.
I would like to thank my lovely wife, Oshin, for supporting me and making sure I gave my best effort to writing this book. Also, thanks to my parents, who taught me to always say “yes” to all opportunities that come my way.
I would also like to thank the Packt team for giving me the opportunity to write this book and play my part in giving back to the programming community. Special thanks to Dhruv for coordinating the publication effort, Kirti for scheduling and keeping things on track over the year, and Nathanya and the editorial team for ensuring that the book is of the highest quality.
Last but not least, special thanks to Dr. Emir Muñoz for his valuable insights into the technical aspects of the book.
Emir Muñoz is a senior manager of machine learning at Genesys, where he works on projects to enhance customer experience using artificial intelligence, machine learning, and data science. He has experience in academia and industry, which he uses to leverage emerging technologies and algorithms to deliver innovative solutions. Currently, he leads a team that mines contact center data to train machine learning models to optimize contact center routing.
Emir holds a PhD in computer science with a specialization in machine learning. He also received a BEng in informatics and an MSc in computer engineering. He is the author of several papers and patents on the topics of semantic web, machine learning, knowledge graphs, and contact center analytics.
The objective of this part is to help you implement an easy, bare-bones demo of how to install, set up, and use H2O AutoML, opening up further exploration of and experimentation with the technology.
This section comprises the following chapters:
Chapter 1, Understanding H2O AutoML BasicsChapter 2, Working with H2O Flow (H2O’s Web UI)Machine Learning (ML) is more than just code. It involves tons of observations from different perspectives. As powerful as actual coding is, a lot of information gets hidden away behind the Terminal on which you code. Humans have always understood pictures more easily than words. Similarly, as complex as ML is, it can be very easy and fun to implement with the help of interactive User Interfaces (UIs). Working with a colorful UI over the dull black and white pixelated Terminal is always a plus when learning about difficult topics.
H2O Flow is a web-based UI developed by the H2O.ai team. This interface works on the same backend that we learned about in Chapter 1, Understanding H2O AutoML Basics. It is simply a web UI wrapped over the main H2O library, which passes inputs and triggers functions on the backend server and reads the results by displaying them back to the user.
In this chapter, we will learn how to work with H2O Flow. We will perform all the typical steps of the ML pipeline, which we learned about in the Understanding AutoML and H2O AutoML section of Chapter 1, Understanding H2O AutoML Basics, from reading datasets to making predictions using the trained models. Also, we will explore a few metrics and model details to help us ease into more advanced topics later. This chapter is hands-on, and we will learn about the various parts of H2O Flow as we create our ML pipeline.
By the end of this chapter, you will be able to navigate and use the various features of H2O Flow. Additionally, you will be able to train your ML models and use them for predictions without needing to write a single line of code using H2O Flow.
In this chapter, we are going to cover the following topics:
Understanding the basics of H2O FlowWorking with data functions in H2O FlowWorking with model training functions in H2O FlowWorking with prediction functions in H2O FlowYou will require the following:
A decent web browser (Chrome, Firefox, or Edge), the latest version of your preferred web browser.H2O Flow is an open source web interface that helps users execute code, plot graphs, and display dataframes on a single page called a Flow notebook or just Flow.
Users of Jupyter notebooks will find H2O Flow very similar. You write your executable code in cells, and the output of the code is displayed below it when you execute the cell. Then, the cursor moves on to the next cell. The best thing about a Flow is that it can be easily saved, exported, and imported between various users. This helps a lot of data scientists share results among various stakeholders, as they just need to save the execution results and share the flow.
In the following sub-sections, we will gain an understanding of the basics of H2O Flow. Let’s begin our journey with H2O Flow by, first, downloading it to our system.
In order to run H2O Flow, you will need to first download the H2O Flow Java Archive (JAR) file onto your system, and then run the JAR file once it has been downloaded.
You can download and launch H2O Flow using the following steps:
You can download H2O Flow at https://h2o-release.s3.amazonaws.com/h2o/master/latest.html.Once the ZIP file has been downloaded, open a Terminal and run the following commands in the folder where you downloaded the ZIP file:unzip {name_of_the_h2o_zip_file}
To run H2O Flow, run the following command inside the folder of your recently unzipped h2o file:java -jar h2o.jar
This will start an H2O Web UI on http://localhost:54321.
Now that we have downloaded and launched H2O Flow, let’s briefly explore it to get an understanding of what functionalities it has to offer.
H2O Flow is a very feature-intensive