33,59 €
The book starts with an overview of the deep learning (DL) life cycle and the emerging Machine Learning Ops (MLOps) field, providing a clear picture of the four pillars of deep learning: data, model, code, and explainability and the role of MLflow in these areas.
From there onward, it guides you step by step in understanding the concept of MLflow experiments and usage patterns, using MLflow as a unified framework to track DL data, code and pipelines, models, parameters, and metrics at scale. You’ll also tackle running DL pipelines in a distributed execution environment with reproducibility and provenance tracking, and tuning DL models through hyperparameter optimization (HPO) with Ray Tune, Optuna, and HyperBand. As you progress, you’ll learn how to build a multi-step DL inference pipeline with preprocessing and postprocessing steps, deploy a DL inference pipeline for production using Ray Serve and AWS SageMaker, and finally create a DL explanation as a service (EaaS) using the popular Shapley Additive Explanations (SHAP) toolbox.
By the end of this book, you’ll have built the foundation and gained the hands-on experience you need to develop a DL pipeline solution from initial offline experimentation to final deployment and production, all within a reproducible and open source framework.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 323
Veröffentlichungsjahr: 2022
Bridge the gap between offline experimentation and online production
Yong Liu
BIRMINGHAM—MUMBAI
Copyright © 2022 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Dhruv Jagdish Kataria
Senior Editor: Tazeen Shaikh
Content Development Editor: Manikandan Kurup
Technical Editor: Devanshi Ayare
Copy Editor: Safis Editing
Project Coordinator: Farheen Fathima
Proofreader: Safis Editing
Indexer: Rekha Nair
Production Designer: Jyoti Chauhan
Marketing Coordinators: Shifa Ansari and Abeer Riyaz Dawe
First published: July 2022
Production reference: 2200722
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80324-133-3
www.packt.com
To my father and the memory of my mother for their sacrificial love, prayers, and life-long support.
– Yong Liu
I am thrilled to introduce this book on Practical Deep Learning at Scale with MLflow by Dr. Yong Liu. Deep learning has been revolutionizing many areas of computing in the past decade, but good resources for using it in production applications remain scarce. At the same time, practitioners have realized that designing machine learning (ML) applications to be operable, maintainable, and updateable is one of the hardest parts of using ML in production, leading to the new field of MLOps. Dr. Liu tackles these issues head-on by showing you how to build robust and maintainable deep learning applications using MLflow, a widely-used open source MLOps framework, and multiple state-of-the-art methods and software tools.
Dr. Liu brings a wealth of experience in production machine learning that shines through in every chapter of the book. He has been working in large-scale computing since his Ph.D., he has built large-scale production ML applications at Microsoft, Maana, and Outreach, and he has published multiple research papers on deep learning. This means that each chapter recommends practical approaches that have worked in multiple organizations. Dr. Liu also presents all his material clearly to tell you the tradeoffs in each decision, illustrates all the ideas through runnable code and surveys multiple open source and commercial tools for each task.
As one of the original creators of MLflow, I was very excited that Dr. Liu chose MLflow as the MLOps framework for this book. When we started MLflow in 2018, there was no widely used open-source MLOps framework, so we designed a highly extensible framework that can be integrated with a wide variety of other tools and services and customized to each organization’s workflow. We’ve been thrilled with the fast growth of the MLflow open source community since then and with the powerful integrations that the community has contributed to libraries including PyTorch, SHAP, Delta Lake, and others. Dr. Liu’s team was one of the early users of MLflow, so he is an expert on how to use the framework in practice. I hope that you enjoy learning from his experience and building groundbreaking applications using the latest techniques in deep learning.
Dr. Matei Zaharia
Chief Technologist, Databricks, and Co-Creator of MLflow
Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
I want to thank my wife and my two teenage kids for their support and encouragement during the time of writing this book. I am also grateful for those collaborators, team members, and mentors at Outreach Corporation whom I have learned a lot from.
Dr. Pavel Dmitriev received a B.S. degree in applied mathematics from Moscow State University in 2002, and a Ph.D. degree in computer science from Cornell University in 2008. He previously worked as an engineer and a data scientist at Yahoo and Microsoft. He is currently a vice president of data science at Outreach where he works on enabling data-driven decision-making in sales through machine learning and experimentation. Pavel's research was presented at a number of international conferences such as KDD, ICSE, WWW, CIKM, BigData, and SEAA. A certified yoga and meditation instructor, he actively works on improving physical and mental well-being in corporations through classes and workshops.
Hong Yung (Joey) Yip is a Ph.D. candidate in computer science at the Artificial Intelligence Institute (AIISC), University of South Carolina. His research interests are the areas of knowledge-infused learning, which intertwines AI and knowledge graphs to enhance neural networks in performance, interpretability, and explainability for dynamic and real-time domains. He has co-authored and published at top venues (WWW, ISWC, and IEEE). He has previously interned at the National Library of Medicine, Bethesda MD, on developing scalable approaches for biomedical vocabulary alignment, and with Outreach Corporation, Seattle WA, on conceptualizing a Sales Engagement Graph framework for temporal pattern discovery and contextual understanding in sales processes.
In this section, we will learn about the five stages of the full life cycle of deep learning (DL), and understand the emerging field of machine learning operations (MLOps) and the role of MLflow. We will provide an overview of the challenges in the four pillars of a DL process: data, model, code, and explainability. Then, we will learn how to set up a basic local MLflow development environment and run our first MLflow experiment for a natural language processing (NLP) model built on top of PyTorch Lightning Flash. Finally, we will explain the foundational MLflow concepts such as experiments, runs, and many more, through this first MLflow experiment example.
This section comprises the following chapters:
Chapter 1, Deep Learning Life Cycle and MLOps ChallengesChapter 2, Getting Started with MLflow for Deep Learning