Machine Learning Model Serving Patterns and Best Practices - Md Johirul Islam - E-Book

Machine Learning Model Serving Patterns and Best Practices E-Book

Md Johirul Islam

0,0
29,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Serving patterns enable data science and ML teams to bring their models to production. Most ML models are not deployed for consumers, so ML engineers need to know the critical steps for how to serve an ML model.
This book will cover the whole process, from the basic concepts like stateful and stateless serving to the advantages and challenges of each. Batch, real-time, and continuous model serving techniques will also be covered in detail. Later chapters will give detailed examples of keyed prediction techniques and ensemble patterns. Valuable associated technologies like TensorFlow severing, BentoML, and RayServe will also be discussed, making sure that you have a good understanding of the most important methods and techniques in model serving. Later, you’ll cover topics such as monitoring and performance optimization, as well as strategies for managing model drift and handling updates and versioning. The book will provide practical guidance and best practices for ensuring that your model serving pipeline is robust, scalable, and reliable. Additionally, this book will explore the use of cloud-based platforms and services for model serving using AWS SageMaker with the help of detailed examples.
By the end of this book, you'll be able to save and serve your model using state-of-the-art techniques.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 419

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Machine Learning Model Serving Patterns and Best Practices

A definitive guide to deploying, monitoring, and providing accessibility to ML models in production

Md Johirul Islam

BIRMINGHAM—MUMBAI

Machine Learning Model Serving Patterns and Best Practices

Copyright © 2022 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author , nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Ali Abidi

Publishing Product Manager: Ali Abidi

Senior Editor: Tiksha Lad

Technical Editor: Rahul Limbachiya

Copy Editor: Safis Editing

Project Coordinator: Aparna Ravikumar Nair

Proofreader: Safis Editing

Indexer: Pratik Shirodkar

Production Designer: Shyam Sundar Korumilli

Marketing Coordinator: Shifa Ansari

First published: December 2022

Production reference: 3060223

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80324-990-2

www.packtpub.com

Contributors

About the author

Md Johirul Islam is a data scientist and engineer at Amazon. He has a PhD in computer science and is also an adjunct professor at Purdue University. His expertise is focused on designing explainable, maintainable, and robust data science pipelines, applying software design principles, and helping organizations deploy machine learning models into production at scale.

About the reviewer

Quan V. Dang is a machine learning engineer with experience in various domains, including finance, e-commerce, and logistics. He started his professional career as a researcher at the University of Aizu, where he mainly worked on classical machine learning and evolutionary algorithms. After graduating from university, he shifted his focus to deploying AI products and managing machine learning infrastructure. He is also the founder of the MLOps VN community, with over 5,000 people discussing MLOps and machine learning engineering. In his free time, he often writes technical blogs, hangs out, and goes traveling.

Table of Contents

Preface

Part 1: Introduction to Model Serving

1

Introducing Model Serving

Technical requirements

What is serving?

What are models?

What is model serving?

Understanding the importance of model serving

Using existing tools to serve models

Summary

2

Introducing Model Serving Patterns

Design patterns in software engineering

Understanding the value of model serving patterns

ML serving patterns

Serving philosophy patterns

Patterns of serving approaches

Summary

Further reading

Part 2: Patterns and Best Practices of Model Serving

3

Stateless Model Serving

Technical requirements

Understanding stateful and stateless functions

Stateless functions

Stateful functions

Extracting states from stateful functions

Using stateful functions

States in machine learning models

Using input data as states

Mitigating the impact of states from the ML model

Summary

4

Continuous Model Evaluation

Technical requirements

Introducing continuous model evaluation

What to monitor in model evaluation

Challenges of continuous model evaluation

The necessity of continuous model evaluation

Monitoring errors

Deciding on retraining

Enhancing serving resources

Understanding business impact

Common metrics for training and monitoring

Continuous model evaluation use cases

Evaluating a model continuously

Collecting the ground truth

Plotting metrics on a dashboard

Selecting the threshold

Setting a notification for performance drops

Monitoring model performance when predicting rare classes

Summary

Further reading

5

Keyed Prediction

Technical requirements

Introducing keyed prediction

Exploring keyed prediction use cases

Multi-threaded programming

Multiple instances of the model running asynchronously

Why the keyed prediction model is needed

Exploring techniques for keyed prediction

Passing keys with features from the clients

Removing keys before the prediction

Tagging predictions with keys

Creating keys

Summary

Further reading

6

Batch Model Serving

Technical requirements

Introducing batch model serving

What is batch model serving?

Different types of batch model serving

Manual triggers

Automatic periodic triggers

Using continuous model evaluation to retrain

Serving for offline inference

Serving for on-demand inference

Example scenarios of batch model serving

Case 1 – recommendation

Case 2 – sentiment analysis

Techniques in batch model serving

Setting up a periodic batch update

Storing the predictions in a persistent store

Pulling predictions by the server application

Limitations of batch serving

Summary

Further reading

7

Online Learning Model Serving

Technical requirements

Introducing online model serving

Serving requests

Use cases for online model serving

Case 1 – recommending the nearest emergency center during a pandemic

Case 2 – predicting the favorite soccer team in a tournament

Case 3 – predicting the path of a hurricane or storm

Case 4 – predicting the estimated delivery time of delivery trucks

Challenges in online model serving

Challenges in using newly arrived data for training

Underperforming of the model after online training

Overfitting and class imbalance

Increasing of latency

Handling concurrent requests

Implementing online model serving

Summary

Further reading

8

Two-Phase Model Serving

Technical requirements

Introducing two-phase model serving

Exploring two-phase model serving techniques

Quantized phase one model

Training and saving an MNIST model

Full integer quantization of the model and saving the converted model

Comparing the size and accuracy of the models

Separately trained phase one model with reduced features

Separately trained different models

Use cases of two-phase model serving

Case 4 – route planners

Summary

Further reading

9

Pipeline Pattern Model Serving

Technical requirements

Introducing the pipeline pattern

A DAG

Stages of the machine learning pipeline

Introducing Apache Airflow

Getting started with Apache Airflow

Creating and starting a pipeline using Apache Airflow

Demonstrating a machine learning pipeline using Airflow

Advantages and disadvantages of the pipeline pattern

Summary

Further reading

10

Ensemble Model Serving Pattern

Technical requirements

Introducing the ensemble pattern

Using ensemble pattern techniques

Model update

Aggregation

Model selection

Combining responses

End-to-end dummy example of serving the model

Summary

11

Business Logic Pattern

Technical requirements

Introducing the business logic pattern

Type of business logic

Technical approaches to business logic in model serving

Data validation

Feature transformation

Prediction post-processing

Summary

Part 3: Introduction to Tools for Model Serving

12

Exploring TensorFlow Serving

Technical requirements

Introducing TensorFlow Serving

Servable

Loader

Source

Aspired versions

Manager

Using TensorFlow Serving to serve models

TensorFlow Serving with Docker

Using advanced model configurations

Summary

Further reading

13

Using Ray Serve

Technical requirements

Introducing Ray Serve

Deployment

ServeHandle

Ingress deployment

Deployment graph

Using Ray Serve to serve a model

Using the ensemble pattern in Ray Serve

Using Ray Serve with the pipeline pattern

Summary

Further reading

14

Using BentoML

Technical requirements

Introducing BentoML

Preparing models

Services and APIs

Bento

Using BentoML to serve a model

Summary

Further reading

Part 4: Exploring Cloud Solutions

15

Serving ML Models using a Fully Managed AWS Sagemaker Cloud Solution

Technical requirements

Introducing Amazon SageMaker

Amazon SageMaker features

Using Amazon SageMaker to serve a model

Creating a notebook in Amazon SageMaker

Serving the model using Amazon SageMaker

Summary

Index

Other Books You May Enjoy

Part 1:Introduction to Model Serving

In this part, we will give an overview of model serving and explain why it is a challenge. We will also introduce you to a naive approach for serving models and discuss its drawbacks.

This section contains the following chapters:

Chapter 1, Introducing Model ServingChapter 2, Introducing Model Serving Patterns