35,99 €
Debugging Machine Learning Models with Python is a comprehensive guide that navigates you through the entire spectrum of mastering machine learning, from foundational concepts to advanced techniques. It goes beyond the basics to arm you with the expertise essential for building reliable, high-performance models for industrial applications. Whether you're a data scientist, analyst, machine learning engineer, or Python developer, this book will empower you to design modular systems for data preparation, accurately train and test models, and seamlessly integrate them into larger technologies.
By bridging the gap between theory and practice, you'll learn how to evaluate model performance, identify and address issues, and harness recent advancements in deep learning and generative modeling using PyTorch and scikit-learn. Your journey to developing high quality models in practice will also encompass causal and human-in-the-loop modeling and machine learning explainability. With hands-on examples and clear explanations, you'll develop the skills to deliver impactful solutions across domains such as healthcare, finance, and e-commerce.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 447
Veröffentlichungsjahr: 2023
Debugging Machine Learning Models with Python
Develop high-performance, low-bias, and explainable machine learning and deep learning models
Ali Madani
BIRMINGHAM—MUMBAI
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Niranjan Naikwadi
Publishing Product Manager: Anant Jain
Book Project Manager: Hemangi Lotlikar
Senior Editor: Rohit Singh
Technical Editor: Sweety Pagaria
Copy Editor: Safis Editing
Proofreader: Safis Editing
Indexer: Sejal Dsilva
Production Designer: Joshua Misquitta
DevRel Marketing Executive: Vinishka Kalra
First published: Sep 2023
Production reference: 1180823
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-80020-858-2
www.packtpub.com
To my mother, Fatemeh Bekali, and my father, Razi, whose sacrifices and unwavering support have been my foundation. To my loving partner, Parand, whose constant understanding and love have been my inspiration and strength.
– Ali Madani
Ali Madani is a global expert in ML-based drug discovery, where he has led the development of multiple robust ML products with real-world applications in the life sciences. Ali is a skilled communicator and he is passionate about practical applications of ML development. He rose to popularity over social media through his educational series on applied ML, distilling complex state-of-the-art AI research topics into brief descriptions and diagrams, which could be easily understood by ML learners and non-technical professionals interested in the scientific and business applications of new technologies. Through his role as the Director of Machine Learning at Cyclica (acquired by Recursion Pharmaceuticals), Ali was involved in all phases of the ML product life cycle, from ideation to continuous development, field testing, and commercialization. He was a mentor to ML-oriented staff developing their technical skillsets as well as scientific-oriented staff and field experts seeking to reconcile their interpretation of ML model evaluations with real-world applications.
In this book, Debugging Machine Learning Models with Python, Ali shares his first-hand experience with readers, covering the practical elements of ML development that are critical for progressing ML technologies from first-pass data science experiments into refined, commercial ML solutions, aimed at real-world performance. This book covers a broad spectrum of topics – from modularizing components of ML life cycles to correctly assessing the performance of ML models and devising improvement strategies. This book extends beyond ML model training and testing, and provides you with technical details on how to detect biases in your models and plan to achieve fairness through different techniques such as methods aiming for local and global ML explainability. You will also practice with Deep Learning supervised, generative, and self-supervised modeling for different data modalities, such as images, texts, and graphs. In this book, you will practice with different Python libraries, such as scikit-learn, PyTorch, Transformers, Ray, imblearn, Shap, AIF360, and many more to gain hands-on experience in implementing these techniques and concepts.
With this book, you’ll learn how to maximize the value of ML technologies, leading the way in developing best-in-class technologies in any domain. Here, Ali provides you with engineering aspects of ML technology development as well as covers topics, such as data and model versioning to achieve reproducibility, data, and concept drift detection to have reliable models in production, and test-driven development to reduce risks of having untrustworthy ML models. You will also learn about different techniques for increasing the security and privacy of your data and models.
Stephen MacKinnon
Vice President, Digital Chemistry
Ali Madani worked as the Director of Machine Learning at Cyclica Inc, leading AI technology development front of Cyclica for drug discovery before its acquisition by Recursion Pharmaceuticals, where Ali continues focusing on the applications of machine learning for drug discovery. Ali completed his Ph.D. at the University of Toronto, focusing on machine learning modeling in a cancer setting, and attained a Master of Mathematics degree from the University of Waterloo. As a believer in industry-oriented education and pro-democratization of knowledge, Ali has actively educated students and professionals through international workshops and courses on basic and advanced high-quality machine learning modeling. When not immersed in machine learning modeling and teaching, Ali enjoys exercising, cooking, and traveling with his partner.
I would like to extend my heartfelt thanks to my partner, Parand, and my parents for their unwavering support and love. I’m also deeply grateful to my mentors throughout the years, whose wisdom and guidance have been invaluable. Thank you all for being an essential part of this journey.
Krishnan Raghavan is an IT Professional with over 20 years of experience in the field of software development and delivery excellence across multiple domains and technologies, ranging from C++ to Java, Python, Data Warehousing, and Big Data tools and technologies. In his free time, Krishnan likes to spend time with his wife and daughter besides reading fiction, non-fiction as well as technical books. Krishnan tries to give back to the community by being a part of GDG – Pune Volunteer Group and helping the team in organizing events. Currently, he is unsuccessfully trying to learn to play the guitar.
You can connect with him at [email protected] or via LinkedIn.
I would like to thank my wife, Anita, and daughter, Ananya, for giving me the time and space to review this book.
Amreth Chandrasehar is a Director at Informatica responsible for ML Engineering, Observability, and SRE teams. Over the last few years, he has played a key role in Cloud migration, CNCF architecture, Generative AI, Observability, and machine learning adoption at various organizations. He is also a co-creator of the Conducktor Platform, serving T-Mobile’s 140+ million customers, and a Tech/Customer Advisory board member at various companies. He has also co-developed and open sourced Kardio.io. Amreth has been invited and spoken at several key conferences and has won several awards within the company. He was recently awarded a Gold Award at the 15th Annual 2023 Golden Bridge Business and Innovation Awards for his contributions to the field of Observability and Generative AI.
I would like to thank my wife, Ashwinya Mani, and my son, Athvik A, for their patience and support during my review of this book.
In this part of the book, we will delve into the different aspects of machine learning development that extend beyond traditional paradigms. The first chapter illuminates the nuances between conventional code debugging and the specialized realm of machine learning debugging, emphasizing that the challenges in ML transcend mere code errors. The next chapter provides a comprehensive overview of the machine learning life cycle, highlighting the role of modularization in streamlining and enhancing model development. Finally, we will underscore the importance of model debugging in the pursuit of Responsible AI, emphasizing its role in ensuring ethical, transparent, and effective machine learning solutions.
This part has the following chapters:
Chapter 1, Beyond Code DebuggingChapter 2, Machine Learning Life CycleChapter 3, Debugging toward Responsible AI