35,99 €
As machine learning practitioners, we often encounter imbalanced datasets in which one class has considerably fewer instances than the other. Many machine learning algorithms assume an equilibrium between majority and minority classes, leading to suboptimal performance on imbalanced data. This comprehensive guide helps you address this class imbalance to significantly improve model performance.
Machine Learning for Imbalanced Data begins by introducing you to the challenges posed by imbalanced datasets and the importance of addressing these issues. It then guides you through techniques that enhance the performance of classical machine learning models when using imbalanced data, including various sampling and cost-sensitive learning methods.
As you progress, you’ll delve into similar and more advanced techniques for deep learning models, employing PyTorch as the primary framework. Throughout the book, hands-on examples will provide working and reproducible code that’ll demonstrate the practical implementation of each technique.
By the end of this book, you’ll be adept at identifying and addressing class imbalances and confidently applying various techniques, including sampling, cost-sensitive techniques, and threshold adjustment, while using traditional machine learning or deep learning models.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 415
Veröffentlichungsjahr: 2023
Machine Learning for Imbalanced Data
Tackle imbalanced datasets using machine learning and deep learning techniques
Kumar Abhishek
Dr. Mounir Abdelaziz
BIRMINGHAM—MUMBAI
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Niranjan Naikwadi
Publishing Product Manager: Sanjana Gupta
Book Project Manager: Kirti Pisat
Senior Editor: Rohit Singh
Technical Editor: Rahul Limbachiya
Copy Editor: Safis Editing
Proofreader: Safis Editing
Indexer: Pratik Shirodkar
Production Designer: Nilesh Mohite
DevRel Marketing Coordinator: Vinishka Kalra
First published: November 2023
Production reference: 2221123
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-80107-083-6
www.packtpub.com
Kumar Abhishek is a seasoned senior machine learning engineer at Expedia Group, US, specializing in risk analysis and fraud detection. With over a decade of machine learning and software engineering experience, Kumar has worked for companies such as Microsoft, Amazon, and a Bay Area start-up. Kumar holds a master’s degree in computer science from the University of Florida, Gainesville.
To my incredible wife who has been my rock and constant source of inspiration, our adorable son who fills our lives with joy, my wonderful parents for their unwavering support, and my close friends. Immense thanks to Christian, who has been a pivotal mentor and guide, for his meticulous reviews. My deepest gratitude to my co-author, Mounir, and contributor, Anshul; their dedication and solid contributions were essential in shaping this book. Lastly, I extend my sincere appreciation to Abhiram and the Packt team for their unwavering support.
Dr. Mounir Abdelaziz is a deep learning researcher specializing in computer vision applications. He holds a Ph.D. in computer science and technology from Central South University, China. During his Ph.D. journey, he developed innovative algorithms to address practical computer vision challenges. He has also authored numerous research articles in the field of few-shot learning for image classification.
I would like to thank my family, especially my parents, for their support and encouragement. I also want to thank all the fantastic people I collaborated with, including my co-author, Packt editors, and reviewers. Without their help, writing this book wouldn’t have been possible.
Anshul Yadav is a software developer and trainer with a keen interest in machine learning, web development, and theoretical computer science. He likes to solve technical problems: the slinkier, the better. He has a B.Tech. degree in computer science and engineering from IIT Kanpur. Anshul loves to share the joy of learning with his audience.
Christian Monson has nine years of industry experience working as a machine learning scientist specializing in Natural Language Processing (NLP) and speech recognition. For five of those years, he worked at Amazon improving the Alexa personal assistant. During the 2000s, he was a graduate student at Carnegie Mellon University and a postdoc at Oregon Health and Science University working on NLP. Christian completed his bachelor’s degree in computer science, with minors in math and physics, at Brigham Young University in 2000. In his free time, Christian creates video games and plays with his kids. Currently, he is a full-time tutor and mentor in machine learning. You can find Christian at www.aitalks.art or watch his videos at youtube.com/@_aitalks.
Abhiram Jagarlapudi is a principal software engineer with 10 years of experience in cloud computing and Artificial Intelligence (AI). At Amazon Web Services and Oracle Cloud, Abhiram was part of launching several public cloud services, later specializing in cloud AI services. He was part of a small team that built the software delivery infrastructure of Oracle Cloud, which started in 2016 and has since grown into a multi-billion-dollar business. He also designed and developed AI services for the Oracle Cloud and is passionate about applying that experience to improve and accelerate the delivery of machine learning.
