The Regularization Cookbook - Vincent Vandenbussche - E-Book

The Regularization Cookbook E-Book

Vincent Vandenbussche

0,0
43,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Regularization is an infallible way to produce accurate results with unseen data, however, applying regularization is challenging as it is available in multiple forms and applying the appropriate technique to every model is a must. The Regularization Cookbook provides you with the appropriate tools and methods to handle any case, with ready-to-use working codes as well as theoretical explanations.

After an introduction to regularization and methods to diagnose when to use it, you’ll start implementing regularization techniques on linear models, such as linear and logistic regression, and tree-based models, such as random forest and gradient boosting. You’ll then be introduced to specific regularization methods based on data, high cardinality features, and imbalanced datasets. In the last five chapters, you’ll discover regularization for deep learning models. After reviewing general methods that apply to any type of neural network, you’ll dive into more NLP-specific methods for RNNs and transformers, as well as using BERT or GPT-3. By the end, you’ll explore regularization for computer vision, covering CNN specifics, along with the use of generative models such as stable diffusion and Dall-E.

By the end of this book, you’ll be armed with different regularization techniques to apply to your ML and DL models.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 412

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



The Regularization Cookbook

Explore practical recipes to improve the functionality of your ML models

Vincent Vandenbussche

BIRMINGHAM—MUMBAI

The Regularization Cookbook

Copyright © 2023 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Ali Abidi

Associate Publishing Product Manager: Aditya Datar

Senior Editor: Tiksha Lad

Technical Editor: Sweety Pagaria

Copy Editor: Safis Editing

Project Coordinator: Farheen Fathima

Proofreader: Safis Editing

Indexer: Manju Arasan

Production Designer: Vijay Kamble

Marketing Coordinator: Vinishka Kalra

First published: July 2023

Production reference: 1200723

Published by Packt Publishing Ltd.

Grosvenor House

11 St Paul’s Square

Birmingham

B3 1R.

ISBN 978-1-83763-408-8

www.packtpub.com

Writing a book mirrors the kaleidoscope of life, encompassing its ups and downs, joys and sorrows, the quest for purpose, fulfillment, and the haunting whispers of inner emptiness. In this swirling storm of intense emotions, Amandine’s unwavering support served as my guiding light, keeping me on course, and it is with deep appreciation that I dedicate this book to her.

– Vincent Vandenbussche

Foreword

In the ever-evolving landscape of machine learning, it can often feel like you’re trying to navigate a complex labyrinth of algorithms, models, and methods. Among the multitude of concepts that make up this domain, regularization plays an essential role – it keeps the balance between bias and variance, helping models to generalize well rather than just memorizing the training data.

As the art and science of machine learning continue to mature, it becomes imperative that we make these technical concepts accessible, intuitive, and practical to practitioners. In this realm, Vincent, with his depth of experience in machine learning, has embarked on a journey to unravel the complexity of regularization in machine learning models. In this cookbook, he offers a structured approach, grounded in practicality, that meticulously covers everything from the basics to advanced regularization techniques.

Beginning with an introduction to the topic, the book lays the groundwork by offering a refresher on machine learning best practices. It then moves on to tackle regularization techniques for both linear and tree-based models, elegantly bridging the gap between theory and practice.

But Vincent’s cookbook does not stop there. It ventures into regularization with data, covering feature aggregation and the handling of imbalanced datasets, followed by insights into regularization for deep learning and recurrent neural networks.

Recognizing the significance of application-specific regularization techniques, Vincent dedicates several chapters to regularization in natural language processing and computer vision. These chapters explore regularization methods specific to these domains, including the use of embeddings, data augmentation, and synthetic image generation.

The final chapter, on synthetic image generation for regularization, is an intriguing intersection of creativity and technology, offering a testament to Vincent’s expertise in cutting-edge techniques.

Throughout this cookbook, Vincent’s deft blend of theory, practice, and cutting-edge knowledge is evident. This book serves not just as a guide, but as a toolbox, filled with practical recipes that you can directly implement and adapt as per your needs.

Whether you are a beginner finding your feet in the world of machine learning or an experienced practitioner looking to deepen your understanding of regularization, this cookbook offers a comprehensive and hands-on approach. It speaks volumes to Vincent’s expertise and his ability to make complex concepts accessible and practical.

This cookbook promises to be an indispensable resource for every machine learning enthusiast’s journey. It is a culmination of Vincent’s vast experience, ingenuity, and desire to share his knowledge with the world. As you embark on this journey with him, it is my hope that this book will guide, inspire, and empower you to harness the true power of regularization in machine learning.

Happy reading and cooking in the world of machine learning!

–Akin Osman Kazakci

Head of Data Innovation Lab at MINES ParisTech and Entrepreneur-in-Residence at Caltech

Contributors

About the author

Since gaining a Ph.D. in physics, Vincent Vandenbussche has worked for a decade in diverse companies, deploying ML solutions at scale. He has worked at numerous companies, such as Renault, L’Oréal, General Electric, Jellysmack, Chanel, and CERN.

He also has a passion for teaching; he co-founded a data science boot camp, was an ML lecturer at MINES ParisTech engineering school and EDHEC Business School, and trained numerous professionals at companies such as ArcelorMittal and Orange.

About the reviewer

Rajat Agrawal is an accomplished data scientist with a keen focus on deep learning and a profound interest in generative AI and language models. With four years of experience in research work and two years in the field, he has established himself as a valuable asset in the realm of data science. Rajat’s expertise in statistical modeling and predictive analytics, coupled with his groundbreaking publication on Lane Detection and Collision Prevention System for Automated Vehicles in Springer, demonstrates his innovative approach to computer vision algorithms. He is dedicated to a noble mission of democratizing AI, ensuring its accessibility to every individual.

My parents’ unwavering support has been instrumental in shaping my career.

Table of Contents

Preface

1

An Overview of Regularization

Technical requirements

Introducing regularization

Examples of models that did not pass the deployment test

Intuition about regularization

Key concepts of regularization

Bias and variance

Underfitting and overfitting

Regularization – from overfitting to underfitting

Unavoidable bias

Diagnosing bias and variance

Regularization – a multi-dimensional problem

Summary

2

Machine Learning Refresher

Technical requirements

Loading data

Getting ready

How to do it…

There’s more…

See also

Splitting data

Getting ready

How to do it…

See also

Preparing quantitative data

Getting ready

How to do it…

There’s more…

See also

Preparing qualitative data

Getting ready

How to do it…

There’s more…

See also

Training a model

Getting ready

How to do it…

See also

Evaluating a model

Getting ready

How to do it…

See also

Performing hyperparameter optimization

Getting ready

How to do it…

3

Regularization with Linear Models

Technical requirements

Training a linear regression model with scikit-learn

Getting ready

How to do it…

There’s more…

See also

Regularizing with ridge regression

Getting ready

How to do it…

There’s more…

See also

Regularizing with lasso regression

Getting ready

How to do it…

There’s more…

See also

Regularizing with elastic net regression

Getting ready

How to do it…

See also

Training a logistic regression model

Getting ready

How to do it…

Regularizing a logistic regression model

Getting ready

How to do it…

There’s more…

Choosing the right regularization

Getting ready

How to do it…

See also

4

Regularization with Tree-Based Models

Technical requirements

Building a classification tree

Disorder measurement

Loss function

Getting ready

How to do it…

There’s more…

See also

Building regression trees

Getting ready

How to do it…

See also

Regularizing a decision tree

Getting ready

How to do it…

How it works…

There’s more…

See also

Training the Random Forest algorithm

Getting ready

How to do it…

See also

Regularization of Random Forest

Getting started

How to do it…

Training a boosting model with XGBoost

Getting ready

How to do it…

See also

Regularization with XGBoost

Getting ready

How to do it…

There’s more…

5

Regularization with Data

Technical requirements

Hashing high cardinality features

Getting started

How to do it...

See also

Aggregating features

Getting ready

How to do it...

There’s more...

Undersampling an imbalanced dataset

Getting ready

How to do it...

There’s more...

See also

Oversampling an imbalanced dataset

Getting ready

How to do it...

There’s more...

See also

Resampling imbalanced data with SMOTE

Getting ready

How to do it...

There’s more...

See also

6

Deep Learning Reminders

Technical requirements

Training a perceptron

Getting started

How to do it…

There’s more…

See also

Training a neural network for regression

Getting started

How to do it…

There’s more…

See also

Training a neural network for binary classification

Getting ready

How to do it…

There’s more…

See also

Training a multiclass classification neural network

Getting ready

How to do it…

There’s more…

See also

7

Deep Learning Regularization

Technical requirements

Regularizing a neural network with L2 regularization

Getting ready

How to do it...

There’s more...

See also

Regularizing a neural network with early stopping

Getting ready

How to do it...

There’s more...

Regularization with network architecture

Getting ready

How to do it...

There’s more...

Regularizing with dropout

Getting ready

How to do it...

There’s more...

See also

8

Regularization with Recurrent Neural Networks

Technical requirements

Training an RNN

Getting started

How to do it…

There’s more…

See also

Training a GRU

Getting started

How to do it…

There’s more…

See also

Regularizing with dropout

Getting ready

How to do it…

There’s more…

Regularizing with the maximum sequence length

Getting ready

How to do it…

There’s more…

9

Advanced Regularization in Natural Language Processing

Technical requirements

Regularization using a word2vec embedding

Getting ready

How to do it…

There’s more…

See also

Data augmentation using word2vec

Getting ready

How to do it…

There’s more…

See also

Zero-shot inference with pre-trained models

Getting ready

How to do it…

There’s more…

See also

Regularization with BERT embeddings

Getting ready

How to do it…

There’s more…

See also

Data augmentation using GPT-3

Getting ready

How to do it…

There’s more…

See also

10

Regularization in Computer Vision

Technical requirements

Training a CNN

Getting started

How to do it…

There’s more…

See also

Regularizing a CNN with vanilla NN methods

Getting started

How to do it…

There’s more…

See also

Regularizing a CNN with transfer learning for object detection

Object detection

Mean average precision

COCO dataset

Getting started

How to do it…

There’s more…

See also

Semantic segmentation using transfer learning

Getting started

How to do it…

There’s more…

See also

11

Regularization in Computer Vision – Synthetic Image Generation

Technical requirements

Applying image augmentation with Albumentations

Spatial-level augmentation

Pixel-level augmentation

Albumentations

Getting started

How to do it…

There’s more…

See also

Creating synthetic images for object detection

Getting started

How to do it…

There’s more…

See also

Implementing real-time style transfer

Stable Diffusion

Perceptual loss

Getting started

How to do it…

There’s more…

See also

Index

Other Books You May Enjoy