Hyperparameter Tuning with Python - Louis Owen - E-Book

Hyperparameter Tuning with Python E-Book

Louis Owen

0,0
33,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Hyperparameters are an important element in building useful machine learning models. This book curates numerous hyperparameter tuning methods for Python, one of the most popular coding languages for machine learning. Alongside in-depth explanations of how each method works, you will use a decision map that can help you identify the best tuning method for your requirements.
You’ll start with an introduction to hyperparameter tuning and understand why it's important. Next, you'll learn the best methods for hyperparameter tuning for a variety of use cases and specific algorithm types. This book will not only cover the usual grid or random search but also other powerful underdog methods. Individual chapters are also dedicated to the three main groups of hyperparameter tuning methods: exhaustive search, heuristic search, Bayesian optimization, and multi-fidelity optimization. Later, you will learn about top frameworks like Scikit, Hyperopt, Optuna, NNI, and DEAP to implement hyperparameter tuning. Finally, you will cover hyperparameters of popular algorithms and best practices that will help you efficiently tune your hyperparameter.
By the end of this book, you will have the skills you need to take full control over your machine learning models and get the best models for the best results.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 379

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Hyperparameter Tuning with Python

Boost your machine learning model’s performance via hyperparameter tuning

Louis Owen

BIRMINGHAM—MUMBAI

Hyperparameter Tuning with Python

Copyright © 2022 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Gebin George

Publishing Product Manager: Dinesh Chaudhary

Senior Editor: David Sugarman

Technical Editor: Devanshi Ayare

Copy Editor: Safis Editing

Project Coordinator: Farheen Fathima

Proofreader: Safis Editing

Indexer: Pratik Shirodkhar

Production Designer: Ponraj Dhandapani

Marketing Coordinator: Shifa Ansari and Abeer Riyaz Dawe

First published: July 2022

Production reference: 1280722

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80323-587-5

www.packt.com

To Mom and Dad, thanks for everything!

– Louis

Contributors

About the author

Louis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects. Finally, Louis loves to meet new friends! So, please feel free to reach out to him on LinkedIn if you have any topics to be discussed.

About the reviewer

Jamshaid Sohail is passionate about data science, machine learning, computer vision, and natural language processing and has more than 2 years of experience in the industry. He has worked at a Silicon Valley-based start-up named FunnelBeam, the founders of which are from Stanford University, as a data scientist. Currently, he is working as a data scientist at Systems Limited. He has completed over 66 online courses from different platforms. He authored the book Data Wrangling with Python 3.X for Packt Publishing and has reviewed multiple books and courses. He is also developing a comprehensive course on data science at Educative and is in the process of writing books for multiple publishers.

Table of Contents

Preface

Section 1: The Methods

Chapter 1: Evaluating Machine Learning Models

Technical requirements

Understanding the concept of overfitting

Creating training, validation, and test sets

Exploring random and stratified splits

Discovering repeated k-fold cross-validation

Discovering Leave-One-Out cross-validation

Discovering LPO cross-validation

Discovering time-series cross-validation

Summary

Further reading

Chapter 2: Introducing Hyperparameter Tuning

What is hyperparameter tuning?

Demystifying hyperparameters versus parameters

Understanding hyperparameter space and distributions

Summary

Chapter 3: Exploring Exhaustive Search

Understanding manual search

Understanding grid search

Understanding random search

Summary

Chapter 4: Exploring Bayesian Optimization

Introducing BO

Understanding BO GP

Understanding SMAC

Understanding TPE

Understanding Metis

Summary

Chapter 5: Exploring Heuristic Search

Understanding simulated annealing

Understanding genetic algorithms

Understanding particle swarm optimization

Understanding Population-Based Training

Summary

Chapter 6: Exploring Multi-Fidelity Optimization

Introducing MFO

Understanding coarse-to-fine search

Understanding successive halving

Understanding hyper band

Understanding BOHB

Summary

Section 2: The Implementation

Chapter 7: Hyperparameter Tuning via Scikit

Technical requirements

Introducing Scikit

Implementing Grid Search

Implementing Random Search

Implementing Coarse-to-Fine Search

Implementing Successive Halving

Implementing Hyper Band

Implementing Bayesian Optimization Gaussian Process

Implementing Bayesian Optimization Random Forest

Implementing Bayesian Optimization Gradient Boosted Trees

Summary

Chapter 8: Hyperparameter Tuning via Hyperopt

Technical requirements

Introducing Hyperopt

Implementing Random Search

Implementing Tree-structured Parzen Estimators

Implementing Adaptive TPE

Implementing simulated annealing

Summary

Chapter 9: Hyperparameter Tuning via Optuna

Technical requirements

Introducing Optuna

Implementing TPE

Implementing Random Search

Implementing Grid Search

Implementing Simulated Annealing

Implementing Successive Halving

Implementing Hyperband

Summary

Chapter 10: Advanced Hyperparameter Tuning with DEAP and Microsoft NNI

Technical requirements

Introducing DEAP

Implementing the Genetic Algorithm

Implementing Particle Swarm Optimization

Introducing Microsoft NNI

Implementing Grid Search

Implementing Random Search

Implementing Tree-structured Parzen Estimators

Implementing Sequential Model Algorithm Configuration

Implementing Bayesian Optimization Gaussian Process

Implementing Metis

Implementing Simulated Annealing

Implementing Hyper Band

Implementing Bayesian Optimization Hyper Band

Implementing Population-Based Training

Summary

Section 3: Putting Things into Practice

Chapter 11: Understanding the Hyperparameters of Popular Algorithms

Exploring Random Forest hyperparameters

Exploring XGBoost hyperparameters

Exploring LightGBM hyperparameters

Exploring CatBoost hyperparameters

Exploring SVM hyperparameters

Exploring artificial neural network hyperparameters

Summary

Chapter 12: Introducing Hyperparameter Tuning Decision Map

Getting familiar with HTDM

Case study 1 – using HTDM with a CatBoost classifier

Case study 2 – using HTDM with a conditional hyperparameter space

Case study 3 – using HTDM with prior knowledge of the hyperparameter values

Summary

Chapter 13: Tracking Hyperparameter Tuning Experiments

Technical requirements

Revisiting the usual practices

Using a built-in Python dictionary

Using a configuration file

Using additional modules

Exploring Neptune

Exploring scikit-optimize

Exploring Optuna

Exploring Microsoft NNI

Exploring MLflow

Summary

Chapter 14: Conclusions and Next Steps

Revisiting hyperparameter tuning methods and packages

Revisiting HTDM

What’s next?

Summary

Other Books You May Enjoy

Table of Contents

Table of Contents

Preface

Hyperparameters are an important element in building useful machine learning models. This book curates numerous hyperparameter tuning methods for Python, one of the most popular coding languages for machine learning. Alongside in-depth explanations of how each method works, you will use a decision map that can help you identify the best tuning method for your requirements.

We will start the book with an introduction to hyperparameter tuning and explain why it’s important. You’ll learn the best methods for hyperparameter tuning for a variety of use cases and a specific algorithm type. The book will not only cover the usual grid or random search but also other powerful underdog methods. Individual chapters are dedicated to giving full attention to the three main groups of hyperparameter tuning methods: exhaustive search, heuristic search, Bayesian optimization, and multi-fidelity optimization.

Later in the book, you will learn about top frameworks such as scikit-learn, Hyperopt, Optuna, NNI, and DEAP to implement hyperparameter tuning. Finally, we will cover hyperparameters of popular algorithms and best practices that will help you efficiently tune your hyperparameters.

By the end of the book, you will have the skills you need to take full control over your machine learning models and get the best models for the best results.

Who this book is for

The book is intended for data scientists and Machine Learning engineers who are working with Python and want to further boost their ML model’s performance by utilizing the appropriate hyperparameter tuning method. You will need to have a basic understanding of ML and how to code in Python but will require no prior knowledge of hyperparameter tuning in Python.

What this book covers

Chapter 1, Evaluating Machine Learning Models, covers all the important things we need to know when it comes to evaluating ML models, including the concept of overfitting, the idea of splitting data into several parts, a comparison between the random and stratified split, and numerous methods on how to split the data.

Chapter 2, Introducing Hyperparameter Tuning, introduces the concept of hyperparameter tuning, starting from the definition and moving on to the goal, several misconceptions, and distributions of hyperparameters.

Chapter 3, Exploring Exhaustive Search, explores each method that belongs to the first out of four groups of hyperparameter tuning, along with the pros and cons. There will be both high-level and detailed explanations for each of the methods. The high-level explanation will use a visualization strategy to help you understand more easily, while the detailed explanation will bring the math to the table.

Chapter 4, Exploring Bayesian Optimization, explores each method that belongs to the second out of four groups of hyperparameter tuning, along with the pros and cons. There will also be both high-level and detailed explanations for each of the methods.

Chapter 5, Exploring Heuristic Search, explores each method that belongs to the third out of four groups of hyperparameter tuning, along with the pros and cons. There will also be both high-level and detailed explanations for each of the methods.

Chapter 6, Exploring Multi-Fidelity Optimization, explores each method that belongs to the fourth out of four groups of hyperparameter tuning, along with the pros and cons. There will also be both high-level and detailed explanations for each of the methods.

Chapter 7, Hyperparameter Tuning via Scikit, covers all the important things about scikit-learn, scikit-optimize, and scikit-hyperband, along with how to utilize each of them to perform hyperparameter tuning.

Chapter 8, Hyperparameter Tuning via Hyperopt, introduces the Hyperopt package, starting from its capabilities and limitations, how to utilize it to perform hyperparameter tuning, and all the other important things you need to know about it.

Chapter 9, Hyperparameter Tuning via Optuna, introduces the Optuna package, starting from its numerous features, how to utilize it to perform hyperparameter tuning, and all the other important things you need to know about it.

Chapter 10, Advanced Hyperparameter Tuning with DEAP and Microsoft NNI, shows how to perform hyperparameter tuning using both the DEAP and Microsoft NNI packages, starting from getting ourselves familiar with the packages and moving on to the important modules and parameters we need to be aware of.

Chapter 11, Understanding Hyperparameters of Popular Algorithms, explores the hyperparameters of several popular ML algorithms. There will be a broad explanation for each of the algorithms, including (but not limited to) the definition of each hyperparameter, what will be impacted when the value of each hyperparameter is changed, and the priority list of hyperparameters based on the impact.

Chapter 12, Introducing Hyperparameter Tuning Decision Map, introduces the Hyperparameter Tuning Decision Map (HTDM), which summarizes all of the discussed hyperparameter tuning methods as a simple decision map based on six aspects. There will be also three study cases that show how to utilize the HTDM in practice.

Chapter 13, Tracking Hyperparameter Tuning Experiments, covers the importance of tracking hyperparameter tuning experiments, along with the usual practices. You will also be introduced to several open source packages that are available and learn how to utilize each of them in practice.

Chapter 14, Conclusions and Next Steps, summarizes all the important lessons learned in the previous chapters, and also introduces you to several topics or implementations that you may benefit from that we have not covered in detail in this book.

To get the most out of this book

You will also need Python version 3.7 (or above) installed on your computer, along with the related packages mentioned in the Technical requirements section of each chapter.

It is worth noting that there is a conflicting version requirement for the Hyperopt package in Chapter 8, Hyperparameter Tuning via Hyperopt, and Chapter 10, Advanced Hyperparameter Tuning with DEAP and Microsoft NNI. You need to install version 0.2.7 for Chapter 8, Hyperparameter Tuning via Hyperopt, and version 0.1.2 for Chapter 10, Advanced Hyperparameter Tuning with DEAP and Microsoft NNI.

It is also worth noting that the HyperBand implementation used in Chapter 7, Hyperparameter Tuning via Scikit, is the modified version of the scikit-hyperband package. You can utilize the modified version by cloning the GitHub repository (a link is available in the next section) and looking in a folder named hyperband.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

To understand all contents in this book, you will need to have a basic understanding of ML and how to code in Python but will require no prior knowledge of hyperparameter tuning in Python. At the end of this book, you will also be introduced to several topics or implementations that you may benefit from which we have not covered yet in this book.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Hyperparameter-Tuning-with-Python. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/ExcbH.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: As for criterion and max_depth, we are still using the same configuration as the previous search space.

A block of code is set as follows:

for n_est in n_estimators:          for crit in criterion:          for m_depth in max_depth:          #perform cross-validation here

Tips or Important Notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Hyperparameter Tuning with Python, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Section 1:The Methods

This initial section covers concepts and theories you need to know before performing hyperparameter tuning experiments.

This section includes the following chapters:

Chapter 1, Evaluating Machine Learning ModelsChapter 2, Introducing Hyperparameter TuningChapter 3, Exploring Exhaustive SearchChapter 4, Exploring Bayesian Optimization Chapter 5, Exploring Heuristic SearchChapter 6, Exploring Multi-Fidelity Optimization

Chapter 2: Introducing Hyperparameter Tuning

Every machine learning (ML) project should have a clear goal and success metrics. The success metrics can be in the form of business and/or technical metrics. Evaluating business metrics is hard, and often, they can only be evaluated after the ML model is in production. On the other hand, evaluating technical metrics is more straightforward and can be done during the development phase. We, as ML developers, want to achieve the best technical metrics that we can get since this is something that we can optimize.

In this chapter, we'll learn one out of several ways to optimize the chosen technical metrics, called hyperparameter tuning. We will start this chapter by understanding what hyperparameter tuning is, along with its goal. Then, we'll discuss the difference between a hyperparameter and a parameter. We'll also learn the concept of hyperparameter space and possible distributions of hyperparameter values that you may find in practice.

By the end of this chapter, you will understand the concept of hyperparameter tuning and hyperparameters themselves. Understanding these concepts is crucial for you to get a bigger picture of what will be discussed in the next chapters.

In this chapter, we'll be covering the following main topics:

What is hyperparameter tuning?Demystifying hyperparameters versus parametersUnderstanding hyperparameter space and distributions

What is hyperparameter tuning?

Hyperparameter tuning is a process whereby we search for the best set of hyperparameters of an ML model from all of the candidate sets. It is the process of optimizing the technical metrics we care about. The goal of hyperparameter tuning is simply to get the maximum evaluation score on the validation set without causing an overfitting issue.

Hyperparameter tuning is one of the model-centric approaches to optimizing a model's performance. In practice, it is suggested to prioritize data-centric approaches over a model-centric approach when it comes to optimizing a model's performance. Data-centric means that we are focusing on cleaning, sampling, augmenting, or modifying the data, while model-centric means that we are focusing on the model and its configuration.

To understand why data-centric is prioritized over model-centric, let's say you are a cook in a restaurant. When it comes to cooking, no matter how expensive and fancy your kitchen setups are, if the ingredients are not in a good condition, it's impossible to serve high-quality food to your customers. In that analogy, ingredients refer to the data, and kitchen setups refer to the model and its configuration. No matter how fancy and complex our model is, if we do not have good data or features in the first place, then we can't achieve the maximum evaluation score. This is expressed in the famous saying, garbage in, garbage out (GIGO).

In model-centric approaches, hyperparameter tuning is performed after we have found the most suitable model framework or architecture. So, it can be said that hyperparameter tuning is the ultimate step in optimizing the model's performance.



Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.