29,99 €
In the rapidly evolving landscape of machine learning, the ability to accurately quantify uncertainty is pivotal. The book addresses this need by offering an in-depth exploration of Conformal Prediction, a cutting-edge framework to manage uncertainty in various ML applications.
Learn how Conformal Prediction excels in calibrating classification models, produces well-calibrated prediction intervals for regression, and resolves challenges in time series forecasting and imbalanced data. Discover specialised applications of conformal prediction in cutting-edge domains like computer vision and NLP. Each chapter delves into specific aspects, offering hands-on insights and best practices for enhancing prediction reliability. The book concludes with a focus on multi-class classification nuances, providing expert-level proficiency to seamlessly integrate Conformal Prediction into diverse industries. With practical examples in Python using real-world datasets, expert insights, and open-source library applications, you will gain a solid understanding of this modern framework for uncertainty quantification.
By the end of this book, you will be able to master Conformal Prediction in Python with a blend of theory and practical application, enabling you to confidently apply this powerful framework to quantify uncertainty in diverse fields.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 353
Veröffentlichungsjahr: 2023
Practical Guide to Applied Conformal Prediction in Python
Learn and apply the best uncertainty frameworks to your industry applications
Valery Manokhin
BIRMINGHAM—MUMBAI
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Niranjan Naikwadi
Publishing Product Manager: Nitin Nainani
Content Development Editor: Priyanka Soam
Technical Editor: Devanshi Ayare
Copy Editor: Safis Editing
Project Coordinator: Shambhavi Mishra
Proofreader: Safis Editing
Indexer: Rekha Nair
Production Designer: Prafulla Nikalje
Marketing Coordinator: Vinishka Kalra
First published: Nov 2023
Production reference: 2041223
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-80512-276-0
www.packtpub.com
In statistical and machine learning, it is rare to encounter a technique that blends deep mathematical rigor with practical simplicity. Conformal Prediction is one such gem. Rooted in solid probability theory, it transcends academic theory to find wide-ranging applications in the real world.
Valery, studied under the inventor of Conformal Prediction, compiles in this book a treasure trove of practical knowledge, tailored for practicing data scientists. His work makes Conformal Prediction not only accessible but intuitively understandable, bridging the gap between complex theory and practical application.
This book stands out for its unique approach to demystifying Conformal Prediction. It eschews the often esoteric and dense theoretical exposition common in statistical texts, opting instead for clarity and comprehensibility. This approach makes the powerful techniques of Conformal Prediction accessible to a broader range of machine learning practitioners.
The applications of Conformal Prediction are vast and varied, and this book delves into them with meticulous detail. From classification and regression to time series analysis to computer vision, and language models. Each application is explored thoroughly with examples to provide practitioners with practical guidance on applying these methods in their work.
This book will be an essential reference for machine learning engineers and data scientists who seek to incorporate uncertainty quantification (UQ) to models that they develop and deploy, a critical element that has been missing in machine learning. UQ is critical to understand prediction reliability, providing safety during model deployment and potential model weakness identification during model development and testing.
For example, Python Interpretable Machine Learning (PiML) toolkit that my team developed and applied in banking incorporating Conformal Prediction to identify regions of inputs where models are less reliable (higher prediction uncertainties).
Agus Sudjianto, PhD
Executive Vice President
Head of Corporate Model Risk Wells Fargo
Valery Manokhin, is the leading expert in the field of machine learning and Conformal Prediction. He holds a Ph.D.in Machine Learning from Royal Holloway, University of London. His doctoral work was supervised by the creator of Conformal Prediction, Vladimir Vovk, and focused on developing new methods for quantifying uncertainty in machine learning models.
Valery has published extensively in leading machine learning journals, and his Ph.D. dissertation ‘Machine Learning for Probabilistic Prediction’ is read by thousands of people across the world. He is also the creator of “Awesome Conformal Prediction,” the most popular resource and GitHub repository for all things Conformal Prediction.
Eleftherios (Lefteris) Koulierakis is a senior data scientist with a diverse international working background. He holds an Engineering Doctorate in data science from Eindhoven University of Technology. He has demonstrated a consistent track record of innovation, notably as the lead inventor of several machine learning patents primarily applicable to the semiconductor industry. He has architected and developed numerous machine learning and deep learning solutions for anomaly detection, image processing, forecasting, and predictive maintenance applications. Embracing collaborations, he has experience in guiding data science teams toward successful product deliveries and also has experience in supporting team members to reach their full potential.
Rahul Vishwakarma is a Senior Member of IEEE and has worked in the industry for over twelve years. During his tenure at Dell Technologies, he drove solutions for Data Protection and assisted customers in safeguarding data with Data Domain. During his position as a Solution Architect at Hewlett Packard Enterprise (HPE), he designed reference architectures for the Converged System for SAP HANA. He holds more than 65 US Patents in machine learning, data storage, persistent memory, DNA storage, and blockchain domains. His current research interests include addressing bias, explainability, and uncertainty quantification of machine learning models.
This part will introduce you to conformal prediction. It will explain in detail the type of problem that conformal prediction can address and outline the general ideas on which it is based.
This part has the following chapters:
Chapter 1, Introducing Conformal PredictionChapter 2, Overview of Conformal PredictionThis book is about conformal prediction, a modern framework for uncertainty quantification that is becoming increasingly popular in industry and academia.
Machine learning and AI applications are everywhere. In the realm of machine learning, prediction is a fundamental task. Given a training dataset, we train a machine learning model to make predictions on new data.
Figure 1.1 – Machine learning prediction model
However, in many real-world applications, the predictions made by statistical, machine learning, and deep learning models are often incorrect or unreliable because of various factors, such as insufficient or incomplete data, issues arising during the modeling process, or simply because of the randomness and complexities of the underlying problem.
Predictions made by machine learning models often come without the uncertainty quantification required for confident and reliable decision-making. This is where conformal prediction comes in. By providing a clear measure of the reliability of its predictions, conformal prediction enhances the trustworthiness and explainability of machine learning models, making them more transparent and user-friendly for decision-makers.
This chapter will introduce conformal prediction and explore how it can be applied in practical settings.
In this chapter, we’re going to cover the following main topics:
Introduction to conformal predictionThe origins of conformal predictionHow conformal prediction differs from traditional machine learningThe p-value and its role in conformal predictionThe chapter will provide a practical understanding of conformal prediction and its applications. By the end of this chapter, you will be able to understand how conformal prediction can be applied to your own machine learning models to improve their reliability and interpretability.
This book uses Python. The code for this book is hosted on GitHub and can be found here: https://github.com/PacktPublishing/Practical-Guide-to-Applied-Conformal-Prediction You can run notebooks locally or upload them to Google Colab (https://colab.research.google.com/).
In this section, we will introduce conformal prediction and explain how it can be used to improve the reliability of predictions produced by statistical, machine learning, and deep learning models. We will provide an overview of the key ideas and concepts behind conformal prediction, including its underlying principles and benefits. By the end of this section, you will have a solid understanding of conformal prediction and why it is an important framework to know.
Conformal prediction is a powerful machine learning framework that provides valid confidence measures for individual predictions. This means that when you make a prediction using any model from the conformal prediction framework, you can also measure your confidence in that prediction.
This is incredibly useful in many practical applications where it is crucial to have reliable and interpretable predictions. For example, in medical diagnosis, conformal prediction can provide a confidence level that a tumor is malignant versus benign. This enables physicians to make more informed treatment decisions based on the prediction confidence. In finance, conformal prediction can provide prediction intervals estimating financial risk. This allows investors to quantify upside and downside risks.
Specifically, conformal prediction can determine a 95% chance a tumor is malignant, giving physicians high confidence in a cancer diagnosis. Or, it can predict an 80% probability that a stock price will fall between $50 and $60 next month, providing an estimated trading range. Conformal prediction increases trust and is valuable in real-world applications such as medical diagnosis and financial forecasting by delivering quantifiable confidence in predictions.
The key benefit of conformal prediction is that it provides valid confidence measures for individual predictions. A conformal prediction model usually provides a prediction in the form of a prediction interval or prediction set with a specified confidence level, for example, 95%. In classification problems, conformal prediction can also calibrate class probabilities, enhancing confidence and informed decision-making.
In conformal prediction, “coverage” denotes the likelihood that the predicted region – whether a set of potential outcomes in classification tasks or a prediction interval in regression tasks – accurately encompasses the true values. Essentially, if you choose a coverage of 95%, it means there’s a 95% chance that the true values fall within the provided prediction set or interval.
We call such prediction regions “valid.” The requirement for the validity of predictions is crucial to ensure that the model does not contain prediction bias and is especially important in consequential applications such as health, finance, and self-driving cars. Valid predictions are a prerequisite of trust in the machine learning model that has produced this prediction.
While there are alternative approaches to uncertainty quantification, such as Bayesian methods, Monte Carlo methods, and bootstrapping, to provide validity guarantees, such approaches require distribution-specific assumptions about the data – for example, an assumption that the data follows a normal distribution. However, the true underlying distribution of real-world data is generally unknown. Conversely, conformal prediction does not make distributional assumptions and can provide validity guarantees without making assumptions about the specifics of data distribution. This makes conformal prediction more broadly applicable to real data that may not satisfy common statistical assumptions such as normality, smoothness, and so on.
In practice, the need for distribution-specific assumptions limits the ability of methods such as Bayesian inference or bootstrapping to make formally rigorous statements about arbitrary real data sources. There is no guarantee that predictions from such methods will have the claimed confidence level or coverage across all data types, since the assumptions may not hold. This can create a mismatch between the confidence level communicated to users and the actual coverage achieved, leading to inaccurate decisions and misleading users about the reliability of model predictions.
Conformal prediction sidesteps these issues by providing distribution-free finite sample validity guarantees without relying on hard-to-verify distributional assumptions about the data. This makes conformal prediction confidence estimates more trustworthy and robust for real-world applications.
Conformal prediction has multiple benefits:
Guaranteed coverage: Conformal prediction guarantees the validity of prediction regions automatically. Any method from the conformal prediction framework guarantees the validity of prediction regions by mathematical design. In comparison, alternative methods output predictions that do not provide any validity guarantees. By way of an example, the popular NGBoost package does not produce valid prediction intervals (you can read more about it at the following link: https://medium.com/@valeman/does-ngboost-work-evaluating-ngboost-against-critical-criteria-for-good-probabilistic-prediction-28c4871c1bab).Distribution-free: Conformal prediction is distribution-free and can be applied to any data distribution regardless of the properties of the distribution as long as the data is exchangeable. Exchangeability means that the order or index of the data points does not matter – shuffling or permuting the data points will not change the overall data distribution. For example, exchangeability assumes that observation 1, 2, 3 has the same distribution as observation 2, 3, 1 or 3, 1, 2. This is a weaker assumption than IID and is required to provide validity guarantees. Unlike many classical statistical models, conformal prediction does not make assumptions such as the data following a normal distribution. The data can have any distribution, even with irregularities such as fat tails. The only requirement is exchangeability. By relying only on exchangeability rather than strict distributional assumptions, conformal prediction provides finite sample guarantees on prediction coverage that are distribution-free and applicable to any data source.Model-agnostic: Conformal prediction can be applied to any prediction model that produces point predictions in classification, regression, time series, computer vision, NLP, reinforcement learning, or other statistical, machine learning, and deep learning tasks. Conformal prediction has been successfully applied to many innovative model types, including recent innovations such as diffusion models and large language models (LLMs). Conformal prediction does not require the model to be statistical, machine learning, or deep learning. It could be any model of any type, for example, a business heuristic developed by domain experts. If you have a model to make point predictions, you can use conformal prediction as an uncertainty quantification layer on top of your point prediction model to obtain a well-calibrated, reliable, and safe probabilistic prediction model.Non-intrusive: Conformal prediction stands out in its simplicity and efficiency. Rather than overhauling your existing point prediction model, it seamlessly integrates with it. For businesses with established models already in production, this is a game changer. And for data scientists, the process is even more exciting. Simply overlay your point prediction model with the uncertainty quantification layer provided by conformal prediction, and you’re equipped with a state-of-the-art probabilistic prediction model.Dataset size: Conformal prediction stands apart from typical statistical methods that depend on stringent data distribution assumptions, such as normality, or need vast datasets for solid guarantees. It offers inherent mathematical assurances of valid predictions without bias, irrespective of the dataset’s size. While smaller datasets may yield broader prediction intervals in regression tasks (or larger sets in classification), conformal prediction remains consistently valid. The validity is assured no matter the dataset size, underlying prediction model, or data distribution, making it a unique and unmatched method for uncertainty quantification.Easy to use: A few years back, the adoption of conformal prediction was limited due to the scarcity of open source libraries, even though esteemed universities and major corporations such as Microsoft had been utilizing it for years. Fast forward to today, and the landscape has dramatically shifted. There’s a rich selection of top-tier Python packages such as MAPIE and Amazon Fortuna, among others. This means that generating well-calibrated probabilistic predictions via conformal prediction is just a few lines of code away, making it straightforward to integrate into business applications. Furthermore, platforms such as KNIME have democratized its use, offering conformal prediction through low-code or no-code solutions.Fast: The most widely embraced conformal prediction variant, inductive conformal prediction, stands out because it operates efficiently without the need to retrain the foundational model. In contrast, other methods, such as Bayesian networks, often necessitate retraining. This distinction means that inductive conformal prediction offers a streamlined approach, eliminating the time and computational costs associated with repeated model retraining.Non-intrusive: Unlike many uncertainty quantification techniques, conformal prediction seamlessly integrates without altering the underlying point prediction models. Its non-invasive nature is cost-effective and convenient, especially compared to other methods that demand potentially costly and complex adjustments to the machine or deep learning models. The benefits of using conformal prediction are truly incredible. You might be interested to know how conformal prediction achieves the unique and powerful benefits that it offers to its users.The key objective of conformal prediction is to provide valid confidence measures that adapt based on the difficulty of making individual predictions. Conformal prediction uses “nonconformity measures” to assess how well new observations fit with previous observations.
The overall workflow consists of the following steps:
A conformal predictor learns from past training examples to quantify uncertainty around predictions for the new observations.When quantifying uncertainty around predictions for the new observations, it calculates nonconformity scores, measuring how different or “nonconforming” the new observation is, compared to the training set (in the classical transductive version of conformal prediction) or calibration (in the most popular variant of conformal prediction – inductive conformal prediction).These nonconformity scores are used to determine whether the new observation falls within the range of values expected based on the training data.The model calculates personalized confidence measures and prediction sets (in classification problems) or prediction intervals (in regression problems and time series forecasting) for each prediction.The magic of conformal prediction lies in these nonconformity measures, which allow the model to evaluate each new prediction in the context of the previously seen data. This simple but powerful approach results in finite sample coverage guarantees adapted to the intrinsic difficulty of making a given prediction. The validity holds for any data distribution, prediction algorithm, or dataset size.
In this book, we will talk interchangeably about nonconformity and conformity measures; one is the inverse of the other, and depending on the application, it will be more convenient to use either conformity or nonconformity measures.
A conformity measure is a critical component of conformal prediction and is essentially a function that assigns a numerical score (conformity score) to each object in a dataset. The conformity score indicates how well a new observation fits the observed data. When making a new prediction, we can use the conformity measure to calculate a conformity score for the new observation and compare it to the conformity scores of the previous observations. Based on this comparison, we can calculate a measure of confidence for our prediction. The conformity score indicates a degree of confidence in the prediction.
The choice of conformity measure is a key step in conformal prediction. The conformity measure determines how we assess how similar new observations are to past examples. There are many options for defining conformity measures depending on the problem.
In a classification setting, a simple conformity measure could calculate the probability scores assigned to each class by the prediction model for a new observation. The class with the highest probability would have the best conformity or match to the training data.
The key advantage of conformal prediction is that we obtain valid prediction regions regardless of the conformity measure used. This is because conformal prediction relies only on the order induced by the conformity measure rather than its exact form.
So, we have the flexibility to incorporate domain knowledge in designing an appropriate conformity measure for the problem at hand. If the measure ranks how well new observations match past data, conformal prediction can be used to deliver finite sample coverage guarantees.
While all conformal predictors provide valid prediction regions, the choice of conformity measure impacts their efficiency. Efficiency relates to the width of the prediction intervals or sets – narrower intervals contain more valuable information for decision-making.
Though validity holds for any conformity measure, thoughtfully choosing one tailored to the application can improve efficiency and produce narrower, more useful prediction intervals. The intervals should also be adaptable based on the model’s uncertainty – expanding for difficult predictions and contracting for clear ones.
Let’s illustrate this with an example. Say we have a dataset of patients diagnosed with a disease, with features such as age, gender, and test results. We want to predict whether new patients are at risk.
A simple conformity measure could calculate how similar the feature values are between new patients and those in the training data. New patients very different from the data would get low conformity scores and wide prediction intervals, indicating high uncertainty. While this conformity measure would produce valid intervals, we can improve efficiency with a more tailored approach.
By carefully selecting conformity measures aligned to our prediction problem and domain knowledge, we can obtain high-quality conformal predictors that provide both validity and efficiency.
We will now talk briefly about the origins of conformal prediction.
The origins of conformal prediction are documented in Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification by Anastasios N. Angelopoulos and Stephen Bates (https://arxiv.org/abs/2107.07511).
Note
Conformal prediction was invented by my PhD supervisor Prof. Vladimir Vovk, a professor at Royal Holloway University of London. Vladimir Vovk graduated from Moscow State University, where he studied mathematics and became a student of one of the most notable mathematicians of the 20th century, Andrey Kolmogorov. During this time, initial ideas that later gave rise to the invention of conformal prediction appeared.
The first edition of Algorithmic Learning in a Random World (https://link.springer.com/book/10.1007/b106715) by Vladimir Vovk, Alexander Gammerman, and Glenn Shafer was published in 2005. The second edition of the book was published in 2022 (https://link.springer.com/book/10.1007/978-3-031-06649-8).
Conformal prediction was popularized in United States academia by Professor Larry Wasserman (Carnegie Mellon) and his collaborators, who have published some key papers and introduced conformal prediction to many other researchers in the United States.
Note
In 2022, I finished my PhD in machine learning. In the same year, I created Awesome Conformal Prediction (https://github.com/valeman/awesome-conformal-prediction) – the most comprehensive professionally curated resource on conformal prediction, which has since received thousands of GitHub stars.
Conformal prediction has grown rapidly from a niche research area into a mainstream framework for uncertainty quantification. The field has exploded in recent years, with over 1,000 research papers on conformal prediction estimated to be published in 2023 alone.
This surge of research reflects the increasing popularity and applicability of conformal prediction across academia and industry. Major technology companies such as Microsoft, Amazon, DeepMind, and NVIDIA now conduct research into and apply conformal prediction. The framework has also been adopted in high-stakes domains such as healthcare and finance, where validity and reliability are critical.
In just over two decades since its introduction, conformal prediction has cemented itself as one of the premier and most trusted approaches for uncertainty quantification in machine learning. The field will continue to expand as more practitioners recognize the value of conformal prediction’s finite sample guarantees compared to traditional statistical methods reliant on asymptotic theory and unverifiable distributional assumptions. With growing research and adoption, conformal prediction is poised to become a standard tool for any application requiring rigorous uncertainty estimates alongside point predictions.
At NeurIPS 2022, one of the prominent mathematicians of our time, Emmanuel Candes (Stanford), delivered a key invited talk titled Conformal Prediction in 2022 (https://slideslive.com/38996063/conformal-prediction-in-2022?ref=speaker-43789) to tens of thousands of attendees. In his talk, Emmanuel Candes said:
Conformal inference methods are becoming all the rage in academia and industry alike. In a nutshell, these methods deliver exact prediction intervals for future observations without making any distributional assumption whatsoever other than having IID, and more generally, exchangeable data.
For years, I have promoted conformal prediction as the premier framework for reliable probabilistic predictions. Excitingly, over the last 2-3 years, there has been an explosion of interest in and adoption of conformal prediction, including by major tech leaders such as Amazon, Microsoft, Google, and DeepMind. Many universities and companies are researching conformal prediction, actively developing real-world applications, and releasing open source libraries such as MAPIE and Amazon Fortuna.
