57,59 €
Product managers working with artificial intelligence will be able to put their knowledge to work with this practical guide to applied AI. This book covers everything you need to know to drive product development and growth in the AI industry. From understanding AI and machine learning to developing and launching AI products, it provides the strategies, techniques, and tools you need to succeed.
The first part of the book focuses on establishing a foundation of the concepts most relevant to maintaining AI pipelines. The next part focuses on building an AI-native product, and the final part guides you in integrating AI into existing products.
You’ll learn about the types of AI, how to integrate AI into a product or business, and the infrastructure to support the exhaustive and ambitious endeavor of creating AI products or integrating AI into existing products. You’ll gain practical knowledge of managing AI product development processes, evaluating and optimizing AI models, and navigating complex ethical and legal considerations associated with AI products. With the help of real-world examples and case studies, you’ll stay ahead of the curve in the rapidly evolving field of AI and ML.
By the end of this book, you’ll have understood how to navigate the world of AI from a product perspective.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 521
Veröffentlichungsjahr: 2023
Develop a product that takes advantage of machine learning to solve AI problems
Irene Bratsis
BIRMINGHAM—MUMBAI
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Dinesh Chaudhary
Senior Editor: Tazeen Shaikh
Technical Editor: Rahul Limbachiya
Copy Editor: Safis Editing
Project Coordinator: Farheen Fathima
Proofreader: Safis Editing
Indexer: Pratik Shirodkar
Production Designer: Alishon Mendonca
Marketing Coordinators: Shifa Ansari and Vinishka Kalra
First published: February 2023
Production reference: 2230223
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80461-293-4
www.packtpub.com
For those courageous enough to believe they deserve their heart’s desires… evermore.
– Irene Bratsis
Irene Bratsis is a director of digital product and data at the International WELL Building Institute (IWBI). She has a bachelor's in economics, and after completing various MOOCs in data science and big data analytics, she completed a data science program with Thinkful. Before joining IWBI, Irene worked as an operations analyst at Tesla, a data scientist at Gesture, a data product manager at Beekin, and head of product at Tenacity. Irene volunteers as NYC chapter co-lead for Women in Data, has coordinated various AI accelerators, moderated countless events with a speaker series with Women in AI called WaiTalk, and runs a monthly book club focused on data and AI books.
Akshat Gurnani is a highly qualified individual with a background in the field of computer science and machine learning. He has a master’s degree in computer science and a deep understanding of various machine learning techniques and algorithms. He has experience working on various projects related to natural language processing, computer vision, and deep learning. He has also published several research papers in top-tier journals and conferences and has a proven track record in the field. He has a passion for keeping up to date with the latest developments in their fields and has a strong desire to continue learning and contributing to the field of artificial intelligence.
It’s hard to come across anyone that doesn’t have strong opinions and reactions about AI these days. I’ve witnessed my own feelings and conclusions about it ebb and flow as the years have gone on. When I was a student, I felt a tremendous amount of excitement and optimism about where AI, and the fourth industrial revolution that accompanies it, would take us. That was quickly tempered when I started my book club, and I started a monthly practice of reading books about how bias and dependence on AI were compromising our lives in seen and unseen ways. Then, I started moderating events, where I brought together people from virtually every corner of AI and machine learning, who spoke not just on how they’re leveraging this technology in their own work but on their own beliefs about how AI will impact us in the future.
This brings us to one of the greatest debates we find ourselves returning to with every major advancement in technology. Do we dare adopt powerful technology even when we’re aware of the risks? As far as I see it, we don’t have a choice, and the debate is only an illusion we indulge ourselves in. AI is here to stay, and nihilistic fears about it won’t save us from any harm it may cause. Pandora’s box is open, and as we peer into what remains of it, we find that hope springs eternal.
AI is holding up a mirror to our biases and inequalities, and so far, it’s not a flattering reflection. It’s my hope that, with time, we will learn how to adopt AI responsibly in order to minimize its harm and optimize its greatest contributions to our modern civilization. I wanted to write a book about AI product management because it’s the makers of products that bring nebulous ideas into the “real” world. Getting into the details about how to ideate, build, manage and maintain AI products with integrity, to the best of my ability, is the greatest contribution I can make to this field at this present moment. It’s been an honor to write this book.
This book is for people that aspire to be AI product managers, AI technologists, and entrepreneurs, or for people that are casually interested in the considerations of bringing AI products to life. It should serve you if you’re already working in product management and you have a curiosity about building AI products. It should also serve you if you already work in AI development in some capacity and you’re looking to bring those concepts into the discipline of product management and adopt a more business-oriented role. While some chapters in the book are more technically focused, all of the technical content in the book can be considered beginner level and accessible to all.
Chapter 1, Understanding the Infrastructure and Tools for Building AI Products, offers an overview of the main concepts and areas of infrastructure for managing AI products.
Chapter 2, Model Development and Maintenance for AI Products, delves into the nuances of model development and maintenance.
Chapter 3, Machine Learning and Deep Learning Deep Dive, is a broader discussion of the difference between traditional deep learning and deep learning algorithms and their use cases.
Chapter 4, Commercializing AI Products, discusses the major areas of AI products we see in the market, as well as examples of the ethics and success factors that contribute to commercialization.
Chapter 5, AI Transformation and Its Impact on Product Management, explores the ways AI can be incorporated into the major market sectors in the future.
Chapter 6, Understanding the AI-Native Product, gives an overview of the strategies, processes, and team building needed to empower the success of an AI-native product.
Chapter 7, Productizing the ML Service, is an exploration of the trials and tribulations that may come up when building an AI product from scratch.
Chapter 8, Customization for Verticals, Customers, and Peer Groups, is a discussion on how AI products change and evolve over various types of verticals, customer types, and peer groups.
Chapter 9, Macro and Micro AI for Your Product, gives an overview of the various ways you can leverage AI in ways big and small, as well as some of the most successful examples and common mistakes.
Chapter 10, Benchmarking Performance, Growth Hacking, and Cost, explains the benchmarking needed to gauge product success at the product level rather than the model performance level.
Chapter 11, The Rising Tide of AI, is a revisit to the concept of the fourth industrial revolution and a blueprint for products that don’t currently leverage AI.
Chapter 12, Trends and Insights across Industry, dives into the various ways we’re seeing AI trending across industries, based on prominent and respected research organizations.
Chapter 13, Evolving Products into AI Products, is a practical guide on how to deliver AI features and upgrade the existing logic of products to successfully update products for AI commercial success.
The text conventions used throughout this book are as follows:
Tips or important notes
Appear like this.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Once you’ve read The AI Product Manager’s Handbook, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere?
Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
Scan the QR code or visit the link belowhttps://packt.link/free-ebook/9781804612934
Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directlyAn AI product manager needs to have a comprehensive understanding of AI, along with all the varied components that lead to its success, if they’re going to be successful in commercializing their products.
This first part consists of five cumulative chapters that will cover what the term AI encompasses and how to support infrastructure to make it successful within your organization. It will also cover how to support your AI program from a maintenance perspective, how to navigate the vast areas of machine learning (ML) and deep learning (DL) and choose the best path for your product, and how to understand current and future developments in AI products.
By the end of this part, you will understand AI terms and components, what an AI implementation means from an investment perspective, how to maintain AI products sustainably, and how to choose between the types of AI that would best fit your product and market. You will also learn how to understand success factors for ideating and building a minimal viable product (MVP), and how to make a product that truly serves its market.
This part comprises the following chapters:
Chapter 1, Understanding the Infrastructure and Tools for Building AI ProductsChapter 2, Model Development and Maintenance for AI ProductsChapter 3, Machine Learning and Deep Learning Deep DiveChapter 4, Commercializing AI ProductsChapter 5, AI Transformation and Its Impact on Product ManagementLaying a solid foundation is an essential part of understanding anything, and the frontier of artificial intelligence (AI) products seems a lot like our universe: ever-expanding. That rate of expansion is increasing with every passing year as we go deeper into a new way to conceptualize products, organizations, and the industries we’re all a part of. Virtually every aspect of our lives will be impacted in some way by AI and we hope those reading will come out of this experience more confident about what AI adoption will look like for the products they support or hope to build someday.
Part 1 of this book will serve as an overview of the lay of the land. We will cover terms, infrastructure, types of AI algorithms, and products done well, and by the end of this section, you will understand the various considerations when attempting to build an AI strategy, whether you’re looking to create a native-AI product or add AI features to an existing product.
Managing AI products is a highly iterative process, and the work of a product manager is to help your organization discover what the best combination of infrastructure, training, and deployment workflow is to maximize success in your target market. The performance and success of AI products lie in understanding the infrastructure needed for managing AI pipelines, the outputs of which will then be integrated into a product. In this chapter, we will cover everything from databases to workbenches to deployment strategies to tools you can use to manage your AI projects, as well as how to gauge your product’s efficacy.
This chapter will serve as a high-level overview of the subsequent chapters in Part 1 but it will foremost allow for a definition of terms, which are quite hard to come by in today’s marketing-heavy AI competitive landscape. These days, it feels like every product is an AI product, and marketing departments are trigger-happy with sprinkling that term around, rendering it almost useless as a descriptor. We suspect this won’t be changing anytime soon, but the more fluency consumers and customers alike have with the capabilities and specifics of AI, machine learning (ML), and data science, the more we should see clarity about how products are built and optimized. Understanding the context of AI is important for anyone considering building or supporting an AI product.
In this chapter, we will cover the following topics:
Definitions – what is and is not AIML versus DL – understanding the differenceLearning types in MLThe order – what is the optimal flow and where does every part of the process live?DB 101 – databases, warehouses, data lakes, and lakehousesManaging projects – IaaSDeployment strategies – what do we do with these outputs?Succeeding in AI – how well-managed AI companies do infrastructure rightThe promise of AI – where is AI taking us?In 1950, a mathematician and world war II war hero Alan Turing asked a simple question in his paper Computing Machinery and Intelligence – Can machines think?. Today, we’re still grappling with that same question. Depending on who you ask, AI can be many things. Many maps exist out there on the internet, from expert systems used in healthcare and finance to facial recognition to natural language processing to regression models. As we continue with this chapter, we will cover many of the facets of AI that apply to products emerging in the market.
For the purposes of applied AI in products across industries, in this book, we will focus primarily on ML and deep learning (DL) models used in various capacities because these are often used in production anywhere AI is referenced in any marketing capacity. We will use AI/ML as a blanket term covering a span of ML applications and we will cover the major areas most people would consider ML, such as DL, computer vision, natural language processing, and facial recognition. These are the methods of applied AI that most people will come across in the industry, and familiarity with these applications will serve any product manager looking to break into AI. If anything, we’d like to help anyone who’s looking to expand into the field from another product management background to choose which area of AI appeals to them most.
We’d also like to cover what is and what isn’t ML. The best way for us to express it as simply as we can is: if a machine is learning from some past behavior and if its success rate is improving as a result of this learning, it is ML! Learning is the active element. No models are perfect but we do learn a lot from employing models. Every model will have some element of hyperparameter tuning, and the use of each model will yield certain results in performance. Data scientists and ML engineers working with these models will be able to benchmark performance and see how performance is improving. If there are fixed, hardcoded rules that don’t change, it’s not ML.
AI is a subset of computer science, and all programmers are effectively doing just that: giving computers a set of instructions to fire away on. If your current program doesn’t learn from the past in any way, if it simply executes on directives it was hardcoded with, we can’t call this ML. You may have heard the terms rules-based engine or expert system thrown around in other programs. They are considered forms of AI, but they're not ML because although they are a form of AI, the rules are effectively replicating the work of a person, and the system itself is not learning or changing on its own.
We find ourselves in a tricky time in AI adoption where it can be very difficult to find information online about what makes a product AI. Marketing is eager to add the AI label to their products but there still isn’t a baseline of explainability with what that means out in the market. This further confuses the term AI for consumers and technologists alike. If you’re confused by the terms, particularly when they’re applied to products you see promoted online, you’re very much not alone.
Another area of confusion is the general term that is AI. For most people, the concept of AI brings to mind the Terminator franchise from the 1980s and other futurist depictions of inescapable technological destruction. While there certainly can be a lot of harm to come from AI, this depiction represents what’s referred to as strong AI or artificial general intelligence (AGI). We still have ways to go for something such as AGI but we’ve got plenty of what’s referred to as artificial narrow intelligence or narrow AI (ANI).
ANI is also commonly expressed as weak AI and is what’s generally meant when you see AI plastered all over products you find online. ANI is exactly what it sounds like: a narrow application of AI. Maybe it’s good at talking to you, at predicting some future value, or at organizing things; maybe it’s an expert at that, but its expertise won’t bleed into other areas. If it could, it would stop being ANI. These major areas of AI are referred to as strong and weak in comparison to human intelligence. Even the most convincing conversational AIs out there, and they are quite convincing, are demonstrating an illusionary intelligence. Effectively, all AI that exists at the moment is weak or ANI. Our Terminator days are still firmly in our future, perhaps never to be realized.
For every person out there that’s come across Reddit threads about AI being sentient or somehow having ill will toward us, we want to make the following statement very clear. AGI does not exist and there is no such thing as sentient AI. This does not mean AI doesn’t actively and routinely cause humans harm, even in its current form. The major caveat here is that unethical, haphazard applications of AI already actively cause us both minor inconveniences and major upsets. Building AI ethically and responsibly is still a work in progress. While AI systems may not be sentiently plotting the downfall of humanity, when they’re left untested, improperly managed, and inadequately vetted for bias, the applications of ANI that are deployed already have the capacity to do real damage in our lives.
For now, can machines think like us? No, they don’t think like us. Will they someday? We hope not. It’s my personal opinion that the insufferable aspects of the human condition end with us. But we do very much believe that we will experience some of our greatest ails, as well as our wildest curiosities, to be impacted considerably by the benevolence of AI and ML.
As a product manager, you’re going to need to build a lot of trust with your technical counterparts so that, together, you can build an amazing product that works as well as it can technically. If you’re reading this book, you’ve likely come across the phrase ML and DL. We will use the following sections titled ML and DL to go over some of the basics but keep in mind that we will be elaborating on these concepts further down in Chapter 3.
In its basic form, ML is made up of two essential components: the models used and the training data it’s learning from. These data are historical data points that effectively teach machines a baseline foundation from which to learn, and every time you retrain the models, the models are theoretically improving. How the models are chosen, built, tuned, and maintained for optimized performance is the work of data scientists and ML engineers. Using this knowledge of performance toward the optimization of the product experience itself is the work of product managers. If you’re working in the field of AI product management, you’re working incredibly closely with your data science and ML teams.
We’d like to also make a distinction about the folks you’ll be working with as an AI product manager. Depending on your organization, you’re either working with data scientists and developers to deploy ML or you’re working with ML engineers who can both train and upkeep the models as well as deploy them into production. We highly suggest maintaining strong relationships with any and all of these impacted teams, along with DevOps.
All ML models can be grouped into the following four major learning categories:
Supervised learningUnsupervised learningSemi-supervised learningReinforcement learningThese are the four major areas of ML and each area is going to have its particular models and algorithms that are used in each specialization. The learning type has to do with whether or not you’re labeling the data and the method you’re using to reward the models you’ve used for good performance. These learning types are relevant whether your product is using a DL model or not, so they’re inclusive of all ML models. We will be covering the learning types in more depth in the following section titled Learning types in ML.
DL is a subset of ML, but the terms are often used colloquially as almost separate expressions. The reason for this is DL is based on neural network algorithms and ML can be thought of as… the rest of the algorithms. In the preceding section covering ML, we looked at the process of taking data, using it to train our models, and using that trained model to predict new future data points. Every time you use the model, you see how off it was from the correct answer by getting some understanding of the rate of error so you can iterate back and forth until you have a model that works well enough. Every time, you are creating a model based on data that has certain patterns or features.
This process is the same in DL, but one of the key differences of DL is that patterns or features in your data are largely picked up by the DL algorithm through what’s referred to as feature learning or feature engineering through a hierarchical layered system. We will go into the various algorithms that are used in the following section because there are a few nuances between each, but as you continue developing your understanding of the types of ML out there, you’ll also start to group the various models that make up these major areas of AI (ML and DL). For marketing purposes, you will for the most part see terms such as ML, DL/neural networks, or just the general umbrella term of AI referenced where DL algorithms are used.
It’s important to know the difference between what these terms mean in practice and at the model level and how they’re communicated by non-technical stakeholders. As product managers, we are toeing the line between the two worlds: what engineering is building and what marketing is communicating. Anytime you’ve heard the term black box model, it’s referring to a neural network model, which is DL. The reason for this is DL engineers often can’t determine how their models are arriving at certain conclusions that are creating an opaque view of what the model is doing. This opacity is double-sided, both for the engineers and technologists themselves, as well as for the customers and users downstream who are experiencing the effects of these models without knowing how they make certain determinations. The DL neural networks are mimicking the structure of the way humans are able to think using a variety of layers of neural networks.
For product managers, DL poses a concern for explainability because there’s very little we can understand about how and why a model is arriving at conclusions, and, depending on the context of your product, the importance of explainability could vary. Another inherent challenge is these models essentially learn autonomously because they aren’t waiting for their engineer to choose the features that are most relevant in the data for them; the neural networks themselves do the feature selection. It learns with very little input from an engineer. Think of the models as the what and the following section of learning types as the how. A quick reminder that as we move on to cover the learning styles (whether a model is used in a supervised, unsupervised, semi-supervised, or reinforcement learning capacity), these learning styles apply to both DL and traditional ML models.
Let’s look at the different learning types in ML.
In this section, we will cover the differences between supervised, unsupervised, semi-supervised, and reinforcement learning and how all these learning types can be applied. Again, the learning type has to do with whether or not you’re labeling the data and the method you’re using to reward the models you’ve used for good performance. The ultimate objective is to understand what kind of learning model gets you the kind of performance and explainability you’re going to need when considering whether or not to use it in your product.
If humans are labeling the data and the machine is looking to also correctly label current or future data points, it’s supervised learning. Because we humans know the answer the machines are trying to arrive at, we can see how off they are from finding the correct answer, and we continue this process of training the models and retraining them until we find a level of accuracy that we’re happy with.
Applications of supervised learning models include classification models that are looking to categorize data in the way spam filters do or regression models that are looking for relationships between variables in order to predict future events and find trends. Keep in mind that all models will only work to a certain point, which is why they require constant training and updating and AI teams are often using ensemble modeling or will try various models and choose the best-performing one. It won’t be perfect either way, but with enough hand-holding, it will take you closer and closer to the truth.
The following is a list of common supervised learning models/algorithms you’ll likely use in production for various products:
Naive Bayes classifier: This algorithm naively considers every feature in your dataset as its own independent variable. So, it’s essentially trying to find associations probabilistically without having any assumptions about the data. It’s one of the simpler algorithms out there and its simplicity actually is what makes it so successful with classification. It’s commonly used for binary values such as trying to decipher whether or not something is spam.Support vector machine (SVM): This algorithm is also largely used for classification problems and will essentially try to split your dataset into two classes so that you can use it to group your data and try to predict where future data points will land along these major splits. If you’re not seeing compelling groups between the data, SVMs allow you to add more dimensions to be able to see groupings easier.Linear regression models: These have been around since the 1950s and they’re the simplest models we have for regression problems such as predicting future data points. They essentially use one or more variables in your dataset to predict your dependent variable. The linear part of this model is trying to find the best line to fit your data, and this line is what dictates how it predicts. Here, we once again see a relatively simple model also being heavily used because of how versatile and dependable it is.Logistic regression: This model works a lot like linear regression in that you have independent and dependent variables, but it’s not predicting a numerical value; it’s predicting a future binary categorical state such as whether or not someone might default on a loan in the future, for instance.Decision trees: This algorithm works well with both predicting something categorical as well as something numerical, so it’s used for both kinds of ML problems, such as predicting a future state or a future price. This is less common so decision trees are used often for both kinds of problems, which has contributed to its popularity. Its comparison to a tree comes from the nodes and branches that effectively function like a flow chart. The model learns from the flow of past data to predict future values.Random forest: This algorithm builds from the previous decision trees and is also used for both categorical and numerical problems. The way it works is it splits the data into different random”samples, creates decision trees for each sample, and then takes an average or majority vote for its predictions (depending on whether you’re using it for categorical or numerical predictions). It’s hard to understand how a random forest comes to conclusions, so if interpretability isn’t super high on the list of concerns, you can use it.K-nearest neighbors (KNNs): This algorithm exclusively works on categorical as well as numerical predictions, so it’s looking for a future state and it offers results in groups. The number of data points in the group is set by the engineer/data scientist, and the way the model works is by grouping the data and determining characteristics the data shares with its neighbors and giving its best guess based on those neighbors for future values.Now that we’ve covered supervised learning, let’s discuss unsupervised learning next.
If the data is unlabeled and we’re using machines to label the data and find patterns we don’t yet know of, it’s unsupervised. Effectively, we humans either know the right answer or we don’t, and that’s how we decipher which camp the ML algorithms belong to. As you might imagine, we take the results of unsupervised learning models with some hesitancy because it may be finding an organization that isn’t actually helpful or accurate. Unsupervised learning models also require large amounts of data to train on because the results can be wildly inaccurate if it’s trying to find patterns out of a small data sample. As it ingests more and more data, its performance will improve and become more refined over time, but once again, there is no correct answer.
Applications of unsupervised learning models include clustering and dimensionality reduction. Clustering models segment or group data into certain areas. These can be used for things such as looking for patterns in medical trials or drug discovery, for instance, because you’re looking for connections and groups of data where there might not already be obvious answers. Dimensionality reduction essentially removes the features in your dataset that contribute less to the performance you’re looking for and will simplify your data so that your most important features will best improve your performance to separate real signals from the noise.
The following is a list of common unsupervised learning models/algorithms you’ll likely use in production for various products:
K-means clustering: This algorithm will group data points together to better see patterns (or clusters), but it’s looking for some optimal number of clusters as well. This is unsupervised learning, so the model is looking to find patterns that it can learn from because it’s not given any information (or supervision) to go off from the engineer that’s using it. Also, the number of clusters assigned is a hyperparameter and you will need to choose what number of clusters is optimal.Principal component analysis (PCA): Often, the largest problem with using unsupervised ML on very large datasets is there’s actually too much uncorrelated data to find meaningful patterns. This is why PCA is used so often because it’s a great way to reduce dimension without actually losing or discarding information. This is especially useful for massive datasets such as finding patterns in genome sequencing or drug discovery trials.Next, let’s jump into semi-supervised learning.
In a perfect world, we’d have massive well-labeled datasets with which to create optimal models that don’t overfit. Overfitting is when you create and tune a model to the dataset you have but it fits a bit too well, which means it’s optimized for that particular dataset and doesn’t work well with more diverse data. This is a common problem in data science. We live in an imperfect world and we can find ourselves in situations where we don’t have enough labeled data or enough data at all. This is where semi-supervised learning comes in handy. We give some labeled datasets and also include a dataset that is unlabeled to essentially give the model nudges in the right direction as it tries to come up with its own semblance of finding patterns.
It doesn’t quite have the same level of absolute truth associated with supervised learning, but it does offer the models some helpful clues with which to organize its results so that it can find an easier path to the right answer.
For instance, let’s say you’re looking for a model that works well with detecting patterns in photos or speech. You might label a few of them and then see how the performance improves over time with the examples you don’t label. You can use multiple models in semi-supervised learning. The process would be a lot like supervised learning, which learns with labeled datasets so that it knows exactly how off it is from being correct. The main difference between supervised learning and semi-supervised learning is that you’re predicting a portion of the new unlabeled data and then, essentially, checking its accuracy against the labeled data. You’re adding unlabeled new data points into the training set so that it’s training on the data it’s gotten correct.
Finally, to wrap up this section, let’s take a brief look at reinforcement learning.
This area of ML effectively learns with trial and error, so it’s learning from past behavior and adapting its approach to finding the best performance by itself. There’s a sequence to reinforcement learning and it’s really a system based on weights and rewards to reinforce correct results. Eventually, the model tries to optimize for these rewards and gets better with time. We see reinforcement learning used a lot with robotics, for instance, where robots are trained to understand how to operate and adjust to the parameters of the real world with all its unpredictability.
Now that we’ve discussed and understood the different ML types, let’s move on and understand the optimal flow of the ML process.
Companies interested in creating value with AI/ML have a lot to gain compared to their more hesitant competitors. According to McKinsey Global Institute, “Companies that fully absorb AI in their value-producing workflows by 2025 will dominate the 2030 world economy with +120% cash flow growth.” The undertaking of embracing AI and productionizing it – whether in your product or for internal purposes – is complex, technical debt-heavy, and expensive. Once your models and use cases are chosen, making that happen in production becomes a difficult program to manage and this is a process many companies will struggle with as we see companies in industries other than tech starting to take on the challenge of embracing AI. Operationalizing the process, updating the models, keeping the data fresh and clean, and organizing experiments, as well as validating, testing, and the storage associated with it, are the complicated parts.
In an effort to make this entire process more digestible, we’re going to present this as a step-by-step process because there are varying layers of complexity but the basic components will be the same. Once you have gotten through the easy bit and you’ve settled on the models and algorithms you feel are optimal for your use case, you can begin to refine your process for managing your AI system.
Essentially, you’ll need a central place to store the data that your AI/ML models and algorithms will be learning from. Depending on the databases you invest in or legacy systems you’re using, you might have a need for an ETL pipeline and data engineering to make the layers of data and metadata available for your productionized AI/ML models to ingest and offer insights from. Think of this as creating the pipeline needed to feed your AI/ML system.
AI feeds on data, and if your system of delivering data is clunky or slow, you’ll run into issues in production later. Choosing your preferred way of storing data is tricky in and of itself. You don’t know how your tech stack will evolve as you scale, so choosing a cost-effective and reliable solution is a mission in and of itself. For example, as we started to add more and more customers at a cybersecurity company we were previously working for, we noticed the load time for certain customer-facing dashboards was lagging behind. Part of the issue was the number of customers, and their metadata was too large to support the pipelines we already had in place.
At this point, you have your models and algorithms and you’ve chosen a system for delivering data to them. Now, you’re going to be in the flow of constantly maintaining this system. In DevOps, this is referred to as continuous integration (CI)/continuous delivery (CD). In the later chapters, we will cover the concept of AI Operations (AIOps) but for now, the following is a list of the stages tailored for the continuous maintenance of AI pipelines. The following are the four major components of the continuous maintenance process:
CI: Testing/validating code and components, along with data, data schemas, and modelsCD: Code changes or updates to your model are passed on continuously so that once you’ve made changes, they are slated to appear in the testing environment before going to production without pausesCT: We’ve mentioned the idea of continuous learning being important for ML, and continuous training productionizes this process so that as your data feeds are refreshed, your models are consistently training and learning from that new dataCM: We can’t have ML/AI models continuously running without also continuously monitoring them to make sure something isn’t going horribly wrongYou can’t responsibly manage an AI program if you aren’t iterating your process constantly. Your models and hyperparameters will become stale. Your data will become stale and when an iterative process like this stagnates, it will stop being effective. Performance is something you’ll constantly be staying up to date on because the lack of performance will be self-evident, whether it is client-facing or not. With that said, things can also go wrong. For example, lags in performance or in the frequency of the model updating can lead to people losing their jobs, not getting a competitive rate on a mortgage, or getting an unfair prison sentence. Major consequences can arise from downstream effects due to improper model maintenance. We recommend exploring the Additional resources section at the end of this chapter for more examples and information on how stagnant AI systems can wreak havoc on environments and people.
B 101 – databases, warehouses, data lakes, and lakehouses
AI/ML products run on data. Where and how you store your data is a big consideration that impacts your AI/ML performance, and in this section, we will be going through some of the most popular storage vehicles for your data. Figuring out the optimal way to store and access and train your data is a specialization in and of itself, but if you’re in the business of AI product management, eventually, you’re going to need to understand the basic building blocks of what makes your AI product work. In a few words, data does.
Because AI requires big data, this is going to be a significant strategic decision for your product and business. If you don’t have a well-oiled machine, pun intended, you’re going to run into snags that will impair the performance of your models and, by extension, your product itself. Having a good grasp of the most cost-effective and performance-driven solution for your particular product, and finding the balance within these various facets, is going to help your success as a product manager. Yes, you will depend on your technical executives for a lot of these decisions, but you’ll be at the table helping make these decisions, so some familiarity is needed here.
Let’s look at some of the different options to store data for AI/ML products.
Depending on your organization’s goals and budget, you’ll be centralizing your data somehow between a data lake, a database, and a data warehouse, and you might even be considering a new option: the data lakehouse. If you’re just getting your feet wet, you’re likely just storing your data in a relational database so that you can access it and query it easily. Databases are a great way to do this if you have a relatively simple setup. With a relational database, there’s a particular schema you’re operating under if you wanted to combine this data with data that’s in another database; you would run into problems aligning these schemas later.
If your primary use of the database is querying to access data and use only a certain subset of your company’s data for general trends, a relational database might be enough. If you’re looking to combine various datasets from disparate areas of your business and you’re looking to accomplish more advanced analytics, dashboards, or AI/ML functions, you’ll need to read on.
If you’re looking to combine data into a location where you can centralize it somewhere and you’ve got lots of structured data coming in, you’re more likely going to use a data warehouse. This is really the first step toward maturity because it will allow you to leverage insights and trends across your various business units quickly. If you’re looking to leverage AI/ML in various ways, rather than one specific specialized way, this will serve you well.
Let’s say, for example, that you want to add AI features to your existing product as well as within your HR function. You’d be leveraging your customer data to offer trends or predictions to your customers based on the performance of others in their peer group, as well as using AI/ML to make predictions or optimizations for your internal employees. Both these use cases would be well served with a data warehouse.
Data warehouses do, however, require some upfront investment to create a plan and design your data structures. They also require a costly investment as well because they make data available for analysis on demand, so you’re paying a premium for keeping that data readily available. Depending on how advanced your internal users are, you could opt for cheaper options, but this option would be optimal for organizations where most of your business users are looking for easily digestible ways to analyze data. Either way, a data warehouse will allow you to create dashboards for your internal users and stakeholder teams.
If you’re sitting on lots of raw, unstructured data, and you want to have a more cost-effective place to store it, you’d be looking at a data lake. Here, you can store unstructured, semi-structured, as well as structured data that can be easily accessed by your more tech-savvy internal users. For instance, data scientists and ML engineers would be able to work with this data because they would be creating their own data models to transform and analyze the data on the fly, but this isn’t the case at most companies.
Keeping your data in a data lake would be cheap if you’ve got lots of data your business users don’t need immediately, but you won’t ever really be able to replace a warehouse or a database with one. It’s more of a “nice to have.’’ If you’re sitting on a massive data lake of historical data you want to use in the future for analytics, you’ll need to consider another way to store it to get those insights.
You might also come across the term lakehouse. There are many databases, data warehouses, and data lakes out there. However, the only lakehouse we’re aware of has been popularized by a company called Databricks, which offers something like a data lake but with some of the capabilities you get with data warehouses, namely, the ability to showcase data, make it available and ingestible for non-technical internal users, and create dashboards with it. The biggest advantage here is that you’re storing it and paying for the data to be stored upfront with the ability to access and manipulate it downstream.
Regardless of the tech you use to maintain and store your data, you’re still going to need to put up pipelines to make sure your data is moving, that your dashboards are refreshing as readily as your business requires, and that data is flowing the way it needs to. There are also multiple ways of processing and passing data. You might be doing it in batches (batch processing) for large amounts of data being moved at various intervals, or in real-time pipelines for getting data in real time as soon as it’s generated. If you’re looking to leverage predictive analytics, enable reporting, or have a system in place to move, process, and store data, a data pipeline will likely be enough. However, depending on what your data is doing and how much transformation is required, you’ll likely be using both data pipelines and perhaps, more specifically, ETL pipelines.
ETL stands for extract, transform, and load, so your data engineers are going to be creating specific pipelines for more advanced systems such as centralizing all your data into one place, adding data or data enrichment, connecting your data with CRM (customer relationship management) tools, or even transforming the data and adding structure to it between systems. The reason for this is that it’s a necessary step when using a data warehouse or database. If you’re exclusively using a data lake, you’ll have all the metadata you need to be able to analyze it and get your insights as you like. In most cases, if you’re working with an AI/ML product, you’re going to be working with a data engineer who will power the data flow needed to make your product a success because you’re likely using a relational database as well as a data warehouse. The analytics required to enable AI/ML features will most likely need to be powered by a data engineer who will focus on the ETL pipeline.
Managing and maintaining this system will also be the work of your data engineer, and we encourage every product manager to have a close relationship with the data engineer(s) that supports their products. One key difference between the two is that ETL pipelines are generally updated in batches and not in real time. If you’re using an ETL pipeline, for instance, to update historical daily information about how your customers are using your product to offer client-facing insights in your platform, it might be optimal to keep this batch updating twice daily. If you need insights to come in real time for a dashboard that’s being used by your internal business users and they rely on that data to make daily decisions, however, you likely will need to resort to a data pipeline that’s updated continuously.
Now, that we understand the different available options to store data and how to choose the right option for the business, let’s discuss how to manage our projects.
If you’re looking to create an AI/ML system in your organization, you’ll have to think about it as its own ecosystem that you’ll need to constantly maintain. This is why you see MLOps and AIOps working in conjunction with DevOps teams. Increasingly so, we will start to see managed services and infrastructure-as-a-service (IaaS) offerings coming out more and more. There has been a shift in the industry toward companies such as Determined AI and Google’s AI platform pipeline tools to meet the needs of the market. At the heart of this need is the desire to ease some of the burdens from companies left scratching their heads as they begin to take on the mammoth task of getting started with an AI system.
Just as DevOps teams became popular with at-scale software development, the result of decades of mistakes, we will see something similar with MLOps and AIOps. Developing a solution and putting it into operation are two different key areas that need to work together. This is doubly true for AI/ML systems. The trend now is on IaaS. This is an important concept to understand because companies just approaching AI often don’t have an understanding of the cost, storage, compute power, and investment required to do AI properly, particularly for DL AI projects that require massive amounts of data to train on.
At this point, most companies haven’t been running AI/ML programs for decades and don’t have dedicated teams. Tech companies such as MAANG (Meta, Amazon, Apple, Netflix, Google) are leading the cultural norms with managing AI/ML, but most companies that will need to embrace AI are not in tech and are largely unprepared for the technical debt AI adoption will pose for their engineering teams to manage.
Shortcuts taken to get AI initiatives off the ground will require code refactoring or changing how your data is stored and managed, which is why strategizing and planning for AI adoption is so crucial. This is why so many of these IaaS services are popping up to help keep engineering teams nimble should they require changes in the future as well. The infrastructure needed to keep AI teams up and running is going to change as time goes on, and the advantage of using an IaaS provider is that you can run all your projects and only pay for the time your AI developers are actually using data to train models.
Once you’re happy with the models you’ve chosen (including their performance and error rate), you’ve got a good level of infrastructure to support your product and chosen AI model’s use case; you’re ready to go to the last step of the process and deploy this code into production. Keeping up with a deployment strategy that works for your product and organization will be part of the continuous maintenance we’ve outlined in the previous section. You’ll need to think about things such as how often you’ll need to retrain your models and refresh your training data to prevent model decay and data drift. You’ll also need a system for continuously monitoring your model’s performance so this process will be really specific to your product and business, particularly because these periods of retraining will require some downtime for your system.
Deployment is going to be a dynamic process because your models are trying to effectively make predictions of real-world data for the most part, so depending on what’s going on in the world of your data, you might have to give deployment more or less of your attention. For instance, when we were working for an ML property-tech company, we were updating, retraining, and redeploying our models almost daily because we worked with real estate data that was experiencing a huge skew due to rapid changes in migration data and housing price data due to the pandemic. If those models were left unchecked and there weren’t engineers and business leaders on both sides of this product, on the client’s end and internally, we might not have caught some of the egregious liberties the models were making on behalf of under-representative data.
There are also a number of well-known deployment strategies you should be aware of. We will discuss them in the following subsections.
In this deployment strategy (often referred to as shadow mode), you’re deploying a new model with new features along with a model that already exists so that the new model that’s deployed is only experienced as a shadow of the model that’s currently in production. This also means that the new model is handling all the requests it’s getting just as the existing model does but it’s not showing the results of that model. This strategy allows you to see whether the shadow model is performing better on the same real-world data it’s getting without interrupting the model that’s actually live in production. Once it’s confirmed that the new model is performing better and that it has no issues running, it will then become the predominant model fully deployed in production and the original model will be retired.
With this strategy, we’re actually seeing two slightly different models with different features