32,99 €
Learn to research, plan, design, and test the UX of AI-powered products
Unlock the future of design with UX for AI—your indispensable guide to not only surviving but thriving in a world powered by artificial intelligence. Whether you're a seasoned UX designer or a budding design student, this book offers a lifeline for navigating the new normal, ensuring you stay relevant, valuable, and indispensable to your organization.
In UX for AI: A Framework for Designing AI-Driven Products, Greg Nudelman—a seasoned UX designer and AI strategist—delivers a battle-tested framework that helps you keep your edge, thrive in your design job, and seize the opportunities AI brings to the table. Drawing on insights from 35 real-world AI projects and acknowledging the hard truth that 85% of AI initiatives fail, this book equips you with the practical skills you need to reverse those odds.
You'll gain powerful tools to research, plan, design, and test user experiences that seamlessly integrate human-AI interactions. From practical design techniques to proven user research methods, this is the essential guide for anyone determined to create AI products that not only succeed but set new standards of value and impact.
Inside the book:
Perfect for any UX designer working with AI-enabled and AI-driven products, UX for AI is also a must-read resource for designers-in-training and design students with an interest in artificial intelligence and contemporary design.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 463
Veröffentlichungsjahr: 2025
COVER
TABLE OF CONTENTS
INTRODUCTION
References
HOW TO USE THIS BOOK
If You Only Have a Couple of Hours
Do the Exercises
Draw in Pencil on Sticky Notes
PART 1: Framing the Problem
CHAPTER 1: Case Study: How to Completely F*ck Up Your AI Project
A Boiling Pot of Spaghetti
Final Thoughts
Reference
CHAPTER 2: The Importance of Picking the Right Use Case
Presuming That AI Will Be Telling Experts How to Do Their Job Is a Red Flag
Ask a Better Question
Reference
CHAPTER 3: Storyboarding for AI Projects
Why Bother with a Storyboard?
How to Create a Storyboard
Storyboarding for AI
Final Thoughts
Design Exercise: Create Your Own Storyboard
Storyboarding Exercise Example: Death Clock
References
CHAPTER 4: Digital Twin—Digital Representation of the Physical Components of Your System
Digital Twin of a Wind Turbine Motor
The Digital Twin Is an Essential Modeling Exercise for Designing AI-Driven Products
How to Build a Digital Twin: An Example
Wait, There’s More!
Design Exercise: Create Your Own Digital Twin
Design Exercise Example: Life Clock Digital Twin
CHAPTER 5: Value Matrix—AI Accuracy Is Bullshit. Here’s What UX Must Do About It
The Big Secret
Confusion Matrix: How Can Accurate AI Be Wrong?
Value Matrix: The AI Tool for the Real World
Training AI on Real-Life Outcomes to “Think” Like a Human
One More Example
Final Thoughts: The Importance of Human Cost/Benefit
Design Exercise: Create Your Own Value Matrix
Design Exercise Example: Life Clock Value Matrix
References
PART 2: AI Design Patterns
CHAPTER 6: Case Study: What Made Sumo Copilot Successful?
Strong Use Case
Clear Vision
Dedicated Full-Screen UI
AI-Driven Autocomplete
Next-Steps Suggestions
Final Words
References
CHAPTER 7: UX Best Practices for SaaS Copilot Design
The More Important the Task, the More Real Estate Is Required
SaaS Copilot Is Stateful
Specialized Fine-Tuned ChatGPT Model
Plug-Ins: Integrated Continuous Learning About Your Specific System
The IA of the AI Is Straightforward, Focused on Chat
Promptbooks: No Need to Twist into Pretzels to Write Prompts
Final Thoughts
Design Exercise: Create Your Own Mobile Copilot
Design Exercise Example: Life Clock Copilot
References
CHAPTER 8: Reporting—One of the Most Important Copilot Use Cases
Zoom AI Companion
Meeting Summary
Microsoft Security Copilot
Info for Report: Ignore Automatically vs. Pick Manually?
Security and Privacy
Design Exercise: Create Your Own Copilot Report
Design Exercise Example: Life Clock Copilot Report
CHAPTER 9: LLM Design Patterns
Restating
Auto-Complete
Talk-Back
Initial Suggestions
Next Steps
Regen Tweaks
Guardrails
Design Exercise: Try Out the LLM Patterns
Design Exercise Example: “Life Copilot Plus”
CHAPTER 10: Search UX Revolution: LLM AI in Search UIs
The Current State of Search
The “Mysteries That Are Not Scary” Problem
Enter LLMs
Design Exercise: Design Your Own LLM Search UI
Design Exercise Example: Life Copilot LLM Search
CHAPTER 11: AI-Search Part 2: “Eye Meat” and DOI Sort Algorithms
What Are Dynamic Dashboards?
Beware of Bias in AI Recommendations
DOI: Degree of Interest/Sort Algorithms
Design Exercise: Create Your Own Dynamic Dashboards and Sort UI
References
CHAPTER 12: Modern Information Architecture for AI-First Applications
Design Pattern du Jour: The Canvas
Is Information Architecture Dead?
Amazon.com: Conventional Approach
AI-First Amazon.com Redesign
Long Live Information Architecture!
References
CHAPTER 13: Forecasting with Line Graphs
Linear Regression
R-Squared
R vs. R-Squared
Forecasting with AI
Forecasting an Aggregate Variable
Final Words
Design Exercise: Design Your Own Forecasting UI
Design Exercise Example: Life Clock Forecasting
References
CHAPTER 14: Designing for Anomaly Detection
Why Is Detecting Anomalies Important?
Four Main Anomaly Types
References
Getting Ready for AI-pocalypse: Shorthand UX Design Notation as AI Prompt
CHAPTER 15: UX for Agentic AI
What Are AI Agents?
How Do AI Agents Work?
Use Case: CloudWatch Investigation with AI Agents
Final Thoughts
References
PART 3: Research for AI Projects
Chapter 16: Case Study: MUSE/Disciplined Brainstorming
Design Idea #1
Design Idea #2
Design Idea #3
Design Idea #4
Design Idea #5
But Wait, Did You Catch That?
Design Exercise: Create Your Novel Designs Using Bookending
Design Exercise Example: Novel Design Ideas for Life Clock
References
CHAPTER 17: The New Normal: AI-Inclusive User-Centered Design Process
In the Beginning …
The Monkey or the Pedestal?
A New Way of User-Centered Thinking
What the Heck Is a Spike?
What Is the Role of Data?
Where Is the Customer in All This?
Why Is This Change Necessary?
Does This Mean I Have to Learn About AI So That I Can Ask My Data Science Teammates Good Questions?
Final Handoff to Dev
Many More Changes to Come
Reference
CHAPTER 18: AI and UX Research
UX Techniques That Will Likely See Full Automation
UX Techniques That Will Be Radically Augmented
UX Techniques That Will Become Increasingly Valuable
AI Bullshit
Final Words
References
CHAPTER 19: RITE, the Cornerstone of Your AI Research
RITE Study vs. Usability Test
How to Conduct a RITE Study
A Few More RITE Rounds
The RITE Design Evolution
Dear Future: AI-Assisted RITE Methodology
Design Exercise: Run Your Own RITE Study
References
PART 4: Bias and Ethics
CHAPTER 20: Case Study: Asking Tough Questions Through Vision Prototyping
References
CHAPTER 21: All AI Is Biased
What Do You Expect When You Ask for “Biologist”?
How About “Basketball Player”?
Third Time’s the Charm: “Depressed Person”
References
CHAPTER 22: AI Ethics
CHAPTER 23: UX Is Dead. Long Live UX for AI!
AI Is Happening for Us, Not to Us
Staying on the Rollercoaster Is Optional
“UX Elitism” Is Over
Designers Are “Ambassadors of Innovation”
Core Skills Are in Demand
Combine Low-Fi UX Tools and Sophisticated AI Models
AI Is a “Wicked Problem”
AI Is Just Too Important to Be Left to Data Scientists
The Best AI is Augmented Intelligence
References
INDEX
TITLE PAGE
COPYRIGHT
DEDICATION
ACKNOWLEDGMENTS
ABOUT THE AUTHOR
ABOUT THE CONTRIBUTING EDITOR
END USER LICENSE AGREEMENT
Introduction
Figure I.1 MCAS AI forcing the crash of Lion Air Flight JT 610
Chapter 1
Figure 1.1 Digital twin of an AI model of the process
Figure 1.2 Digital twin of a real-life process
Chapter 3
Figure 3.1 Papyrus of Ani
Figure 3.2 “Mental Health Assistant: AI Therapist in Your Pocket” storyboard...
Figure 3.3 “Mental Health Assistant: AI Helping Hand for Mild Social Anxiety...
Figure 3.4 Mental Health Assistant app establishing shot
Figure 3.5 More examples of establishing shot panels for storyboards
Figure 3.6 Drawing things
Figure 3.7 Three styles of simple drawings of a person
Figure 3.8 Using eyebrows to help communicate nuanced feelings in simple fac...
Figure 3.9 Action-to-Action transition in the “Mental Health Assistant” stor...
Figure 3.10 Another example of an Action-to-Action transition
Figure 3.11 Subject-to-Subject transition in the “Mental Health Assistant” s...
Figure 3.12 Scene-to-Scene transition in the “Mental Health Assistant” story...
Figure 3.13 Subject-to-AI transition in the “Mental Health Assistant” storyb...
Figure 3.14 Subject-to-AI transitions example with “Danger Robot” AI
Figure 3.15 Subject-to-AI transitions featuring Alexa
Figure 3.16 This conclusion panel does not fit the story.
Figure 3.17 This conclusion panel is much more reasonable.
Figure 3.18 The current UX for “Answer Phone While Driving” is dangerous.
Figure 3.19 AI-first UX for “Answer Phone While Driving”
Figure 3.20 Storyboarding Exercise Example: Death Clock
Chapter 4
Figure 4.1 Schematic of GE Haliade 150 wind turbine showing seven yaw system...
Figure 4.2 Digital twin model diagram of the wind turbine yaw motor
Figure 4.3 GE-WTMS Parts View showing the EN4 yaw motor schematic and metada...
Figure 4.4 GE-WTMS Parts View showing EN4’s input current
Figure 4.5 GE-WTMS Parts View showing EN4’s temperature
Figure 4.6 GE-WTMS Parts View showing EN4’s remaining lifetime
Figure 4.7 Digital twin iteration 1: smartwatch alone
Figure 4.8 Digital twin iteration 2: smartwatch + smartphone with a GPS trac...
Figure 4.9 Digital twin iteration 3: smartwatch + smartphone with a GPS trac...
Figure 4.10 Complete digital twin exercise for the Life Clock app
Chapter 5
Figure 5.1 AI model selection based on data science metrics: precision, reca...
Figure 5.2 AI model selection based on real-world outcomes, assuming TP (tru...
Figure 5.3 The Confusion Matrix for the Conservative AI model
Figure 5.4 The Confusion Matrix for the Conservative (Accurate) AI model, as...
Figure 5.5 AI model selection based on real-world outcomes, assuming TP of $...
Figure 5.6 AI model selection based on real-world outcomes, assuming TP of $...
Chapter 6
Figure 6.1 Complex Geo mapping query done with a simple natural language com...
Figure 6.2 Dedicated full-screen UI allowed us the room we needed to impleme...
Figure 6.3 Powerful autocomplete helps users ask the right question from the...
Figure 6.4 Next-steps suggestions respond to the user’s journey through the ...
Chapter 7
Figure 7.1 Microsoft Security Copilot implemented as a side panel
Figure 7.2 Amazon Q Copilot implemented as a large overlay
Figure 7.3 Microsoft Security Copilot implemented in a full page
Figure 7.4 Trained MSC vs. untrained ChatGPT
Figure 7.5 Specially trained MSC Copilot can answer near real-time questions...
Figure 7.6 A list of external data sources (plug-ins) available to MSC
Figure 7.7 The information architecture of a typical Copilot
Figure 7.8 MSC Promptbooks provide easy starting points for a security inves...
Figure 7.9 A simple workflow for a Life Copilot app
Figure 7.10 AI Meal Scan feature from MyNetDiary app
Figure 7.11 Life Copilot app with health coach AI commentary
Chapter 8
Figure 8.1 ZAC provides an automated meeting summary out of the box
Figure 8.2 ZAC can answer questions about the meeting
Figure 8.3 One of the benefits of ZAC is that there is not much to set up...
Figure 8.4 ZAC uses a different UI modality by organizing the meeting ideas ...
Figure 8.5 The Executive Summary report is generated in jargon-free English ...
Figure 8.6 The Pinboard report is generated by selecting only the items the ...
Figure 8.7 When seconds matter, the Pinboard feature allows anyone joining t...
Figure 8.8 The Pinboard report is limited only to human-selected information...
Figure 8.9 Life Copilot Daily Summary Report
Figure 8.10 Life Copilot Weekly Report
Figure 8.11 Life Copilot Navigation menu
Chapter 9
Figure 9.1 ChatGPT “understands” the request in context
Figure 9.2 Siri does not understand the context
Figure 9.3 Restating feature in Microsoft Power BI
Figure 9.4 Auto-complete feature in Power BI
Figure 9.5 If the user utilized Auto-complete, Power BI shows the answer wit...
Figure 9.6 Power BI has one of the most sophisticated Auto-complete features...
Figure 9.7 “Remove the squid”—a highly sophisticated example of Talk-Back in...
Figure 9.8 Generic initial suggestions
Figure 9.9 Initial suggestions tuned to the data source
Figure 9.10 Initial suggestions that continue where the user left off
Figure 9.11 Various sophisticated suggestions in Sumo Logic Copilot
Figure 9.12 Midjourney is a great example of a creative generation flow with...
Figure 9.13 ChatGPT o1-preview successfully resists giving the recipe due to...
Figure 9.14 ChatGPT o1-preview yields the answer to a slightly more sophisti...
Figure 9.15 Left Image: Simple suggestions from Chapter 7. Right Image: New ...
Figure 9.16 History, Suggest, Restating, and Next Steps in context of the “L...
Figure 9.17 Example of AI content Guardrails
Chapter 10
Figure 10.1 Google Search with answers
Figure 10.2 Amazon Search with facets
Figure 10.3 Google Search for “Mysteries That Are Not Scary” returns a human...
Figure 10.4 “Authoritative source” for non-scary mysteries
Figure 10.5 A hodgepodge of Amazon search results for “Mysteries That Are No...
Figure 10.6 AP Images conventional search results for “Mysteries That Are No...
Figure 10.7 Jackpot: AP Images AI results for “Mysteries That Are Not Scary”...
Figure 10.8 ChatGPT results for “Mysteries That Are Not Scary”
Figure 10.9 ChatGPT results for movies “Mysteries That Are Not Scary”
Figure 10.10 LLM fuzzy search results for a query: suggest a healthy cocktai...
Chapter 11
Figure 11.1 Figuring out what the customer wants next is a tough problem
Figure 11.2 Hey, Amazon, most of us do not collect toilet seats
Figure 11.3
Jungle Book
search on mobile (left) and desktop (right)
Figure 11.4 Movies about bears
Figure 11.5 Search results for query “presidential candidates,” collected Se...
Figure 11.6 Search results for query “presidential candidates,” collected on...
Figure 11.7 Presidential candidates image search results, collected Septembe...
Figure 11.8 The two primary presidential candidates do not show up together ...
Figure 11.9 Web views over time for a specific hashtag
Figure 11.10 Newly trending hashtag (small graph in bottom left, before the ...
Chapter 12
Figure 12.1 Space Crocodiles: Canvas design pattern in action
Figure 12.2 “AI-minus”—Amazon homepage of mostly useless junk
Figure 12.3 Books results for “Mysteries That Are Not Scary” are a hodgepodg...
Figure 12.4 AI-plus features of the Item Detail screen
Figure 12.5 AI-first Analysis page for Black Friday
Figure 12.6 AI-first Analysis Fishing Category page for Black Friday
Figure 12.7 AI-first LLM Search Results page for “Mysteries That Are Not Sca...
Figure 12.8 AI-first Item Detail page
Figure 12.9 Not everything needs to be AI
Chapter 13
Figure 13.1 Typical forecast on a graph
Figure 13.2 The now line and two confidence intervals in a temperature forec...
Figure 13.3 Example of a linear regression forecast
Figure 13.4 R-squared is a measure of how well the line fits the data points...
Figure 13.5 The confidence interval gives the viewer a hint of a possible “g...
Figure 13.6 Chlorine degradation in a product as a function of the time is a...
Figure 13.7 Multiple equations might work almost equally well. You need to u...
Figure 13.8 Not all curves that fit the data match the use case. This graph ...
Figure 13.9 Example of seasonality
Figure 13.10 Aggregate variable is best displayed with a bar graph
Figure 13.11 Complex seasonal forecast for an aggregate variable
Figure 13.12 Line graph forecast with a confidence interval
Figure 13.13 Aggregate metrics (calorie intake, exercise, lifetime increase/...
Chapter 14
Figure 14.1 Four types of common anomalies detected by AI, inspired by Andre...
Figure 14.2 Example of point anomalies
Figure 14.3 The UI for a static threshold point anomaly detection
Figure 14.4 Bollinger Bands are a simple example of a dynamic threshold poin...
Figure 14.5 A more complex AI-driven dynamic threshold for point anomaly det...
Figure 14.6 UI design for fine-tuning AI-driven dynamic threshold settings f...
Figure 14.7 A point anomaly detection alert is only generated if three or mo...
Figure 14.8 A change point anomaly detection alert is only generated if the ...
Figure 14.9 Seasonal shape anomaly detection UI
Figure 14.10 Advanced settings for seasonal shape anomaly detection UI
Figure 14.11 Digital twin of a Horse-Head oil pump. Note the downhole dynaca...
Figure 14.12 Anomalies list in the Life Copilot app
Figure 14.13 Wireframe for a form controlling fluid pound rule for a Horse-H...
Figure 14.14 Shorthand notation as input for AI to create a simple form in R...
Figure 14.15 Wireframe for a table of Horse-Head pump AI rules
Figure 14.16 Simplified shorthand notation as input for AI to create a table...
Chapter 15
Figure 15.1 Multistage agentic AI flow
Figure 15.2 Step 1: The human operator launches a new investigation
Figure 15.3 Step 2: The supervisor agent launches worker agents, which take ...
Figure 15.4 Step 3: Worker agents return with suggested observations concern...
Figure 15.5 Step 4: After receiving user feedback, the agents investigate fu...
Figure 15.6 Step 5: The supervisor agent is now ready to point out the culpr...
Figure 15.7 The supervisor agent also provides the next steps to fix the pro...
Chapter 16
Figure 16.1 MUSE Design Idea #1, inspired by side-panel Copilot
Figure 16.2 MUSE Design Idea #2, inspired by GitHub Copilot
Figure 16.3 MUSE Design Idea #3 starts suggesting when user pauses typing
Figure 16.4 MUSE Design Idea #4 provides ideas even before the user starts t...
Figure 16.5 Scrivener Card Management Interface
Figure 16.6 Grammarly GO offers a long list of initial suggestions and makes...
Figure 16.7 MUSE Design Idea #5, inspired by Scrivener and Grammarly GO
Chapter 17
Figure 17.1 An example of an early linear UX process diagram
Figure 17.2 The “Real”(TM) UX design process: It starts with Idea and Money,...
Figure 17.3 The cyclical nature of the RITE process
Figure 17.4 The new AI-inclusive user-centered design process iterates betwe...
Figure 17.5 In the “new normal,” the “glue” aspect of UX—bringing together c...
Chapter 19
Figure 19.1 RITE Round One: 2–3 screens is enough to get started proving our...
Figure 19.2 A more complete prototype after a few rounds of RITE studies
Chapter 20
Figure 20.1 Ultrasonic pipe inspection
Figure 20.2 Tactical graph showing a typical rate of corrosion over time
Figure 20.3 Precision vs. accuracy. Our customers cared more about precision...
Figure 20.4 AI-driven UX to compare various scenarios to extend the life of ...
Chapter 21
Figure 21.1 Midjourney query for “biologist” yields a vast majority of white...
Figure 21.2 For a query “biologist,” you are statistically as likely to get ...
Figure 21.3 Midjourney query for “basketball player” yields a vast majority ...
Figure 21.4 Midjourney query for “depressed person” yields a vast majority o...
Figure 21.5 Typical output for queries: “biologist,” “basketball player,” “d...
Figure 21.6 Example of survivor bias
Figure 21.7 Typical output for queries: “black trans biologist,” “Indian wom...
Cover
Table of Contents
Title Page
Copyright
Dedication
Acknowledgments
About the Author
About the Contributing Editor
Introduction
How to Use This Book
Begin Reading
Index
End User License Agreement
i
xxv
xxvi
xxvii
xxviii
xxix
xxxi
xxxii
1
3
4
5
6
7
8
9
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
67
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
89
90
91
92
93
94
95
96
97
98
99
100
101
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
121
122
123
124
125
126
127
128
129
130
131
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
165
166
167
168
169
170
171
172
173
174
175
176
177
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
213
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
265
266
267
268
269
270
271
272
273
275
276
277
278
279
280
281
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
v
vi
vii
ix
xi
xiii
315
Absolutely brilliant!
—LINDA LANE, BFA, MSIM, UX DESIGNER, RESEARCHER, WRITER/EDITOR, TECHNICAL PRODUCT MANAGER AT MICROSOFT & INFOSYS
Insightful perspectives on designing user experiences tailored for AI applications.
—ALEX FAUNDEZ (www.linkedin.com/posts/faundez_uxlx2024-ux-design-activity-7201518211496861696-taCZ)
Forward-thinking ideas for reimagining our UX process, encouraging a shift from repetitive UI tasks to strategic UX design strategy.
—LAURA GRAHAM (www.linkedin.com/posts/laura-graham-765b406_uxlx-userexperience-designcommunity-activity-7201192500655529984-OiJv)
Thank you … Great workshop [material] on UX for AI.
—KATRIN ELLICE HEINTZE (www.linkedin.com/posts/activity-7201154293511376896-0U76)
Another favorite … Showcasing how to storyboard, identify AI use cases, and create digital twins.
—SHAHRUKH KHAN (www.linkedin.com/posts/shahrukhkhan07_just-came-home-from-lisbon-and-what-activity-7200969521086504960-98Fp)
Hands-on, practical, engaging … Equipped us with innovative tools. Loved it!
—SABRINA S. (www.linkedin.com/posts/activity-7200781769015537664-Ym_e)
For Captain Bhavye Suneja and his copilot Harvino, this 6:20 a.m. flight started just like any other. The skies were clear, and the Jakarta Soekarno-Hatta International Airport was not particularly busy.
Yet just two minutes after takeoff, the nearly new jet started behaving erratically. The plane warned pilots it was in a stall and began to dive in response. (A “stall” is when the airflow over a plane’s wings is too weak to generate lift and keep the plane flying.)
The captain fought the controls, trying to get the plane to climb, but the AI, still incorrectly sensing a stall, continued to push the nose down using the plane’s trim system. For the next nine minutes, AI and humans fought each other for control as the massive jet continued to buck, losing altitude and airspeed.
The chart below dispassionately documents the desperate struggle between human pilots and the Maneuvering Characteristics Augmentation System (MCAS) AI, a real-life HAL-9000 from Arthur C. Clarke’s 2001: A Space Odyssey, determined to act according to its programming (see Figure I.1).
Figure I.1 MCAS AI forcing the crash of Lion Air Flight JT 610
Source: Adapted from Komite Nasional Keselamatan Transportasi
The “trim manual” line shows the pilot’s efforts to redirect the plane; the “trim automatic” line below shows the MCAS actions (1).
In just 12 minutes after takeoff, the AI won the battle, and the ill-fated Lion Air flight JT 610 hit the water, killing all 189 people on board. As I write this introduction in the hot California summer of 2024, it is nearly the sixth anniversary of that fatal crash.
Six years later, most people still do not realize that the crash was not, at its core, a hardware or software problem.
It was a UX for AI design problem.
According to TAC, 737 Max had six “checklists” (predefined procedures pilots have to follow to solve a problem) that pertained to the situation faced by JT 610’s pilots:
Unreliable airspeed
Unreliable altitude
Angle of attack (AoA) disagree
Speed trim failure
Stabilizer out of trim
Runaway stabilizer trim
Can you guess which one concerned checking the runaway AI’s actions? If you guessed “Runaway stabilizer trim,” you are better at guessing than the pilots of JT 610 were on that tragic day.
Greg Bowen, Southwest Airline Pilots Association (SWAPA) Training & Standards Chair, pointed out that the cutout switches (which would have turned off the hallucinating MCAS AI) are the “fourth or fifth” item on the runaway stabilizer trim checklist. In addition to being down on the list of Boeing’s recommended procedures, the runaway stabilizer trim checklist was a “memory item,” which means the pilots had to memorize it. “So one of the things we’re looking at is redesigning that checklist so that it follows the conscript of what people would normally be expected to remember or not,” Greg Bowen said.
Not surprisingly, the official regulatory guidance from the FAA has discouraged the use of memory items as part of procedures. “Memory items should be avoided whenever possible,” according to a 2017 Advisory Circular from the US aviation regulator. “If the procedure must include memory items, they should be clearly identified, emphasized in training, less than three items, and should not contain conditional decision steps.”
Designers reading this will recognize the problem as a classic “recognition vs. recall” dilemma. It’s much easier to recognize something you see on the screen than to remember something (particularly something you do only rarely) in a stressful situation.
In the single case of pilots recovering from an incorrect MCAS activation in the real world, which occurred on a Lion Air flight that took place the day before the crash, it took 3 minutes and 40 seconds for the pilots to figure out what was wrong—with the help of a third pilot who happened to be present to dig through the three separate checklists it took to resolve the problem. Unfortunately, these pilots did not pass on to the next crew all of the information about the problems they encountered. In contrast, pilots on JT 610 could not recall the right checklist to turn off the trim switch. As the captain tried in vain to find the right procedure in the handbook, the first officer was unable to control the bucking plane.
“They didn’t seem to know the trim was moving down,” the source close to the investigation said. “They thought only about airspeed and altitude. That was the only thing they talked about.” “It is like a test where there are 100 questions, and when the time is up, you have only answered 75,” a source said. “So you panic. It is a timeout condition” (2).
Let me emphasize this again: The pilots did not even know that AI was forcing the nose of the airplane down, much less having the time to figure out (remember!) how to turn off the AI.
On the day of the crash, the erroneous data that activated MCAS led to a number of alerts also going off, including an item called a “stick shaker” that vibrates noisily and other audible warnings. Meanwhile, instructions issued by Indonesian Air Traffic Control, which didn’t realize how serious the issue was, added to their workload.
Imagine digging through the manual when the alarms go off all over the place, and the plane is bucking and fighting you at every step of the way! “There was a tsunami of distractions going off in an airplane. Unless you precisely come in and interrupt it, you are heading for a plane that is heading for the ground,” said Dennis Tajer, spokesman for the Allied Pilots Association, adding, “You shouldn’t have to count on superhuman behavior when you’re designing an aircraft.”
To better deal with the runaway MCAS AI in the future, both American and Southwest Airlines today have switched to a Quick Reference Checklist (QRC) card system for the most urgent situations. Both airlines have the runaway stabilizer trim checklist as part of their quick reference card used to fly the 737.
However, as a UX for AI designer, I think that switching to QRC is not enough.
As AI-based systems like the 737 Max’s MCAS, self-driving cars, and AI-driven autonomous industrial procedures take over the tasks of managing more and more of the complexity that our lives have become, we need to have a comprehensive approach to how we design UX interactions with our AI-based systems.
We need UX for AI.
“It’s brought back an intellectual aggression to know more about how these aircraft are designed,” said Tajer. “As dark as it’s been, we’re going to be in a better place after this.”
Mica Endsley, a former chief scientist for the US Air Force whose work was cited in the Indonesia report, said, “The problem is … understanding the importance of human-factors science and prioritizing it … ” Endsley continued, “A lot of automation has been ‘silent but deadly,’ acting in the background but not communicating well with the people who ultimately have the responsibility for the safety of the operation … The biggest reason for this is that the engineers designing the automation assume that their system will behave properly … This, of course, is a very bad assumption” (3).
At the heart of the safety crisis facing the airplane on that fatal day was the interaction between humans and AI, and the lack of care in the design of this UX for AI interaction is what caused this accident and one additional crash, making Boeing’s MCAS AI responsible for killing a total of 346 people. In addition to being held responsible for the tragic loss of life, Boeing also suffered tremendous financial losses and brand damage. The company was sued by every conceivable entity, including the Southwest Airline Pilots Association (SWAPA), alleging it “deliberately misled” the pilots about the differences between the Next Generation and Max iterations of the 737. In 2019, the union was seeking more than $100 million from Boeing for lost wages as a result of the grounding.
While we’d like to believe that Boeing’s experience was the exception, the inconvenient truth is that a full 85 percent of AI and machine learning (ML) projects fail, according to technology research firm Gartner, Inc. Gartner has estimated that 85 percent of AI and ML projects fail to produce a return for the business. The reasons often cited for the high failure rate include poor scope definition, bad training data, organizational inertia, lack of process change, mission creep, and insufficient experimentation.
To this list, I would add another reason that I have seen many organizations struggle to achieve value from their AI projects. Companies often have invested heavily in building data science teams to create innovative ML models. However, they have failed to adopt the mindset, team, processes, and tools necessary to efficiently and safely put those models into a production environment where they can actually deliver value (4).
In other words, for AI-driven project endeavors to succeed, we must focus our attention on how AI adoption transforms the entire enterprise, starting with how we research, plan, design, and user-test the human-AI interaction of our AI-driven products.
In the past decade, I have had the privilege to be part of 35 projects designing the UX for various AI-driven systems in a variety of industries: oil & gas, network and cloud monitoring, security, analytics, agriculture, CRM, content management, and more. In addition to personally leading the design for real-world AI projects, I have been teaching UX for AI design techniques in a series of sold-out hands-on workshops in multiple countries around the globe.
The good news is that I discovered that core UX design skills can be of tremendous value in the new AI-driven world. Self-starter action and original ideas, together with driving broad project alignment, coordination of conflicting interests, and a lightweight design process centered on user research—these skills will now be more valuable than ever. Due to the major disruption introduced by AI, these skills are once again becoming the bread and butter of the UX industry.
The bad news is that many of today’s designers are poorly equipped for the new age being quickly ushered in by functional AI. Designers who wish to remain in the profession will, therefore, need to retrain themselves in skills that include staying humble, asking powerful questions, coming up with options, testing them quickly with customers, conferring with developers and AI specialists, and combining the entirety of the incomplete information to come up with original solutions to barely articulated and poorly understood problems.
In this book, I distill my experience designing and teaching UX for AI-driven systems into a set of general principles and practical techniques you can start replicating immediately in your next project to improve your odds of success, minimize future catastrophic failures, and bring tremendous value to AI-driven product development.
Designers are essential in bringing “Balance to the Force” because AI is simply too important to be left to engineers, business people, and data scientists.
In our rush to adopt and use AI, we urgently need to rediscover and use our humanity.
The book you hold in your hands aims to teach you how to do just that in a practical and accessible manner.
So I hope that this book inspires you in all the right ways, instills an occasional chuckle, gives you a sense of urgency to rethink and retool, and, most importantly, provides you with the complete set of practical techniques to get the job done.
Designers, step up! The world needs you!
With love and hope for the future,
Greg Nudelman (with Daria Kempka, Contributing Editor)
August 2024
1. Komite Nasional Keselamatan Transportasi Republic of Indonesia Preliminary KNKT.18.10.35.04 (October 229, 2018). Aircraft Accident Investigation Report PT. Lion Mentari Airlines Boeing 737-8 (MAX); PK-LQP Tanjung Karawang, West Java Republic of Indonesia.
https://web.archive.org/web/20191017072333/http://knkt.dephub.go.id/knkt/ntsc_aviation/baru/pre/2018/2018%20-%20035%20-%20PK-LQP%20Preliminary%20Report.pdf
2. Ostrower, J. (2019).
Checklists come into focus as pace-setter for 737 Max return
. The Air Current.
https://theaircurrent.com/aviation-safety/checklists-come-into-focus-as-pace-setter-for-737-max-return
3.
Lion Air crash: Pilots searched flight manual, prayed minutes before plane plunged into sea
(2019). Scroll.in.
https://scroll.in/latest/917387/lion-air-crash-pilots-searched-flight-manual-prayed-minutes-before-plane-plunged-into-sea
4. Vartak, M. (2023).
Achieving next-level value from ai by focusing on the operational side of machine learning
. Forbes Technology Council.
www.forbes.com/sites/forbestechcouncil/2023/01/17/achieving-next-evel-value-from-ai-by-focusing-on-the-operational-side-of-machine-learning
This book is based on our best-selling UX for AI: A Framework for Product Design Workshop, with material honed over several years, multiple retellings, and enthusiastic feedback from over 1,000 designers, without whom this book would not be possible. After multiple trials and errors, I (the author) and Daria Kempka (my wonderful contributing editor) settled on the optimal flow of the material you see in the book. The material is arranged in a way that is optimized for understanding. All of the key concepts are provided with the hands-on design exercises where you can practice applying the key principles of this book to your own AI-driven design project.
Together, Daria and I strove to make this book as compact as possible, to enable you to complete all of the material on a plane ride from San Francisco to London. Thus, you can read all the chapters, do all the exercises, and land a mere 10 hours later, ready to take on the world!
The whole of the book is designed to be read in order and be bigger and better than the sum of its chapters. However, if you only have a couple of hours and want to maximize your time-to-value, I recommend focusing on Chapters 3, 4, 5, 15, 17, 19, 20, and 21. Those chapters alone are worth the price of the admission and consistently receive top ratings from workshop attendees. In just a few hours it takes to fly between San Francisco and Austin, TX, this minimum set of material will equip you with the basic skills you need to talk to data scientists and AI engineers and immediately add value to most AI projects.
After those chapters, it’s basically “dealer’s choice,” as each chapter can be used in isolation to address a particular problem you are having. For example, Chapter 7 on Copilot best practices helps you (naturally) design a Copilot, whereas Chapters 13 and 14 help you design the UI for predictive forecasting and anomalies, and so on. Part 4 of the book is dedicated to a critical examination of the AI bias and ethics and provides practical tips to preserve your creative voice to become a force for good in the new AI-driven world.
I recommend you read the material in order, immediately practicing the application of the ideas in the book by applying the exercises to your own UX for AI project. If you do not currently have a project on which you are working, I recommend doing the exercises using the enclosed UX for AI use case “Life Clock,” which I use to demonstrate various techniques and concepts throughout the book. The exercises are an essential part of the book and will help you coalesce your understanding into a working knowledge and sharpen your insights.
All design exercises and drawings in this book are rendered with pencil on sticky notes or in black and red gel pens in a dot-matrix paper notebook. I decided to use a pen due to the generally poor reproduction of detailed pencil drawings on yellow sticky notes and related criticism I received for my $1 Prototype book (which featured many pencil drawings).
I strongly encourage you to use a pencil with a good eraser and sticky notes in your own work.
Why? To begin with, your drawings are not meant to be art pieces! Your drawings are working prototypes, digital twins, storyboard panels, and notes—all subject to change immediately and at the shortest possible notice. Your working drawings represent the fastest way to get the design documentation done so that you can move on to solving problems. One of the things I try to teach you in this book is the increasing need to remain flexible and not fall in love with your designs. In the world of functional AI, there are no sacred cows.
All images in the book are Source: Greg Nudelman unless otherwise credited. I used a pen only to ensure the clarity of images. In your actual work, get comfortable using a pencil with a good eraser on sticky notes. You’ll be glad you did. We’ll explain what to do in detail in specific chapters.
In Part 1 of the book, I discuss the importance of understanding and framing the problem we are trying to solve with AI. Many products fail not due to technology issues but because we find ourselves solving the wrong problem. AI is a large and powerful hammer, and temptation for teams to go pound some nails is often nearly irresistible. That is why I start the book with a detailed case study of how to f*ck up your AI project. Then in Chapter 2 I discuss in more detail the most common way to sink your AI-driven project right out of the gate: picking the wrong use case. After I cover these common pitfalls, I show you how to apply a novel twist on traditional lightweight UX techniques to select the right use case and frame it in a way that can be effectively solved by your team: storyboarding (focusing on the quirks of applying storyboarding to AI projects), digital twin modeling, and finally, the Value Matrix analysis. Let's do this!
A lot of folks say hindsight is 20/20, but personally, I love postmortems. Seeing clearly how my team screwed up in the past teaches me how to avoid creating the same messes in the future. As I mentioned earlier in the introduction, 85 percent of all AI projects fail (1). Starting by demonstrating how things might go wrong provides a powerful framework for learning the subject.
In my experience with 35 UX for AI projects, the most common causes of failure were not “AI” or “tech” failures per se—most were failures of UX design, research, testing, and process. Often, multiple issues work together to weaken the team’s effort and cause “the death of a thousand cuts.” Throughout this book, I will be covering AI project failures in detail and providing many real-life examples to help you recognize red flags early and avoid these pitfalls.
The following case study provides a salient real-life example of how our team failed to correctly frame the problem due to multiple issues that worked together to create a fog of confusion and caused the project to go off the rails.
Imagine a complex industrial process involving an acid gas removal unit (AGRU) that is a crucial part of the Liquid Natural Gas (LNG) purification process. Without going into too much detail, this process is akin to boiling a giant pot of spaghetti on a stove. If you increase the temperature, the spaghetti will be done faster, and you can cook more pasta in 24 hours. However, increasing the temperature of the stove burner also increases the risk of the pot boiling over, creating a sticky spaghetti mess all over the stove (which requires shutting down the cooking, deploying an expensive industrial cleaning process, and ruining a fun day of boiling pasta in favor of a tense encounter with an angry boss).
One industrial supplier of these giant AGRU “industrial pasta pots” (who shall remain nameless) thought of a brilliant solution: They could use AI to predict when the pot was about to boil over. My team of seven well-paid data science/dev/UX professionals spent 6 months trying to make it work, but sadly, the entire project was a complete and utter failure. Our AI project tanked due to the following five common critical failure principles.
It did not take my team long to figure out that every “industrial pasta pot” (worth millions of dollars) was operated by a dedicated expert technician trained to maintain the right level of boil to achieve a good yield without boiling over. If the technician saw the liquid level rise rapidly, they would lower the heat, avoiding overboiling. After a short time, these technicians became experts in avoiding overboiling in their specific pot installation.
Our team theorized that our AI solution would replace these technicians, rendering them unemployed and saving the company money in the process. While this strategy sounds bulletproof in theory, this business plan was akin to trying to block bullets with a wet tissue.
To begin with, despite being experts, these technicians were not so highly paid, and our AI solution out of the gate would cost the customer much more than their existing technician. The AI my team was selling them on was not trained on their specific pot installation. (AI was actually not trained at all since my team could not get the data for ML training; see point 3 later). So, there was no possibility of AI performing as well as the technician while actually costing more.
Any time your AI solution tries to replace an existing installed expert operator, take care! This is a huge red flag, and the likelihood of your project failing goes way up. If your AI solution costs more than the installed expert, don’t just walk away. Run.
The detailed guide on how to pick the right use case for your AI project is covered in Chapter 2, “The Importance of Picking the Right Use Case.”
Before engaging in an AI-building exercise, take the time to understand the cost/benefit analysis of your use case. Every AI action is a prediction with a certain probability of success or failure, and the outcome of every prediction has a specific cost and benefit. Our project team failed to quantify the cost/benefit of this project before developing the AI solution.
Don’t make the same mistake.
While avoiding overboiling was relatively easy (just lower the heat), the cost impact of an overboiling event was very high. The cost of just a single overboiling of the pot was several times higher than the yearly salary of the expert operator. Thus, to justify the price of the installation, the AI solution needed to be ridiculously accurate at avoiding overboiling because a single false negative (failing to check the overboiling) would wipe out all of the profits from a full year of preventing overboiling or a factor of 1000:1 against (e.g., 1,000 correct true negative guesses against a single false negative).
Skewed cost/benefit impact made it very hard to convince the customers that they should replace their proven, installed, and trained solution (a low-cost, full-time expert pot operator) with an expensive, unproven, untrained AI solution.
As an additional disincentive to adopting AI in this case, our company refused to cover the cost of overboiling caused by a faulty AI guess, making the whole thing a complete non-starter.
If the potential cost of a wrong AI guess far exceeds the benefit of a correct AI guess, walk away. If the cost of a bad AI guess is catastrophic, run.
The detailed walk-through on conducting your own cost/benefit analysis with a value matrix is covered in Chapter 5, “Value Matrix—AI Accuracy is Bullshit. Here’s What UX Must Do About It.”
While our company made pots, it did not use them; only our customers did. This made collecting data challenging from the start. The high cost of each pot meant that only a few thousand pots were installed worldwide—not enough to automatically collect generalized machine learning (ML) data.
What made things even worse was that every pot installation was a little different: different pipes, different heat sources, slight variations in atmospheric temperature, pressure, humidity, rate of flow, fans, and the like made each installation bespoke. (In the same way that me boiling a pot of pasta on my stove tells you nothing about the conditions of boiling pasta on your stove, even if we both use the same pot!) The AI model from installation A could not be used in installation B. This meant that every pot required its own custom AI system.
If you do not have the data to train your AI/ML or have no easy, cheap way to obtain the data, walk away. If your solution requires a custom AI model for each installation, run.
For a detailed discussion on how to spot the bias in your ML training data and techniques for dealing with it, look to Part 4, “Bias and Ethics,” of this book.
While the lack of data alone should have killed the project, the question my team was modeling with AI sealed the project’s demise.
The human operator was tasked with answering the question: How high can I make my temperature before the risk of boiling over is too great?
In contrast, the AI model my team was building was trying to answer a different question: Given the measurement of temperature and pressure at this setting, how long do I have until the next boil-over event?
Now you can see the problem. The operator’s question aimed to increase the customer’s profit because, as you recall, more heat meant more cooked pasta at the end of the day.
In contrast, AI was trying to answer a question that was related to operations but not necessarily directly aimed at increasing profits. However, it was a convenient question for our model to answer, so my employer decided it was good enough.
It wasn’t.
My team was akin to the protagonist in that (in)famous joke about a drunken man looking for his keys:
In the middle of the night, a drunken man is crawling on his hands and knees underneath a streetlight, intently looking for something. A passerby stops to help.
Passerby
“What did you lose?”
Drunk
“My keys.”
Passerby
“Where did you lose them?”
Drunk
“Over there in the bushes.”
Passerby
“Then why are you looking for them here?”
Drunk
“Because here, under the streetlight, I can see what I’m doing!”
If your AI model is trying to answer a question not directly related to maximizing profits but instead is answering a data science question, walk away. If your team insists on looking under a streetlight only because it’s the only place they can see what they are doing, run.
For a detailed write-up on modeling inputs and outputs with a digital twin so that your AI can answer the question your customer actually cares about, see Chapter 4, “Digital Twin—Digital Representation of the Physical Components of Your System.”
Each of the 1,000+ plants where my company’s pots were installed was remote and not readily accessible for user research. As a result, our team made all kinds of assumptions, most of them wrong, that could have been cleared up within an hour of seeing the situation for ourselves.
Recall that our AI was trying to answer the question: Given the measurement of temperature and pressure, how long do I have until the next boil-over event?
Our subject-matter expert (SME) told us that the only two sensor readings available to AI for modeling were
Temperature
Pressure
Thus, our AI-driven system’s digital twin model looked like Figure 1.1. (I will cover digital twins in detail in Chapter 4.)
Figure 1.1 Digital twin of an AI model of the process
Anyone who has ever boiled a covered pot of pasta knows that overboiling is not gradual—it is fast, explosive, and messy. The best way to avoid overboiling the pasta is to look at the surface of the liquid, not at the pressure or temperature.
After many months of toiling at the problem and failing, my team and I discovered that the human operator had an additional sensor that ensured their success: They could look through a small glass window onto the boiling surface of the pot and visually ascertain how the boiling was performing. It was like having a transparent glass lid on your pasta pot—a great help in avoiding overboiling accidents!
Thus, the digital twin of a real-life process looked like Figure 1.2.
Sure enough, the company’s SME knew about this “boiling surface window.” Still, he thought it was not essential to tell us about it because the visual of the boiling surface could not be easily instrumented with a sensor. The other two sensors (temperature and pressure) were convenient numbers already instrumented on every pot, and you could easily feed the readings into the AI model. The visual of the boiling surface was not instrumented. It was “messy,” and it took a trained human to be good at judging whether the pot was about to overboil by looking at the surface of the liquid.
As a result, our AI model had no chance in hell to solve this problem.
Even a single field research session would have told us that we had no chance of success without instrumenting a sensor on the “boiling surface window.” Unfortunately, the leadership deemed such a session unnecessary and over budget.
Figure 1.2 Digital twin of a real-life process
If you do not have a well-run research program that will help you connect directly with your customers, walk away. If you cannot conduct even a single in-person, on-site interview with your target customers, run.
You can read more about the “new normal” of conducting research for AI-driven projects in Part 3, “Research for AI Projects.”
To summarize, the failure of this AI project came down to the following:
Trying to replace a trained expert with an AI
Forgetting to analyze costs vs. benefits
Not getting the ML training data
Not paying careful attention to the question your AI model was answering
Not doing any user research because we had an SME
Hindsight is 20/20—looking back allows you to see clearly all the ways your team screwed up. While it’s uncomfortable, this learning is essential if we aim to improve and avoid the same mistakes in the future. I hope that by reading about our mistakes, you can avoid making some mistakes of your own. I suggest you write down the five principles in this study and tape them above your monitor so they remain top of mind as you work on your own AI-driven projects.
Subsequent chapters throughout the book will explore these challenges further, provide additional real-life examples, and explain how to avoid critical pitfalls in your project. In the next chapter, I will share another real-life story about a project that initially tried to replace a trained expert with AI. Fortunately, my team and I successfully turned the project around by reframing the problem through strategic user research.
1. Vartak, M. (2023).
Achieving next-level value from AI by focusing on the operational side of machine learning
. Forbes Technology Council.
www.forbes.com/sites/forbestechcouncil/2023/01/17/achieving-next-level-value-from-ai-by-focusing-on-the-operational-side-of-machine-learning