AI for Good - Juan M. Lavista Ferres - E-Book

AI for Good E-Book

Juan M. Lavista Ferres

0,0
18,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

FOREWORD BY BRAD SMITH, VICE CHAIR AND PRESIDENT OF MICROSOFT

Discover how AI leaders and researchers are using AI to transform the world for the better

In AI for Good: Applications in Sustainability, Humanitarian Action, and Health, a team of veteran Microsoft AI researchers delivers an insightful and fascinating discussion of how one of the world's most recognizable software companies is tackling intractable social problems with the power of artificial intelligence (AI). In the book, you’ll see real in-the-field examples of researchers using AI with replicable methods and reusable AI code to inspire your own uses.

The authors also provide:

  • Easy-to-follow, non-technical explanations of what AI is and how it works
  • Examples of the use of AI for scientists working on mitigating climate change, showing how AI can better analyze data without human bias, remedy pattern recognition deficits, and make use of satellite and other data on a scale never seen before so policy makers can make informed decisions
  • Real applications of AI in humanitarian action, whether in speeding disaster relief with more accurate data for first responders or in helping address populations that have experienced adversity with examples of how analytics is being used to promote inclusivity
  • A deep focus on AI in healthcare where it is improving provider productivity and patient experience, reducing per-capita healthcare costs, and increasing care access, equity, and outcomes
  • Discussions of the future of AI in the realm of social benefit organizations and efforts
Beyond the work of the authors, contributors, and researchers highlighted in the book, AI For Good begins with a foreword from Microsoft Vice Chair and President Brad Smith. There, Smith details the Microsoft rationale behind the creation of and continued investment in the AI for Good Lab. The vision is one of hope with AI saving lives in disasters, improving health care globally, and Microsoft's mission to make sure AI's benefits are available to all.

An essential guide to impactful social change with artificial intelligence, AI for Good is a must-read resource for technical and non-technical professionals interested in AI’s social potential, as well as policymakers, regulators, NGO professionals, and non-profit volunteers.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 514

Veröffentlichungsjahr: 2024

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Table of Contents

Title Page

Foreword

Introduction

A Call to Action

Part I: Primer on Artificial Intelligence and Machine Learning

Chapter 1: What Is Artificial Intelligence and How Can It Be Used for Good?

What Is Artificial Intelligence?

What If Artificial Intelligence Were Used to Improve Societal Good?

Chapter 2: Artificial Intelligence: Its Application and Limitations

Why Now?

The Challenges and Lessons Learned from Using Artificial Intelligence

Large Language Models

Chapter 3: Commonly Used Processes and Terms

Common Processes

Commonly Used Measures

The Structure of the Book

Part II: Sustainability

Chapter 4: Deep Learning with Geospatial Data

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 5: Nature-Dependent Tourism

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 6: Wildlife Bioacoustics Detection

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 7: Using Satellites to Monitor Whales from Space

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 8: Social Networks of Giraffes

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 9: Data-driven Approaches to Wildlife Conflict Mitigation in the Maasai Mara

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 10: Mapping Industrial Poultry Operations at Scale

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 11: Identifying Solar Energy Locations in India

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 12: Mapping Glacial Lakes

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 13: Forecasting and Explaining Degradation of Solar Panels with AI

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Part III: Humanitarian Action

Chapter 14: Post-Disaster Building Damage Assessment

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 15: Dwelling Type Classification

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 16: Damage Assessment Following the 2023 Earthquake in Turkey

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 17: Food Security Analysis

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 18: BankNote-Net: Open Dataset for Assistive Universal Currency Recognition

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 19: Broadband Connectivity

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 20: Monitoring the Syrian War with Natural Language Processing

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 21: The Proliferation of Misinformation Online

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 22: Unlocking the Potential of AI with Open Data

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Part IV: Health

Chapter 23: Detecting Middle Ear Disease

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 24: Detecting Leprosy in Vulnerable Populations

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 25: Automated Segmentation of Prostate Cancer Metastases

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 26: Screening Premature Infants for Retinopathy of Prematurity in Low-Resource Settings

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 27: Long-Term Effects of COVID-19

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 28: Using Artificial Intelligence to Inform Pancreatic Cyst Management

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 29: NLP-Supported Chatbot for Cigarette Smoking Cessation

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 30: Mapping Population Movement Using Satellite Imagery

Executive Summary

Why Is This Important?

Methods Used

Findings

Discussion

What We Learned

Chapter 31: The Promise of AI and Generative Pre-Trained Transformer Models in Medicine

What Are GPT Models and What Do They Do?

GPT Models in Medicine

Conclusion

Part V: Summary, Looking Forward, and Additional Resources

Epilogue: Getting Good at AI for Good

Communication

Data

Modeling

Impact

Conclusion

Key Takeaways

AI and Satellites: Critical Tools to Help Us with Planetary Emergencies

Amazing Things in the Amazon

Quick Help Saving Lives in Disaster Response

Additional Resources

Endnotes

Acknowledgments

About the Editors

About the Authors

Microsoft's AI for Good Lab

Collaborators

Index

Copyright

Dedication

End User License Agreement

List of Tables

Chapter 5

Table 5.1 Tourism expenditure and visit estimates for beach and coral reef–d...

Chapter 6

Table 6.1 Text prompts to perform zero-shot classification for each benchmar...

Table 6.2 Experiment Results on Text Prompts

Chapter 12

Table 12.1 Comparison of the Performance of Models Using Labeled Recent Imag...

Table 12.2 Comparison of the Performance of Models Using Imagery Labeled in ...

Chapter 17

Table 17.1 Questions and Weights Used to Create Reduced Coping Strategies In...

Table 17.2 Measures Used to Evaluate the Performance of the Model, Their Cal...

Table 17.3 Basic Household Demographics and Livelihood Information

Table 17.4 Performance Scores for the Different Models

Chapter 20

Table 20.1 Typology Used to Classify Events in Syria's Civil War.

Table 20.2 Accuracy of the Four Classification Models

Chapter 22

Table 22.1 Number of Data Assets in Each of the Top Ten Open Data Sites

Table 22.2 Top Ten Data File Formats on GitHub

Table 22.3 Number of Data Assets on GitHub by Research Category

Table 22.4 Percentage of Data Assets Classified by License Category

Table 22.5 Four GitHub Account Owners that Have Published Over 40 Million Da...

Chapter 23

Table 23.1 Summary of Internal and External Performance Between Cohorts to D...

Table 23.2 Summary of Pooled Performance for Differentiating Normal from Abn...

Chapter 26

Table 26.1 Comparison of the Performance of Our Models with Pediatric Ophtha...

Chapter 27

Table 27.1 ICD-10 Codes That Were Observed in a Significantly Higher Proport...

Table 27.2 ICD-10 Codes That Were Significantly Overrepresented in Each of T...

Chapter 28

Table 28.1 Comparison of the EBM Model (With and Without CFA) to Recommended...

Table 28.2 Comparison of the EBM Model to the MOCA Model, Including CFA Resu...

Chapter 29

Table 29.1 Comparisons of Natural Language Processing Model Answers to Three...

List of Illustrations

Chapter 2

Figure 2.1 A scatterplot of home sales price and square footage, with a tren...

Figure 2.2 The drop in the cost of a hard disk per gigabyte, 1960 to 2015.

Figure 2.3 Proportion of left-handed people in the population.

Figure 2.4 The relationship between day temperature and ice cream sales.

Figure 2.5 Directed acyclic graph for three different scenarios with the sam...

Figure 2.6 A phone rocker device (Amazon.com, Inc.).

Figure 2.7 Trends in public interest: the surge of AI curiosity.

Chapter 4

Figure 4.1 An example of heterogenous geospatial layers collected from diffe...

Figure 4.2 Examples of geospatial data that have different projections and s...

Chapter 5

Figure 5.1 The data collection, processing, modeling, and mapping process (F...

Figure 5.2 For five Eastern Caribbean countries, specific use intensity maps...

Chapter 6

Figure 6.1 Contrasting conventional supervised learning, contrastive audio-l...

Figure 6.2 Illustration of fixed window sound event existence classification...

Chapter 7

Figure 7.1 Trends in the type, number, and mean spatial resolution of satell...

Figure 7.2 Examples of how high-resolution panchromatic imagery and lower-re...

Chapter 8

Figure 8.1 Giraffe in Tanzania, August 2023 (A, V, L Lavista-Guzman).

Figure 8.2 Both static (left) and dynamic (right) network clustering demonst...

Figure 8.3 All observations and connections for Masai giraffes in the Tarang...

Figure 8.4 Transition rates among super-communities for all giraffes, male g...

Figure 8.5 Mean distances moved from first detection (km) for male and femal...

Figure 8.6 Spatial distribution of movements from the first position for eve...

Chapter 9

Figure 9.1 The region of interest (left), train/validation splits, and label...

Figure 9.2 Overview of our methodology.

Figure 9.3 The original two images (left/right), train/validation splits (gr...

Chapter 10

Figure 10.1 The labeled Delmarva Peninsula dataset.

Figure 10.2 Examples of false positives (pictures on the left) and a true ne...

Figure 10.3 A map of poultry CAFOs for the continental United States (Robins...

Figure 10.4 Modeled predictions of poultry CAFOs in counties in Delaware and...

Chapter 11

Figure 11.1 Methods used (Anthony Ortiz et al. 2022 / Springer Nature / CC B...

Figure 11.2 Map of solar farm locations in India.

Figure 11.3 Map of alternative land uses.

Chapter 12

Figure 12.1 Examples of two growing glacial lakes at two time points. The hi...

Figure 12.2 Demonstration of the different models' predictions against groun...

Figure 12.3 A volcano plot of estimated glacial lake area changes. Lakes at ...

Chapter 13

Figure 13.1 Overview of DeepDeg: a) The initial hours of degradation of a ph...

Figure 13.2 Forecasting cross-validation: a) High-accuracy degradation predi...

Figure 13.3 Held-out test set: a) molecular structure along with the device ...

Chapter 14

Figure 14.1 In this Siamese U-Net model architecture, pre- and post-disaster...

Figure 14.2 Pre- and post-disaster imagery samples of different disasters fr...

Figure 14.3 Imagery samples with polygons showing building edges and greysca...

Figure 14.4 Full screenshots of pre- and post-disaster images of the 2021 Ha...

Chapter 15

Figure 15.1 An overview of the model development process (Nasir M et al., 20...

Figure 15.2 Examples of our model's results: predicted label from the model ...

Figure 15.3 An overview of the risk score modeling pipeline (Nasir M et al.,...

Chapter 16

Figure 16.1 Cities on which we focused our analysis (Robinson C et al., 2023...

Figure 16.2 Building damage assessment and affected population estimate work...

Figure 16.3 Extensive damage around the Culture Park area of Kahra...

Figure 16.4 Several destroyed apartment buildings in Türkoğlu. G...

Figure 16.5 Demonstration of damage in Nurdağı. Greytones indica...

Figure 16.6 A set of nine destroyed five-story apartment buildings in downto...

Chapter 17

Figure 17.1 Preparation of the machine learning dataset where independent va...

Figure 17.2 Overall process used in the study. HH means households (Gholami ...

Figure 17.3 Variation in Reduced Coping Strategies Index scores (Gholami S e...

Figure 17.4 Shapley Additive Explanations (SHAP) analysis results, overall (...

Figure 17.5 A map of the location of households for which the analyses were ...

Chapter 18

Figure 18.1 Overview of the BankNote-Net dataset, including a) the total num...

Figure 18.2 An overview of the approach we used. We used a convolutional neu...

Figure 18.3 Clustering of individual currencies, which improves predictive p...

Chapter 19

Figure 19.1 Process for estimating the error introduced by differential priv...

Figure 19.2 Map of the United States by ZIP Code with indicators of broadban...

Figure 19.3 Measures of error for several population ranges. Expected mean a...

Chapter 21

Figure 21.1 A simplified example of a navigation graph node.

Breitbart.com

s...

Chapter 22

Figure 22.1 Number of data assets by year added to GitHub from 2013 to 2022 ...

Chapter 23

Figure 23.1 The processes used to assess internal and external performance o...

Chapter 24

Figure 24.1 Model development and testing. (Barbieri RR, et al., 2022 / With...

Figure 24.2 The ten most important features in the model (Barbieri RR, et al...

Chapter 25

Figure 25.1 The impact of adding additional neighboring slices on detection ...

Figure 25.2 Evaluation metrics comparing Dice loss and weighted Dice loss ap...

Figure 25.3 Model performance as a function of different measures of the les...

Chapter 26

Figure 26.1 Procedure followed to capture a video of the retina using a low-...

Figure 26.2 A retinal image as seen in the smartphone-based video collection...

Figure 26.3 An example of our ROP mobile application output after it was use...

Chapter 27

Figure 27.1 Methods for identifying cohorts examined. In A, comorbidities we...

Figure 27.2 Social determinants of health variables that were statistically ...

Chapter 28

Figure 28.1 Two-step clinical management approach based on an AI model witho...

Chapter 29

Figure 29.1 An overview of the QuitBot's architecture for handling freeform ...

Chapter 30

Figure 30.1 Global population projections under different scenarios of femal...

Figure 30.2 Microsoft Building Footprints paired with Planet mosaics.

Figure 30.3 Building density modeling pipeline.

Figure 30.4 Piecewise smoothing of pixel-level building density over time.

Figure 30.5 Local patterns of childhood wasting in low and middle-income cou...

Figure 30.6 Building density estimates in Tindouf, Algeria.

Figure 30.7 Building density growth in the Maldives, 2018–2023.

Figure 30.8 Population map in Nairobi, Kenya, 2020. This image represents th...

Figure 30.9 Population data in Kenya combined with climate data.

Guide

Cover

Table of Contents

Title Page

Copyright

Dedication

Foreword

Introduction

Begin Reading

Epilogue: Getting Good at AI for Good

AI and Satellites: Critical Tools to Help Us with Planetary Emergencies

Additional Resources

Acknowledgments

About the Editors

About the Authors

Index

End User License Agreement

Pages

iii

xix

xx

xxi

xxiii

xxiv

xxv

xxvi

xxvii

xxviii

xxix

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

iv

v

400

AI for Good

Applications in Sustainability, Humanitarian Action, and Health

 

 

 

 

Edited by

Juan M. Lavista Ferres, PhD, MS

William B. Weeks, MD, PhD, MBA

 

 

 

 

 

Foreword

—Brad Smith, Vice Chair and President of Microsoft

In 2018, when we launched Microsoft's AI for Good Lab, our goal was ambitious yet clear: to harness the transformative power of AI to tackle global challenges and enhance lives worldwide. In just five years, I've watched Juan and his team of researchers, subject matter experts, and data scientists do just that as they formulate and advance that mission into inspiring results. Every day, they bring a sense of wonder and optimism to their work, solving our most pressing problems and improving lives at scale across society.

This book begins with a useful primer on what AI is and how it's used in a variety of applications, followed by a series of applied scientific studies. It explores how AI helps today's leading experts take on global challenges in health, human rights, and climate change. At Microsoft, we talk about the many ways AI offers more potential for the good of humanity than any invention that has preceded it. After reading this book, I hope you'll share my optimism for the possibilities of this powerful new technology.

An example of this collaboration is the lab's work on retinopathy of prematurity (ROP), a condition in children who are born prematurely. ROP occurs when premature birth disrupts the normal eye development of blood vessels in the retina, often leading to injury of the eye. As more premature babies survive, thanks to neonatal care, the prevalence of ROP has increased. However, if detected early, laser surgery can save a baby's vision. The challenge is that there are not enough ophthalmologists trained to diagnose and treat ROP, especially in the Global South, where the trend in premature births is rising, but healthcare infrastructure is lagging.

By combining the expertise of ophthalmologists with AI, we've developed an app that replaces expensive diagnostic machinery with a smartphone camera powered by AI. This innovation enables healthcare workers in remote areas to swiftly diagnose ROP, increasing access to essential interventions. This simple app not only improves access to important healthcare services, but also eases pressure on a system overwhelmed by an increase in demand.

When disasters strike, AI can help save lives. In 2023, the world faced fires, earthquakes, and flooding. The AI for Good Lab and its partners used geospatial tools that combined satellite images with AI models to map where buildings had been damaged or destroyed. These tools put real-time, actionable data into the hands of rescue teams, significantly enhancing their life-saving efforts.

When speed is critical, AI can aggregate, analyze, and share information with local authorities, but it cannot substitute the work of responders who are conducting rescue efforts on the ground. It's this joint effort between technology and decision-makers that reduces the time needed for making life-saving choices and boosts their capacity to act quickly and efficiently.

As these examples show, AI can play a critical role in addressing global challenges. They also reflect our dedication to ensuring that technology's benefits are available to all.

AI's potential reminds me of electricity, another world-changing invention from more than a century ago. In 1882, Thomas Edison's Pearl Street Station generator lit up homes and businesses in New York City. Yet today, more than 150 years later, access to reliable electricity is still out of reach for more than 700 million people around the world. We must avoid creating a similar gap with AI and ensure that it is available to everyone, everywhere.

The responsibility now falls on us to guide AI's evolution responsibly. I am grateful for the innovative, compassionate individuals at the forefront of using AI for good.

Introduction

—William B. Weeks, MD, PhD, MBA

Writing is a lonely endeavor that, to be honest, is draining. Authors put a lot of themselves into writing. Picking the right next word, getting the phrasing correct, and accurately conveying the material all take effort. To be sure, spell-check helps, as do grammatical suggestions. However, writing about technical processes and research findings requires a lot of second-guessing and ego oversight. It is not enough just to get words on paper: someone reading them might follow your suggestions, and if the words are misleading or inaccurate, they could be more harmful than helpful. There is an ethical imperative to get the work right, to revise and check and confirm the work and the words so that they accurately depict what you did as a researcher, what you found, and what the limitations of your findings are.

Nonetheless, I love to write about and conduct research. Because of its challenges, I find the research process and the conveyance thereof to be highly intellectually stimulating and engaging. But more importantly, good research, when shared, can improve the world.

After a 30-year career at Dartmouth Medical School as a professor, teacher, and health services researcher who studied health systems and how people used them, I joined Microsoft. I love working at Microsoft and have had wonderful managers here—Dr. Jim Weinstein during my time at Microsoft Research, and now Dr. Juan M. Lavista Ferres, the co-editor of this book and the leader of the AI for Good Lab, the work of which fills this book. But, further, I think that Microsoft's top leadership—Satya Nadella and Brad Smith, who fund the AI for Good Lab—seek to use their positions to do good in the world.

In teaching classes on the financial and strategic management of healthcare organizations at Dartmouth, I often contrasted the two Latin phrases that express the ethics of business and medicine: “caveat emptor” and “primum non nocere,” respectively. Caveat emptor means “let the buyer beware.” If an organization produces something and sells it and it does not work out for the customer, too bad—the customer should have done due diligence and might even have anticipated that the product was not going to be useful. Medicine has an antithetical ethic: first do no harm. Healthcare providers have a fiduciary responsibility to their patients: they have an ethical obligation to share the risks and benefits of treatment decisions and collaboratively work with patients to tailor care pathways to achieve patients' goals in a way that is consistent with their values.

With a mission “to empower every person and every organization on the planet to do more,” Microsoft's ethic aligns more with the medical one than the business one, which is why I like working there. Much like that of a provider and a patient, Microsoft seeks to have long-term and helpful relationships with its customers, ones in which customers benefit from Microsoft products in ways that are consistent with their goals and values.

Perhaps Microsoft's ethic of empowerment is most evident within Microsoft's AI for Good Lab. Considered part of Microsoft's philanthropic efforts, the Lab seeks to engage largely not-for-profit organizations in one of two ways. First, by providing Azure Cloud credits so those with data science expertise can use those credits to begin work on a particular project without incurring cloud storage and compute expenses. Second, by providing time-limited and project-specific data science expertise to those organizations that have data but do not have the advanced analytic skills to use the data to improve the world. The Lab also engages in work that addresses social problems that may not have a specific not-for-profit collaborator, like rapidly assessing damage from natural disasters or war, or providing tools that can help researchers and policymakers identify where broadband access or health inequities exist in the United States.

That work is presented in this book. By writing the book, we seek to help readers who are interested in how artificial intelligence and advanced data science techniques can be used to solve world problems by providing examples of the Lab's efforts to do so. So, part of the reason to write this book is to share knowledge with others who are interested, might learn about the approaches that the Lab has used, and, hopefully, apply those methods in a propagative way to address more problems. The world is complex, and we need as many thoughtful, curious, and motivated people who want to spend time addressing its problems as possible. We hope this book reaches them.

Moreover, I worry. As a physician, I worry about the world's health and the massive inequities in care access, quality, and outcomes that drive health disparities, within countries and across countries. As an economist, I worry that, unless efficient and effective solutions to some of the most pressing issues in the world (like climate change, humanitarian action, and health equity) are addressed, incentives that drive market behavior will worsen the plight of the disenfranchised. As a father of six and grandfather of four, I worry that the world I leave to my kids and theirs will be a worse one than the one I inherited.

So, my primary reason for writing a book that demonstrates how artificial intelligence and sophisticated data analytics can be used to solve the world's most pressing problems is because I have hope that these technologies can help, and hope assuages worry. The tools that are described herein are not panaceas—just as with a medical intervention, the choice of the tool, the approach, and the target must be clearly described, cautiously applied, and carefully interpreted.

But I am hopeful that these tools, when judiciously, rigorously, and ethically applied, can empower the world's populations to live in more just societies, avoid unnecessary harms that might otherwise befall them, and live healthier, more fruitful, and more fulfilling lives.

A Call to Action

—Juan M. Lavista Ferres, PhD, MS

Jeffrey Hammerbacher, one of the first data scientists at Facebook, once said, “The best minds of my generation are thinking about how to make people click ads.”

When I first came across Hammerbacher's quote, I was leading the metrics team for Bing. Although I wasn't working directly on ads, part of my job was to understand the trade-offs between ads and relevance, so this statement resonated deeply with me.

To be clear, search engines like Bing and Google have immensely enriched society by granting unparalleled access to information and empowering individuals in novel ways. Yet, while recognizing these contributions, the pressing challenges of our times necessitate collective innovation, creative application of new tools and methods, and solutions that attempt to solve those challenges and not just provide information access.

The problems that we seek to address within the AI for Good Lab are foundational to societal improvement. For example, each year, millions of children die before they reach the age of five, with a significant majority of these deaths being entirely preventable. The climate crisis affects hundreds of millions of people, a staggering 1.6 billion people live with severe disabilities, and half of the world's population has inadequate access to high-quality healthcare. The world needs all the help it can get.

Over the past five years, I've had the opportunity to see the remarkable ways in which artificial intelligence and technology can address some of those challenges. While they aren't silver bullets, artificial intelligence and technology can be instrumental in solving specific issues. However, one challenge is that non-profit organizations and governments, which are at the forefront of addressing these problems, often do not have the capacity to attract or retain the artificial intelligence experts that they need to solve them.

Some might find it surprising, but even though predicting which person will click on your ad and which child has a higher chance of infant mortality are vastly different in societal terms, from a pure data science standpoint, they are essentially the same problem. If we can apply AI algorithms to optimize ad clicks, why can't we direct some of our best minds and most advanced technologies toward optimizing human life, well-being, and the health of our planet?

I am optimistic and strive to leave the world better than I found it—a goal I believe is more widespread than commonly perceived. I'm profoundly grateful to Microsoft for the chance to lead our AI for Good Lab that embraces this very mission.

I'm not going to lie: demonstrating an impact in this complex world is not easy. Over the years, my successes have taught me a lot, but my failures have taught me even more about making a tangible impact. And I've learned a few lessons.

I learned that there is a huge difference between solving a problem on paper and solving one in a real-world setting. One profound realization I had was that, as humans, we are addicted to complexity: we like complex problems and complex projects. This is the reason we sent a person to the moon before we added wheels to our luggage. However, seeking complexity is the wrong approach. If we want to try to impress people and look smart, our solutions will be complex. But if we want to measurably improve the world, our solutions must be simple. And building simple solutions is much harder.

I've learned that, when working on a problem, it is critical to collaborate with subject matter experts. No matter how much we data scientists think we understand a problem, it is crucial to work with those who fundamentally understand the problems we seek to address. Without insight from these experts, we might not understand potential issues with the data, we might focus our efforts on solutions that are not meaningful, are not pragmatic, or are unlikely to be implemented.

There is a difference between applied and pure research. In applied research, we focus on a problem and seek the best hammer to solve it. In basic research, we aim to create a new hammer. While both are essential for scientific advancement, one of my most significant realizations is that there's an imbalance between the two: it's not that the world needs fewer hammers, but it certainly needs more people looking to solve real world problems.

Don’t try to make the problem fit your tools; get yourself the tools that fit the problem.

—Toby Berger

Academia brims with brilliant individuals. But if the currencies of academia—like publications, citations, or the h-index—are the sole objectives of academic work, they might hinder genuine impact. I strongly believe that, even if it is much harder to measure, the impact that our work has on improving society is the right measure.

My career path differs significantly from that of many of my lab colleagues and academic peers. I did not intend to spend a career in research: with a background in computer science, I spent my early years developing software solutions. But the combination of the two disciplines, which occurred during my tenure at Microsoft, gave me insight into the importance of combining the rigor of research with the utility of creating simple, practical, implementable, and scalable solutions.

In my position as the Director of Microsoft's philanthropic AI for Good Lab, I have been able to recruit some of the brightest minds I've ever experienced. I am profoundly grateful to them: they have taught me nearly everything I know about artificial intelligence.

The intent of this book is different from the many texts that aim to teach the fundamentals of artificial intelligence. By sharing real-world examples of how artificial intelligence and advanced data science methods can be applied for good, we seek to inspire the reader to envision new possibilities for impactful change. We want our work to engage readers by exploring pressing questions, igniting a broader conversation about ethically redirecting our technological capabilities for the greater good, and showing that it can be done.

The smartphones we carry in our pockets surpass the computing power that once sent astronauts to the moon.

With artificial intelligence and data at our fingertips, we now have the tools to address the world’s most pressing problems.

We no longer have excuses.

Part IPrimer on Artificial Intelligence and Machine Learning

We intend for this book to be available to a broad audience, in particular, those without expertise in artificial intelligence or machine learning.

To make the book more accessible to that audience, we have purposely avoided including too many technical details of the studies. We have avoided the use of mathematical formulae and equations. Interested readers can find those in the references at the end of each chapter.

Nonetheless, artificial intelligence and machine learning tools have their own language and applications and limitations; studies applying artificial intelligence and machine learning have common processes, practices, and terms. Therefore, in the following chapters, we provide a primer that defines artificial intelligence in lay terms, describes the applications and limitations of artificial intelligence, provides a brief glossary of commonly used practices and terms used in the application of artificial intelligence, and presents an overview of the structure that we applied to each of the chapters.

Those versed in artificial intelligence and machine learning should feel free to skip this part and jump right into the meat of the studies that we present (each of which is independent and not reliant on knowledge in any of the other chapters). Those wanting an introduction to or a light refresher on artificial intelligence, its application, and its evaluation should probably read this part before reading the studies.

Chapter 1What Is Artificial Intelligence and How Can It Be Used for Good?

—William B. Weeks

Defining artificial intelligence begs the question of how one defines intelligence. It's a sticky and complex wicket because whole philosophical arguments and schools hang on the definition of intelligence. But, for the purposes of this book, we define intelligence as the ability to learn.

Learning entails acquisition of knowledge and application of that knowledge. Acquisition of knowledge without application is simply data. As humans, we are constantly obtaining data; however, much of the data that we obtain is filtered out. While our brains are purportedly the most complex entities in the universe, were we not able to filter out data, we would become overwhelmed by it. Imagine driving or taking the subway to work, and capturing and cataloguing every sight, smell, innuendo, look, and feeling that you experienced. I would bet that most readers—if thinking back to their last commute to work—would be hard-pressed to remember any details of it. Perhaps if something out of the ordinary happened, that would stand out. Or maybe if you heard your favorite song being sung by a busker in the metro station, that might penetrate your natural filtering system and persist for a while. But we filter out a lot, in large part to maintain the efficiency of our brains.

As humans have evolved, we have increasingly moved from a true/false, good/bad categorization mechanism for data inputs. Such categorizations might have even become embedded in our thinking process, possibly in an inefficient manner: most people shy away from snakes, even though most snakes are not dangerous. Nonetheless, because some snakes are dangerous, for efficiency, we may categorize all snakes as dangerous. And it really doesn't do us much harm, assuming that humans have very little or no use for snakes. The depiction of snakes as evil in biblical or mythological stories only reinforces our bias toward snakes.

But what if some snakes were useful to humans? Given humans' propensity to disregard or abhor snakes, that would be a challenging question to answer. Only a few humans become herpetologists and can, perhaps, achieve the objectivity necessary to study that question. Counter to being inculcated by natural training and belief systems that have persisted over millennia, these herpetologists might be able to apply their brains in a very focused manner to determine, without filters and bias, which snakes might be useful to humans and why.

Humans can apply their brains to try to avoid filtering and natural propensities to think in impartial ways. The entire scientific process is designed to be impartial. One considers an answerable question (“Might some snakes be useful to humans?”), designs a study to answer the question (“Let's objectively examine and document the potential harms and benefits of human–snake encounters for 100 randomly selected snake species.”), collects data in a formalized manner (“Here is the list of those harms and benefits, with harms and benefits evaluated on a scale of 1 to 5, for each snake.”), analyzes the data (“We conducted a regression analysis that evaluated the anticipated severity of harms and extent of benefits of 100 snakes based on their color, length, species, and whether they were venomous.”), reports the findings (“Ninety-five percent of the snakes we studied were not dangerous to humans; many species of snakes were indirectly useful to humans by reducing populations of harmful rodents and insects, serving as a food source, and inspiring medical treatments.”), and makes overall and policy conclusions (“A minority of snake species are harmful and many snake species are helpful to humans. Let's not categorically kill or be fearful of snakes.”)

The work described is learning. Our herpetologist acquired data, analyzed it, and applied it in a new way that might change behavior and lead to more fruitful encounters with snakes in the future—not to mention generating less anxiety and a sea change toward snake-related biases. That learning took a lot of human effort and was limited by the requirements of research. Our herpetologist crafted the question, informed by the literature, on how to classify snakes. They might have been limited by linear analytic techniques that ignored more nuanced interactions between the variables they collected and the relationships between snakes and humans. They might have wanted to inform their study with more data on snakes (which are nocturnal, and which are tree-dwelling?) and human behaviors (what are the characteristics of humans who are harmed by snakes?). But, until recently, models that could be used to facilitate such complex data analytics have not been available.

What Is Artificial Intelligence?

That brings us to artificial intelligence. Artificial intelligence is intelligence—the ability to learn and apply knowledge—that is conducted by machines.

The term artificial intelligence was coined at a summer research project that was held at Dartmouth College in 1956 and attended by researchers from Dartmouth, Harvard, IBM, and Bell Telephone Laboratories. Organized by John McCarthy, a Dartmouth mathematics professor, the objective of the conference was “to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can, in principle, be so precisely described that a machine can be made to simulate it.”

Over the last 65 years, the field of artificial intelligence can be categorized as having had cycles of intense interest (known as “AI summers”), followed by periods of reduced funding and disillusionment (known as “AI winters”). While the lay public might be under the impression that large language models like ChatGPT formed the nidus of the artificial intelligence revolution, today, we are surrounded by artificial intelligence algorithms that collect data, analyze it, and learn from it.

Consider your Amazon, or Walmart, or Netflix account. You buy an item. Based on your purchase—and, likely, your age, sex, and your history of other purchases—algorithms might suggest several other items that you might like. By aggregating data on behaviors that include what you buy, what you watch on YouTube, what you like on Facebook, and what others who have similar interests and characteristics buy and watch, these companies can steer you toward purchases or activities that, indeed, you are likely to enjoy. And with each new purchase or view, the algorithms can better learn your preferences and the preferences of those like you and further enhance your interactive experiences.

Artificial intelligence is behind the development, improvement, and maintenance of these algorithms. When automated, the process requires much less human involvement than our herpetologist's effort. Because of its effectiveness in driving sales, artificial intelligence has been largely used in commercial endeavors—collecting data, learning patterns, and deploying that learning to drive sales.

What If Artificial Intelligence Were Used to Improve Societal Good?

That is the premise of Microsoft's philanthropic AI for Good Lab—that these tools that are very effective at driving commercial behavior, can be used to attain social good through objectively conducted research. They can do so in three ways:

First, the tools can be used to facilitate data collection

. While we are more awash in data than at any other time in humanity, sometimes data is challenging to collect in timely, consistent, and objective ways. In this book, we provide examples of how, when trained, artificial intelligence algorithms can use satellite data to identify whales, or poultry farms, or building damage following an earthquake. Without satellite data or acoustic data or the ability to perform identification tasks in an automated way, it would take an inordinate amount of time for humans to collect data or images, classify them, and enter them into datasets, with each step potentially being complicated by human error. Artificial intelligence can expedite data collation, classification, and analysis in a replicable and consistent way, thereby reducing costs and allowing insights from such data to be generated much more rapidly.

Second, artificial intelligence can be used to classify images so that scarce human resources are used more efficiently and effectively

. For example, among children, chronic otitis media is the most common cause of deafness and retinopathy of prematurity is the most common cause of blindness. For both conditions, timely treatment could prevent deafness and blindness, but treatment requires scarce medical specialists: otolaryngologists and ophthalmologists, respectively. Trained on images of normal and abnormal tympanic membranes or normal and abnormal retinae, artificial intelligence algorithms can be applied to videos obtained by unskilled community health workers to identify which frames in the video are of adequate quality for assessment, and then determine the probability that the child has chronic otitis media or retinopathy of prematurity. By doing so, only those children likely to benefit from treatment can be referred to the scarce medical resources for intervention: specialists then spend less time screening and more time treating, and more patients get care that prevents lifelong adverse consequences associated with lack of appropriate and timely treatment.

Finally, artificial intelligence can analyze data in novel ways that explore non-linear relationships among a multitude of variables to develop new insights regarding relationships between those variables and outcomes of interest

. In part constricted by data availability and in part restricted by computer power limitations, statistical methods used to analyze data have focused on examining linear and discrete relationships between variables and outcomes of interest. There is inherent bias in those traditional approaches: researchers like our herpetologist need to identify the limited variables that they need to collect to conduct their studies and then assume relationships between particular snake characteristics and their potential harms or benefits. However, just like humans' interactions with one another, relationships are complex, non-linear, highly interrelated, contextually determined, and time-dependent. Artificial intelligence algorithms can incorporate a virtually infinite number of variables to identify their individual and collective relationships to outcomes of interest.

Make no mistake—artificial intelligence is no panacea. As the following chapter suggests, there are rules that must be followed in its application and there are limits to how it can be used.

Nonetheless, artificial intelligence holds great promise for addressing the world's most pressing social problems. In this book, we give examples of how Microsoft's philanthropic AI for Good Lab uses artificial intelligence and other advanced data science techniques to conduct research in the areas of sustainability, humanitarian action, and health to generate innovative uses of existing data, novel insights, and new diagnostic pathways that can be used to improve life and address the world's biggest problems.

Chapter 2Artificial Intelligence: Its Application and Limitations

—Juan M. Lavista Ferres

In many ways, “artificial intelligence” is a marketing term that has evolved over time. Even back in the 1970s, some applications that had a few rules were considered artificial intelligence. Today, there are many debates about what qualifies as artificial intelligence. In practical terms, most of the time when we talk about artificial intelligence, we are referring to machine learning.

So, what is machine learning? It is a technique involving algorithms that transform data into rules. In conventional programming, humans use their intelligence and knowledge to create rules expressed in software code. In contrast, machine learning relies on data and success criteria, using this data to generate rules while optimizing that criteria.

Many say that data is the new oil. In reality, data has nothing to do with oil; data is the new code.

Machine learning methods are not new: they are older than computers. We didn't refer to them as machine learning in the past, but as statistical methods. This includes techniques such as linear regression, logistic regression, and linear discriminant analysis (LDA). Some of these methods, like linear regression, date back to the early 19th century.

As an example, imagine you want to predict the price of a house. A basic model might use just square footage. By collecting data on square footage and sale prices, you can plot them on a chart (see Figure 2.1). The machine learning model then attempts to fit a curve that best describes the relationship between the two, while minimizing errors.

Figure 2.1 A scatterplot of home sales price and square footage, with a trend line.

It is important to understand that no matter how sophisticated a machine learning model becomes, its primary objective is to fit a curve. You might ask, “Isn't this something humans can do?” While it's true that humans can conceptualize problems with a small number of dimensions, doing so becomes impossible when dealing with thousands or millions of dimensions: our brains simply cannot handle it. However, given the right data, sufficient computing power, and time, machine learning algorithms can try to find a solution.

In this book, we often use AI algorithms and machine learning interchangeably, although it's worth noting that machine learning is a specific subset of AI focused on data-driven learning and prediction. A model in machine learning is the outcome of training an algorithm on data, where the algorithm learns and encapsulates this knowledge in its parameters. For instance, in a linear model y=ax+b, a and b are parameters that the algorithm learns from the data. Once these parameters are determined, they define the actual model.

Why Now?

Despite the fact that the term was coined almost 70 years ago, only recently has artificial intelligence become a universal topic. So, the pivotal question is:

Why now?

Two critical ingredients for machine learning and artificial intelligence are data and processing power. The accessibility and cost-effectiveness of both have changed dramatically over the years. Moore's Law anticipated the dramatic increase in processing power per dollar and the decrease in data storage per dollar (see Figure 2.2). The impact of greater processing power and cheaper data storage on the capacity to use artificial intelligence is profound.

Figure 2.2 The drop in the cost of a hard disk per gigabyte, 1960 to 2015.

In the 1990s, storing 1 gigabyte of data—roughly equivalent to one minute of 4K video—cost about $10,000. Storing a two-hour movie that we casually watch at home today would have required a $1.2 million investment. In 2023, a 10-terabyte hard disk (which is 10,000 gigabytes) costs about $118, reducing the storage cost of that same movie to $1.20.

Processing power has had a parallel evolution. Since Gordon Moore's initial observations in 1965, computational capacities have doubled roughly every 18 months. Comprehending the impact of this exponential growth can be challenging. For perspective, in 2004, the most powerful supercomputer was the NEC Earth Simulator; it cost $750 million. By 2023, a NVIDIA GPU card with three times the processing power of the Earth Simulator costs around $3,000.

The Power of Artificial Intelligence

The power of artificial intelligence stems from its ability to use data to solve problems, allowing us to address challenges that were once too intricate for traditional computer programs. In the 1990s, teams of software developers dedicated years to building systems that could understand handwriting. Yet today, with open datasets like MNIST and artificial intelligence libraries like PyTorch, a software developer can build a system to recognize handwritten numbers using just ten lines of code.

Moreover, there are problems wherein artificial intelligence is the only viable solution. Worldwide, over 450 million people have diabetes, and with changing dietary habits, this number will rise. Up to 21 percent of diabetics have retinopathy—a disease of the retina that impairs vision—when they are first diagnosed; over time, most diabetics will develop retinopathy. And diabetic retinopathy is the primary cause of blindness globally.

Early detection and treatment of diabetic retinopathy can prevent vision loss; however, with only about 200,000 ophthalmologists available in the world to diagnose and treat this growing number of diabetics, it is logistically impossible for every diabetic to be screened for timely treatment.

The good news is that researchers, including those in our Microsoft's AI for Good Lab, have developed artificial intelligence models that can detect diabetic retinopathy as accurately as top-tier ophthalmologists. In situations like these, where there simply are not enough medical professionals to screen patients, artificial intelligence isn't merely beneficial—it's the only viable solution. Its potential isn't limited to this condition; artificial intelligence algorithms can also help identify patients at risk for other diseases, improving the efficiency of healthcare providers around the world.

Artificial intelligence enables us to tackle problems that traditional programming couldn't solve and promises worldwide scalability.

The Challenges and Lessons Learned from Using Artificial Intelligence

While the power of artificial intelligence lies in its ability to learn from data, the quality and integrity of the data are paramount to its ethical and effective application. Here, I discuss nine lessons and challenges of applying artificial intelligence to real-world problems.

Models Can Be Fooled by Bias

Just as faulty code can produce inaccurate study results, biased or erroneous data can distort artificial intelligence applications.

In 1991, Halpern and Coren analyzed a random sample of individuals who had died and asked their family members whether the deceased were left- or right-handed. The researchers uncovered a concerning finding: on average, left-handed people died nine years earlier than their right-handed counterparts. This study was published in the New England Journal of Medicine, one of the world's most prestigious medical journals. However, Halpern and Coren failed to consider changes in social norms with regard to left-handedness. In the early 1900s, many parents forced their left-handed children to use their right hands. This social norm artificially reduced the percentage of left-handed people in the population. From 1920 to 1950, parents stopped this practice (see Figure 2.3), so the proportion of the population that reported themselves as being left-handed individuals rose considerably. This artificial increase in the left-handed population gave the illusion that left-handed individuals died younger, when in reality that was not the case.

Figure 2.3 Proportion of left-handed people in the population.

Our findings have social ramifications: if a life insurance company uses handedness as a feature in their actuarial models, it could erroneously conclude that left-handed people are more likely to die younger and might unjustly charge them more for life insurance policies.

The point is that models trained on data are only as good as the data they use, and using data without questioning and understanding it can lead to flawed analytical output; output that can harm people. A significant portion of the data we collect has some bias; if we don't understand these potential biases, our models will not be correct.

Predictive Power Does Not Imply Causation

People often say, “Correlation does not imply causation.” This means that just because two things correlate—consistently move in the same or opposite directions—doesn't mean one causes the other. In machine learning, it is important to understand that predictive power also does not imply causation. Simply because a machine learning model can predict outcomes based on certain data doesn't indicate that those data cause the predicted outcome.

Consider this analogy: imagine you are inside a windowless building. If people come in carrying umbrellas, you might predict that it's raining outside. However, the presence of umbrellas does not cause the rain.

Supervised machine learning is the process of mapping inputs to outputs using pairs of input and output data. This approach's predictive power is crucial in many scenarios. For instance, when you're driving and running late for a meeting, the ability of an algorithm to accurately predict traffic congestion is crucial. In such cases, the underlying cause of the traffic—whether it's a baseball game or road construction—is irrelevant. However, in other contexts, like cancer research, it's vital not only to predict whether a person will develop cancer but also to understand the causes of cancer.

It is important to note that machine learning, by itself, does not uncover causality. While it can help us identify potential clues, causality cannot be directly inferred from the data alone. Being able to predict an outcome does not automatically grant us insight into the factors that cause it.

Unfortunately, many people fail to understand the difference between causality and predictive power. A few years ago, Gallup conducted a survey with a simple question: “Do you believe correlation implies causation?” Surprisingly, 64 percent of Americans answered “yes.”

For example, in constructing a machine learning model to predict drownings, variables such as season, day temperature, and the number of people swimming possess predictive power. At the same time, ice cream sales, which depend on day temperature, also show a high correlation with both day temperature and drownings, thus demonstrating predictive capability. However, it is important to recognize that not all these correlations imply a causal relationship with drownings (see Figure 2.4).

Figure 2.4 The relationship between day temperature and ice cream sales.

This relationship demonstrates that, in the absence of direct information, if our objective is to predict drownings, we can build a model based on ice cream sales, which will show predictive power. Of course, this doesn't imply causation; banning ice cream sales would merely disrupt this relationship, rendering it ineffective in prediction.

In summary, it is critical to understand that correlation and predictive power do not necessarily imply causation. This does not mean that such information is not useful; rather, predicting an event is distinctly different from understanding its causes.

A common misconception, particularly prevalent in the computer science community, is the belief that sophisticated machine learning techniques can directly reveal cause-and-effect relationships. However, no matter the level of sophistication in data analysis or machine learning models, data itself does not inherently encode causality, thus hindering the direct extraction of causal relationships. For example, Figure 2.5 illustrates a directed acyclic graph in three distinct scenarios, each resulting in the same data output for variables A and B.

Figure 2.5 Directed acyclic graph for three different scenarios with the same data output for A and B.

This doesn't mean that AI cannot aid in understanding causality. While we cannot directly infer causality from the data, AI can significantly assist in the process of formulating hypotheses for causality. We can then use this information, which will require combining AI with human expertise, domain knowledge, and a rigorous scientific approach.

AI Algorithms Can Discriminate

When AI learns from historical data, it is susceptible to the same biases and prejudices that might have existed in the past. In the 1980s, St. George's Hospital Medical School in London experienced this firsthand when an algorithm used to recommend hospital admission was found to be biased against women and racial minorities. The algorithm didn't introduce new biases, it merely mirrored existing ones. It is important to scrutinize and test algorithms for biases, especially when they can cause harm.

While a model can be predictive, it can still be unfair. The likelihood of a person repaying a loan is heavily influenced by their disposable income. It is well understood that, regardless of race or ethnicity, individuals with the same disposable income tend to have similar default rates. However, in countries like the United States, the distribution of disposable income varies based on race or ethnicity. Measuring disposable income is complex, as it requires individuals to disclose details about their spending habits. In contrast, determining someone's race or ethnicity is typically straightforward.