E-Book
79,99 €

The Handbook of Data Science and AI E-Book

Katherine Munro

0,0

79,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.

Herausgeber: Carl Hanser Verlag GmbH & Co. KG
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

- A comprehensive overview of the various fields of application of data science and artificial intelligence. - Case studies from practice to make the described concepts tangible. - Practical examples to help you carry out simple data analysis projects. - BONUS in print edition: E-Book inside Data Science, Big Data, Artificial Intelligence and Generative AI are currently some of the most talked-about concepts in industry, government, and society, and yet also the most misunderstood. This book will clarify these concepts and provide you with practical knowledge to apply them. Using exercises and real-world examples, it will show you how to apply data science methods, build data platforms, and deploy data- and ML-driven projects to production. It will help you understand - and explain to various stakeholders - how to generate value from such endeavors. Along the way, it will bring essential data science concepts to life, including statistics, mathematics, and machine learning fundamentals, and explore crucial topics like critical thinking, legal and ethical considerations, and building high-performing data teams. Readers of all levels of data familiarity - from aspiring data scientists to expert engineers to data leaders - will ultimately learn: how can an organization become more data-driven, what challenges might it face, and how can they as individuals help make that journey a success. The team of authors consists of data professionals from business and academia, including data scientists, engineers, business leaders and legal experts. All are members of the Vienna Data Science Group (VDSG), an NGO that aims to establish a platform for exchanging knowledge on the application of data science, AI and machine learning, and raising awareness of the opportunities and potential risks of these technologies. WHAT‘S INSIDE // - Critical Thinking and Data Culture: How evidence driven decision making is the base for effective AI. - Machine Learning Fundamentals: Foundations of mathematics, statistics, and ML algorithms and architectures - Natural Language Processing and Computer Vision: How to extract valuable insights from text, images and video data, for real world applications. - Foundation Models and Generative AI: Understand the strengths and challenges of generative models for text, images, video, and more. - ML and AI in Production: Turning experimentation into a working data science product. - Presenting your Results: Essential presentation techniques for data scientists.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

MOBI

Seitenzahl: 1632

Veröffentlichungsjahr: 2024

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Katherine Munro, Stefan Papp, Zoltan Toth, Wolfgang Weidinger, Danko Nikolić, Barbora Antosova Vesela, Karin Bruckmüller, Annalisa Cadonna, Jana Eder, Jeannette Gorzala, Gerald Hahn, Georg Langs, Roxane Licandro, Christian Mata, Sean McIntyre, Mario Meir-Huber, György Móra, Manuel Pasieka, Victoria Rugli, Rania Wazir, Günther Zauner

The Handbook of Data Science and AI

Generate Value from Data with Machine Learning and Data Analytics

2nd Edition

Distributed by:Carl Hanser VerlagPostfach 86 04 20, 81631 Munich, GermanyFax: +49 (89) 98 48 09www.hanserpublications.comwww.hanser-fachbuch.de

The use of general descriptive names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. While the advice and information in this book are believed to be true and accurate at the date of going to press, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

The final determination of the suitability of any information for the use contemplated for a given application remains the sole responsibility of the user.

All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying or by any information storage and retrieval system, without permission in writing from the publisher.No part of the work may be used for the purposes of text and data mining without the written consent of the publisher, in accordance with § 44b UrhG (German Copyright Law).

© Carl Hanser Verlag, Munich 2024Coverconcept: Marc Müller-Bremer, www.rebranding.de, MunichCoverdesign: Tom WestCover image: © istockphoto.com/ValeryBrozhinskyEditor: Sylvia HasselbachProduction Management: le-tex publishing services GmbH, LeipzigTypesetting: Eberl & Koesel Studio, Kempten, Germany

Print ISBN: 978-1-56990-934-8E-Book ISBN: 978-1-56990-235-6ePub ISBN: 978-1-56990-411-4

Title

Table of Contents

Preface

Acknowledgments

1 Introduction

Stefan Papp

1.1 About this Book

1.2 The Halford Group

1.2.1 Alice Halford – Chairwoman

1.2.2 Analysts

1.2.3 “CDO”

1.2.4 Sales

1.2.5 IT

1.2.6 Security

1.2.7 Production Leader

1.2.8 Customer Service

1.2.9 HR

1.2.10 CEO

1.3 In a Nutshell

2 The Alpha and Omega of AI

Stefan Papp

2.1 The Data Use Cases

2.1.1 Bias

2.1.2 Data Literacy

2.2 Culture Shock

2.3 Ideation

2.4 Design Process Models

2.4.1 Design Thinking

2.4.2 Double Diamond

2.4.3 Conducting Workshops

2.5 In a Nutshell

3 Cloud Services

Stefan Papp

3.1 Introduction

3.2 Cloud Essentials

3.2.1 XaaS

3.2.2 Cloud Providers

3.2.3 Native Cloud Services

3.2.4 Cloud-native Paradigms

3.3 Infrastructure as a Service

3.3.1 Hardware

3.3.2 Distributed Systems

3.3.3 Linux Essentials for Data Professionals

3.3.4 Infrastructure as Code

3.4 Platform as a Service

3.4.1 Cloud Native PaaS Solutions

3.4.2 External Solutions

3.5 Software as a Service

3.6 In a Nutshell

4 Data Architecture

Zoltan C. Toth and Sean McIntyre

4.1 Overview

4.1.1 Maslow’s Hierarchy of Needs for Data

4.1.2 Data Architecture Requirements

4.1.3 The Structure of a Typical Data Architecture

4.1.4 ETL (Extract, Transform, Load)

4.1.5 ELT (Extract, Load, Transform)

4.1.6 ETLT

4.2 Data Ingestion and Integration

4.2.1 Data Sources

4.2.2 Traditional File Formats

4.2.3 Modern File Formats

4.2.4 Which Storage Option to Choose?

4.3 Data Warehouses, Data Lakes, and Lakehouses

4.3.1 Data Warehouses

4.3.2 Data Lakes and Cloud Data Platforms

4.4 Data Transformation

4.4.1 SQL

4.4.2 Big Data & Apache Spark

4.4.3 Cloud Data Platforms for Apache Spark

4.5 Workflow Orchestration

4.5.1 Dagster and the Modern Data Stack

4.6 A Data Architecture Use Case

4.7 In a Nutshell

5 Data Engineering

Stefan Papp

5.1 Differentiating from Software Engineering

5.2 Programming Languages

5.2.1 Code or No Code?

5.2.2 Language Ecosystem

5.2.3 Python

5.2.4 Scala

5.3 Software Engineering Processes for Data

5.3.1 Configuration Management

5.3.2 CI/CD

5.4 Data Pipelines

5.4.1 Common Characteristics of a Data Pipeline

5.4.2 Data Pipelines in the Unified Data Architecture

5.5 Storage Options

5.5.1 File Era

5.5.2 Database Era

5.5.3 Data Lake Era

5.5.4 Serverless Era

5.5.5 Polyglot Storage

5.5.6 Data Mesh Era

5.6 Tooling

5.6.1 Batch: Airflow

5.6.2 Streaming: Kafka

5.6.3 Transformation: Databricks Notebooks

5.7 Common challenges

5.7.1 Data Quality and Different Standards

5.7.2 Skewed Data

5.7.3 Stressed Operational Systems

5.7.4 Legacy Operational Systems

5.7.5 Platform and Information Security

5.8 In a Nutshell

6 Data Governance

Victoria Rugli, Mario Meir-Huber

6.1 Why Do We Need Data Governance?

6.1.1 Sample 1: Achieving Clarity with Data Governance

6.1.2 Sample 2: The (Negative) Impact of Poor Data Governance

6.2 The Building Blocks of Data Governance

6.2.1 Data Governance Explained

6.3 People

6.3.1 Data Ownership

6.3.2 Data Stewardship

6.3.3 Data Governance Board

6.3.4 Change Management

6.4 Process

6.4.1 Metadata Management

6.4.2 Data Quality Management

6.4.3 Data Security and Privacy

6.4.4 Master Data Management

6.4.5 Data Access and Search

6.5 Technology (Data Governance Tools)

6.5.1 Open-Source Tools

6.5.2 Cloud-based Data Governance Tools

6.6 In a Nutshell

7 Machine Learning Operations (ML Ops)

Zoltan C. Toth, György Móra

7.1 Overview

7.1.1 Scope of MLOps

7.1.2 Data Collection and Exploration

7.1.3 Feature Engineering

7.1.4 Model Training

7.1.5 Models Deployed to Production

7.1.6 Model Evaluation

7.1.7 Model Understanding

7.1.8 Model Versioning

7.1.9 Model Monitoring

7.2 MLOps in an Organization

7.2.1 Main Benefits of MLOps

7.2.2 Capabilities Needed for MLOps

7.3 Several Common Scenarios in the MLOps Space

7.3.1 Integrating Notebooks

7.3.2 Features in Production

7.3.3 Model Deployment

7.3.4 Model Formats

7.4 MLOps Tooling and MLflow

7.4.1 MLflow

7.5 In a Nutshell

8 Machine Learning Security

Manuel Pasieka

8.1 Introduction to Cybersecurity

8.2 Attack Surface

8.3 Attack Methods

8.3.1 Model Stealing

8.3.2 Data Extraction

8.3.3 Data Poisoning

8.3.4 Adversarial Attack

8.3.5 Backdoor Attack

8.4 Machine Learning Security of Large Language Models

8.4.1 Data Extraction

8.4.2 Jailbreaking

8.4.3 Prompt Injection

8.5 AI Threat Modelling

8.6 Regulations

8.7 Where to go from here

8.8 Conclusion

8.9 In a Nutshell

9 Mathematics

Annalisa Cadonna

9.1 Linear Algebra

9.1.1 Vectors and Matrices

9.1.2 Operations between Vectors and Matrices

9.1.3 Linear Transformations

9.1.4 Eigenvalues, Eigenvectors, and Eigendecomposition

9.1.5 Other Matrix Decompositions

9.2 Calculus and Optimization

9.2.1 Derivatives

9.2.2 Gradient and Hessian

9.2.3 Gradient Descent

9.2.4 Constrained Optimization

9.3 Probability Theory

9.3.1 Discrete and Continuous Random Variables

9.3.2 Expected Value, Variance, and Covariance

9.3.3 Independence, Conditional Distributions, and Bayes’ Theorem

9.4 In a Nutshell

10 Statistics – Basics

Rania Wazir, Georg Langs, Annalisa Cadonna

10.1 Data

10.2 Simple Linear Regression

10.3 Multiple Linear Regression

10.4 Logistic Regression

10.5 How Good is Our Model?

10.6 In a Nutshell

11 Business Intelligence (BI)

Christian Mata

11.1 Introduction to Business Intelligence

11.1.1 Definition of Business Intelligence

11.1.2 Role in Organizations

11.1.3 Development of Business Intelligence

11.1.4 Data Science and AI in the Context of BI

11.1.5 Data for Decision-Making

11.1.6 Understanding Business Context

11.1.7 Business Intelligence Activities

11.2 Data Management Fundamentals

11.2.1 What is Data Management, Data Integration and Data Warehousing?

11.2.2 Data Load Processes – The Case of ETL or ELT

11.2.3 Data Modeling

11.3 Reporting and Data Analysis

11.3.1 Reporting

11.3.2 Types of Reports

11.3.3 Data Analysis

11.3.4 Visual Analysis

11.3.5 Significant Trends

11.3.6 Relevant BI Technologies

11.3.7 BI Tool Examples

11.4 BI and Data Science: Complementary Disciplines

11.4.1 Differences

11.4.2 Similarities

11.4.3 Interdependencies

11.5 Outlook for Business Intelligence

11.5.1 Expectations for the Evolution of BI

11.6 In a Nutshell

12 Machine Learning

Georg Langs, Katherine Munro, Rania Wazir

12.1 Introduction

12.2 Basics: Feature Spaces

12.3 Classification Models

12.3.1 K-Nearest-Neighbor-Classifier

12.3.2 Support Vector Machine

12.3.3 Decision Trees

12.4 Ensemble Methods

12.4.1 Bias and Variance

12.4.2 Bagging: Random Forests

12.4.3 Boosting: AdaBoost

12.4.4 The Limitations of Feature Construction and Selection

12.5 Unsupervised learning: Learning without labels

12.5.1 Clustering

12.5.2 Manifold Learning

12.5.3 Generative Models

12.6 Artificial Neural Networks and Deep Learning

12.6.1 The Perceptron

12.6.2 Artificial Neural Networks

12.6.3 Deep Learning

12.6.4 Convolutional Neural Networks

12.6.5 Training Convolutional Neural Networks

12.6.6 Recurrent Neural Networks

12.6.7 Long Short-Term Memory

12.6.8 Autoencoders and U-Nets

12.6.9 Adversarial Training Approaches

12.6.10 Generative Adversarial Networks

12.6.11 Cycle GANs and Style GANs

12.7 Transformers and Attention Mechanisms

12.7.1 The Transformer Architecture

12.7.2 What the Attention Mechanism Accomplishes

12.7.3 Applications of Transformer Models

12.8 Reinforcement Learning

12.9 Other Architectures and Learning Strategies

12.10 Validation Strategies for Machine Learning Techniques

12.11 Conclusion

12.12 In a Nutshell

13 Building Great Artificial Intelligence

Danko Nikolić

13.1 How AI Relates to Data Science and Machine Learning

13.2 A Brief History of AI

13.3 Five Recommendations for Designing an AI Solution

13.3.1 Recommendation No. 1: Be Pragmatic

13.3.2 Recommendation No. 2: Make it Easier for Machines to Learn – Create Inductive Biases

13.3.3 Recommendation No. 3: Perform Analytics

13.3.4 Recommendation No. 4: Beware of the Scaling Trap

13.3.5 Recommendation No. 5: Beware of the Generality Trap (there is no such a thing as free lunch)

13.4 Human-level Intelligence

13.5 In a Nutshell

14 Signal Processing

Jana Eder

14.1 Introduction

14.2 Sampling and Quantization

14.3 Frequency Domain Analysis

14.3.1 Fourier Transform

14.4 Noise Reduction and Filtering Techniques

14.4.1 Denoising Using a Gaussian Low-pass Filter

14.5 Time Domain Analysis

14.5.1 Signal Normalization and Standardization

14.5.2 Signal Transformation and Feature Extraction

14.5.3 Time Series Decomposition Techniques

14.5.4 Autocorrelation: Understanding Signal Similarity over Time

14.6 Time-Frequency Domain Analysis

14.6.1 Short Term Fourier Transform and Spectrogram

14.6.2 Discrete Wavelet Transform

14.6.3 Gramian Angular Field

14.7 The Relationship of Signal Processing and Machine Learning

14.7.1 Techniques for Feature Engineering

14.7.2 Preparing for Machine Learning

14.8 Practical Applications

14.9 In a Nutshell

15 Foundation Models

Danko Nikolić

15.1 The Idea of a Foundation Model

15.2 How to Train a Foundation Model?

15.3 How Do we Use Foundation Models?

15.4 A Breakthrough: There is no End to Learning

15.5 In a Nutshell

16 Generative AI and Large Language Models

Katherine Munro, Gerald Hahn, Danko Nikolić

16.1 Introduction to “Gen AI”

16.2 Generative AI Modalities

16.2.1 Methods for Training Generative Models

16.3 Large Language Models

16.3.1 What are “LLMs”?

16.3.2 How is Something like ChatGPT Trained?

16.3.3 Methods for Using LLMs Directly

16.3.4 Methods for Customizing an LLM

16.4 Vulnerabilities and Limitations of Gen AI Models

16.4.1 Introduction

16.4.2 Prompt Injection and Jailbreaking Attacks

16.4.3 Hallucinations, Confabulations, and Reasoning Errors

16.4.4 Copyright Concerns

16.4.5 Bias

16.5 Building Robust, Effective Gen AI Applications

16.5.1 Control Strategies Throughout Development and Use

16.5.2 Guardrails

16.5.3 Using Generative AI Safely and Successfully

16.6 In a Nutshell

17 Natural Language Processing (NLP)

Katherine Munro

17.1 What is NLP and Why is it so Valuable?

17.2 Why Learn “Traditional” NLP in the “Age of Large Language Models”?

17.3 NLP Data Preparation Techniques

17.3.1 The NLP Pipeline

17.3.2 Converting the Input Format for Machine Learning

17.4 NLP Tasks and Methods

17.4.1 Rule-Based (Symbolic) NLP

17.4.2 Statistical Machine Learning Approaches

17.4.3 Neural NLP

17.4.4 Transfer Learning

17.5 In a Nutshell

18 Computer Vision

Roxane Licandro

18.1 What is Computer Vision?

18.2 A Picture Paints a Thousand Words

18.2.1 The Human Eye

18.2.2 Image Acquisition Principle

18.2.3 Digital File Formats

18.2.4 Image Compression

18.3 I Spy With My Little Eye Something That Is

18.3.1 Computational Photography and Image Manipulation

18.4 Computer Vision Applications & Future Directions

18.4.1 Image Retrieval Systems

18.4.2 Object Detection, Classification and Tracking

18.4.3 Medical Computer Vision

18.5 Making Humans See

18.6 In a Nutshell

19 Modelling and Simulation – Create your own Models

Günther Zauner, Wolfgang Weidinger, Dominik Brunmeir, Benedikt Spiegel

19.1 Introduction

19.2 General Considerations during Modeling

19.3 Modelling to Answer Questions

19.4 Reproducibility and Model Lifecycle

19.4.1 The Lifecycle of a Modelling and Simulation Question

19.4.2 Parameter and Output Definition

19.4.3 Documentation

19.4.4 Verification and Validation

19.5 Methods

19.5.1 Ordinary Differential Equations (ODEs)

19.5.2 System Dynamics (SD)

19.5.3 Discrete Event Simulation

19.5.4 Agent-based Modelling

19.6 Modelling and Simulation Examples

19.6.1 Dynamic Modelling of Railway Networks for Optimal Pathfinding Using Agent-based Methods and Reinforcement Learning

19.6.2 Agent-Based Covid Modelling Strategies

19.6.3 Deep Reinforcement Learning Approach for Optimal Replenishment Policy in a VMI Setting

19.6.4 Finding Feasible Solutions for a Resource-constrained Project Scheduling Problem with Reinforcement Learning and Implementing a Dynamic Planing Scheme with Discrete Event Simulation

19.7 Summary and Lessons Learned

19.8 In a Nutshell

20 Data Visualization

Barbora Antosova Vesela

20.1 History

20.2 Which Tools to Use

20.3 Types of Data Visualizations

20.3.1 Scatter Plot

20.3.2 Line Chart

20.3.3 Column and Bar Charts

20.3.4 Histogram

20.3.5 Pie Chart

20.3.6 Box Plot

20.3.7 Heat Map

20.3.8 Tree Diagram

20.3.9 Other Types of Visualizations

20.4 Select the right Data Visualization

20.5 Tips and Tricks

20.6 Presentation of Data Visualization

20.7 In a Nutshell

21 Data Driven Enterprises

Mario Meir-Huber, Stefan Papp

21.1 The three Levels of a Data Driven Enterprise

21.2 Culture

21.2.1 Corporate Strategy for Data

21.2.2 The Current State Analysis

21.2.3 Culture and Organization of a Successful Data Organisation

21.2.4 Core Problem: The Skills Gap

21.3 Technology

21.3.1 The Impact of Open Source

21.3.2 Cloud

21.3.3 Vendor Selection

21.3.4 Data Lake from a Business Perspective

21.3.5 The Role of IT

21.3.6 Data Science Labs

21.3.7 Revolution in Architecture: The Data Mesh

21.4 Business

21.4.1 Buy and Share Data

21.4.2 Analytical Use Case Implementation

21.4.3 Self-service Analytics

21.5 In a Nutshell

22 Creating High-Performing Teams

Stefan Papp

22.1 Forming

22.2 Storming

22.2.1 Scenario: 50 Shades of Red

22.2.2 Scenario: Retrospective

22.3 Norming

22.3.1 Change Management and Transition

22.3.2 RACI Matrix

22.3.3 SMART

22.3.4 Agile Processes

22.3.5 Communication Culture

22.3.6 DataOps

22.4 Performing

22.4.1 Scenario: A new Dawn

22.4.2 Growth Mindsets

22.5 In a Nutshell

23 Artificial Intelligence Act

Jeannette Gorzala, Karin Bruckmüller

23.1 Introduction

23.2 Definition of AI Systems

23.3 Scope and Purpose of the AI Act

23.3.1 The Risk-Based Approach

23.3.2 Unacceptable Risk and Prohibited AI Practices

23.3.3 High-Risk AI Systems and Compliance

23.3.4 Medium Risk and Transparency Obligations

23.3.5 Minimal Risk and Voluntary Commitments

23.4 General Purpose AI Models

23.5 Timeline and Applicability

23.6 Penalties

23.7 AI and Civil Liability

23.8 AI and Criminal Liability

23.9 In a Nutshell

24 AI in Different Industries

Stefan Papp, Mario Meir-Huber, Wolfgang Weidinger, Thomas Treml

24.1 Automotive

24.1.1 Vision

24.1.2 Data

24.1.3 Use Cases

24.1.4 Challenges

24.2 Aviation

24.2.1 Vision

24.2.2 Data

24.2.3 Use Cases

24.2.4 Challenges

24.3 Energy

24.3.1 Vision

24.3.2 Data

24.3.3 Use Cases

24.3.4 Challenges

24.4 Finance

24.4.1 Vision

24.4.2 Data

24.4.3 Use Cases

24.4.4 Challenges

24.5 Health

24.5.1 Vision

24.5.2 Data

24.5.3 Use Cases

24.5.4 Challenges

24.6 Government

24.6.1 Vision

24.6.2 Data

24.6.3 Use Cases

24.6.4 Challenges

24.7 Art

24.7.1 Vision

24.7.2 Data

24.7.3 Use cases

24.7.4 Challenges

24.8 Manufacturing

24.8.1 Vision

24.8.2 Data

24.8.3 Use Cases

24.8.4 Challenges

24.9 Oil and Gas

24.9.1 Vision

24.9.2 Data

24.9.3 Use Cases

24.9.4 Challenges

24.10 Retail

24.10.1 Vision

24.10.2 Data

24.10.3 Use Cases

24.10.4 Challenges

24.11 Telecommunications Provider

24.11.1 Vision

24.11.2 Data

24.11.3 Use Cases

24.11.4 Challenges

24.12 Transport

24.12.1 Vision

24.12.2 Data

24.12.3 Use Cases

24.12.4 Challenges

24.13 Teaching and Training

24.13.1 Vision

24.13.2 Data

24.13.3 Use Cases

24.13.4 Challenges

24.14 The Digital Society

24.15 In a Nutshell

25 Climate Change and AI

Stefan Papp

25.1 Introduction

25.2 AI – a Climate Saver?

25.3 Measuring and Reducing Emissions

25.3.1 Baseline

25.3.2 Data Use Cases

25.4 Sequestration

25.4.1 Biological Sequestration

25.4.2 Geological Sequestration

25.5 Prepare for Impact

25.6 Geoengineering

25.7 Greenwashing

25.8 Outlook

25.9 In a Nutshell

26 Mindset and Community

Stefan Papp

26.1 Data-Driven Mindset

26.2 Data Science Culture

26.2.1 Start-up or Consulting Firm?

26.2.2 Labs Instead of Corporate Policy

26.2.3 Keiretsu Instead of Lone Wolf

26.2.4 Agile Software Development

26.2.5 Company and Work Culture

26.3 Antipatterns

26.3.1 Devaluation of Domain Expertise

26.3.2 IT Will Take Care of It

26.3.3 Resistance to Change

26.3.4 Know-it-all Mentality

26.3.5 Doom and Gloom

26.3.6 Penny-pinching

26.3.7 Fear Culture

26.3.8 Control over Resources

26.3.9 Blind Faith in Resources

26.3.10 The Swiss Army Knife

26.3.11 Over-Engineering

26.4 In a Nutshell

27 Trustworthy AI

Rania Wazir

27.1 Legal and Soft-Law Framework

27.1.1 Standards

27.1.2 Regulations

27.2 AI Stakeholders

27.3 Fairness in AI

27.3.1 Bias

27.3.2 Fairness Metrics

27.3.3 Mitigating Unwanted Bias in AI Systems

27.4 Transparency of AI Systems

27.4.1 Documenting the Data

27.4.2 Documenting the Model

27.4.3 Explainability

27.5 Conclusion

27.6 In a Nutshell

28 Epilogue

Stefan Papp

28.1 Halford 2.0

28.1.1 Environmental, Social and Governance

28.1.2 HR

28.1.3 Customer Satisfaction

28.1.4 Production

28.1.5 IT

28.1.6 Strategy

28.2 Final Words

28.3 In a Nutshell

29 The Authors

Preface

This preface was NOT written by ChatGPT (or similar).

As I make this statement, I’m wondering how often it will remain true for text or even other forms of media in the future. Over the last two years, this AI-powered tool has risen to enormous popularity, and has given Data Science and AI an incredible awareness boost. As a result, the expectations for Artificial Intelligence have grown seemingly exponentially, and reached such heights that one might ask, if they can ever be achieved.

AI is following the well-known hype cycle. Some of these high expectations are well-deserved: this powerful technology will change the way we live and work in many ways. To name one example: some universities are considering not to ask their students for seminar papers any longer, as it’s not possible to check if it was written by an AI tool.

But we also must brace ourselves for some disappointment in the future, as AI inevitably fails to live up to certain people’s inflated expectations.

Even when the vision is reasonable, often the timelines these people and organizations have in mind for implementing AI projects is not. This leads to further disappointment, when the hoped-for impact and value fail to materialize within the desired timeframe.

We’re already seeing the beginning of this, with ChatGPT and similar tools generating plenty of eloquent and coherent – yet completely inaccurate – information. This isn’t helped by the new wave of ‘AI experts’, who are making ever more outlandish promises about tools invented by themselves or their companies; promises which will be very hard to keep. They are, essentially, selling digital ‘snake oil’.

All of this puts even more pressure on data scientists to deal with these expectations, while continuing to deliver on the same goal they’ve had for decades:

generating understandable answers to questions, using data.

This is what makes neutral organizations such as the Vienna Data Science Group (VDSG [www.vdsg.at]) – which fosters interdisciplinary and international knowledge exchange between data experts – so necessary and important. We are still highly dedicated to the development of the entire Data Science and AI ecosystem (education, certification, standardization, societal impact study, and so on), across Europe and beyond. This book represents just one of our efforts towards this goal. Because despite all the hype and hyperbole in the AI and data landscape, Data Science remains the same: an interdisciplinary science gathering a very heterogeneous crowd of specialists. It is made up of three major streams, and we are proud to have expert members in each of them:

Computer Science and IT

Mathematics and Statistics

Domain expertise in the industry or field in which Data Science and AI is applied.

As a matter of fact, the VDSG [www.vdsg.at] has always taken a holistic approach to data science, and this book is no different: Starting at Chapter 1 we introduce a fictional company who wants to become more data driven, and we check in with them throughout the book, right up to the end of their data transformation in Chapter 28. Along the way we cover many challenges in their journey, thus providing you with practical insights which were only possible thanks to vibrant exchange among our vast Data Science and AI community.

The result is a greatly expanded edition of our Data Science & AI Handbook, with 10 new chapters covering topics like Building AI solutions (Chapter 13), Foundation Models (Chapter 15), Large Language Models and Generative AI (Chapter 16) and Climate Change and AI (Chapter 25). This is complemented by also tackling the fundamental topics of Data Architecture, Engineering and Governance (Chapters 4, 5 and 6) and topping it off with Machine Learning Operations (MLOps, Chapter 7), which has become a very important discipline in itself.

To provide a firm foundation to help you understand all this, we’ve again included an introduction to the underlying Mathematics (Chapter 9) and Statistics (Chapter 10) used in Data Science, as well as chapters on the theory behind Machine Learning, Signal Processing and Computer Vision (Chapters 12, 14 and 18). We’ve also covered topics related to generating value from data, such as Business Intelligence (Chapter 11) and Data Driven Enterprises (Chapter 21), as well as vital information to help you use data safely, including chapters on the new EU AI Act (Chapter 23) and Trustworthy AI (Chapter 27).

This vast expansion of VDSG’s Magnum Opus serves one core purpose:

to give a realistic and holistic picture of Data Science and AI.

Data Science and AI is developing at an incredibly quick pace at the moment and so is its impact on society. This means that responsibilities put on the shoulders of data scientists have grown as well, and so has the need for organizations like VDSG [www.vdsg.at] to get involved and tackle these challenges too.

Let’s go for it!

Summer 2024

Wolfgang Weidinger

Acknowledgments

We, the authors, would like to take this opportunity to express our sincere gratitude to our families and friends, who helped us to express our thoughts and insights in this book. Without their support and patience, this work would not have been possible.

A special thanks from all the authors goes to Katherine Munro, who contributed a lot to this book and spent a tremendous amount of time and effort editing our manuscripts.

For my parents, who always said I could do anything. We never expected it would be a thing like this.

Katherine Munro

I’d like to thank my wife and the Vienna Data Science Group for their continuous support through my professional journey.

Zoltan C. Toth

Thinking about the people who supported me most, I want to thank my parents, who have always believed in me, no matter what, and my partner Verena, who was very patient again during the last months while I worked on this book.

In addition I’m very grateful for the support and motivation I got from the people I met through the Vienna Data Science Group.

Wolfgang Weidinger

1Introduction

Stefan Papp

“I want to be CDO instead of the CDO.”

Iznogoud (adjusted)

Questions Answered in this Chapter:

How could we describe a fictional company before its journey to becoming data-driven?

What challenges might such a company need to resolve to become data-driven?

How will the chapters in this book help you, the reader, to recognize and address such challenges in your own organization?

1.1About this Book

This book takes a practical, experience-led look into various aspects of data science and artificial intelligence. In this, our third edition, the authors also deeply dive into some of the most exciting and rapidly developing topics of our time, including large language models and generative AI.

The authors’ primary goal is to give the reader a holistic approach to the field. For this reason, this book is not purely technical: Data science and AI maturity depends as much on work culture, particularly critical thinking and evidence-based decision-making, as it does on knowledge in mathematics, neural networks, AI frameworks, and data platforms.

In recent years, most experts have come to agree that artificial intelligence will change how we work and live. For a holistic view, we must also look at the status quo, if we want to understand what needs to be done to meet our diverse ambitions with the help of AI. One useful frame for doing this is to explore how people deal with data transformation challenges from an organizational perspective. For this reason, we will shortly introduce the reader to a fictional company at the beginning of its journey to integrate evidence-based decision-making into its corporate identity. We’ll use this fictional company, in which most things could be more data-oriented but aren’t yet, as a model for outlining possible challenges organizations may encounter when aiming to become more data-driven. By the end of this book, our hypothetical company will also serve as a model of how a data-driven company could look. In the chapters in between, we’ll address many of these challenges and provide practical advice on how to tackle them.

Suppose you, as a reader, would rather not read prose about an invented company in order to learn about such typical organizational challenges. In that case, we encourage you to skip this chapter and start with one that fits your interests. As a holistic book on this field, the authors discuss artificial intelligence, machine learning, generative AI, modeling, natural language processing, computer vision, and other relevant areas. We cover engineering-related topics such as data architecture and data pipelines, which are essential for getting data-driven projects into production. Lastly, we also address critical social and legal issues surrounding the use of data. Each author goes into a lot of detail for their specific field, so there’s plenty for you to learn from.

We kindly ask readers to contact us directly to provide feedback on how we can do better to achieve our ambitious goal of becoming the standard literature providing a holistic approach to this field. If you feel some new content should be covered in one of the subsequent editions, you can find the authors on professional networks such as LinkedIn.

And with that said, let’s get started.

1.2The Halford Group

Bob entered the office building of the Halford Group, a manufacturer of consumer products, including their best-selling rubber duck. After crossing the office doors, he felt he was thrown back into the eighties. Visitors having to register at the entrance, filling out forms to declare themselves liable in case of an accident, and promising not to take photos, was only the first step. As Bob entered the elevator, with its brass buttons and glossy, mahogany decor, he could have sworn he’d entered the setting of the movie “The Wolf of Wall Street.”

The executive office was similar. The brownish carpets showed their age, and the wallpapers looked like they’d inhaled the smoke of many an eighties Marlboro Man. The worn leather couches and the looming wooden desk (mahogany, again), seemed a memory of a great but distant past. Bob could imagine his dad—a man who had always been proud of being in sales and following the teachings of Zig Ziglar—doing business with this company in his younger years.

This image in Bob’s imagination was immediately disrupted when a young woman entered the room, and Bob was immediately thrown back into the present time. With an air of determination, she strode forward to reach for Bob’s hand. Somewhat taken aback, he took in the shock of platinum blonde hair, and the tattoos that had not been entirely hidden by her tailored suit, and raised his hand in response. The woman smiled.

1.2.1Alice Halford – Chairwoman

“I’m Alice Halford,” she said, “I am the granddaughter of Big Harry Halford, the founder of this group. He built his empire from the ground up.”

Bob had read all the legends about the old Halford boss. Every article about him made it clear he did not listen to many people. Instead, “Big Harry” was a proud, determined captain; one who set the course and demanded absolute obedience from his team. Business magazines were yet to write much about Alice, as far as Bob knew. However, he had read one article in preparation for this meeting. Alice was different from the grand old family patriarch, it had said. She had won the succession in a fierce battle against three ambitious brothers, and been selected by the board as chairwoman, thanks to her big plans to transition the company into a modern enterprise that could meet the Zeitgeist of the 21st century.

“Although successful, today’s generation would call my granddad a dinosaur who just wanted to leave enough footprints to let the next generation know he had been there,” Alice said. “Especially in his last years, he was skeptical about changes. Many principal consultants from respectable companies came with heads high to our offices, explaining that our long-term existence would depend on becoming a data-driven company. However, my granddad always had a saying: The moment a computer decides, instead of a founder who knows their stuff and follows their gut, it’s over. All the once proud consultants and their supporters from within the company thought they could convince every executive to buy into their ideas of a modern company, but ultimately, they walked out with their tails between their legs.”

Alice smiled at Bob and continued, “my granddad’s retirement was long overdue, but, finally, his exotic Cuban cigars and his habit of drinking expensive whiskey forced him to end his work life. I took over as a chairwoman of the board. I want to eliminate all the smells of the last century. When I joined, I found parts of the company were highly toxic. My strategic consultants advised me that every large organization has some organizational arrogance and inefficiency. They also cautioned me to keep my expectations low. While many enthusiasts claim that AI will change the world forever, every large organization is like a living organism with many different subdivisions and characteristics. Changing a company’s culture is a long process, and many companies face similar challenges. Ultimately, every company is run by people, and nobody can change people over night. Some might be okay with changes, a few may even want them to happen too fast, but most people will resist changes in one way or another.

At the same time, I understand that we are running out of time. We learned that our main competitors are ahead of us, and if we do not catch up, we will eventually go out of business. Our current CEO has a background in Finance and, therefore, needs support from a data strategist. Bob, you have been recommended as the most outstanding expert to transform a company into a data-driven enterprise that disrupts traditional business models. You can talk with everyone; you have all the freedom you need. After that, I am curious about your ideas to change the company from the ground up.”

Bob nodded enthusiastically. “I love challenges. Your secretary already told me I shouldn’t have any other appointments in the afternoon. Can you introduce me to your team? I would love to learn more about how they work, and their requirements.”

“I thought you’d want to do that. First, you will meet David and Anna, the analysts. Then you’ll meet Tom, the sales director. It would be best if you also talked with the IT manager, Peter—” Alice stopped herself, sighed, and continued. “Lastly, I arranged a meeting for you with our production leader, the complaints department, our Head of Security, and finally with our HR. I will introduce our new CEO, who is flying in today to discuss details at dinner. I booked a table in a good restaurant close by. But it makes sense if you first talk to all the other stakeholders. I had my colleagues each arrange a one-on-one with you. You’re in for a busy afternoon, Bob.”

1.2.2Analysts

As Alice swept out of the room, a bespeckled man apparently in his mid-forties, and a woman of about the same age, appeared in the doorway. It must have been the analysts, David and Anna. When neither appeared willing to enter the room first, Bob beckoned them inside. He was reminded of an empowerment seminar he’d attended some years ago: The trainer had been hell bent on turning everyone in the workshop into strong leaders, but warned that only the energetic would dominate the world. These analysts seemed to be the exact opposite. David laughed nervously as he entered, and Anna kept her eyes lowered as she headed to the nearest seat. Neither seemed too thrilled to be there; Bob didn’t even want to imagine how they would have performed in that seminar’s “primal scream” test.

David and Anna sat down, and Bob tried to break the ice with questions about their work. It took him a while, but finally, they started to talk.

“Well, we create reports for management,” David said. “We aim to keep things accurate, and we try to hand in our reports on time. It’s become something of a reputation,” he added with a weak chuckle.

Bob realized that if he was going to make them talk, he’d need to give his famous speech, summarized as, “your job in this meeting is to talk about your problem. Mine is to listen.” After all, he needed to transform Halford company into a data-driven company, and they were ones working closest with the company’s data.

Bob finished his speech with gusto, but Anna merely shrugged. “The management wants to know a lot, but our possibilities are limited.”

Bob tried his best to look both in the eyes, though Anna turned quickly away. “But what is it that prevents you from doing your work without any limits?”

“Our biggest challenge is the batch process from hell,” David spoke up suddenly. “This notorious daily job runs overnight and extracts all data from the operational databases. It is hugely complex. I lost count of how often this job failed over time.”

Got them, Bob thought, nodding in encouragement.

“And nobody knows why this job fails,” Anna jumped in. “But when it does, we don’t know if the data is accurate. So far, there has never been a problem if we handed in a report with questionable figures. But that’s probably because most managers ignore the facts and figures we provide anyway.”

“Exactly!” David threw up his hands. Bob started to worry he had stirred up a hornet’s nest.

“When a job fails, it’s me who has to go to IT,” David said. “I just can’t hear anymore that these nerds ran out of disk space and that some DevSecOps closed a firewall port again. All I want is the data to create my reports. I also fight often with our security department. Sometimes, their processes are so strict that they come close to sabotaging innovation. Occasionally, I get the impression they cut access to data sources on purpose to annoy us.”

“Often, we are asked if we want something more sophisticated,” Anna said, shaking her head in frustration. “It is always the same pattern. A manager visits a seminar and comes to us to ask us if we can ‘do AI’. If you ask me honestly, I would love to do something more sophisticated, but we are afraid that the whole system will break apart if we change something. So, I am just happy if we can provide the management with the data from the day before.”

Don’t get us wrong, ML and AI would be amazing. But our company must still master the basics. I believe most of our managers have no clue what AI does and what we could do with it. But will they admit it? Not a chance.”

Anna sat back in a huff. Bob did not need to ask them to know that both were applying at other companies for jobs.

1.2.3“CDO”

At lunch break, a skinny man in a black turtleneck sweater hurled into the office. He seemed nervous, as if someone was chasing him. His eyes darted around the room, avoiding eye contact. His whole body was fidgeting, and he could not keep his hands still.

“I am the CDO. My name is Cesario Antonio Ramirez Sanchez; call me Cesar,” he introduced himself with a Spanish accent.

Bob was surprised that this meeting had not been announced. Meanwhile, his unexpected visitor kept approaching a chair and moving away from it again as if he could not decide whether to sit down or not.

“CDO? I have not seen this position in the org chart,” Bob answered calmly, “I have seen a Cesario Antonio Rami …”

“No no no … It’s not my official title. It is what I am doing,” Cesar said dramatically. “I am changing the company bottom up, you know? Like guerilla warfare. Without people like me, this company would still be in the Stone Age, you see?”

“I am interested in everyone’s view,” Bob replied, “but I report to Alice, and I cannot participate in any black ops work.”

“No, no, no …, everything is simple. Lots of imbeciles are running around in this company—” Cesar raised his finger and took a sharp breath, nodded twice, and continued. “I know … HR always tells me to be friendly with people and not to say bad words. But we have only data warehouses in this company. Not even a data lake. Catastrófica! Its the 21st century, and these dinosaurs work like in Latin America hace veinte años. Increíble!”

He took another breath, and then continued. “Let’s modernize! Everything! Start from zero. So much to do. First, we must toss these old devices into the garbage, you know? And replace them with streaming-enabled PLCs. Then, modern edge computing services streams everything with Kafka to different data stores. All problems solved. And then we’ll have a real-time analytics layer on top of a data mesh.”

Bob stared at his counterpart, who seemed unable to keep his eyes or his body still for more than a moment. “I am sorry, I do not understand.”

“You are an expert, you have a Ph.D., no? You should understand: modern factory, IoT, Industry 4.0, Factory of the Future.”

Bob decided not to answer. Instead, he kept his eyebrows raised as he waited for what Cesar would say next.

“So much potential,” Cesar went on. “And all is wasted. Why is HR always talking about people’s feelings? Everything is so easy. This old company needs to get modern. We don’t need artists, we need people with brains. If I want art, I listen to Mariachi in Cancun. If current people are imbeciles, hire new people. Smart people, with Ph.D. and experience. My old bosses in Latin America, you cannot imagine, they would have fired everyone, including HR. Let’s talk later; I’m in the IT department en la cava.”

Bob had no time to answer. Cesar left the room as fast as he had entered it.

1.2.4Sales

A tall, slim, grey-haired man entered the room, took a place at the end of the table, leaned back and presented to Bob a salesman grin for which Colgate would have paid millions.

“I am Tom Jenkins. My friends call me ‘the Avalanche’. That’s because if I take the phone, nobody can stop me anymore. Back in the nineties, I made four sales in a single day. Can you imagine this?”

I get it; you are a hero. Bob thought. Let’s turn it down a bit.

“My name is Bob. I am a consultant who has been hired to help this company become more data-oriented.”

Tom’s winning smile vanished when Bob mentioned ‘data.’

“I have heard too much of the data talk,” Tom said. “No analysis can beat gut feeling and experience. Don’t get me wrong. I love accurate data about my sales records, but you should trust an experienced man to make his own decisions. No computer will ever tell me which potential client I should call. When I sit at my desk, I know which baby will fly.”

“With all due respect. I can show you a lot of examples of how an evidence-based approach has helped clients to make more revenue.”

“Did you hear yourself just now?” Tom answered, “Evidence-based. You do not win sales with brainy talks. You need to work on people’s emotions and relationships. No computer will ever do better sales than a salesman with a winning smile. I’ll give you an example: One day, our sales data showed that we sold fewer products in our prime region. Some data analysts told me something about demographic changes. What a nonsense!

So, I went out and talked to the people. I know my folks up there. They are all great people. All amazing guys! Very smart and very hands-on. I love this. We had some steaks and beers, then I pitched our new product line. Guess who was salesman of the month after that?

No computer needs to tell me how to approach my clients. So, as long as we get the sales reports right and we can calculate the commission, all is good. It is the salesman, not the computer, who closes a deal.”

With that, The Avalanche was on his feet. He invited Bob to a fantastic restaurant—“I know the owner and trust me, he makes the best steaks you’ll ever taste!”—and was gone.

1.2.5IT

Ten minutes past the planned meeting start time, Bob was still waiting for the team member he had heard most about upfront: the IT leader, Peter. His name had been mentioned by various people multiple times, but whenever Bob had asked to know more about him, people were reluctant to answer, or simply sighed and told him, “you’ll see.”

Finally, Peter stormed into the room, breathless and sweating. “This trip from my office in the cellar to this floor is a nightmare,” he said between gasps. “You meet so many people in the elevator who want something. I am constantly under so much stress, you cannot imagine! Here, I brought us some sandwiches. I have a little side business in gastronomy. You need a hobby like this to survive here. Without a hobby in this business, you go mad.”

Peter was a squat, red-faced man, who’d been with Halford since he was a lot younger, and had a lot more hair. He sank a little too comfortably in his chair, with the confidence of a man who’d been around so long, he was practically part of the furniture.

He doesn’t lack confidence, that’s for sure, Bob thought. I wonder how many dirty secrets this man has learned over the years that only he knows.

“Okay, let’s talk about IT then,” Peter sighed after Bob turned down the sandwiches. “My colleagues from the board and the executives still don’t get what it is they’re asking of me daily. When they invite me to meetings, I often do not show up anymore. We are a huge company, but nobody wants to invest in IT. I am understaffed; we hardly manage to keep the company running. Want to go for a cigarette?”

“No, thank you,” Bob said, but Peter was already crumpled pack from his trouser pocket. He rambled all the way to the smoker’s chamber, bouncing around from one topic to another. Bob learned everything about Peter, from his favorite food over his private home to his hernia, which was apparently only getting worse. Once Peter got first cigarette into his mouth, he went back to the topic Bob was really interested in.

“The suits want things without knowing the implications. On the one hand, they want everything to be secure, but then again, they want modern data solutions. Often, they ask me for one thing one day, and then the very next, they prioritize something else. To be blunt, I had my share of talks with these external consultants. If I allowed them to do what they asked me to do, I could immediately put all our data on a file server and invite hackers to download it with the same result. To keep things working, you need to firewall the whole company,” Peter stubbed out his cigarette, and reached for another.

Bob leaped at the chance to interject. “Can you tell me more about your IT department? I was looking for some documentation of the IT landscape. I have not found much information on your internal file shares. Which cloud provider are you currently using?”

Peter laughed and then started coughing. Tears in his eyes, he answered. “I told you, I’m understaffed. Do you really think I have time to document?” He pointed to his head. “Don’t worry, everything is stored in the grey cells up here. And we have a no-cloud strategy. Cloud is just a marketing thing if you ask me. When we build by ourselves, it is safer, and we have everything under control.

If I just had more people … Did you meet one of my guys, Cesar? He is also okay when he does not talk, which unfortunately doesn’t happen often. I don’t like when people think they are smarter than me. He doesn’t know Peter’s two rules yet. Rule Number 1: Do not get on your boss’s nerves. Rule Number 2. Follow rule number 1.”

Peter laughed, flicked the second cigarette on the ground, and retrieved a bag from his other pocket. It was full of caramels: Peter popped one into his mouth and continued, chewing loudly. “Alice asked me if I could introduce you to Bill, my lead engineer, but I declined. This guy has the brains of a fox but the communication skills of a donkey. He also gets nervous when you look him straight in the eyes. I am always worried that he might wet his pants— Or am I being too politically incorrect again? Our HR keeps telling me that I should be more friendly. But in this looney bin, you learn to let our your stress by saying what you think. So, please excuse my sarcasm. I am the last person standing between chaos and a running IT landscape, the management keeps getting on my nerves with stupid requests, and last but not least, the HR department is more concerned about how I communicate than about finding the people who could help me keep our company running.”

It took a couple of attempts until Bob could finally break free from Peter’s complaining to head to his next meeting. Even as he was leaving, Peter repeatedly called on Bob to visit his food business sometime, where they could have a drink in private, and Peter could share his Halford ‘war stories’ more openly.

1.2.6Security

While waiting for the HR representative, Bob received a voice message from Suzie Wong, the head of data security. When Bob played it, he heard traffic sounds in the background.

“Apologies for not showing up. School called me in as one of my kids got sick. I hope a voice message is fine. I am Suzie Wong. I have been with Halford for years. They call me the human firewall against innovation. I take this as a compliment because, in some way, it means I am doing my job well. Could any company be happy with a Head of Security who takes her job easy? My predecessor was more laid back than I am. He was in his fifties and got a little too comfortable, thinking he would retire in a secure job. And then one day … there was this security breach. His kid’s still in private school, he’s suddenly without a job and, well, I’ll spare you the details.

People often think I’m only around to sign off on their intentions to use data, but my real job is protecting our client’s privacy. Data scientists must prove to me that our client’s data is safe when they want to work with it. Unfortunately, too many take that too lightly.

If the requestor follows the process, a privacy impact assessment could be done within a week. I will send you a link to our security portal later so you can review it. You’ll see for yourself that we do not ask for anything impossible.

I am the last line of defense, ensuring that we do not pay hefty fines because someone thought it was just data they were playing around with. Some people also jokingly call me ‘Mrs. No,’ because this is my common answer if you cannot express why I should grant you security exceptions or provide access to data containing clients’ private information. Some people complain that this way, it may take months to get security approval. But so long as engineers and data scientists still don’t get how to address security matters correctly, I don’t care if it takes years before I give my final OK.

Anyway, excuse me now, I’m at the school …”

1.2.7Production Leader

Bob had some time before his next meeting and looked up his next meeting partner online. He discovered a middle-aged man with a long history on social media, including some questionable photos of his younger self in a Che Guevara t-shirt. Bob chuckled. That young man could be happy that their interview wasn’t taking place during the times of the Cold War.

Finally, Bob’s interviewee entered the room. He was muscular, and his bushy black beard showed the first signs of greying.

“My name is Hank. Pleased to meet you,” he said with a deep voice.

“I heard you are new in your position,” Bob said.

“Yes. Alice fired my predecessor because he was a tyrant. I am now one of the first of what she calls ‘the new generation.’ I accepted because I can change things here now. Let me get to the point: What are you planning to do?”

Bob smiled and said, “the idea in factories is often to use machine learning for automation. Think of processes where people check the quality of an item manually. Imagine that you can automate all this. A camera screens every piece, and defective items — which we call ‘rejects’ — are filtered out automatically.”

Hank stiffened. “My job is to protect jobs, not support removing them. Some of our factories are often in villages, where they are the only source of work.”

“Almost every country goes through demographic changes. Can you guarantee that you will be able to maintain a strong enough workforce to keep the factories running? How about doing the same with fewer people?”

“But if you remove a few people, they can end up out of work,” Hank said. “What if you don’t need workers at all in a few years? I don’t want to open the door to a system that makes the bourgeoisie richer and put the ordinary proletarian out of work.”

“That is very unlikely,” Bob said.

“I see you are solidary with your employees, Hank. Did you consider exploring use cases to protect them? We can use computer vision to see if factory workers wear helmets, for example.”

Hank looked deeply into Bob’s eyes. Bob couldn’t quite tell if it was a good or bad sign, be he did realize something: this was not a man he’d like to meet on a dark, empty street.

“I understand that there might be benefits for my colleagues,” Hank said. “I just want to open up a trojan horse: I get one IT system in to prevent accidents, and the next one makes the workers obsolete. But I promised Alice I’d support her. She is a good person. I will talk with my colleagues. I need to get them on board, but one thing is not negotiable: We will never tolerate any system that completely replaces people who need the job they have.”

1.2.8Customer Service

The next interviewee, an elderly woman with perfectly glossy, silver hair, entered the room. She sat down and carefully ran her fingers over classic French bun, ensuring not a hair was out of place.

“I am Annie from the complaints department,” she said with something of an aristocratic tone. She seemed more interested in her neatly manicured nails than Bob as she went on. “I honestly do not know why you want to talk to me.”

“Well, part of a data-driven enterprise is often also a customer-first strategy. We can measure customer churn and other metrics through data. Most of my clients want to use data to maximize success. They even renamed their departments to ‘Customer Satisfaction Department’ to underline this.”

“Aha,” Annie said. There was an uncomfortable silence as she polished the face of her antique watch with her other sleeve.

Bob cleared his throat, anxious to get her attention. “Would you be interested to learn more about your customers through data?”

“Why should I?”

“To serve them better?”

“We have sturdy products. Most complaints have no base. We believe the less money we spend on confused customers, the more we have left to improve our products. This is what I call the real customer value we provide.”

Ah-hah. Bob recognized the famous argument against investing in any domain that doesn’t directly create revenue. She probably gets a bonus for keeping yearly costs low, he thought, seeing an opportunity.

“And how do you keep costs small at the moment?”

“We have an offshore call center. They handle about 80 % of calls, although a lot of those customers just give up, for some reason. The remaining 20 % are forwarded to a small team of more advanced customer support employees. I know it sounds harsh, but you cannot imagine how many confused people try to call us without having a problem at all. Some – it seems – call us just to talk.”

“Right. And have you thought of the possibility to reduce costs by building chatbots backed by generative AI? There are also many ways to use data science to filter customer complaints. If properly trained, your clients get better support, and you reduce costs.”

“Would it be good enough to shut down the offshore center?

Gotcha. “If done right, yes.”

For what felt like the first time, Annie looked at Bob directly. “How much would it cost?”

“At the moment, it is still difficult to estimate.”

Annie thought a while, then stood up to leave. At the door, she paused. “Once you know, call me immediately.”

1.2.9HR

“I’m, I’m Pratima,” came a woman’s voice at the door. She approached Bob, looked up at him with a welcoming smile and asked, “how can I help you, Bob?”

“Hi, Pratima. Let’s take a seat. As you know, I’m here to transform this company into a more data-oriented one. I saw on LinkedIn that you have previously worked for very modern companies with a strong data culture. How is it now to work for a company at the beginning of its journey?”

“Alice asked me to be open to you. I took this job as a career step to advance to leadership. However, the Wheel of Fortune led me to more challenges than expected.

In my previous job, we had the vibes to attract new talent. It was an environment primed for excellence: fancy office spaces, a modern work culture with flat hierarchies, cool products to work on, and many talented, diverse colleagues. Recruiting was easy because new candidates felt it the spirit of our community.”

Pratima sighed.

“In this company, though, we cannot hide that we are at the beginning of our transition. Applicants usually have many offers to choose from. Sometimes, we have to watch perfect candidates walk away because we do not yet provide a warm and welcoming environment for data professionals.

When managers discuss AI and data transition, some might oversee the human aspect. What if you create the perfect data strategy but cannot attract enough talent? Many companies face this problem, and an elephant is always in the room. To become a data-driven company, you have to create an environment that attracts people who think differently, and this means changing your culture.

“Do you believe management is scared to promote too much change because it is afraid to lose everything?”

“I understand that some seasoned employees might get disappointed and even resign if their comfortable environment starts to modernize. But at the same time, if you do not change at all, you are stuck in the mud, and your competition will make you obsolete. The Dalai Lama says we should be the change we wish to be.”

“Right. And I believe it was Seneca who once said, ‘It’s not because things are difficult that we dare not venture. It’s because we dare not venture that they are difficult.’”

“True! But I have to go now. I am looking forward to continuing our talks.”

1.2.10CEO

Alice and Bob met at a fusion restaurant downtown in the evening. Alice introduced Bob to Santiago, the long-time CFO turned new CEO. After an excellent meal, they ordered some famous Armenian cognac, and got down to the real discussion.

“I’ll be honest with you, Bob,” Santiago began. “All your ideas to transform Halford sound fantastic, but as an economist and a numbers person, my first question is, how much will this all cost?”

Oh boy. Bob was prepared for the question, but he knew Santiago wouldn’t like the answer. “It depends,” he said, and Santiago looked about as dissatisfied as Bob would have expected.

“I understand that everyone looks at the costs,” Bob continued, “but history is full of companies that failed to innovate and went bankrupt as their competition moved forward. If you see the full spectrum of artificial intelligence, hardly any company will eventually operate as before.”

“Some companies recommend that we start with data literacy workshops to enable leaders to interpret data and numbers efficiently. Literacy sounds as if they want to teach us to read and write again—and for a huge amount of money, of course. Don’t get me wrong, please. I understand that we need to innovate, but if I approve everything consultants suggest to me, we will soon be broke.”

“But if your leadership team cannot ‘think in data’,” Bob said, making air quotes as he spoke, “how do they expect to attend our planned strategy workshop on exploring specific data science options for our business goals?”

“What is the difference?”

“In the data literacy workshops, we aim to create an understanding of how to interpret data. In the strategy workshop, we’ll create a list of use cases to improve processes in your company, and prioritize them, to integrate new data solutions gradually.”

“I understand that we have some tough nuts to crack. Some of our employees do not believe in becoming data-driven, and we may need to invest hugely in Enablement. We once asked external companies to help us modernize our IT. No consulting company gave me a quote with a fixed price for a transition project. They always said we were facing a hole without a bottom.”

“Leadership is the only way to move forward. If the executive team is convinced and aligned, this culture can spread.

Your operational IT will need to mature and modernize gradually. However, be aware that an analytical layer can be built outside of corporate IT. One risk is to make data transition to an IT problem; IT is part of it, but becoming a data-driven company is far more than giving some engineers a job to build platforms.”

“For me, it’s clear,” Alice said. “Either we modernize, or we gradually fade out of existence. Bob, what do you need to help us?”

Bob looked from one to the other, carefully considering his next words. “Becoming data-driven does not mean hiring a bunch of data scientists who do a bit of magic, and suddenly the company makes tons of money using AI. As I said, the first step is to align the stakeholders. For me, this is the alpha and omega of AI: creating a data culture based on critical thinking and evidence-based decisions. ”

“Great,” answered Alice. “Let’s get started with that.”

1.3In a Nutshell

Expectation Management

Most companies see the need to become data-driven, as they understand that those organizations that ignore technical evolution mostly fail.

Some employees might have unrealistic expectations about how fast a transition can go. We highlight that changing to a data-driven company is not just a change of practices and processes, it is often a cultural overhaul of how the company does its business.

Many employees fear having to give up some of their autonomy, or even losing their jobs to computers entirely, if AI is introduced at their company. An organization that transitions to become data-driven must address this.

Technology Focus and Missing Strategy

Some companies try to find a silver bullet that solves all problems. “We’ll just use this technology, just apply AI in this or that way, and all our problems are resolved,” they think. Being too technology-focused, however, is an anti-pattern that can hinder a company’s evolution to becoming data-driven.

Data Science and AI are about more than just Understanding Frameworks and Methods

While it is essential to have a team of skilled data scientists and AI engineers to pick the right AI frameworks and build complex AI systems, for large organizations, there are many other considerations to watch for. Not being able to understand the needs of an organization and where AI can make a difference is a risk. With the wrong target, every strategy will fail.

Collaboration between Analysts and IT

In some companies, IT provides the platforms that analysts have to use. If these platforms are error-prone or old, it can get frustrating for analysts. In modern environments, not all analytical platforms must be managed by one central IT department. This can give data teams more freedom to operate on their own.

Many IT teams lack the resources to build the data pipelines needed for data science platforms. Often there is a gap between business users and engineers, making it hard for them to communicate with each other.

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: