E-Book
25,99 €

Keras to Kubernetes E-Book

Dattaraj Rao

0,0

25,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: John Wiley & Sons
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Build a Keras model to scale and deploy on a Kubernetes cluster We have seen an exponential growth in the use of Artificial Intelligence (AI) over last few years. AI is becoming the new electricity and is touching every industry from retail to manufacturing to healthcare to entertainment. Within AI, we re seeing a particular growth in Machine Learning (ML) and Deep Learning (DL) applications. ML is all about learning relationships from labeled (Supervised) or unlabeled data (Unsupervised). DL has many layers of learning and can extract patterns from unstructured data like images, video, audio, etc. Keras to Kubernetes: The Journey of a Machine Learning Model to Production takes you through real-world examples of building DL models in Keras for recognizing product logos in images and extracting sentiment from text. You will then take that trained model and package it as a web application container before learning how to deploy this model at scale on a Kubernetes cluster. You will understand the different practical steps involved in real-world ML implementations which go beyond the algorithms. Find hands-on learning examples Learn to uses Keras and Kubernetes to deploy Machine Learning models Discover new ways to collect and manage your image and text data with Machine Learning Reuse examples as-is to deploy your models Understand the ML model development lifecycle and deployment to production If you re ready to learn about one of the most popular DL frameworks and build production applications with it, you ve come to the right place!

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 474

Veröffentlichungsjahr: 2019

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Cover

Introduction

How This Book Is Organized

Conventions Used

Who Should Read This Book

Tools You Will Need

Summary

CHAPTER 1: Big Data and Artificial Intelligence

Data Is the New Oil and AI Is the New Electricity

Applications of Artificial Intelligence

Summary

CHAPTER 2: Machine Learning

Finding Patterns in Data

The Awesome Machine Learning Community

Types of Machine Learning Techniques

Solving a Simple Problem

Analyzing a Bigger Dataset

Comparison of Classification Methods

Bias vs. Variance: Underfitting vs. Overfitting

Reinforcement Learning

Summary

CHAPTER 3: Handling Unstructured Data

Structured vs. Unstructured Data

Making Sense of Images

Dealing with Videos

Handling Textual Data

Listening to Sound

Summary

CHAPTER 4: Deep Learning Using Keras

Handling Unstructured Data

Welcome to TensorFlow and Keras

Bias vs. Variance: Underfitting vs. Overfitting

Summary

CHAPTER 5: Advanced Deep Learning

The Rise of Deep Learning Models

New Kinds of Network Layers

Building a Deep Network for Classifying Fashion Images

CNN Architectures and Hyper‐Parameters

Making Predictions Using a Pretrained VGG Model

Data Augmentation and Transfer Learning

A Real Classification Problem: Pepsi vs. Coke

Recurrent Neural Networks

Summary

CHAPTER 6: Cutting‐Edge Deep Learning Projects

Neural Style Transfer

Generating Images Using AI

Credit Card Fraud Detection with Autoencoders

Summary

CHAPTER 7: AI in the Modern Software World

A Quick Look at Modern Software Needs

How AI Fits into Modern Software Development

Simple to Fancy Web Applications

The Rise of Cloud Computing

Containers and CaaS

Kubernetes: A CaaS Solution for Infrastructure Concerns

Summary

CHAPTER 8: Deploying AI Models as Microservices

Building a Simple Microservice with Docker and Kubernetes

Adding AI Smarts to Your App

Packaging the App as a Container

Pushing a Docker Image to a Repository

Deploying the App on Kubernetes as a Microservice

Summary

CHAPTER 9: Machine Learning Development Lifecycle

Machine Learning Model Lifecycle

Deployment on Edge Devices

Summary

CHAPTER 10: A Platform for Machine Learning

Machine Learning Platform Concerns

Putting the ML Platform Together

Summary

A Final Word …

APPENDIX A: References

Index

End User License Agreement

List of Tables

Chapter 7

Table 7.1: Some Useful Minikube Commands

Table 7.2: Useful Kubectl Commands to Access Local and Remote Kubernetes Clu...

List of Illustrations

Introduction

Figure 1: Opening a new notebook in Google Colaboratory

Figure 2: Click Connect to start the virtual machine

Figure 3: Example of running code in a notebook

Chapter 1

Figure 1.1: Alexa, can you do my homework?

Figure 1.2: Data volumes on the consumer Internet

Figure 1.3: Data volumes on the industrial Internet

Figure 1.4: AI for computer vision at a railway crossing

Figure 1.5: IBM Watson beating

Jeopardy

! champions

Figure 1.6: Google's self‐driving autonomous car

Figure 1.7: Google CEO demonstrating Duplex virtual assistant fooling the reser...

Figure 1.8: Expressing Ys as a function of Xs

Figure 1.9: Describe the data to humans

Figure 1.10: Diagnose an issue using data

Figure 1.11: Weather forecasting

Figure 1.12: Route to work

Figure 1.13: Types of analytics

Figure 1.14: Rules‐based analytics models

Figure 1.15: Data‐driven analytics models

Figure 1.16: Person on treadmill

Figure 1.17: Fitbit wrist device

Figure 1.18: Cameras tracking motion

Chapter 2

Figure 2.1: Charting these real numbers shows a pattern

Figure 2.2: Kaggle hosts data science competitions and gives free datasets

Figure 2.3: UCI Machine Learning repository

Figure 2.4: Sample dataset we will analyze

Figure 2.5: Chart of the sample dataset

Figure 2.6: Clusters shown on the initial data chart

Figure 2.7: Linear regression tries to map X and Y values to a straight line

Figure 2.8: Varying slope (w) and intercept (b) values gives us different lines...

Figure 2.9: Gradient descent to find the optimal weight and bias terms

Figure 2.10: Plot of house price versus location

Figure 2.11: Decision based only on location

Figure 2.12: Decision based only on price

Figure 2.13: Linear decision boundary for buy vs. no‐buy decisions

Figure 2.14: Simple network representation of the logistic regression equation

Figure 2.15: Our new dataset with expectation of buy and don't buy

Figure 2.16: Non‐linear decision boundary

Figure 2.17: Sample of the Wine Quality dataset

Figure 2.18: Summary of the wine data frame

Figure 2.19: Precision and recall concepts

Figure 2.20: Precision and recall formula

Figure 2.21: Sample example of a decision tree

Figure 2.22: Sample decision tree

Figure 2.23: Shooting darts with high bias to the top left

Figure 2.24: Shooting darts with high variance across the board

Figure 2.25: Adjusting bias and variance to get your bull's‐eye!

Figure 2.26: Linear regressor underfitting our data

Figure 2.27: Overfitting on the training data

Figure 2.28: A good fit and well‐trained model!

Figure 2.29: How Reinforcement Learning works

Figure 2.30: A simple analogy for Reinforcement Learning

Figure 2.31: Learning from reinforcements received from the environment

Figure 2.32: Bellman equation for calculating long‐term rewards

Figure 2.33: Example showing a Markov Decision Process (MDP) and a sample Q‐Lea...

Figure 2.34: Deep Q Network to predict Q‐Values

Chapter 3

Figure 3.1: Structured data examples—timeseries and tabular data

Figure 3.2: Two paths to handling unstructured data

Figure 3.3: An image of a handwritten digit 5 in 28×28 resolution

Figure 3.4: The image expanded to show the 28×28 pixel array in detail

Figure 3.5: Image array as raw data with pixel intensity values

Figure 3.6: Load image using OpenCV and convert it to grayscale

Figure 3.7: RGB color space source Wikipedia

Figure 3.8: Results of array operations on the image

Figure 3.9: Results of the OpenCV operations on the image

Figure 3.10: Results of thresholding operations on the image

Figure 3.11: Results of applying 2D filters to the image

Figure 3.12: Results of applying Canny edge detection

Figure 3.13: Results of detecting a face using the Haar Cascade Classifier

Figure 3.14: Frequency chart of common words—stemmed

Figure 3.15: Frequency chart of common words—lemmatized

Figure 3.16: PCA to reduce dimensions and plot word embeddings

Figure 3.17: High‐level flow of Alexa answering a question

Figure 3.18: Frequency domain reveals the hidden information inside waves

Figure 3.19: Time domain plot of sound from car engine

Figure 3.20: Frequency domain plot for car engine sound signal

Chapter 4

Figure 4.1: Biological neurons in the human brain

Figure 4.2: Simple neural network where learning units are connected as a netwo...

Figure 4.3: Processing at an individual neuron

Figure 4.4: Neural network with weight values

Figure 4.5: Neural network with weight and bias values

Figure 4.6: How Gradient Descent moves toward the minima

Figure 4.7: Sample from the MNIST training dataset

Figure 4.8: Summary of our neural network, multi‐layered perceptron

Figure 4.9: Darts example to explain bias and variance

Figure 4.10: Training vs. validation set accuracy

Chapter 5

Figure 5.1: Simple convolution filter to extract horizontal lines

Figure 5.2: Simple convolution filter to extract vertical lines

Figure 5.3: Samples from fashion images dataset

Figure 5.4: Simplified architecture of our CNN model

Figure 5.5: Model accuracy increases and loss decreases over the epochs

Figure 5.6: ImageNet categorized images

Figure 5.7: Electric Locomotive image from Wikipedia

Figure 5.8: Typical CNN architecture where early layers extract spatial pattern...

Figure 5.9: General folder structure for our problem of predicting two classes ...

Figure 5.10: Results of data augmentation on a few logos

Figure 5.11: Predictions for test1.jpg: Coca‐Cola and test2.jpg: Pepsi

Figure 5.12: Architecture of a Recurrent Neural Network

Chapter 6

Figure 6.1: General idea of how neural style transfer works

Figure 6.2: Example of neural style transfer

Figure 6.3: Style and content images we will use for this demo

Figure 6.4: Results of neural style transfer

Figure 6.5: Neural network captures encoding of image

Figure 6.6: Art‐forger analogy for generative adversarial networks

Figure 6.7: Displaying the fashion items dataset

Figure 6.8: Results from GAN trained to generate fashion images

Figure 6.9: Credit card transaction dataset with details hidden in V‐features

Figure 6.10: Concept of an autoencoder neural network

Figure 6.11: Model accuracy plot for autoencoder

Figure 6.12: Model loss plot for autoencoder

Figure 6.13: Predictions on testing data using autoencoder

Chapter 7

Figure 7.1: Displaying a web page and HTML code

Figure 7.2: The HTML 2 logo

Figure 7.3: Data center with racks of blade servers

Figure 7.4: IaaS vs. PaaS vs. SaaS, explained through a block diagram

Figure 7.5: Virtual machines vs. containers

Figure 7.6: The simple application shown in a browser

Chapter 8

Figure 8.1: What you see in the browser

Figure 8.2: The new app.py file shown in a browser

Figure 8.3: The result after pressing Submit

Figure 8.4: The new app demo in the browser

Figure 8.5: What you get after pressing Submit

Figure 8.6: Entering a new phrase

Figure 8.7: This one results in a negative result

Figure 8.8: Demo on the local host

Figure 8.9: Result shown on the local host

Figure 8.10: Accessing the application as a Docker app

Chapter 9

Figure 9.1: Steps in a Machine Learning development lifecycle

Figure 9.2: An unofficial generic guideline for model selection

Figure 9.3: The H2O AI workbench allows codeless model development

Figure 9.4: H2O AI example of an AutoML leaderboard

Figure 9.5: Changing from CPU to GPU runtime in Google Colaboratory

Figure 9.6: Sample images from the CIFAR‐10 dataset

Figure 9.7: Change the setting to use GPU

Figure 9.8: Change the setting to use TPU

Figure 9.9: Change the setting to use CPU only

Chapter 10

Figure 10.1: Typical data science concerns and tools that address them

Figure 10.2: This Kafka‐based system for data ingestion includes a Hadoop conne...

Figure 10.3: H2O web user interface (UI)—Flow

Figure 10.4: Uploading and parsing a CSV file—no code needed

Figure 10.5: Checking the parsed data frame and splitting it into training and ...

Figure 10.6: Selecting the model and defining hyper‐parameters for it

Figure 10.7: Running the training job. Evaluating the trained model. Still no c...

Figure 10.8: Selecting Run AutoML from the menu bar

Figure 10.9: Running the AutoML job. Note the leaderboard of all different mode...

Figure 10.10: Images used to validate the model

Guide

Cover

Table of Contents

Begin Reading

Pages

iii

vii

xiii

xiv

xvi

xvii

xviii

100

101

102

103

104

105

106

107

108

109

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

Keras to Kubernetes®

The Journey of a Machine Learning Model to Production

Dattaraj Jagdish Rao

Introduction

Welcome! This book introduces the topics of Machine Learning (ML) and Deep Learning (DL) from a practical perspective. I try to explain the basics of how these techniques work and the core algorithms involved. The main focus is on building real‐world systems using these techniques. I see many ML and DL books cover the algorithms extensively but not always show a clear path to deploying these algorithms into production systems. Also, we often see a big gap in understanding around how these Artificial Intelligence (AI) systems can be scaled to handle large volume of data—also referred to as Big Data.

Today we have systems like Docker and Kubernetes that help us package our code and seamlessly deploy to large on‐premise or Cloud systems. Kubernetes takes care of all the low‐level infrastructure concerns like scaling, fail‐over, load balancing, networking, storage, security, etc. I show how your ML and DL projects can take advantage of the rich features that Kubernetes provides. I focus on deployment of the ML and DL algorithms at scale and tips to handle large volumes of data.

I talk about many popular algorithms and show how you can build systems using them. I include code examples that are heavily commented so you can easily follow and possibly reproduce the examples. I use an example of a DL model to read images and classify logos of popular brands. Then this model is deployed on a distributed cluster so it can handle large volumes of client requests. This example shows you an end‐to‐end approach for building and deploying a DL model in production.

I also provide references to books and websites that cover details of items I do not cover fully in this book.

How This Book Is Organized

The first half of the book (Chapters 1–5) focuses on Machine Learning (ML) and Deep Learning (DL). I show examples of building ML models with code (in Python) and show examples of tools that automate this process. I also show an example of building an image classifier model using the Keras library and TensorFlow framework. This logo‐classifier model is used to distinguish between the Coca‐Cola and Pepsi logos in images.

In the second half of the book (Chapters 6–10), I talk about how these ML and DL models can actually be deployed in a production environment. We talk about some common concerns that data scientists have and how software developers can implement these models. I explain an example of deploying our earlier logo‐classifier model at scale using Kubernetes.

Conventions Used

Italic terms indicate key concepts I want to draw attention to and which will be good to grasp.

Underlined references are references to other books or publications or external web links.

Code examples in Python will be shown as follows:

# This box carries code – mainly in Pythonimport tensorflow as tf

Results from code are shown as follows:

Results from code are shown as a picture or in this font below the code box.

Who Should Read This Book

This book is intended for software developers and data scientists. I talk about developing Machine Learning (ML) models, connecting these to application code, and deploying them as microservices packaged as Docker containers. Modern software systems are heavily driven by ML and I feel that data scientists and software developers can both benefit by knowing enough about each other's discipline.

Whether you are a beginner at software/data science or an expert in the field, I feel there will be something in this book for you. Although a programming background is best to understand the examples well, the code and examples are targeted to very general audience. The code presented is heavily commented as well, so it should be easy to follow. Although I have used Python and specific libraries—Scikit‐Learn, and Keras—you should be able to find equivalent functions and convert the code to other languages and libraries like R, MATLAB, Java, SAS, C++, etc.

My effort is to provide as much theory as I can so you don't need to go through the code to understand the concepts. The code is very practical and helps you adapt the concepts to your data very easily. You are free (and encouraged) to copy the code and try the examples with your own datasets.

NOTE

All the code is available for free on my GitHub site listed here. This site also contains sample datasets and images we use in examples. Datasets are in comma‐separated values (CSV) and are in the data folder.

https://github.com/dattarajrao/keras2kubernetes

Tools You Will Need

My effort is to provide as much theory about the concepts as possible. The code is practical and commented to help you understand. Like most data scientists today, my preference is to use the Python programming language. You can install the latest version of Python from https://www.python.org/.

Using Python

A popular way to write Python code is using Jupyter Notebooks. It is a browser‐based interface for running your Python code. You open a web page in a browser and write Python code that gets executed and you see the results right there on the same web page. It has an excellent user‐friendly interface and shows you immediate results by executing individual code cells. The examples I present are also small blocks of code that you can quickly run separately in a Jupyter Notebook. This can be installed from http://jupyter.org.

The big advantage of Python is its rich set of libraries for solving different problems. We particularly use the Pandas library for loading and manipulating data to be used for building our ML models. We also use Scikit‐Learn, which is a popular library that provides implementation for most of the ML techniques. These libraries are available from the following links:

https://pandas.pydata.org/

https://scikit‐learn.org/

Using the Frameworks

Specifically, for Deep Learning, we use a framework for building our models. There are multiple frameworks available, but the one we use for examples is Google's TensorFlow. TensorFlow has a good Python interface we use to write Deep Learning code in Python. We use Keras, which is a high‐level abstraction library that runs on top of TensorFlow. Keras comes packaged with TensorFlow. You can install TensorFlow for Python from https://www.tensorflow.org.

One disclaimer. TensorFlow, although production‐ready, is under active development by Google. It releases new versions every two to three months, which is unprecedented for normal software development. But because of today's world of Agile development and continuous integration practices, Google is able to release huge functionalities in weeks rather than months. Hence the code I show for Deep Learning in Keras and TensorFlow may need updating to the latest version of the library. Usually this is pretty straightforward. The concepts I discuss will still be valid; you just may need to update the code periodically.

Setting Up a Notebook

If you don't want to set up your own Python environment, you can get a hosted notebook running entirely in the Cloud. That way all you need is a computer with an active Internet connection to run all the Python code. There are no libraries or frameworks to install. All this by using the magic of Cloud computing. Two popular choices here are Amazon's SageMaker and Google's Colaboratory. I particularly like Colaboratory for all the Machine Learning library support.

Let me show you how to set up a notebook using Google's Cloud‐hosted programming environment, called Colaboratory. A special shout‐out to our friends at Google, who made this hosted environment available for free to anyone with a Google account. To set up the environment, make sure you have a Google account (if not, you'll need to create one). Then open your web browser and go to https://colab.research.google.com.

Google Colaboratory is a free (as of writing this book) Jupyter environment that lets you create a notebook and easily experiment with Python code. This environment comes pre‐packaged with the best data science and Machine Learning libraries like Pandas, Scikit‐Learn, TensorFlow, and Keras.

The notebooks (work files) you create will be stored on your Google Drive account. Once you're logged in, open a new Python 3 notebook, as shown in Figure 1.

Figure 1: Opening a new notebook in Google Colaboratory

You will see a screen similar to the one in Figure 2, with your first Python 3 notebook called Untitled1.pynb. You can change the name to something relevant to you. Click Connect to connect to an environment and get started. This will commission a Cloud machine in the background and your code will run on that virtual machine. This is the beauty of working in a Cloud‐hosted environment. You have all the processing, storage, and memory concerns handled by the Cloud and you can focus on your logic. This is an example of the Software‐as‐a‐Service (SaaS) paradigm.

Figure 2: Click Connect to start the virtual machine

Once your notebook is connected to the Cloud runtime, you can add code cells and click the Play button on the slide to run your code. It's that simple. Once the code runs, you will see outputs popping up below the block. You can also add text blocks for informational material you want to include and format this text.

Figure 3 shows a simple example of a notebook with code snippets for checking the TensorFlow library and downloading a public dataset using the Pandas library. Remember that Python has a rich set of libraries that helps you load, process, and visualize data.

Figure 3: Example of running code in a notebook

Finding a Dataset

Look at the second code block in Figure 3; it loads a CSV file from the Internet and shows the data in a data frame. This dataset shows traffic at different intersections in the city of Chicago. This dataset is maintained by the city.

Many such datasets are available for free, thanks to the amazing data science community. These datasets are cleansed and contain data in good format to be used for building models. These can be used to understand different ML algorithms and their effectiveness. You can find a comprehensive list at https://catalog.data.gov/dataset?res_format=CSV. You can search by typing CSV and clicking the CSV icon to download the dataset or copy the link.

Google also now has a dedicated website for searching for datasets that you can use to build your models. Have a look at this site at https://toolbox.google.com/datasetsearch.

Summary

We will now embark on a journey of building Machine Learning and Deep Learning models for real‐world use cases. We will use the Python programming language and popular libraries for ML and DL, like Scikit‐Learn, TensorFlow, and Keras. You could build an environment from scratch and try to work on the code provided in this book. Another option is to use a hosted notebook in Google's Colaboratory to run the code. There are many open datasets that are freely available for you to experiment with model building and testing. You can enhance your data science skills with these datasets. I show examples of the same. Let's get started!

CHAPTER 1Big Data and Artificial Intelligence

Chapter 1 provides an overview of some of the hot trends in the industry around Big Data and Artificial Intelligence. We will see how the world is being transformed through digitization, leading to the Big Data phenomenon—both in the consumer and industrial spaces. We see data volumes increasing exponentially, from terabytes to exabytes to zettabytes. We see the processing power of computers increase in magnitudes of tens and hundreds. We will talk about software getting smarter with the application of Artificial Intelligence—whether it's IBM's Watson beating human champions at Jeopardy! or Facebook automatically tagging friends in your photos, or even Google's self‐driving car. Finally, the chapter discusses the types of analytics and covers a simple example of building a system driven by analytics to deliver outcomes.

Data Is the New Oil and AI Is the New Electricity

We are living in the Internet age. Shopping on Amazon to booking cabs through Uber to binge‐watching TV shows on Netflix—all these outcomes are enabled by the Internet. These outcomes involve huge volumes of data being constantly uploaded and downloaded from our computing devices to remote servers in the Cloud. The computing devices themselves are no longer restricted to personal computers, laptops, and mobile phones. Today, we have many more smart devices or “things” connected to the Internet, like TVs, air conditioners, washing machines, and more every day. These devices are powered with microprocessors just like in a computer and have communication interfaces to transfer data to the Cloud. These devices can upload their data to the Cloud using communication protocols like Wi‐Fi, Bluetooth, and cellular. They can also download up‐to‐date content from remote servers, including the latest software updates.

The Internet of Things (IoT) is here to change our lives with outcomes that would easily fit in a science fiction novel from 10 years ago. We have fitness wristbands that suggest exercise routines based on our lifestyle, watches that monitor for heart irregularities, home electronics that listen to voice commands, and of course, the famous self‐driving cars and trucks. These Internet‐connected devices are smart enough to analyze complex data in the form of images, videos, and audio, understand their environments, predict expected results, and either take a recommended action or prescribe one to a human.

My Fitbit checks if I have not done enough exercise in a day and “asks” me politely to get up and start exercising. We have sensors that sense any absence of motion and shut off lights automatically if the room is empty. The Apple watch 4 has a basic EKG feature to measure your heart condition. Consumers of Tesla cars get new features delivered directly over the air through software updates. No need to visit the service shop. The modern IoT devices are not only connected but have the smarts to achieve some amazing outcomes, which were described only in science fiction novels just a few years back.

So great is the impact of this IoT revolution that we are now getting used to expecting such results. This technology is here to stay. The other day, my 4‐year‐old asked our Amazon Echo device, “Alexa, can you do my homework?” (See Figure 1.1.) The modern consumer is now expecting devices to provide these new outcomes. Anything less is becoming unacceptable!

Figure 1.1: Alexa, can you do my homework?

Despite the diverse outcomes there is a common pattern to these IoT devices or “things.” They have sensors to “observe” the environment and collect data. This data may be simple sensor readings like temperature measurements, to complex unstructured datatypes like sound and video. Some processing is done on the device itself, which is called edge processing. IoT devices usually have a very limited processing and storage capability due to their low cost. For larger processing and comparing to historical data, these devices upload data to a remote server or the Cloud. Newer advanced IoT devices have built‐in connectivity to the Cloud with options like Wi‐Fi, Bluetooth, or cellular. Low‐power (and low‐cost) devices usually use a gateway to connect and upload data to the Cloud. At the Cloud, the data can be processed on bigger, faster computers often arranged into large clusters in data centers. Also, we can combine the device data with historical data from the same device and from many other devices. This can generate new and more complex outcomes not possible at the edge alone. The results generated are then downloaded back to the device using the same connectivity options. These IoT devices may also need to be managed remotely with timely software updates and configuration—that is also done through the Cloud. Figure 1.2 shows a very high‐level overview with the scale of data handled at each level.

Figure 1.2: Data volumes on the consumer Internet

We are putting billions of smart connected devices on the Internet. We have smartphones capturing, storing, and transferring terabytes of photos and videos. Security cameras collect video feeds 24×7. GPS devices, RFID tags, and fitness trackers continuously monitor, track, and report motion. We have moved our library off the shelves and into hundreds of eBooks on our Kindles. We moved from tapes and CDs to MP3s to downloaded music libraries on apps. Netflix consumes 15% of the world's Internet bandwidth. And all this is only the consumer Internet.

Rise of the Machines

There is a parallel data revolution happening in the industrial world with even bigger outcomes. This is a whole new Internet being championed by companies like General Electric, Siemens, Bosch, etc., especially for industrial applications. It's known as the Industrial Internet or Industry 4.0 in Europe. Instead of smaller consumer devices, heavy machinery like gas turbines, locomotives, and MRI machines are transformed into smart devices and connected to the Internet. These machines are upgraded with advanced sensors, connectivity, and processing power to enable edge analytics and connectivity to the industrial Cloud. Industrial machines generate terabytes and petabytes of data every day, probably much more than consumer devices. This needs to be processed in real‐time to understand what the machine is telling us and how we can improve its performance. We need to be able to, by observing sensor data, determine that an aircraft is due for service and should not be sent on a flight. Our MRI scanners should have extremely high accuracy to be able to capture images that can provide enough evidence for a doctor to diagnose a condition.

You can clearly see from Figure 1.3 that the scales of data increase in the industrial world along with the criticality of processing the data and generating outcomes in time. We can wait a couple of seconds for our favorite Black Mirror episode to buffer up. But a few seconds' delay in getting MRI results to a doctor may be fatal for the patient!

Figure 1.3: Data volumes on the industrial Internet

Exponential Growth in Processing

This is the Big Data revolution and we are all a part of it. All this data is of little use, unless we have a way to process it in time and extract value out of it. We are seeing an unprecedented growth in processing power of computing devices and a similar rise in storage capacity. Moore's Law of electronics states that the processing power of a computing device doubles every two years due to improvements in electronics. Basically, we can pack twice the number of transistors in the same form factor and double the processing power. Modern computing technology is making this law pretty much obsolete. We are seeing a growth of 10–100 times each year in processing power using advanced processors like NVIDIA GPU, Google TPU, and specialized FPGAs integrated using the System‐on‐Chip (SoC) technology. When we think of a computer, it is no longer a bulky screen with a keyboard and a CPU tower sitting on a table. We have microprocessors installed in televisions, air conditioners, washing machines, trains, airplanes, and more. Data storage volumes are rising from terabytes to petabytes and exabytes and now we have a new term introduced to describe Big Data, the zettabyte. We are getting good at improving processing on the device (edge) and moving the more intensive storage and processing to the Cloud.

This growth in data and processing power is driving improvements in the type of analysis we do on the data. Traditionally, we would program the computing devices with specific instructions to follow and they would diligently run these algorithms without question. Now we expect these devices to be smarter and use this large data to get better outcomes. We don't just want predefined rules to run all the time—but we want to achieve outcomes we talked of earlier. These devices need to think like a human. We are expecting computers to develop a visual and audio perception of the world through voice and optical sensors. We expect computers to plan our schedules like a human assistant would—to tell us in advance if our car will have issues based on the engine overheating and respond to us like a human with answers to questions we ask.

A New Breed of Analytics

All this needs a whole new paradigm shift in the way we conceptualize and build analytics. We are moving from predefined rule‐based methods to building Artificial Intelligence (AI) in our processing systems. Our traditional algorithmic methods for building analytics cannot keep up with the tremendous increase in the volume, velocity, and variety of data these systems handle. We now need specialized applications that were so far thought only possible by the human brain and not programmed in computers. Today, we have computers learning to do intelligent tasks and even out‐performing humans at them. Dr. Andrew Ng, Stanford Professor and the founder of Coursera, famously said, “AI is the new electricity.” During the Industrial Revolution, just as electricity touched every industry and every aspect of human life and totally transformed it—we are seeing AI doing the exact same thing. AI is touching so many areas of our lives and enabling outcomes that were considered impossible for computers. Big Data and AI are transforming all aspects of our lives and changing the world!

Examples of AI performing smart tasks are recognizing people in photos (Google Photos), responding to voice commands (Alexa), playing video games, looking at MRI scans to diagnose patients, replying to chat messages, self‐driving cars, detecting fraudulent transactions on credit cards, and many more. These were all considered specialized tasks that only humans could do. But we now have computer systems starting to do this even better than humans. We have examples like IBM's Watson, an AI computer beating the chess grandmaster. Self‐driving trucks can take cross‐country trips in the United States. Amazon Alexa can listen to your command, interpret it, and respond with an answer—all in a matter of seconds. The same holds for the industrial Internet. With many recent examples—like autonomous trucks and trains, and power plants moving to predictive maintenance and airlines able to anticipate delays before takeoff—we see AI driving major outcomes in the industrial world. See Figure 1.4.

Figure 1.4: AI for computer vision at a railway crossing

AI is starting to play a role in areas that no one would have thought of just 2 or 3 years ago. Recently there was news about a painting purely generated by AI that sold for a whopping $432,500. The painting sold by Christie's NY was titled “Edmond de Belamy, from La Famille de Belamy.” This painting was generated by an AI algorithm called Generative Adversarial Networks (GAN). You will see examples and code to generate images with AI in Chapter 6. Maybe you can plan your next painting with AI and try to fetch a good price!

Another interesting AI project was done by the NVIDIA researchers to take celebrity face images and generate new ones. The result was some amazing new images that looked absolutely real, but did not belong to any celebrity. They were fakes! Using random numbers and patterns learned by “watching” real celebrity photos, the super‐smart AI was able to create indistinguishable fakes. We will see cool AI examples like these in Chapter 6.

What Makes AI So Special

Imagine a security camera system at a railway crossing. It captures terabytes of video feeds from multiple cameras 24×7. It synchronizes feeds from several cameras and shows them on a screen along with timing information from each video. Now a human can look at this feed live or play back a specific time to understand what happens. In this case, the computer system handles the capturing and storing of data in the right format, synchronizing several feeds and displaying them on a common dashboard. It performs these tasks extremely efficiently without getting tired or complaining.

A human does the actual interpretation of the videos. If we want to check if there are people crossing the track as a train is about to approach, we rely on a human to check this in the feed and report back. Similar surveillance systems are used to detect suspicious behavior in public spaces or fire hazards on a ship or unattended luggage at an airport. The final analysis needs to be done by the human brain to pick up the patterns of interest and act on them. The human brain has amazing processing power and built‐in intelligence. It has the intelligence to process hundreds of images per second and interpret them to look for items of interest (people, fires, etc.). The drawback is that humans are prone to fatigue over time and tend to make errors. Over time, if a security guard continuously watches live feeds, he or she is bound to get tired and may miss important events.

Artificial Intelligence is all about building human‐like intelligence into computing systems. With the security feed example, along with displaying the synchronized video feeds, the system can also recognize significant activities, which builds an AI system. To do this, the system needs more than just large data and processing power. It needs some smart algorithms that understand and extract patterns in data and use these to make predictions on new data. These smart algorithms constitute the “brain” of our AI system and help it perform human‐like activities.

Normal computer systems are very good at performing repetitive tasks. They need to be explicitly programmed with the exact instructions to perform actions on data and they will continuously run these actions on any new data that comes in the system. We program these instructions in code and the computer has no problem executing this code over and over millions of times. Modern computing systems can also handle parallel processing by running multiple jobs simultaneously on multi‐core processors. However, each job is still a predetermined sequence programmed in it. This is where the earlier activity of processing video feeds and showing on a display fit perfectly. You can feed the system with footage from hundreds of cameras simultaneously and it will keep formatting the video, store it, and display it on‐screen without any loss—as long as the computing resources (CPU, memory, and storage) are adequate. We can have hundreds of video feeds coming into the system and it will do an excellent job storing them, synchronizing them, and displaying them on‐screen for us.

However, in order to understand these videos and extract valuable knowledge from them, it needs a totally different capability. This capability that we as humans have taken for granted is known as intelligence…and is a pretty big deal for computers. Intelligence helps us look at videos and understand what is happening inside them. Intelligence helps us read hundreds of pages of a book and summarize the story to a friend in a few words. Intelligence helps us learn to play a game of chess and over time get good at it. If we can somehow push this intelligence into computers then we have a lethal combination of speed and intelligence, which can help us do some amazing things. This is what Artificial Intelligence is all about!

Applications of Artificial Intelligence

AI has found many applications in our lives. As we speak, more AI applications are being developed by smart engineers to improve different aspects of our lives.

A very popular application of AI is in knowledge representation. This involves trying to replicate the human brain's super‐ability to store large volumes of information in a manner that's easy to retrieve and correlate with so as to answer a question. If I ask you about your first day at your first ever job you probably remember it pretty well and hopefully have fond memories. You may not do so well remembering, say, the 15th day, unless something major happened then. Our brain is very good at storing large volumes of information that is relevant along with a context for it. So, when needed it can quickly look up the right information based on the context and retrieve it. Similarly, an AI system needs to convert volumes of raw data into knowledge that can be stored with context and easily retrieved to find answers. A good example of this is IBM's Watson, which is a supercomputer that is able to learn by reading millions of documents over the Internet and storing this knowledge internally. Watson was able to use this knowledge to answer questions and beat human experts at the game of Jeopardy!. IBM is also teaching Watson medical diagnosis knowledge so that Watson can help develop medical prescriptions like a doctor. See Figure 1.5.

Figure 1.5: IBM Watson beating Jeopardy! champions

(Source: Wikimedia)

Another popular and even cooler application of AI is in building a sense of perception in machines. Here the computer inside of a machine collects and interprets data from advanced sensors to help the machine understand its environment. Think of a self‐driving car that uses cameras, LiDAR, RADAR, and ultrasound sensors to locate objects on the road. Self‐driving cars have AI computers that help them look for pedestrians, cars, signs, and signals on the road and make sure they avoid obstacles and follow traffic rules. Figure 1.6 shows Google's self‐driving car, Waymo.

Figure 1.6: Google's self‐driving autonomous car

(Source: Wikimedia)

AI can also be used for strategy and planning, where we have smart agents that know how to interact with real‐world objects and achieve given objectives. This could be an AI beating the Grandmaster at a game of chess or an industrial agent or robot picking up your online orders from an Amazon warehouse and preparing your shipment in the fastest manner.

More applications of AI include recommendation engines like Amazon uses, which propose the next items you may be interested in based on your purchase history. Or Netflix recommending a movie you will like based on past movies you have seen. Online advertisement is a huge area where AI is used to understand patterns in human activity and improve visibility to products for sale. Google and Facebook automatically tagging photos of your friends is also done using AI.

Video surveillance is another area that is being revolutionized by AI. Recently many police teams have started using AI to identify persons of interest from video footage from security cameras and then track these people. AI can do much more than just find people in security footage. We are seeing AI understand human expressions and body posture to detect people with signs of fatigue, anger, acts of violence, etc. Hospitals use camera feeds with AI to see if patients are expressing high levels of stress and inform the doctor. Modern cars, trucks, and trains use driver cameras to detect if a driver is under stress or getting drowsy and then try to avoid accidents.

Last but not least, the industry that was one of the foremost to start adopting it and is making the most of the latest advances in AI is video gaming. Almost all modern games have an AI engine that can build a strategy for gameplay and play against the user. Some of the modern games have such an amazing engine that it captures the flawless rendition of the real world. For example, in my favorite game, Grand Theft Auto V, the railway crossing interactions are extremely realistic. The AI inside the game captures all aspects of stopping the traffic, flashing crossing lights, passing the train, and then opening the gates to allow traffic to pass, absolutely perfectly. Using methods like Reinforcement Learning, games can learn different strategies to take actions and build agents that can compete with humans and keep us entertained.

The field of AI that has really jumped in prominence and attention over the past years is Machine Learning (ML). This will be our area of focus for this book. ML is all about learning from data, extracting patterns, and using these patterns to make predictions. While most people put ML as a category under AI, you will find that modern ML is pretty much a major influencer in different areas of AI applications. In fact, you may struggle to find AI without some learning element of ML. If you think back to the different AI applications we discussed, ML touches all of them in some way or another.

IBM Watson builds a knowledge base and learns from this using Natural Language Processing (an area of ML) to be good at prescribing solutions. Self‐driving cars use ML models—more specifically Deep Learning (DL) models—to process huge volumes of unstructured data to extract valuable knowledge like location of pedestrians, other cars, and traffic signals. An agent playing chess uses Reinforcement Learning, which is again an area of ML. The agent tries to learn different policies by observing games of chess over and over again and finally gets good enough to beat a human. This can be compared to how a child learns to play the game too, but in a highly accelerated fashion. Finally, the robot finding your items and preparing your order is mimicking what 10 or more warehouse workers would be doing—of course, without the lunch break!

One topic gaining a lot of attention in the world of AI is Artificial General Intelligence (AGI). This is an advanced AI that is almost indistinguishable from humans. It can do almost all the intellectual tasks that a human can. Basically, it can fool humans into thinking that it's human. This is the kind of the stuff you will see on TV shows like Black Mirror or Person of Interest. I remember during a 2018 Google event that CEO Sundar Pichai demonstrated how their virtual assistant could make an appointment calling a restaurant (see Figure 1.7). The reservations attendant could not tell that a computer was on the other end of the line. This demo spun off many AI ethics debates and lots of criticism of Google for misleading people. Sure enough, Google issued an apology and released an AI ethics policy basically saying they won't use AI for harm. However, the fact remains that AI capability is maturing by the day and will greatly influence our lives more and more.

Figure 1.7: Google CEO demonstrating Duplex virtual assistant fooling the reservations attendant

(Source: Wikimedia)

Building Analytics on Data

Development of analytics depends on the problem you are trying to solve. Based on the intended outcome you are chasing, you first need to understand what data is available, what can be made available, and what techniques you can use to process it. Data collected from the system under investigation may be human inputs, sensor readings, existing sources like databases, images and videos from cameras, audio signals, etc. If you are building a system from scratch, you may have the freedom to decide which parameters you want to measure and what sensors to install. However, in most cases you will be dealing with digitizing an existing system with limited scope to measure new parameters. You may have to use whatever existing sensors and data sources are available.

Sensors measure particular physical characteristics and convert them into electrical signals and then into a series of numbers to analyze. Sensors measure characteristics of a system under study like motion, temperature, pressure, images, audio, video, etc. These are usually located at strategic positions so as to give you maximum details about the system. For example, a security camera should be placed so that it covers the maximum area you want to watch over. Some cars have ultrasound sensors attached at the back that measure distance from objects to help you when you're reversing. These physical characteristics are measures and converted into electrical signals by sensors. These electrical signals then flow through a signal processing circuit and get converted into numbers that you can analyze using a computer.

If our system already has sensors collecting data or existing databases with system data, then we can use this historical data to understand our system. Otherwise, we may have to install sensors and run the system for some time to collect data. Engineering systems also use simulators to generate data very similar to how a real system would. We can then use this data to build our processing logic—that is our analytic. For example, if we want to build temperature control logic to simulate thermostat data, we can simulate different temperature variations in a room. Then we pass this data through our thermostat analytic—which is designed to increase or decrease heat flow in the room based on a set temperature. Another example of simulation may be generating data on different stock market conditions and using that to build an analytic that decided on buying and selling stock. This data collected either from a real system or simulator can also be used to train an AI system to learn patterns and make decisions on different states of the system.

Whether you are building an AI‐ or non‐AI–based analytic—the general pattern for building is the same—you read inputs from data sources, build the processing logic, test this logic on real or simulated data, and deploy it to the system to generate desired outputs. Mathematically speaking, all these inputs and outputs whose values can keep varying over time are called variables. The inputs are usually called independent variables (or Xs) and the outputs are called dependent variables (or Ys). Our analytic tries to build a relationship between our dependent and independent variables. We will use this terminology in the rest of the book as we describe the different AI algorithms.

Our analytic tries to express or map our Ys as a function of our Xs (see Figure 1.8). This could be a simple math formula or a complex neural network that maps independent variables to dependent ones. We could know the details of the formula—meaning that we know the intrinsic details about how our system behaves. Or the relationship may be a black box where we don't know any details and only use the black box to predict outputs based on inputs. There may be an internal relationship between our dependent variables or Xs. However, typically we choose to ignore that and focus on the X‐Y relationships.

Figure 1.8: Expressing Ys as a function of Xs

Types of Analytics: Based on the Application

The job of the analytic is to produce outputs by processing input data from the system so humans can make decisions based on the system. It is extremely important to first understand the question we want to ask the system, before jumping into building the analytic. Based on the question we are asking, there may be four categories of analytics. The following sections explain some examples with the questions they try to answer.

Descriptive Analytics: What Happened?

These are the simplest kind but are also very important because they try to clearly describe the data. The outputs here may be statistical summaries like mean, mode, and median. We could have visual aids like charts and histograms that help humans understand patterns in the data. Many business intelligence and reporting tools like Tableau, Sisense, QlikView, Crystal Reports, etc., are based on this concept. The idea is to provide users with a consolidated view of their data to help them make decisions. The example in Figure 1.9 shows which months we had a higher than usual monthly spending.

Figure 1.9: Describe the data to humans

Diagnostic Analytics: Why Did It Happen?

Here we try to diagnose something that happened and try to understand why it happened. The obvious example is when a doctor looks at your symptoms and diagnoses the presence of a disease. We have systems like WebMD that try to capture this amazing human intelligence that doctors possess and give us a quick initial diagnosis. Similarly, healthcare machines like MRI scanners use diagnostic analytics to try to isolate patterns of disease. This type of analytic is also very popular in industrial applications for diagnosing machines. Using sensor data, industrial control and safety systems use diagnostic rules to detect the presence of a failure occurring and try to stop the machine before major damage occurs.

We may use the same tools used in descriptive analytics like charts and summaries to diagnose issues. We may also use techniques like inferential statistics to identify root causes of certain events. In inferential statistics, we establish a hypothesis or assumption saying that our event is dependent on certain Xs in our problem. Then we collect data to see if we have enough data evidence to prove this assumption.

The analytic here will normally provide us with evidence regarding a particular event. The human still has to use her intuition to decide why the event occurred and what needs to be done. The example in Figure 1.10 shows how the engine oil temperature kept increasing, which might have caused the engine failure.

Figure 1.10: Diagnose an issue using data

Predictive Analytics: What Will Happen?

The previous two AI applications dealt with what happened in the past or in hindsight. Predictive analytics focus on the future or foresight. Here we use techniques like Machine Learning to learn from historical data and build models that predict the future. This is where we will primarily use AI to develop analytics that make predictions. Since we are making predictions here, these analytics extensively use probability to give us a confidence factor. We will cover this type of analytic case in the rest of the book.

The example in Figure 1.11 shows weather websites analyzing history data patterns to predict the weather.

Figure 1.11: Weather forecasting

(Source: weather.com)

Prescriptive Analytics: What to Do?

Now we take prediction one step further and prescribe an action. This is the most complex type of analytic and is still an active area of research and also some debate. Prescriptive can be looked as a type of predictive analytic; however, for an analytic to be prescriptive, it also clearly states an action the human must take. In some cases, if the confidence on the prediction is high enough, we may allow the analytic to take action on its own. This analytic depends heavily on the domain for which you are trying to make the prediction. To build impactful prescriptive analytics, we need to explore many advanced AI methods.

The example in Figure 1.12 shows how Google Maps prescribes the fastest route by considering traffic conditions.

Figure 1.12: Route to work

(Source: Google Maps)

Figure 1.13

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: