Machine Learning in the AWS Cloud - Abhishek Mishra - E-Book

Machine Learning in the AWS Cloud E-Book

Abhishek Mishra

0,0
32,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Put the power of AWS Cloud machine learning services to work in your business and commercial applications! Machine Learning in the AWS Cloud introduces readers to the machine learning (ML) capabilities of the Amazon Web Services ecosystem and provides practical examples to solve real-world regression and classification problems. While readers do not need prior ML experience, they are expected to have some knowledge of Python and a basic knowledge of Amazon Web Services. Part One introduces readers to fundamental machine learning concepts. You will learn about the types of ML systems, how they are used, and challenges you may face with ML solutions. Part Two focuses on machine learning services provided by Amazon Web Services. You'll be introduced to the basics of cloud computing and AWS offerings in the cloud-based machine learning space. Then you'll learn to use Amazon Machine Learning to solve a simpler class of machine learning problems, and Amazon SageMaker to solve more complex problems. * Learn techniques that allow you to preprocess data, basic feature engineering, visualizing data, and model building * Discover common neural network frameworks with Amazon SageMaker * Solve computer vision problems with Amazon Rekognition * Benefit from illustrations, source code examples, and sidebars in each chapter The book appeals to both Python developers and technical/solution architects. Developers will find concrete examples that show them how to perform common ML tasks with Python on AWS. Technical/solution architects will find useful information on the machine learning capabilities of the AWS ecosystem.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 652

Veröffentlichungsjahr: 2019

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Acknowledgments

About the Author

About the Technical Editor

Introduction

Who This Book Is For

What This Book Covers

How This Book Is Structured

What You Need to Use This Book

Conventions

Source Code

Errata

Part 1: Fundamentals of Machine Learning

Chapter 1: Introduction to Machine Learning

What Is Machine Learning?

Types of Machine Learning Systems

The Traditional Versus the Machine Learning Approach

Summary

Chapter 2: Data Collection and Preprocessing

Machine Learning Datasets

Data Preprocessing Techniques

Summary

Chapter 3: Data Visualization with Python

Introducing Matplotlib

Components of a Plot

Common Plots

Summary

Chapter 4: Creating Machine Learning Models with Scikit-learn

Introducing Scikit-learn

Creating a Training and Test Dataset

Creating Machine Learning Models

Summary

Chapter 5: Evaluating Machine Learning Models

Evaluating Regression Models

Evaluating Classification Models

Choosing Hyperparameter Values

Summary

Part 2: Machine Learning with Amazon Web Services

Chapter 6: Introduction to Amazon Web Services

What Is Cloud Computing?

Cloud Service Models

Cloud Deployment Models

The AWS Ecosystem

Sign Up for an AWS Free-Tier Account

Summary

Note

Chapter 7: AWS Global Infrastructure

Regions and Availability Zones

Edge Locations

Accessing AWS

Summary

Chapter 8: Identity and Access Management

Key Concepts

Common Tasks

Summary

Chapter 9: Amazon S3

Key Concepts

Common Tasks

Summary

Chapter 10: Amazon Cognito

Key Concepts

Common Tasks

User Pools or Identity Pools: Which One Should You Use?

Summary

Chapter 11: Amazon DynamoDB

Key Concepts

Common Tasks

Summary

Chapter 12: AWS Lambda

Common Use Cases for Lambda

Key Concepts

Common Tasks

Summary

Chapter 13: Amazon Comprehend

Key Concepts

Text Analysis Using the Amazon Comprehend Management Console

Interactive Text Analysis with the AWS CLI

Using Amazon Comprehend with AWS Lambda

Summary

Chapter 14: Amazon Lex

Key Concepts

Creating an Amazon Lex Bot

Summary

Chapter 15: Amazon Machine Learning

Key Concepts

Creating Datasources

Viewing Data Insights

Creating an ML Model

Making Batch Predictions

Creating a Real-Time Prediction Endpoint for Your Machine Learning Model

Making Predictions Using the AWS CLI

Using Real-Time Prediction Endpoints with Your Applications

Summary

Chapter 16: Amazon SageMaker

Key Concepts

Creating an Amazon SageMaker Notebook Instance

Preparing Test and Training Data

Training a Scikit-Learn Model on an Amazon SageMaker Notebook Instance

Training a Scikit-Learn Model on a Dedicated Training Instance

Training a Model Using a Built-in Algorithm on a Dedicated Training Instance

Summary

Chapter 17: Using Google TensorFlow with Amazon SageMaker

Introduction to Google TensorFlow

Creating a Linear Regression Model with Google TensorFlow

Training and Deploying a DNN Classifier Using the TensorFlow Estimators API and Amazon SageMaker

Summary

Chapter 18: Amazon Rekognition

Key Concepts

Analyzing Images Using the Amazon Rekognition Management Console

Interactive Image Analysis with the AWS CLI

Using Amazon Rekognition with AWS Lambda

Summary

Appendix A: Anaconda and Jupyter Notebook Setup

Installing the Anaconda Distribution

Creating a Conda Python Environment

Installing Python Packages

Installing Jupyter Notebook

Summary

Appendix B: AWS Resources Needed to Use This Book

Creating an IAM User for Development

Creating S3 Buckets

Appendix C: Installing and Configuring the AWS CLI

Mac OS Users

Windows Users

Appendix D: Introduction to NumPy and Pandas

NumPy

Pandas

Index

End User License Agreement

List of Tables

Chapter 1

TABLE 1.1: Type and Range of Data across 100 Sample Applications

TABLE 1.2: Transforming Categorical Features into Numeric Features

TABLE 1.3: Modified Input Features

Chapter 7

TABLE 7.1: AWS Regions and Availability Zones

Chapter 9

TABLE 9.1: Amazon S3 System-Defined Metadata

Chapter 12

TABLE 12.1: Common Event Sources for AWS Lambda

Chapter 14

TABLE 14.1: ACMEBankAccount Table Items

TABLE 14.2: ACMEAccountTransaction Table Items

TABLE 14.3: ViewTransactionList Intent Slots

Chapter 15

TABLE 15.1: The First Five Rows of the Titanic Dataset

TABLE 15.2: The First Ten Rows of the Batch Prediction Result

Chapter 18

TABLE 18.1: Aggregate Metric Graphs

Appendix C

TABLE C.1: AWS Region Names

TABLE C.2 AWS Region Names

Appendix D

TABLE D.1: Commonly Used Ndarray Attributes

List of Illustrations

Chapter 1

FIGURE 1.1 Supervised learning

FIGURE 1.2 Clustering technique used to find patterns in the data

FIGURE 1.3 Semi-supervised learning

FIGURE 1.4 Architecture of a rule-based decision system

FIGURE 1.5 A flowchart depicting the decision-making process for a rule-base...

FIGURE 1.6 Cross-validation using multiple folds

FIGURE 1.7 The sigmoid function

FIGURE 1.8 Using the sigmoid function for binary classification

Chapter 2

FIGURE 2.1 The

head()

function displays rows from the beginning of a Pandas ...

FIGURE 2.2 The

head()

function displays truncated data for large dataframes.

FIGURE 2.3 Impact of the

set_index

function on a dataframe

FIGURE 2.4 Distribution of values for the Survived attribute

FIGURE 2.5 Histogram of numeric features

FIGURE 2.6 Histogram of numeric feature “Age” using different bin widths (2,...

FIGURE 2.7 Histogram of categorical feature “Embarked”

FIGURE 2.8 Box plot of numeric features

FIGURE 2.9 Linear correlation between numeric columns

FIGURE 2.10 Matrix of scatter plots between pairs of numeric attributes

FIGURE 2.11 Box plot of the Age feature variable

FIGURE 2.12 Dataframe with engineered feature AgeCategory

FIGURE 2.13 Dataframe with engineered feature FareCategory

FIGURE 2.14 Histogram of Age, NormalizedAge, and StandardizedAge

Chapter 3

FIGURE 3.1 Plotting two curves using Matplotlib

FIGURE 3.2 Components of a Matplotlib plot

FIGURE 3.3 A figure object with four axes objects

FIGURE 3.4 Comparison of plots with and without grids

FIGURE 3.5 Histogram of Passenger Age values

FIGURE 3.6 Histograms of Passenger Age values created using different binnin...

FIGURE 3.7 Bar chart of theEmbarked attribute

FIGURE 3.8 Grouped bar chart of the Embarked attribute

FIGURE 3.9 Stacked bar chart of the Embarked attribute

FIGURE 3.10 Stacked percentage bar chart of the Embarked attribute

FIGURE 3.11 Pie chart of proportion of passengers embarking from different p...

FIGURE 3.12 Pie charts showing the proportion of survivors from each embarka...

FIGURE 3.13 Box plot showing the distribution of the Age attribute

FIGURE 3.14 Box plots of the Age attribute comparing the distribution of sur...

FIGURE 3.15 Scatter plot of the Age attribute against the Fare attribute

FIGURE 3.16 Scatter plots depicting the ideal strong positive and strong neg...

FIGURE 3.17 Scatter plot matrix of the features of the Iris dataset

FIGURE 3.18 Scatter plot of four clusters of data

Chapter 4

FIGURE 4.1 Scikit-learn's

train_test_split()

method automatically shuffles t...

FIGURE 4.2 Comparison of the distribution of target variables in the origina...

FIGURE 4.3 Comparison of the distribution of target variables in the origina...

FIGURE 4.4 Cross-validation using k-folds

FIGURE 4.5 Scatter plot of expected vs. predicted house prices

FIGURE 4.6 Scatter plot of synthetic dataset along with regression lines

FIGURE 4.7 Three potential decision boundaries

FIGURE 4.8 Data that cannot be classified using a linear decision boundary i...

FIGURE 4.9 Data that cannot be classified using a linear decision boundary i...

FIGURE 4.10 Nonlinear decision boundary in two-dimensional space

FIGURE 4.11 Effect of kernel choice on decision boundaries

FIGURE 4.12 Linear regression vs. support vector regression

FIGURE 4.13 SVR predictions on Boston housing dataset

FIGURE 4.14 The sigmoid function

FIGURE 4.15 Using the sigmoid function for binary classification

FIGURE 4.16 Softmax logistic regression

FIGURE 4.17 Decision tree visualization

FIGURE 4.18 Decision tree for regression

Chapter 5

FIGURE 5.1 Comparison of predictive accuracies of a linear regression model ...

FIGURE 5.2 Mean squared error and root mean squared error

FIGURE 5.3 A class-wise confusion matrix

FIGURE 5.4 ROC curves for three binary classification models

FIGURE 5.5 Multi-class confusion matrix for a five-class dataset

FIGURE 5.6 Multi-class confusion matrix for two models trained on the Iris f...

Chapter 6

FIGURE 6.1 Common cloud service models

FIGURE 6.2 Brief timeline of Amazon Web Services

FIGURE 6.3 Amazon Web Services home page

FIGURE 6.4 AWS sign-in screen

FIGURE 6.5 Contact Information screen

FIGURE 6.6 Payment Information screen

FIGURE 6.7 Phone Verification screen

FIGURE 6.8 Phone verification PIN

FIGURE 6.9 Completing the identity verification process

FIGURE 6.10 Support plan selection

FIGURE 6.11 Completing the sign-up process

Chapter 7

FIGURE 7.1 Multiple Availability Zones in a single region

FIGURE 7.2 Geographically distant users accessing a video file from Tokyo

FIGURE 7.3 Edge locations can be used to cache frequently used content

FIGURE 7.4 AWS home page

FIGURE 7.5 AWS management console home page

FIGURE 7.6 AWS management console menu bar

FIGURE 7.7 Accessing the Services menu in the AWS management console

FIGURE 7.8 Resource Groups menu

FIGURE 7.9 Creating a resource group

FIGURE 7.10 Tagged resources are visible in the Resource Groups menu.

FIGURE 7.11 Resources in the CustomerAPI-Infrastructure resource group

FIGURE 7.12 Account menu

FIGURE 7.13 Regions menu

Chapter 8

FIGURE 8.1 IAM users exist under the root AWS account.

FIGURE 8.2 Obtaining temporary credentials

FIGURE 8.3 IAM groups contain users and permissions.

FIGURE 8.4 Root account login screen

FIGURE 8.5 IAM user-specific login screen

FIGURE 8.6 AWS management console region selector

FIGURE 8.7 Accessing the IAM management console

FIGURE 8.8 User-specific IAM sign-in link

FIGURE 8.9 IAM resource dashboard

FIGURE 8.10 Creating an IAM user

FIGURE 8.11 User details screen

FIGURE 8.12 Configuring user permissions

FIGURE 8.13 Creating a new group

FIGURE 8.14 The new group appears alongside existing groups.

FIGURE 8.15 The EC2FullAccess policy loaded in the policy editor

FIGURE 8.16 Review user settings screen

FIGURE 8.17 User confirmation screen

FIGURE 8.18 List of groups

FIGURE 8.19 Group permissions summary

FIGURE 8.20 Creating a new role using the IAM console

FIGURE 8.21 Creating a service role for EC2 instances

FIGURE 8.22 Attaching a policy to a role

FIGURE 8.23 You can associate up to 50 optional tags with a role.

FIGURE 8.24 Review new role screen

FIGURE 8.25 Accessing MFA settings

FIGURE 8.26 Configure security credentials warning

FIGURE 8.27 The Activate MFA button is enabled.

FIGURE 8.28 Choosing the MFA device type

FIGURE 8.29 Configuring a step-up authenticator

FIGURE 8.30 IAM password policy settings

Chapter 9

FIGURE 9.1 Accessing the Amazon S3 management console

FIGURE 9.2 Amazon S3 management console welcome page

FIGURE 9.3 List of Amazon S3 buckets

FIGURE 9.4 Specifying the bucket name and region

FIGURE 9.5 Configuring versioning, logging, and cost allocation tags

FIGURE 9.6 Configuring bucket permissions

FIGURE 9.7 Bucket summary page

FIGURE 9.8 List of Amazon S3 buckets in your account

FIGURE 9.9 Contents of an Amazon S3 bucket

FIGURE 9.10 Selecting files in the File Upload dialog box

FIGURE 9.11 Configuring object permissions

FIGURE 9.12 Configuring file storage class and encryption

FIGURE 9.13 File summary page

FIGURE 9.14 Amazon S3 bucket showing a file

FIGURE 9.15 Downloading a file from a bucket

FIGURE 9.16 Locating the Amazon S3 Object URL

FIGURE 9.17 Non-public buckets and files are not accessible using a URL.

FIGURE 9.18 Accessing Amazon S3 bucket permissions

FIGURE 9.19 Configuring Amazon S3 bucket permissions

FIGURE 9.20 Accessing the Make Public option

FIGURE 9.21 Making a file publicly accessible

FIGURE 9.22 Changing the storage class of an object

FIGURE 9.23 Deleting an object from an Amazon S3 bucket

FIGURE 9.24 Enabling bucket versioning

FIGURE 9.25 Making an object publicly accessible while uploading it

FIGURE 9.26 Accessing document versions

FIGURE 9.27 Version selector switch

Chapter 10

FIGURE 10.1 Accessing the S3 management console

FIGURE 10.2 Amazon Cognito splash screen

FIGURE 10.3 Creating a new user pool

FIGURE 10.4 Specifying the name of the new user pool

FIGURE 10.5 User pool attributes

FIGURE 10.6 Adding a custom attribute to a user pool

FIGURE 10.7 Setting up user pool policies

FIGURE 10.8 Multifactor authentication settings for the user pool

FIGURE 10.9 Customizing email and SMS verification messages

FIGURE 10.10 Cost allocation tag setup screen

FIGURE 10.11 You can set up a user pool to remember devices

FIGURE 10.12 Configuring applications that can use the user pool to authenti...

FIGURE 10.13 Create Application screen

FIGURE 10.14 List of client applications in the user pool

FIGURE 10.15 Use triggers to call AWS Lambda functions at specific points in...

FIGURE 10.16 User pool Review screen

FIGURE 10.17 Click the Show Details button to reveal the app client ID and t...

FIGURE 10.18 Amazon Cognito splash screen

FIGURE 10.19 Creating a new identity pool

FIGURE 10.20 List of existing identity pools

FIGURE 10.21 Specifying the Amazon Cognito user pool ID and app client ID

FIGURE 10.22 Cognito, by default, creates new roles for authenticated and un...

FIGURE 10.23 Accessing the credentials needed to access AWS services

Chapter 11

FIGURE 11.1 Accessing the Amazon DynamoDB service home page

FIGURE 11.2 Amazon DynamoDB splash screen

FIGURE 11.3 Amazon DynamoDB dashboard

FIGURE 11.4 Specifying a table name

FIGURE 11.5 Specifying a composite key for a table

FIGURE 11.6 Changing the provisioned I/O capacity

FIGURE 11.7 Amazon DynamoDB table overview

FIGURE 11.8 Creating a new item in the customer table

FIGURE 11.9 Item attributes dialog showing default primary key attribute

FIGURE 11.10 Adding item attributes

FIGURE 11.11 Specifying multiple attributes

FIGURE 11.12 Viewing item attributes as JSON

FIGURE 11.13 Amazon DynamoDB table with one item

FIGURE 11.14 Each item in an Amazon DynamoDB table can have different attrib...

FIGURE 11.15 Creating an index

FIGURE 11.16 Index properties dialog

FIGURE 11.17 Amazon DynamoDB table index list

FIGURE 11.18 Mandatory fields for new items

FIGURE 11.19 Multiple items in an Amazon DynamoDB table

FIGURE 11.20 List of items returned as a result of a scan operation

FIGURE 11.21 Adding a filter expression to a scan

FIGURE 11.22 Indexes can be used while performing a scan.

FIGURE 11.23 Switching from Scan mode to Query mode

FIGURE 11.24 Querying a DynamoDB table based on the partition key

Chapter 12

FIGURE 12.1 AWS Lambda service home page

FIGURE 12.2 AWS Lambda splash screen

FIGURE 12.3 AWS Lambda dashboard

FIGURE 12.4 List of existing AWS Lambda functions

FIGURE 12.5 AWS Lambda Create Function screen

FIGURE 12.6 Lambda function Name and Runtime settings

FIGURE 12.7 Inspecting the permissions policy document associated with the I...

FIGURE 12.8 Lambda function configuration page

FIGURE 12.9 List of AWS Lambda functions

FIGURE 12.10 Updating the code for the AWS Lambda function

FIGURE 12.11 List of AWS Lambda functions

FIGURE 12.12 Configuring a test event

FIGURE 12.13 Configuring a test event

FIGURE 12.14 AWS Lambda function execution results

FIGURE 12.15 Accessing AWS Lambda function execution statistics and logs

FIGURE 12.16 Accessing the Delete function menu item

FIGURE 12.17 Accessing the Amazon CloudWatch dashboard

FIGURE 12.18 List of Amazon CloudWatch log groups

FIGURE 12.19 Accessing the Delete Log Group menu item

Chapter 13

FIGURE 13.1 Accessing the Amazon Comprehend service home page

FIGURE 13.2 Testing the capabilities of Amazon Comprehend

FIGURE 13.3 Analyzing text with Amazon Comprehend

FIGURE 13.4 Amazon Comprehend presents analysis results as insights.

FIGURE 13.5 AWS Lambda splash screen

FIGURE 13.6 AWS Lambda dashboard

FIGURE 13.7 Creating an AWS Lambda function from scratch

FIGURE 13.8 Lambda Function Name and Runtime settings

FIGURE 13.9 Viewing the default policy document associated with the IAM role...

FIGURE 13.10 Updating the default policy document associated with the IAM ro...

FIGURE 13.11 Review Policy screen

FIGURE 13.12 AWS Lambda function designer

FIGURE 13.13 Adding the Amazon S3 trigger to the AWS Lambda function

FIGURE 13.14 Configuring the Amazon S3 event trigger

FIGURE 13.15 Accessing the function code editor

Chapter 14

FIGURE 14.1 Accessing the Amazon DynamoDB service home page

FIGURE 14.2 Amazon DynamoDB splash screen

FIGURE 14.3 Amazon DynamoDB dashboard

FIGURE 14.4 Specifying the table name, partition key, and sort key

FIGURE 14.5 Changing the provisioned I/O capacity

FIGURE 14.6 Amazon DynamoDB table overview

FIGURE 14.7 Settings for the ACMEAccountTransaction table

FIGURE 14.8 Amazon DynamoDB table overview

FIGURE 14.9 Creating a new item in the ACMEBankCustomer table

FIGURE 14.10 ACMEBankCustomer table with two items

FIGURE 14.11 ACMEBankAccount table with five items

FIGURE 14.12 Creating an AWS Lambda function from scratch

FIGURE 14.13 Lambda function name and runtime settings

FIGURE 14.14 Viewing the default policy document associated with the IAM rol...

FIGURE 14.15 Updating the default policy document associated with the IAM ro...

FIGURE 14.16 Review Policy screen

FIGURE 14.17 AWS Lambda function designer

FIGURE 14.18 Accessing the Amazon Lex service home page

FIGURE 14.19 Amazon Lex service splash screen

FIGURE 14.20 Amazon Lex dashboard

FIGURE 14.21 Creating a custom bot

FIGURE 14.22 Amazon Lex bot editor

FIGURE 14.23 Configuring the slots for your new intent.

FIGURE 14.24 Specifying the name of the new intent

FIGURE 14.25 Amazon Lex bot editor with two intents

FIGURE 14.26 The Sample Utterances section of the bot editor

FIGURE 14.27 Utterances associated with the AccountOverview intent

FIGURE 14.28 Specifying the validation function for the AccountOverview inte...

FIGURE 14.29 Slots for the AccountOverview intent

FIGURE 14.30 CustomerIdentifier slot settings

FIGURE 14.31 Specifying the Fulfillment function for the AccountOverview int...

FIGURE 14.32 Specifying the validation function for the ViewTransactionList ...

FIGURE 14.33 Specifying the fulfillment function for the ViewTransactionList...

FIGURE 14.34 Building the bot

FIGURE 14.35 Testing the bot with the integrated chat client

Chapter 15

FIGURE 15.1 Uploading the Titanic dataset to an Amazon S3 bucket

FIGURE 15.2 Accessing the Amazon Machine Learning service home page

FIGURE 15.3 The Amazon Machine Learning service home page

FIGURE 15.4 Accessing the Amazon Machine Learning dashboard

FIGURE 15.5 Accessing the Create Datasource option from the Amazon Machine L...

FIGURE 15.6 Specifying the location of the input file

FIGURE 15.7 Granting Amazon Machine Learning access to your Amazon S3 bucket

FIGURE 15.8 Modifying the default schema generated by Amazon Machine Learnin...

FIGURE 15.9 Specifying the target attribute

FIGURE 15.10 Specifying a row identifier attribute

FIGURE 15.11 Datasource Review screen

FIGURE 15.12 Filtering the items displayed in the Amazon Machine Learning da...

FIGURE 15.13 Specifying the location of the data for the new datasource

FIGURE 15.14 Setting up the schema for the new datasource

FIGURE 15.15 The new datasource does not have a target attribute.

FIGURE 15.16 Specifying a row identifier attribute

FIGURE 15.17 Selecting the datasource from the dashboard

FIGURE 15.18 Histogram of the target attribute

FIGURE 15.19 Summary statistics for categorical values

FIGURE 15.20 Distribution of values of the Embarked attribute

FIGURE 15.21 Rows that do not have a value for the Embarked attribute

FIGURE 15.22 Distribution of the Cabin attribute

FIGURE 15.23 Summary statistics for numeric attributes

FIGURE 15.24 Distribution of values for the Age attribute

FIGURE 15.25 Creating an ML model

FIGURE 15.26 Selecting a datasource

FIGURE 15.27 Specifying ML model settings

FIGURE 15.28 Amazon Machine Learning dashboard showing new data sources, the...

FIGURE 15.29 ML model summary

FIGURE 15.30 ML model evaluation

FIGURE 15.31 Advanced ML model statistics

FIGURE 15.32 A score of 0.37 results in a model accuracy of 0.8507 (85.07%).

FIGURE 15.33 Accessing the option to create a new batch prediction from the ...

FIGURE 15.34 Selecting an ML model for batch predictions

FIGURE 15.35 Selecting a datasource for batch predictions

FIGURE 15.36 Specifying an Amazon S3 bucket where the results of the batch p...

FIGURE 15.37 Batch Prediction Review screen

FIGURE 15.38 Amazon Machine Learning dashboard showing a completed batch pre...

FIGURE 15.39 Amazon S3 Bucket with the results of the batch prediction

FIGURE 15.40 Creating a real-time prediction endpoint for an Amazon Machine ...

FIGURE 15.41 Costs of maintaining a real-time prediction endpoint

FIGURE 15.42 Accessing the real-time prediction endpoint

Chapter 16

FIGURE 16.1 Enabling region-specific Amazon STS endpoints

FIGURE 16.2 Accessing the Amazon SageMaker management console

FIGURE 16.3 Navigating to the list of notebook instances

FIGURE 16.4 Specifying the name of the new Amazon SageMaker notebook instanc...

FIGURE 16.5 Creating a new IAM role for the Amazon SageMaker notebook instan...

FIGURE 16.6 Specifying the permissions policy for the new IAM role for Amazo...

FIGURE 16.7 New IAM role for Amazon SageMaker

FIGURE 16.8 Amazon SageMaker management console showing the new notebook ins...

FIGURE 16.9 Amazon SageMaker notebook instance management

FIGURE 16.10 Accessing the Amazon S3 bucket that will contain the training a...

FIGURE 16.11 Uploading the pre-split training and test data files to the Ama...

FIGURE 16.12 Creating a new Jupyter Notebook on an Amazon SageMaker notebook...

FIGURE 16.13 Changing the title of a Jupyter Notebook file

FIGURE 16.14 Uploading a file to a notebook instance

FIGURE 16.15 Using a notebook instance to create a training job

FIGURE 16.16 List of trained models

FIGURE 16.17 Training a model based on a built-in algorithm using an AWS Sag...

Chapter 17

FIGURE 17.1 Structure of an artificial neural network (ANN)

FIGURE 17.2 A simple neural network

FIGURE 17.3 TensorFlow API architecture

FIGURE 17.4 Accessing the Amazon S3 bucket that will contain the training an...

FIGURE 17.5 Uploading the pre-split training and test data files to the Amaz...

FIGURE 17.6 Amazon SageMaker management console showing the new notebook ins...

FIGURE 17.7 Inspecting the first five rows of the Boston housing dataset

FIGURE 17.8 Mean squared error metric

FIGURE 17.9 Computation graph with two placeholder nodes

FIGURE 17.10 Computation graph with two variable nodes

FIGURE 17.11 Computation graph after the multiplication of w1 and x1 nodes

FIGURE 17.12 Computation graph with y_predicted

FIGURE 17.13 Computation graph that contains nodes to compute the MSE cost f...

FIGURE 17.14 Computation graph that contains the operation to optimize the c...

FIGURE 17.15 Uploading a file to a notebook instance

FIGURE 17.16 Architecture of neural-network–based classification model

FIGURE 17.17 Using a notebook instance to create a training job

FIGURE 17.18 List of trained models

Chapter 18

FIGURE 18.1 Accessing the Amazon Rekognition service home page

FIGURE 18.2 Accessing the Object and Scene Detection demo

FIGURE 18.3 Object labels detected in a sample scene

FIGURE 18.4 Amazon Rekognition aggregate metric graphs

FIGURE 18.5 Accessing the Amazon DynamoDB management console

FIGURE 18.6 Amazon DynamoDB table name and primary key attributes

FIGURE 18.7 Amazon DynamoDB Table Read/Write Capacity Mode section

FIGURE 18.8 Amazon DynamoDB management console displaying a list of tables

FIGURE 18.9 Creating an AWS Lambda function from scratch

FIGURE 18.10 Lambda Function Name and Runtime settings

FIGURE 18.11 Viewing the default policy document associated with the IAM rol...

FIGURE 18.12 Updating the default policy document associated with the IAM ro...

FIGURE 18.13 Review Policy screen

FIGURE 18.14 AWS Lambda function designer

FIGURE 18.15 Configuring the S3 event trigger

FIGURE 18.16 Configuring the AWS Lambda function code

FIGURE 18.17 Examining the results of the AWS Lambda function

FIGURE 18.18 Querying the Amazon DynamoDB table will allow you to search for...

Guide

Cover

Table of Contents

Begin Reading

Pages

iii

iv

v

vii

ix

xi

xxiii

xxiv

xxv

xxvi

xxvii

1

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

Machine Learning in the AWS Cloud

Add Intelligence to Applications with Amazon SageMaker and Amazon Rekognition

 

 

Abhishek Mishra

 

 

 

 

 

 

 

Copyright © 2019 by John Wiley & Sons, Inc., Indianapolis, IndianaPublished simultaneously in Canada

ISBN: 978-1-119-55671-8ISBN: 978-1-119-55673-2 (ebk.)ISBN: 978-1-119-55672-5 (ebk.)

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read.

For general information on our other products and services or to obtain technical support, please contact our Customer Care Department within the U.S. at (877) 762-2974, outside the U.S. at (317) 572-3993 or fax (317) 572-4002.

Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.

Library of Congress Control Number: 2019940774

TRADEMARKS: Wiley, the Wiley logo, and the Sybex logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. Amazon SageMaker and Amazon Rekognition are registered trademarks of Amazon Technologies, Inc. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

To my wife Sonam, for her love and support through all the years we've been together.

To my daughter Elana, for bringing joy and happiness into our lives.

—Abhishek

Acknowledgments

This book would not have been possible without the support of the team at Wiley, including Jim Minatel, Kenyon Brown, David Clark, Kim Cofer, and Pete Gaughan. I would also like to thank Chaim Krause for his keen eye for detail. It has been my privilege to work with all of you. Thank you.

About the Author

Abhishek Mishra has been active in the IT industry for over 19 years and has extensive experience with a wide range of programming languages, enterprise systems, service architectures, and platforms.

He holds a master's degree in computer science from the University of London and currently provides consultancy services to Lloyds Banking Group in London as a security and fraud solution architect. He is the author of several books, including Amazon Web Services for Mobile Developers.

About the Technical Editor

Chaim Krause is a lover of computers, electronics, animals, and electronic music. He's tickled pink when he can combine two or more in some project. He has come by the vast majority of his knowledge through independent learning. He jokes with everyone that the only difference between what he does at home and what he does at work is the logon he uses. As a lifelong learner he is often frustrated with technical errors in documentation that waste valuable time and cause unnecessary frustration. One of the reasons he works as the technical editor on books is to help others avoid those same pitfalls.

Introduction

Amazon Web Services (AWS) is one of the leading cloud-computing platforms in the industry today. At the time this book was written, AWS offered more than 100 services, each of which resided in one of 18 different service categories. For someone who is new to cloud computing or to the AWS ecosystem, the sheer number of services on offer can be daunting. It can be difficult to know where to begin and what services to focus on.

Developers who are new to machine learning as well as experienced data scientists are often not aware of the power of the public cloud and AWS's offerings in the machine learning space in particular. In the past, cloud-based machine learning offerings have been limited in the types of algorithms they could support and the level of customization that was possible. All of this changed when Amazon announced SageMaker—a service that provided the ability to build machine learning models based on Amazon's implementation of cutting-edge algorithms, as well as the option to build custom models with frameworks such as Scikit-learn and Google TensorFlow.

Real-world use cases of cloud-based machine learning models are not based on using the model in isolation, but instead rely on a number of supporting systems such as databases, load balancers, API gateways, and identity providers, all of which are provided by AWS. This book is written to provide both seasoned machine learning experts and enthusiasts alike an introduction to a selection of AWS machine learning services that are based on pre-trained models, as well as step-by-step examples of how to train and deploy your own custom models on Amazon SageMaker. For enthusiasts who are new to machine learning, this book also provides a selection of chapters that cover the fundamentals of machine learning such as data preprocessing, visualization, feature engineering, and the use of common Python libraries such as NumPy, Pandas, and Scikit-learn.

This book at all times attempts to balance between theory and practice, giving you enough visibility into the underlying concepts and providing you with the best practices and practical advice that you can apply at your workplace right away. I have also made every attempt to keep the content up-to-date and relevant. Even though this makes the book susceptible to being outdated in a few rare instances, I am confident the content will remain useful and relevant through the next versions of the AWS services.

Who This Book Is For

This book is best suited for software developers who wish to learn about machine learning in general and how to leverage machine learning–specific offerings from AWS. The book is also useful to data scientists, system architects, and application architects, who want to get an introduction to some of the commonly used AWS services in the machine learning space.

If you are new to both machine learning and AWS, I advise that you read all chapters from start to finish. If you are an experienced data scientist, you may want to skip ahead to Part 2 to learn about machine learning–specific AWS services.

What This Book Covers

This book covers building and training machine learning models with Python on the AWS cloud, as well as a number of ready-to-use machine learning services such as Amazon Rekognition, Amazon Comprehend, and Amazon Lex.

The book also covers general high-level concepts of machine learning, including feature engineering, data visualization, as well as supporting AWS services that are used to build machine learning systems such as Amazon IAM, Amazon Cognito, Amazon S3, Amazon DynamoDB, and AWS Lambda.

The model-building and evaluation code in this book is written in Python 3. Services provided by Amazon, Apple, and Google are updated frequently and therefore sometimes you may encounter a newer version of a screen when you follow the instructions in a chapter.

How This Book Is Structured

This book consists of 18 chapters that are grouped into two parts, and four appendices. The first part consists of five chapters and covers the fundamentals of machine learning using Python. This part covers techniques for feature engineering, data visualization, model building, and model evaluation using Pandas, NumPy, Matplotlib, and Scikit-learn. The examples developed in this part make use of Jupyter Notebook and are aimed at readers who are new to machine learning.

Part 2 covers building machine learning applications using AWS services. This part starts with introducing the basics of commonly used AWS services such as Amazon S3, Amazon DynamoDB, and AWS Lambda. It then proceeds to AWS services that deal specifically with machine learning such as Amazon Comprehend, Amazon Lex, Amazon Machine Learning, and Amazon SageMaker. Two chapters are dedicated to Amazon SageMaker; the first one covers building and deploying models using built-in algorithms and Scikit-learn, and the second one covers building and deploying a model with Google TensorFlow. Not all chapters in this part include source code, but where applicable, you can download the source code that accompanies each chapter using a GitHub link. Some of the chapters in this part require you to upload files to Amazon S3; you will need to substitute the names of buckets in the examples with those from your own account.

The chapters in Part 1 include:

Introduction to Machine Learning (

Chapter 1

)

This is an introduction to the types of machine learning systems, their applications, and tools used to build machine learning systems.

Data Collection and Preprocessing (

Chapter 2

)

This chapter covers sources that can be used to obtain training data, techniques to explore datasets, and basic feature engineering.

Data Visualization with Python (

Chapter 3

)

This chapter covers techniques to visualize datasets using Matplotlib.

Creating Machine Learning Models with Scikit-learn (

Chapter 4

)

This chapter covers techniques to build and train classification and regression models using Scikit-learn.

Evaluating Machine Learning Models (

Chapter 5

)

This chapter covers techniques to evaluate the quality of a machine learning model.

The chapters in Part 2 include:

Introduction to Amazon Web Services (

Chapter 6

)

This chapter is a brief primer on cloud computing and Amazon Web Services. It also covers commonly encountered service and deployment models.

AWS Global Infrastructure (

Chapter 7

)

This chapter introduces AWS regions, availability zones, and edge locations.

Identity and Access Management (

Chapter 8

)

This chapter introduces one of the key services provided by AWS to secure your resources in the Amazon cloud. It also provides instructions to sign up for an account under the AWS free tier.

Amazon S3 (

Chapter 9

)

This chapter introduces one the most commonly used storage services provided by AWS, Amazon Simple Storage Service (S3).

Amazon Cognito (

Chapter 10

)

This chapter introduces Amazon's cloud-based OAuth2.0-compliant identity management solution, Amazon Cognito.

Amazon DynamoDB (

Chapter 11

)

This chapter introduces Amazon's managed NoSQL database service, Amazon DynamoDB.

AWS Lambda (

Chapter 12

)

This chapter introduces AWS Lambda, a service designed to allow you to run code in the Amazon cloud without having to provision or manage any infrastructure.

Amazon Comprehend (

Chapter 13

)

This chapter introduces Amazon Comprehend, a cloud-based natural language processing service that you can integrate into your applications to analyze the contents of text documents.

Amazon Lex (

Chapter 14

)

This chapter introduces Amazon Lex, a cloud-based service that you can use to create chatbots and integrate them into your applications.

Amazon Machine Learning (

Chapter 15

)

This chapter introduces Amazon Machine Learning, a fully managed cloud-based service that you can use to build and deploy simple machine learning models without any programming.

Amazon SageMaker (

Chapter 16

)

This chapter introduces Amazon SageMaker, a cloud-based machine learning service that can be used to train and deploy both built-in and custom machine learning models.

Using Google Tensorflow with Amazon SageMaker (

Chapter 17

)

This chapter introduces Google's Tensorflow framework and covers the use of Amazon SageMaker to build and deploy Tensorflow models.

Amazon Rekognition (

Chapter 18

)

This chapter introduces Amazon Rekognition, a fully managed cloud-based service that can be used to add computer vision capabilities to your applications.

The appendices cover the following topics:

Anaconda and Jupyter Notebook Setup (

Appendix A

)

This appendix provides instructions to install the Anaconda distribution and set up a Jupyter Notebook server on your local computer.

AWS Resources Needed to Use This Book (

Appendix B

)

This appendix provides information on the AWS resources that you need to set up in your account in order to follow along with the examples in the book.

Installing and Configuring the AWS CLI (

Appendix C

)

This appendix provides instructions to download and install the AWS CLI tool.

Introduction to NumPy and Pandas (

Appendix D

)

This appendix provides an introduction to two Python libraries commonly used by data scientists: NumPy and Pandas.

What You Need to Use This Book

A suitable Mac or Windows computer for development

Basic knowledge of Python programming

An AWS account that you can administer

Conventions

To help you get the most from the text and keep track of what's happening, we've used a number of conventions throughout the book.

NOTE

  Notes, tips, hints, tricks, and asides to the current discussion are offset like this.

As for styles in the text:

We

italicize

new terms and important words when we introduce them.

We show keyboard strokes like this: Ctrl+A.

We show filenames, URLs, and code within the text like so:

persistence.properties

.

We present code in two different ways:

We use a monofont type with no highlighting for most code examples.

We use bold type to emphasize code that is of particular importance in the present context.

Source Code

As you work through the examples in this book, you may choose either to type in all the code manually or to use the source code files that accompany the book. All of the source code used in this book is available for download at www.wiley.com/go/machinelearningawscloud. Also, you can download the code files at GitHub.

Errata

We make every effort to ensure that there are no errors in the text or in the code. However, no one is perfect, and mistakes do occur. If you find an error in one of our books, like a spelling mistake or faulty piece of code, we would be very grateful for your feedback. By sending in errata you may save another reader hours of frustration and at the same time you will be helping us provide even higher quality information.

To report errata, email to [email protected] and include

The book's title and ISBN (

Machine Learning in the AWS Cloud

, 9781119556718)

The page number of the relevant content

A description of just what's wrong

Part 1Fundamentals of Machine Learning

Chapter 1: Introduction to Machine Learning

Chapter 2: Data Collection and Preprocessing

Chapter 3: Data Visualization with Python

Chapter 4: Creating Machine Learning Models with Scikit-learn

Chapter 5: Evaluating Machine Learning Models