32,99 €
Put the power of AWS Cloud machine learning services to work in your business and commercial applications! Machine Learning in the AWS Cloud introduces readers to the machine learning (ML) capabilities of the Amazon Web Services ecosystem and provides practical examples to solve real-world regression and classification problems. While readers do not need prior ML experience, they are expected to have some knowledge of Python and a basic knowledge of Amazon Web Services. Part One introduces readers to fundamental machine learning concepts. You will learn about the types of ML systems, how they are used, and challenges you may face with ML solutions. Part Two focuses on machine learning services provided by Amazon Web Services. You'll be introduced to the basics of cloud computing and AWS offerings in the cloud-based machine learning space. Then you'll learn to use Amazon Machine Learning to solve a simpler class of machine learning problems, and Amazon SageMaker to solve more complex problems. * Learn techniques that allow you to preprocess data, basic feature engineering, visualizing data, and model building * Discover common neural network frameworks with Amazon SageMaker * Solve computer vision problems with Amazon Rekognition * Benefit from illustrations, source code examples, and sidebars in each chapter The book appeals to both Python developers and technical/solution architects. Developers will find concrete examples that show them how to perform common ML tasks with Python on AWS. Technical/solution architects will find useful information on the machine learning capabilities of the AWS ecosystem.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 652
Veröffentlichungsjahr: 2019
Cover
Acknowledgments
About the Author
About the Technical Editor
Introduction
Who This Book Is For
What This Book Covers
How This Book Is Structured
What You Need to Use This Book
Conventions
Source Code
Errata
Part 1: Fundamentals of Machine Learning
Chapter 1: Introduction to Machine Learning
What Is Machine Learning?
Types of Machine Learning Systems
The Traditional Versus the Machine Learning Approach
Summary
Chapter 2: Data Collection and Preprocessing
Machine Learning Datasets
Data Preprocessing Techniques
Summary
Chapter 3: Data Visualization with Python
Introducing Matplotlib
Components of a Plot
Common Plots
Summary
Chapter 4: Creating Machine Learning Models with Scikit-learn
Introducing Scikit-learn
Creating a Training and Test Dataset
Creating Machine Learning Models
Summary
Chapter 5: Evaluating Machine Learning Models
Evaluating Regression Models
Evaluating Classification Models
Choosing Hyperparameter Values
Summary
Part 2: Machine Learning with Amazon Web Services
Chapter 6: Introduction to Amazon Web Services
What Is Cloud Computing?
Cloud Service Models
Cloud Deployment Models
The AWS Ecosystem
Sign Up for an AWS Free-Tier Account
Summary
Note
Chapter 7: AWS Global Infrastructure
Regions and Availability Zones
Edge Locations
Accessing AWS
Summary
Chapter 8: Identity and Access Management
Key Concepts
Common Tasks
Summary
Chapter 9: Amazon S3
Key Concepts
Common Tasks
Summary
Chapter 10: Amazon Cognito
Key Concepts
Common Tasks
User Pools or Identity Pools: Which One Should You Use?
Summary
Chapter 11: Amazon DynamoDB
Key Concepts
Common Tasks
Summary
Chapter 12: AWS Lambda
Common Use Cases for Lambda
Key Concepts
Common Tasks
Summary
Chapter 13: Amazon Comprehend
Key Concepts
Text Analysis Using the Amazon Comprehend Management Console
Interactive Text Analysis with the AWS CLI
Using Amazon Comprehend with AWS Lambda
Summary
Chapter 14: Amazon Lex
Key Concepts
Creating an Amazon Lex Bot
Summary
Chapter 15: Amazon Machine Learning
Key Concepts
Creating Datasources
Viewing Data Insights
Creating an ML Model
Making Batch Predictions
Creating a Real-Time Prediction Endpoint for Your Machine Learning Model
Making Predictions Using the AWS CLI
Using Real-Time Prediction Endpoints with Your Applications
Summary
Chapter 16: Amazon SageMaker
Key Concepts
Creating an Amazon SageMaker Notebook Instance
Preparing Test and Training Data
Training a Scikit-Learn Model on an Amazon SageMaker Notebook Instance
Training a Scikit-Learn Model on a Dedicated Training Instance
Training a Model Using a Built-in Algorithm on a Dedicated Training Instance
Summary
Chapter 17: Using Google TensorFlow with Amazon SageMaker
Introduction to Google TensorFlow
Creating a Linear Regression Model with Google TensorFlow
Training and Deploying a DNN Classifier Using the TensorFlow Estimators API and Amazon SageMaker
Summary
Chapter 18: Amazon Rekognition
Key Concepts
Analyzing Images Using the Amazon Rekognition Management Console
Interactive Image Analysis with the AWS CLI
Using Amazon Rekognition with AWS Lambda
Summary
Appendix A: Anaconda and Jupyter Notebook Setup
Installing the Anaconda Distribution
Creating a Conda Python Environment
Installing Python Packages
Installing Jupyter Notebook
Summary
Appendix B: AWS Resources Needed to Use This Book
Creating an IAM User for Development
Creating S3 Buckets
Appendix C: Installing and Configuring the AWS CLI
Mac OS Users
Windows Users
Appendix D: Introduction to NumPy and Pandas
NumPy
Pandas
Index
End User License Agreement
Chapter 1
TABLE 1.1: Type and Range of Data across 100 Sample Applications
TABLE 1.2: Transforming Categorical Features into Numeric Features
TABLE 1.3: Modified Input Features
Chapter 7
TABLE 7.1: AWS Regions and Availability Zones
Chapter 9
TABLE 9.1: Amazon S3 System-Defined Metadata
Chapter 12
TABLE 12.1: Common Event Sources for AWS Lambda
Chapter 14
TABLE 14.1: ACMEBankAccount Table Items
TABLE 14.2: ACMEAccountTransaction Table Items
TABLE 14.3: ViewTransactionList Intent Slots
Chapter 15
TABLE 15.1: The First Five Rows of the Titanic Dataset
TABLE 15.2: The First Ten Rows of the Batch Prediction Result
Chapter 18
TABLE 18.1: Aggregate Metric Graphs
Appendix C
TABLE C.1: AWS Region Names
TABLE C.2 AWS Region Names
Appendix D
TABLE D.1: Commonly Used Ndarray Attributes
Chapter 1
FIGURE 1.1 Supervised learning
FIGURE 1.2 Clustering technique used to find patterns in the data
FIGURE 1.3 Semi-supervised learning
FIGURE 1.4 Architecture of a rule-based decision system
FIGURE 1.5 A flowchart depicting the decision-making process for a rule-base...
FIGURE 1.6 Cross-validation using multiple folds
FIGURE 1.7 The sigmoid function
FIGURE 1.8 Using the sigmoid function for binary classification
Chapter 2
FIGURE 2.1 The
head()
function displays rows from the beginning of a Pandas ...
FIGURE 2.2 The
head()
function displays truncated data for large dataframes.
FIGURE 2.3 Impact of the
set_index
function on a dataframe
FIGURE 2.4 Distribution of values for the Survived attribute
FIGURE 2.5 Histogram of numeric features
FIGURE 2.6 Histogram of numeric feature “Age” using different bin widths (2,...
FIGURE 2.7 Histogram of categorical feature “Embarked”
FIGURE 2.8 Box plot of numeric features
FIGURE 2.9 Linear correlation between numeric columns
FIGURE 2.10 Matrix of scatter plots between pairs of numeric attributes
FIGURE 2.11 Box plot of the Age feature variable
FIGURE 2.12 Dataframe with engineered feature AgeCategory
FIGURE 2.13 Dataframe with engineered feature FareCategory
FIGURE 2.14 Histogram of Age, NormalizedAge, and StandardizedAge
Chapter 3
FIGURE 3.1 Plotting two curves using Matplotlib
FIGURE 3.2 Components of a Matplotlib plot
FIGURE 3.3 A figure object with four axes objects
FIGURE 3.4 Comparison of plots with and without grids
FIGURE 3.5 Histogram of Passenger Age values
FIGURE 3.6 Histograms of Passenger Age values created using different binnin...
FIGURE 3.7 Bar chart of theEmbarked attribute
FIGURE 3.8 Grouped bar chart of the Embarked attribute
FIGURE 3.9 Stacked bar chart of the Embarked attribute
FIGURE 3.10 Stacked percentage bar chart of the Embarked attribute
FIGURE 3.11 Pie chart of proportion of passengers embarking from different p...
FIGURE 3.12 Pie charts showing the proportion of survivors from each embarka...
FIGURE 3.13 Box plot showing the distribution of the Age attribute
FIGURE 3.14 Box plots of the Age attribute comparing the distribution of sur...
FIGURE 3.15 Scatter plot of the Age attribute against the Fare attribute
FIGURE 3.16 Scatter plots depicting the ideal strong positive and strong neg...
FIGURE 3.17 Scatter plot matrix of the features of the Iris dataset
FIGURE 3.18 Scatter plot of four clusters of data
Chapter 4
FIGURE 4.1 Scikit-learn's
train_test_split()
method automatically shuffles t...
FIGURE 4.2 Comparison of the distribution of target variables in the origina...
FIGURE 4.3 Comparison of the distribution of target variables in the origina...
FIGURE 4.4 Cross-validation using k-folds
FIGURE 4.5 Scatter plot of expected vs. predicted house prices
FIGURE 4.6 Scatter plot of synthetic dataset along with regression lines
FIGURE 4.7 Three potential decision boundaries
FIGURE 4.8 Data that cannot be classified using a linear decision boundary i...
FIGURE 4.9 Data that cannot be classified using a linear decision boundary i...
FIGURE 4.10 Nonlinear decision boundary in two-dimensional space
FIGURE 4.11 Effect of kernel choice on decision boundaries
FIGURE 4.12 Linear regression vs. support vector regression
FIGURE 4.13 SVR predictions on Boston housing dataset
FIGURE 4.14 The sigmoid function
FIGURE 4.15 Using the sigmoid function for binary classification
FIGURE 4.16 Softmax logistic regression
FIGURE 4.17 Decision tree visualization
FIGURE 4.18 Decision tree for regression
Chapter 5
FIGURE 5.1 Comparison of predictive accuracies of a linear regression model ...
FIGURE 5.2 Mean squared error and root mean squared error
FIGURE 5.3 A class-wise confusion matrix
FIGURE 5.4 ROC curves for three binary classification models
FIGURE 5.5 Multi-class confusion matrix for a five-class dataset
FIGURE 5.6 Multi-class confusion matrix for two models trained on the Iris f...
Chapter 6
FIGURE 6.1 Common cloud service models
FIGURE 6.2 Brief timeline of Amazon Web Services
FIGURE 6.3 Amazon Web Services home page
FIGURE 6.4 AWS sign-in screen
FIGURE 6.5 Contact Information screen
FIGURE 6.6 Payment Information screen
FIGURE 6.7 Phone Verification screen
FIGURE 6.8 Phone verification PIN
FIGURE 6.9 Completing the identity verification process
FIGURE 6.10 Support plan selection
FIGURE 6.11 Completing the sign-up process
Chapter 7
FIGURE 7.1 Multiple Availability Zones in a single region
FIGURE 7.2 Geographically distant users accessing a video file from Tokyo
FIGURE 7.3 Edge locations can be used to cache frequently used content
FIGURE 7.4 AWS home page
FIGURE 7.5 AWS management console home page
FIGURE 7.6 AWS management console menu bar
FIGURE 7.7 Accessing the Services menu in the AWS management console
FIGURE 7.8 Resource Groups menu
FIGURE 7.9 Creating a resource group
FIGURE 7.10 Tagged resources are visible in the Resource Groups menu.
FIGURE 7.11 Resources in the CustomerAPI-Infrastructure resource group
FIGURE 7.12 Account menu
FIGURE 7.13 Regions menu
Chapter 8
FIGURE 8.1 IAM users exist under the root AWS account.
FIGURE 8.2 Obtaining temporary credentials
FIGURE 8.3 IAM groups contain users and permissions.
FIGURE 8.4 Root account login screen
FIGURE 8.5 IAM user-specific login screen
FIGURE 8.6 AWS management console region selector
FIGURE 8.7 Accessing the IAM management console
FIGURE 8.8 User-specific IAM sign-in link
FIGURE 8.9 IAM resource dashboard
FIGURE 8.10 Creating an IAM user
FIGURE 8.11 User details screen
FIGURE 8.12 Configuring user permissions
FIGURE 8.13 Creating a new group
FIGURE 8.14 The new group appears alongside existing groups.
FIGURE 8.15 The EC2FullAccess policy loaded in the policy editor
FIGURE 8.16 Review user settings screen
FIGURE 8.17 User confirmation screen
FIGURE 8.18 List of groups
FIGURE 8.19 Group permissions summary
FIGURE 8.20 Creating a new role using the IAM console
FIGURE 8.21 Creating a service role for EC2 instances
FIGURE 8.22 Attaching a policy to a role
FIGURE 8.23 You can associate up to 50 optional tags with a role.
FIGURE 8.24 Review new role screen
FIGURE 8.25 Accessing MFA settings
FIGURE 8.26 Configure security credentials warning
FIGURE 8.27 The Activate MFA button is enabled.
FIGURE 8.28 Choosing the MFA device type
FIGURE 8.29 Configuring a step-up authenticator
FIGURE 8.30 IAM password policy settings
Chapter 9
FIGURE 9.1 Accessing the Amazon S3 management console
FIGURE 9.2 Amazon S3 management console welcome page
FIGURE 9.3 List of Amazon S3 buckets
FIGURE 9.4 Specifying the bucket name and region
FIGURE 9.5 Configuring versioning, logging, and cost allocation tags
FIGURE 9.6 Configuring bucket permissions
FIGURE 9.7 Bucket summary page
FIGURE 9.8 List of Amazon S3 buckets in your account
FIGURE 9.9 Contents of an Amazon S3 bucket
FIGURE 9.10 Selecting files in the File Upload dialog box
FIGURE 9.11 Configuring object permissions
FIGURE 9.12 Configuring file storage class and encryption
FIGURE 9.13 File summary page
FIGURE 9.14 Amazon S3 bucket showing a file
FIGURE 9.15 Downloading a file from a bucket
FIGURE 9.16 Locating the Amazon S3 Object URL
FIGURE 9.17 Non-public buckets and files are not accessible using a URL.
FIGURE 9.18 Accessing Amazon S3 bucket permissions
FIGURE 9.19 Configuring Amazon S3 bucket permissions
FIGURE 9.20 Accessing the Make Public option
FIGURE 9.21 Making a file publicly accessible
FIGURE 9.22 Changing the storage class of an object
FIGURE 9.23 Deleting an object from an Amazon S3 bucket
FIGURE 9.24 Enabling bucket versioning
FIGURE 9.25 Making an object publicly accessible while uploading it
FIGURE 9.26 Accessing document versions
FIGURE 9.27 Version selector switch
Chapter 10
FIGURE 10.1 Accessing the S3 management console
FIGURE 10.2 Amazon Cognito splash screen
FIGURE 10.3 Creating a new user pool
FIGURE 10.4 Specifying the name of the new user pool
FIGURE 10.5 User pool attributes
FIGURE 10.6 Adding a custom attribute to a user pool
FIGURE 10.7 Setting up user pool policies
FIGURE 10.8 Multifactor authentication settings for the user pool
FIGURE 10.9 Customizing email and SMS verification messages
FIGURE 10.10 Cost allocation tag setup screen
FIGURE 10.11 You can set up a user pool to remember devices
FIGURE 10.12 Configuring applications that can use the user pool to authenti...
FIGURE 10.13 Create Application screen
FIGURE 10.14 List of client applications in the user pool
FIGURE 10.15 Use triggers to call AWS Lambda functions at specific points in...
FIGURE 10.16 User pool Review screen
FIGURE 10.17 Click the Show Details button to reveal the app client ID and t...
FIGURE 10.18 Amazon Cognito splash screen
FIGURE 10.19 Creating a new identity pool
FIGURE 10.20 List of existing identity pools
FIGURE 10.21 Specifying the Amazon Cognito user pool ID and app client ID
FIGURE 10.22 Cognito, by default, creates new roles for authenticated and un...
FIGURE 10.23 Accessing the credentials needed to access AWS services
Chapter 11
FIGURE 11.1 Accessing the Amazon DynamoDB service home page
FIGURE 11.2 Amazon DynamoDB splash screen
FIGURE 11.3 Amazon DynamoDB dashboard
FIGURE 11.4 Specifying a table name
FIGURE 11.5 Specifying a composite key for a table
FIGURE 11.6 Changing the provisioned I/O capacity
FIGURE 11.7 Amazon DynamoDB table overview
FIGURE 11.8 Creating a new item in the customer table
FIGURE 11.9 Item attributes dialog showing default primary key attribute
FIGURE 11.10 Adding item attributes
FIGURE 11.11 Specifying multiple attributes
FIGURE 11.12 Viewing item attributes as JSON
FIGURE 11.13 Amazon DynamoDB table with one item
FIGURE 11.14 Each item in an Amazon DynamoDB table can have different attrib...
FIGURE 11.15 Creating an index
FIGURE 11.16 Index properties dialog
FIGURE 11.17 Amazon DynamoDB table index list
FIGURE 11.18 Mandatory fields for new items
FIGURE 11.19 Multiple items in an Amazon DynamoDB table
FIGURE 11.20 List of items returned as a result of a scan operation
FIGURE 11.21 Adding a filter expression to a scan
FIGURE 11.22 Indexes can be used while performing a scan.
FIGURE 11.23 Switching from Scan mode to Query mode
FIGURE 11.24 Querying a DynamoDB table based on the partition key
Chapter 12
FIGURE 12.1 AWS Lambda service home page
FIGURE 12.2 AWS Lambda splash screen
FIGURE 12.3 AWS Lambda dashboard
FIGURE 12.4 List of existing AWS Lambda functions
FIGURE 12.5 AWS Lambda Create Function screen
FIGURE 12.6 Lambda function Name and Runtime settings
FIGURE 12.7 Inspecting the permissions policy document associated with the I...
FIGURE 12.8 Lambda function configuration page
FIGURE 12.9 List of AWS Lambda functions
FIGURE 12.10 Updating the code for the AWS Lambda function
FIGURE 12.11 List of AWS Lambda functions
FIGURE 12.12 Configuring a test event
FIGURE 12.13 Configuring a test event
FIGURE 12.14 AWS Lambda function execution results
FIGURE 12.15 Accessing AWS Lambda function execution statistics and logs
FIGURE 12.16 Accessing the Delete function menu item
FIGURE 12.17 Accessing the Amazon CloudWatch dashboard
FIGURE 12.18 List of Amazon CloudWatch log groups
FIGURE 12.19 Accessing the Delete Log Group menu item
Chapter 13
FIGURE 13.1 Accessing the Amazon Comprehend service home page
FIGURE 13.2 Testing the capabilities of Amazon Comprehend
FIGURE 13.3 Analyzing text with Amazon Comprehend
FIGURE 13.4 Amazon Comprehend presents analysis results as insights.
FIGURE 13.5 AWS Lambda splash screen
FIGURE 13.6 AWS Lambda dashboard
FIGURE 13.7 Creating an AWS Lambda function from scratch
FIGURE 13.8 Lambda Function Name and Runtime settings
FIGURE 13.9 Viewing the default policy document associated with the IAM role...
FIGURE 13.10 Updating the default policy document associated with the IAM ro...
FIGURE 13.11 Review Policy screen
FIGURE 13.12 AWS Lambda function designer
FIGURE 13.13 Adding the Amazon S3 trigger to the AWS Lambda function
FIGURE 13.14 Configuring the Amazon S3 event trigger
FIGURE 13.15 Accessing the function code editor
Chapter 14
FIGURE 14.1 Accessing the Amazon DynamoDB service home page
FIGURE 14.2 Amazon DynamoDB splash screen
FIGURE 14.3 Amazon DynamoDB dashboard
FIGURE 14.4 Specifying the table name, partition key, and sort key
FIGURE 14.5 Changing the provisioned I/O capacity
FIGURE 14.6 Amazon DynamoDB table overview
FIGURE 14.7 Settings for the ACMEAccountTransaction table
FIGURE 14.8 Amazon DynamoDB table overview
FIGURE 14.9 Creating a new item in the ACMEBankCustomer table
FIGURE 14.10 ACMEBankCustomer table with two items
FIGURE 14.11 ACMEBankAccount table with five items
FIGURE 14.12 Creating an AWS Lambda function from scratch
FIGURE 14.13 Lambda function name and runtime settings
FIGURE 14.14 Viewing the default policy document associated with the IAM rol...
FIGURE 14.15 Updating the default policy document associated with the IAM ro...
FIGURE 14.16 Review Policy screen
FIGURE 14.17 AWS Lambda function designer
FIGURE 14.18 Accessing the Amazon Lex service home page
FIGURE 14.19 Amazon Lex service splash screen
FIGURE 14.20 Amazon Lex dashboard
FIGURE 14.21 Creating a custom bot
FIGURE 14.22 Amazon Lex bot editor
FIGURE 14.23 Configuring the slots for your new intent.
FIGURE 14.24 Specifying the name of the new intent
FIGURE 14.25 Amazon Lex bot editor with two intents
FIGURE 14.26 The Sample Utterances section of the bot editor
FIGURE 14.27 Utterances associated with the AccountOverview intent
FIGURE 14.28 Specifying the validation function for the AccountOverview inte...
FIGURE 14.29 Slots for the AccountOverview intent
FIGURE 14.30 CustomerIdentifier slot settings
FIGURE 14.31 Specifying the Fulfillment function for the AccountOverview int...
FIGURE 14.32 Specifying the validation function for the ViewTransactionList ...
FIGURE 14.33 Specifying the fulfillment function for the ViewTransactionList...
FIGURE 14.34 Building the bot
FIGURE 14.35 Testing the bot with the integrated chat client
Chapter 15
FIGURE 15.1 Uploading the Titanic dataset to an Amazon S3 bucket
FIGURE 15.2 Accessing the Amazon Machine Learning service home page
FIGURE 15.3 The Amazon Machine Learning service home page
FIGURE 15.4 Accessing the Amazon Machine Learning dashboard
FIGURE 15.5 Accessing the Create Datasource option from the Amazon Machine L...
FIGURE 15.6 Specifying the location of the input file
FIGURE 15.7 Granting Amazon Machine Learning access to your Amazon S3 bucket
FIGURE 15.8 Modifying the default schema generated by Amazon Machine Learnin...
FIGURE 15.9 Specifying the target attribute
FIGURE 15.10 Specifying a row identifier attribute
FIGURE 15.11 Datasource Review screen
FIGURE 15.12 Filtering the items displayed in the Amazon Machine Learning da...
FIGURE 15.13 Specifying the location of the data for the new datasource
FIGURE 15.14 Setting up the schema for the new datasource
FIGURE 15.15 The new datasource does not have a target attribute.
FIGURE 15.16 Specifying a row identifier attribute
FIGURE 15.17 Selecting the datasource from the dashboard
FIGURE 15.18 Histogram of the target attribute
FIGURE 15.19 Summary statistics for categorical values
FIGURE 15.20 Distribution of values of the Embarked attribute
FIGURE 15.21 Rows that do not have a value for the Embarked attribute
FIGURE 15.22 Distribution of the Cabin attribute
FIGURE 15.23 Summary statistics for numeric attributes
FIGURE 15.24 Distribution of values for the Age attribute
FIGURE 15.25 Creating an ML model
FIGURE 15.26 Selecting a datasource
FIGURE 15.27 Specifying ML model settings
FIGURE 15.28 Amazon Machine Learning dashboard showing new data sources, the...
FIGURE 15.29 ML model summary
FIGURE 15.30 ML model evaluation
FIGURE 15.31 Advanced ML model statistics
FIGURE 15.32 A score of 0.37 results in a model accuracy of 0.8507 (85.07%).
FIGURE 15.33 Accessing the option to create a new batch prediction from the ...
FIGURE 15.34 Selecting an ML model for batch predictions
FIGURE 15.35 Selecting a datasource for batch predictions
FIGURE 15.36 Specifying an Amazon S3 bucket where the results of the batch p...
FIGURE 15.37 Batch Prediction Review screen
FIGURE 15.38 Amazon Machine Learning dashboard showing a completed batch pre...
FIGURE 15.39 Amazon S3 Bucket with the results of the batch prediction
FIGURE 15.40 Creating a real-time prediction endpoint for an Amazon Machine ...
FIGURE 15.41 Costs of maintaining a real-time prediction endpoint
FIGURE 15.42 Accessing the real-time prediction endpoint
Chapter 16
FIGURE 16.1 Enabling region-specific Amazon STS endpoints
FIGURE 16.2 Accessing the Amazon SageMaker management console
FIGURE 16.3 Navigating to the list of notebook instances
FIGURE 16.4 Specifying the name of the new Amazon SageMaker notebook instanc...
FIGURE 16.5 Creating a new IAM role for the Amazon SageMaker notebook instan...
FIGURE 16.6 Specifying the permissions policy for the new IAM role for Amazo...
FIGURE 16.7 New IAM role for Amazon SageMaker
FIGURE 16.8 Amazon SageMaker management console showing the new notebook ins...
FIGURE 16.9 Amazon SageMaker notebook instance management
FIGURE 16.10 Accessing the Amazon S3 bucket that will contain the training a...
FIGURE 16.11 Uploading the pre-split training and test data files to the Ama...
FIGURE 16.12 Creating a new Jupyter Notebook on an Amazon SageMaker notebook...
FIGURE 16.13 Changing the title of a Jupyter Notebook file
FIGURE 16.14 Uploading a file to a notebook instance
FIGURE 16.15 Using a notebook instance to create a training job
FIGURE 16.16 List of trained models
FIGURE 16.17 Training a model based on a built-in algorithm using an AWS Sag...
Chapter 17
FIGURE 17.1 Structure of an artificial neural network (ANN)
FIGURE 17.2 A simple neural network
FIGURE 17.3 TensorFlow API architecture
FIGURE 17.4 Accessing the Amazon S3 bucket that will contain the training an...
FIGURE 17.5 Uploading the pre-split training and test data files to the Amaz...
FIGURE 17.6 Amazon SageMaker management console showing the new notebook ins...
FIGURE 17.7 Inspecting the first five rows of the Boston housing dataset
FIGURE 17.8 Mean squared error metric
FIGURE 17.9 Computation graph with two placeholder nodes
FIGURE 17.10 Computation graph with two variable nodes
FIGURE 17.11 Computation graph after the multiplication of w1 and x1 nodes
FIGURE 17.12 Computation graph with y_predicted
FIGURE 17.13 Computation graph that contains nodes to compute the MSE cost f...
FIGURE 17.14 Computation graph that contains the operation to optimize the c...
FIGURE 17.15 Uploading a file to a notebook instance
FIGURE 17.16 Architecture of neural-network–based classification model
FIGURE 17.17 Using a notebook instance to create a training job
FIGURE 17.18 List of trained models
Chapter 18
FIGURE 18.1 Accessing the Amazon Rekognition service home page
FIGURE 18.2 Accessing the Object and Scene Detection demo
FIGURE 18.3 Object labels detected in a sample scene
FIGURE 18.4 Amazon Rekognition aggregate metric graphs
FIGURE 18.5 Accessing the Amazon DynamoDB management console
FIGURE 18.6 Amazon DynamoDB table name and primary key attributes
FIGURE 18.7 Amazon DynamoDB Table Read/Write Capacity Mode section
FIGURE 18.8 Amazon DynamoDB management console displaying a list of tables
FIGURE 18.9 Creating an AWS Lambda function from scratch
FIGURE 18.10 Lambda Function Name and Runtime settings
FIGURE 18.11 Viewing the default policy document associated with the IAM rol...
FIGURE 18.12 Updating the default policy document associated with the IAM ro...
FIGURE 18.13 Review Policy screen
FIGURE 18.14 AWS Lambda function designer
FIGURE 18.15 Configuring the S3 event trigger
FIGURE 18.16 Configuring the AWS Lambda function code
FIGURE 18.17 Examining the results of the AWS Lambda function
FIGURE 18.18 Querying the Amazon DynamoDB table will allow you to search for...
Cover
Table of Contents
Begin Reading
iii
iv
v
vii
ix
xi
xxiii
xxiv
xxv
xxvi
xxvii
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
Abhishek Mishra
Copyright © 2019 by John Wiley & Sons, Inc., Indianapolis, IndianaPublished simultaneously in Canada
ISBN: 978-1-119-55671-8ISBN: 978-1-119-55673-2 (ebk.)ISBN: 978-1-119-55672-5 (ebk.)
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read.
For general information on our other products and services or to obtain technical support, please contact our Customer Care Department within the U.S. at (877) 762-2974, outside the U.S. at (317) 572-3993 or fax (317) 572-4002.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.
Library of Congress Control Number: 2019940774
TRADEMARKS: Wiley, the Wiley logo, and the Sybex logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. Amazon SageMaker and Amazon Rekognition are registered trademarks of Amazon Technologies, Inc. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
To my wife Sonam, for her love and support through all the years we've been together.
To my daughter Elana, for bringing joy and happiness into our lives.
—Abhishek
This book would not have been possible without the support of the team at Wiley, including Jim Minatel, Kenyon Brown, David Clark, Kim Cofer, and Pete Gaughan. I would also like to thank Chaim Krause for his keen eye for detail. It has been my privilege to work with all of you. Thank you.
Abhishek Mishra has been active in the IT industry for over 19 years and has extensive experience with a wide range of programming languages, enterprise systems, service architectures, and platforms.
He holds a master's degree in computer science from the University of London and currently provides consultancy services to Lloyds Banking Group in London as a security and fraud solution architect. He is the author of several books, including Amazon Web Services for Mobile Developers.
Chaim Krause is a lover of computers, electronics, animals, and electronic music. He's tickled pink when he can combine two or more in some project. He has come by the vast majority of his knowledge through independent learning. He jokes with everyone that the only difference between what he does at home and what he does at work is the logon he uses. As a lifelong learner he is often frustrated with technical errors in documentation that waste valuable time and cause unnecessary frustration. One of the reasons he works as the technical editor on books is to help others avoid those same pitfalls.
Amazon Web Services (AWS) is one of the leading cloud-computing platforms in the industry today. At the time this book was written, AWS offered more than 100 services, each of which resided in one of 18 different service categories. For someone who is new to cloud computing or to the AWS ecosystem, the sheer number of services on offer can be daunting. It can be difficult to know where to begin and what services to focus on.
Developers who are new to machine learning as well as experienced data scientists are often not aware of the power of the public cloud and AWS's offerings in the machine learning space in particular. In the past, cloud-based machine learning offerings have been limited in the types of algorithms they could support and the level of customization that was possible. All of this changed when Amazon announced SageMaker—a service that provided the ability to build machine learning models based on Amazon's implementation of cutting-edge algorithms, as well as the option to build custom models with frameworks such as Scikit-learn and Google TensorFlow.
Real-world use cases of cloud-based machine learning models are not based on using the model in isolation, but instead rely on a number of supporting systems such as databases, load balancers, API gateways, and identity providers, all of which are provided by AWS. This book is written to provide both seasoned machine learning experts and enthusiasts alike an introduction to a selection of AWS machine learning services that are based on pre-trained models, as well as step-by-step examples of how to train and deploy your own custom models on Amazon SageMaker. For enthusiasts who are new to machine learning, this book also provides a selection of chapters that cover the fundamentals of machine learning such as data preprocessing, visualization, feature engineering, and the use of common Python libraries such as NumPy, Pandas, and Scikit-learn.
This book at all times attempts to balance between theory and practice, giving you enough visibility into the underlying concepts and providing you with the best practices and practical advice that you can apply at your workplace right away. I have also made every attempt to keep the content up-to-date and relevant. Even though this makes the book susceptible to being outdated in a few rare instances, I am confident the content will remain useful and relevant through the next versions of the AWS services.
This book is best suited for software developers who wish to learn about machine learning in general and how to leverage machine learning–specific offerings from AWS. The book is also useful to data scientists, system architects, and application architects, who want to get an introduction to some of the commonly used AWS services in the machine learning space.
If you are new to both machine learning and AWS, I advise that you read all chapters from start to finish. If you are an experienced data scientist, you may want to skip ahead to Part 2 to learn about machine learning–specific AWS services.
This book covers building and training machine learning models with Python on the AWS cloud, as well as a number of ready-to-use machine learning services such as Amazon Rekognition, Amazon Comprehend, and Amazon Lex.
The book also covers general high-level concepts of machine learning, including feature engineering, data visualization, as well as supporting AWS services that are used to build machine learning systems such as Amazon IAM, Amazon Cognito, Amazon S3, Amazon DynamoDB, and AWS Lambda.
The model-building and evaluation code in this book is written in Python 3. Services provided by Amazon, Apple, and Google are updated frequently and therefore sometimes you may encounter a newer version of a screen when you follow the instructions in a chapter.
This book consists of 18 chapters that are grouped into two parts, and four appendices. The first part consists of five chapters and covers the fundamentals of machine learning using Python. This part covers techniques for feature engineering, data visualization, model building, and model evaluation using Pandas, NumPy, Matplotlib, and Scikit-learn. The examples developed in this part make use of Jupyter Notebook and are aimed at readers who are new to machine learning.
Part 2 covers building machine learning applications using AWS services. This part starts with introducing the basics of commonly used AWS services such as Amazon S3, Amazon DynamoDB, and AWS Lambda. It then proceeds to AWS services that deal specifically with machine learning such as Amazon Comprehend, Amazon Lex, Amazon Machine Learning, and Amazon SageMaker. Two chapters are dedicated to Amazon SageMaker; the first one covers building and deploying models using built-in algorithms and Scikit-learn, and the second one covers building and deploying a model with Google TensorFlow. Not all chapters in this part include source code, but where applicable, you can download the source code that accompanies each chapter using a GitHub link. Some of the chapters in this part require you to upload files to Amazon S3; you will need to substitute the names of buckets in the examples with those from your own account.
The chapters in Part 1 include:
Introduction to Machine Learning (
Chapter 1
)
This is an introduction to the types of machine learning systems, their applications, and tools used to build machine learning systems.
Data Collection and Preprocessing (
Chapter 2
)
This chapter covers sources that can be used to obtain training data, techniques to explore datasets, and basic feature engineering.
Data Visualization with Python (
Chapter 3
)
This chapter covers techniques to visualize datasets using Matplotlib.
Creating Machine Learning Models with Scikit-learn (
Chapter 4
)
This chapter covers techniques to build and train classification and regression models using Scikit-learn.
Evaluating Machine Learning Models (
Chapter 5
)
This chapter covers techniques to evaluate the quality of a machine learning model.
The chapters in Part 2 include:
Introduction to Amazon Web Services (
Chapter 6
)
This chapter is a brief primer on cloud computing and Amazon Web Services. It also covers commonly encountered service and deployment models.
AWS Global Infrastructure (
Chapter 7
)
This chapter introduces AWS regions, availability zones, and edge locations.
Identity and Access Management (
Chapter 8
)
This chapter introduces one of the key services provided by AWS to secure your resources in the Amazon cloud. It also provides instructions to sign up for an account under the AWS free tier.
Amazon S3 (
Chapter 9
)
This chapter introduces one the most commonly used storage services provided by AWS, Amazon Simple Storage Service (S3).
Amazon Cognito (
Chapter 10
)
This chapter introduces Amazon's cloud-based OAuth2.0-compliant identity management solution, Amazon Cognito.
Amazon DynamoDB (
Chapter 11
)
This chapter introduces Amazon's managed NoSQL database service, Amazon DynamoDB.
AWS Lambda (
Chapter 12
)
This chapter introduces AWS Lambda, a service designed to allow you to run code in the Amazon cloud without having to provision or manage any infrastructure.
Amazon Comprehend (
Chapter 13
)
This chapter introduces Amazon Comprehend, a cloud-based natural language processing service that you can integrate into your applications to analyze the contents of text documents.
Amazon Lex (
Chapter 14
)
This chapter introduces Amazon Lex, a cloud-based service that you can use to create chatbots and integrate them into your applications.
Amazon Machine Learning (
Chapter 15
)
This chapter introduces Amazon Machine Learning, a fully managed cloud-based service that you can use to build and deploy simple machine learning models without any programming.
Amazon SageMaker (
Chapter 16
)
This chapter introduces Amazon SageMaker, a cloud-based machine learning service that can be used to train and deploy both built-in and custom machine learning models.
Using Google Tensorflow with Amazon SageMaker (
Chapter 17
)
This chapter introduces Google's Tensorflow framework and covers the use of Amazon SageMaker to build and deploy Tensorflow models.
Amazon Rekognition (
Chapter 18
)
This chapter introduces Amazon Rekognition, a fully managed cloud-based service that can be used to add computer vision capabilities to your applications.
The appendices cover the following topics:
Anaconda and Jupyter Notebook Setup (
Appendix A
)
This appendix provides instructions to install the Anaconda distribution and set up a Jupyter Notebook server on your local computer.
AWS Resources Needed to Use This Book (
Appendix B
)
This appendix provides information on the AWS resources that you need to set up in your account in order to follow along with the examples in the book.
Installing and Configuring the AWS CLI (
Appendix C
)
This appendix provides instructions to download and install the AWS CLI tool.
Introduction to NumPy and Pandas (
Appendix D
)
This appendix provides an introduction to two Python libraries commonly used by data scientists: NumPy and Pandas.
A suitable Mac or Windows computer for development
Basic knowledge of Python programming
An AWS account that you can administer
To help you get the most from the text and keep track of what's happening, we've used a number of conventions throughout the book.
NOTE
Notes, tips, hints, tricks, and asides to the current discussion are offset like this.
As for styles in the text:
We
italicize
new terms and important words when we introduce them.
We show keyboard strokes like this: Ctrl+A.
We show filenames, URLs, and code within the text like so:
persistence.properties
.
We present code in two different ways:
We use a monofont type with no highlighting for most code examples.
We use bold type to emphasize code that is of particular importance in the present context.
As you work through the examples in this book, you may choose either to type in all the code manually or to use the source code files that accompany the book. All of the source code used in this book is available for download at www.wiley.com/go/machinelearningawscloud. Also, you can download the code files at GitHub.
We make every effort to ensure that there are no errors in the text or in the code. However, no one is perfect, and mistakes do occur. If you find an error in one of our books, like a spelling mistake or faulty piece of code, we would be very grateful for your feedback. By sending in errata you may save another reader hours of frustration and at the same time you will be helping us provide even higher quality information.
To report errata, email to [email protected] and include
The book's title and ISBN (
Machine Learning in the AWS Cloud
, 9781119556718)
The page number of the relevant content
A description of just what's wrong
Chapter 1: Introduction to Machine Learning
Chapter 2: Data Collection and Preprocessing
Chapter 3: Data Visualization with Python
Chapter 4: Creating Machine Learning Models with Scikit-learn
Chapter 5: Evaluating Machine Learning Models