E-Book
38,99 €

AWS Certified Machine Learning Study Guide E-Book

Shreyas Subramanian

0,0

38,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.

Herausgeber: John Wiley & Sons
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Succeed on the AWS Machine Learning exam or in your next job as a machine learning specialist on the AWS Cloud platform with this hands-on guide As the most popular cloud service in the world today, Amazon Web Services offers a wide range of opportunities for those interested in the development and deployment of artificial intelligence and machine learning business solutions. The AWS Certified Machine Learning Study Guide: Specialty (MLS-CO1) Exam delivers hyper-focused, authoritative instruction for anyone considering the pursuit of the prestigious Amazon Web Services Machine Learning certification or a new career as a machine learning specialist working within the AWS architecture. From exam to interview to your first day on the job, this study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. And with the practice exams and assessments, electronic flashcards, and supplementary online resources that accompany this Study Guide, you'll be prepared for success in every subject area covered by the exam. You'll also find: * An intuitive and organized layout perfect for anyone taking the exam for the first time or seasoned professionals seeking a refresher on machine learning on the AWS Cloud * Authoritative instruction on a widely recognized certification that unlocks countless career opportunities in machine learning and data science * Access to the Sybex online learning resources and test bank, with chapter review questions, a full-length practice exam, hundreds of electronic flashcards, and a glossary of key terms AWS Certified Machine Learning Study Guide: Specialty (MLS-CO1) Exam is an indispensable guide for anyone seeking to prepare themselves for success on the AWS Certified Machine Learning Specialty exam or for a job interview in the field of machine learning, or who wishes to improve their skills in the field as they pursue a career in AWS machine learning.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 543

Veröffentlichungsjahr: 2021

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Cover

Title Page

Dedication

Acknowledgments

About the Authors

About the Technical Editor

Introduction

The AWS Certified Machine Learning Specialty Exam

Who Should Buy This Book

Study Guide Features

AWS Certified Machine Learning Specialty Exam Objectives

Assessment Test

Answers to Assessment Test

PART I: Introduction

Chapter 1: AWS AI ML Stack

Amazon Rekognition

Amazon Textract

Amazon Transcribe

Amazon Translate

Amazon Polly

Amazon Lex

Amazon Kendra

Amazon Personalize

Amazon Forecast

Amazon Comprehend

Amazon CodeGuru

Amazon Augmented AI

Amazon SageMaker

AWS Machine Learning Devices

Summary

Exam Essentials

Review Questions

Chapter 2: Supporting Services from the AWS Stack

Storage

Amazon VPC

AWS Lambda

AWS Step Functions

AWS RoboMaker

Summary

Exam Essentials

Review Questions

PART II: Phases of Machine Learning Workloads

Chapter 3: Business Understanding

Phases of ML Workloads

Business Problem Identification

Summary

Exam Essentials

Review Questions

Chapter 4: Framing a Machine Learning Problem

ML Problem Framing

Recommended Practices

Summary

Exam Essentials

Review Questions

Chapter 5: Data Collection

Basic Data Concepts

Data Repositories

Data Migration to AWS

Summary

Exam Essentials

Review Questions

Chapter 6: Data Preparation

Data Preparation Tools

Summary

Exam Essentials

Review Questions

Chapter 7: Feature Engineering

Feature Engineering Concepts

Feature Engineering Tools on AWS

Summary

Exam Essentials

Review Questions

Chapter 8: Model Training

Common ML Algorithms

Local Training and Testing

Remote Training

Distributed Training

Monitoring Training Jobs

Debugging Training Jobs

Hyperparameter Optimization

Summary

Exam Essentials

Review Questions

Chapter 9: Model Evaluation

Experiment Management

Metrics and Visualization

Summary

Exam Essentials

Review Questions

Chapter 10: Model Deployment and Inference

Deployment for AI Services

Deployment for Amazon SageMaker

Advanced Deployment Topics

Summary

Exam Essentials

Review Questions

Chapter 11: Application Integration

Integration with On-Premises Systems

Integration with Cloud Systems

Integration with Front-End Systems

Summary

Exam Essentials

Review Questions

PART III: Machine Learning Well-Architected Lens

Chapter 12: Operational Excellence Pillar for ML

Operational Excellence on AWS

Summary

Exam Essentials

Review Questions

Chapter 13: Security Pillar

Security and AWS

Secure SageMaker Environments

AI Services Security

Summary

Exam Essentials

Review Questions

Chapter 14: Reliability Pillar

Reliability on AWS

Change Management for ML

Failure Management for ML

Summary

Exam Essentials

Review Questions

Chapter 15: Performance Efficiency Pillar for ML

Performance Efficiency for ML on AWS

Summary

Exam Essentials

Review Questions

Chapter 16: Cost Optimization Pillar for ML

Common Design Principles

Cost Optimization for ML Workloads

Summary

Exam Essentials

Review Questions

Chapter 17: Recent Updates in the AWS AI/ML Stack

New Services and Features Related to AI Services

New Features Related to Amazon SageMaker

Summary

Exam Essentials

Appendix Answers to the Review Questions

Chapter 1: AWS AI ML Stack

Chapter 2: Supporting Services from the AWS Stack

Chapter 3: Business Understanding

Chapter 4: Framing a Machine Learning Problem

Chapter 5: Data Collection

Chapter 6: Data Preparation

Chapter 7: Feature Engineering

Chapter 8: Model Training

Chapter 9: Model Evaluation

Chapter 10: Model Deployment and Inference

Chapter 11: Application Integration

Chapter 12: Operational Excellence Pillar for ML

Chapter 13: Security Pillar

Chapter 14: Reliability Pillar

Chapter 15: Performance Efficiency Pillar for ML

Chapter 16: Cost Optimization Pillar for ML

Index

End User License Agreement

List of Tables

Chapter 1

TABLE 1.1 Various features of SageMaker corresponding to the different phase...

Chapter 2

TABLE 2.1 AWS Lambda limits

Chapter 5

TABLE 5.1 Table of housing data

Chapter 8

TABLE 8.1 Services relevant to an end-to-end machine learning workflow that ...

List of Illustrations

Chapter 1

FIGURE 1.1 Document analysis with human review flow

FIGURE 1.2 Flow showing how to translate customer service calls followed by ...

FIGURE 1.3 The

AppointmentBot

can be built using Amazon Lex and backend ...

FIGURE 1.4 The end-to-end flow with Amazon Personalize (text on top) and how...

Chapter 2

FIGURE 2.1 Pattern for using FSx for Lustre with Amazon SageMaker for traini...

FIGURE 2.2 Architecture showing the use of VPC endpoints to connect to vario...

FIGURE 2.3 An example ML pipeline constructed using step functions that orch...

Chapter 3

FIGURE 3.1 Diagram showing the phases of the machine learning lifecycle

Chapter 5

FIGURE 5.1 Various data sources you can use with AWS Data Pipeline to land d...

FIGURE 5.2 Various data sources you can use with AWS DMS to land data in S3...

FIGURE 5.3 Conceptual diagram of Kinesis Data Streams

FIGURE 5.4 Conceptual diagram of Kinesis Data Firehose showing how data can ...

FIGURE 5.5 Diagram showing streaming data flow pattern for Kinesis Data Anal...

Chapter 6

FIGURE 6.1 Diagram showing SageMaker Ground Truth data labeling tool

FIGURE 6.2 Diagram showing AWS Glue as an ETL tool

Chapter 7

FIGURE 7.1 Diagram showing how you can deal with skewed distributions using ...

FIGURE 7.2 Diagram showing how you backtest on time series data

Chapter 8

FIGURE 8.1 Linear regression example. The error terms shown by the vertical ...

FIGURE 8.2 Example showing lack of constant variance. The error terms shown ...

FIGURE 8.3 Example showing violation of linearity assumption

FIGURE 8.4 SVM conceptual example showing the separatrix by the solid line a...

FIGURE 8.5 You can use a kernel SVM to separate the points. Although they cl...

FIGURE 8.6 Sequential learning of XGBoost to combine many weak learners into...

FIGURE 8.7 Possible output of a clustering analysis that splits data into th...

FIGURE 8.8 Elbow curve analysis of PCA to determine optimal number of cluste...

FIGURE 8.9 Values of

and

explored with grid search

FIGURE 8.10 Values of

and

explored with random search

Chapter 9

FIGURE 9.1 Data from a toy two-class classification problem

FIGURE 9.2 Example SVM hyperplane separating the two classes of data

FIGURE 9.3 Example ROC curve

FIGURE 9.4 Example showing comparison of two ROC curves by calculating the A...

FIGURE 9.5 Example precision vs. recall curve

Chapter 10

FIGURE 10.1 SageMaker real-time endpoints under the hood

FIGURE 10.2 SageMaker Batch transform under the hood

FIGURE 10.3 Re-create strategy showing how to stop endpoint A and start endp...

FIGURE 10.4 Ramped strategy showing how to gradually shift from endpoint A t...

Chapter 11

FIGURE 11.1 Typical architecture for connecting Amazon API Gateway to AWS La...

Chapter 12

FIGURE 12.1 Diagram showing the ML workflow with different ways you can use ...

FIGURE 12.2 Diagram showing a typical CI/CD workflow that can be used as par...

Chapter 13

FIGURE 13.1 Diagram showing the AWS shared responsibility model. Understand ...

FIGURE 13.2 Diagram showing tag-based controls that can be applied using att...

FIGURE 13.3 Authentication flow using SAML 2.0 to access the AWS console

FIGURE 13.4 Different IAM roles applicable to SageMaker

FIGURE 13.5 Private network traffic to SageMaker Studio

Guide

Cover

Table of Contents

Title Page

Dedication

Acknowledgments

About the Authors

About the Technical Editor

Introduction

Begin Reading

Appendix Answers to the Review Questions

Index

End User License Agreement

Pages

iii

vii

viii

xvii

xviii

xix

xxi

xxii

xxiii

xxiv

xxv

xxvi

xxvii

xxviii

xxix

xxx

xxxi

xxxii

xxxiii

xxxiv

xxxv

xxxvi

100

101

102

103

104

105

106

107

108

109

110

111

113

114

115

116

117

118

119

120

121

122

123

124

125

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

207

208

209

210

211

212

213

214

215

216

217

218

219

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

241

242

243

244

245

246

247

248

249

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

303

304

305

306

307

308

309

310

311

312

313

314

315

AWSCertified Machine LearningStudy Guide

Specialty (MLS-C01) Exam

Shreyas Subramanian

Stefan Natu

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada and the United Kingdom.

978-1-119-82100-7978-1-119-82102-1 (ebk.)978-1-119-82101-4 (ebk.)

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Website is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Website may provide or recommendations it may make. Further, readers should be aware the Internet Websites listed in this work may have changed or disappeared between when this work was written and when it is read.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Control Number: 2021944004

Trademarks: WILEY, the Wiley logo, Sybex, and the Sybex logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. Amazon Web Services and AWS are trademarks of Amazon, Inc. or its affiliates in the United States and/or other countries. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Cover image: © Jeremy Woodhouse/Getty ImagesCover design: Wiley

To our parents.

Acknowledgments

Although this book bears our names as authors, many other people contributed to its creation. Without their help, this book wouldn't exist, or at best would exist in a lesser form. Kenyon Brown was the acquisitions editor and so helped get the book started. Christine O'Connor, the managing editor, and Caroline Define, project manager, oversaw the book as it progressed through all its stages. Sonam Mishra was the technical editor, who checked the text for technical errors and omissions—but any mistakes that remain are our own. We would also like to thank Matt Wagner from FreshBooks, who helped connect us with Wiley to write this book. Finally, we would like to thank our wives for their patience as we spent many weekend hours researching the content and writing this book over the past 8 months.

About the Authors

Shreyas Subramanian has a PhD in multilevel systems optimization and application of machine learning to large-scale optimization. He is currently a principal machine learning specialist at Amazon Web Services, and he has worked with several large-scale companies on their business-critical machine learning and optimization problems. Subramanian is passionate about simplifying difficult concepts within optimization, and he holds two patents in areas connected to aviation-related tools and techniques for improving efficiency and security of the airspace. He has also published over 20 conference and journal papers on the topics of aircraft design, evolutionary optimization, distributed optimization, and multilevel systems or systems optimization. He has several years of experience building machine learning and optimization models for customers in large enterprises to small startups, while taking part in and winning hackathons on the side. Subramanian is passionate about teaching practical machine learning to citizen data scientists and has trained hundreds of customers in private, hands-on environments and has helped customers build proofs-of-concept that are now in production today, providing millions of dollars’ worth of revenue to the AWS business as well as customers.

Stefan Natu is a principal machine learning (ML) architect at Alexa AI, where he is building an ML platform for Alexa scientists and engineers. Prior to that, Natu was the lead ML architect at Amazon Web Services, where he focused on financial services and helped major investment banking, asset management, and insurance customers build and operationalize ML use cases on AWS, with an emphasis on security, enterprise data, and model governance. Natu has developed and evangelized common ML architecture and infrastructure patterns globally across AWS highly regulated customers, leading to numerous production ML deployments and millions of dollars in AWS cloud revenue. He has authored over 25 AWS machine learning blogs, code samples. and whitepapers, and is a frequent speaker at conferences such as AWS re:Invent. He completed his PhD in atomic and condensed matter physics from Cornell University, and he worked as a research physicist at ExxonMobil, submitting two patents and over 25 peer-reviewed publications. Natu is passionate about mentorship and has served as a technical adviser at Insight Data Science, where he guided students in their transition from careers in academia to industry.

About the Technical Editor

Sonam Mishra is an IT consultant with several years of experience in diverse roles ranging from software development, to application testing, to technical content creation. She is passionate about new and emerging technologies, particularly in the area of cloud computing. She lives in the United Kingdom with her family.

Introduction

Machine learning (ML) is one of the most popular and rapidly growing fields in the technology industry today, with far-reaching business implications. The market for ML solutions and products is expected to grow annually by tens of billions of dollars, and with it, the demand for professionals who understand how to analyze data and build ML solutions is expected to grow as well.

ML is a highly technical field, and successful ML professionals need a foundation in mathematics, statistics, and data analysis. They must be able to code and have a fundamental understanding of infrastructure and software development best practices. In the past, the practitioners of machine learning were academics and PhDs, but the industry demand for ML is much larger than the supply of new PhDs emerging from academic institutions.

The purpose of this book is for you to understand the concepts and principles behind ML, with the practical goal of passing the AWS Certified Machine Learning Specialty exam. As practicing ML solution architects, we go well beyond the scope of the test in this book and incorporate architecture patterns and best practices that we have seen employed in the industry today. Reading this book will also give you an understanding of what is required to be a successful machine learning architect.

This is not a book on ML foundations. That is simply too vast a field for us to do it justice in this book and also is not our intention. There are a number of excellent textbooks and online resources you can use to develop a foundation on ML algorithms, deep learning, and similar topics. However, we will cover the concepts that you will need for the test.

Finally, one of our favorite leadership principles here at Amazon that widely applies to the solution architect role is learn and be curious. We have found that the best way to learn a topic is to get hands-on, and we highly recommend that you go beyond this book and get hands-on experience in ML. Download and explore some public datasets, and train some simple predictive models. Build a neural network from scratch using TensorFlow/PyTorch or just native Python. Explore AWS services such as Amazon SageMaker by running some of the sample Jupyter Notebooks. We highly recommend getting some hands-on knowledge before taking the test. Check out the AWS Training and Certification web page for helpful courses: www.aws.training.

Don't just study the questions and answers! The questions on the actual exam will be different from the practice questions included in this book. The exam is designed to test your knowledge of a concept or objective, so use this book to learn the objectives behind the questions.

The ML space is maturing and growing very quickly; what this means is that our book is just a snapshot in time of our understanding of the industry and certification requirements. We highly recommend that you read the SageMaker home page to review the latest releases that may appear on the test.

The AWS Certified Machine Learning Specialty Exam

The AWS Certified Machine Learning Specialty exam is intended for professionals who perform a data science, machine learning engineer role. The official details of the test can be found here: https://aws.amazon.com/certification/certified-machine-learning-specialty.

The focus of the test is to validate your understanding of foundational ML concepts, foundations of statistics, data analysis, exploration, feature engineering, and common ML algorithms. This is required knowledge for anyone performing this role in industry today. However, in addition to this, this certification focuses on your ability to deploy those solutions on AWS and to be able to architect an end-to-end solution on AWS from data ingestion to model deployment and monitoring using a host of relevant AWS services for a given business use case.

Why Become AWS Machine Learning Specialty Certified?

There are several good reasons to get your AWS Certified Machine Learning certification:

It provides proof of professional achievement.

Certifications are quickly becoming status symbols in the computer service industry. Organizations, including members of the computer service industry, are recognizing the benefits of cloud certification such as the AWS Solution Architect Professional, Certified Security, and Advanced Networking Specialty. As ML becomes increasingly popular, these certifications provide proof of your understanding of ML and your ability to practically deploy ML solutions on AWS.

It provides an opportunity for advancement.

The solution architect role is one of the most coveted roles in the tech industry today due to the breadth and depth of the knowledge you gain, while having an outsized impact on customers’ business. The Machine Learning Specialty Certification could provide you with an opportunity to specialize in ML and become a practicing ML architect, a unique role that many employers are looking to hire.

It helps you develop an industry understanding of ML.

ML education is rapidly becoming a crowded space with blogs, textbooks, online courses that cover the foundations of ML, statistics and data science, and even ML tooling. However, there is no substitute for experience, and there isn't much material on actual industry use cases with solutions and best practices (with the exception of some fantastic tech blogs published by companies like Uber, Google, Netflix, Lyft, Airbnb, and many others). This book aims to cover some of that gap by providing you with a practical understanding of building real-life ML solutions on AWS.

It will satisfy your curiosity.

As technologists and technology enthusiasts, we are constantly learning new areas and expanding our knowledge. One of the best and most fulfilling reasons to take this certification is simply to satiate your curiosity to learn how to build ML solutions on AWS.

How to Become AWS Machine Learning Specialty Certified

The AWS Certified Machine Learning Specialty exam is available to anyone and does not require other AWS certifications as prerequisites. It is recommended, however, that you have 1–2 years of experience developing and architecting ML and deep learning workloads on AWS prior to taking the test. Because it is a specialty certification, it also assumes prior foundational understanding of AWS services for storage, networking, security, databases, and so forth; however, these are not tested in detail.

The exam is administered by Pearson VUE and PSI. To register for the test with PSI, you can register online at https://awsavailability.psiexams.com. To register with Pearson VUE, you can register online using https://home.pearsonvue.com/Clients/Amazon-Web-Services.aspx.

Exam policies can change from time to time. We highly recommend that you check both the PSI and Pearson VUE sites for the most up-to-date information when you begin preparing, when you register, and again a few days before your scheduled exam date.

Who Should Buy This Book

Anybody who wants to pass the AWS Certified Machine Learning Specialty exam may benefit from this book. This book is also helpful for business and IT professionals who want to learn how ML is practically used in the industry and pivot their careers toward an ML-centric role such as a data scientist or ML engineer working on AWS. We include a number of practical case studies, industry best practices, and architecture patterns that we have seen used in industry today from our engagements with hundreds of AWS customers. This book is also essential for data scientists, engineers, and other data professionals who are curious about how you can build, train, and deploy models at scale on AWS.

This book assumes some familiarity with ML and with AWS. If you are completely new to machine learning, we recommend that you first learn some basic ML concepts since this book is mainly focused on the practical aspects of building ML solutions. There are several great resources that cover ML foundations, particularly for building statistical models and for deep learning. Two of our favorites are Aurélion Géron's Hands-on Machine Learning with Scikit-learn and TensorFlow (O'Reilly Publishing) and Francois Chollet's Deep Learning with Python (Manning, 2017). There are also several awesome blogs on Medium.com and TowardsDataScience.com. Finally, we also recommend a number of industry blogs from leading tech companies like Uber, Google, Facebook, Amazon, Airbnb, and others on how they deploy large-scale ML solutions to have a holistic understanding of the industry landscape in this space.

As a practical matter, you'll need a laptop or desktop with which to practice and learn in a hands-on way. This book does not cover labs, and there is no substitute for hands-on experience. Go get familiar with AWS ML services such as SageMaker, as well as the AI services, before taking the test. We also recommend that you explore some public datasets, engineer features, and train simple models as well as some deep learning models.

Study Guide Features

This study guide uses a number of common elements to help you prepare. These include the following:

Summaries

The summary section of each chapter briefly explains the chapter, allowing you to easily understand what it covers.

Exam Essentials

The Exam Essentials focus on major exam topics and critical knowledge that you should take into the test. They focus on the exam objectives provided by AWS.

Chapter Review Questions

A set of questions at the end of each chapter will help you assess your knowledge and if you are ready to take the exam based on your knowledge of that chapter's topics.

The review questions, assessment test, and other testing elements included in this book are not derived from the actual exam questions, so don't memorize the answers to these questions and assume that doing so will enable you to pass the exam. You should learn the underlying topic, as described in the text of the book. This will let you answer the questions provided with this book and pass the exam. Learning the underlying topic is also the approach that will serve you best in the workplace—the ultimate goal of a certification.

Interactive Online Learning Environment and Test Bank

We’ve worked hard to provide some really great tools to help you with your certification process. The interactive online learning environment that accompanies the AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam provides a test bank with study tools to help you prepare for the certification exam—and increase your chances of passing it the first time! The test bank includes the following:

Sample Tests: All the questions in this book are provided, including the assessment test at the end of this introduction and the review questions at the end of each chapter. In addition, there is a practice exam with 76 questions. Use these questions to test your knowledge of the study guide material. The online test bank runs on multiple devices.

Flashcards: The online text bank includes flashcards specifically written to challenge you, so don’t get discouraged if you don’t ace your way through them at first. They’re there to ensure that you’re really ready for the exam. And no worries—armed with the book, reference material, review questions, practice exams, and flashcards, you’ll be more than prepared when exam day comes. Questions are provided in digital flashcard format (a question followed by a single correct answer). You can use the flashcards to reinforce your learning and provide last-minute test prep before the exam.

Glossary: A glossary of key terms from this book is available as a fully searchable PDF.

Go to www.wiley.com/go/sybextestprep, register your book to receive your unique PIN, and then once you have the PIN, return to www.wiley.com/go/sybextestprep and register a new account or add this book to an existing account.

Conventions Used in This Book

This book uses certain typographic styles in order to help you quickly identify important information and to avoid confusion over the meaning of words such as on-screen prompts. In particular, look for the following styles:

Italicized text

indicates key terms that are described at length for the first time in a chapter. (Italics are also used for emphasis.)

A monospaced font

indicates the contents of configuration files, messages displayed at a text-mode Linux shell prompt, filenames, text-mode command names, and Internet URLs.

Italicized monospaced text

indicates a variable—information that differs from one system or command run to another, such as the name of a client computer or a process ID number.

Bold monospaced text

is information that you're to type into the computer, such as at a shell prompt. This text can also be italicized to indicate that you should substitute an appropriate value for your system.

In addition to these text conventions, which can apply to individual words or entire paragraphs, a few conventions highlight segments of text:

A note indicates information that's useful or interesting but that's somewhat peripheral to the main text. A note might be relevant to a small number of networks, for instance, or it may refer to an outdated feature.

A tip provides information that can save you time or frustration and that may not be entirely obvious. A tip might describe how to get around a limitation or how to use a feature to perform an unusual task.

Warnings describe potential pitfalls or dangers. If you fail to heed a warning, you may end up spending a lot of time recovering from a bug, or you may even end up restoring your entire system from scratch.

Real World Scenario

Real-World Scenario

A real-world scenario is a type of sidebar that describes a task or an example that's particularly grounded in the real world. This may be a situation we or somebody we know has encountered, or it may be advice on how to work around problems that are common in real, working ML environments.

AWS Certified Machine Learning Specialty Exam Objectives

AWS Certified Machine Learning Study Guide has been written to cover every AWS exam objective at a level appropriate to its exam weighting. The following table provides a breakdown of this book's exam coverage, showing you the weight of each section and the chapter where each objective or subobjective is covered:

Subject Area

% of Exam

Domain 1: Data Engineering Domain 2: Exploratory Data Analysis Domain 3: Modeling Domain 4: Machine Learning Implementation and Operations

20% 24% 36% 20%

Total

100%

Domain 1: Data Engineering

Subdomain 1.1: Create Data Repositories for Machine Learning

Exam Objective

Chapter

1.1-1. Create data repositories for machine learning

Identify data sources

Determine storage mediums

Subdomain 1.2: Identify and Implement a Data Ingestion Solution

Exam Objective

Chapter

1.2-1. Data job styles/types (batch load/streaming)

1.2-2. Data ingestion pipelines

Kinesis

Kinesis Analytics

Kinesis Firehose

EMR

Glue

1.2-3. Job scheduling

Subdomain 1.3: Identify and Implement a Data Transformation Solution

Exam Objective

Chapter

1.3-1. Transforming data transit (ETL: Glue, EMR, AWS Batch)

1.3-2. Handle ML-specific data using map reduce (Hadoop, Spark, Hive)

Domain 2: Exploratory Data Analysis

Subdomain 2.1: Sanitize and Prepare Data for Modeling

Exam Objective

Chapter

2.1-1. Identify and handle missing data, corrupt data, stop words, etc.

2.1-2. Formatting, normalizing, augmenting, and scaling data

2.1-3. Labeled data (recognizing when you have enough labeled data and identifying mitigation strategies [Data labeling tools (Mechanical Turk, manual labor)])

Subdomain 2.2: Perform Feature Engineering

Exam Objective

Chapter

2.2-1. Identify and extract features from datasets, including from data sources such as text, speech, image, public datasets, etc.

2.2-2. Analyze/evaluate feature engineering concepts (binning, tokenization, outliers, synthetic features, One-hot encoding, reducing dimensionality of data)

Subdomain 2.3: Analyze and Visualize Data for Machine Learning

Exam Objective

Chapter

2.3-1. Graphing (scatter plot, time series, histogram, box plot)

2.3-2. Interpreting descriptive statistics (correlation, summary statistics, p value)

2.3-3. Clustering (hierarchical, diagnosing, elbow plot, cluster size)

Domain 3: Modeling

Subdomain 3.1: Frame Business Problems as Machine Learning Problems

Exam Objective

Chapter

3.1-1. Determine when to use/when not to use ML

3.1-2. Know the difference between supervised and unsupervised learning

3.1-3. Selecting from among classification, regression, forecasting, clustering, recommendation, etc.

Subdomain 3.2: Select the Appropriate Model(s) for a Given Machine Learning Problem

Exam Objective

Chapter

3.2-1. XGBoost, logistic regression, K-means, linear regression, decision trees, random forests, RNN, CNN, Ensemble, Transfer learning

3.2-2. Express intuition behind models

Subdomain 3.3: Train Machine Learning Models

Exam Objective

Chapter

3.3-1. Train validation test split, cross-validation

3.3-2. Optimizer, gradient descent, loss functions, local minima, convergence, batches, probability, etc.

3.3-3. Compute choice (GPU vs. CPU, distributed vs. non-distributed, platform [Spark vs. non-Spark]

3.3-4. Model updates and retraining

Subdomain 3.4: Perform Hyperparameter Optimization

Exam Objective

Chapter

3.4-1. Regularization

3.4-2. Cross validation

3.4-3. Model initialization

3.4-4. Neural network architecture (layers/nodes), learning rate, activation functions

3.4-5. Tree-based models (# of trees, # of levels)

3.4-6. Linear models (learning rate)

Subdomain 3.5: Evaluate machine learning models

Exam Objective

Chapter

3.5-1. Avoid overfitting/underfitting (detect and handle bias and variance

3.5-2. Metrics (AUC-ROC, accuracy, precision, recall, RMSE, F1 score)

3.5-3. Confusion matrix

3.5-4. Offline and online model evaluation, A/B testing

3.5-5. Compare models using metrics (time to train a model, quality of model, engineering costs)

3.5-6. Cross validation

Domain 4: Machine Learning Implementation and Operations

Subdomain 4.1: Frame Build Machine Learning Solutions for Performance, Availability, Scalability, Resiliency, and Fault Tolerance

Exam Objective

Chapter

4.1-1. AWS environment logging and monitoring

CloudTrail and CloudWatch

Build Error Monitoring

4.1-2. Multiple regions, Multiple AZs

4.1-3. Docker containers

4.1-4. Auto Scaling groups

4.1-5. Rightsizing

4.1-6. Load balancing

4.1-7. AWS best practices

Subdomain 4.2: Recommend and Implement the Appropriate Machine Learning Services and Features for a Given Problem

Exam Objective

Chapter

4.2-1. ML on AWS (application services)

4.2-2. AWS service limits

4.2-3. Build your own model vs. SageMaker built-in algorithms

4.2-4. Infrastructure: Instances types for ML and cost considerations

Subdomain 4.3: Apply Basic AWS Security Practices to Machine Learning Solutions

Exam Objective

Chapter

4.3-1. IAM

4.3-2. S3 Bucket Policies

4.3-3. Security groups

4.3-4. VPC

4.3-5. Encryption/anonymization

Subdomain 4.4: Deploy and Operationalize Machine Learning Solutions

Exam Objective

Chapter

4.4-1. Exposing endpoints and interacting with them

4.4-2. ML model versioning

4.4-3. A/B testing

4.4-4. Retrain pipelines

4.4-5. ML debugging/troubleshooting

Detect and mitigate drop in performance

Monitor performance of the model

Exam domains and objectives are subject to change at any time without prior notice and at AWS's sole discretion. Please visit their website (https://aws.amazon.com/certification/certified-machine-learning-specialty) for the most current information.

PART IIntroduction

Chapter 1AWS AI ML Stack

THE AWS CERTIFIED MACHINE LEARNING (ML) SPECIALTY EXAM OBJECTIVES COVERED IN THIS CHAPTER INCLUDE BUT ARE NOT LIMITED TO THE FOLLOWING:

Domain 3.0: Modeling

3.1 Frame business problems as machine learning problems

3.2 Select the appropriate model(s) for a given machine learning problem

Sample ML architectures for common business workflows such as video analysis, text mining, and others

Details about some common algorithms used in solving complex problems involving unstructured data like text and image

Domain 4.0: Machine Learning Implementation and Operations

4.2 Recommend and implement the appropriate machine learning services and features for a given problem

Details about algorithms for different ML use cases

Details about when to use the proper AWS AI/ML Service

In this chapter, you will learn about different AWS Services for Machine Learning, starting with the artificial intelligence (AI) services for common machine learning (ML) tasks such as image and video analysis, natural language processing, text-to-speech conversion or vice versa, or building recommendation systems or time-series forecasting into your applications. These services make it easy for you to build ML-powered applications without machine learning experience. You will then learn about Amazon SageMaker, which is a fully managed service for data scientists and machine learning developers to build, train, and deploy ML models in the AWS cloud for various business applications. For reference, the exam guide can be found at https://d1.awsstatic.com/training-and-certification/docs-ml/AWS-Certified-Machine-Learning-Specialty_Exam-Guide.pdf.

Amazon Rekognition

Amazon Rekognition is an AI service that makes it easy for users to implement image or video analysis workflows into their applications. Amazon Rekognition aims to leverage Amazon's vast experience in using deep learning for various image-based workloads such as image classification, object detection, detection of text in image, facial recognition, sentiment, and most recently, public safety.

Although there is a vast amount of deep learning research behind developing models for image analytics, training these deep learning models is often computationally expensive and can take several cycles of data scientist or developer time. That's where Amazon Rekognition comes in. With Amazon Rekognition, developers can simply leverage pretrained models or train custom machine learning models without having to worry about writing the algorithm code, or about setting up or managing the infrastructure to train and deploy a deep learning model. More importantly, you don't require any prior machine learning or deep learning knowledge to use this service.

Before diving into Amazon Rekognition, let's quickly grasp the lay of the land on the subject of images and videos in deep learning. Image recognition typically relies on convolutional neural network (CNN) architectures. CNNs are deep learning algorithms consisting of alternating convolutional layers, which apply various filters on the input data to capture different information at different scales, followed by pooling layers, which reduce the number of parameters in the network and also the spatial size of the representation. The initial layers capture low-level features like edges and curves, whereas latter layers build up to higher-level ones to eventually identify the object. There are many popular architectures for CNNs such as ResNet or Inception V4, but it is important to understand the basic concept.

It is also useful to understand the concept of transfer learning. Transfer learning refers to taking a model that was pretrained on one dataset, freezing the initial layers, and letting it relearn the last few layers of the model on a different dataset. The benefits of this are that:

It is computationally less expensive than training a full neural network from scratch.

When you don't have a lot of data or data labeling is expensive, using a pretrained model can provide better model performance than training a model from scratch.

Both Inception V4 and ResNet models are popular algorithms for transfer learning in the image classification space. Transfer learning can be used in many deep learning applications—not just image or video data use cases, but also in natural language processing (NLP).

For object detection, the fundamental architecture is similar, but instead of detecting objects such as a cat versus a dog (fixed label), the model aims to detect a bounding box encapsulating the object of interest. Common algorithms used include single-shot detector (SSD), R-CNN or Faster R-CNN, and YOLO v4.

Finally, semantic segmentation actually segments the object of interest in an image by classifying whether or not an object belongs in a given pixel. An example is detecting a tumor in a human tissue. In order to be useful for doctors, it is not just sufficient to draw a bounding box, but you also need to accurately isolate the tumor from healthy tissue.

You can use Amazon Rekognition with the following key use cases:

Image Labeling

This refers to labeling whether an image consists of certain objects (popular objects in nature), events (party, graduation, etc.), concepts (landscape, nature, evening), or activities.

Custom Image Labeling

Imagine that you are a manufacturer and you need to detect whether or not parts on your assembly line are defective. Since your parts do not correspond to common objects found in nature, you may need to train a custom model. We will discuss this in more detail later, but Amazon Rekognition allows you to train a custom model for use cases of this kind.

Face Detection and Search

Amazon Rekognition can not only detect faces in images but also search for faces from an existing collection. Imagine you are a company that wants to implement face detection for your employees to access your corporate buildings. You can store pictures of your employees in a collection, and call Amazon Rekognition APIs to recognize employees from that collection.

People Paths

Amazon Rekognition can track the movement of people in a video. For example, you may want to track the movement of players on a field during a game for postprocessing, stats, and analytics for fans.

Text Detection

Amazon Rekognition can detect text in images and convert it to machine-readable text that you can use for downstream actions.

Celebrity Detection

Amazon Rekognition recognizes celebrities from images and stored videos.

Personal Protective Equipment (PPE)

Amazon Rekognition can now detect PPE on persons in an image.

Look out for key phrases like “without any prior machine learning/deep learning knowledge” or “cost effective” or any of the use cases just described to think of Amazon Rekognition as the solution.

If the question contains a phrase like “custom model,” unless it has to do with image labeling, usually Amazon Rekognition is not the answer.

Image and Video Operations

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: