E-Book
29,99 €

AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide E-Book

Somanath Nanda

0,0

29,99 €

oder

Leseprobe lesen

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Fachliteratur
Sprache: Englisch

Beschreibung

The AWS Certified Machine Learning Specialty (MLS-C01) exam evaluates your ability to execute machine learning tasks on AWS infrastructure. This comprehensive book aligns with the latest exam syllabus, offering practical examples to support your real-world machine learning projects on AWS. Additionally, you'll get lifetime access to supplementary online resources, including mock exams with exam-like timers, detailed solutions, interactive flashcards, and invaluable exam tips, all accessible across various devices—PCs, tablets, and smartphones.
Throughout the book, you’ll learn data preparation techniques for machine learning, covering diverse methods for data manipulation and transformation across different variable types. Addressing challenges such as missing data and outliers, the book guides you through an array of machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, text mining, and image processing, accompanied by requisite machine learning algorithms essential for exam success. The book helps you master the deployment of models in production environments and their subsequent monitoring.
Equipped with insights from this book and the accompanying mock exams, you'll be fully prepared to achieve the AWS MLS-C01 certification.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Veröffentlichungsjahr: 2024

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

AWS Certified Machine Learning Specialty: MLS-C01 Certification Guide

Somanath Nanda

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Denke (nach) und werde reich

Napoleon Hill

30 Minuten Resilienz

Ulrich Siegrist

Krebszellen mögen keine Himbeeren - Der große Bestseller - Vollständig überarbeitet und aktualisiert

Richard Béliveau

Die Hormonrevolution

Michael E Platt

Der Crash ist die Lösung

Matthias Weik

Günter, der innere Schweinehund, lernt verkaufen

Stefan Frädrich

Die Leber wächst mit ihren Aufgaben

Dr. med. Eckart von Hirschhausen

Der größte Raubzug der Geschichte

Matthias Weik

Unsere Hunde - gesund durch Homöopathie

Hans Günter Wolff

Die Jahrhundertlüge, die nur Insider kennen

Heiko Schrang

Organisation für Komplexität

Niels Pfläging

Radikal führen

Reinhard K. Sprenger

30 Minuten Sympathisch und souverän: So geht Vortragen!

Thomas Lorenz

BLACKOUT - Morgen ist es zu spät

Marc Elsberg

The Truth About Employee Engagement

Patrick M. Lencioni

Mensch und Wald

Carsten Wippermann

The Food Truck Handbook

David Weber

Leseprobe

AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide

Second Edition

The ultimate guide to passing the MLS-C01 exam on your first attempt

Somanath Nanda

Weslley Moura

AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide

Second Edition

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Authors: Somanath Nanda and Weslley Moura

Reviewer: Patrick Uzuwe

Publishing Product Manager: Sneha Shinde

Senior-Development Editor: Ketan Giri

Development Editor: Kalyani S.

Presentation Designer: Shantanu Zagade

Editorial Board: Vijin Boricha, Megan Carlisle, Wilson D'souza, Ketan Giri, Saurabh Kadave, Alex Mazonowicz, Abhishek Rane, Gandhali Raut, and Ankita Thakur

First Published: March 2021

Second Edition: February 2024

Production Reference: 1280224

Published by Packt Publishing Ltd.

Grosvenor House

11 St Paul’s Square

Birmingham

B3 1RB

ISBN: 978-1-83508-220-1

www.packtpub.com

Contributors

About the Authors

Somanath Nanda has 14 years of experience designing and building Data and ML products. He emphasizes implementing fault-tolerant system design practices throughout the software development lifecycle. He currently holds a prominent leadership position in the finance domain, actively shaping strategic decisions and executions and expertly guiding engineering teams to achieve success.

Weslley Moura has been developing data products for the past decade. At his recent roles, he has been influencing data strategy and leading data teams into the urban logistics and blockchain industries.

About the Reviewer

Patrick Uzuwe serves as the Chief Technology Officer (CTO) at Sparkrena, a company based in Sheffield, England, United Kingdom. In this role, he specializes in assisting customers in the design and development of cloud-native machine learning products. Driven by a passion for solving challenging problems, he collaborates with partners and customers to modernize their machine learning stack, integrating seamlessly with Amazon SageMaker. Dr. Uzuwe actively works alongside both business and engineering teams to ensure the success of products.

His academic background includes a Ph.D. in Information Systems, which he earned from The University of Bolton in Manchester, United Kingdom.

Preface

1 Machine Learning Fundamentals

Making The Most Out of This Book – Your Certification and Beyond

Comparing AI, ML, and DL

Examining ML

Examining DL

Classifying supervised, unsupervised, and reinforcement learning

Introducing supervised learning

The CRISP-DM modeling life cycle

Data splitting

Overfitting and underfitting

Applying cross-validation and measuring overfitting

Bootstrapping methods

The variance versus bias trade-off

Shuffling your training set

Modeling expectations

Introducing ML frameworks

ML in the cloud

Summary

Exam Readiness Drill – Chapter Review Questions

2 AWS Services for Data Storage

Technical requirements

Storing Data on Amazon S3

Creating buckets to hold data

Distinguishing between object tags and object metadata

Controlling access to buckets and objects on Amazon S3

S3 bucket policy

Protecting data on Amazon S3

Applying bucket versioning

Applying encryption to buckets

Securing S3 objects at rest and in transit

Using other types of data stores

Relational Database Service (RDS)

Managing failover in Amazon RDS

Taking automatic backups, RDS snapshots, and restore and read replicas

Writing to Amazon Aurora with multi-master capabilities

Storing columnar data on Amazon Redshift

Amazon DynamoDB for NoSQL Database-as-a-Service

Summary

Exam Readiness Drill – Chapter Review Questions

3 AWS Services for Data Migration and Processing

Technical requirements

Creating ETL jobs on AWS Glue

Features of AWS Glue

Getting hands-on with AWS Glue Data Catalog components

Getting hands-on with AWS Glue ETL components

Querying S3 data using Athena

Processing real-time data using Kinesis Data Streams

Storing and transforming real-time data using Kinesis Data Firehose

Different ways of ingesting data from on-premises into AWS

AWS Storage Gateway

Snowball, Snowball Edge, and Snowmobile

AWS DataSync

AWS Database Migration Service

Processing stored data on AWS

AWS EMR

AWS Batch

Summary

Exam Readiness Drill – Chapter Review Questions

4 Data Preparation and Transformation

Identifying types of features

Dealing with categorical features

Transforming nominal features

Applying binary encoding

Transforming ordinal features

Avoiding confusion in our train and test datasets

Dealing with numerical features

Data normalization

Data standardization

Applying binning and discretization

Applying other types of numerical transformations

Understanding data distributions

Handling missing values

Dealing with outliers

Dealing with unbalanced datasets

Dealing with text data

Bag of words

TF-IDF

Word embedding

Summary

Exam Readiness Drill – Chapter Review Questions

5 Data Understanding and Visualization

Visualizing relationships in your data

Visualizing comparisons in your data

Visualizing distributions in your data

Visualizing compositions in your data

Building key performance indicators

Introducing QuickSight

Summary

Exam Readiness Drill – Chapter Review Questions

6 Applying Machine Learning Algorithms

Introducing this chapter

Storing the training data

A word about ensemble models

Supervised learning

Working with regression models

Working with classification models

Forecasting models

Object2Vec

Unsupervised learning

Clustering

Anomaly detection

Dimensionality reduction

IP Insights

Textual analysis

BlazingText algorithm

Sequence-to-sequence algorithm

Neural Topic Model algorithm

Image processing

Image classification algorithm

Semantic segmentation algorithm

Object detection algorithm

Summary

Exam Readiness Drill – Chapter Review Questions

7 Evaluating and Optimizing Models

Introducing model evaluation

Evaluating classification models

Extracting metrics from a confusion matrix

Summarizing precision and recall

Evaluating regression models

Exploring other regression metrics

Model optimization

Grid search

Summary

Exam Readiness Drill – Chapter Review Questions

8 AWS Application Services for AI/ML

Technical requirements

Analyzing images and videos with Amazon Rekognition

Exploring the benefits of Amazon Rekognition

Getting hands-on with Amazon Rekognition

Text to speech with Amazon Polly

Exploring the benefits of Amazon Polly

Getting hands-on with Amazon Polly

Speech to text with Amazon Transcribe

Exploring the benefits of Amazon Transcribe

Getting hands-on with Amazon Transcribe

Implementing natural language processing with Amazon Comprehend

Exploring the benefits of Amazon Comprehend

Getting hands-on with Amazon Comprehend

Translating documents with Amazon Translate

Exploring the benefits of Amazon Translate

Getting hands-on with Amazon Translate

Extracting text from documents with Amazon Textract

Exploring the benefits of Amazon Textract

Getting hands-on with Amazon Textract

Creating chatbots on Amazon Lex

Exploring the benefits of Amazon Lex

Getting hands-on with Amazon Lex

Amazon Forecast

Exploring the benefits of Amazon Forecast

Sales Forecasting Model with Amazon Forecast

Summary

Exam Readiness Drill – Chapter Review Questions

9 Amazon SageMaker Modeling

Technical requirements

Creating notebooks in Amazon SageMaker

What is Amazon SageMaker?

Training Data Location and Formats

Getting hands-on with Amazon SageMaker notebook instances

Getting hands-on with Amazon SageMaker’s training and inference instances

Model tuning

Tracking your training jobs and selecting the best model

Choosing instance types in Amazon SageMaker

Choosing the right instance type for a training job

Choosing the right instance type for an inference job

Taking care of Scalability Configurations

Scaling Policy Overview

Scale Based on a Schedule

Minimum and Maximum Scaling Limits

Cooldown Period

Securing SageMaker notebooks

SageMaker Debugger

SageMaker Autopilot

SageMaker Model Monitor

SageMaker Training Compiler

SageMaker Data Wrangler

SageMaker Feature Store

SageMaker Edge Manager

SageMaker Canvas

Summary

Exam Readiness Drill – Chapter Review Questions

10 Model Deployment

Factors influencing model deployment options

SageMaker deployment options

Real-time endpoint deployment

Batch transform job

Multi-model endpoint deployment

Endpoint autoscaling

Serverless APIs with AWS Lambda and SageMaker

Creating alternative pipelines with Lambda Functions

Creating and configuring a Lambda Function

Completing your configurations and deploying a Lambda function

Working with step functions

Scaling applications with SageMaker deployment and AWS Autoscaling

Scenario 1 – Fluctuating inference workloads

Scenario 2 – The batch processing of large datasets

Scenario 3 – A multi-model endpoint with dynamic traffic

Scenario 4 – Continuous Model Monitoring with drift detection

Securing SageMaker applications

Summary

Exam Readiness Drill – Chapter Review Questions

11 Accessing the Online Practice Resources

Other Books You May Enjoy

Share Your Thoughts

Once you’ve read AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide, Second Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a Free PDF Copy of This Book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily.

Follow these simple steps to get the benefits:

Scan the QR code or visit the link below:

https://packt.link/free-ebook/9781835082201

Submit your proof of purchase.That’s it! You’ll send your free PDF and other benefits to your email directly.

1 Machine Learning Fundamentals

For many decades, researchers have been trying to simulate human brain activity through the field known as artificial intelligence, or AI for short. In 1956, a group of people met at the Dartmouth Summer Research Project on Artificial Intelligence, an event that is widely accepted as the first group discussion about AI as it’s known today. Researchers were trying to prove that many aspects of the learning process could be precisely described and, therefore, automated and replicated by a machine. Today, you know they were right!

Many other terms appeared in this field, such as machine learning (ML) and deep learning (DL). These sub-areas of AI have also been evolving for many decades (granted, nothing here is new to the science). However, with the natural advance of the information society and, more recently, the advent of big data platforms, AI applications have been reborn with much more applicability – power (because now there are more computational resources to simulate and implement them) and applicability (because now information is everywhere).

Even more recently, cloud service providers have put AI in the cloud. This helps all sizes of companies to reduce their operational costs and even lets them sample AI applications, considering that it could be too costly for a small company to maintain its own data center to scale an AI application.

An incredible journey of building cutting-edge AI applications has emerged with the popularization of big data and cloud services. In June 2020, one specific technology gained significant attention and put AI on the list of the most discussed topics across the technology industry – its name is ChatGPT.

ChatGPT is a popular AI application that uses large language models (more specifically, generative pre-trained transformers) trained on massive amounts of text data to understand and generate human-like language. These models are designed to process and comprehend the complexities of human language, including grammar, context, and semantics.

Large language models utilize DL techniques (for example, deep neural networks based on transformer architecture) to learn patterns and relationships within textual data. They consist of millions of parameters, making them highly complex and capable of capturing very specific language structures.

Such mixing of terms and different classes of use cases might get one stuck on understanding the practical steps of implementing AI applications. That brings you to the goal of this chapter: being able to describe what the terms AI, ML, and DL mean, as well as understanding all the nuances of an ML pipeline. Avoiding confusion about these terms and knowing what exactly an ML pipeline is will allow you to properly select your services, develop your applications, and master the AWS Machine Learning Specialty exam.

Making The Most Out of This Book – Your Certification and Beyond

This book and its accompanying online resources are designed to be a complete preparation tool for your MLS-C01 Exam.

The book is written in a way that you can apply everything you’ve learned here even after your certification. The online practice resources that come with this book (Figure 1.1) are designed to improve your test-taking skills. They are loaded with timed mock exams, interactive flashcards, and exam tips to help you work on your exam readiness from now till your test day.

Before You Proceed

To learn how to access these resources, head over to Chapter 14, Accessing the Online Practice Resources, at the end of the book.

Figure 1.1 – Dashboard interface of the online practice resources

Here are some tips on how to make the most out of this book so that you can clear your certification and retain your knowledge beyond your exam:

Read each section thoroughly.Make ample notes: You can use your favorite online note-taking tool or use a physical notebook. The free online resources also give you access to an online version of this book. Click the BACK TO THE BOOK link from the Dashboard to access the book in Packt Reader. You can highlight specific sections of the book there.Chapter Review Questions: At the end of this chapter, you’ll find a link to review questions for this chapter. These are designed to test your knowledge of the chapter. Aim to score at least 75% before moving on to the next chapter. You’ll find detailed instructions on how to make the most of these questions at the end of this chapter in the Exam Readiness Drill - Chapter Review Questions section. That way, you’re improving your exam-taking skills after each chapter, rather than at the end.Flashcards: After you’ve gone through the book and scored 75% more in each of the chapter review questions, start reviewing the online flashcards. They will help you memorize key concepts.Mock Exams: Solve the mock exams that come with the book till your exam day. If you get some answers wrong, go back to the book and revisit the concepts you’re weak in.Exam Tips: Review these from time to time to improve your exam readiness even further.

The main topics of this chapter are as follows:

Comparing AI, ML, and DLClassifying supervised, unsupervised, and reinforcement learningThe CRISP-DM modeling life cycleData splittingModeling expectationsIntroducing ML frameworksML in the cloud

Comparing AI, ML, and DL

AI is a broad field that studies different ways to create systems and machines that will solve problems by simulating human intelligence. There are different levels of sophistication to create these programs and machines, which go from simple rule-based engines to complex self-learning systems. AI covers, but is not limited to, thefollowing sub-areas:

RoboticsNatural language processing (NLP)Rule-based systemsMachine learning (ML)Computer vision

The area this certification exam focuses on is ML.

Examining ML

ML is a sub-area of AI that aims to create systems and machines that can learn from experience, without being explicitly programmed. As the name suggests, the system can observe its underlying environment, learn, and adapt itself without human intervention. Algorithms behind ML systems usually extract and improve knowledge from the data and conditions that are available to them.

Figure 1.2 – Hierarchy of AI, ML, and DL

You should keep in mind that there are different classes of ML algorithms. For example, decision tree-based models, probabilistic-based models, and neural network models. Each of these classes might contain dozens of specific algorithms or architectures (some of them will be covered in later sections of this book).

As you might have noticed in Figure 1.2, you can be even more specific and break the ML field down into another very important topic for the Machine Learning Specialty exam: deep learning, or DL for short.

Examining DL

DL is a subset of ML that aims to propose algorithms that connect multiple layers to solve a particular problem. The knowledge is then passed through, layer by layer, until the optimal solution is found. The most common type of DL algorithm is deep neural networks.

At the time of writing this book, DL is a very hot topic in the field of ML. Most of the current state-of-the-art algorithms for machine translation, image captioning, and computer vision were proposed in the past few years and are a part of the DL field (GPT-4, used by the ChatGPT application, is one of these algorithms).

Now that you have an overview of types of AI, take a look at some of the ways you can classify ML.

Classifying supervised, unsupervised, and reinforcement learning

ML is a very extensive field of study; that’s why it is very important to have a clear definition of its sub-divisions. From a very broad perspective, you can split ML algorithms into two main classes: supervised learning and unsupervised learning.

Introducing supervised learning

Supervised algorithms use a class or label (from the input data) as support to find and validate the optimal solution. In Table 1.1, there is a dataset that aims to classify fraudulent transactions from a financial company.

Day of the week

Hour

Transaction amount

Merchant type

Is fraud?

Mon