E-Book
27,59 €

Machine Learning Fundamentals E-Book

Hyatt Saleh

0,0

27,59 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Fachliteratur
Sprache: Englisch

Beschreibung

As machine learning algorithms become popular, new tools that optimize these algorithms are also developed. Machine Learning Fundamentals explains you how to use the syntax of scikit-learn. You'll study the difference between supervised and unsupervised models, as well as the importance of choosing the appropriate algorithm for each dataset. You'll apply unsupervised clustering algorithms over real-world datasets, to discover patterns and profiles, and explore the process to solve an unsupervised machine learning problem.

The focus of the book then shifts to supervised learning algorithms. You'll learn to implement different supervised algorithms and develop neural network structures using the scikit-learn package. You'll also learn how to perform coherent result analysis to improve the performance of the algorithm by tuning hyperparameters.

By the end of this book, you will have gain all the skills required to start programming machine learning algorithms.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

MOBI

Seitenzahl: 254

Veröffentlichungsjahr: 2018

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Denke (nach) und werde reich

Napoleon Hill

30 Minuten Resilienz

Ulrich Siegrist

Krebszellen mögen keine Himbeeren - Der große Bestseller - Vollständig überarbeitet und aktualisiert

Richard Béliveau

Die Hormonrevolution

Michael E Platt

Der Crash ist die Lösung

Matthias Weik

Günter, der innere Schweinehund, lernt verkaufen

Stefan Frädrich

Die Leber wächst mit ihren Aufgaben

Dr. med. Eckart von Hirschhausen

Der größte Raubzug der Geschichte

Matthias Weik

Unsere Hunde - gesund durch Homöopathie

Hans Günter Wolff

Die Jahrhundertlüge, die nur Insider kennen

Heiko Schrang

Organisation für Komplexität

Niels Pfläging

Radikal führen

Reinhard K. Sprenger

30 Minuten Sympathisch und souverän: So geht Vortragen!

Thomas Lorenz

BLACKOUT - Morgen ist es zu spät

Marc Elsberg

The Truth About Employee Engagement

Patrick M. Lencioni

Mensch und Wald

Carsten Wippermann

The Food Truck Handbook

David Weber

Die selbstbestimmte Geburt

Ina May Gaskin

Leseprobe

Machine Learning Fundamentals

Use Python and scikit-learn to get up and running with the hottest developments in machine learning

Hyatt Saleh

Machine Learning Fundamentals

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Author: Hyatt Saleh

Managing Editor: Neha Nair

Acquisitions Editor: Aditya Date

Production Editor: Samita Warang

Editorial Board: David Barnes, Ewan Buckingham, Simon Cox, Manasa Kumar, Alex Mazonowicz, Douglas Paterson, Dominic Pereira, Shiny Poojary, Saman Siddiqui, Erol Staveley, Ankita Thakur, and Mohita Vyas

First Published: November 2018

Production Reference: 1291118

ISBN: 978-1-78980-355-6

Preface i

Introduction to Scikit-Learn 1

Introduction 2

Scikit-Learn 2

Advantages of Scikit-Learn 4

Disadvantages of Scikit-Learn 4

Data Representation 5

Tables of Data 5

Features and Target Matrices 7

Exercise 1: Loading a Sample Dataset and Creating the Features and Target Matrices 7

Activity 1: Selecting a Target Feature and Creating a Target Matrix 10

Data Preprocessing 12

Messy Data 12

Exercise 2: Dealing with Messy Data 17

Dealing with Categorical Features 22

Exercise 3: Applying Feature Engineering over Text Data 23

Rescaling Data 25

Exercise 4: Normalizing and Standardizing Data 26

Activity 2: Preprocessing an Entire Dataset 28

Scikit-Learn API 29

How Does It Work? 30

Supervised and Unsupervised Learning 33

Supervised Learning 33

Unsupervised Learning 35

Summary 37

Unsupervised Learning: Real-Life Applications 39

Introduction 40

Clustering 40

Clustering Types 40

Applications of Clustering 41

Exploring a Dataset: Wholesale Customers Dataset 42

Understanding the Dataset 43

Data Visualization 45

Loading the Dataset Using Pandas 45

Visualization Tools 46

Exercise 5: Plotting a Histogram of One Feature from the Noisy Circles Dataset 48

Activity 3: Using Data Visualization to Aid the Preprocessing Process 51

k-means Algorithm 52

Understanding the Algorithm 52

Exercise 6: Importing and Training the k-means Algorithm over a Dataset 55

Activity 4: Applying the k-means Algorithm to a Dataset 59

Mean-Shift Algorithm 59

Understanding the Algorithm 60

Exercise 7: Importing and Training the Mean-Shift Algorithm over a Dataset 61

Activity 5: Applying the Mean-Shift Algorithm to a Dataset 63

DBSCAN Algorithm 64

Understanding the Algorithm 64

Exercise 8: Importing and Training the DBSCAN Algorithm over a Dataset 65

Activity 6: Applying the DBSCAN Algorithm to the Dataset 67

Evaluating the Performance of Clusters 67

Available Metrics in Scikit-Learn 68

Exercise 9: Evaluating the Silhouette Coefficient Score and Calinski–Harabasz Index 69

Activity 7: Measuring and Comparing the Performance of the Algorithms 70

Summary 71

Supervised Learning: Key Steps 73

Introduction 74

Model Validation and Testing 74

Data Partition 74

Split Ratio 76

Exercise 10: Performing Data Partition over a Sample Dataset 78

Cross Validation 81

Exercise 11: Using Cross-Validation to Partition the Train Set into a Training and a Validation Set 82

Activity 8: Data Partition over a Handwritten Digit Dataset 84

Evaluation Metrics 84

Evaluation Metrics for Classification Tasks 84

Exercise 12: Calculating Different Evaluation Metrics over a Classification Task 88

Choosing an Evaluation Metric 90

Evaluation Metrics for Regression Tasks 90

Exercise 13: Calculating Evaluation Metrics over a Regression Task 92

Activity 9: Evaluating the Performance of the Model Trained over a Handwritten Dataset 93

Error Analysis 94

Bias, Variance, and Data Mismatch 95

Exercise 14: Calculating the Error Rate over Different Sets of Data 98

Activity 10: Performing Error Analysis over a Model Trained to Recognize Handwritten Digits 101

Summary 102

Supervised Learning Algorithms: Predict Annual Income 105

Introduction 106

Exploring the Dataset 106

Understanding the Dataset 107

Naïve Bayes Algorithm 111

How Does It Work? 111

Exercise 15: Applying the Naïve Bayes Algorithm 114

Activity 11: Training a Naïve Bayes Model for Our Census Income Dataset 116

Decision Tree Algorithm 117

How Does It Work? 117

Exercise 16: Applying the Decision Tree Algorithm 119

Activity 12: Training a Decision Tree Model for Our Census Income Dataset 120

Support Vector Machine Algorithm 120

How Does It Work? 120

Exercise 17: Applying the SVM Algorithm 124

Activity 13: Training an SVM Model for Our Census Income Dataset 125

Error Analysis 126

Accuracy, Precision, and Recall 126

Summary 129

Artificial Neural Networks: Predict Annual Income 131

Introduction 132

Artificial Neural Networks 132

How Do They Work? 133

Understanding the Hyperparameters 139

Applications 142

Limitations 142

Applying an Artificial Neural Network 143

Scikit-Learn's Multilayer Perceptron 143

Exercise 18: Applying the Multilayer Perceptron Classifier Class 144

Activity 14: Training a Multilayer Perceptron for Our Census Income Dataset 145

Performance Analysis 147

Error Analysis 147

Hyperparameter Fine-Tuning 148

Model Comparison 151

Activity 15: Comparing Different Models to Choose the Best Fit for the Census Income Data Problem 152

Summary 153

Building Your Own Program 155

Introduction 156

Program Definition 156

Building a Program: Key Stages 156

Understanding the Dataset 159

Activity 16: Performing the Preparation and Creation Stages for the Bank Marketing Dataset 163

Saving and Loading a Trained Model 165

Saving a Model 165

Exercise 19: Saving a Trained Model 166

Loading a Model 167

Exercise 20: Loading a Saved Model 167

Activity 17: Saving and Loading the Final Model for the Bank Marketing Dataset 168

Interacting with a Trained Model 170

Exercise 21: Creating a Class and a Channel to Interact with a Trained Model 171

Activity 18: Allowing Interaction with the Bank Marketing Dataset Model 173

Summary 174

Appendix 177 Preface

About

This section briefly introduces the author, the coverage of this book, the technical skills you'll need to get started, and the hardware and software required to complete all of the included activities and exercises.

Introduction to Scikit-Learn

Learning Objectives

By the end of this chapter, you will be able to:

Describe scikit-learn and its main advantagesUse the scikit-learn APIPerform data preprocessingExplain the difference between supervised and unsupervised models, as well as the importance of choosing the right algorithm for each dataset

This chapter gives an explanation of the scikit-learn syntax and features in order to be able to process and visualize data

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben:

Machine Learning Fundamentals E-Book

Hyatt Saleh

Machine Learning Fundamentals

Machine Learning Fundamentals

Table of Contents

Preface i

Introduction to Scikit-Learn 1

Introduction 2

Scikit-Learn 2

Advantages of Scikit-Learn 4

Disadvantages of Scikit-Learn 4

Data Representation 5

Tables of Data 5

Features and Target Matrices 7

Exercise 1: Loading a Sample Dataset and Creating the Features and Target Matrices 7

Activity 1: Selecting a Target Feature and Creating a Target Matrix 10

Data Preprocessing 12

Messy Data 12

Exercise 2: Dealing with Messy Data 17

Dealing with Categorical Features 22

Exercise 3: Applying Feature Engineering over Text Data 23

Rescaling Data 25

Exercise 4: Normalizing and Standardizing Data 26

Activity 2: Preprocessing an Entire Dataset 28

Scikit-Learn API 29

How Does It Work? 30

Supervised and Unsupervised Learning 33

Supervised Learning 33

Unsupervised Learning 35

Summary 37

Unsupervised Learning: Real-Life Applications 39

Introduction 40

Clustering 40

Clustering Types 40

Applications of Clustering 41

Exploring a Dataset: Wholesale Customers Dataset 42

Understanding the Dataset 43

Data Visualization 45

Loading the Dataset Using Pandas 45

Visualization Tools 46

Exercise 5: Plotting a Histogram of One Feature from the Noisy Circles Dataset 48

Activity 3: Using Data Visualization to Aid the Preprocessing Process 51

k-means Algorithm 52

Understanding the Algorithm 52

Exercise 6: Importing and Training the k-means Algorithm over a Dataset 55

Activity 4: Applying the k-means Algorithm to a Dataset 59

Mean-Shift Algorithm 59

Understanding the Algorithm 60

Exercise 7: Importing and Training the Mean-Shift Algorithm over a Dataset 61

Activity 5: Applying the Mean-Shift Algorithm to a Dataset 63

DBSCAN Algorithm 64

Understanding the Algorithm 64

Exercise 8: Importing and Training the DBSCAN Algorithm over a Dataset 65

Activity 6: Applying the DBSCAN Algorithm to the Dataset 67

Evaluating the Performance of Clusters 67

Available Metrics in Scikit-Learn 68

Exercise 9: Evaluating the Silhouette Coefficient Score and Calinski–Harabasz Index 69

Activity 7: Measuring and Comparing the Performance of the Algorithms 70

Summary 71

Supervised Learning: Key Steps 73

Introduction 74

Model Validation and Testing 74

Data Partition 74

Split Ratio 76

Exercise 10: Performing Data Partition over a Sample Dataset 78

Cross Validation 81

Exercise 11: Using Cross-Validation to Partition the Train Set into a Training and a Validation Set 82

Activity 8: Data Partition over a Handwritten Digit Dataset 84

Evaluation Metrics 84

Evaluation Metrics for Classification Tasks 84

Exercise 12: Calculating Different Evaluation Metrics over a Classification Task 88

Choosing an Evaluation Metric 90

Evaluation Metrics for Regression Tasks 90

Exercise 13: Calculating Evaluation Metrics over a Regression Task 92

Activity 9: Evaluating the Performance of the Model Trained over a Handwritten Dataset 93

Error Analysis 94

Bias, Variance, and Data Mismatch 95

Exercise 14: Calculating the Error Rate over Different Sets of Data 98

Activity 10: Performing Error Analysis over a Model Trained to Recognize Handwritten Digits 101

Summary 102