Automated Machine Learning with AutoKeras - Luis Sobrecueva - E-Book

Automated Machine Learning with AutoKeras E-Book

Luis Sobrecueva

0,0
31,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

AutoKeras is an AutoML open-source software library that provides easy access to deep learning models. If you are looking to build deep learning model architectures and perform parameter tuning automatically using AutoKeras, then this book is for you.
This book teaches you how to develop and use state-of-the-art AI algorithms in your projects. It begins with a high-level introduction to automated machine learning, explaining all the concepts required to get started with this machine learning approach. You will then learn how to use AutoKeras for image and text classification and regression. As you make progress, you'll discover how to use AutoKeras to perform sentiment analysis on documents. This book will also show you how to implement a custom model for topic classification with AutoKeras. Toward the end, you will explore advanced concepts of AutoKeras such as working with multi-modal data and multi-task, customizing the model with AutoModel, and visualizing experiment results using AutoKeras Extensions.
By the end of this machine learning book, you will be able to confidently use AutoKeras to design your own custom machine learning models in your company.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 163

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Automated Machine Learning with AutoKeras

Deep learning made accessible for everyone with just few lines of coding

Luis Sobrecueva

BIRMINGHAM—MUMBAI

Automated Machine Learning with AutoKeras

Copyright © 2021 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Kunal Parikh

Publishing Product Manager: Reshma Raman

Senior Editor: Mohammed Yusuf Imaratwale

Content Development Editor: Sean Lobo

Technical Editor: Sonam Pandey

Copy Editor: Safis Editing

Project Coordinator: Aparna Ravikumar Nair

Proofreader: Safis Editing

Indexer: Rekha Nair

Production Designer: Prashant Ghare

First published: May 2021

Production reference: 1210421

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80056-764-1

www.packt.com

Contributors

About the author

Luis Sobrecueva is a senior software engineer and ML/DL practitioner currently working at Cabify. He has been a contributor to the OpenAI project as well as one of the contributors to the AutoKeras project.

About the reviewers

Satya Kesav is a computer science graduate, machine learning enthusiast, and software engineer interested in building end-to-end machine learning products at scale. He has 2+ years of experience in this field, having worked on interesting products including Google Search and YouTube, as well as for an NLP-based start-up and interesting products including Google Search and YouTube. He was an early contributor to the AutoKeras deep learning library, which is now collaborated with Google Brain. He has published four papers and two patents in his career, working in a multitude of fields in computer science.

Anton Hromadskyi has designed data schemas for multiple projects, has configured migration/ETL, has written a lot of algorithms relating to data preparation and feature engineering, has developed and integrated BI, and has implemented prediction models, trading bots, and data processors for the marketing platform. He has applied decision trees, regression, neural networks, anomaly detection, PCA, and ICA and developed ensembles of stacked models for AI solutions, along with a state-action model for a chatbot. He has accepted a legacy AI project without documentation for two weeks before delivery which proved to be a success. Special thanks to Aparna for being patient.

Table of Contents

Preface

Section 1: AutoML Fundamentals

Chapter 1: Introduction to Automated Machine Learning

The anatomy of a standard ML workflow

Data ingestion

Data preprocessing

Model deployment

Model monitoring

What is AutoML?

Differences from the standard approach

Types of AutoML

Automated feature engineering

Automated model choosing and hyperparameter optimization

Automated neural network architecture selection

Summary

Further reading

Chapter 2: Getting Started with AutoKeras

Technical requirements

What is deep learning?

What is a neural network and how does it learn?

How do deep learning models learn?

Why AutoKeras?

How to run the AutoKeras experiments?

Installing AutoKeras

Installing AutoKeras in the cloud

Installing AutoKeras locally

Hello MNIST: Implementing our first AutoKeras experiment

Importing the needed packages

Getting the MNIST dataset

How are the digits distributed?

Creating an image classifier

Evaluating the model with the test set

Visualizing the model

Creating an image regressor

Evaluating the model with the test set

Visualizing the model

Summary

Chapter 3: Automating the Machine Learning Pipeline with AutoKeras

Understanding tensors

What is a tensor?

Types of tensors

Preparing the data to feed deep learning models

Data preprocessing operations for neural network models

Loading data into AutoKeras in multiple formats

Splitting your dataset for training and evaluation

Why you should split your dataset

How to split your dataset

Summary

Section 2: AutoKeras in Practice

Chapter 4: Image Classification and Regression Using AutoKeras

Technical requirements

Surpassing classical neural networks

Creating a CIFAR-10 image classifier

Creating and fine-tuning a powerful image classifier

Improving the model performance

Evaluating the model with the test set

Visualizing the model

Creating an image regressor to find out the age of people

Creating and fine-tuning a powerful image regressor

Improving the model performance

Evaluating the model with the test set

Visualizing the model

Summary

Chapter 5: Text Classification and Regression Using AutoKeras

Technical requirements

Working with text data

Tokenization

Vectorization

Understanding RNNs

One-dimensional CNNs (Conv1D)

Creating an email spam detector

Creating the spam predictor

Evaluating the model

Visualizing the model

Predicting news popularity in social media

Creating a text regressor

Evaluating the model

Visualizing the model

Improving the model performance

Evaluating the model with the test set

Summary

Chapter 6: Working with Structured Data Using AutoKeras

Technical requirements

Understanding structured data

Working with structured data

Creating a structured data classifier to predict Titanic survivors

Creating the classifier

Evaluating the model

Visualizing the model

Creating a structured data regressor to predict Boston house prices

Creating a structure data regressor

Evaluating the model

Visualizing the model

Summary

Chapter 7: Sentiment Analysis Using AutoKeras

Technical requirements

Creating a sentiment analyzer

Creating the sentiment predictor

Evaluating the model

Visualizing the model

Analyzing the sentiment in specific sentences

Summary

Chapter 8: Topic Classification Using AutoKeras

Technical requirements

Understanding topic classification

Creating a news topic classifier

Creating the classifier

Evaluating the model

Visualizing the model

Evaluating the model

Customizing the model search space

Summary

Section 3: Advanced AutoKeras

Chapter 9: Working with Multimodal and Multitasking Data

Technical requirements

Exploring models with multiple inputs or outputs

What is AutoModel?

What is multimodal?

What is multitask?

Creating a multitask/multimodal model

Creating the model

Visualizing the model

Customizing the search space

Summary

Chapter 10: Exporting and Visualizing the Models

Technical requirements

Exporting your models

How to save and load a model

Visualizing your models with TensorBoard

Using callbacks to log the model state

Setting up and loading TensorBoard

Sharing your ML experiment results with TensorBoard.dev

Visualizing and comparing your models with ClearML

Adding ClearML to code

Comparing experiments

Summary

A final few words

Why subscribe?

Other Books You May Enjoy

Preface

Can deep learning be accessible to everyone? Without a doubt, this is the objective that the cloud services offered by giants such as Google or Amazon are trying to achieve. Google AutoML and Amazon ML services are cloud-based services that make it easy for developers of all skill levels to use machine learning technology. AutoKeras is the free open source alternative and, as we'll see soon, a fantastic framework.

When faced with a deep learning problem, the choice of an architecture or the configuration of certain parameters when creating a model usually comes from the intuition of the data scientist, based on years of study and experience.

In my case, being a software engineer without a broad background in data science, I have always looked for methods to automate this part, using different search algorithms (grid, evolutionary, or Bayesian) to explore the different variables that make up a model.

Like many other Python developers, I started in the world of machine learning with scikit-learn and then jumped into deep learning projects with TensorFlow and Keras, testing different frameworks such as Hyperas or TPOT to automate model generation and even developed one to explore architectures in my Keras models, but once AutoKeras was released I found everything I needed, and since then I've been using it and contributing to the project.

AutoKeras has a large community that grows day by day and is supported by the widely known deep learning framework Keras, but apart from its documentation and the occasional blog article, to date, there are almost no books written about it– this book tries to fill that gap.

Both the book and the framework, are aimed at a broad spectrum of ML professionals, from beginners looking for an alternative to cloud services (using it as a black box simply by defining its inputs and outputs), to seasoned data scientists who want to automate exploration by defining search space parameters in detail and exporting generated models to Keras for manual fine tuning. If you are one of the first, maybe these terms and concepts may sound strange to you, but do not worry, we will explain them in detail throughout the book.

Who this book is for

This book is for machine learning and deep learning enthusiasts who want to apply automated ML techniques to their projects. Prior basic knowledge of Python programming is required in order to get the most out of this book.

What this book covers

Chapter 1, Introduction to Automated Machine Learning, covers the main concepts of automated machine learning with an overview of the types of AutoML methods and its software systems.

Chapter 2, Getting Started with AutoKeras, covers everything you need in order to get started with AutoKeras and put it into practice with the help of a foundational, well explained code example.

Chapter 3, Automating the Machine Learning Pipeline with AutoKeras, explains the standard machine learning pipeline, explains how to automate such a pipeline with AutoKeras, and describes the main data preparation best practices to apply before training a model.

Chapter 4, Image Classification and Regression Using AutoKeras, focuses on the use of AutoKeras applied to images by creating more complex and powerful image recognizers, examining how they work, and seeing how to fine-tune them to improve their performance.

Chapter 5, Text Classification and Regression Using AutoKeras, focuses on the use of AutoKeras to work with text (sequences of words). This chapter also explains what recurrent neural networks are and how they work.

Chapter 6, Working with Structured Data Using AutoKeras, enables you to explore a structured dataset, transform it, and use it as a data source for specific models, as well as create your own classification and regression models to solve tasks based on structured data.

Chapter 7, Sentiment Analysis Using AutoKeras, uses a text classifier to extract sentiments from text data and applies the concepts of text classification in a practical way by implementing the sentiment predictor.

Chapter 8, Topic Classification Using AutoKeras, focuses on the practical aspects of the text-based tasks learned in the previous chapters. It teaches you how to create a topic classifier with AutoKeras and then apply it to any topic or category-based dataset.

Chapter 9, Working with Multi-Modal Data and Multi-Task, covers the use of the AutoModel API to show how to handle multimodal and multitasking data.

Chapter 10, Exporting and Visualizing the Models, teaches you to export and import AutoKeras models and visualize graphically, as well as in real time, what is happening during the training of our models.

To get the most out of this book

If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Automated-Machine-Learning-with-AutoKeras. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781800567641_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system."

A block of code is set as follows:

import autokeras as ak

import matplotlib.pyplot as plt

import numpy as np

import tensorflow as tf

from tensorflow.keras.datasets import mnist

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

[default]

exten => s,1,Dial(Zap/1|30)

exten => s,2,Voicemail(u100)

exten => s,102,Voicemail(b100)

exten => i,1,Voicemail(s0)

Any command-line input or output is written as follows:

$ mkdir css

$ cd css

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "a train dataset for training the model and a test dataset for testing the prediction modeling."

Note

A notebook is a file generated by Jupyter Notebook (https://jupyter.org), an open source framework for creating and sharing documents that incorporates live code, visualizations, and rich text. Both the editing and the execution is done in a web browser, adding snippets (called cells) of code and rich text that show us clearly and visually what is being programmed. Each of these code cells can be run independently, making development interactive and avoiding having to run all your code if there is an error.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at customercare@packtpub.com.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Section 1: AutoML Fundamentals

This section is a high-level introduction to automated machine learning, explaining all the notions required to get started with this machine learning approach.

This section comprises the following chapters:

Chapter 1, Introduction to Automated Machine LearningChapter 2, Getting Started with AutoKerasChapter 3, Automating the Machine Learning Pipeline with AutoKeras

Chapter 1: Introduction to Automated Machine Learning

In this chapter, we cover the main concepts relating to Automated Machine Learning (AutoML) with an overview of the types of AutoML methods and its software systems.

If you are a developer working with AutoML, you will be able to put your knowledge to work with this practical guide to develop and use state-of-the-art AI algorithms in your projects. By the end of this chapter, you will have a clear understanding of the anatomy of the Machine Learning (ML) workflow, what AutoML is, and its different types.

Through clear explanations of essential concepts and practical examples, you will see the differences between the standard ML and the AutoML approaches and the pros and cons of each.

In this chapter, we're going to cover the following main topics:

The anatomy of a standard ML workflowWhat is AutoML?Types of AutoML

The anatomy of a standard ML workflow

In a traditional ML application, professionals have to train a model using a set of input data. If this data is not in the proper form, an expert may have to apply some data preprocessing techniques, such as feature extraction, feature engineering, or feature selection.

Once the data is ready and the model can be trained, the next step is to select the right algorithm and optimize the hyperparameters to maximize the accuracy of the model's predictions. Each step involves time-consuming challenges, and typically also requires a data scientist with the experience and knowledge to be successful. In the following figure, we can see the main steps represented in a typical ML pipeline:

Figure 1.1 – ML pipeline steps

Each of these pipeline processes involves a series of steps. In the following sections, we describe each process and related concepts in more detail.

Data ingestion

Piping incoming data to a data store is the first step in any ML workflow. The target here is to store that raw data without doing any transformation, to allow us to have an immutable record of the original dataset. The data can be obtained from various data sources, such as databases, message buses, streams, and so on.

Data preprocessing

The second phase, data preprocessing, is one of the most time-consuming tasks in the pipeline and involves many sub-tasks, such as data cleaning, feature extraction, feature selection, feature engineering, and data segregation. Let's take a closer look at each one:

The data cleaning process is responsible for detecting and fixing (or deleting) corrupt or wrong records from a dataset. Because the data is unprocessed and unstructured, it is rarely in the correct form to be processed; it implies filling in missing fields, removing duplicate rows, or normalizing and fixing other errors in the data.Feature extraction is a procedure for reducing the number of resources required in a large dataset by creating new features from the combination of others (and eliminating the original ones). The main problem when analyzing large datasets is the number of variables to take into account. Processing a large number of variables generally requires a lot of hardware resources, such as memory and computing power, and can also cause overfitting, which means that the algorithm works very well for training samples and generalizes poorly for new samples. Feature extraction is based on the construction of new variables, combining existing ones to solve these problems without losing precision in the data.Feature selection is the process of selecting a subset of variables to use in building the model. Performing feature selection simplifies the model (making it more interpretable for humans), reduces training times, and improves generalization by reducing overfitting. The main reason to apply feature selection methods is that the data contains some features that can be redundant or irrelevant, so removing them wouldn't incur much loss of information.Feature engineering is the process by which, through data mining techniques, features are extracted from raw data using domain knowledge. This typically requires a knowledgeable expert and is used to improve the performance of ML algorithms.

Data segregation consists of dividing the dataset into two subsets: a train dataset for training the model and a test dataset for



Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.