36,59 €
Explore a diverse set of meta-learning algorithms and techniques to enable human-like cognition for your machine learning models using various Python frameworks
Key Features
Book Description
Meta learning is an exciting research trend in machine learning, which enables a model to understand the learning process. Unlike other ML paradigms, with meta learning you can learn from small datasets faster.
Hands-On Meta Learning with Python starts by explaining the fundamentals of meta learning and helps you understand the concept of learning to learn. You will delve into various one-shot learning algorithms, like siamese, prototypical, relation and memory-augmented networks by implementing them in TensorFlow and Keras. As you make your way through the book, you will dive into state-of-the-art meta learning algorithms such as MAML, Reptile, and CAML. You will then explore how to learn quickly with Meta-SGD and discover how you can perform unsupervised learning using meta learning with CACTUs. In the concluding chapters, you will work through recent trends in meta learning such as adversarial meta learning, task agnostic meta learning, and meta imitation learning.
By the end of this book, you will be familiar with state-of-the-art meta learning algorithms and able to enable human-like cognition for your machine learning models.
What you will learn
Who this book is for
Hands-On Meta Learning with Python is for machine learning enthusiasts, AI researchers, and data scientists who want to explore meta learning as an advanced approach for training machine learning models. Working knowledge of machine learning concepts and Python programming is necessary.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 212
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Pavan RamchandaniAcquisition Editor: Pavan RamchandaniContent Development Editor: Chris D'cruzTechnical Editor: Dinesh PawarCopy Editor: Safis EditingProject Coordinator: Namrata SwettaProofreader: Safis EditingIndexer: Tejal Daruwale SoniGraphics: Tom ScariaProduction Coordinator: Nilesh Mohite
First published: December 2018
Production reference: 1261218
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78953-420-7
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Sudharsan Ravichandiran is a data scientist, researcher, artificial intelligence enthusiast, and YouTuber (search forSudharsan reinforcement learning). He completed his bachelor's in information technology at Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning, which includes natural language processing and computer vision. He is an open source contributor and loves answering questions on Stack Overflow. He also authored a best-seller, Hands-On Reinforcement Learning with Python, published by Packt Publishing.
Gautham Krishna Gudur is a machine learning engineer and researcher working on extracting actionable insights in healthcare (medical wearables) using artificial intelligence. He also does independent research at the intersection of applied machine learning/deep learning, physical activity sensing using sensors and wearable data, computer vision, and ubiquitous computing. Previously, he was a research assistant in the areas of gesture recognition, data science, and IoT at Chennai, India. He actively contributes to the research community by authoring and presenting research publications at renowned conferences around the world. During his undergraduate study, he was also an avid competitive programmer on online platforms such as HackerRank.
Armando Fandangocreates AI-empowered products by leveraging his expertise in deep learning, machine learning, distributed computing, and computational methods, and has fulfilled thought-leadership roles as Chief Data Scientist and Director at start-ups and large enterprises. He has been advising high-tech AI-based start-ups. Armando has authored books titledPython Data Analysis - Second EditionandMastering TensorFlow. He has also published research in international journals and conferences.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Hands-On Meta Learning with Python
Dedication
About Packt
Why subscribe?
Packt.com
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Conventions used
Get in touch
Reviews
Introduction to Meta Learning
Meta learning
Meta learning and few-shot
Types of meta learning
Learning the metric space
Learning the initializations
Learning the optimizer
Learning to learn gradient descent by gradient descent
Optimization as a model for few-shot learning
Summary
Questions
Further reading
Face and Audio Recognition Using Siamese Networks
What are siamese networks?
Architecture of siamese networks
Applications of siamese networks
Face recognition using siamese networks
Building an audio recognition model using siamese networks
Summary
Questions
Further readings
Prototypical Networks and Their Variants
Prototypical networks
Algorithm
Performing classification using prototypical networks
Gaussian prototypical network
Algorithm
Semi-prototypical networks
Summary
Questions
Further reading
Relation and Matching Networks Using TensorFlow
Relation networks
Relation networks in one-shot learning
Relation networks in few-shot learning
Relation networks in zero-shot learning
Loss function
Building relation networks using TensorFlow
Matching networks
Embedding functions
The support set embedding function (g)
The query set embedding function (f)
The architecture of matching networks
Matching networks in TensorFlow
Summary
Questions
Further reading
Memory-Augmented Neural Networks
NTM
Reading and writing in NTM
Read operation
Write operation
Erase operation
Add operation
Addressing mechanisms
Content-based addressing
Location-based addressing
Interpolation
Convolution shift
Sharpening
Copy tasks using NTM
Memory-augmented neural networks (MANN)
Read and write operations
Read operation
Write operation
Summary
Questions
Further reading
MAML and Its Variants
MAML
MAML algorithm
MAML in supervised learning
Building MAML from scratch
Generate data points
Single layer neural network
Training using MAML
MAML in reinforcement learning
Adversarial meta learning
FGSM
ADML
Building ADML from scratch
Generating data points
FGSM
Single layer neural network
Adversarial meta learning
CAML
CAML algorithm
Summary
Questions
Further reading
Meta-SGD and Reptile
Meta-SGD
Meta-SGD for supervised learning
Building Meta-SGD from scratch
Generating data points
Single layer neural network
Meta-SGD
Meta-SGD for reinforcement learning
Reptile
The Reptile algorithm
Sine wave regression using Reptile
Generating data points
Two-layered neural network
Reptile
Summary
Questions
Further readings
Gradient Agreement as an Optimization Objective
Gradient agreement as an optimization
Weight calculation
Algorithm
Building gradient agreement algorithm with MAML
Generating data points
Single layer neural network
Gradient agreement in MAML
Summary
Questions
Further reading
Recent Advancements and Next Steps
Task agnostic meta learning (TAML)
Entropy maximization/reduction
Algorithm
Inequality minimization
Inequality measures
Gini coefficient
Theil index
Variance of algorithms
Algorithm
Meta imitation learning
MIL algorithm
CACTUs
Task generation using CACTUs
Learning to learn in concept space
Key components
Concept generator
Concept discriminator
Meta learner
Loss function
Concept discrimination loss
Meta learning loss
Algorithm
Summary
Questions
Further reading
Assessments
Chapter 1: Introduction to Meta Learning
Chapter 2: Face and Audio Recognition Using Siamese Networks
Chapter 3: Prototypical Networks and Their Variants
Chapter 4: Relation and Matching Networks Using TensorFlow
Chapter 5: Memory-Augmented Neural Networks
Chapter 6: MAML and Its Variants
Chapter 7: Meta-SGD and Reptile Algorithms
Chapter 8: Gradient Agreement as an Optimization Objective
Chapter 9: Recent Advancements and Next Steps
Other Books You May Enjoy
Leave a review - let other readers know what you think
Hands-On Meta Learning with Python explains the fundamentals of meta learning and helps you to understand the concept of learning to learn. You will go through various one-shot learning algorithms, such as siamese, prototypical, relation, and memory-augmented networks, and implement them in TensorFlow and Keras. You will also learn about the state-of-the-art meta learning algorithms, such as model-agnostic meta learning (MAML), Reptile, and fast context adaptation via meta learning (CAML). You will then explore how to learn quickly with meta-SGD and discover how to perform unsupervised learning using meta learning.
This book will help machine learning enthusiasts, AI researchers, and data scientists who want to learn about meta learning as an advanced approach for training machine learning models. The book assumes a working knowledge of machine learning concepts and a sound knowledge of Python programming.
Chapter 1, Introduction to Meta Learning, helps us to understand what meta learning is and covers the different types of meta learning. We will also learn how meta learning uses few-shot learning by learning from a few data points. We will then see how to become familiar with gradient descent. Later in the chapter, we will see optimization as a model for the few shot learning setting.Chapter 2, Face and Audio Recognition Using Siamese Networks, starts by explaining what siamese networks are and how siamese networks are used in the one-shot learning setting. We will look at the architecture of a siamese network and some of the applications of a siamese network. Then, we will see how to use the siamese networks to build face and audio recognition models.Chapter 3, Prototypical Networks and Their Variants, explains what prototypical networks are and how they are used in the few shot learning scenario. We will see how to build a prototypical network to perform classification on an omniglot character set. Later in the chapter, we will look at different variants of prototypical networks, such as the Gaussian prototypical networks and semi-prototypical networks.
Chapter 4, Relation and Matching Networks Using TensorFlow, helps us to understand the relation network architecture and how relation network is used in one-shot, few-shot, and zero-shot learning settings. We will then see how to build a relation network using TensorFlow. Next, we will learn about the matching network and its architecture. We will also explore full contextual embeddings and how to build a matching network using TensorFlow.Chapter 5, Memory-Augmented Neural Networks, covers what neural Turing machines (NTMs) are and how they make use of external memory for storing and retrieving information. We will look at different addressing mechanisms used in NTMs and then we will learn about memory augmented neural networks and how they differ from the NTM architecture.Chapter 6, MAML and Its Variants, deals with one of the popular meta learning algorithms, called model-agnostic meta learning (MAML). We will explore what MAML is and how it is used in supervised and reinforcement learning settings. We will also see how to build MAML from scratch. Then, we will learn about adversarial meta learning and CAML, which is used for fast context adaptation in meta learning.Chapter 7, Meta-SGD and Reptile, explain how meta-SGD is used to learn all the ingredients of gradient descent algorithms, such as initial weights, learning rates, and the update direction. We will see how to build meta-SGD from scratch. Later in the chapter, we will learn about the reptile algorithm and see how it serves as an improvement over MAML. We will also see how to use the reptile algorithm for sine wave regression.Chapter 8, Gradient Agreement as an Optimization Objective, covers how we can use gradient agreement as an optimization objective in the meta learning setting. We will learn what gradient agreement is and how it can enhance meta learning algorithms. Later in the chapter, we will learn how to build a gradient agreement algorithm from scratch.Chapter 9, Recent Advancements and Next Steps, starts by explaining task-agnostic meta learning, and then we will see how meta learning is used in an imitation learning setting. Then, we will learn how we can apply MAML in an unsupervised learning setting using the CACTUs algorithm. Then, we will explore a deep meta learning algorithm called learning to learn in the concept space.
You need the following software for this book:
Python
Anaconda
TensorFlow
Keras
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-On-Meta-Learning-with-Python. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "The read_image function takes an image as input and returns a NumPy array."
A block of code is set as follows:
import reimport numpy as npfrom PIL import Image
Bold: Indicates a new term, an important word, or words that you see onscreen.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
Meta learning is one of the most promising and trending research areas in the field of artificial intelligence right now. It is believed to be a stepping stone for attaining Artificial General Intelligence (AGI). In this chapter, we will learn about what meta learning is and why meta learning is the most exhilarating research in artificial intelligence right now. We will understand what is few-shot, one-shot, and zero-shot learning and how it is used in meta learning. We will also learn about different types of meta learning techniques. We will then explore the concept of learning to learn gradient descent by gradient descent where we understand how we can learn the gradient descent optimization using the meta learner. Going ahead, we will also learn about optimization as a model for few-shot learning where we will see how we can use meta learner as an optimization algorithm in the few-shot learning setting.
In this chapter, you will learn about the following:
Meta learning
Meta learning and few-shot
Types of meta learning
Learning to learn gradient descent by gradient descent
Optimization as a model for few-shot learning
Meta learning is an exhilarating research domain in the field of AI right now. With plenty of research papers and advancements, meta learning is clearly making a major breakthrough in AI. Before getting into meta learning, let's see how our current AI model works.
Deep learning has progressed rapidly in recent years with great algorithms such as generative adversarial networks and capsule networks. But the problem with deep neural networks is that we need to have a large training set to train our model and it will fail abruptly when we have very few data points. Let's say we trained a deep learning model to perform task A. Now, when we have a new task, B, that is closely related to A, we can't use the same model. We need to train the model from scratch for task B. So, for each task, we need to train the model from scratch although they might be related.
Is deep learning really the true AI? Well, it is not. How do we humans learn? We generalize our learning to multiple concepts and learn from there. But current learning algorithms master only one task. Here is where meta learning comes in. Meta learning produces a versatile AI model that can learn to perform various tasks without having to train them from scratch. We train our meta learning model on various related tasks with few data points, so for a new related task, it can make use of the learning obtained from the previous tasks and we don't have to train them from scratch. Many researchers and scientists believe that meta learning can get us closer to achieving AGI. We will learn exactly how meta learning models learn the learning process in the upcoming sections.
Learning from fewer data points is called few-shot learning or k-shot learning where k denotes the number of data points in each of the classes in the dataset. Let's say we are performing the image classification of dogs and cats. If we have exactly one dog and one cat image then it is called one-shot learning, that is, we are learning from just one data point per class. If we have, say 10 images of a dog and 10 images of a cat, then that is called 10-shot learning. So k in k-shot learning implies a number of data points we have per class. There is also zero-shot learning where we don't have any data points per class. Wait. What? How can we learn when there are no data points at all? In this case, we will not have data points, but we will have meta information about each of the classes and we will learn from the meta information. Since we have two classes in our dataset, that is, dog and cat, we can call it two-way k-shot learning; so n-way means the number of classes we have in our dataset.
In order to make our model learn from a few data points, we will train them in the same way. So, when we have a dataset, D, we sample a few data points from each of the classes present in our data set and we call it as support set. Similarly, we sample some different data points from each of the classes and call it as query set. So we train our model with a support set and test with a query set. We train our model in an episodic fashion—that is, in each episode, we sample a few data points from our dataset, D, prepare our support set and query set, and train on the support set and test on the query set. So, over series of episodes, our model will learn how to learn from a smaller dataset. We will explore this in more detail in the upcoming chapters.
Meta learning can be categorized in several ways, right from finding the optimal sets of weights to learning the optimizer. We will categorize meta learning into the following three categories:
Learning the metric space
Learning the initializations
Learning the optimizer
In the metric-based meta learning setting, we will learn the appropriate metric space. Let's say we want to learn the similarity between two images. In the metric-based setting, we use a simple neural network that extracts the features from two images and finds the similarity by computing the distance between features of these two images. This approach is widely used in a few-shot learning setting where we don't have many data points. In the upcoming chapters, we will learn about metric-based learning algorithms such as Siamese networks, prototypical networks, and relation networks.
In this method, we try to learn optimal initial parameter values. What do we mean by that? Let's say we are a building a neural network to classify images. First, we initialize random weights, calculate loss, and minimize the loss through a gradient descent. So, we will find the optimal weights through gradient descent and minimize the loss. Instead of initializing the weights randomly, if can we initialize the weights with optimal values or close to optimal values, then we can attain the convergence faster and we can learn very quickly. We will see how exactly we can find these optimal initial weights with algorithms such as MAML, Reptile, and Meta-SGD in the upcoming chapters.
In this method, we try to learn the optimizer. How do we generally optimize our neural network? We optimize our neural network by training on a large dataset and minimize the loss using gradient descent. But in the few-shot learning setting, gradient descent fails as we will have a smaller dataset. So, in this case, we will learn the optimizer itself. We will have two networks: a base network that actually tries to learn and a meta network that optimizes the base network. We will explore how exactly this works in the upcoming sections.
Now, we will see one of the interesting meta learning algorithms called learning to learn gradient descent by gradient descent. Isn't the name kind of daunting? Well, in fact, it is one of the simplest meta learning algorithms. We know that, in meta learning, our goal is to learn the learning process. In general, how do we train our neural networks? We train our network by computing loss and minimizing the loss through gradient descent. So, we optimize our model using gradient descent. Instead of using gradient descent can we learn this optimization process automatically?
But how can we learn this? We replace our traditional optimizer (gradient descent) with the Recurrent Neural Network (RNN