40,81 €
Gain expertise in advanced deep learning domains such as neural networks, meta-learning, graph neural networks, and memory augmented neural networks using the Python ecosystem
Key Features
Book Description
In order to build robust deep learning systems, you'll need to understand everything from how neural networks work to training CNN models. In this book, you'll discover newly developed deep learning models, methodologies used in the domain, and their implementation based on areas of application.
You'll start by understanding the building blocks and the math behind neural networks, and then move on to CNNs and their advanced applications in computer vision. You'll also learn to apply the most popular CNN architectures in object detection and image segmentation. Further on, you'll focus on variational autoencoders and GANs. You'll then use neural networks to extract sophisticated vector representations of words, before going on to cover various types of recurrent networks, such as LSTM and GRU. You'll even explore the attention mechanism to process sequential data without the help of recurrent neural networks (RNNs). Later, you'll use graph neural networks for processing structured data, along with covering meta-learning, which allows you to train neural networks with fewer training samples. Finally, you'll understand how to apply deep learning to autonomous vehicles.
By the end of this book, you'll have mastered key deep learning concepts and the different applications of deep learning models in the real world.
What you will learn
Who this book is for
This book is for data scientists, deep learning engineers and researchers, and AI developers who want to further their knowledge of deep learning and build innovative and unique deep learning projects. Anyone looking to get to grips with advanced use cases and methodologies adopted in the deep learning domain using real-world examples will also find this book useful. Basic understanding of deep learning concepts and working knowledge of the Python programming language is assumed.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 582
Veröffentlichungsjahr: 2019
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Pravin DhandreAcquisition Editor:Devika BattikeContent Development Editor:Nathanya DiasSenior Editor: Ayaan HodaTechnical Editor: Manikandan KurupCopy Editor: Safis EditingProject Coordinator:Aishwarya MohanProofreader: Safis EditingIndexer:Tejal Daruwale SoniProduction Designer:Nilesh Mohite
First published: December 2019
Production reference: 1111219
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78995-617-7
www.packt.com
Packt.com
Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Fully searchable for easy access to vital information
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Ivan Vasilev started working on the first open source Java deep learning library with GPU support in 2013. The library was acquired by a German company, where he continued to develop it. He has also worked as a machine learning engineer and researcher in the area of medical image classification and segmentation with deep neural networks. Since 2017, he has been focusing on financial machine learning. He is working on a Python-based platform that provides the infrastructure to rapidly experiment with different machine learning algorithms for algorithmic trading. Ivan holds an MSc degree in artificial intelligence from the University of Sofia, St. Kliment Ohridski.
Saibal Dutta has been working as an analytical consultant in SAS Research and Development. He is also pursuing a PhD in data mining and machine learning from IIT, Kharagpur. He holds an M.Tech in electronics and communication from the National Institute of Technology, Rourkela. He has worked at TATA communications, Pune, and HCL Technologies Limited, Noida, as a consultant. In his 7 years of consulting experience, he has been associated with global players including IKEA (in Sweden) and Pearson (in the US). His passion for entrepreneurship led him to create his own start-up in the field of data analytics. His areas of expertise include data mining, artificial intelligence, machine learning, image processing, and business consultation.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Advanced Deep Learning with Python
About Packt
Why subscribe?
Contributors
About the author
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Section 1: Core Concepts
The Nuts and Bolts of Neural Networks
The mathematical apparatus of NNs
Linear algebra
Vector and matrix operations
Introduction to probability
Probability and sets
Conditional probability and the Bayes rule
Random variables and probability distributions
Probability distributions
Information theory
Differential calculus
A short introduction to NNs
Neurons
Layers as operations
NNs
Activation functions
The universal approximation theorem
Training NNs
Gradient descent
Cost functions
Backpropagation
Weight initialization
SGD improvements
Summary
Section 2: Computer Vision
Understanding Convolutional Networks
Understanding CNNs
Types of convolutions
Transposed convolutions
1×1 convolutions
Depth-wise separable convolutions
Dilated convolutions
Improving the efficiency of CNNs
Convolution as matrix multiplication
Winograd convolutions
Visualizing CNNs
Guided backpropagation
Gradient-weighted class activation mapping
CNN regularization
Introducing transfer learning
Implementing transfer learning with PyTorch
Transfer learning with TensorFlow 2.0
Summary
Advanced Convolutional Networks
Introducing AlexNet
An introduction to Visual Geometry Group 
VGG with PyTorch and TensorFlow
Understanding residual networks
Implementing residual blocks
Understanding Inception networks
Inception v1
Inception v2 and v3
Inception v4 and Inception-ResNet
Introducing Xception
Introducing MobileNet
An introduction to DenseNets
The workings of neural architecture search
Introducing capsule networks
The limitations of convolutional networks
Capsules
Dynamic routing
The structure of the capsule network
Summary
Object Detection and Image Segmentation
Introduction to object detection
Approaches to object detection
Object detection with YOLOv3
A code example of YOLOv3 with OpenCV
Object detection with Faster R-CNN
Region proposal network
Detection network
Implementing Faster R-CNN with PyTorch
Introducing image segmentation
Semantic segmentation with U-Net
Instance segmentation with Mask R-CNN
Implementing Mask R-CNN with PyTorch
Summary
Generative Models
Intuition and justification of generative models
Introduction to VAEs
Generating new MNIST digits with VAE
Introduction to GANs
Training GANs
Training the discriminator
Training the generator
Putting it all together
Problems with training GANs
Types of GAN
Deep Convolutional GAN
Implementing DCGAN
Conditional GAN
Implementing CGAN
Wasserstein GAN
Implementing WGAN
Image-to-image translation with CycleGAN
Implementing CycleGAN
Building the generator and discriminator
Putting it all together
Introducing artistic style transfer
Summary
Section 3: Natural Language and Sequence Processing
Language Modeling
Understanding n-grams
Introducing neural language models
Neural probabilistic language model
Word2Vec
CBOW
Skip-gram
fastText
Global Vectors for Word Representation model
Implementing language models
Training the embedding model
Visualizing embedding vectors
Summary
Understanding Recurrent Networks
Introduction to RNNs
RNN implementation and training
Backpropagation through time
Vanishing and exploding gradients
Introducing long short-term memory
Implementing LSTM
Introducing gated recurrent units
Implementing GRUs
Implementing text classification
Summary
Sequence-to-Sequence Models and Attention
Introducing seq2seq models
Seq2seq with attention
Bahdanau attention
Luong attention
General attention
Implementing seq2seq with attention
Implementing the encoder
Implementing the decoder
Implementing the decoder with attention
Training and evaluation
Understanding transformers
The transformer attention
The transformer model
Implementing transformers
Multihead attention
Encoder
Decoder
Putting it all together
Transformer language models
Bidirectional encoder representations from transformers
Input data representation
Pretraining
Fine-tuning
Transformer-XL
Segment-level recurrence with state reuse
Relative positional encodings
XLNet
Generating text with a transformer language model
Summary
Section 4: A Look to the Future
Emerging Neural Network Designs
Introducing Graph NNs
Recurrent GNNs
Convolutional Graph Networks
Spectral-based convolutions
Spatial-based convolutions with attention
Graph autoencoders
Neural graph learning
Implementing graph regularization
Introducing memory-augmented NNs
Neural Turing machines
MANN*
Summary
Meta Learning
Introduction to meta learning
Zero-shot learning
One-shot learning
Meta-training and meta-testing
Metric-based meta learning
Matching networks for one-shot learning
Siamese networks
Implementing Siamese networks
Prototypical networks
Optimization-based learning
Summary
Deep Learning for Autonomous Vehicles
Introduction to AVs
Brief history of AV research
Levels of automation
Components of an AV system 
Environment perception
Sensing
Localization
Moving object detection and tracking
Path planning
Introduction to 3D data processing
Imitation driving policy
Behavioral cloning with PyTorch
Generating the training dataset
Implementing the agent neural network
Training
Letting the agent drive
Putting it all together
Driving policy with ChauffeurNet
Input and output representations
Model architecture
Training
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
This book is a collection of newly evolved deep learning models, methodologies, and implementations based on the areas of their application. In the first section of the book, you will learn about the building blocks of deep learning and the math behind neural networks (NNs). In the second section, you'll focus on convolutional neural networks (CNNs) and their advanced applications in computer vision (CV). You'll learn to apply the most popular CNN architectures in object detection and image segmentation. Finally, you'll discuss variational autoencoders and generative adversarial networks.In the third section, you'll focus on natural language and sequence processing. You'll use NNs to extract sophisticated vector representations of words. You'll discuss various types of recurrent networks, such as long short-term memory (LSTM) and gated recurrent unit (GRU). Finally, you'll cover the attention mechanism to process sequential data without the help of recurrent networks. In the final section, you'll learn how to use graph NNs to process structured data. You'll cover meta-learning, which allows you to train an NN with fewer training samples. And finally, you'll learn how to apply deep learning in autonomous vehicles.By the end of this book, you'll have gained mastery of the key concepts associated with deep learning and evolutionary approaches to monitoring and managing deep learning models.
This book is for data scientists, deep learning engineers and researchers, and AI developers who want to master deep learning and want to build innovative and unique deep learning projects of their own. This book will also appeal to those who are looking to get well-versed with advanced use cases and the methodologies adopted in the deep learning domain using real-world examples. Basic conceptual understanding of deep learning and a working knowledge of Python is assumed.
Chapter 1, The Nuts and Bolts of Neural Networks, will briefly introduce what deep learning is and then discuss the mathematical underpinnings of NNs. This chapter will discuss NNs as mathematical models. More specifically, we'll focus on vectors, matrices, and differential calculus. We'll also discuss some gradient descent variations, such as Momentum, Adam, and Adadelta, in depth. We will also discuss how to deal with imbalanced datasets.
Chapter 2, Understanding Convolutional Networks, will provide a short description of CNNs. We'll discuss CNNs and their applications in CV
Chapter 3, Advanced Convolutional Networks, will discuss some advanced and widely used NN architectures, including VGG, ResNet, MobileNets, GoogleNet, Inception, Xception, and DenseNets. We'll also implement ResNet and Xception/MobileNets using PyTorch.
Chapter 4, Object Detection and Image Segmentation, will discuss two important vision tasks: object detection and image segmentation. We'll provide implementations for both of them.
Chapter 5, Generative Models, will begin the discussion about generative models. In particular, we'll talk about generative adversarial networks and neural style transfer. The particular style transfer will be implemented later.
Chapter 6, Language Modeling, will introduce word and character-level language models. We'll also talk about word vectors (word2vec, Glove, and fastText) and we'll use Gensim to implement them. We'll also walk through the highly technical and complex process of preparing text data for machine learning applications such as topic modeling and sentiment modeling with the help of the Natural Language ToolKit's (NLTK) text processing techniques.
Chapter 7, Understanding Recurrent Networks, will discuss the basic recurrent networks, LSTM, and GRU cells. We'll provide a detailed explanation and pure Python implementations for all of the networks.
Chapter 8, Sequence-to-Sequence Models and Attention, will discuss sequence models and the attention mechanism, including bidirectional LSTMs, and a new architecture called transformer with encoders and decoders.
Chapter 9, Emerging Neural Network Designs, will discuss graph NNs and NNs with memory, such as Neural Turing Machines (NTM), differentiable neural computers, and MANN.
Chapter 10, Meta Learning, will discuss meta learning—the way to teach algorithms how to learn. We'll also try to improve upon deep learning algorithms by giving them the ability to learn more information using less training samples.
Chapter 11, Deep Learning for Autonomous Vehicles, will explore the applications of deep learning in autonomous vehicles. We'll discuss how to use deep networks to help the vehicle make sense of its surrounding environment.
To get the most out of this book, you should be familiar with Python and have some knowledge of machine learning. The book includes short introductions to the major types of NNs, but it will help if you are already familiar with the basics of NNs.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
Support
tab.
Click on
Code Downloads
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Advanced-Deep-Learning-with-Python. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781789956177_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Build the full GAN model by including the generator, discriminator, and the combined network."
A block of code is set as follows:
import
matplotlib.pyplot
as
plt
from
matplotlib.markers
import
MarkerStyle
import
numpy
as
np
import
tensorflow
as
tf
from
tensorflow.keras
import
backend
as
K
from
tensorflow.keras.layers
import
Lambda
,
Input
,
Dense
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "The collection of all possible outcomes (events) of an experiment is called, samplespace."
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
This section will discuss some core Deep Learning (DL) concepts: what exactly DL is, the mathematical underpinnings of DL algorithms, and the libraries and tools that make it possible to develop DL algorithms rapidly.
This section contains the following chapter:
Chapter 1
,
The Nuts and Bolts of Neural Networks
In this chapter, we'll discuss some of the intricacies of neural networks (NNs)—the cornerstone of deep learning (DL). We'll talk about their mathematical apparatus, structure, and training. Our main goal is to provide you with a systematic understanding of NNs. Often, we approach them from a computer science perspective—as a machine learning (ML) algorithm (or even a special entity) composed of a number of different steps/components. We gain our intuition by thinking in terms of neurons, layers, and so on (at least I did this when I first learned about this field). This is a perfectly valid way to do things and we can still do impressive things at this level of understanding. Perhaps this is not the correct approach, though.
NNs have solid mathematical foundations and if we approach them from this point of view, we'll be able to define and understand them in a more fundamental and elegant way. Therefore, in this chapter, we'll try to underscore the analogy between NNs from mathematical and computer science points of view. If you are already familiar with these topics, you can skip this chapter. Still, I hope that you'll find some interesting bits you didn't know about already (we'll do our best to keep this chapter interesting!).
In this chapter, we will cover the following topics:
The mathematical apparatus of NNs
A short introduction to NNs
Training NNs
In the next few sections, we'll discuss the mathematical branches related to NNs. Once we've done this, we'll connect them to NNs themselves.
Linear algebra deals with linear equations such as and linear transformations (or linear functions) and their representations, such as matrices and vectors.
Linear algebra identifies the following mathematical objects:
Scalars
: A single number.
Vectors
: A one-dimensional array of numbers (or components). Each component of the array has an index. In literature, we will see vectors denoted either with a superscript arrow () or in bold (
x
). The following is an example of a vector:
We can visually represent an n-dimensional vector as the coordinates of a point in an n-dimensional Euclidean space, (equivalent to a coordinate system). In this case, the vector is referred to as Euclidean and each vector component represents the coordinate along the corresponding axis, as shown in the following diagram:
However, the Euclidean vector is more than just a point and we can also represent it with the following two properties:
Magnitude
(or
length
)
is a generalization of the Pythagorean theorem for an
n
-dimensional space
:
Direction
is the angle of the vector along each axis of the vector space.
Matrices
: This is a two-dimensional array of numbers. Each element is identified by two indices (row and column). A matrix is usually denoted with a bold capital letter; for example,
A
. Each matrix element is denoted with the small matrix letter and a subscript index; for example,
a
ij
.
Let's look at an example of the matrix notation in the following formula
:
We can represent a vector as a single-column n×1 matrix (referred to as a column matrix) or a single -ow 1×n matrix (referred to as a row matrix).
Tensors
: Before we explain them, we have to start with a disclaimer. Tensors originally come from mathematics and physics, where they have existed long before we started using them in ML. The tensor definition in these fields differs from the ML one. For the purposes of this book, we'll only consider tensors in the ML context. Here, a tensor is a multi-dimensional array with the following properties:
Rank
: Indicates the number of array dimensions. For example, a tensor of rank 2 is a matrix, a tensor of rank 1 is a vector, and a tensor of rank 0 is a scalar. However, the tensor has no limit on the number of dimensions. Indeed, some types of NNs use tensors of rank 4.
Shape
: The size of each dimension.
The data type
of the tensor elements. These can vary between libraries, but typically include 16-, 32-, and 64-bit float and 8-, 16-, 32-, and 64-bit integers.
Contemporary DL libraries such as TensorFlow and PyTorch use tensors as their main data structure.
Now that we've introduced the types of objects in linear algebra, in the next section, we'll discuss some operations that can be applied to them.
We'll start with the binomial distribution for discrete variables in binomial experiments. A binomial experiment has only two possible outcomes: success or failure. It also satisfies the following requirements:
Each trial is independent of the others.
The probability of success is always the same.
An example of a binomial experiment is the coin toss experiment.
Now, let's assume that the experiment consists of n trials. x of them are successful, while the probability of success at each trial is p. The formula for a binomial PMF of variable X (not to be confused with x) is as follows:
Here, is the binomial coefficient. This is the number of combinations of x successful trials, which we can select from the n total trials. If n=1, then we have a special case of binomial distribution called Bernoulli distribution.
Next, let's discuss the normal (or Gaussian) distribution for continuous variables, which closely approximates many natural processes. The normal distribution is defined with the following exponential PDF formula, known as normal equation (one of the most popular notations):
Here, x is the value of the random variable, μ is the mean, σ is the standard deviation, and σ2 is the variance. The preceding equation produces a bell-shaped curve, which is shown in the following diagram:
Let's discuss some of the properties of the normal distribution, in no particular order:
The curve is symmetric along its center, which is also the maximum value.
The shape and location of the curve are fully described by the mean and standard deviation, where we have the following:
The center of the curve (and its maximum value) is equal to the mean. That is, the mean determines the location of the curve along the
x
axis.
The width of the curve is determined by the standard deviation.
In the following diagram, we can see examples of normal distributions with different μ and σ values:
The normal distribution approaches 0 toward +/- infinity, but it never becomes 0. Therefore, a random variable under normal distribution can have any value (albeit some values with a tiny probability).
The surface area under the curve is equal to 1, which is ensured by the constant, , being before the exponent.
(located in the exponent) is called the standard score (or z-score). A standardized normal variable has a mean of 0 and a standard deviation of 1. Once transformed, the random variable participates in the equation in its standardized form.
In the next section, we'll introduce the multidisciplinary field of information theory, which will help us use probability theory in the context of NNs.
A NN is a function (let's denote it with f) that tries to approximate another target function, g. We can describe this relationship with the following equation:
Here, x is the input data and θ are the NN parameters (weights). The goal is to find such θ parameters with the best approximate, g. This generic definition applies for both regression (approximating the exact value of g) and classification (assigning the input to one of multiple possible classes) tasks. Alternatively, the NN function can be denoted as .
We'll start our discussion from the smallest building block of the NN—the neuron.
The next level in the NN organizational structure is the layers of units, where we combine the scalar outputs of multiple units in a single output vector. The units in a layer are not connected to each other. This organizational structure makes sense for the following reasons:
We can generalize multivariate regression to a layer, as opposed to only linear or logistic regression for a single unit. In other words, we can approximate multiple values with a layer as opposed to a single value with a unit. This happens in the case of classification output, where each output unit represents the probability the input belongs to a certain class.
A unit can convey limited information because its output is a scalar. By combining the unit outputs, instead of a single activation, we can now consider the vector in its entirety. In this way, we can convey a lot more information, not only because the vector has multiple values, but also because the relative ratios between them carry additional meaning.
Because the units in a layer have no connections to each other, we can parallelize the computation of their outputs (thereby increasing the computational speed). This ability is one of the major reasons for the success of DL in recent years.
