E-Book
28,14 €

Hands-On Deep Learning Architectures with Python E-Book

Yuxi (Hayden) Liu

0,0

28,14 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Concepts, tools, and techniques to explore deep learning architectures and methodologies

Key Features

Explore advanced deep learning architectures using various datasets and frameworks

Implement deep architectures for neural network models such as CNN, RNN, GAN, and many more

Discover design patterns and different challenges for various deep learning architectures

Book Description

Deep learning architectures are composed of multilevel nonlinear operations that represent high-level abstractions; this allows you to learn useful feature representations from the data. This book will help you learn and implement deep learning architectures to resolve various deep learning research problems.

Hands-On Deep Learning Architectures with Python explains the essential learning algorithms used for deep and shallow architectures. Packed with practical implementations and ideas to help you build efficient artificial intelligence systems (AI), this book will help you learn how neural networks play a major role in building deep architectures. You will understand various deep learning architectures (such as AlexNet, VGG Net, GoogleNet) with easy-to-follow code and diagrams. In addition to this, the book will also guide you in building and training various deep architectures such as the Boltzmann mechanism, autoencoders, convolutional neural networks (CNNs), recurrent neural networks (RNNs), natural language processing (NLP), GAN, and more—all with practical implementations.

By the end of this book, you will be able to construct deep models using popular frameworks and datasets with the required design patterns for each architecture. You will be ready to explore the potential of deep architectures in today's world.

What you will learn

Implement CNNs, RNNs, and other commonly used architectures with Python

Explore architectures such as VGGNet, AlexNet, and GoogLeNet

Build deep learning architectures for AI applications such as face and image recognition, fraud detection, and many more

Understand the architectures and applications of Boltzmann machines and autoencoders with concrete examples

Master artificial intelligence and neural network concepts and apply them to your architecture

Understand deep learning architectures for mobile and embedded systems

Who this book is for

If you're a data scientist, machine learning developer/engineer, or deep learning practitioner, or are curious about AI and want to upgrade your knowledge of various deep learning architectures, this book will appeal to you. You are expected to have some knowledge of statistics and machine learning algorithms to get the best out of this book

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 316

Veröffentlichungsjahr: 2019

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Hands-On Deep Learning Architectures with Python

Create deep neural networks to solve computational problems using TensorFlow and Keras

Yuxi (Hayden) Liu

Saransh Mehta

BIRMINGHAM - MUMBAI

Hands-On Deep Learning Architectures with Python

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor:Sunith ShettyAcquisition Editor:Porous GodhaaContent Development Editor:Karan ThakkarTechnical Editor: Sushmeeta JenaCopy Editor: Safis EditingProject Coordinator:Hardik BhindeProofreader: Safis EditingIndexer:Pratik ShirodkarGraphics:Jisha ChirayilProduction Coordinator:Arvindkumar Gupta

First published: April 2019

Production reference: 1300419

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78899-808-6

www.packtpub.com

mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

Packt.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the authors

Yuxi (Hayden) Liu is an author of a series of machine learning books and an education enthusiast. His first book, the first edition of Python Machine Learning By Example, was a #1 bestseller on Amazon India in 2017 and 2018 and his other book R Deep Learning Projects, both published by Packt Publishing.

He is an experienced data scientist who is focused on developing machine learning and deep learning models and systems. He has worked in a variety of data-driven domains and has applied his machine learning expertise to computational advertising, recommendations, and network anomaly detection. He published five first-authored IEEE transaction and conference papers during his master's research at the University of Toronto.

Saransh Mehta has cross-domain experience of working with texts, images, and audio using deep learning. He has been building artificial, intelligence-based solutions, including a generative chatbot, an attendee-matching recommendation system, and audio keyword recognition systems for multiple start-ups. He is very familiar with the Python language, and has extensive knowledge of deep learning libraries such as TensorFlow and Keras. He has been in the top 10% of entrants to deep learning challenges hosted by Microsoft and Kaggle.

First and most, I would like to thank my mentor and guide, Ankur Pal, for providing me with opportunities to work with deep learning. I would also like to thank my family and friends, Yash Bonde and Kumar Subham, for being my constant supports in the field of artificial intelligence. This book is the result of the help and support I have been receiving from them.

About the reviewers

Antonio L. Amadeu is a data science consultant and is passionate about data, artificial intelligence, and neural networks, in particular, using machine learning and deep learning algorithms in daily challenges to solve all types of issues in any business field and industry. He has worked for large companies, including Unilever, Lloyds Bank, TE Connectivity, and Microsoft.

As an aspiring astrophysicist, he does some research in relation to the Virtual Observatory group at Sao Paulo University in Brazil, a member of the International Virtual Observatory Alliance (IVOA).

Junho Kim received a BS in mathematics and computer science engineering in 2015, and an MS in computer science engineering in 2017, from Chung-Ang University, Seoul, South Korea. After graduation, he worked as an artificial intelligence research intern at AIRI, Lunit, and Naver Webtoon. Currently, he is working for NCSOFT as an artificial intelligence research scientist.

His research interests include deep learning in computer vision, especially in relation to generative models with GANs, and image-to-image translation. He likes to read papers and implement deep learning in a simple way that others can understand easily. All his works are shared on GitHub (@taki0112). His dream is to make everyone's life more fun using AI.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Title Page

Hands-On Deep Learning Architectures with Python

About Packt

Why subscribe?

Packt.com

Contributors

About the authors

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1: The Elements of Deep Learning

Getting Started with Deep Learning

Artificial intelligence

Machine learning

Supervised learning

Regression

Classification

Unsupervised learning

Reinforcement learning

Deep learning

Applications of deep learning

Self-driving cars

Image translation

Machine translation

Encoder-decoder structure

Chatbots

Building the fundamentals

Biological inspiration 

ANNs

Activation functions

Linear activation 

Sigmoid activation

Tanh activation

ReLU activation

Softmax activation

TensorFlow and Keras

Setting up the environment

Introduction to TensorFlow

Installing TensorFlow CPU

Installing TensorFlow GPU

Testing your installation

Getting to know TensorFlow 

Building a graph

Creating a Session

Introduction to Keras

Sequential API

Functional API

Summary

Deep Feedforward Networks

Evolutionary path to DFNs

Architecture of DFN

Training

Loss function

Regression loss

Mean squared error (MSE)

Mean absolute error

Classification loss

Cross entropy

Gradient descent

Types of gradient descent

Batch gradient descent

Stochastic gradient descent

Mini-batch gradient descent

Backpropagation

Optimizers

Train, test, and validation

Training set

Validation set

Test set

Overfitting and regularization

L1 and L2 regularization

Dropout

Early stopping

Building our first DFN

MNIST fashion data

Getting the data 

Visualizing data

Normalizing and splitting data

Model parameters

One-hot encoding

Building a model graph

Adding placeholders

Adding layers

Adding loss function

Adding an optimizer

Calculating accuracy

Running a session to train

The easy way

Summary

Restricted Boltzmann Machines and Autoencoders

What are RBMs?

The evolution path of RBMs

RBM architectures and applications

RBM and their implementation in TensorFlow

RBMs for movie recommendation

DBNs and their implementation in TensorFlow

DBNs for image classification

What are autoencoders?

The evolution path of autoencoders

Autoencoders architectures and applications

Vanilla autoencoders

Deep autoencoders

Sparse autoencoders

Denoising autoencoders

Contractive autoencoders

Summary

Exercise

Acknowledgements

Section 2: Convolutional Neural Networks

CNN Architecture

Problem with deep feedforward networks

Evolution path to CNNs

Architecture of CNNs

The input layer

The convolutional layer

The maxpooling layer

The fully connected layer

Image classification with CNNs

VGGNet

InceptionNet

ResNet

Building our first CNN 

CIFAR

Data loading and pre-processing

Object detection with CNN

R-CNN

Faster R-CNN

You Only Look Once (YOLO)

Single Shot Multibox Detector

TensorFlow object detection zoo

Summary

Mobile Neural Networks and CNNs

Evolution path to MobileNets

Architecture of MobileNets

Depth-wise separable convolution

The need for depth-wise separable convolution

Structure of MobileNet

MobileNet with Keras

MobileNetV2

Motivation behind MobileNetV2

Structure of MobileNetV2

Linear bottleneck layer

Expansion layer

Inverted residual block

Overall architecture

Implementing MobileNetV2

Comparing the two MobileNets

SSD MobileNetV2

Summary

Section 3: Sequence Modeling

Recurrent Neural Networks

What are RNNs?

The evolution path of RNNs

RNN architectures and applications

Architectures by input and output

Vanilla RNNs

Vanilla RNNs for text generation

LSTM RNNs

LSTM RNNs for text generation

GRU RNNs

GRU RNNs for stock price prediction

Bidirectional RNNs

Bidirectional RNNs for sentiment classification

Summary

Section 4: Generative Adversarial Networks (GANs)

Generative Adversarial Networks

What are GANs?

Generative models

Adversarial – training in an adversarial manner

The evolution path of GANs

GAN architectures and implementations

Vanilla GANs

Deep convolutional GANs

Conditional GANs

InfoGANs

Summary

Section 5: The Future of Deep Learning and Advanced Artificial Intelligence

New Trends of Deep Learning

New trends in deep learning

Bayesian neural networks

What our deep learning models don't know – uncertainty

How we can obtain uncertainty information – Bayesian neural networks

Capsule networks

What convolutional neural networks fail to do

Capsule networks – incorporating oriental and relative spatial relationships

Meta-learning

One big challenge in deep learning – training data

Meta-learning – learning to learn

Metric-based meta-learning

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

Deep learning architectures are composed of multilevel nonlinear operations that represent high-level abstractions. This allows you to learn useful feature representations from data. Hands-On Deep Learning Architectures with Python gives you a rundown explaining the essential learning algorithms used for deep and shallow architectures. Packed with practical implementations and ideas to build efficient artificial intelligence systems, this book will help you learn how neural networks play a major role in building deep architectures.

You will gain an understanding of various deep learning architectures, such as AlexNet, VGG Net, GoogleNet, and many more, with easy-to-follow code and diagrams. In addition to this, the book will also guide you in building and training various deep architectures, such as the Boltzmann mechanism, autoencoders, convolutional neural networks (CNNs), recurrent neural networks (RNN), natural language processing (NLP), generative adversarial networks (GANs), and others, with practical implementations. This book explains the essential learning algorithms used for deep and shallow architectures.

Who this book is for

If you're a data scientist, machine learning developer/engineer, deep learning practitioner, or are curious about the field of AI and want to upgrade your knowledge of various deep learning architectures, this book will appeal to you. You are expected to have some knowledge of statistics and machine learning algorithms to get the most out of this book.

What this book covers

Chapter 1, Getting Started with Deep Learning, covers the evolution of intelligence in machines and artificial intelligence and, eventually, deep learning. We'll then look at some applications of deep learning and set up our environment for coding our way through deep learning models. Completing this chapter, you will learn the following things.

Chapter 2, Deep Feedforward Networks, covers the evolution history of deep feedforward networks and their architecture. We will also demonstrate how to bring up and preprocess data for training a deep learning network.

Chapter 3, Restricted Boltzmann Machines and Autoencoders, explains the algorithm behind the scenes, called restricted Boltzmann machines (RBMs) and their evolutionary path. We will then dig deeper into the logic behind them and implement RBMs in TensorFlow. We will also apply them to build a movie recommender. We'll then learn about autoencoders and briefly look at their evolutionary path. We will also illustrate a variety of autoencoders, categorized by their architectures or forms of regularization.

Chapter 4, CNN Architecture, covers an important class of deep learning network for images, called convolutional neural networks (CNNs). We will also discuss the benefits of CNNs over deep feedforward networks. We will then learn more about some famous image classification CNNs and then build our first CNN image classifier on the CIFAR-10 dataset. Then, we'll move on to object detection with CNNs and the TensorFlow detection model, zoo.

Chapter 5, Mobile Neural Networks and CNNs, discusses the need for mobile neural networks for doing CNN work in a real-time application. We will also talk about the two benchmark MobileNet architectures introduced by Google—MobileNet and MobileNetV2. Later, we'll discuss the successful combination of MobileNet with object detection networks such as SSD to achieve object detection on mobile devices.

Chapter 6, Recurrent Neural Networks, explains one of the most important deep learning models, recurrent neural networks (RNNs), its architecture, and the evolutionary path of RNNs. Later, we'll will discuss a variety of architectures categorized by the recurrent layer, including vanilla RNNs, LSTM, GRU, and bidirectional RNNs, and apply the vanilla architecture to write our own War and Peace (a bit nonsensical though). We'll also introduce the bidirectional architecture that allows the model to preserve information from both past and future contexts of the sequence.

Chapter 7, Generative Adversarial Networks, explains one of the most interesting deep learning models, generative adversarial networks (GANs), and its evolutionary path. We will also illustrate a variety of GAN architectures with an example of image generation. We will also explore four GAN architectures, including vanilla GANs, deep convolutional GANs, conditional GANs, and information-maximizing GANs.

Chapter 8, New Trends in Deep Learning, talks about a few deep learning ideas that we have found impactful this year and more prominent in the future. We'll also learn that Bayesian deep learning combines the merits of both Bayesian learning and deep learning.

To get the most out of this book

Readers will require prior knowledge of Python, TensorFlow, and Keras.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

www.packt.com

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

Enter the name of the book in the

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-On-Deep-Learning-Architectures-with-Python. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://www.packtpub.com/sites/default/files/downloads/9781788998086_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system."

A block of code is set as follows:

import tensorflow as tfimport numpy as npimport matplotlib.pyplot as pltfrom sklearn.model_selection import train_test_split

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

import tensorflow as tfimport numpy as npimport matplotlib.pyplot as pltfrom sklearn.model_selection import

train_test_split

Any command-line input or output is written as follows:

conda activate test_env

conda install tensorflow

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select System info from the Administration panel."

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Section 1: The Elements of Deep Learning

In this section, you will get an overview of deep learning with Python, and will also learn about the architectures of the deep feedforward network, the Boltzmann machine, and autoencoders. We will also practice examples based on DFN and applications of the Boltzmann machine and autoencoders, with the concrete examples based on the DL frameworks/libraries with Python, along with their benchmarks.

This section consists of the following chapters:

Chapter 1

Getting Started with Deep Learning

Chapter 2

Deep Feedforward Networks

Chapter 3

Restricted Boltzmann Machines and Autoencoders

Getting Started with Deep Learning

Artificial intelligence might work, and if it does, it will be the biggest development in technology ever.

– Sam Altman

Welcome to the Hands-On Deep Learning Architectures with Python! If you are completely unfamiliar with deep learning, you can begin your journey right here with this book. And for readers who have an idea about it, we have covered almost every aspect of deep learning. So you are definitely going to learn a lot more about deep learning from this book.

The book is laid out in a cumulative manner; that is, it begins from the basics and builds it over and over to get to advanced levels. In this chapter, we discuss how humans started creating intelligence in machines and how artificial intelligence gradually evolved to machine learning and eventually deep learning. We then see some nice applications of deep learning. Moving back to the fundamentals, we will learn how artificial neurons work and, in the end, set up our environment for coding our way through deep learning models. After completing this chapter, you will have learned about the following things.

What artificial intelligence is, and how machine learning, deep learning relates to it

The types of machine learning tasks

Information about some interesting deep learning applications

What an artificial neural network is, and how it works

Setting up TensorFlow and Keras with Python

Let's begin with a short discussion on artificial intelligence and the relationships between artificial intelligence, machine learning, and deep learning.

Artificial intelligence

Ever since the beginning of the computer era, humans have been trying to mimic the brain into the machine. Researchers have been developing methods that would make machines not only compute but also decide like we humans do. This quest of ours gave birth to artificial intelligence around the 1960s. By definition, artificial intelligence means developing systems that are capable of accomplishing tasks without a human explicitly programming every decision. In 1956, the first program for playing checkers was written by Arthur Samuel. Since then, researchers tried to mimic human intelligence by defining sets of handwritten rules that didn't involve any learning. Artificial intelligence programs, which played games such as chess, were nothing but sets of manually defined moves and strategies. In 1959, Arthur Samuel coined the term machine learning. Machine learning started using various concepts of probability and bayesian statistics to perform pattern recognition, feature extraction, classification, and so on. In the 1980s, inspired by the neural structure of the human brain, artificial neural networks (ANN) were introduced. ANN in the 2000s evolved into today's so-called deep learning! The following is a timeline for the evolution of artificial intelligence through machine learning and deep learning:

Machine learning

Artificial intelligence prior to machine learning was just about writing rules that the machine used to process the provided data. Machine learning made a transition. Now, just by providing the data and expected output to the machine learning algorithm, the computer returns an optimized set of rules for the task. Machine learning uses historic data to train a system and test it on unknown but similar data, beginning the journey of machines learning how to make decisions without being hard coded. In the early 90s, machine learning emerged as the new face of artificial intelligence. Larger datasets were developed and made public to allow more people to build and train machine learning models. Very soon a huge community of machine learning scientists/engineers developed. Although machine learning algorithms draw inference from statistics, what makes it powerful is the error minimization approach. It tries to minimize the error between expected output provided by the dataset and predicted algorithm output to discover the optimized rules. This is the learning part of machine learning. We won't be covering machine learning algorithms in this book but they are essentially divided into three categories: supervised, unsupervised, and reinforcement. Since deep learning is also a subset of machine learning, these categories apply to deep learning as well.

Regression

Regression deals with learning continuous mapping functions that can predict values provided by various input features. The function can be linear or non-linear. If the function is linear, it is referred to as linear regression, and if it is non-linear, it is commonly called polynomial regression. Predicting values when there are multiple input features (variables), we call multi-variate regression. A very typical example of regression is the house prediction problem. Provided with the various parameters of a house, such as build area, locality, number of rooms, and so on, the accurate selling price of the house can be predicted using historic data.

Classification

When the target output values are categorized instead of raw values, as in regression, it is a classification task. For example, we could classify different species of flowers based on input features, petal length, petal width, sepal length, and sepal width. Output categories are versicolor, setosa, and virginica. Algorithms like logistic regression, decision tree, naive bayes, and so on are classification algorithms. We will be covering details of Classification in Chapter 2.

Unsupervised learning

Unsupervised learning is used when we don't have the corresponding target output values for the input. It is used to understand the data distribution and discover similarity of some kinds between the data points. As there is no target output to learn from, unsupervised algorithms rely on initializers to generate initial decision boundaries and update them as they go through the data. After going through the data multiple times, the algorithms update to optimized decision boundaries, which groups data points based on similarities. This method is known as clustering, and algorithms such as k-means are used for it.

Reinforcement learning

Remember how you learned to ride a bicycle in your childhood? It was a trial and error process, right? You tried to balance yourself, and each time you did something wrong, you tipped off the bicycle. But, you learned from your mistakes, and eventually, you were able to ride without falling. In the same way, Reinforcement learning does the same! An agent is exposed to an environment where it takes action from a list of possible actions, which leads to a change in the state of the agent. A state is the current situation of the environment the agent is in. For every action, the agent receives an award. Whenever the received reward is positive, it signifies the agent has taken the correct step, and when the reward is negative, it signifies a mistake. The agent follows a policy, a reinforcement learning algorithm through which the agent determines next actions considering the current state. Reinforcement learning is the true form of artificial intelligence, inspired by a human's way of learning through trial and error. Think of yourself as the agent and the bicycle the environment! Discussing reinforcement learning algorithms here is beyond the scope of this book, so let's shift focus back to deep learning!

Deep learning

Though machine learning has provided computers with the capability to learn decision boundaries, it misses out on the robustness of doing so. Machine learning models have to be very specifically designed for every particular application. People spent hours deciding what features to select for optimal learning. As the data cross folded and non-linearity in data increased, machine learning models struggled to produce accurate results. Scientists soon realized that a much more powerful tool was required to apex this growth. In the 1980s, the concept of ANN was reborn, and with faster computing capabilities, deeper versions of ANN were developed, providing us with the powerful tool we were looking for—deep learning!

Applications of deep learning

The leverage of a technology is decided by the robustness of its applications. Deep learning has created a great amount of commotion in tech as well as the non-tech market, owing to its hefty applications. So, in this section, we discuss some of the amazing applications of deep learning, which will keep you motivated all through the book.

Self-driving cars

This is probably the coolest and most promising application of deep learning. An autonomous vehicle has a number of cameras attached to it. The output video stream is fed into deep learning networks that recognize, as well as segment, different objects present around the car. NVIDIA has introduced an End to End Learning for Self-Driving Cars, which is a convolutional neural network that takes in input images from cameras and predicts what actions should be taken in the form of steering angle or acceleration. For training the network, the steering angles and throttle and camera views are stored when a human is driving along, documenting the actions taken by him for the changes occurring around him. The network's parameters are then updated through backpropagating (Backpropagation is discussed in details in Chapter 2, Deep Feedforward Networks) the error from human input and the network's predictions.

If you wish to know more about the NVIDIA's Learning for Self-Driving Cars, you can refer to the following NVIDIA's paper: https://arxiv.org/abs/1604.07316.

Image translation

Generative adversarial networks (GANs) are the most notorious deep learning architectures. This is due to their capability to generate outputs from random noise input vectors. GAN has two networks: generator and discriminator. The job of the generator is to take a random vector as input and generate sample output data. The discriminator takes input from both the real data and faked data created by the generator. The job of the discriminator is to determine whether the input is coming from real data or a faked one from the generator. You can visualize the scenario, imagining the discriminator is a bank trying to distinguish between real and fake currency. At the same time, the generator is the fraud trying to pass fake currency to a counterfeit bank; generator and discriminator both learn through their mistakes, and the generator eventually produces results that imitate the real data very precisely.

One of the interesting applications of GANs is image-to-image translation. It is based on conditional GAN (we will be discussing GANs in details under Chapter 7). Given a pair of images holding some relation, say I1 and I2, a conditional GAN learns how to convert I1 into I2. A dedicated software called pix2pix is created to demonstrate the applications of this concept. It can be used to fill in colors to black and white images, create maps from satellite images, generate object images from mere sketches, and what not!

The following is the link to the actual paper published by Phillip Isola for image-to-image translation and a sample image from pix2pix depicting various applications of image-to-image translation (https://arxiv.org/abs/1611.07004):

Sourced from pix2pix

Machine translation

There are more than 4,000 languages in this world and billions of people communicating through them. You can imagine the scale at which language translation is required. Most of the translations were done with human translators because classical rule-based translations made by machine were quite often meaningless. Deep learning came up with a solution to this. It can learn the language like we do and generate translations that are more natural. This is commonly referred to as neural machine translation (NMT).

Encoder-decoder structure

Neural machine translation models are Recurrent Neural Networks (RNN), arranged in encoder-decoder fashion. The encoder network takes in variable length input sequences through RNN and encodes the sequences into a fixed size vector. The decoder begins with this encoded vector and starts generating translation word by word, until it predicts the end of sentence. The whole architecture is trained end-to-end with input sentence and correct output translation. The major advantage of these systems (apart from the capability to handle variable input size) is that they learn the context of a sentence and predict accordingly, rather than making a word-to-word translation. Neural machine translation can be best seen in action on Google translate in the following screenshot:

Sourced from Google translate

Chatbots

You might find this is the coolest application! Computers talking to us like humans has been a fascinating desire. It gives us a sense of computers being intelligent. However, most of the chatbot systems built earlier were based on a knowledge base and rules that define which response to pick from it. This made chatbots a very closed domain and they sounded quite unnatural. But with a little tweaking to the encoder-decoder architecture, we saw machine translation can actually make a chatbot generate a response on its own. The encodings learn the context of the input sentences, and, if the whole architecture is trained on sample queries and responses, whenever the system sees a new query, it can generate its response based on the learning. A lot of platforms like IBM Watson, Bottr, and rasa are building deep learning powered tools to build chatbots for business purposes.

Building the fundamentals

This section is where you will begin the journey of being a deep learning architect. Deep learning stands on the pillar of ANNs. Our first step should be to understand how they work. In this section, we describe the biological inspiration behind the artificial neuron and the mathematical model to create an ANN. We have tried keeping the mathematics to a minimum and focused more on concepts. However, we assume you are familiar with basic algebra and calculus.

Biological inspiration

As we mentioned earlier, deep learning is inspired by the human brain. This seems a good idea indeed. To develop the intelligence of the brain inside a machine, you need the machine to mimic the brain! Now, if you are slightly aware of how a human brain learns and memorizes things so fast, you must know that this is possible due to millions of neurons developing an interconnected network, sending signals to each other, which makes up the memory. The neuron has two major components: dendrite and axon. The dendrite acts as a receptor and combines all the signals that the neuron is receiving. The axon is connected to dendrites at the end of other neurons through synapses. Once the incoming signals cross a threshold, they flow through the axon and synapse to pass the signal to the connected neuron. The structure in which the neurons are connected to each other decides the network's capabilities. Following is a diagram of what a biological neuron might look like:

A biological neuron (sourced from Wikimedia)

Hence, the artificial model of neural network should be a parallel network of interconnected nodes, which take in inputs from various other nodes, and pass on the output when activated. This activation phenomenon must be controlled by some sort of mathematical operations. Let's see the operations and equations next!