Deep Learning with TensorFlow - Giancarlo Zaccone - E-Book

Deep Learning with TensorFlow E-Book

Giancarlo Zaccone

0,0
39,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Deep learning is the step that comes after machine learning, and has more advanced
implementations. Machine learning is not just for academics anymore, but is becoming a mainstream practice through wide adoption, and deep learning has taken the front seat. As a data scientist, if you want to explore data abstraction layers, this book will be your guide. This book shows how this can be exploited in the real world with complex raw data using TensorFlow 1.x.

Throughout the book, you’ll learn how to implement deep learning algorithms for machine learning systems and integrate them into your product offerings, including
search, image recognition, and language processing. Additionally, you’ll learn how
to analyze and improve the performance of deep learning models. This can be done by
comparing algorithms against benchmarks, along with machine intelligence, to learn
from the information and determine ideal behaviors within a specific context.

After finishing the book, you will be familiar with machine learning techniques, in particular the use of TensorFlow for deep learning, and will be ready to apply your knowledge to research or commercial projects.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 297

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Title Page

Deep Learning with TensorFlow
Take your machine learning knowledge to the next level with the power of TensorFlow 1.x
Giancarlo Zaccone Md. Rezaul KarimAhmed Menshawy

BIRMINGHAM - MUMBAI

Copyright

Deep Learning with TensorFlow

Copyright © 2017 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: April 2017

Production reference: 1200417

Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.

ISBN 978-1-78646-978-6

www.packtpub.com

Credits

Authors

Giancarlo Zaccone Md. Rezaul Karim Ahmed Menshawy

Copy Editor

Safis Editing

Reviewers

Swapnil Ashok Jadhav

Chetan Khatri

Project Coordinator

Shweta H Birwatkar

Commissioning Editor

Veena Pagare

Proofreader

Safis Editing

Acquisition Editor

Vinay Agrekar

Indexer

Aishwarya Gangawane

ContentDevelopmentEditor

Amrita Norohna

Graphics

Tania Dutta

Technical Editor

Deepti Tuscano

Production Coordinator

Nilesh Mohite

About the Authors

Giancarlo Zaccone has more than ten years of experience in managing research projects both in scientific and industrial areas. He worked as researcher at the C.N.R, the National Research Council, where he was involved in projects relating to parallel computing and scientific visualization.

Currently, he is a system and software engineer at a consulting company developing and maintaining software systems for space and defense applications.

He is author of the following Packt volumes: Python Parallel Programming Cookbook and Getting Started with TensorFlow.

You can follow him at https://it.linkedin.com/in/giancarlozaccone.

Md. Rezaul Karim has more than 8 years of experience in the area of research and development with a solid knowledge of algorithms and data structures, focusing C/C++, Java, Scala, R, and Python and big data technologies such as Spark, Kafka, DC/OS, Docker, Mesos, Hadoop, and MapReduce. His research interests include machine learning, deep learning, Semantic Web, big data, and bioinformatics. He is the author of the book titled Large-Scale Machine Learning with Spark, Packt Publishing.

He is a Software Engineer and Researcher currently working at the Insight Center for Data Analytics, Ireland. He is also a Ph.D. candidate at the National University of Ireland, Galway. He also holds a BS and an MS degree in Computer Engineering. Before joining the Insight Centre for Data Analytics, he had been working as a Lead Software Engineer with Samsung Electronics, where he worked with the distributed Samsung R&D centers across the world, including Korea, India, Vietnam, Turkey, and Bangladesh. Before that, he worked as a Research Assistant in the Database Lab at Kyung Hee University, Korea. He also worked as an R&D Engineer with BMTech21 Worldwide, Korea. Even before that, he worked as a Software Engineer with i2SoftTechnology, Dhaka, Bangladesh.

I would like to thank my parents (Mr. Razzaque and Mrs. Monoara) for their continuous encouragement and motivation throughout my life. I would also like to thank my wife (Saroar) and my kid (Shadman) for their never-ending support, which keeps me going. I would like to give special thanks to Ahmed Menshawy and Giancarlo Zaccone for authoring this book. Without their contributions, the writing would have been impossible. Overall, I would like to dedicate this book to my elder brother Md. Mamtaz Uddin (Manager, International Business, Biopharma Ltd., Bangladesh) for his endless contributions to my life.

Further, I would like to thank the acquisition, content development and technical editors of Packt Publishing (and others who were involved in this book title) for their sincere cooperation and coordination. Additionally, without the work of numerous researchers and deep learning practitioners who shared their expertise in publications, lectures, and source code, this book might not exist at all! Finally, I appreciate the efforts of the TensorFlow community and all those who have contributed to APIs, whose work ultimately brought the deep learning to the masses.

Ahmed Menshawy is a Research Engineer at the Trinity College Dublin, Ireland. He has more than 5 years of working experience in the area of Machine Learning and Natural Language Processing (NLP). He holds an MSc in Advanced Computer Science. He started his Career as a Teaching Assistant at the Department of Computer Science, Helwan University, Cairo, Egypt. He taught several advanced ML and NLP courses such as Machine Learning, Image Processing, Linear Algebra, Probability and Statistics, Data structures, Essential Mathematics for Computer Science. Next, he joined as a research scientist at the Industrial research and development lab at IST Networks, based in Egypt. He was involved in implementing the state-of-the-art system for Arabic Text to Speech. Consequently, he was the main machine learning specialist in that company. Later on, he joined the Insight Centre for Data Analytics, the National University of Ireland at Galway as a Research Assistant working on building a Predictive Analytics Platform. Finally, he joined ADAPT Centre, Trinity College Dublin as a Research Engineer. His main role in ADAPT is to build prototypes and applications using ML and NLP techniques based on the research that is done within ADAPT.

I would like to thank my parents, my Wife Sara and daughter Asma for their support and patience during the book. Also I would like to sincerely thank Md. Rezaul Karim and Giancarlo Zaccone for authoring this book.

Further, I would like to thank the acquisition, content development and technical editors of Packt Publishing (and others who were involved in this book title) for their sincere cooperation and coordination. Additionally, without the work of numerous researchers and deep learning practitioners who shared their expertise in publications, lectures, and source code, this book might not exist at all! Finally, I appreciate the efforts of the TensorFlow community and all those who have contributed to APIs, whose work ultimately brought the machine learning to the masses.

About the Reviewers

Swapnil Ashok Jadhavis a Machine Learning and NLP enthusiast. He enjoys learning new Machine Learning and Deep Learning technologies and solving interesting data science problems and has around 3 years of working experience in these fields.

He is currently working at Haptik Infotech Pvt. Ltd. as a Machine Learning Scientist.Swapnil holds Masters degree in Information Security from NIT Warangal and Bachelors degree from VJTI Mumbai.

You can follow him athttps://www.linkedin.com/in/swapnil-jadhav-9448872a.

Chetan Khatri is a data science researcher with having total of five years of experience in research and development. He works as a lead technology at Accionlabs India. Prior to that he worked with Nazara Games, where he lead data science practice as a principal big data engineer for Gaming and Telecom Business. He has worked with a leading data companies and a Big 4 companies, where he managed the data science practice platform and one of the Big 4 company's resources team.

He completed his master's degree in computer science and minor data science at KSKV Kachchh.

University, and was awarded as “Gold Medalist” by the Governer of Gujarat for his “University 1st Rank” achievements.

He contributes to society in various ways, including giving talks to sophomore students at universities and giving talks on the various fields of data science, machine learning, AI, IoT in academia and at various conferences. He has excellent correlative knowledge of both academic research and industry best practices. Hence, He always come forward to remove gap between Industry and Academia where he has good number of achievements. He was core co-author of various courses such as data science, IoT, machine learning/AI, distributed databases at PG/UG cariculla at university of Kachchh. Hence, university of Kachchh become first government university in Gujarat to introduce Python as a first programming language in Cariculla and India’s first government university to introduce data science, AI, IoT courses in Cariculla entire success story presented by Chetan at Pycon India 2016 conference. He is one of the founding members of PyKutch—A Python Community.

Currently, he is working on intelligent IoT devices with deep learning , reinforcement learning and distributed computing with various modern architectures. He is committer at Apache HBase and Spark HBase connector.

I would like to thank Prof. Devji Chhanga, Head of the Computer Science, University of Kachchh, for routing me to the correct path and for his valuable guidance in the field of data science research.

I would also like to thanks Prof. Shweta Gorania for being the first to introduce genetic algorithm and neural networks.

Last but not least, I would like to thank my beloved family for their support.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1786469786.

If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

Table of Contents

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Getting Started with Deep Learning

Introducing machine learning

Supervised learning

Unsupervised learning

Reinforcement learning

What is deep learning?

How the human brain works

Deep learning history

Problems addressed

Neural networks

The biological neuron

An artificial neuron

How does an artificial neural network learn?

The backpropagation algorithm

Weights optimization

Stochastic gradient descent

Neural network architectures

Multilayer perceptron

DNNs architectures

Convolutional Neural Networks

Restricted Boltzmann Machines

Autoencoders

Recurrent Neural Networks

Deep learning framework comparisons

Summary

First Look at TensorFlow

General overview

What's new with TensorFlow 1.x?

How does it change the way people use it?

Installing and getting started with TensorFlow

Installing TensorFlow on Linux

Which TensorFlow to install on your platform?

Requirements for running TensorFlow with GPU from NVIDIA

Step 1: Install NVIDIA CUDA

Step 2: Installing NVIDIA cuDNN v5.1+

Step 3: GPU card with CUDA compute capability 3.0+

Step 4: Installing the libcupti-dev library

Step 5: Installing Python (or Python3)

Step 6: Installing and upgrading PIP (or PIP3)

Step 7: Installing TensorFlow

How to install TensorFlow

Installing TensorFlow with native pip

Installing with virtualenv

Installing TensorFlow on Windows

Installation from source

Install on Windows

Test your TensorFlow installation

Computational graphs

Why a computational graph?

Neural networks as computational graphs

The programming model

Data model

Rank

Shape

Data types

Variables

Fetches

Feeds

TensorBoard

How does TensorBoard work?

Implementing a single input neuron

Source code for the single input neuron

Migrating to TensorFlow 1.x

How to upgrade using the script

Limitations

Upgrading code manually

Variables

Summary functions

Simplified mathematical variants

Miscellaneous changes

Summary

Using TensorFlow on a Feed-Forward Neural Network

Introducing feed-forward neural networks

Feed-forward and backpropagation

Weights and biases

Transfer functions

Classification of handwritten digits

Exploring the MNIST dataset

Softmax classifier

Visualization

How to save and restore a TensorFlow model

Saving a model

Restoring a model

Softmax source code

Softmax loader source code

Implementing a five-layer neural network

Visualization

Five-layer neural network source code

ReLU classifier

Visualization

Source code for the ReLU classifier

Dropout optimization

Visualization

Source code for dropout optimization

Summary

TensorFlow on a Convolutional Neural Network

Introducing CNNs

CNN architecture

A model for CNNs - LeNet

Building your first CNN

Source code for a handwritten classifier

Emotion recognition with CNNs

Source code for emotion classifier

Testing the model on your own image

Source code

Summary

Optimizing TensorFlow Autoencoders

Introducing autoencoders

Implementing an autoencoder

Source code for the autoencoder

Improving autoencoder robustness

Building a denoising autoencoder

Source code for the denoising autoencoder

Convolutional autoencoders

Encoder

Decoder

Source code for convolutional autoencoder

Summary

Recurrent Neural Networks

RNNs basic concepts

RNNs at work

Unfolding an RNN

The vanishing gradient problem

LSTM networks

An image classifier with RNNs

Source code for RNN image classifier

Bidirectional RNNs

Source code for the bidirectional RNN

Text prediction

Dataset

Perplexity

PTB model

Running the example

Summary

GPU Computing

GPGPU computing

GPGPU history

The CUDA architecture

GPU programming model

TensorFlow GPU set up

Update TensorFlow

TensorFlow GPU management

Programming example

Source code for GPU computation

GPU memory management

Assigning a single GPU on a multi-GPU system

Source code for GPU with soft placement

Using multiple GPUs

Source code for multiple GPUs management

Summary

Advanced TensorFlow Programming

Introducing Keras

Installation

Building deep learning models

Sentiment classification of movie reviews

Source code for the Keras movie classifier

Adding a convolutional layer

Source code for movie classifier with convolutional layer

Pretty Tensor

Chaining layers

Normal mode

Sequential mode

Branch and join

Digit classifier

Source code for digit classifier

TFLearn

TFLearn installation

Titanic survival predictor

Source code for titanic classifier

Summary

Advanced Multimedia Programming with TensorFlow

Introduction to multimedia analysis

Deep learning for Scalable Object Detection

Bottlenecks

Using the retrained model

Accelerated Linear Algebra

Key strengths of TensorFlow

Just-in-time compilation via XLA

JIT compilation

Existence and advantages of XLA

Under the hood working of XLA

Still experimental

Supported platforms

More experimental material

TensorFlow and Keras

What is Keras?

Effects of having Keras on board

Video question answering system

Not runnable code!

Deep learning on Android

TensorFlow demo examples

Getting started with Android

Architecture requirements

Prebuilt APK

Running the demo

Building with Android studio

Going deeper - Building with Bazel

Summary

Reinforcement Learning

Basic concepts of Reinforcement Learning

Q-learning algorithm

Introducing the OpenAI Gym framework

FrozenLake-v0 implementation problem

Source code for the FrozenLake-v0 problem

Q-learning with TensorFlow

Source code for the Q-learning neural network

Summary

Preface

Machine learning is concerned with algorithms that transform raw data into information into actionable intelligence. This fact makes machine learning well suited to the predictive analytics of big data. Without machine learning, therefore, it would be nearly impossible to keep up with these massive streams of information altogether. On the other hand, the deep learning is a branch of machine learning algorithms based on learning multiple levels of representation. Just in the last few years have been developed powerful deep learning algorithms to recognize images, natural language processing and perform a myriad of other complex tasks. A deep learning algorithm is nothing more than the implementation of a complex neural network so that it can learn through the analysis of large amounts of data. This book introduces the core concepts of deep learning using the latest version of TensorFlow. This is Google’s open-source framework for mathematical, machine learning and deep learning capabilities released in 2011. After that, TensorFlow has achieved wide adoption from academia and research to industry and following that recently the most stable version 1.0 has been released with a unified API. TensorFlow provides the flexibility needed to implement and research cutting-edge architectures while allowing users to focus on the structure of their models as opposed to mathematical details. Readers will learn deep learning programming techniques with the hands-on model building, data collection and transformation and even more!

Enjoy reading!

What this book covers

Chapter 1, Getting Started with TensorFlow, covers some basic concepts that will be found in all the subsequent chapters. We’ll introduce machine learning and deep learning architectures. Finally, we’ll introduce deep learning architectures, the so-called Deep Neural Networks: these are distinguished from the more commonplace single-hidden-layer neural networks by their depth; that is, the number of node layers through which data passes in a multistep process of pattern recognition. We will provide a comparative analysis of deep learning architectures with a chart summarizing all the neural networks from where most of the deep learning algorithm evolved.

Chapter 2, First Look at TensorFlow, will cover the main features and capabilities of TensorFlow 1.x: getting started with computation graph, data model, programming model and TensorBoard. In the last part of the chapter, we’ll see TensorFlow in action by implementing a Single Input Neuron. Finally, it will show how to upgrade from TensorFlow 0.x to TensorFlow 1.x.

Chapter 3, Using TensorFlow on a Feed-Forward Neural Network, provides a detailed introduction of feed-forward neural networks. The chapter will be also very practical, implementing a lot of application examples using this fundamental architecture.

Chapter 4, TensorFlow on a Convolutional Neural Network, introduces the CNNs networks that are the basic blocks of a deep learning-based image classifier. We’ll develop two examples of CNN networks; the first is the classic MNIST digit classification problem, while the purpose for the second is to train a network on a series of facial images to classify their emotional stretch.

Chapter 5, Optimizing TensorFlow Autoencoders, presents autoencoders networks that are designed and trained for transforming an input pattern so that, in the presence of a degraded or incomplete version of an input pattern, it is possible to obtain the original pattern. In the chapter, we’ll see autoencoders in action with some application examples.

Chapter 6, Recurrent Neural Networks, explains this fundamental architecture designed to handle data that comes in different lengths, that is very popular for various natural language processing tasks. Text processing and image classification problems will be implemented in the course if this chapter.

Chapter 7, GPU Computing, shows the TensorFlow facilities for GPU computing. In this chapter, we’ll explore some techniques to handle GPU using TensorFlow.

Chapter 8, Advanced TensorFlow Programming, gives an overviewof the following TensorFlow-based libraries: Keras, Pretty Tensor, and TFLearn. For each library, we’ll describe the main features with an application example.

Chapter 9, Advanced Multimedia Programming with TensorFlow, covers some advanced and emerging aspects of multimedia programming using TensorFlow. Deep neural networks for scalable object detection and deep learning on Android using TensorFlow with an example with the code will be discussed. The Accelerated Linear Algebra (XLA) and Keras will be discussed with examples to make the discussion more concrete.

Chapter 10, Reinforcement Learning, covers the basic concepts of RL. We will experience the Q-learning algorithm that is one of the most popular reinforcement learning algorithms. Furthermore, we’ll introduce the OpenAI gym framework that is a TensorFlow compatible, toolkit for developing and comparing reinforcement learning algorithms.

What you need for this book

All the examples have been implemented using Python version 2.7 (and 3.5) on an Ubuntu Linux 64 bit including the TensorFlow library version 1.0.1. However, all the source codes that are shown in the book are Python 2.7 compatible. Further, source codes for Python 3.5 compatible can be downloaded from the Packt repository. Source codes for Python 3.5+ compatible can be downloaded from the Packt repository.

You will also need the following Python modules (preferably the latest version):

Pip

Bazel

Matplotlib

NumPy

Pandas

mnist_data

For chapters 8, 9 and 10, you will need the following frameworks too:

Keras

XLA

Pretty Tensor

TFLearn

OpenAI gym

Most importantly, GPU-enabled version of TensorFlow has several requirements such as 64-bit Linux, Python 2.7 (or 3.3+ for Python 3), NVIDIA CUDA® 7.5 (CUDA 8.0 required for Pascal GPUs) and NVIDIA cuDNN v4.0 (minimum) or v5.1 (recommended). More specifically, the current implementation of TensorFlow supports GPU computing with NVIDIA toolkits, drivers and software only.

Who this book is for

This book is dedicated to developers, data analysts, or deep learning enthusiasts who do not have much background with complex numerical computations but want to know what deep learning is. The book majorly appeals to beginners who are looking for a quick guide to gain some hands-on experience with deep learning. A rudimentary level of programming in one language is assumed as is a basic familiarity with computer science techniques and technologies including basic awareness of computer hardware and algorithms. Some competence in mathematics is needed to the level of elementary linear algebra and calculus.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.

Hover the mouse pointer on the

SUPPORT

tab at the top.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on

Code Download

.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Deep-Learning-with-TensorFlow. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/DeepLearningwithTensorFlow_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at [email protected] with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

Getting Started with Deep Learning

In this chapter, we will discuss about some basic concepts of deep learning and their related architectures that will be found in all the subsequent chapters of this book. We'll start with a brief definition of machine learning, whose techniques allow the analysis of large amounts of data to automatically extract information and to make predictions about subsequent new data. Then we'll move onto deep learning, which is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data.

Finally, we'll introduce deep learning architectures, the so-called Deep Neural Networks (DNNs)--these are distinguished from the more commonplace single hidden layer neural networks by their depth; that is, the number of node layers through which data passes in a multistep process of pattern recognition. we will provide a chart summarizing all the neural networks from where most of the deep learning algorithm evolved.

In the final part of the chapter, we'll briefly examine and compare some deep learning frameworks across various features, such as the native language of the framework, multi-GPU support, and aspects of usability.

This chapter covers the following topics:

Introducing machine learning

What is deep learning?

Neural networks

How does an artificial neural network learn?

Neural network architectures

DNNs architectures

Deep learning framework comparison

Introducing machine learning

Machine learning is a computer science research area that deals with methods to identify and implement systems and algorithms by which a computer can learn, based on the examples given in the input. The challenge of machine learning is to allow a computer to learn how to automatically recognize complex patterns and make decisions that are as smart as possible. The entire learning process requires a dataset as follows:

Training set

: This is the knowledge base used to train the machine learning algorithm. During this phase, the parameters of the machine learning model (hyperparameters) can be tuned according to the performance obtained.

Testing set

: This is used only for evaluating the performance of the model on unseen data.

Learning theory uses mathematical tools that are derived from probability theory of and information theory. This allows you to assess the optimality of some methods over others.

There are basically three learning paradigms that will be briefly discussed:

Supervised learning

Unsupervised learning

Learning with reinforcement

Let's take a look at them.

Supervised learning

Supervised learning is the automatic learning task simpler and better known. It is based on a number of preclassified examples, in which, namely, is known a prior the category to which each of the inputs used as examples should belong. In this case, the crucial issue is the problem of generalization. After the analysis of a sample (often small) of examples, the system should produce a model that should work well for all possible inputs.

The set consists of labeled data, that is, objects and their associated classes. This set of labeled examples, therefore, constitutes the training set.

Most of the supervised learning algorithms share one characteristic: the training is performed by the minimization of a particular loss or cost function, representing the output error provided by the system with respect to the desired possible output, because the training set provides us with what must be the desired output.

The system then changes its internal editable parameters, the weights, to minimize this error function. The goodness of the model is evaluated, providing a second set of labeled examples (the test set), evaluating the percentage of correctly classified examples and the percentage of misclassified examples.

The supervised learning context includes the classifiers, but also the learning of functions that predict numeric values. This task is the regression. In a regression problem, the training set is a pair formed by an object and the associated numeric value. There are several supervised learning algorithms that have been developed for classification and regression. These can be grouped into the formula used to represent the classifier or the learned predictor, among all, decision trees, decision rules, neural networks and Bayesian networks.

Unsupervised learning

In unsupervised learning, a set of inputs is supplied to the system during the training phase which, however, contrary to the case supervised learning, is not labeled with the related belonging class. This type of learning is important because in the human brain it is probably far more common than supervised learning.

The only objects in the domain of learning models, in this case, are the observed data inputs, which often is assumed to be independent samples of an unknown underlying probability distribution.

Unsupervised learning algorithms are used particularly used in clustering problems, in which given a collection of objects, we want to be able to understand and show their relationships. A standard approach is to define a similarity measure between two objects, and then look for any cluster of objects that are more similar to each other, compared to the objects in the other clusters.

Reinforcement learning

Reinforcement learning is an artificial intelligence approach that emphasizes the learning of the system through its interactions with the environment. With reinforcement learning, the system adapts its parameters based on feedback received from the environment, which then provides feedback on the decisions made. For example, a system that models a chess player who uses the result of the preceding steps to improve their performance is a system that learns with reinforcement. Current research on learning with reinforcement is highly interdisciplinary, and includes researchers specializing in genetic algorithms, neural networks, psychology, and control engineering.

The following figure summarizes the three types of learning, with the related problems to address:

Figure 1: Types of learning and related problems

What is deep learning?

Deep learning is a machine learning research area that is based on a particular type of learning mechanism. It is characterized by the effort to create a learning model at several levels, in which the most profound levels take as input the outputs of previous levels, transforming them and always abstracting more. This insight on the levels of learning is inspired by the way the brain processes information and learns, responding to external stimuli.

Each learning level corresponds, hypothetically, to one of the different areas which make up the cerebral cortex.

How the human brain works

The visual cortex, which is intended to solve image recognition problems, shows a sequence of sectors placed in a hierarchy. Each of these areas receives an input representation, by means of flow signals that connect it to other sectors.

Each level of this hierarchy represents a different level of abstraction, with the most abstract features defined in terms of those of the lower level. At a time when the brain receives an input image, the processing goes through various phases, for example, detection of the edges or the perception of forms (from those primitive to those gradually more and more complex).

As the brain learns by trial and activates new neurons by learning from the experience, even in deep learning architectures, the extraction stages or layers are changed based on the information received at the input.

The scheme, on the next page shows what has been said in the case of an image classification system, each block gradually extracts the features of the input image, going on to process data already preprocessed from the previous blocks, extracting features of the image that are increasingly abstract, and thus building the hierarchical representation of data that comes with on deep learning based system.

More precisely, it builds the layers as follows along with the figure representation:

Layer 1

: The system starts identifying the dark and light pixels

Layer 2

: The system identifies edges and shapes

Layer 3

: The system learns more complex shapes and objects

Layer 4

: The system learns which objects define a human face

Here is the visual representation of the process:

Figure 2: A deep learning system at work on a facial classification problem