Hands-On Convolutional Neural Networks with TensorFlow - Iffat Zafar - E-Book

Hands-On Convolutional Neural Networks with TensorFlow E-Book

Iffat Zafar

0,0
23,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Convolutional Neural Networks (CNN) are one of the most popular architectures used in computer vision apps. This book is an introduction to CNNs through solving real-world problems in deep learning while teaching you their implementation in popular Python library - TensorFlow. By the end of the book, you will be training CNNs in no time!

We start with an overview of popular machine learning and deep learning models, and then get you set up with a TensorFlow development environment. This environment is the basis for implementing and training deep learning models in later chapters. Then, you will use Convolutional Neural Networks to work on problems such as image classification, object detection, and semantic segmentation.

After that, you will use transfer learning to see how these models can solve other deep learning problems. You will also get a taste of implementing generative models such as autoencoders and generative adversarial networks.

Later on, you will see useful tips on machine learning best practices and troubleshooting. Finally, you will learn how to apply your models on large datasets of millions of images.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 272

Veröffentlichungsjahr: 2018

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Hands-On Convolutional Neural Networks with TensorFlow

 

 

Solve computer vision problems with modeling in TensorFlow and Python

 

 

 

 

 

 

 

Iffat Zafar 
Giounona Tzanidou
Richard Burton
Nimesh Patel
Leonardo Araujo 

 

 

 

 

 

 

 

 

 

 

 

BIRMINGHAM - MUMBAI

Hands-On Convolutional Neural Networks with TensorFlow

Copyright © 2018 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Amey VarangaonkarAcquisition Editor: Siddharth MandalContent Development Editor: Aditi GourTechnical Editor: Vaibhav DwivediCopy Editor: Safis EditingProject Coordinator: Hardik BhindeProofreader: Safis EditingIndexer: Tejal Daruwale SoniGraphics: Jason MonteiroProduction Coordinator: Deepika Naik

First published: August 2018

Production reference: 1240818

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78913-033-1

www.packtpub.com

 
mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

PacktPub.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks. 

Contributors

About the authors

 

Iffat Zafar was born in Pakistan. She received her Ph.D. from the Loughborough University in Computer Vision and Machine Learning in 2008. After her Ph.D. in 2008, she worked as research associate at the Department of Computer Science, Loughborough University, for about 4 years. She currently works in the industry as an AI engineer, researching and developing algorithms using Machine Learning and Deep Learning for object detection and general Deep Learning tasks for edge and cloud-based applications.

Giounona Tzanidou is a PhD in computer vision from Loughborough University, UK, where she developed algorithms for runtime surveillance video analytics. Then, she worked as a research fellow at Kingston University, London, on a project aiming at prediction detection and understanding of terrorist interest through intelligent video surveillance. She was also engaged in teaching computer vision and embedded systems modules at Loughborough University. Now an engineer, she investigates the application of deep learning techniques for object detection and recognition in videos.

Richard Burton graduated from the University of Leicester with a master's degree in mathematics. After graduating, he worked as a research engineer at the University of Leicester for a number of years, where he developed deep learning object detection models for their industrial partners. Now, he is working as a software engineer in the industry, where he continues to research the applications of deep learning in computer vision.

Nimesh Patel graduated from the University of Leicester with an MSc in applied computation and numerical modeling. During this time, a project collaboration with one of University of Leicester’s partners was undertaken, dealing with Machine Learning for Hand Gesture recognition. Since then, he has worked in the industry, researching Machine Learning for Computer Vision related tasks, such as Depth Estimation.

Leonardo Araujo is just the regular, Brazilian, curious engineer, who has worked in the industry for the past 19 years (yes, in Brazil, people work before graduation), doing HW/SW development and research on the topics of control engineering and computer vision. For the past 6 years, he has focused more on Machine Learning methods. His passions are too many to put on the book.

 

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Hands-On Convolutional Neural Networks with TensorFlow

Packt Upsell

Why subscribe?

PacktPub.com

Contributors

About the authors

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Reviews

Setup and Introduction to TensorFlow

The TensorFlow way of thinking

Setting up and installing TensorFlow

Conda environments

Checking whether your installation works

TensorFlow API levels

Eager execution

Building your first TensorFlow model

One-hot vectors

Splitting into training and test sets

Creating TensorFlow graphs

Variables

Operations

Feeding data with placeholders

Initializing variables

Training our model

Loss functions

Optimization

Evaluating a trained model

The session

Summary

Deep Learning and Convolutional Neural Networks

AI and ML

Types of ML

Old versus new ML

Artificial neural networks

Activation functions

The XOR problem

Training neural networks

Backpropagation and the chain rule

Batches

Loss functions

The optimizer and its hyperparameters

Underfitting versus overfitting

Feature scaling

Fully connected layers

A TensorFlow example for the XOR problem

Convolutional neural networks

Convolution

Input padding

Calculating the number of parameters (weights)

 Calculating the number of operations

Converting convolution layers into fully connected layers

The pooling layer

1x1 Convolution

Calculating the receptive field

Building a CNN model in TensorFlow

TensorBoard

Other types of convolutions

Summary

Image Classification in TensorFlow

CNN model architecture

Cross-entropy loss (log loss)

Multi-class cross entropy loss

The train/test dataset split

Datasets

ImageNet

CIFAR

Loading CIFAR

Image classification with TensorFlow

Building the CNN graph

Learning rate scheduling

Introduction to the tf.data API

The main training loop

Model Initialization

Do not initialize all weights with zeros

Initializing with a mean zero distribution

Xavier-Bengio and the Initializer

Improving generalization by regularizing

L2 and L1 regularization

Dropout

The batch norm layer

Summary

Object Detection and Segmentation

Image classification with localization

Localization as regression

TensorFlow implementation

Other applications of localization

Object detection as classification – Sliding window

Using heuristics to guide us (R-CNN)

Problems

Fast R-CNN

Faster R-CNN

Region Proposal Network

RoI Pooling layer

Conversion from traditional CNN to Fully Convnets

Single Shot Detectors – You Only Look Once

Creating training set for Yolo object detection

Evaluating detection (Intersection Over Union)

Filtering output

Anchor Box

Testing/Predicting in Yolo

Detector Loss function (YOLO loss)

Loss Part 1

Loss Part 2

Loss Part 3

Semantic segmentation

Max Unpooling

Deconvolution layer (Transposed convolution)

The loss function

Labels

Improving results

Instance segmentation

Mask R-CNN

Summary

VGG, Inception Modules, Residuals, and MobileNets

Substituting big convolutions

Substituting the 3x3 convolution

VGGNet

Architecture

Parameters and memory calculation

Code

More about VGG

GoogLeNet

Inception module

More about GoogLeNet

Residual Networks

MobileNets

Depthwise separable convolution

Control parameters

More about MobileNets

Summary

Autoencoders, Variational Autoencoders, and Generative Adversarial Networks

Why generative models

Autoencoders

Convolutional autoencoder example

Uses and limitations of autoencoders

Variational autoencoders

Parameters to define a normal distribution

VAE loss function

Kullback-Leibler divergence

Training the VAE

The reparameterization trick

Convolutional Variational Autoencoder code

Generating new data

Generative adversarial networks

The discriminator

The generator

GAN loss function

Generator loss

Discriminator loss

Putting the losses together

Training the GAN

Deep convolutional GAN

WGAN

BEGAN

Conditional GANs

Problems with GANs

Loss interpretability

Mode collapse

Techniques to improve GANs' trainability

Minibatch discriminator

Summary

Transfer Learning

When?

How? An overview

How? Code example

TensorFlow useful elements

An autoencoder without the decoder

Selecting layers

Training only some layers

Complete source

Summary

Machine Learning Best Practices and Troubleshooting

Building Machine Learning Systems

Data Preparation

Split of Train/Development/Test set

Mismatch of the Dev and Test set

When to Change Dev/Test Set

Bias and Variance

Data Imbalance

Collecting more data

Look at your performance metric

Data synthesis/Augmentation

Resample Data

Loss function Weighting

Evaluation Metrics

Code Structure best Practice

Singleton Pattern

Recipe for CNN creation

Summary

Training at Scale

Storing data in TFRecords

Making a TFRecord

Storing encoded images

Sharding

Making efficient pipelines

Parallel calls for map transformations

Getting a batch

Prefetching

Tracing your graph

Distributed computing in TensorFlow

Model/data parallelism

Synchronous/asynchronous SGD

When data does not fit on one computer

The advantages of NoSQL systems

Installing Cassandra (Ubuntu 16.04)

The CQLSH tool

Creating databases, tables, and indexes

Doing queries in Python

Populating tables in Python

Doing backups

Scaling computation in the cloud

EC2

AMI

Storage (S3)

SageMaker

Summary

References

Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 7

Chapter 9

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

This book is all about giving a practical, hands-on introduction to machine learning with the aim of enabling anyone to start working in the field. We'll focus mainly on deep learning methods and how they can be used to solve important computer vision problems, but the knowledge acquired here can be transferred to many different domains. Along the way, the reader will also get a grip of how to use the popular deep learning library, TensorFlow.

Who this book is for

Anyone interested in a practical guide to machine learning, specifically deep learning and computer vision, will particularly benefit from reading this book. In addition, the following people will also benefit:

Machine learning engineers

Data scientists

Developers interested in learning about the deep learning and computer vision fields

Students studying machine learning

What this book covers

Chapter 1, Setup and Introduction to Tensorflow, covers the setting up and installation of TensorFlow along with writing a simple Tensorflow model for machine learning.

Chapter 2, Deep Learning and Convolutional Neural Networks, introduces you to machine learning, and artificial intelligence as well as artificial neural networks and how to train them. It also covers CNNs and how to use TensorFlow to train your own CNN.

Chapter 3, Image Classification in Tensorflow, talks about building CNN models and how to train them for classifying the CIFAR10 dataset. It also looks at ways to help improve the quality of our trained model by talking about different methods of initialization and regularization.

Chapter 4, Object Detection and Segmentation, teaches the basics of object localization, detection and segmentation and the most famous algorithms related to those topics.

Chapter 5, VGG, Inception Modules, Residuals, and MobileNets, introduces you to different convolutional neural network designs like VGGNet, GoggLeNet, and MobileNet.

Chapter 6, AutoEncoders, Variational Autoencoders, and Generative Adversarial Networks, introduces you to generative models, generative adversarial network, and different types of encoders. 

Chapter 7, Transfer Learning, covers the usage of transfer learning and implementing it in our own tasks.

Chapter 8, Machine Learning Best Practices and Troubleshooting, introduces us to preparing and splitting a dataset into subsets and performing meaningful tests. The chapter also talks about underfitting and overfitting along with the best practices for addressing them.

Chapter 9, Training at Scale, teaches you how to train TensorFlow models across multiple GPUs and machines. It also covers best practices for storing your data and feeding it to your model.

To get the most out of this book

To get the most of the book, the reader should have some knowledge of the Python programming language and how to install some required packages. All the rest will be covered by the book with an easy language approach. Installation instructions will be given in the book and in the repository.

Download the example code files

You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at

www.packtpub.com

.

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-on-Convolutional-Neural-Networks-with-Tensorflow. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packtpub.com.

Setup and Introduction to TensorFlow

TensorFlow is an open source software library created by Google that allows you to build and execute data flow graphs for numerical computation. In these graphs, every node represents some computation or function to be executed, and the graph edges connecting up nodes represent the data flowing between them. In TensorFlow, the data is multi-dimensional arrays called Tensors. Tensors flow around the graph, hence the name TensorFlow.

Machine learning (ML) models, such as convolutional neural networks, can be represented with these kinds of graphs, and this is exactly what TensorFlow was originally designed for.

In this chapter, we'll cover the following topics:

Understanding the TensorFlow way of thinking

Setting up and installing TensorFlow

Introduction to TensorFlow API levels

Building and training a linear classifier in TensorFlow

Evaluating a trained model

The TensorFlow way of thinking

Using TensorFlow requires a slightly different approach to programming than what you might be used to using, so let's explore what makes it different.

At their core, all TensorFlow programs have two main parts to them:

Construction of a computational graph called 

tf.Graph

Running the computational graph using 

tf.Session

In TensorFlow, a computational graph is a series of TensorFlow operations arranged into a graph structure. The TensorFlow graph contains two main types of components:

Operations

: More commonly called

ops

, for short, these are the nodes in your graph. Ops carry out any computation that needs to be done in your graph. Generally, they consume and produce Tensors. Some ops are special and can have certain side effects when they run.

Tensors

: These are the edges of your graph; they connect up the nodes and represent data that flows through it. Most TensorFlow ops will produce and consume these

tf.Tensors

.

In TensorFlow, the main object that you work with is called a Tensor. Tensors are the generalization of vectors and matrices. Even though vectors are one-dimensional and matrices are two-dimensional, a Tensor can be n-dimensional. TensorFlow represents Tensors as n-dimensional arrays of a user-specified data type, for example, float32.

TensorFlow programs work by first building a graph of computation. This graph will produce some tf.Tensor output. To evaluate this output, you must run it within a tf.Session by calling tf.Session.run on your output Tensor. When you do this, TensorFlow will execute all the parts of your graph that need to be executed in order to evaluate the tf.Tensor you asked it to run.

Setting up and installing TensorFlow

TensorFlow is supported on the latest versions of Ubuntu and Windows. TensorFlow on Windows only supports the use of Python 3, while use on Ubuntu allows the use of both Python 2 and 3. We recommend using Python 3, and that is what we will use in this book for code examples.

There are several ways you can install TensorFlow on your system, and here we will go through two of the main ways. The easiest is by simply using the pip package manager. Issuing the following command from a terminal will install the CPU-only version of TensorFlow to your system Python:

$ pip3 install --upgrade tensorflow

To install the version of Tensorflow that supports using your Nvidia GPU, simply type the following:

$ pip3 install --upgrade tensorflow-gpu

One of the advantages of TensorFlow is that it allows you to write code that can run directly on your GPU. With a few exceptions, almost all the major operations in TensorFlow can be run on a GPU to accelerate their execution speed. We will see that this is going to be essential in order to train the large convolutional neural networks described later in this book.

Conda environments

Using pip may be the quickest to get started, but I see that the most convenient method involves using conda environments.

Conda environments allow you to create isolated Python environments, which are completely separate from your system Python or any other Python programs. This way, there is no chance of your TensorFlow installation messing with anything already installed, and vice versa.

To use conda, you must download Anaconda from here: https://www.anaconda.com/download/. This will include conda with it. Once you've installed Anaconda, installing TensorFlow can be done by entering the certain commands in your Command Prompt. First, enter the following:

$ conda create -n tf_env pip python=3.5

This will create your conda environment with the name tf_env, the environment will use Python 3.5, and pip will also be installed for us to use.

Once this environment is created, you can start using it by entering the following on Windows:

$ activate tf_env

If you are using Ubuntu, enter the following command:

$ source activate tf_env

It should now display (tf_env) next to your Command Prompt. To install TensorFlow, we simply do a pip install as previously, depending on if you want CPU only or you want GPU support:

(tf_env)$ pip install --upgrade tensorflow

(tf_env)$ pip install --upgrade tensorflow-gpu

Eager execution

At the time of this writing, Google had just introduced the eager execution API to TensorFlow. Eager Execution is TensorFlow's answer to another deep learning library called PyTorch. It allows you to bypass the usual TensorFlow way of working where you must first define a computational graph and then execute the graph to get a result. This is known as static graph computation. Instead, with Eager Execution, you can now create the so-called dynamic graphs that are defined on the fly as you run your program. This allows for a more traditional, imperative way of programming when using TensorFlow. Unfortunately, eager execution is still under development with some features still missing, and will not be featured in this book. More information on Eager Execution can be found at the TensorFlow website.

One-hot vectors

After shuffling, we do some preprocessing on the data labels. The labels loaded with the dataset is just a 150-length vector of integers representing which target class each datapoint belongs to, either 1, 2, or 3 in this case. When creating machine learning models, we like to transform our labels into a new form that is easier to work with by doing something called one-hot encoding.

Rather than a single number being the label for each datapoint, we use