Deep Learning with TensorFlow and Keras – 3rd edition - Dr. Amita Kapoor - E-Book

Deep Learning with TensorFlow and Keras – 3rd edition E-Book

Dr. Amita Kapoor

0,0
35,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Deep Learning with TensorFlow and Keras teaches you neural networks and deep learning techniques using TensorFlow (TF) and Keras. You'll learn how to write deep learning applications in the most powerful, popular, and scalable machine learning stack available.

TensorFlow 2.x focuses on simplicity and ease of use, with updates like eager execution, intuitive higher-level APIs based on Keras, and flexible model building on any platform. This book uses the latest TF 2.0 features and libraries to present an overview of supervised and unsupervised machine learning models and provides a comprehensive analysis of deep learning and reinforcement learning models using practical examples for the cloud, mobile, and large production environments.

This book also shows you how to create neural networks with TensorFlow, runs through popular algorithms (regression, convolutional neural networks (CNNs), transformers, generative adversarial networks (GANs), recurrent neural networks (RNNs), natural language processing (NLP), and graph neural networks (GNNs)), covers working example apps, and then dives into TF in production, TF mobile, and TensorFlow with AutoML.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 925

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Deep Learning with TensorFlow and Keras

Third Edition

Build and deploy supervised, unsupervised, deep, and reinforcement learning models

Amita Kapoor

Antonio Gulli

Sujit Pal

BIRMINGHAM—MUMBAI

Deep Learning with TensorFlow and Keras

Third Edition

Copyright © 2022 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Lead Senior Publishing Product Manager: Tushar Gupta

Acquisition Editor – Peer Reviews: Gaurav Gavas

Project Editor: Namrata Katare

Content Development Editor: Bhavesh Amin

Copy Editor: Safis Editing

Technical Editor: Aniket Shetty

Proofreader: Safis Editing

Indexer: Rekha Nair

Presentation Designer: Ganesh Bhadwalkar

First published: April 2017

Second edition: December 2019

Third edition: October 2022

Production reference: 1300922

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80323-291-1

www.packt.com

Foreword

Approachable, well-written, with a great balance between theory and practice. A very enjoyable introduction to machine learning for software developers.

François Chollet,

Creator of Keras

Contributors

About the authors

Amita Kapoor taught and supervised research in the field of neural networks and artificial intelligence for 20+ years as an Associate Professor at the University of Delhi. At present, she works as an independent AI consultant and provides her expertise to various organizations working in the field of AI and EdTech.

First and foremost, I am thankful to the readers of this book. It is your encouragement via messages and emails that motivate me to give my best. I am extremely thankful to my co-authors, Antonio Gulli and Sujit Pal, for sharing their vast experience with me in writing this book. I am thankful to the entire Packt team for the effort they put in since the inception of this book and the reviewers who painstakingly went through the content and verified the code; their comments and suggestions helped improve the book.

Last but not the least, I am thankful to my teachers for their faith in me, my colleagues at the University of Delhi for their love and support, my friends for continuously motivating me, and my family members for their patience and love.

A part of the royalties of the book are donated.

Antonio Gulli has a passion for establishing and managing global technological talent for innovation and execution. His core expertise is in cloud computing, deep learning, and search engines. Currently, Antonio works for Google in the Cloud Office of the CTO in Zurich, working on Search, Cloud Infra, Sovereignty, and Conversational AI. Previously, he served as a founding member of the Office of the CTO in the EMEA. Earlier on, he served as Google Warsaw Site Director Leader, growing the site to 450+ engineers fully focused on cloud managing teams in GCE, Kubernetes, Serverless, Borg, and Console.

So far, Antonio has been lucky enough to gain professional experience in five countries in Europe and to manage teams in six countries in EMEA and the U.S:

In Amsterdam, as Vice President for Elsevier, a leading scientific publisher.In London, as Principal Engineer for Bing Search, Microsoft.In Italy and the U.K, as CTO, Europe for Ask.com.In Poland, the U.K, and Switzerland with Google.

Antonio has co-invented a number of technologies for search, smart energy, and AI with 11 patents issued (21 applied) and published several books on coding and machine learning also translated into Japanese and Chinese. He speaks Spanish, English, and Italian and is currently learning Polish and French. Antonio is a proud father of Two boys, Lorenzo, 21 and Leonardo, 16, and a little queen, Aurora, 11.

I want to thank my sons, Lorenzo and Leonardo, and my daughter, Aurora, for being the motivation behind my perseverance. Also, I want to thank my partner, Nina, for being the North Star of my life in recent years.

Sujit Pal is a Technology Research Director at Elsevier Labs, an advanced technology group within the Reed-Elsevier Group of companies. His interests include semantic search, natural language processing, machine learning, and deep learning. At Elsevier, he has worked on several initiatives involving search quality measurement and improvement, image classification and duplicate detection, and annotation and ontology development for medical and scientific corpora.

About the reviewer

Raghav Bali is a seasoned data science professional with over a decade’s experience in the research and development of large-scale solutions in finance, digital experience, IT infrastructure, and healthcare for giants such as Intel, American Express, UnitedHealth Group, and Delivery Hero. He is an innovator with 7+ patents, a published author of multiple well-received books (including Hands-On Transfer Learning with Python), has peer reviewed papers, and is a regular speaker in leading conferences on topics in the areas of machine learning, deep learning, computer vision, NLP, generative models, and augmented reality.

I would like to take this opportunity to congratulate the authors on yet another amazing book. Thanks to Packt for bringing me on board as a reviewer for this book, particularly Namrata, Saby, and Tushar for all their support and assistance and for being so receptive throughout the review process. And finally, I’d like to thank my wife, family, and colleagues for all the support and patience.

Join our book’s Discord space

Join our Discord community to meet like-minded people and learn alongside more than 2000 members at: https://packt.link/keras

Contents

Preface

Who this book is for

What this book covers

Get in touch

References

Neural Network Foundations with TF

What is TensorFlow (TF)?

What is Keras?

Introduction to neural networks

Perceptron

Our first example of TensorFlow code

Multi-layer perceptron: our first example of a network

Problems in training the perceptron and solution

Activation function: sigmoid

Activation function: tanh

Activation function: ReLU

Two additional activation functions: ELU and Leaky ReLU

Activation functions

In short: what are neural networks after all?

A real example: recognizing handwritten digits

One hot-encoding (OHE)

Defining a simple neural net in TensorFlow

Running a simple TensorFlow net and establishing a baseline

Improving the simple net in TensorFlow with hidden layers

Further improving the simple net in TensorFlow with dropout

Testing different optimizers in TensorFlow

Increasing the number of epochs

Controlling the optimizer learning rate

Increasing the number of internal hidden neurons

Increasing the size of batch computation

Summarizing experiments run to recognizing handwritten digits

Regularization

Adopting regularization to avoid overfitting

Understanding batch normalization

Playing with Google Colab: CPUs, GPUs, and TPUs

Sentiment analysis

Hyperparameter tuning and AutoML

Predicting output

A practical overview of backpropagation

What have we learned so far?

Toward a deep learning approach

Summary

References

Regression and Classification

What is regression?

Prediction using linear regression

Simple linear regression

Multiple linear regression

Multivariate linear regression

Neural networks for linear regression

Simple linear regression using TensorFlow Keras

Multiple and multivariate linear regression using the TensorFlow Keras API

Classification tasks and decision boundaries

Logistic regression

Logistic regression on the MNIST dataset

Summary

References

Convolutional Neural Networks

Deep convolutional neural networks

Local receptive fields

Shared weights and bias

A mathematical example

ConvNets in TensorFlow

Pooling layers

Max pooling

Average pooling

ConvNets summary

An example of DCNN: LeNet

LeNet code in TF

Understanding the power of deep learning

Recognizing CIFAR-10 images with deep learning

Improving the CIFAR-10 performance with a deeper network

Improving the CIFAR-10 performance with data augmentation

Predicting with CIFAR-10

Very deep convolutional networks for large-scale image recognition

Recognizing cats with a VGG16 network

Utilizing the tf.Keras built-in VGG16 net module

Recycling pre-built deep learning models for extracting features

Deep Inception V3 for transfer learning

Other CNN architectures

AlexNet

Residual networks

HighwayNets and DenseNets

Xception

Style transfer

Content distance

Style distance

Summary

References

Word Embeddings

Word embedding ‒ origins and fundamentals

Distributed representations

Static embeddings

Word2Vec

GloVe

Creating your own embeddings using Gensim

Exploring the embedding space with Gensim

Using word embeddings for spam detection

Getting the data

Making the data ready for use

Building the embedding matrix

Defining the spam classifier

Training and evaluating the model

Running the spam detector

Neural embeddings – not just for words

Item2Vec

node2vec

Character and subword embeddings

Dynamic embeddings

Sentence and paragraph embeddings

Language model-based embeddings

Using BERT as a feature extractor

Summary

References

Recurrent Neural Networks

The basic RNN cell

Backpropagation through time (BPTT)

Vanishing and exploding gradients

RNN cell variants

Long short-term memory (LSTM)

Gated recurrent unit (GRU)

Peephole LSTM

RNN variants

Bidirectional RNNs

Stateful RNNs

RNN topologies

Example ‒ One-to-many – Learning to generate text

Example ‒ Many-to-one – Sentiment analysis

Example ‒ Many-to-many – POS tagging

Encoder-decoder architecture – seq2seq

Example ‒ seq2seq without attention for machine translation

Attention mechanism

Example ‒ seq2seq with attention for machine translation

Summary

References

Transformers

Architecture

Key intuitions

Positional encoding

Attention

Self-attention

Multi-head (self-)attention

How to compute attention

Encoder-decoder architecture

Residual and normalization layers

An overview of the transformer architecture

Training

Transformers’ architectures

Categories of transformers

Decoder or autoregressive

Encoder or autoencoding

Seq2seq

Multimodal

Retrieval

Attention

Full versus sparse

LSH attention

Local attention

Pretraining

Encoder pretraining

Decoder pretraining

Encoder-decoder pretraining

A taxonomy for pretraining tasks

An overview of popular and well-known models

BERT

GPT-2

GPT-3

Reformer

BigBird

Transformer-XL

XLNet

RoBERTa

ALBERT

StructBERT

T5 and MUM

ELECTRA

DeBERTa

The Evolved Transformer and MEENA

LaMDA

Switch Transformer

RETRO

Pathways and PaLM

Implementation

Transformer reference implementation: An example of translation

Hugging Face

Generating text

Autoselecting a model and autotokenization

Named entity recognition

Summarization

Fine-tuning

TFHub

Evaluation

Quality

GLUE

SuperGLUE

SQuAD

RACE

NLP-progress

Size

Larger doesn’t always mean better

Cost of serving

Optimization

Quantization

Weight pruning

Distillation

Common pitfalls: dos and don’ts

Dos

Don’ts

The future of transformers

Summary

Unsupervised Learning

Principal component analysis

PCA on the MNIST dataset

TensorFlow Embedding API

K-means clustering

K-means in TensorFlow

Variations in k-means

Self-organizing maps

Colour mapping using a SOM

Restricted Boltzmann machines

Reconstructing images using an RBM

Deep belief networks

Summary

References

Autoencoders

Introduction to autoencoders

Vanilla autoencoders

TensorFlow Keras layers ‒ defining custom layers

Reconstructing handwritten digits using an autoencoder

Sparse autoencoder

Denoising autoencoders

Clearing images using a denoising autoencoder

Stacked autoencoder

Convolutional autoencoder for removing noise from images

A TensorFlow Keras autoencoder example ‒ sentence vectors

Variational autoencoders

Summary

References

Generative Models

What is a GAN?

MNIST using GAN in TensorFlow

Deep convolutional GAN (DCGAN)

DCGAN for MNIST digits

Some interesting GAN architectures

SRGAN

CycleGAN

InfoGAN

Cool applications of GANs

CycleGAN in TensorFlow

Flow-based models for data generation

Diffusion models for data generation

Summary

References

Self-Supervised Learning

Previous work

Self-supervised learning

Self-prediction

Autoregressive generation

PixelRNN

Image GPT (IPT)

GPT-3

XLNet

WaveNet

WaveRNN

Masked generation

BERT

Stacked denoising autoencoder

Context autoencoder

Colorization

Innate relationship prediction

Relative position

Solving jigsaw puzzles

Rotation

Hybrid self-prediction

VQ-VAE

Jukebox

DALL-E

VQ-GAN

Contrastive learning

Training objectives

Contrastive loss

Triplet loss

N-pair loss

Lifted structural loss

NCE loss

InfoNCE loss

Soft nearest neighbors loss

Instance transformation

SimCLR

Barlow Twins

BYOL

Feature clustering

DeepCluster

SwAV

InterCLR

Multiview coding

AMDIM

CMC

Multimodal models

CLIP

CodeSearchNet

Data2Vec

Pretext tasks

Summary

References

Reinforcement Learning

An introduction to RL

RL lingo

Deep reinforcement learning algorithms

How does the agent choose its actions, especially when untrained?

How does the agent maintain a balance between exploration and exploitation?

How to deal with the highly correlated input state space

How to deal with the problem of moving targets

Reinforcement success in recent years

Simulation environments for RL

An introduction to OpenAI Gym

Random agent playing Breakout

Wrappers in Gym

Deep Q-networks

DQN for CartPole

DQN to play a game of Atari

DQN variants

Double DQN

Dueling DQN

Rainbow

Deep deterministic policy gradient

Summary

References

Probabilistic TensorFlow

TensorFlow Probability

TensorFlow Probability distributions

Using TFP distributions

Coin Flip Example

Normal distribution

Bayesian networks

Handling uncertainty in predictions using TensorFlow Probability

Aleatory uncertainty

Epistemic uncertainty

Creating a synthetic dataset

Building a regression model using TensorFlow

Probabilistic neural networks for aleatory uncertainty

Accounting for the epistemic uncertainty

Summary

References

An Introduction to AutoML

What is AutoML?

Achieving AutoML

Automatic data preparation

Automatic feature engineering

Automatic model generation

AutoKeras

Google Cloud AutoML and Vertex AI

Using the Google Cloud AutoML Tables solution

Using the Google Cloud AutoML Text solution

Using the Google Cloud AutoML Video solution

Cost

Summary

References

The Math Behind Deep Learning

History

Some mathematical tools

Vectors

Derivatives and gradients everywhere

Gradient descent

Chain rule

A few differentiation rules

Matrix operations

Activation functions

Derivative of the sigmoid

Derivative of tanh

Derivative of ReLU

Backpropagation

Forward step

Backstep

Case 1: From hidden layer to output layer

Case 2: From hidden layer to hidden layer

Cross entropy and its derivative

Batch gradient descent, stochastic gradient descent, and mini-batch

Batch gradient descent

Stochastic gradient descent

Mini-batch gradient descent

Thinking about backpropagation and ConvNets

Thinking about backpropagation and RNNs

A note on TensorFlow and automatic differentiation

Summary

References

Tensor Processing Unit

C/G/T processing units

CPUs and GPUs

TPUs

Four generations of TPUs, plus Edge TPU

First generation TPU

Second generation TPU

Third generation TPU

Fourth generation TPUs

Edge TPU

TPU performance

How to use TPUs with Colab

Checking whether TPUs are available

Keras MNIST TPU end-to-end training

Using pretrained TPU models

Summary

References

Other Useful Deep Learning Libraries

Hugging Face

OpenAI

OpenAI GPT-3 API

OpenAI DALL-E 2

OpenAI Codex

PyTorch

ONNX

H2O.ai

H2O AutoML

AutoML using H2O

H2O model explainability

Partial dependence plots

Variable importance heatmap

Model correlation

Summary

Graph Neural Networks

Graph basics

Graph machine learning

Graph convolutions – the intuition behind GNNs

Common graph layers

Graph convolution network

Graph attention network

GraphSAGE (sample and aggregate)

Graph isomorphism network

Common graph applications

Node classification

Graph classification

Link prediction

Graph customizations

Custom layers and message passing

Custom graph dataset

Single graphs in datasets

Set of multiple graphs in datasets

Future directions

Heterogeneous graphs

Temporal Graphs

Summary

References

Machine Learning Best Practices

The need for best practices

Data best practices

Feature selection

Features and data

Augmenting textual data

Model best practices

Baseline models

Pretrained models, model APIs, and AutoML

Model evaluation and validation

Model improvements

Summary

References

TensorFlow 2 Ecosystem

TensorFlow Hub

Using pretrained models for inference

TensorFlow Datasets

Load a TFDS dataset

Building data pipelines using TFDS

TensorFlow Lite

Quantization

FlatBuffers

Mobile converter

Mobile optimized interpreter

Supported platforms

Architecture

Using TensorFlow Lite

A generic example of an application

Using GPUs and accelerators

An example of an application

Pretrained models in TensorFlow Lite

Image classification

Object detection

Pose estimation

Smart reply

Segmentation

Style transfer

Text classification

Large language models

A note about using mobile GPUs

An overview of federated learning at the edge

TensorFlow FL APIs

TensorFlow.js

Vanilla TensorFlow.js

Converting models

Pretrained models

Node.js

Summary

References

Advanced Convolutional Neural Networks

Composing CNNs for complex tasks

Classification and localization

Semantic segmentation

Object detection

Instance segmentation

Application zoos with tf.Keras and TensorFlow Hub

Keras Applications

TensorFlow Hub

Answering questions about images (visual Q&A)

Creating a DeepDream network

Inspecting what a network has learned

Video

Classifying videos with pretrained nets in six different ways

Text documents

Using a CNN for sentiment analysis

Audio and music

Dilated ConvNets, WaveNet, and NSynth

A summary of convolution operations

Basic CNNs

Dilated convolution

Transposed convolution

Separable convolution

Depthwise convolution

Depthwise separable convolution

Capsule networks

What is the problem with CNNs?

What is new with capsule networks?

Summary

References

Other Books You May Enjoy

Index

Landmarks

Cover

Index

Share your thoughts

Once you’ve read Deep Learning with TensorFlow and Keras, Third Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.