35,99 €
Deep Learning with TensorFlow and Keras teaches you neural networks and deep learning techniques using TensorFlow (TF) and Keras. You'll learn how to write deep learning applications in the most powerful, popular, and scalable machine learning stack available.
TensorFlow 2.x focuses on simplicity and ease of use, with updates like eager execution, intuitive higher-level APIs based on Keras, and flexible model building on any platform. This book uses the latest TF 2.0 features and libraries to present an overview of supervised and unsupervised machine learning models and provides a comprehensive analysis of deep learning and reinforcement learning models using practical examples for the cloud, mobile, and large production environments.
This book also shows you how to create neural networks with TensorFlow, runs through popular algorithms (regression, convolutional neural networks (CNNs), transformers, generative adversarial networks (GANs), recurrent neural networks (RNNs), natural language processing (NLP), and graph neural networks (GNNs)), covers working example apps, and then dives into TF in production, TF mobile, and TensorFlow with AutoML.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 925
Veröffentlichungsjahr: 2022
Deep Learning with TensorFlow and Keras
Third Edition
Build and deploy supervised, unsupervised, deep, and reinforcement learning models
Amita Kapoor
Antonio Gulli
Sujit Pal
BIRMINGHAM—MUMBAI
Deep Learning with TensorFlow and Keras
Third Edition
Copyright © 2022 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Lead Senior Publishing Product Manager: Tushar Gupta
Acquisition Editor – Peer Reviews: Gaurav Gavas
Project Editor: Namrata Katare
Content Development Editor: Bhavesh Amin
Copy Editor: Safis Editing
Technical Editor: Aniket Shetty
Proofreader: Safis Editing
Indexer: Rekha Nair
Presentation Designer: Ganesh Bhadwalkar
First published: April 2017
Second edition: December 2019
Third edition: October 2022
Production reference: 1300922
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80323-291-1
www.packt.com
Approachable, well-written, with a great balance between theory and practice. A very enjoyable introduction to machine learning for software developers.
François Chollet,
Creator of Keras
Amita Kapoor taught and supervised research in the field of neural networks and artificial intelligence for 20+ years as an Associate Professor at the University of Delhi. At present, she works as an independent AI consultant and provides her expertise to various organizations working in the field of AI and EdTech.
First and foremost, I am thankful to the readers of this book. It is your encouragement via messages and emails that motivate me to give my best. I am extremely thankful to my co-authors, Antonio Gulli and Sujit Pal, for sharing their vast experience with me in writing this book. I am thankful to the entire Packt team for the effort they put in since the inception of this book and the reviewers who painstakingly went through the content and verified the code; their comments and suggestions helped improve the book.
Last but not the least, I am thankful to my teachers for their faith in me, my colleagues at the University of Delhi for their love and support, my friends for continuously motivating me, and my family members for their patience and love.
A part of the royalties of the book are donated.
Antonio Gulli has a passion for establishing and managing global technological talent for innovation and execution. His core expertise is in cloud computing, deep learning, and search engines. Currently, Antonio works for Google in the Cloud Office of the CTO in Zurich, working on Search, Cloud Infra, Sovereignty, and Conversational AI. Previously, he served as a founding member of the Office of the CTO in the EMEA. Earlier on, he served as Google Warsaw Site Director Leader, growing the site to 450+ engineers fully focused on cloud managing teams in GCE, Kubernetes, Serverless, Borg, and Console.
So far, Antonio has been lucky enough to gain professional experience in five countries in Europe and to manage teams in six countries in EMEA and the U.S:
In Amsterdam, as Vice President for Elsevier, a leading scientific publisher.In London, as Principal Engineer for Bing Search, Microsoft.In Italy and the U.K, as CTO, Europe for Ask.com.In Poland, the U.K, and Switzerland with Google.Antonio has co-invented a number of technologies for search, smart energy, and AI with 11 patents issued (21 applied) and published several books on coding and machine learning also translated into Japanese and Chinese. He speaks Spanish, English, and Italian and is currently learning Polish and French. Antonio is a proud father of Two boys, Lorenzo, 21 and Leonardo, 16, and a little queen, Aurora, 11.
I want to thank my sons, Lorenzo and Leonardo, and my daughter, Aurora, for being the motivation behind my perseverance. Also, I want to thank my partner, Nina, for being the North Star of my life in recent years.
Sujit Pal is a Technology Research Director at Elsevier Labs, an advanced technology group within the Reed-Elsevier Group of companies. His interests include semantic search, natural language processing, machine learning, and deep learning. At Elsevier, he has worked on several initiatives involving search quality measurement and improvement, image classification and duplicate detection, and annotation and ontology development for medical and scientific corpora.
Raghav Bali is a seasoned data science professional with over a decade’s experience in the research and development of large-scale solutions in finance, digital experience, IT infrastructure, and healthcare for giants such as Intel, American Express, UnitedHealth Group, and Delivery Hero. He is an innovator with 7+ patents, a published author of multiple well-received books (including Hands-On Transfer Learning with Python), has peer reviewed papers, and is a regular speaker in leading conferences on topics in the areas of machine learning, deep learning, computer vision, NLP, generative models, and augmented reality.
I would like to take this opportunity to congratulate the authors on yet another amazing book. Thanks to Packt for bringing me on board as a reviewer for this book, particularly Namrata, Saby, and Tushar for all their support and assistance and for being so receptive throughout the review process. And finally, I’d like to thank my wife, family, and colleagues for all the support and patience.
Join our Discord community to meet like-minded people and learn alongside more than 2000 members at: https://packt.link/keras
Preface
Who this book is for
What this book covers
Get in touch
References
Neural Network Foundations with TF
What is TensorFlow (TF)?
What is Keras?
Introduction to neural networks
Perceptron
Our first example of TensorFlow code
Multi-layer perceptron: our first example of a network
Problems in training the perceptron and solution
Activation function: sigmoid
Activation function: tanh
Activation function: ReLU
Two additional activation functions: ELU and Leaky ReLU
Activation functions
In short: what are neural networks after all?
A real example: recognizing handwritten digits
One hot-encoding (OHE)
Defining a simple neural net in TensorFlow
Running a simple TensorFlow net and establishing a baseline
Improving the simple net in TensorFlow with hidden layers
Further improving the simple net in TensorFlow with dropout
Testing different optimizers in TensorFlow
Increasing the number of epochs
Controlling the optimizer learning rate
Increasing the number of internal hidden neurons
Increasing the size of batch computation
Summarizing experiments run to recognizing handwritten digits
Regularization
Adopting regularization to avoid overfitting
Understanding batch normalization
Playing with Google Colab: CPUs, GPUs, and TPUs
Sentiment analysis
Hyperparameter tuning and AutoML
Predicting output
A practical overview of backpropagation
What have we learned so far?
Toward a deep learning approach
Summary
References
Regression and Classification
What is regression?
Prediction using linear regression
Simple linear regression
Multiple linear regression
Multivariate linear regression
Neural networks for linear regression
Simple linear regression using TensorFlow Keras
Multiple and multivariate linear regression using the TensorFlow Keras API
Classification tasks and decision boundaries
Logistic regression
Logistic regression on the MNIST dataset
Summary
References
Convolutional Neural Networks
Deep convolutional neural networks
Local receptive fields
Shared weights and bias
A mathematical example
ConvNets in TensorFlow
Pooling layers
Max pooling
Average pooling
ConvNets summary
An example of DCNN: LeNet
LeNet code in TF
Understanding the power of deep learning
Recognizing CIFAR-10 images with deep learning
Improving the CIFAR-10 performance with a deeper network
Improving the CIFAR-10 performance with data augmentation
Predicting with CIFAR-10
Very deep convolutional networks for large-scale image recognition
Recognizing cats with a VGG16 network
Utilizing the tf.Keras built-in VGG16 net module
Recycling pre-built deep learning models for extracting features
Deep Inception V3 for transfer learning
Other CNN architectures
AlexNet
Residual networks
HighwayNets and DenseNets
Xception
Style transfer
Content distance
Style distance
Summary
References
Word Embeddings
Word embedding ‒ origins and fundamentals
Distributed representations
Static embeddings
Word2Vec
GloVe
Creating your own embeddings using Gensim
Exploring the embedding space with Gensim
Using word embeddings for spam detection
Getting the data
Making the data ready for use
Building the embedding matrix
Defining the spam classifier
Training and evaluating the model
Running the spam detector
Neural embeddings – not just for words
Item2Vec
node2vec
Character and subword embeddings
Dynamic embeddings
Sentence and paragraph embeddings
Language model-based embeddings
Using BERT as a feature extractor
Summary
References
Recurrent Neural Networks
The basic RNN cell
Backpropagation through time (BPTT)
Vanishing and exploding gradients
RNN cell variants
Long short-term memory (LSTM)
Gated recurrent unit (GRU)
Peephole LSTM
RNN variants
Bidirectional RNNs
Stateful RNNs
RNN topologies
Example ‒ One-to-many – Learning to generate text
Example ‒ Many-to-one – Sentiment analysis
Example ‒ Many-to-many – POS tagging
Encoder-decoder architecture – seq2seq
Example ‒ seq2seq without attention for machine translation
Attention mechanism
Example ‒ seq2seq with attention for machine translation
Summary
References
Transformers
Architecture
Key intuitions
Positional encoding
Attention
Self-attention
Multi-head (self-)attention
How to compute attention
Encoder-decoder architecture
Residual and normalization layers
An overview of the transformer architecture
Training
Transformers’ architectures
Categories of transformers
Decoder or autoregressive
Encoder or autoencoding
Seq2seq
Multimodal
Retrieval
Attention
Full versus sparse
LSH attention
Local attention
Pretraining
Encoder pretraining
Decoder pretraining
Encoder-decoder pretraining
A taxonomy for pretraining tasks
An overview of popular and well-known models
BERT
GPT-2
GPT-3
Reformer
BigBird
Transformer-XL
XLNet
RoBERTa
ALBERT
StructBERT
T5 and MUM
ELECTRA
DeBERTa
The Evolved Transformer and MEENA
LaMDA
Switch Transformer
RETRO
Pathways and PaLM
Implementation
Transformer reference implementation: An example of translation
Hugging Face
Generating text
Autoselecting a model and autotokenization
Named entity recognition
Summarization
Fine-tuning
TFHub
Evaluation
Quality
GLUE
SuperGLUE
SQuAD
RACE
NLP-progress
Size
Larger doesn’t always mean better
Cost of serving
Optimization
Quantization
Weight pruning
Distillation
Common pitfalls: dos and don’ts
Dos
Don’ts
The future of transformers
Summary
Unsupervised Learning
Principal component analysis
PCA on the MNIST dataset
TensorFlow Embedding API
K-means clustering
K-means in TensorFlow
Variations in k-means
Self-organizing maps
Colour mapping using a SOM
Restricted Boltzmann machines
Reconstructing images using an RBM
Deep belief networks
Summary
References
Autoencoders
Introduction to autoencoders
Vanilla autoencoders
TensorFlow Keras layers ‒ defining custom layers
Reconstructing handwritten digits using an autoencoder
Sparse autoencoder
Denoising autoencoders
Clearing images using a denoising autoencoder
Stacked autoencoder
Convolutional autoencoder for removing noise from images
A TensorFlow Keras autoencoder example ‒ sentence vectors
Variational autoencoders
Summary
References
Generative Models
What is a GAN?
MNIST using GAN in TensorFlow
Deep convolutional GAN (DCGAN)
DCGAN for MNIST digits
Some interesting GAN architectures
SRGAN
CycleGAN
InfoGAN
Cool applications of GANs
CycleGAN in TensorFlow
Flow-based models for data generation
Diffusion models for data generation
Summary
References
Self-Supervised Learning
Previous work
Self-supervised learning
Self-prediction
Autoregressive generation
PixelRNN
Image GPT (IPT)
GPT-3
XLNet
WaveNet
WaveRNN
Masked generation
BERT
Stacked denoising autoencoder
Context autoencoder
Colorization
Innate relationship prediction
Relative position
Solving jigsaw puzzles
Rotation
Hybrid self-prediction
VQ-VAE
Jukebox
DALL-E
VQ-GAN
Contrastive learning
Training objectives
Contrastive loss
Triplet loss
N-pair loss
Lifted structural loss
NCE loss
InfoNCE loss
Soft nearest neighbors loss
Instance transformation
SimCLR
Barlow Twins
BYOL
Feature clustering
DeepCluster
SwAV
InterCLR
Multiview coding
AMDIM
CMC
Multimodal models
CLIP
CodeSearchNet
Data2Vec
Pretext tasks
Summary
References
Reinforcement Learning
An introduction to RL
RL lingo
Deep reinforcement learning algorithms
How does the agent choose its actions, especially when untrained?
How does the agent maintain a balance between exploration and exploitation?
How to deal with the highly correlated input state space
How to deal with the problem of moving targets
Reinforcement success in recent years
Simulation environments for RL
An introduction to OpenAI Gym
Random agent playing Breakout
Wrappers in Gym
Deep Q-networks
DQN for CartPole
DQN to play a game of Atari
DQN variants
Double DQN
Dueling DQN
Rainbow
Deep deterministic policy gradient
Summary
References
Probabilistic TensorFlow
TensorFlow Probability
TensorFlow Probability distributions
Using TFP distributions
Coin Flip Example
Normal distribution
Bayesian networks
Handling uncertainty in predictions using TensorFlow Probability
Aleatory uncertainty
Epistemic uncertainty
Creating a synthetic dataset
Building a regression model using TensorFlow
Probabilistic neural networks for aleatory uncertainty
Accounting for the epistemic uncertainty
Summary
References
An Introduction to AutoML
What is AutoML?
Achieving AutoML
Automatic data preparation
Automatic feature engineering
Automatic model generation
AutoKeras
Google Cloud AutoML and Vertex AI
Using the Google Cloud AutoML Tables solution
Using the Google Cloud AutoML Text solution
Using the Google Cloud AutoML Video solution
Cost
Summary
References
The Math Behind Deep Learning
History
Some mathematical tools
Vectors
Derivatives and gradients everywhere
Gradient descent
Chain rule
A few differentiation rules
Matrix operations
Activation functions
Derivative of the sigmoid
Derivative of tanh
Derivative of ReLU
Backpropagation
Forward step
Backstep
Case 1: From hidden layer to output layer
Case 2: From hidden layer to hidden layer
Cross entropy and its derivative
Batch gradient descent, stochastic gradient descent, and mini-batch
Batch gradient descent
Stochastic gradient descent
Mini-batch gradient descent
Thinking about backpropagation and ConvNets
Thinking about backpropagation and RNNs
A note on TensorFlow and automatic differentiation
Summary
References
Tensor Processing Unit
C/G/T processing units
CPUs and GPUs
TPUs
Four generations of TPUs, plus Edge TPU
First generation TPU
Second generation TPU
Third generation TPU
Fourth generation TPUs
Edge TPU
TPU performance
How to use TPUs with Colab
Checking whether TPUs are available
Keras MNIST TPU end-to-end training
Using pretrained TPU models
Summary
References
Other Useful Deep Learning Libraries
Hugging Face
OpenAI
OpenAI GPT-3 API
OpenAI DALL-E 2
OpenAI Codex
PyTorch
ONNX
H2O.ai
H2O AutoML
AutoML using H2O
H2O model explainability
Partial dependence plots
Variable importance heatmap
Model correlation
Summary
Graph Neural Networks
Graph basics
Graph machine learning
Graph convolutions – the intuition behind GNNs
Common graph layers
Graph convolution network
Graph attention network
GraphSAGE (sample and aggregate)
Graph isomorphism network
Common graph applications
Node classification
Graph classification
Link prediction
Graph customizations
Custom layers and message passing
Custom graph dataset
Single graphs in datasets
Set of multiple graphs in datasets
Future directions
Heterogeneous graphs
Temporal Graphs
Summary
References
Machine Learning Best Practices
The need for best practices
Data best practices
Feature selection
Features and data
Augmenting textual data
Model best practices
Baseline models
Pretrained models, model APIs, and AutoML
Model evaluation and validation
Model improvements
Summary
References
TensorFlow 2 Ecosystem
TensorFlow Hub
Using pretrained models for inference
TensorFlow Datasets
Load a TFDS dataset
Building data pipelines using TFDS
TensorFlow Lite
Quantization
FlatBuffers
Mobile converter
Mobile optimized interpreter
Supported platforms
Architecture
Using TensorFlow Lite
A generic example of an application
Using GPUs and accelerators
An example of an application
Pretrained models in TensorFlow Lite
Image classification
Object detection
Pose estimation
Smart reply
Segmentation
Style transfer
Text classification
Large language models
A note about using mobile GPUs
An overview of federated learning at the edge
TensorFlow FL APIs
TensorFlow.js
Vanilla TensorFlow.js
Converting models
Pretrained models
Node.js
Summary
References
Advanced Convolutional Neural Networks
Composing CNNs for complex tasks
Classification and localization
Semantic segmentation
Object detection
Instance segmentation
Application zoos with tf.Keras and TensorFlow Hub
Keras Applications
TensorFlow Hub
Answering questions about images (visual Q&A)
Creating a DeepDream network
Inspecting what a network has learned
Video
Classifying videos with pretrained nets in six different ways
Text documents
Using a CNN for sentiment analysis
Audio and music
Dilated ConvNets, WaveNet, and NSynth
A summary of convolution operations
Basic CNNs
Dilated convolution
Transposed convolution
Separable convolution
Depthwise convolution
Depthwise separable convolution
Capsule networks
What is the problem with CNNs?
What is new with capsule networks?
Summary
References
Other Books You May Enjoy
Index
Cover
Index
Once you’ve read Deep Learning with TensorFlow and Keras, Third Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.