Generative AI with Python and TensorFlow 2 - Joseph Babcock - E-Book

Generative AI with Python and TensorFlow 2 E-Book

Joseph Babcock

0,0
31,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Machines are excelling at creative human skills such as painting, writing, and composing music. Could you be more creative than generative AI?

In this book, you’ll explore the evolution of generative models, from restricted Boltzmann machines and deep belief networks to VAEs and GANs. You’ll learn how to implement models yourself in TensorFlow and get to grips with the latest research on deep neural networks.

There’s been an explosion in potential use cases for generative models. You’ll look at Open AI’s news generator, deepfakes, and training deep learning agents to navigate a simulated environment.

Recreate the code that’s under the hood and uncover surprising links between text, image, and music generation.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 583

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Generative AI with Python and TensorFlow 2

Create images, text, and music with VAEs, GANs, LSTMs, Transformer models

Joseph Babcock

Raghav Bali

BIRMINGHAM - MUMBAI

Generative AI with Python and TensorFlow 2

Copyright © 2021 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Producer: Tushar Gupta

Acquisition Editor – Peer Reviews: Suresh Jain, Saby D'silva

Content Development Editors: Lucy Wan, Joanne Lovell

Technical Editor: Gaurav Gavas

Project Editor: Janice Gonsalves

Copy Editor: Safis Editing

Proofreader: Safis Editing

Indexer: Pratik Shirodkar

Presentation Designer: Pranit Padwal

First published: April 2021

Production reference: 3070721

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-80020-088-3

www.packt.com

Contributors

About the authors

Joseph Babcock has over a decade of experience in machine learning and developing big data solutions. He applied predictive modeling to drug discovery and genomics during his doctoral studies in neurosciences, and has since worked and led data science teams in the streaming media, e-commerce, and financial services industries. He previously authored Mastering Predictive Analytics with Python and Python: Advanced Predictive Analytics with Packt.

I would like to acknowledge my family for their support during the composition of this book.

Raghav Bali is a data scientist and a published author. He has led advanced analytics initiatives working with several Fortune 500 companies like Optum (UHG), Intel, and American Express. His work involves research and development of enterprise solutions leveraging machine learning and deep learning. He holds a Master of Technology degree (gold medalist) from IIIT Bangalore, with specializations in machine learning and software engineering. Raghav has authored several books on R, Python, machine learning, and deep learning, including Hands-On Transfer Learning with Python.

To my wife, parents, and brother, without whom this would not have been possible. To all the researchers whose work continues to inspire me to learn. And to my co-author, reviewers, and the Packt team (especially Tushar, Janice, and Lucy) for their hard work in transforming our work into this amazing book.

About the reviewers

Hao-Wen Dong is currently a PhD student in Computer Science and Engineering at the University of California, San Diego, working with Prof. Julian McAuley and Prof. Taylor Berg-Kirkpatrick. His research interests lie at the intersection of music and machine learning, with a recent focus on music generation. He is interested in building tools that could lower the barrier of entry for music composition and potentially lead to the democratization of music creation. Previously, he did a research internship in the R&D Division at Yamaha Corporation. Before that, he was a research assistant in the Music and AI Lab directed by Dr. Yi-Hsuan Yang at Academia Sinica. He received his bachelor's degree in Electrical Engineering from National Taiwan University.

Gokula Krishnan Santhanam is a Python developer who lives in Zurich, Switzerland. He has been working with deep learning techniques for more than 5 years. He has worked on problems in generative modeling, adversarial attacks, interpretability, and predictive maintenance while working at IBM Research and interning at Google. He finished his master's in Computer Science at ETH Zurich and his bachelor's at BITS Pilani. When he's not working, you can find him enjoying board games with his wife or hiking in the beautiful Alps.

I would like to thank my wife, Sadhana, for her continuous help and support and for always being there when I need her.

Contents

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

An Introduction to Generative AI: "Drawing" Data from Models

Applications of AI

Discriminative and generative models

Implementing generative models

The rules of probability

Discriminative and generative modeling and Bayes' theorem

Why use generative models?

The promise of deep learning

Building a better digit classifier

Generating images

Style transfer and image transformation

Fake news and chatbots

Sound composition

The rules of the game

Unique challenges of generative models

Summary

References

Setting Up a TensorFlow Lab

Deep neural network development and TensorFlow

TensorFlow 2.0

VSCode

Docker: A lightweight virtualization solution

Important Docker commands and syntax

Connecting Docker containers with docker-compose

Kubernetes: Robust management of multi-container applications

Important Kubernetes commands

Kustomize for configuration management

Kubeflow: an end-to-end machine learning lab

Running Kubeflow locally with MiniKF

Installing Kubeflow in AWS

Installing Kubeflow in GCP

Installing Kubeflow on Azure

Installing Kubeflow using Terraform

A brief tour of Kubeflow's components

Kubeflow notebook servers

Kubeflow pipelines

Using Kubeflow Katib to optimize model hyperparameters

Summary

References

Building Blocks of Deep Neural Networks

Perceptrons – a brain in a function

From tissues to TLUs

From TLUs to tuning perceptrons

Multi-layer perceptrons and backpropagation

Backpropagation in practice

The shortfalls of backpropagation

Varieties of networks: Convolution and recursive

Networks for seeing: Convolutional architectures

Early CNNs

AlexNet and other CNN innovations

AlexNet architecture

Networks for sequence data

RNNs and LSTMs

Building a better optimizer

Gradient descent to ADAM

Xavier initialization

Summary

References

Teaching Networks to Generate Digits

The MNIST database

Retrieving and loading the MNIST dataset in TensorFlow

Restricted Boltzmann Machines: generating pixels with statistical mechanics

Hopfield networks and energy equations for neural networks

Modeling data with uncertainty with Restricted Boltzmann Machines

Contrastive divergence: Approximating a gradient

Stacking Restricted Boltzmann Machines to generate images: the Deep Belief Network

Creating an RBM using the TensorFlow Keras layers API

Creating a DBN with the Keras Model API

Summary

References

Painting Pictures with Neural Networks Using VAEs

Creating separable encodings of images

The variational objective

The reparameterization trick

Inverse Autoregressive Flow

Importing CIFAR

Creating the network from TensorFlow 2

Summary

References

Image Generation with GANs

The taxonomy of generative models

Generative adversarial networks

The discriminator model

The generator model

Training GANs

Non-saturating generator cost

Maximum likelihood game

Vanilla GAN

Improved GANs

Deep Convolutional GAN

Vector arithmetic

Conditional GAN

Wasserstein GAN

Progressive GAN

The overall method

Progressive growth-smooth fade-in

Minibatch standard deviation

Equalized learning rate

Pixelwise normalization

TensorFlow Hub implementation

Challenges

Training instability

Mode collapse

Uninformative loss and evaluation metrics

Summary

References

Style Transfer with GANs

Paired style transfer using pix2pix GAN

The U-Net generator

The Patch-GAN discriminator

Loss

Training pix2pix

Use cases

Unpaired style transfer using CycleGAN

Overall setup for CycleGAN

Adversarial loss

Cycle loss

Identity loss

Overall loss

Hands-on: Unpaired style transfer with CycleGAN

Generator setup

Discriminator setup

GAN setup

The training loop

Related works

DiscoGAN

DualGAN

Summary

References

Deepfakes with GANs

Deepfakes overview

Modes of operation

Replacement

Re-enactment

Editing

Key feature set

Facial Action Coding System (FACS)

3D Morphable Model

Facial landmarks

Facial landmark detection using OpenCV

Facial landmark detection using dlib

Facial landmark detection using MTCNN

High-level workflow

Common architectures

Encoder-Decoder (ED)

Generative Adversarial Networks (GANs)

Replacement using autoencoders

Task definition

Dataset preparation

Autoencoder architecture

Training our own face swapper

Results and limitations

Re-enactment using pix2pix

Dataset preparation

Pix2pix GAN setup and training

Results and limitations

Challenges

Ethical issues

Technical challenges

Generalization

Occlusions

Temporal issues

Off-the-shelf implementations

Summary

References

The Rise of Methods for Text Generation

Representing text

Bag of Words

Distributed representation

Word2vec

GloVe

FastText

Text generation and the magic of LSTMs

Language modeling

Hands-on: Character-level language model

Decoding strategies

Greedy decoding

Beam search

Sampling

Hands-on: Decoding strategies

LSTM variants and convolutions for text

Stacked LSTMs

Bidirectional LSTMs

Convolutions and text

Summary

References

NLP 2.0: Using Transformers to Generate Text

Attention

Contextual embeddings

Self-attention

Transformers

Overall architecture

Multi-head self-attention

Positional encodings

BERT-ology

GPT 1, 2, 3…

Generative pre-training: GPT

GPT-2

Hands-on with GPT-2

Mammoth GPT-3

Summary

References

Composing Music with Generative Models

Getting started with music generation

Representing music

Music generation using LSTMs

Dataset preparation

LSTM model for music generation

Music generation using GANs

Generator network

Discriminator network

Training and results

MuseGAN – polyphonic music generation

Jamming model

Composer model

Hybrid model

Temporal model

MuseGAN

Generators

Critic

Training and results

Summary

References

Play Video Games with Generative AI: GAIL

Reinforcement learning: Actions, agents, spaces, policies, and rewards

Deep Q-learning

Inverse reinforcement learning: Learning from experts

Adversarial learning and imitation

Running GAIL on PyBullet Gym

The agent: Actor-Critic network

The discriminator

Training and results

Summary

References

Emerging Applications in Generative AI

Finding new drugs with generative models

Searching chemical space with generative molecular graph networks

Folding proteins with generative models

Solving partial differential equations with generative modeling

Few shot learning for creating videos from images

Generating recipes with deep learning

Summary

References

Why subscribe?

Other Books You May Enjoy

Index

2

Setting Up a TensorFlow Lab

Now that you have seen all the amazing applications of generative models in Chapter 1, An Introduction to Generative AI: "Drawing" Data from Models, you might be wondering how to get started with implementing these projects that use these kinds of algorithms. In this chapter, we will walk through a number of tools that we will use throughout the rest of the book to implement the deep neural networks that are used in various generative AI models. Our primary tool is the TensorFlow 2.0 framework, developed by Google1 2; however, we will also use a number of additional resources to make the implementation process easier (summarized in Table 2.1).

We can broadly categorize these tools:

Resources for replicable dependency management (Docker, Anaconda)Exploratory tools for data munging and algorithm hacking (Jupyter)Utilities to deploy these resources to the cloud and manage their lifecycle (Kubernetes, Kubeflow, Terraform)

Tool

Project site

Use

Docker

https://www.docker.com/

Application runtime dependency encapsulation

Anaconda

https://www.anaconda.com/

Python language package management

Jupyter

https://jupyter.org/

Interactive Python runtime and plotting / data exploration tool

Kubernetes

https://kubernetes.io/

Docker container orchestration and resource management

Kubeflow

https://www.kubeflow.org/

Machine learning workflow engine developed on Kubernetes

Terraform

https://www.terraform.io/

Infrastructure scripting language for configurable and consistent deployments of Kubeflow and Kubernetes

VSCode

https://code.visualstudio.com/

Integrated development environment (IDE)

Table 2.1: Tech stack for generative adversarial model development

On our journey to bring our code from our laptops to the cloud in this chapter, we will first describe some background on how TensorFlow works when running locally. We will then describe a wide array of software tools that will make it easier to run an end-to-end TensorFlow lab locally or in the cloud, such as notebooks, containers, and cluster managers. Finally, we will walk through a simple practical example of setting up a reproducible research environment, running local and distributed training, and recording our results. We will also examine how we might parallelize TensorFlow across multiple CPU/GPU units within a machine (vertical scaling) and multiple machines in the cloud (horizontal scaling) to accelerate training. By the end of this chapter, we will be all ready to extend this laboratory framework to tackle implementing projects using various generative AI models.

First, let's start by diving more into the details of TensorFlow, the library we will use to develop models throughout the rest of this book. What problem does TensorFlow solve for neural network model development? What approaches does it use? How has it evolved over the years? To answer these questions, let us review some of the history behind deep neural network libraries that led to the development of TensorFlow.

Deep neural network development and TensorFlow

As we will see in Chapter 3, Building Blocks of Deep Neural Networks, a deep neural network in essence consists of matrix operations (addition, subtraction, multiplication), nonlinear transformations, and gradient-based updates computed by using the derivatives of these components.

In the world of academia, researchers have historically often used efficient prototyping tools such as MATLAB3 to run models and prepare analyses. While this approach allows for rapid experimentation, it lacks elements of industrial software development, such as object-oriented (OO) development, that allow for reproducibility and clean software abstractions that allow tools to be adopted by large organizations. These tools also had difficulty scaling to large datasets and could carry heavy licensing fees for such industrial use cases. However, prior to 2006, this type of computational tooling was largely sufficient for most use cases. However, as the datasets being tackled with deep neural network algorithms grew, groundbreaking results were achieved such as:

Image classification on the ImageNet dataset4Large-scale unsupervised discovery of image patterns in YouTube videos5The creation of artificial agents capable of playing Atari video games and the Asian board game GO with human-like skill6 7State-of-the-art language translation via the BERT model developed by Google8

The models developed in these studies exploded in complexity along with the size of the datasets they were applied to (see Table 2.2 to get a sense of the immense scale of some of these models). As industrial use cases required robust and scalable frameworks to develop and deploy new neural networks, several academic groups and large technology companies invested in the development of generic toolkits for the implementation of deep learning models. These software libraries codified common patterns into reusable abstractions, allowing even complex models to be often embodied in relatively simple experimental scripts.

Model Name

Year

# Parameters

AlexNet

2012

61M

YouTube CNN

2012

1B

Inception

2014

5M

VGG-16

2014

138M

BERT

2018

340M

GPT-3

2020