31,19 €
Google's TensorFlow is a game changer in the world of machine learning. It has made machine learning faster, simpler, and more accessible than ever before. This book will teach you how to easily get started with machine learning using the power of Python and TensorFlow 1.x.
Firstly, you’ll cover the basic installation procedure and explore the capabilities of TensorFlow 1.x. This is followed by training and running the first classifier, and coverage of the unique features of the library including data ?ow graphs, training, and the visualization of performance with TensorBoard—all within an example-rich context using problems from multiple industries. You’ll be able to further explore text and image analysis, and be introduced to CNN models and their setup in TensorFlow 1.x. Next, you’ll implement a complete real-life production system from training to serving a deep learning model. As you advance you’ll learn about Amazon Web Services (AWS) and create a deep neural network to solve a video action recognition problem. Lastly, you’ll convert the Caffe model to TensorFlow and be introduced to the high-level TensorFlow library, TensorFlow-Slim.
By the end of this book, you will be geared up to take on any challenges of implementing TensorFlow 1.x in your machine learning environment.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 249
Veröffentlichungsjahr: 2017
BIRMINGHAM - MUMBAI
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: November 2017
Production reference: 1171117
ISBN 978-1-78646-296-1
www.packtpub.com
Authors
Quan Hua
Shams Ul Azeem
Saif Ahmed
Copy Editor
Zainab Bootwala
Reviewer
Nathan Lintz
Project Coordinator
Prajakta Naik
Commissioning Editor
Kunal Parikh
Proofreader
Safis Editing
Acquisition Editor
Tushar Gupta
Indexer
Rekha Nair
Content Development Editor
Siddhi Chavan
Graphics
Jason Monteiro
Technical Editor
Mehul Singh
Production Coordinator
Deepika Naik
Quan Hua is a Computer Vision and Machine Learning Engineer at BodiData, a data platform for body measurements, where he focuses on developing computer vision and machine learning applications for a handheld technology capable of acquiring a body avatar while a person is fully clothed. He earned a bachelor of science degree from the University of Science, Vietnam, specializing in Computer Vision. He has been working in the field of computer vision and machine learning for about 3 years at start-ups.
Quan has been writing for Packt since 2015 for a Computer Vision book, OpenCV 3 Blueprints.
Shams Ul Azeem is an undergraduate in electrical engineering from NUSTIslamabad, Pakistan. He has a great interest in the computer science field, and he started his journey with Android development. Now, he’s pursuing his career in Machine Learning, particularly in deep learning, by doing medical-related freelancing projects with different companies. He was also a member of the RISE lab, NUST, and he has a publication credit at the IEEE International Conference, ROBIO as a co-author of Designing of motions for humanoid goalkeeper robots.
Saif Ahmed is an accomplished quantitative analyst and data scientist with 15 years of industry experience. His career started in management consulting at Accenture and lead him to quantitative and senior management roles at Goldman Sachs and AIG Investments. Most recently, he co-founded and runs a start-up focused on applying Deep Learning to automating medical imaging. He obtained his bachelor's degree in computer science from Cornell University and is currently pursuing a graduate degree in data science at U.C. Berkeley.
Nathan Lintz is a Machine Learning researcher, focusing on text classification. When he began with Machine Learning, he primarily used Theano but quickly switched to TensorFlow when it was released. TensorFlow has greatly reduced the time it takes to build Machine Learning systems thanks to its intuitive and powerful neural network utilities.
For support files and downloads related to your book, please visitwww.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version atwww.PacktPub.comand as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us [email protected] more details.
Atwww.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills withMapt.Maptgives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demandand accessible via a web browser</li>
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1787123421. If you'd like to join our team of regular reviewers, you can email us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
Getting Started with TensorFlow
Current use
Installing TensorFlow
Ubuntu installation
macOS installation
Windows installation
Virtual machine setup
Testing the installation
Summary
Your First Classifier
The key parts
Obtaining training data
Downloading training data
Understanding classes
Automating the training data setup
Additional setup
Converting images to matrices
Logical stopping points
The machine learning briefcase
Training day
Saving the model for ongoing use
Why hide the test set?
Using the classifier
Deep diving into the network
Skills learned
Summary
The TensorFlow Toolbox
A quick preview
Installing TensorBoard
Incorporating hooks into our code
Handwritten digits
AlexNet
Automating runs
Summary
Cats and Dogs
Revisiting notMNIST
Program configurations
Understanding convolutional networks
Revisiting configurations
Constructing the convolutional network
Fulfilment
Training day
Actual cats and dogs
Saving the model for ongoing use
Using the classifier
Skills learned
Summary
Sequence to Sequence Models-Parlez-vous Français?
A quick preview
Drinking from the firehose
Training day
Summary
Finding Meaning
Additional setup
Skills learned
Summary
Making Money with Machine Learning
Inputs and approaches
Getting the data
Approaching the problem
Downloading and modifying data
Viewing the data
Extracting features
Preparing for training and testing
Building the network
Training
Testing
Taking it further
Practical considerations for the individual
Skills learned
Summary
The Doctor Will See You Now
The challenge
The data
The pipeline
Understanding the pipeline
Preparing the dataset
Explaining the data preparation
Training routine
Validation routine
Visualize outputs with TensorBoard
Inception network
Going further
Other medical data challenges
The ISBI grand challenge
Reading medical data
Skills Learned
Summary
Cruise Control - Automation
An overview of the system
Setting up the project
Loading a pre-trained model to speed up the training
Testing the pre-trained model
Training the model for our dataset
Introduction to the Oxford-IIIT Pet dataset
Dataset Statistics
Downloading the dataset
Preparing the data
Setting up input pipelines for training and testing
Defining the model
Defining training operations
Performing the training process
Exporting the model for production
Serving the model in production
Setting up TensorFlow Serving
Running and testing the model
Designing the web server
Testing the system
Automatic fine-tune in production
Loading the user-labeled data
Performing a fine-tune on the model
Setting up cronjob to run every day
Summary
Go Live and Go Big
Quick look at Amazon Web Services
P2 instances
G2 instances
F1 instances
Pricing
Overview of the application
Datasets
Preparing the dataset and input pipeline
Pre-processing the video for training
Input pipeline with RandomShuffleQueue
Neural network architecture
Training routine with single GPU
Training routine with multiple GPU
Overview of Mechanical Turk
Summary
Going Further - 21 Problems
Dataset and challenges
Problem 1 - ImageNet dataset
Problem 2 - COCO dataset
Problem 3 - Open Images dataset
Problem 4 - YouTube-8M dataset
Problem 5 - AudioSet dataset
Problem 6 - LSUN challenge
Problem 7 - MegaFace dataset
Problem 8 - Data Science Bowl 2017 challenge
Problem 9 - StarCraft Game dataset
TensorFlow-based Projects
Problem 10 - Human Pose Estimation
Problem 11 - Object Detection - YOLO
Problem 12 - Object Detection - Faster RCNN
Problem 13 - Person Detection - tensorbox
Problem 14 - Magenta
Problem 15 - Wavenet
Problem 16 - Deep Speech
Interesting Projects
Problem 17 - Interactive Deep Colorization - iDeepColor
Problem 18 - Tiny face detector
Problem 19 - People search
Problem 20 - Face Recognition - MobileID
Problem 21 - Question answering - DrQA
Caffe to TensorFlow
TensorFlow-Slim
Summary
Advanced Installation
Installation
Installing Nvidia driver
Installing the CUDA toolkit
Installing cuDNN
Installing TensorFlow
Verifying TensorFlow with GPU support
Using TensorFlow with Anaconda
Summary
Machine Learning has revolutionized the modern world. Many machine learning algorithms, especially deep learning, have been used worldwide, ranging from mobile devices to cloud-based services. TensorFlow is one of the leading open source software libraries and helps you build, train, and deploy your Machine Learning system for a variety of applications. This practical book is designed to bring you the best of TensorFlow and help you build real-world Machine Learning systems.
By the end of this book, you will have a deep understanding of TensorFlow and be able to apply Machine Learning techniques to your application.
Chapter 1, Getting Started with TensorFlow, shows how to install Tensorflow and get started on Ubuntu, macOS, and Windows.
Chapter 2, Your First Classifier, guides you through your first journey with a handwriting recognizer.
Chapter 3, The TensorFlow Toolbox, gives you an overview of the tools that Tensorflow provides to work more effectively and easily.
Chapter 4, Cats and Dogs, teaches you how to build an image classifier using Convolutional Neural Networks in TensorFlow.
Chapter 5, Sequence to Sequence Models—Parlez-vous Français?, discusses how to build an English to French translator using sequence-to-sequence models.
Chapter 6, Finding Meaning, explores the ways to find the meaning in the text by using sentiment analysis, entity extraction, keyword extraction, and word-relation extraction.
Chapter 7, Making Money with Machine Learning, dives into an area with copious amounts of data: the financial world. You will learn how to work with the time series data to solve the financial problems.
Chapter 8, The Doctor Will See You Now, investigates ways to tackle an enterprise-grade problem—medical diagnosis—using deep neural networks.
Chapter 9, Cruise Control - Automation, teaches you how to create a production system, ranging from training to serving a model. The system can also receive user feedbacks and automatically train itself every day.
Chapter 10, Go Live and Go Big, guides you through the world of Amazon Web Services and shows you how to take advantage of a multiple GPUs system on Amazon servers.
Chapter 11, Going Further - 21 Problems, introduces 21 real-life problems that you can use in deep learning—TensorFlow to solve after reading this book.
Appendix, Advanced Installation, discusses GPUs and focuses on a step-by-step CUDA setup and a GPU-based TensorFlow installation.
For software, the whole book is based on TensorFlow. You can use either Linux, Windows, or macOS.
For hardware, you will need a computer or laptop that runs Ubuntu, macOS, or Windows. As authors, we encourage you to have an NVIDIA graphics card if you want to work with deep neural networks, especially when you want to work with large-scale datasets.
This book is ideal for you if you aspire to build Machine Learning systems that are smart and practical enough for real-world applications. You should be comfortable with Machine Learning concepts, Python programming, IDEs, and the command line. This book will be useful to people who program professionally as part of their job, or those who are working as scientists and engineers and need to learn about Machine Learning and TensorFlow in support of their work.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply [email protected], and mention the book'stitle onthe subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide atwww.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for this book from your account athttp://www.packtpub.com. If you purchased this book elsewhere, you can visithttp://www.packtpub.com/supportand register to have the files e-mailed directly to you.
You can download the code files by following these steps:
Log in or register to our website using your e-mail address and password.
Hover the mouse pointer on the
SUPPORT
tab at the top.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box.
Select the book for which you're looking to download the code files.
Choose from the drop-down menu where you purchased this book from.
Click on
Code Download
.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub athttps://github.com/PacktPublishing/Machine-Learning-with-TensorFlow-1.x. We also have other code bundles from our rich catalog of books and videos available athttps://github.com/PacktPublishing/. Check them out!
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file fromhttps://www.packtpub.com/sites/default/files/downloads/MachineLearningwithTensorFlow1.x_ColorImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybea mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visitinghttp://www.packtpub.com/submit-errata, selecting your book, clicking on theErrata Submission Formlink, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go tohttps://www.packtpub.com/books/content/supportand enter the name of the book in the search field. The required information willappear undertheErratasection.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us [email protected] a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us [email protected], and we will do our best to address the problem.
The proliferation of large public datasets, inexpensive GPUs, and open-minded developer culture has revolutionized machine learning efforts in recent years. Training data, the lifeblood of machine learning, has become widely available and easily consumable in recent years. Computing power has made the required horsepower available to small businesses and even individuals. The current decade is incredibly exciting for data scientists.
Some of the top platforms used in the industry include Caffe, Theano, and Torch. While the underlying platforms are actively developed and openly shared, usage is limited largely to machine learning practitioners due to difficult installations, non-obvious configurations, and difficulty with productionizing solutions.
Late 2015 and 2016 brought additional platforms into the landscape—TensorFlow from Google, CNTK from Microsoft, and Veles from Samsung, among other options. Google's TensorFlow is the most exciting for several reasons.
TensorFlow has one of the easiest installations of any platform, bringing machine learning capabilities squarely into the realm of casual tinkerers and novice programmers. Meanwhile, high-performance features, such as—multiGPU support, make the platform exciting for experienced data scientists and industrial use as well. TensorFlow also provides a reimagined process and multiple user-friendly utilities, such as TensorBoard, to manage machine learning efforts. Finally, the platform has significant backing and community support from the world's largest machine learning powerhouse--Google. All this is before even considering the compelling underlying technical advantages, which we'll dive into later.
In this chapter, we will cover the following topics:
macOS X
Microsoft Windows and Linux, both the core software and all the dependencies
VM setup to enable Windows installation
Although TensorFlow has been public for just two years, numerous community efforts have already successfully ported over existing machine learning projects. Some examples include handwriting recognition, language translation, animal classification, medical image triage, and sentiment analysis. The wide applicability of machine learning to so many industries and problems always intrigues people. With TensorFlow, these problems are not only feasible but easily achievable. In fact, we will tackle and solve each of the preceding problems within the course of this book!
TensorFlow conveniently offers several types of installation and operates on multiple operating systems. The basic installation is CPU-only, while more advanced installations unleash serious horsepower by pushing calculations onto the graphics card, or even to multiple graphics cards. We recommend starting with a basic CPU installation at first. More complex GPU and CUDA installations will be discussed in Appendix, Advanced Installation.
Even with just a basic CPU installation, TensorFlow offers multiple options, which are as follows:
A basic Python
pip
installation
A segregated Python installation via Virtualenv
A fully segregated container-based installation via Docker
We recommend a Python installation via Virtualenv, but our examples will use a basic Python pip installation to help you focus on the crux of our task, that is, getting TensorFlow up and running. Again, more advanced installation types will be covered in Appendix, Advanced Installation.
TensorFlow can fully work on Linux and macOS with both Python 2.7 and 3.5. On Windows, we can only use TensorFlow with Python 3.5.x or 3.6.x. It can also be easily used on Windows by running a Linux virtual machine (VM). With an Ubuntu virtual machine, we can use TensorFlow with Python 2.7. However, we can't use TensorFlow with GPU support in a virtual machine. As of TensorFlow 1.2, TensorFlow doesn't provide GPU support on macOS. Therefore, if you want to use macOS with GPU-enabled TensorFlow, you will have to compile from sources, which is out of the scope of this chapter. Otherwise, you can still use TensorFlow 1.0 or 1.1, which provides GPU support out of the box on macOS. Linux and Windows users can use TensorFlow with both CPU and GPU support.
Ubuntu is one of the best Linux distributions for working with Tensorflow. We highly recommend that you use an Ubuntu machine, especially if you want to work with GPU. We will do most of our work on the Ubuntu terminal. We will begin with installing python-pip and python-dev via the following command:
sudo apt-get install python-pip python-dev
A successful installation will appear as follows:
If you find missing packages, you can correct them via the following command:
sudo apt-get update --fix-missing
Then, you can continue the python and pip installation.
We are now ready to install TensorFlow. We will do a CPU-only installation, and if you wish to do an advanced GPU-enabled installation, we will cover that in Appendix, Advanced Installation.
The CPU installation is initiated via the following command:
sudo pip install tensorflow
A successful installation will appear as follows:
If you use Python, you will probably already have the Python package installer, pip. However, if not, you can easily install it using the easy_install pip command. You'll note that we actually executed sudo easy_install pip—the sudo prefix was required because the installation requires administrative rights.
We will make the fair assumption that you already have the basic package installer, easy_install, available; if not, you can install it from https://pypi.python.org/pypi/setuptools. A successful installation will appear as shown in the following screenshot:
Next, we will install the six package:
sudo easy_install --upgrade six
A successful installation will appear as shown in the following screenshot:
Surprisingly, those are the only two prerequisites for TensorFlow, and we can now install the core platform. We will use the pip package installer mentioned earlier and install TensorFlow directly from Google's site. The most recent version at the time of writing this book is v1.3, but you should change this to the latest version you wish to use:
sudo pip install tensorflow
The pip installer will automatically gather all the other required dependencies. You will see each individual download and installation until the software is fully installed.
A successful installation will appear as shown in the following screenshot:
That's it! If you were able to get to this point, you can start to train and run your first model. Skip to Chapter 2, Your First Classifier, to train your first model.
macOS X users wishing to completely segregate their installation can use a VM instead, as described in the Windows installation.
As we mentioned earlier, TensorFlow with Python 2.7 does not function natively on Windows. In this section, we will guide you through installing TensorFlow with Python 3.5 and set up a VM with Linux if you want to use TensorFlow with Python 2.7.
First, we need to install Python 3.5.x or 3.6.x 64-bit from the following links:
https://www.python.org/downloads/release/python-352/
https://www.python.org/downloads/release/python-362/
Make sure that you download the 64-bit version of Python where the name of the installation has amd64, such as python-3.6.2-amd64.exe. The Python 3.6.2 installation looks like this:
We will select Add Python 3.6 to PATH and click Install Now. The installation process will complete with the following screen:
We will click the Disable path length limit and then click Close to finish the Python installation. Now, let's open the Windows PowerShell application under the Windows menu. We will install the CPU-only version of Tensorflow with the following command:
pip3 install tensorflow
The result of the installation will look like this:
Congratulations, you can now use TensorFlow on Windows with Python 3.5.x or 3.6.x support. In the next section, we will show you how to set up a VM to use TensorFlow with Python 2.7. However, you can skip to the Test installation section of Chapter 2, Your First Classifier, if you don't need Python 2.7.
Now, we will show you how to set up a VM with Linux to use TensorFlow with Python 2.7. We recommend the free VirtualBox system available at https://www.virtualbox.org/wiki/Downloads. The latest stable version at the time of writing is v5.0.14, available at the following URL:
http://download.virtualbox.org/virtualbox/5.1.28/VirtualBox-5.1.28-117968-Win.exe
A successful installation will allow you to run the Oracle VM VirtualBox Manager dashboard, which looks like this:
Linux comes in numerous flavors, but as the TensorFlow documentation mostly mentions Ubuntu, we'll be working with Ubuntu Linux. You are welcome to use any flavor of Linux, but you should be aware that there are subtle differences across flavors and versions of each flavor. Most differences are benign, but some may trip up the installation or even usage of TensorFlow.
Even after choosing Ubuntu, there are many versions and configurations; you can see some at http://cdimage.ubuntu.com/ubuntu-gnome/releases/14.04/release/.
We will install the most popular version, which is Ubuntu 14.04.4 LTS (make sure to download a version appropriate for your computer). Versions marked x86 are designed to run on 32-bit machines, while those marked with some variation of 64 are designed to run on 64-bit machines. Most modern machines are 64-bit, so if you are unsure, go with the latter.
Installations happen via an ISO file, which is, essentially, a file equivalent of an installation CD. The ISO for Ubuntu 14.04.4 LTS is ubuntu-gnome-14.04-desktop-amd64.iso.
Once you have downloaded the installation ISO, we will set up a VM and use the ISO file to install Ubuntu Linux on the VM.
Setting up the VM on Oracle VM VirtualBox Manager is relatively simple, but pay close attention as the default options are not sufficient for TensorFlow. You will go through the following seven screens, and at the end, it will prompt you for the installation file, which was just downloaded.
We will first set up the type of operating system and configure the random access memory (RAM) allocated to the VM:
Note that we selected a 64-bit installation as that is the image we're using; you can choose to use a 32-bit image if you need:
How much RAM you allocate depends on how much your machine has. In the following screenshot, we will allocate half our RAM(8 GB) to our VM. Remember that this is consumed only while we are running the VM, so we can be liberal with our allocations. We can allocate at least 4 GB:
Our VM will need a hard disk. We'll create a
Virtual Hard Disk
(
VHD
), as shown in the following screenshot:
Then, we will choose the type of hard drive for the VM, that is,
VDI (VirtualBox Disk Image)
, as shown in the following screenshot:
Next, we will choose how much space to allocate for the VHD. This is important to understand as we will soon work with extremely large datasets:
We will allocate 12 GB because TensorFlow and typical TensorFlow applications have an array of dependencies, such as
NumPy
,
SciPy
, and
Pandas
. Our exercises will also be downloading large datasets, which are to be used for training:
After setting up the VM, it will appear on the left side VM listing. Select it and click on
Start
. This is the equivalent of booting up the machine:
As the machine boots for the first time, provide it the installation CD (in our case, the Ubuntu ISO we downloaded earlier):
