28,14 €
Gain a working knowledge of advanced machine learning and explore Python's powerful tools for extracting data from images and videos
Key Features
Book Description
Python is the ideal programming language for rapidly prototyping and developing production-grade codes for image processing and Computer Vision with its robust syntax and wealth of powerful libraries. This book will help you design and develop production-grade Computer Vision projects tackling real-world problems.
With the help of this book, you will learn how to set up Anaconda and Python for the major OSes with cutting-edge third-party libraries for Computer Vision. You'll learn state-of-the-art techniques for classifying images, finding and identifying human postures, and detecting faces within videos. You will use powerful machine learning tools such as OpenCV, Dlib, and TensorFlow to build exciting projects such as classifying handwritten digits, detecting facial features,and much more. The book also covers some advanced projects, such as reading text from license plates from real-world images using Google's Tesseract software, and tracking human body poses using DeeperCut within TensorFlow.
By the end of this book, you will have the expertise required to build your own Computer Vision projects using Python and its associated libraries.
What you will learn
Who this book is for
Python programmers and machine learning developers who wish to build exciting Computer Vision projects using the power of machine learning and OpenCV will find this book useful. The only prerequisite for this book is that you should have a sound knowledge of Python programming.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 169
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Amey VarangaonkarAcquisition Editor: Dayne CastelinoContent Development Editor: Pratik AndradeTechnical Editors:Nilesh Sawakhande, Jovita AlvaCopy Editor: Safis EditingProject Coordinator:Namrata SwettaProofreader: Safis EditingIndexer:Priyanka DhadkeGraphics:Jisha ChirayilProduction Coordinator:Jisha Chirayil
First published: December 2018
Production reference: 1241218
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78995-455-5
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Matthew Rever is an image processing and computer vision engineer at a major national laboratory. He has years of experience in automating the analysis of complex scientific data, as well as in controlling sophisticated instruments. He has applied computer vision technology to save a great many hours of valuable human labor. He is also enthusiastic about making the latest developments in computer vision accessible to developers of all backgrounds.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Computer Vision Projects with OpenCV and Python 3
About Packt
Why subscribe?
Packt.com
Contributors
About the author
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Setting Up an Anaconda Environment
Introducing and installing Python and Anaconda
Installing Anaconda
Installing additional libraries
Installing OpenCV
Installing dlib
Installing Tesseract
Installing TensorFlow
Exploring Jupyter Notebook
Summary
Image Captioning with TensorFlow
Technical requirements
Introduction to image captioning
Difference between image classification and image captioning
Recurrent neural networks with long short-term memory
Google Brain im2txt captioning model
Running the captioning code on Jupyter
Analyzing the result captions
Running the captioning code on Jupyter for multiple images
Retraining the captioning model
Summary
Reading License Plates with OpenCV
Identifying the license plate
Plate utility functions
The gray_thresh_img function and morphological functions
Kernels
The matching character function
The k-nearest neighbors digit classifier
Finding plate characters
Finding matches and groups of characters
Finding and reading license plates with OpenCV
Result analysis
Summary
Human Pose Estimation with TensorFlow
Pose estimation using DeeperCut and ArtTrack
Single-person pose detection
Multi-person pose detection
Retraining the human pose estimation model
Summary
Handwritten Digit Recognition with scikit-learn and TensorFlow
Acquiring and processing MNIST digit data
Creating and training a support vector machine
Applying the support vector machine to new data
Introducing TensorFlow with digit classification
Evaluating the results
Summary
Facial Feature Tracking and Classification with dlib
Introducing dlib
Facial landmarks
Finding 68 facial landmarks in images
Faces in videos
Facial recognition
Summary
Deep Learning Image Classification with TensorFlow
Technical requirements
An introduction to TensorFlow
Using Inception for image classification
Retraining with our own images
Speeding up computation with your GPU
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
In this book, you learn how to leverage the power of Python, OpenCV, and TensorFlow to solve problems in computer vision. Python is the ideal programming language for rapidly prototyping and developing production-grade code for image processing and computer vision, with its robust syntax and wealth of powerful libraries.
This book will be your practical guide to designing and developing production-grade computer vision projects that tackle real-world problems. You will learn how to set up Anaconda Python for the major OSes with cutting-edge third-party libraries for computer vision, and you will learn state-of-the-art techniques of classifying images and finding and identifying humans within videos. You will gain the expertise required to build your own computer vision projects using Python and its associated libraries by the end of this book.
Python programmers and machine learning developers who wish to build exciting computer vision projects using the power of machine learning and OpenCV will find this book to be useful. The only prerequisite for this book is that you should have a sound knowledge of Python programming.
Chapter 1, Setting Up an Anaconda Environment, helps you download and install Python 3 and Anaconda along with their additional libraries, and also discusses the basic concepts of Jupyter Notebook.Chapter2, Image Captioning with TensorFlow, introduces you to image captioning using the Google Brain im2txt captioning model, which is a pre-defined model. We will also learn the process of retraining the model for our own customized images.Chapter 3, Reading License Plates with OpenCV, introduces you to reading license plates using the plate utility functions. We learn the process of finding the possible candidates for our license plate characters, which is key to reading license plates.Chapter 4, Human Pose Estimation with TensorFlow, introduces you to pose estimation using the DeeperCut algorithm and the pre-defined ArtTrack model. You will learn about single-person and multi-person pose detection, and you'll learn how to retrain the model for images and videos.
Chapter 5, Handwritten Digit Recognition with scikit-learn and TensorFlow, helps you acquire and process MNIST digit data. You will learn how to create and train a support vector machine, and also learn about digit classification using TensorFlow.Chapter 6, Facial Feature Tracking and Classification with dlib, helps you detect facial features from images and videos, which helps us carry out facial recognition.Chapter 7, Deep Learning Image Classification with TensorFlow, helps you learn image classification using a pre-trained Inception model. The chapter also teaches you how to retrain the model for customized images.
Some programming experience in Python and its packages, such as TensorFlow, OpenCV, and dlib, will help you get the most out of this book.
A powerful GPU with CUDA support is required to retrain the models.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Computer-Vision-Projects-with-OpenCV-and-Python-3. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781789954555_ColorImages.pdf.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
Welcome to Computer Vision Projects with OpenCV and Python 3. This book is one you might want to check out if you're new to OpenCV, and to computer vision in general.
In this chapter, we will be installing all the required tools that we're going to use in the book. We will be dealing with Python 3, OpenCV, and TensorFlow.
You might be wondering: why should I be using Python 3, and not Python 2? The answer to your question is on Python's own website:
We are looking to the future here, and if we want to future-proof our code, it's better to use Python 3. If you're using Python 2, some of the code examples here might not run, so we'll install Python 3 and use that for all the projects in the book.
In this chapter, we will cover the following topics:
Introducing and installing Python and Anaconda
Installing the additional libraries
Exploring Jupyter Notebook
The first thing we need is Python 3. The best way to install this is by downloading Continuum Analytics and the Anaconda distribution.
Anaconda is a fully-featured Python distribution that comes with a lot of packages, including numerical analytics, data science, and computer vision. It's going to make our lives a whole lot easier, because it provides us with libraries that are not present in the base Python distribution.
The best part about Anaconda is that it gives us the conda package manager, along with pip, which makes it very easy to install external packages for our Python distribution.
Let's get started.
We will begin by setting up our Anaconda and Python distribution, using the following steps:
Go to the Anaconda website, using the following link
www.anaconda.com/download
. You should see a landing page that looks similar to the following screenshot:
Next, select your OS and download the latest version of the Anaconda distribution, which includes Python 3.7. Click the
Download
button, as shown in the following screenshot:
Installing the setup file is pretty straightforward, so we won't go through each step here.
When you have everything properly installed and your path variables defined, go to the Command Prompt and make sure everything is good to go by typing the
where python
command. This shows us all the directories in which Python is installed. You should see something similar to the following screenshot:
As seen in the preceding screenshot, we see that the first instance of Python is in our Anaconda distribution. This means that we can proceed with our Python programs.
Now, let's make sure we have our other tools. Our first tool will be IPython, which is
essentially a command shell for interactive computing in multiple programming languages. We will check it using the
where ipython
command, as shown in the following screenshot:
The next package we will check is the
pip
tool, which
is the Python installer package. We do this with the
where pip
command, as shown in the following screenshot:
The next tool to check is the
conda
package,
which is Anaconda's built-in package manager.
This is done using the
where conda
command, as shown in the following screenshot:
We should be good to go with Python now.
In the next section, we're going to cover installing additional libraries such as OpenCV, TensorFlow, dlib, and Tesseract, which will be used for the projects in this book.
All the packages that we will be installing in this section are vital for our upcoming projects. So, let's get started.
To get OpenCV, go to the following link: anaconda.org/conda-forge/opencv. Technically, we don't need to access the website to install this package. The site just shows the various versions of OpenCV and all the different systems we can install it on.
Copy and paste the installation command from the site into Command Prompt and then run it, as shown in the following screenshot:
The preceding command is a simple, platform-independent way to get OpenCV. There are other methods for getting it; however, using this command ensures that we are installing the latest version.
We need to install dlib from the Anaconda distribution, similar to OpenCV. Just as with OpenCV, installing dlib is a straightforward process.
Run the following command:
conda install -c menpo dlib
You will get the following output:
This will take around 10 to 20 seconds to run. If everything goes well, we should be good to go with dlib.
Tesseract is Google's optical character recognition library, and is not natively a Python package. Because of this, there's a Python binding for it that calls the executable, which can then be installed manually.
Go to the GitHub repository for Tesseract, which is found at the following link: https://github.com/tesseract-ocr/tesseract.
Scroll down to theInstalling Tesseractsection in the GitHub readme. Here, we are presented with two options:
Installing it via a pre-built binary package
Building it from source
We want to install it via the pre-built binary package, so click on that link. We can also build it from source if we want to, but that doesn't really offer any advantages. The Tesseract Wiki explains the steps to install it on various different operating systems.
As we're using Windows, and we want to install a pre-built one, click on the Tesseract at UB Mannheim link, where you will find all the latest setup files. Download the latest setup from the site.
Once downloaded, run the installer or execute the command. However, this is not going to put Tesseract in your path. We need to make sure it is in your path; otherwise, when you call Tesseract from within Python, you're going to get an error message.
So, we need to figure out where Tesseract is and modify our path variable. To do this, type the where tesseract command in Command Prompt, as shown in the following screenshot:
Once you have the binary packages, use the pip command to apply the Python binding to the packages. Use the following commands:
$ pip install tesseract
$ pip install pytesseract
We should be good to go with Tesseract now.
Last but not least, we will install TensorFlow, which is a software library for data flow programming across a range of tasks. It is usually used for machine learning applications such as neural networks.
To install it, go to TensorFlow's website at the following link: tensorflow.org/install/. The website contains instructions for all the major operating systems.
As we're using Windows, the installation process is very simple. We just have to run the pip install tensorflow command in Command Prompt, as seen in the following screenshot:
As seen in the preceding screenshot, TensorFlow is already installed on the system, so it says that the requirements are satisfied. We should be good to go with TensorFlow now.
Install tensorflow-hub using the following command:
pip install tensorflow-hub
Next, install tflearn using the following command:
pip install tflearn
