40,81 €
Create advanced applications with Python and OpenCV, exploring the potential of facial recognition, machine learning, deep learning, web computing and augmented reality.
Key Features
Book Description
OpenCV is considered to be one of the best open source computer vision and machine learning software libraries. It helps developers build complete projects in relation to image processing, motion detection, or image segmentation, among many others. OpenCV for Python enables you to run computer vision algorithms smoothly in real time, combining the best of the OpenCV C++ API and the Python language.
In this book, you'll get started by setting up OpenCV and delving into the key concepts of computer vision. You'll then proceed to study more advanced concepts and discover the full potential of OpenCV. The book will also introduce you to the creation of advanced applications using Python and OpenCV, enabling you to develop applications that include facial recognition, target tracking, or augmented reality. Next, you'll learn machine learning techniques and concepts, understand how to apply them in real-world examples, and also explore their benefits, including real-time data production and faster data processing. You'll also discover how to translate the functionality provided by OpenCV into optimized application code projects using Python bindings. Toward the concluding chapters, you'll explore the application of artificial intelligence and deep learning techniques using the popular Python libraries TensorFlow, and Keras.
By the end of this book, you'll be able to develop advanced computer vision applications to meet your customers' demands.
What you will learn
Who this book is for
This book is designed for computer vision developers, engineers, and researchers who want to develop modern computer vision applications. Basic experience of OpenCV and Python programming is a must.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 533
Veröffentlichungsjahr: 2019
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Richa TripathiAcquisition Editor:Alok DhuriContent Development Editor:Manjusha MantriTechnical Editor:Riddesh DawneCopy Editor: Safis EditingProject Coordinator:Prajakta NaikProofreader: Safis EditingIndexer:Rekha NairGraphics:Jisha ChirayilProduction Coordinator:Shraddha Falebhai
First published: March 2019
Production reference: 1280319
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78934-491-2
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Alberto Fernández Villán is a software engineer with more than 12 years of experience in developing innovative solutions. In the last couple of years, he has been working in various projects related to monitoring systems for industrial plants, applying both Internet of Things (IoT) and big data technologies. He has a Ph.D. in computer vision (2017), a deep learning certification (2018), and several publications in connection with computer vision and machine learning in journals such as Machine Vision and Applications, IEEE Transactions on Industrial Informatics, Sensors, IEEE Transactions on Industry Applications, IEEE Latin America Transactions, and more. As of 2013, he is a registered and active user (albertofernandez) on the Q&A OpenCV forum.
Wilson Choo is a computer vision engineer working on validating computer vision and deep learning algorithms on many different hardware configurations. His strongest skills include algorithm benchmarking, integration, app development, and test automation.
He is also a machine learning and computer vision enthusiast. He often researches trending CVDL algorithms and applies them to solve modern-day problems. Besides that, Wilson likes to participate in hackathons, where he showcases his ideas and competes with other developers. His favorite programming languages are Python and C++.
Vincent Kok is a maker and a software platform application engineer in the transportation industry. He graduated from USM with a MSc in embedded system engineering. Vincent actively involves himself with the developer community, as well as attending Maker Faire events held around the world, such as in Shenzhen in 2014, and in Singapore and Tokyo in 2015. Designing electronics hardware kits and giving soldering/Arduino classes for beginners are some of his favorite ways to spend his free time. Currently, his focus is in computer vision technology, software test automation, deep learning, and constantly keeping himself up to date with the latest technology.
Rubén Usamentiaga is a tenured associate professor in the department of computer science and engineering at the University of Oviedo. He received his M.S. and Ph.D. degrees in computer science from the University of Oviedo in 1999 and 2005, respectively. He has participated in 4 European projects, 3 projects of the National R&D Plan, 2 projects of the Regional Plan of the Principado of Asturias, and 14 contracts with companies. He is the author of more than 60 publications in JCR journals (25 of Q1) and more than 50 publications in international conferences. In addition, he has completed a 6-month research stay at the Aeronautical Technology Center and a 3-month research stay at the University of Laval in Quebec.
Arun Ponnusamy, works as a computer vision research engineer at an AI start-up in India. He is a lifelong learner, passionate about image processing, computer vision, and machine learning. He is an engineering graduate from PSG College of Technology, Coimbatore. He started his career at MulticoreWare Inc., where he spent most of his time on image processing, OpenCV, software optimization, and GPU computing.
Arun loves to understand computer vision concepts clearly and explain them in an intuitive way in his blog and in meetups. He has created an open source Python library for computer vision, named cvlib, which is aimed at simplicity and user friendliness. He is currently working on object detection, action recognition, and generative networks.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Mastering OpenCV 4 with Python
About Packt
Why subscribe?
Packt.com
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Section 1: Introduction to OpenCV 4 and Python
Setting Up OpenCV
Technical requirements
Code testing specifications
Hardware specifications
Understanding Python
Introducing OpenCV
Contextualizing the reader
A theoretical introduction to the OpenCV library
OpenCV modules
OpenCV users
OpenCV applications
Why citing OpenCV in your research work
Installing OpenCV, Python, and other packages
Installing Python, OpenCV, and other packages globally
Installing Python
Installing Python on Linux
Installing Python on Windows
Installing OpenCV
Installing OpenCV on Linux
Installing OpenCV on Windows
Testing the installation
Installing Python, OpenCV, and other packages with virtualenv
Python IDEs to create virtual environments with virtualenv
Anaconda/Miniconda distributions and conda package–and environment-management system 
Packages for scientific computing, data science, machine learning, deep learning, and computer vision
Jupyter Notebook
Trying Jupiter Notebook online 
Installing the Jupyter Notebook
Installing Jupyter using Anaconda
Installing Jupyter with pip
The OpenCV and Python project structure
Our first Python and OpenCV project
Summary
Questions
Further reading
Image Basics in OpenCV
Technical requirements
A theoretical introduction to image basics
Main problems in image processing
Image-processing steps
Images formulation
Concepts of pixels, colors, channels, images, and color spaces
File extensions
The coordinate system in OpenCV
Accessing and manipulating pixels in OpenCV
Accessing and manipulating pixels in OpenCV with BGR images
Accessing and manipulating pixels in OpenCV with grayscale images
BGR order in OpenCV
Summary
Questions
Further reading
Handling Files and Images
Technical requirements
An introduction to handling files and images
sys.argv
Argparse – command-line option and argument parsing
Reading and writing images
Reading images in OpenCV
Reading and writing images in OpenCV
Reading camera frames and video files
Reading camera frames
Accessing some properties of the capture object
Saving camera frames
Reading a video file
Reading from an IP camera
Writing a video file
Calculating frames per second
Considerations for writing a video file
Playing with video capture properties
Getting all the properties from the video capture object
Using the properties – playing a video backwards
Summary
Questions
Further reading
Constructing Basic Shapes in OpenCV
Technical requirements
A theoretical introduction to drawing in OpenCV
Drawing shapes
Basic shapes – lines, rectangles, and circles
Drawing lines
Drawing rectangles
Drawing circles
Understanding advanced shapes
Drawing a clip line
Drawing arrows
Drawing ellipses
Drawing polygons
Shift parameter in drawing functions
lineType parameter in drawing functions
Writing text
Drawing text
Using all OpenCV text fonts
More functions related to text
Dynamic drawing with mouse events
Drawing dynamic shapes
Drawing both text and shapes
Event handling with Matplotlib
Advanced drawing
Summary
Questions
Further reading
Section 2: Image Processing in OpenCV
Image Processing Techniques
Technical requirements
Splitting and merging channels in OpenCV
Geometric transformations of images
Scaling an image
Translating an image
Rotating an image
Affine transformation of an image
Perspective transformation of an image
Cropping an image
Image filtering
Applying arbitrary kernels
Smoothing images
Averaging filter
Gaussian filtering
Median filtering
Bilateral filtering
Sharpening images
Common kernels in image processing
Creating cartoonized images
Arithmetic with images
Saturation arithmetic
Image addition and subtraction
Image blending
Bitwise operations
Morphological transformations
Dilation operation
Erosion operation
Opening operation
Closing operation
Morphological gradient operation
Top hat operation
Black hat operation
Structuring element
Applying morphological transformations to images
Color spaces
Showing color spaces
Skin segmentation in different color spaces
Color maps
Color maps in OpenCV
Custom color maps
Showing the legend for the custom color map
Summary
Questions
Further reading
Constructing and Building Histograms
Technical requirements
A theoretical introduction to histograms
Histogram terminology
Grayscale histograms
Grayscale histograms without a mask
Grayscale histograms with a mask
Color histograms
Custom visualizations of histograms
Comparing OpenCV, NumPy, and Matplotlib histograms
Histogram equalization
Grayscale histogram equalization
Color histogram equalization
Contrast Limited Adaptive Histogram Equalization 
Comparing CLAHE and histogram equalization
Histogram comparison
Summary
Questions
Further reading
Thresholding Techniques
Technical requirements
Installing scikit-image
Installing SciPy
Introducing thresholding techniques
Simple thresholding 
Thresholding types
Simple thresholding applied to a real image
Adaptive thresholding
Otsu's thresholding algorithm
The triangle binarization algorithm
Thresholding color images
Thresholding algorithms using scikit-image
Introducing thresholding with scikit-image
Trying out more thresholding techniques with scikit-image
Summary
Questions
Further reading
Contour Detection, Filtering, and Drawing
Technical requirements
An introduction to contours
Compressing contours
Image moments
Some object features based on moments
Hu moment invariants
Zernike moments
More functionality related to contours
Filtering contours
Recognizing contours
Matching contours
Summary
Questions
Further reading
Augmented Reality
Technical requirements
An introduction to augmented reality
Markerless-based augmented reality
Feature detection
Feature matching
Feature matching and homography computation to find objects
Marker-based augmented reality
Creating markers and dictionaries
Detecting markers
Camera calibration
Camera pose estimation
Camera pose estimation and basic augmentation
Camera pose estimation and more advanced augmentation
Snapchat-based augmented reality
Snapchat-based augmented reality OpenCV moustache overlay
Snapchat-based augmented reality OpenCV glasses overlay
QR code detection
Summary
Questions
Further reading
Section 3: Machine Learning and Deep Learning in OpenCV
Machine Learning with OpenCV
Technical requirements
An introduction to machine learning
Supervised machine learning
Unsupervised machine learning
Semi-supervised machine learning
k-means clustering
Understanding k-means clustering
Color quantization using k-means clustering
k-nearest neighbor
Understanding k-nearest neighbors
Recognizing handwritten digits using k-nearest neighbor 
Support vector machine
Understanding SVM
Handwritten digit recognition using SVM
Summary
Questions
Further reading
Face Detection, Tracking, and Recognition
Technical requirements
Installing dlib
Installing the face_recognition package
Installing the cvlib package
Face processing introduction
Face detection
Face detection with OpenCV
Face detection with dlib
Face detection with face_recognition
Face detection with cvlib
Detecting facial landmarks
Detecting facial landmarks with OpenCV
Detecting facial landmarks with dlib
Detecting facial landmarks with face_recognition
Face tracking
Face tracking with the dlib DCF-based tracker
Object tracking with the dlib DCF-based tracker
Face recognition
Face recognition with OpenCV
Face recognition with dlib
Face recognition with face_recognition
Summary
Questions
Further reading
Introduction to Deep Learning
Technical requirements
Installing TensorFlow
Installing Keras
Deep learning overview for computer vision tasks
Deep learning characteristics
Deep learning explosion
Deep learning for image classification
Deep learning for object detection 
Deep learning in OpenCV
Understanding cv2.dnn.blobFromImage()
Complete examples using the OpenCV DNN face detector
OpenCV deep learning classification
AlexNet for image classification
GoogLeNet for image classification
ResNet for image classification
SqueezeNet for image classification
OpenCV deep learning object detection
MobileNet-SSD for object detection
YOLO for object detection
The TensorFlow library
Introduction example to TensorFlow
Linear regression in TensorFlow
Handwritten digits recognition using TensorFlow
The Keras library
Linear regression in Keras
Handwritten digit recognition in Keras
Summary
Questions
Further reading
Section 4: Mobile and Web Computer Vision
Mobile and Web Computer Vision with Python and OpenCV
Technical requirements
Installing the packages
Introduction to Python web frameworks
Introduction to Flask
Web computer vision applications using OpenCV and Flask
A minimal example to introduce OpenCV and Flask 
Minimal face API using OpenCV
Deep learning cat detection API using OpenCV
Deep learning API using Keras and Flask
Keras applications
Deep learning REST API using Keras Applications
Deploying a Flask application to the cloud
Summary
Questions
Further reading
Assessments
Chapter 1
Chapter 2 
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Chapter 12
Chapter 13
Other Books You May Enjoy
Leave a review - let other readers know what you think
In a nutshell, this book is about computer vision using OpenCV, which is a computer vision (and also machine learning) library, and the Python programming language. You may be wondering why OpenCV and Python? That is really a good question, which we address in the first chapter of this book. To summarize, OpenCV is the best open source computer vision library (BSD license—it is free for both academic and commercial use), offering more than 2,500 optimized algorithms, including state-of-the-art computer vision algorithms, and it also has machine learning and deep learning support. OpenCV is written in optimized C/C++, but it provides Python wrappers. Therefore, this library can be used in your Python programs. In this sense, Python is considered the ideal language for scientific computing because it stimulates rapid prototyping and has a lot of prebuilt libraries for every aspect of your computer vision projects.
As introduced in the previous paragraph, there are many prebuilt libraries you can use in your projects. Indeed, in this book, we use lots of them, showing you that it's really easy to install and use new libraries. Libraries such as Matplotlib, scikit-image, SciPy, dlib, face-recognition, Pillow, cvlib, Keras, TensorFlow, and Flask will be used in this book to show you the potential of the Python ecosystem. If this is the first time that you're reading about these libraries, don't worry, because we introduce hello world examples for almost all of these libraries.
This book is a complete resource for creating advanced applications with Python and OpenCV using various techniques, such as facial recognition, target tracking, augmented reality, object detection, and classification, among others. In addition, this book explores the potential of machine learning and deep learning techniques in computer vision applications using the Python ecosystem.
It's time to dive deeper into the content of this book. We are going to introduce you to what this book covers, including a short paragraph talking about each chapter of the book. So, let's get started!
This book is great for students, researchers, and developers with basic Python programming knowledge who are new to computer vision and who would like to dive deeper into this world. It's assumed that readers have some previous experience with Python. A basic understanding of image data (for example, pixels and color channels) would also be helpful, but is not necessary, because these concepts are covered in the book. Finally, standard mathematical skills are required.
Chapter 1, Setting Up OpenCV, shows how to install everything you need to start programming with Python and OpenCV. You'll also be introduced to general terminology and concepts to contextualize what you will learn, establishing and setting the bases in relation to the main concepts of computer vision using OpenCV.
Chapter 2, Image Basics in OpenCV, demonstrates how to start writing your first scripts, in order to introduce you to the OpenCV library.
Chapter 3, Handling Files and Images, shows you how to cope with files and images, which are necessary for building your computer vision applications.
Chapter 4, Constructing Basic Shapes in OpenCV, covers how to draw shapes—from basic ones to some that are more advanced—using the OpenCV library.
Chapter 5, Image Processing Techniques, introduces most of the common image processing techniques you will need for your computer vision projects.
Chapter 6, Constructing and Building Histograms, shows how to both create and understand histograms, which are a powerful tool for understanding image content.
Chapter 7, Thresholding Techniques, introduces the main thresholding techniques you will need for your computer vision applications as a key process of image segmentation.
Chapter 8, Contour Detection, Filtering, and Drawing, shows how to deal with contours, which are used for shape analysis and for both object detection and recognition.
Chapter 9, Augmented Reality, teaches you how to build your first augmented reality application.
Chapter 10, Machine Learning with OpenCV, introduces you to the world of machine learning. You will see how machine learning can be used in your computer vision projects.
Chapter 11, Face Detection, Tracking, and Recognition, demonstrates how to create face processing projects using state-of-the-art algorithms, in connection with face detection, tracking, and recognition.
Chapter 12, Introduction to Deep Learning, introduces you to the world of deep learning with OpenCV and also some deep learning Python libraries (TensorFlow and Keras).
Chapter 13, Mobile and Web Computer Vision with Python and OpenCV, shows how to create computer vision and deep learning web applications using Flask.
With the aim of making the most of this book, you have to take into account two simple but key considerations:
Some basic knowledge of Python programming is assumed as all the scripts and examples in this book are in Python.
The NumPy and OpenCV-Python packages are highly interconnected (you will learn why in this book). In spite of NumPy examples being fully explained, the learning curve can be softened if some NumPy knowledge is acquired before starting this book.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Mastering-OpenCV-4-with-Python. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://www.packtpub.com/sites/default/files/downloads/9781789344912_ColorImages.pdf.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in, and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
In this first section of the book, you will be introduced to the OpenCV library. You will learn how to install everything you need to start programming with Python and OpenCV. Also, you will familiarize yourself with the general terminology and concepts to contextualize what you will learn, establishing the foundations you will need in order to grasp the main concepts of this book. Additionally, you will start writing your first scripts in order to get to grips with the OpenCV library, and you will also learn how to work with files and images, which are necessary for building your computer vision applications. Finally, you will see how to draw basic and advanced shapes using the OpenCV library.
The following chapters will be covered in this section:
Chapter 1
,
Setting Up OpenCV
Chapter 2
,
Image Basics in OpenCV
Chapter 3
,
Handling Files and Images
Chapter 4
,
Constructing Basic Shapes in OpenCV
Mastering OpenCV 4 with Python will give you the knowledge to build projects involving Open Source Computer Vision Library (OpenCV) and Python. These two technologies (the first one is a programming language, while the second one is a computer vision and machine learning library) will be introduced. Also, you will learn why the combination of OpenCV and Python has the potential to build every kind of computer application. Finally, an introduction about the main concepts related to the content of this book will be provided.
In this chapter, you will be given step-by-step instructions to install everything you need to start programming with Python and OpenCV. This first chapter is quite long, but do not worry, because it is divided into easily assimilated sections, starting with general terminology and concepts, which assumes that the reader is new to this information. At the end of this chapter, you will be able to build your first project involving Python and OpenCV.
The following topics will be covered in this chapter:
A theoretical introduction to the OpenCV library
Installing Python
OpenCV
and other packages
Running samples, documentation, help, and updates
Python and OpenCV project structure
First Python and OpenCV project
This chapter and subsequent chapters are focused on Python (a programming language) and OpenCV (a computer vision library) concepts in connection with computer vision, machine learning, and deep learning techniques (among others). Therefore, Python (https://www.python.org/) and OpenCV (https://opencv.org/) should be installed on your computer. Moreover, some Python packages related to scientific computing and data science should also be installed (for example, NumPy (http://www.numpy.org/) or Matplotlib (https://matplotlib.org/)).
Additionally, it is recommended that you install an integrated development environment (IDE) software package because it facilitates computer programmers with software development. In this sense, a Python-specific IDE is recommended. The de facto Python IDE is PyCharm, which can be downloaded from https://www.jetbrains.com/pycharm/.
Finally, in order to facilitate GitHub activities (for example, cloning a repository), you should install a Git client. In this sense, GitHub provides desktop clients that include the most common repository actions. For an introduction to Git commands, check out https://education.github.com/git-cheat-sheet-education.pdf, where commonly used Git command-line instructions are summarized. Additionally, instructions for installing a Git client on your operating system are included.
The GitHub repository for this book, which contains all the supporting project files necessary to work through the book from the first chapter to the last, can be accessed at https://github.com/PacktPublishing/Mastering-OpenCV-4-with-Python.
Finally, it should be noted that the README file of the GitHub repository for Mastering OpenCV with Python includes the following, which is also attached here for the sake of completeness:
Code testing specifications
Hardware specifications
Related books and products
Mastering OpenCV 4 with Python requires some installed packages, which you can see here:
Chapter 1
,
Setting Up OpenCV
:
opencv-contrib-python
Chapter 2
,
Image Basics in OpenCV
:
opencv-contrib-python
and
matplotlib
Chapter 3
,
Handling Files and Images
:
opencv-contrib-python
and
matplotlib
Chapter 4
,
Constructing Basic Shapes in OpenCV
:
opencv-contrib-python
and
matplotlib
Chapter 5
,
Image Processing Techniques
:
opencv-contrib-python
and
matplotlib
Chapter 6
,
Constructing and Building Histograms
:
opencv-contrib-python
and
matplotlib
Chapter 7
,
Thresholding Techniques
:
opencv-contrib-python
,
matplotlib
,
scikit-image
, and
scipy
Chapter 8
,
Contours Detection, Filtering, and Drawing
:
opencv-contrib-python
and
matplotlib
Chapter 9
,
Augmented Reality
:
opencv-contrib-python
and
matplotlib
Chapter 10
,
Machine Learning with OpenCV
:
opencv-contrib-python
and
matplotlib
Chapter 11
,
Face Detection, Tracking, and Recognition
:
opencv-contrib-python
,
matplotlib
,
dlib
,
face-recognition
,
cvlib
,
requests
,
progressbar
,
keras
, and
tensorflow
Chapter 12
,
Introduction to Deep Learning
:
opencv-contrib-python
,
matplotlib
,
tensorflow
, and
keras
Chapter 13
,
Mobile and Web Computer Vision with Python and OpenCV
:
opencv-contrib-python
,
matplotlib
,
flask
,
tensorflow
,
keras
,
requests
, and
pillow
If you want to install the exact versions this book was tested on, include the version when installing from pip, which is indicated as follows.
Run the following command to install the both main and contrib modules:
Install
opencv-contrib-python
:
pip install opencv-contrib-python==4.0.0.21
It should be noted that OpenCV requires numpy. numpy-1.16.1 has been installed when installing opencv-contrib-python==4.0.0.21.
Run the following command to install Matplotlib library:
Install
matplotlib
:
pip install matplotlib==3.0.2
It should be noted that matplotlib requires kiwisolver, pyparsing, six, cycler, and python-dateutil.
cycler-0.10.0, kiwisolver-1.0.1, pyparsing-2.3.1, python-dateutil-2.8.0, and six-1.12.0 have been installed when installing matplotlib==3.0.2.
Run the following command to install library which contains collections of algorithm for image processing:
Install
scikit-image
:
pip install scikit-image==0.14.2
It should be noted that scikit-image requires cloudpickle, decorator, networkx, numpy, toolz, dask, pillow, PyWavelets, and six.
PyWavelets-1.0.1, cloudpickle-0.8.0, dask-1.1.1, decorator-4.3.2, networkx-2.2, numpy-1.16.1, pillow-5.4.1, six-1.12.0, and toolz-0.9.0 have been installed when installing scikit-image==0.14.2.
If you need SciPy, you can install it with the following command:
Install
scipy
:
pip install scipy==1.2.1
It should be noted that scipy requires numpy.
numpy-1.16.1 has been installed when installing scipy==1.2.1.
Run the following command to install dlib library:
Install
dlib
:
pip install dlib==19.8.1
To install the face recognition library, run the following command:
Install
face-recognition
:
pip install face-recognition==1.2.3
It should be noted that face-recognition requires dlib, Click, numpy, face-recognition-models, and pillow.
dlib-19.8.1, Click-7.0, face-recognition-models-0.3.0, and pillow-5.4.1 have been installed when installing face-recognition==1.2.3.
Run the following command to install open source computer vision library:
Install
cvlib
:
pip install cvlib==0.1.8
To install requests library run the following command:
Install
requests
:
pip install requests==2.21.0
It should be noted that requests requires urllib3, chardet, certifi, and idna.
urllib3-1.24.1, chardet-3.0.4, certifi-2018.11.29, and idna-2.8 have been installed when installing requests==2.21.0.
Run the following command to install text progress bar library:
Install
progressbar
:
pip install progressbar==2.5
Run the following command to install Keras library for deep learning:
Install
keras
:
pip install keras==2.2.4
It should be noted that keras requires numpy, six, h5py, keras-applications, scipy, keras-preprocessing, and pyyaml.
h5py-2.9.0, keras-applications-1.0.7, keras-preprocessing-1.0.9, numpy-1.16.1 pyyaml-3.13, and scipy-1.2.1 six-1.12.0 have been installed when installing keras==2.2.4.
Run the following command to install TensorFlow library:
Install
tensorflow
:
pip install tensorflow==1.12.0
It should be noted that TensorFlow requires termcolor, numpy, wheel, gast, six, setuptools, protobuf, markdown, grpcio, werkzeug, tensorboard, absl-py, h5py, keras-applications, keras-preprocessing, and astor.
termcolor-1.1.0, numpy-1.16.1, wheel-0.33.1, gast-0.2.2, six-1.12.0, setuptools-40.8.0, protobuf-3.6.1, markdown-3.0.1, grpcio-1.18.0, werkzeug-0.14.1, tensorboard-1.12.2, absl-py-0.7.0, h5py-2.9.0, keras-applications-1.0.7, keras-preprocessing-1.0.9, and astor-0.7.1 have been installed when installing tensorflow==1.12.0.
Run the following command to install Flask library:
Install
flask
:
pip install flask==1.0.2
It should be noted that flask requires Werkzeug, click, itsdangerous, and MarkupSafe Jinja2.
Jinja2-2.10, MarkupSafe-1.1.1, Werkzeug-0.14.1, click-7.0, and itsdangerous-1.1.0 have been installed when installing flask==1.0.2.
The hardware specifications are as follows:
32-bit or 64-bit architecture
2+ GHz CPU
4 GB RAM
At least 10 GB of hard disk space available
Python is an interpreted high-level and general-purpose programming language with a dynamic type system and automatic memory management. The official home of the Python programming language is https://www.python.org/. The popularity of Python has risen steadily over the past decade. This is because Python is a very important programming language in some of today's most exciting and challenging technologies. Artificial intelligence (AI), machine learning, neural networks, deep learning, Internet of Things (IoT), and robotics (among others) rely on Python.
Here are some advantages of Python:
Python is considered a perfect language for scientific computing, mainly for four reasons:
It is very easy to understand.
It has support (via packages) for scientific computing.
It removes many of the complexities other programming languages have.
It has a simple and consistent syntax.
Python stimulates rapid prototyping because it helps in easy writing and execution of code. Indeed, Python can implement the same logic with as little as one-fifth of the code as compared to other programming languages.
Python has a lot of prebuilt libraries (NumPy, SciPy, scikit-learn) for every need of your AI project. Python benefits from a rich ecosystem of libraries for scientific computing.
It is an independent platform, which allows developers to save time in testing on different platforms.
Python offers some tools, such as Jupyter Notebook, that can be used to share scripts in an easy and comfortable way. This is perfect in scientific computing because it stimulates collaboration in an interactive computational environment.
OpenCV is a C++ programming library, with real-time capabilities. As it is written in optimized C/C++, the library can profit from multi-core processing. A theoretical introduction about the OpenCV library is carried out in the next section.
In connection with the OpenCV library, here are some reasons for its popularity:
Open source computer vision library
OpenCV (BSD license—
https://en.wikipedia.org/wiki/BSD_licenses
) is free
Specific library for image processing
It has more than 2,500 optimized algorithms, including state-of-the-art computer vision algorithms
Machine learning and deep learning support
The library is optimized for performance
There is a big community of developers using and supporting OpenCV
It has C++, Python, Java, and MATLAB interfaces
The library supports Windows, Linux, Android, and macOS
Fast and regular updates (official releases now occur every six months)
In order to contextualize the reader, it is necessary to establish and set the bases in relation to the main concepts concerning the theme of this book. The last few years have seen considerable interest in AI and machine learning, specifically in the area of deep learning. These terms are used interchangeably and very often confused with each other. For the sake of completeness and clarification, these terms are briefly described next.
AI refers to a set of technologies that enable machines – computers or robotic systems – to process information in the same way humans would.
The term AI is commonly used as an umbrella for a machine technology in order to provide intelligence covering a wide range of methods and algorithms. Machine Learning is the process of programming computers to learn from historical data to make predictions on new data. Machine learning is a sub-discipline of AI and refers to statistical techniques that machines use on the basis of learned interrelationships. On the basis of data gathered or collected, algorithms are independently learned by computers. These algorithms and methods include support vector machine, decision tree, random forest, logistic regression, Bayesian networks, and neural networks.
Neural Networks are computer models for machine learning that are based on the structure and functioning of the biological brain. An artificial neuron processes a plurality of input signals and, in turn, when the sum of the input signals exceeds a certain threshold value, signals to further adjacent neurons will be sent. Deep Learning is a subset of machine learning that operates on large volumes of unstructured data, such as human speech, text, and images. A deep learning model is an artificial neural network that comprises multiple layers of mathematical computation on data, where results from one layer are fed as input into the next layer in order to classify the input data and/or make a prediction.
Therefore, these concepts are interdependent in a hierarchical way, AI being the broadest term and deep learning the most specific. This structure can be seen in the next diagram:
Computer vision is an interdisciplinary field of Artificial Intelligence that aims to give computers and other devices with computing capabilities a high-level understanding from both digital images and videos, including functionality for acquiring, processing, and analyzing digital images. This is why computer vision is, partly, another sub-area of Artificial Intelligence, heavily relying on machine learning and deep learning algorithms to build computer vision applications. Additionally, Computer vision is composed of several technologies working together—Computer graphics, Image processing, Signal processing, Sensor technology, Mathematics, or even Physics.
Therefore, the previous diagram can be completed to introduce the computer vision discipline:
OpenCV is a programming library with real-time computer vision capabilities and it is free for both academic and commercial use (BSD license). In this section, an introduction about the OpenCV library will be given, including its main modules and other useful information in connection with the library.
OpenCV (since version 2) is divided into several modules, where each module can be understood, in general, as being dedicated to one group of computer vision problems. This division can be seen in the next diagram, where the main modules are shown:
OpenCV modules are shortly described here:
core
: Core functionality. Core functionality is a module defining basic data structures and also basic functions used by all other modules in the library.
imgproc
: Image processing. An image-processing module that includes image filtering, geometrical image transformations, color space conversion, and histograms.
imgcodecs
: Image codecs. Image file reading and writing.
videoio
: Video I/O. An interface to video capturing and video codecs.
highgui
: High-level GUI. An interface to UI capabilities. It provides an interface to easily do the following:
Create and manipulate windows that can display/show images
Add trackbars to the windows, keyboard commands, and handle mouse events
video
: Video analysis. A video-analysis module including
background subtraction,
motion estimation, and object-tracking algorithms.
calib3d
: Camera calibration and 3D reconstruction. Camera calibration and 3D reconstruction covering basic multiple-view geometry algorithms, stereo correspondence algorithms, object pose estimation, both single and stereo camera calibration, and also 3D reconstruction.
features2d
: 2D features framework. This module includes feature detectors, descriptors, and descriptor matchers.
objdetect
: Object detection. Detection of objects and instances of predefined classes (for example, faces, eyes, people, and cars).
dnn
:
Deep neural network
(
DNN
) module. This module contains the following:
API for new layers creation
Set of built useful layers
API to construct and modify neural networks from layers
Functionality for loading serialized networks models from different deep learning frameworks
ml
: Machine learning. The
Machine Learning Library
(
MLL
) is a set of classes and methods that can be used for classification, regression, and clustering purposes.
flann
: Clustering and search in multi-dimensional spaces.
Fast Library for Approximate Nearest Neighbors
(
FLANN
) is a collection of algorithms that are highly suited for fast nearest-neighbor searches.
photo
:
Computational photography. This module provides some functions for computational photography.
stitching
:
Images stitching.
This module
implements a stitching pipeline that performs automatic panoramic image stitching.
shape
: Shape distance and matching.
Shape distance and matching
module that can be used for shape matching, retrieval, or comparison.
superres
: Super-resolution. This module contains a set of classes and methods that can be used for resolution enhancement.
videostab
: Video stabilization. This module contains a set of classes and methods for video stabilization.
viz
: 3D visualizer. This module is used to display widgets that provide several methods to interact with scenes and widgets.
Regardless of whether you are a professional software developer ora novice programmer, the OpenCV library will be interesting for graduate students, researchers, and computer programmers in image-processing and computer vision areas. The library has gained popularity among scientists and academics because many state-of-the-art computer vision algorithms are provided by this library.
Additionally, it is often used as a teaching tool for both computer vision and machine learning. It should be taken into account that OpenCV is robust enough to support real-world applications. That is why OpenCV can be used for non-commercial and commercial products. For example, it is used by companies such as Google, Microsoft, Intel, IBM, Sony, and Honda. Research institutes in leading universities, such as MIT, CMU, or Stanford, provide support for the library. OpenCV has been adopted all around the world. It has more than 14 million downloads and more than 47,000 people in its community.
OpenCV is being used for a very wide range of applications:
2D and 3D feature toolkits
Street view image stitching
Egomotion estimation
Facial-recognition system
Gesture recognition
Human-computer interaction
Mobile robotics
Motion understanding
Object identification
Automated inspection and surveillance
Segmentation and recognition
Stereopsis stereo vision
–
depth perception from two cameras
Medical image analysis
Structure from motion
Motion tracking
Augmented reality
Video/image search and retrieval
Robot and driverless car navigation and control
Driver drowsiness and distraction detection
OpenCV, Python, and AI-related packages can be installed on most operating systems. We will see how to install these packages by means of different approaches.
Additionally, at the end of this chapter, an introduction to Jupyter Notebook is given due to the popularity of these documents, which can be run to perform data analysis.
In this section, you will see how to install Python, OpenCV, and any other package globally. Specific instructions are given for both Linux and Windows operating systems.
We are going to see how to install Python globally on both the Linux and Windows operating systems.
On Debian derivatives such as Ubuntu, use APT to install Python. Afterwards, it is recommended to upgrade the pip version. pip (https://pip.pypa.io/en/stable/) is the PyPA (https://packaging.python.org/guides/tool-recommendations/) recommended tool for installing Python packages:
$ sudo apt-get install python3.7
$ sudo pip install --upgrade pip
To verify that Python has been installed correctly, open a Command Prompt or shell and run the following command:
$ python3 --version
Python 3.7.0
Go to https://www.python.org/downloads/. The default Python Windows installer is 32 bits. Start the installer. Select Customize installation:
On the next screen, all the optional features should be checked:
Finally, on the next screen, make sure to check Add Python to environment variables and Precompile standard library. Optionally, you can customize the location of the installation, for example, C:\Python37:
Press the Install button and, in a few minutes, the installation should be ready. On the last page of the installer, you should also press Disable path length limit:
To check whether Python has been installed properly, press and hold the Shift key and right-click with your mouse somewhere on your desktop. Select Open command window here. Alternatively, on Windows 10, use the lower-left search box to search for cmd. Now, write python in the command window and press the Enter key. You should see something like this:
You should also upgrade pip:
$ python -m pip install --upgrade pip
Now, we are going to install OpenCV on both the Linux and Windows operating systems. First, we are going to see how to install OpenCV on Linux, and then how to install OpenCV on Windows.
Ensure you have installed NumPy. To install NumPy, enter the following:
$ pip3 install numpy
Then install OpenCV:
$ pip3 install opencv-contrib-python
Additionally, we can install Matplotlib, which is a Python plotting library that produces quality figures:
$ pip3 install matplotlib
$ pip install numpy
Then install OpenCV:
$ pip install opencv-contrib-python
Additionally, we can install Matplotlib:
$ pip install matplotlib
One way to test the installation is to execute an OpenCV Python script. In order to do it, you should have two files, logo.png and test_opencv_installation.py, in a specific folder:
Open a cmd and go to the path where these two files are. Next, we can check the installation by typing the following:
python test_opencv_installation.py
You should see both the OpenCV RGB logo and the OpenCV grayscale logo:
In that case, the installation has been successful.
virtualenv (https://pypi.org/project/virtualenv/) is a very popular tool that creates isolated Python environments for Python libraries. virtualenv allows multiple Python projects that have different (and sometimes conflicting) requirements. In a technical way, virtualenv works by installing some files under a directory (for example, env/).
Additionally, virtualenv modifies the PATH environment variable to prefix it with a custom bin directory (for example, env/bin/). Additionally, an exact copy of the Python or Python3 binary is placed in this directory. Once this virtual environment is activated, you can install packages in the virtual environment using pip. virtualenv is also recommended by the PyPA (https://packaging.python.org/guides/tool-recommendations/). Therefore, we will see how to install OpenCV or any other packages using virtual environments.
Usually, pip and virtualenv are the only two packages you need to install globally. This is because, once you have installed both packages, you can do all your work inside a virtual environment. In fact, virtualenv is really all you need, because this package provides a copy of pip, which gets copied into every new environment you create.
Now, we will see how to install, activate, use, and deactivate virtual environments. Specific commands are given now for both Linux and Windows operating systems. We are not going to add a specific section for each of the operating systems, because the process is very similar in each one. Let's start installing virtualenv:
$ pip install virtualenv
Inside this directory (env), some files and folders are created with all you need to run your python applications. For example, the new python executable will be located at /env/scripts/python.exe. The next step is to create a new virtual environment. First, change the directory into the root of the project directory. The second step is to use the virtualenv command-line tool to create the environment:
$ virtualenv env
Here, env is the name of the directory you want to create your virtual environment inside. It is a common convention to call the directory you want to create your virtual environment inside env, and to put it inside your project directory. This way, if you keep your code at ~/code/myproject/, the environment will be at ~/code/myproject/env/.
The next step is to activate the env environment that you have just created using the command-line tool to execute the activate script, which is in the following location:
~/code/myprojectname/env/bin/activate
(Linux)
~/code/myprojectname/env/
Scripts
/activate
(Windows)
For example, under Windows, you should type the following:
$ ~/code/myprojectname/env/Scripts/activate
(env) $
Now you can install the required packages only for this activated environment. For example, if you want to install Django, which is a free and open source web framework, written in Python, you should type this:
(env)$ pip install Django
You can also deactivate the environment by executing the following:
$ deactivate
$
You should see that you have returned to your normal prompt, indicating that you are no longer in any virtualenv. Finally, if you want to delete your environment, just type the following:
$ rmvirtualenv test
In the next section, we are going to create virtual environments with PyCharm, which is a Python IDE. But before doing that, we are going to discuss IDEs. An IDE is a software application that facilitates computer programmers with software development. IDEs present a single program where all the development is done. In connection with Python IDEs, two approaches can be found:
G
eneral editors and IDEs with Python support
Python-specific editors and IDEs
In the first category (general IDEs), some examples should be highlighted:
Eclipse + PyDev
Visual Studio + Python Tools for Visual Studio
Atom + Python extension
In the second category, here are some Python-specific IDEs:
PyCharm
: One of the best full-featured, dedicated IDEs for Python. PyCharm installs quickly and easily on Windows, macOS, and Linux platforms. It is the
de facto
Python IDE environment.
Spyder
: Spyder, which comes with the Anaconda package manager distribution, is an open source Python IDE that is highly suited for data science workflows.
Thonny
: Thonny is intended to be an IDE for beginners. It is available for all major platforms (Windows, macOS, Linux), with installation instructions on the site.
In this case, we are going to install PyCharm (the de facto Python IDE environment) Community Edition. Afterwards, we are going to see how to create virtual environments using this IDE. PyCharm can be downloaded from https://www.jetbrains.com/pycharm/. PyCharm can be installed on Windows, macOS, and Linux:
After the installation of PyCharm, we are ready to use it. Using PyCharm, we can create virtual environments in a very simple and intuitive way.
After opening Pycharm, you can click Create New Project. If you want to create a new environment, you should click on Project Interpreter: New Virtualenv environment. Then click on New environment using Virtualenv. This can be seen in the next screenshot:
You should note that the virtual environment is named (by default in PyCharm) venv and located under the project folder. In this case, the project is named test-env-pycharm and the virtual environment, venv, is located at test-env-pycharm/venv. Additionally, you can see that the venv name can be changed according to your preferences.
When you click on the Create button, PyCharm loads the project and creates the virtual environment. You should see something like this:
After the project is created, you are ready to install a package with just a few clicks. Click on File, then click on Settings... (Ctrl + Alt + S). A new window will appear, showing something like this:
Now, click on Project: and select Project Interpreter. On the right-hand side of this screen, the installed packages are shown in connection with the selected Project Interpreter. You can change it on top of this screen. After selecting the appropriate interpreter (and, hence, the environment for your project), you can install a new package. To do so, you can search in the upper-left input box. In the next screenshot, you can see an example of searching for the numpy package:
You can install the package (latest version by default) by clicking on Install Package. You can also specify a concrete version, as can be seen in the previous screenshot:
After the installation of this package, we can see that we now have three installed packages on our virtual environment. Additionally, it is very easy to change between environments. You should go to Run/Debug Configurations and click on Python interpreter to change between environments. This feature can be seen in the next screenshot:
Finally, you may have noticed that, in the first step, of creating a virtual environment with PyCharm, options other than virtualenv are possible. PyCharm gives you the ability to create virtual environments using Virtualenv, Pipenv, and Conda:
We previously introduced Virtualenv and how to work with this tool for creating isolated Python environments for Python libraries.
Pyenv (https://github.com/pyenv/pyenv) is used to isolate Python versions. For example, you may want to test your code against Python 2.6, 2.7, 3.3, 3.4, and 3.5, so you will need a way to switch between them.
Conda (https://conda.io/docs/) is an open source package management and environment management system (provides virtual environment capabilities) that runs on Windows, macOS, and Linux. Conda is included in all versions of Anaconda and Miniconda.
Conda (https://conda.io/docs/) is an open source package-management and environment-management system (provides virtual environment capabilities) that runs on many operating systems (for example, Windows, macOS, and Linux). Conda installs, runs, and updates packages and their dependencies. Conda can create, save, load, and switch between environments.
Anaconda is a downloadable, free, open source, high-performance Python and R distribution. Anaconda comes with conda, conda build, Python, and more than 100 open source scientific packages and their dependencies. Using the conda install command, you can easily install popular open source packages for data science from the Anaconda repository. Miniconda is a small version of Anaconda, which includes only conda, Python, the packages they depend on, and a small number of other useful packages.
Installing Anaconda or Miniconda is easy. For the sake of simplicity, we are focusing on Anaconda. To install Anaconda, check the Acadonda installer for your operating system (https://www.anaconda.com/download/). Anaconda 5.2 can be installed in both Python 3.6 and Python 2.7 versions on Windows, macOS, and Linux:
After you have finished installing, in order to test the installation, in Terminal or Anaconda Prompt, run the following command:
$ conda list
For a successful installation, a list of installed packages appears. As mentioned, Anaconda (and Miniconda) comes with conda, which is a simple package manager similar to apt-get on Linux. In this way, we can install new packages in Terminal using the following command:
$ conda install packagename
Here, packagename is the actual name of the package we want to install. Existing packages can be updated using the following command:
$ conda update packagename
We can also search for packages using the following command:
$ anaconda search –t conda packagename
This will bring up a whole list of packages available through individual users. A package called packagename from a user called username can then be installed as follows:
$ conda install -c username packagename
Additionally, conda can be used to create and manage virtual environments. For example, creating a test environment and installing NumPy version 1.7 is as simple as typing the next command:
$ conda create --name test numpy=1.7
In a similar fashion as working with virtualenv, environments can be activated and deactivated. To do this on macOS and Linux, just run the following:
$ source activate test
$ python
...
$ source deactivate
On Windows, run the following:
$ activate test
$ python
...
$ deactivate
Finally, it should be pointed out that we can work with conda under the PyCharm IDE, in a similar way as virtualenv to create and manage virtual environments, because PyCharm can work with both tools.
So far, we have seen how to install Python, OpenCV, and a few other packages (numpy and matplotlib) from scratch, or using Anaconda distribution, which includes many popular data-science packages. In this way, some knowledge about the main packages for scientific computing, data science, machine learning, and computer vision is a key point because they offer powerful computational tools. Throughout this book, many Python packages will be used. Not all of the cited packages in this section will, but a comprehensive list is provided for the sake of completeness in order to show the potential of Python in topics related to the content of this book:
NumPy
(
http://www.numpy.org/
) provides support for large, multi-dimensional arrays. NumPy is a key library in computer vision because images can be represented as multi-dimensional arrays. Representing images as NumPy arrays has many advantages.
OpenCV
(
https://opencv.org/
) is an open source computer vision library.
Scikit-image
(
https://scikit-image.org/
) is a collection of algorithms for image processing. Images manipulated by scikit-image are simply NumPy arrays.
The
Python Imaging Library
(
PIL
) (
http://www.pythonware.com/products/pil/
) is an image-processing library that provides powerful image-processing and graphics capabilities.
Pillow
(
https://pillow.readthedocs.io/
) is the friendly PIL fork by Alex Clark and contributors. The PIL adds image-processing capabilities to your Python interpreter.
SimpleCV
(
http://simplecv.org/
) is a framework for computer vision that provides key functionalities to deal with image processing.
Mahotas
(
https://mahotas.readthedocs.io/
