Machine Learning for OpenCV 4 - Aditya Sharma - E-Book

Machine Learning for OpenCV 4 E-Book

Aditya Sharma

0,0
40,81 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

A practical guide to understanding the core machine learning and deep learning algorithms, and implementing them to create intelligent image processing systems using OpenCV 4




Key Features



  • Gain insights into machine learning algorithms, and implement them using OpenCV 4 and scikit-learn


  • Get up to speed with Intel OpenVINO and its integration with OpenCV 4


  • Implement high-performance machine learning models with helpful tips and best practices



Book Description



OpenCV is an opensource library for building computer vision apps. The latest release, OpenCV 4, offers a plethora of features and platform improvements that are covered comprehensively in this up-to-date second edition.






You'll start by understanding the new features and setting up OpenCV 4 to build your computer vision applications. You will explore the fundamentals of machine learning and even learn to design different algorithms that can be used for image processing. Gradually, the book will take you through supervised and unsupervised machine learning. You will gain hands-on experience using scikit-learn in Python for a variety of machine learning applications. Later chapters will focus on different machine learning algorithms, such as a decision tree, support vector machines (SVM), and Bayesian learning, and how they can be used for object detection computer vision operations. You will then delve into deep learning and ensemble learning, and discover their real-world applications, such as handwritten digit classification and gesture recognition. Finally, you'll get to grips with the latest Intel OpenVINO for building an image processing system.






By the end of this book, you will have developed the skills you need to use machine learning for building intelligent computer vision applications with OpenCV 4.





What you will learn



  • Understand the core machine learning concepts for image processing


  • Explore the theory behind machine learning and deep learning algorithm design


  • Discover effective techniques to train your deep learning models


  • Evaluate machine learning models to improve the performance of your models


  • Integrate algorithms such as support vector machines and Bayes classifier in your computer vision applications


  • Use OpenVINO with OpenCV 4 to speed up model inference



Who this book is for



This book is for Computer Vision professionals, machine learning developers, or anyone who wants to learn machine learning algorithms and implement them using OpenCV 4. If you want to build real-world Computer Vision and image processing applications powered by machine learning, then this book is for you. Working knowledge of Python programming is required to get the most out of this book.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 498

Veröffentlichungsjahr: 2019

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Machine Learning for OpenCV 4Second Edition

 

 

 

 

 

 

Intelligent algorithms for building image processing apps using OpenCV 4, Python, and scikit-learn

 

 

 

 

 

 

 

Aditya Sharma
Vishwesh Ravi Shrimali
Michael Beyeler

 

 

 

 

 

BIRMINGHAM - MUMBAI

Machine Learning for OpenCV 4 Second Edition

Copyright © 2019 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

 

Commissioning Editor: Pawan RamchandaniAcquisition Editor: Aniruddha PatilContent Development Editor:Pratik AndradeSenior Editor: Ayaan HodaTechnical Editor: Dinesh PawarCopy Editor: Safis EditingProject Coordinator:Vaidehi SawantProofreader: Safis EditingIndexer:Rekha NairProduction Designer:Alishon Mendonsa

First published: July 2017 Second edition: September 2019

Production reference: 1060919

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78953-630-0

www.packt.com

 

Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Fully searchable for easy access to vital information

Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks. 

Contributors

About the authors

Aditya Sharma is a senior engineer at Robert Bosch working on solving real-world autonomous computer vision problems. At Robert Bosch, he also secured first place at an AI hackathon 2019. He has been associated with some of the premier institutes of India, including IIT Mandi and IIIT Hyderabad. At IIT, he published papers on medical imaging using deep learning at ICIP 2019 and MICCAI 2019. At IIIT, his work revolved around document image super-resolution.

He is a motivated writer and has written many articles on machine learning and deep learning for DataCamp and LearnOpenCV. Aditya runs his own YouTube channel and has contributed as a speaker at the NCVPRIPG conference (2017) and Aligarh Muslim University for a workshop on deep learning.

 

Vishwesh Ravi Shrimali graduated from BITS Pilani, where he studied mechanical engineering, in 2018. Since then, he has been working with BigVision LLC on deep learning and computer vision and is also involved in creating official OpenCV courses. He has a keen interest in programming and AI and has applied that interest in mechanical engineering projects. He has also written multiple blogs on OpenCV and deep learning on LearnOpenCV, a leading blog on computer vision. When he is not writing blogs or working on projects, he likes to go on long walks or play his acoustic guitar.

 

Michael Beyeler is a postdoctoral fellow in neuroengineering and data science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye). His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android. Michael received a PhD in computer science from the University of California, Irvine, and an MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland.

About the reviewers

Wilson Choo is a deep learning engineer working on deep learning modeling research. He has a deep interest in creating applications that implement deep learning, computer vision, and machine learning.

His past work includes the validation and benchmarking of Intel OpenVINO Toolkit algorithms, as well as custom Android OS validation. He has experience in integrating deep learning applications in different hardware and OSes. His native programming languages are Java, Python, and C++.

 

Robert B. Fisher has a PhD from the University of Edinburgh, where he also served as a college dean of research. He is currently the industrial liaison committee chair for the International Association for Pattern Recognition. His research covers topics mainly in high-level computer vision and 3D video analysis, which has led to 5 books and 300 peer-reviewed scientific articles or book chapters (Google H-index: 46). Most recently, he has been the coordinator of an EC-funded project that's developing a gardening robot. He has developed several online computer vision resources with over 1 million hits. He is a fellow of the International Association for Pattern Recognition and the British Machine Vision Association.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Machine Learning for OpenCV 4 Second Edition

About Packt

Why subscribe?

Contributors

About the authors

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1: Fundamentals of Machine Learning and OpenCV

A Taste of Machine Learning

Technical requirements

Getting started with machine learning

Problems that machine learning can solve

Getting started with Python

Getting started with OpenCV

Installation

Getting the latest code for this book

Getting to grips with Python's Anaconda distribution

Installing OpenCV in a conda environment

Verifying the installation

Getting a glimpse of OpenCV's ml module

Applications of machine learning

What's new in OpenCV 4.0?

Summary

Working with Data in OpenCV

Technical requirements

Understanding the machine learning workflow

Dealing with data using OpenCV and Python

Starting a new IPython or Jupyter session

Dealing with data using Python's NumPy package

Importing NumPy

Understanding NumPy arrays

Accessing single array elements by indexing

Creating multidimensional arrays

Loading external datasets in Python

Visualizing the data using Matplotlib

Importing Matplotlib

Producing a simple plot

Visualizing data from an external dataset

Dealing with data using OpenCV's TrainData container in C++

Summary

First Steps in Supervised Learning

Technical requirements

Understanding supervised learning

Having a look at supervised learning in OpenCV

Measuring model performance with scoring functions

Scoring classifiers using accuracy, precision, and recall

Scoring regressors using mean squared error, explained variance, and R squared

Using classification models to predict class labels

Understanding the k-NN algorithm

Implementing k-NN in OpenCV

Generating the training data

Training the classifier

Predicting the label of a new data point

Using regression models to predict continuous outcomes

Understanding linear regression

Linear regression in OpenCV

Using linear regression to predict Boston housing prices

Loading the dataset

Training the model

Testing the model

Applying Lasso and ridge regression

Classifying iris species using logistic regression

Understanding logistic regression

Loading the training data

Making it a binary classification problem

Inspecting the data

Splitting data into training and test sets

Training the classifier

Testing the classifier

Summary

Representing Data and Engineering Features

Technical requirements

Understanding feature engineering

Preprocessing data

Standardizing features

Normalizing features

Scaling features to a range

Binarizing features

Handling the missing data

Understanding dimensionality reduction

Implementing Principal Component Analysis (PCA) in OpenCV

Implementing independent component analysis (ICA)

Implementing non-negative matrix factorization (NMF)

Visualizing the dimensionality reduction using t-Distributed Stochastic Neighbor Embedding (t-SNE)

Representing categorical variables

Representing text features

Representing images

Using color spaces

Encoding images in the RGB space

Encoding images in the HSV and HLS space

Detecting corners in images

Using the star detector and BRIEF descriptor

Using Oriented FAST and Rotated BRIEF (ORB)

Summary

Section 2: Operations with OpenCV

Using Decision Trees to Make a Medical Diagnosis

Technical requirements

Understanding decision trees

Building our first decision tree

Generating new data

Understanding the task by understanding the data

Preprocessing the data

Constructing the tree

Visualizing a trained decision tree

Investigating the inner workings of a decision tree

Rating the importance of features

Understanding the decision rules

Controlling the complexity of decision trees

Using decision trees to diagnose breast cancer

Loading the dataset

Building the decision tree

Using decision trees for regression

Summary

Detecting Pedestrians with Support Vector Machines

Technical requirement

Understanding linear SVMs

Learning optimal decision boundaries

Implementing our first SVM

Generating the dataset

Visualizing the dataset

Preprocessing the dataset

Building the support vector machine

Visualizing the decision boundary

Dealing with nonlinear decision boundaries

Understanding the kernel trick

Knowing our kernels

Implementing nonlinear SVMs

Detecting pedestrians in the wild

Obtaining the dataset

Taking a glimpse at the histogram of oriented gradients (HOG)

Generating negatives

Implementing the SVM

Bootstrapping the model

Detecting pedestrians in a larger image

Further improving the model

Multiclass classification using SVMs

About the data

Attribute information

Summary

Implementing a Spam Filter with Bayesian Learning

Technical requirements

Understanding Bayesian inference

Taking a short detour through probability theory

Understanding Bayes' theorem

Understanding the Naive Bayes classifier

Implementing your first Bayesian classifier

Creating a toy dataset

Classifying the data with a normal Bayes classifier

Classifying the data with a Naive Bayes classifier

Visualizing conditional probabilities

Classifying emails using the Naive Bayes classifier

Loading the dataset

Building a data matrix using pandas

Preprocessing the data

Training a normal Bayes classifier

Training on the full dataset

Using n-grams to improve the result

Using TF-IDF to improve the result

Summary

Discovering Hidden Structures with Unsupervised Learning

Technical requirements

Understanding unsupervised learning

Understanding k-means clustering

Implementing our first k-means example

Understanding expectation-maximization

Implementing our expectation-maximization solution

Knowing the limitations of expectation-maximization

The first caveat – no guarantee of finding the global optimum

The second caveat – we must select the number of clusters beforehand

The third caveat – cluster boundaries are linear

The fourth caveat – k-means is slow for a large number of samples

Compressing color spaces using k-means

Visualizing the true-color palette

Reducing the color palette using k-means

Classifying handwritten digits using k-means

Loading the dataset

Running k-means

Organizing clusters as a hierarchical tree

Understanding hierarchical clustering

Implementing agglomerative hierarchical clustering

Comparing clustering algorithms

Summary

Section 3: Advanced Machine Learning with OpenCV

Using Deep Learning to Classify Handwritten Digits

Technical requirements

Understanding the McCulloch-Pitts neuron

Understanding the perceptron

Implementing your first perceptron

Generating a toy dataset

Fitting the perceptron to data

Evaluating the perceptron classifier

Applying the perceptron to data that is not linearly separable

Understanding multilayer perceptrons

Understanding gradient descent

Training MLPs with backpropagation

Implementing a MLP in OpenCV

Preprocessing the data

Creating an MLP classifier in OpenCV

Customizing the MLP classifier

Training and testing the MLP classifier

Getting acquainted with deep learning

Getting acquainted with Keras

Classifying handwritten digits

Loading the MNIST dataset

Preprocessing the MNIST dataset

Training an MLP using OpenCV

Training a deep neural network using Keras

Preprocessing the MNIST dataset

Creating a convolutional neural network

Model summary

Fitting the model

Summary

Ensemble Methods for Classification

Technical requirements

Understanding ensemble methods

Understanding averaging ensembles

Implementing a bagging classifier

Implementing a bagging regressor

Understanding boosting ensembles

Weak learners

Implementing a boosting classifier

Implementing a boosting regressor

Understanding stacking ensembles

Combining decision trees into a random forest

Understanding the shortcomings of decision trees

Implementing our first random forest

Implementing a random forest with scikit-learn

Implementing extremely randomized trees

Using random forests for face recognition

Loading the dataset

Preprocessing the dataset

Training and testing the random forest

Implementing AdaBoost

Implementing AdaBoost in OpenCV

Implementing AdaBoost in scikit-learn

Combining different models into a voting classifier

Understanding different voting schemes

Implementing a voting classifier

Plurality

Summary

Selecting the Right Model with Hyperparameter Tuning

Technical requirements

Evaluating a model

Evaluating a model the wrong way

Evaluating a model in the right way

Selecting the best model

Understanding cross-validation

Manually implementing cross-validation in OpenCV

Using scikit-learn for k-fold cross-validation

Implementing leave-one-out cross-validation

Estimating robustness using bootstrapping

Manually implementing bootstrapping in OpenCV

Assessing the significance of our results

Implementing Student's t-test

Implementing McNemar's test

Tuning hyperparameters with grid search

Implementing a simple grid search

Understanding the value of a validation set

Combining grid search with cross-validation

Combining grid search with nested cross-validation

Scoring models using different evaluation metrics

Choosing the right classification metric

Choosing the right regression metric

Chaining algorithms together to form a pipeline

Implementing pipelines in scikit-learn

Using pipelines in grid searches

Summary

Using OpenVINO with OpenCV

Technical requirements

Introduction to OpenVINO

OpenVINO toolkit installation

OpenVINO components

Interactive face detection demo

Using OpenVINO Inference Engine with OpenCV

Using OpenVINO Model Zoo with OpenCV

Image classification using OpenCV with OpenVINO Inference Engine

Image classification using OpenVINO

Image classification using OpenCV with OpenVINO

Summary

Conclusion

Technical requirements

Approaching a machine learning problem

Building your own estimator

Writing your own OpenCV-based classifier in C++

Writing your own scikit-learn-based classifier in Python

Where to go from here

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

As the world changes and humans build smarter and better machines, the demand for machine learning and computer vision experts increases. Machine learning, as the name suggests, is the process of a machine learning to make predictions given a certain set of parameters as input. Computer vision, on the other hand, gives a machine vision; that is, it makes the machine aware of visual information. When you combine these technologies, you get a machine that can use visual data to make predictions, which brings machines one step closer to having human capabilities. When you add deep learning to it, the machine can even surpass human capabilities in terms of making predictions. This might seem far-fetched, but with AI systems taking over decision-based systems, this has actually become a reality. You have AI cameras, AI monitors, AI sound systems, AI-powered processors, and more. We cannot promise you that you will be able to build an AI camera after reading this book, but we do intend to provide you with the tools necessary for you to do so. The most powerful tool that we are going to introduce is the OpenCV library, which is the world's largest computer vision library. Even though its use in machine learning is not very common, we have provided some examples and concepts on how it can be used for machine learning. We have gone with a hands-on approach in this book and we recommend that you try out every single piece of code present in this book to build an application that showcases your knowledge. The world is changing and this book is our way of helping young minds change it for the better.

Who this book is for

We have tried to explain all the concepts from scratch to make the book suitable for beginners as well as advanced readers. We recommend that readers have some basic knowledge of Python programming, but it's not mandatory. Whenever you encounter some Python syntax that you are not able to understand, make sure you look it up on the internet. Help is always provided to those who look for it.

What this book covers

Chapter 1, A Taste of Machine Learning, starts us off with installing the required software and Python modules for this book.

Chapter 2, Working with Data in OpenCV, takes a look at some basic OpenCV functions.

Chapter 3, First Steps in Supervised Learning, will cover the basics of supervised learning methods in machine learning. We will have a look at some examples of supervised learning methods using OpenCV and the scikit-learn library in Python.

Chapter 4, Representing Data and Engineering Features, will cover concepts such as feature detection and feature recognition using ORB in OpenCV. We will also try to understand important concepts such as the curse of dimensionality.

Chapter 5, Using Decision Trees to Make a Medical Diagnosis, will introduce decision trees and important concepts related to them, including the depth of trees and techniques such as pruning. We will also cover a practical application of predicting breast cancer diagnoses using decision trees.

Chapter 6, Detecting Pedestrians with Support Vector Machines, will start off with an introduction to support vector machines and how they can be implemented in OpenCV. We will also cover an application of pedestrian detection using OpenCV.

Chapter 7, Implementing a Spam Filter with Bayesian Learning, will discuss techniques such as the Naive Bayes algorithm, multinomial Naive Bayes, and more, as well as how they can be implemented. Finally, we will build a machine learning application to classify data into spam and ham.

Chapter 8, Discovering Hidden Structures with Unsupverised Learning, will be our first introduction to the second class of machine learning algorithms—unsupervised learning. We will discuss techniques such as clustering using k-nearest neighbors, k-means, and more.

Chapter 9, Using Deep Learning to Classify Handwritten Digits, will introduce deep learning techniques and we will see how we can use deep neural networks to classify images from the MNIST dataset.

Chapter 10, Ensemble Methods for Classification, will cover topics such as random forest, bagging, and boosting for classification purposes.

Chapter 11, Selecting the Right Model with Hyperparameter Tuning, will go over the process of selecting the optimum set of parameters in various machine learning methods in order to improve the performance of a model.

Chapter 12, Using OpenVINO with OpenCV, will introduce OpenVINO Toolkit, which was introduced in OpenCV 4.0. We will also go over how we can use it in OpenCV using image classification as an example.

Chapter 13, Conclusion, will provide a summary of the major topics that we have covered in the book and talk about what you can do next.

To get the most out of this book

We recommend that you go through any good Python programming book or online tutorials or videos, if you are a beginner in Python. You can also have a look at DataCamp (http://www.datacamp.com) to learn Python using interactive lessons.

We also recommend that you learn some basic concepts about the Matplotlib library in Python. You can try out this tutorial for that: https://www.datacamp.com/community/tutorials/matplotlib-tutorial-python.

You don't need to have anything installed on your system for this book before starting it. We will cover all the installation steps in the first chapter.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at

www.packt.com

.

Select the

Support

 tab.

Click on

Code Downloads

.

Enter the name of the book in the

Search

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Machine-Learning-for-OpenCV-Second-Edition. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781789536300_ColorImages.pdf.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Section 1: Fundamentals of Machine Learning and OpenCV

In the very first section of this book, we will go over the basics of machine learning and OpenCV, starting with installing the required libraries, and then moving on to basic OpenCV functions, the basics of supervised learning and their applications, and finally, feature detection and recognition using OpenCV.

This section includes the following chapters:

Chapter 1

A Taste of Machine Learning

Chapter 2

Working with Data in OpenCV

Chapter 3

First Steps in Supervised Learning

Chapter 4

Representing Data and Engineering Features

A Taste of Machine Learning

So, you have decided to enter the field of machine learning. That's great!

Nowadays, machine learning is all around us—from protecting our email, to automatically tagging our friends in pictures, to predicting what movies we like. As a form of artificial intelligence, machine learning enables computers to learn through experience; to make predictions about the future using collected data from the past. On top of that, computer vision is one of today's most exciting application fields of machine learning, with deep learning and convolutional neural networks driving innovative systems such as self-driving cars and Google's DeepMind.

However, fret not; your application does not need to be as large-scale or world-changing as the previous examples in order to benefit from machine learning. In this chapter, we will talk about why machine learning has become so popular and discuss the kinds of problems that it can solve. We will then introduce the tools that we need in order to solve machine learning problems using OpenCV. Throughout the book, I will assume that you already have a basic knowledge of OpenCV and Python, but that there is always room to learn more. We will also go over how you can install OpenCV on your local system so that you can try out the code on your own.

Are you ready then? In this chapter, we will go over the following concepts:

What is machine learning and what are its categories?

Important Python concepts

Getting started with OpenCV

Installing Python and the required modules on the local system

Applications of machine learning

What's new in OpenCV 4.0?

Technical requirements

You can refer to the code for this chapter at the following link: https://github.com/PacktPublishing/Machine-Learning-for-OpenCV-Second-Edition/tree/master/Chapter01.

Here is a short summary of the software and hardware requirements:

OpenCV version 4.1.x (4.1.0 or 4.1.1 will both work just fine).

Python version 3.6 (any Python version 3.x will be fine).

Anaconda Python 3 for installing Python and the required modules.

You can use any OS—macOS, Windows, and Linux-based OS—with this book. We recommend you have at least 4 GB RAM in your system.

You don't need to have a GPU to run the code provided with this book.

Getting started with machine learning

Machine learning has been around for at least 60 years. Growing out of the quest for artificial intelligence, early machine learning systems inferred the hand-coded rules of if...else statements to process data and make decisions. Think of a spam filter whose job is to parse incoming emails and move unwanted messages to a spam folder as shown here in the following diagram:

We could come up with a blacklist of words that, whenever they show up in a message, would mark an email as spam. This is a simple example of a hand-coded expert system. (We will build a smarter one in Chapter 7, Implementing a Spam Filter with Bayesian Learning.)

These expert decision rules can become arbitrarily complicated if we are allowed to combine and nest them in what is known as a decision tree (Chapter 5, Using Decision Trees to Make a Medical Diagnosis). Then, it becomes possible to make more informed decisions that involve a series of decision steps. You should note that even though decision trees do look like a set of if...else conditions, it's way more than that and is actually a kind of machine learning algorithm that we will explore in Chapter 5, Using Decision Trees to Make a Medical Diagnosis.

Hand-coding these decision rules is sometimes feasible but it has two major disadvantages:

The logic required to make a decision applies only to a specific task in a single domain. For example, there is no way that we could use this spam filter to tag our friends in a picture. Even if we wanted to change the spam filter to do something slightly different, such as filtering out phishing emails (intended to steal your personal data) in general, we would have to redesign all the decision rules.

Designing rules by hand requires a deep understanding of the problem. We would have to know exactly what type of emails constitute spam, including all possible exceptions. This is not as easy as it seems; otherwise, we wouldn't often be double-checking our spam folder for important messages that might have been accidentally filtered out. For other domain problems, it is simply not possible to design the rules by hand.

This is where machine learning comes in. Sometimes, tasks cannot be defined well—except maybe by example—and we would like machines to make sense of and solve the tasks by themselves. Other times, it is possible that important relationships and correlations are hidden among large piles of data that we as humans might have missed (see Chapter 8, Discovering Hidden Structures with Unsupervised Learning). When dealing with large amounts of data, machine learning can often be used to extract these hidden relationships (also known as data mining).

A good example of where man-made expert systems have failed is in detecting faces in images. Silly, isn't it? Today, every smartphone can detect a face in an image. However, 20 years ago, this problem was largely unsolved. The reason for this was the way humans think about what constitutes a face was not very helpful to machines. As humans, we tend not to think in pixels. If we were asked to detect a face, we would probably just look for the defining features of a face, such as eyes, nose, mouth, and so on. But how would we tell a machine what to look for, when all the machine knows is that images have pixels and pixels have a certain shade of gray? For the longest time, this difference in image representation basically made it impossible for a human to come up with a good set of decision rules that would allow a machine to detect a face in an image. We will talk about different approaches to this problem in Chapter 4, Representing Data and Engineering Features.

However, with the advent of convolutional neural networks and deep learning (Chapter 9, Using Deep Learning to Classify Handwritten Digits), machines have become as successful as us when it comes to recognizing faces. All we had to do was simply present a large collection of images of faces to the machine. Most approaches also require some form of annotation about where the faces are in the training data. From there on, the machine was able to discover the set of characteristics that would allow it to identify a face, without having to approach the problem in the same way as we would. This is the true power of machine learning.

Problems that machine learning can solve

Most machine learning problems belong to one of the following three main categories:

In supervised learning, we have what is referred to as the label for a data point. Now, this can be the class of an object that is captured in the image, a bounding box around a face, the digit present in the image, or anything else. Think of it as a teacher who teaches but also tells you what the correct answer is to a problem. Now, the student can try to devise a model or an equation that takes into account all the problems and their correct answers and finds out the answer to a problem that does (or does not) have a correct answer. The data that goes into learning the model is called the

training data

and the data on which the process/model is tested is called 

test data

. These predictions come in two flavors, such as identifying new photos with the correct animal (called a

 

classification

 

problem

) or

 assigning 

accurate sale prices to other used cars (called a

 

regression

 

problem

). Don't worry if this seems a little over your

 head 

for now—we will have the entirety of the book to nail down the details.

In unsupervised learning, data points have no

 labels 

associated with them (

Chapter 8

,

 

Discovering Hidden Structures with

 

Unsupervised Learning

). Think of it like a class where the instructor gives you a jumbled puzzle and leaves it up to you to figure out what to do. Here, the most common result is the

c

lusters, which contain objects with similar characteristics. It can also result in different ways of looking at higher dimensional data (complex data) so that it appears simpler.

Reinforcement learning is about maximizing a reward in a problem. So, if the teacher gives you a candy for every correct answer and punishes you for every incorrect answer, he/she is reinforcing the concepts by making you increase the number of times you receive candies rather than the number of times you are subjected to a punishment.

These three main categories are illustrated in the following diagram:

Now that we have covered the main machine learning categories, let's go over some concepts in Python that will prove very useful along the journey of this book.

Getting started with Python

Python has become the common language for many data science and machine learning applications, thanks to its great number of open source libraries for processes such as data loading, data visualization, statistics, image processing, and natural language processing. One of the main advantages of using Python is the ability to interact directly with the code, using a Terminal or other tools such as the Jupyter Notebook, which we'll look at shortly.

If you have mostly been using OpenCV in combination with C++, I would strongly suggest that you switch to Python, at least for the purpose of studying this book. This decision has not been made out of spite! Quite the contrary: I have done my fair share of C/C++ programming—especially in combination with GPU computing via NVIDIA's Compute Unified Device Architecture (CUDA)—and I like it a lot. However, I consider Python to be a better choice if you want to pick up a new topical skill because you can do more by typing less. This will help reduce the cognitive load. Rather than getting annoyed by the syntactic subtleties of C++ or wasting hours trying to convert data from one format to another, Python will help you concentrate on the topic at hand: becoming an expert in machine learning.

Getting started with OpenCV

Being the avid user of OpenCV that I believe you are, I probably don't have to convince you about the power of OpenCV.

Built to provide a common infrastructure for computer vision applications, OpenCV has become a comprehensive set of both classic and state-of-the-art computer vision and machine learning algorithms. According to their own documentation, OpenCV has a user community of more than 47,000 people and has been downloaded over seven million times. That's pretty impressive! As an open source project, it is very easy for researchers, businesses, and government bodies to utilize and modify already available code.

That being said, a number of open source machine learning libraries have popped up as part of the recent machine learning boom that provide far more functionality than OpenCV. A prominent example is scikit-learn, which provides a number of state-of-the-art machine learning algorithms as well as a wealth of online tutorials and code snippets. As OpenCV was developed mainly to provide computer vision algorithms, its machine learning functionality is restricted to a single module, called ml. As we will see in this book, OpenCV still provides a number of state-of-the-art algorithms, but sometimes lacks a bit in functionality. In these rare cases, instead of reinventing the wheel, we will simply use scikit-learn for our purposes.

Last but not least, installing OpenCV using the Python Anaconda distribution is essentially a one-liner as we'll see in the following sections.

If you are a more advanced user who wants to build real-time applications, OpenCV's algorithms are well-optimized for this task, and Python provides several ways to speed up computations where it is necessary (using, for example, Cython or parallel processing libraries such as joblib or dask).

Installation

Before we get started, let's make sure that we have all the tools and libraries installed that are necessary to create a fully functioning data science environment. After downloading the latest code for this book from GitHub, we are going to install the following software:

Python's Anaconda distribution, based on Python 3.6 or higher

OpenCV 4.1

Some

 supporting 

packages

Don't feel like installing stuff? You can also visit https://mybinder.org/v2/gh/PacktPublishing/Machine-Learning-for-OpenCV-Second-Edition/master, where you will find all the code for this book in an interactive, executable environment and 100% free and open source, thanks to the Binder project.

Getting the latest code for this book

You can get the latest code for this book from GitHub: https://github.com/PacktPublishing/Machine-Learning-for-OpenCV-Second-Edition. You can either download a .zip package (beginners) or clone the repository using Git (intermediate users).

Git is a version control system that allows you to track changes in files and collaborate with others on your code. In addition, the web platform GitHub makes it easy for people to share their code with you on a public server. As I make improvements to the code, you can easily update your local copy, file bug reports, or suggest code changes.

If you choose to go with git, the first step is to make sure it is installed (https://git-scm.com/downloads).

Then, open a Terminal (or Command Prompt, as it is called in Windows):

On Windows 10, right-click on the Start Menu button, and select

 

Command Prompt

.

On macOS X, press

 

Cmd

+

Space

 

to open spotlight search, then type

 

terminal

, and hit

 

Enter

.

On Ubuntu, Linux/Unix and friends, press

 

Ctrl + Alt + T

. On Red Hat, right-click on the desktop and choose

 

Open Terminal

 

from the menu.

Navigate to a directory where you want the code downloaded:

cd Desktop

Then you can grab a local copy of the latest code by typing the following:

git clone https://github.com/PacktPublishing/Machine-Learning-for-OpenCV-Second-Edition.git

This will download the latest code in a folder called OpenCV-ML.

After a while, the code might change online. In that case, you can update your local copy by running the following command from within the OpenCV-ML directory:

git pull origin master

Getting to grips with Python's Anaconda distribution

Anaconda is a free Python distribution developed by Continuum Analytics that is made for scientific computing. It works across Windows, Linux, and macOS X platforms and is free, even for commercial use. However, the best thing about it is that it comes with a number of preinstalled packages that are essential for data science, math, and engineering. These packages include the following:

NumPy

: A fundamental package for scientific computing in Python that provides functionality for multidimensional arrays, high-level mathematical functions, and pseudo-random number generators

SciPy

: A

 collection 

of functions for scientific computing in Python

 that

provides advanced linear algebra routines, mathematical function optimization, signal processing, and so on

scikit-learn

: An open source machine

 learning 

library in Python

 that

provides useful helper functions and infrastructure that OpenCV lacks

Matplotlib

: The primary scientific plotting library in Python, which provides functionality for producing line charts, histograms, scatter plots, and so on

Jupyter Notebook

: An interactive environment for the

 running 

of code in a web browser that also includes functionalities of markdown, which in turn helps in maintaining well commented and detailed project notebooks

An installer for our platform of choice (Windows, macOS X, or Linux) can be found on the Continuum website, https://www.anaconda.com/download. I recommend using the Python 3.6-based distribution, as Python 2 is no longer under active development.

To run the installer, do one of the following:

On Windows, double-click on the

 

.exe

 

file and follow the instructions on the screen

On macOS X, double-click on the

 

.pkg

 

file and follow the instructions on the screen

On Linux, open a Terminal and run the

 

.sh

 

script using bash as shown here:

$ bash Anaconda3-2018.12-Linux-x86_64.sh # Python 3.6 based

In addition, Python Anaconda comes with conda—a simple package manager similar to apt-get on Linux. After successful installation, we can install new packages by typing the following command in the Terminal:

$ conda install package_name

Here, package_name is the actual name of the package that we want to install.

Existing packages can be updated using the following command:

$ conda update package_name

We can also search for packages using the following command:

$ anaconda search -t conda package_name

This will bring up a whole list of packages made available by developers. For example, searching for a package named opencv, we get the following hits:

This will bring up a long list of users who have OpenCV packages installed, allowing us to locate users that have our version of the software installed on our own platform. A package called package_name from a user called user_name can then be installed as follows:

$ conda install -c user_name package_name

Finally, conda provides something called an environment, which allows us to manage different versions of Python and/or packages installed in them. This means we could have a separate environment where we have all packages necessary to run OpenCV 4.1 with Python 3.6. In the following section, we will create an environment that contains all the packages needed to run the code in this book.

Installing OpenCV in a conda environment

We will carry out the following steps to install OpenCV:

In a Terminal, navigate to the

 directory 

where you

 downloaded 

the following code:

$ cd Desktop/OpenCV-ML

Then, run the following command to create a conda environment based on Python 3.6, which will also install all the necessary packages listed in the

 

environment.yml

 file (available in the GitHub repository) 

in one fell swoop:

$ conda create env -f environment.yml

You can also have a look at the following 

environment.yml

 file:

name

: OpenCV-ML

channels

: - conda-forge

dependencies

: - python==3.6 - numpy==1.15.4 - scipy==1.1.0 - scikit-learn==0.20.1 - matplotlib - jupyter==1.0 - notebook==5.7.4 - pandas==0.23.4 - theano - keras==2.2.4 - mkl-service==1.1.2 - pip -

pip

: - opencv-contrib-python==4.1.0.25

Notice that the environment's name will be OpenCV-ML. This code will use the conda-forge channel to download all the conda based dependencies and use pip to install OpenCV 4.0 (along with opencv_contrib).

To activate the environment, type one of the following, depending on your platform:

$ source activate OpenCV-ML # on Linux / Mac OS X

$ activate OpenCV-ML # on Windows

When we close the Terminal, the session will be deactivated—so we will have to run this last command again the next time we open a new Terminal. We can also deactivate the environment by hand:

$ source deactivate # on Linux / Mac OS X

$ deactivate # on Windows

And done! Let's verify whether all this installation was successful or not.

Verifying the installation

It's a good idea to double-check our installation. While our terminal is still open, we start IPython, which is an interactive shell to run Python commands:

$ ipython

Next, make sure that you are running (at least) Python 3.6 and not Python 2.7. You might see the version number displayed in IPython's welcome message. If not, you can run the following commands:

In [1]: import sys... print(sys.version) 3.6.0 | packaged by conda-forge | (default, Feb 9 2017, 14:36:55) [GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]

Now try to import OpenCV as follows:

In [2]: import cv2

You should get no error messages. Then, try to find out the version number like so:

In [3]: cv2.__version__Out[3]: '4.0.0'

Make sure that OpenCV's version number reads at 4.0.0; otherwise, you will not be able to use some OpenCV functionality later on.

OpenCV 3 is actually called cv2. I know it's confusing. Apparently, the reason for this is that the 2 does not stand for the version number. Instead, it is meant to highlight the difference between the underlying C API (which is denoted by the cv prefix) and the C++ API (which is denoted by the cv2 prefix).

You can then exit the IPython shell by typing exit, or hitting Ctrl + D and confirming that you want to quit.

Alternatively, you can run the code in a web browser thanks to the Jupyter Notebook. If you have never heard of Jupyter Notebooks or played with them before, trust me—you will love them! If you followed the directions as mentioned earlier and installed the Python Anaconda stack, Jupyter is already installed and ready to go. In a Terminal, type this:

$ jupyter notebook

This will automatically open a browser window, showing a list of files in the current directory. Click on the OpenCV-ML folder, then on the notebooks folder, and voila! Here you will find all the code for this book, ready to be explored:

The notebooks are arranged by chapter and section. For the most part, they contain only the relevant code, but no additional information or explanations. These are reserved for those who support our effort by buying this book—so thank you!

Simply click on a notebook of your choice, such as 01.00-A-Taste-of-Machine-Learning.ipynb, and you will be able to run the code yourself by selecting Kernel|Restart | Run All:

There are a few handy keyboard shortcuts for navigating Jupyter Notebooks. However, the only ones that you need to know about right now are the following:

Click in a cell (note the highlighted region in the preceding screenshot—that's referred to as a cell) in order to edit it

While the cell is selected, hit

 

Ctrl

+

Enter

 

to execute the code in it

Alternatively, hit

 

Shift

+

Enter

 

to execute a cell and select the cell below it

Hit

 

Esc

 

to exit write mode, then hit

 

A

 

to insert a cell above the currently selected one and

 

B

 

to insert a cell below

Check out all the keyboard shortcuts by clicking on Help | Keyboard Shortcut, or take a quick tour by clicking on Help | User Interface Tour.

However, I strongly encourage you to follow along with the book by actually typing out the commands yourself, preferably in an IPython shell or an empty Jupyter Notebook. There is no better way to learn how to code than by getting your hands dirty. Even better if you make mistakes—we have all been there. At the end of the day, it's all about learning by doing!

Getting a glimpse of OpenCV's ml module

Starting with OpenCV 3.1, all machine learning related functions in OpenCV have been grouped into the ml module. This has been the case for the C++ API for quite some time. You can get a glimpse of what's to come by displaying all functions in the ml module:

In [4]: dir(cv2.ml)Out[4]: ['ANN_MLP_ANNEAL', 'ANN_MLP_BACKPROP', 'ANN_MLP_GAUSSIAN', 'ANN_MLP_IDENTITY', 'ANN_MLP_LEAKYRELU', 'ANN_MLP_NO_INPUT_SCALE', 'ANN_MLP_NO_OUTPUT_SCALE', ... '__spec__']

If you have installed an older version of OpenCV, the ml module might not be present. For example, the k-nearest neighbor algorithm (which we will talk about in Chapter 3, First Steps in Supervised Learning) used to be called cv2.KNearest() but is now called cv2.ml.KNearest_create(). In order to avoid confusion throughout the book, I recommend using OpenCV 4.0.

This is all good but you will be wondering by now why should you even learn machine learning, and what are its applications? Let's answer this question in the next section.

Applications of machine learning

Machine learning, artificial intelligence, deep learning, and data science are four terms that I believe are going to change the way we have always looked at things. Let's see if I can convince you why I believe so.

From making a computer learn how to play Go and defeat the world champion of the very same game to using the same branch to detect whether a person has a tumor or not just by seeing their brain's CT Scan, machine learning has left its mark in every single domain. One of the projects that I worked on was using machine learning to determine the residual life cycle of boiler water wall tubes in thermal power plants. The proposed solution was successful in saving a huge amount of money by using the tubes more efficiently. If you thought that machine learning applications are limited to engineering and medical science, then you are wrong. Researchers have applied machine learning concepts to process newspapers and predict the effect of news on the chances of a particular candidate winning the US presidential elections. 

Deep learning and computer vision concepts have been applied to colorize black and white movies (have a look at this blog post—https://www.learnopencv.com/convolutional-neural-network-based-image-colorization-using-opencv/), to create super-slow-motion movies, to restore torn out portions of famous artworks, and more.

I hope I have managed to convince you about the importance and the power of machine learning. You have made the right decision to explore this field. But, if you are not a computer science engineer and are worried that you might end up working in a domain that is not your favorite, do not worry. Machine learning is an extra skillset that you can always apply to a problem of your choice.

What's new in OpenCV 4.0?

So, we come to the last section of the very first chapter. I will keep it short and to the point since you as a reader can safely skip it. The topic of our discussion is OpenCV 4.0.

OpenCV 4.0 is a result of three and a half years of hard work and bug fixes by OpenCV and was finally released in November 2018. In this section, we will look at some of the major changes and new features in OpenCV 4.0:

With the OpenCV 4.0 release, OpenCV has officially become a C++11 library. This means that you have to make sure that a C++11 compliant compiler is present in your system when you are trying to compile OpenCV 4.0. 

In continuation of the previous point, a lot of C APIs have been removed. Some of the modules that have been affected include Video IO module (

videoio

), Object Detection module (

objdetect

), and others. File IO for XML, YAML, and JSON have also removed the C API.

OpenCV 4.0 also has a lot of improvements in the DNN module (the deep learning module). ONNX support has been added.

Intel OpenVINO

 also marks its presence in the new OpenCV version. We will be looking into this in some more detail in our later chapters. 

OpenCL acceleration has been fixed on AMD and NVIDIA GPUs. 

OpenCV Graph API has also been added, which is a highly efficient engine for image processing and other operations.

As in every OpenCV release, there have been a lot of changes with the purpose of improving the performance. Some new features such as QR Code Detection and Decoding have also been added.

In short, there have been a lot of changes in OpenCV 4.0 and they have their own uses. For example, ONNX support helps in the portability of models across various languages and frameworks, OpenCL reduces runtime for computer vision applications, Graph API helps in increasing the efficiency of the applications, and the OpenVINO toolkit uses Intel's processors and a model zoo to provide highly efficient deep learning models. We will be focusing primarily on OpenVINO toolkit and DLDT as well as accelerating computer vision applications in later chapters. But, I should also point out here that both OpenCV 3.4.4 and OpenCV 4.0.0 are being modified at a high speed to fix bugs. So, if you are going to use either of them in any application, be prepared to modify your code and installation to incorporate the changes made. On a similar note, OpenCV 4.0.1 and OpenCV 3.4.5 are also out within a few months of their predecessors.

Summary

In this chapter, we talked about machine learning at a high abstraction level: what it is, why it is important, and what kinds of problems it can solve. We learned that machine learning problems come in three flavors: supervised learning, unsupervised learning, and reinforcement learning. We talked about the prominence of supervised learning, and that this field can be further divided into two subfields: classification and regression. Classification models allow us to categorize objects into known classes (such as animals into cats and dogs), whereas regression analysis can be used to predict continuous outcomes of target variables (such as the sales price of used cars).

We also learned how to set up a data science environment using the Python Anaconda distribution, how to get the latest code of this book from GitHub, and how to run code in a Jupyter Notebook.

With these tools in hand, we are now ready to start talking about machine learning in more detail. In the next chapter, we will look at the inner workings of machine learning systems and learn how to work with data in OpenCV with the help of common Pythonic tools such as NumPy and Matplotlib.

Working with Data in OpenCV

Now that we have whetted our appetite for machine learning, it is time to delve a little deeper into the different parts that make up a typical machine learning system.

Far too often, you hear someone throw around the phrase, Just apply machine learning to your data!, as if that will instantly solve all of your problems. You can imagine that the reality of this is much more intricate, although, I will admit that nowadays, it is incredibly easy to build your own machine learning system simply by cutting and pasting a few lines of code from the internet. However, to build a system that is truly powerful and effective, it is essential to have a firm grasp of the underlying concepts and an intimate knowledge of the strengths and weaknesses of each method. So, don't worry if you don't consider yourself a machine learning expert just yet. Good things take time.

Earlier, I described machine learning as a subfield of artificial intelligence. This might be true—mainly for historical reasons—but most often, machine learning is simply about making sense of data. Therefore, it might be more suitable to think of machine learning as a subfield of data science, where we build mathematical models to help us to understand data.

Hence, this chapter is all about data. We want to learn how data fits in with machine learning and how to work with data using the tools of our choice: OpenCV and Python.

In this chapter, we will cover the following topics:

Understanding the machine learning workflow

Understanding training data and test data

Learning how to load, store, edit, and visualize data with OpenCV and Python

Technical requirements

You can refer to the code for this chapter from the following link: https://github.com/PacktPublishing/Machine-Learning-for-OpenCV-Second-Edition/tree/master/Chapter02.

Here is a summary of the software and hardware requirements:

You will need OpenCV version 4.1.x (4.1.0 or 4.1.1 will both work just fine).

You will need Python version 3.6 (any Python version 3.x will be fine).

You will need Anaconda Python 3 for installing Python and the required modules.

You can use any OS—macOS, Windows, and Linux-based OSes along with this book. We recommend you have at least 4 GB RAM in your system.

You don't need to have a GPU to run the code provided along with this book.

Understanding the machine learning workflow

As mentioned earlier, machine learning is all about building mathematical models to understand data. The learning aspect enters this process when we give a machine learning model the capability to adjust its internal parameters; we can tweak these parameters so that the model explains the data better. In a sense, this can be understood as the model learning from the data. Once the model has learned enough—whatever that means—we can ask it to explain newly observed data.

A typical classification process is illustrated in the following diagram:

Let's break it down step by step.

The first thing to notice is that machine learning problems are always split into (at least) two distinct phases:

A

 

training phase

, during which we aim to train a

 

machine

 

learning model on a set of

 

data

 

that we call the

 

training dataset

A test phase, during which we evaluate the learned (or finalized) machine learning model on a new set of never-before-seen data that we call the 

test dataset

The importance of splitting our data into a training set and test set cannot be understated. We always evaluate our models on an independent test set because we are interested in knowing how well our models generalize to new data. In the end, isn't this what learning is all about—be it machine learning or human learning? Think back to school, when you were a learner yourself: the problems you had to solve as part of your homework would never show up in exactly the same form in the final exam. The same scrutiny should be applied to a machine learning model; we are not so much interested in how well our models can memorize a set of data points (such as a homework problem), but we want to know how our models will use what they have learned to solve new problems (such as the ones that show up in a final exam) and explain new data points.

The workflow of an advanced machine learning problem will typically include a third set of data termed a validation dataset