Machine Learning for OpenCV - Michael Beyeler - E-Book

Machine Learning for OpenCV E-Book

Michael Beyeler

0,0
39,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Machine learning is no longer just a buzzword, it is all around us: from protecting your email, to automatically tagging friends in pictures, to predicting what movies you like. Computer vision is one of today's most exciting application fields of machine learning, with Deep Learning driving innovative systems such as self-driving cars and Google’s DeepMind.

OpenCV lies at the intersection of these topics, providing a comprehensive open-source library for classic as well as state-of-the-art computer vision and machine learning algorithms. In combination with Python Anaconda, you will have access to all the open-source computing libraries you could possibly ask for.

Machine learning for OpenCV begins by introducing you to the essential concepts of statistical learning, such as classification and regression. Once all the basics are covered, you will start exploring various algorithms such as decision trees, support vector machines, and Bayesian networks, and learn how to combine them with other OpenCV functionality. As the book progresses, so will your machine learning skills, until you are ready to take on today's hottest topic in the field: Deep Learning.

By the end of this book, you will be ready to take on your own machine learning problems, either by building on the existing source code or developing your own algorithm from scratch!

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 438

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Machine Learning for OpenCV
A practical introduction to the world of machine learning and image processing using OpenCV and Python
Michael Beyeler

BIRMINGHAM - MUMBAI

Machine Learning for OpenCV

Copyright © 2017 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: July 2017

Production reference: 1130717

Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.

ISBN 978-1-78398-028-4

www.packtpub.com

Credits

Author

Michael Beyeler

Copy Editor

Manisha Sinha

Reviewers

Vipul Sharma Rahul Kavi

Project Coordinator

Manthan Patel

Commissioning Editor

Veena Pagare

Proofreader

Safis Editing

Acquisition Editor

Varsha Shetty

Indexer

Tejal Daruwale Soni

ContentDevelopmentEditor

Jagruti Babaria

Graphics

Tania Dutta

Technical Editor

Sagar Sawant

Production Coordinator

Deepika Naik

Foreword

Over the last few years, our machines have slowly but surely learned how to see for themselves. We now take it for granted that our cameras detect our faces in pictures that we take, and that social media apps can even recognize us and our friends in the photos that we upload from these cameras. Over the next few years we will experience even more radical transformation. Before long, cars will be driving themselves, our cellphones will be able to read and translate a sign in any language for us, and our x-rays and other medical images will be read and analyzed by powerful algorithms that will be able to accurately suggest a medical diagnosis, and even recommend effective treatments.

These transformations are driven by an explosive combination of increased computing power, masses of image data, and a set of clever ideas taken from math, statistics, and computer science. This rapidly growing intersection that is machine learning has taken off, affecting many of our day-to-day interactions with the world, and with each other. One of the most remarkable features of the current machine learning paradigm-shift in computer vision is that it relies to a large extent on software tools that are freely available and developed by large groups of volunteers, hobbyists, scientists, and engineers in open source communities. This means that, in principle, the barriers to entry are also lower than ever: anyone who is interested in putting their mind to it can harness machine learning for image processing.

However, just like in a garden with many forking paths, the wealth of tools and ideas, and the rapid development of these ideas, underscores the need for a guide who can show you the way, and orient you in the right direction. I have some good news for you: having picked up this book, you are in the good hands of my colleague and collaborator Dr. Michael Beyeler as your guide. With his broad range of expertise, Michael is both a hard-nosed engineer, computer scientist, and neuroscientist, as well as a prolific open source software developer. He has not only taught robots how to see and navigate through complex environments, and computers how to model brain activity, but he also regularly teaches humans how to use programming to solve a variety of different machine learning and image processing problems. This means that you will get to benefit not only from the sure-handed rigor of his expertise and experience, but also that you will get to enjoy his thoughtfulness in teaching the ideas in his book, as well as a good dose of his sense of humor.

The second piece of good news is that this going to be an exhilarating trip. There's nothing that matches the thrill of understanding that comes from putting together the pieces of the puzzle that go into solving a problem in computer vision and machine learning with code and data. As Richard Feynman put it: "What I cannot create, I do not understand". So, get ready to get your hands dirty (so to speak) with the code and data in the (open source!) code examples that accompany this book, and to get creative. Understanding will surely follow.

Ariel Rokem Data Scientist, The University of Washington eScience Institute

About the Author

Michael Beyeler is a Postdoctoral Fellow in Neuroengineering and Data Science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye). His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. Michael is the author of OpenCV with Python Blueprints by Packt Publishing, 2015, a practical guide for building advanced computer vision projects. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android.Michael received a PhD in computer science from the University of California, Irvine as well as a MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland. When he is not "nerding out" on brains, he can be found on top of a snowy mountain, in front of a live band, or behind the piano.

About the Reviewers

Vipul Sharma is a Software Engineer at a startup in Bangalore, India. He studied engineering in Information Technology at Jabalpur Engineering College (2016). He is an ardent Python fan and loves building projects on computer vision in his spare time. He is an open source enthusiast and hunts for interesting projects to contribute to. He is passionate about learning and strives to better himself as a developer. He writes blogs on his side projects at http://vipul.xyz. He also publishes his code at http://github.com/vipul-sharma20.

Rahul Kavi works as a research scientist in Silicon Valley. He holds a Master's and PhD degree in computer science from West Virginia University. Rahul has worked on researching and optimizing computer vision applications for a wide variety of platforms and applications. He has also contributed to the machine learning module in OpenCV. He has written computer vision and machine learning software for prize-winning robots for NASA's 2015 and 2016 Centennial Challenges: Sample Return Robot (1st prize). Rahul's research has been published in conference papers and journals.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1783980281.

If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

To my loving wife, who continues to support me in all my endeavors --; no matter how grand, silly, or nerdy they may be.

Table of Contents

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

A Taste of Machine Learning

Getting started with machine learning

Problems that machine learning can solve

Getting started with Python

Getting started with OpenCV

Installation

Getting the latest code for this book

Getting to grips with Python's Anaconda distribution

Installing OpenCV in a conda environment

Verifying the installation

Getting a glimpse of OpenCV's ML module

Summary

Working with Data in OpenCV and Python

Understanding the machine learning workflow

Dealing with data using OpenCV and Python

Starting a new IPython or Jupyter session

Dealing with data using Python's NumPy package

Importing NumPy

Understanding NumPy arrays

Accessing single array elements by indexing

Creating multidimensional arrays

Loading external datasets in Python

Visualizing the data using Matplotlib

Importing Matplotlib

Producing a simple plot

Visualizing data from an external dataset

Dealing with data using OpenCV's TrainData container in C++

Summary

First Steps in Supervised Learning

Understanding supervised learning

Having a look at supervised learning in OpenCV

Measuring model performance with scoring functions

Scoring classifiers using accuracy, precision, and recall

Scoring regressors using mean squared error, explained variance, and R squared

Using classification models to predict class labels

Understanding the k-NN algorithm

Implementing k-NN in OpenCV

Generating the training data

Training the classifier

Predicting the label of a new data point

Using regression models to predict continuous outcomes

Understanding linear regression

Using linear regression to predict Boston housing prices

Loading the dataset

Training the model

Testing the model

Applying Lasso and ridge regression

Classifying iris species using logistic regression

Understanding logistic regression

Loading the training data

Making it a binary classification problem

Inspecting the data

Splitting the data into training and test sets

Training the classifier

Testing the classifier

Summary

Representing Data and Engineering Features

Understanding feature engineering

Preprocessing data

Standardizing features

Normalizing features

Scaling features to a range

Binarizing features

Handling the missing data

Understanding dimensionality reduction

Implementing Principal Component Analysis (PCA) in OpenCV

Implementing Independent Component Analysis (ICA)

Implementing Non-negative Matrix Factorization (NMF)

Representing categorical variables

Representing text features

Representing images

Using color spaces

Encoding images in RGB space

Encoding images in HSV and HLS space

Detecting corners in images

Using the Scale-Invariant Feature Transform (SIFT)

Using Speeded Up Robust Features (SURF)

Summary

Using Decision Trees to Make a Medical Diagnosis

Understanding decision trees

Building our first decision tree

Understanding the task by understanding the data

Preprocessing the data

Constructing the tree

Visualizing a trained decision tree

Investigating the inner workings of a decision tree

Rating the importance of features

Understanding the decision rules

Controlling the complexity of decision trees

Using decision trees to diagnose breast cancer

Loading the dataset

Building the decision tree

Using decision trees for regression

Summary

Detecting Pedestrians with Support Vector Machines

Understanding linear support vector machines

Learning optimal decision boundaries

Implementing our first support vector machine

Generating the dataset

Visualizing the dataset

Preprocessing the dataset

Building the support vector machine

Visualizing the decision boundary

Dealing with nonlinear decision boundaries

Understanding the kernel trick

Knowing our kernels

Implementing nonlinear support vector machines

Detecting pedestrians in the wild

Obtaining the dataset

Taking a glimpse at the histogram of oriented gradients (HOG)

Generating negatives

Implementing the support vector machine

Bootstrapping the model

Detecting pedestrians in a larger image

Further improving the model

Summary

Implementing a Spam Filter with Bayesian Learning

Understanding Bayesian inference

Taking a short detour on probability theory

Understanding Bayes' theorem

Understanding the naive Bayes classifier

Implementing your first Bayesian classifier

Creating a toy dataset

Classifying the data with a normal Bayes classifier

Classifying the data with a naive Bayes classifier

Visualizing conditional probabilities

Classifying emails using the naive Bayes classifier

Loading the dataset

Building a data matrix using Pandas

Preprocessing the data

Training a normal Bayes classifier

Training on the full dataset

Using n-grams to improve the result

Using tf-idf to improve the result

Summary

Discovering Hidden Structures with Unsupervised Learning

Understanding unsupervised learning

Understanding k-means clustering

Implementing our first k-means example

Understanding expectation-maximization

Implementing our own expectation-maximization solution

Knowing the limitations of expectation-maximization

First caveat: No guarantee of finding the global optimum

Second caveat: We must select the number of clusters beforehand

Third caveat: Cluster boundaries are linear

Fourth caveat: k-means is slow for a large number of samples

Compressing color spaces using k-means

Visualizing the true-color palette

Reducing the color palette using k-means

Classifying handwritten digits using k-means

Loading the dataset

Running k-means

Organizing clusters as a hierarchical tree

Understanding hierarchical clustering

Implementing agglomerative hierarchical clustering

Summary

Using Deep Learning to Classify Handwritten Digits

Understanding the McCulloch-Pitts neuron

Understanding the perceptron

Implementing your first perceptron

Generating a toy dataset

Fitting the perceptron to data

Evaluating the perceptron classifier

Applying the perceptron to data that is not linearly separable

Understanding multilayer perceptrons

Understanding gradient descent

Training multi-layer perceptrons with backpropagation

Implementing a multilayer perceptron in OpenCV

Preprocessing the data

Creating an MLP classifier in OpenCV

Customizing the MLP classifier

Training and testing the MLP classifier

Getting acquainted with deep learning

Getting acquainted with Keras

Classifying handwritten digits

Loading the MNIST dataset

Preprocessing the MNIST dataset

Training an MLP using OpenCV

Training a deep neural net using Keras

Preprocessing the MNIST dataset

Creating a convolutional neural network

Fitting the model

Summary

Combining Different Algorithms into an Ensemble

Understanding ensemble methods

Understanding averaging ensembles

Implementing a bagging classifier

Implementing a bagging regressor

Understanding boosting ensembles

Implementing a boosting classifier

Implementing a boosting regressor

Understanding stacking ensembles

Combining decision trees into a random forest

Understanding the shortcomings of decision trees

Implementing our first random forest

Implementing a random forest with scikit-learn

Implementing extremely randomized trees

Using random forests for face recognition

Loading the dataset

Preprocessing the dataset

Training and testing the random forest

Implementing AdaBoost

Implementing AdaBoost in OpenCV

Implementing AdaBoost in scikit-learn

Combining different models into a voting classifier

Understanding different voting schemes

Implementing a voting classifier

Summary

Selecting the Right Model with Hyperparameter Tuning

Evaluating a model

Evaluating a model the wrong way

Evaluating a model in the right way

Selecting the best model

Understanding cross-validation

Manually implementing cross-validation in OpenCV

Using scikit-learn for k-fold cross-validation

Implementing leave-one-out cross-validation

Estimating robustness using bootstrapping

Manually implementing bootstrapping in OpenCV

Assessing the significance of our results

Implementing Student's t-test

Implementing McNemar's test

Tuning hyperparameters with grid search

Implementing a simple grid search

Understanding the value of a validation set

Combining grid search with cross-validation

Combining grid search with nested cross-validation

Scoring models using different evaluation metrics

Choosing the right classification metric

Choosing the right regression metric

Chaining algorithms together to form a pipeline

Implementing pipelines in scikit-learn

Using pipelines in grid searches

Summary

Wrapping Up

Approaching a machine learning problem

Building your own estimator

Writing your own OpenCV-based classifier in C++

Writing your own scikit-learn-based classifier in Python

Where to go from here?

Summary

Preface

I'm glad you're here. It's about time we talked about machine learning.

Machine learning is no longer just a buzzword, it is all around us: from protecting your email, to automatically tagging friends in pictures, to predicting what movies you like. As a subfield of data science, machine learning enables computers to learn through experience: to make predictions about the future using collected data from the past.

And the amount of data to be analyzed is enormous! Current estimates put the daily amount of produced data at 2.5 exabytes (or roughly 1 billion gigabytes). Can you believe it? This would be enough data to fill up 10 million blu-ray discs, or amount to 90 years of HD video. In order to deal with this vast amount of data, companies such as Google, Amazon, Microsoft, and Facebook have been heavily investing in the development of data science platforms that allow us to benefit from machine learning wherever we go--scaling from your mobile phone application all the way to supercomputers connected through the cloud.

In other words: this is the time to invest in machine learning. And if it is your wish to become a machine learning practitioner, too--then this book is for you!

But fret not: your application does not need to be as large-scale or influential as the above examples in order to benefit from machine learning. Everyone starts small. Thus, the first step of this book is to introduce you to the essential concepts of statistical learning, such as classification and regression, with the help of simple and intuitive examples. If you have already studied machine learning theory in detail, this book will show you how to put your knowledge into practice. Oh, and don't worry if you are completely new to the field of machine learning--all you need is the willingness to learn.

Once we have covered all the basic concepts, we will start exploring various algorithms such as decision trees, support vector machines, and Bayesian networks, and learn how to combine them with other OpenCV functionality. Along the way, you will learn how to understand the task by understanding the data and how to build fully functioning machine learning pipelines.

As the book progresses, so will your machine learning skills, until you are ready to take on today's hottest topic in the field: deep learning. Combined with the trained skill of knowing how to select the right tool for the task, we will make sure you get comfortable with all relevant machine learning fundamentals.

At the end of the book, you will be ready to take on your own machine learning problems, either by building on the existing source code or developing your own algorithm from scratch!

What this book covers

Chapter 1, A Taste of Machine Learning, will gently introduce you to the different subfields of machine learning, and explain how to install OpenCV and other essential tools in the Python Anaconda environment.

Chapter 2, Working with Data in OpenCV and Python, will show you what a typical machine learning workflow looks like, and where data comes in to play. I will explain the difference between training and test data, and show you how to load, store, manipulate, and visualize data with OpenCV and Python.

Chapter 3, First Steps in Supervised Learning, will introduce you to the topic of supervised learning by reviewing some core concepts, such as classification and regression. You will learn how to implement a simple machine learning algorithm in OpenCV, how to make predictions about the data, and how to evaluate your model.

Chapter 4, Representing Data and Engineering Features, will teach you how to get a feel for some common and well-known machine learning datasets and how to extract the interesting stuff from your raw data.

Chapter 5, Using Decision Trees to Make a Medical Diagnosis, will show you how to build decision trees in OpenCV, and use them in a variety of classification and regression problems.

Chapter 6, Detecting Pedestrians with Support Vector Machines, will explain how to build support vector machines in OpenCV, and how to apply them to detect pedestrians in images.

Chapter 7, Implementing a Spam Filter with Bayesian Learning, will introduce you to probability theory, and show you how you can use Bayesian inference to classify emails as spam or not.

Chapter 8, Discovering Hidden Structures with Unsupervised Learning, will talk about unsupervised learning algorithms such as k-means clustering and Expectation-Maximization, and show you how they can be used to extract hidden structures in simple, unlabeled datasets.

Chapter 9, Using Deep Learning to Classify Handwritten Digits, will introduce you to the exciting field of deep learning. Starting with the perceptron and multi-layer perceptrons, you will learn how to build deep neural networks in order to classify handwritten digits from the extensive MNIST database.

Chapter 10, Combining Different Algorithms into an Ensemble, will show you how to effectively combine multiple algorithms into an ensemble in order to overcome the weaknesses of individual learners, resulting in more accurate and reliable predictions.

Chapter 11, Selecting the Right Model with Hyper-Parameter Tuning, will introduce you to the concept of model selection, which allows you to compare different machine learning algorithms in order to select the right tool for the task at hand.

Chapter 12, Wrapping Up, will conclude the book by giving you some useful tips on how to approach future machine learning problems on your own, and where to find information on more advanced topics.

What you need for this book

You will need a computer, Python Anaconda, and enthusiasm. Lots of enthusiasm. Why Python?, you may ask. The answer is simple: it has become the de facto language of data science, thanks to its great number of open source libraries and tools to process and interact with data.

One of these tools is the Python Anaconda distribution, which provides all the scientific computing libraries we could possibly ask for, such as NumPy, SciPy, Matplotlib, Scikit-Learn, and Pandas. In addition, installing OpenCV is essentially a one-liner. No more flipping switches in cc make or compiling from scratch! We will talk about how to install Python Anaconda in Chapter 1, A Taste of Machine Learning.

If you have mostly been using OpenCV in combination with C++, that's fine. But, at least for the purpose of this book, I would strongly suggest that you switch to Python. C++ is fine when your task is to develop high-performance code or real-time applications. But when it comes to picking up a new skill, I believe Python to be a fundamentally better choice of language, because you can do more by typing less. Rather than getting annoyed by the syntactic subtleties of C++, or wasting hours trying to convert data from one format into another, Python will help you concentrate on the topic at hand: to become an expert in machine learning.

Who this book is for

Throughout the book, I will assume that you already have a basic knowledge of OpenCV and Python, but that there is always room to learn more.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "In Python, we can create a list of integers by using the list() command." A block of code is set using the IPython notation, marking user input with In [X], line continuations with and corresponding output with Out[X]:

In [1]: import numpy ... numpy.__version__ Out[1]: '1.11.3'

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

In [1]: import numpy

... numpy.__version__

Out[1]: '1.11.3'

Any command-line input or output is written as follows:

$ ipython

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Clicking the Next button moves you to the next screen."

Warnings or important notes appear in a box like this.

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book--what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply email [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the latest version of the example code files for this book from GitHub: http://github.com/mbeyeler/opencv-machine-learning. All code is released under the MIT software license, so you are free to use, adapt, and share the code as you see fit. There you will also be able to explore the source code by browsing through the different Jupyter notebooks.

If you get stuck or have questions about the source code, you are welcome to post in our web forum: https://groups.google.com/d/forum/machine-learning-for-opencv. Chances are, someone else has already shared a solution to your specific problem.

Alternatively, you can download the original code files from the date of publication by visiting your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.

Hover the mouse pointer on the

SUPPORT

tab at the top.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on

Code Download

.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Machine-Learning-For-OpenCV. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books--maybe a mistake in the text or the code--we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at [email protected] with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

A Taste of Machine Learning

I am writing a new line with double spaces .

So, you have decided to enter the field of machine learning. That's great!

Nowadays, machine learning is all around us--from protecting our email, to automatically tagging our friends in pictures, to predicting what movies we like. As a form of artificial intelligence, machine learning enables computers to learn through experience: to make predictions about the future using collected data from the past. On top of that, computer vision is one of today's most exciting application fields of machine learning, with deep learning and convolutional neural networks driving innovative systems such as self-driving cars and Google's DeepMind.

However, fret not; your application does not need to be as large-scale or world-changing as the previous examples in order to benefit from machine learning. In this chapter, we will talk about why machine learning has become so popular and discuss the kinds of problems that it can solve. We will then introduce the tools that we need in order to solve machine learning problems using OpenCV. Throughout the book, I will assume that you already have a basic knowledge of OpenCV and Python, but that there is always room to learn more.

Are you ready then? Let's go!

Getting started with machine learning

Machine learning has been around for at least 60 years. Growing out of the quest for artificial intelligence, early machine learning systems used hand-coded rules of if...else statements to process data and make decisions. Think of a spam filter whose job is to parse incoming emails and move unwanted messages to a spam folder:

Spam filter

We could come up with a blacklist of words that, whenever they show up in a message, would mark an email as spam. This is a simple example of a hand-coded expert system. (We will build a smarter one in Chapter 7, Implementing a Spam Filter with Bayesian Learning.)

We can think of these expert decision rules to become arbitrarily complicated if we are allowed to combine and nest them in what is known as a decision tree (Chapter 5, Using Decision Trees to Make a Medical Diagnosis). Then, it becomes possible to make more informed decisions that involve a series of decision steps, as shown in the following image:

Decision steps in a simple spam filter

Hand-coding these decision rules is sometimes feasible, but has two major disadvantages:

The logic required to make a decision applies only to a specific task in a single domain. For example, there is no way that we could use this spam filter to tag our friends in a picture. Even if we wanted to change the spam filter to do something slightly different, such as filtering out phishing emails in general, we would have to redesign all the decision rules.

Designing rules by hand requires a deep understanding of the problem. We would have to know exactly which type of emails constitute spam, including all possible exceptions. This is not as easy as it seems; otherwise, we wouldn't often be double-checking our spam folder for important messages that might have been accidentally filtered out. For other domain problems, it is simply not possible to design the rules by hand.

This is where machine learning comes in. Sometimes, tasks cannot be defined well--except maybe by example--and we would like machines to make sense of and solve the tasks by themselves. Other times, it is possible that, hidden among large piles of data, are important relationships and correlations that we as humans might have missed (see Chapter 8, Discovering Hidden Structures with Unsupervised Learning). In these cases, machine learning can often be used to extract these hidden relationships (also known as data mining).

A good example of where man-made expert systems have failed is in detecting faces in images. Silly, isn't it? Today, every smart phone can detect a face in an image. However, 20 years ago, this problem was largely unsolved. The reason for this was the way humans think about what constitutes a face was not very helpful to machines. As humans, we tend not to think in pixels.If we were asked to detect a face, we would probably just look for the defining features of a face, such as eyes, nose, mouth, and so on. But how would we tell a machine what to look for, when all the machine knows is that images have pixels and pixels have a certain shade of gray? For the longest time, this difference inimage representationbasicallymade it impossible for a human to come up with a good set of decision rules that would allow a machine to detect a face in an image.We will talk about different approaches to this problem inChapter 4,Representing Data and Engineering Features.

However, with the advent of convolutional neural networks and deep learning (Chapter 9, Using Deep Learning to Classify Handwritten Digits), machines have become as successful as us when it comes to recognizing faces. All we had to do was simply present a large collection of images of faces to the machine. From there on, the machine was able to discover the set of characteristics that would allow it to identify a face, without having to approach the problem in the same way as we would do. This is the true power of machine learning.

Problems that machine learning can solve

Most machine learning problems belong to one of the following three main categories:

In

supervised learning

, each data point is labeled or associated with a category or value of interest (

Chapter 3

,

First Steps in Supervised Learning

). An example of a categorical

label

is assigning an image as either a cat or dog. An example of a value label is the sale price associated with a used car. The goal of supervised learning is to study many labeled examples like these (called

training data

) in order to make predictions about future data points (called

test data

). These predictions come in two flavors, such as identifying new photos with the correct animal (called a

classification

problem) or assigning accurate sale prices to other used cars (called a

regression

problem). Don't worry if this seems a little over your head for now--we will have the entirety of the book to nail down the details.

In

unsupervised learning

, data points have no labels associated with them (

Chapter 8

,

Discovering Hidden Structures with

Unsupervised Learning

). Instead, the goal of an unsupervised learning algorithm is to organize the data in some way or to describe its structure. This can mean grouping them into

clusters

or finding different ways of looking at complex data so that they appear simpler.

In

reinforcement learning

, the algorithm gets to choose an action in response to each data point. It is a common approach in robotics, where the set of sensor readings at one point in time is a data point and the algorithm must choose the robot's next action. It's also a natural fit for

Internet of Things

applications, where the learning algorithm receives a

reward signal

at a short time into the future, indicating how good the decision was. Based on this, the algorithm modifies its strategy in order to achieve the highest reward.

These three main categories are illustrated in the following figure:

Main machine learning categories

Getting started with Python

Python has become the common language for many data science and machine learning applications, thanks to its great number of open-source libraries for processes such as data loading, data visualization, statistics, image processing, and natural language processing. One of the main advantages of using Python is the ability to interact directly with the code, using a terminal or other tools such as the Jupyter Notebook, which we'll look at shortly.

If you have mostly been using OpenCV in combination with C++, I would strongly suggest that you switch to Python, at least for the purpose of studying this book. This decision has not been made out of spite! Quite the contrary: I have done my fair share of C/C++ programming--especially in combination with GPU computing via NVIDIA's Compute Unified Device Architecture (CUDA)--and like it a lot. However, I consider Python to be a better choice fundamentally if you want to pick up a new topical skill, because you can do more by typing less. This will help reduce the cognitive load. Rather than getting annoyed by the syntactic subtleties of C++ or wasting hours trying to convert data from one format to another, Python will help you concentrate on the topic at hand: becoming an expert in machine learning.

Getting started with OpenCV

Being the avid user of OpenCV that I believe you are, I probably don't have to convince you about the power of OpenCV.

Built to provide a common infrastructure for computer vision applications, OpenCV has become a comprehensive set of both classic and state-of-the-art computer vision and machine learning algorithms. According to their own documentation, OpenCV has a user community of more than 47,000 people and has been downloaded over seven million times. That's pretty impressive! As an open-source project, it is very easy for researchers, businesses, and government bodies to utilize and modify already available code.

This being said, a number of open-source machine learning libraries have popped up since the recent machine learning boom that provide far more functionality than OpenCV. A prominent example is scikit-learn, which provides a number of state-of-the-art machine learning algorithms as well as a wealth of online tutorials and code snippets. As OpenCV was developed mainly to provide computer vision algorithms, its machine learning functionality is restricted to a single module, called ml. As we will see in this book, OpenCV still provides a number of state-of-the-art algorithms, but sometimes lacks a bit in functionality. In these rare cases, instead of reinventing the wheel, we will simply use scikit-learn for our purposes.

Last but not least, installing OpenCV using the Python Anaconda distribution is essentially a one-liner!

If you are a more advanced user who wants to build real-time applications, OpenCV's algorithms are well-optimized for this task, and Python provides several ways to speed up computations where it is necessary (using, for example, Cython or parallel processing libraries such as joblib or dask).

Installation

Before we get started, let's make sure that we have all the tools and libraries installed that are necessary to create a fully functioning data science environment. After downloading the latest code for this book from GitHub, we are going to install the following software:

Python's Anaconda distribution, based on Python 3.5 or higher

OpenCV 3.1 or higher

Some supporting packages

Don't feel like installing stuff? You can also visit http://beta.mybinder.org/v2/gh/mbeyeler/opencv-machine-learning/master, where you will find all the code for this book in an interactive, executable environment and 100% free and open source, thanks to the Binder project.

Getting the latest code for this book

You can get the latest code for this book from GitHub, https://github.com/mbeyeler/opencv-machine-learning. You can either download a .zip package (beginners) or clone the repository using git (intermediate users).

Git is a version control system that allows you to track changes in files and collaborate with others on your code. In addition, the web platform GitHub.com makes it easy for me to share my code with you on a public server. As I make improvements to the code, you can easily update your local copy, file bug reports, or suggest code changes.

If you choose to go with git, the first step is to make sure it is installed (https://git-scm.com/downloads).

Then, open a terminal (or command prompt, as it is called in Windows):

On Windows 10, right-click on the Start Menu button, and select

Command Prompt

.

On Mac OS X, press

Cmd + Space

to open spotlight search, then type

terminal

, and hit

Enter

.

On Ubuntu and friends, press

Ctrl + Alt + T

. On Red Hat, right-click on the desktop and choose

Open Terminal

from the menu.

Navigate to a directory where you want the code downloaded, for example:

$ cd Desktop

Then you can grab a local copy of the latest code by typing the following:

$ git clone https://github.com/mbeyeler/opencv-machine-learning.git

This will download the latest code in a folder called opencv-machine-learning.

After a while, the code might change online. In that case, you can update your local copy by running the following command from within the opencv-machine-learning directory:

$ git pull origin master

Getting to grips with Python's Anaconda distribution

Anaconda is a free Python distribution developed by Continuum Analytics that is made for scientific computing. It works across Windows, Linux, and Mac OS X platforms and is free even for commercial use. However, the best thing about it is that it comes with a number of preinstalled packages that are essential for data science, math, and engineering. These packages include the following:

NumPy

: A fundamental package for scientific computing in Python, which provides functionality for multidimensional arrays, high-level mathematical functions, and pseudo-random number generators

SciPy

: A collection of functions for scientific computing in Python, which provides advanced linear algebra routines, mathematical function optimization, signal processing, and so on

scikit-learn

: An open-source machine learning library in Python, which provides useful helper functions and infrastructure that OpenCV lacks

Matplotlib

: The primary scientific plotting library in Python, which provides functionality for producing line charts, histograms, scatter plots, and so on

Jupyter Notebook

: An interactive environment for the running of code in a web browser

An installer for our platform of choice (Windows, Mac OS X, or Linux) can be found on the Continuum website, https://www.continuum.io/Downloads. I recommend using the Python 3.6-based distribution, as Python 2 is no longer under active development.

To run the installer, do one of the following:

On Windows, double-click on the

.exe

file and follow the instructions on the screen

On Mac OS X, double-click on the

.pkg

file and follow the instructions on the screen

On Linux, open a terminal and run the

.sh

script using bash:

$ bash Anaconda3-4.3.0-Linux-x86_64.sh # Python 3.6 based

$ bash Anaconda2-4.3.0-Linux-x64_64.sh # Python 2.7 based

In addition, Python Anaconda comes with conda--a simple package manager similar to apt-get on Linux. After successful installation, we can install new packages in the terminal using the following command:

$ conda install package_name

Here, package_name is the actual name of the package that we want to install.

Existing packages can be updated using the following command:

$ conda update package_name

We can also search for packages using the following command:

$ anaconda search -t conda package_name

This will bring up a whole list of packages available through individual users. For example, searching for a package named opencv, we get the following hits:

Searching for OpenCV packages provided by different conda users.

This will bring up a long list of users who have OpenCV packages installed, where we can locate users that have our version of the software installed on our own platform. A package called package_name from a user called user_name can then be installed as follows:

$ conda install -c user_name package_name

Finally, conda provides something called an environment, which allows us to manage different versions of Python and/or packages installed in them. This means we could have one environment where we have all packages necessary to run OpenCV 2.4 with Python 2.7, and another where we run OpenCV 3.2 with Python 3.6. In the following section, we will create an environment that contains all the packages needed to run the code in this book.

Installing OpenCV in a conda environment

In a terminal, navigate to the directory where you downloaded the code:

$ cd Desktop/opencv-machine-learning

Before we create a new conda environment, we want to make sure we added the Conda-Forge channel to our list of trusted conda channels:

$ conda config --add channels conda-forge

The Conda-Forge channel is led by an open-source community that provides a wide variety of coderecipes and software packages (for more info, see https://conda-forge.github.io). Specifically, it providesan OpenCV package for 64-bit Windows, which will simplify the remaining steps of the installation.

Then run the following command to create a conda environment based on Python 3.5, which will also install all the necessary packages listed in the file requirements.txt in one fell swoop:

$ conda create -n Python3 python=3.5 --file requirements.txt

To activate the environment, type one of the following, depending on your platform:

$ source activate Python3 # on Linux / Mac OS X

$ activate Python3 # on Windows

Once we close the terminal, the session will be deactivated--so we will have to run this last command again the next time we open a terminal. We can also deactivate the environment by hand:

$ source deactivate # on Linux / Mac OS X

$ deactivate # on Windows

And done!

Verifying the installation

It's a good idea to double-check our installation. While our terminal is still open, we fire up IPython, which is an interactive shell to run Python commands:

$ ipython

Now make sure that you are running (at least) Python 3.5 and not Python 2.7. You might see the version number displayed in IPython's welcome message. If not, you can run the following commands:

In [1]: import sys

... print(sys.version)

3.5.3 |Continuum Analytics, Inc.| (default, Feb 22 2017, 21:28:42) [MSC v.1900 64 bit (AMD64)]

Now try to import OpenCV:

In [2]: import cv2

You should get no error messages. Then, try to find out the version number:

In [3]: cv2.__version__

Out[3]: '3.1.0'

Make sure that the Python version reads 3.5 or 3.6, but not 2.7. Additionally, make sure that OpenCV's version number reads at least 3.1.0; otherwise, you will not be able to use some OpenCV functionality later on.

OpenCV 3 is actually called cv2. I know it's confusing. Apparently, the reason for this is that the 2 does not stand for the version number. Instead, it is meant to highlight the difference between the underlying C API (which is denoted by the cv prefix) and the C++ API (which is denoted by the cv2 prefix).

You can then exit the IPython shell by typing exit- or hitting Ctrl + D and confirming that you want to quit.

Alternatively, you can run the code in a web browser thanks to Jupyter Notebook. If you have never heard of Jupyter Notebooks or played with them before, trust me - you will love them! If you followed the directions as mentioned earlier and installed the Python Anaconda stack, Jupyter is already installed and ready to go. In a terminal, type as follows:

$ jupyter notebook

This will automatically open a browser window, showing a list of files in the current directory. Click on the opencv-machine-learningfolder, then on the notebooksfolder, and voila! Here you will find all the code for this book, ready for you to be explored:

Beginning of the list of Jupyter Notebooks that come with this book

The notebooks are arranged by chapter and section. For the most part, they contain only the relevant code, but no additional information or explanations. These are reserved for those who support our effort by buying this book - so thank you!

Simply click on a notebook of your choice, such as 01.00-A-Taste-of-Machine-Learning.ipynb, and you will be able to run the code yourself by selecting Kernel > Restart & Run All:

Example excerpt of this chapter's Jupyter Notebook

There are a few handy keyboard shortcuts for navigating Jupyter Notebooks. However, the only ones that you need to know about right now are the following:

Click in a cell in order to edit it.

While the cell is selected, hit

Ctrl + Enter

to execute the code in it.

Alternatively, hit

Shift + Enter

to execute a cell and select the cell below it.

Hit

Esc

to exit write mode, then hit

A

to insert a cell above the currently selected one and

B

to insert a cell below.

Check out all the keyboard shortcuts by clicking on Help > Keyboard Shortcut, or take a quick tour by clicking on Help > User Interface Tour.

However, I strongly encourage you to follow along the book by actually typing out the commands yourself, preferably in an IPython shell or an empty Jupyter Notebook. There is no better way to learn how to code than by getting your hands dirty. Even better if you make mistakes--we have all been there. At the end of the day, it's all about learning by doing!

Getting a glimpse of OpenCV's ML module

Starting with OpenCV 3.1, all machine learning-related functions in OpenCV have been grouped into the ml module. This has been the case for the C++ API for quite some time. You can get a glimpse of what's to come by displaying all functions in the ml module:

In [4]: dir(cv2.ml)

Out[4]: ['ANN_MLP_BACKPROP',

'ANN_MLP_GAUSSIAN',

'ANN_MLP_IDENTITY',

'ANN_MLP_NO_INPUT_SCALE',

'ANN_MLP_NO_OUTPUT_SCALE',

...

'__spec__']

If you have installed an older version of OpenCV, the ml module might not be present. For example, the k-nearest neighbor algorithm (which we will talk about in Chapter 3,First Steps in Supervised Learning) used to be called cv2.KNearest() but is now called cv2.ml.KNearest_create(). In order to avoid confusion throughout the book, I therefore recommend using at least OpenCV 3.1.

Summary

In this chapter, we talked about machine learning at a high abstraction level: what it is, why it is important, and what kinds of problems it can solve. We learned that machine learning problems come in three flavors: supervised learning, unsupervised learning, and reinforcement learning. We talked about the prominence of supervised learning, and that this field can be further divided into two subfields: classification and regression. Classification models allow us to categorize objects into known classes (such as animals into cats and dogs), whereas regression analysis can be used to predict continuous outcomes of target variables (such as the sales price of used cars).

We also learned how to set up a data science environment using the Python Anaconda distribution, how to get the latest code of this book from GitHub, and how to run code in a Jupyter Notebook.