E-Book
39,59 €

Learning OpenCV 3 Application Development E-Book

Samyak Datta

0,0

39,59 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Fachliteratur
Sprache: Englisch

Beschreibung

Computer vision and machine learning concepts are frequently used in practical computer vision based projects. If you’re a novice, this book provides the steps to build and deploy an end-to-end application in the domain of computer vision using OpenCV/C++.
At the outset, we explain how to install OpenCV and demonstrate how to run some simple programs. You will start with images (the building blocks of image processing applications), and see how they are stored and processed by OpenCV. You’ll get comfortable with OpenCV-specific jargon (Mat Point, Scalar, and more), and get to know how to traverse images and perform basic pixel-wise operations.
Building upon this, we introduce slightly more advanced image processing concepts such as filtering, thresholding, and edge detection. In the latter parts, the book touches upon more complex and ubiquitous concepts such as face detection (using Haar cascade classifiers), interest point detection algorithms, and feature descriptors. You will now begin to appreciate the true power of the library in how it reduces mathematically non-trivial algorithms to a single line of code!
The concluding sections touch upon OpenCV’s Machine Learning module. You will witness not only how OpenCV helps you pre-process and extract features from images that are relevant to the problems you are trying to solve, but also how to use Machine Learning algorithms that work on these features to make intelligent predictions from visual data!

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

MOBI

Seitenzahl: 450

Veröffentlichungsjahr: 2016

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Denke (nach) und werde reich

Napoleon Hill

30 Minuten Resilienz

Ulrich Siegrist

Krebszellen mögen keine Himbeeren - Der große Bestseller - Vollständig überarbeitet und aktualisiert

Richard Béliveau

Die Hormonrevolution

Michael E Platt

Der Crash ist die Lösung

Matthias Weik

Günter, der innere Schweinehund, lernt verkaufen

Stefan Frädrich

Mission erfüllt

Owen Mark

Die Leber wächst mit ihren Aufgaben

Dr. med. Eckart von Hirschhausen

Macht, was ihr liebt!

Anja Förster

Der größte Raubzug der Geschichte

Matthias Weik

Unsere Hunde - gesund durch Homöopathie

Hans Günter Wolff

Die Jahrhundertlüge, die nur Insider kennen

Heiko Schrang

Organisation für Komplexität

Niels Pfläging

Radikal führen

Reinhard K. Sprenger

30 Minuten Sympathisch und souverän: So geht Vortragen!

Thomas Lorenz

BLACKOUT - Morgen ist es zu spät

Marc Elsberg

The Truth About Employee Engagement

Learning OpenCV 3 Application Development

Credits

About the Author

About the Reviewer

www.PacktPub.com

Why subscribe?

Preface

What this book covers

Who this book is for

What you need for this book

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

1. Laying the Foundation

Digital image basics

Pixel intensities

Color depth and color spaces

Color channels

Introduction to the Mat class

Exploring the Mat class: loading images

Exploring the Mat class - declaring Mat objects

Spatial dimensions of an image

Color space or color depth

Color channels

Image size

Default initialization value

Digging inside Mat objects

Traversing Mat objects

Continuity of the Mat data matrix

Image traversals

Image enhancement

Lookup tables

Linear transformations

Identity transformation

Negative transformation

Logarithmic transformations

Log transformation

Exponential or inverse-log transformation

Summary

2. Image Filtering

Neighborhood of a pixel

Image averaging

Image filters

Image averaging in OpenCV

Blurring an image in OpenCV

Gaussian smoothing

Gaussian function and Gaussian filtering

Gaussian filtering in OpenCV

Using your own filters in OpenCV

Image noise

Vignetting

Implementing Vignetting in OpenCV

Summary

3. Image Thresholding

Binary images

Image thresholding basics

Image thresholding in OpenCV

Types of simple image thresholding

Binary threshold

Inverted binary threshold

Truncate

Threshold-to-zero

Inverted threshold-to-zero

Adaptive thresholding

Morphological operations

Erosion and dilation

Erosion and dilation in OpenCV

Summary

4. Image Histograms

The basics of histograms

Histograms in OpenCV

Plotting histograms in OpenCV

Color histograms in OpenCV

Multidimensional histograms in OpenCV

Summary

5. Image Derivatives and Edge Detection

Image derivatives

Image derivatives in two dimensions

Visualizing image derivatives with OpenCV

The Sobel derivative filter

From derivatives to edges

The Sobel detector - a basic framework for edge detection

The Canny edge detector

Image noise and edge detection

Laplacian - yet another edge detection technique

Blur detection using OpenCV

Summary

6. Face Detection Using OpenCV

Image classification systems

Face detection

Haar features

Integral image

Integral image in OpenCV

AdaBoost learning

Cascaded classifiers

Face detection in OpenCV

Controlling the quality of detected faces

Gender classification

Working with real datasets

Summary

7. Affine Transformations and Face Alignment

Exploring the dataset

Running face detection on the dataset

Face alignment - the first step in facial analysis

Rotating faces

Image cropping -- basics

Image cropping for face alignment

Face alignment - the complete pipeline

Summary

8. Feature Descriptors in OpenCV

Introduction to the local binary pattern

A basic implementation of LBP

Variants of LBP

What does LBP capture?

Applying LBP to aligned facial images

A complete implementation of LBP

Putting it all together - the main() function

Summary

9. Machine Learning with OpenCV

What is machine learning

Supervised and unsupervised learning

Revisiting the image classification framework

k-means clustering - the basics

k-means clustering - the algorithm

k-means clustering in OpenCV

k-nearest neighbors classifier - introduction

k-nearest neighbors classifier - algorithm

What k to use

k-nearest neighbors classifier in OpenCV

Some problems with kNN

Some enhancements to kNN

Support vector machines (SVMs) - introduction

Intuition into the workings of SVMs

Non-linear SVMs

SVM in OpenCV

Using an SVM as a gender classifier

Overfitting

Cross-validation

Common evaluation metrics

The P-R curve

Some qualitative results

Summary

A. Command-line Arguments in C++

Introduction to command-line arguments

Parsing command-line arguments

Summary

Learning OpenCV 3 Application Development

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: December 2016

Production reference: 1131216

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78439-145-4

www.packtpub.com

Credits

Author

Samyak Datta

Copy Editor

Safis Editing

Reviewer

Nikolaus Gradwohl

Project Coordinator

Sheejal Shah

Commissioning Editor

Kunal Parikh

Proofreader

Safis Editing

Acquisition Editor

Sonali Vernekar

Indexer

Rekha Nair

Content Development Editor

Nikhil Borkar

Graphics

Abhinash Sahu

Technical Editor

Hussain Kanchwala

Production Coordinator

Shraddha Falebhai

About the Author

Samyak Datta has a bachelor's and a master's degree in Computer Science from the Indian Institute of Technology, Roorkee. He is a computer vision and machine learning enthusiast. His first contact with OpenCV was in 2013 when he was working on his master's thesis, and since then, there has been no looking back. He has contributed to OpenCV's GitHub repository. Over the course of his undergraduate and master's degrees, Samyak has had the opportunity to engage with both the industry and research. He worked with Google India and Media.net (Directi) as a software engineering intern, where he was involved with projects ranging from machine learning and natural language processing to computer vision. As of 2016, he is working at the Center for Visual Information Technology (CVIT) at the Indian Institute of Information Technology, Hyderabad.

About the Reviewer

Nikolaus Gradwohl was born 1976 in Vienna, Austria and always wanted to become an inventor like Gyro Gearloose. When he got his first Atari, he figured out that being a computer programmer is the closest he could get to that dream. He has written programs for nearly anything that can be programmed, ranging from 8-bit microcontrollers to mainframes, for a living. In his free time, he collects programming languages and operating systems.

He is the author of the book Processing 2: Creative Coding Hotshot. You can see some of his work on his blog at http://www.local-guru.net/.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Preface

The mission of this book is to explain to a novice the steps involved in building and deploying an end-to-end application in the domain of computer vision using OpenCV/C++. The book will start with instructions about installing the library and end with the reader having developed an application that does something tangible and useful in computer vision/machine learning. All concepts included in the text have been selected because of their frequent use and relevance in practical computer vision-based projects. To avoid being too theoretical, the description of concepts is accompanied simultaneously by the development of some end-to-end applications. The projects will be explained and developed step-by-step during the entire course of the text, as and when relevant theoretical concepts will be introduced. This will help the readers grasp the practical applications of the concept under study while not losing sight of the big-picture.

What this book covers

Chapter 1, Laying the Foundation, will lay the foundation for the basic data structures and containers in OpenCV for example, the Mat, Rect, Point and Scalar objects . It also explains the need for each of them as a separate data types by offering an insight into the possible use cases. A major portion of the chapter focuses on how OpenCV uses the Mat object to store images so that your code can access them, basics of scanning images using the Mat object and the concept of R, G and B color channels. After the readers are comfortable with working with images, we introduce the concept of pixel-based image transformations and show examples of simple code that can achieve contrast/brightness modification using simple Mat object traversals.

Chapter 2, Image Filtering, progresses to slightly advanced pixel traversal algorithms. We introduce the concepts of masks and image filtering and talk about the standard filters such as box filter, median filter, and Gaussian blur in OpenCV. Also, we will mathematically explain the concepts of convolution and correlation and show how a generic filter can be written using OpenCV’s filter2D method. As a practical use case, we implement a popular and interesting image manipulation technique called the Vignette filter.

Chapter 3, Image Thresholding, talks about image thresholding, which is yet another process that frequently comes up in the solution to most computer vision problems. In this chapter, the readers are made to understand that the algorithms that fall into this domain basically involve operations that produce a binary image from a grayscale one. The different variants such as binary and inverted binary, are available as part of the OpenCV imgproc module are explained in detail. The chapter will also briefly touch upon the concepts of erosion and dilation (morphological transformations) as steps that takes place after thresholding.

Chapter 4, Image Histograms, talks about aggregating pixel values by discussing image histograms and histogram-related operations on images such as calculating and displaying histograms, the concept of color and multi-dimensional histograms.

Chapter 5, Image Derivatives and Edge Detection, focuses on other types of information that can be extracted from the pixels in our image. The readers are introduced to the concept of image derivatives and the discussion then moves on to the application of image derivatives in edge detection algorithms. A demonstration of the edge detection methods in OpenCV for example Sobel, Canny and Laplacian is presented. As a small practical use-case of the Laplacian, we demonstrate how it can be used to detect the blurriness of images.

Chapter 6, Face Detection Using OpenCV, talks about one of the most popular and ubiquitous problems in the computer vision community--detecting objects, specifically faces in images. The main motivation of this chapter is to take a look at the Haar-cascade classifier algorithm, which is used to detect faces and then go on to show how the complex algorithm can be run using a single line of OpenCV code. At this point, we introduce our programming project of predicting gender of a person from a facial image. The reader is made to realize that any system involving analysis of facial images (be it face recognition or gender, age or ethnicity prediction) requires an accurate and robust face detection as its first step.

Chapter 7, Affine Transformations and Face Alignment, serves as a natural successor to the preceding chapter. After detecting faces from images, the readers will be taught about the post-processing steps that are undertaken--geometric (Affine) transformations such resizing, cropping and rotating images. The need for such transformations is explained. All images in the gender classification project have to go through this step before feature extraction.

Chapter 8, Feature Descriptors in OpenCV, introduces the notion of feature descriptors. The readers realize that in order to infer meaningful information from images, one must construct appropriate features from the pixel values. In other words, images have to be converted into feature vectors before feeding them into a machine learning algorithm. To that end, the chapter also goes on to introduce the concept of local binary pattern as a feature and also talks about the details of implementation. We demonstrate the process of calculating the LBP features from the cropped and aligned facial images that we obtained in the previous chapters.

Chapter 9, Machine Learning with OpenCV, teaches the readers different classifiers and machine learning algorithms available as part of the OpenCV library and how they can be used to make meaningful predictions from data. The readers will witness the learning algorithms accept the feature vectors that they had computed in the previous chapters as input and make intelligent predictions as outputs. All the steps involved with using a learning algorithm--training, testing, cross validation, selection of hyper-parameters and evaluation of results--will be covered in detail.

Appendix, Command-line Arguments in C++, talks about command line arguments in C++ and how to extract the best possible use of them while writing OpenCV programs.

Who this book is for

Learning OpenCV 3.0 Application Development is the perfect book for anyone who wants to dive into the exciting world of image processing and computer vision. This book is aimed at programmers having a working knowledge of C++. A prior knowledge of OpenCV is not required. The book also doesn't assume any prior understanding of computer vision/machine learning algorithms as all concepts are explained from basic principles. Although, familiarity with these concepts would help. By the end of this book, readers will get a first-hand experience of building and deploying computer vision applications using OpenCV and C++ by following the detailed, step-by-step tutorials. They will begin to appreciate the power of OpenCV in simplifying seemingly challenging and intensive tasks such as contrast enhancement in images, edge detection, face detection and classification.

What you need for this book

The book assumes a basic, working knowledge of C++. However, prior knowledge of computer vision, image processing or machine learning is not assumed. You will need an OpenCV 3.1 installation in your systems to run the sample programs spread across the various chapters in this book. The setup and installation details have already been shared.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "We can include other contexts through the use of the include directive."

A block of code is set as follows:

typedef Size_<int> Size2i; typedef Size2i Size;

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Clicking the Next button moves you to the next screen."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the book in the Search box.Select the book for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Learning-OpenCV-3-Application-Development We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/LearningOpenCV3ApplicationDevelopment_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at [email protected] with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

Introduction to the Mat class

We have discussed the formation and representation of digital images and the concept of color spaces and color channels at length. Having laid a firm foundation on the basic principles of image processing, we now turn to the OpenCV library and take a look at what it has got to offer us in terms of storing digital images! Just like pixels are the building blocks of digital images, the Mat class is the cornerstone of OpenCV programming. Any instantiation of the Mat class is called the Mat object.

Before we embark on a description of the Mat object, I would urge you to think, keeping in mind whatever we have discussed regarding the structure and representation of digital images, about how you would go about designing a data structure in C++ that could store images for efficient processing. One obvious solution that comes to mind is using a two-dimensional array or a vector of vectors. What about the data types? An unsigned char should be sufficient since we would rarely need to store values beyond the range of 0 to 255. How would you go about implementing channels? Perhaps we could have an array of two-dimensional grids to represent color images (getting a little complicated now, isn't it?).

The Mat object is capable of doing all of the preceding things that were described (and much more) in the most efficient manner possible! It lets you handle multiple color channels and different color spaces without you (the programmer) having to worry about the internal implementation details. Since the library is written in C++, it also lifts the burden of memory management from the hands of the user. So, all you've got to worry about is building your cool application and you can trust Mat (and OpenCV) to take care of the rest!

According to OpenCV's official documentation:

"The class Mat represents an n-dimensional dense numerical single-channel or multi-channel array."

We have already witnessed that digital images are two-dimensional arrays of pixels, where each pixel is associated with a numerical value from a predefined color space. This makes the Mat object a very obvious choice for representing images inside the world of OpenCV. And indeed, it does enable you to load, process, and store images and image data in your program. Most of the computer vision applications that you will be developing (as part of this book or otherwise) would involve abundant usage of images. These images would typically enter your system from the outside (user input), your application would apply several image processing algorithms on them, and finally produce an output, which may be written to disk. All these operations involve storing images inside your program and passing them around different modules of your code. This is precisely where the Mat object lends its utility.

Mat objects have two parts: a header and the actual matrix of pixel values. The header contains information, such as the size of the matrix, the memory address where it is stored (a pointer), and other pieces of information pertaining to the internal workings of Mat and OpenCV. The other part of the Mat object is where the actual pixel values are stored. The header for every Mat object is constant in size, but the size of the matrix of pixel values depends on the size of your image.

As this book progresses, you will realize that a Mat object is not always synonymous with images. You will work with certain instantiations of the Mat class, which do not represent a meaningful image as such. In such cases, it is more convenient to think of the Mat object as a data structure that helps us to operate on (possibly multidimensional) numerical arrays (as the official document suggests). But irrespective of whether we use Mat as an image store or a generic multidimensional array, you will soon realize the immense power that the creators of the library have placed in your hands through the Mat class. As mentioned earlier, its scope goes beyond merely storing images. It can act as a data structure and can provide the users with tools to use the most common linear algebra routines--matrix multiplication, inverse, eigen-values, PCA, norms, SVD, and even DFT and DCT--the list goes on.

Traversing Mat objects

So far, you have learnt in detail about the Mat class, what it represents, how to initialize instances of the Mat class, and the different ways to create Mat objects. Along the way, we have also looked at some other OpenCV classes, such as Size, Scalar, and Rect