39,59 €
OpenCV 3 is a state-of-the-art computer vision library that allows a great variety of image and video processing operations. Some of the more spectacular and futuristic features such as face recognition or object tracking are easily achievable with OpenCV 3. Learning the basic concepts behind computer vision algorithms, models, and OpenCV's API will enable the development of all sorts of real-world applications, including security and surveillance.
Starting with basic image processing operations, the book will take you through to advanced computer vision concepts. Computer vision is a rapidly evolving science whose applications in the real world are exploding, so this book will appeal to computer vision novices as well as experts of the subject wanting to learn the brand new OpenCV 3.0.0. You will build a theoretical foundation of image processing and video analysis, and progress to the concepts of classification through machine learning, acquiring the technical know-how that will allow you to create and use object detectors and classifiers, and even track objects in movies or video camera feeds. Finally, the journey will end in the world of artificial neural networks, along with the development of a hand-written digits recognition application.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 286
Veröffentlichungsjahr: 2015
Copyright © 2015 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: September 2015
Production reference: 1240915
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78528-384-0
www.packtpub.com
Authors
Joe Minichino
Joseph Howse
Reviewers
Nandan Banerjee
Tian Cao
Brandon Castellano
Haojian Jin
Adrian Rosebrock
Commissioning Editor
Akram Hussain
Acquisition Editors
Vivek Anantharaman
Prachi Bisht
Content Development Editor
Ritika Singh
Technical Editors
Novina Kewalramani
Shivani Kiran Mistry
Copy Editor
Sonia Cheema
Project Coordinator
Milton Dsouza
Proofreader
Safis Editing
Indexer
Monica Ajmera Mehta
Graphics
Disha Haria
Production Coordinator
Arvindkumar Gupta
Cover Work
Arvindkumar Gupta
Joe Minichino is a computer vision engineer for Hoolux Medical by day and a developer of the NoSQL database LokiJS by night. On weekends, he is a heavy metal singer/songwriter. He is a passionate programmer who is immensely curious about programming languages and technologies and constantly experiments with them. At Hoolux, Joe leads the development of an Android computer vision-based advertising platform for the medical industry.
Born and raised in Varese, Lombardy, Italy, and coming from a humanistic background in philosophy (at Milan's Università Statale), Joe has spent his last 11 years living in Cork, Ireland, which is where he became a computer science graduate at the Cork Institute of Technology.
I am immensely grateful to my partner, Rowena, for always encouraging me, and also my two little daughters for inspiring me. A big thank you to the collaborators and editors of this book, especially Joe Howse, Adrian Roesbrock, Brandon Castellano, the OpenCV community, and the people at Packt Publishing.
Joseph Howse lives in Canada. During the winters, he grows his beard, while his four cats grow their thick coats of fur. He loves combing his cats every day and sometimes, his cats also pull his beard.
He has been writing for Packt Publishing since 2012. His books include OpenCV for Secret Agents, OpenCV Blueprints, Android Application Programming with OpenCV 3, OpenCV Computer Vision with Python, and Python Game Programming by Example.
When he is not writing books or grooming his cats, he provides consulting, training, and software development services through his company, Nummist Media (http://nummist.com).
Nandan Banerjee has a bachelor's degree in computer science and a master's in robotics engineering. He started working with Samsung Electronics right after graduation. He worked for a year at its R&D centre in Bangalore. He also worked in the WPI-CMU team on the Boston Dynamics' robot, Atlas, for the DARPA Robotics Challenge. He is currently working as a robotics software engineer in the technology organization at iRobot Corporation. He is an embedded systems and robotics enthusiast with an inclination toward computer vision and motion planning. He has experience in various languages, including C, C++, Python, Java, and Delphi. He also has a substantial experience in working with ROS, OpenRAVE, OpenCV, PCL, OpenGL, CUDA and the Android SDK.
I would like to thank the author and publisher for coming out with this wonderful book.
Tian Cao is pursuing his PhD in computer science at the University of North Carolina in Chapel Hill, USA, and working on projects related to image analysis, computer vision, and machine learning.
I dedicate this work to my parents and girlfriend.
Brandon Castellano is a student from Canada pursuing an MESc in electrical engineering at the University of Western Ontario, City of London, Canada. He received his BESc in the same subject in 2012. The focus of his research is in parallel processing and GPGPU/FPGA optimization for real-time implementations of image processing algorithms. Brandon also works for Eagle Vision Systems Inc., focusing on the use of real-time image processing for robotics applications.
While he has been using OpenCV and C++ for more than 5 years, he has also been advocating the use of Python frequently in his research, most notably, for its rapid speed of development, allowing low-level interfacing with complex systems. This is evident in his open source projects hosted on GitHub, for example, PySceneDetect, which is mostly written in Python. In addition to image/video processing, he has also worked on implementations of three-dimensional displays as well as the software tools to support the development of such displays.
In addition to posting technical articles and tutorials on his website (http://www.bcastell.com), he participates in a variety of both open and closed source projects and contributes to GitHub under the username Breakthrough (http://www.github.com/Breakthrough). He is an active member of the Super User and Stack Overflow communities (under the name Breakthrough), and can be contacted directly via his website.
I would like to thank all my friends and family for their patience during the past few years (especially my parents, Peter and Lori, and my brother, Mitchell). I could not have accomplished everything without their continued love and support. I can't ever thank everyone enough.
I would also like to extend a special thanks to all of the developers that contribute to open source software libraries, specifically OpenCV, which help bring the development of cutting-edge software technology closer to all the software developers around the world, free of cost. I would also like to thank those people who help write documentation, submit bug reports, and write tutorials/books (especially the author of this book!). Their contributions are vital to the success of any open source project, especially one that is as extensive and complex as OpenCV.
Haojian Jin is a software engineer/researcher at Yahoo! Labs, Sunnyvale, CA. He looks primarily at building new systems of what's possible on commodity mobile devices (or with minimum hardware changes). To create things that don't exist today, he spends large chunks of his time playing with signal processing, computer vision, machine learning, and natural language processing and using them in interesting ways. You can find more about him at http://shift-3.com/
Adrian Rosebrock is an author and blogger at http://www.pyimagesearch.com/. He holds a PhD in computer science from the University of Maryland, Baltimore County, USA, with a focus on computer vision and machine learning.
He has consulted for the National Cancer Institute to develop methods that automatically predict breast cancer risk factors using breast histology images. He has also authored a book, Practical Python and OpenCV (http://pyimg.co/x7ed5), on the utilization of Python and OpenCV to build real-world computer vision applications.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.
OpenCV 3 is a state-of-the-art computer vision library that is used for a variety of image and video processing operations. Some of the more spectacular and futuristic features, such as face recognition or object tracking, are easily achievable with OpenCV 3. Learning the basic concepts behind computer vision algorithms, models, and OpenCV's API will enable the development of all sorts of real-world applications, including security and surveillance tools.
Starting with basic image processing operations, this book will take you through a journey that explores advanced computer vision concepts. Computer vision is a rapidly evolving science whose applications in the real world are exploding, so this book will appeal to computer vision novices as well as experts of the subject who want to learn about the brand new OpenCV 3.0.0.
Chapter 1, Setting Up OpenCV, explains how to set up OpenCV 3 with Python on different platforms. It will also troubleshoot common problems.
Chapter 2, Handling Files, Cameras, and GUIs, introduces OpenCV's I/O functionalities. It will also discuss the concept of a project and the beginnings of an object-oriented design for this project.
Chapter 3, Processing Images with OpenCV 3, presents some techniques required to alter images, such as detecting skin tone in an image, sharpening an image, marking contours of subjects, and detecting crosswalks using a line segment detector.
Chapter 4, Depth Estimation and Segmentation, shows you how to use data from a depth camera to identify foreground and background regions, such that we can limit an effect to only the foreground or background.
Chapter 5, Detecting and Recognizing Faces, introduces some of OpenCV's face detection functionalities, along with the data files that define particular types of trackable objects.
Chapter 6, Retrieving Images and Searching Using Image Descriptors, shows how to detect the features of an image with the help of OpenCV and make use of them to match and search for images.
Chapter 7, Detecting and Recognizing Objects, introduces the concept of detecting and recognizing objects, which is one of the most common challenges in computer vision.
Chapter 8, Tracking Objects, explores the vast topic of object tracking, which is the process of locating a moving object in a movie or video feed with the help of a camera.
Chapter 9, Neural Networks with OpenCV – an Introduction, introduces you to Artificial Neural Networks in OpenCV and illustrates their usage in a real-life application.
You simply need a relatively recent computer, as the first chapter will guide you through the installation of all the necessary software. A webcam is highly recommended, but not necessary.
This book is aimed at programmers with working knowledge of Python as well as people who want to explore the topic of computer vision using the OpenCV library. No previous experience of computer vision or OpenCV is required. Programming experience is recommended.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
You picked up this book so you may already have an idea of what OpenCV is. Maybe, you heard of Sci-Fi-sounding features, such as face detection, and got intrigued. If this is the case, you've made the perfect choice. OpenCV stands for Open Source Computer Vision. It is a free computer vision library that allows you to manipulate images and videos to accomplish a variety of tasks from displaying the feed of a webcam to potentially teaching a robot to recognize real-life objects.
In this book, you will learn to leverage the immense potential of OpenCV with the Python programming language. Python is an elegant language with a relatively shallow learning curve and very powerful features. This chapter is a quick guide to setting up Python 2.7, OpenCV, and other related libraries. After setup, we also look at OpenCV's Python sample scripts and documentation.
If you wish to skip the installation process and jump right into action, you can download the virtual machine (VM) I've made available at http://techfort.github.io/pycv/.
This file is compatible with VirtualBox, a free-to-use virtualization application that lets you build and run VMs. The VM I've built is based on Ubuntu Linux 14.04 and has all the necessary software installed so that you can start coding right away.
This VM requires at least 2 GB of RAM to run smoothly, so make sure that you allocate at least 2 (but, ideally, more than 4) GB of RAM to the VM, which means that your host machine will need at least 6 GB of RAM to sustain it.
The following related libraries are covered in this chapter:
For this book's purposes, OpenNI and SensorKinect can be considered optional. They are used throughout Chapter 4, Depth Estimation and Segmentation, but are not used in the other chapters or appendices.
This book focuses on OpenCV 3, the new major release of the OpenCV library. All additional information about OpenCV is available at http://opencv.org, and its documentation is available at http://docs.opencv.org/master.
We are free to choose various setup tools, depending on our operating system and how much configuration we want to do. Let's take an overview of the tools for Windows, Mac, Ubuntu, and other Unix-like systems.
Windows does not come with Python preinstalled. However, installation wizards are available for precompiled Python, NumPy, SciPy, and OpenCV. Alternatively, we can build from a source. OpenCV's build system uses CMake for configuration and either Visual Studio or MinGW for compilation.
If we want support for depth cameras, including Kinect, we should first install OpenNI and SensorKinect, which are available as precompiled binaries with installation wizards. Then, we must build OpenCV from a source.
The precompiled version of OpenCV does not offer support for depth cameras.
On Windows, OpenCV 2 offers better support for 32-bit Python than 64-bit Python; however, with the majority of computers sold today being 64-bit systems, our instructions will refer to 64-bit. All installers have 32-bit versions available from the same site as the 64-bit.
Some of the following steps refer to editing the system's PATH variable. This task can be done in the Environment Variables window of Control Panel.
You can choose to install Python and its related libraries separately if you prefer; however, there are Python distributions that come with installers that will set up the entire SciPy stack (which includes Python and NumPy), which make it very trivial to set up the development environment.
One such distribution is Anaconda Python (downloadable at http://09c8d0b2229f813c1b93c95ac804525aac4b6dba79b00b39d1d3.r79.cf1.rackcdn.com/Anaconda-2.1.0Windows-x86_64.exe). Once the installer is downloaded, run it and remember to add the path to the Anaconda installation to your PATH variable following the preceding procedure.
Here are the steps to set up Python7, NumPy, SciPy, and OpenCV:
Windows does not come with any compilers or CMake. We need to install them. If we want support for depth cameras, including Kinect, we also need to install OpenNI and SensorKinect.
Let's assume that we have already installed 32-bit Python 2.7, NumPy, and SciPy either from binaries (as described previously) or from a source. Now, we can proceed with installing compilers and CMake, optionally installing OpenNI and SensorKinect, and then building OpenCV from the source:
Note that you will need to sign in with your Microsoft account and if you don't have one, you can create one on the spot. Install the software and reboot after installation is complete.
For MinGW, get the installer from http://sourceforge.net/projects/mingw/files/Installer/mingw-get-setup.exe/download and http://sourceforge.net/projects/mingw/files/OldFiles/mingw-get-inst/mingw-get-inst-20120426/mingw-get-inst-20120426.exe/download. When running the installer, make sure that the destination path does not contain spaces and that the optional C++ compiler is included. Edit the system's PATH variable and append ;C:\MinGW\bin (assuming that MinGW is installed to the default location). Reboot the system.
Optionally, download and install OpenNI 1.5.4.0 from the links provided in the GitHub homepage of OpenNI at https://github.com/OpenNI/OpenNI.You can download and install SensorKinect 0.93 from https://github.com/avin2/SensorKinect/blob/unstable/Bin/SensorKinect093-Bin-Win32-v5.1.2.1.msi?raw=true (32-bit). Alternatively, for 64-bit Python, download the setup from https://github.com/avin2/SensorKinect/blob/unstable/Bin/SensorKinect093-Bin-Win64-v5.1.2.1.msi?raw=true (64-bit). Note that this repository has been inactive for more than three years.Download the self-extracting ZIP of OpenCV 3.0.0 from https://github.com/Itseez/opencv. Run the self-extracting ZIP, and when prompted, enter any destination folder, which we will refer to as <unzip_destination>. A subfolder, <unzip_destination>\opencv, is then created.Open Command Prompt and make another folder where our build will go using this command:Change the directory of the build folder:
If the build fails to complete or you run into problems later, try installing missing dependencies (often available as prebuilt binaries), and then rebuild OpenCV from this step.
You have the option of selecting/deselecting build options (according to the libraries you have installed on your machine) and click on Configure again, until you get a clear background (white).
Alternatively, for MinGW, run the following command:
If OpenNI is not installed, omit -D:WITH_OPENNI=ON. (In this case, depth cameras will not be supported.) If OpenNI and SensorKinect are installed to nondefault locations, modify the command to include -D:OPENNI_LIB_DIR=<openni_install_destination>\Lib -D:OPENNI_INCLUDE_DIR=<openni_install_destination>\Include -D:OPENNI_PRIME_SENSOR_MODULE_BIN_DIR=<sensorkinect_install_destination>\Sensor\Bin.
Alternatively, for MinGW, run this command:
Some versions of Mac used to come with a version of Python 2.7 preinstalled that were customized by Apple for a system's internal needs. However, this has changed and the standard version of OS X ships with a standard installation of Python. On python.org, you can also find a universal binary that is compatible with both the new Intel systems and the legacy PowerPC.
You can obtain this installer at https://www.python.org/downloads/release/python-279/ (refer to the Mac OS X 32-bit PPC or the Mac OS X 64-bit Intel links). Installing Python from the downloaded .dmg file will simply overwrite your current system installation of Python.
For Mac, there are several possible approaches for obtaining standard Python 2.7, NumPy, SciPy, and OpenCV. All approaches ultimately require OpenCV to be compiled from a source using Xcode Developer Tools. However, depending on the approach, this task is automated for us in various ways by third-party tools. We will look at these kinds of approaches using MacPorts or Homebrew. These tools can potentially do everything that CMake can, plus they help us resolve dependencies and separate our development libraries from system libraries.
I recommend MacPorts, especially if you want to compile OpenCV with depth camera support via OpenNI and SensorKinect. Relevant patches and build scripts, including some that I maintain, are ready-made for MacPorts. By contrast, Homebrew does not currently provide a ready-made solution to compile OpenCV with depth camera support.
Before proceeding, let's make sure that the Xcode Developer Tools are properly set up:
Alternatively, you can install Xcode command-line tools by running the following command (in the terminal):
Now, we have the required compilers for any approach.
We can use the MacPorts package manager to help us set up Python 2.7, NumPy, and OpenCV. MacPorts provides terminal commands that automate the process of downloading, compiling, and installing various pieces of open source software (OSS). MacPorts also installs dependencies as needed. For each piece of software, the dependencies and build recipes are defined in a configuration file called a Portfile. A MacPorts repository is a collection of Portfiles.
Starting from a system where Xcode and its command-line tools are already set up, the following steps will give us an OpenCV installation via MacPorts:
Save the file. Now, MacPorts knows that it has to search for Portfiles in my online repository first, and then the default online repository.
Open the terminal and run the following command to update MacPorts:When prompted, enter your password.
Now (if we are using my repository), run the following command to install OpenCV with Python 2.7 bindings and support for depth cameras, including Kinect:Alternatively (with or without my repository), run the following command to install OpenCV with Python 2.7 bindings and support for depth cameras, excluding Kinect:
Dependencies, including Python 2.7, NumPy, OpenNI, and (in the first example) SensorKinect, are automatically installed as well.
By adding +python27