39,59 €
If you are a Java and Android developer looking to enhance your skills by learning the latest features of OpenCV Android application programming, then this book is for you.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 212
Veröffentlichungsjahr: 2015
Copyright © 2015 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: July 2015
Production reference: 1230715
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78398-820-4
www.packtpub.com
Authors
Salil Kapur
Nisarg Thakkar
Reviewers
Radhakrishna Dasari
Noritsuna Imamura
Ashwin Kachhara
André Moreira de Souza
Commissioning Editor
Kartikey Pandey
Acquisition Editors
Harsha Bharwani
Aditya Nair
Content Development Editors
Ruchita Bhansali
Kirti Patil
Technical Editor
Ankur Ghiye
Copy Editor
Rashmi Sawant
Project Coordinator
Nidhi Joshi
Proofreader
Safis Editing
Indexer
Hemangini Bari
Graphics
Sheetal Aute
Production Coordinator
Nitesh Thakur
Cover Work
Nitesh Thakur
Salil Kapur is a software engineer at Microsoft. He earned his bachelor's degree in computer science from Birla Institute of Technology and Science, Pilani.
He has a passion for programming and is always excited to try out new technologies. His interests lie in computer vision, networks, and developing scalable systems. He is an open source enthusiast and has contributed to libraries such as SimpleCV, BinPy, and Krita.
When he is not working, he spends most of his time on Quora and Hacker News. He loves to play basketball and ultimate frisbee. He can be reached at <[email protected]>.
Nisarg Thakkar is a software developer and a tech enthusiast in general. He primarily programs in C++ and Java. He has extensive experience in Android app development and computer vision application development using OpenCV. He has also contributed to an OpenCV project and works on its development during his free time. His interests lie in stereo vision, virtual reality, and exploiting the Android platform for noncommercial projects that benefit the people who cannot afford the conventional solutions.
He was also the subcoordinator of the Mobile App Club at his university. He was also the cofounder of two start-ups at his college, which he started with his group of friends. One of these start-ups has developed Android apps for hotels, while the other is currently working on building a better contact manager app for the Android platform.
Nisarg Thakkar is currently studying at BITS Pilani, K. K. Birla Goa campus, where he will be graduating with a degree in engineering (hons.) in computer science in May 2016. He can be reached at <[email protected]>.
Radhakrishna Dasari is a computer science PhD student at the State University of New York in Buffalo. He works at Ubiquitous Multimedia Lab, whose director is Dr. Chang Wen Chen. His research spans computer vision and machine learning with an emphasis on multimedia applications. He intends to pursue a research career in computer vision and loves to teach.
Noritsuna Imamura is a specialist in embedded Linux/Android-based computer vision. He is the main person of SIProp (http://siprop.org/).
His main works are as follows:
He can be reached at <[email protected]>.
Ashwin Kachhara graduated from IIT Bombay in June 2015 and is currently pursuing his master's at Georgia Tech, Atlanta. Over the past 5 years, he has been developing software for different platforms, including AVR, Android, Microsoft Kinect, and the Oculus Rift. His professional interests span Mixed Reality, Wearable Technologies, graphics, and computer vision. He has previously worked as an intern at the SONY Head Mounted Display (HMD) division in Tokyo and at the National University of Singapore's Interactive and Digital Media Institute (IDMI). He is a virtual reality enthusiast and enjoys rollerblading and karaoke when he is not writing awesome code.
André Moreira de Souza is a PhD candidate in computer science, with an emphasis on computer graphics from the Pontifical Catholic University of Rio de Janeiro (Brazil).
He graduated with a bachelor of computer science degree from Universidade Federal do Maranhão (UFMA) in Brazil. During his undergraduate degree, he was a member of Labmint's research team and worked with medical imaging, specifically, breast cancer detection and diagnosis using image processing.
Currently, he works as a researcher and system analyst at Instituto Tecgraf, one of the major research and development labs in computer graphics in Brazil. He has been working extensively with PHP, HTML, and CSS since 2007; nowadays, he develops projects in C++11/C++14, along with SQLite, Qt, Boost, and OpenGL. More information about him can be acquired by visiting his personal website at www.andredsm.com.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.
This book will help you get started with OpenCV on the Android platform in no time. It explains the various computer vision algorithms conceptually, as well as their implementation on the Android platform. This book is an invaluable resource if you are looking forward to implementing computer vision modules on new or existing Android apps.
Chapter 1, Applying Effects to Images, includes some of the basic preprocessing algorithms used in various computer vision applications. This chapter also explains how you can integrate OpenCV to your existing projects.
Chapter 2, Detecting Basic Features in Images, covers the detection of primary features such as edges, corners, lines, and circles in images.
Chapter 3, Detecting Objects, dives deep into feature detection, using more advanced algorithms to detect and describe features in order to uniquely match them to features in other objects.
Chapter 4, Drilling Deeper into Object Detection – Using Cascade Classifiers, explains the detection of general objects, such as faces/eyes in images and videos.
Chapter 5, Tracking Objects in Videos, covers the concepts of optical flow as a motion detector and implements the Lucas-Kanade-Tomasi tracker to track objects in a video.
Chapter 6, Working with Image Alignment and Stitching, covers the basic concepts of image alignment and image stitching to create a panoramic scene image.
Chapter 7, Bringing Your Apps to Life with OpenCV Machine Learning, explains how machine learning can be used in computer vision applications. In this chapter, we take a look at some common machine learning algorithms and their implementation in Android.
Chapter 8, Troubleshooting and Best Practices, covers some of the common errors and issues that developers face while building their applications. It also unfolds some good practices that can make the application more efficient.
Chapter 9, Developing a Document Scanning App, uses various algorithms that have been explained across various chapters to build a complete system to scan documents, regardless of what angle you click the image at.
For this book, you need a system with at least 1 GB RAM. Windows, OS X, and Linux are the currently supported operating systems for Android development.
If you are a Java and Android developer and looking to enhance your skills by learning the latest features of OpenCV Android application programming, then this book is for you.
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Create a file named Application.mk and copy the following lines of code to it."
A block of code is set as follows:
New terms and important words are shown in bold.
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from: https://www.packtpub.com/sites/default/files/downloads/8204OS_ImageBundle.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
Generally, an image contains more information than required for any particular task. For this reason, we need to preprocess the images so that they contain only as much information as required for the application, thereby reducing the computing time needed.
In this chapter, we will learn about the different preprocessing operations, which are as follows:
At the end of this chapter, we will see how you can integrate OpenCV into your existing Android applications.
Before we take a look at the various feature detection algorithms and their implementations, let's first build a basic Android application to which we will keep adding feature detection algorithms, as we go through this chapter.
When we see an image, we perceive it as colors and objects. However, a computer vision system sees it as a matrix of numbers (see the following image). These numbers are interpreted differently, depending on the color model used. The computer cannot directly detect patterns or objects in the image. The aim of computer vision systems is to interpret this matrix of numbers as an object of a particular type.
Representation of a binary image
OpenCV is the short form of Open Source Computer Vision library. It is the most widely used computer vision library. It is a collection of commonly used functions that perform operations related to computer vision. OpenCV has been natively written in C/C++, but has wrappers for Python, Java, and any JVM language, which is designed to create the Java byte code, such as Scala and Clojure. Since most of the Android app development is done in C++/Java, OpenCV has also been ported as an SDK that developers can use to implement it in their apps and make them vision enabled.
We will now take a look at how to get started with setting up OpenCV for the Android platform, and start our journey. We will use Android Studio as our IDE of choice, but any other IDE should work just as well with slight modifications. Follow these steps in order to get started:
OpenCV stores images as a custom object called Mat. This object stores the information such as rows, columns, data, and so on that can be used to uniquely identify and recreate the image when required. Different images contain different amounts of data. For example, a colored image contains more data than a grayscale version of the same image. This is because a colored image is a 3-channel image when using the RGB model, and a grayscale image is a 1-channel image. The following figures show how 1-channel and multichannel (here, RGB) images are stored (these images are taken from docs.opencv.org).
A 1-channel representation of an image is shown as follows:
A grayscale (1-channel) image representation:
A more elaborate form of an image is the RGB representation, which is shown as follows:
A RGB (3-channel) image representation
In the grayscale image, the numbers represent the intensity of that particular color. They are represented on a scale of 0-255 when using integer representations, with 0 being pure black and 255 being pure white. If we use a floating point representation, the pixels are represented on a scale of 0-1, with 0 being pure black and 1 being pure white. In an RGB image in OpenCV, the first channel corresponds to blue color, second channel corresponds to green color, and the third channel corresponds to red color. Thus, each channel represents the intensity of any particular color. As we know that red, green, and blue are primary colors, they can be combined in different proportions to generate any color visible to the human eye. The following figure shows the different colors and their respective RGB equivalents in an integer format:
Now that we have seen how an image is represented in computing terms, we will see how we can modify the pixel values so that they need less computation time when using them for the actual task at hand.