34,79 €
The release of Microsoft Kinect, then PrimeSense Sensor, and Asus Xtion opened new doors for developers to interact with users, re-design their application’s UI, and make them environment (context) aware. For this purpose, developers need a good framework which provides a complete application programming interface (API), and OpenNI is the first choice in this field. This book introduces the new version of OpenNI.
"OpenNI Cookbook" will show you how to start developing a Natural Interaction UI for your applications or games with high level APIs and at the same time access RAW data from different sensors of different hardware supported by OpenNI using low level APIs. It also deals with expanding OpenNI by writing new modules and expanding applications using different OpenNI compatible middleware, including NITE.
"OpenNI Cookbook" favors practical examples over plain theory, giving you a more hands-on experience to help you learn. OpenNI Cookbook starts with information about installing devices and retrieving RAW data from them, and then shows how to use this data in applications. You will learn how to access a device or how to read data from it and show them using OpenGL, or use middleware (especially NITE) to track and recognize users, hands, and guess the skeleton of a person in front of a device, all through examples.You also learn about more advanced aspects such as how to write a simple module or middleware for OpenNI itself.
"OpenNI Cookbook" shows you how to start and experiment with both NIUI designs and OpenNI itself using examples.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 330
Veröffentlichungsjahr: 2013
Copyright © 2013 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: July 2013
Production Reference: 1190713
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-84951-846-8
www.packtpub.com
Cover Image by Ramin Gharouni (<[email protected]>)
Author
Soroush Falahati
Reviewers
Vinícius Godoy
Li Yang Ku
Liza Roumani
Acquisition Editor
Usha Iyer
Lead Technical Editor
Amey Varangaonkar
Technical Editors
Aparna Chand
Athira Laji
Dominic Pereira
Copy Editors
Insiya Morbiwala
Aditya Nair
Alfida Paiva
Laxmi Subramanian
Project Coordinator
Leena Purkait
Proofreader
Stephen Copestake
Indexer
Monica Ajmera Mehta
Graphics
Ronak Dhruv
Abhinash Sahu
Production Coordinator
Shantanu Zagade
Cover Work
Shantanu Zagade
Soroush Falahati is a Microsoft MCPD certificated C# developer of Web and Windows applications, now preparing for a new MCSD certification from Microsoft. He started programming at the age of 13 with VB5 and then continued to VB.Net, C#, C++, C for microcontrollers, as well as scripting languages such as PHP and JavaScript.
He is currently the owner of an e-commerce company that uses web applications and smart phone apps as primary advantages over other competitors.
As a hobby, Soroush supports robotic teams by voluntarily training them on how to program microcontrollers.
I would like to thank my family, who supported me at the time of writing of this book with their patience, just as they always have been patient through the rest of my life!
Also, I want to thank PrimeSense, which gave me access to confidential material and helped me through the writing of this book. I would like to especially thank Eddie Cohen, Software Team Leader, who answered my many questions and Jeremie Kletzkine, the Director of Business Development.
Vinícius Godoy is a computer graphics university professor at PUCPR. He is also an IT manager of an Electronic Content Management (ECM) company in Brazil, called Sinax. His former experience also includes building games and applications for Positivo Informática—including building an augmented reality educational game exposed at CEBIT—and network libraries for Siemens Enterprise Communications.
In his research, he used Kinect, OpenNI, and OpenCV to recognize Brazilian sign language gestures. He is also a game development fan, having a popular website entirely dedicated to the field, called Ponto V (http://www.pontov.com.br). He is mainly proficient with the C++ and Java languages and his field of interest includes graphics synthesis, image processing, image recognition, design patterns, Internet, and multithreading applications.
Li Yang Ku is a Computer Vision scientist and the main author of the Serious Computer Vision Blog (http://computervisionblog.wordpress.com), one of the foremost Computer Vision blogs. He is also the founder of EatPaper (http://www.eatpaper.org), a free web tool for organizing publications visually.
He has worked as a researcher in HRL Laboratories, Malibu, California from 2011 to 2013. He did AI research on multiple humanoid robots and designed one of the vision systems for NASA's humanoid space robot, Robonaut 2, at NASA JSC, Houston. He also has broad experience on RGBD sensor applications, such as object recognition, object tracking, human activity classification, SLAM, and quadrotor navigation.
Li Yang Ku received his MS degree in CS from University of California, Los Angeles, and has a BS degree in EE from National Chiao Tung University, Taiwan. He is now pursuing a Ph.D. degree at the University of Massachusetts, Amherst.
Liza Roumani was born in Paris in 1989. After passing the French scientific Baccalaureate, she decided to move to Israel.
After one year in Jerusalem University, she joined the Technion Institute of Technology of Haifa, where she obtained a BSC degree in Electrical Engineering.
Liza Roumani is currently working at PrimeSense Company, the worldwide leader in 3D sensors technology.
You might want to visit www.PacktPub.com for support files and downloads related to your book.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and, as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
http://PacktLib.PacktPub.com
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books.
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.
As a step towards interacting with users through the physical world, learn how to write NIUI-based applications or motion-controlled games.
OpenNI Cookbook is here to show you how to start developing Natural Interaction UI for your applications or games with high-level APIs while, at the same time, accessing raw data from different sensors of different devices that are supported by OpenNI using low-level APIs.
Chapter 1, Getting Started, will teach you how to install OpenNI along with NiTE and shows you how to prepare an environment for writing an OpenNI-based application.
Chapter 2, OpenNI and C++, explains how to start programming with OpenNI, from basic steps such as creating a project in Visual Studio to initializing and accessing different devices and sensors.
Chapter 3, Using Low-level Data, is an important chapter of this book, as we are going to cover reading and handling output of basic sensors from each device.
Chapter 4, More about Low-level Outputs, shows how you can customize the frame data right from the device itself, including mirroring and cropping.
Chapter 5, NiTE and User Tracking, will start using the Natural Interaction features of NiTE. As a first step, you will learn how to detect users on the scene and their properties.
Chapter 6, NiTE and Hands Tracking, will cover topics such as recognizing and tracking hand movements.
Chapter 7, NiTE and Skeleton Tracking, will be covering the most important features of NiTE: skeleton tracking and recognizing users' skeleton joints.
You need to have Visual Studio 2010 to perform the recipes given in this book. You will also need to download OpenNI 2 and NiTE from their official websites. If you are going to use Kinect, you may need to download the Kinect SDK from Microsoft's website as well.
OpenNI Cookbook is a book for both starters and professionals in NIUI, for people who want to write serious applications or games, and for people who want to experience and start working with NIUI. Even OpenNI 1 and OpenNI 1.x programmers who want to move to the new versions of OpenNI can use this book as a starting point.
This book uses C++ as its primary language; so for reading and understanding you only need to have a basic knowledge of C or C++.
In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Also, we checked if the initializing process ended without any error by creating a variable of type openni::Status."
A block of code is set as follows:
New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "From the File menu, select New and then New Project."
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.
To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title via the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the erratasubmissionform link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.
Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors, and our ability to bring you valuable content.
You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.
The first step before writing an application or game using OpenNI is to install OpenNI itself, the drivers, and any other prerequisites. So in this chapter, we will cover this process and make everything ready for writing an app using OpenNI.
In this chapter, we will cover the following recipes:
As an introduction, it is important for you to have an idea about the technology behind the topics just mentioned and our reasons for writing this book, as well as to know about the different devices and middleware libraries that can be used with OpenNI.
Motion detectors are part of our everyday life, from a simple alarm system to complicated military radars or an earthquake warning system, all using different methods and different sensors but for the same purpose—detecting motion in the environment.
But they were rarely used to control computers or devices until recent years. This was usually because of the high price of capable devices and the lack of powerful software and hardware for consumers, and maybe because end users did not need this technology. Fortunately, this situation changed after some of the powerful players in computer technology tried to use this idea and supported other small innovation companies in this task.
We believe that the idea of controlling computers and other devices with environment-aware input devices is going to grow in computer industries even more in the coming years. Computers can't rely any more on a keyboard and a mouse to learn about real environments. Computers are going to control more and more parts of our everyday life; each time they need to understand better our living environment. So if you are interested in being part of this change, work through this book.
In this book, we are going to show you how to start using current devices and software to write your own applications or games to interact with the real world.
In this chapter, we will introduce you to some usable technologies and devices, and then introduce some of the frameworks and middleware before speaking a little about how you can make applications or games with Natural Interactive User Interfaces (NIUI).
This way of interacting with a computer is known as 3DUI (3D User Interaction or 3D User Interfaces), RBI (Reality based interaction), or NI (Natural Interaction). To know more, visit http://en.wikipedia.org/wiki/3D_user_interaction and http://en.wikipedia.org/wiki/Natural_user_interface.
The keyboard and mouse are two of the most used input devices for computers; they're the way they learn from outside of the box. But the usage of these two devices is very limited and there is a real gap between the physical world and the computer's understanding of the surrounding environment.
To fill this gap, different projects were raised to reconstruct 3D environments for computers using different methods. Read more about these techniques at http://en.wikipedia.org/wiki/Range_imaging.
For example, vSlam is one such famous project designed for robotic researchers who try to do this using one or two RGB cameras. This project is an open source one and is available at http://www.ros.org/wiki/vslam.
However, since most of these solutions depend on the camera's movement or detection of similar patterns from two cameras, and then use Stereo triangulation algorithms for creating a 3D map of the environment, they perform a high number of calculations along with using complex algorithms. This makes them slow and their output unreliable and/or inaccurate.
There are more expensive methods to solve these problems when high accuracy is needed. Methods such as Laser Imaging Detection and Ranging (LIDaR) use one or more laser beams to scan the environment. These methods are expensive and actually not a good option for targeting end users. They are usually big in size and the mid-level models are slow at scanning a 3D environment completely. Yet, because they use ToF (Time of Flight) for calculating distances, they have very good accuracy and a very good range too. The devices that use laser beams are used mainly for scanning huge objects, buildings, surfaces, landforms (in Geology), and so on, from the ground, an airplane, or from a satellite. Read more on http://en.wikipedia.org/wiki/Lidar.
To know more about the other types of 3D scanners, visit http://en.wikipedia.org/wiki/3D_scanner.
In 2010, Microsoft released the Kinect device for Xbox 360 users to control their console and games without a controller. Kinect originally uses PrimeSense's technology and its SoC (System on Chip) to capture and analyze the depth of the environment. PrimeSense's method of scanning the environment is based on projecting a pattern of a hundred beams of infrared lasers to the environment and capturing these beams using a simple image CMOS sensor (a.k.a. Active Pixel Sensor or APS) with an infrared-passing filter in front of it. PrimeSense's SoC is then responsible for comparing the results of the captured pattern with the projected one and creates a displacement map of the captured pattern compared to the projected pattern. This displacement map is actually the same depth map that the device provides to the developers later with some minor changes. This technology is called Structured-light 3D scanning. Its accuracy, size, and error rate (below 70 millimeters in the worst possible case) when compared to its cost makes it a reasonable choice for a consumer-targeted device.
To know more about Kinect, visit http://en.wikipedia.org/wiki/Kinect.
PrimeSense decided to release similar devices after Kinect was released. Carmine 1.08, Carmine 1.09 (a short range version of Carmine 1.08), and Capri 1.25 (an embeddable version) are the three devices from PrimeSense. In this book, we will call them all PrimeSense sensors. A list of the available devices from PrimeSense can be viewed at http://www.primesense.com/solutions/sensor/.
Before the release of PrimeSense sensors, Asus released two sensors in 2011 named Asus Xtion (with only depth and IR output) and Asus Xtion Pro Live (with depth, color, IR, and audio output) with PrimeSense's technology and chipset, just as with Kinect, but without some features such as tilting, custom design, higher resolution, and frame rate compared to Kinect. From what PrimeSense told us, the Asus Xtion series and PrimeSense's sensors both share the same design and are almost identical.
Both of PrimeSense's sensors and the Asus Xtion series are almost twice as expensive compared to Microsoft Kinect, yet they have a more acceptable price than the other competitors (in the U.K., Microsoft Kinect is priced at $110).
Here is an illustration to help you understand how Kinect, Asus Xtion, and PrimeSense sensors work:
More information about this method is available on Wikipedia at http://en.wikipedia.org/wiki/Structured-light_3D_scanner.
After the release of Kinect, other devices aimed to give better and faster outputs to users and yet keep the price in an acceptable range. These devices usually use ToF to scan environments and must have better accuracy, at least in theory. SoftKinetic devices (the DepthSense series of devices) and pmd[vision]® CamBoard nano are two of the notable designs. Currently, there is no support for them in OpenNI and they are not very popular compared to Kinect, Asus Xtion, and PrimeSense's sensors. Their resolution is less than what PrimeSense-based devices can offer, but their frame rate is usually better because of a simple calculation they use to produce a depth frame. Current devices can offer from 60 to 120 frames per second ranging from 160 x 120 to 320 x 240 resolutions, whereas Kinect, Asus Xtion, and PrimeSense's sensor can give you up to 640 x 480 resolutions at 30 to 60 frames per second. Also, these devices usually cost more than PrimeSense-based devices (from $250$ to $690 at the time of writing this book).
Microsoft introduced Xbox One in 2013 with a new version of Kinect, known as Kinect for Xbox One (a.k.a. Kinect 2), which uses ToF technology and custom-made CMOS for capturing both RGB and depth data along with projecting beams of laser. From what Microsoft told the media, it is completely made by Microsoft and, unlike the first version of Kinect, this time there is no third-party company involved. It is unknown if this new version of Kinect is compatible with OpenNI, but Microsoft promised a Windows SDK, which means we can expect a custom module for OpenNI from the community at least.
You can read more about ToF-based cameras and their technologies on Wikipedia at http://en.wikipedia.org/wiki/Time-of-flight_camera.
Fotonic is another manufacturing company for 3D imaginary cameras. Fotonic E series products are OpenNI-compatible TOF devices. You can check their website (http://www.fotonic.com/) for more information.
In this book, we use Asus Xtion Pro Live and Kinect, but you can use any of PrimeSense's sensors and it will give you the same result as Asus Xtion Pro without any headache. We even expect the same result with any other OpenNI-compatible device (for example, Fotonic E70 or E40).
After having good hardware for capturing the 3D environment, it is very important to have a good interface to communicate and read data from a device. Apart from the fact that each device may have its own SDK, it is important for developers to use one interface for all of the different devices.
Unfortunately, there is no unique interface for such devices now. But OpenNI, as the default framework, and SDK, for PrimeSense-based devices (such as Kinect, PrimeSense sensors, and Asus Xtion), have the capacity to become one.
OpenNI is an organization that is responsible for its framework with the same name. Their framework (that we will call OpenNI in this book) is an open source project and is available for change by any developer. The funder of this project is PrimeSense itself. This project became very famous because of being the first framework with unofficial Kinect support when there wasn't any reliable framework. In the current version of OpenNI, Kinect is officially supported via the Microsoft SDK.
OpenNI, on one hand, gives device producers the ability to connect their devices to the framework, and on the other hand gives developers the ability to work with the same API for different devices. At the same time, other companies and individuals can develop their own middleware and expand the API of OpenNI. Having these features gives this framework the value that other competitions don't have.
As mentioned in the title of the book, we will use OpenNI as a way to know this field better and to develop our applications.
NiTE is a middleware based on the OpenNI framework and was developed by PrimeSense as an enterprise project.
NiTE gives us more information about a scene based on the information from the depth stream of a device.
We will use NiTE in this book for accessing a user's data and body tracking as well as hand tracking and gesture recognition.
NiTE is not the only middleware; there are other middleware that you can use along with OpenNI, such as the following:
And a whole lot more. A list of SDKs and middleware libraries is available at OpenNI.org (http://www.openni.org/software/?cat_slug=file-cat1).
With the seventh generation of video game consoles, interacting with users via motion detection became popular, with the focus on improving gaming experience, starting with the Nintendo Wii controller and followed by Microsoft Kinect and Sony PlayStation Move.
But gaming isn't the only subject capable of using these new ways. There are different cases where interacting with users via natural ways is a better option than traditional ways, or at least can be used as an improvement. Just think of how you can use it in advertising panels, or how you can give product information to users. Or you can design an intelligent house that is able to identify and understand a user's orders. Just look at what some of the companies such as Samsung did with their Smart TV line of productions.
With improving the device's accuracy and usable field of view, you can expect the creation of applications for personal computers to become reasonable too, for example, moving and rotating a 3D model in 3D modeling apps, or helping in drawing apps, as well as the possibility of interacting with the Windows 8 Modern interface or other similar interfaces.
As a developer, you can think of it as a 3D touch screen, and one can do lots of work with a 3D touch screen. What it needs is a little creativity and innovation to find and create ways and ideas to use these methods to interact with users.
Yet, developing games and applications is not the only area that you can use this technology for. There are projects already underway for creating more environment-aware indoor robots and different indoor security systems as well as constructing and scanning an environment completely (such as the KinectFusion project or other similar projects). It's hard to ignore and not mention the available Motion Capture applications (for example, iPi Motion Capture™).
As you can see, there are lots of possibilities in which you can use OpenNI, NiTE, and other middleware libraries.
But in this book, we are not going to show you how to do anything specific to one of the preceding categories. Instead, we are going to cover how to use OpenNI and NiTE, and it all depends on you and how you want to use the information provided in this book.
In this chapter, we are going to introduce OpenNI and cover the process of initializing OpenNI as well as the process of accessing different devices. The next step for you in this book is reading RAW data from devices and using OpenNI to customize this data from a device. NiTE can help you to convert this data to understandable information about the current scene. This information can be used to interact with users. We are going to cover NiTE and its features in this book too.
By using this information, you will be able to create your own body-controlled game, an application with an NI interface, or even custom systems and projects with better understanding of the world and with the possibility of interacting more easily and in natural ways with users.
The main programming language with OpenNI is C, but there is a C++ wrapper with each release. This book makes conservative use of C++ for simplicity. We used a little bit of OpenGL using the GLUT library to visually show some of the information. So you may need to know C++ and have a little understanding about what OpenGL and 2D drawing are.
Currently, there are two official wrappers for OpenNI and NiTE: C++ and Java wrappers. Yet there is no official wrapper for .NET, Unity, or other languages/software.
Community-maintained wrappers, at the time of writing this book, are NiWrapper.Net which is an open source project supporting OpenNI and NiTE functionalities for .NET developers and ZDK for Unity3D, which is a commercial project for adding OpenNI 2 and NiTE 2 support to Unity. Of course, there are other frameworks that use OpenNI as the backend, but none of these can be fitted in the subject of this book.
OpenNI is a multiplatform framework supporting Windows (32 bit and 64 bit; the ARM edition is not yet available at the time of writing this book), Mac OS X, and Linux (32 bit, 64 bit, and ARM editions). In this book, we are going to use Windows (mainly 64 bit) for projects. But porting codes to other platforms is easily possible and it is unlikely to create serious problems for you if you decided to do this.
The first step to use OpenNI to develop any application or game is to install the OpenNI framework on your development machine. In this recipe, we will show you how to install OpenNI; actually it is as easy as 1-2-3.
There is nothing special here; we downloaded and opened the archive file and then executed the installer package. Also, we accepted the installation of new drivers to the Windows catalog.
If you want to use high-level outputs and some advanced tracking and recognition features of NiTE, you need to install it as well. NiTE is a middleware based on the OpenNI framework and needs to be installed after it.
Before installing NiTE, you need to have OpenNI installed using the Downloading and installing OpenNI recipe in this chapter.
Actually we did nothing special here either; all we did was register, download, and install NiTE for our version of the OS and CPU architecture.
For using Kinect on Windows 7 and Windows 8, you need to install the Microsoft Kinect SDK. This SDK lets OpenNI access Kinect for Windows and Kinect for Xbox devices.
Please note that the current version of OpenNI (OpenNI 2.2) works only with Version 1.6 and higher of Microsoft Kinect SDK. The current stable version of Kinect SDK is 1.7.
Please note that Microsoft Kinect SDK can only be installed on Windows 7 and later.
Just as with the previous two recipes, we did nothing worth explaining except for downloading and installing the Kinect SDK, Kinect Drivers, and the Kinect Runtime.
After installing OpenNI, you need to connect your device to your PC. In this recipe, we will show you how to connect and expect Windows to recognize your device. Actually, both of these devices use only one USB port and drivers are also a part of OpenNI, so in this recipe we are not going to do anything other than connecting and waiting.
Before connecting your device, you need to have OpenNI installed using the Downloading and installing OpenNI recipe in this chapter.
Please note that your device may not be compatible with USB3. It is possible for PrimeSense and Asus Xtion users to update their device firmware to add support for Audio and USB3. Check out the PrimeSense website (http://www.primesense.com/updates/) for downloading the latest firmware.
