Augmented Reality for Developers - Jonathan Linowes - E-Book

Augmented Reality for Developers E-Book

Jonathan Linowes

0,0
39,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Augmented Reality brings with it a set of challenges that are unseen and unheard of for traditional web and mobile developers. This book is your gateway to Augmented Reality development—not a theoretical showpiece for your bookshelf, but a handbook you will keep by your desk while coding and architecting your first AR app and for years to come.

The book opens with an introduction to Augmented Reality, including markets, technologies, and development tools. You will begin by setting up your development machine for Android, iOS, and Windows development, learning the basics of using Unity and the Vuforia AR platform as well as the open source ARToolKit and Microsoft Mixed Reality Toolkit. You will also receive an introduction to Apple's ARKit and Google's ARCore! You will then focus on building AR applications, exploring a variety of recognition targeting methods. You will go through multiple complete projects illustrating key market sectors including business marketing, education, industrial training, and gaming.

By the end of the book, you will have gained the necessary knowledge to make quality content appropriate for a range of AR devices, platforms, and intended uses.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 552

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Augmented Reality for Developers

 

 

 

 

 

 

 

 

 

 

Build practical augmented reality applications with Unity, ARCore, ARKit, and Vuforia

 

 

 

 

 

 

 

 

 

 

Jonathan Linowes
Krystian Babilinski

 

 

 

 

BIRMINGHAM - MUMBAI

Augmented Reality for Developers

Copyright © 2017 Packt Publishing

 

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

 

First published: October 2017

 

Production reference: 1051017

Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.

ISBN: 978-1-78728-643-6

 

www.packtpub.com

Credits

Authors

Jonathan Linowes

Krystian Babilinski

Copy Editor

 

Safis Editing

Reviewers

 

Micheal Lanham

Project Coordinator

 

Ulhas Kambali

Commissioning Editor

 

Amarabha Banerjee

Proofreader

 

Safis Editing

Acquisition Editor

 

Reshma Raman

Indexer

 

Rekha Nair

Content Development Editor

 

Anurag Ghogre

Graphics

 

Abhinash Sahu

Technical Editor

 

Jash Bavishi

Production Coordinator

 

Melwyn Dsa

About the Authors

Jonathan Linowes is principal at Parkerhill Reality Labs, an immersive media Indie studio. He is a veritable 3D graphics enthusiast, Unity developer, successful entrepreneur, and teacher. He has a fine arts degree from Syracuse University and a master’s degree from the MIT Media Lab. He has founded several successful startups and held technical leadership positions at major corporations, including Autodesk Inc. He is the author of other books and videos by Packt, including Unity Virtual Reality Projects (2015) and Cardboard VR Projects for Android (2016).

Special thanks to Lisa and our kids for augmenting my life and keeping my real-reality real.

 

 

 

Krystian Babilinski is an experienced Unity developer with extensive knowledge in 3D design. He has been developing professional AR/VR applications since 2015. He led Babilin Applications, a Unity Design group that promotes open source development and engages with the the Unity community. Krystian now leads the development at Parkerhill Reality Labs, which recently published Power Solitaire VR, a multiplatform VR game.

About the Reviewer

Micheal Lanham is a solutions architect with petroWEB and currently resides in Calgary, Alberta, in Canada. In his current role, he develops integrated GIS applications with advanced ML and spatial search capabilities. He has worked as a professional and amateur game developer; he has been building desktop and mobile games for over 15 years. In 2007, Micheal was introduced to Unity 3D and has been an avid developer, consultant, and manager of multiple Unity games and graphic projects ever since.

Micheal had previously written Augmented Reality Game Development and Game Audio Development with Unity 5.x, also published by Packt in 2017.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1787286436.

If you'd like to join our team of regular reviewers, you can email us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

Table of Contents

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Augment Your World

What is augmented reality?

Augmented reality versus virtual reality

How AR works

Handheld mobile AR

Optical eyewear AR

Target-based AR

3D spatial mapping

Developing AR with spatial mapping

Input for wearable AR

Other AR display techniques

Types of AR targets

Marker

Coded Markers

Images

Multi-targets

Text recognition

Simple shapes

Object recognition

Spatial maps

Geolocation

Technical issues in relation to augmented reality

Field of view

Visual perception

Focus

Resolution and refresh rate

Ergonomics

Applications of augmented reality

Business marketing

Education

Industrial training

Retail

Gaming

Others

The focus of this book

Summary

Setting Up Your System

Installing Unity

Requirements

Download and install

Introduction to Unity

Exploring the Unity Editor

Objects and hierarchy

Scene editing

Adding a cube

Adding a plane

Adding a material

Saving the scene

Changing the Scene view

Game development

Material textures, lighting, and shaders

Animation

Physics

Additional features

Using Cameras in AR

Getting and using Vuforia

Installing Vuforia

Downloading the Vuforia Unity package

Importing the Vuforia Assets package

VuforiaConfiguration setup

License key

Webcam

Building a quick demo with Vuforia

Adding AR Camera prefab to the Scene

Adding a target image

Adding a cube

Getting and using ARToolkit

Installing ARToolkit

Importing the ARToolkit Assets package

ARToolkit Scene setup

Adding the AR Controller

Adding the AR Root origin

Adding an AR camera

Saving the Scene

Building a quick demo with ARToolkit

Identifying the AR Marker

Adding an AR Tracked Object

Adding a cube

Summary

Building Your App

Identifying your platform and toolkits

Building and running from Unity

Targeting Android

Installing Java Development Kit (JDK)

About your JDK location

Installing an Android SDK

Installing via Android Studio

Installing via command-line tools

About your Android SDK root path location

Installing USB device, debugging and connection

Configuring Unity's external tools

Configuring a Unity platform and player for Android

Building and running

Troubleshooting

Android SDK path error

Plugins colliding error

Using Google ARCore for Unity

Targeting iOS

Having an Apple ID

Installing Xcode

Configuring the Unity player settings for iOS

ARToolkit player settings

Building and running

Troubleshooting

Plugins colliding error

Recommended project settings warning

Requires development team error

Linker failed error

No video feed on the iOS device

Using Apple ARKit for Unity

Targeting Microsoft HoloLens

Having a Microsoft developer account

Enabling Windows 10 Hyper-V

Installing Visual Studio

Installing the HoloLens emulator

Setting up and pairing the HoloLens device for development

Configuring Unity's external tools

Configuring the Unity platform and player for the UWP holographic

Build settings

Quality settings

Player settings - capabilities

Player settings - other settings

Vuforia settings for HoloLens

Enabling extended tracking

Adding HoloLensCamera to the Scene

Binding the HoloLens Camera

Building and running

Holographic emulation within Unity

MixedRealityToolkit for Unity

Summary

Augmented Business Cards

Planning your AR development

Project objective

AR targets

Graphic assets

Obtaining 3D models

Simplifying high poly models

Target device and development tools

Setting up the project (Vuforia)

Adding the image target

Adding ImageTarget prefab to the scene

Creating the target database

Importing database into Unity

Activating and running

Enable extended tracking or not?

What makes a good image target?

Adding objects

Building and running

Understanding scale

Real-life scale

Virtual scale and Unity

Target scale and object scale

Animating the drone

How do the blades spin?

Adding an Idle animation

Adding a fly animation

Connecting the clips in the Animator Controller

Playing, building and running

Building for iOS devices

Setting up the project

Adding the image target

Adding objects

Build settings

Building and running

Building and running using Apple ARKit

Building for HoloLens

Setting up the project

Adding the image target

Adding objects

Build settings

Building and running

Building with ARToolkit

Setting up the project

Preparing the image target

Adding the image target

Adding objects

Building and running

Summary

AR Solar System

The project plan

User experience

AR targets

Graphic assets

Target device and development tools

Setting up the project

Creating our initial project

Setting up the scene and folders

Using a marker target

Creating a SolarSystem container

Building the earth

Creating an earth

Rotating the earth

Adding audio

Lighting the scene

Adding sunlight

Night texture

Building an earth-moon system

Creating the container object

Creating the moon

Positioning the moon

A quick introduction to Unity C# programming

Animating the moon orbit

Adding the moon orbit

Adding a global timescale

Orbiting the sun

Making the sun the center, not the earth

Creating the sun

The earth orbiting around the sun

Tilt the earth's axis

Adding the other planets

Creating the planets with textures

Adding rings to Saturn

Switching views

Using VuMark targets (Vuforia)

Associating markers with planets

Adding a master speed rate UI

Creating a UI canvas and button

Gametime event handlers

Trigger input events

Building and running

Exporting the SolarSystem package

Building for Android devices – Vuforia

Building for iOS devices – Vuforia

Building for HoloLens – Vuforia

Building and running ARTookit

ARToolkit markers

Building the project for AR Toolkit

Using 2D bar code targets (AR Toolkit)

Markerless building and running

Building and running iOS with ARKit

Setting up a generic ARKit scene

Adding SolarSystem

Placing SolarSystem in the real world

UI for animation speed

Building and running HoloLens with MixedRealityToolkit

Creating the scene

Adding user selection of scale and time

Summary

How to Change a Flat Tire

The project plan

Project objective

User experience

Basic mobile version

AR mobile version

Markerless version

AR targets

Graphic assets and data

Software design patterns

Setting up the project

Creating the UI (view)

Creating an Instruction Canvas

Creating a Nav Panel

Creating a Content panel

Adding a title text element

Adding a body text element

Creating an Instructions Controller

Wiring up the controller with the UI

Creating an instruction data model

InstructionStep class

InstructionModel class

Connecting the model with the controller and UI

Loading data from a CSV file

Abstracting UI elements

Adding InstructionEvent to the controller

Refactoring InstructionsController

Defining InstructionElement

Linking up the UI elements in Unity

Adding image content

Adding an image to the instruction Content panel

Adding image data to the InstructionStep model

Importing the image files into your project

Adding video content

Adding video to the instruction content panel

Adding video player and render texture

Adding video data to the InstructionStep model

Adding a scroll view

Summary

Augmenting the Instruction Manual

Setting up the project for AR with Vuforia

Switching between AR Mode

Using user-defined targets

Adding a user-defined target builder

Adding an image target

Adding a capture button

Wire capture button to UDT capture event

Adding visual helpers to the AR Prompt

Adding a cursor

Adding a registration target

Removing the AR prompt during tracking

Preventing poor tracking

Integrating augmented content

Reading the AR graphic instructions

Creating AR UI elements

Displaying the augmented graphic

Making the augmented graphics

Including the instructions panel in AR

Using ARKit for spatial anchoring

Setting up the project for ARKit

Preparing the scene

Modifying the InstructionsController

Adding the AR mode button

Adding the anchor mode button

Adding the AR prompt

Adding AR graphic content

A Holographic instruction manual

Setting up the project for HoloLens

World space content canvas

Enabling the next and previous buttons

Adding an AR prompt

Placement of the hologram

Adding AR graphics content

Summary

Room Decoration with AR

The project plan

User experience

Graphic assets

Photos

Frames

User interface elements

Icon buttons

Setting up the project and scene

Create a new Unity project

Developing for HoloLens

Creating default picture

About Mixed Reality Toolkit Input Manager

Gaze Manager

Input Manager

Mixed Reality Toolkit input events

Creating a toolbar framework

Create a toolbar

PictureController component

PictureAction component

Wire up the actions

Move tool, with spatial mapping

Add the Move button and script

Use Spatial Mapping for positioning

Understanding surface planes

Scale tool with Gesture Recognizer

Adding the scale button and script

Scaling the picture

Supporting Cancel

Abstract selection menu UI

Adding the frame menu

SetFrame in PictureController

The FrameMenu object and component

Frame options objects

Activating the frame menu

Support for Cancel in PictureController

Adding the Image menu

SetImage in PictureController

The ImageMenu object and component

Image options objects

Activating the Image menu

Adjusting for Image aspect ratio

Adding and deleting framed pictures

Add and Delete in the Toolbar

GameController

Add and Delete Commands in PictureController

Handling empty scenes

UI feedback

Click audio feedback

Click animation feedback

Building for iOS with ARKit

Set up project and scene for ARKit

Use touch events instead of hand gestures

PictureAction

ClickableObjects

ScaleTool

MoveTool

Building for mobile AR with Vuforia

Set up project and scene for Vuforia

Set the image target

Add DefaultPicture to the scene

GameController

Use touch events instead of hand gestures

Summary

Poke the Ball Game

The game plan

User experience

Game components

Setting up the project

Creating an initial project

Setup the scene and folders

Importing the BallGameArt package

Setting the image target

Boxball game graphics

Ball game court

Scale adjustments

Bouncy balls

Bounce sound effect

Throwing the ball

Ready ball

Holding the ball

Throwing the ball

Detecting goals

Goal collider

CollisionBehavior component

Goal! feedback

Cheers for goals

BallGame component

Keeping score

Current core UI

Game controller

Tracking high score

Augmenting real-world objects

About Vuforia Smart Terrain

User experience and app states

Screen space canvas

Using Smart Terrain

Handling tracking events

App state

App state manager

Wiring up the state manager

Playing alternative games

Setting up the scene with ball games

Activating and deactivating games

Controlling which game to play

Other toolkits

Summary

Preface

Augmented Reality has been said to be the next major computing platform. This book shows you how to build exciting AR applications with Unity 3D and the leading AR toolkits for a spectrum of mobile and wearable devices. The book opens with an introduction to augmented reality, including the markets, technologies, and development tools. You will begin with setting up your development machine for Android, iOS, and/or Windows development, and learn the basics of using Unity and the Vuforia AR platform as well as the open source ARToolKit, Microsoft Mixed Reality toolkit, Google ARCore, and Apple’s ARKit! You will then focus on building AR applications, exploring a variety of recognition targeting methods. You will go through full projects illustrating key business sectors, including marketing, education, industrial training, and gaming. Throughout the book, we introduce major concepts in AR development, best practices in user experience, and important software design patterns that every professional and aspiring software developer should use. It was quite a challenge to construct the book in a way that (hopefully) retains its usefulness and relevancy for years to come. There is an ever-increasing number of platforms, toolkits, and AR-capable devices emerging each year. There are solid general-purpose toolkits such as Vuforia and the open-source ARToolkit, which support both Android and iOS devices. There is the beta Microsoft HoloLens and its Mixed Reality Toolkit for Unity. We had nearly completed writing this book when Apple announced its debut into the market with ARKit, and Google ARCore, so we took the time to integrate ARKit and ARCore into our chapter projects too. By the end of this book, you will gain the necessary knowledge to make quality content appropriate for a range of AR devices, platforms, and intended uses.

What this book covers

Chapter 1, Augment Your World, will introduce you to augmented reality and how it works, including a range of best practices, devices, and practical applications.

Chapter 2, Setting Up Your System, walks you through installing Unity, Vuforia, ARToolkit, and other software needed to develop AR projects on Windows or Mac development machines. It also includes a brief tutorial on how to use Unity.

Chapter 3, Building Your App, continues from Chapter 2, Setting Up Your System, to ensure that your system is set up to build and run AR on your preferred target devices, including Android, iOS, and Windows Mixed Reality (HoloLens).

Chapter 4, Augmented Business Cards, takes you through the building of an app that augments your business card. Using a drone photography company as the example, we make its business card come to life with a flying drone in AR.

Chapter 5, AR Solar System, demonstrates the application of AR for science and education. We build an animated model of the solar system using actual NASA scale, orbits, and texture data.

Chapter 6, How to Change a Flat Tire, dives into the Unity user interface (UI) development and also explores the software design pattern, while building a how-to instruction manual. The result is a regular mobile app using text, image, and video media. This is part 1 of the project.

Chapter 7, Augmenting the Instruction Manual, takes the mobile app developed in the previous chapter and augments it, adding 3D AR graphics as a new media type. This project demonstrates how AR need not be the central feature of an app but simply another kind of media.

Chapter 8, Room Decoration with AR, demonstrates the application of AR for design, architecture, and retail visualization. In this project, you can decorate your walls with framed photos, with a world-space toolbar to add, remove, resize, position, and change the pictures and frames.

Chapter 9, Poke the Ball Game, demonstrates the development of a fun ballgame that you can play on your real-world coffee table or desk using virtual balls and game court. You shoot the ball at the goal, aim to win, and keep score.

Each project can be built using a selection of AR toolkits and hardware devices, including Vuforia or the open source ARToolkit for Android or iOS. We also show how to build the same projects to target iOS with Apple ARKit, Google ARCore, and HoloLens with the Microsoft Mixed Reality Toolkit.

What you need for this book

Requirements will depend on what you are using for a development machine, preferred AR toolkit, and target device. We assume you are developing on a Windows 10 PC or on a macOS. You will need a device to run your AR apps, whether that be an Android smartphone or tablet, an iOS iPhone or iPad, or Microsoft HoloLens.

All the software required for this book are described and explained in Chapter 2, Setting Up Your System, and Chapter 3, Building Your App, which include web links to download what you may need. Please refer to Chapter 3, Building Your App, to understand the specific combinations of development OS, AR toolkit SDK, and target devices supported.

Who this book is for

The ideal target audience for this book is developers who have some experience in mobile development, either Android or iOS. Some broad web development experience would also be beneficial.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "We can include other contexts through the use of the include directive."

A block of code is set as follows:

[default]exten => s,1,Dial(Zap/1|30)exten => s,2,Voicemail(u100)exten => s,102,Voicemail(b100)exten => i,1,Voicemail(s0)

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

[default]exten => s,1,Dial(Zap/1|30)exten => s,2,Voicemail(u100)

exten => s,102,Voicemail(b100)

exten => i,1,Voicemail(s0)

Any command-line input or output is written as follows:

# cp /usr/src/asterisk-addons/configs/cdr_mysql.conf.sample

/etc/asterisk/cdr_mysql.conf

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Clicking the Next button moves you to the next screen."

Warnings or important notes appear in a box like this.

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

The completed projects are available on GitHub in an account dedicated to this book: https://github.com/arunitybook. We encourage our readers to submit improvements, issues, and pull requests via GitHub. As AR toolkits and platforms change frequently, we aim to keep the repositories up to date with the help of the community.

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files emailed directly to you. You can download the code files by following these steps:

Log in or register to our website using your email address and password.

Hover the mouse pointer on the

SUPPORT

tab at the top.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on

Code Download

.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Augmented-Reality-for-Developers. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/AugmentedRealityforDevelopers_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title. To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the internet, please provide us with the location address or website name immediately so that we can pursue a remedy. Please contact us at [email protected] with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

Augment Your World

We're at the dawn of a whole new computing platform, preceded by personal computers, the internet, and mobile device revolutions. Augmented reality (AR) is the future, today!

Let's help invent this future where your daily world is augmented by digital information, assistants, communication, and entertainment. As it emerges, there is a booming need for developers and other skilled makers to design and build these applications.

This book aims to educate you about the underlying AR technologies, best practices, and steps for making AR apps, using some of the most powerful and popular 3D development tools available, including Unity with Vuforia, Apple ARKit, Google ARCore, Microsoft HoloLens, and the open source ARToolkit. We will guide you through the making of quality content appropriate to a variety of AR devices and platforms and their intended uses.

In this first chapter, we introduce you to AR and talk about how it works and how it can be used. We will explore some of the key concepts and technical achievements that define the state of the art today. We then show examples of effective AR applications, and introduce the devices, platforms, and development tools that will be covered throughout this book.

Welcome to the future!

We will cover the following topics in this chapter:

Augmented reality versus virtual reality

How AR works

Types of markers

Technical issues with augmented reality

Applications of augmented reality

The focus of this book

What is augmented reality?

Simply put, AR is the combination of digital data and real-world human sensory input in real-time that is apparently attached (registered) to the physical space.

AR is most often associated with visual augmentation, where computer graphics are combined with actual world imagery. Using a mobile device, such as a smartphone or tablet, AR combines graphics with video. We refer to this as handheld video see-through. The following is an image of the Pokémon Go game that brought AR to the general public in 2016:

AR is not really new; it has been explored in research labs, military, and other industries since the 1990's. Software toolkits for desktop PCs have been available as both open source and propriety platforms since the late 1990's. The proliferation of smartphones and tablets has accelerated the industrial and consumer interest in AR. And certainly, opportunities for handheld AR have not yet reached their full potential, with Apple only recently entering the fray with its release of ARKit for iOS in June 2017 and Google's release of ARCore SDK for Android in August 2017.

Much of today's interest and excitement for AR is moving toward wearable eyewear AR with optical see-through tracking. These sophisticated devices, such as Microsoft HoloLens and Metavision's Meta headsets, and yet-to-be-revealed (as of this writing) devices from Magic Leap and others use depth sensors to scan and model your environment and then register computer graphics to the real-world space. The following is a depiction of a HoloLens device used in a classroom:

However, AR doesn't necessarily need to be visual. Consider a blind person using computer-generated auditory feedback to help guide them through natural obstacles. Even for a sighted person, a system like that which augments the perception of your real-world surroundings with auditory assistance is very useful. Inversely, consider a deaf person using an AR device who listens and visually displays the sounds and words going on around them.

Also, consider tactic displays as augmented reality for touch. A simple example is, the Apple Watch with a mapping app that will tap you on your wrist with haptic vibrations to remind you it's time to turn at the next intersection. Bionics is another example of this. It's not hard to consider the current advances in prosthetics for amputees as AR for the body, augmenting kinesthesia perception of body position and movement.

Then, there's this idea of augmenting spatial cognition and way finding. In 2004, researcher Udo Wachter built and wore a belt on his waist, lined with haptic vibrators (buzzers) attached every few inches. The buzzer facing north at any given moment would vibrate, letting him constantly know what direction he was facing. Udo's sense of direction improved dramatically over a period of weeks (https://www.wired.com/2007/04/esp/):

Can AR apply to smell or taste? I don't really know, but researchers have been exploring these possibilities as well.

What is real? How do you define "real"? If you're talking about what you can feel, what you can smell, and what you can taste and see, then "real" is simply electrical signals interpreted by your brain. ~ "Morpheus in The Matrix (1999)"

OK, this may be getting weird and very science fictiony. (Have you read Ready Player One and Snow Crash?) But let's play along a little bit more before we get into the crux of this specific book.

According to the Merriam-Webster dictionary (https://www.merriam-webster.com), the word augment is defined as, to make greater, more numerous, larger, or more intense. And reality is defined as, the quality or state of being real. Take a moment to reflect on this. You will realize that augmented reality, at its core, is about taking what is real and making it greater, more intense, and more useful.

Apart from this literal definition, augmented reality is a technology and, more importantly, a new medium whose purpose is to improve human experiences, whether they be directed tasks, learning, communication, or entertainment. We use the word real a lot when talking about AR: real-world, real-time, realism, really cool!

As human flesh and blood, we experience the real world through our senses: eyes, ears, nose, tongue, and skin. Through the miracle of life and consciousness, our brains integrate these different types of input, giving us vivid living experiences. Using human ingenuity and invention, we have built increasingly powerful and intelligent machines (computers) that can also sense the real world, however humbly. These computers crunch data much faster and more reliably than us. AR is the technology where we allow machines to present to us a data-processed representation of the world to enhance our knowledge and understanding.

In this way, AR uses a lot of artificial intelligence (AI) technologies. One way AR crosses with AI is in the area of computer vision. Computer vision is seen as a part of AI because it utilizes techniques for pattern recognition and computer learning. AR uses computer vision to recognize targets in your field of view, whether specific coded markers, natural feature tracking (NFT), or other techniques to recognize objects or text. Once your app recognizes a target and establishes its location and orientation in the real world, it can generate computer graphics that aligns with those real-world transforms, overlaid on top of the real-world imagery.

However, augmented reality is not just the combining of computer data with human senses. There's more to it than that. In his acclaimed 1997 research report, A Survey of augmented reality(http://www.cs.unc.edu/~azuma/ARpresence.pdf), Ronald Azuma proposed AR meet the following characteristics:

Combines real and virtual

Interactive in real time

Registered in 3D

AR is experienced in real time, not pre-recorded. Cinematic special effects, for example, that combine real action with computer graphics do not count as AR.

Also, the computer-generated display must be registered to the real 3D world. 2D overlays do not count as AR. By this definition, various head-up displays, such as in Iron Man or even Google Glass, are not AR. In AR, the app is aware of its 3D surroundings and graphics are registered to that space. From the user's point of view, AR graphics could actually be real objects physically sharing the space around them.

Throughout this book, we will emphasize these three characteristics of AR. Later in this chapter, we will explore the technologies that enable this fantastic combination of real and virtual, real-time interactions, and registration in 3D.

As wonderful as this AR future may seem, before moving on, it would be remiss not to highlight the alternative possible dystopian future of augmented reality! If you haven't seen it yet, we strongly recommend watching the Hyper-Reality video produced by artist Keiichi Matsuda (https://vimeo.com/166807261). This depiction of an incredible, frightening, yet very possible potential future infected with AR, as the artist explains, presents a provocative and kaleidoscopic new vision of the future, where physical and virtual realities have merged, and the city is saturated in media. But let's not worry about that right now. A screenshot of the video is as follows:

Augmented reality versus virtual reality

Virtual reality (VR) is a sister technology of AR. As described, AR augments your current experience in the real world by adding digital data to it. In contrast, VR magically, yet convincingly, transports you to a different (computer-generated) world. VR is intended to be a totally immersive experience in which you are no longer in the current environment. The sense of presence and immersion are critical for VR's success.

AR does not carry that burden of creating an entire world. For AR, it is sufficient for computer-generated graphics to be added to your existing world space. Although, as we'll see, that is not an easy accomplishment either and in some ways is much more difficult than VR. They have much in common, but AR and VR have contrasting technical challenges, market opportunities, and useful applications.

Although financial market projections change from month to month, analysts consistently agree that the combined VR/AR market will be huge, as much as $120 billion by 2021 (http://www.digi-capital.com/news/2017/01/after-mixed-year-mobile-ar-to-drive-108-billion-vrar-market-by-2021) with AR representing over 75 percent of that. This is in no way a rebuff of VR; its market will continue to be very big and growing, but it is projected to be dwarfed by AR.

Since VR is so immersive, its applications are inherently limited. As a user, the decision to put on a VR headset and enter into a VR experience is, well, a commitment. Seriously! You are deciding to move yourself from where you are now and to a different place.

AR, however, brings virtual stuff to you. You physically stay where you are and augment that reality. This is a safer, less engaging, and more subtle transaction. It carries a lower barrier for market adoption and user acceptance.

VR headsets visually block off the real world. This is very intentional. No external light should seep into the view. In VR, everything you see is designed and produced by the application developer to create the VR experience. The technology design and development implications of this requirement are immense. A fundamental problem with VR is motion to photon latency. When you move your head, the VR image must update quickly, within 11 milliseconds for 90 frames per second, or you risk experiencing motion sickness. There are multiple theories why this happens (see https://en.wikipedia.org/wiki/Virtual_reality_sickness).

In AR, latency is much less of a problem because most of the visual field is the real world, either a video or optical see-through. You're less likely to experience vertigo when most of what you see is real world. Generally, there's a lot less graphics to render and less physics to calculate in each AR frame.

VR also imposes huge demands on your device's CPU and GPU processors to generate the 3D view for both left and right eyes. VR generates graphics for the entire scene as well as physics, animations, audio, and other processing requirements. Not as much rendering power is required by AR.

On the other hand, AR has an extra burden not borne by VR. AR must register its graphics with the real world. This can be quite complicated, computationally. When based on video processing, AR must engage image processing pattern recognition in real time to find and follow the target markers. More complex devices use depth sensors to build and track a scanned model of your physical space in real time (Simultaneous Localization and Mapping, or SLAM). As we'll see, there are a number of ways AR applications manage this complexity, using simple target shapes or clever image recognition and matching algorithms with predefined natural images. Should this be: Custom depth sensing hardware and semiconductors are used to calculate a 3D mesh of the user's environment in real time, along with geolocation sensors. This, in turn, is used to register the position and orientation computer graphics superimposed on the real-world visuals.

VR headsets ordinarily include headphones that, like the visual display, preferably block outside sounds in the real world so you can be fully immersed in the virtual one using spatial audio. In contrast, AR headsets provide open-back headphones or small speakers (instead of headphones) that allow the mix of real-world sounds with the spatial audio coming from the virtual scene.

Because of these inherent differences between AR and VR, the applications of these technologies can be quite different. In our opinion, a lot of applications presently being explored for VR will eventually find their home in AR instead. Even in cases where it's ambiguous whether the application could either augment the real world versus transport the user to a virtual space, the advantage of AR not isolating you from the real world will be key to the acceptance of these applications. Gaming will be prevalent with both AR and VR, albeit the games will be different. Cinematic storytelling and experiences that require immersive presence will continue to thrive in VR. But all other applications of 3D computer simulations may find their home in the AR market.

For developers, a key difference between VR and AR, especially when considering head-mounted wearable devices, is that VR is presently available in the form of consumer devices, such as Oculus Rift, HTC Vive, PlayStation VR, and Google Daydream, with millions of devices already in consumers' hands. Wearable AR devices are still in Beta release and quite expensive. That makes VR business opportunities more realistic and measurable. As a result, AR is largely confined to handheld (phone or tablet-based) apps for consumers, or if you delve into wearables, it's an internal corporate project, experimental project, or speculative product R&D.

How AR works

We've discussed what augmented reality is, but how does it work? As we said earlier, AR requires that we combine the real environment with a computer-generated virtual environment. The graphics are registered to the real 3D world. And, this must be done in real time.

There are a number of ways to accomplish this. In this book, we will consider just two. The first is the most common and accessible method: using a handheld mobile device such as a smartphone or tablet. Its camera captures the environment, and the computer graphics are rendered on the device's screen.

A second technique, using wearable AR smartglasses, is just emerging in commercial devices, such as Microsoft HoloLens and Metavision's Meta 2. This is an optical see-through of the real world, with computer graphics shown on a wearable near-eye display.

Handheld mobile AR

Using a handheld mobile device, such as a smartphone or tablet, augmented reality uses the device's camera to capture the video of the real world and combine it with virtual objects.

As illustrated in the following image, running an AR app on a mobile device, you simply point its camera to a target in the real world and the app will recognize the target and render a 3D computer graphic registered to the target's position and orientation. This is handheld mobile video see-through augmented reality:

We use the words handheld and mobile because we're using a handheld mobile device. We use video see-through because we're using the device's camera to capture reality, which will be combined with computer graphics. The AR video image is displayed on the device's flat screen.

Mobile devices have features important for AR, including the following:

Untethered and battery-powered

Flat panel graphic display touchscreen input

Rear-facing camera

CPU (main processor), GPU (graphics processor), and memory

Motion sensors, namely accelerometer for detecting linear motion and gyroscope for rotational motion

GPS and/or other position sensors for geolocation and wireless and/or Wi-Fi data connection to the internet

Let's chat about each of these. First of all, mobile devices are... mobile.... Yeah, I know you get that. No wires. But what this really means is that like you, mobile devices are free to roam the real world. They are not tethered to a PC or other console. This is natural for AR because AR experiences take place in the real world, while moving around in the real world.

Mobile devices sport a flat panel color graphic display with excellent resolution and pixel density sufficient for handheld viewing distances. And, of course, the killer feature that helped catapult the iPhone revolution is the multitouch input sensor on the display that is used for interacting with the displayed images with your fingers.

A rear-facing camera is used to capture video from the real world and display it in real time on the screen. This video data is digital, so your AR app can modify it and combine virtual graphics in real time as well. This is a monocular image, captured from a single camera and thus a single viewpoint. Correspondingly, the computer graphics use a single viewpoint to render the virtual objects that go with it.

Today's mobile devices are quite powerful computers, including CPU (main processor) and GPU (graphics processor), both of which are critical for AR to recognize targets in the video, process sensor, and user input, and render the combined video on the screen. We continue to see these requirements and push hardware manufacturers to try ever harder to deliver higher performance.

Built-in sensors that measure motion, orientation, and other conditions are also key to the success of mobile AR. An accelerometer is used for detecting linear motion along three axes and a gyroscope for detecting rotational motion around the three axes. Using real-time data from the sensors, the software can estimate the device's position and orientation in real 3D space at any given time. This data is used to determine the specific view the device's camera is capturing and uses this 3D transformation to register the computer-generated graphics in 3D space as well.

In addition, GPS sensor can be used for applications that need to map where they are on the globe, for example, the use of AR to annotate a street view or mountain range or find a rogue Pokémon.

Last but not least, mobile devices are enabled with wireless communication and/or Wi-Fi connections to the internet. Many AR apps require an internet connection, especially when a database of recognition targets or metadata needs to be accessed online.

Optical eyewear AR

In contrast to handheld mobiles, AR devices worn like eyeglasses or futuristic visors, such as Microsoft HoloLens and Metavision Meta, may be referred to as optical see-through eyewear augmented reality devices, or simply, smartglasses. As illustrated in the following image, they do not use video to capture and render the real world. Instead, you look directly through the visor and the computer graphics are optically merged with the scene:

The display technologies used to implement optical see-through AR vary from vendor to vendor, but the principles are similar. The glass that you look through while wearing the device is not a basic lens material that might be prescribed by your optometrist. It uses a combiner lens much like a beam splitter, with an angled surface that redirects a projected image coming from the side toward your eye.

An optical see-through display will mix the light from the real world with the virtual objects. Thus, brighter graphics are more visible and effective; darker areas may get lost. Black pixels are transparent. For similar reasons, these devices do not work great in brightly lit environments. You don't need a very dark room but dim lighting is more effective.

We can refer to these displays as binocular. You look through the visor with both eyes. Like VR headsets, there will be two separate views generated, one for each eye to account for parallax and enhance the perception of 3D. In real life, each eye sees a slightly different view in front, offset by the inter-pupillary distance between your eyes. The augmented computer graphics must also be drawn separately for each eye with similar offset viewpoints.

One such device is Microsoft HoloLens, a standalone mobile unit; Metavision Meta 2 can be tethered to a PC using its processing resources. Wearable AR headsets are packed with hardware, yet they must be in a form factor that is lightweight and ergonomic so they can be comfortably worn as you move around. The headsets typically include the following:

Lens optics, with a specific field of view

Forward-facing camera

Depth sensors for positional tracking and hand recognition

Accelerometer and gyroscope for linear and rotational motion detection and near-ear audio speakers

Microphone

Furthermore, as a standalone device, you could say that HoloLens is like wearing a laptop wrapped around your head--hopefully, not for the weight but the processing capacity! It runs Windows 10 and must handle all the spatial and graphics processing itself. To assist, Microsoft developed a custom chip called holographic processing unit(HPU)to complement the CPU and GPU.

Instead of headphones, wearable AR headsets often include near-ear speakers that don't block out environmental sounds. While handheld AR could also emit audio, it would come from the phone's speaker or the headphones you may have inserted into your ears. In either case, the audio would not be registered with the graphics. With wearable near-eye visual augmentation, it's safe to assume that your ears are close to your eyes. This enables the use of spatial audio for more convincing and immersive AR experiences.

Target-based AR

The following image illustrates a more traditional target-based AR. The device camera captures a frame of video. The software analyzes the frame looking for a familiar target, such as a pre-programmed marker, using a technique called photogrammetry. As part of target detection, its deformation (for example, size and skew) is analyzed to determine its distance, position, and orientation relative to the camera in a three-dimensional space.

From that, the camera pose (position and orientation) in 3D space is determined. These values are then used in the computer graphics calculations to render virtual objects. Finally, the rendered graphics are merged with the video frame and displayed to the user:

iOS and Android phones typically have a refresh rate of 60Hz. This means the image on your screen is updated 60 times a second, or 1.67 milliseconds per frame. A lot of work goes into this quick update. Also, much effort has been invested in optimizing the software to minimize any wasted calculations, eliminate redundancy, and other tricks that improve performance without negatively impacting user experience. For example, once a target has been recognized, the software will try to simply track and follow as it appears to move from one frame to the next rather than re-recognizing the target from scratch each time.

To interact with virtual objects on your mobile screen, the input processing required is a lot like any mobile app or game. As illustrated in the following image, the app detects a touch event on the screen. Then, it determines which object you intended to tap by mathematically casting a ray from the screen's XY position into 3D space, using the current camera pose. If the ray intersects a detectable object, the app may respond to the tap (for example, move or modify the geometry). The next time the frame is updated, these changes will be rendered on the screen:

A distinguishing characteristic of handheld mobile AR is that you experience it from an arm's length viewpoint. Holding the device out in front of you, you look through its screen like a portal to the augmented real world. The field of view is defined by the size of the device screen and how close you're holding it to your face. And it's not entirely a hands-free experience because unless you're using a tripod or something to hold the device, you're using one or two hands to hold the device at all times.

Snapchat's popular augmented reality selfies go even further. Using the phone's front-facing camera, the app analyzes your face using complex AI pattern matching algorithms to identify significant points, or nodes, that correspond to the features of your face--eyes, nose, lips, chin, and so on. It then constructs a 3D mesh, like a mask of your face. Using that, it can apply alternative graphics that match up with your facial features and even morph and distort your actual face for play and entertainment. See this video for a detailed explanation from Snapchat's Vox engineers: https://www.youtube.com/watch?v=Pc2aJxnmzh0. The ability to do all of this in real time is remarkably fun and serious business:

Perhaps, by the time you are reading this book, there will be mobile devices with built-in depth sensors, including Google Project Tango and Intel RealSense technologies, capable of scanning the environment and building a 3D spatial map mesh that could be used for more advanced tracking and interactions. We will explain these capabilities in the next topic and explore them in this book in the context of wearable AR headsets, but they may apply to new mobile devices too.

3D spatial mapping

Handheld mobile AR described in the previous topic is mostly about augmenting 2D video with regard to the phone camera's location in 3D space. Optical wearable AR devices are completely about 3D data. Yes, like mobile AR, wearable AR devices can do target-based tracking using its built-in camera. But wait, there's more, much more!

These devices include depth sensors that scan your environment and construct a spatial map (3D mesh) of your environment. With this, you can register objects to specific surfaces without the need for special markers or a database of target images for tracking.

A depth sensor measures the distance of solid surfaces from you, using an infrared (IR) camera and projector. It projects IR dots into the environment (not visible to the naked eye) in a pattern that is then read by its IR camera and analyzed by the software (and/or hardware). On nearer objects, the dot pattern spread is different than further ones; depth is calculated using this displacement. Analysis is not performed on just a single snapshot but across multiple frames over time to provide more accuracy, so the spatial model can be continuously refined and updated.

A visible light camera may also be used in conjunction with the depth sensor data to further improve the spatial map. Using photogrammetry techniques, visible features in the scene are identified as a set of points (nodes) and tracked across multiple video frames. The 3D position of each node is calculated using triangulation.

From this, we get a good 3D mesh representation of the space, including the ability to discern separate objects that may occlude (be in front of) other objects. Other sensors locate the user's actual head in the real world, providing the user's own position and view of the scene. This technique is called SLAM. Originally developed for robotics applications, the 2002 seminal paper on this topic by Andrew Davison, University of Oxford, can be found at https://www.doc.ic.ac.uk/~ajd/Publications/davison_cml2002.pdf.

A cool thing about present day implementations of SLAP is how the data is continuously updated in response to real time sensor readings in your device.

"As the HoloLens gathers new data about the environment, and as changes to the environment occur, spatial surfaces will appear, disappear and change." (https://developer.microsoft.com/en-us/windows/holographic/spatial_mapping)

The following illustration shows what occurs during each update frame. The device uses current readings from its sensors to maintain the spatial map and calculate the virtual camera pose. This camera transformation is then used to render views of the virtual objects registered to the mesh. The scene is rendered twice, for the left and right eye views. The computer graphics are displayed on the head-mounted visor glass and will be visible to the user as if it were really there--virtual objects sharing space with real world physical objects:

That said, spatial mapping is not limited to devices with depth sensing cameras. Using clever photogrammetry techniques, much can be accomplished in software alone. The Apple iOS ARKit, for example, uses just the video camera of the mobile device, processing each frame together with its various positional and motion sensors to fuse the data into a 3D point cloud representation of the environment. Google ARCore works similarly. The Vuforia SDK has a similar tool, albeit more limited, called Smart Terrain.

Developing AR with spatial mapping

Spatial mapping is the representation of all of the information the app has from its sensors about the real world. It is used to render virtual AR world objects. Specifically, spatial mapping is used to do the following:

Help virtual objects or characters navigate around the room

Have virtual objects occlude a real object or be occluded by a real object to interact with something, such as bouncing off the floor

Place a virtual object onto a real object

Show the user a visualization of the room they are in

In video game development, a level designer's job is to create the fantasy world stage, including terrains, buildings, passageways, obstacles, and so on. The Unity game development platform has great tools to constrain the navigation of objects and characters within the physical constraints of the level. Game developers, for example, add simplified geometry, or navmesh, derived from a detailed level design; it is used to constrain the movement of characters within a scene. In many ways, the AR spatial map acts like a navmesh for your virtual AR objects.

A spatial map, while just a mesh, is 3D and does represent the surfaces of solid objects, not just walls and floors but furniture. When your virtual object moves behind a real object, the map can be used to occlude virtual objects with real-world objects when it's rendered on the display. Normally, occlusion is not possible without a spatial map.

When a spatial map has collider properties, it can be used to interact with virtual objects, letting them bump into or bounce off real-world surfaces.

Lastly, a spatial map could be used to transform physical objects directly. For example, since we know where the walls are, we can paint them a different color in AR.

This can get pretty complicated. A spatial map is just a triangular mesh. How can your application code determine physical objects from that? It's difficult but not an unsolvable problem. In fact, the HoloLens toolkit, for example, includes a spatialUnderstanding module that analyzes the spatial map and does higher level identification, such as identification of floor, ceiling, and walls, using techniques such as ray casting, topology queries, and shape queries.

Spatial mapping can encompass a whole lot of data that could overwhelm the processing resources of your device and deliver an underwhelming user experience. HoloLens, for example, mitigates this by letting you subdivide your physical space into what they call spatial surface observers, which in turn contain a set of spatial surfaces. An observer is a bounding volume that defines a region of space with mapping data as one or more surfaces. A surface is a triangle 3D mesh in real-world 3D space. Organizing and partitioning space reduces the dataset needed to be tracked, analyzed, and rendered for a given interaction.

For more information on spatial mapping with HoloLens and Unity, refer tohttps://developer.microsoft.com/en-us/windows/mixed-reality/spatial_mapping andhttps://developer.microsoft.com/en-us/windows/mixed-reality/spatial_mapping_in_unity.

Input for wearable AR

Ordinarily AR eyewear devices neither use a game controller or clicker nor positionally tracked hand controllers. Instead, you use your hands. Hand gesture recognition is another challenging AI problem for computer vision and image processing.

In conjunction with tracking, where the user is looking (gaze), gestures are used to trigger events such as select, grab, and move. Assuming the device does not support eye tracking (moving your eyes without moving your head), the gaze reticle is normally at the center of your gaze. You must move your head to point to the object of interest that you want to interact with: