Robot Learning by Visual Observation - Aleksandar Vakanski - E-Book

Robot Learning by Visual Observation E-Book

Aleksandar Vakanski

0,0
105,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This book presents programming by demonstration for robot learning from observations with a focus on the trajectory level of task abstraction * Discusses methods for optimization of task reproduction, such as reformulation of task planning as a constrained optimization problem * Focuses on regression approaches, such as Gaussian mixture regression, spline regression, and locally weighted regression * Concentrates on the use of vision sensors for capturing motions and actions during task demonstration by a human task expert

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 294

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Preface

List of Abbreviations

1 Introduction

1.1 Robot Programming Methods

1.2 Programming by Demonstration

1.3 Historical Overview of Robot PbD

1.4 PbD System Architecture

1.5 Applications

1.6 Research Challenges

1.7 Summary

References

2 Task Perception

2.1 Optical Tracking Systems

2.2 Vision Cameras

2.3 Summary

References

3 Task Representation

3.1 Level of Abstraction

3.2 Probabilistic Learning

3.3 Data Scaling and Aligning

3.4 Summary

References

4 Task Modeling

4.1 Gaussian Mixture Model (GMM)

4.2 Hidden Markov Model (HMM)

4.3 Conditional Random Fields (CRFs)

4.4 Dynamic Motion Primitives (DMPs)

4.5 Summary

References

5 Task Planning

5.1 Gaussian Mixture Regression

5.2 Spline Regression

5.3 Locally Weighted Regression

5.4 Gaussian Process Regression

5.5 Summary

References

6 Task Execution

6.1 Background and Related Work

6.2 Kinematic Robot Control

6.3 Vision‐Based Trajectory Tracking Control

6.4 Image‐Based Task Planning

6.5 Robust Image‐Based Tracking Control

6.6 Discussion

6.7 Summary

References

Index

End User License Agreement

List of Tables

Chapter 05

Table 5.1 LBG algorithm.

Table 5.2 Mean values and standard deviations of the computation times for learning the trajectories from Experiment 1 and Experiment 2.

Table 5.3 Mean values and standard deviations of the computation times for learning the trajectories from Experiment 1 and Experiment 2 by applying the GMM/GMR approach.

Table 5.4 Means and standard deviations of the classification rates obtained by CRF for the painting task in Experiment 1.

Table 5.5 Means and standard deviations of the classification rates obtained by CRF for the peening task in Experiment 2.

Table 5.6 Means and standard deviations of the classification rates obtained by HMM and CRF for the painting task from Experiment 1.

Table 5.7 Means and standard deviations of the classification rates obtained by HMM and CRF for the peening task from Experiment 2.

Chapter 06

Table 6.1 Evaluation of the trajectory tracking with IBVS under intrinsic camera parameters errors ranging from 0 to 80%.

Table 6.2 Evaluation of the trajectory tracking without vision‐based control under intrinsic camera parameters errors ranging from 0 to 80%.

Table 6.3 Coordinates of the object position at the end of the task expressed in the robot base frame (in millimeters) under extrinsic camera parameters errors ranging from 0 to 80%.

List of Illustrations

Chapter 01

Figure 1.1 Classification of robot programming methods.

Figure 1.2 The user demonstrates the task in front of a robot learner, and is afterward actively involved in the learning process by moving the robot’s arms during the task reproduction attempts to refine the learned skills (Calinon and Billard (2007a).

Figure 1.3 Block diagram of the information flow in a general robot PbD system.

Figure 1.4 Kinesthetic teaching of feasible postures in a confined workspace. During kinesthetic teaching the human operator physically grabs the robot and executes the task.

Figure 1.5 The PbD setup for teaching peg‐in‐hole assembly tasks includes a teleoperated robot gripper and the objects manipulated by a human expert. Tracking is done using magnetic sensors.

Figure 1.6 Teleoperation scheme for PbD—master arm (on the left) and slave arm (on the right) used for human demonstrations.

Figure 1.7 AR training of an assembly task using adaptive visual aids (AVAs).

Figure 1.8 Mobile AR component including a haptic bracelet.

Figure 1.9 Sensory systems used for PbD task observation.

Figure 1.10 Learning levels in PbD.

Figure 1.11 Learning at a symbolic level of abstraction by representing the decomposed task into a hierarchy of motion primitives.

Figure 1.12 A humanoid robot is learning and reproducing trajectories for a figure‐8 movement from human demonstrations.

Figure 1.13 Control of a 19 DoFs humanoid robot using PbD.

Figure 1.14 A kitchen helping robot learns the sequence of actions for cooking from observation of human demonstrations.

Figure 1.15 The experimental setup used for teaching (on the left) includes an ultrasound machine, an ultrasound phantom model and a handheld ultrasound transducer with force sensing and built‐in 3D position markers for optical tracking system. The robotically controlled ultrasound scanning is also shown (on the right).

Figure 1.16 Robot grasp planning application.

Chapter 02

Figure 2.1 (a) Position camera sensors of the optical tracking system Optotrak Certus; (b) optical markers attached on a tool are tracked during a demonstration of a “figure 8” motion.

Chapter 03

Figure 3.1 (a) Two sequences with different number of measurements: a reference sequence of 600 measurement data points and a test sequence of 800 measurement data points; (b) the test sequence is linearly scaled to the same number of measurements as the reference sequence; and (c) the test sequence is aligned with the reference sequence using DTW.

Chapter 04

Figure 4.1 Graphical representation of an HMM. The shaded nodes depict the sequence of observed elements

and the white nodes depict the hidden states sequence

.

Figure 4.2 Graphical representation of a CRF with linear chain structure.

Chapter 05

Figure 5.1 An example of initial selection of trajectory key points with the LBG algorithm. The key points are indicated using circles. The following input features are used for clustering: (a) normalized positions coordinates

; (b) normalized velocities

; and (c) normalized positions and velocities

.

Figure 5.2 Illustration of the weighted curve fitting. For the clusters with small variance of the key points high weights for spline fitting are assigned, whereas for the clusters with high variance of the key points low weights are assigned, which results in loose fitting.

Figure 5.3 Diagram representation of the presented approach for robot PbD. The solid lines depict automatic steps in the data processing. For a set of observed trajectories

X

1

, …,

X

M

, the algorithm automatically generates a generalized trajectory

X

gen

, which is transferred to a robot for task reproduction.

Figure 5.4 (a) Experimental setup for Experiment 1: panel, painting tool, and reference frame; (b) perception of the demonstrations with the optical tracking system; and (c) the set of demonstrated trajectories.

Figure 5.5 Distributions of (a) velocities, (b) accelerations, and (c) jerks of the demonstrated trajectories by the four subjects. The bottom and top lines of the boxes plot the 25th and 75th percentile for the distributions, the bands in the middle represent the medians, and the whiskers display the minimum and maximum of the data.

Figure 5.6 Initial assignment of key points for the trajectory with minimum distortion.

Figure 5.7 Spatiotemporally aligned key points from all demonstrations. For the parts of the demonstrations which correspond to approaching and departing of the tool with respect to the panel, the clusters of key points are more scattered, when compared to the painting part of the demonstrations.

Figure 5.8 (a) RMS errors for the clusters of key points, (b) weighting coefficients with threshold values of 1/2 and 2 standard deviations, and (c) weighting coefficients with threshold values of 1/6 and 6 standard deviations.

Figure 5.9 Generalization of the tool orientation from the spatiotemporally aligned key points. Roll angles are represented by a dashed line, pitch angles are represented by a solid line, and yaw angles are represented by a dash–dotted line. The dots in the plot represent the orientation angles of the key points.

Figure 5.10 Generalized trajectory for the Cartesian

x–y–z

position coordinates of the object.

Figure 5.11 Distributions of (a) velocities, (b) accelerations, and (c) jerks for the demonstrated trajectories by the subjects and the generalized trajectory.

Figure 5.12 (a) The part used for Experiment 2, with the surfaces to be painted bordered with solid lines and (b) the set of demonstrated trajectories.

Figure 5.13 (a) Generalized trajectory for Experiment 2 and (b) execution of the trajectory by the robot learner.

Figure 5.14 Generated trajectory for reproduction of the task of panel painting for Experiment 1, based on Calinon and Billard (2004). The most consistent trajectory (dashed line) corresponds to the observation sequence with the highest likelihood of being generated by the learned HMM.

Figure 5.15 RMS differences for the reproduction trajectories generated by the presented approach (

X

G

1

), the approaches proposed in Calinon and Billard (2004) (

X

G

2

), Asfour

et al

. (2006) (

X

G

3

), and the demonstrated trajectories (

X

1

X

12

). As the color bar on the right side indicates, lighter nuances of the cells depict greater RMS differences.

Figure 5.16 Cumulative sums of the RMS differences for the reproduction trajectories generated by the presented approach (

X

G

1

), the approaches proposed in Calinon and Billard (2004) (

X

G

2

), and Asfour

et al

. (2006) (

X

G

3

). (a) Demonstrated trajectories (

X

1

X

12

) from Experiment 1 and (b) demonstrated trajectories (

X

1–

X

5

) from Experiment 2.

Figure 5.17 Generalized trajectory obtained by the GMM/GMR method (Calinon, 2009) for (a) Experiment 1 and (b) Experiment 2, in Section 4.4.4.

Figure 5.18 (a) Experimental setup for Experiment 1 showing the optical tracker, the tool with attached markers and the object for painting. (b) One of the demonstrated trajectories with the initially selected key points. The arrow indicates the direction of the tool’s motion. (c) Demonstrated trajectories and the generalized trajectory.

Figure 5.19 (a) Plot of a sample demonstrated trajectory for the peening task from Experiment 2 and a set of initially selected key points, and (b) generalized trajectory for the peening experiment.

Chapter 06

Figure 6.1 Response of classical IBVS: feature trajectories in the image plane, camera trajectory in Cartesian space, camera velocities, and feature errors in the image plane.

Figure 6.2 Response of classical PBVS: feature trajectories in the image plane, camera trajectory in Cartesian space, camera velocities, and feature errors in the image plane.

Figure 6.3 The learning cell, consisting of a robot, a camera, and an object manipulated by the robot. The assigned coordinate frames are: camera frame

c

(

O

c

, 

x

c

, 

y

c

, 

z

c

), object frame

o

(

O

o

, 

x

o

, 

y

o

, 

z

o

), robot base frame

b

(

O

b

, 

x

b

, 

y

b

, 

z

b

), and robot’s end‐point frame

e

(

O

e

, 

x

e

, 

y

e

, 

z

e

). The transformation between a frame

i

and a frame

j

is given by a position vector

and a rotation matrix

.

Figure 6.4 (a) The eigenvectors of the covariance matrix

ê

1

,

ê

2

for three demonstrations at times

k

 = 10, 30, and 44; (b) observed parameters for feature 1,

. The vector

is required to lie in the region bounded by

η

min

and

η

max

.

Figure 6.5 (a) Three demonstrated trajectories of the object in the Cartesian space. For one of the trajectories, the initial and ending coordinate frames of the object are shown, along with the six coplanar features; (b) projected demonstrated trajectories of the feature points onto the image plane of the camera; (c) reference image feature trajectories produced by Kalman smoothing and generalized trajectories produced by the optimization model; and (d) object velocities from the optimization model.

Figure 6.6 (a) Demonstrated Cartesian trajectories of the object, with the features points, and the initial and ending object frames; (b) demonstrated and reference linear velocities of the object for

x

‐ and

y

‐coordinates of the motions; (c) reference image feature trajectories from the Kalman smoothing and the corresponding generalized trajectories from the optimization; and (d) the demonstrated and retrieved generalized object trajectories in the Cartesian space. The initial state and the ending state are depicted with square and cross marks, respectively.

Figure 6.7 (a) Experimental setup showing the robot in the home configuration and the camera. The coordinate axes of the robot base frame and the camera frame are depicted; (b) the object with the coordinate frame axes and the features.

Figure 6.8 Sequence of images from the kinesthetic demonstrations.

Figure 6.9 (a) Feature trajectories in the image space for one‐sample demonstration; (b) demonstrated trajectories, Kalman‐smoothed (reference) trajectory, and corresponding planned trajectory for one feature point (for the feature no. 2); (c) demonstrated linear and angular velocities of the object and the reference velocities obtained by Kalman smoothing; (d) Kalman‐smoothed (reference) image feature trajectories and the generalized trajectories obtained from the optimization procedure; and (e) demonstrated and the generated Cartesian trajectories of the object in the robot base frame.

Figure 6.10 (a) Desired and robot‐executed feature trajectories in the image space; (b) tracking errors for the pixel coordinates (

u

,

v

) of the five image features in the image space; (c) tracking errors for

x

‐,

y

‐, and

z

‐coordinates of the object in the Cartesian space; (d) translational velocities of the object from the IBVS tracker.

Figure 6.11 Task execution without optimization of the trajectories: (a) Desired and robot‐executed feature trajectories in the image space; (b) tracking errors for the pixel coordinates (

u

,

v

) of the five image features in the image space; (c) tracking errors for

x

‐,

y

‐, and

z

‐coordinates of the object in the Cartesian space; (d) translational velocities of the object from the IBVS tracker.

Figure 6.12 (a) Demonstrated trajectories of the image feature points, superimposed with the desired and robot‐executed feature trajectories; (b) desired and executed trajectories for slowed down trajectories.

Figure 6.13 (a) Image feature trajectories for one of the demonstrations; (b) demonstrated trajectories, reference trajectory from the Kalman smoothing, and the corresponding generalized trajectory for one of the feature points; (c) desired and robot‐executed image feature trajectories; (d) translational velocities of the object from the IBVS tracker; (e) tracking errors for pixel coordinates (

u

,

v

) of the five image features in the image space; and (f) tracking errors for

x

‐,

y

‐, and

z

‐coordinates of the object in the Cartesian space.

Figure 6.14 Projected trajectories of the feature points in the image space with (a) errors of 5, 10, and 20% introduced for all camera intrinsic parameters; (b) errors of 5% introduced for the focal length scaling factors of the camera.

Guide

Cover

Table of Contents

Begin Reading

Pages

iii

iv

v

x

xi

xii

xiii

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

43

44

45

46

47

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

Robot Learning by Visual Observation

Aleksandar Vakanski

Farrokh Janabi-Sharifi

 

 

 

 

 

 

 

 

 

This edition first published 2017© 2017 John Wiley & Sons, Inc.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Aleksandar Vakanski and Farrokh Janabi‐Sharifi to be identified as the author(s) of this work has been asserted in accordance with law.

Registered OfficesJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA

Editorial Office111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of Warranty

The publisher and the authors make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties; including without limitation any implied warranties of fitness for a particular purpose. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for every situation. In view of on‐going research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. The fact that an organization or website is referred to in this work as a citation and/or potential source of further information does not mean that the author or the publisher endorses the information the organization or website may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this works was written and when it is read. No warranty may be created or extended by any promotional statements for this work. Neither the publisher nor the author shall be liable for any damages arising here from.

Library of Congress Cataloguing‐in‐Publication Data

Names: Vakanski, Aleksandar, author. | Janabi-Sharifi, Farrokh, 1959– author.Title: Robot learning by visual observation / Aleksandar Vakanski, Farrokh Janabi‐Sharifi.Description: Hoboken, NJ, USA : John Wiley & Sons, Inc., [2017] | Includes bibliographical references and index.Identifiers: LCCN 2016041712| ISBN 9781119091806 (cloth) | ISBN 9781119091783 | ISBN 9781119091998 (epub) | ISBN 9781119091783 (epdf)Subjects: LCSH: Robot vision. | Machine learning. | Robots--Control systems. | Automatic programming (Computer science)Classification: LCC TJ211.3 .V35 2017 | DDC 629.8/92631-dc23LC record available at https://lccn.loc.gov/2016041712

Cover image: fandijki/Gettyimages

Cover design by Wiley

 

 

 

 

 

To our families

Preface

The ability to transfer knowledge has played a quintessential role in the advancement of our species. Several evolutionary innovations have significantly leveraged the knowledge transfer. One example is rewiring of the neuronal networks in primates’ brains to form the so‐called mirror neuron systems, so that when we observe tasks performed by others, a section of the brain that is responsible for observation and a section that is responsible for motor control are concurrently active. Through this, when observing actions, the brain is attempting at the same time to learn how to reproduce these actions. The mirror neuron system represents an especially important learning mechanism among toddlers and young kids, stimulating them to acquire skills by imitating the actions of adults around them. However, the evolutionary processes and modifications are very slow and prodigal, and as we further developed, we tended to rely on employing our creativity in innovating novel means for transferring knowledge. By inventing writing and alphabets as language complements, we were able to record, share, and communicate knowledge at an accelerated rate. Other innovations that followed, such as the printing press, typing machine, television, personal computers, and World Wide Web, each have revolutionized our ability to share knowledge and redefined the foundations for our current level of technological advancement.

As our tools and machines have grown more advanced and sophisticated, society recognized a need to transfer knowledge to the tools in order to improve efficiency and productivity, or to reduce efforts or costs. For instance, in the manufacturing industry, robotic technology has emerged as a principal means in addressing the increased demand for accuracy, speed, and repeatability. However, despite the continuous growth of the number of robotic applications across various domains, the lack of interfaces for quick transfer of knowledge in combination with the lack of intelligence and reasoning abilities has practically limited operations of robots to preprogrammed repetitive tasks performed in structured environments. Robot programming by demonstration (PbD) is a promising form for transferring new skills to robots from observation of skill examples performed by a demonstrator. Borrowed from the observational imitation learning mechanisms among humans, PbD has a potential to reduce the costs for the development of robotic applications in the industry. The intuitive programming style of PbD can allow robot programming by end‐users who are experts in performing an industrial task but may not necessarily have programming or technical skills. From a broader perspective, another important motivation for the development of robot PbD systems is the old dream of humankind about robotic assistance in performing everyday domestic tasks. Future advancements in PbD would allow the general population to program domestic and service robots in a natural way by demonstrating the required task in front of a robot learner.

Arguably, robot PbD is currently facing various challenges, and its progress is dependent on the advancements in several other research disciplines. On the other hand, the strong demand for new robotic applications across a wide range of domains, combined with the reduced cost of actuators, sensors, and processing memory, is amounting for unprecedented progress in the field of robotics. Consequently, a major motivation for writing this book is our hope that the next advancements in PbD can further increase the number of robotic applications in the industry and can speed up the advent of robots into our homes and offices for assistance in performing daily tasks.

The book attempts to summarize the recent progress in the robot PbD field. The emphasis is on the approaches for probabilistic learning of tasks at a trajectory level of abstraction. The probabilistic representation of human motions provides a basis for encapsulating relevant information from multiple demonstrated examples of a task. The book presents examples of learning industrial tasks of painting and shot peening by employing hidden Markov models (HMMs) and conditional random fields (CRFs) to probabilistically encode the tasks. Another aspect of robot PbD covered in depth is the integration of vision‐based control in PbD systems. The presented methodology for visual learning performs all the steps of a PbD process in the image space of a vision camera. The advantage of such learning approach is the enhanced robustness to modeling and measurement errors.

The book is written at a level that requires a background in robotics and artificial intelligence. Targeted audience consists of researchers and educators in the field, graduate students, undergraduate students with technical knowledge, companies that develop robotic applications, and enthusiasts interested in expanding their knowledge on the topic of robot learning. The reader can benefit from the book by grasping the fundamentals of vision‐based learning for robot programming and use the ideas in research and development or educational activities related to robotic technology.

We would like to acknowledge the help of several collaborators and researchers who made the publication of the book possible. We would like to thank Dr. Iraj Mantegh from National Research Council (NRC)—Aerospace Manufacturing Technology Centre (AMTC) in Montréal, Canada, for his valuable contributions toward the presented approaches for robotic learning of industrial tasks using HMMs and CRFs. We are also thankful to Andrew Irish for his collaboration on the aforementioned projects conducted at NRC‐Canada. We acknowledge the support from Ryerson University for access to pertinent resources and facilities, and Natural Sciences and Engineering Research Council of Canada (NSERC) for supporting the research presented in the book. We also thank the members of the Robotics, Mechatronics and Automation Laboratory at Ryerson University for their help and support. Particular thanks go to both Dr. Abdul Afram and Dr. Shahir Hasanzadeh who provided useful comments for improving the readability of the book. Last, we would like to express our gratitude to our families for their love, motivation, and encouragement in preparing the book.

List of Abbreviations

CRF

conditional random field

DMPs

dynamic motion primitives

DoFs

degrees of freedom

DTW

dynamic time warping

GMM

Gaussian mixture model

GMR

Gaussian mixture regression

GPR

Gaussian process regression

HMM

hidden Markov model

IBVS

image‐based visual servoing

LBG

Linde–Buzo–Gray (algorithm)

PbD

programming by demonstration

PBVS

position‐based visual servoing

RMS

root mean square

SE

special Euclidean group

1Introduction

Robot programming is the specification of the desired motions of the robot such that it may perform sequences of prestored motions or motions computed as functions of sensory input (Lozano‐Pérez, 1983).

In today’s competitive global economy, shortened life cycles and diversification of the products have pushed the manufacturing industry to adopt more flexible approaches. In the meanwhile, advances in automated flexible manufacturing have made robotic technology an intriguing prospect for small‐ and medium‐sized enterprises (SMEs). However, the complexity of robot programming remains one of the major barriers in adopting robotic technology for SMEs. Moreover, due to the strong competition in the global robot market, historically each of the main robot manufacturers has developed their own proprietary robot software, which further aggravates the matter. As a result, the cost of robotic tasks integration could be many folds of the cost of robot purchase. On the other hand, the applications of robots have gone well beyond the manufacturing to the domains such as household services, where a robot programmer’s intervention would be scarce or even impossible. Interaction with robots is increasingly becoming a part of humans’ daily activities. Therefore, there is an urgent need for new programming paradigms enabling novice users to program and interact with robots. Among the variety of robot programming approaches, programming by demonstration (PbD) holds a great potential to overcome complexities of many programming methods.

This introductory chapter reviews programming approaches and illustrates the position of PbD in the spectrum of robot programming techniques. The PbD architecture is explained next. The chapter continues with applications of PbD and concludes with an outline of the open research problems in PbD.

1.1 Robot Programming Methods

A categorization of the robot programming modes based on the taxonomy reported by Biggs and MacDonald (2003) is illustrated in Figure 1.1. The conventional methods for robot programming are classified into manual and automatic, both of which rely heavily on expensive programming expertise for encoding desired robot motions into executable programs.

Figure 1.1 Classification of robot programming methods.

(Data from Biggs and MacDonald (2003).)

The manual programming systems involve text‐based programming and graphical interfaces. In text‐based programming, a user develops a program code using either a controller‐specific programming language or extensions of a high‐level multipurpose language, for example, C++ or Java (Kanayama and Wu, 2000; Hopler and Otter, 2001; Thamma et al., 2004). In both cases, developing the program code is time‐consuming and tedious. It requires a robot programming expert and an equipped programming facility, and the outcomes rely on programmer’s abilities to successfully encode the required robot performance. Moreover, since robot manufacturers have developed proprietary programming languages, in industrial environments with robots from different manufacturers, programming robots would be even more expensive. The graphical programming systems employ graphs, flowcharts, or diagrams as a medium for creating a program code (Dai and Kampker, 2000; Bischoff et al., 2002). In these systems, low‐level robot actions are represented by blocks or icons in a graphical interface. The user creates programs by composing sequences of elementary operations through combination of the graphical units. A subclass of the graphical programming systems is the robotic simulators, which create a virtual model of the robot and the working environment, whereby the virtual robot is employed for emulating the motions of the actual robot (Rooks, 1997). Since the actual robot is not utilized during the program development phase, this programming method is referred to as off‐line programming (OLP).

The conventional automatic programming systems employ a teach‐pendant or a panel for guiding the robot links through a set of states to achieve desired goals. The robot’s joint positions recorded during the teaching phase are used to create a program code for task execution. Although programming by teach‐pendants or panel decreases the level of required expertise, when compared to the text‐based programming systems, it still requires trained operators with high technical skills. Other important limitations of the guided programming systems include the difficulties in programming tasks with high accuracy requirements, absence of means for tasks generalizations or for transfer of the generated programs to different robots, etc.

The stated limitations of the conventional programming methods inspired the emergence of a separate class of automatic programming systems, referred to as learning systems. The underlying idea of robot learning systems originates from the way we humans acquire new skills and knowledge. Biggs and MacDonald (2003) classified these systems based on the corresponding forms of learning and solving problems in cognitive psychology: exploration, instruction, and observation. In exploration‐based systems, a robot learns a task with gradually improving the performance by autonomous exploration. These systems are often based on reinforcement learning techniques, which optimize a function of the robot states and actions through assigning rewards for the undertaken actions (Rosenstein and Barto, 2004; Thomaz and Breazeal, 2006; Luger, 2008). Instructive systems utilize a sequence of high‐level instructions by a human operator for executing preprogrammed robot actions. Gesture‐based (Voyles and Khosla, 1999), language‐based (Lauria et al., 2002), and multimodal communication (McGuire et al., 2002) approaches have been implemented for programming robots using libraries of primitive robot actions. Observation‐based systems learn from observation of another agent while executing the task. The PbD paradigm is associated with the observation‐based learning systems (Billard et al., 2008).

1.2 Programming by Demonstration

Robot PbD is an important topic in robotics with roots in the way human beings ultimately expect to interact with a robotic system. Robot PbD refers to automatic programming of robots by demonstrating sample tasks and can be viewed as an intuitive way of transferring skill and tasks knowledge to a robot. The term is often used interchangeably with learning by demonstration (LbD) and learning from demonstration (LfD) (Argall et al., 2009; Konidaris et al., 2012). PbD has evolved as an interdisciplinary field of robotics, human–robot interaction (HRI), sensor fusion, machine learning, machine vision, haptics, and motor control. A few surveys of robot PbD are available in the literature (e.g., Argall et al., 2009). PbD can be perceived as a class of supervised learning problems because the robot learner is presented with a set of labeled training data, and it is required to infer an output function with the capability of generalizing the function to new contexts. In the taxonomy of programming approaches shown in Figure 1.1, PbD is a superior learning‐based approach. Compared to the exploration‐based learning systems (as an unsupervised learning problem), PbD systems reduce the search space for solutions to a particular task, by relying on the task demonstrations. The learning is also faster because the trial and errors associated with the reinforcement methods are eliminated.

In summary, the main purpose in PbD is to overcome the major obstacles for natural and intuitive way of programming robots, namely lack of programming skills and scarcity of task knowledge. In industrial settings, this translates to reduced time and cost of programming robots by eliminating the involvement of a robot programmer. In interactive robotic platforms, PbD systems can help to better understand the mechanisms of HRI, which is central to social robotics challenges. Moreover, PbD creates a collaborative environment in which humans and robots participate in a teaching/learning process. Hence, PbD can help in developing methods for robot control which integrate safe operation and awareness of the human presence in human–robot collaborative tasks.

1.3 Historical Overview of Robot PbD

Approaches for automatic programming of robots emerged in the 1980s. One of the earlier works was the research by Dufay and Latombe (1984) who implemented inductive learning for the robot assembly tasks of mating two parts. The assembly tasks in this work were described by the geometric models of the parts, and their initial and final relations. Synthesis of program codes in the robotic language was obtained from training and inductive (planning) phases for sets of demonstrated trajectories. In this pioneering work on learning from observation, the sequences of states and actions were represented by flowcharts, where the states described the relations between the mating parts and the sensory conditions.

Another early work on a similar topic is the assembly‐plan‐from‐observation (APO) method (Ikeuchi and Suehiro, 1993). The authors presented a method for learning assembly operations of polyhedral objects. The APO paradigm comprises the following six main steps: temporal segmentation of the observed process into meaningful subtasks, scene objects recognition, recognition of performed assembly task, grasp recognition of the manipulated objects, recognition of the global path of manipulated objects for collision avoidance, and task instantiation for reproducing the observed actions. The contact relations among the manipulated objects and environmental objects were used as a basis for constraining the relative objects movements. Abstract task models were represented by sequences of elementary operations accompanied by sets of relevant parameters (i.e., initial configurations of objects, grasp points, and goal configurations).

Munch et al