139,99 €
As time goes on, big companies such as Amazon, Microsoft, Google and Apple become increasingly interested in virtual assistants. The interest and development of social robots has put research into affective and social computing at the forefront of the scene. The aim of Opinion Analysis in Interactions is to present methods based on artificial intelligence through a combination of machine learning models and symbolic approaches. Also discussed are natural language processing and affective computing, via the analysis and generation of socio-emotional signals. The book explores the analysis of opinions in human-human interaction and tackles the less-explored (yet crucial) challenges related to the analysis methods of user opinions within the context of human-agent interaction. It also illustrates the implementation of strategies for selecting and generating agent utterances in response to user opinions, and opens up perspectives on the agent's multimodal generation of utterances that hold attitudes.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 223
Veröffentlichungsjahr: 2019
Cover
Preface
Introduction: From Opinion Mining to Human–agent Interactions
I.1. Terminologies and theoretical models of opinions
I.2. Computational models of opinions
I.3. Human–agent interactions and socio-emotional behaviors
I.4. Outline of the book
1 Oral and Written Interaction Corpora
1.1. Oral H–H corpora: call centers and satisfaction surveys
1.2. Written H–H corpora: forums
1.3. Oral H–A corpora: virtual assistants and robots
1.4. Written H–A corpus: chatbot
1.5. Comparative study of different corpora
1.6. Conclusion
2 Analyzing User Opinions in Human–human Interactions
2.1. From linguistic modeling to machine learning
2.2. Learning to account for linguistic specificities
2.3. Conclusion
3 Analyzing User Opinions in Human–Agent Interactions
3.1. Choice of phenomena to study in relation to applications
3.2. Rule-based system to take into account interaction
3.3. Hybrid approach for taking account of interactions
3.4. Evaluation for human–agent interactions
3.5. Conclusion
4 Socio-emotional Interaction Strategies: the Case of Alignment
4.1. Theoretical models
4.2. Qualitative and quantitative corpus analysis
4.3. Computational model of verbal alignment
4.4. Method for evaluating an alignment module
4.5. Conclusion
5 Generating Socio-emotional Behaviors
5.1. Generating agent prosody
5.2. Intonation, facial expressions and sequence mining
5.3. Generation of coverbal gestures for agents
5.4. Conclusion
Conclusion: Summary and Directions for Future Research
References
Index
End User License Agreement
Chapter 1
Table 1.1. Vox14-fine, subset of the 1,600 h call center corpus
Table 1.2. Examples of responses to open questions in the NPS corpus
Table 1.3. Distribution of responses to open questions into three classes based ...
Table 1.4. Quantitative data for the WebGRC corpus
Table 1.5. User engagement annotation categories used in the UE-HRI corpus
Table 1.6. Overview of the corpora used in our studies
Table 1.7. Overview of the human–agent corpora used in our studies
Chapter 2
Table 2.1. Morpho-syntactic labeling of noisy textual data
Table 2.2. Reduction in number of words not recognized by XeLDA
Table 2.3. Examples of web expressions extracted from the WebGRC corpus
Chapter 3
Table 3.1. Distribution of attitudes in the development corpus taken from the Se...
Table 3.2. Fleiss’ kappa (presence of a like/dislike) and Cronbach alpha coeffic...
Table 3.3. Fleiss’ kappa and agreement rate
Table 3.4. Fleiss’ kappa between the system and the reference results
Table 3.5. F1 and accuracy scores using different descriptors and models
Chapter 4
Table 4.1. Example of dialog with shared expressions between two speakers
Table 4.2. Shared expression lexicon constructed from the dialog in Table 4.1
Chapter 5
Table 5.1. Distribution of dialogs in our subcorpus study, broken down using the...
Table 5.2. Summary of the corpus analysis presented in [BAW 15]
Table 5.3. Percentage of dialog acts for each contour type. Table taken from [BA...
Table 5.4. Examples of associations between image schemas and gestural invariant...
Introduction
Figure I.1. Analysis of opinions in interactions as presented in this book
Figure I.2. Human–agent interactions and socio-emotional behaviors: the three as...
Chapter 1
Figure 1.1. The speech signal is first segmented into speakers (agent/client), t...
Figure 1.2. UE-HRI corpus: collecting data from spontaneous human–robot interact...
Figure 1.3. UE-HRI corpus: collection of spontaneous human–robot interaction dat...
Figure 1.4. EDF’s virtual advisor, Laura, in 2014. Laura responded to client que...
Chapter 2
Figure 2.1. Opinion analysis and extraction of business concepts from client dat...
Chapter 3
Figure 3.1. Illustration taken from Caroline Langlet’s presentation at the ACII ...
Figure 3.2. Annotation model, taken from [LAN 15]
Figure 3.3. Extraction patterns and semantic rules at utterance level: lexical, ...
Figure 3.4. Detection of likes and dislikes in an adjacent pair in the system pr...
Figure 3.5. Extraction patterns and semantic rules within thematic sequences
Figure 3.6. Scores obtained using SentiWordNet (image taken from [BAC 10]). Thre...
Figure 3.7. Illustration from [BAR 18], showing a model of interactional dynamic...
Figure 3.8. Illustration from [BAR 18], showing a model of interactional dynamic...
Figure 3.9. Illustration from [BAR 18], showing a model of interactional dynamic...
Figure 3.10. Illustration from [BAR 18], showing a model of interactional dynami...
Figure 3.11. Question about the presence of a like/dislike
Figure 3.12. Selection of words linked to the target in the annotation platform
Figure 3.13. Annotation platform including conversation history. For a color ver...
Figure 3.14. Ordering targets according to user preference
Chapter 4
Figure 4.1. General architecture of the verbal alignment module. For a color ver...
Figure 4.2. Reproduction of a museum environment
Figure 4.3. User interacting with Leonard, taken from [CLA 16a]
Chapter 5
Figure 5.1. Operating diagram for SMART. The dotted line shows the importance of...
Figure 5.2. Figure taken from [RAV 18a] showing the generation of the phrase: “W...
Cover
Table of Contents
Begin Reading
v
iii
iv
ix
x
xi
xiii
xiv
xv
xvi
xvii
xviii
xix
xx
xxi
xxii
xxiii
xxiv
xxv
xxvi
xxvii
xxviii
xxix
xxx
xxxi
xxxii
xxxiii
xxxiv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
Series Editor
Patrick Paroubek
Chloé Clavel
First published 2019 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:
ISTE Ltd27-37 St George’s Road London SW19 4EU UK www.iste.co.uk
John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA www.wiley.com
© ISTE Ltd 2019 The rights of Chloé Clavel to be identified as the author of this work have been asserted by her in accordance with the Copyright, Designs and Patents Act 1988.
Library of Congress Control Number: 2019940674
British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN 978-1-78630-419-3
This book is dedicated to the analysis of opinions in human–human and human–agent interactions. We shall present methods based on artificial intelligence (through learning models of socio-emotional behaviors, combining symbolic and machine learning methods) and affective computing (analysis and synthesis of socio-emotional signals).
The work presented here is essentially that which was submitted for my Habilitation à diriger des recherches (HDR) on May 29, 2017. It was written from an essentially personal perspective and was not intended to provide exhaustive coverage of the field in question. My vision of the subject evidently bears the marks of my own research career in both academic and industrial settings.
Following my doctoral thesis on acoustic recognition of emotions [CLA 07], I took up a research post in the R&D center at Thales Research and Technology. Later, I extended my field of study from acoustic analysis to natural language processing in the context of opinion analysis studies carried out at EDF Lab.
Since 2013, I have held a teaching and research post at the LTCI (Laboratoire Traitement et Communication de l’Information, Information and Communications Processing Laboratory) at Telecom-ParisTech. My current research builds on the work I carried out at EDF Lab on the analysis of opinions and feelings in written and oral interactions, encompassing the treatment of emotions and social attitudes in human–agent interactions. More specifically, my aim is to examine the modes of expression (verbal and prosodic) in the context of interactions between a human individual and an animated conversation agent, and to support the development of socially and emotionally competent agents.
This book presents the various studies that I have carried out on these subjects, grouped around two main axes:
– opinion analysis in human–human and human–agent interactions;
– socio-emotional interaction strategies and the generation of socio-emotional behaviors in human–agent interactions.
The research presented here is rooted in the fields of natural language processing, applied machine learning, affective computing and social human–machine/human–robot interactions. It was developed within a multidisciplinary context, drawing on fields such as psychology, sociology and linguistics. This multidisciplinary facet is crucial given the nature of the subject, from the expression of opinions to social attitudes.
The majority of the work presented here was carried out by research interns, doctoral and postdoctoral students working under my supervision (listed chronologically):
– interns: Charlotte Danesi, Camille Dutrey, Rachel Bawden and Jessica Durand;
– doctoral students: Rémi Lavalley, Camille Dutrey, Caroline Langlet, Thomas Janssoone, Irina Maslowski and Valentin Barrière;
– postdoctoral students: Sabrina Campano, Guillaume Dubuisson-Duplessis, Brian Ravenet and Atef Ben-Youssef.
This work also draws on the results of collaborative efforts, notably in the context of cosupervision of the doctoral and post doc students named above.
The studies of opinion analysis methods for interactions were carried out in collaboration with Anne-Laure Guénet, Delphine Lagarde, Anne Peradotto and Alina Stoica Beck at EDF, and with Patrice Bellot and Marc El-Béze of the Avignon Informatics Laboratory.
The work on oral data from call centers, notably the disfluence analysis presented in Chapters 1 and 2, is the fruit of a collaboration with the LIMSI-CNRS, the Laboratoire d’Informatique pour la Mécanique et les Sciences de l’Ingénieur (Informatics Laboratory for Mechanics and Engineering Sciences) at the University of Paris 11 (Sophie Rosset and Ioana Vasilescu) and with the LPP, the Laboratoire de Phonétique et Phonologie (Phonetics and Phonology laboratory) at Paris 3 (Martine Adda-Decker).
My research on human–agent interactions (presented in Chapters 3, 4 and 5) coincided with my arrival at the LTCI, and benefited hugely from a long and fruitful collaborative partnership with Catherine Pelachaud. The work on conditional random fields presented in Chapter 3 was made possible by expertise provided by the LTCI in the context of a collaboration with Slim Essid; the study of facial expression generation, described in Chapter 5, would not have been possible without the support of Kévin Bailly (ISIR). The prosodic behavior generation and alignment measurement techniques presented in Chapters 4 and 5 owe a good deal to scientific discussions with Frédéric Landragin from Lattice.
Finally, I wish to thank Björn Schuller, Dirk Heylen and Nicolas Sabouret, the reviewers of my HDR assessment, along with Frédéric Bechet, Mohamed Chetouani and Patrick Paroubek, the examiners, for their probing questions and stimulating discussion, which provided inspiration and motivation for my subsequent work.
Chloé CLAVELMay 2019
The automatic analysis of opinions in interactions is a rapidly expanding domain, encouraged, on the one hand, by the challenges presented by practical applications and, on the other hand, by the growing presence of online platforms for public expression and media. This development offers exciting new possibilities in terms of critical expression and action over the Internet [CAR 13], and the quantity and variety of data available is increasing. In terms of natural language processing, the challenge is to analyze expressions of opinion automatically for the purposes of analyzing social trends. There are many possible applications for these techniques: analyzing citizens’ opinions of candidates at election time, analyzing Internet users’ opinions of a product (or product reputation), identifying target clients for recommendation systems, evaluating the success of an advertising campaign, etc.
Running parallel to this social web phenomenon, social robotics, and human–agent interactions as a whole, offer fertile ground for the analysis of opinions in interactions between humans and virtual agents. For example, companion robots are used to provide assistance to users (helping maintain independence) and for entertainment purposes. In this context, knowledge of the user and their profile is critical in order to build a social link between the person and the robot. Using this profile (particularly user preferences, in this case), a companion robot may, for example, choose subjects to discuss when interacting with the user, or recommend products, music or entertainment that they may enjoy.
The possible applications of the domain are manifold, as are the challenges it represents. In recent years, virtual agents have come into use for managing client relations on websites, and a number of companies have developed their own virtual assistants (such as Alexa (Amazon), Siri (Apple) and Cortana (Microsoft)). Whilst these virtual assistants are already widely used, further work on the social component of interactions is crucial in order to improve the fluidity and natural feel of interactions. A further area in which socio-emotional behaviors in human–agent interactions may be taken into account is that of Serious Games, in which users may be trained to handle different situations in conjunction with a virtual agent. For example, in [YOU 15], users can work on improving their social behaviors in the context of virtual job interviews.
The research on opinion analysis presented in this work covers two different interaction situations:
1) human–human interactions collected online and from company data;
2) human–agent interactions (embodied conversational agents, robots).
Opinion detection and analysis approaches have been welcomed with open arms by the machine learning community, although the areas of natural language processing and affective computing have sometimes been omitted. In this work, we shall make use of research carried out in all three fields (affective computing, machine learning and natural language processing). We shall present detection methods including logico-semantic rules and machine learning methods, choosing the most appropriate option for different scientific problems and in response to different levels of maturity. Rule- and knowledge-based methods constitute an essential first step in defining the outlines of new scientific problems. For instance, we defined logico-semantic rules to develop an initial user opinion detection system in a context of human–agent interactions, a subject not previously covered in the literature.
Our work focuses on a number of research questions woven through every chapter. In this introduction, we shall establish the scientific context and specific elements of each of these main questions.
– the first research question relates to theoretical opinion models (Q1, presented in
section I.1
);
– the second research question relates to computational opinion models (Q2, presented in
section I.2
);
– the third and fourth questions relate to the creation of socio-affective conversation agents (Q3 and Q4 in
section I.3
).
Figure I.1.Analysis of opinions in interactions as presented in this book
These research questions clearly highlight common themes and articulations between the chapters of this book. They form a bridge between the two areas of research that we have chosen to present:
1) the development of opinion detection systems for interaction analysis purposes (written and oral
1
), in the context of both human–human (
Chapter 2
) and human–agent interactions (
Chapter 3
), as shown in
Figure I.1
;
2) the development of virtual agents with the ability to express opinions and, more broadly, to present socio-emotional behaviors (
Chapters 4
and
5
), as shown in
Figure I.2
.
Figure I.2.Human–agent interactions and socio-emotional behaviors: the three aspects presented in this book, in Chapters 4 and 5, and the associated research questions Q1, Q2, Q3 and Q4
The opinion detection problem is often reduced to a simple question of positive/negative classification. Nevertheless, the precise definition of the phenomenon and of what differentiates opinions from emotions or feelings is important, and the choice of a specific phenomenon differs in relation to the scientific problem in question. In this section, we shall present work carried out by different communities on the terminological and subjacent theoretical aspects of the opinion phenomenon.
The multidisciplinary nature of this domain of research has led, on the one hand, to the use of different terminologies to denote similar phenomena (emotion, opinion, feelings, moods, attitudes, interpersonal stance, personality, affect sensing, judgement, assessment, argument, etc.); on the other hand, the same terminology may be used to denote different phenomena. The opinion mining community tends to use terms such as opinion, feeling and affect to refer to different phenomena. However, existing work rarely gives a precise and in-depth definition of exactly what is meant by each of these terms.
The human–agent interaction community draws on psychological theories, used to model emotion-related phenomena, affect sensing and mood, and, more recently, on social interactions. Clavel and Callejas [CLA 16b] give a state of the art of these terminologies and theoretical models, as described below.
Scherer [SCH 05] proposes a distinction between different phenomena. Notably, he establishes a difference between emotions and attitudes. Emotions are defined as phenomena of short duration, including a physiological reaction, following the evaluation of a major stimulus (as in the case of fear, sadness, joy or anger). Attitudes are defined as predispositions toward objects and/or persons (as in the case of preferences).
Scherer also defines the interpersonal stance, or social attitudes, as an affective disposition toward another person in the context of an interaction, for example politeness, warmth or distrust.
Specific studies of verbal content have been carried out in the field of linguistics. Like Scherer, Martin and White [MAR 05] prefer the term “attitudes” to those of “feelings” or “opinions”. They define an attitude as something concerned with feelings, including emotional reactions, judgments of behavior and evaluations of things [MAR 05, p. 35].
The authors distinguished three types of attitudes:
– affects (personal reactions relating to an emotional state);
– judgments (the fact of assigning qualities – such as tenacity – to individuals as a function of normative principles);
– appreciation (the evaluation of an object, product or process).
Munecero et al. [MUN 14], working in the context of opinion mining, also propose definitions backed up by textual examples of feeling and opinion phenomena. The author defines affects as preceding emotions prior to awareness of the associated feeling. Consequently, according to this view, affects are not expressed in linguistic form. Distinctions are also made between emotions and feelings and between opinions and feelings:
– emotions differ from feelings in terms of duration (emotions have a shorter duration) and by the presence of a target (emotions are not always connected to an object);
– opinions are personal interpretations of information and are not necessarily emotionally charged in the same way as feelings.
The studies presented in this book are based on the concept of attitude defined by Martin and White. This definition encompasses all of the phenomena relating to opinions, providing subcategories that circumscribe phenomena as a function of a scientific context.
The theoretical models underpinning opinion and emotion analysis systems also differ according to the community in question (opinion mining or human–agent interactions) and to the chosen application. In [CLA 16b], we identified three major families of theoretical models used in defining opinion-related phenomena:
– dimensional models;
– categorical models;
– models based on evaluation theory.
The most common tasks applied in the context of opinion mining relate to the detection of polarity (positive vs. negative) and intensity [WOL 13], [OSH 09]. Polarity and intensity are two dimensions used to describe opinions and may be linked to theories based on a dimensional model [RUS 80] of opinions/emotions. This descriptive mode represents socio-emotional phenomena along abstract axes, such as valence/activation.
Polarity detection in particular can be used to simplify the opinion analysis problem, segmenting the polarity axis into two or three classes (is the opinion expressed in a text broadly positive, negative or neutral2?) and is used, for example, to analyze opinions concerning a brand (e-reputation) or in analyzing movie reviews.
For example, the Deft’07 text mining challenge [GRO 07] concerned the attribution of opinions (positive, negative or neutral, where applicable) to a corpus of reviews of books, shows, video games, scientific articles and parliamentary debates.
Opinion polarity or emotion valence analysis is also used in the domain of human–agent interactions to manage negative emotions within the context of interactions [SMI 11].
Other studies (for example [PER 13]) have drawn on categorical models developed in the field of psychology [EKM 99, IZA 71, PIC 00, PLU 03, WHI 89,] concerning the detection of categories of opinion or emotion in textual data. The categorical approach consists of assigning appropriate predefined lexical items, or labels, to socio-emotional phenomena.
This approach constitutes the most intuitive means of describing specific phenomena using categories drawn from everyday language [CLA 07]. Categories are defined by tracing hard lines within the perceptive space. Each category corresponds to a prototype [KLE 90] to which other similar manifestations may be linked.
The way in which category lines are drawn is heavily dependent on the data in question. In the case of fully simulated corpora, we look for illustrative examples of a predefined prototype. All emotional manifestations contained in the corpora must converge strongly toward the prototype. In the case of spontaneous corpora, expressions are grouped around an abstract prototype. The diversity of contexts in which emotions emerge within spontaneous speech heightens the complexity of the task.
The classes used in opinion analysis are thus highly dependent on the context of application and on the data in question. Examples include:
– detection of agreement or disagreement [GAL 04] in recordings of meetings;
– detection of insulting messages on the Internet [SPE 97];
– detection of subjectivity [TSY 12];
– detection of frustration in drivers [BOR 10], for educational support systems [LIT 06] or in computer games designed for children [YIL 11];
– representation of emotion detected in a text through avatars [NEV 10b, ZHA 08].
Models based on evaluation theory provide a richer basis for analysis and have been shown to be effective in the context of human–agent interactions, although they have yet to be widely adopted by the opinion mining community.
The most popular model of this type within both communities is the Orthony Clore and Collins (OCC) model based on the cognitive structure of emotions. It has been used in the context of opinion mining for textual affect sensing [SHA 09]), and is particularly popular in the agent community for generating emotional behaviors for agents, classifying events, objects and actions in order to define the emotion which the agent should express [VAL 09].
However, different communities also use their own evaluation theories. In the opinion mining community, work has been carried out on another evaluation theory, providing a definition of attitudes or evaluation through language [MAR 05]. This theory is used to represent an opinion (an attitude) as an evaluation of a target (e.g. a service or a product) by a source (e.g. the person communicating) [BLO 07].
Alternatives to the OCC model have also been used among the agent community, including Scherer’s appraisal model, which breaks the evaluation process down into different steps (such as the evaluation of a new element), or the EMA dynamic appraisal model [MAR 09].
Another theoretical approach used by researchers working in a similar field to opinion mining involves argument models. A graphic formalization of argument models from the field of philosophy [TOU 03a] was proposed by [CAB 13] for the purposes of argument mining of debates on social networks. This formalization enables the identification of structures connecting opinions, for example by linking opinions corresponding to rebuttals and claims.
A first effort to identify different terminologies and theoretical models was made by the W3C with the development of the EmotionML3 (Emotion Markup Language). The aim was to define a common language for annotating emotions. This project has now been extended to describing feelings in linked data sources, remaining within the W3C framework [SÁN 16].
In accordance with our decision to examine opinion phenomena in connection with the concept of attitude, the work presented in this book is based on Martin and White’s theory of evaluation in language, which provides a description of verbal realizations of attitudes; a symbolic formalization of expressions may thus be developed and integrated into a detection model.
In [CLA 07], we reflected on the best theoretical model to use in constructing a computational model in a different context, that of acoustic analysis and the detection of fear-type emotions in abnormal situations. The work presented in this book extends our investigation to socio-emotional behaviors, including linguistic phenomena associated with opinions and feelings.
We shall consider (Q1) the relevance of different theoretical models for constructing a computational model (linguistic and prosodic) depending on the application (social network analysis, customer relations management, recommendations, human–agent interactions and social robotics).
This research question will be addressed throughout the book:
– a first, outline response to this question is presented in
Chapter 2
, with the definition of the concept of satisfaction in marketing terms based on company data for a customer relations application;
– the theoretical modeling question is considered in greater detail in
Chapter 3
, using Martin and White’s appraisal theory [MAR 05] to construct:
– “like” and “dislike” models for users in human–agent interactions, with the aim of establishing a user profile,
– models of customer/user opinions in interaction with a chatbot;
– in
Chapter 4
