139,99 €
Natural Language Processing (NLP) is a scientific discipline which is found at the intersection of fields such as Artificial Intelligence, Linguistics, and Cognitive Psychology. This book presents in four chapters the state of the art and fundamental concepts of key NLP areas. Are presented in the first chapter the fundamental concepts in lexical semantics, lexical databases, knowledge representation paradigms, and ontologies. The second chapter is about combinatorial and formal semantics. Discourse and text representation as well as automatic discourse segmentation and interpretation, and anaphora resolution are the subject of the third chapter. Finally, in the fourth chapter, I will cover some aspects of large scale applications of NLP such as software architecture and their relations to cognitive models of NLP as well as the evaluation paradigms of NLP software. Furthermore, I will present in this chapter the main NLP applications such as Machine Translation (MT), Information Retrieval (IR), as well as Big Data and Information Extraction such as event extraction, sentiment analysis and opinion mining.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 501
Veröffentlichungsjahr: 2017
Cover
Title
Copyright
Introduction
1 The Sphere of Lexicons and Knowledge
1.1. Lexical semantics
1.2. Lexical databases
1.3. Knowledge representation and ontologies
2 The Sphere of Semantics
2.1. Combinatorial semantics
2.2. Formal semantics
3 The Sphere of Discourse and Text
3.1. Discourse analysis and pragmatics
3.2. Computational approaches to discourse
4 The Sphere of Applications
4.1. Software engineering for NLP software
4.2. Machine translation (MT)
4.3. Information retrieval (IR)
4.4. Big Data (BD) and information extraction
Conclusion
Bibliography
Index
End User License Agreement
1 The Sphere of Lexicons and Knowledge
Table 1.1. Examples of the connotations of the color red
Table 1.2. The metaphor of life as a voyage
Table 1.3. The metaphor of humans as machines and machines as humans
Table 1.4. Examples of polysemy
Table 1.5. Examples of partial synonyms
Table 1.6. A semic analysis of the field of the chair according to Pottier
Table 1.7. Comparison of the attributes of sparrows, ostriches and kiwis
Table 1.8. Simplified frame of a plane
2 The Sphere of Semantics
Table 2.1. Analyses of semias: train, metro, bus and coach
Table 2.2. Truth table for the negation operator
Table 2.3. Truth values for the conjunction operator
Table 2.4. Truth table for the or operator
Table 2.5. Truth table for the exclusive or operator
Table 2.6. Truth table for the implication operator
Table 2.7. Truth table for the biconditional
Table 2.8. Truth table for formula α
Table 2.9. Proof of P ⋀ P ≡ P
Table 2.10. Proof of P ⋀ Q ≡ Q ⋀P
Table 2.11. Proof of (p → (Q ∨ R)) ≡ ((P → Q) ∨ (P → R))
3 The Sphere of Discourse and Text
Table 3.1. Some perspectives on the difference between text and discourse
Table 3.2. Examples of sentences with their presuppositions
Table 3.3. Inventory of mononuclear relations established in [MAN 88]
Table 3.4. Inventory of multinuclear relations established in [MAN 88]
Table 3.5. The constraints on the relation
Table 3.6. The four types of transition possible in a discourse segment
4 The Sphere of Applications
Table 4.1. Comparison of several programming environments in Prolog [FER 00]
Table 4.2. List of the most frequent words in the CSFN
Table 4.3. Comparison tf, cf, tf/cf, idf and tf-idf scores
Table 4.4. Term-document matrix
Table 4.5. Document–document matrix for the collection in Figure 4.27
Table 4.6. Standardized document–document matrix (binary)
Table 4.7. Term–document matrix for our collection and the two queries
Table 4.8. The distances between the queries and the documents in this collection
Table 4.9. Hearst’s extraction patterns for hyponyms
Table 4.10. The basic emotions and corresponding neurotransmitter rates
Table 4.11. Examples of opinions with different types of qualifications
Table 4.12. Categories of terms in WordNetAffect [STR 04]
Table 4.13. Relation between syntactic structures and the elements of an opinion
1 The Sphere of Lexicons and Knowledge
Figure 1.1. General diagram of lexical hierarchies in a field [CRU 00]
Figure 1.2. Partial taxonomy of animals
Figure 1.3. Meronymic hierarchy of the human body
Figure 1.4. General structure of a semiotic square
Figure 1.5. Example of a semiotic square for feminine/masculine
Figure 1.6. Lexical structure of the lexical entry car
Figure 1.7. Extract of an SGML document that represents the days of the week
Figure 1.8. Diagram of a possible use of SGML in a real context
Figure 1.9. Structure of a lexical entry
Figure 1.10. Example of the definition of a lexical entry in the form of an SGML document
Figure 1.11. Possible display for the SGML document
Figure 1.12. Example of an XML document that represents the days of the week
Figure 1.13. Use of XML to format lexical entries
Figure 1.14. An example of RDF metadata
Figure 1.15. Example of the entry dresser [BUR 15]
Figure 1.16. Example of a MARTIF format document
Figure 1.17. The components of the LMF standard
Figure 1.18. Class diagram of the LMF’s core
Figure 1.19. Example of a lexicon coded with LMF [FRA 06]
Figure 1.20. Papillon macrostructure with interlanguage links [MAN 06]
Figure 1.21. Examples of interlanguage links [BOI 02]
Figure 1.22. Example of an interlanguage sense in XML [SÉR 01]
Figure 1.23. Microstructure of the lexia murder [MAN 06]
Figure 1.24. Data flow in the DEB system [SMR 03]
Figure 1.25. Example of a lexical entry in the wn_cz dictionary [SMR 03]
Figure 1.26. Example of a lexical entry in the gloss_en dictionary [SMR 03]
Figure 1.27. Result of a search for the word car in WordNet
Figure 1.28. The pivot, prolexeme and instances of the UNO [MAU 08]. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 1.29. A program in Prolog with a micro knowledge base
Figure 1.30. Example of a simple semantic network
Figure 1.31. Example of a semantic network with two multiple inheritances
Figure 1.32. A conceptual graph that represents: the book is on the table
Figure 1.33. The conceptual graph for the sentence: all books are in paper
Figure 1.34. Conceptual graph for John goes to Prague by plane tomorrow
Figure 1.35. Dependencies between the components of the sentence Mary thinks that John wants to go to Prague by plane tomorrow
Figure 1.36. Conceptual graph of the sentence Mary thinks that John wants to go to Prague by plane tomorrow
Figure 1.37. Linear form of the graph in Figure 1.36
Figure 1.38. The class laborer and two instances
Figure 1.39. UNL graph of sentence 1.6
Figure 1.40. Global architecture of an information system with an ontology and an NLP module
Figure 1.41. Role of an ontology in an information exchange process [MAE 03]
Figure 1.42. The levels of knowledge in an ontology (adapted from [RIG 99])
Figure 1.43. Taxonomy of DOLCE [MAS 03]
Figure 1.44. Base structure of the SUMO ontology
Figure 1.45. Examples of representation with the language KIF-SUMO
Figure 1.46. Hierarchy of SNAP entities
Figure 1.47. Hierarchy of SPAN entities
2 The Sphere of Semantics
Figure 2.1. Description of the word pig according to the model in [FOD 64]
Figure 2.2. General diagram of the initial model of generative semantics
Figure 2.3. Semantic representation of the deep structure for: John killed Mary
Figure 2.4. Result of a predicate raising operation
Figure 2.5. Typical structure of a sentence according to Fillmore's model
Figure 2.6. Deep structure of the sentence John gave flowers to Mary
Figure 2.7. Representations of the sentence: John repaired the television with a screwdriver
Figure 2.8. Interactions of the independent components of macrosemantics
Figure 2.9. Architecture of the meaning-text model [MEL 97]
Figure 2.10. Semantic network of the predicate p(x, y) [MEL 97]
Figure 2.11. Lexical semantic rule R1 [MEL 97]
Figure 2.12. A few logical identities
Figure 2.13. A few rules of inference
Figure 2.14. Tree structure of a simple formula
Figure 2.15. Tree structure of a more complex formula
Figure 2.16 Representation of a nominal group with the determinant a
Figure 2.17. The syntactic dependencies of the sentence: John looks at a flower
Figure 2.18. Analysis of a simple sentence with a transitive verb
3 The Sphere of Discourse and Text
Figure 3.1. Diagram of constant topic progression
Figure 3.2. Diagram of linear topic progression
Figure 3.3. Diagram of derived topic progression
Figure 3.4. Diagram of inserted topic progression
Figure 3.5. Prince’s taxonomy [PRI 81]
Figure 3.6. Linear segmentation of a text
Figure 3.7. Example of a text analyzed according to RST [MAN 12]. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 3.8. Example of rules used to construct the discursive structure starting from the syntactic structure [POL 04b]
Figure 3.9. DRT representation of [3.35]
Figure 3.10. DRT representation of [3.35]
Figure 3.11. First step of the algorithm
Figure 3.12. Second step of the algorithm
Figure 3.13. Third step of the algorithm
Figure 3.14. Fourth step of the algorithm
Figure 3.15. Fifth step of the algorithm
Figure 3.16. Sixth step of the algorithm
Figure 3.17. Seventh step of the algorithm
Figure 3.18. Algorithm for processing pronominal references [BRE 87]
Figure 3.19. Transition of centers in example [3.39]
Figure 3.20. Decision tree C4.5 for coreference resolution [MCC 95]
Figure 3.21. Basic algorithm for creating decision trees [QUI 79]
Figure 3.22. Skeleton of a decision tree
4 The Sphere of Applications
Figure 4.1. Examples of simple serial architectures. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.2. Architecture of the Hearsay II system
Figure 4.3. Levels of knowledge in the Hearsay II blackboard
Figure 4.4. Architecture of the Vico system [BER 02]. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.5. Architecture of the MICRO system [CAE 94]
Figure 4.6. Some software configurations for integrating syntax and semantics
Figure 4.7. Runtime of the worst cases observed by length
Figure 4.8. An example of an initial utterance and some derived utterances
Figure 4.9. Functional typology of MT systems according to Carbonell
Figure 4.10. General diagram of the assimilation process
Figure 4.11. General diagram of the dissemination process
Figure 4.12. The Vauquois triangle with some modifications [VAU 68]
Figure 4.13. Architecture of a system using the direct approach
Figure 4.14. Architecture of the transfer approach. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.15. Architecture of a transfer-based system with three languages
Figure 4.16. An example of a syntactic transfer
Figure 4.17. Syntactic transfer with order inversion
Figure 4.18. Example of a synchronized syntactic tree and semantic tree
Figure 4.19. Diagram of a simple transfer [PRI 94]
Figure 4.20. Architecture of pivot-based systems
Figure 4.21. Example of a pivot in the domain of the translation of hotel reservations
Figure 4.22. Diagram of a mediated dialogue in Verbmobil [WAH 00a]
Figure 4.23. The interface of the Verbmobil system [WAH 00a]. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.24. DFDs of information systems and information retrieval systems
Figure 4.25. Curve of the relation between frequency and rank for the first 300 words in the CSFN corpus. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.26. Flat clustering of a set of documents. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.27. Collection of documents
Figure 4.28. List of keywords retained in the documents in this collection
Figure 4.29. Graph of connections between the documents
Figure 4.30. Connected components algorithm
Figure 4.31. Diagrams of naive and complex Bayesian networks
Figure 4.32. General diagram of information retrieval with LSA. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.33. Application of the SVD to a matrix X
Figure 4.34. Collection of documents
Figure 4.35. Approximation factor of 2 of the SVD of X
Figure 4.36. Vector calculation of the queries and the documents in this collection
Figure 4.37. The vectors corresponding to the texts in reduced space (k = 2). For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.38. ELT approach for data warehouses. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.39. BD processing architecture. For a color version of this figure, see www.iste.co.uk/kurdi/language2.zip
Figure 4.40. DFD of an information extraction system and a question answering system
Figure 4.41. DFD of KnowItAll
Figure 4.42. Examples of predicates in the domains of geography and films
Figure 4.43. Induction of rules from examples
Figure 4.44. Dependency tree to extract relations
Figure 4.45. Algorithm to calculate tree similarity
Figure 4.46. Pairs of acquisition by multi-nationals
Figure 4.47. Examples of search results
Figure 4.48. DFD of the REES system
Figure 4.49. Example of an event template
Figure 4.50. The representation of emotions according to Russell’s circumplex model
Figure 4.51. Lövheim’s cube of emotions
Figure 4.52. DFD of a categorical system for extracting subjectivities
Cover
Table of Contents
Begin Reading
C1
iii
iv
v
ix
x
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
259
260
261
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
G1
G2
G3
G4
e1
Series EditorPatrick Paroubek
Mohamed Zakaria Kurdi
First published 2017 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:
ISTE Ltd
27-37 St George’s Road
London SW19 4EU
UK
www.iste.co.uk
John Wiley & Sons, Inc.
111 River Street
Hoboken, NJ 07030
USA
www.wiley.com
© ISTE Ltd 2017
The rights of Mohamed Zakaria Kurdi to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.
Library of Congress Control Number: 2017953290
British Library Cataloguing-in-Publication Data
A CIP record for this book is available from the British Library
ISBN 978-1-84821-921-2
Language is a central tool in our social and professional lives. It is a means to convey ideas, information, opinions and emotions, as well as to persuade, request information, give orders, etc. The interest in language from a computer science point of view began with the start of computer science studies themselves, notably in the context of work in the area of artificial intelligence. The Turing test, one of the first tests developed to determine whether a machine is intelligent or not, stipulates that to be considered as intelligent, the machine must have conversational capacities comparable to those of a human [TUR 50]. This means that an intelligent machine must have the capacity for comprehension and generation, in the broad sense of the terms, hence the interest in natural language processing (NLP) at the dawn of the computer age. Historically, computer processing of languages was very quickly directed toward applied domains such as machine translation (MT) in the context of the Cold War. Thus, the first MT system was created as part of a shared project between Georgetown University and IBM in the United States [DOS 55, HUT 04]. These applied works were not as successful as intended and the researchers quickly became aware that a deep understanding of the linguistic system was a prerequisite for any successful application.
The internet wave between the mid-1990s and the start of the 2000s was a very significant driving force for NLP and related domains, notably information retrieval, which grew from a marginal domain limited to information retrieval in the context of a large company to information retrieval on the scale of the Internet, whose content is constantly growing. This development in terms of the availability of data also favored a discipline that was already in its infancy: Data Science. Located at the intersection of statistics, computer science and mathematics, Data Science focuses on the analysis, visualization and processing of digital data in all forms: images, text and speech. The role of NLP within Data Science is obvious, given that the majority of the information processed is contained in written documents or speech recordings. It is therefore possible to distinguish two different but complementary research approaches in the domain of NLP. On the one hand, there are works that aim to solve the fundamental problem of language processing and that are consequently concerned with the cognitive and linguistics aspects of this problem. On the other hand, several works are dedicated to optimizing and adapting existing NLP techniques for various applied domains such as the medical or banking sectors.
The objective of this book is to provide a comprehensive review of classic and modern works in the domains of lexical databases and the representation of knowledge for NLP, semantics, discourse analysis, and NLP applications such as machine translation and information retrieval. This book also aims to be profoundly interdisciplinary by giving equal consideration to linguistic and cognitive models, algorithms and computer applications as much as possible because we are starting from the premise, which has been proven in NLP and elsewhere time and time again, that the best results are the product of a good theory paired with a well-designed empirical approach.
In addition to the Introduction, this book has four chapters. The first chapter concerns the lexicon and the representation of knowledge. After an introduction to the principles of lexical semantics and theories of lexical meaning, this chapter covers lexical databases, the main procedures for representing knowledge and ontologies. The second chapter is dedicated to semantics. First, the main approaches in combinatorial semantics such as interpretive semantics, generative semantics, case grammar, etc. will be presented. The next section is dedicated to the logical approaches to formal semantics used in the domain of NLP. The third chapter focuses on discourse. It covers the fundamental concepts in discourse analysis such as utterance production, thematic progression, structuring information in discourse, coherence and cohesion. This chapter also presents different approaches to discourse processing such as linear segmentation, discourse analysis and interpretation, and anaphora resolution. The fourth and final chapter is dedicated to NLP applications. First, the fundamental aspects of NLP systems such as software architecture and evaluation approaches are presented. Then, some particularly important applications in the domain of NLP, such as machine translation, information retrieval and information extraction, are reviewed.
Located at the intersection of semantics and lexicology, lexical semantics is a branch of semantics that focuses on the meaning of words and their variations. Many factors are taken into consideration in these studies:
– The variations and extensions of meaning depending on the usage context. The context can be linguistic (e.g. surrounding words), which is why some experts call it “cotext” instead. The context can also be related to the use or register of the language. In this case, it can indicate the socio-cultural category of the interlocutors, for example, formal, informal or vulgar.
– The semantic relationship that the word has with other words: synonyms, antonyms, similar meaning, etc. The grammatical and morphological nature of these words and their effects on these relationships are also of interest.
– The meaning of words can be considered to be a fairly complex structure of semantic features that each plays a different role.
This section will focus on the forms of extension of lexical meaning, the paradigmatic relations between words and the main theories concerning lexical meaning.
Language users are aware that the lexical units to fight, to rack one’s brain and crazy are not used literally in sentences [1.1]:
– The minister
fought
hard to pass the new law.
[1.1]
– Mary
racked her brain
trying to find the street where John lived.
– John drives too fast, he’s
crazy!
Far from the simplicity of the everyday use of these descriptive uses, the figurative use of lexical items occurs in different forms that will be discussed in the following sections.
From the perspective of the philosophy of language, denotation designates the set of objects to which a word refers. From a linguistic point of view, denotation is a stable and objective element because it is shared, in principle, by the entire linguistic community. This means that denotation is the guarantee of the conceptual content of the lexicon of a given language.
Depending on the context, connotation is defined as the set of secondary significations that are connected to a linguistic sign and that are related to the emotional content of the vocabulary. For example, the color red denotes the visual waves of certain physical properties. Depending on the context, this color has several different connotations. Here are a few linguistic contexts where the word red can be used with their connotations (see Table 1.1).
Table 1.1.Examples of the connotations of the color red
Context
Connotation
Glowing red coals.
very hot
The streets ran red after the conflict.
blood-stained
John offered Mary a red rose.
love
Red light.
interdiction
Red card (Soccer).
expulsion
The Red Scare (politics).
Communism
In some cases, the difference between connotation and denotation pertains to the register of language. For example, the groups (dog, mutt, pooch), (woman, chick), (police officer, cop, pig) each refer to the same object but the words of each group have different connotations that can provide information about the socio-cultural origins of the interlocutor and/or the situation of communication.
The distinction between denotation and connotation is considered to be problematic by some linguists. Linguistic evolution means that external features or properties become ingrained over time. For example, the word pestilence, which refers to an illness, has evolved with time and now also refers to a disagreeable person, as in: Mary is a little pest.
Of Greek origin, the word metaphor literally means transfer. It consists of the semantic deviation of a lexical item’s meaning. Traditionally, it is a means to express a concept or abstract object using a concrete lexical item with which it has an objective or subjective relationship. The absence of an element of comparison such as like is what distinguishes metaphor from simile. The sentence she is as beautiful as a rose is an example of a simile.
There needs to be only some kind of resemblance for the metaphor process to enter into play. These resemblances can concern a property: to burn with love (intense and passionate love); the form: a rollercoaster life (a life with ups and downs like a rollercoaster ride), John reaches for the stars (to set one’s sights high or be very ambitious), genealogical tree (a set of relations whose diagram is similar to the shape of the branches of a tree); the degree: to die of laughter (death is an extreme state); the period: the springtime of life (youth), the Arab Spring (renewal); or personification: the whale said to Sinbad, “You must go in this direction” (the whale spoke like a person).
In some cases, there are objects that do not have a proper designation (non-lexicalized objects). They metaphorically borrow the names of other objects. This includes things like the wing of a plane, a windmill or a building, which all borrow the term of a bird’s limb because of the resemblance in terms of form or function. This metaphor is called a catachresis.
From a cognitive perspective, there are two opposing schools of thought when it comes to the study of metaphors: the constructivist movement and the non-constructivist movement. According to the constructivist movement, the objective world is not directly accessible. It is constructed on the basis of restricting influences on both language and human knowledge. In this case, metaphor can be seen as an instrument used to construct reality. According to the Conceptual Metaphor Theory of [LAK 80, LAK 87], the most extreme form of constructivism, metaphor is not a stylistic decorative effect at all. Rather, it is an essential component of our cognitive system that allows us to concretely conceptualize an abstract idea. The basic idea of this theory is that a metaphor is a relationship of correspondence between two conceptual domains: the source domain and the destination domain. According to this theory, metaphor is not limited to a particular linguistic expression because the same metaphor can be expressed in several different ways. To illustrate this idea, Lakoff gives the example of the metaphor of the voyage of life, where life is the source domain and the voyage is the destination domain (see Table 1.2).
Table 1.2.The metaphor of life as a voyage
Life
Voyage
Birth
Start of the voyage
Death
End of the voyage
Reaching an objective
Arriving at the destination
Point of an important choice
Intersection
Difficulties
Obstacles
Encountering difficulties
Climbing
Colleagues, friends, partners, etc.
Co-travelers
The correspondences presented in Table 1.2 are the source of expressions like “It’s the end of the road for John”, and “Mary is progressing quickly but she still has not arrived at the point where she wants to be”, etc. Note that in Lakoff’s approach, two types of correspondences are possible: ontological correspondences that involve entities from different domains and epistemological correspondences that involve knowledge about entities.
As shown in [LAK 89], the correspondences are unidirectional even in the case of different metaphors that share the same domain. They give the example of humans as machines and machines as humans (see Table 1.3).
Table 1.3.The metaphor of humans as machines and machines as humans
Humans as machines
Machines as humans
John is very good at math, he’s a human calculator.Marcel is a harvesting machine.
I think my car doesn’t like you, she doesn’t want to start this morning.The machine did not like the new engine, it was too weak.My computer told me that the program wasn’t working.
Although these metaphors share the same domain, the features used in one direction are not the same as the features used in the other direction. For example, in the metaphor of humans as machines, the functional features associated with machines are efficiency, rapidity and precision, projected onto humans. On the other hand, different features like desire and the capacity for communication are projected onto machines.
Metaphors are far from being considered a marginal phenomenon by linguists. In fact, some believe that studying metaphorical language is fundamental for understanding the mechanisms of language evolution because many metaphors pass into ordinary use. Other models have also been proposed, including the theory of lexical facets [KLE 96, CRU 00, CRU 04].
Metonymy consists of designating an object or a concept by the name of another object or concept. There are different types of metonymy depending on the nature of the connections that relate the objects or concepts:
– The cause and its effect: the harvest can designate the product of the harvest as well as the process of harvesting.
– The container for the contents:
he drank the whole
bottle
,
he ate the whole
box/plate
.
– The location for the institution that serves there:
The Pentagon decided to send more soldiers into the field. Matignon decided to make the documents public
(Matignon is the castle where the residence and the office of the French Prime Minister is located in Paris).
Like metaphors, the context plays an important role in metonymy. In fact, sentences like I have read Baudelaire (meaning that I have read poems written by Baudelaire) can only be interpreted as metonymies because the verb to read requires a readable object (e.g. a book, newspaper, novel, poem). Since the object here is a poet, we imagine that there is a direct relationship with what we have read: his poems.
Synecdoche, a particular case of metonymy, consists of designating an object by the name of another object. The relationship between the two objects can be a varied form of inclusion. Here are a few examples:
– A part for the whole, as in:
the sails are close to port
(sails/ship), or
new hands join in the effort
(hands/person), or
the jaws of the sea
(jaws/shark).
– The whole for a part:
Italy won the European Cup
(Italy/Italian team).
– From the specific to the general:
Spring is the season of roses
(roses/all kinds of flowers).
As noted, unlike metonymy, the two objects involved in a synecdoche are always inseparable from one another.
Language is far from being a nomenclature of words. Words have varied relationships on different levels. In addition to syntagmatic relations of co-occurrence, which are fundamentally syntactical, words have essentially semantic paradigmatic relations. These relations can be linear, hierarchical, or within clusters.
Used to designate the structure of a linguistic domain, the term field, while fundamental in lexicology, can refer to various concepts depending on the school of thought or linguists. Generally, following the German tradition of distinction between sinnfeld (field of meaning) and wortfeld (field of words), there is a distinction made between the lexical field and the semantic field [BAY 00]. A lexical field is defined as a set of words that pertain to the same domain or the same sector of activity. For example, the words raid, anti-tank, armored vehicle, missile and machine gun belong to the lexical field of war. In cases of polysemy, the same word belongs to several fields. For example, the word operation belongs to these three fields: mathematics, war and medicine [MIT 76]1. The semantic field is defined as the area covered by the signification(s) of a word in a language at a given moment in its history [FUC 07]. In this regard, the semantic field is related to polysemy. Faced with this terminological confusion, two approaches from two linguistic currents proposed representing polysemes in terms of their shared meaning. The first approach, presented by Bernard Pottier and François Rastier, is part of the structural semantics movement and analyzes according to the hierarchy of semantic components: taxeme, domain, dimension (see section 2.11 on the interpretive semantics of Rastier). The second approach, presented by Jacqueline Picoche, falls under the context of Gustave Guillaume’s psychomechanics and proposes lexical-semantic fields [PIC 77, PIC 86].
As underscored in [CRU 00], the relations between the terms in a field are hierarchical. They follow the diagram shown in Figure 1.1.
Figure 1.1.General diagram of lexical hierarchies in a field [CRU 00]
As can be seen in Figure 1.1, two types of relations emerge from lexical hierarchies: relations of dominance, like the relationships between A and (B, C) or B and (D, E) and relations of differentiation, such as the relationships between B and C or F and G. From a formal perspective, the trees are acyclic-directed graphs (there is no path with points of departure or arrival). In other words, if there is a link between two points x and y, then there is no link in the inverse direction2. Furthermore, each node has a single element that immediately dominates it, called the parent node, and potentially it has one or more child nodes itself.
In lexical hierarchies, the symbols A, B, …G correspond to lexical items. Cruse distinguishes between two types of hierarchies: taxonomic, or classificatory, hierarchies and meronymic hierarchies.
These hierarchies reflect the categorization of objects in the real world by members of a given linguistic community. First, consider the example of the classification of animals presented in Figure 1.2.
Figure 1.2.Partial taxonomy of animals
In a taxonomic hierarchy, the element at the higher classification level, or the parent, is called the hyperonym and the lower element, the child, is called the hyponym. Thus, animal is the hyperonym of fish and Felidae is the hyponym of carnivore. They mark a level of genericity or precision, as in the following exchange [1.2].
Did you buy apples at the market? [1.2]
Yes, I bought a kilogram of Golden Delicious.
In exchange [1.2], Golden Delicious, hyponym of apple, is used here to give more specific information in response to the question. The inverse process could be used to hide some of the information.
The root is the basic element of the tree. It is distinguished by greater levels of genericity and abstraction than all other elements of the tree. Often, it is not a concrete object, but rather a set of features shared by all of the words in the group. In the example in Figure 1.2, the root element animal cannot be associated with a visual image or a given behavior. It is also important to note that the number of levels can vary considerably from one domain to another. According to [CRU 00], taxonomies related to daily life such as dishes and appliances tend to be flatter than taxonomies that pertain to the scientific domains. Some research has indicated that the depth of daily life taxonomies does not extend past six levels. Obviously, the depth of the tree depends on the genericity and the detail of the description. For example, a level above animal can be added if an expansion of the description is desired. Similarly, we can refine the analysis by adding levels that correspond to types of cats: with or without fur, domestic or wild, with or without tails, etc. There is a certain amount of subjectivity involved in taxonomic descriptions due to the level of knowledge of the domain as well as the objectives of the description.
Finally, it is also useful to mention that certain approaches, especially those of the structural current, prefer to expand the tree with distinctive features that make it possible to differentiate elements on the same level. For instance, the feature [+vertebral column] and [–vertebral column] could be placed on vertebrate and invertebrate, respectively. Similarly, the feature: [aquatic] and [cutaneous respiration] can be used to distinguish fish from amphibians.
Meronymic and holonymic relations are the lexical equivalents of the relationship between an object and its components: the components and the composite. In other words, they are based on relations like part of or composed of. In a meronymic tree, the parent of an element is its holonym and the child of an element is its meronym.
Some modeling languages, like the Unified Modeling Language (UML), distinguish between two types of composition: a strong composition and a weak composition. Strong composition concerns elements that are indispensable to an entity, while weak composition pertains to accessories. For example, a car is not a car without wheels and an engine (strong composition) but many cars exist that do not have air conditioning or a radio (weak composition). This leads to another distinction between strong meronymy and weak meronymy. In the case of strong meronymy, the parts form an indissociable entity. Weak meronymy connects objects that can be totally independent but form an assorted set. For example, a suit must be made up of trousers and a jacket (strong composition). Sometimes, there is also a vest (weak composition). For varied and diverse reasons, the trousers can be worn independent of the jacket and vice versa. However, this kind of freedom is not observed concerning the wheel or the engine of a car, which cannot be used independently of the car, the entity they compose.
An interesting point to mention concerning the modeling of these relations is the number of entities involved in the composition relation, both on the side of the components and on the side of the composites, which are commonly called the multiplicity and the cardinality of the relation, respectively. Thus, it is worth mentioning that a human body is composed of a single heart and that any one particular heart only belongs to one body at a time, in a one-to-one relation. Similarly, the body has a one-to-two cardinal relationship with eyes, hands, feet, cheeks, etc. The cardinal relationship between a car and a wheel is one-to-many, because a car has several wheels (four or sometimes more).
Figure 1.3 presents a hierarchy of body parts with a breakdown of the components of the head.
Figure 1.3.Meronymic hierarchy of the human body
Even more so than in the case of taxonomic hierarchies, there is no absolute rule in this kind of hierarchy to decide if an element is a part of an entity or not. For example, the neck could just as well be part of the head as part of the torso. The same goes for shoulders, which could be considered part of the arms or part of the torso.
Homonymy is the relation that is established between two or more lexical items that have the same signifier but fundamentally different signifieds. For example, the verb lie in the sense: utter a false statement, and lie in the sense: to assume a horizontal position are homonyms because they share the same pronunciation and spelling even though there is no semantic link between them. There are also syntactic differences between the two verbs, as they require different prepositions to introduce their objects (lie to someone and lie in). In addition to rare cases of total homonymy, two forms of partial homonymy can be distinguished: homophony and homography.
Homophony is the relation that is established between two words that have different meanings but an identical pronunciation. Homophones can be in the same grammatical category, such as the nouns air and heir that are pronounced [er], or from different categories, like the verb flew and the nouns flue or flu that are pronounced [flü].
Homography is the relationship between two semantically, and some syntactically, different words that have an identical spelling. For example, bass [beɪs] as in: Bass is the lowest part of the musical range, and bass [bas] as in: Bass is bony fish are two homographs. Note that when homonymy extends beyond the word as in: she cannot bear children, this is often referred to by the term ambiguity.
Polysemy designates the property of a lexical item having multiple meanings. For example, the word glass has, among others, the following meanings: vitreous material, liquid or drink contained in a glass, a vessel to drink from and lenses. Once in a specific context, polysemy tends to disappear or at least to be reduced, as in these sentences [1.3]:
John wants to drink a glass of water. [1.3]
John bought new glasses.
Mary offered a crystal glass to her best friend.
It should also be noted that polysemy sometimes entails a change in the syntactic behavior of a word. To illustrate this difference, consider the different uses of the word mouton (sheep in French) presented in Table 1.4.
In the sentences presented in Table 1.4, the syntactic behavior of the word mouton varies according to the semantic changes.
Polysemy is part of a double opposition that is composed of monosemic units and homonymic units.
Table 1.4.Examples of polysemy
Jean a attrapé un petit mouton.Jean caught a little sheep.
Animal/DOC
Jean cuisine/mange du mouton.Jean cooks/eats sheep.
Meat/IOC
Jean possède une vieille veste de mouton.Jean has an old jacket made of sheep leather/skin.
leather/skin/noun complement
The first opposition is with monosemic lexical units that have a single meaning in all possible contexts. These are rare, and are often technical terms like: hepatology, arteriosclerosis and hypertension. Nouns used to designate species also have a tendency to be monosemic in their use outside of idiomatic expressions: rhinoceros, aralia, adalia, etc.
The second opposition, fundamental in lexicology and lexicography, is between homonymy and polysemy. The main question is: what criteria can be used to judge whether we are dealing with a polysemic lexical item or a pair of homonyms? The criterion used to determine that the original polysemy has been fractured, leaving in its place two different lexical entries that have a homonymic relationship, is the semantic distance perceived by speakers of the language. If, on the other hand, this link is no longer discernable, the words are considered to be homonyms. The issue with this criterion is that it leaves a great deal to subjectivity, which results in different treatments. In dictionaries, polysemy is presented in the form of different meanings for the same term, while distinct entries are reserved for homonyms. For example, the grapheme bear is presented under two different entries in the Merriam-Webster dictionary3: one for the noun (the animal) and one for the verb to move while holding up something. On the other hand, there is one entry for the word car with three different meanings (polysemy): a vehicle moving on wheels, the passenger compartment of an elevator, and the part of an airship or balloon that carries the passengers and cargo.
It should be noted that ambiguity can be seen as the other side of polysemy. In her book Les ambiguïtés du français, Catherine Fuchs considers that polysemy can also concern extra-lexical levels such as the sentence [FUC 96]. For example, in the sentence: I saw the black magic performer, the adjective black qualifies either the performer or magic.
Synonymy connects lexical items of the same grammatical category that have the same meaning. More formally, in cases of synonymy, two signifiers from the same grammatical category are associated with the same signified. Synonymy exists in all languages around the world and corresponds to semantic overlap between lexical items. It is indispensable, particularly for style and quality. One way to determine synonymy is to use the method of interchangeability or substitution.
If two words are interchangeable in all possible contexts, then they are said to be a case of total or extreme synonymy. Rather rare, it especially concerns pairs of words that can be considered to be morphological variants, such as ophtalmologue/ophtalmologiste (opthalmologist in French), she is sitting/she is seated.
Partial synonymy occurs in cases of interchangeability limited to certain contexts. For instance, consider these pairs: car/automobile, peril/danger, risk/danger, courage/bravery and distinguish/differentiate. These pairs are interchangeable in certain (common contexts) and are not in others (distinctive contexts) (see Table 1.5). Polysemy constitutes the primary source of this limit of interchangeability, because often words have several meanings, each of which is realized in a precise context, where it is synonymous with one or several other words.
Table 1.5.Examples of partial synonyms
John drives (the car/gondola).
Common context
John drives his car to go to work (automobile).John drives his gondola to go to work. ?
Distinctive context
He wants to keep the company safe from all (dangers/peril/risks).
Common context
He lives in fear of the Yellow Peril.He lives in fear of the yellow danger/risk. ?
Distinctive context
It is not easy to (differentiate/distinguish) him from his brother.
Common context
The Goncourt Prize distinguished this extraordinary novel.The Goncourt Prize differentiated this extraordinary novel. ?
Distinctive context
The use of a lexical unit by a particular socio-cultural category can add a socio-semantic dimension to this unit, according to the terms of [MIT 76], which is then differentiated by other synonyms. For example, the following pairs are synonyms, but are distinguished by a different social usage (familiar or vulgar vs. formal): guy/man, yucky/disgusting, boring/tiresome. Geo-linguistic factors also play a role. For example, in the east of France, the lexical unit pair of can be synonymous with the number two as in a pair of birds or a pair of shoes [BAY 00]. In everyday use, these words are not synonyms: a pair of glasses is not the same as two glasses.
Sometimes two lexical units can have the same denotation but two different connotations. For example, an obese woman and a fat woman both designate someone of the female sex who suffers from excessive weight but the phrases nevertheless have different connotations.
The use of a word in the context of a locution or a fixation is one of the reasons that limit its synonymic relations. For example, the word risk used in locutions such as at risk or at risk of makes it non-substitutable in these locutions with words that are otherwise its synonyms, like danger and peril. Similarly, the words baked and warmed that are synonyms in a context like the sun baked the land/warmed the land are no longer synonyms when baked is used in a fixed expression like we baked the cake.
Finally, synonymy does not necessarily imply a parallelism between two words. The French nouns éloge and louange (praise) are synonyms and so are the adjectives louangeur and élogieux (laudatory). As the morphological nature of these two words is different, the parallelism does not extend to the verbal form, given that the verb élogier does not exist in French to be the synonym of the verb louanger.
The semantic nature of some lexical units logically involves a certain form of opposition. This makes opposition a phenomenon that is universally shared by all languages in the world. However, the definition of this relation is not simple. In fact, several forms of oppositions exist and, to determine a type, logical and linguistic criteria are often used.
The simplest form of opposition is called binary, polar or privative opposition. This concerns cases where there is no gradation possible between the opposed words. For example, between dead and alive, there is no intermediary state (zombie being a purely fictional state).
Oppositions that are gradable or scalar are distinguished by the existence of at least one intermediary or middle state. The opposition between the pairs long/short, hot/cold and fast/slow is gradable and allows for a theoretically infinite number of intermediary states.
To distinguish these two forms of opposition from other forms, a logical test can be applied that consists of a double negation according to these two rules:
A and B being two antonyms, → is a logical implication (→ is read as if … then) and ¬ is the negation symbol (¬B is read as not B). Applied to the pair open/closed, these rules provide the following inferences:
open → ¬ closed (if a door is open then it is not closed).
¬ closed → open (if a door is not closed then it is open).
Gradual oppositions do not validate the first rule. If a car is fast, that does not necessarily mean that it is not slow (it could be in any one of an innumerable intermediary states).
Pairs that designate equivalent concepts such as days of the week, months and metrological units can only validate the first rule:
April → ¬ July (if it is April, then it is not July)
¬ July → April *
From a linguistic point of view, we recognize adjectives through the possibility or impossibility of inserting them in front of an intensifier or using them as comparatives or superlatives. Adjectives such as small, intelligent and fast can often be used with an intensifier as in: very fast, fairly intelligent and too small. They can also be employed as comparatives and superlatives as in: the most intelligent, as fast as. Some linguists introduce degrees of nuance to the two large forms of opposition that we just discussed. For example, the oppositions male/female, man/woman or interior/exterior are traditionally considered to be a relation of complementarity. Some prefer to call the two extremes of a gradual opposition antipodes (peak/foot).
To visually represent the relations of opposition, [GRE 68] proposed the semiotic square. This is a process that makes it possible to logically analyze oppositions by considering logically possible classes that result from a binary opposition (see Figure 1.4).
Figure 1.4.General structure of a semiotic square
Thus, the man/woman opposition can give rise to the classes: man, woman, man and woman (hermaphrodite or androgyne), neither man nor woman (person suffering from genital deformation). This can produce the semiotic square presented in Figure 1.5. [HEB 12].
Figure 1.5.Example of a semiotic square for feminine/masculine
Finally, as opposition is essentially a relation between signifieds, it is naturally affected by polysemy. The same lexical item can have several antonyms according to its different significations.
This is a relationship between two or more words that resemble each other phonetically and/or in terms of spelling without necessarily having a semantic relation. There are several cases of this type, including: affect (act physically on something) and effect (a phenomenon that follows and is caused by some previous phenomenon); desert (arid land) and dessert (a dish), and just and justice. Note that some paronyms are common sources of errors in language use, as in the pair: diffuse (to spread) and defuse (reduce danger or tension).
Initially proposed by [FEL 90a], this relation pertains to the classification of verbs regarding the genericity of the events they express: communicate >speak> whisper> yell. These relations can be expressed as the manner_of. Poisoning is a manner of killing and running is a manner of moving.
Categorization is a fundamental cognitive process. It is a means of grouping different entities under the same label or category. Studying this process necessarily involves a good understanding of the nature of categories that humans use. Are they objective categories that depend on the nature of the objects that they represent? Or, on the contrary, are they subjective categories whose existence depends on a community of agents4? Two approaches attempt to shed light on these categories: the Aristotelian approach and the prototype approach.
The Aristotelian approach, sometimes called the classical approach or the necessary and sufficient conditions approach, stipulates that the properties shared by the entities in question are the basis of the grouping. So, to determine whether a given entity belongs to a given category, it must possess the properties of this category (necessary conditions) and it must possess enough of them to belong to it (sufficient conditions). At the end of this process, a binary decision is made: the element either belongs to the category in question or not. For example, in order for an entity X to be classified in the category man, the following conditions must be met:
– X is a human
– X is a male (sex/gender)
– X is an adult
If at least one of these conditions is not met, then X is not a man because the conditions are necessary individually. Inversely, we can also logically infer all of the conditions starting from the category: from the sentence John is a man, we can infer that John is a human, John is a male and John is an adult. If all of these conditions are satisfied, then the entity X is categorized as a man because all of these conditions are sufficient for the categorization. In other words, from these criteria, we can infer that he is a man and nothing else.
The Aristotelian approach, where the borders between categories are rigid, has been challenged by several works in philosophy and psychology. In his philosophical investigations, the Austrian philosopher Ludwig Wittgenstein from the analytical school showed that it is impossible to define a concept as banal as a game in terms of necessary and sufficient conditions. There is at least one case where the following intuitive conditions are not valid:
– Involves physical activity: chess and video games are well-known examples of non-physical games.
– There is always a winner and a loser: many video games are designed in terms of steps and/or points. The notions of victory and loss are not relevant in these cases.
– For leisure: there are professional sports players.
Wittgenstein concluded that categories ought to be described in terms of similarities between families. In these cases, the members of the family are all similar without necessarily sharing common features. In other words, the connections between the members of a given category resemble a chain where the shared features are observed at a local level. Another philosopher from the analytic tradition, the American Hilary Putnam, proposed a similar model known as the semantics of the stereotype.
To represent the lexical meaning, several linguists, starting with the Dane Louis Hjelmslev [HJE 43], have adopted in various forms a componential analysis of the meaning of words using features similar to those used in phonology. As emphasized in [KLE 90], the features have a role similar to that of the necessary and sufficient conditions of the Aristotelian approach.
Bernard Pottier, main defender of componential analysis in France, gave an example of this kind of analysis, which is presented in Table 1.6 [POT 64].
Table 1.6.A semic analysis of the field of the chair according to Pottier
Semes Words
For sitting
Rigid material
For one person
Has feet
With backrest
With arms
Seat
+
-
-
-
-
-
Chair
+
+
+
+
+
-
Armchair
+
+
+
+
+
+
Stool
+
+
+
+
-
-
Sofa
+
+
-
+
+
-
Pouffe
+
-
+
-
-
-
In the example given in Table 1.6, each line represents a sememe. This consists of the set of semes in a word. The seme of the first column, for sitting, is shared by all of the words in the table. Pottier proposed calling classemes the group of semes that, as the seme for sitting, are used to characterize the class. The sememe of the word seat is the least restrictive: only the classeme is required because the word is the hyperonym of all other words in the table.
Situated at a more general level, the representation of lexical information quickly becomes more complex. The class of seat itself belongs to the more general class of furniture. In turn, furniture belongs to higher classes such as manufactured objects, and objects in general. Similarly, the class of armchairs includes several subclasses such as the wing chair, Voltaire chair and club chair that each has a set of semes that distinguish them from the set of neighboring or encompassing classes (hyperonyms). This means that a large number of new semes must be added to the semes identified by Pottier himself in order to account for these relations. The word seat itself can be employed figuratively in ways that are different from the ordinary usage, such as the seat of UNESCO is in Paris. To account for these uses, Pottier admitted the existence of particular semes, called virtuemes, that are activated in particular cases.
As highlighted in [CRU 00], the principle of compositionality is far from universal. There are phenomena that are an exception to this principle. This includes fixed expressions like kicked the bucket, a piece of cake and porte-manteau as well as the metaphors the ball is in John’s court, to weave a tangled web and to perform without a safety net. There are no objective rules that make it possible to decide which features should be included in a linguistic description. The amount of detail in the descriptions often depends on the specific objectives of each project. This considerably limits the reusability of these works. This point is all the more problematic because the practical implementation of them requires a considerable amount of work.
The ideas of Wittgenstein and Putnam were taken up and developed by the American psychologist Eleanor Rosch and her collaborators who proposed the prototype-based approach commonly called prototype semantics [ROS 73, ROS 75, ROS 78, KLE 90, DUB 91]. According to Rosch, categorization is a structured process that is based on two principles: the principle of cognitive economy and the principle of the structure of the perceived world.
According to the principle of cognitive economy, humans attempt to gain the maximum possible information about their environment while keeping their cognitive efforts and resources to a minimum. Categories serve to group different entities or stimuli under a single label contributing to the economy of the cognitive representation.
The principle of the structure of the perceived world stipulates that the world has correlational structures. For example, carnivore is more often associated with teeth than with the possibility of living in a particular zone like the tropics or the North Pole. Structures of this type are used to form and organize categories.
These two principles are the basis for a cognitive system of categorization that has a double dimension: a vertical dimension and a horizontal dimension.
The vertical dimension emerged from Rosch’s work concerning the level of inclusion of objects in a hierarchy of categories connected by a relation of inclusion [ROS 76]. For example, the category mammal is more inclusive than the category cat because it includes, among others, entities such as dog, whale and monkey. Similarly, the category cat, which includes several breeds is more inclusive than Chartreux or Angora.
According to Rosch, the levels of inclusion or abstraction are not cognitively equivalent. The experiments that she conducted with her collaborators showed that there is a level of inclusion that best satisfies the principle of cognitive economy. This level of inclusion is called the base level. It is located at a middle level of details between, on the one hand, a higher level like mammal and vehicle and, on the other hand, a subordinate level like Chartreux and sedan. Her work also showed that this base level has several properties that make it cognitively prominent. Among others, they showed that the base level is one where the subjects are most at ease providing attributes. They also showed that the words corresponding to the base level are the first to emerge in the vocabulary, thus proving their primacy in the process of acquisition. Rosch and her collaborators considered that the primacy of the base level affected the very structure of language because we can observe that the words corresponding to the base level are generally simple and monolexical like chair, dog and car
