Introduction to Experimental Linguistics - Christelle Gillioz - E-Book

Introduction to Experimental Linguistics E-Book

Christelle Gillioz

0,0
139,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

The use of experimental methodology in the field of linguistics has boomed in recent decades. However, implementation of such methods does require an understanding and mastery of specific theoretical and methodological principles. Introduction to Experimental Linguistics presents the key concepts of experimental linguistics in an accessible way, addressing, in turn: the application of experimentation in linguistics; the techniques most frequently used for the study of language; the methodological and practical aspects useful for the implementation of an experiment; and an introduction to the analysis of quantitative data derived from experiments. This didactic book combines the elements presented with examples drawn from the various fields of linguistics. It also includes a number of resources available for people who wish to implement an experimental study, more advanced reading suggestions, and revision questions along with their answer key.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 451

Veröffentlichungsjahr: 2020

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title page

Copyright

Preface

1 Experimental Linguistics: General Principles

1.1. The scientific process

1.2. Characteristics of experimental research

1.3. Types of experiment in experimental linguistics

1.4. Advantages and disadvantages of experimental linguistics

1.5. Where to access research on experimental linguistics

1.6. Conclusion

1.7. Revision questions and answer key

1.8. Further reading

2 Building a Valid and Reliable Experiment

2.1. Validity and reliability of an experiment

2.2. Independent and dependent variables

2.3. Different measurement scales for variables

2.4. Operationalizing variables

2.5. Choosing a measure for every variable

2.6. Notions of reliability and validity of measurements

2.7. Choosing the modalities of independent variables

2.8. Identifying and controlling external and confounding variables . . . .

2.9. Conclusion

2.10. Revision questions and answer key

3 Studying Linguistic Productions

3.1. Differences between language comprehension and language production

3.2. Corpora and experiments as tools for studying production

3.3. Free elicitation tasks

3.4. Constrained elicitation tasks

3.5. Repetition tasks

3.6. Conclusion

3.7. Revision questions and answer key

3.8. Further reading

4 Offline Methods for Studying Language Comprehension

4.1. Explicit tasks

4.2. Implicit tasks

4.3. Conclusion

4.4. Revision questions and answer key

4.5. Further reading

5 Online Methods for Studying Language Comprehension

5.1. Think-aloud protocols

5.2. Using time as an indicator of comprehension

5.3. Priming

5.4. Lexical decision tasks

5.5. Naming tasks

5.6. Stroop task

5.7. Verification task

5.8. The self-paced reading paradigm

5.9. Eye-tracking

5.10. The visual world paradigm

5.11. Conclusion

5.12. Revision questions and answer key

5.13. Further reading

6 Practical Aspects for Designing an Experiment

6.1. Searching scientific literature and getting access to bibliographic resources

6.2. Conceptualizing and formulating the research hypothesis

6.3. Choosing the experimental design

6.4. Building the experimental material

6.5. Building the experiment

6.6. Data collection

6.7. Ethical principles

6.8. Conclusion

6.9. Revision questions and answer key

6.10. Further reading

7 Introduction to Quantitative Data Processing and Analysis

7.1. Preliminary observations

7.2. Raw data organization

7.3. Raw data processing

7.4. The concept of distribution

7.5. Descriptive statistics

7.6. Linear models

7.7. Basic principles of inferential statistics

7.8. Types of statistical effects

7.9. Conventional procedures for testing the effects of independent variables

7.10. Mixed linear models

7.11. Best-practices for collecting and modeling data

7.12. Conclusion

7.13. Revision questions and answer key

7.14. Further reading

References

Index

End User License Agreement

List of Illustrations

Chapter 2

Figure 2.1.

Illustrations of nominal, ordinal, interval and ratio scales

Chapter 4

Figure 4.1.

Examples of situations presented in Coventry

et al.

(2001)

Chapter 5

Figure 5.1.

Example of fictitious steps involved in simple (a) or choice (b) rea...

Figure 5.2.

Illustrations of eye movements during reading. The circles correspon...

Chapter 6

Figure 6.1.

Diagram of the operational hypothesis and examples of external varia...

Figure 6.2.

Illustrations of experimental trials in different tasks in experimen...

Chapter 7

Figure 7.1.

Histogram representing the distribution of data acquired in an exper...

Figure 7.2.

Normal distribution, with a mean of 0 and a standard deviation of 1

Figure 7.3.

Graphic illustrations of decision times by participant #1 and the me...

Figure 7.4.

Histograms representing the decision times obtained in the different...

Figure 7.5.

Mean decision times of participants in high and low frequency condit...

Figure 7.6.

Illustrations of interaction effects for a design with two independe...

Figure 7.7.

Reaction times distribution for participants (panel (a)) and for ite...

Figure 7.8.

Decision times from participants 1, 2 and 5. Each point corresponds ...

Tables

Chapter 6

Table 6.1.

Combination possibilities for 4 conditions in a Latin square design

Table 6.2.

Combination of independant variable modalities for creating condition...

Table 6.3.

List possibilities for an experiment

Chapter 7

Table 7.1.

Example of a database describing the characteristics of participants

Table 7.2a.

Example of a long format database (where one row corresponds to one ...

Table 7.2b.

Example of a long format database (where one row corresponds to one ...

Table 7.3.

Examples of participants' mean decision times in the two frequency co...

Table 7.4.

Examples of decision time means for items, depending on their frequen...

Guide

Cover

Table of Contents

Title page

Copyright

Preface

Begin Reading

References

Index

End User License Agreement

Pages

v

iii

iv

ix

x

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

243

244

245

246

247

248

Introduction to Experimental Linguistics

Christelle Gillioz

Sandrine Zufferey

First published 2020 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd

27-37 St George’s Road

London SW19 4EU

UK

www.iste.co.uk

John Wiley & Sons, Inc.

111 River Street

Hoboken, NJ 07030

USA

www.wiley.com

© ISTE Ltd 2020

The rights of Christelle Gillioz and Sandrine Zufferey to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2020943938

British Library Cataloguing-in-Publication Data

A CIP record for this book is available from the British Library

ISBN 978-1-78630-418-6

Preface

This book aims to present the theoretical and methodological principles of experimental linguistics in an accessible manner. It intends to offer an overall vision of the field, so as to help the non-initiated audience to become familiar with the necessary concepts for carrying out linguistic experiments. The elements discussed in this book can particularly serve as a basis for a critical understanding of the results published in the scientific literature and as a starting point for carrying out experiments.

Since the field of experimental linguistics is rich and varied, both in terms of the phenomena studied and of the methods employed, it is impossible to offer an exhaustive presentation. The choice of aspects introduced in this book aims to provide an overview of the different possibilities available to those wishing to carry out an experimental study about language. For every aspect developed in the chapters of the book, there exist specific works which, due to their complexity and the prerequisites they demand, are often reserved for an expert audience. This is why we have deliberately chosen to select the information we deem essential for building a knowledge base that will later enable readers to explore the scientific literature and other works on this topic. Therefore, the emphasis will be placed on understanding the scientific approach and the methodological principles underlying the construction of experiments, and on analyzing the data which results from these experiments. In regards to research methods, we chose to make a presentation of the most accessible methods for linguists. In order to illustrate the many possibilities for applying such methods, we have provided examples drawn from different fields in linguistics. Finally, a list of more specific resources and available tools is provided at the end of each chapter, in order to encourage the interested reader to deepen and put into practice the knowledge acquired in this book.

This book begins with an introductory chapter, offering a general overview of the principles underlying experimental methodology, as well as the key concepts which will be developed in the rest of the chapters.

Chapter 2 goes through the various points the researcher should comply with in order to conduct a valid and reliable experiment, thus making it possible to infer solid conclusions. First, we will define the concepts of validity and reliability and then discuss the notion of variables, as well as present different options for measuring such variables. We will pay special attention to the stages involved in the transformation of the research question into an experimentally testable hypothesis.

Chapters 3–5 are dedicated to the different methods used for studying language production (Chapter 3) and language comprehension, focusing not only on the results of the comprehension process (Chapter 4), but also on the process itself (Chapter 5).

Chapter 6 presents the main practical aspects associated with the construction of an experiment, such as the various possibilities offered by different types of experimental designs, the criteria for choosing the experimental material, the stages involved in an experiment, the aspects related to data collection, as well as the ethical principles that should be observed while carrying out research with human participants.

Finally, Chapter 7 offers an introduction to the analysis of quantitative data, aiming to summarize the key elements for understanding descriptive and inferential statistics, as found in the scientific literature devoted to experimental linguistics. This chapter will also emphasize the peculiarities of the data acquired through linguistic experiments, namely the interdependence of observations. Then, we will introduce mixed linear models that can be used to analyze such types of data.

Christelle GILLIOZ

Sandrine ZUFFEREY

August 2020

1Experimental Linguistics: General Principles

We start this chapter by outlining the foundations of the experimental methodology and its main features. Then, we discuss the advantages and disadvantages of this type of methodology, as well as the main arguments in favor of its use in the field of linguistics. Last, we present a series of resources offering access to research in experimental linguistics.

1.1. The scientific process

The experimental methodology in linguistics is part of a scientific approach for studying language. It aims to observe language facts from an objective and quantitative point of view. The general idea behind this approach is that it is impossible to rely on one’s own intuitions in order to understand the world. Quite the contrary, it is necessary to observe objective data reflecting reality. For example, by simply observing the world around us, and relying solely on our own intuition, we might believe that the Earth is flat. This is why the scientific approach, used in fields such as psychology or physics, is based on specific principles and stages, instead of relying on the intuition of scientists. Let us briefly go through these stages:

The first stage in the scientific process involves the observation of concrete phenomena and the subsequent generalization of observations, in order to build a scientific fact: a fact which does not depend on a specific place, time, object or person. At this first stage, it is also possible to trace certain regularities concerning the emergence of a phenomenon, and to try to define the conditions in which such phenomenon generally appears. So, let us illustrate this process by reviewing the stages involved in the discovery of gravitation. This finding is usually attributed to Isaac Newton, who is said to have had a revelation after seeing several apples fall from a tree. As he watched the apples fall, Newton wondered why the apples always fell in a perpendicular direction from the apple tree to the ground, never to the side or upwards.

During the second stage, all of the scientific facts concerning the same phenomenon may prompt the development of a law or theory aimed at explaining such facts. A theory synthesizes knowledge about a phenomenon at a given moment and is therefore provisional, insofar as it can evolve according to new knowledge. We should make it clear that the notion of theory in science is rather distant from the meaning of the word theory as we use it in everyday language. While this word can be used to refer to personal ideas or reasoning mechanisms, its use in the scientific field only applies to coherent and well-established principles or explanations. Going back to our example, in Newton’s time, two models coexisted for describing the movement of bodies: one followed Galileo’s law and was devoted to terrestrial bodies, whereas the other was oriented by Kepler’s law and made reference to celestial bodies. On the basis of this knowledge and his own observations, Newton suggested the existence of a force which made objects attract one another and which could explain the movements of both celestial and terrestrial bodies.

At the third stage, a theory is capable of predicting the emergence of observable facts, or to put it differently, to formulate precise hypotheses which can be put to the test. In order to test these hypotheses, it is necessary to collect a large amount of data and check whether they support the initial theory. In this way, it is possible to know to what extent we can rely on our theory. The more the predictions made on the basis of the theory are fulfilled, that is, the more the data collected corresponds to what might be expected according to the theory, the higher the confidence level will be. Otherwise, if the predictions did not come true, the theory should be put into question and re-examined. Newton’s law of universal gravitation has made it possible to predict and explain the movement of the tides thanks to the moon’s gravitational pull on the Earth, the elliptical movement of celestial bodies or the equatorial bulge.

In summary, the scientific approach is a circular and dynamic process, originating in the reality of the facts, abstracting itself from them in an attempt to explain them, and then approaching them again to check the validity of the explanation.

1.1.1. Qualitative and quantitative approaches

It is possible to investigate a research question in different ways and from different perspectives. Let us imagine that you wish to study second language acquisition within the context of linguistic immersion. The first way of doing this could be to contact students attending your university for a language stay and to interview them. These interviews can later be viewed to analyze the opinions of students regarding their experience during their stay, their feelings on its advantages and disadvantages, or their opinion on the impact of such a stay on their linguistic competences. By doing this, you would be carrying out what is called qualitative research.

The qualitative approach helps us to explore and understand a phenomenon by studying it in detail and trying to take hold of it in a holistic manner, based on the meanings that people assign to the phenomenon. This type of research takes a long time when conducting interviews and interpreting the results; hence, only a small number of individuals can be questioned. Due to this characteristic, the results of a qualitative study are strongly anchored to the context in which the study was carried out, and cannot be generalized to other people or to other contexts. This is not a problem insofar, as qualitative studies do not aim to make such a generalization. The subjectivity of the individuals involved in the study is acknowledged as an integral part of qualitative research. This methodology is built on the principles of a constructivist vision of knowledge, according to which there is not only one, but many realities construed by people’s interpretations and the meanings they attribute to events or things, on the basis of their own experience.

When reading this first proposal for investigating second language acquisition within a context of linguistic immersion, you might think that although it may be interesting to know learners’ opinions about their experience in a language stay, you also desire to know more about the benefits of such a stay on the evolution of their linguistic competences. The conclusions drawn based on the opinions of a few interviewees may not reflect the reality of all learners. It is possible that the interviewees could subjectively overestimate or underestimate the evolution of their skills, or that these particular cases do not mirror the typical experience learners have during a language stay. One possibility, to obtain more objective data on the advantages of a language stay for improving linguistic competences, could be to take into account the experience of more people and to measure their linguistic competences at the start and end of the stay, for example, with an assessment test. By comparing the results before and after the stay with the help of a statistical test, you could determine whether the students’ linguistic skills have evolved and in what aspect. If you chose this second option, your research would follow a quantitative methodology, in the sense that your conclusions would be drawn from the analysis of numerical data pertaining to a large number of people, and objectively assessed through a test. Your results would depend little on the respondents, their subjective perceptions or your interpretation of their declarations. If learners have really benefited from their language stay, this should be reflected in their results to the test, probably higher at the end than at the beginning of the stay, and this is what you would measure directly.

This example illustrates to what extent quantitative research differs from qualitative research, in that it aims to observe quantifiable elements and to measure a phenomenon. The techniques used for measuring a phenomenon can be extremely varied, depending on how the phenomenon is defined. Going back to our previous example, it is possible to measure language proficiency using a general language test (such as the placement tests used in language schools). Another way of doing this would be to count the number of mistakes students make in a grammar test or to measure the size of their second language lexicon. Choosing the proper measures for undertaking research is a big question in itself. We will return to this in Chapter 2, where we will discuss the different stages of choosing the measures involved in an experiment.

Quantitative research also differs from qualitative research in terms of the type of reasoning on which it is based. We have seen that in qualitative research, we draw upon data in order to outline a structure. In this case, data works as a source of interpretations and explanations upon which hypotheses will be formulated. This type of reasoning, starting from data and leading towards a theory, is called inductive reasoning. On the contrary, quantitative research follows deductive reasoning: it draws on theory in order to formulate hypotheses which will later be verified by data acquired in the field. When choosing a deductive approach, it is necessary to build a preliminary hypothesis, on which the research will be based and that will guide the researchers’ methodological choices.

Going back to the example of learners within an immersion context, there are a large number of hypotheses that could be formulated by using the link between language stay and language proficiency. The first hypothesis could be that a language stay improves second language skills. A second hypothesis, similar to the first, but involving a different research methodology, could be that people who have spent time on a language stay have acquired better skills than those who have not. In order to verify the second hypothesis, we would have to test two groups of learners who may or may not have benefited from linguistic immersion, instead of one group of students before and after the stay. A third hypothesis could focus on one specific aspect of language proficiency, such as pronunciation in a foreign language (accent). We might imagine that the learners who have spent some time on a language stay may have a better pronunciation (an accent closer to that of the native speakers), than those who have not. In order to test this third hypothesis, two groups of students would be required, but this time they would be assessed on their pronunciation.

Even if they differ in their formulation and in the type of elements they have put to the test, the hypotheses mentioned above share a common feature, which is that they all postulate a relationship between what we call variables. In all the hypotheses, the first variable corresponds to linguistic immersion. In the first and second hypotheses, the second variable is the proficiency level in the second language. In the third hypothesis, the second variable corresponds to a weaker non-native accent. We will discuss the notion of variables in further detail in Chapter 2. For the time being, it is important to understand that a variable is something that varies, and can take different values. For example, a variable can be the age of participants in a study, which would result in a broad number of values. A second variable could be the fact of wearing glasses, or eye color, etc. These variables adopt fewer values: either yes or no for wearing glasses, and blue, brown, green or other for eye color.

Let us now take the example of a variable studied in language science: bilingualism. At first glance, this variable may seem to only adopt two values: either bilingual or monolingual. However, things get more complicated when we have to define what we mean by bilingual. For example, we may decide that anyone having knowledge of a second language is bilingual. In that case, there would be great heterogeneity within the bilingual group, containing people who can only speak or understand a second language superficially, and people capable of perfectly mastering both languages. A corollary of such a definition would be that very few people would belong to the monolingual group, since many people are familiar with one or more languages, apart from their mother tongue. On the other extreme, we could consider belonging to the bilingual group as only those with a perfect command of their second language. In this case, the bilingual group as would be more homogeneous, in the sense that all those belonging to it would have similar competences in their second language. But this definition raises additional questions: what do we mean by perfect command and how can command be measured? This example illustrates the need to clearly and precisely define the variables investigated in a research process. This definition procedure is called the operationalization of a research question. It represents a crucial phase in quantitative research, and we will discuss it in depth in Chapter 2.

To summarize, quantitative research aims to investigate the relationship between two or more variables. To do this, it starts from a hypothesis and defines the measures used for studying the chosen variables. Then, it relies on digital data collected from a large number of people and analyzes such data using statistical tests, in order to generalize the results.

1.1.2. Observational research and experimental research

Quantitative approaches in linguistics make an important difference between observational research and experimental research. The first example of a research tool, the questionnaire, is frequently used in linguistics to collect data in a quantitative manner. A questionnaire is a set of questions aimed at collecting different types of information about speakers, such as personal characteristics, their use of certain words or linguistic structures, or their point of view about certain linguistic phenomena. Let us now imagine that you wish to know whether there is a difference in the way that French speakers from France, Belgium and Switzerland refer to a yogurt. As Avanzi (2019) did, you could directly ask a large number of French, Belgian and Swiss people to tell you which of the two possible names, yaourt or yoghourt, they use on a daily basis. By counting the responses of more than 7,000 people, Avanzi showed that the form yaourt is mainly used in France, whereas it is never used in Switzerland, where yoghourt is the only form in use. In Belgium, the choice of yaourt and yoghourt varies from region to region.

In a slightly different way, instead of relying on the answers of people in a questionnaire, you could use linguistic data retrieved from natural productions and carry out a corpus study. In such studies, linguistic productions in the form of texts, audio or video recordings are used with the aim of counting the number of word occurrences, a grammatical form or any linguistic characteristic. In order to research the uses of yaourt or yoghourt in France, Belgium and Switzerland, first it would be necessary to select corpora comprising linguistic productions collected from these different regions. This data could come from French, Belgian and Swiss newspapers, for example. The number of occurrences of each form could be counted in each corpus and then compared, in order to reveal differences in the use of these forms from country to country.

Another way of studying quantitative data is to examine the link between two variables. Let us imagine that you wish to study the relation between learners’ age and their ability to acquire a second language. Extensive research has already been devoted to this topic and suggests that the older people are when learning a second language, the more difficult it is for them to reach a high level of proficiency (see DeKeyser and Larson-Hall (2005) for a review). In order to confirm (or refute) this hypothesis, you could test a large number of people who start learning a language at different ages and measure their language proficiency after a certain period of time. In this example, the first variable, the age when learning begins, is a quantitative variable. Likewise, the second variable, language proficiency, can be measured quantitatively using a language test. Using an appropriate statistical test, it is possible to show the existence of a link between these two variables. This type of procedure is called correlational research and unveils the degree of dependence between two variables, which is called correlation. In the case of our example, if age plays a role in second language acquisition, the correlation obtained by our test would show that the older a person is when the process of learning a language begins, the lower their mastery of the language will be after a certain learning period.

The various studies described above correspond to research based on data observation. This type of research is generally used when, for practical or ethical reasons, it is necessary to observe variables from the outside. In this type of research, researchers do not interfere with the object of study, but observe the relationship between two variables at a given moment. As a consequence, the results of an observational study must be kept at a descriptive level, since it is not possible to infer a causal relation between two variables. In our example of a correlational study, the age when learning begins is related to language proficiency, but it is not possible to state that an increase in age is the cause for the decrease in language proficiency. It might be possible that other variables not considered in our research can also explain the relationship between the variables examined. We could imagine, for example, that the context in which second language acquisition takes place is not the same depending on the age when the learning process begins. It is likely that when young children learn a second language, this takes place within a family setting, where parents may speak different languages or a different language from that of the external environment. When older people start learning a language, it is probable that they grew up in a monolingual linguistic environment and later discovered a second language at school, or when moving to another country, for example. The type of linguistic exchanges may also differ depending on age, as well as the motivation to learn, cognitive skills or many other variables. These external variables that are left aside during research are called confounding variables and are related to the two variables examined, age and language proficiency. It could be, that language learning conditions rather than age itself can account for the differences in language levels. Since it is impossible to distinguish the variables examined, from confounding variables, research based on the observation of data should not draw a conclusion from a causal relation between two variables.

In order to determine a causal relation between two variables, it is necessary to exclude any confounding variable. By using experimental methodology, the variables of interest can be manipulated to determine what effect a variable has on another variable, regardless of other possibly interfering variables. In other words, rather than observing natural data, the experimental methodology defines the conditions under which a phenomenon could be observed and then sets up an experiment in which these conditions can be manipulated, in order to measure their influence on the phenomenon under investigation. In the rest of this chapter, we will describe in more detail the various characteristics of experimental research.

1.2. Characteristics of experimental research

In this section, we will first stress the fact that experimental research must be based on a research question that makes it possible to formulate precise hypotheses. We will then see that in order to empirically assess a hypothesis, an experimental study must manipulate variables of interest while controlling other variables, which may influence the outcome of the experiment. Finally, we will discuss some methodological aspects of data collection, so that they can be analyzed through the use of statistics. These points will be elaborated in detail in the chapters dedicated to these different aspects.

1.2.1. Research questions and hypotheses

We have already emphasized that experimental research is part of a scientific process. It builds on existing knowledge in a research field and aims to increase such knowledge by studying a research question generated on the basis of an existing theory. A scientific research question identifies the potential cause for a phenomenon and postulates a cause to effect relation between the cause and phenomenon. For example, the question “how do we understand a text?” is not a research question, as it is too vague. Such a question corresponds to a general research topic, from which many research questions can emanate. On the other hand, a question such as “what is the role of memory in readers’ comprehension of a text?” is a research question that can be investigated empirically. This question identifies a cause – memory – and a consequence – text comprehension –, and establishes a relation between the two.

Once the research question has been defined, it is necessary to transform it into a research hypothesis, which corresponds to an empirically testable statement. In other words, the hypothesis must be confirmed or rejected on the basis of objective data. In order to do this, the research hypothesis must be operationalized, that is, it is necessary to specify which variables will be examined and how these variables will be measured, in order to collect relevant data for the experiment.

If we go back to our example above, memory is still a vague concept. As a matter of fact, a distinction is generally made between long-term memory, short-term memory and working memory. Working memory is a system that simultaneously stores and processes verbal elements (verbal working memory) or visual elements (visuospatial working memory). It is typically the verbal working memory that we use for reading, for deciphering and for putting together the words in a sentence. The operational hypothesis should therefore define what type of memory will be the object of study, verbal working memory, for example.

In the same way, the operational hypothesis should explain the way in which reading comprehension will be measured. Reading comprehension involves many steps, from deciphering words to relating these words in a sentence, and then to a text. Therefore, it is impossible to measure reading comprehension in only one way or with one type of experiment. We need to narrow down this notion to a more precise variable, corresponding to a process involved in reading comprehension that can be measured. For example, this could be the elements included in the readers’ representation of the text and stored in memory once the reading has finished. One way to assess comprehension would be to ask questions about the text at the end of reading and count the number of correct answers.

Let us look at a few more examples to understand what a research hypothesis is:

(1) Bilinguals have different cognitive abilities from monolinguals.

(2) Reading and understanding a text is difficult for children.

The above-mentioned hypotheses cannot be the basis of experimental research since they do not meet the criteria listed above. Their terms are too vague, they specify neither the cause nor the effect, and do not specify any measure to rely on so as to draw conclusions.

In order to be tested empirically, these hypotheses could be transformed into (3) and (4):

(3) Bilinguals perform better than monolinguals at a cognitive flexibility task.

(4) When reading a text, 10–12-year-old children draw fewer inferences than 14–16-year-old teenagers.

In these two examples we see that the vague terms used in (1) and (2) have been transformed into accurate terms in (3) and (4). Cognitive skills became performance during a cognitive flexibility task, and understanding atext became drawing inferences. By doing this, measures for quantifying the variables were defined. In addition, (4) specifies which groups will be included and compared in the study. Finally, both (3) and (4) indicate a clear relationship between variables.

In summary, a research hypothesis is based on existing knowledge in order to establish a relationship between two or more variables. It must also be operationalized, that is, clearly defining the measures that will be used for quantifying the variables being examined to verify the hypothesis.

The construction of a good research hypothesis is the result of different stages, among which the most important are conceptualizing the hypothesis, on the basis of knowledge acquired in the field, and then operationalizing the hypothesis. We will discuss the specific stages for conceptualizing a hypothesis in Chapter 6, which is devoted to the practical aspects of an experiment. We will discuss the stages involved in the operationalization of a hypothesis in Chapter 2.

1.2.2. Manipulation of variables

Let us now go back to the example of the influence of working memory on reading comprehension. In this example, the variable verbal working memory can be observed in two ways. The first possibility would be to measure the skills of the people taking part in the experiment by using a verbal working memory test. According to this evaluation and its results, participants could be sorted into groups. By doing so, every participant is included under a variable modality (e.g. high competence or low competence) depending on his/her own characteristics, as some people have better working memory capacities than others. In this case, the variable is simply observed during research.

A second possibility would be to manipulate the variable verbal working memory, by implementing conditions within the experiment where this variable has different modalities. In our example, the manipulation of the independent variable would aim at restricting the use of verbal working memory in some of the participants, in order to see the impact of such manipulation on reading comprehension, as compared to other participants whose working memory has not been restricted during the reading. A common task used for manipulating verbal working memory is to ask people to momentarily memorize different series of letters while reading the text, to report them and then to memorize others. Having to remember a series of letters while reading the text reduces the verbal working memory storage capacity used for reading and makes it possible to show a connection, if existent, between working memory and comprehension.

In general, in experimental research, the aim is to manipulate all the variables involved in the hypotheses. However, due to practical or ethical reasons, this is not always possible. For example, age, socio-economic level, bilingualism, etc., cannot be manipulated because they are inherent in people. When variables can be manipulated, the decision to manipulate them, as well as the way in which to manipulate them, must follow ethical principles, ensuring that research will not harm the participants during the test. The cost/benefit relationship must be clearly considered when pondering the possibility of manipulating a variable or not. For example, imagine that you formulate a hypothesis stating that in stressful situations, people tend to speak faster than in non-stressful situations. In order to study the influence of stress on articulation rate, you could decide to manipulate the participant’s stress level. To set up a stressful condition, you could imagine putting some of the participants in a dark room in front of an audience booing at them. In experimental terms, such manipulation would be adequate, in the sense that a high level of stress would most likely result from your manipulation. On the other hand, it would be totally inappropriate from an ethical point of view. Actually, this type of manipulation would affect the participants to a much larger extent than needed, and they would probably not leave the experiment unscathed. Although this is an extreme example, it illustrates the fact that an experiment should not leave an impact trace on the participants once the experiment is over. We will develop this point in Chapter 6, which is devoted to the practical aspects of an experiment.

1.2.3. Control of external variables

We have seen that when operationalizing research hypotheses, variables need to be defined with accuracy. The main purpose of such a definition is to isolate the variables studied within the experiment, in order to reach a reliable conclusion as to the relationship between them. In parallel, and for the same purpose, it is necessary to control the other variables, known as external variables, which could influence the variables and the results obtained in the experiment. External variables can be multiple and we will return to them in Chapters 2 and 6, where we will discuss hypotheses and the practical aspects of an experiment. However, it is generally acknowledged that the characteristics of the participants are variables which may interfere with the variables investigated in an experiment.

Going back to the example of the influence of memory on reading comprehension, we may assume that educational level, general cognitive abilities, age, reading habits, etc., can influence both memory and reading comprehension. Likewise, the characteristics of the material used in the experiment may have an influence on the results. If, in the above-mentioned example, we use very simple text and questions, it is possible that everyone answers the questions perfectly well, regardless of their memory skills. On the contrary, if the text and the questions are very complicated, it is possible that very few people will be capable of answering. In these cases, we risk not finding a connection between memory and reading comprehension, not because the link doesn’t exist, but because the material used for the experiment is not suitable for evidencing such a link.

1.2.4. The notions of participants and items

To attenuate these potential problems, and to reduce the importance of the characteristics of the participants or the material employed, experimental research is based on data collected from a large number of people, using a broad palette of materials. Referring back to our example, it would be necessary to test a large number of people by means of a comprehension test. This test should contain multiple texts and different questions for each of them. In general, the material used in an experiment is defined as a set of items (the texts or the questions in our example are items). The ideal number of participants, as well as the number of items necessary to undertake proper research, is a complex question, which we will address in Chapter 6.

Furthermore, experimental research is generally carried out by recruiting naive participants, who ignore the goals of the experiment and who have zero expertise in the subject under study. This precaution aims to try to control certain cognitive biases that could influence research results. The first bias is related to the fact that the participants who know the research hypothesis may try to base their answers on this hypothesis. Should this happen, the results obtained could suffer from what is called confirmationbias. Rather than answering naturally, participants could provide answers based on the hypothesis to confirm it, not because the assumption is correct, but rather because it seems adequate to them (even if this is not the case). The second bias is related to the fact that participants may want to help the researcher. If the participants know or suspect the goal of an experiment beforehand, the results obtained in this second scenario may not correspond to reality, but rather to the answers that the participants presume are expected.

Finally, in experimental research, participants are generally assigned to conditions in a random manner. This means that every person has the same chances of being included under one condition of the experiment or another. This random assignment offers additional protection against the effect of uncontrolled external variables. In addition to testing a large number of people, randomly distributing them to the different conditions reduces the probability that external variables could systematically influence the results. However, this random assignment is only feasible when all variables are manipulated. When one or more variables are simply observed, participants must be included in one condition or another on the basis of their own characteristics, such as gender or age, for instance. In this case, we speak of quasi-experimental research, since it is not possible to control all the variables. Leaving this question aside, experimental and quasi-experimental research is very similar, and the elements developed in the following chapters apply to both types of research.

1.2.5. Use of statistics and generalization of results

The last essential characteristic of experimental research concerns the way in which data is analyzed. Experimental research aims to collect quantitative data that can be statistically analyzed. As we will see in Chapter 7, quantitative data can be described using different indicators, such as the mean, for example. Based on these descriptive indicators, it is possible to obtain an overview of the data collected, to summarize and illustrate them, in order to communicate the results with simplicity.

At the second stage, data is used to draw conclusions about the research hypotheses. In experimental linguistics, the aim is to study and understand a linguistic phenomenon for a specific population. Since it is impossible to test an entire population, researchers collect data from a representative sample. Through the use of inferential statistics, it is possible to determine whether the results of a particular sample are applicable to the whole population. This process is called generalization.

1.3. Types of experiment in experimental linguistics

Experimental research can be applied to all areas of linguistics, even if historically some areas have used such a methodology more consistently than others. Research questions vary widely between linguistic fields, meaning that many different methods and measures can be used in experimental linguistics. In this book, we do not aim to offer a detailed presentation of every research field and the methods associated with each, but rather to provide an overview of the principles of experimental methodology and the available techniques for linguists. Here, we will introduce some major classes of experiments that can be carried out in linguistics, and we will then develop these in every dedicated chapter.

In general, the experimental studies carried out in linguistics can be classified depending on the aspect of the language under study. Alternately, we will discuss studies on linguistic production and those relating to language comprehension. We will see that the study of comprehension poses many challenges, since this process is not directly observable. For this reason, research on language comprehension is based on the observation of indirect measures, which can be explicit or implicit. We will also see that it is possible to study comprehension by observing different stages of this process, either while it is in progress or once it has been completed.

1.3.1. Studying linguistic productions

The first type of linguistic experiment aims to investigate language production, all the manifestations of language that are produced by individuals in a certain language. Although these manifestations can be collected from diverse corpora and then studied through corpus analysis (see Zufferey (2020) for a detailed presentation of these methods), in some cases, the data contained in the corpus is not enough for studying a linguistic phenomenon. Some rare phenomena practically do not appear, if at all, in a corpus. What is more, the use of observation of naturally produced data is not suitable for showing the influence of a variable on the emergence of a specific linguistic phenomenon, as we have already seen. To counter this, different experiments can be implemented in order to study the production of linguistic phenomena. In these experiments, the goal is to purposefully elicit the emergence of certain linguistic structures, while controlling the context in which such structures appear. The experimental study of linguistic production will be described in further detail in Chapter 3.

1.3.2. Explicit and implicit measures of comprehension

The second type of experiments used in experimental linguistics include studies conducted on the mechanisms involved in language processing and comprehension. Such processes are numerous and range from the organization of the lexicon, to the comprehension of a text or a discourse. It is therefore the most broadly studied aspect in experimental linguistics. Unlike some aspects of the production component, the language comprehension component is unique, in that it cannot be directly assessed through mere observation. It is outright impossible to directly observe the processes involved in the comprehension of a text, for example. This is why it is necessary to find a way to measure these processes indirectly, based on indicators that can be associated with them.

The first way of collecting these indicators requires the use of explicit tasks in which participants have to reflect upon certain linguistic aspects. For example, this is the case for metalinguistic tasks such as grammaticality or acceptability judgments. This type of task could be used to test the participants’ grammatical knowledge, by showing them syntactically correct or incorrect sentences in compliance with grammatical standards, and asking them to identify errors and justify their choice. While these tasks have the advantage of providing direct access to speakers’ knowledge, they also have the defect of being based on their reflexive skills and their subjective appreciation of their own understanding. These tasks are also particularly complex for certain types of people, especially for children or people with language impairments, for whom it is often very difficult to explain the reasoning behind their decision. Other tasks make it possible to circumvent these problems, by setting up experiments in which the participants have to choose between several illustrations matching a linguistic stimulus. For example, Durrleman et al. (2015) tested the comprehension of relative sentences in people with autism spectrum disorder (ASD), asking them to point to the image corresponding to sentences such as “show me the little boy running after the cat”. Making use of such tasks offers the possibility of studying language comprehension in children and populations suffering from linguistic impairments.

Alternatively, methods for studying comprehension in an implicit manner (without asking the participants directly for a judgment or an explanation of their reasoning) have also been developed. This is the case in action tasks, in which some kinds of behavior adopted on the basis of a linguistic stimulus can be observed. For example, Pouscoulous et al. (2007) tested the understanding of scalar implicatures triggered by words such as quelques (roughly equivalent to some), by asking French-speaking children to arrange tokens in boxes so as to match statements like “quelques cases ont des jetons (some boxes have tokens)”. It is also possible to understand comprehension skills using recall or recognition tasks, in which questions are asked at the end of a reading exercise or after listening to a text or speech fragment. For example, Zufferey et al. (2015a) tested the comprehension of causal relations in children aged 5–8 years, by asking them to answer why questions after every page, when reading a story with them.

1.3.3. Offline and online measures of comprehension

The various tasks listed above, as well as the tasks proposed in the examples presented so far in this chapter, enable access to comprehension once the word, sentence or text has been processed and understood. These measures are described as offline, in that they affect the final interpretations resulting from the comprehension process. On the other hand, online measures allow us to study the processes that come into play in comprehension itself. Such processes have the characteristic of being extremely fast, transient and occurring out of people’s consciousness, therefore remaining inaccessible to traditional offline measures.

Borrowing scientific methods and paradigms from other disciplines, such as psychology, has allowed the study of online processes involved in language comprehension. The majority of online measurement techniques have something in common: they observe the time required for a process, by measuring the reading time or reaction time. These techniques are based on the idea that the time required to complete a process reflects certain characteristics of this process, particularly in terms of complexity. Longer reaction times and reading times are generally associated with a more in-depth processing of the linguistic stimulus. Tasks using these time measures typically involve asking participants to name words, read or produce sentences, or decide whether or not a series of letters matches a word in their language. Studies that have employed such tasks have shown that, at the word level, response times and reading times are influenced by properties such as frequency, length and predictability. Similarly, at the sentence level, reading is influenced by properties such as syntax complexity or the need to produce inferences (Just and Carpenter 1980; Rayner 1998; Smith and Levy 2013).

Studies based on time measures have benefited from significant technological developments since the 1970s, so that today, anyone can easily conduct research from their computer. In addition, new techniques have been developed to enable the recording of eye movement whilst reading or when observing an image. It is thus possible to gain an insight, not only into the time required to read certain words or sentences, but also the exact movements made by the eyes during reading. This data provides additional information, such as the time allotted for different words, the order in which words are fixated or even the eye movements associated with reading certain passages. These eye movement measures can be applied to the study of reading as well as to the study of spoken speech production or comprehension.

Finally, the methods used in the field of neuroscience have also been transferred to experimental linguistics. These methods provide access to the brain activity involved in language-related processes. Using small electrodes placed on the scalp, the electroencephalogram (EEG) records the activity of neurons on the surface of the brain. This technique gives an accurate temporal overview of the activity of neurons associated with a specific linguistic process. Functional magnetic resonance imaging (fMRI) aims to measure the activity of neurons based on their oxygen consumption. It thus provides a precise spatial overview of the brain areas involved in a specific linguistic process.

As we can infer by reading these lines, offline methods are the most accessible to researchers, since they require few technical means. In most cases, offline measures can be collected using paper and pencil tasks. A simple spreadsheet available on every computer can be used for organizing and analyzing the data from such studies. For some statistical tests, a program must be added to the list of necessary tools. Online methods for observing reaction time or reading performance require special software for programming experiments. Things get more complicated when you want to record eye movements. These recordings require the use of expensive tools, that also take time to control. Furthermore, the data from studies on eye movements is much more complex to process. Finally, EEG or fMRI studies are generally reserved for people benefiting from access to such techniques, which are extremely costly in terms of equipment and necessary skills for processing recorded signals. For this reason, such techniques will not be discussed in this book.