Direction Dependence in Statistical Modeling -  - E-Book

Direction Dependence in Statistical Modeling E-Book

0,0
116,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Covers the latest developments in direction dependence research Direction Dependence in Statistical Modeling: Methods of Analysis incorporates the latest research for the statistical analysis of hypotheses that are compatible with the causal direction of dependence of variable relations. Having particular application in the fields of neuroscience, clinical psychology, developmental psychology, educational psychology, and epidemiology, direction dependence methods have attracted growing attention due to their potential to help decide which of two competing statistical models is more likely to reflect the correct causal flow. The book covers several topics in-depth, including: * A demonstration of the importance of methods for the analysis of direction dependence hypotheses * A presentation of the development of methods for direction dependence analysis together with recent novel, unpublished software implementations * A review of methods of direction dependence following the copula-based tradition of Sungur and Kim * A presentation of extensions of direction dependence methods to the domain of categorical data * An overview of algorithms for causal structure learning The book's fourteen chapters include a discussion of the use of custom dialogs and macros in SPSS to make direction dependence analysis accessible to empirical researchers.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 752

Veröffentlichungsjahr: 2020

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Direction Dependence in Statistical Modeling

Copyright

dedication-page

About the Editors

Notes on Contributors

Acknowledgments

Preface

References

Part I: Fundamental Concepts of Direction Dependence

1 From Correlation to Direction Dependence Analysis 1888–2018

1.1 Introduction

1.2 Correlation as a Symmetrical Concept of X and Y

1.3 Correlation as an Asymmetrical Concept of X and Y

1.4 Outlook and Conclusions

References

2 Direction Dependence Analysis

2.1 Some Origins of Direction Dependence Research

2.2 Causation and Asymmetry of Dependence

2.3 Foundations of Direction Dependence

2.4 Direction Dependence in Mediation

2.5 Direction Dependence in Moderation

2.6 Some Applications and Software Implementations

2.7 Conclusions and Future Directions

References

3 The Use of Copulas for Directional Dependence Modeling

3.1 Introduction and Definitions

3.2 Directional Dependence Between Two Numerical Variables

3.3 Directional Association Between Two Categorical Variables

3.4 Concluding Remarks and Future Directions

References

Part II: Direction Dependence in Continuous Variables

4 Asymmetry Properties of the Partial Correlation Coefficient

4.1 Asymmetry Properties of the Partial Correlation Coefficient

4.2 Direction Dependence Measures when Errors Are Non‐Normal

4.3 Statistical Inference on Direction Dependence

4.4 Monte‐Carlo Simulations

4.5 Data Example

4.6 Discussion

References

5 Recent Advances in Semi‐Parametric Methods for Causal Discovery

5.1 Introduction

5.2 Linear Non‐Gaussian Methods

5.3 Nonlinear Bivariate Methods

5.4 Conclusion

References

6 Assumption Checking for Directional Causality Analyses

6.1 Epistemic Causality

6.2 Assessment of Functional Form: Loess Regression

6.3 Influential and Outlying Observations

6.4 Directional Dependence Based on All Available Data

6.5 Directional Dependence Based on Latent Difference Scores

6.6 Direction Dependence Based on State‐Trait Models

6.7 Discussion

References

7 Complete Dependence

7.1 Basic Properties

7.2 Measure of Complete Dependence

7.3 Example Calculation

7.4 Future Works and Open Problems

References

Part III: Direction Dependence in Categorical Variables

8 Locating Direction Dependence Using Log‐Linear Modeling, Configural Frequency Analysis, and Prediction Analysis

8.1 Specifying Directional Hypotheses in Categorical Variables

8.2 Types of Directional Hypotheses

8.3 Analyzing Event‐Based Directional Hypotheses

8.4 Data Example

8.5 Reversing Direction of Effect

8.6 Discussion

References

9 Recent Developments on Asymmetric Association Measures for Contingency Tables

9.1 Introduction

9.2 Measures on Two‐Way Contingency Tables

9.3 Asymmetric Measures of Three‐Way Contingency Tables

9.4 Simulation of Three‐Way Contingency Tables

9.5 Real Data of Three‐Way Contingency Tables

References

10 Analysis of Asymmetric Dependence for Three‐Way Contingency Tables Using the Subcopula Approach

10.1 Introduction

10.2 Review on Subcopula Based Asymmetric Association Measure for Ordinal Two‐Way Contingency Table

10.3 Measure of Asymmetric Association for Ordinal Three‐Way Contingency Tables via Subcopula Regression

10.4 Numerical Examples

10.5 Conclusion

Appendix

References

Part IV: Applications and Software

11 Distribution‐Based Causal Inference

11.1 Introduction

11.2 Direction of Dependence in Linear Regression

11.3 Previous Epidemiologic Applications of Distribution‐Based Causal Inference

11.4 A Running Example: Re‐Visiting the Case of Sleep Problems and Depression

11.5 Evaluating the Assumptions in Practical Work

11.6 Distribution‐Based Causality Estimates for the Running Example

11.7 Conducting Sensitivity Analyses

11.8 Simulation‐Based Analysis of Statistical Power

11.9 Triangulating Causal Inferences

11.10 Conclusion

References

12 Determining Causality in Relation to Early Risk Factors for ADHD

12.1 Method

12.2 Results

12.3 Discussion

Acknowledgments

References

13 Direction of Effect Between Intimate Partner Violence and Mood Lability

13.1 Introduction

13.2 Methods

13.3 Results

13.4 Discussion

References

14 On the Causal Relation of Academic Achievement and Intrinsic Motivation

14.1 Direction of Dependence in Linear Regression

14.2 The Causal Relation of Intrinsic Motivation and Academic Achievement

14.3 Direction Dependence Analysis Using SPSS

14.4 Conclusions

References

Author Index

Subject Index

End User License Agreement

List of Tables

Chapter 2

Table 2.1

Summary of assumptions of direction dependence analysis (DDA) and con...

Chapter 3

Table 3.1

The values of directional dependence measures for the data sets

A

D

....

Chapter 4

Table 4.1

Median bias and median percent bias of three direction dependence meas...

Table 4.2

95% CI coverage rates of the four direction dependence measures as a f...

Table 4.3

Bivariate Pearson correlations and univariate descriptive measures.

Table 4.4

Results of the linear regression model predicting perceived energy 12 ...

Table 4.5

Results of resampling‐based direction dependence tests.

Chapter 6

Table 6.1

Direction analysis of pre‐college composite scores (wave 0).

Table 6.2

Cross‐tabulations of influential observations for target and alternate...

Table 6.3

Frequencies of influential observations under target and alternate mod...

Table 6.4

Direction dependence analysis of factor scores.

Table 6.5

Directional analysis of latent difference.

Table 6.6

Frequencies of influential observations under target and alternate mod...

Table 6.7

Direction dependence of latent trait factor scores.

Table 6.8

Frequencies of influential observations under target and alternate mod...

Table 6.9

Frequencies of influential observations under target and alternate mod...

Table 6.10

Directional analysis of state residual scores for pre‐college (wave 0...

Table 6.11

Directional analysis of residual scores for wave 1.

Table 6.12

Directional analysis of residual scores for wave 3.

Table 6.13

Directional analysis of residual scores for wave 5.

Table 6.14

Directional analysis of residual scores for wave 7.

Chapter 8

Table 8.1

Truth table for the two statements

p

and

q

and five links.

Table 8.2

Direction of effect for

x

1

 → 

y

2

.

Table 8.3

Hit cells for the implications

a

1

b

3

 → 

c

2

d

1

and

a

2

b

2

 → 

c

1

d

2

.

Table 8.4

Hit cells for the implication

X1, 1 ∧ X2, 1 ∨ X1, 2 ∧ X2, 1 → Y2.

...

Table 8.5

Hit Cells for the implication

X1 → Y1, 2 ∧ Y2, 1 ∨ Y1, 2 ∧ Y2, 2.

...

Table 8.6

Truth table for the implication

X1, 1 ∧ X2, 1 ∨ X1, 2 ∧ X2, 1 → Y2

...

Table 8.7

Correlation matrix of the design matrix for the hypothesis

X1, 1 ∧ X2,

...

Table 8.8

[PCD, OMS] × [DEP] cross‐classification.

Table 8.9

Goodness of fit of two models for the analysis of the [PCD, OMS] × [DE...

Table 8.10

CFA of the [PCD, OMS] × [DEP] cross‐classification.

Table 8.11.

Goodness of fit of base model and reverse‐direction model for the an...

Table 8.12

CFA of the [PCD, OMS] × [DEP] cross‐classification under two directio...

Chapter 9

Table 9.1

Three‐way contingency table of

X

,

Y

, and

Z

with

P = {pijk}

...

Table 9.2

(a) Joint p.m.f of

Y

and

Z

; (b) joint p.m.f of

X

and

Z

; (c) joint p.m....

Table 9.3

Three‐way contingency tables of dichotomous variables.

Table 9.4

Supports of

U

,

V

,

W

and the joint p.m.f. of

C

.

Table 9.5

Black Olive preference (

P

) by location (

L

) and urbanization (

U

).

Table 9.6

Worker satisfaction for organizational aspects (

O

) and satisfaction fo...

Chapter 10

Table 10.1

Job satisfaction data (Beh & Lombardo, 2014, p. 478).

Table 10.2

Analysis of asymmetric association in job satisfaction data.

Table 10.3

Hierarchical analysis for two types of association in job satisfactio...

Chapter 11

Table 11.1

Estimates of causal direction for depression and sleep variables of t...

Table 11.2

Role of the variables on generating the data of the four simulation s...

Table 11.3

Estimated biometric sources of variance for depression and sleep vari...

Chapter 12

Table 12.1

Sample description.

Table 12.2

Breastfeeding duration by ADHD group status.

Table 12.3

Covariate evaluation and selection.

Table 12.4

Regression models (

n

 = 829).

Table 12.5

DDA results for covariate‐adjusted models of the form

breastfeeding

 →...

Table 12.6

DDA results for covariate‐adjusted models of the form

parent‐rated AD

...

Table 12.7

DDA results for covariate‐adjusted models of the form

breastfeeding

 →...

Table 12.8

DDA results for covariate‐adjusted models of the form

parent‐rated AD

...

Table 12.9

DDA results for covariate‐adjusted models of the form

breastfeeding

 →...

Table 12.10

DDA results for covariate‐adjusted models of the form

teacher‐rated

...

Table 12.11

DDA results for covariate‐adjusted models of the form

breastfeeding

 ...

Table 12.12

DDA results for covariate‐adjusted models of the form

teacher‐rated

...

Chapter 13

Table 13.1

Autoregression parameter estimates for IPV, LEAVE, and MOOD (standard...

Chapter 14

Table 14.1

Summary of DDA components and model‐specific DDA patterns.

Table 14.2

Bivariate Pearson correlation coefficients and descriptive measures o...

Table 14.3

Results of the competing regression models.

Table 14.4

Summary of DDA decisions.

List of Illustrations

Chapter 2

Figure 2.1 Patterns of direction of dependence of three explanatory causal m...

Figure 2.2 Six alternative mediation models together with the corresponding ...

Chapter 3

Figure 3.1 Construction of a copula and its use.

Figure 3.2 Plot of the copula regression functions for

β1 = 10, β2 = 40

...

Figure 3.3 Plot of the copula regression functions for

β1 = 10, β2 = 40

...

Figure 3.4 Plot of the copula regression functions for

β1 = 10, β2 = 40

...

Figure 3.5 Plot of the difference between the copula regression functions fo...

Figure 3.6 Plots of directional dependence measures

and

together with th...

Figure 3.7 Distributional cycle. The key random variables and their roles on...

Figure 3.8 Dependence cycle. The key random variables and their roles on und...

Figure 3.9 Distributional circle showing the pairwise correlations for three...

Figure 3.10 The plots of a data set where order statistics of

X

and

Y

are bo...

Figure 3.11 Surface and contour plots of odds ratio and conditional ratios f...

Figure 3.12 The behavior of

D

X

 → 

Y

as a function of

p+1

...

Chapter 4

Figure 4.1 Empirical power of detecting the true model as a function of

βyx

...

Figure 4.2 Empirical power of 95% BCa CIs of four direction dependence measu...

Figure 4.3 Univariate distributions (main diagonal) and bivariate scatterplo...

Chapter 5

Figure 5.1 An example causal graph.

Figure 5.2 Three candidate causal models.

Figure 5.3 Three candidate causal models in the presence of hidden common ca...

Figure 5.4 The skeleton of a causal graph with an undirected edge between

x

1

Chapter 6

Figure 6.1 Confirmatory factor model for pre‐college assessment of reasons f...

Figure 6.2 Latent difference score model (unstandardized co‐efficients).

Figure 6.3 Example state trait model (standardized co‐efficients).

Figure 6.4 Loess regression of social motives on enhancement motives, pre‐co...

Figure 6.5 Loess regression of enhancement motives on social motives – pre‐c...

Chapter 9

Figure 9.1 The complete dependence measure

μ(Y, Z ∣ X)

...

Chapter 10

Figure 10.1 The estimated subcopula regression based association measure

(...

Figure 10.2 The estimated subcopula regression based association measure

(...

Figure 10.3 The estimated subcopula regression based association measure

(...

Chapter 11

Figure 11.1 Illustration of skewness‐based causal signal. Histogram of a Log...

Figure 11.2 Illustrating the SATSA data. Whereas the histograms (a and b) sh...

Figure 11.3 Illustrating lurking nonlinearities. The first panel shows a sca...

Figure 11.4 Illustration of the assumed model in DirectLiNGAM (a; one variab...

Figure 11.5 Results for simulation‐based sensitivity analyses of DirectLiNGA...

Figure 11.6 Results for the power analysis of DirectLiNGAM in terms of skewn...

Figure 11.7 Path diagram for Direction of Causation (DoC) models in the runn...

Chapter 12

Figure 12.1 Distributions for Thurstone factor scores (mother's age at child...

Chapter 13

Figure 13.1 Granger causality model in which IPV causes MOOD lability (stand...

Figure 13.2 Granger causality model in which IPV causes MOOD lability, under...

Chapter 14

Figure 14.1 Simplified competing models with standardized focal variables

x

...

Figure 14.2 Conceptual diagram of model (c) with an unmeasured confounder to...

Figure 14.3 SPSS main dialogue box to perform DDA.

Figure 14.4 DDA options dialogue box.

Figure 14.5 SPSS output of DDA variable distribution tests.

Figure 14.6 Univariate distributions (main diagonal) and scatterplot (upper ...

Figure 14.7 SPSS output of residual distribution tests.

Figure 14.8 DDA function dialogue box.

Figure 14.9 SPSS outputs of DDA independence tests.

Guide

Title Page

Copyright

Dedication

About the Editors

Notes on Contributors

Acknowledgments

Preface

Table of Contents

Begin Reading

Author Index

Subject Index

WILEY END USER LICENSE AGREEMENT

Pages

iv

v

xv

xvi

xvii

xviii

xxi

xxiii

xxiv

xxv

xxvi

xxvii

xxviii

xxix

xxx

1

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

81

82

83

84

85

86

87

88

89

90

91

92

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

154

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

183

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

Direction Dependence in Statistical Modeling

Methods of Analysis

Edited by

Wolfgang Wiedermann

Daeyoung Kim

Engin A. Sungur

Alexander von Eye

 

 

 

 

 

Copyright

This edition first published 2021

© 2021 John Wiley & Sons, Inc.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Wolfgang Wiedermann, Daeyoung Kim, Engin A. Sungur, and Alexander von Eye to be identified as the authors of the editorial material in this work has been asserted in accordance with law.

Registered Office

John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA

Editorial Office

111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of Warranty

While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication Data

Names: Wiedermann, Wolfgang, 1981‐ editor. | Kim, Daeyoung, editor. |

  Sungur, Engin, editor. | Eye, Alexander von, editor.

Title: Direction dependence in statistical modeling : methods of analysis /

  edited by Wolfgang Wiedermann, Daeyoung Kim, Engin Sungur, Alexander von

  Eye.

Description: Hoboken, NJ : Wiley, 2021. | Includes bibliographical

  references and index.

Identifiers: LCCN 2020015364 (print) | LCCN 2020015365 (ebook) | ISBN

  9781119523079 (cloth) | ISBN 9781119523130 (adobe pdf) | ISBN

  9781119523147 (epub)

Subjects: LCSH: Dependence (Statistics)

Classification: LCC QA273.18 .D57 2020 (print) | LCC QA273.18 (ebook) |

  DDC 519.5–dc23

LC record available at https://lccn.loc.gov/2020015364

LC ebook record available at https://lccn.loc.gov/2020015365

Cover design by Wiley

Cover image: © zhengshun tang/Getty Images

To Anna and Linus

—W.W

To my wife Shu‐Min and my son Minjun

—D.K

To my wife Lamia Sungur

—E.S

To Donata, the origin of direction

—A.vE

About the Editors

Wolfgang Wiedermann

Wolfgang Wiedermann is Associate Professor at the University of Missouri, Columbia. He received his PhD in Quantitative Psychology from the University of Klagenfurt, Austria in 2012. His primary research interests include the development of methods for causal inference, methods to determine the causal direction of dependence in observational data, and methods for person‐oriented research settings. He has edited books on advances in statistical methods for causal inference (with von Eye, Wiley) and new developments in statistical methods for dependent data analysis in the social and behavioral sciences (with Stemmler and von Eye). His work appears in leading quantitative methods journals, including Psychological Methods, Multivariate Behavioral Research, Behavior Research Methods, and the British Journal of Mathematical and Statistical Psychology. He currently serves as an associate editor for Behaviormetrika and the Journal for Person‐Oriented Research.

Daeyoung Kim

Daeyoung Kim is Associate Professor of Mathematics and Statistics at the University of Massachusetts, Amherst. He received his PhD from the Pennsylvania State University in Statistics in 2008. His original research interests were in likelihood inference in finite mixture modeling including empirical identifiability and multimodality, development of geometric and computational methods to delineate multidimensional inference functions, and likelihood inference in incompletely observed categorical data, followed by a focus on the analysis of asymmetric association in multivariate data using (sub)copula regression. He also has active collaborations with colleagues in food sciences at the University of Massachusetts, Amherst, focusing on the use of statistical models to analyze data for colon cancer, obesity and diabetes.

Engin A. Sungur

Engin A. Sungur has a BA in City and Regional Planning (Middle East Technical University, METU, Turkey), MS in Applied Statistics, METU, M.S. in Statistics (Carnegie‐Mellon University, CMU) and PhD in Statistics (CMU). He taught at Carnegie‐Mellon University, University of Pittsburg, Middle East Technical University, and University of Iowa. Currently, he is a Morse‐Alumni distinguished professor of statistics at University of Minnesota, Morris. He has been teaching statistics for more than 38 years, 29 years of which at the University of Minnesota, Morris. His research areas are dependence modeling with emphasis on directional dependence, modern multivariate statistics, extreme value theory, and statistical education.

Alexander von Eye

Alexander von Eye, PhD, is Professor Emeritus of Psychology at Michigan State University. He received his PhD in Psychology, with minors in Education and Psychiatry, from the University of Trier, Germany, in 1976. He is known for his work on statistical modeling, categorical data analysis, methods of analysis of direction dependence hypotheses, person‐oriented research, and human development. He authored, among others, texts on Configural Frequency Analysis, the analysis of rater agreement (with Mun), and on log‐linear modeling (with Mun; Wiley), and he edited, among others, two books on latent variables analysis (the first with Clogg, the second with Pugesek and Tomer), and one on Statistics and Causality (with Wiedermann; Wiley). His over 400 articles appeared in the premier journals of the field, including, for instance, Psychological Methods, Multivariate Behavioral Research, Child Development, the Journal of Person‐Oriented Research, the American Statistician, and the Journal of Applied Statistics.

Notes on Contributors

Patrick Blöbaum

Patrick Blöbaum studied Cognitive Computer Science and Intelligent Systems at Bielefeld University (Germany) from 2009 to 2014 and received his PhD in Engineering (Machine Learning) from Osaka University (Japan) in 2019 with a research focus on causality. In addition to his PhD studies, he worked as an assistant researcher and machine learning engineer in Japan. In 2019, Patrick joined the newly founded causality team at the Amazon research and development center in Tübingen, Germany, which focuses on the development and application of novel causality algorithms.

G. Anne Bogat

G. Anne Bogat, PhD is a professor of clinical psychology at Michigan State University. Her research centers on intimate partner violence (IPV), including a focus on daily experiences of IPV among college students; how IPV during pregnancy affects women, children, and the mother–child relationship; and how bonding between mothers and infants is affected by pregnancy and postpartum IPV. In addition, she has written about and employed person‐oriented methods in her research.

Yadolah Dodge

Yadolah Dodge was born in Abadan, Iran, and is a Swiss citizen. Along with a full‐time position as Professor and Chair of Statistics at the University of Neuchâtel, Switzerland, his dedication to photography, painting, and film‐making continued, resulting in three long documentaries: Turicum: This is Zurich (2014), Dear Son (2018), and Moving Heart (2019). He is author, co‐author, and editor of over 20 books by Oxford University Press, Springer and John‐Wiley, Dunod and North‐Holland and several papers.

Regina García‐Velázquez

Dr. Regina García‐Velázquez has a background in Psychology and received her master's degree in Methodology of Behavioural and Health Sciences from the Autonomous University of Madrid. She received her PhD from the University of Helsinki. She is a post‐doctoral researcher at the University of Helsinki and teaches courses on Psychometrics. She is interested in measurement issues applied to psychopathology, particularly on classification, validity, and statistical modelling. In her current research, she focuses on internalizing disorders.

Jade E. Kobayashi

Jade E. Kobayashi, MA is a clinical psychology graduate student in the doctoral program at Michigan State University. Her research interests include adult romantic attachment, interpersonal conflict and intimate partner violence (IPV), and intensive longitudinal and dyadic analytic methods.

Alytia A. Levendosky

Alytia A. Levendosky, PhD is a Professor in the Department of Psychology at Michigan State University. Her research over the past 25 years has focused on the effects of intimate partner violence (IPV) on mothers and children. Currently, her primary research interests are in the role of IPV as a stressor during the perinatal period. Her work has helped elucidate how IPV during pregnancy affects the very beginnings of motherhood through women's developing representations/schemas about their unborn child which later affect their parenting behaviors during early childhood.

Xintong Li

Xintong Li is Senior Research Analyst at the Assessment Resource Center at the University of Missouri, and he is an experienced researcher specialized in quantitative methods and educational research. He received his PhD in statistics, measurement and evaluation in education at the University of Missouri. His major research interests include causal inference with non‐experimental data, educator effectiveness, and motivation in education. He is skilled and experienced in advance statistical modeling, programming, large‐scale simulations, and large database management. He has multiple publications on methodological foundations and applications of direction dependence principals published in, e.g. Multivariate Behavioral Research, Behavior Research Methods, and Prevention Science.

Joel T. Nigg

Joel T. Nigg, PhD is a clinical psychologist and professor of Psychiatry and Behavioral Neuroscience and Director of the Center for ADHD Research at Oregon Health & Science University. His research on ADHD and related conditions has been funded by NIH continuously for over 20 years. His work focuses on refining the phenotype related to cognition and emotion and examining environmental and genetic etiology.

Tom Rosenström

Dr. Tom Rosenström has obtained his education in psychology (MA, PhD) and applied mathematics (MSc) at the University of Helsinki, Finland. He conducts mental health research, applying and developing mathematical and statistical models within the field. In addition, he has worked on theoretical biology at the University of Bristol and on behavior genetics at the Norwegian Institute of Public Health, and is currently employed by the Helsinki University Hospital, where he also conducts clinical patient work.

Valentin Rousson

Valentin Rousson was born in 1967 in Neuchâtel, Switzerland, where he got a PhD in Statistics in 1998. He then spent some time at the Australian National University in Canberra as a postdoc, and at the University of Zurich, sharing his time between statistical consulting and research. In 2007, he was named Associate Professor in Biostatistics at the University of Lausanne, where he is currently working and teaching.

Shohei Shimizu

Shohei Shimizu is a Professor at the Faculty of Data Science, Shiga University, Japan and leads the Causal Inference Team, RIKEN Center for Advanced Intelligence Project. He received a PhD in Engineering from Osaka University in 2006. His research interests include statistical methodologies for learning data generating processes such as structural equation modeling and independent component analysis and their application to causal inference. He received the Hayashi Chikio Award (Excellence Award) from the Behaviormetric Society in 2016. He is a coordinating editor of Behaviormetrika since 2016 and is an associate editor of Neurocomputing since 2019.

Diane D. Stadler

Diane Stadler has a PhD in Human Nutrition and is a registered dietitian with expertise in maternal and infant nutrition and providing care for children with metabolic disorders and developmental disabilities. She directs the Graduate Programs in Human Nutrition at Oregon Health & Science University in Portland, Oregon and is a leader in OHSU's nutrition education initiatives and research mentoring programs. She also oversees OHSU's clinical nutrition specialist training program and research initiatives in Lao People's Democratic Republic to support health care providers in addressing the country's high rates of childhood malnutrition.

Santi Tasena

Since 2011, Santi Tasena has been working at Chiang Mai University, Thailand. Being in love with mathematics, he enjoys discussing any topic related to mathematics. His research interests include mathematical analysis and related fields. His work includes heat kernel analysis on metric spaces, (sub)copulas and measures of dependence, and construction of aggregation and related functions. He received grants from the Commission on Higher Education, Thailand, the Centre of Excellence in Mathematics (CHE), Thailand, the Data Science Research Center, and the Center of Excellence in Mathematics and Applied Mathematics, Chiang Mai University, Thailand.

Tonghui Wang

Tonghui Wang is currently a full professor of statistics in the Department of Mathematical Sciences, New Mexico State University. He received his PhD degree from the University of Windsor, Canada in May, 1993. His research interests are multivariate linear modes under skew normal settings; copulas and their associated measures with applications; and big data analysis and statistical learning with applications.

Zheng Wei

Zheng Wei is currently an assistant professor of statistics in the Department of Mathematics and Statistics at the University of Maine. He served as a visiting assistant professor at Department of Mathematics and Statistics, University of Massachusettes Amherst from May 2015 to August 2017. He developed research in Bayesian statistical methods for data science, big data and analytics, the copula theory and its applications. He completed the PhD at the New Mexico State University in May 2015.

Phillip K. Wood

Phil K. Wood is a professor of Quantitative Psychology at the University of Missouri. He specializes in structural equation modeling, growth curve modeling and factor analysis, with particular emphasis on techniques for the analysis of longitudinally intensive data such as dynamic factor models. His substantive areas of interest include the cognitive outcomes of higher education and longitudinal inter‐individual differences in behaviors during young adulthood such as problematic alcohol use, tobacco and other drug usage and risky sexual behaviors.

Xiaonan Zhu

Xiaonan Zhu is an assistant professor in the Department of Mathematics at University of North Alabama. Before joining UNA in Fall 2019, he obtained his PhD and MS in Mathematical Statistics from the Department of Mathematical Sciences at New Mexico State University in 2019 and 2014, respectively. His research interests include sampling distributions of skew normal distributions, distribution of quadratic forms under closed skew normal settings, construction of copulas, (local) dependence of random vectors, measures of dependence through (sub‐)copulas.

Acknowledgments

There are numerous people to thank in regard of the preparation of this volume. First and foremost, we offer our deepest thanks to our contributing authors with whom we share a dedication to the development and application of statistical methods in the context of direction of dependence and causality. This volume would not have been possible without their excellent work.

We are also grateful to Wiley publishers for their interest in the topic and their support. This applies in particular to Sari Friedman, Kathleen Santoloci, Mindy Okura‐Marszycki, Elisha Benjamin, Sechin Nithya, Amudhapriya Sivamurthy, and Ezhilan Vikraman who have supported and guided us from the very first contact to the completion of this book. Thank you all!

Most important, we are grateful for the love and support of our respective families. The first editor wants to emphasize that he is grateful to be allowed to experience a causal mechanism that does not require any empirical evaluation – the dependence between W's happiness and the existence of Anna and Linus. No statistical modeling is needed to show that {Anna, Linus} → (W = happy) holds in an unconfounded manner. The second editor would like to express gratitude and sincere thanks to his wife, Shu‐Min, and his son, Minjun, for their tremendous support and love.

Preface

Questions concerning causation are omnipresent in the empirical sciences. In non‐experimental research, however, it is often hard to determine the status of variables as cause and effect. Temporal order alone is of limited use, unless one observes antecedents and the beginning of a chain of events. That is, even when a putative explanatory variable (x) is measured earlier in time than the (putative) outcome (y), one cannot rule out that an outcome, measured at an earlier point in time, may have caused x. Similarly, temporality alone does not prevent causal effect estimates from being biased unless one is able to adjust for all relevant (potentially time‐varying) confounders (Bellemare, Masaki, & Pepinsky, 2017). Cross‐sectional research has often been looked‐down upon because it is deemed of little use for the analysis of hypotheses that are compatible with (possibly competing) theories of causality. Based on cross‐sectional data alone, for example, one is not able to distinguish whether a relation between x and y is observed because of an underlying causal model of the form x → y (i.e. x causes y), the reverse‐causal model y → x (y causes x), or whether the observed relation is spurious due to (total or partial) confounding, x ← u → y.

Limitations of longitudinal and cross‐sectional observational research are (partly) rooted in the limitations of the statistical methods that are routinely applied to analyze dependence structures. In both research designs, covariance‐based methods (such as correlational, linear regression, and structural equation modeling techniques) are de rigueur. Although, these methods can be useful in the estimation of the magnitude of causal effects (provided that certain unconfoundedness conditions are fulfilled, see, e.g. Pearl, 2009), they do not help to empirically distinguish between cause and effect. For example, in the standardized case, linear regression parameters for the model x → y are identical to the ones that are estimated for the reverse regression, y → x (von Eye & DeShon, 2012). These symmetry properties of the linear regression model have been known since its early origins (Galton, 1886). In fact, the observation that regression is inherently symmetric was one of the reasons why Francis Galton (the “founding father” of linear regression) changed his characterization of the phenomenon that previously suppressed hereditary traits can re‐appear from a phenomenon of reversion to a phenomenon of regression (Gorroochurn, 2016). In other words, symmetry properties influenced how linear regression was conceptualized as a statistical tool. Similarly, symmetry properties of conventional representations of the Pearson product‐moment correlation (for an overview of various facets of the Pearson correlation see, for example, Rodgers and Nicewander (1988), Rovine and von Eye (1997), Falk and Well (1997), and Nelsen (1998)) certainly contributed to the widespread and well‐known mantra that correlation does not imply causation and to the belief that the means of statistic cannot be used to establish the causal direction of dependence.

Fortunately, this state of affairs has changed recently. It did take statisticians until the beginning of the new millennium to get a handle on the issue of direction dependence. But in 2000, Dodge and Rousson derived, within the framework of the linear regression model, the relation between cause and effect variables, for the (not so) particular case in which the cause variable is asymmetrically distributed. Specifically, these authors showed that variable information beyond means, variances, and covariances (e.g. skewness and co‐skewness) can be used to empirically determine which of two variables, is more likely to be the cause and which is more likely to be the effect. Focusing on asymmetry properties of the linear regression and the Pearson correlation, the work by Dodge and Rousson (2000) initiated a new topic and line of statistical research, that of the development and application of methods for the analysis of direction dependence and causal hypotheses. Dodge and Rousson (2000) focused on asymmetry that emerges from marginal variable distributions. Asymmetry properties based on error distributions have later been proposed by Wiedermann, Hagmann, and von Eye (2015), Wiedermann and von Eye (2015b), and Wiedermann and Hagmann (2016). Extensions to measurement error models were recently discussed in Wiedermann, Merkle, and von Eye (2018). The second seminal paper in this new line of research was published in 2005 by Engin A. Sungur (see Sungur (2005a); a discussion of copulas in the regression context is given by Sungur (2005b)). While Dodge and Rousson's (2000) initial work focused on determining the direction of dependence through studying the marginal behavior of distributions, Sungur (2005a) proposed to study the behavior of joint variable distributions by making use of copulas. This copula‐based direction dependence approach constitutes a second line of research that allows researchers to analyze cause‐effect properties of variables while accounting for potential differences in marginal distributions. Copula‐based directional dependence analysis has experienced rapid development. Various extension have been proposed by, e.g. Kim and Kim (2014, 2016), Wei and Kim (2017, 2018), and Kim and Hwang (2019) – more recent applications of the approach are given by Lee and Kim (2019) and Kim, Lee and Xiao (2019). The third seminal paper in the development of methods to distinguish between cause and effect variables was published by Shimizu and colleagues in 2006 proposing the linear non‐Gaussian acyclic model (LiNGAM) – a causal machine learning algorithm for non‐normal variables that is closely related to independent component analysis (Hyvärinen, Karhunen, & Oja, 2001). LiNGAM rapidly developed in the area of machine learning research and has been extended to nonlinear variable relations (Zhang & Hyvärinen, 2016), models with hidden common causes (Hoyer, Shimizu, Kerminen, & Palviainen, 2008; Shimizu & Bollen, 2014), and mixed (continuous and categorical) data (Yamayoshi, Tsuchida, & Yadohisa, 2020), to name a few. For an overview of recent advances in causal machine learning, see Guyon, Statnikov, and Batu (2019).

The present book is concerned with novel statistical approaches to the analysis of the causal direction of dependence of variables in both, exploratory (i.e. learning the causal structures from observational data without background knowledge) and confirmatory (i.e. testing a priori existing competing causal theories) research scenarios, and presents original work in four modules. In the first module, Fundamental Concepts of Direction Dependence, Dodge and Rousson (Chapter 1) introduce the well‐known Pearson correlation coefficient as an asymmetric concept of two variables which (as discussed above) served as a starting point for several lines of direction dependence research. Further, the authors provide a reminder that working with non‐normality of variables (as a key requirement to derive asymmetry properties in the linear case) bears challenges in practice (e.g. distinguishing between non‐normality as a characteristic of the construct under study versus non‐normality due to outliers and suboptimal measurement). In Chapter 2, Wiedermann, Li, and von Eye then continue the discussion of asymmetry properties of the linear regression model and introduce three asymmetry concepts (summarized in a framework termed Direction Dependence Analysis (DDA), cf. Wiedermann and von Eye, 2015a; Wiedermann & Li, 2018) that can be used to detect potential confounding and distinguish between the two causally competing models x → y and y → x. Applications of DDA in the context of mediation and moderation models are discussed. Chapter 3, by Engin A. Sungur, is devoted to the use of copulas in direction dependence modeling. This chapter introduces definitions and fundamental principles to model directional dependence of variables using asymmetric copulas and regression, and describes various copula‐based directional dependence measures to perform model selection in both, continuous and categorical data settings.

The second module is devoted to Direction Dependence in Continuous Variables. Chapter 4, by Wolfgang Wiedermann, discusses asymmetry properties of the partial correlation coefficient in the research tradition of Dodge and Rousson (2000). Asymmetric facets of the partial correlation coefficient are presented which enable one to test causally competing models while adjusting for relevant background variables. Parameter recovery and accuracy of model selection is evaluated using Monte‐Carlo simulation experiments. Chapter 5, by Shimizu and Blöbaum, gives an overview of recent advances in the development of algorithms for unsupervised causal learning. The authors start by introducing the standard LiNGAM and present extensions to structural vector autoregressive models for the analysis of time series data, models with hidden common causes, and methods for causal learning under nonlinearity of variable relations. In Chapter 6, Phillip K. Wood takes a regression diagnostic perspective and discusses the importance of evaluating the assumptions of the statistical models that are used to learn the causal structure of observational data. The author uses data from a longitudinal study on motives for alcohol consumption (cf. Sher & Rutledge, 2007) and compares the use of manifest variable composites, factor scores within a state‐trait model, and latent difference factor scores in the evaluation of directional dependence hypotheses. The last chapter of this module (Chapter 7) by Santi Tasena, reviews definitions and basic properties of measures of complete dependence. The author gives examples of calculating complete dependence measures in the case of the multivariate Gaussian distribution and presents open problems and potential future directions.

In the third module, methods of direction dependence are extended to the categorical variable domain. Chapter 8, by von Eye and Wiedermann, introduces an event‐based perspective in the analysis of hypotheses compatible with direction dependence. The authors introduce two‐valued statement calculus to derive composite causality statements and use a design matrix approach to evaluate event‐based direction dependence hypotheses. Three methods are compared with respect to their capability to test direction of dependence in categorical data, log‐linear modeling, configural frequency analysis, and prediction analysis. Chapter 9 contributed by Zhu, Wei, and Wang, is devoted to a copula‐based approach to measure associations in contingency tables. The authors start with reviewing some recently developed measures for the analysis of asymmetric associations in two‐way or three‐way contingency tables. Then, they propose two new measures of complete dependence on three‐way contingency tables and present corresponding nonparametric estimators. Chapter 10, by Kim and Wei, investigates a subcopula‐based asymmetric association measure for the analysis of dependence structures in three‐way ordinal contingency tables. Their asymmetric measure utilizes sub‐copula regressions obtained under the hypothesized dependence relations.

The fourth module is then devoted to Applications and Software. In Chapter 11, Rosenström and Regina García‐Velázquez make use of LiNGAM in the context of psychiatric epidemiology. Specifically, the authors use distribution‐based indicators to test the causal direction of the association between sleeping problems and depressive symptoms using data from the Swedish Adoption/Twin Study on Aging (Pedersen, 2005). In addition, the authors provide application guidelines for epidemiologists, present a novel Monte‐Carlo‐based sensitivity analysis approach to evaluate the robustness of LiNGAM results, and integrate distribution‐based causality approaches in the process of causal triangulation in etiologic epidemiology. Chapter 12, by Nigg, Stadler, von Eye, and Wiedermann, provides an application of direction dependence analysis in the context of determining risk factors of attention‐deficit/hyperactivity disorder (ADHD). Specifically, direction dependence methods for linear models are used to evaluate the causal structure of the association between breastfeeding duration and ADHD. The authors use one of the largest well‐characterized samples currently available and demonstrate DDA results can be affected by rater effects when measuring ADHD. Further an attempt is presented to account for potential ceiling/floor effects that can artificially increase the magnitude of non‐normality of variables. In Chapter 13, Bogat, Levendosky, Kobayashi, and von Eye then take a longitudinal data perspective in the discussion of causal effect directionality. The authors use daily diary data to assess longitudinal dynamics of the causal structure of intimate partner violence and mood lability in young adult couples. Granger causality models (a causal prediction approach in which one tests whether the inclusion of past information of one variable (e.g. xt–1) is useful in predicting another variable yt above and beyond the information that is contained in yt–1; Granger, 1969) are applied to test whether intimate partner violence is more likely to cause mood lability or vice versa. In the final chapter, by Li and Wiedermann (Chapter 14), a software implementation of direction dependence methods is presented. The authors introduce SPSS Custom Dialogs to perform DDA and use data from the High School Longitudinal Study 2009 (Ingels et al., 2011) for illustrative purposes. Specifically, the authors present a step‐by‐step tutorial to evaluate the causal direction of effect of academic achievement and intrinsic motivation in 9th grade Asian students.

Within the last two decades, tremendous progress has been made in the area of direction dependence modeling. We believe that this volume makes a timely and important contribution to the ongoing development of methods of direction dependence and we hope that this contribution will advance the statistical tools empirical sciences can use to better explain causal phenomena.

Wolfgang Wiedermann, University of Missouri, Columbia

Daeyoung Kim, University of Massachusetts, Amherst

Engin A. Sungur, University of Minnesota, Morris

Alexander von Eye, Michigan State University, East Lansing

References

Bellemare, M. F., Masaki, T., & Pepinsky, T. B. (2017). Lagged explanatory variables and the estimation of causal effect.

Journal of Politics

,

79

, 949–963. doi:10.2139/ssrn.2568724

Dodge, Y., & Rousson, V. (2000). Direction dependence in a regression line.

Communications in Statistics‐Theory and Methods

,

29

(9–10), 1957–1972. doi:10.1080/03610920008832589

Falk, R., & Well, A. D. (1997), “Many faces of the correlation coefficient,”

Journal of Statistics Education

, 5. Retrieved from http://www.amstat.org/publications/jse/v5n3/falk.html.

Galton, F. (1886). Family likeness in stature.

Proceedings of the Royal Society of London

,

40

(242–245), 42–73. doi:10.1098/rspl.1886.0009

Gorroochurn, P. (2016). On Galton's change from “reversion” to “regression”.

The American Statistician

,

70

(3), 227–231. doi:10.1080/00031305.2015.1087876

Granger, C. W. J. (1969). Investigating causal relations by econometric models and cross‐spectral methods.

Econometrica

,

37

(3), 424–438. doi:10.2307/1912791

Guyon, I., Statnikov, A., & Batu, B. B. (Eds.) (2019).

Cause effect pairs in machine learning

. doi:10.1007/978‐3‐030‐21810‐2

Hoyer, P. O., Shimizu, S., Kerminen, A. J., & Palviainen, M. (2008). Estimation of causal effects using linear non‐Gaussian causal models with hidden variables.

International Journal of Approximate Reasoning

,

49

(2), 362–378. doi:10.1016/j.ijar.2008.02.006

Hyvärinen, A., Karhunen, J., & Oja, E. (2001).

Independent component analysis

. New York, NY: Wiley & Sons.

Ingels, S. J., Pratt, D. J., Herget, D. R., Burns, L. J., Dever, J. A., Ottem, R., … LoGerfo, L. (2011).

High School Longitudinal Study of 2009 (HSLS: 09): Base‐year data file documentation

. Washington, DC: U.S. Dept. of Education, Institute of Education Sciences, National Center for Education Statistics.

Kim, D., & Kim, J.‐M. (2014). Analysis of directional dependence using asymmetric copula‐based regression models.

Journal of Statistical Computation and Simulation

,

84

(9), 1990–2010. doi:10.1080/00949655.2013.779696

Kim, S., & Kim, D. (2016).

Directional dependence analysis using skew‐normal copula‐based regression

. In W. Wiedermann & A. von Eye (Eds.),

Statistics and causality: Methods for applied empirical research

(pp. 131–152). Hoboken, NJ: Wiley and Sons.

Kim, J.‐M., & Hwang, S. Y. (2019). The copula directional dependence by stochastic volatility models.

Communications in Statistics ‐ Simulation and Computation

,

48

(4), 1153–1175. doi:10.1080/03610918.2017.1406512

Lee, N., & Kim, J.‐M. (2019). Copula directional dependence for inference and statistical analysis of whole‐brain connectivity from fMRI data.

Brain and Behavior

,

9

(1), e01191. doi:10.1002/brb3.1191

Nelsen, R. B. (1998). Correlation, regression lines, and moments of intertia.

American Statistician

,

52

, 343–345.

Pearl, J. (2009).

Causality: Models, reasoning, and inference

(2nd ed.). New York, NY: Cambridge University Press.

Pedersen, N. L. (2005). Swedish Adoption/Twin Study on Aging (SATSA), 1984, 1987, 1990, 1993, 2004, 2007, and 2010 [Data set]. doi:10.3886/ICPSR03843.v2

Rodgers, J. L., & Nicewander, W. A. (1988). Thirteen ways to look at the correlation coefficient.

American Statistician

,

42

, 59–66.

Rovine, M. J., & von Eye, A. (1997). A 14th way to look at a correlation coefficient: Correlation as the proportion of matches.

American Statistician

,

51

, 42–46.

Sher, K. J., & Rutledge, P. C. (2007). Heavy drinking across the transition to college: Predicting first‐semester heavy drinking from precollege variables.

Addictive Behaviors

,

32

, 819–835.

Shimizu, S., & Bollen, K. A. (2014). Bayesian estimation of causal direction in acyclic structural equation models with individual‐specific confounder variables and non‐Gaussian distributions.

Journal of Machine Learning Research

,

15

, 2629–2652.

Shimizu, S., Hoyer, P. O., Hyvärinen, A., & Kerminen, A. (2006). A linear non‐Gaussian acyclic model for causal discovery.

The Journal of Machine Learning Research

,

7

, 2003–2030.

a Sungur, E. A. (2005a). A note on directional dependence in regression setting.

Communications in Statistics ‐ Theory and Methods

,

34

(9–10), 1957–1965. doi:10.1080/03610920500201228

b Sungur, E. A. (2005b). Some observations on copula regression functions.

Communications in Statistics ‐ Theory and Methods

,

34

(9–10), 1967–1978. doi:10.1080/03610920500201244

von Eye, A., & DeShon, R. P. (2012). Directional dependence in developmental research.

International Journal of Behavioral Development

,

36

(4), 303–312. doi:10.1177/0165025412439968

Wei, Z., & Kim, D. (2017). Subcopula‐based measure of asymmetric association for contingency tables.

Statistics in Medicine

,

36

, 3875–3894. doi:10.1002/sim.7399

Wei, Z., & Kim, D. (2018). On multivariate asymmetric dependence using multivariate skew‐normal copula‐based regression.

International Journal of Approximate Reasoning

,

92

, 376–391. doi:10.1016/j.ijar.2017.10.016

Wiedermann, W., & Hagmann, M. (2016). Asymmetric properties of the Pearson correlation coefficient: Correlation as the negative association between linear regression residuals.

Communications in Statistics: Theory and Methods

,

45

(21), 6263–6283. doi:10.1080/03610926.2014.960582

Wiedermann, W., & Li, X. (2018). Direction dependence analysis: A framework to test the direction of effects in linear models with an implementation in SPSS.

Behavior Research Methods

,

50

(4), 1581–1601. doi:10.3758/s13428‐018‐1031‐x

a Wiedermann, W., & von Eye, A. (2015a). Direction‐dependence analysis: A confirmatory approach for testing directional theories.

International Journal of Behavioral Development

,

39

(6), 570–580. doi:10.1177/0165025415582056

b Wiedermann, W., & von Eye, A. (2015b). Direction of effects in multiple linear regression models.

Multivariate Behavioral Research

,

50

, 23–40.

Wiedermann, W., Hagmann, M., & von Eye, A. (2015). Significance tests to determine the direction of effects in linear regression models.

British Journal of Mathematical and Statistical Psychology

,

68

, 116–141.

Wiedermann, W., Merkle, E. C., & von Eye, A. (2018). Direction of dependence in measurement error models.

British Journal of Mathematical and Statistical Psychology

,

71

, 117–145.

Yamayoshi, M., Tsuchida, J., & Yadohisa, H. (2020). An estimation of causal structure based on latent LiNGAM for mixed data.

Behaviormetrika

,

47

(1), 105–121. doi:10.1007/s41237‐019‐00095‐3

Zhang, K., & Hyvärinen, A. (2016). Nonlinear functional causal models for distinguishing cause from effect. In W. Wiedermann & A. von Eye (Eds.),

Wiley series in probability and statistics

(pp. 185–201). doi:10.1002/9781118947074.ch8

Part IFundamental Concepts of Direction Dependence

1From Correlation to Direction Dependence Analysis 1888–2018

Yadolah Dodge1and Valentin Rousson2

Institute of Statistics, University of Neuchâtel, Neuchâtel, Switzerland

Division of Biostatistics, Center for Primary Care and Public Health (Unisanté), University of Lausanne, Lausanne, Switzerland

1.1 Introduction

The Pearson product‐moment correlation coefficient is one of the most popular statistical measure to summarize an association between two (continuous) variables X and Y. As suggested by Rodgers and Nicewander (1988), it should actually be renamed the “Galton–Pearson” correlation coefficient since both men played a significant role in the development and promotion of this coefficient in statistics. The concept of correlation was introduced by Francis Galton in 1888 (Blyth, 1994; Galton, 1888), although it was already presented in 1885 in relation to regression (Galton, 1885; Rodgers & Nicewander, 1988), while Karl Pearson (1895) provided the mathematical formula. See e.g. Stigler (1989) for some detailed historical account. Although Pearson (1930) quoted in Aldrich (1995) wrote that “up to 1889 men of sciences had thought only in terms of causation,” it was clear from the very beginning that “correlation does not imply causation.” For example, Aldrich (1995) mentioned that Francis Galton (1888) was well aware that “the correlation between two variables measures the extent to which they are governed by common causes.” Thus, establishing a correlation between X and Y does not imply that one variable is the cause and the other is the (direct or indirect) consequence, just that the two variables are associated, due perhaps to the existence of a third variable Z which would be a common cause of both X and Y. In fact, even if one could rule out completely the possibility of the existence of such a variable Z, there would be no way to conclude from a correlation which of X and Y is the cause and which is the consequence, since the formula provided by Karl Pearson is perfectly (and beautifully) symmetric in X and Y.

1.2 Correlation as a Symmetrical Concept of X and Y

Given a sample of nobservations (Xi, Yi) (i = 1, …, n) from a bivariate variable (X, Y), the (Pearson product‐moment) correlation (coefficient) can be calculated as:

(1.1)

where and denote the sample means of X and Y. This is also the covariance between X and Y divided by the product of their standard deviations. Obviously, one has rXY = rYX. As mentioned in Section 1.1, correlation is intimately related to regression. Let us consider the regression equation with Y as the response variable and X as the predictor:

(1.2)

as well as the regression equation with X as the response variable and Y as the predictor:

(1.3)

If the goal is to get residuals ɛi and with zero mean and with the smallest possible variances, the regression coefficients are obtained via the least squares criterion, which for the slopes are given by:

(1.4)

and by:

(1.5)

Thus, the correlation is also the geometrical mean of the slopes in Eqs. (1.2) and (1.3):

(1.6)

Again, this is a symmetrical formula in X and Y. Many other ways to calculate or to interpret a correlation have been provided in the statistical literature. In particular, Rodgers and Nicewander (1988) identified 13 ways to look at the correlation, whereas a 14th way has been added to the list by Rovine and von Eye (1997), and even more by Falk and Well (1997). However, all these formulas, when involving two continuous variables, are symmetrical in X and Y.

1.3 Correlation as an Asymmetrical Concept of X and Y

When one considers a linear regression model (1.2) or (1.3), one usually assumes residuals which are normally distributed and with the same variance (homoscedasticity), yielding independence between the predictor and the residual variable (the residual distribution is the same whatever the value of predictor), which is also what is assumed in what follows. In that case, models (1.2) and (1.3) cannot hold simultaneously unless the distribution of (X, Y) is bivariate normal. In particular, if one considers that both X and Y are non‐normal, at most one of (1.2) and (1.3) may hold. It is in such a context of non‐normal X and Y that Dodge and Rousson (2000, 2001) introduced further formulas to interpret a correlation. Under model (1.2), and using basic properties of cumulants (see e.g. Kendall & Stuart, 1963), which differ from zero in case of non‐normal variables, they noted that:

(1.7)

where cumulantm(V) denotes the mth (standardized) cumulant of a random variable V, where m ≥ 3, yielding the skewness coefficient for m = 3. One has thus for example:

(1.8)