53,99 €
Presents a novel approach to conducting meta-analysis using structural equation modeling.
Structural equation modeling (SEM) and meta-analysis are two powerful statistical methods in the educational, social, behavioral, and medical sciences. They are often treated as two unrelated topics in the literature. This book presents a unified framework on analyzing meta-analytic data within the SEM framework, and illustrates how to conduct meta-analysis using the metaSEM package in the R statistical environment.
Meta-Analysis: A Structural Equation Modeling Approach begins by introducing the importance of SEM and meta-analysis in answering research questions. Key ideas in meta-analysis and SEM are briefly reviewed, and various meta-analytic models are then introduced and linked to the SEM framework. Fixed-, random-, and mixed-effects models in univariate and multivariate meta-analyses, three-level meta-analysis, and meta-analytic structural equation modeling, are introduced. Advanced topics, such as using restricted maximum likelihood estimation method and handling missing covariates, are also covered. Readers will learn a single framework to apply both meta-analysis and SEM. Examples in R and in Mplus are included.
This book will be a valuable resource for statistical and academic researchers and graduate students carrying out meta-analyses, and will also be useful to researchers and statisticians using SEM in biostatistics. Basic knowledge of either SEM or meta-analysis will be helpful in understanding the materials in this book.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 592
Veröffentlichungsjahr: 2015
Cover
Title Page
Copyright
Dedication
Preface
Purpose of This Book
Level and Prerequisites
Acknowledgments
List of abbreviations
List of figures
List of tables
Chapter 1: Introduction
1.1 What is meta-analysis?
1.2 What is structural equation modeling?
1.3 Reasons for writing a book on meta-analysis and structural equation modeling
1.4 Outline of the following chapters
1.5 Concluding remarks and further readings
References
Chapter 2: Brief review of structural equation modeling
2.1 Introduction
2.2 Model specification
2.3 Common structural equation models
2.4 Estimation methods, test statistics, and goodness-of-fit indices
2.5 Extensions on structural equation modeling
2.6 Concluding remarks and further readings
References
Chapter 3: Computing effect sizes for meta-analysis
3.1 Introduction
3.2 Effect sizes for univariate meta-analysis
3.3 Effect sizes for multivariate meta-analysis
3.4 General approach to estimating the sampling variances and covariances
3.5 Illustrations Using
R
3.6 Concluding remarks and further readings
References
Chapter 4: Univariate meta-analysis
4.1 Introduction
4.2 Fixed-effects model
4.3 Random-effects model
4.4 Comparisons between the fixed- and the random-effects models
4.5 Mixed-effects model
4.6 Structural equation modeling approach
4.7 Illustrations using R
4.8 Concluding remarks and further readings
References
Chapter 5: Multivariate meta-analysis
5.1 Introduction
5.2 Fixed-effects model
5.3 Random-effects model
5.4 Mixed-effects model
5.5 Structural equation modeling approach
5.6 Extensions: mediation and moderation models on the effect sizes
5.7 Illustrations using
R
5.8 Concluding remarks and further readings
References
Chapter 6: Three-level meta-analysis
6.1 Introduction
6.2 Three-level model
6.3 Structural equation modeling approach
6.4 Relationship between the multivariate and the three-level meta-analyses
6.5 Illustrations using
R
6.6 Concluding remarks and further readings
References
Chapter 7: Meta-analytic structural equation modeling
7.1 Introduction
7.2 Conventional approaches
7.3 Two-stage structural equation modeling: fixed-effects models
7.4 Two-stage structural equation modeling: random-effects models
7.5 Related issues
7.6 Illustrations using
R
7.7 Concluding remarks and further readings
References
Chapter 8: Advanced topics in SEM-based meta-analysis
8.1 Restricted (or residual) maximum likelihood estimation
8.2 Missing values in the moderators
8.3 Illustrations using
R
8.4 Concluding remarks and further readings
References
Chapter 9: Conducting meta-analysis with Mplus
9.1 Introduction
9.2 Univariate meta-analysis
9.3 Multivariate meta-analysis
9.4 Three-level meta-analysis
9.5 Concluding remarks and further readings
References
Appendix A: A brief introduction to R, OpenMx, and metaSEM packages
A.1
R
A.2
OpenMx
A.3
metaSEM
References
Index
End User License Agreement
xiii
xiv
xv
xvii
xix
xx
xxi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
Cover
Table of Contents
Begin Reading
Figure A.1
Figure 1.1
Figure 2.1
Figure 2.2
Figure 2.3
Figure 2.4
Figure 2.5
Figure 2.6
Figure 2.7
Figure 2.8
Figure 2.9
Figure 2.10
Figure 3.1
Figure 3.2
Figure 3.3
Figure 3.4
Figure 4.1
Figure 4.2
Figure 4.3
Figure 4.4
Figure 4.5
Figure 4.6
Figure 4.7
Figure 5.1
Figure 5.2
Figure 5.3
Figure 5.4
Figure 5.5
Figure 5.6
Figure 5.7
Figure 5.8
Figure 5.9
Figure 5.10
Figure 6.1
Figure 6.2
Figure 6.3
Figure 7.1
Figure 7.2
Figure 7.3
Figure 7.4
Figure 8.1
Figure 8.2
Figure 8.3
Figure 8.4
Figure 8.5
Figure 8.6
Figure 9.1
Figure 9.2
Figure 9.3
Figure 9.4
Figure 9.5
Figure 9.6
Figure 9.7
Figure 9.8
Table 1.1
Table 5.1
Table 5.2
Table 6.1
Table 6.2
Table 6.3
Table 6.4
Mike W. -L. Cheung
National University of Singapore, Singapore
This edition first published 2015
© 2015 John Wiley & Sons, Ltd
Registered office
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data applied for
A catalogue record for this book is available from the British Library.
ISBN: 9781119993438
For my family—my wife Maggie, my daughter little Ching Ching, and my parents
“
If all you have is a hammer, everything looks like a nail.
”
—Maslow's hammer
There were two purposes of writing this book. One was personal and the other was more “formal.” I will give the personal one first. The primary motivation for writing this book was to document my own journey in learning structural equation modeling (SEM) and meta-analysis. The journey began when I was a undergraduate student. I first learned SEM from Wai Chan, my former supervisor. After learning a bit from the giants in SEM, such as Karl Jöreskog, Peter Bentler, Bengt Muthén, Kenneth Bollen, Michael Browne, Michael Neale, and Roderick McDonald, among others, I found SEM fascinating. It seems that SEM is the statistical framework for all data analysis. Nearly all statistical techniques I learned can be formulated as structural equation models.
In my graduate study, I came across a different technique—meta-analysis. I learned meta-analysis by reading the classic book by Larry Hedges and Ingram Olkin. I was impressed that a simple yet elegant statistical model could be used to synthesize findings across studies. It seems that meta-analysis is the key to advance knowledge by combining results from different studies. As I was trained with the SEM background, everything looks like a structural equation model to me. I asked the question, “could a meta-analysis be a structural equation model?” This book summarized my journey to answer this question in the past one and a half decades.
Now, I will give a more formal purpose of this book. With the advances in statistics and computing, researchers have more statistical tools to answer their research questions. SEM and meta-analysis are two powerful statistical techniques in the social, educational, behavioral, and medical sciences. SEM is a popular tool to test hypothesized models by modeling the latent and observed variables in primary research, while meta-analysis is a de facto tool to synthesize research findings from a pool of empirical studies. These two techniques are usually treated as two unrelated topics in the literature. They have their own strengths, weaknesses, assumptions, models, terminologies, software packages, audiences, and even journals (Structural Equation Modeling: A Multidisciplinary Journal and Research Synthesis Methods). Researchers working in one area rarely refer to the work in the other area. Advances in one area have basically no impact on the other area.
There were two primary goals for this book. The first one was to present the recent methodological advances on integrating meta-analysis and SEM—the SEM-based meta-analysis (using SEM to conducting meta-analysis) and meta-analytic structural equation modeling (conducting meta-analysis on correlation matrices for the purpose of fitting structural equation models on the pooled correlation matrix). It is my hope that a unified framework will be made available to researchers conducting both primary data analysis and meta-analysis. A single framework can easily translate advances from one field to the other fields. Researchers do not need to reinvent the wheels again.
The second goal was to provide accessible computational tools for researchers conducting meta-analyses. The metaSEM package in the R statistical environment, which is available at http://courses.nus.edu.sg/course/psycwlm/Internet/metaSEM/, was developed to fill this gap. Using the OpenMx package as the workhorse, the metaSEM package implemented most of the methods discussed in this book. Complete examples in R code are provided to guide readers to fit various meta-analytic models. Besides the R code, Mplus was also used to illustrate some of the examples in this book. R (3.1.1), OpenMx (2.0.0-3654), metaSEM (0.9-0), metafor (1.9-3), lavaan (0.5-17.698), and Mplus (7.2) were used in writing this book. The output format may be slightly different from the versions that you are using.
Readers are expected to have some basic knowledge of SEM. This level is similar to the first year of research methods covered in most graduate programs. Knowledge of meta-analysis is preferable though not required. We will go through the meta-analytic models in this book. It will also be useful if readers have some knowledge in R because R is the main statistical environment to implement the methods introduced in this book. Readers may refer to Appendix at the end of this book for a quick introduction to R. For readers who are more familiar with Mplus, they may use Mplus to implement some of the methods discussed in this book.
Mike W.-L. Cheung Singapore
I thank Wai Chan, my former supervisor, for introducing me to the exciting field of structural equation modeling (SEM). He also suggested me to explore meta-analytic structural equation modeling in my graduate studies. I acknowledge the suggestions and comments made by many people: Shu-fai Cheung, Adam Hafdahl, Suzanne Jak, Yonghao Lim, Iris Sun, and Wolfgang Viechtbauer. All remaining errors are mine. I especially thank my wife for her support and patience. My daughter was born during the preparation of this book. I enjoyed my daughter's company when I was writing this book. Part of the book was completed during my sabbatical leave supported by the Faculty of Arts & Social Sciences, the National University of Singapore. I also appreciate the funding provided by the Faculty to facilitate the production of this book. I thank Heather Kay, Richard Davies, Jo Taylor, and Prachi Sinha Sahay from Wiley. They are very supportive and professional. It has been a pleasure working with them.
The metaSEM package could not be written without R and OpenMx. Contributions by the R Development Core Team and the OpenMx Core Development Team are highly appreciated. Their excellent work makes it possible to implement the techniques discussed in this book. I have to specially thank the members of the OpenMx Core Development Team for their quick and helpful responses in addressing issues related to OpenMx. I also thank Yves Rosseel for answering questions related to the lavaan package. Finally, the preparation of this book was mainly based on the open-source software. This includes LATEX for typesetting this book, R for the analyses, Sweave for mixing R and LATEX, Graphviz and dot2tex for preparing the figures, GNU make for automatically building files, Git for revision control, Emacs for editing files, and finally, Linux as the platform for writing.
Abbreviation
Full name
CFA
confirmatory factor analysis
CFI
comparative fit index
CI
confidence interval
FIML
full information maximum likelihood
GLS
generalized least squares
LBCI
likelihood-based confidence interval
LL
log likelihood
LR
likelihood ratio
MASEM
meta-analytic structural equation modeling
ML
maximum likelihood
NNFI
non-normed fit index
OR
odds ratio
OLS
ordinary least squares
RAM
reticular action model
REML
restricted (or residual) maximum likelihood estimation
RMD
raw mean difference
RMSEA
root mean square error of approximation
SE
standard error
SEM
structural equation modeling
SMD
standardized mean difference
SRMS
standardized root mean square residual
TLI
Tucker–Lewis index
TSSEM
two-stage structural equation modeling
UMM
unweighted method of moments
WLS
weighted least squares
WMM
weighted method of moments
1.1
Datasets used in this book.
5.1
Long format data for a multivariate meta-analysis.
5.2
Wide format data for a multivariate meta-analysis.
6.1
Long format data for a two-level meta-analysis.
6.2
Wide format data for a two-level meta-analysis.
6.3
Wide format data for a three-level meta-analysis.
6.4
Two effect sizes nested within
k
clusters
This chapter gives an overview of this book. It first briefly reviews the history and applications of meta-analysis and structural equation modeling (SEM). The importance of using meta-analysis and SEM to advancing scientific research is discussed. This chapter then addresses the needs and advantages of integrating meta-analysis and SEM. It further outlines the remaining chapters and the data sets used in the book. We close this chapter by addressing topics that will not be further discussed in this book.
Pearson (1904) was often credited as one of the earliest researchers applying ideas of meta-analysis (e.g., Chalmers et al., 2002; Cooper and Hedges, 2009; National Research Council, 1992; O'Rourke, 2007). He tried to determine the relationship between mortality and inoculation with a vaccine for enteric fever by averaging correlation coefficients across 11 small-sample studies. The idea of combining and pooling studies has been widely used in the physical and social sciences. There are many successful stories as documented in, for example, National Research Council (1992) and Hunt (1997). The term meta-analysis was coined by Gene Glass in educational psychology to represent “the statistical analysis of a large collection of analysis results from individual studies for the purpose of integrating the findings” (Glass 1976, p.3).
Validity generalization, another technique with similar objectives, was independently developed by Schmidt and Hunter (1977) in industrial and organizational psychology in nearly the same period. Later, Hedges and Olkin (1985) wrote a classic text that provides the statistical foundation of meta-analysis. These techniques have been expanded, refined, and adopted in many disciplines. Meta-analysis is now a popular statistical technique to synthesizing research findings in many disciplines including educational, social, and medical sciences.
A meta-analysis begins by conceptualizing the research questions. The research questions must be empirically testable based on the published studies. The published studies should be able to provide enough information to calculate the effect sizes, the ingredients for a meta-analysis. Detailed inclusion and exclusion criteria are developed to guide which studies are eligible to be included in the meta-analysis. After extracting the effect sizes and the study characteristics, the data can be subjected to a statistical analysis. The next step is to interpret the results and prepare reports to disseminate the findings.
This book mainly focuses on the statistical issues in a meta-analysis.Generally speaking, the statistical models discussed in this book fall into three dimensions:
fixed-effects versus random-effects models;
independent versus nonindependent effect sizes; and
models with or without structural models on the averaged effect sizes.
The first dimension is fixed-effects versus random-effects models. Fixed-effects models provide conditional inferences on the studies included in the meta-analysis, while random-effects models attempt to generalize the inferences beyond the studies used in the meta-analysis. Statistically speaking, the fixed-effects models, also known as the common effects models, are special cases of the random-effects models.
The second dimension focuses on whether the effect sizes are independent or nonindependent. Most meta-analytic models, such as the univariate meta-analysis introduced in this book, assume independence on the effect sizes. When there is more than one effect size reported per study, the effect sizes are likely nonindependent. Both the multivariate and three-level meta-analyses are introduced to handle the nonindependent effect sizes depending on the assumptions of the data. The last dimension is whether the research questions are related to the averaged effect sizes themselves or some forms of structural models on the averaged effect sizes. If researchers are only interested in the effect sizes, conventional univariate, multivariate, and three-level meta-analyses are sufficient. Sometimes, researchers are interested in testing proposed structures on the effect size. This type of research questions can be addressed by testing the mediation and moderation models on the effect sizes (Section 5.6) or the meta-analytic structural equation modeling (MASEM; Chapter 7).
SEM is a flexible modeling technique to test proposed models. The proposed models can be specified as path diagrams, equations, or matrices. SEM integrates several statistical techniques into a single framework—path analysis in biology and sociology, factor analysis in psychology, and simultaneous equation and errors-in-variables models in economics (e.g., Matsueda 2012). Jöreskog (1969, 1970, 1978) was usually credited as the one who first integrated these techniques into a single framework. He further proposed computational feasible approaches to conduct the analysis. These algorithms were implemented in LISREL (Jöreskog and Sörbom, 1996), the first SEM package in the market. At nearly the same time, Bentler contributed a lot in the methodological development of SEM (e.g., Bentler 1986, 1990; Bentler and Weeks, 1980). He also wrote a user friendly program called EQS (Bentler, 2006) to conduct SEM. The availability of LISREL and EQS popularized applications of SEM in various fields. Both Jöreskog and Bentler received the Award for Distinguished Scientific Applications of Psychology (American Psychological Association, 2007a, b) “[f]or [their] development of models, statistical procedures, and a computer algorithm for structural equation modeling (SEM) that changed the way in which inferences are made from observational data; namely, SEM permits hypotheses derived from theory to be tested.”
Many recent methodological advances have been developed and integrated into Mplus, a popular and powerful SEM package (Muthén and Muthén, 2012). SEM is now widely used as a statistical model to test research hypotheses. Readers may refer to, for example, MacCallum and Austin (2000) and Bollen (2002) for some applications in the social sciences.
There are already many good books on the topic of meta-analysis (e.g., Borenstein et al., 2010; Card, 2012; Hedges and Olkin, 1985; Lipsey and Wilson, 2000; Schmidt and Hunter, 2015; Whitehead, 2002). Moreover, meta-analysis has also been covered as special cases of mixed-effects or multilevel models (e.g., Demidenko, 2013, Goldstein, 2011, Hox, 2010; Raudenbush and Bryk, 2002). It seems that there is no need to write another book on meta-analysis. On the other hand, this book did not aim to be a comprehensive introduction to SEM neither. Before answering this question, let us first review the current state of applications of meta-analysis and SEM in academic research.
Figure 1.1 shows two figures on the numbers of publications using meta-analysis and SEM in Web of Science. The figures were averaged over 5 years. For example, the number for 2010 was calculated by averaging from 1998 to 2012. Figure 1.1a depicts the actual numbers of publications, while Figure 1.1b converts the numbers to percentages by dividing the numbers by the total numbers of publications. The trends in both figures are nearly identical in terms of actual numbers and percentages. One speculation why the numbers on meta-analysis are higher than those on SEM is that meta-analysis is very popular in medical research, whereas SEM is rarely used in medical research (cf. Song and Lee, 2012). Anyway, it is clear that both techniques are getting more and more popular over time.
Figure 1.1 Publications using meta-analysis and structural equation modeling. (a) Actual number of publications per year and (b) percentage of publications.
Although both SEM and meta-analysis are very popular in the educational, social, behavioral, and medical sciences, both techniques are treated as two unrelated techniques in the literature. They have their own assumptions, models, terminologies, software packages, communities, and even journals (Structural Equation Modeling: A Multidisciplinary Journal and Research Synthesis Methods). These two techniques are also considered as separate topics in doctoral training in psychology (Aiken et al., 2008). Users of SEM are mainly interested in primary research, while users of meta-analysis only conduct research synthesis on the literature. Researchers working in one area rarely refer to the work in the other area. Users of SEM seldom have the motivation to learn meta-analysis and vice versa. Advances in one area have basically no impact on the other area.
There were some attempts to bring these two techniques together. One such topic is known as MASEM (e.g., Cheung and Chan, 2005b; Viswesvaran and Ones, 1995). There are two stages involved in an MASEM. Meta-analysis is usually used to pool correlation matrices together in the stage 1 analysis. The pooled correlation matrix is used to fit structural equation models in the stage 2 analysis. As researchers usually apply ad hoc procedures to fit structural equation models, some of these procedures are not statistically defensible from an SEM perspective. Therefore, one of the goals of this book (Chapter 7) was to provide a statistically defensible approach to conduct MASEM.
Another reason for writing this book was to integrate meta-analysis into the general SEM framework. This helps to advance the methodological development in both areas. There are many such examples in the literature. Consider the classic example of analysis of variance (ANOVA) and multiple regression. Before the seminal work of Cohen (1968) and Cohen and Cohen (1975), “[t]he textbooks in ‘psychological’ statistics treat [multiple regression, ANOVA, and ANCOVA] quite separately, with wholly different algorithms, nomenclature, output, and examples” (Cohen 1968, p. 426). Understanding the mathematical equivalence between an ANOVA (and analysis of covariance (ANCOVA)) and a multiple regression helps us to comprehend the details behind the general linear model that plays an important role in modern statistics.
SEM is another successful story in the literature. The general linear model, path analysis, and confirmatory factor analysis (CFA) are some well-known special cases of SEM. It has been shown that many models used in the social and behavioral sciences are indeed special cases of SEM. For example, many item response theory (IRT) models can be analyzed as structural equation models with binary or categorical variables as indicators (e.g., Takane and Deleeuw, 1987). The main advantage of analyzing IRT models as structural equation models is that many of the SEM techniques can be directly applied to address research questions that are challenging in traditional IRT framework. For example, researchers may test IRT models with multiple traits (multiple factor models in SEM), with covariates aspredictors (multiple indicators multiple causes in SEM), with missing data (full information maximum likelihood (FIML) estimation in SEM), and with nested structures (multilevel SEM) (Muthén and Asparouhov, 2013).
Another recent example is the recognition of multilevel models as structural equation models (Bauer, 2003; Curran, 2003; Mehta and Neale, 2005; Mehta and West, 2000; Rovine and Molenaar, 2000). Understanding the similarities between multilevel models and structural equation models helps to develop the multilevel SEM (e.g., Mehta and Neale, 2005; Muthén, 1994; Preacher et al., 2010). There are at least two methodological advances of integrating multilevel models and SEM. First, graphical models, which are popular in SEM, have been developed to represent multilevel models (Curran and Bauer, 2007). Another advance is that various goodness-of-fit indices in SEM have been exported to multilevel models (Wu et al., 2009). Readers may refer to, for example, Bollen et al. (2010), Matsueda (2012), and Kaplan (2009) for the recent methodological advances in SEM.
The current SEM framework is far beyond the original SEM developed by Jöreskog and Bentler. Modern SEM framework integrates techniques and models from several disciplines. For example, Mplus (Muthén and Muthén, 2012) combines traditional SEM, multilevel models, complex survey analysis, mixture modeling, survival analysis, latent class models, some IRT models, and even Bayesian inferences into a single statistical modeling framework. Another general framework is the generalized linear latent and mixed models (GLLAMM) (Skrondal and Rabe-Hesketh, 2004) that integrate SEM, generalized linear models, multilevel models, latent class models, and IRT models.
This book provides the foundation of integrating meta-analysis into the SEM framework. Latent variables in a structural equation model are used to represent the true effect sizes in a meta-analysis. Meta-analytic models can then be analyzed as structural equation models. This approach is termed SEM-based meta-analysis in this book. Many state-of-the-art techniques in SEM are available to researchers doing meta-analysis by using the SEM-based meta-analysis.
There are several advantages of integrating meta-analysis into the SEM framework. For the SEM users, the SEM-based meta-analysis extends their statistical tools to conduct research with meta-analysis. Suppose that their primary research interests are in studying the training effectiveness with SEM; the SEM-based meta-analysis allows them to conduct a meta-analysis on the same topic without leaving the SEM framework. Many of the terminologies in meta-analysis can be translated into the terminologies in SEM. Software developers may explore thepossibilities to develop an integrated SEM framework for researchers doing primary and meta-analysis. For example, Mplus can be used to implement many of the SEM-based meta-analysis introduced in this book (see Chapter 9).
For the meta-analysis users, the SEM-based meta-analysis provides some new research tools to address research questions in meta-analysis. For example, users may apply the SEM-based meta-analysis to conduct univariate, multivariate, and three-level meta-analyses that handle missing values in moderators in the same SEM framework. Future studies may explore how techniques, such as robust statistics, bootstrap, and mixture models available in SEM, can be applied to meta-analysis.
In terms of graduate training in statistics, a single coherent framework can be introduced to students. This framework includes the general linear model, SEM, and meta-analysis. It helps student to appreciate the similarities and differences among the techniques under the same SEM framework. Graduate students may be more prepared to conduct both primary research and meta-analysis after their graduation.
Chapter 2 gives a brief overview on the key topics in SEM. These topics were selected in a way that they are relevant to the SEM-based meta-analysis. FIML estimation, definition variables, and phantom variables play a crucial role in the SEM-based meta-analysis. Chapter 3 provides a summary on how to calculate the effect sizes and their sampling variances and covariances for univariate and multivariate meta-analyses. We also introduce a general approach to derive the approximate sampling variances and covariances for any types of effect sizes using a delta method and SEM. Chapter 4 introduces univariate meta-analysis and how the meta-analytic models can be formulated as structural equation models. This chapter provides the foundation on understanding the SEM-based meta-analysis.
In Chapter 5, we extend the univariate meta-analysis to multivariate meta-analysis. We discuss the advantages of multivariate meta-analysis to the univariate meta-analysis. At the end of this chapter, we apply the multivariate meta-analysis to test mediation and moderation models on the effect sizes. Chapter 6 discusses issues of dependent effect sizes and several common strategies to handle the dependence when the degree of dependence is unknown. A three-level meta-analysis is proposed to handle the effect sizes nested within clusters. The relationship between a multivariate and a three-level meta-analyses is also discussed. Chapter 7 focuses on the MASEM. Several common methods for conducting MASEM are reviewed. The fixed- and random-effects two-stage structural equation modeling (TSSEM) approach are proposed and discussed in details. Issues related to the MASEM are discussed.
Chapter 8 addresses two advanced topics in the SEM-based meta-analysis. The first topic is the pros and cons of the restricted (or residual) maximum likelihood (REML) estimation and how it can be implemented in the SEM framework. The second topic is how to handle missing values in the moderators in a mixed-effects meta-analysis. Several common strategies for handling missing data are reviewed. Advantages and implementation of FIML to handle missing data are discussed. Chapter 9 gives an overview on how to implement the SEM-based meta-analysis in Mplus, a popular SEM software. Most of the SEM-based meta-analysis except the TSSEM approach can be conducted in Mplus by using a transformed variables approach. Appendix A gives a very brief introduction to the R statistical environment, the OpenMx, and the metaSEM packages.
Computer examples were provided to illustrate the techniques introduced in this book. The R statistical environment was mainly used as the platform of data analysis except Chapter 9 that used Mplus as the statistical program. Several real data sets were used in the illustrations. All data sets are available in the metaSEM package. Table 1.1 summarizes these data sets. More details of the data sets will be given in the later chapters.
Table 1.1 Datasets used in this book
Dataset
Type of meta-analysis
Key references
Topic
Mak09
Univariate
Cheung et al. (2012) and Mak et al. (2009)
(Log) odds ratio of atrial fibrillation between bisphosphonate and non-bisphosphonate users
Jaramillo05
Univariate
Jaramillo et al. (2005)
Correlation between organizational commitment and salesperson job performance
BCG
Multivariate
Colditz et al. (1994) and van Houwelingen et al. (2002)
(Log) odds ratio of BCG vaccine for preventing tuberculosis
wvs94a
Multivariate, mediation, and moderation
Cheung (2013) and World Values Study Group (1994)
Standardized mean differences between males and females on life satisfaction and life control
Bornmann07
Three-level
Bornmann et al. (2007), Cheung (2014b), and Marsh et al. (2009)
(Log) odds ratio of gender differences in peer reviews of grant proposals
Digman97
MASEM
Digman (1997), Cheung (2014a), and Cheung and Chan (2005a)
A higher-order confirmatory factor analytic model for the Big Five model
Becker94
MASEM
Becker and Schram (1994) and Cheung (2014a)
A regression model on SAT (Math) by using SAT (Verbal) and spatial ability as predictors
Hunter83
MASEM
Hunter (1983)
A path model for cognitive ability to supervisor ratings
This book mainly covers the statistical models in the meta-analysis from an SEM approach. The SEM-based meta-analysis provides an alternative framework to conduct meta-analysis. It is useful to mention topics that will not be covered in this book. Conceptual issues, such as conceptualization, literature review, and coding study characteristics for moderator analysis in a meta-analysis, will not be covered. Readers may refer to, for example, Card (2012) and Cooper (2010) for details. Moreover, topics such as publication bias (Rothstein et al., 2005), graphical methods to display data (Anzures-Cabrera and Higgins, 2010), individual participant data (Whitehead, 2002), network meta-analysis (see Salanti and Schmid, 2012, for a special issue), correction for statistical artifacts (Schmidt and Hunter, 2015), and Bayesian meta-analysis (Whitehead, 2002) will not be covered in this book. These techniques have not been well explored in the SEM-based meta-analysis yet. Future research may investigate how these topics can be integrated into the SEM framework. Some matrix calculations are used in this book. Readers who are less familiar with them may refer to Fox (2009) or the online appendix of his book (Fox, 2008).
Aiken LS, West SG and Millsap RE 2008. Doctoral training in statistics, measurement, and methodology in psychology: replication and extension of Aiken, West, Sechrest, and Reno's (1990) survey of PhD programs in North America.
American Psychologist
63
(1), 32–50.
Anzures-Cabrera J and Higgins JPT 2010. Graphical displays for meta-analysis: an overview with suggestions for practice.
Research Synthesis Methods
1
(1), 66–80.
Bauer DJ 2003. Estimating multilevel linear models as structural equation models.
Journal of Educational and Behavioral Statistics
28
(2), 135–167.
Becker BJ and Schram CM 1994. Examining explanatory models through research synthesis In
The handbook of research synthesis
(ed. Cooper H and Hedges LV). Russell Sage Foundation, New York, pp. 357–381.
Bentler PM 1986. Structural modeling and Psychometrika: an historical perspective on growth and achievements.
Psychometrika
51
(1), 35–51.
Bentler PM 1990. Comparative fit indexes in structural models.
Psychological Bulletin
107
(2), 238–246.
Bentler PM 2006.
EQS 6 structural equations program manual
. Multivariate Software, Encino, CA.
Bentler PM and Weeks D 1980. Linear structural equations with latent variables.
Psychometrika
45
(3), 289–308.
Bollen K 2002. Latent variables in psychology and the social sciences.
Annual Review of Psychology
53
, 605–634.
Bollen KA, Bauer DJ, Christ SL and Edwards MC 2010. Overview of structural equation models and recent extensions In
Statistics in the social sciences: current methodological developments
(ed. Kolenikov S, Steinley D and Thombs L). John Wiley & Sons, Inc., Hoboken, NJ, pp. 37–79.
Borenstein M, Hedges LV, Higgins JP and Rothstein HR 2010. A basic introduction to fixed-effect and random-effects models for meta-analysis.
Research Synthesis Methods
1
(2), 97–111.
Bornmann L, Mutz R and Daniel HD 2007. Gender differences in grant peer review: a meta-analysis.
Journal of Informetrics
1
(3), 226–238.
Card NA 2012.
Applied meta-analysis for social science research
. The Guilford Press, New York.
Chalmers I, Hedges LV and Cooper H 2002. A brief history of research synthesis.
Evaluation & the Health Professions
25
(1), 12–37.
Cheung MWL 2013. Multivariate meta-analysis as structural equation models.
Structural Equation Modeling: A Multidisciplinary Journal
20
(3), 429–454.
Cheung MWL 2014a. Fixed- and random-effects meta-analytic structural equation modeling: examples and analyses in R.
Behavior Research Methods
46
(1), 29–40.
Cheung MWL 2014b. Modeling dependent effect sizes with three-level meta-analyses: a structural equation modeling approach.
Psychological Methods
19
(2), 211–229.
Cheung MWL and Chan W 2005a. Classifying correlation matrices into relatively homogeneous subgroups: a cluster analytic approach.
Educational and Psychological Measurement
65
(6), 954–979.
Cheung MWL and Chan W 2005b. Meta-analytic structural equation modeling: a two-stage approach.
Psychological Methods
10
(1), 40–64.
Cheung MWL, Ho RCM, Lim Y and Mak A 2012. Conducting a meta-analysis: basics and good practices.
International Journal of Rheumatic Diseases
15
(2), 129–135.
Cohen J 1968. Multiple regression as a general data-analytic system.
Psychological Bulletin
70
(6), 426–443.
Cohen J and Cohen P 1975.
Applied multiple regression/correlation analysis for the behavioral sciences
. Erlbaum, Hillsdale, NJ.
Colditz GA, Brewer TF, Berkey CS, Wilson ME, Burdick E, Fineberg HV and Mosteller F 1994. Efficacy of BCG vaccine in the prevention of tuberculosis. Meta-analysis of the published literature.
JAMA: The Journal of the American Medical Association
271
(9), 698–702.
Cooper HM 2010.
Research synthesis and meta-analysis: a step-by-step approach
, 4th edn. Sage Publications, Inc., Los Angeles, CA.
Cooper H and Hedges LV 2009. Research synthesis as a scientific process In
The handbook of research synthesis and meta-analysis
(ed. Cooper H, Hedges LV and Valentine JC), 2nd edn. Russell Sage Foundation, New York, pp. 3–16.
Curran PJ 2003. Have multilevel models been structural equation models all along?
Multivariate Behavioral Research
38
(4), 529–569.
Curran PJ and Bauer DJ 2007. Building path diagrams for multilevel models.
Psychological Methods
12
(3), 283–297.
Demidenko E 2013.
Mixed models: theory and applications with R
, 2nd edn. Wiley-Interscience, Hoboken, NJ.
Digman JM 1997. Higher-order factors of the Big Five.
Journal of Personality and Social Psychology
73
(6), 1246–1256.
Fox J 2008.
Applied regression analysis and generalized linear models
, 2nd edn. Sage Publications, Inc., Thousand Oaks, CA
Fox J 2009.
A mathematical primer for social statistics
. Sage Publications, Inc., Los Angeles, CA.
Glass GV 1976. Primary, secondary, and meta-analysis of research.
Educational Researcher
5
(10), 3–8.
Goldstein H 2011.
Multilevel statistical models
, 4th edn. John Wiley & Sons, Inc., Hoboken, NJ.
Hedges LV and Olkin I 1985.
Statistical methods for meta-analysis
. Academic Press, Orlando, FL.
Hox JJ 2010.
Multilevel analysis: techniques and applications
, 2nd edn. Routledge, New York.
Hunt M 1997.
How science takes stock: the story of meta-analysis
. Russell Sage Foundation, New York.
Hunter JE 1983. A causal analysis of cognitive ability, job knowledge, job performance, and supervisor ratings In
Performance measurement and theory
(ed. Landy F, Zedeck S and Cleveland J). Erlbaum Hillsdale, NJ, pp. 257–266.
Jaramillo F, Mulki JP and Marshall GW 2005. A meta-analysis of the relationship between organizational commitment and salesperson job performance: 25 years of research.
Journal of Business Research
58
(6), 705–714.
Jöreskog KG 1969. A general approach to confirmatory maximum likelihood factor analysis.
Psychometrika
34
(2), 183–202.
Joreskog KG 1970. A general method for analysis of covariance structures.
Biometrika
57
(2), 239–251.
Jöreskog KG 1978. Structural analysis of covariance and correlation matrices.
Psychometrika
43
(4), 443–477.
Jöreskog KG and Sörbom D 1996.
LISREL 8: a user's reference guide
. Scientific Software International, Inc., Chicago, IL.
Kaplan D 2009.
Structural equation modeling: foundations and extensions
, 2nd edn. SAGE Publications, Inc., Thousand Oaks, CA.
Lipsey MW and Wilson D 2000.
Practical meta-analysis
. Sage Publications, Inc., Thousand Oaks, CA.
MacCallum RC and Austin JT 2000. Applications of structural equation modeling in psychological research.
Annual Review of Psychology
51
(1), 201–226.
Mak A, Cheung MWL, Ho RCM, Cheak AAC and Lau CS 2009. Bisphosphonates and atrial fibrillation: Bayesian meta-analyses of randomized controlled trials and observational studies.
BMC Musculoskeletal Disorders
10
(1), 1–12.
Marsh HW, Bornmann L, Mutz R, Daniel HD and O'Mara A 2009. Gender effects in the peer reviews of grant proposals: a comprehensive meta-analysis comparing traditional and multilevel approaches.
Review of Educational Research
79
(3), 1290–1326.
Matsueda RL 2012. Key advances in the history of structural equation modeling In
Handbook of structural equation modeling
(ed. Hoyle RH). Guilford Press New York, pp. 3–42.
Mehta PD and Neale MC 2005. People are variables too: multilevel structural equations modeling.
Psychological Methods
10
(3), 259–284.
Mehta P and West S 2000. Putting the individual back into individual growth curves.
Psychological Methods
5
(1), 23–43.
Muthén BO 1994. Multilevel covariance structure analysis.
Sociological Methods & Research
22
(3), 376–398.
Muthén BO and Asparouhov T 2013. Item response modeling in Mplus: a multi-dimensional, multi-level, and multi-timepoint example In
Handbook of item response theory: models, statistical tools, and applications
(ed. Van der Linden WJ and Hambleton RK). Chapman & Hall/CRC Press, Boca Raton, FL, forthcoming.
Muthén BO and Muthén LK 2012.
Mplus user's guide
, 7th edn. Muthén & Muthén, Los Angeles, CA.
National Research Council 1992.
Combining information: statistical issues and opportunities for research
. National Academies Press, Washington, DC.
American Psychological Association 2007a. Karl G. Jöreskog: award for distinguished scientific applications of psychology.
American Psychologist
62
(8), 768–769.
No authorship indicated 2007b. Peter M. Bentler: award for distinguished scientific applications of psychology.
American Psychologist
62
(8), 769–782.
O'Rourke K 2007. An historical perspective on meta-analysis: dealing quantitatively with varying study results.
Journal of the Royal Society of Medicine
100
(12), 579–582.
Pearson K 1904. Report on certain enteric fever inoculation statistics.
British Medical Journal
2
(2288), 1243–1246.
Preacher K, Zyphur M and Zhang Z 2010. A general multilevel SEM framework for assessing multilevel mediation.
Psychological Methods
15
(3), 209–233.
Raudenbush SW and Bryk AS 2002.
Hierarchical linear models: applications and data analysis methods
. Sage Publications, Inc., Thousand Oaks, CA.
Rothstein HR, Sutton AJ and Borenstein M 2005.
Publication bias in meta-analysis: prevention, assessment and adjustments
. John Wiley & Sons, Ltd, Chichester.
Rovine MJ and Molenaar PCM 2000. A structural modeling approach to a multilevel random coefficients model.
Multivariate Behavioral Research
35
(1), 51–88.
Salanti G and Schmid CH 2012. Research synthesis methods special issue on network meta-analysis: introduction from the editors.
Research Synthesis Methods
3
(2), 69–70.
Schmidt FL and Hunter JE 1977. Development of a general solution to the problem of validity generalization.
Journal of Applied Psychology
62
(5), 529–540.
Schmidt FL and Hunter JE 2015.
Methods of meta-analysis: correcting error and bias in research findings
, 3rd edn. Sage Publications, Inc., Thousand Oaks, CA.
Skrondal A and Rabe-Hesketh S 2004.
Generalized latent variable modeling: multilevel, longitudinal, and structural equation models
. Chapman & Hall/CRC, Boca Raton, FL.
Song XY and Lee SY 2012.
Basic and advanced Bayesian structural equation modeling: with applications in the medical and behavioral sciences
. John Wiley & Sons, Inc., Hoboken, NJ.
Takane Y and Deleeuw J 1987. On the relationship between item response theory and factor-analysis of discretized variables.
Psychometrika
52
(3), 393–408.
van Houwelingen HC, Arends LR and Stijnen T 2002. Advanced methods in meta-analysis: multivariate approach and meta-regression.
Statistics in Medicine
21
(4), 589–624.
Viswesvaran C and Ones DS 1995. Theory testing: combining psychometric meta-analysis and structural equations modeling.
Personnel Psychology
48
(4), 865–885.
Whitehead A 2002.
Meta-analysis of controlled clinical trials
. John Wiley & Sons, Ltd, Chichester.
World Values Study Group 1994.
World Values Survey, 19811984 and 19901993 [Computer file]
. Inter-University Consortium for Political and Social Research, Ann Arbor, MI.
Wu W, West SG and Taylor AB 2009. Evaluating model fit for growth curve models: integration of fit indices from SEM and MLM frameworks.
Psychological Methods
14
(3), 183–201.
This chapter reviews selected topics in structural equation modeling (SEM) that are relevant to the SEM-based meta-analysis. It provides a quick introduction to SEM for those who are less familiar with the techniques. This chapter begins by introducing three different model specifications—path diagrams, equations, and matrix specification. It then introduces popular structural equation models such as path analysis, confirmatory factor analytic (CFA) models, SEMs, latent growth models, and multiple-group analysis. How to obtain parameter estimates, standard errors (SEs), confidence intervals (CIs), test statistics, and various goodness-of-fit indices are introduced. Finally, we introduce phantom variables, definition variables, and full information maximum likelihood (FIML). These concepts are the keys to formulating meta-analytic models as structural equation models.
SEM, also known as covariance structure analysis and correlation structure analysis, is a generic term for many related statistical techniques. Many popular multivariate techniques, such as correlation analysis, regression analysis, analysis of variance (ANOVA), multivariate analysis of variance (MANOVA), factor analysis, and item response theory, can be considered as special models of SEM. Generally speaking, SEM is a statistical technique to model the first and the second moments of the data when the data are multivariate normal. The first moment represent the mean structure, while the second moment represents the covariance matrix of the variables. If we are only interested in the covariance matrix among the variables, we may skip the mean structure.
SEM is widely used in psychology and the social sciences to test hypotheses involving observed and latent variables (e.g., Bentler, 1986; Bollen, 2002; MacCallum and Austin, 2000). Latent variables are hypothetical constructs that cannot be observed directly. They have to be represented by the observed variables known as indicators. By using the indicators to measure the latent variables, the amount of measurement errors can be quantified and taken into account when estimating the relationship among the latent variables.
There are several steps involved in fitting a structural equation model (see, e.g., Kline, 2011). A proposed model is specified based on the hypothesized relationship among the observed and latent variables. The proposed model is fitted against the data. When the solution for the optimization is convergent, parameter estimates, their SEs, and test statistics are available for inspection. Users may determine whether the proposed model fits the data well. If the proposed model does not fit the data, we may modify the model to see if the model fit can be improved. Interpretations on the overall model and the individual parameter estimates can be made.
There are three equivalent approaches to specify a structural equation model. They are path diagrams, equations, and matrix specification (e.g., Mulaik, 2009). Let us illustrate these approaches by using a model on simple regression.
The first approach is to specify the models by equations. The model for the simple regression is
where , , , , and are the independent variable, dependent variable, the residual, the intercept, and the regression coefficient, respectively. As equations only allow us to specify the effects from one variable to another, we need to specify further constraints on the models. For example, we may need to indicate that and are uncorrelated, that is, . On the basis of the above model, we derive the expected means and the expected covariance matrix for the variables:
where , , and are the population mean of , the population variance of , and the population variance of , respectively. The expected means and the expected covariance matrix are used to compare against the observed means and covariance matrix in order to obtain parameter estimates.
One of the reasons for the popularity of SEM is its ability to specify models graphically. Path diagrams can be used to represent the mathematical models. Besides the conventional models, such as path analysis, CFA, and SEM, path diagrams have been extended to represent multilevel models (Curran and Bauer, 2007; Muthén and Muthén, 2012; Skrondal and Rabe-Hesketh, 2004) and meta-analysis (Cheung, 2008, 2013, 2014). Graphical models are convenient devices to represent the mathematical models.
There are slight variations on how graphical models are presented in SEM (Arbuckle, 2006; Bentler, 2006; Jöreskog and Sörbom, 1996; Muthén and Muthén, 2012; Neale et al., 2006). For example, some authors prefer to explicitly draw the means and the latent variables of the measurement errors while others do not. We follow the conventions in the OpenMx package (see Boker and McArdle, 2005) in this book.
Rectangles (or squares) and ellipses (or circles) are used to represent the observed and latent variables, respectively. Triangles represent a vector of constant one that is used to represent the intercepts. Single and double arrows represent prediction and covariance among the variables. Strictly speaking, a double arrow (variance) on the triangle is required to fulfill the tracing rules to calculate the model-implied means and covariance matrix (Boker and McArdle, 2005). Conventionally, this double arrow is not shown to simplify the figures.
Figure 2.1 shows two graphical model representations of the simple regression. The model in Figure 2.1a explicitly includes the error and its variance . The main advantage of this representation is that it includes both latent and observed variables in the figure. Readers may easily map the figure to the equations and the matrix representation. The main disadvantage is that the latent variables of the residuals are required in the figure. Suppose that we are fitting a CFA model with 20 observed variables and 4 latent variables; we have to include 20 latent variables for the residuals. This may make the figure unnecessarily crowded.
Figure 2.1 Two graphical model representations of a simple regression.
Figure 2.1b shows an alternative representation for the same model. The main difference of this representation is that the latent variables for the residuals are not shown in the figure. The double arrows are drawn directly on the observed and the latent variables. When the double arrows are drawn on the independent variables, they represent the variances; when the double arrows are drawn on the dependent variables, they represent the error or residual variances. Even though the figure has been simplified, it essentially carries the same information as the figure with the explicit errors. The only drawback is that the is not explicitly shown in the figure.
Regardless of whether we specify the models in equations or path diagrams, most SEM packages convert these models to matrices for analysis. There are several matrix representations. The most traditional approach is the LISREL model (Jöreskog and Sörbom, 1996). The other popular model representations are the models used in EQS (Bentler, 2006), Mplus (Muthén and Muthén, 2012), and the reticular action model (RAM) (McArdle, 2005; McArdle and McDonald, 1984). Although the specifications look different, models specified in one formulation can be translated to the other formulations. In this book, we mainly use the RAM formulation. Some common structural equation models are introduced in this chapter.
Suppose that there are observed and latent variables in the model and is the total number of variables; the RAM formulation involves four matrices: , , , and . We may include the dimensions for the matrices for the ease of reference. Let be a vector that includes all variables in the model. The matrix links the variables by
where denotes the asymmetric paths, such as the regression coefficients and the factor loadings, with in representing the regression coefficient from to . The main purpose of is to specify the single arrows in path diagrams.
is a symmetric matrix representing the variances and covariances of . It is used to specify the double arrows in path diagrams. The diagonal elements represent the variances of the variables. If the elements in are independent variables, the corresponding diagonals in denote the variances; otherwise, the corresponding diagonals in represent the residuals of the dependent variables. The off-diagonals in represent the covariances of the variables. represents the means or intercepts of the variables. is a selection matrix consisting 1 and 0. It is used to select the observed variables.
Regarding the simple regression example, we stack the variables into a column vector , where is the transpose of . Equation 2.4 shows the RAM formulation explicitly including in the model that is equivalent to the model in Figure 2.1a.
Equation 2.5 shows the RAM formulation with . The latent variable for the residual
