Survey Data Harmonization in the Social Sciences -  - E-Book

Survey Data Harmonization in the Social Sciences E-Book

0,0
111,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Survey Data Harmonization in the Social Sciences An expansive and incisive overview of the practical uses of harmonization and its implications for data quality and costs In Survey Data Harmonization in the Social Sciences, a team of distinguished social science researchers delivers a comprehensive collection of ex-ante and ex-post harmonization methodologies in the context of specific longitudinal and cross-national survey projects. The book examines how ex-ante and ex-post harmonization work individually and in relation to one another, offering practical guidance on harmonization decisions in the preparation of new data infrastructure for comparative research. Contributions from experts in sociology, political science, demography, economics, health, and medicine are included, all of which give voice to discipline-specific and interdisciplinary views on methodological challenges inherent in harmonization. The authors offer perspectives from Europe and the United States, as well as Africa, the latter of which provides insights rarely featured in survey research methodology handbooks. Readers will also find: * A thorough introduction to approaches and concepts for survey data harmonization, as well as the effects of data harmonization on the overall survey research process * Comprehensive explorations of ex-ante harmonization of survey instruments and non-survey data * Practical discussions of ex-post harmonization of national social surveys, census and time use data, including explorations of survey data recycling * A detailed overview of statistical issues linked to the use of harmonized survey data Perfect for upper undergraduate and graduate researchers who specialize in survey methodology, Survey Data Harmonization in the Social Sciences will also earn a place in the libraries of survey practitioners who engage in international research.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 940

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.


Ähnliche


Table of Contents

Cover

Table of Contents

Title Page

Copyright

Preface and Acknowledgments

About the Editors

About the Contributors

1 Objectives and Challenges of Survey Data Harmonization

1.1 Introduction

1.2 What is the Harmonization of Survey Data?

1.3 Why Harmonize Social Survey Data?

1.4 Harmonizing Survey Data Across and Within Countries

1.5 Sources of Knowledge for Survey Data Harmonization

1.6 Challenges to Survey Harmonization

1.7 Survey Harmonization and Standardization Processes

1.8 Quality of the Input and the End‐product of Survey Harmonization

1.9 Relevance of Harmonization Methodology to the FAIR Data Principles

1.10 Ethical and Legal Issues

1.11 How to Read this Volume?

References

2 The Effects of Data Harmonization on the Survey Research Process

2.1 Introduction

2.2 Part 1: Harmonization: Origins and Relation to Standardization

2.3 Part 2: Stakeholders and Division of Labor

2.4 Part 3: New Data Types, New Challenges

2.5 Conclusion

References

Part I: Ex-ante harmonization of survey instruments and non-survey data

3 Harmonization in the World Values Survey

3.1 Introduction

3.3 Documentation and Quality Assurance

3.4 Challenges to Harmonization

3.5 Software Tools

3.6 Recommendations

References

4 Harmonization in the Afrobarometer

4.1 Introduction

4.2 Core Principles

4.3 Applied Harmonization Methods

4.4 Harmonization and Country Selection

4.5 Software Tools and Harmonization

4.6 Challenges to Harmonization

4.7 Recommendations

References

Notes

5 Harmonization in the National Longitudinal Surveys of Youth (NLSY)*

5.1 Introduction

5.2 Cross‐Cohort Design

5.3 Applied Harmonization

5.4 Challenges to Harmonization

5.5 Documentation and Quality Assurance

5.6 Software Tools

5.7 Recommendations and Some Concluding Thoughts

References

Notes

6 Harmonization in the Comparative Study of Electoral Systems (CSES) Projects

6.1 Introducing the CSES

6.2 Harmonization Principles and Technical Infrastructure

6.3 Ex‐ante Input Harmonization

6.4 Ex‐ante Output Harmonization

6.5 Exploring Interplay Between Ex‐ante and

Ex‐post

Harmonization

6.6 Taking Stock and New Frontiers in Harmonization

References

Notes

7 Harmonization in the East Asian Social Survey

7.1 Introduction

7.2 Characteristics of the EASS and its Harmonization Process

7.3 Documentation and Quality Assurance

7.4 Challenges to Harmonization

7.5 Software Tools

7.6 Recommendations

Acknowledgment

References

8 Ex‐ante Harmonization of Official Statistics in Africa (SHaSA)

8.1 Introduction

8.2 Applied Harmonization Methods

8.3 Quality Assurance Framework

8.4 Challenges to Statistical Harmonization in Africa

8.5 Common Software Tools Used

8.6 Conclusion and Recommendations

References

Notes

Part II: Ex-post harmonization of national social surveys

9 Harmonization for Cross‐National Secondary Analysis: Survey Data Recycling

9.1 Introduction

9.2 Harmonization Methods in the SDR Project

9.3 Documentation and Quality Assurance

9.4 Challenges to Harmonization

9.5 Software Tools of the SDR Project

9.6 Recommendations

Acknowledgments

References

9.A Data Quality Indicators in SDR2

Notes

10 Harmonization of Panel Surveys: The Cross‐National Equivalent File

10.1 Introduction

10.2 Applied Harmonization Methods

10.3 Current CNEF Partners

10.4 Planned CNEF Partners

10.5 Documentation and Quality Assurance

10.6 Challenges to Harmonization

10.7 Recommendations for Researchers Interested in Harmonizing Panel Survey Data

10.8 Conclusion

References

Notes

11 Harmonization of Survey Data from UK Longitudinal Studies: CLOSER

11.1 Introduction

11.2 Applied Harmonization Methods

11.3 Documentation and Quality Assurance

11.4 Challenges to Harmonization

11.5 Software Tools

11.6 Recommendations

Acknowledgments

References

Note

12 Harmonization of Census Data: IPUMS – International

12.1 Introduction

12.2 Project History

12.3 Applied Harmonization Methods

12.4 Documentation and Quality Assurance

12.5 Challenges to Harmonization

12.6 Software Tools

12.7 Team Organization and Project Management

12.8 Lessons and Recommendations

References

Notes

Part III: Domain-driven ex-post harmonization

13 Maelstrom Research Approaches to Retrospective Harmonization of Cohort Data for Epidemiological Research

13.1 Introduction

13.2 Applied Harmonization Methods

13.3 Documentation and Quality Assurance

13.4 Challenges to Harmonization

13.5 Software Tools

13.6 Recommendations

Acknowledgments

References

14 Harmonizing and Synthesizing Partnership Histories from Different German Survey Infrastructures

14.1 Introduction

14.2 Applied Harmonization Methods

14.3 Documentation and Quality Assurance

14.4 Challenges to Harmonization

14.5 Software Tools

14.6 Recommendations

Acknowledgments

References

Note

15 Harmonization and Quality Assurance of Income and Wealth Data: The Case of LIS

15.1 Introduction

15.2 Applied Harmonization Methods

15.3 Documentation and Quality Assurance

15.4 Challenges to Harmonization

15.5 Software Tools

15.6 Conclusion

References

16

Ex‐Post

Harmonization of Time Use Data: Current Practices and Challenges in the Field

16.1 Introduction

16.2 Applied Harmonization Methods

16.3 Documentation and Quality Assurance

16.4 Challenges to Harmonization

16.5 Software Tools

16.6 Recommendations

References

Notes

Part IV: Further Issues: Dealing with Methodological Issues in Harmonized Survey Data

17 Assessing and Improving the Comparability of Latent Construct Measurements in

Ex‐Post

Harmonization

17.1 Introduction

17.2 Measurement and Reality

17.3 Construct Match

17.4 Reliability Differences

17.5 Units of Measurement

17.6 Cross‐Cultural Comparability

17.7 Discussion and Outlook

References

Note

18 Comparability and Measurement Invariance

18.1 Latent Variable Framework for Testing and Accounting for Measurement Non‐Invariance

18.2 Approaches to Empirical Assessment of Measurement Equivalence

18.3 Beyond Multiple Indicators

18.4 Conclusions

References

Notes

19 On the Creation, Documentation, and Sensible Use of Weights in the Context of Comparative Surveys*

19.1 Introduction

19.2 Design Weights

19.3 Post‐stratification Weights

19.4 Population Weights

19.5 Conclusion

References

Notes

20 On Using Harmonized Data in Statistical Analysis: Notes of Caution

20.1 Introduction

20.2 Challenges in the Combination of Data Sets

20.3 Challenges in the Analysis of Combined Data Sets

20.4 Recommendations

References

Note

21 On the Future of Survey Data Harmonization

21.1 What We Have Learned from Contributions on Survey Data Harmonization in this Volume

21.2 New Opportunities and Challenges

21.3 Developing a New Methodology of Harmonizing Non‐Survey Data

21.4 Globalization of Science and Harmonizing Scientific Practice

References

Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1 Types of survey harmonization in projects included in this volume...

Table 1.2 Number of countries in the first and last waves of major internat...

Chapter 3

Table 3.1 Overview of WVS survey waves coverage.

Table 3.2 Target variable and source variables on education level in WVS‐7 ...

Table 3.3 Education variables in WVS (1990–2021).

Table 3.4 Fragment of the hierarchical coding scheme used for the target va...

Chapter 5

Table 5.1 Description of original cohorts.

Chapter 6

Table 6.1 Summary of the CSES Data Products as of April 2022.

Table 6.2 Harmonizing party codes across time in CSES IMD.

Chapter 7

Table 7.1 Summary of East Asian Social Survey and their Source Surveys.

Table 7.2 Original and translated answer responses for the subjective healt...

Chapter 8

Table 8.1 List and composition of the Specialized Technical Groups.

Table 8.2 ACS Principles grouped by data quality system components.

Chapter 9

Table 9.1 International survey projects selected for harmonization.

Table 9.2 SDR2 at a glance.

Table 9.3 Self‐reported voting behavior in the last elections harmonized to...

Table 9.4 Harmonization control for T_VOTED, SDR2.

Table 9.5 Effects of the indexes of survey quality deficiencies on mean val...

Table 9.6 Within‐project effects of length, direction, and polarity of the ...

Table 9.7 Rules of recoding a set of two questions about participating in d...

Table 9.8 Codes for missing values in SDR2.

Chapter 10

Table 10.1 Earnings measures in PSID and SOEP.

Table 10.2 Current and planned CNEF members, by Parent Data Source.

Chapter 12

Table 12.1 Data dictionary example (selected fields).

Table 12.2 Correspondence table: class of worker.

Table 12.3 Correspondence table: educational attainment.

Chapter 13

Table 13.1 Summary of the Maelstrom Research guideline steps for rigorous r...

Table 13.2 Overview of the CanPath and MINDMAP projects and an example of a...

Table 13.3 Definition of DataSchema variable “highest level of education” i...

Table 13.4 Example of types of processing algorithms applied to generate th...

Chapter 14

Table 14.1 Overview of the survey programs and sub‐studies used in the HaSp...

Table 14.2 Target code and coding instructions for the harmonization proces...

Chapter 18

Table 18.1 Four approaches to assessing measurement invariance (the relevan...

Chapter 20

Table 20.1 The impact of source, scale length, and context on average insti...

Table 20.2 Add in

p

time at the country grouping level.

Table 20.3 Socioeconomic indices before and after imputation.

Table 20.4 Impact of socioeconomic indicators on Average Trust in Instituti...

Table 20.5 Impact of population size and number of surveys conducted on ins...

List of Illustrations

Chapter 1

Figure 1.1 Simplified relationships between different types of data harmoniz...

Chapter 5

Figure 5.1 Screenshot of NLSY79 Employment Topical Guide section.

Figure 5.2 Screenshot of NLS Investigator search function.

Chapter 6

Figure 6.1 Example of Election Study Note concept in CSES Codebook.

Figure 6.2 Example of Study Overview of an Election Study detailed in a CSES...

Figure 6.3 CSES relational data structure for variables assigned alphabetica...

Figure 6.4 Example of how electoral alliances and their coding are detailed ...

Chapter 7

Figure 7.1 Response Distributions to the question “Do you agree or disagree ...

Figure 7.2 Comparison of the distributions of subjective health in EASS and ...

Chapter 8

Graph 8.1 Governance, Peace and Security Statistics in Cabo Verde (2013 and ...

Chapter 9

Figure 9.1 The SDR Harmonization Workflow.

Chapter 11

Figure 11.1 CLOSER's harmonization‐related work packages.

Figure 11.2 Illustration of cross‐generational differences in the prevalence...

Figure 11.3 Generalized process of retrospective harmonization among the CLO...

Chapter 13

Figure 13.1 Some of the information about harmonization processing documente...

Chapter 14

Figure 14.1 Cumulated marital dissolutions (in percent) by marital duration ...

Figure 14.2 MarkDoc document after parsing in Stata.

Figure 14.3 Screenshot from the HaSpaD‐Harmonization Wizard illustrating the...

Chapter 15

Figure 15.1

Ex post

harmonization at LIS.

Figure 15.2 Conceptual framework for

disposable household income

.

Figure 15.3 Conceptual framework for

net worth

.

Figure 15.4 Aggregation of subcategories into overall categories.

Figure 15.5 Country‐specific variable and

standardized

variable for educatio...

Figure 15.6 The three main stages of quality assurance at LIS.

Figure 15.7 LIS to National Accounts ratios in percent – around 2016 or late...

Figure 15.8 (a) Example: availability of information person versus household...

Chapter 16

Figure 16.1 An example of a contemporary time use diary: the 2014–2015 Unite...

Figure 16.2 Main data formats in time use diaries.

Figure 16.3 Screenshot of IPUMS MTUS variable selection page.

Chapter 18

Figure 18.1 Graphical representation of MG‐CFA model.

Chapter 20

Figure 20.1 Trends in institutional trust from 1995 to 2017 by country group...

Figure 20.2 Trends in institutional trust from 1995 to 2017 by source of dat...

Guide

Cover

Table of Contents

Title Page

Copyright

Preface and Acknowledgments

About the Editors

About the Contributors

Begin Reading

Index

End User License Agreement

Pages

iii

iv

xv

xvii

xviii

xix

xx

xxi

xxii

xxiii

xxiv

xxv

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

145

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

Survey Data Harmonization in the Social Sciences

 

Edited by

Irina Tomescu‐DubrowInstitute of Philosophy and SociologyPolish Academy of Sciences, Warsaw, Poland

Christof WolfGESIS Leibniz‐Institute for the Social SciencesUniversity Mannheim, Germany

Kazimierz M. SlomczynskiInstitute of Philosophy and SociologyPolish Academy of Sciences, Warsaw, Poland

J. Craig JenkinsDepartment of SociologyThe Ohio State University, Ohio, USA

 

 

 

 

 

Copyright © 2024 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per‐copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750‐8400, fax (978) 750‐4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at http://www.wiley.com/go/permission.

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762‐2974, outside the United States at (317) 572‐3993 or fax (317) 572‐4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging‐in‐Publication Data Applied for:

Hardback ISBN: 9781119712176

Cover Design: WileyCover Image: © Simon Zhu/Unsplash

Preface and Acknowledgments

This edited volume is about the broad spectrum of harmonization methods that scholars use to create survey data infrastructure for comparative social science research. Contributors from a variety of disciplines, including sociology, political science, demography, and economics, among others, discuss practical applications of harmonization in major social science projects, both at the data collection and processing stages and after data release. They also discuss methodological challenges inherent in harmonization as well as statistical issues linked to the use of harmonized survey data. We thank them all for their valuable input.

An important impetus for preparing this book was the US National Science Foundation grant for the project Survey Data Recycling: New Analytic Framework, Integrated Database, and Tools for Cross‐national Social, Behavioral and Economic Research (hereafter SDR Project; NSF 1738502). Work within the SDR Project highlighted the need for shared knowledge about harmonization methodology. This volume provides a platform where experts in scientific disciplines such as demography and public health directly communicate with sociology, political science, economics, organizational research, and survey methodology.

The support from the Institute of Philosophy and Sociology of the Polish Academy of Sciences, the Ohio State University’s Sociology Department, the Mershon Center for International Security Studies, and the GESIS – Leibniz Institute for the Social Sciences was key to completing this volume. We warmly thank Margit Bäck, GESIS, for her help in proofreading many chapters and communicating with authors.

About the Editors

Irina Tomescu‐Dubrow is Professor of Sociology at the Institute of Philosophy and Sociology at the Polish Academy of Sciences (PAN), and director of the Graduate School for Social Research at PAN.

Christof Wolf is President of GESIS Leibniz‐Institute for the Social Sciences and professor for sociology at University Mannheim. He has co‐authored several papers and co‐edited a number of handbooks in the fields of survey methodology and statistics. Aside of his longstanding interest in survey practice and survey research he works on questions of social stratification and health.

Kazimierz M. Slomczynski is Professor of Sociology at the Institute of Philosophy and Sociology, thePolish Academy of Sciences (IFiS PAN) and Academy Professor of Sociology at the Ohio State University(OSU). He co‐directs CONSIRT – the Cross‐national Studies: Interdisciplinary Research and Training program at OSU and IFiS PAN.

J. Craig Jenkins is Academy Professor of Sociology and Senior Research Scientist at the MershonCenter for International Security at the Ohio State University.

About the Contributors

Chapter 1 – Objectives and Challenges of Survey Data Harmonization

Kazimierz M. Slomczynski is Professor of Sociology at the Institute of Philosophy and Sociology, the Polish Academy of Sciences (IFiS PAN) and Academy Professor of Sociology at the Ohio State University (OSU). He co‐directs CONSIRT – the Cross‐national Studies: Interdisciplinary Research and Training program at OSU and IFiS PAN.

Irina Tomescu‐Dubrow is Professor of Sociology at the Institute of Philosophy and Sociology at the Polish Academy of Sciences (PAN), and director of the Graduate School for Social Research at PAN.

J. Craig Jenkins is Academy Professor of Sociology and Senior Research Scientist at the Mershon Center for International Security at the Ohio State University.

Christof Wolf is President of GESIS Leibniz‐Institute for the Social Sciences and professor for sociology at University Mannheim. He has co‐authored several papers and co‐edited a number of handbooks in the fields of survey methodology and statistics. Aside of his longstanding interest in survey practice and survey research he works on questions of social stratification and health.

Chapter 2 – The Effects of Data Harmonization on the Survey Research Process

Ranjit K. Singh is a post‐doctoral scholar at GESIS, the Leibniz Institute for the Social Sciences, where he practices, researches, and consults on the harmonization of substantive measurement instruments in surveys. He has a background both in social sciences and psychology. Research interests include measurement quality of survey instruments as well as assessing and improving survey data comparability with harmonization.

Arnim Bleier is a postdoctoral researcher in the Department of Computational Social Science at GESIS – Leibniz‐Institute for the Social Sciences. His research interests are in the field of Computational Social Science, with an emphasis on Reproducible Research. In collaboration with social scientists, he develops models for the content, structure and dynamics of social phenomena.

Peter Granda is Archivist Emeritus at the Inter‐university Consortium for Political and Social Research (ICPSR) at the University of Michigan. He maintains a strong interest in international comparative research projects and how data generated from these efforts are archived and made available to the public. He studied the history and cultures of South Asia and spent several years doing archival research in India.

Chapter 3 – Harmonization in the World Value Survey

Kseniya Kizilova, PhD in Sociology, is a Senior Research Fellow at the Institute for Comparative Survey Research (Austria) and Head of Secretariat at the World Values Survey Association (Sweden). Her research focuses on social capital, political culture and political trust, democratization, and political participation. She is a member of the Council of the World Association for Public Opinion Research and an associated researcher at the University of Kharkiv (Ukraine).

Jaime Diez‐Medrano is Founding President of JD Systems and Director of the World Values Survey Association’s Data archive (Spain). He specializes in telecommunications engineering and has over 20 years of experience in database management software development. Diez‐Medrano is actively involved into the data processing and harmonization for a number of large‐scale survey research projects such as Afro Barometer, Arab Barometer, Latinobarometro among the others.

Christian Welzel is Professor of Political Culture Research at the Center for the Study of Democracy, Leuphana University (Germany) and Vice‐President of the World Values Survey Association WVSA (Sweden). His research focuses on human empowerment, emancipative values, cultural change and democratization. Recipient of multiple large‐scale grants, Welzel is the author of more than a hundred‐and‐fifty scholarly publications and a member of the German Academy of Sciences.

Christian Haerpfer is Research Professor of Political Science at the University of Vienna and Director of the Institute for Comparative Survey Research (Austria). He is President of the World Values Survey Association WVSA (Sweden), Director of the Eurasia Barometer and a member of the European Academy of Sciences and Arts. His research focuses on democratization in Eastern Europe and Eurasia, political trust and regime support, electoral behavior, and political participation.

Chapter 4 – Harmonization in the Afrobarometer

Carolyn Logan is an Associate Professor in the Department of Political Science at Michigan State University, and currently serves as Director of Analysis with Afrobarometer. She has been with Afrobarometer since 2001, including serving as Deputy Director from 2008–2019 and during the network’s expansion from 20 to 36 countries in 2011–2013. Her research interests include the role of traditional authorities in democratic governance, and citizen‐versus‐subject attitudes among African publics.

Robert Mattes is Professor of Government and Public Policy at the University of Strathclyde, and Honorary Professor at the Institute for Democracy, Citizenship and Public Policy in Africa at the University of Cape Town. He is a co‐founder of, and Senior Adviser to, Afrobarometer, a ground‐breaking regular survey of public opinion in over 30 African countries (www.afrobarometer.org).

Francis Kibirige is co‐founder and Managing Director of Hatchile Consult Ltd., a research company based in Uganda. He joined Afrobarometer in 1999, and currently serves as Network Sampling Specialist and co‐leads the Afrobarometer team in Uganda. He studied agricultural engineering at Makerere University and has since received four Afrobarometer fellowships to study political research methodology and statistical modeling at University of Cape Town and the Inter‐University Consortium for Political and Social Research (ICPSR) at the University of Michigan.

Chapter 5 – Harmonization in the National Longitudinal Survey of Youth (NLSY)

Elizabeth Cooksey (PhD) is Academy Professor Emeritus at The Ohio State University and a senior researcher at CHRR (Center for Human Resource Research) at The Ohio State University. She has worked with the National Longitudinal Survey of Youth data (NLSY) for over 30 years, and has been the Principal Investigator for the NLSY79 Child and Young Adult surveys for the past two decades.

Rosella Gardecki (PhD) is a Research Specialist at CHRR at The Ohio State University. In 1996, she joined CHRR as a data archivist for the National Longitudinal Survey of Youth 1997 (NLSY97). She has contributed to questionnaire design for more than 20 years. With her background in economics, she currently leads the team that creates variables for both the NLSY79 and NLSY97 cohorts.

Carole Lunney (MA) is a data analysis consultant living in Calgary, Alberta, Canada. From 2011 to 2020, Carole worked at CHRR at The Ohio State University on the data archivist team for the National Longitudinal Surveys. In that role, she was involved in survey instrument design and testing, statistical programming, writing study documentation, and data user outreach. She has co‐authored publications on posttraumatic stress disorder, music cognition, and communication.

Amanda Roose (MA) has been the NLS Project Manager/Documentation Lead at CHRR at The Ohio State University for two decades, with management responsibilities for all aspects of survey administration including design, fielding, data preparation, documentation, and dissemination. She has experience editing academic publications across a range of disciplines.

Chapter 6 – Harmonization in the Comparative Study of Electoral Systems (CSES) Projects

Stephen Quinlan (PhD, University College Dublin) is Senior Researcher at the GESIS – Leibniz Institute for the Social Sciences in Mannheim and Project Manager of the Comparative Study of Electoral Systems (CSES) project. His research interests are comparative electoral behavior and social media’s impact on politics. His work has appeared in journals such as Information, Communication, and Society, International Journal of Forecasting, Electoral Studies, Party Politics, and The European Political Science Review. E: [email protected]

Christian Schimpf (PhD, University of Mannheim) is a former Data Processing Specialist at the CSES. Previously, he has been Senior Researcher at the University of Alberta and Université du Québec à Montréal. His research interests are comparative electoral behavior and environment/energy policy. His work has appeared in journals such as The American Political Science Review, Environmental Politics, and Political Studies. E: [email protected]

Katharina Blinzler (MA, University of Mannheim) is a Data Processing Specialist and Archivist with the CSES Secretariat at the GESIS – Leibniz Institute for the Social Sciences, Köln. Her research interests are comparative electoral behavior and data harmonization. E: [email protected]

Slaven Zivkovic (PhD, ABD) is a former Data Processing Specialist at the CSES and a PhD candidate at the University of Mainz, Germany. Most recently, he was a Fulbright Fellow at the Florida International University in Miami, USA. His research interests are comparative electoral behavior, especially in post‐communist states. His work has appeared in journals such as The Journal of Contemporary European Studies, Comparative European Politics, and East European Politics and Societies. E: [email protected]

Chapter 7 – Harmonization in the East Asian Social Survey

Noriko Iwai is Director of the Japanese General Social Survey Research Center and Professor of Faculty of Business Administration, Osaka University of Commerce. She is a PI of JGSS and EASS, a director of the Japanese Association for Social Research, and a member of Science Council of Japan. Her current research project is supporting Japanese researchers in the humanities and social sciences to prepare their data for public usage.

Tetsuo Mo is Research Fellow of the Japanese General Social Survey Research Center, Osaka University of Commerce. His areas of specialty are labor economy, inequality and social exclusion, social survey, and quantitative analysis of survey data. He is responsible for the creation of JGSS questionnaires, cleaning of JGSS data, and harmonization of East Asian Social Survey data.

Jibum Kim is a professor in the Department of Sociology and the director of the Survey Research Center at Sungkyunkwan University (SKKU) in Seoul, South Korea. He is also a PI of the Korean General Social Survey (KGSS). He is currently the president of the World Association for Public Opinion Research (WAPOR) Asia Pacific and serves on the editorial board of the International Journal of Public Opinion Research.

Chyi‐in Wu is a Research Fellow at the Institute of Sociology, Academia Sinica in Taipei, Taiwan. He is currently the PI of the Taiwan Social Change Survey (TSCS). He currently serves as the editor of The Journal of Information Society (Taiwan) and was the president of the Taiwan Association of Information Society (TAIS).

Weidong Wang is the Director of the Social Psychology Institute, the Executive Deputy Director of the National Survey Research Center (NSRC), Associate Professor of the Sociology Department at Renmin University of China. He is the PI of the Chinese Religious Life and Survey (CRLS), PI of China Education Panel Survey (CEPS), and the co‐founder and Executive Director of the Chinese National Survey Data Archive (CNSDA).

Chapter 8 – Ex‐ante Harmonization of Official Statistics in Africa (SHaSA)

Dr. Dossina Yeo is a passionate statistician who led the development and the implementation of the recent major statistical development strategic frameworks in Africa including the African Charter on Statistics and the Strategy for the harmonization of Statistics in Africa (SHaSA 1 and 2. He also led the establishment of the statistical function within the African Union Commission including the creation of the Statistics Unit, its transformation into a Statistics Division and the creation of the African Union Institute for Statistics (STATAFRIC) based in Tunis (Tunisia) and the Pan‐African Training Center (PAN‐STAT) based in Yamoussoukro (Cote d’Ivoire). Dr. Yeo is currently Acting Director of Economic Development, Integration and Trade at the African Union Commission. He has previously worked for the United Nations Statistics Division (UNSD) and the African Development Bank (AfDB).

Ch 9 – Harmonization for Cross‐National Secondary Analysis: Survey Data Recycling

Irina Tomescu‐Dubrow is Professor of Sociology at the Institute of Philosophy and Sociology at the Polish Academy of Sciences (PAN), and director of the Graduate School for Social Research at PAN.

Kazimierz M. Slomczynski is Professor of Sociology at the Institute of Philosophy and Sociology, the Polish Academy of Sciences (IFiS PAN) and Academy Professor of Sociology at the Ohio State University (OSU). He co‐directs CONSIRT – the Cross‐national Studies: Interdisciplinary Research and Training program at OSU and IFiS PAN.

Ilona Wysmulek is an Assistant Professor at the Institute of Philosophy and Sociology, Polish Academy of Sciences in Warsaw. She works in the research group of Professor Kazimierz M. Slomczynski on Comparative Analysis of Social Inequality and is actively involved in the two team’s projects: the SDR survey data harmonization project and the Polish Panel Survey POLPAN.

Przemek Powałko is an Assistant Professor at the Institute of Philosophy and Sociology, Polish Academy of Sciences in Warsaw.

Olga Li is a PhD student at the Graduate School for Social Research and is a member of the research unit on Comparative Analyses of Social Inequality at the Institute of Philosophy and Sociology, Polish Academy of Sciences. She has previously worked in Polish Panel Survey (POLPAN) and Survey Data Recycling (SDR) grant projects. For her PhD thesis, she is conducting a quantitative research on political participation in authoritarian regimes.

Yamei Tu is a Ph.D. student at the Ohio State University, and her research interests include visualization and NLP.

Marcin Slarzynski is an Assistant Professor at the Institute of Philosophy and Sociology at the Polish Academy of Sciences. His research focuses on the national movement of local elites in Poland, 2005–2015.

Marcin W. Zielinski is a Research Assistant, Institute of Philosophy and Sociology, Polish Academy of Sciences, Warsaw, Poland.

Denys Lavryk from Graduate School for Social Research, Polish Academy of Sciences, Warsaw, Poland.

Chapter 10 – Harmonization of Panel Surveys: The Cross‐National Equivalent File

Dean Lillard is Professor of consumer sciences in the Department of Human Sciences at The Ohio State University. He received his PhD in economics from the University of Chicago in 1991. From 1991 to 2012, he held appointments at Cornell University. He moved to OSU in 2012. He is a Research Fellow of the German Institute for Economic Research and a Research Associate of the National Bureau of Economic Research.

Chapter 11 – Harmonization of Survey Data from UK Longitudinal Studies: CLOSER

Dara O'Neill was the theme lead in data harmonization at the CLOSER consortium of UK‐based longitudinal studies (2018–2022) at University College London (UCL), overseeing diverse cross‐study harmonization projects. Previously, Dara held research posts at UCL’s Department of Epidemiology and Public Health, at UCL’s Institute of Cardiovascular Science and at the University of Surrey’s School of Psychology. Dara now works as a statistician/psychometrician in clinical trial research and is an honorary Senior Research Fellow at UCL.

Social Research Institute, University College London, UK

Rebecca Hardy is Professor of Epidemiology and Medical Statistics in the School of Sport, Exercise and Health Sciences at Loughborough University. She previously worked at University College London, where she was CLOSER Director (2019–2022) and Programme Leader in the MRC Unit for Lifelong Health and Aging (2003–2019). Rebecca uses a life course approach to study health and aging, and her interests also include methodological considerations in life course and longitudinal data analysis, as well as cross‐study data harmonization.

Chapter 12 – Harmonization of Census Data: IPUMS – International

Lara Cleveland is a sociologist and principal research scientist at the University of Minnesota’s Institute for Social Research and Data Innovation (ISRDI) where she directs the IPUMS International census and survey data project. She leads technical workflow development, partner relations, and grant management for the project. Her research interests include data and methods; organizations, occupations, and work; and global standardization practices. She serves on international working groups concerning data dissemination.

Steven Ruggles is Regents Professor of History and Population Studies at the University of Minnesota, He started IPUMS in 1991, and today IPUMS provides billions of records from thousands of censuses and surveys describing individuals and households in over 100 countries from the 18th century to the present. Ruggles has published extensively on historical demography, focusing especially on long‐run changes in families and marriage, and on methods for population history.

Matthew Sobek is the IPUMS Director of Data Integration and has overseen numerous projects to harmonize census and survey data collections over the past 30 years. Sobek co‐authored the original U.S. IPUMS in the 1990s before shifting focus to international data harmonization. He played a foundational role in the development of IPUMS harmonization and dissemination methods and continues to contribute to their evolution.

Chapter 13 – Maelstrom Research Approaches to Retrospective Harmonization of Cohort Data for Epidemiological Research

Tina W. Wey, PhD, is a research data analyst with the Maelstrom Research team at the Research Institute of the McGill University Health Centre. She has a background in biological research, with experience in data management and statistical analysis in behavior, ecology, and epidemiology.

Isabel Fortier, PhD, is a researcher at the Research Institute of the McGill University Health Centre and Assistant Professor in the Department of Medicine at McGill University. She has extensive experience in collaborative epidemiological and methodological research and leads the Maelstrom Research program, which aims to provide the international research community with resources (expertise, methods, and software) to leverage and support data harmonization and integration across studies.

Chapter 14 – Harmonizing and Synthesizing Partnership Histories from Different German Survey Infrastructures

Dr. Bernd Weiß is head of the GESIS Panel, a probabilistic mixed‐mode panel. He also serves as Deputy Scientific Director of the Department Survey Design and Methodology at GESIS – Leibniz Institute for the Social Sciences in Mannheim. His research interests range from survey methodology, research synthesis, and open science to family sociology and juvenile delinquency.

Dr. Sonja Schulz is a senior researcher in the Department Survey Data Curation at GESIS – Leibniz Institute for the Social Sciences. Her current research and services focus on survey data harmonization, family research and social inequality, with a special focus on trends in family formation and marriage dissolution. Recent articles were published in Criminal Justice Review, European Journal of Criminology, and Journal of Quantitative Criminology.

Dr. Lisa Schmid is a social scientist with a research interest in intimate relationships and family demography. As a research associate at GESIS – Leibniz‐Institute for the Social Science in Mannheim she is part of the team Family Surveys that conducts the Family Research and Demographic Analysis (FReDA) panel in Germany.

Sebastian Sterl is a scientific researcher in the Interdisciplinary Security Research Group at Freie Universität Berlin focused on creating psychosocial situation pictures in crises and disasters using quantitative approaches. Before, he worked at GESIS – Leibniz Institute for the Social Sciences responsible for harmonizing and synthesizing survey data. His interests include quantitative methods of empirical social and economic research, data management, risk perception, protective and coping behavior, and rational choice theory.

Dr. Anna‐Carolina Haensch is a postdoctoral researcher at the LMU Munich and an assistant professor at the University of Maryland. Her work focuses on data quality, especially regarding missing data. She has also been part of the COVID‐19 Trends and Impact Surveys (CTIS) team since early 2021 and enjoys teaching courses on quantitative methods at the LMU Munich and the Joint Program in Survey Methodology.

Chapter 15 – Harmonization and Quality Assurance of Income and Wealth Data: The Case of LIS

Jörg Neugschwender has a PhD in Sociology from the Graduate School of Economic and Social Sciences (GESS) at the University of Mannheim, Germany. He is Data Team Manager at LIS Cross‐National Data Center in Luxembourg, supervising the data team in harmonizing datasets for the LIS databases, developing and maintaining data production and quality assessment applications, and overseeing the consistency of produced datasets. He is the editor of the LIS newsletter Inequality Matters.

Teresa Munzi, an economist by training (University of Rome and London School of Economics), has been with the LIS Cross‐National Data Center in Luxembourg for over 20 years. Since 2019, she is the Director of Operations of LIS, where she is responsible for managing and overseeing all operations. She also carries out research on the comparative study of welfare systems and their impact through redistribution on poverty, inequality, and family well‐being.

Piotr Paradowski works at LIS Cross‐National Data Center in Luxembourg as a Data Expert and Research Associate. He is also affiliated with the Department of Statistics and Econometrics at the Gdansk University of Technology. In addition, he conducts interdisciplinary research focusing on income and wealth distributions as they relate to economic inequality, poverty, and welfare state politics.

Chapter 16 – Ex‐Post Harmonization of Time Use Data: Current Practices and Challenges in the Field

Ewa Jarosz is an assistant professor at the Faculty of Economic Sciences, University of Warsaw. She specializes in cross‐national studies and uses comparative survey data, time use data and panel data in her work. Her research interest include time use, demographic change, social inequality, health, and wellbeing.

Sarah Flood is the Director of U.S. Surveys at the IPUMS Center for Data Integration and the Associate Director of the Life Course Center, both at the University of Minnesota. Her data infrastructure work on IPUMS Time Use (https://timeuse.ipums.org/) lowers the barriers to accessing time diary data. Her substantive research is at the intersection of gender, work, family, and the life course.

Margarita Vega‐Rapun is a research officer at the Joint research of the European Commission. She is mainly involved in topics related to development goals and territories. She also holds an honorary position at the University College London, Centre for Time Use Research, where she worked on the Multinational Time Use project. Her research interests are time poverty, gender and inequalities, and the impact of covid 19 on time use patterns.

Chapter 17 – Assessing and Improving the Comparability of Latent Construct Measurements in Ex‐Post Harmonization

Ranjit K. Singh is a post‐doctoral scholar at GESIS, the Leibniz Institute for the Social Sciences, where he practices, researches, and consults on the harmonization of substantive measurement instruments in surveys. He has a background both in social sciences and psychology. Research interests include measurement quality of survey instruments as well as assessing and improving survey data comparability with harmonization.

Markus Quandt is a senior researcher and team leader at GESIS Leibniz Institute for the Social Sciences in Cologne, Germany. His research is based on quantitative surveys in cross‐country comparative settings. Substantive interests are in political and social participation as collective goods problems; methodological interests concern the comparability and validity of measures of attitudes and values, and the closely related problems of harmonizing data from different sources.

Chapter 18 – Comparability and Measurement Invariance

Artur Pokropek is a Professor at the Institute of Philosophy and Sociology of the Polish Academy of Sciences. His main areas of research interests are statistics, psychometrics and machine learning. He has developed several methodological and statistical approaches for analyzing survey data. He gained knowledge as a visiting scholar at the Educational Testing Service (Princeton, USA) and as an associate researcher at the EC Joint Research Centre (Ispra, Italy).

Chapter 19 – On the Creation, Documentation, and Sensible Use of Weights in the Context of Comparative Surveys

Dominique Joye is professor emeritus of sociology at the University of Lausanne and affiliated researcher at FORS. He is one of the co‐editors of the SAGE Handbook of Survey Methodology, and was for a long time the Swiss coordinator of ESS, EVS and ISSP as well as participant in the methodological boards of these international comparative projects.

Marlène Sapin is senior researcher at the Swiss Centre of Expertise in Social Sciences (FORS) and at the Swiss Centre of Expertise in life course research (LIVES), University of Lausanne. She is a specialist in population‐based surveys and has been involved in the leading developing team of several national and international surveys (e.g. Swiss Federal Survey of Youths, International Social Survey Programme). She has a strong interest in survey methodology, social networks, and health.

Christof Wolf is President of GESIS Leibniz‐Institute for the Social Sciences and professor for sociology at University Mannheim. He has co‐authored several papers and co‐edited a number of handbooks in the fields of survey methodology and statistics. Aside of his longstanding interest in survey practice and survey research he works on questions of social stratification and health.

Chapter 20 – On Using Harmonized Data in Statistical Analysis: Notes of Caution

Claire Durand is Professor of Sociology at the Institute of Philosophy and Sociology, the Polish Academy of Sciences (IFiS PAN) and Academy Professor of Sociology at the Ohio State University (OSU). He co‐directs CONSIRT ‐ the Cross‐national Studies: Interdisciplinary Research and Training program at OSU and IFiS PAN.

Chapter 21 – On the Future of Survey Data Harmonization

Kazimierz M. Slomczynski is Professor of Sociology at the Institute of Philosophy and Sociology, the Polish Academy of Sciences (IFiS PAN) and Academy Professor of Sociology at the Ohio State University (OSU). He co‐directs CONSIRT – the Cross‐national Studies: Interdisciplinary Research and Training program at OSU and IFiS PAN.

Christof Wolf is President of GESIS Leibniz‐Institute for the Social Sciences and professor for sociology at University Mannheim. He has co‐authored several papers and co‐edited a number of handbooks in the fields of survey methodology and statistics. Aside of his longstanding interest in survey practice and survey research he works on questions of social stratification and health.

Irina Tomescu‐Dubrow is Professor of Sociology at the Institute of Philosophy and Sociology at the Polish Academy of Sciences (PAN), and director of the Graduate School for Social Research at PAN.

J. Craig Jenkins is Academy Professor of Sociology and Senior Research Scientist at the Mershon Center for International Security at the Ohio State University.

1Objectives and Challenges of Survey Data Harmonization

Kazimierz M. Slomczynski, Irina Tomescu‐Dubrow, J. Craig Jenkins, and Christof Wolf

1.1 Introduction

This edited volume is an extensive presentation of survey data harmonization in the social sciences and the first to discuss ex‐ante, or propspective and ex‐post, or retrospective harmonization concepts and methodologies from a global perspective in the context of specific cross‐national and longitudinal survey projects. Survey data harmonization combines survey methods, statistical techniques, and substantive theories to create datasets that facilitate comparative research. Both data producers and secondary users engage in harmonization to achieve or strengthen the comparability of answers that respondents surveyed in different populations or the same population over time provide (Granda et al. 2010; Wolf et al. 2016). Most data producers employ harmonization ex‐ante, when designing and implementing comparative studies, for example, the World Values Survey (WVS), the Survey of Health, Ageing, and Retirement in Europe, the International Social Survey Programme (ISSP), or the European Social Survey (ESS), among many others. Secondary users, as well as some data producers, apply harmonization methods ex‐post, to already released files that are not comparable by design to integrate them into datasets suitable for comparative analysis. The Luxembourg Income Study, the Multinational Time Use Study, IPUMS‐International, the Cross‐national Equivalent File, and more recently, the Survey Data Recycling (SDR) Database are relevant illustrations of large‐scale ex‐post harmonization projects.

Harmonizing at the data collection and processing stages (i.e. ex‐ante) and harmonizing after data release (i.e. ex‐post) present obvious differences, including in scope (what can be harmonized), methods (how to harmonize), organization (who is involved), and expenditure (what is the calculated cost). Nonetheless, both approaches – individually and in relation to each other – play important roles in obtaining survey responses that can be compared. Well‐documented ex‐ante harmonization procedures inform subsequent ex‐post harmonization steps, while lessons learned during ex‐post harmonization efforts can aid harmonization decisions in the preparation of new surveys. This book covers both perspectives to offer readers a rounded view of survey data harmonization. Although our examples draw on comparative research, whether cross‐national or historical, other harmonization work uses many of the same methods that are presented here.

Over the decades, the number of harmonized social science datasets has grown rapidly, responding to the push for greater comparability of concepts and constructs, representation, and measurement in longitudinal and cross‐national projects (Granda et al. 2010) and due to incentives to reuse the wealth of already collected data (Slomczynski and Tomescu‐Dubrow 2018). However, documentation of the complex process of harmonization decisions is weak (Granda and Blaszcyk 2016), and the body of methodological literature is small (Dubrow and Tomescu‐Dubrow 2016). Explicit discussion of ex‐ante harmonization, if present, is usually subsumed under comparative survey methods, and hardly any survey methodology textbook features chapters on ex‐post harmonization. The consequence is a scattered field where harmonized datasets are readily available, but harmonization assumptions, as well as challenges that researchers face and solutions they chose, and best practice recommendations, are not widely shared.

To foster the diffusion of harmonization knowledge across the social sciences, this book provides a platform where such scientific disciplines as demography and public health directly communicate with sociology, political science, economics, organizational research, and survey methodology. We structure the volume into four parts that, taken together, integrate the discussion of concepts and methods developed around harmonization with practical knowledge accumulated in the process of building longitudinal and cross‐national datasets for comparative research and relevant insights for analyzing harmonized survey data. These four parts are preceded not only by this Introduction (Chapter 1) but also by Chapter 2, which considers data harmonization in the overall survey research process.

Part I of this book focuses on six renowned projects from around the world as case studies of ex‐ante harmonization. Whether these studies are international surveys (Chapters 3, 4, 6, and 7), a single‐nation panel (Chapter 5), or official statistics (Chapter 8), they each share their experiences with harmonization, including its documentation and quality controls, challenges and how they were met, and recommendations. The same structure characterizes the eight chapters in Parts II and III, whose core common theme is ex‐post harmonization. Chapters 9–12 deal with the integration and harmonization of national surveys, while Chapters 13–16 are devoted to the harmonization of surveys on specific substantive issues, such as health, family, income and wealth, and time use. Contributions in Part IV adopt a user’s perspective and discuss methodological issues that statistical analysis of harmonized data will likely raise. We conclude the volume with lessons learned from the chapters and an agenda for moving the survey data harmonization field forward.

1.2 What is the Harmonization of Survey Data?

Harmonization of survey data is a process that aims to produce equivalent or comparable measures of a given characteristic across datasets coming from different populations or from the same population but at different time points. If the aim is to harmonize an entire data collection effort, such as cross‐national survey programs or multi‐culture surveys seek to do, then the scope of harmonization is broad and concerns sampling design, data collection instrument(s), survey mode(s), fieldwork, documentation, data cleaning, and presenting data and meta‐data as machine‐readable files. If data are harmonized after they are collected, then the focus lies on ways to code the data into equivalent or comparable categories using appropriate scales and developing variables to capture sources of potential bias among the various surveys incorporated.

1.2.1 Ex‐ante, Input and Output, Survey Harmonization

Input harmonization pursues equivalent or comparable representation of the populations studied (samples), instruments of the measurement (questionnaires), methods of data collection (modes), and data documentation (metadata) in a bid to reduce as much as possible the share of methodological artifacts in comparative analyses. This generally involves the agreement of Principal Investigators (PIs) and the national data collection teams to use uniform definitions of concepts and indicators, and the same procedures (e.g. survey mode), training (of, e.g. translators), and technical requirements (e.g. sampling method, minimum response rate) (Ehling 2003; Hoffmeyer‐Zlotnik 2016; Kallas and Linardis 2010).

Ex‐ante output harmonization assumes that for concepts that exist across populations, comparable estimates can be obtained even if survey conditions differ (Hoffmeyer‐Zlotnik 2016). For some characteristics, strict input harmonization is not an option, and this approach is the only way to arrive at comparable measures (Schneider et al. 2016). PIs and national teams agree on a target variable and a common measurement schema to construct it, but first, respondents’ answers are gathered using country‐specific survey questions (Granda et al. 2010; Kallas and Linardis 2010). Once the data are in, the country‐specific items are recoded into the common target variable following the agreed‐upon harmonized coding schema. For example, in the ESS, the International Standard Classification of Education, ISCED, the harmonized measure of education levels, is obtained via “mapping” of national classifications of education. International survey projects frequently use both input and output harmonization as they seek to balance the need for high‐quality cross‐national measures with the need for valid and reliable national indicators.

Ex‐post survey harmonization aims at producing an integrated data file containing information on the units of observations and variables describing these units that stem from surveys that were not originally meant to be put together. For this type of harmonization, the output in the form of an integrated file is essential. The process of harmonizing ex‐post focuses on two aspects of comparability: the question content and the answering scale. The end product should contain metadata on all aspects of the survey life cycle, which can be used to assess comparability and as controls for potential bias associated with differences among included surveys.

In Figure 1.1, we present a simplified relationship between different types of harmonization, including stages of the survey life cycle. We take into account only two projects (A and B), containing three (A) or two (B) surveys, respectively, conducted in the same set of countries, or – in the case of longitudinal studies conducted in one country – different time periods. In the figure, within each project, ex‐ante input harmonization refers to all pertinent stages of the survey life cycle. Ex‐ante output harmonization should (only) be used for characteristics that cannot be measured with the same categories across contexts. This is different from surveys such as, for example, the EU‐Labor Force Survey for which Eurostat did not succeed in prescribing a unified questionnaire for all European countries.

Figure 1.1 Simplified relationships between different types of data harmonization, including stages of survey life cycles.

For ex‐post harmonization, we assume that substantively both projects deal with the same topics, although they could use slightly different question wordings or other elements of survey production. Of course, ex‐post harmonization should be driven by a theoretical and/or practical interest in certain topics. In Figure 1.1, ex‐post harmonization adjusts for surveys’ differences only at the level of projects because within‐project variation has already been eliminated by ex‐ante output harmonization. In practice, however, ex‐post harmonization often looks at survey differences independently of their project origin. An important part of ex‐post harmonization is examining survey differences at all stages of the survey life cycle. Ideally, all differences should be documented as metadata in the form of separate variables.

In Table 1.1, we describe projects included in this volume in terms of the types of survey harmonization they utilize. Among ex‐ante output harmonization, we distinguish two situations. The first one follows the ex‐ante input harmonization and corrects some survey‐specific deviations from the previously agreed‐upon solutions. The second situation occurs when some harmonization is needed because collaborating partners bring their own data for which there was no or little prior harmonization. This situation n is similar to ex‐post harmonization, when researchers combine surveys on the same topic and harmonize relevant variables. However, ex‐post harmonization is conducted within the same project, as in the case of harmonizing data from different waves, either in cross‐sectional or panel frameworks. The remaining type of ex‐post harmonization refers to the situation when researchers harmonize a number of surveys from different projects, like ISSP and ESS, or Eurobarometer (EB) with its rendition in Asia or Latin America.

Table 1.1 Types of survey harmonization in projects included in this volume.

Chapter #

Project

Ex‐ante harmonization

Ex‐post harmonization

Input

Output

Within a given survey project

of multi‐ survey projects

of inde‐pendent surveys

After inputharmo‐nization

With noor little inputharmo‐nization

3

World Value Survey WVS

X

X

X

4

Afrobarometer AFB

X

X

X

5

National Longitudinal Survey of Youth NLSY

X

X

X

X

6

Comparative Study of Electoral System CSES

X

X

X

X

7

East Asian Social Survey EASS

X

X

X

8

Official Statistics in Africa SHaSA

X

X

X

9

Survey Data Recycling SDR

X

10

Cross‐National Equivalent File CNEF

X

11

UK Longitudinal Studies CLOSER

X

X

12

Harmonization of Census Data IPUMS

X

13

Health Survey Data MAELSTROM

X

X

X

X

X

14

Harmonizing Family Biographies HFB

X

15

Income and Wealth Data LIS/LWS

X

16

Harmonization of Time Use Data

X

X

Survey harmonization does not mean the simple pooling of data from different international survey projects for analyses performed separately for these projects. For example, in one study, religiosity was re‐coded to the unified schema for the World Value Survey and ISSP, but the impact of this created variable on happiness, civic engagement, and health was assessed independently for each project (PEW 2019). Various aspects of pooling survey data and how it is distinct from ex‐post harmonization have been examined in the literature (e.g. Kish 1999; Ayadi et al. 2003; Wendt 2007; Roberts and Binder 2009; Malnar and Ryan 2022). We should note, however, that in some projects, survey data integration of different surveys is the first step to harmonization (e.g. in CNEF, Chapter 10).

1.3 Why Harmonize Social Survey Data?

Generally, researchers harmonize survey data to gain new knowledge and solve scientific problems requiring sample sizes that could not be obtained with individual studies. Typically, this requires large geographic and/or temporal coverage as to provide context on the macro‐level for explaining individuals’ opinions, attitudes, and behaviors. Ideally, in advanced comparative analysis, context should be expressed in terms of specific variables (Przeworski and Teune 1970), dealing with conditions in various relevant dimensions: political (e.g. indexes of democracy), economic (e.g. GDP per capita, foreign debt), social (e.g. indexes of marriage homogamy or social mobility), and cultural (e.g. book readership, religious fractionalization). In addition, as it has been pointed out in the literature, harmonized data “improves the generalizability of results, helps ensure the validity of comparative research, encourages more efficient secondary usage of existing data, and provides opportunities for collaborative and multi‐center research.” (Doiron et al. 2012, p. 221).

1.3.1 Comparison and Equivalence

How can valid comparisons be made in cross‐national research when so many terms and concepts differ in their meanings from country to country?” (Przeworski and Teune 1966, p. 551; see also van Deth 1998). A first naïve answer would be: by using identical measures. The identity of two measures would require that they have the same manifestation in every possible dimension or with respect to every possible aspect. Identity would, for example, require that the language in which the instrument is administered be the same in all instances. Obviously, this is not feasible in comparative research.