Statistical Analysis with Missing Data - Roderick J. A. Little - E-Book

Statistical Analysis with Missing Data E-Book

Roderick J. A. Little

0,0
86,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

An up-to-date, comprehensive treatment of a classic text on missing data in statistics

The topic of missing data has gained considerable attention in recent decades. This new edition by two acknowledged experts on the subject offers an up-to-date account of practical methodology for handling missing data problems. Blending theory and application, authors Roderick Little and Donald Rubin review historical approaches to the subject and describe simple methods for multivariate analysis with missing values. They then provide a coherent theory for analysis of problems based on likelihoods derived from statistical models for the data and the missing data mechanism, and then they apply the theory to a wide range of important missing data problems.

Statistical Analysis with Missing Data, Third Edition starts by introducing readers to the subject and approaches toward solving it. It looks at the patterns and mechanisms that create the missing data, as well as a taxonomy of missing data. It then goes on to examine missing data in experiments, before discussing complete-case and available-case analysis, including weighting methods. The new edition expands its coverage to include recent work on topics such as nonresponse in sample surveys, causal inference, diagnostic methods, and sensitivity analysis, among a host of other topics.

  • An updated “classic” written by renowned authorities on the subject
  • Features over 150 exercises (including many new ones)
  • Covers recent work on important methods like multiple imputation, robust alternatives to weighting, and Bayesian methods
  • Revises previous topics based on past student feedback and class experience
  • Contains an updated and expanded bibliography

The authors were awarded The Karl Pearson Prize in 2017 by the International Statistical Institute, for a research contribution that has had profound influence on statistical theory, methodology or applications. Their work "has been no less than defining and transforming." (ISI)

Statistical Analysis with Missing Data, Third Edition is an ideal textbook for upper undergraduate and/or beginning graduate level students of the subject. It is also an excellent source of information for applied statisticians and practitioners in government and industry.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 810

Veröffentlichungsjahr: 2019

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



WILEY SERIES IN PROBABILITY AND STATISTICS

Established by Walter A. Shewhart and Samuel S. Wilks

Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay

Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels

The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods.

Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches.

This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.

A complete list of titles in this series can be found at http://www.wiley.com/go/wsps

Statistical Analysis with Missing Data

Roderick J. A. Little

Richard D. Remington Distinguished University Professor ofBiostatistics, Professor of Statistics, and Research Professor,Institute for Social Rsearch, at the University of Michigan

Donald B. Rubin

Professor at Yau Mathematical Sciences Center, TsinghuaUniversity; Murray Shusterman Senior Research Fellow, FoxSchool of Business, at Temple University; and Professor Emeritus,at Harvard University

3rd Edition

This edition first published 2020

© 2020 John Wiley & Sons, Inc

Edition History

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Roderick J A Little and Donald B Rubin to be identified as the authors of the material in this work has been asserted in accordance with law.

Registered Office(s)

John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA

Editorial Office

111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of Warranty

In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging-in-Publication Data

Names: Little, Roderick J.A., author. | Rubin, Donald B., author.

Title: Statistical analysis with missing data / Roderick J.A. Little, Donald B. Rubin.

Description: Third edition | Hoboken, NJ : Wiley, 2020. | Series: Wiley series in probability and statistics | Includes index. |

Identifiers: LCCN 2018058860 (print) | LCCN 2018061330 (ebook) | ISBN 9781118596012 (Adobe PDF) | ISBN 9781118595695 (ePub) | ISBN 9780470526798 (hardcover)

Subjects: LCSH: Mathematical statistics. | Mathematical statistics--Problems, exercises, etc. | Missing observations (Statistics) | Missing observations (Statistics)--Problems, exercises, etc.

Classification: LCC QA276 (ebook) | LCC QA276 .L57 2019 (print) | DDC 519.5--dc23

LC record available at https://lccn.loc.gov/2018058860

Cover image: Wiley

Cover design by Wiley

CONTENTS

Cover

Preface to the Third Edition

Part I Overview and Basic Approaches

1 Introduction

1.1 The Problem of Missing Data

1.2 Missingness Patterns and Mechanisms

1.3 Mechanisms That Lead to Missing Data

1.4 A Taxonomy of Missing Data Methods

Problems

Note

2 Missing Data in Experiments

2.1 Introduction

2.2 The Exact Least Squares Solution with Complete Data

2.3 The Correct Least Squares Analysis with Missing Data

2.4 Filling in Least Squares Estimates

2.5 Bartlett's ANCOVA Method

2.6 Least Squares Estimates of Missing Values by ANCOVA Using Only Complete-Data Methods

2.7 Correct Least Squares Estimates of Standard Errors and One Degree of Freedom Sums of Squares

2.8 Correct Least-Squares Sums of Squares with More Than One Degree of Freedom

Problems

3 Complete-Case and Available-Case Analysis, Including Weighting Methods

3.1 Introduction

3.2 Complete-Case Analysis

3.3 Weighted Complete-Case Analysis

3.4 Available-Case Analysis

Problems

4 Single Imputation Methods

4.1 Introduction

4.2 Imputing Means from a Predictive Distribution

4.3 Imputing Draws from a Predictive Distribution

4.4 Conclusion

Problems

5 Accounting for Uncertainty from Missing Data

5.1 Introduction

5.2 Imputation Methods that Provide Valid Standard Errors from a Single Filled-in Data Set

5.3 Standard Errors for Imputed Data by Resampling

5.4 Introduction to Multiple Imputation

5.5 Comparison of Resampling Methods and Multiple Imputation

Problems

Part II Likelihood-Based Approaches to the Analysis of Data with Missing Values

6 Theory of Inference Based on the Likelihood Function

6.1 Review of Likelihood-Based Estimation for Complete Data

6.2 Likelihood-Based Inference with Incomplete Data

6.3 A Generally Flawed Alternative to Maximum Likelihood: Maximizing over the Parameters and the Missing Data

6.4 Likelihood Theory for Coarsened Data

Problems

Notes

7 Factored Likelihood Methods When the Missingness Mechanism Is Ignorable

7.1 Introduction

7.2 Bivariate Normal Data with One Variable Subject to Missingness: ML Estimation

7.3 Bivariate Normal Monotone Data: Small-Sample Inference

7.4 Monotone Missingness with More Than Two Variables

7.5 Factored Likelihoods for Special Nonmonotone Patterns

Problems

8 Maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse

8.1 Alternative Computational Strategies

8.2 Introduction to the EM Algorithm

8.3 The E Step and The M Step of EM

8.4 Theory of the EM Algorithm

8.5 Extensions of EM

8.6 Hybrid Maximization Methods

Problems

9 Large-Sample Inference Based on Maximum Likelihood Estimates

9.1 Standard Errors Based on The Information Matrix

9.2 Standard Errors via Other Methods

Problems

10 Bayes and Multiple Imputation

10.1 Bayesian Iterative Simulation Methods

10.2 Multiple Imputation

Problems

Notes

Part III Likelihood-Based Approaches to the Analysis of Incomplete Data: Some Examples

11 Multivariate Normal Examples, Ignoring the Missingness Mechanism

11.1 Introduction

11.2 Inference for a Mean Vector and Covariance Matrix with Missing Data Under Normality

11.3 The Normal Model with a Restricted Covariance Matrix

11.4 Multiple Linear Regression

11.5 A General Repeated-Measures Model with Missing Data

11.6 Time Series Models

11.7 Measurement Error Formulated as Missing Data

Problems

12 Models for Robust Estimation

12.1 Introduction

12.2 Reducing the Influence of Outliers by Replacing the Normal Distribution by a Longer-Tailed Distribution

12.3 Penalized Spline of Propensity Prediction

Problems

Notes

13 Models for Partially Classified Contingency Tables, Ignoring the Missingness Mechanism

13.1 Introduction

13.2 Factored Likelihoods for Monotone Multinomial Data

13.3 ML and Bayes Estimation for Multinomial Samples with General Patterns of Missingness

13.4 Loglinear Models for Partially Classified Contingency Tables

Problems

14 Mixed Normal and Nonnormal Data with Missing Values, Ignoring the Missingness Mechanism

14.1 Introduction

14.2 The General Location Model

14.3 The General Location Model with Parameter Constraints

14.4 Regression Problems Involving Mixtures of Continuous and Categorical Variables

14.5 Further Extensions of the General Location Model

Problems

15 Missing Not at Random Models

15.1 Introduction

15.2 Models with Known MNAR Missingness Mechanisms: Grouped and Rounded Data

15.3 Normal Models for MNAR Missing Data

15.4 Other Models and Methods for MNAR Missing Data

Problems

References

Author Index

Subject Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1

Table 1.2

Table 1.3

Chapter 2

Table 2.1

Table 2.2

Chapter 3

Table 3.1

Chapter 4

Table 4.1

Chapter 7

Table 7.1

Table 7.2

Table 7.3

Table 7.4

Chapter 8

Table 8.1

Chapter 9

Table 9.1

Chapter 10

Table 10.1

Chapter 11

Table 11.1

Table 11.2

Table 11.3

Table 11.4

Table 11.5

Table 11.6

Chapter 12

Table 12.1

Table 12.2

Chapter 13

Table 13.1

Table 13.2

Table 13.3

Table 13.4

Table 13.5

Table 13.6

Table 13.7

Table 13.8

Table 13.9

Table 13.10

Table 13.11

Chapter 14

Table 14.1

Table 14.2

Table 14.3

Chapter 15

Table 15.1

Table 15.2

Table 15.3

Table 15.4

Table 15.5

Table 15.6

Table 15.7

Table 15.8

Table 15.9

Table 15.10

Table 15.11

Table 15.12

Table 15.13

Table 15.14

Guide

Cover

Table of Contents

Preface

Pages

ii

iii

iv

xi

xii

1

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

107

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

429

430

431

432

433

434

435

437

438

439

440

441

442

443

444

445

446

447

448

449

E1

Preface to the Third Edition

There has been tremendous growth in the literature on statistical methods for handling missing data, and associated software, since the publication of the second edition of “Statistical Analysis with Missing Data” in 2002. Attempting to cover this literature comprehensively would add excessively to the length of the book and also change its character. Therefore, our additions have focused mainly on work with which we have been associated and we can write about with some authority. The main changes from the second edition are as follows:

Concerning theory, we have changed the “obs” and “mis” notation for observed and missing data, which, though intuitive, caused some confusion because subscripting data by “obs” was not intended to imply conditioning on the pattern of observed values. We now use subscript (0) to denote observed values and subscript (1) to denote missing values, which is in fact similar to the notation employed by Rubin's original (1976a) paper. We have also been more specific about assumptions for ignoring the missing data mechanism for likelihood-based/Bayesian analyses and asymptotic frequentist analysis; the latter involves changing missing data patterns in repeated analysis. These changes reflect material in Mealli and Rubin (2015). A definition of “partially missing at random” and ignorability for parameter subsets has been added, based on Little et al. (2016a).

Data previously termed “not missing at random” are now called “missing not at random,” which we think is clearer.

Applications place greater emphasis on multiple imputation rather than direct computation of the posterior distribution of parameters. This new emphasis reflects the expansion of flexible software for multiple imputation, which makes the method attractive to applied statisticians.

We have added a number of additional missing data applications to measurement error, disclosure limitation, robust inference, and clinical trial data.

Chapter 15, on missing not at random data, has been completely revamped, including a number of new applications to subsample regression and sensitivity analysis

A number of minor errors in the previous edition have been corrected, although (as in all books), some probably remain and other new ones may have crept in – for which we apologize.

The ideal of using a consistent notation across all chapters, avoiding the use of the same symbol to mean different concepts, proved too hard given the range of topics covered. However, we have tried to maintain a consistent notation within chapters, and defined new uses of common letters as they arise. We hope different uses of the same symbol across chapters is not too confusing, and welcome suggestions for improvements.

Part IOverview and Basic Approaches