86,99 €
An up-to-date, comprehensive treatment of a classic text on missing data in statistics
The topic of missing data has gained considerable attention in recent decades. This new edition by two acknowledged experts on the subject offers an up-to-date account of practical methodology for handling missing data problems. Blending theory and application, authors Roderick Little and Donald Rubin review historical approaches to the subject and describe simple methods for multivariate analysis with missing values. They then provide a coherent theory for analysis of problems based on likelihoods derived from statistical models for the data and the missing data mechanism, and then they apply the theory to a wide range of important missing data problems.
Statistical Analysis with Missing Data, Third Edition starts by introducing readers to the subject and approaches toward solving it. It looks at the patterns and mechanisms that create the missing data, as well as a taxonomy of missing data. It then goes on to examine missing data in experiments, before discussing complete-case and available-case analysis, including weighting methods. The new edition expands its coverage to include recent work on topics such as nonresponse in sample surveys, causal inference, diagnostic methods, and sensitivity analysis, among a host of other topics.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 810
Veröffentlichungsjahr: 2019
WILEY SERIES IN PROBABILITY AND STATISTICS
Established by Walter A. Shewhart and Samuel S. Wilks
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay
Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels
The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods.
Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches.
This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.
A complete list of titles in this series can be found at http://www.wiley.com/go/wsps
Roderick J. A. Little
Richard D. Remington Distinguished University Professor ofBiostatistics, Professor of Statistics, and Research Professor,Institute for Social Rsearch, at the University of Michigan
Donald B. Rubin
Professor at Yau Mathematical Sciences Center, TsinghuaUniversity; Murray Shusterman Senior Research Fellow, FoxSchool of Business, at Temple University; and Professor Emeritus,at Harvard University
This edition first published 2020
© 2020 John Wiley & Sons, Inc
Edition History
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Roderick J A Little and Donald B Rubin to be identified as the authors of the material in this work has been asserted in accordance with law.
Registered Office(s)
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
Editorial Office
111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging-in-Publication Data
Names: Little, Roderick J.A., author. | Rubin, Donald B., author.
Title: Statistical analysis with missing data / Roderick J.A. Little, Donald B. Rubin.
Description: Third edition | Hoboken, NJ : Wiley, 2020. | Series: Wiley series in probability and statistics | Includes index. |
Identifiers: LCCN 2018058860 (print) | LCCN 2018061330 (ebook) | ISBN 9781118596012 (Adobe PDF) | ISBN 9781118595695 (ePub) | ISBN 9780470526798 (hardcover)
Subjects: LCSH: Mathematical statistics. | Mathematical statistics--Problems, exercises, etc. | Missing observations (Statistics) | Missing observations (Statistics)--Problems, exercises, etc.
Classification: LCC QA276 (ebook) | LCC QA276 .L57 2019 (print) | DDC 519.5--dc23
LC record available at https://lccn.loc.gov/2018058860
Cover image: Wiley
Cover design by Wiley
Cover
Preface to the Third Edition
Part I Overview and Basic Approaches
1 Introduction
1.1 The Problem of Missing Data
1.2 Missingness Patterns and Mechanisms
1.3 Mechanisms That Lead to Missing Data
1.4 A Taxonomy of Missing Data Methods
Problems
Note
2 Missing Data in Experiments
2.1 Introduction
2.2 The Exact Least Squares Solution with Complete Data
2.3 The Correct Least Squares Analysis with Missing Data
2.4 Filling in Least Squares Estimates
2.5 Bartlett's ANCOVA Method
2.6 Least Squares Estimates of Missing Values by ANCOVA Using Only Complete-Data Methods
2.7 Correct Least Squares Estimates of Standard Errors and One Degree of Freedom Sums of Squares
2.8 Correct Least-Squares Sums of Squares with More Than One Degree of Freedom
Problems
3 Complete-Case and Available-Case Analysis, Including Weighting Methods
3.1 Introduction
3.2 Complete-Case Analysis
3.3 Weighted Complete-Case Analysis
3.4 Available-Case Analysis
Problems
4 Single Imputation Methods
4.1 Introduction
4.2 Imputing Means from a Predictive Distribution
4.3 Imputing Draws from a Predictive Distribution
4.4 Conclusion
Problems
5 Accounting for Uncertainty from Missing Data
5.1 Introduction
5.2 Imputation Methods that Provide Valid Standard Errors from a Single Filled-in Data Set
5.3 Standard Errors for Imputed Data by Resampling
5.4 Introduction to Multiple Imputation
5.5 Comparison of Resampling Methods and Multiple Imputation
Problems
Part II Likelihood-Based Approaches to the Analysis of Data with Missing Values
6 Theory of Inference Based on the Likelihood Function
6.1 Review of Likelihood-Based Estimation for Complete Data
6.2 Likelihood-Based Inference with Incomplete Data
6.3 A Generally Flawed Alternative to Maximum Likelihood: Maximizing over the Parameters and the Missing Data
6.4 Likelihood Theory for Coarsened Data
Problems
Notes
7 Factored Likelihood Methods When the Missingness Mechanism Is Ignorable
7.1 Introduction
7.2 Bivariate Normal Data with One Variable Subject to Missingness: ML Estimation
7.3 Bivariate Normal Monotone Data: Small-Sample Inference
7.4 Monotone Missingness with More Than Two Variables
7.5 Factored Likelihoods for Special Nonmonotone Patterns
Problems
8 Maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse
8.1 Alternative Computational Strategies
8.2 Introduction to the EM Algorithm
8.3 The E Step and The M Step of EM
8.4 Theory of the EM Algorithm
8.5 Extensions of EM
8.6 Hybrid Maximization Methods
Problems
9 Large-Sample Inference Based on Maximum Likelihood Estimates
9.1 Standard Errors Based on The Information Matrix
9.2 Standard Errors via Other Methods
Problems
10 Bayes and Multiple Imputation
10.1 Bayesian Iterative Simulation Methods
10.2 Multiple Imputation
Problems
Notes
Part III Likelihood-Based Approaches to the Analysis of Incomplete Data: Some Examples
11 Multivariate Normal Examples, Ignoring the Missingness Mechanism
11.1 Introduction
11.2 Inference for a Mean Vector and Covariance Matrix with Missing Data Under Normality
11.3 The Normal Model with a Restricted Covariance Matrix
11.4 Multiple Linear Regression
11.5 A General Repeated-Measures Model with Missing Data
11.6 Time Series Models
11.7 Measurement Error Formulated as Missing Data
Problems
12 Models for Robust Estimation
12.1 Introduction
12.2 Reducing the Influence of Outliers by Replacing the Normal Distribution by a Longer-Tailed Distribution
12.3 Penalized Spline of Propensity Prediction
Problems
Notes
13 Models for Partially Classified Contingency Tables, Ignoring the Missingness Mechanism
13.1 Introduction
13.2 Factored Likelihoods for Monotone Multinomial Data
13.3 ML and Bayes Estimation for Multinomial Samples with General Patterns of Missingness
13.4 Loglinear Models for Partially Classified Contingency Tables
Problems
14 Mixed Normal and Nonnormal Data with Missing Values, Ignoring the Missingness Mechanism
14.1 Introduction
14.2 The General Location Model
14.3 The General Location Model with Parameter Constraints
14.4 Regression Problems Involving Mixtures of Continuous and Categorical Variables
14.5 Further Extensions of the General Location Model
Problems
15 Missing Not at Random Models
15.1 Introduction
15.2 Models with Known MNAR Missingness Mechanisms: Grouped and Rounded Data
15.3 Normal Models for MNAR Missing Data
15.4 Other Models and Methods for MNAR Missing Data
Problems
References
Author Index
Subject Index
End User License Agreement
Chapter 1
Table 1.1
Table 1.2
Table 1.3
Chapter 2
Table 2.1
Table 2.2
Chapter 3
Table 3.1
Chapter 4
Table 4.1
Chapter 7
Table 7.1
Table 7.2
Table 7.3
Table 7.4
Chapter 8
Table 8.1
Chapter 9
Table 9.1
Chapter 10
Table 10.1
Chapter 11
Table 11.1
Table 11.2
Table 11.3
Table 11.4
Table 11.5
Table 11.6
Chapter 12
Table 12.1
Table 12.2
Chapter 13
Table 13.1
Table 13.2
Table 13.3
Table 13.4
Table 13.5
Table 13.6
Table 13.7
Table 13.8
Table 13.9
Table 13.10
Table 13.11
Chapter 14
Table 14.1
Table 14.2
Table 14.3
Chapter 15
Table 15.1
Table 15.2
Table 15.3
Table 15.4
Table 15.5
Table 15.6
Table 15.7
Table 15.8
Table 15.9
Table 15.10
Table 15.11
Table 15.12
Table 15.13
Table 15.14
Cover
Table of Contents
Preface
ii
iii
iv
xi
xii
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
107
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
429
430
431
432
433
434
435
437
438
439
440
441
442
443
444
445
446
447
448
449
E1
There has been tremendous growth in the literature on statistical methods for handling missing data, and associated software, since the publication of the second edition of “Statistical Analysis with Missing Data” in 2002. Attempting to cover this literature comprehensively would add excessively to the length of the book and also change its character. Therefore, our additions have focused mainly on work with which we have been associated and we can write about with some authority. The main changes from the second edition are as follows:
Concerning theory, we have changed the “obs” and “mis” notation for observed and missing data, which, though intuitive, caused some confusion because subscripting data by “obs” was not intended to imply conditioning on the pattern of observed values. We now use subscript (0) to denote observed values and subscript (1) to denote missing values, which is in fact similar to the notation employed by Rubin's original (1976a) paper. We have also been more specific about assumptions for ignoring the missing data mechanism for likelihood-based/Bayesian analyses and asymptotic frequentist analysis; the latter involves changing missing data patterns in repeated analysis. These changes reflect material in Mealli and Rubin (2015). A definition of “partially missing at random” and ignorability for parameter subsets has been added, based on Little et al. (2016a).
Data previously termed “not missing at random” are now called “missing not at random,” which we think is clearer.
Applications place greater emphasis on multiple imputation rather than direct computation of the posterior distribution of parameters. This new emphasis reflects the expansion of flexible software for multiple imputation, which makes the method attractive to applied statisticians.
We have added a number of additional missing data applications to measurement error, disclosure limitation, robust inference, and clinical trial data.
Chapter 15, on missing not at random data, has been completely revamped, including a number of new applications to subsample regression and sensitivity analysis
A number of minor errors in the previous edition have been corrected, although (as in all books), some probably remain and other new ones may have crept in – for which we apologize.
The ideal of using a consistent notation across all chapters, avoiding the use of the same symbol to mean different concepts, proved too hard given the range of topics covered. However, we have tried to maintain a consistent notation within chapters, and defined new uses of common letters as they arise. We hope different uses of the same symbol across chapters is not too confusing, and welcome suggestions for improvements.
