77,99 €
Throughout the social, medical and other sciences the importance of understanding complex hierarchical data structures is well understood. Multilevel modelling is now the accepted statistical technique for handling such data and is widely available in computer software packages. A thorough understanding of these techniques is therefore important for all those working in these areas. This new edition of Multilevel Statistical Models brings these techniques together, starting from basic ideas and illustrating how more complex models are derived. Bayesian methodology using MCMC has been extended along with new material on smoothing models, multivariate responses, missing data, latent normal transformations for discrete responses, structural equation modeling and survival models. Key Features: * Provides a clear introduction and a comprehensive account of multilevel models. * New methodological developments and applications are explored. * Written by a leading expert in the field of multilevel methodology. * Illustrated throughout with real-life examples, explaining theoretical concepts. This book is suitable as a comprehensive text for postgraduate courses, as well as a general reference guide. Applied statisticians in the social sciences, economics, biological and medical disciplines will find this book beneficial.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 593
Veröffentlichungsjahr: 2011
Contents
Preface
Acknowledgements
Notation
Glossary
Chapter 1: An introduction to multilevel models
1.1 Hierarchically structured data
1.2 School effectiveness
1.3 Sample survey methods
1.4 Repeated measures data
1.5 Event history and survival models
1.6 Discrete response data
1.7 Multivariate models
1.8 Nonlinear models
1.9 Measurement errors
1.10 Cross classifications and multiple membership structures
1.11 Factor analysis and structural equation models
1.12 Levels of aggregation and ecological fallacies
1.13 Causality
1.14 The latent normal transformation and missing data
1.15 Other texts
1.16 A caveat
Chapter 2: The 2-level model
2.1 Introduction
2.2 The 2-level model
2.3 Parameter estimation
2.4 Maximum likelihood estimation using iterative generalised least squares (IGLS)
2.5 Marginal models and generalised estimating equations (GEE)
2.6 Residuals
2.7 The adequacy of ordinary least squares estimates
2.8 A 2-level example using longitudinal educational achievement data
2.9 General model diagnostics
2.10 Higher level explanatory variables and compositional effects
2.11 Transforming to normality
2.12 Hypothesis testing and confidence intervals
2.13 Bayesian estimation using Markov Chain Monte Carlo (MCMC)
2.14 Data augmentation
Appendix 2.1 The general structure and maximum likelihood estimation for a multilevel model
Appendix 2.2 Multilevel residuals estimation
Appendix 2.3 Estimation using profile and extended likelihood
Appendix 2.4 The EM algorithm
Appendix 2.5 MCMC sampling
Chapter 3: 3-level models and more complex hierarchical structures
3.1 Complex variance structures
3.2 A 3-level complex variation model example
3.3 Parameter constraints
3.4 Weighting units
3.5 Robust (sandwich) estimators and jacknifing
3.6 The bootstrap
3.7 Aggregate level analyses
3.8 Meta analysis
3.9 Design issues
Chapter 4: Multilevel models for discrete response data
4.1 Generalised linear models
4.2 Proportions as responses
4.3 Examples
4.4 Models for multiple response categories
4.5 Models for counts
4.6 Ordered responses
4.7 Mixed discrete-continuous response models
4.8 A latent normal model for binary responses
4.9 Partitioning variation in discrete response models
Appendix 4.1 Multilevel generalised linear model estimation
Appendix 4.2 Maximum likelihood estimation for multilevel generalised linear models
Appendix 4.3 MCMC estimation for generalised linear models
Appendix 4.4 Bootstrap estimation for multilevel generalised linear models
Chapter 5: Models for repeated measures data
5.1 Repeated measures data
5.2 A 2-level repeated measures model
5.3 A polynomial model example for adolescent growth and the prediction of adult height
5.4 Modelling an autocorrelation structure at level 1
5.5 A growth model with autocorrelated residuals
5.6 Multivariate repeated measures models
5.7 Scaling across time
5.8 Cross-over designs
5.9 Missing data
5.10 Longitudinal discrete response data
Chapter 6: Multivariate multilevel data
6.1 Introduction
6.2 The basic 2-level multivariate model
6.3 Rotation designs
6.4 A rotation design example using Science Survey test scores
6.5 Informative response selection: subject choice in examinations
6.6 Multivariate structures at higher levels and future predictions
6.7 Multivariate responses at several levels
6.8 Principal components analysis
6.9 Multiple discriminant analysis
Appendix 6.1 MCMC algorithm for a multivariate normal response model with constraints
Chapter 7: Latent normal models for multivariate data
7.1 The normal multilevel multivariate model
7.2 Sampling binary responses
7.3 Sampling ordered categorical responses
7.4 Sampling unordered categorical responses
7.5 Sampling count data
7.6 Sampling continuous non-normal data
7.7 Sampling the level 1 and level 2 covariance matrices
7.8 Model fit
7.9 Partially ordered data
7.10 Hybrid normal/ordered variables
7.11 Discussion
Chapter 8: Multilevel factor analysis, structural equation and mixture models
8.1 A 2-stage 2-level factor model
8.2 A general multilevel factor model
8.3 MCMC estimation for the factor model
8.4 Structural equation models
8.5 Discrete response multilevel structural equation models
8.6 More complex hierarchical latent variable models
8.7 Multilevel mixture models
Chapter 9: Nonlinear multilevel models
9.1 Introduction
9.2 Nonlinear functions of linear components
9.3 Estimating population means
9.4 Nonlinear functions for variances and covariances
9.5 Examples of nonlinear growth and nonlinear level 1 variance
Appendix 9.1 Nonlinear model estimation
Chapter 10: Multilevel modelling in sample surveys
10.1 Sample survey structures
10.2 Population structures
10.3 Small area estimation
Chapter 11: Multilevel event history and survival models
11.1 Introduction
11.2 Censoring
11.3 Hazard and survival functions
11.4 Parametric proportional hazard models
11.5 The semiparametric Cox model
11.6 Tied observations
11.7 Repeated events proportional hazard models
11.8 Example using birth interval data
11.9 Log duration models
11.10 Examples with birth interval data and children’s activity episodes
11.11 The grouped discrete time hazards model
11.12 Discrete time latent normal event history models
Chapter 12: Cross-classified data structures
12.1 Random cross classifications
12.2 A basic cross-classified model
12.3 Examination results for a cross classification of schools
12.4 Interactions in cross classifications
12.5 Cross classifications with one unit per cell
12.6 Multivariate cross-classified models
12.7 A general notation for cross classification
12.8 MCMC estimation in cross-classified models
Appendix 12.1 IGLS estimation for cross-classified data
Chapter 13: Multiple membership models
13.1 Multiple membership structures
13.2 Notation and classifications for multiple membership structures
13.3 An example of salmonella infection
13.4 A repeated measures multiple membership model
13.5 Individuals as higher level units
13.6 Spatial models
13.7 Missing identification models
Appendix 13.1 MCMC estimation for multiple membership models
Chapter 14: Measurement errors in multilevel models
14.1 A basic measurement error model
14.2 Moment-based estimators
14.3 A 2-level example with measurement error at both levels
14.4 Multivariate responses
14.5 Nonlinear models
14.6 Measurement errors for discrete explanatory variables
14.7 MCMC estimation for measurement error models
14.1 Measurement error estimation
Chapter 15: Smoothing models for multilevel data
15.1 Introduction
15.2 Smoothing estimators
15.3 Smoothing splines
15.4 Semiparametric smoothing models
15.5 Multilevel smoothing models
15.6 General multilevel semiparametric smoothing models
15.7 Generalised linear models
15.8 An example
15.9 Conclusions
Chapter 16: Missing data, partially observed data and multiple imputation
16.1 Introduction
16.2 Creating a completed dataset
16.3 Joint modelling for missing data
16.4 A 2-level model with responses of different types at both levels
16.5 Multiple imputation
16.6 A simulation example of multiple imputation for missing data
16.7 Longitudinal data with attrition
16.8 Partially known data values
16.9 Conclusions
Chapter 17: Multilevel models with correlated random effects
17.1 Introduction
17.2 Non-independence of level 2 residuals
17.3 MCMC estimation for non-independent level 2 residuals
17.4 Adaptive proposal distributions in MCMC estimation
17.5 MCMC estimation for non-independent level 1 residuals
17.6 Modelling the level 1 variance as a function of explanatory variables with random effects
17.7 Discrete responses with correlated random effects
17.8 Calculating the DIC statistic
17.9 A growth dataset
17.10 Conclusions
Chapter 18: Software for multilevel modelling
18.1 Software packages
References
Author index
Subject index
WILEY SERIES IN PROBABILITY AND STATISTICS
Established by WALTER A. SHEWHART and SAMUEL S. WILKS
EditorsDavid J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Harvey Goldstein, Iain M. Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg
Editors EmeritiVic Barnett, Ralph A. Bradley, J. Stuart Hunter, J.B. Kadane, David G. Kendall, Jozef L. Teugels
This edition first published 2011© 2011 John Wiley & Sons, Ltd
Registered officeJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloguing-in-Publication Data
Goldstein, Harvey.Multilevel statistical models / Harvey Goldstein. – 4th ed.p. cm.Includes bibliographical references and index.ISBN 978-0-470-74865-7 (cloth)1. Social sciences–Mathematical models. 2. Social sciences–Research–Methodology. 3. Educational tests and measurements–Mathematical models. I. Title.H61.25.G65 2010519.5–dc22
2010023377
A catalogue record for this book is available from the British Library.
Print ISBN: 978-0-470-74865-7ePDF ISBN: 978-0-470-97340-0oBook ISBN: 978-0-470-97339-4
This book is dedicated to Jon Rasbash who died in March 2010. Without his support, enthusiasm and insight, many of the things discussed in this book would not have happened.
Harvey GoldsteinJune 2010
Preface
In the mid-1980s, a number of researchers began to see how to introduce systematic approaches to the statistical modelling and analysis of hierarchically structured data. The early work of Aitkin et al. (1981) on the teaching styles’ data and Aitkin’s subsequent work with Longford (1987) initiated a series of developments that by the early 1990s had resulted in a core set of established techniques, experience and software packages that could be applied routinely. These methods and further extensions of them are described in this book; they are now applied widely in areas such as education, epidemiology, geography, child growth and household surveys.
In addition to the first, second and third editions of the present text (Goldstein, 1987b, Goldstein, 1995, Goldstein, 2003), several expository volumes have now appeared (see Section 1.15). The present text aims to integrate existing methodological developments within a consistent terminology and notation, provide examples and explain a number of new developments, especially in the areas of latent normal models, missing data, multiple membership structures, errors of measurement and survival data. In almost all cases, these developments are the subject of continuing research.
The main text seeks to avoid undue statistical complexity, with derivations occurring in appendices. Examples and diagrams are used where possible to illustrate the application of the techniques and references are given to other works. The book is intended to be suitable for graduate level courses and as a general reference.
Harvey GoldsteinJune 2010
Acknowledgements
This book would not have been possible without the support and dedication of all those who have worked on, or been closely associated with, the Centre for Multilevel Modelling now at the University of Bristol. Those who contributed substantially to the third edition include Jon Rasbash, Min Yang, William Browne, Fiona Steele, Ian Plewis, Michael Healy and Toby Lewis. For the fourth edition, I have continued to benefit considerably from the ideas and advice of Jon Rasbash, William Browne and Fiona Steele. In addition, I have had useful input from James Carpenter, Chris Charlton, Paul Clarke, Bianca DeStavola, Tony Fielding, Kelvyn Jones, Daphne Kounali, George Leckie, Rebecca Pillinger, Tony Robinson and Chris Skinner. All of these people have invested time and effort in correcting mistakes and making many useful suggestions. Many friends and other colleagues, too numerous to mention, have also pointed out errors, suggested improvements and generally provided encouragement. Sincere thanks are due to the Economic and Social Research Council for their almost continuous provision of project funding for methodological developments since 1986. Needless to say, any remaining obscurities or mistakes are entirely my responsibility.
Harvey GoldsteinJune 2010
Glossary
ClusterA grouping containing ‘lower level’ elements. For example in a sample survey the set of households in a neighbourhood.Cross classificationA structure where lower level units are grouped within theDesign matrixcells of a multiway classification of higher level units In the fixed part of the model, the matrix of values of the explanatory variables X. In the random part the matrix of explanatory variables Z.Explanatory variableAlso known as an ‘independent’ variable. In the fixed part of the model usually denoted by x and in the random part by z.Fixed partThat part of a model represented by Xβ, that is the average relationship. The parameters, β, are referred to as ‘fixed parameters’.LevelA component of a data hierarchy. Level 1 is the lowest level, for example students within schools or repeated measurement occasions within individual subjects.Level n variationThe variation among level n unit measurements.Multiple membershipA structure where a level unit may be nested within one or more higher level units.NestingThe clustering of units into a hierarchyRandom partThat part of a model represented by Zu, that is the contribution of the random variables u, at each level. The parameters associated with the random variables, i.e. variances and covariances are referred to as ‘random parameters’.Response variable UnitAlso known as a ‘dependent’ variable. Denoted by y. An entity defined at a level of a data hierarchy. For example an individual student will be a level 1 unit within a level 2 unit such as a school.1
An introduction to multilevel models
1.1 Hierarchically structured data
Many kinds of data, including observational data collected in the human and biological sciences, have a hierarchical, nested, or clustered structure. For example, animal and human studies of inheritance deal with a natural hierarchy where offspring are grouped within families. Offspring from the same parents tend to be more alike in their physical and mental characteristics than individuals chosen at random from the population at large. For instance, children from the same family may all tend to be small, perhaps because their parents are small or because of a common impoverished environment. Many designed experiments, such as clinical trials carried out in several randomly chosen centres or groups of individuals, also create data hierarchies.
For now, we are concerned only with the fact of such hierarchies, not their provenance. The principal applications are those from the social and medical sciences, but the techniques are, of course, applicable more generally. In subsequent chapters, as we develop the theory and techniques with examples, we see how a proper recognition of these natural hierarchies allows us to obtain more satisfactory answers to important questions.
We refer to a hierarchy as consisting of units grouped at different levels. Thus offspring may be the level 1 units in a 2-level structure where the level 2 units are the families: students may be the level 1 units clustered or nested within schools that are the level 2 units.
The existence of such data hierarchies is neither accidental nor ignorable. Individual people differ, as do individual animals, and this differentiation is mirrored in all kinds of social activity where the latter is often a direct result of the former; for example, when students with similar motivations or aptitudes are grouped in highly selective schools or colleges. In other cases, the groupings may arise for reasons less strongly associated with the characteristics of individuals, such as the allocation of young children to elementary schools, or the allocation of patients to different clinics. Once groupings are established, even if their establishment is effectively random, often they will tend to become differentiated. This differentiation implies that the group and its members both influence and are influenced by the group membership. To ignore this risks overlooking the importance of group effects, and may also render invalid many of the traditional statistical analysis techniques used for studying data relationships.
We look at this issue of statistical validity in the next chapter. For now, one simple example will show its importance. A well-known and influential study of the teaching styles used with primary (elementary) school children carried out in the 1970s (Bennett, 1976), claimed that children exposed to so-called ‘formal’ styles of teaching reading exhibited more progress than those who were not. The data were analysed using traditional multiple regression techniques which recognised only the individual children as the units of analysis and ignored their groupings within teachers and into classes. The results showed statistically significant differences. Subsequently, Aitkin . (1981) demonstrated that when the analysis accounted properly for the grouping of children into classes, the significant differences disappeared and the ‘formally’ taught children could not be shown to differ from the others.
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
