Multilevel Statistical Models - H. Goldstein - E-Book

Multilevel Statistical Models E-Book

H. Goldstein

0,0
77,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Throughout the social, medical and other sciences the importance of understanding complex hierarchical data structures is well understood. Multilevel modelling is now the accepted statistical technique for handling such data and is widely available in computer software packages. A thorough understanding of these techniques is therefore important for all those working in these areas. This new edition of Multilevel Statistical Models brings these techniques together, starting from basic ideas and illustrating how more complex models are derived. Bayesian methodology using MCMC has been extended along with new material on smoothing models, multivariate responses, missing data, latent normal transformations for discrete responses, structural equation modeling and survival models. Key Features: * Provides a clear introduction and a comprehensive account of multilevel models. * New methodological developments and applications are explored. * Written by a leading expert in the field of multilevel methodology. * Illustrated throughout with real-life examples, explaining theoretical concepts. This book is suitable as a comprehensive text for postgraduate courses, as well as a general reference guide. Applied statisticians in the social sciences, economics, biological and medical disciplines will find this book beneficial.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 593

Veröffentlichungsjahr: 2011

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Preface

Acknowledgements

Notation

Glossary

Chapter 1: An introduction to multilevel models

1.1 Hierarchically structured data

1.2 School effectiveness

1.3 Sample survey methods

1.4 Repeated measures data

1.5 Event history and survival models

1.6 Discrete response data

1.7 Multivariate models

1.8 Nonlinear models

1.9 Measurement errors

1.10 Cross classifications and multiple membership structures

1.11 Factor analysis and structural equation models

1.12 Levels of aggregation and ecological fallacies

1.13 Causality

1.14 The latent normal transformation and missing data

1.15 Other texts

1.16 A caveat

Chapter 2: The 2-level model

2.1 Introduction

2.2 The 2-level model

2.3 Parameter estimation

2.4 Maximum likelihood estimation using iterative generalised least squares (IGLS)

2.5 Marginal models and generalised estimating equations (GEE)

2.6 Residuals

2.7 The adequacy of ordinary least squares estimates

2.8 A 2-level example using longitudinal educational achievement data

2.9 General model diagnostics

2.10 Higher level explanatory variables and compositional effects

2.11 Transforming to normality

2.12 Hypothesis testing and confidence intervals

2.13 Bayesian estimation using Markov Chain Monte Carlo (MCMC)

2.14 Data augmentation

Appendix 2.1 The general structure and maximum likelihood estimation for a multilevel model

Appendix 2.2 Multilevel residuals estimation

Appendix 2.3 Estimation using profile and extended likelihood

Appendix 2.4 The EM algorithm

Appendix 2.5 MCMC sampling

Chapter 3: 3-level models and more complex hierarchical structures

3.1 Complex variance structures

3.2 A 3-level complex variation model example

3.3 Parameter constraints

3.4 Weighting units

3.5 Robust (sandwich) estimators and jacknifing

3.6 The bootstrap

3.7 Aggregate level analyses

3.8 Meta analysis

3.9 Design issues

Chapter 4: Multilevel models for discrete response data

4.1 Generalised linear models

4.2 Proportions as responses

4.3 Examples

4.4 Models for multiple response categories

4.5 Models for counts

4.6 Ordered responses

4.7 Mixed discrete-continuous response models

4.8 A latent normal model for binary responses

4.9 Partitioning variation in discrete response models

Appendix 4.1 Multilevel generalised linear model estimation

Appendix 4.2 Maximum likelihood estimation for multilevel generalised linear models

Appendix 4.3 MCMC estimation for generalised linear models

Appendix 4.4 Bootstrap estimation for multilevel generalised linear models

Chapter 5: Models for repeated measures data

5.1 Repeated measures data

5.2 A 2-level repeated measures model

5.3 A polynomial model example for adolescent growth and the prediction of adult height

5.4 Modelling an autocorrelation structure at level 1

5.5 A growth model with autocorrelated residuals

5.6 Multivariate repeated measures models

5.7 Scaling across time

5.8 Cross-over designs

5.9 Missing data

5.10 Longitudinal discrete response data

Chapter 6: Multivariate multilevel data

6.1 Introduction

6.2 The basic 2-level multivariate model

6.3 Rotation designs

6.4 A rotation design example using Science Survey test scores

6.5 Informative response selection: subject choice in examinations

6.6 Multivariate structures at higher levels and future predictions

6.7 Multivariate responses at several levels

6.8 Principal components analysis

6.9 Multiple discriminant analysis

Appendix 6.1 MCMC algorithm for a multivariate normal response model with constraints

Chapter 7: Latent normal models for multivariate data

7.1 The normal multilevel multivariate model

7.2 Sampling binary responses

7.3 Sampling ordered categorical responses

7.4 Sampling unordered categorical responses

7.5 Sampling count data

7.6 Sampling continuous non-normal data

7.7 Sampling the level 1 and level 2 covariance matrices

7.8 Model fit

7.9 Partially ordered data

7.10 Hybrid normal/ordered variables

7.11 Discussion

Chapter 8: Multilevel factor analysis, structural equation and mixture models

8.1 A 2-stage 2-level factor model

8.2 A general multilevel factor model

8.3 MCMC estimation for the factor model

8.4 Structural equation models

8.5 Discrete response multilevel structural equation models

8.6 More complex hierarchical latent variable models

8.7 Multilevel mixture models

Chapter 9: Nonlinear multilevel models

9.1 Introduction

9.2 Nonlinear functions of linear components

9.3 Estimating population means

9.4 Nonlinear functions for variances and covariances

9.5 Examples of nonlinear growth and nonlinear level 1 variance

Appendix 9.1 Nonlinear model estimation

Chapter 10: Multilevel modelling in sample surveys

10.1 Sample survey structures

10.2 Population structures

10.3 Small area estimation

Chapter 11: Multilevel event history and survival models

11.1 Introduction

11.2 Censoring

11.3 Hazard and survival functions

11.4 Parametric proportional hazard models

11.5 The semiparametric Cox model

11.6 Tied observations

11.7 Repeated events proportional hazard models

11.8 Example using birth interval data

11.9 Log duration models

11.10 Examples with birth interval data and children’s activity episodes

11.11 The grouped discrete time hazards model

11.12 Discrete time latent normal event history models

Chapter 12: Cross-classified data structures

12.1 Random cross classifications

12.2 A basic cross-classified model

12.3 Examination results for a cross classification of schools

12.4 Interactions in cross classifications

12.5 Cross classifications with one unit per cell

12.6 Multivariate cross-classified models

12.7 A general notation for cross classification

12.8 MCMC estimation in cross-classified models

Appendix 12.1 IGLS estimation for cross-classified data

Chapter 13: Multiple membership models

13.1 Multiple membership structures

13.2 Notation and classifications for multiple membership structures

13.3 An example of salmonella infection

13.4 A repeated measures multiple membership model

13.5 Individuals as higher level units

13.6 Spatial models

13.7 Missing identification models

Appendix 13.1 MCMC estimation for multiple membership models

Chapter 14: Measurement errors in multilevel models

14.1 A basic measurement error model

14.2 Moment-based estimators

14.3 A 2-level example with measurement error at both levels

14.4 Multivariate responses

14.5 Nonlinear models

14.6 Measurement errors for discrete explanatory variables

14.7 MCMC estimation for measurement error models

14.1 Measurement error estimation

Chapter 15: Smoothing models for multilevel data

15.1 Introduction

15.2 Smoothing estimators

15.3 Smoothing splines

15.4 Semiparametric smoothing models

15.5 Multilevel smoothing models

15.6 General multilevel semiparametric smoothing models

15.7 Generalised linear models

15.8 An example

15.9 Conclusions

Chapter 16: Missing data, partially observed data and multiple imputation

16.1 Introduction

16.2 Creating a completed dataset

16.3 Joint modelling for missing data

16.4 A 2-level model with responses of different types at both levels

16.5 Multiple imputation

16.6 A simulation example of multiple imputation for missing data

16.7 Longitudinal data with attrition

16.8 Partially known data values

16.9 Conclusions

Chapter 17: Multilevel models with correlated random effects

17.1 Introduction

17.2 Non-independence of level 2 residuals

17.3 MCMC estimation for non-independent level 2 residuals

17.4 Adaptive proposal distributions in MCMC estimation

17.5 MCMC estimation for non-independent level 1 residuals

17.6 Modelling the level 1 variance as a function of explanatory variables with random effects

17.7 Discrete responses with correlated random effects

17.8 Calculating the DIC statistic

17.9 A growth dataset

17.10 Conclusions

Chapter 18: Software for multilevel modelling

18.1 Software packages

References

Author index

Subject index

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by WALTER A. SHEWHART and SAMUEL S. WILKS

EditorsDavid J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Harvey Goldstein, Iain M. Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg

Editors EmeritiVic Barnett, Ralph A. Bradley, J. Stuart Hunter, J.B. Kadane, David G. Kendall, Jozef L. Teugels

This edition first published 2011© 2011 John Wiley & Sons, Ltd

Registered officeJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.

The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

Library of Congress Cataloguing-in-Publication Data

Goldstein, Harvey.Multilevel statistical models / Harvey Goldstein. – 4th ed.p. cm.Includes bibliographical references and index.ISBN 978-0-470-74865-7 (cloth)1. Social sciences–Mathematical models. 2. Social sciences–Research–Methodology. 3. Educational tests and measurements–Mathematical models. I. Title.H61.25.G65 2010519.5–dc22

2010023377

A catalogue record for this book is available from the British Library.

Print ISBN: 978-0-470-74865-7ePDF ISBN: 978-0-470-97340-0oBook ISBN: 978-0-470-97339-4

This book is dedicated to Jon Rasbash who died in March 2010. Without his support, enthusiasm and insight, many of the things discussed in this book would not have happened.

Harvey GoldsteinJune 2010

Preface

In the mid-1980s, a number of researchers began to see how to introduce systematic approaches to the statistical modelling and analysis of hierarchically structured data. The early work of Aitkin et al. (1981) on the teaching styles’ data and Aitkin’s subsequent work with Longford (1987) initiated a series of developments that by the early 1990s had resulted in a core set of established techniques, experience and software packages that could be applied routinely. These methods and further extensions of them are described in this book; they are now applied widely in areas such as education, epidemiology, geography, child growth and household surveys.

In addition to the first, second and third editions of the present text (Goldstein, 1987b, Goldstein, 1995, Goldstein, 2003), several expository volumes have now appeared (see Section 1.15). The present text aims to integrate existing methodological developments within a consistent terminology and notation, provide examples and explain a number of new developments, especially in the areas of latent normal models, missing data, multiple membership structures, errors of measurement and survival data. In almost all cases, these developments are the subject of continuing research.

The main text seeks to avoid undue statistical complexity, with derivations occurring in appendices. Examples and diagrams are used where possible to illustrate the application of the techniques and references are given to other works. The book is intended to be suitable for graduate level courses and as a general reference.

Harvey GoldsteinJune 2010

Acknowledgements

This book would not have been possible without the support and dedication of all those who have worked on, or been closely associated with, the Centre for Multilevel Modelling now at the University of Bristol. Those who contributed substantially to the third edition include Jon Rasbash, Min Yang, William Browne, Fiona Steele, Ian Plewis, Michael Healy and Toby Lewis. For the fourth edition, I have continued to benefit considerably from the ideas and advice of Jon Rasbash, William Browne and Fiona Steele. In addition, I have had useful input from James Carpenter, Chris Charlton, Paul Clarke, Bianca DeStavola, Tony Fielding, Kelvyn Jones, Daphne Kounali, George Leckie, Rebecca Pillinger, Tony Robinson and Chris Skinner. All of these people have invested time and effort in correcting mistakes and making many useful suggestions. Many friends and other colleagues, too numerous to mention, have also pointed out errors, suggested improvements and generally provided encouragement. Sincere thanks are due to the Economic and Social Research Council for their almost continuous provision of project funding for methodological developments since 1986. Needless to say, any remaining obscurities or mistakes are entirely my responsibility.

Harvey GoldsteinJune 2010

Glossary

ClusterA grouping containing ‘lower level’ elements. For example in a sample survey the set of households in a neighbourhood.Cross classificationA structure where lower level units are grouped within theDesign matrixcells of a multiway classification of higher level units In the fixed part of the model, the matrix of values of the explanatory variables X. In the random part the matrix of explanatory variables Z.Explanatory variableAlso known as an ‘independent’ variable. In the fixed part of the model usually denoted by x and in the random part by z.Fixed partThat part of a model represented by Xβ, that is the average relationship. The parameters, β, are referred to as ‘fixed parameters’.LevelA component of a data hierarchy. Level 1 is the lowest level, for example students within schools or repeated measurement occasions within individual subjects.Level n variationThe variation among level n unit measurements.Multiple membershipA structure where a level unit may be nested within one or more higher level units.NestingThe clustering of units into a hierarchyRandom partThat part of a model represented by Zu, that is the contribution of the random variables u, at each level. The parameters associated with the random variables, i.e. variances and covariances are referred to as ‘random parameters’.Response variable UnitAlso known as a ‘dependent’ variable. Denoted by y. An entity defined at a level of a data hierarchy. For example an individual student will be a level 1 unit within a level 2 unit such as a school.

1

An introduction to multilevel models

1.1 Hierarchically structured data

Many kinds of data, including observational data collected in the human and biological sciences, have a hierarchical, nested, or clustered structure. For example, animal and human studies of inheritance deal with a natural hierarchy where offspring are grouped within families. Offspring from the same parents tend to be more alike in their physical and mental characteristics than individuals chosen at random from the population at large. For instance, children from the same family may all tend to be small, perhaps because their parents are small or because of a common impoverished environment. Many designed experiments, such as clinical trials carried out in several randomly chosen centres or groups of individuals, also create data hierarchies.

For now, we are concerned only with the fact of such hierarchies, not their provenance. The principal applications are those from the social and medical sciences, but the techniques are, of course, applicable more generally. In subsequent chapters, as we develop the theory and techniques with examples, we see how a proper recognition of these natural hierarchies allows us to obtain more satisfactory answers to important questions.

We refer to a hierarchy as consisting of units grouped at different levels. Thus offspring may be the level 1 units in a 2-level structure where the level 2 units are the families: students may be the level 1 units clustered or nested within schools that are the level 2 units.

The existence of such data hierarchies is neither accidental nor ignorable. Individual people differ, as do individual animals, and this differentiation is mirrored in all kinds of social activity where the latter is often a direct result of the former; for example, when students with similar motivations or aptitudes are grouped in highly selective schools or colleges. In other cases, the groupings may arise for reasons less strongly associated with the characteristics of individuals, such as the allocation of young children to elementary schools, or the allocation of patients to different clinics. Once groupings are established, even if their establishment is effectively random, often they will tend to become differentiated. This differentiation implies that the group and its members both influence and are influenced by the group membership. To ignore this risks overlooking the importance of group effects, and may also render invalid many of the traditional statistical analysis techniques used for studying data relationships.

We look at this issue of statistical validity in the next chapter. For now, one simple example will show its importance. A well-known and influential study of the teaching styles used with primary (elementary) school children carried out in the 1970s (Bennett, 1976), claimed that children exposed to so-called ‘formal’ styles of teaching reading exhibited more progress than those who were not. The data were analysed using traditional multiple regression techniques which recognised only the individual children as the units of analysis and ignored their groupings within teachers and into classes. The results showed statistically significant differences. Subsequently, Aitkin . (1981) demonstrated that when the analysis accounted properly for the grouping of children into classes, the significant differences disappeared and the ‘formally’ taught children could not be shown to differ from the others.

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!