111,99 €
A one-of-a-kind resource on identifying and dealing with bias in statistical research on causal effects Do cell phones cause cancer? Can a new curriculum increase student achievement? Determining what the real causes of such problems are, and how powerful their effects may be, are central issues in research across various fields of study. Some researchers are highly skeptical of drawing causal conclusions except in tightly controlled randomized experiments, while others discount the threats posed by different sources of bias, even in less rigorous observational studies. Bias and Causation presents a complete treatment of the subject, organizing and clarifying the diverse types of biases into a conceptual framework. The book treats various sources of bias in comparative studies--both randomized and observational--and offers guidance on how they should be addressed by researchers. Utilizing a relatively simple mathematical approach, the author develops a theory of bias that outlines the essential nature of the problem and identifies the various sources of bias that are encountered in modern research. The book begins with an introduction to the study of causal inference and the related concepts and terminology. Next, an overview is provided of the methodological issues at the core of the difficulties posed by bias. Subsequent chapters explain the concepts of selection bias, confounding, intermediate causal factors, and information bias along with the distortion of a causal effect that can result when the exposure and/or the outcome is measured with error. The book concludes with a new classification of twenty general sources of bias and practical advice on how mathematical modeling and expert judgment can be combined to achieve the most credible causal conclusions. Throughout the book, examples from the fields of medicine, public policy, and education are incorporated into the presentation of various topics. In addition, six detailed case studies illustrate concrete examples of the significance of biases in everyday research. Requiring only a basic understanding of statistics and probability theory, Bias and Causation is an excellent supplement for courses on research methods and applied statistics at the upper-undergraduate and graduate level. It is also a valuable reference for practicing researchers and methodologists in various fields of study who work with statistical data. This book was selected as the 2011 Ziegel Prize Winner in Technometrics for the best book reviewed by the journal. It is also the winner of the 2010 PROSE Award for Mathematics from The American Publishers Awards for Professional and Scholarly Excellence
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 735
Veröffentlichungsjahr: 2011
Table of Contents
Cover
Table of Contents
Half title page
Series page
Title page
Copyright page
Dedication
Preface
CHAPTER 1 What Is Bias?
1.1 APPLES AND ORANGES
1.2 STATISTICS VS. CAUSATION
1.3 BIAS IN THE REAL WORLD
GUIDEPOST 1
CHAPTER 2 Causality and Comparative Studies
2.1 BIAS AND CAUSATION
2.2 CAUSALITY AND COUNTERFACTUALS
2.3 WHY COUNTERFACTUALS?
2.4 CAUSAL EFFECTS
2.5 EMPIRICAL EFFECTS
GUIDEPOST 2
CHAPTER 3 Estimating Causal Effects
3.1 EXTERNAL VALIDITY
3.2 MEASURES OF EMPIRICAL EFFECTS
3.3 DIFFERENCE OF MEANS
3.4 RISK DIFFERENCE AND RISK RATIO
3.5 POTENTIAL OUTCOMES
3.6 TIME-DEPENDENT OUTCOMES
3.7 INTERMEDIATE VARIABLES
3.8 MEASUREMENT OF EXPOSURE
3.9 MEASUREMENT OF THE OUTCOME VALUE
3.10 CONFOUNDING BIAS
GUIDEPOST 3
CHAPTER 4 Varieties of Bias
4.1 RESEARCH DESIGNS AND BIAS
4.2 BIAS IN BIOMEDICAL RESEARCH
4.3 BIAS IN SOCIAL SCIENCE RESEARCH
4.4 SOURCES OF BIAS: A PROPOSED TAXONOMY
GUIDEPOST 4
CHAPTER 5 Selection Bias
5.1 SELECTION PROCESSES AND BIAS
5.2 TRADITIONAL SELECTION MODEL: DICHOTOMOUS OUTCOME
5.3 CAUSAL SELECTION MODEL: DICHOTOMOUS OUTCOME
5.4 RANDOMIZED EXPERIMENTS
5.5 OBSERVATIONAL COHORT STUDIES
5.6 TRADITIONAL SELECTION MODEL: NUMERICAL OUTCOME
5.7 CAUSAL SELECTION MODEL: NUMERICAL OUTCOME
GUIDEPOST 5
APPENDIX
CHAPTER 6 Confounding: An Enigma?
6.1 WHAT IS THE REAL PROBLEM?
6.2 CONFOUNDING AND EXTRANEOUS CAUSES
6.3 CONFOUNDING AND STATISTICAL CONTROL
6.4 CONFOUNDING AND COMPARABILITY
6.5 CONFOUNDING AND THE ASSIGNMENT MECHANISM
6.6 CONFOUNDING AND MODEL SPECIFICATION
GUIDEPOST 6
CHAPTER 7 Confounding: Essence, Correction, and Detection
7.1 ESSENCE: THE NATURE OF CONFOUNDING
7.2 CORRECTION: STATISTICAL CONTROL FOR CONFOUNDING
7.3 DETECTION: ADEQUACY OF STATISTICAL ADJUSTMENT
GUIDEPOST 7
APPENDIX
CHAPTER 8 Intermediate Causal Factors
8.1 DIRECT AND INDIRECT EFFECTS
8.2 PRINCIPAL STRATIFICATION
8.3 NONCOMPLIANCE
8.4 ATTRITION
GUIDEPOST 8
CHAPTER 9 Information Bias
9.1 BASIC CONCEPTS
9.2 CLASSICAL MEASUREMENT MODEL: DICHOTOMOUS OUTCOME
9.3 CAUSAL MEASUREMENT MODEL: DICHOTOMOUS OUTCOME
9.4 CLASSICAL MEASUREMENT MODEL: NUMERICAL OUTCOME
9.5 CAUSAL MEASUREMENT MODEL: NUMERICAL OUTCOME
9.6 COVARIATES MEASURED WITH ERROR
GUIDEPOST 9
CHAPTER 10 Sources of Bias
10.1 SAMPLING
10.2 ASSIGNMENT
10.3 ADHERENCE
10.4 EXPOSURE ASCERTAINMENT
10.5 OUTCOME MEASUREMENT
GUIDEPOST 10
CHAPTER 11 Contending with Bias
11.1 CONVENTIONAL SOLUTIONS
11.2 STANDARD STATISTICAL PARADIGM
11.3 TOWARD A BROADER PERSPECTIVE
11.4 REAL-WORLD BIAS REVISITED
11.5 STATISTICS AND CAUSATION
Glossary
Bibliography
Index
Bias and Causation
WILEY SERIES IN PROBABILITY AND STATISTICS
Established by WALTER A. SHEWHART and SAMUEL S. WILKS
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Harvey Goldstein, Iain M. Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg
Editors Emeriti: Vic Barnett, J. Stuart Hunter, Jozef L. Teugels
A complete list of the titles in this series appears at the end of this volume.
Copyright © 2010 by John Wiley & Sons, Inc. All rights reserved
Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
Brief explanation of the cover photo for Bias and Causation:
In the quest for evidence of causation, the reseacher must follow a confusing and circuitous path, avoiding false steps that can easily lead to bias.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Weisberg, Herbert I., 1944–
Bias and causation : models and judgment for valid comparisons / Herbert I. Weisberg.
p. cm.—(Wiley series in probability and statistics)
Includes bibliographical references and index.
ISBN 978-0-470-28639-5 (cloth)
ISBN 978-1-118-05820-6 (ebk)
1. Discriminant analysis. 2. Paired comparisons (Statistics) I. Title.
QA278.65.W45 2010
519.5′35–dc22
2009054242
To Nina, Alexander, and Daniel
Preface
Throughout a long career as a statistician, I have frequently found myself wrestling, in one way or another, with issues of bias and causation. As a methodologist, researcher, consultant, or expert witness, I have had to propose, justify, or criticize many varieties of causal statements. My training in mathematics and statistics prepared me well to deal with many aspects of the diverse, and occasionally bizarre, problems I have chanced to encounter. However, the statistical theory I studied in graduate school did not deal explicitly with the subject of causal inference, except within the narrow confines of randomized experimentation.
When I entered the “real world” of statistical research and consulting, the problems I regularly faced were not amenable to strict experimental control. They typically involved causal effects on human health and behavior in the presence of observational data subject to many possible sources of bias. To attack these problems, I needed analytic weapons that were not in my statistical arsenal. Little by little, I found myself being transformed into a practitioner of some dark art that involved statistics, but that drew as well on intuition, logic, and common sense.
The nature of this evolution can be best illustrated by an anecdote. The first legal case in which I provided statistical expertise was an employment discrimination lawsuit against a Boston-based Fortune 500 company. The plaintiffs were convinced that black workers were being systematically prevented from rising to higher-level positions within the manufacturing division of the company. A young associate in a large Boston law firm representing the plaintiffs had somehow been referred to me. Knowing next to nothing about the industry or the relevant employment law, I peppered this attorney with questions of all sorts. Eventually, I even got around to requesting some data from the company’s human resources department. I dutifully subjected the data to various standard analyses, searching for an effect of race on promotion rates, but came up empty. Despite repeated failures, I harbored a nagging suspicion that something important had been overlooked.
I began to scrutinize listings of the data, trying to discern some hidden pattern behind the numbers. Preliminary ideas led to further questions and to discussions with some of the plaintiffs. This interactive process yielded a more refined understanding of personnel decision making at the company. Eventually, it became clear to me what was “really” going on. Up to a certain level in the hierarchy of positions, there was virtually no relationship between race and promotion. But for a particular level midway up the organizational ladder, very few workers were being promoted from within the company when openings arose. Rather, these particular jobs were being filled primarily through outside hires, and almost always by white applicants. Moreover, these external candidates were sometimes less qualified than the internally available workers. We came to call this peculiar dynamic “the bottleneck.”
This subtle pattern, once recognized, was supported anecdotally in several ways. The statistical data, coupled with qualitative supporting information, was eventually presented to the defendant company’s attorneys. The response to our demonstration of the bottleneck phenomenon was dramatic: a sudden interest in negotiation after many months of intransigence. Within weeks, a settlement of the case was forged.
The methods of data analysis I had employed in this instance did not conform to any textbook methods I had been taught. Indeed, I felt a bit guilty that the actual statistical techniques were quite simple and did not require any advanced mathematical knowledge. Moreover, the intensive “data dredging” in which I had engaged was highly unorthodox. But the results made perfect sense! Besides leading to a practical solution, the answer was intellectually satisfying, connecting all the dots of a previously inexplicable data pattern.
The kinds of data analyses that have proved most useful in my work have often displayed this same quality of making sense in a way that is intuitive and logically compelling. This plausibility derives not only from statistical criteria, such as levels of significance, but also from broader considerations that are harder to articulate. I have come to believe that statisticians tend to be uncomfortable with causal inference in part because the issues cannot be settled with technical skill alone. Substantive knowledge and expert judgment are also necessary, in ways that are often difficult to quantify. Thus, at least until very recently, statisticians have been content to cede most methodological questions related to bias and causation to other academic disciplines. This situation has certainly started to change for the better. However, it remains unclear how far the statistical profession is prepared to stretch to meet the real challenges of causal puzzles.
When I began writing this book, I envisioned something much less ambitious than this effort has turned out to be. Fortunately, I was foolish enough to rush in where others apparently feared to tread, not realizing that I would be drawn into the subject so deeply. Rather than a “handy reference” on the types of bias, with some causal modeling framework in the background, the tail (causation) has come to wag the dog (bias). It seems to me that to really understand bias, a clear counterfactual framework for formulating the issues is necessary. This framework provides the foundation upon which potential solutions, whether quantitative or qualitative, may rest.
This book is intended primarily for practicing researchers and methodologists, and for students with a reasonably solid grounding in basic statistics and research methods. The mathematics used involves nothing beyond elementary algebra and basic statistics and probability theory. The few more complicated derivations are relegated to appendices that will be of interest only to the more mathematically sophisticated. The main value of the book is conceptual, not technical. The purpose of the mathematical models is to provide insight, rather than methods, although some methods have been and can be built upon the conceptual foundations.
I have provided very little detail on the traditional statistical methods that address problems of random variability. When conducting actual research, these problems need to be addressed in tandem with those related to bias. In reading this book, it may be helpful to imagine that we are dealing with extremely large samples in which random variability can be ignored. Of course, in reality, finite-sampling issues are usually very important. Much of the recent research by statisticians pertaining to causal inference concentrates on estimating causal effects based on finite samples. The statistical principles and methods they apply are well established and have been expounded in numerous texts. My subject is the poor stepchild of statistics: systematic error that cannot be cured by obtaining a large-enough sample.
This book was not written specifically as a textbook. However, it may be found useful as a central or secondary resource in a graduate-level research methods course in epidemiology or the social sciences. I believe that much of the material is best learned in the context of real research. So, teachers may wish to supplement the limited set of examples in the book with articles and reports relevant to their particular interests and areas of application. Some teachers may find it useful to treat this book as a reference for selected topics, or to approach the topics covered in a different order than I have presented them. This is certainly their prerogative, but I would caution that material in later chapters depends strongly on concepts and terminology introduced earlier.
Chapter 1 provides an introduction to the problems considered, and brief summaries of six “case studies” that are referenced throughout the book. Chapter 2 discusses the counterfactual framework for causal inference, and some important concepts and terminology pertaining to bias and causation. Chapter 3 contains a brief exposition of several methodological issues that are central to the difficulties posed by bias. Chapter 4 summarizes the various types of bias as viewed in the biomedical and the social sciences. These four chapters form an extended introduction and review of the issues addressed in the remainder of the text.
Chapter 5 deals with the problem of selection bias—a term used in different ways, and a source of considerable confusion. Selection bias is approached from the perspective of traditional statistical modeling as well as from a causal modeling viewpoint. Chapters 6 and 7 both focus on the problem of confounding. Chapter 6 lays out the various ways in which this central, but enigmatic, concept has been defined. Chapter 7 offers an explanation of confounding that is based on the causal (counterfactual) framework. Chapter 8 discusses intermediate causal factors, which can engender bias even in a randomized controlled trial. Chapter 9 considers the topic of information bias, the distortion of a causal effect that can result when the exposure and/or the outcome is measured with error. Chapters 5 through 9 are the most challenging, both conceptually and mathematically.
Chapters 10 and 11 are more practical and less technical than the preceding four chapters. Chapter 10 offers a preliminary organization of bias sources. I define a source of bias as a real-world condition that affects a comparative study and can lead to bias. I define and describe 20 general sources of bias. This list includes, at a high level of generality, most of the common sources of bias that arise in practice. Finally, Chapter 11 considers the different ways in which we can attempt to cope with bias. I argue that the standard statistical paradigm is most appropriate for a narrow (albeit important) range of problems related to bias and causation. There is a vast uncharted territory of research problems for which this paradigm is impractical, and sometimes even inappropriate.
In this sense, I am a pessimist when it comes to the ability of standard approaches, and highly sophisticated mathematical extensions of them, to solve the problems of causal inference. I believe that a broader paradigm for data analysis is needed, one that focuses much more on individual variability and that meshes qualitative and quantitative sources of information more effectively. I am most definitely an optimist, however, about the enormous potential of research to improve human health and well-being.
I would like to acknowledge the contributions, direct and indirect, of many colleagues and friends with whom I have been privileged to work. Many of you have helped to shape my thinking during the course of our collaborations on various projects over the years. In this regard, I highlight especially the following (in alphabetical order): Tony Bryk, Xiu Chen, Richard Derrig, Mike Dolan, Eric Garnick, Mike Grossman, Vanessa Hayden, Peter Höfler, Jarvis Kellogg, Eric Kraus, Tom Marx, Mike Meyer, Bruce Parker, Victor Pontes, Sam Ratick, David Rogosa, Peter Rousmaniere, David Schwartz, and Terry Tivnan.
I offer my sincere thanks for important direct contributions to this book by several individuals. Richard Derrig and Jarvis Kellogg participated in several discussions that helped to sharpen my focus and improve the presentation of ideas. Vanessa Hayden and Victor Pontes provided insightful feedback as early versions of the causal model presented in this book were being hatched, and collaborated as coauthors on an article that introduced the basic conceptual framework. Jay Kadane and Dave Sackett reviewed early drafts of several chapters and made many valuable comments. Tom Marx and Terry Tivnan deserve special commendations for plowing through the entire manuscript and offering their helpful (and always tactful) editorial and substantive suggestions. My editor at Wiley, Steve Quigley, was enormously helpful at all stages of the process. His practical advice at critical junctures, delivered with wry, self-effacing humor, was just what I needed to see this project through to completion.
Finally, I wish to acknowledge the support and encouragement supplied by many family members, including my sons, Alex and Dan, my sister, Sally Goldberg, and my sisters-in-law, Barbara Irving and Abigail Natenshon. Alex also skillfully translated my rough drafts of the pie-chart graphs into electronic form. Finally, my deepest sense of gratitude goes to my wife, Nina, who has always given me the confidence to trust my instincts, and the courage to pursue my dreams.
Herbert I. Weisberg
Needham, Massachusetts
March 2010
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
