116,99 €
The second edition of this best-selling book has been thoroughly revised and expanded to reflect the significant changes and advances made in systematic reviewing. New features include discussion on the rationale, meta-analyses of prognostic and diagnostic studies and software, and the use of systematic reviews in practice.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 796
Veröffentlichungsjahr: 2013
Contents
Contributors
Foreword
Introduction
1 Rationale, potentials, and promise of systematic reviews
Summary points
Systematic review, overview or meta-analysis?
The scope of meta-analysis
Historical notes
Why do we need systematic reviews? A patient with myocardial infarction in 1981
Narrative reviews
Limitations of a single study
A more transparent appraisal
The epidemiology of results
What was the evidence in 1981? Cumulative metaanalysis
Conclusions
Acknowledgements
Part I: Systematic reviews of controlled trials
2 Principles of and procedures for systematic reviews
Summary points
Developing a review protocol
Objectives and eligibility criteria
Literature search
Selection of studies, assessment of methodological quality and data extraction
Presenting, combining and interpreting results
Standardised outcome measure
Graphical display
Heterogeneity between study results
Methods for estimating a combined effect estimate
Bayesian meta-analysis
Sensitivity analysis
Relative or absolute measures of effect?
Conclusions
Acknowledgements
3 Problems and limitations in conducting systematic reviews
Summary points
Garbage in – garbage out?
The dissemination of research findings
Publication bias
Who is responsible for publication bias: authors, reviewers or editors?
The influence of external funding and commercial interests
Other reporting biases
Duplicate (multiple) publication bias
Citation bias
Language bias
Outcome reporting bias
The future of unbiased, systematic reviewing
Acknowledgements
4 Identifying randomised trials
Summary points
Project to identify and re-tag reports of controlled trials in MEDLINE
Project to identify reports of controlled trials in EMBASE
Other databases already searched for reports of controlled trials
Identifying reports of controlled trials by handsearching
Recommended supplementary searches of databases already searched or in progress
Searches of additional databases not currently being searched for The Cochrane Controlled Trials Register
Supplementary searches of journals and conference proceedings
Identifying ongoing and/or unpublished studies
Conclusions
Acknowledgements
Disclaimer
5 Assessing the quality of randomised controlled trials
Summary points
A framework for methodological quality
Dimensions of internal validity
Empirical evidence of bias
Dimensions of external validity
Quality of reporting
Assessing trial quality: composite scales
A critique of two widely used quality scales
Potential of quality scales
Assessing trial quality: the component approach
Incorporating study quality into meta-analysis
Quality as a weight in statistical pooling
Sensitivity analysis
Conclusions
6 Obtaining individual patient data from randomised controlled trials
Summary points
What are individual patient data reviews?
Improved follow-up
Time-to-event analyses
Estimating time-to-event outcomes from published information
Effects in patient subgroups and on different outcomes
How to obtain data that are as complete as possible
What if IPD are not available?
Potential disadvantages of IPD reviews
Examples of how the collection of IPD can make a difference to the findings of a review
Conclusion
7 Assessing the quality of reports of systematic reviews: the QUOROM statement compared to other tools
Summary points
A systematic review of published checklists and scales
The QUOROM Statement
QUOROM as the “gold standard”
Results
Comparison of QUOROM to other checklists and scales
Scales
Assessment of quality across instruments
Discussion
Acknowledgements
Appendix 1
Appendix 2
Part II: Investigating variability within and between studies
8 Going beyond the grand mean: subgroup analysis in meta-analysis of randomised trials
Summary points
Subgroup analysis
Meta-regression: examining gradients in treatment effects
Risk stratification in meta-analysis
Problems in risk stratification
Use of risk indicators in meta-analysis
Confounding
Acknowledgements
9 Why and how sources of heterogeneity should be investigated
Summary points
Clinical and statistical heterogeneity
Serum cholesterol concentration and risk of ischaemic heart disease
Serum cholesterol reduction and risk of ischaemic heart disease
Statistical methods for investigating sources of heterogeneity
The relationship between underlying risk and treatment benefit
The virtues of individual patient data
Conclusions
Acknowledgements
10 Analysing the relationship between treatment benefit and underlying risk: precautions and recommendations
Summary points
Relating treatment effect to underlying risk: conventional approaches
Observed treatment effect versus observed average risk
Observed treatment group risk versus observed control group risk
Relating treatment effect to underlying risk: recently proposed approaches
Re-analysis of the sclerotherapy data
Conclusions
11 Investigating and dealing with publication and other biases
Summary points
Funnel plots
Other graphical methods
Statistical methods to detect and correct for bias
Conclusions
Acknowledgements
Part III: Systematic reviews of observational studies
12 Systematic reviews of observational studies
Summary points
Why do we need systematic reviews of observational studies?
Confounding and bias
Rare insight? The protective effect of beta-carotene that wasn’t
Exploring sources of heterogeneity
Conclusion
Acknowledgements
13 Systematic reviews of evaluations of prognostic variables
Summary points
Systematic review of prognostic studies
Meta-analysis of prognostic factor studies
Discussion
14 Systematic reviews of evaluations of diagnostic and screening tests
Summary points
Rationale for undertaking systematic reviews of studies of test accuracy
Features of studies of test accuracy
Summary measures of diagnostic accuracy
Predictive values
Systematic reviews of studies of diagnostic accuracy
Literature searching
Assessment of study quality
Ascertainment of reference diagnosis
Meta-analysis of studies of diagnostic accuracy
Pooling sensitivities and specificities
Pooling likelihood ratios
Combining symmetric ROC curves: pooling diagnostic odds ratios
Littenberg and Moses methods for estimation of summary ROC curves
Investigation of sources of heterogeneity
Pooling ROC curves
Discussion
Part IV: Statistical methods and computer software
15 Statistical methods for examining heterogeneity and combining results from several studies in meta-analysis
Summary points
Meta-analysis
Formulae for estimates of effect from individual studies
Formulae for deriving a summary (pooled) estimate of the treatment effect by combining trial results (meta-analysis)
Mantel–Haenszel methods
Peto’s odds ratio method
Extending the Peto method for pooling time-to-event data
DerSimonian and Laird random effects models
Confidence interval for overall effect
Test statistic for overall effect
Test statistics of homogeneity
Use of stratified analyses for investigating sources of heterogeneity
Meta-analysis with individual patient data
Additional analyses
Some practical issues
Other methods of meta-analysis
Case study 2 : Assertive community treatment for severe mental disorders
Case study 3 : effect of reduced dietary sodium on blood pressure
Discussion
Acknowledgements
16 Effect measures for meta-analysis of trials with binary outcomes
Summary points
Criteria for selection of a summary statistic
Mathematical properties
Ease of interpretation
Odds and risks
Measure of absolute effect – the risk difference
Measures of relative effect – the risk ratio and odds ratio
What is the event?
The L’Abbé plot
Empirical evidence of consistency
Empirical evidence of ease of interpretation
Case studies
Discussion
Acknowledgements
17 Meta-analysis software
Summary points
Commercial meta-analysis software
Freely available meta-analysis software
General statistical software which includes facilities for meta-analysis, or for which meta-analysis routines have been written
Conclusions
18 Meta-analysis in Stata™
Summary points
Getting started
Commands to perform a standard meta-analysis
Dealing with zero cells
Cumulative meta-analysis
Examining the influence of individual studies
Funnel plots and tests for funnel plot asymmetry
Meta-regression
Example 3: trials of BCG vaccine against tuberculosis
Part V: Using systematic reviews in practice
19 Applying the results of systematic reviews at the bedside
Summary points
Determining the applicability of the evidence to an individual patient
Determining the feasibility of the intervention in a particular setting
Determining the benefit : risk ratio in an individual patient
Extrapolating to the individual patient
Incorporating patient values and preferences
Conclusion
Acknowledgements
20 Numbers needed to treat derived from meta-analyses: pitfalls and cautions
Summary points
Describing the effects of treatment
The effect of choice of outcome
Ineffective treatments and adverse effects
The effect of variation in baseline risk
The effects of geographical and secular trends
The effect of clinical setting
Trial participants and patients in routine clinical practice
Pooling assumptions and NNTs
Duration of treatment effect
Interpretation of NNTs
Deriving NNTs
Understanding of NNTs
Conclusion
Acknowledgements
21 Using systematic reviews in clinical guideline development
Summary points
Methods of developing guidelines
Scale of the review process
Using existing systematic reviews
Internal validity of reviews
Dimensions of evidence
External validity of reviews
Updating existing reviews
Conducting reviews within the process of guideline development
The summary metric used in reviews
Concluding remarks
22 Using systematic reviews for evidence based policy making
Summary points
Evidence based medicine and evidence based healthcare
Evidence based decision making for populations
Evidence as the dominant driver
Resource-driven decisions
Value-driven decisions
Who should make value-drive decisions?
23 Using systematic reviews for economic evaluation
Summary points
What is economic evaluation?
Using systematic reviews of the effects of health care in economic evaluation
Systematic review of economic evaluations
Conclusions: the role of systematic review in economic evaluation
Acknowledgements
24 Using systematic reviews and registers of ongoing trials for scientific and ethical trial design, monitoring, and reporting
Summary points
Questionable use of limited resources
Ethical concerns
Distorted research agendas
Registration of planned and ongoing trials to inform decisions on new trials
Registration of trials to reduce reporting biases
Ethical and scientific monitoring of ongoing trials
Interpreting the results from new trials
Acknowledgements
Part VI: The Cochrane Collaboration
25 The Cochrane Collaboration in the 20th century
Summary points
Background and history
Mission, principles and organisation
Collaborative Review Groups
Cochrane Centres
Method Groups
Communication
Output of the Cochrane Collaboration
Conclusion
Acknowledgements
26 The Cochrane Collaboration in the 21st century: ten challenges and one reason why they must be met
Summary points
Ethical challenges
Social challenges
Logistical challenges
Methodological challenges
Why these challenges must be met
Acknowledgements
Index
Stata™ datasets and other additional information can be found on the book’s web site: http://www.systematicreviews.com
© BMJ Publishing Group 2001
Chapter 4 © Crown copyright 2000
Chapter 24 © Crown copyright 1995, 2000
Chapters 25 and 26 © The Cochrane Collaboration 2000
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording and/or otherwise, without the prior written permission of the publishers.
First published in 1995 by the BMJ Publishing Group, BMA House, Tavistock Square, London WC1H 9JR
www.bmjbooks.com
First published 1995
6 2007
British Library Cataloguing in Publication Data
A catalogue record for this book is available from the British Library
ISBN: 978-0-7279-1488–0
Contributors
Douglas G Altman
Professor of Statistics in Medicine
ICRF Medical Statistics Group
Centre for Statistics in Medicine
Institute of Health Sciences
University of Oxford
Oxford, UK
Gerd Antes
Director
German Cochrane Centre
Institut für Medizinische Biometrie und Medizinische Informatik
University of Freiburg
Freiburg i.B., Germany
Michael J Bradburn
Medical Statistician
ICRF Medical Statistics Group
Centre for Statistics in Medicine
Institute of Health Sciences
University of Oxford, Oxford, UK
Iain Chalmers
Director
UK Cochrane Centre
NHS Research and Development Programme
Oxford, UK
Michael J Clarke
Associate Director (Research)
UK Cochrane Centre
NHS Research and Development Programme
and
Overviews’ Co-ordinator
Clinical Trials Service Unit
Oxford, UK
George Davey Smith
Professor of Clinical Epidemiology
Division of Epidemiology and MRC Health Services Research Collaboration
Department of Social Medicine
University of Bristol
Bristol, UK
Jonathan J Deeks
Senior Medical Statistician
Systematic Review Development Programme
Centre for Statistics in Medicine
Institute of Health Sciences
University of Oxford
Oxford, UK
Kay Dickersin
Associate Professor
Department of Community Health
Brown University
Rhode Island, USA
Catherine Dubé
Department of Medicine
Division of Gastro-enterology
University of Ottawa
Ottawa, Canada
Shah Ebrahim
Professor in Epidemiology of Ageing
Division of Epidemiology and MRC Health Services Research Collaboration
Department of Social Medicine
University of Bristol
Bristol, UK
Martin Eccles
Professor of Clinical Effectiveness
Centre for Health Services Research
University of Newcastle upon Tyne
Newcastle upon Tyne, UK
Matthias Egger
Senior Lecturer in Epidemiology and Public Health Medicine
Division of Health Services Research and MRC Health Services Research Collaboration
Department of Social Medicine
University of Bristol
Bristol, UK
Nick Freemantle
Reader in Epidemiology and Biostatistics
Medicines Evaluation Group
Centre for Health Economics
University of York
York, UK
J A Muir Gray
Director
Institute of Health Sciences
University of Oxford
Oxford, UK
Peter Jüni
Research Fellow
MRC Health Services Research Collaboration
Department of Social Medicine
University of Bristol
Bristol, UK
Carol Lefebvre
Information Specialist
UK Cochrane Centre
NHS Research and Development Programme
Oxford, UK
James Mason
Senior Research Fellow
Medicines Evaluation Group
Centre for Health Economics
University of York
York, UK
Finlay A McAlister
Assistant Professor
Division of General Internal Medicine
University of Alberta Hospital
Edmonton, Canada
David Moher
Director
Thomas C Chalmers Centre for Systematic Reviews
Children’s Hospital of Eastern Ontario Research Institute
University of Ottawa
Ottawa, Canada
Miranda Mugford
Professor of Health Economics
School of Health Policy and Practice
University of East Anglia
Norwich, UK
Keith O’Rourke
Statistician
Clinical Epidemiology Unit
Loeb Research Institute
Ottawa Hospital
Ottawa, Canada
Andrew D Oxman
Director
Health Services Research Unit
National Institute of Public Health
Oslo, Norway
Martin Schneider
Specialist Registrar in Internal Medicine
Department of Medicine
University Hospitals
Geneva, Switzerland
Stephen J Sharp
Medical Statistician
GlaxoWellcome Research and Development
London, UK
Beverley Shea
Loeb Health Research Institute
Clinical Epidemiology Unit
Ottawa Hospital
University of Ottawa
Ottawa, Canada
Jonathan A C Sterne
Senior Lecturer in Medical Statistics
Division of Epidemiology and MRC Health Services Research Collaboration
Department of Social Medicine
University of Bristol
Bristol, UK
Lesley A Stewart
Head
Meta-Analysis Group
MRC Clinical Trials Unit
London, UK
Alexander J Sutton
Lecturer in Medical Statistics
Department of Epidemiology and Public Health
University of Leicester
Leicester, UK
Simon G Thompson
Director
MRC Biostatistics Unit
Institute of Public Health
University of Cambridge
Cambridge, UK
Foreword
“If, as is sometimes supposed, science consisted in nothing but the laborious accumulation of facts, it would soon come to a standstill, crushed, as it were, under its own weight. The suggestion of a new idea, or the detection of a law, supersedes much that has previously been a burden on the memory, and by introducing order and coherence facilitates the retention of the remainder in an available form…. Two processes are thus at work side by side, the reception of new material and the digestion and assimilation of the old; and as both are essential we may spare ourselves the discussion of their relative importance. One remark, however, should be made. The work which deserves, but I am afraid does not always receive, the most credit is that in which discovery and explanation go hand in hand, in which not only are new facts presented, but their relation to old ones is pointed out.”1
The above quotation is from the presidential address given by Lord Rayleigh, Professor of Physics at Cambridge University, at the meeting of the British Association for the Advancement of Science held in Montreal in 1884. More than a century later, research funding agencies, research ethics committees, researchers and journal editors in the field of health research have only just begun to take Lord Rayleigh’s injunction seriously. Research synthesis has a long history and has been developed in many spheres of scientific activity.2 Social scientists in the United States, in particular, have been actively discussing, developing and applying methods for this kind of research for more than quarter of a century,3–5 and, when the quality of the original research has been adequate, research syntheses have had an important impact on policy and practice.6,7
It was not until the late 1980s that Cynthia Mulrow8 and Andy Oxman9 began to spell out, for a medical readership, the scientific issues that need to be addressed in research synthesis. During the 1990s, there was an encouraging growth of respect for scientific principles among those preparing “stand alone” reviews, particularly reviews of research on the effects of health care interventions. Unfortunately, there is still little evidence that the same scientific principles are recognised as relevant in preparing the “discussion” sections of reports of new research. An analysis of papers in five influential general medical journals showed that the results of new studies are only very rarely presented in the context of systematic reviews of relevant earlier studies.10
In an important step in the right direction, the British Medical Journal, acknowledging the cumulative nature of scientific evidence, now publishes with each report of new research a summary of what is already known on the topic addressed, and what the new study has added.
As a result of the slow progress in adopting scientifically defensible methods of research synthesis in health care, the limited resources made available for research continue to be squandered on ill-conceived studies,11 and avoidable confusion continues to result from failure to review research systematically and set the results of new studies in the context of other relevant research. As a result, patients and others continue to suffer unnecessarily.12
Take, for example, the disastrous effects of giving class 1 anti-arrhythmic drugs to people having heart attacks, which has been estimated to have caused tens of thousands of premature deaths in the United States alone.13 The fact that the theoretical potential of these drugs was not being realised in practice could have been recognised many years earlier than it was. The warning signs were there in one of the first systematic reviews of controlled trials in health care,14 yet more than 50 trials of these drugs were conducted over nearly two decades15 before official warnings about their lethal impact were issued. Had the new data generated by each one of these trials been presented within the context of systematic reviews of the results of all previous trials, the lethal potential of this class of drugs would have become clear earlier, and an iatrogenic disaster would have been contained, if not avoided.
Failure to improve the quality of reviews by taking steps to reduce biases and the effects of the play of chance – whether in “stand alone” reviews or in reports of new evidence – will continue to have adverse consequences for people using health services. The first edition of this book helped to raise awareness of this,16 and its main messages were well received. However, the call to improve the scientific quality of reviews has not been accepted by everyone. In 1998, for example, editors at the New England Journal of Medicine rejected a commentary they had commissioned because they felt that their readers would not understand its main message – that meta-analysis (statistical synthesis of the results of separate but similar studies) could not be expected to reduce biases in reviews, but only to reduce imprecision.17 The journal’s rejection was particularly ironic in view of the fact that it had published one of the earliest and most important systematic reviews ever done.18
It was because of the widespread and incautious use of the term “meta-analysis” that the term “systematic reviews” was chosen as the title for the first edition of this book.16 Although meta-analysis may reduce statistical imprecision and may sometimes hint at biases in reviews (for example through tests of homogeneity, or funnel plots), it can never prevent biases. As in many forms of research, even elegant statistical manipulations, when performed on biased rubble, are incapable of generating unbiased precious stones. As Matthias Egger has put it – the diamond used to represent a summary statistic cannot be assumed to be the jewel in the crown!
The term “meta-analysis” has become so attractive to some people that they have dubbed themselves “meta-analysts”, and so repellent to others that they have lampooned it with dismissive “synonyms” such as “mega-silliness”19 and “shmeta-analysis”.20 Current discussions about ways of reducing biases and imprecision in reviews of research must not be allowed to be held hostage by ambiguous use of the term ‘metaanalysis’. Hopefully, both the title and the organisation of the contents of this second edition of the book will help to promote more informed and specific criticisms of reviews, and set meta-analysis in a proper context.
Interest in methods for research synthesis among health researchers and practitioners has burgeoned during the five years that have passed between the first and second editions of this book. Whereas the first edition16 had eight chapters and was just over 100 pages long, the current edition has 26 chapters and is nearly 500 pages long. The first edition of the book contained a methodological bibliography of less than 400 citations. Because that bibliography has now grown to over 2500 citations, it is now published and updated regularly in The Cochrane Methodology Register.21 These differences reflect the breathtaking pace of methodological developments in this sphere of research. Against this background it is easy to understand why I am so glad that Matthias Egger and George Davey Smith – who have contributed so importantly to these developments – agreed to co-edit the second edition of this book with Doug Altman.
After an introductory editorial chapter, the new edition begins with six chapters concerned principally with preventing and detecting biases in systematic reviews of controlled experiments. The important issue of investigating variability within and between studies is tackled in the four chapters that follow. The “methodological tiger country” of systematic reviews of observational studies is then explored in three chapters. Statistical methods and computer software are addressed in a section with four chapters. The book concludes with six chapters about using systematic reviews in practice, and two about the present and future of the Cochrane Collaboration.
Looking ahead, I hope that there will have been a number of further developments in this field before the third edition of the book is prepared. First and foremost, there needs to be wider acknowledgement of the essential truth of Lord Rayleigh’s injunction, particularly within the research community and among funders. Not only is research synthesis an essential process for taking stock of the dividends resulting from the investment of effort and other resources in research, it is also intellectually and methodologically challenging, and this should be reflected in the criteria used to judge the worth of academic work. Hopefully we will have seen the back of the naïve notion that when the results of systematic reviews differ from those of large trials, the latter should be assumed to be “the truth”.22
Second, I hope that people preparing systematic reviews, rather than having to detect and try to take account of biases retrospectively, will increasingly be able to draw on material that is less likely to be biased. Greater efforts are needed to reduce biases in the individual studies that will contribute to reviews.23 Reporting biases need to be reduced by registration of studies prior to their results being known, and by researchers recognising that they have an ethical and scientific responsibility to report findings of well-designed studies, regardless of the results.24 And I hope that there will be greater collaboration in designing and conducting systematic reviews prospectively, as a contribution to reducing biases in the review process, as pioneered in the International Multicentre Pooled Analysis of Colon Cancer Trials.25
Third, the quality of reviews of observational studies must be improved to address questions about aetiology, diagnostic accuracy, risk prediction and prognosis.26 These questions cannot usually be tackled using controlled experiments, so this makes systematic reviews of the relevant research more complex. Consumers of research results are frequently confused by conflicting claims about the accuracy of a diagnostic test, or the importance of a postulated aetiological or prognostic factor. They need systematic reviews that explore whether these differences of opinion simply reflect differences in the extent to which biases and the play of chance have been controlled in studies with apparently conflicting results. A rejection of meta-analysis in these circumstances20 should not be used as an excuse for jettisoning attempts to reduces biases in reviews of such observational data.
Fourth, by the time the next edition of this book is published it should be possible to build on assessments of individual empirical studies that have addressed methodological questions, such as those published in the Cochrane Collaboration Methods Groups Newsletter,27 and instead, take account of up-to-date, systematic reviews of such studies. Several such methodological reviews are currently being prepared, and they should begin to appear in The Cochrane Library in 2001.
Finally, I hope that social scientists, health researchers and lay people will be cooperating more frequently in efforts to improve both the science of research synthesis and the design of new studies. Lay people can help to ensure that researchers address important questions, and investigate outcomes that really matter.28,29 Social scientists have a rich experience of research synthesis, which remains largely untapped by health researchers, and they have an especially important role to play in designing reviews and new research to assess the effects of complex interventions and to detect psychologically mediated effects of interventions.30,31 Health researchers, for their part, should help lay people to understand the benefits and limitations of systematic reviews, and encourage social scientists to learn from the methodological developments that have arisen from the recent, intense activity in reviews of health care interventions. Indeed, five years from now there may be a case for reverting to the original title of the book – Systematic Reviews – to reflect the fact that improving the quality of research synthesis presents similar challenges across the whole spectrum of scientific activity.
Iain Chalmers
I am grateful to Mike Clarke, Paul Glasziou, Dave Sackett, and the editors for help in preparing this foreword.
1 Rayleigh, The Right Hon Lord. Presidential address at the 54th meeting of the British Association for the Advancement of Science, Montreal, August/September 1884. London: John Murray. 1889:3–23.
2 Chalmers I, Hedges LV, Cooper H. A brief history of research synthesis. Eval Health Prof (in press).
3 Glass GV. Primary, secondary and meta-analysis of research. Educat Res 1976;5:3–8.
4 Lipsey MW, Wilson DB. The efficacy of psychological, educational, and behavioral treatment. Am Psychol 1993;48:1181–209.
5 Cooper H, Hedges LV. The handbook of research synthesis. New York: Russell Sage Foundation, 1994.
6 Chelimsky E. Politics, Policy, and Research Synthesis. Keynote address before the National Conference on Research Synthesis, sponsored by the Russell Sage Foundation, Washington DC, 21 June 1994.
7 Hunt M. How science takes stock: the story of meta-analysis. New York: Russell Sage Foundation, 1997.
8 Mulrow CD. The medical review article: state of the science. Ann Int Med 1987;106:485–8.
9 Oxman AD, Guyatt GH. Guidelines for reading literature reviews. Can Med Assoc J 1988;138:697–703.
10 Clarke M, Chalmers I. Discussion sections in reports of controlled trials published in general medical journals: islands in search of continents? JAMA 1998;280:280–2.
11 Soares K, McGrath J, Adams C. Evidence and tardive dyskinesia. Lancet 1996;347:1696–7.
12 Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers TC. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts. JAMA 1992;268:240–8.
13 Moore T. Deadly Medicine. New York: Simon and Schuster, 1995.
14 Furberg CD. Effect of anti-arrhythmic drugs on mortality after myocardial infarction. Am J Cardiol 1983;52:32C–36C.
15 Teo KK, Yusuf S, Furberg CD. Effects of prophylactic anti-arrhythmic drug therapy in acute myocardial infarction. JAMA 1993;270:1589–95.
16 Chalmers I, Altman DG. Systematic reviews. London: BMJ, 1995.
17 Sackett DL, Glasziou P, Chalmers I. Meta-analysis may reduce imprecision, but it can’t reduce bias. Unpublished commentary commissioned by the New England Journal of Medicine, 1997.
18 Stampfer MJ, Goldhaber SZ, Yusuf S, Peto R, Hennekens CH. Effect of intravenous streptokinase on acute myocardial infarction: pooled results from randomized trials. N Engl J Med 1982;307:1180–2.
19 Eysenck HJ. An exercise in mega-silliness. Am Psychol 1978;33:517.
20 Shapiro S. Meta-analysis/shmeta-analysis. Am J Epidemiol 1994;140:771–8.
21 Cochrane Methodology Register. In: The Cochrane Library, Issue 1. Oxford: Update Software, 2001.
22 Ioannidis JP, Cappelleri JC, Lau J. Issues in comparisons between meta-analyses and large trials. JAMA 1998;279:1089–93.
23 Chalmers I. Unbiased, relevant, and reliable assessments in health care. BMJ 1998;317:1167–8.
24 Chalmers I, Altman DG. How can medical journals help prevent poor medical research? Some opportunities presented by electronic publishing. Lancet 1999;353:490–3.
25 International Multicentre Pooled Analysis of Colon Cancer Trials (IMPACT). Efficacy of adjuvant fluorouracil and folinic acid in colon cancer. Lancet 1995;345:939–44.
26 Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, Moher D, Becker BJ, Sipe TA, Thacker SB. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA 2000;283:2008–12.
27 Clarke M, Hopewell S (eds). The Cochrane Collaboration Methods Groups Newsletter. vol 4, 2000.
28 Chalmers I. What do I want from health research and researchers when I am a patient? BMJ 1995;310:1315–18.
29 Oliver S. Users of health services: following their agenda. In: Hood S, Mayall B, Oliver S (eds). Critical issues in social research. Buckingham: Open University Press, 1999:139–153.
30 Boruch RF. Randomized experiments for planning and evaluation. Thousand Oaks: Sage Publications,1997.
31 Oakley A. Experiments in knowing. Oxford: Polity Press, 2000.
Introduction
MATTHIAS EGGER, GEORGE DAVEY SMITH, KEITH O’ROURKE
Reviews are essential tools for health care workers, researchers, consumers and policy makers who want to keep up with the evidence that is accumulating in their field.
Systematic reviews allow for a more objective appraisal of the evidence than traditional narrative reviews and may thus contribute to resolve uncertainty when original research, reviews, and editorials disagree.
Meta-analysis, if appropriate, will enhance the precision of estimates of treatment effects, leading to reduced probability of false negative results, and potentially to a more timely introduction of effective treatments.
Exploratory analyses, e.g. regarding subgroups of patients who are likely to respond particularly well to a treatment (or the reverse), may generate promising new research questions to be addressed in future studies.
Systematic reviews may demonstrate the lack of adequate evidence and thus identify areas where further studies are needed.
The volume of data that need to be considered by practitioners and researchers is constantly expanding. In many areas it has become simply impossible for the individual to read, critically evaluate and synthesise the state of current knowledge, let alone keep updating this on a regular basis. Reviews have become essential tools for anybody who wants to keep up with the new evidence that is accumulating in his or her field of interest. Reviews are also required to identify areas where the available evidence is insufficient and further studies are required. However, since Mulrow1 and Oxman and Guyatt2 drew attention to the poor quality of narrative reviews it has become clear that conventional reviews are an unreliable source of information. In response to this situation there has, in recent years, been increasing focus on formal methods of systematically reviewing studies, to produce explicitly formulated, reproducible, and up-to-date summaries of the effects of health care interventions. This is illustrated by the sharp increase in the number of reviews that used formal methods to synthesise evidence (Figure 1.1).
Figure 1.1 Number of publications concerning meta-analysis, 1986–1999. Results from MEDLINE search using text word and medical subject (MESH) heading “meta-analysis” and text word “systematic review”.
In this chapter we will attempt to clarify terminology and scope, provide some historical background, and examine the potentials and promise of systematic reviews and meta-analysis.
A number of terms are used concurrently to describe the process of systematically reviewing and integrating research evidence, including “systematic review”, “meta-analysis”, “research synthesis”, “overview” and “pooling”. In the foreword to the first edition of this book, Chalmers and Altman3 defined systematic review as a review that has been prepared using a systematic approach to minimising biases and random errors which is documented in a materials and methods section. A systematic review may, or may not, include a meta-analysis: a statistical analysis of the results from independent studies, which generally aims to produce a single estimate of a treatment effect.4 The distinction between systematic review and meta-analysis, which will be used throughout this book, is important because it is always appropriate and desirable to systematically review a body of data, but it may sometimes be inappropriate, or even misleading, to statistically pool results from separate studies.5 Indeed, it is our impression that reviewers often find it hard to resist the temptation of combining studies even when such meta-analysis is questionable or clearly inappropriate.
As discussed in detail in Chapter 12, a clear distinction should be made between meta-analysis of randomised controlled trials and metaanalysis of epidemiological studies. Consider a set of trials of high methodological quality that examined the same intervention in comparable patient populations: each trial will provide an unbiased estimate of the same underlying treatment effect. The variability that is observed between the trials can confidently be attributed to random variation and meta-analysis should provide an equally unbiased estimate of the treatment effect, with an increase in the precision of this estimate. A fundamentally different situation arises in the case of epidemiological studies, for example case-control studies, cross-sectional studies or cohort studies. Due to the effects of confounding and bias, such observational studies may produce estimates of associations that deviate from the underlying effect in ways that may systematically differ from chance. Combining a set of epidemiological studies will thus often provide spuriously precise, but biased, estimates of associations. The thorough consideration of heterogeneity between observational study results, in particular of possible sources of confounding and bias, will generally provide more insights than the mechanistic calculation of an overall measure of effect (see Chapters 9 and 12 for examples of observational meta-analyses).
The fundamental difference that exists between observational studies and randomised controlled trials does not mean that the latter are immune to bias. Publication bias and other reporting biases (see Chapter 3) may distort the evidence from both trials and observational studies. Bias may also be introduced if the methodological quality of controlled trials is inadequate6,7 (Chapter 5). It is crucial to understand the limitations of meta-analysis and the importance of exploring sources of heterogeneity and bias (Chapters 8–11), and much emphasis will be given to these issues in this book.
Efforts to compile summaries of research for medical practitioners who struggle with the amount of information that is relevant to medical practice are not new. Chalmers and Tröhler8 drew attention to two journals published in the 18th century in Leipzig and Edinburgh, Comentariide rebus in scientia naturali et medicina gestis and Medical and Philosophical Commentaries, which published critical appraisals of important new books in medicine, including, for example, William Withering’s now classic Account of the foxglove (1785) on the use of digitalis for treating heart disease. These journals can be seen as the 18th century equivalents of modern day secondary publications such as the ACP Journal Club or Evidence based medicine.
Astronomers long ago noticed that observations of the same objects differed even when made by the same observers under similar conditions. The calculation of the mean as a more precise value than a single measurement had appeared by the end of the 17th century.9 By the late 1700s probability models were being used to represent the uncertainty of observations that was caused by measurement error. Laplace decided to write these models not as the probability that an observation equalled the true value plus some error but as the truth plus the “probability of some error”. In doing this he recognised that as probabilities of independent errors multiply he could determine the most likely joint errors, the concept which is at the heart of maximum likelihood estimation.10 Laplace’s method of combining and quantifying uncertainty in the combination of observations required an explicit probability distribution for errors in the individual observations and no acceptable one existed. Gauss drew on empirical experience and argued that a probability distribution corresponding to what is today referred to as the Normal or Gaussian distribution would be best. This remained speculative until Laplace’s formulation of the central limit theorem – that for large sample sizes the error distribution will always be close to Normally distributed. Hence, Gauss’s method was more than just a good guess but justified by the central limit theorem. Most statistical techniques used today in meta-analysis follow from Gauss’s and Laplace’s work. Airy disseminated their work in his 1861 “textbook” on “meta-analysis” for astronomers (Figure 1.2) which included the first formulation of a random effects model to allow for heterogeneity in the results.11 Airy offered practical advice and argued for the use of judgement to determine what type of statistical model should be used.
Figure 1.2 The title page of what may be seen as the first “textbook” of meta-analysis, published in 1861.
The statistical basis of meta-analysis reaches back to the 17th century when in astronomy and geodesy intuition and experience suggested that combinations of data might be better than attempts to choose amongst them (see Box 1.1). In the 20th century the distinguished statistician Karl Pearson (Figure 1.3), was, in 1904, probably the first medical researcher reporting the use of formal techniques to combine data from different studies. The rationale for pooling studies put forward by Pearson in his account on the preventive effect of serum inoculations against enteric fever,12 is still one of the main reasons for undertaking meta-analysis today:
“Many of the groups … are far too small to allow of any definite opinion being formed at all, having regard to the size of the probable error involved”.12
Figure 1.3 Distinguished statistician Karl Pearson is seen as the first medical researcher to use formal techniques to combine data from different studies.
However, such techniques were not widely used in medicine for many years to come. In contrast to medicine, the social sciences and in particular psychology and educational research, developed an early interest in the synthesis of research findings. In the 1930s, 80 experiments examining the “potency of moral instruction in modifying conduct” were systematically reviewed.13 In 1976 the psychologist Glass coined the term “meta-analysis” in a paper entitled “Primary, secondary and meta-analysis of research”.14 Three years later the British physician and epidemiologist Archie Cochrane drew attention to the fact that people who want to make informed decisions about health care do not have ready access to reliable reviews of the available evidence.15 In the 1980s meta-analysis became increasingly popular in medicine, particularly in the fields of cardiovascular disease,16,17 oncology,18 and perinatal care.19 Meta-analysis of epidemiological studies20,21 and “cross design synthesis”,22 the integration of observational data with the results from meta-analyses of randomised clinical trials was also advocated. In the 1990s the foundation of the Cochrane Collaboration (see Chapters 25 and 26) facilitated numerous developments, many of which are documented in this book.
A likely scenario in the early 1980s, when discussing the discharge of a patient who had suffered an uncomplicated myocardial infarction, is as follows: a keen junior doctor asks whether the patient should receive a beta-blocker for secondary prevention of a future cardiac event. After a moment of silence the consultant states that this was a question which should be discussed in detail at the Journal Club on Thursday. The junior doctor (who now regrets that she asked the question) is told to assemble and present the relevant literature. It is late in the evening when she makes her way to the library. The MEDLINE search identifies four clinical trials.23–26 When reviewing the conclusions from these trials (Table 1.1) the doctor finds them to be rather confusing and contradictory. Her consultant points out that the sheer amount of research published makes it impossible to keep track of and critically appraise individual studies. He recommends a good review article. Back in the library the junior doctor finds an article which the BMJ published in 1981 in a “Regular Reviews” section.27 This narrative review concluded:
Thus, despite claims that they reduce arrhythmias, cardiac work, and infarct size, we still have no clear evidence that beta-blockers improve long-term survival after infarction despite almost 20 years of clinical trials.27
Table 1.1 Conclusions from four randomised controlled trials of beta-blockers in secondary prevention after myocardial infarction.
The junior doctor is relieved. She presents the findings of the review article, the Journal Club is a full success and the patient is discharged without a beta-blocker.
Traditional narrative reviews have a number of disadvantages that systematic reviews may overcome. First, the classical review is subjective and therefore prone to bias and error.28 Mulrow showed that among 50 reviews published in the mid 1980s in leading general medicine journals, 49 reviews did not specify the source of the information and failed to perform a standardised assessment of the methodological quality of studies.1 Our junior doctor could have consulted another review of the same topic, published in the European Heart Journal in the same year. This review concluded that “it seems perfectly reasonable to treat patients who have survived an infarction with timolol”.29 Without guidance by formal rules, reviewers will inevitably disagree about issues as basic as what types of studies it is appropriate to include and how to balance the quantitative evidence they provide. Selective inclusion of studies that support the author’s view is common. This is illustrated by the observation that the frequency of citation of clinical trials is related to their outcome, with studies in line with the prevailing opinion being quoted more frequently than unsupportive studies30,31 Once a set of studies has been assembled a common way to review the results is to count the number of studies supporting various sides of an issue and to choose the view receiving the most votes. This procedure is clearly unsound, since it ignores sample size, effect size, and research design. It is thus hardly surprising that reviewers using traditional methods often reach opposite conclusions1 and miss small, but potentially important, differences.32 In controversial areas the conclusions drawn from a given body of evidence may be associated more with the speciality of the reviewer than with the available data.33 By systematically identifying, scrutinising, tabulating, and perhaps integrating all relevant studies, systematic reviews allow a more objective appraisal, which can help to resolve uncertainties when the original research, classical reviews and editorial comments disagree.
A single study often fails to detect, or exclude with certainty, a modest, albeit relevant, difference in the effects of two therapies. A trial may thus show no statistically significant treatment effect when in reality such an effect exists – it may produce a false negative result. An examination of clinical trials which reported no statistically significant differences between experimental and control therapy has shown that false negative results in health care research are common: for a clinically relevant difference in outcome the probability of missing this effect given the trial size was greater than 20% in 115 (85%) of the 136 trials examined.34 Similarly, a recent examination of 1941 trials relevant to the treatment of schizophrenia showed that only 58 (3%) studies were large enough to detect an important improvement.35 The number of patients included in trials is thus often inadequate, a situation which has changed little over recent years.34 In some cases, however, the required sample size may be difficult to achieve. A drug which reduces the risk of death from myocardial infarction by 10% could, for example, delay many thousands of deaths each year in the UK alone. In order to detect such an effect with 90% certainty over ten thousand patients in each treatment group would be needed.36
The meta-analytic approach appears to be an attractive alternative to such a large, expensive and logistically problematic study. Data from patients in trials evaluating the same or a similar drug in a number of smaller, but comparable, studies are considered. Methods used for meta-analysis employ a weighted average of the results in which the larger trials have more influence than the smaller ones. Comparisons are made exclusively between patients enrolled in the same study. As discussed in detail in chapter 15, there are a variety of statistical techniques available for this purpose.37,38 In this way the necessary number of patients may be reached, and relatively small effects can be detected or excluded with confidence. Systematic reviews can also contribute to considerations regarding the applicability of study results. The findings of a particular study might be felt to be valid only for a population of patients with the same characteristics as those investigated in the trial. If many trials exist in different groups of patients, with similar results being seen in the various trials, then it can be concluded that the effect of the intervention under study has some generality. By putting together all available data meta-analyses are also better placed than individual trials to answer questions regarding whether or not an overall study result varies among subgroups, e.g. among men and women; older and younger patients or participants with different degrees of severity of disease.
An important advantage of systematic reviews is that they render the review process transparent. In traditional narrative reviews it is often not clear how the conclusions follow from the data examined. In an adequately presented systematic review it should be possible for readers to replicate the quantitative component of the argument. To facilitate this, it is valuable if the data included in meta-analyses are either presented in full or made available to interested readers by the authors. The increased openness required leads to the replacement of unhelpful descriptors such as “no clear evidence”, “some evidence of a trend”, “a weak relationship” and “a strong relationship”.39 Furthermore, performing a meta-analysis may lead to reviewers moving beyond the conclusions authors present in the abstract of papers, to a thorough examination of the actual data.
The tabulation, exploration and evaluation of results are important components of systematic reviews. This can be taken further to explore sources of heterogeneity and test new hypotheses that were not posed in individual studies, for example using “meta-regression” techniques (see also Chapters 8–11). This has been termed the “epidemiology of results” where the findings of an original study replace the individual as the unit of analysis.40 However, it must be born in mind that although the studies included may be controlled experiments, the meta-analysis itself is subject to many biases inherent in observational studies.41 Aggregation or ecological bias42 is also a problem unless individual patient data is available (see Chapter 6). Systematic reviews can, nevertheless, lead to the identification of the most promising or the most urgent research question, and may permit a more accurate calculation of the sample sizes needed in future studies (see Chapter 24). This is illustrated by an early meta-analysis of four trials that compared different methods of monitoring the fetus during labour.43 The meta-analysis led to the hypothesis that, compared with intermittent auscultation, continuous fetal heart monitoring reduced the risk of neonatal seizures. This hypothesis was subsequently confirmed in a single randomised trial of almost seven times the size of the four previous studies combined.44
What conclusions would our junior doctor have reached if she had had access to a meta-analysis? Numerous meta-analyses of trials examining the effect of beta-antagonists have been published since 1981.17,45–48Figure 1.4 shows the results from the most recent analysis that included 33 randomised comparisons of beta-blockers versus placebo or alternative treatment in patients who had had a myocardial infarction.48 These trials were published between 1967 and 1997. The combined relative risk indicates that beta-blockade starting after the acute infarction reduces subsequent premature mortality by an estimated 20% (relative risk 0.80). A useful way to show the evidence that was available in 1981 and at other points in time is to perform a cumulative meta-analysis.49
Cumulative meta-analysis is defined as the repeated performance of meta-analysis whenever a new relevant trial becomes available for inclusion. This allows the retrospective identification of the point in time when a treatment effect first reached conventional levels of statistical significance. In the case of beta-blockade in secondary prevention of myocardial infarction, a statistically significant beneficial effect (P < 0·05) became evident by 1981 (Figure 1.5). Subsequent trials in a further 15 000 patients simply confirmed this result. This situation has been taken to suggest that further studies in large numbers of patients may be at best superfluous and costly, if not unethical,50 once a statistically significant treatment effect is evident from meta-analysis of the existing smaller trials.
Figure 1.4 “Forest plot” showing mortality results from trials of beta-blockers in secondary prevention after myocardial infarction. Trials are ordered by year of publication. The black square and horizontal line correspond to the trials’ risk ratio and 95% confidence intervals. The area of the black squares reflects the weight each trial contributes in the meta-analysis. The diamond represents the combined relative risk with its 95% confidence interval, indicating a 20% reduction in the odds of death. See Chapter 2 for a detailed description of forest plots. Adapted from Freemantle et al.48
Figure 1.5 Cumulative meta-analysis of controlled trials of beta-blockers after myocardial infarction. The data correspond to Figure 1.4. A statistically significant (P < 0·05) beneficial effect on mortality became evident in 1981.
Figure 1.6 Cumulative meta-analysis of randomised controlled trials of intravenous streptokinase in myocardial infarction. The number of patients randomised in a total of 33 trials, and national authorities licensing streptokinase for use in myocardial infarction are also shown. aIncludes GISSI-1; bISIS-2.
Systematic review including, if appropriate, a formal meta-analysis is clearly superior to the narrative approach to reviewing research. In addition to providing a precise estimate of the overall treatment effect in some instances, appropriate examination of heterogeneity across individual studies can produce useful information with which to guide rational and cost effective treatment decisions. Systematic reviews are also important to demonstrate areas where the available evidence is insufficient and where new, adequately sized trials are required.
We are grateful to Sir David Cox for providing key references to early statistical work and to Iain Chalmers for his comments on an earlier draft of this chapter. We thank Dr T Johansson and G Enocksson (Pharmacia AB, Stockholm) and Dr A Schirmer and Dr M Thimme (Behring AG, Marburg) for providing data on licensing of streptokinase in different countries. This chapter draws on material published earlier in the BMJ.54
1 Mulrow CD. The medical review article: state of the science. Ann Intern Med 1987;106:485–8.
2 Oxman AD, Guyatt GH. Guidelines for reading literature reviews. Can Med Assoc J 1988;138:697–703.
3 Chalmers I, Altman D (eds). Systematic reviews. London: BMJ Publishing Group, 1995.
4 Huque MF. Experiences with meta-analysis in NDA submissions. Proc Biopharmac Sec Am Stat Assoc 1988;2:28–33.
5 O’Rourke K, Detsky AS. Meta-analysis in medical research: strong encouragement for higher quality in individual research efforts. J Clin Epidemiol 1989;42:1021–4.
6 Schulz KF, Chalmers I, Hayes RJ, Altman D. Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273:408–12.
7 Moher D, Pham B, Jones A, et al. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses? Lancet 1998;352:609–13.
8 Chalmers I, Tröhler U. Helping physicians to keep abreast of the medical literature: Medical and philosophical commentaries, 1773–1795. Ann Intern Med 2000;133:238–43.
9 Plackett RL. Studies in the history of probability and statistics: VII. The principle of the arithmetic mean. Biometrika 1958;1958:130–5.
10 Stigler SM. The history of statistics. The measurement of uncertainty before 1900. Cambridge, MA: The Belknap Press of Harvard University Press, 1990.
11 Airy GB. On the algebraical and numerical theory of errors of observations and the combinations of observations. London: Macmillan, 1861.
12 Pearson K. Report on certain enteric fever inoculation statistics. BMJ 1904;3:1243–6.
13 Peters CC. Summary of the Penn State experiments on the influence of instruction in character education. J Educat Sociol 1933;7:269–72.
14 Glass GV. Primary, secondary and meta-analysis of research. Educat Res 1976;5:3–8.
15 Cochrane AL. 1931–1971: a critical review, with particular reference to the medical profession. In: Medicines for the year 2000. London: Office of Health Economics, 1979:1–11.
16 Baber NS, Lewis JA. Confidence in results of beta-blocker postinfarction trials. BMJ 1982; 284:1749–50.
17 Yusuf S, Peto R, Lewis J, Collins R, Sleight P. Beta blockade during and after myocardial infarction: an overview of the randomized trials. Prog Cardiovasc Dis 1985;17:335–71.
18 Early Breast Cancer Trialists’ Collaborative Group. Effects of adjuvant tamoxifen and of cytotoxic therapy on mortality in early breast cancer. An overview of 61 randomized trials among 28 896 women. N Engl J Med 1988;319:1681–92.
19 Chalmers I, Enkin M, Keirse M. Effective care during pregnancy and childbirth. Oxford: Oxford University Press, 1989.
20 Greenland S. Quantitative methods in the review of epidemiologic literature. Epidemiol Rev 1987;9:1–30.
21 Friedenreich CM. Methods for pooled analyses of epidemiologic studies. Epidemiology 1993;4:295–302.
22 General Accounting Office. Cross design synthesis: a new strategy for medical effectiveness research. Washington, DC: GAO, 1992.
23 Reynolds JL, Whitlock RML. Effects of a beta-adrenergic receptor blocker in myocardial infarctation treated for one year from onset. Br Heart J 1972;34:252–9.
24 Multicentre International Study: supplementary report. Reduction in mortality after myocardial infarction with long-term beta-adrenoceptor blockade. BMJ 1977;2:419–21.
25 Baber NS, Wainwright Evans D, Howitt G, et al. Multicentre post-infarction trial of propranolol in 49 hospitals in the United Kingdom, Italy and Yugoslavia. Br Heart J 1980;44:96–100.
26 The Norwegian Multicenter Study Group. Timolol-induced reduction in mortality and reinfarction in patients surviving acute myocardial infarction. N Engl J Med 1981;304:801–7.
27 Mitchell JRA. Timolol after myocardial infarction: an answer or a new set of questions? BMJ 1981;282:1565–70.
28 Teagarden JR. Meta-analysis: whither narrative review? Pharmacotherapy 1989;9:274–84.
29 Hampton JR. The use of beta blockers for the reduction of mortality after myocardial infarction. Eur Heart J 1981;2:259–68.
30 Ravnskov U. Cholesterol lowering trials in coronary heart disease: frequency of citation and outcome. BMJ 1992;305:15–19.
31 Gøtzsche PC. Reference bias in reports of drug trials. BMJ 1987;295:654–6.
32 Cooper H, Rosenthal R. Statistical versus traditional procedures for summarising research findings. Psychol Bull 1980;87:442–9.
33 Chalmers TC, Frank CS, Reitman D. Minimizing the three stages of publication bias. JAMA 1990;263:1392–5.
34 Freiman JA, Chalmers TC, Smith H, Kuebler RR. The importance of beta, the type II error, and sample size in the design and interpretation of the randomized controlled trial. In: Bailar JC, Mosteller F, eds. Medical uses of statistics. Boston, MA: NEJM Books, 1992:357–73.
35 Thornley B, Adams C. Content and quality of 2000 controlled trials in schizophrenia over 50 years. BMJ 1998;317:1181–4.
36 Collins R, Keech A, Peto R, et al. Cholesterol and total mortality: need for larger trials. BMJ 1992;304:1689.
37 Berlin J, Laird NM, Sacks HS, Chalmers TC. A comparison of statistical methods for combining event rates from clinical trials. Stat Med 1989;8:141–51.
38 Fleiss JL. The statistical basis of meta-analysis. Stat Meth Med Res 1993;2:121–45.
39 Rosenthal R. An evaluation of procedures and results. In: Wachter KW, Straf ML, eds. The future of meta-analysis. New York: Russel Sage Foundation, 1990:123–33.
40 Jenicek M. Meta-analysis in medicine. Where we are and where we want to go. J Clin Epidemiol 1989;42:35–44.
41 Gelber RD, Goldhirsch A. Interpretation of results from subset analyses within overviews of randomized clinical trials. Stat Med 1987;6:371–8.
42 Piantadosi S, Byar DP, Green SB. The ecological fallacy. Am J Epidemiol 1988;127:893–904.
43 Chalmers I. Randomised controlled trials of fetal monitoring 1973–1977. In: Thalhammer O, Baumgarten K, Pollak A, eds. Perinatal medicine. Stuttgart: Thieme, 1979:260–5.
44 MacDonald D, Grant A, Sheridan-Pereira M, Boylan P, Chalmers I. The Dublin randomised controlled trial of intrapartum fetal heart rate monitoring. Am J Obstet Gynecol 1985;152:524–39.
45 Beta-Blocker Pooling Project Research Group. The beta-blocker pooling project (BBPP): subgroup findings from randomized trials in post-infarction trials. Eur Heart J 1988;9:8–16.
46 Goldstein S. Review of beta blocker myocardial infarction trials. Clin Cardiol 1989;12:54–7.
47 Soriano JB, Hoes AW, Meems L, Grobbee DE. Increased survival with betablockers: importance of ancillary properties. Prog Cardiovasc Dis 1997;39:445–56.
48 Freemantle N, Cleland J, Young P, Mason J, Harrison J. Beta blockade after myocardial infarction: systematic review and meta regression analysis. BMJ 1999;318:1730–7.
49 Lau J, Antman EM, Jimenez-Silva J, Kupelnick B, Mosteller F, Chalmers TC. Cumulative meta-analysis of therapeutic trials for myocardial infarction. N Engl J Med 1992;327:248–54.
50 Murphy DJ, Povar GJ, Pawlson LG. Setting limits in clinical medicine. Arch Intern Med 1994;154:505–12.
51 Gruppo Italiano per lo Studio della Streptochinasi nell’Infarto Miocardico (GISSI). Effectiveness of intravenous thrombolytic treatment in acute myocardial infarction. Lancet 1986;i:397–402.
52 ISIS-2 Collaborative Group. Randomised trial of intravenous streptokinase, oral aspirin, both, or neither among 17187 cases of suspected acute myocardial infarction: ISIS-2. Lancet 1988;ii:349–60.
53 Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers TC. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts. JAMA 1992;268:240–8.
54 Egger M, Davey Smith G. Meta-analysis: potentials and promise. BMJ 1997;315:1371–4.
MATTHIAS EGGER, GEORGE DAVEY SMITH
Reviews and meta-analyses should be as carefully planned as any other research project, with a detailed written protocol prepared in advance.
The formulation of the review question, the a priori definition of eligibility criteria for trials to be included, a comprehensive search for such trials and an assessment of their methodological quality, are central to high quality reviews.
The graphical display of results from individual studies on a common scale (“Forest plot”) is an important step, which allows a visual examination of the degree of heterogeneity between studies.
There are different statistical methods for combining the data in meta-analysis but there is no single “correct” method. A thorough sensitivity analysis should always be performed to assess the robustness of combined estimates to different assumptions, methods and inclusion criteria and to investigate the possible influence of bias.
