Systematic Reviews in Health Care -  - E-Book

Systematic Reviews in Health Care E-Book

0,0
116,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

The second edition of this best-selling book has been thoroughly revised and expanded to reflect the significant changes and advances made in systematic reviewing. New features include discussion on the rationale, meta-analyses of prognostic and diagnostic studies and software, and the use of systematic reviews in practice.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 796

Veröffentlichungsjahr: 2013

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Contributors

Foreword

Introduction

1 Rationale, potentials, and promise of systematic reviews

Summary points

Systematic review, overview or meta-analysis?

The scope of meta-analysis

Historical notes

Why do we need systematic reviews? A patient with myocardial infarction in 1981

Narrative reviews

Limitations of a single study

A more transparent appraisal

The epidemiology of results

What was the evidence in 1981? Cumulative metaanalysis

Conclusions

Acknowledgements

Part I: Systematic reviews of controlled trials

2 Principles of and procedures for systematic reviews

Summary points

Developing a review protocol

Objectives and eligibility criteria

Literature search

Selection of studies, assessment of methodological quality and data extraction

Presenting, combining and interpreting results

Standardised outcome measure

Graphical display

Heterogeneity between study results

Methods for estimating a combined effect estimate

Bayesian meta-analysis

Sensitivity analysis

Relative or absolute measures of effect?

Conclusions

Acknowledgements

3 Problems and limitations in conducting systematic reviews

Summary points

Garbage in – garbage out?

The dissemination of research findings

Publication bias

Who is responsible for publication bias: authors, reviewers or editors?

The influence of external funding and commercial interests

Other reporting biases

Duplicate (multiple) publication bias

Citation bias

Language bias

Outcome reporting bias

The future of unbiased, systematic reviewing

Acknowledgements

4 Identifying randomised trials

Summary points

Project to identify and re-tag reports of controlled trials in MEDLINE

Project to identify reports of controlled trials in EMBASE

Other databases already searched for reports of controlled trials

Identifying reports of controlled trials by handsearching

Recommended supplementary searches of databases already searched or in progress

Searches of additional databases not currently being searched for The Cochrane Controlled Trials Register

Supplementary searches of journals and conference proceedings

Identifying ongoing and/or unpublished studies

Conclusions

Acknowledgements

Disclaimer

5 Assessing the quality of randomised controlled trials

Summary points

A framework for methodological quality

Dimensions of internal validity

Empirical evidence of bias

Dimensions of external validity

Quality of reporting

Assessing trial quality: composite scales

A critique of two widely used quality scales

Potential of quality scales

Assessing trial quality: the component approach

Incorporating study quality into meta-analysis

Quality as a weight in statistical pooling

Sensitivity analysis

Conclusions

6 Obtaining individual patient data from randomised controlled trials

Summary points

What are individual patient data reviews?

Improved follow-up

Time-to-event analyses

Estimating time-to-event outcomes from published information

Effects in patient subgroups and on different outcomes

How to obtain data that are as complete as possible

What if IPD are not available?

Potential disadvantages of IPD reviews

Examples of how the collection of IPD can make a difference to the findings of a review

Conclusion

7 Assessing the quality of reports of systematic reviews: the QUOROM statement compared to other tools

Summary points

A systematic review of published checklists and scales

The QUOROM Statement

QUOROM as the “gold standard”

Results

Comparison of QUOROM to other checklists and scales

Scales

Assessment of quality across instruments

Discussion

Acknowledgements

Appendix 1

Appendix 2

Part II: Investigating variability within and between studies

8 Going beyond the grand mean: subgroup analysis in meta-analysis of randomised trials

Summary points

Subgroup analysis

Meta-regression: examining gradients in treatment effects

Risk stratification in meta-analysis

Problems in risk stratification

Use of risk indicators in meta-analysis

Confounding

Acknowledgements

9 Why and how sources of heterogeneity should be investigated

Summary points

Clinical and statistical heterogeneity

Serum cholesterol concentration and risk of ischaemic heart disease

Serum cholesterol reduction and risk of ischaemic heart disease

Statistical methods for investigating sources of heterogeneity

The relationship between underlying risk and treatment benefit

The virtues of individual patient data

Conclusions

Acknowledgements

10 Analysing the relationship between treatment benefit and underlying risk: precautions and recommendations

Summary points

Relating treatment effect to underlying risk: conventional approaches

Observed treatment effect versus observed average risk

Observed treatment group risk versus observed control group risk

Relating treatment effect to underlying risk: recently proposed approaches

Re-analysis of the sclerotherapy data

Conclusions

11 Investigating and dealing with publication and other biases

Summary points

Funnel plots

Other graphical methods

Statistical methods to detect and correct for bias

Conclusions

Acknowledgements

Part III: Systematic reviews of observational studies

12 Systematic reviews of observational studies

Summary points

Why do we need systematic reviews of observational studies?

Confounding and bias

Rare insight? The protective effect of beta-carotene that wasn’t

Exploring sources of heterogeneity

Conclusion

Acknowledgements

13 Systematic reviews of evaluations of prognostic variables

Summary points

Systematic review of prognostic studies

Meta-analysis of prognostic factor studies

Discussion

14 Systematic reviews of evaluations of diagnostic and screening tests

Summary points

Rationale for undertaking systematic reviews of studies of test accuracy

Features of studies of test accuracy

Summary measures of diagnostic accuracy

Predictive values

Systematic reviews of studies of diagnostic accuracy

Literature searching

Assessment of study quality

Ascertainment of reference diagnosis

Meta-analysis of studies of diagnostic accuracy

Pooling sensitivities and specificities

Pooling likelihood ratios

Combining symmetric ROC curves: pooling diagnostic odds ratios

Littenberg and Moses methods for estimation of summary ROC curves

Investigation of sources of heterogeneity

Pooling ROC curves

Discussion

Part IV: Statistical methods and computer software

15 Statistical methods for examining heterogeneity and combining results from several studies in meta-analysis

Summary points

Meta-analysis

Formulae for estimates of effect from individual studies

Formulae for deriving a summary (pooled) estimate of the treatment effect by combining trial results (meta-analysis)

Mantel–Haenszel methods

Peto’s odds ratio method

Extending the Peto method for pooling time-to-event data

DerSimonian and Laird random effects models

Confidence interval for overall effect

Test statistic for overall effect

Test statistics of homogeneity

Use of stratified analyses for investigating sources of heterogeneity

Meta-analysis with individual patient data

Additional analyses

Some practical issues

Other methods of meta-analysis

Case study 2 : Assertive community treatment for severe mental disorders

Case study 3 : effect of reduced dietary sodium on blood pressure

Discussion

Acknowledgements

16 Effect measures for meta-analysis of trials with binary outcomes

Summary points

Criteria for selection of a summary statistic

Mathematical properties

Ease of interpretation

Odds and risks

Measure of absolute effect – the risk difference

Measures of relative effect – the risk ratio and odds ratio

What is the event?

The L’Abbé plot

Empirical evidence of consistency

Empirical evidence of ease of interpretation

Case studies

Discussion

Acknowledgements

17 Meta-analysis software

Summary points

Commercial meta-analysis software

Freely available meta-analysis software

General statistical software which includes facilities for meta-analysis, or for which meta-analysis routines have been written

Conclusions

18 Meta-analysis in Stata™

Summary points

Getting started

Commands to perform a standard meta-analysis

Dealing with zero cells

Cumulative meta-analysis

Examining the influence of individual studies

Funnel plots and tests for funnel plot asymmetry

Meta-regression

Example 3: trials of BCG vaccine against tuberculosis

Part V: Using systematic reviews in practice

19 Applying the results of systematic reviews at the bedside

Summary points

Determining the applicability of the evidence to an individual patient

Determining the feasibility of the intervention in a particular setting

Determining the benefit : risk ratio in an individual patient

Extrapolating to the individual patient

Incorporating patient values and preferences

Conclusion

Acknowledgements

20 Numbers needed to treat derived from meta-analyses: pitfalls and cautions

Summary points

Describing the effects of treatment

The effect of choice of outcome

Ineffective treatments and adverse effects

The effect of variation in baseline risk

The effects of geographical and secular trends

The effect of clinical setting

Trial participants and patients in routine clinical practice

Pooling assumptions and NNTs

Duration of treatment effect

Interpretation of NNTs

Deriving NNTs

Understanding of NNTs

Conclusion

Acknowledgements

21 Using systematic reviews in clinical guideline development

Summary points

Methods of developing guidelines

Scale of the review process

Using existing systematic reviews

Internal validity of reviews

Dimensions of evidence

External validity of reviews

Updating existing reviews

Conducting reviews within the process of guideline development

The summary metric used in reviews

Concluding remarks

22 Using systematic reviews for evidence based policy making

Summary points

Evidence based medicine and evidence based healthcare

Evidence based decision making for populations

Evidence as the dominant driver

Resource-driven decisions

Value-driven decisions

Who should make value-drive decisions?

23 Using systematic reviews for economic evaluation

Summary points

What is economic evaluation?

Using systematic reviews of the effects of health care in economic evaluation

Systematic review of economic evaluations

Conclusions: the role of systematic review in economic evaluation

Acknowledgements

24 Using systematic reviews and registers of ongoing trials for scientific and ethical trial design, monitoring, and reporting

Summary points

Questionable use of limited resources

Ethical concerns

Distorted research agendas

Registration of planned and ongoing trials to inform decisions on new trials

Registration of trials to reduce reporting biases

Ethical and scientific monitoring of ongoing trials

Interpreting the results from new trials

Acknowledgements

Part VI: The Cochrane Collaboration

25 The Cochrane Collaboration in the 20th century

Summary points

Background and history

Mission, principles and organisation

Collaborative Review Groups

Cochrane Centres

Method Groups

Communication

Output of the Cochrane Collaboration

Conclusion

Acknowledgements

26 The Cochrane Collaboration in the 21st century: ten challenges and one reason why they must be met

Summary points

Ethical challenges

Social challenges

Logistical challenges

Methodological challenges

Why these challenges must be met

Acknowledgements

Index

Stata™ datasets and other additional information can be found on the book’s web site: http://www.systematicreviews.com

© BMJ Publishing Group 2001

Chapter 4 © Crown copyright 2000

Chapter 24 © Crown copyright 1995, 2000

Chapters 25 and 26 © The Cochrane Collaboration 2000

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording and/or otherwise, without the prior written permission of the publishers.

First published in 1995 by the BMJ Publishing Group, BMA House, Tavistock Square, London WC1H 9JR

www.bmjbooks.com

First published 1995

6 2007

British Library Cataloguing in Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-7279-1488–0

Contributors

Douglas G Altman

Professor of Statistics in Medicine

ICRF Medical Statistics Group

Centre for Statistics in Medicine

Institute of Health Sciences

University of Oxford

Oxford, UK

Gerd Antes

Director

German Cochrane Centre

Institut für Medizinische Biometrie und Medizinische Informatik

University of Freiburg

Freiburg i.B., Germany

Michael J Bradburn

Medical Statistician

ICRF Medical Statistics Group

Centre for Statistics in Medicine

Institute of Health Sciences

University of Oxford, Oxford, UK

Iain Chalmers

Director

UK Cochrane Centre

NHS Research and Development Programme

Oxford, UK

Michael J Clarke

Associate Director (Research)

UK Cochrane Centre

NHS Research and Development Programme

and

Overviews’ Co-ordinator

Clinical Trials Service Unit

Oxford, UK

George Davey Smith

Professor of Clinical Epidemiology

Division of Epidemiology and MRC Health Services Research Collaboration

Department of Social Medicine

University of Bristol

Bristol, UK

Jonathan J Deeks

Senior Medical Statistician

Systematic Review Development Programme

Centre for Statistics in Medicine

Institute of Health Sciences

University of Oxford

Oxford, UK

Kay Dickersin

Associate Professor

Department of Community Health

Brown University

Rhode Island, USA

Catherine Dubé

Department of Medicine

Division of Gastro-enterology

University of Ottawa

Ottawa, Canada

Shah Ebrahim

Professor in Epidemiology of Ageing

Division of Epidemiology and MRC Health Services Research Collaboration

Department of Social Medicine

University of Bristol

Bristol, UK

Martin Eccles

Professor of Clinical Effectiveness

Centre for Health Services Research

University of Newcastle upon Tyne

Newcastle upon Tyne, UK

Matthias Egger

Senior Lecturer in Epidemiology and Public Health Medicine

Division of Health Services Research and MRC Health Services Research Collaboration

Department of Social Medicine

University of Bristol

Bristol, UK

Nick Freemantle

Reader in Epidemiology and Biostatistics

Medicines Evaluation Group

Centre for Health Economics

University of York

York, UK

J A Muir Gray

Director

Institute of Health Sciences

University of Oxford

Oxford, UK

Peter Jüni

Research Fellow

MRC Health Services Research Collaboration

Department of Social Medicine

University of Bristol

Bristol, UK

Carol Lefebvre

Information Specialist

UK Cochrane Centre

NHS Research and Development Programme

Oxford, UK

James Mason

Senior Research Fellow

Medicines Evaluation Group

Centre for Health Economics

University of York

York, UK

Finlay A McAlister

Assistant Professor

Division of General Internal Medicine

University of Alberta Hospital

Edmonton, Canada

David Moher

Director

Thomas C Chalmers Centre for Systematic Reviews

Children’s Hospital of Eastern Ontario Research Institute

University of Ottawa

Ottawa, Canada

Miranda Mugford

Professor of Health Economics

School of Health Policy and Practice

University of East Anglia

Norwich, UK

Keith O’Rourke

Statistician

Clinical Epidemiology Unit

Loeb Research Institute

Ottawa Hospital

Ottawa, Canada

Andrew D Oxman

Director

Health Services Research Unit

National Institute of Public Health

Oslo, Norway

Martin Schneider

Specialist Registrar in Internal Medicine

Department of Medicine

University Hospitals

Geneva, Switzerland

Stephen J Sharp

Medical Statistician

GlaxoWellcome Research and Development

London, UK

Beverley Shea

Loeb Health Research Institute

Clinical Epidemiology Unit

Ottawa Hospital

University of Ottawa

Ottawa, Canada

Jonathan A C Sterne

Senior Lecturer in Medical Statistics

Division of Epidemiology and MRC Health Services Research Collaboration

Department of Social Medicine

University of Bristol

Bristol, UK

Lesley A Stewart

Head

Meta-Analysis Group

MRC Clinical Trials Unit

London, UK

Alexander J Sutton

Lecturer in Medical Statistics

Department of Epidemiology and Public Health

University of Leicester

Leicester, UK

Simon G Thompson

Director

MRC Biostatistics Unit

Institute of Public Health

University of Cambridge

Cambridge, UK

Foreword

“If, as is sometimes supposed, science consisted in nothing but the laborious accumulation of facts, it would soon come to a standstill, crushed, as it were, under its own weight. The suggestion of a new idea, or the detection of a law, supersedes much that has previously been a burden on the memory, and by introducing order and coherence facilitates the retention of the remainder in an available form…. Two processes are thus at work side by side, the reception of new material and the digestion and assimilation of the old; and as both are essential we may spare ourselves the discussion of their relative importance. One remark, however, should be made. The work which deserves, but I am afraid does not always receive, the most credit is that in which discovery and explanation go hand in hand, in which not only are new facts presented, but their relation to old ones is pointed out.”1

The above quotation is from the presidential address given by Lord Rayleigh, Professor of Physics at Cambridge University, at the meeting of the British Association for the Advancement of Science held in Montreal in 1884. More than a century later, research funding agencies, research ethics committees, researchers and journal editors in the field of health research have only just begun to take Lord Rayleigh’s injunction seriously. Research synthesis has a long history and has been developed in many spheres of scientific activity.2 Social scientists in the United States, in particular, have been actively discussing, developing and applying methods for this kind of research for more than quarter of a century,3–5 and, when the quality of the original research has been adequate, research syntheses have had an important impact on policy and practice.6,7

It was not until the late 1980s that Cynthia Mulrow8 and Andy Oxman9 began to spell out, for a medical readership, the scientific issues that need to be addressed in research synthesis. During the 1990s, there was an encouraging growth of respect for scientific principles among those preparing “stand alone” reviews, particularly reviews of research on the effects of health care interventions. Unfortunately, there is still little evidence that the same scientific principles are recognised as relevant in preparing the “discussion” sections of reports of new research. An analysis of papers in five influential general medical journals showed that the results of new studies are only very rarely presented in the context of systematic reviews of relevant earlier studies.10

In an important step in the right direction, the British Medical Journal, acknowledging the cumulative nature of scientific evidence, now publishes with each report of new research a summary of what is already known on the topic addressed, and what the new study has added.

As a result of the slow progress in adopting scientifically defensible methods of research synthesis in health care, the limited resources made available for research continue to be squandered on ill-conceived studies,11 and avoidable confusion continues to result from failure to review research systematically and set the results of new studies in the context of other relevant research. As a result, patients and others continue to suffer unnecessarily.12

Take, for example, the disastrous effects of giving class 1 anti-arrhythmic drugs to people having heart attacks, which has been estimated to have caused tens of thousands of premature deaths in the United States alone.13 The fact that the theoretical potential of these drugs was not being realised in practice could have been recognised many years earlier than it was. The warning signs were there in one of the first systematic reviews of controlled trials in health care,14 yet more than 50 trials of these drugs were conducted over nearly two decades15 before official warnings about their lethal impact were issued. Had the new data generated by each one of these trials been presented within the context of systematic reviews of the results of all previous trials, the lethal potential of this class of drugs would have become clear earlier, and an iatrogenic disaster would have been contained, if not avoided.

Failure to improve the quality of reviews by taking steps to reduce biases and the effects of the play of chance – whether in “stand alone” reviews or in reports of new evidence – will continue to have adverse consequences for people using health services. The first edition of this book helped to raise awareness of this,16 and its main messages were well received. However, the call to improve the scientific quality of reviews has not been accepted by everyone. In 1998, for example, editors at the New England Journal of Medicine rejected a commentary they had commissioned because they felt that their readers would not understand its main message – that meta-analysis (statistical synthesis of the results of separate but similar studies) could not be expected to reduce biases in reviews, but only to reduce imprecision.17 The journal’s rejection was particularly ironic in view of the fact that it had published one of the earliest and most important systematic reviews ever done.18

It was because of the widespread and incautious use of the term “meta-analysis” that the term “systematic reviews” was chosen as the title for the first edition of this book.16 Although meta-analysis may reduce statistical imprecision and may sometimes hint at biases in reviews (for example through tests of homogeneity, or funnel plots), it can never prevent biases. As in many forms of research, even elegant statistical manipulations, when performed on biased rubble, are incapable of generating unbiased precious stones. As Matthias Egger has put it – the diamond used to represent a summary statistic cannot be assumed to be the jewel in the crown!

The term “meta-analysis” has become so attractive to some people that they have dubbed themselves “meta-analysts”, and so repellent to others that they have lampooned it with dismissive “synonyms” such as “mega-silliness”19 and “shmeta-analysis”.20 Current discussions about ways of reducing biases and imprecision in reviews of research must not be allowed to be held hostage by ambiguous use of the term ‘metaanalysis’. Hopefully, both the title and the organisation of the contents of this second edition of the book will help to promote more informed and specific criticisms of reviews, and set meta-analysis in a proper context.

Interest in methods for research synthesis among health researchers and practitioners has burgeoned during the five years that have passed between the first and second editions of this book. Whereas the first edition16 had eight chapters and was just over 100 pages long, the current edition has 26 chapters and is nearly 500 pages long. The first edition of the book contained a methodological bibliography of less than 400 citations. Because that bibliography has now grown to over 2500 citations, it is now published and updated regularly in The Cochrane Methodology Register.21 These differences reflect the breathtaking pace of methodological developments in this sphere of research. Against this background it is easy to understand why I am so glad that Matthias Egger and George Davey Smith – who have contributed so importantly to these developments – agreed to co-edit the second edition of this book with Doug Altman.

After an introductory editorial chapter, the new edition begins with six chapters concerned principally with preventing and detecting biases in systematic reviews of controlled experiments. The important issue of investigating variability within and between studies is tackled in the four chapters that follow. The “methodological tiger country” of systematic reviews of observational studies is then explored in three chapters. Statistical methods and computer software are addressed in a section with four chapters. The book concludes with six chapters about using systematic reviews in practice, and two about the present and future of the Cochrane Collaboration.

Looking ahead, I hope that there will have been a number of further developments in this field before the third edition of the book is prepared. First and foremost, there needs to be wider acknowledgement of the essential truth of Lord Rayleigh’s injunction, particularly within the research community and among funders. Not only is research synthesis an essential process for taking stock of the dividends resulting from the investment of effort and other resources in research, it is also intellectually and methodologically challenging, and this should be reflected in the criteria used to judge the worth of academic work. Hopefully we will have seen the back of the naïve notion that when the results of systematic reviews differ from those of large trials, the latter should be assumed to be “the truth”.22

Second, I hope that people preparing systematic reviews, rather than having to detect and try to take account of biases retrospectively, will increasingly be able to draw on material that is less likely to be biased. Greater efforts are needed to reduce biases in the individual studies that will contribute to reviews.23 Reporting biases need to be reduced by registration of studies prior to their results being known, and by researchers recognising that they have an ethical and scientific responsibility to report findings of well-designed studies, regardless of the results.24 And I hope that there will be greater collaboration in designing and conducting systematic reviews prospectively, as a contribution to reducing biases in the review process, as pioneered in the International Multicentre Pooled Analysis of Colon Cancer Trials.25

Third, the quality of reviews of observational studies must be improved to address questions about aetiology, diagnostic accuracy, risk prediction and prognosis.26 These questions cannot usually be tackled using controlled experiments, so this makes systematic reviews of the relevant research more complex. Consumers of research results are frequently confused by conflicting claims about the accuracy of a diagnostic test, or the importance of a postulated aetiological or prognostic factor. They need systematic reviews that explore whether these differences of opinion simply reflect differences in the extent to which biases and the play of chance have been controlled in studies with apparently conflicting results. A rejection of meta-analysis in these circumstances20 should not be used as an excuse for jettisoning attempts to reduces biases in reviews of such observational data.

Fourth, by the time the next edition of this book is published it should be possible to build on assessments of individual empirical studies that have addressed methodological questions, such as those published in the Cochrane Collaboration Methods Groups Newsletter,27 and instead, take account of up-to-date, systematic reviews of such studies. Several such methodological reviews are currently being prepared, and they should begin to appear in The Cochrane Library in 2001.

Finally, I hope that social scientists, health researchers and lay people will be cooperating more frequently in efforts to improve both the science of research synthesis and the design of new studies. Lay people can help to ensure that researchers address important questions, and investigate outcomes that really matter.28,29 Social scientists have a rich experience of research synthesis, which remains largely untapped by health researchers, and they have an especially important role to play in designing reviews and new research to assess the effects of complex interventions and to detect psychologically mediated effects of interventions.30,31 Health researchers, for their part, should help lay people to understand the benefits and limitations of systematic reviews, and encourage social scientists to learn from the methodological developments that have arisen from the recent, intense activity in reviews of health care interventions. Indeed, five years from now there may be a case for reverting to the original title of the book – Systematic Reviews – to reflect the fact that improving the quality of research synthesis presents similar challenges across the whole spectrum of scientific activity.

Iain Chalmers

Acknowledgements

I am grateful to Mike Clarke, Paul Glasziou, Dave Sackett, and the editors for help in preparing this foreword.

1 Rayleigh, The Right Hon Lord. Presidential address at the 54th meeting of the British Association for the Advancement of Science, Montreal, August/September 1884. London: John Murray. 1889:3–23.

2 Chalmers I, Hedges LV, Cooper H. A brief history of research synthesis. Eval Health Prof (in press).

3 Glass GV. Primary, secondary and meta-analysis of research. Educat Res 1976;5:3–8.

4 Lipsey MW, Wilson DB. The efficacy of psychological, educational, and behavioral treatment. Am Psychol 1993;48:1181–209.

5 Cooper H, Hedges LV. The handbook of research synthesis. New York: Russell Sage Foundation, 1994.

6 Chelimsky E. Politics, Policy, and Research Synthesis. Keynote address before the National Conference on Research Synthesis, sponsored by the Russell Sage Foundation, Washington DC, 21 June 1994.

7 Hunt M. How science takes stock: the story of meta-analysis. New York: Russell Sage Foundation, 1997.

8 Mulrow CD. The medical review article: state of the science. Ann Int Med 1987;106:485–8.

9 Oxman AD, Guyatt GH. Guidelines for reading literature reviews. Can Med Assoc J 1988;138:697–703.

10 Clarke M, Chalmers I. Discussion sections in reports of controlled trials published in general medical journals: islands in search of continents? JAMA 1998;280:280–2.

11 Soares K, McGrath J, Adams C. Evidence and tardive dyskinesia. Lancet 1996;347:1696–7.

12 Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers TC. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts. JAMA 1992;268:240–8.

13 Moore T. Deadly Medicine. New York: Simon and Schuster, 1995.

14 Furberg CD. Effect of anti-arrhythmic drugs on mortality after myocardial infarction. Am J Cardiol 1983;52:32C–36C.

15 Teo KK, Yusuf S, Furberg CD. Effects of prophylactic anti-arrhythmic drug therapy in acute myocardial infarction. JAMA 1993;270:1589–95.

16 Chalmers I, Altman DG. Systematic reviews. London: BMJ, 1995.

17 Sackett DL, Glasziou P, Chalmers I. Meta-analysis may reduce imprecision, but it can’t reduce bias. Unpublished commentary commissioned by the New England Journal of Medicine, 1997.

18 Stampfer MJ, Goldhaber SZ, Yusuf S, Peto R, Hennekens CH. Effect of intravenous streptokinase on acute myocardial infarction: pooled results from randomized trials. N Engl J Med 1982;307:1180–2.

19 Eysenck HJ. An exercise in mega-silliness. Am Psychol 1978;33:517.

20 Shapiro S. Meta-analysis/shmeta-analysis. Am J Epidemiol 1994;140:771–8.

21 Cochrane Methodology Register. In: The Cochrane Library, Issue 1. Oxford: Update Software, 2001.

22 Ioannidis JP, Cappelleri JC, Lau J. Issues in comparisons between meta-analyses and large trials. JAMA 1998;279:1089–93.

23 Chalmers I. Unbiased, relevant, and reliable assessments in health care. BMJ 1998;317:1167–8.

24 Chalmers I, Altman DG. How can medical journals help prevent poor medical research? Some opportunities presented by electronic publishing. Lancet 1999;353:490–3.

25 International Multicentre Pooled Analysis of Colon Cancer Trials (IMPACT). Efficacy of adjuvant fluorouracil and folinic acid in colon cancer. Lancet 1995;345:939–44.

26 Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, Moher D, Becker BJ, Sipe TA, Thacker SB. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA 2000;283:2008–12.

27 Clarke M, Hopewell S (eds). The Cochrane Collaboration Methods Groups Newsletter. vol 4, 2000.

28 Chalmers I. What do I want from health research and researchers when I am a patient? BMJ 1995;310:1315–18.

29 Oliver S. Users of health services: following their agenda. In: Hood S, Mayall B, Oliver S (eds). Critical issues in social research. Buckingham: Open University Press, 1999:139–153.

30 Boruch RF. Randomized experiments for planning and evaluation. Thousand Oaks: Sage Publications,1997.

31 Oakley A. Experiments in knowing. Oxford: Polity Press, 2000.

Introduction

1

Rationale, potentials, and promise of systematic reviews

MATTHIAS EGGER, GEORGE DAVEY SMITH, KEITH O’ROURKE

Summary points

Reviews are essential tools for health care workers, researchers, consumers and policy makers who want to keep up with the evidence that is accumulating in their field.

Systematic reviews allow for a more objective appraisal of the evidence than traditional narrative reviews and may thus contribute to resolve uncertainty when original research, reviews, and editorials disagree.

Meta-analysis, if appropriate, will enhance the precision of estimates of treatment effects, leading to reduced probability of false negative results, and potentially to a more timely introduction of effective treatments.

Exploratory analyses, e.g. regarding subgroups of patients who are likely to respond particularly well to a treatment (or the reverse), may generate promising new research questions to be addressed in future studies.

Systematic reviews may demonstrate the lack of adequate evidence and thus identify areas where further studies are needed.

The volume of data that need to be considered by practitioners and researchers is constantly expanding. In many areas it has become simply impossible for the individual to read, critically evaluate and synthesise the state of current knowledge, let alone keep updating this on a regular basis. Reviews have become essential tools for anybody who wants to keep up with the new evidence that is accumulating in his or her field of interest. Reviews are also required to identify areas where the available evidence is insufficient and further studies are required. However, since Mulrow1 and Oxman and Guyatt2 drew attention to the poor quality of narrative reviews it has become clear that conventional reviews are an unreliable source of information. In response to this situation there has, in recent years, been increasing focus on formal methods of systematically reviewing studies, to produce explicitly formulated, reproducible, and up-to-date summaries of the effects of health care interventions. This is illustrated by the sharp increase in the number of reviews that used formal methods to synthesise evidence (Figure 1.1).

Figure 1.1 Number of publications concerning meta-analysis, 1986–1999. Results from MEDLINE search using text word and medical subject (MESH) heading “meta-analysis” and text word “systematic review”.

In this chapter we will attempt to clarify terminology and scope, provide some historical background, and examine the potentials and promise of systematic reviews and meta-analysis.

Systematic review, overview or meta-analysis?

A number of terms are used concurrently to describe the process of systematically reviewing and integrating research evidence, including “systematic review”, “meta-analysis”, “research synthesis”, “overview” and “pooling”. In the foreword to the first edition of this book, Chalmers and Altman3 defined systematic review as a review that has been prepared using a systematic approach to minimising biases and random errors which is documented in a materials and methods section. A systematic review may, or may not, include a meta-analysis: a statistical analysis of the results from independent studies, which generally aims to produce a single estimate of a treatment effect.4 The distinction between systematic review and meta-analysis, which will be used throughout this book, is important because it is always appropriate and desirable to systematically review a body of data, but it may sometimes be inappropriate, or even misleading, to statistically pool results from separate studies.5 Indeed, it is our impression that reviewers often find it hard to resist the temptation of combining studies even when such meta-analysis is questionable or clearly inappropriate.

The scope of meta-analysis

As discussed in detail in Chapter 12, a clear distinction should be made between meta-analysis of randomised controlled trials and metaanalysis of epidemiological studies. Consider a set of trials of high methodological quality that examined the same intervention in comparable patient populations: each trial will provide an unbiased estimate of the same underlying treatment effect. The variability that is observed between the trials can confidently be attributed to random variation and meta-analysis should provide an equally unbiased estimate of the treatment effect, with an increase in the precision of this estimate. A fundamentally different situation arises in the case of epidemiological studies, for example case-control studies, cross-sectional studies or cohort studies. Due to the effects of confounding and bias, such observational studies may produce estimates of associations that deviate from the underlying effect in ways that may systematically differ from chance. Combining a set of epidemiological studies will thus often provide spuriously precise, but biased, estimates of associations. The thorough consideration of heterogeneity between observational study results, in particular of possible sources of confounding and bias, will generally provide more insights than the mechanistic calculation of an overall measure of effect (see Chapters 9 and 12 for examples of observational meta-analyses).

The fundamental difference that exists between observational studies and randomised controlled trials does not mean that the latter are immune to bias. Publication bias and other reporting biases (see Chapter 3) may distort the evidence from both trials and observational studies. Bias may also be introduced if the methodological quality of controlled trials is inadequate6,7 (Chapter 5). It is crucial to understand the limitations of meta-analysis and the importance of exploring sources of heterogeneity and bias (Chapters 8–11), and much emphasis will be given to these issues in this book.

Historical notes

Efforts to compile summaries of research for medical practitioners who struggle with the amount of information that is relevant to medical practice are not new. Chalmers and Tröhler8 drew attention to two journals published in the 18th century in Leipzig and Edinburgh, Comentariide rebus in scientia naturali et medicina gestis and Medical and Philosophical Commentaries, which published critical appraisals of important new books in medicine, including, for example, William Withering’s now classic Account of the foxglove (1785) on the use of digitalis for treating heart disease. These journals can be seen as the 18th century equivalents of modern day secondary publications such as the ACP Journal Club or Evidence based medicine.

Box 1.1 From Laplace and Gauss to the first textbook of meta-analysis

Astronomers long ago noticed that observations of the same objects differed even when made by the same observers under similar conditions. The calculation of the mean as a more precise value than a single measurement had appeared by the end of the 17th century.9 By the late 1700s probability models were being used to represent the uncertainty of observations that was caused by measurement error. Laplace decided to write these models not as the probability that an observation equalled the true value plus some error but as the truth plus the “probability of some error”. In doing this he recognised that as probabilities of independent errors multiply he could determine the most likely joint errors, the concept which is at the heart of maximum likelihood estimation.10 Laplace’s method of combining and quantifying uncertainty in the combination of observations required an explicit probability distribution for errors in the individual observations and no acceptable one existed. Gauss drew on empirical experience and argued that a probability distribution corresponding to what is today referred to as the Normal or Gaussian distribution would be best. This remained speculative until Laplace’s formulation of the central limit theorem – that for large sample sizes the error distribution will always be close to Normally distributed. Hence, Gauss’s method was more than just a good guess but justified by the central limit theorem. Most statistical techniques used today in meta-analysis follow from Gauss’s and Laplace’s work. Airy disseminated their work in his 1861 “textbook” on “meta-analysis” for astronomers (Figure 1.2) which included the first formulation of a random effects model to allow for heterogeneity in the results.11 Airy offered practical advice and argued for the use of judgement to determine what type of statistical model should be used.

Figure 1.2 The title page of what may be seen as the first “textbook” of meta-analysis, published in 1861.

The statistical basis of meta-analysis reaches back to the 17th century when in astronomy and geodesy intuition and experience suggested that combinations of data might be better than attempts to choose amongst them (see Box 1.1). In the 20th century the distinguished statistician Karl Pearson (Figure 1.3), was, in 1904, probably the first medical researcher reporting the use of formal techniques to combine data from different studies. The rationale for pooling studies put forward by Pearson in his account on the preventive effect of serum inoculations against enteric fever,12 is still one of the main reasons for undertaking meta-analysis today:

“Many of the groups … are far too small to allow of any definite opinion being formed at all, having regard to the size of the probable error involved”.12

Figure 1.3 Distinguished statistician Karl Pearson is seen as the first medical researcher to use formal techniques to combine data from different studies.

However, such techniques were not widely used in medicine for many years to come. In contrast to medicine, the social sciences and in particular psychology and educational research, developed an early interest in the synthesis of research findings. In the 1930s, 80 experiments examining the “potency of moral instruction in modifying conduct” were systematically reviewed.13 In 1976 the psychologist Glass coined the term “meta-analysis” in a paper entitled “Primary, secondary and meta-analysis of research”.14 Three years later the British physician and epidemiologist Archie Cochrane drew attention to the fact that people who want to make informed decisions about health care do not have ready access to reliable reviews of the available evidence.15 In the 1980s meta-analysis became increasingly popular in medicine, particularly in the fields of cardiovascular disease,16,17 oncology,18 and perinatal care.19 Meta-analysis of epidemiological studies20,21 and “cross design synthesis”,22 the integration of observational data with the results from meta-analyses of randomised clinical trials was also advocated. In the 1990s the foundation of the Cochrane Collaboration (see Chapters 25 and 26) facilitated numerous developments, many of which are documented in this book.

Why do we need systematic reviews? A patient with myocardial infarction in 1981

A likely scenario in the early 1980s, when discussing the discharge of a patient who had suffered an uncomplicated myocardial infarction, is as follows: a keen junior doctor asks whether the patient should receive a beta-blocker for secondary prevention of a future cardiac event. After a moment of silence the consultant states that this was a question which should be discussed in detail at the Journal Club on Thursday. The junior doctor (who now regrets that she asked the question) is told to assemble and present the relevant literature. It is late in the evening when she makes her way to the library. The MEDLINE search identifies four clinical trials.23–26 When reviewing the conclusions from these trials (Table 1.1) the doctor finds them to be rather confusing and contradictory. Her consultant points out that the sheer amount of research published makes it impossible to keep track of and critically appraise individual studies. He recommends a good review article. Back in the library the junior doctor finds an article which the BMJ published in 1981 in a “Regular Reviews” section.27 This narrative review concluded:

Thus, despite claims that they reduce arrhythmias, cardiac work, and infarct size, we still have no clear evidence that beta-blockers improve long-term survival after infarction despite almost 20 years of clinical trials.27

Table 1.1 Conclusions from four randomised controlled trials of beta-blockers in secondary prevention after myocardial infarction.

The junior doctor is relieved. She presents the findings of the review article, the Journal Club is a full success and the patient is discharged without a beta-blocker.

Narrative reviews

Traditional narrative reviews have a number of disadvantages that systematic reviews may overcome. First, the classical review is subjective and therefore prone to bias and error.28 Mulrow showed that among 50 reviews published in the mid 1980s in leading general medicine journals, 49 reviews did not specify the source of the information and failed to perform a standardised assessment of the methodological quality of studies.1 Our junior doctor could have consulted another review of the same topic, published in the European Heart Journal in the same year. This review concluded that “it seems perfectly reasonable to treat patients who have survived an infarction with timolol”.29 Without guidance by formal rules, reviewers will inevitably disagree about issues as basic as what types of studies it is appropriate to include and how to balance the quantitative evidence they provide. Selective inclusion of studies that support the author’s view is common. This is illustrated by the observation that the frequency of citation of clinical trials is related to their outcome, with studies in line with the prevailing opinion being quoted more frequently than unsupportive studies30,31 Once a set of studies has been assembled a common way to review the results is to count the number of studies supporting various sides of an issue and to choose the view receiving the most votes. This procedure is clearly unsound, since it ignores sample size, effect size, and research design. It is thus hardly surprising that reviewers using traditional methods often reach opposite conclusions1 and miss small, but potentially important, differences.32 In controversial areas the conclusions drawn from a given body of evidence may be associated more with the speciality of the reviewer than with the available data.33 By systematically identifying, scrutinising, tabulating, and perhaps integrating all relevant studies, systematic reviews allow a more objective appraisal, which can help to resolve uncertainties when the original research, classical reviews and editorial comments disagree.

Limitations of a single study

A single study often fails to detect, or exclude with certainty, a modest, albeit relevant, difference in the effects of two therapies. A trial may thus show no statistically significant treatment effect when in reality such an effect exists – it may produce a false negative result. An examination of clinical trials which reported no statistically significant differences between experimental and control therapy has shown that false negative results in health care research are common: for a clinically relevant difference in outcome the probability of missing this effect given the trial size was greater than 20% in 115 (85%) of the 136 trials examined.34 Similarly, a recent examination of 1941 trials relevant to the treatment of schizophrenia showed that only 58 (3%) studies were large enough to detect an important improvement.35 The number of patients included in trials is thus often inadequate, a situation which has changed little over recent years.34 In some cases, however, the required sample size may be difficult to achieve. A drug which reduces the risk of death from myocardial infarction by 10% could, for example, delay many thousands of deaths each year in the UK alone. In order to detect such an effect with 90% certainty over ten thousand patients in each treatment group would be needed.36

The meta-analytic approach appears to be an attractive alternative to such a large, expensive and logistically problematic study. Data from patients in trials evaluating the same or a similar drug in a number of smaller, but comparable, studies are considered. Methods used for meta-analysis employ a weighted average of the results in which the larger trials have more influence than the smaller ones. Comparisons are made exclusively between patients enrolled in the same study. As discussed in detail in chapter 15, there are a variety of statistical techniques available for this purpose.37,38 In this way the necessary number of patients may be reached, and relatively small effects can be detected or excluded with confidence. Systematic reviews can also contribute to considerations regarding the applicability of study results. The findings of a particular study might be felt to be valid only for a population of patients with the same characteristics as those investigated in the trial. If many trials exist in different groups of patients, with similar results being seen in the various trials, then it can be concluded that the effect of the intervention under study has some generality. By putting together all available data meta-analyses are also better placed than individual trials to answer questions regarding whether or not an overall study result varies among subgroups, e.g. among men and women; older and younger patients or participants with different degrees of severity of disease.

A more transparent appraisal

An important advantage of systematic reviews is that they render the review process transparent. In traditional narrative reviews it is often not clear how the conclusions follow from the data examined. In an adequately presented systematic review it should be possible for readers to replicate the quantitative component of the argument. To facilitate this, it is valuable if the data included in meta-analyses are either presented in full or made available to interested readers by the authors. The increased openness required leads to the replacement of unhelpful descriptors such as “no clear evidence”, “some evidence of a trend”, “a weak relationship” and “a strong relationship”.39 Furthermore, performing a meta-analysis may lead to reviewers moving beyond the conclusions authors present in the abstract of papers, to a thorough examination of the actual data.

The epidemiology of results

The tabulation, exploration and evaluation of results are important components of systematic reviews. This can be taken further to explore sources of heterogeneity and test new hypotheses that were not posed in individual studies, for example using “meta-regression” techniques (see also Chapters 8–11). This has been termed the “epidemiology of results” where the findings of an original study replace the individual as the unit of analysis.40 However, it must be born in mind that although the studies included may be controlled experiments, the meta-analysis itself is subject to many biases inherent in observational studies.41 Aggregation or ecological bias42 is also a problem unless individual patient data is available (see Chapter 6). Systematic reviews can, nevertheless, lead to the identification of the most promising or the most urgent research question, and may permit a more accurate calculation of the sample sizes needed in future studies (see Chapter 24). This is illustrated by an early meta-analysis of four trials that compared different methods of monitoring the fetus during labour.43 The meta-analysis led to the hypothesis that, compared with intermittent auscultation, continuous fetal heart monitoring reduced the risk of neonatal seizures. This hypothesis was subsequently confirmed in a single randomised trial of almost seven times the size of the four previous studies combined.44

What was the evidence in 1981? Cumulative metaanalysis

What conclusions would our junior doctor have reached if she had had access to a meta-analysis? Numerous meta-analyses of trials examining the effect of beta-antagonists have been published since 1981.17,45–48Figure 1.4 shows the results from the most recent analysis that included 33 randomised comparisons of beta-blockers versus placebo or alternative treatment in patients who had had a myocardial infarction.48 These trials were published between 1967 and 1997. The combined relative risk indicates that beta-blockade starting after the acute infarction reduces subsequent premature mortality by an estimated 20% (relative risk 0.80). A useful way to show the evidence that was available in 1981 and at other points in time is to perform a cumulative meta-analysis.49

Cumulative meta-analysis is defined as the repeated performance of meta-analysis whenever a new relevant trial becomes available for inclusion. This allows the retrospective identification of the point in time when a treatment effect first reached conventional levels of statistical significance. In the case of beta-blockade in secondary prevention of myocardial infarction, a statistically significant beneficial effect (P < 0·05) became evident by 1981 (Figure 1.5). Subsequent trials in a further 15 000 patients simply confirmed this result. This situation has been taken to suggest that further studies in large numbers of patients may be at best superfluous and costly, if not unethical,50 once a statistically significant treatment effect is evident from meta-analysis of the existing smaller trials.

Figure 1.4 “Forest plot” showing mortality results from trials of beta-blockers in secondary prevention after myocardial infarction. Trials are ordered by year of publication. The black square and horizontal line correspond to the trials’ risk ratio and 95% confidence intervals. The area of the black squares reflects the weight each trial contributes in the meta-analysis. The diamond represents the combined relative risk with its 95% confidence interval, indicating a 20% reduction in the odds of death. See Chapter 2 for a detailed description of forest plots. Adapted from Freemantle et al.48

Figure 1.5 Cumulative meta-analysis of controlled trials of beta-blockers after myocardial infarction. The data correspond to Figure 1.4. A statistically significant (P < 0·05) beneficial effect on mortality became evident in 1981.

Figure 1.6 Cumulative meta-analysis of randomised controlled trials of intravenous streptokinase in myocardial infarction. The number of patients randomised in a total of 33 trials, and national authorities licensing streptokinase for use in myocardial infarction are also shown. aIncludes GISSI-1; bISIS-2.

Conclusions

Systematic review including, if appropriate, a formal meta-analysis is clearly superior to the narrative approach to reviewing research. In addition to providing a precise estimate of the overall treatment effect in some instances, appropriate examination of heterogeneity across individual studies can produce useful information with which to guide rational and cost effective treatment decisions. Systematic reviews are also important to demonstrate areas where the available evidence is insufficient and where new, adequately sized trials are required.

Acknowledgements

We are grateful to Sir David Cox for providing key references to early statistical work and to Iain Chalmers for his comments on an earlier draft of this chapter. We thank Dr T Johansson and G Enocksson (Pharmacia AB, Stockholm) and Dr A Schirmer and Dr M Thimme (Behring AG, Marburg) for providing data on licensing of streptokinase in different countries. This chapter draws on material published earlier in the BMJ.54

1 Mulrow CD. The medical review article: state of the science. Ann Intern Med 1987;106:485–8.

2 Oxman AD, Guyatt GH. Guidelines for reading literature reviews. Can Med Assoc J 1988;138:697–703.

3 Chalmers I, Altman D (eds). Systematic reviews. London: BMJ Publishing Group, 1995.

4 Huque MF. Experiences with meta-analysis in NDA submissions. Proc Biopharmac Sec Am Stat Assoc 1988;2:28–33.

5 O’Rourke K, Detsky AS. Meta-analysis in medical research: strong encouragement for higher quality in individual research efforts. J Clin Epidemiol 1989;42:1021–4.

6 Schulz KF, Chalmers I, Hayes RJ, Altman D. Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273:408–12.

7 Moher D, Pham B, Jones A, et al. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses? Lancet 1998;352:609–13.

8 Chalmers I, Tröhler U. Helping physicians to keep abreast of the medical literature: Medical and philosophical commentaries, 1773–1795. Ann Intern Med 2000;133:238–43.

9 Plackett RL. Studies in the history of probability and statistics: VII. The principle of the arithmetic mean. Biometrika 1958;1958:130–5.

10 Stigler SM. The history of statistics. The measurement of uncertainty before 1900. Cambridge, MA: The Belknap Press of Harvard University Press, 1990.

11 Airy GB. On the algebraical and numerical theory of errors of observations and the combinations of observations. London: Macmillan, 1861.

12 Pearson K. Report on certain enteric fever inoculation statistics. BMJ 1904;3:1243–6.

13 Peters CC. Summary of the Penn State experiments on the influence of instruction in character education. J Educat Sociol 1933;7:269–72.

14 Glass GV. Primary, secondary and meta-analysis of research. Educat Res 1976;5:3–8.

15 Cochrane AL. 1931–1971: a critical review, with particular reference to the medical profession. In: Medicines for the year 2000. London: Office of Health Economics, 1979:1–11.

16 Baber NS, Lewis JA. Confidence in results of beta-blocker postinfarction trials. BMJ 1982; 284:1749–50.

17 Yusuf S, Peto R, Lewis J, Collins R, Sleight P. Beta blockade during and after myocardial infarction: an overview of the randomized trials. Prog Cardiovasc Dis 1985;17:335–71.

18 Early Breast Cancer Trialists’ Collaborative Group. Effects of adjuvant tamoxifen and of cytotoxic therapy on mortality in early breast cancer. An overview of 61 randomized trials among 28 896 women. N Engl J Med 1988;319:1681–92.

19 Chalmers I, Enkin M, Keirse M. Effective care during pregnancy and childbirth. Oxford: Oxford University Press, 1989.

20 Greenland S. Quantitative methods in the review of epidemiologic literature. Epidemiol Rev 1987;9:1–30.

21 Friedenreich CM. Methods for pooled analyses of epidemiologic studies. Epidemiology 1993;4:295–302.

22 General Accounting Office. Cross design synthesis: a new strategy for medical effectiveness research. Washington, DC: GAO, 1992.

23 Reynolds JL, Whitlock RML. Effects of a beta-adrenergic receptor blocker in myocardial infarctation treated for one year from onset. Br Heart J 1972;34:252–9.

24 Multicentre International Study: supplementary report. Reduction in mortality after myocardial infarction with long-term beta-adrenoceptor blockade. BMJ 1977;2:419–21.

25 Baber NS, Wainwright Evans D, Howitt G, et al. Multicentre post-infarction trial of propranolol in 49 hospitals in the United Kingdom, Italy and Yugoslavia. Br Heart J 1980;44:96–100.

26 The Norwegian Multicenter Study Group. Timolol-induced reduction in mortality and reinfarction in patients surviving acute myocardial infarction. N Engl J Med 1981;304:801–7.

27 Mitchell JRA. Timolol after myocardial infarction: an answer or a new set of questions? BMJ 1981;282:1565–70.

28 Teagarden JR. Meta-analysis: whither narrative review? Pharmacotherapy 1989;9:274–84.

29 Hampton JR. The use of beta blockers for the reduction of mortality after myocardial infarction. Eur Heart J 1981;2:259–68.

30 Ravnskov U. Cholesterol lowering trials in coronary heart disease: frequency of citation and outcome. BMJ 1992;305:15–19.

31 Gøtzsche PC. Reference bias in reports of drug trials. BMJ 1987;295:654–6.

32 Cooper H, Rosenthal R. Statistical versus traditional procedures for summarising research findings. Psychol Bull 1980;87:442–9.

33 Chalmers TC, Frank CS, Reitman D. Minimizing the three stages of publication bias. JAMA 1990;263:1392–5.

34 Freiman JA, Chalmers TC, Smith H, Kuebler RR. The importance of beta, the type II error, and sample size in the design and interpretation of the randomized controlled trial. In: Bailar JC, Mosteller F, eds. Medical uses of statistics. Boston, MA: NEJM Books, 1992:357–73.

35 Thornley B, Adams C. Content and quality of 2000 controlled trials in schizophrenia over 50 years. BMJ 1998;317:1181–4.

36 Collins R, Keech A, Peto R, et al. Cholesterol and total mortality: need for larger trials. BMJ 1992;304:1689.

37 Berlin J, Laird NM, Sacks HS, Chalmers TC. A comparison of statistical methods for combining event rates from clinical trials. Stat Med 1989;8:141–51.

38 Fleiss JL. The statistical basis of meta-analysis. Stat Meth Med Res 1993;2:121–45.

39 Rosenthal R. An evaluation of procedures and results. In: Wachter KW, Straf ML, eds. The future of meta-analysis. New York: Russel Sage Foundation, 1990:123–33.

40 Jenicek M. Meta-analysis in medicine. Where we are and where we want to go. J Clin Epidemiol 1989;42:35–44.

41 Gelber RD, Goldhirsch A. Interpretation of results from subset analyses within overviews of randomized clinical trials. Stat Med 1987;6:371–8.

42 Piantadosi S, Byar DP, Green SB. The ecological fallacy. Am J Epidemiol 1988;127:893–904.

43 Chalmers I. Randomised controlled trials of fetal monitoring 1973–1977. In: Thalhammer O, Baumgarten K, Pollak A, eds. Perinatal medicine. Stuttgart: Thieme, 1979:260–5.

44 MacDonald D, Grant A, Sheridan-Pereira M, Boylan P, Chalmers I. The Dublin randomised controlled trial of intrapartum fetal heart rate monitoring. Am J Obstet Gynecol 1985;152:524–39.

45 Beta-Blocker Pooling Project Research Group. The beta-blocker pooling project (BBPP): subgroup findings from randomized trials in post-infarction trials. Eur Heart J 1988;9:8–16.

46 Goldstein S. Review of beta blocker myocardial infarction trials. Clin Cardiol 1989;12:54–7.

47 Soriano JB, Hoes AW, Meems L, Grobbee DE. Increased survival with betablockers: importance of ancillary properties. Prog Cardiovasc Dis 1997;39:445–56.

48 Freemantle N, Cleland J, Young P, Mason J, Harrison J. Beta blockade after myocardial infarction: systematic review and meta regression analysis. BMJ 1999;318:1730–7.

49 Lau J, Antman EM, Jimenez-Silva J, Kupelnick B, Mosteller F, Chalmers TC. Cumulative meta-analysis of therapeutic trials for myocardial infarction. N Engl J Med 1992;327:248–54.

50 Murphy DJ, Povar GJ, Pawlson LG. Setting limits in clinical medicine. Arch Intern Med 1994;154:505–12.

51 Gruppo Italiano per lo Studio della Streptochinasi nell’Infarto Miocardico (GISSI). Effectiveness of intravenous thrombolytic treatment in acute myocardial infarction. Lancet 1986;i:397–402.

52 ISIS-2 Collaborative Group. Randomised trial of intravenous streptokinase, oral aspirin, both, or neither among 17187 cases of suspected acute myocardial infarction: ISIS-2. Lancet 1988;ii:349–60.

53 Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers TC. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts. JAMA 1992;268:240–8.

54 Egger M, Davey Smith G. Meta-analysis: potentials and promise. BMJ 1997;315:1371–4.

Part I

Systematic reviews of controlled trials

2

Principles of and procedures for systematic reviews

MATTHIAS EGGER, GEORGE DAVEY SMITH

Summary points

Reviews and meta-analyses should be as carefully planned as any other research project, with a detailed written protocol prepared in advance.

The formulation of the review question, the a priori definition of eligibility criteria for trials to be included, a comprehensive search for such trials and an assessment of their methodological quality, are central to high quality reviews.

The graphical display of results from individual studies on a common scale (“Forest plot”) is an important step, which allows a visual examination of the degree of heterogeneity between studies.

There are different statistical methods for combining the data in meta-analysis but there is no single “correct” method. A thorough sensitivity analysis should always be performed to assess the robustness of combined estimates to different assumptions, methods and inclusion criteria and to investigate the possible influence of bias.