Clinical Trials with Missing Data - Michael O'Kelly - E-Book

Clinical Trials with Missing Data E-Book

Michael O'Kelly

0,0
71,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This book provides practical guidance for statisticians, clinicians, and researchers involved in clinical trials in the biopharmaceutical industry, medical and public health organisations. Academics and students needing an introduction to handling missing data will also find this book invaluable.

The authors describe how missing data can affect the outcome and credibility of a clinical trial, show by examples how a clinical team can work to prevent missing data, and present the reader with approaches to address missing data effectively.

The book is illustrated throughout with realistic case studies and worked examples, and presents clear and concise guidelines to enable good planning for missing data. The authors show how to handle missing data in a way that is transparent and easy to understand for clinicians, regulators and patients. New developments are presented to improve the choice and implementation of primary and sensitivity analyses for missing data. Many SAS code examples are included – the reader is given a toolbox for implementing analyses under a variety of assumptions.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 822

Veröffentlichungsjahr: 2014

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Cover

Series

Title Page

Copyright

Dedication

Preface

References

Acknowledgments

Notation

Table of SAS code fragments

Contributors

Chapter 1: What's the problem with missing data?

1.1 What do we mean by missing data?

1.2 An illustration

1.3 Why can't I use only the available primary endpoint data?

1.4 What's the problem with using last observation carried forward?

1.5 Can we just assume that data are missing at random?

1.6 What can be done if data may be missing not at random?

1.7 Stress-testing study results for robustness to missing data

1.8 How the pattern of dropouts can bias the outcome

1.9 How do we formulate a strategy for missing data?

1.10 Description of example datasets

Appendix 1.A: Formal definitions of MCAR, MAR and MNAR

References

Chapter 2: The prevention of missing data

2.1 Introduction

2.2 The impact of “too much” missing data

2.3 The role of the statistician in the prevention of missing data

2.4 Methods for increasing subject retention

2.5 Improving understanding of reasons for subject withdrawal

Acknowledgments

Appendix 2.A: Example protocol text for missing data prevention

References

Chapter 3: Regulatory guidance – a quick tour

3.1 International conference on harmonization guideline: Statistical principles for clinical trials: E9

3.2 The US and EU regulatory documents

3.3 Key points in the regulatory documents on missing data

3.4 Regulatory guidance on particular statistical approaches

3.5 Guidance about how to plan for missing data in a study

3.6 Differences in emphasis between the NRC report and EU guidance documents

3.7 Other technical points from the NRC report

3.8 Other US/EU/international guidance documents that refer to missing data

3.9 And in practice?

References

Chapter 4: A guide to planning for missing data

4.1 Introduction

4.2 Planning for missing data

4.3 Exploring and presenting missingness

4.4 Model checking

4.5 Interpreting model results when there is missing data

4.6 Sample size and missing data

Appendix 4.A: Sample protocol/SAP text for study in Parkinson's disease

Appendix 4.B: A formal definition of a sensitivity parameter

References

Chapter 5: Mixed models for repeated measures using categorical time effects (MMRM)

5.1 Introduction

5.2 Specifying the mixed model for repeated measures

5.3 Understanding the data

5.4 Applying the mixed model for repeated measures

5.5 Additional mixed model for repeated measures topics

5.6 Logistic regression mixed model for repeated measures using the generalized linear mixed model

References

Table of SAS Code Fragments

Chapter 6: Multiple imputation

6.1 Introduction

6.2 Imputation phase

6.3 Analysis phase: Analyzing multiple imputed datasets

6.4 Pooling phase: Combining results from multiple datasets

6.5 Required number of imputations

6.6 Some practical considerations

6.7 Pre-specifying details of analysis with multiple imputation

Appendix 6.A: Additional methods for multiple imputation

References

Table of SAS Code Fragments

Chapter 7: Analyses under missing-not-at-random assumptions

7.1 Introduction

7.2 Background to sensitivity analyses and pattern-mixture models

7.3 Two methods of implementing sensitivity analyses via pattern-mixture models

7.4 A “toolkit”: Implementing sensitivity analyses via SAS

7.5 Examples of realistic strategies and results for illustrative datasets of three indications

Appendix 7.A How one could implement the neighboring case missing value assumption using visit-by-visit multiple imputation

Appendix 7.B SAS code to model withdrawals from the experimental arm, using observed data from the control arm

Appendix 7.C SAS code to model early withdrawals from the experimental arm, using the last-observation-carried-forward-like values

Appendix 7.D SAS macro to impose delta adjustment on a responder variable in the mania dataset

Appendix 7.E SAS code to implement tipping point via exhaustive scenarios for withdrawals in the mania dataset

Appendix 7.F SAS code to perform sensitivity analyses for the Parkinson's disease dataset

Appendix 7.G SAS code to perform sensitivity analyses for the insomnia dataset

Appendix 7.H SAS code to perform sensitivity analyses for the mania dataset

Appendix 7.I Selection models

Appendix 7.J Shared parameter models

References

Table of SAS Code Fragments

Chapter 8: Doubly robust estimation

8.1 Introduction

8.2 Inverse probability weighted estimation

8.3 Doubly robust estimation

8.4 Vansteelandt et al. method for doubly robust estimation

8.5 Implementing the Vansteelandt et al. method via SAS

Appendix 8.A How to implement Vansteelandt et al. method for mania dataset (binary response)

Appendix 8.B SAS code to calculate estimates from the bootstrapped datasets

Appendix 8.C How to implement Vansteelandt et al. method for insomnia dataset

References

Table of SAS Code Fragments

Bibliography

Index

Statistics in Practice

STATISTICS IN PRACTICE

Series Advisors

Human and Biological Sciences

Stephen Senn

CRP-Santé, Luxembourg

Earth and Environmental Sciences

Marian Scott

University of Glasgow, UK

Industry, Commerce and Finance

Wolfgang Jank

University of Maryland, USA

Founding Editor

Vic Barnett

Nottingham Trent University, UK

Statistics in Practice is an important international series of texts which provide detailed coverage of statistical concepts, methods and worked case studies in specific fields of investigation and study.

With sound motivation and many worked practical examples, the books show in down-to-earth terms how to select and use an appropriate range of statistical techniques in a particular practical field within each title’s special topic area.

The books provide statistical support for professionals and research workers across a range of employment fields and research environments. Subject areas covered include medicine and pharmaceutics; industry, finance and commerce; public services; the earth and environmental sciences, and so on.

The books also provide support to students studying statistical courses applied to the above areas. The demand for graduates to be equipped for the work environment has led to such courses becoming increasingly prevalent at universities and colleges. It is our aim to present judiciously chosen and well-written workbooks to meet everyday practical needs. Feedback of views from readers will be most valuable to monitor the success of this aim.

A complete list of titles in this series appears at the end of the volume.

This edition first published 2014 © 2014 John Wiley & Sons, Ltd

Registered officeJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.

The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

Library of Congress Cataloging-in-Publication Data

O’Kelly, Michael, author.    Clinical trials with missing data : a guide for practitioners / Michael O’Kelly, Bohdana Ratitch.       p. ; cm. – (Statistics in practice)    Includes bibliographical references and index.    ISBN 978-1-118-46070-2 (hardback)    I. Ratitch, Bohdana, author.   II. Title.   III. Series: Statistics in practice.

   [DNLM: 1. Clinical Trials as Topic.   2. Bias (Epidemiology)   3. Models, Statistical.   4. Research Design. QV 771.4]    R853.C55    610.72′4–dc23

2013041088

A catalogue record for this book is available from the British Library.

ISBN: 978-1-118-46070-2

To Raymond Kearns, teacher and Linda O’Nolan, partner.

—Michael O’Kelly

To my family, with love and gratitude for inspiration and support.

—Bohdana Ratitch

Preface

The aim of this book is to explain the difficulties that arise with the credibility and interpretability of clinical study results when there is missing data; and to provide practical strategies to deal with these difficulties. We try to do this in straightforward language, using realistic clinical trial examples.

This book is written to serve the needs of a broad audience of pharmaceutical industry professionals and regulators, including statisticians and non-statisticians, as well as academics with an interest in or need to understand the practical side of handling missing data. This book could also be used for a practical course in methods for handling missing data. For statisticians, this book provides mathematical background for a wide spectrum of statistical methodologies that are currently recommended to deal with missing data, avoiding unnecessary complexity. We also present a variety of examples and discussions on how these methods can be implemented using mainstream statistical software. The book includes a framework in which the entire clinical study team can contribute to a sound design of a strategy to deal with missing data, from prevention, to formulating clinically plausible assumptions about unobserved data, to statistical analysis and interpretation.

In the past, missing data was sometimes viewed as a problem that can be taken care of within statistical methodology without burdening others with the technicalities of it. While it is true that sophisticated statistical methods can and should be used to conduct sound analyses in the presence of missing data, all these methods make assumptions about missing data that clinical experts should help to formulate – assumptions that should be clinically interpretable and plausible. Moreover, it is important to understand that some assumptions about missing data are always being made, be it explicitly or implicitly. Even a strategy using only observed data for analysis carries within it certain implicit assumptions about subjects with missing data, and these assumptions are being implicitly made part of study conclusions. Clinicians fully participate in the effort to select carefully the type of data (clinical endpoints) that could best serve as evidence for efficacy and safety of a treatment. Their clinical expertise is invaluable for the choice of data that is collected in a clinical trial and subsequently used as observed data. Similarly, it is only natural to expect that the same level of clinical expertise would be provided to make choices for “hidden data” – the assumptions that would be used in place of missing data as an integral part of the overall body of evidence. Parts of this book (Chapters 1–4) contain non-technical material that can be easily understood by non-statisticians, and we hope that it will help clinicians and statisticians to build a common ground and a common language in order to tackle appropriately the problem of missing data together. Chapter 2 is dedicated entirely to prevention of missing data, which is the best way to deal with the problem, albeit not sufficient by itself in reality. Everyone involved in the planning and conduct of clinical trials would benefit from the ideas presented in this chapter.

Chapters 5 through 8 are aimed primarily at statisticians and cover well-understood methods that are presently regarded as statistically sound ways of conducting analyses in the presence of missing data and which can provide clinically meaningful estimands of treatment effect. In particular, this book covers direct likelihood methodology for longitudinal data with repeated correlated measurements; multiple imputation; pattern-mixture models; and inverse weighting and doubly robust methods. We discuss in detail how these methodologies can be applied under a variety of clinical assumptions about unobserved data, both in the context of primary and sensitivity analyses. Aspects that are covered more briefly include selection models and non-parametric approaches. Examples cover both continuous outcomes and binary responses (e.g., treatment success/failure). Missing data problems in other contexts, such as time-to-event analyses, are not covered in this book.

Along with algebraic basics and plain language explanations of statistical methodology, this book contains numerous examples of practical implementations using SAS®. Throughout the book, as well as in supplemental material, we provide fragments of SAS code that would be sufficient for readers to use as templates or at least good starting points to implement all analyses mentioned in this book. We also provide pointers and explanations for a number of SAS macros publicly available at www.missingdata.org.uk, developed by members of the Drug Information Association Scientific Working Group on Missing Data. Both authors of this book are members of this Working Group. We note that alternative software solutions exist in other programming environments, including free packages such as R. Other authors, for example, Carpenter and Kenward (2013) and van Buuren (2012), have provided tools that the readers would be able to use in order to implement general analysis principles discussed in this book.

Examples of realistic clinical trial data featured in this book provide illustrations of how reasonable missing data strategies can be designed in several different clinical indications, each with some specific challenges and characteristics. All examples have two treatment arms – experimental and control – but the methodology discussed in this book can be applied in more general settings with more than two arms in a straightforward manner.

We have also endeavored to make the book suitable for casual use, allowing the professional statistician with a particular need to use a particular section without having to be familiar with the whole book. Therefore, each chapter begins with a list of key points covered; abbreviations are expanded on first appearance in each chapter; references are listed at the end of each chapter; explanations of particular points may be repeated if it helps to make a passage readable (although there are many cross-references between chapters too); where a book is referenced, we try to give page numbers if we think this might be helpful; and for some references to journal papers we also give web links to enable fast reference to abstracts and to enable downloading for those who may have electronic subscriptions.

Finally, we would like to stress that the problem of missing data unfortunately does not have a one-fits-all solution. A clinical research team must evaluate their strategy for missing data in the context of a specific clinical indication, subject population, expected mechanism of action of the experimental treatment, control treatment used in the study, and standards of care that would be available to subjects once they leave the trial. This book aims at providing the reader with a good general understanding of the issues involved and a tool box of methods from which to select the ones that would be the most appropriate for a study at hand.

References

Carpenter JR, Kenward MG (2013) Multiple Imputation and its Application. John Wiley & Sons Ltd, West Sussex.

Van Buuren S (2012) Flexible Imputation of Missing Data. Chapman & Hall/CRC Press, Boca Raton, FL.

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the United States and other countries.® indicates USA registration.

Acknowledgments

We thank the contributors to this book, Sonia Davis, Sara Hughes, Belinda Hernández and Ilya Lipkovich, for their clear contributions and constant helpfulness.

We have found the scholars and experts on missing data to be friendly, approachable and willing to share ideas and expertise. As many who learn from him will tell you, James Roger epitomizes this spirit of willingness to spark ideas off others and to share the fruits of applied mathematics and elegant programming. It is likely that much of our work on sensitivity analyses in this book would not have been done without Roger's inspiration and example. It seems typical of those who work on missing data that much material from one of our favorite books on the subject was made freely available on the internet by authors James Carpenter and Mike Kenward. We thank these two scholars. Gary Koch first pointed us to Roger's ideas on sensitivity analyses; Gary also suggested the usefulness of sequential multiple imputation; over many years he has answered our questions on the direction of our work. Craig Mallinckrodt has genially chaired the Scientific Working Group for Missing Data and we thank him for fostering co-operation between pharmaceutical companies and academia, all with the aim of improving our handling of missing data. Geert Molenberghs gave thought-provoking answers to our queries; he also reviewed this book and we thank him for his helpful comments. Thank you to John Preisser, Willem Koetse, Forrest DeMarcus, and David Couper for reviews and contributions to Chapter 5. We thank Quintiles' Judith Beach for her legal review and general advice; we thank Quintiles' Kevin Nash for his review also. The errors that remain are ours. From our employers, Quintiles, we thank especially Olga Marchenko, who championed the research; Tom Pike and King Jolly for their encouragement; and Andy Garrett and Yves Lalonde who supported Bohdana Ratitch in making time for the book. We also pay tribute to Quintiles for its remarkable support for the development of its employees – Michael O'Kelly owes his entire post-graduate education to the support of Quintiles, and in particular to the support of Imelda Parker when she was head of the Quintiles Dublin statistics department. Thanks to Ilaria Meliconi from Wiley who first encouraged us to think of writing the book; and to Wiley's Debbie Jupe who guided us as the book progressed, with help from Richard Davies and Heather Kay. Finally, we thank our spouses and family for their support.

Contributors

Chapter 5 by:

Sonia M. Davis, Collaborative Studies Coordinating Center, Professor of the Practice, Department of Biostatistics, University of North Carolina, USA.

Chapter 8 by:

Belinda Hernández, School of Mathematical Sciences (Discipline of Statistics) and the School of Medicine and Medical Science, University College Dublin, Ireland.

Chapter 2 by:

Sara Hughes, Head of Clinical Statistics, GlaxoSmithKline, UK.

Contributions to and review of Chapter 8 by:

Ilya Lipkovich, Center for Statistics in Drug Development, Quintiles Innovation, Morrisville, North Carolina, USA.

__________________________________________

Michael O'Kelly: authored chapters 1, 3, 4 and 7, contributed research for Chapter 8, and reviewed all chapters.

Bohdana Ratitch: authored Chapter 6, contributed to chapters 1, 4 and 7, contributed research for chapters 4 and 7, and reviewed all chapters.

1

What's the problem with missing data?

Michael O'Kelly and Bohdana Ratitch

Text not available in this digital edition.

Macavity the Mystery Cat, TS Eliot*

Key points
Missing data for the purposes of this book are data that were planned to be recorded during a clinical trial but are not available. Non-monotone or intermediate missing data occur when a subject misses a visit but contributes data at later visits. Monotone missing data, where all data for a subject is missing after a certain time-point due to early withdrawal from the study, is the more serious problem in interpreting the results of a trial.The most important thing about missing data is that it is missing: we can never be sure whether the assumptions made about it are true.An example illustrates the potential bias of using only observed data in an analysis (a favorable subset of subjects); and of using a subject's last available observation or baseline observation in place of missing values (bias varies and may be difficult to predict).Assuming that data are missing at random (i.e., that given the data and the model, missingness is independent of the unobserved values) allows one to use study data to infer likely values for missing data, but is likely biased in that it assumes that subjects who withdrew from the study have results like similar subjects who remained in the study.Given that we can never be sure whether the assumptions made about missingness in the primary analysis are true, sensitivity analyses are needed to stress-test the trial results for robustness to assumptions about missing data: sensitivity analyses will help the reader of the clinical study report to assess the credibility of a trial with missing data.

1.1 What do we mean by missing data?

This book is about missing data in clinical trials. In a clinical trial, missing data are data that were planned to be recorded but are not present in the database. No matter how well designed and conducted a trial is, some missing data can almost always be expected. Missingness may be absolutely unrelated to the subject's medical condition and study treatment. For example, data could be missing due to a human error in recording data; due to a scheduling conflict that prevented the subject from attending the study visit; or due to a subject's moving to a region outside of the study's remit. On the other hand, data may be missing for reasons that are related to subject's health and the experimental treatment he/she is undergoing. For example, subjects may decide to discontinue from study prematurely if their condition worsens or fails to improve, or if they experience adverse reactions or adverse events (AEs). A contrary situation is also possible, although probably less common, where a subject is cured and observations are missing because the subject is not willing to bother with the rest of the study assessments. Apart from missingness due to missed visits, missing data can arise simply due to the nature of the measurement or the nature of the disease. An example of data that would be missing because not meaningful is a quality-of-life score for a subject who has died. Those cases where missingness is related to the subject's underlying condition and study treatment have the greatest potential to undermine the credibility of a trial. Sometimes, a subject's data collected prior to discontinuation reflects the reason for withdrawal (e.g., worsening, improvement or toxicity), but subjects can also discontinue without providing that crucial information that would have enabled us to assess the reason for missingness and thus incorporate it in our analysis. Such cases potentially hide some important information about treatment efficacy and/or safety, without which study conclusions may be biased.

When a subject has provided data over the course of the study, but some assessments, either in the middle of the trial or at the primary time point, are missing for any reason, their data can be referred to as partial subject data. In this book, we explore the implications of this partial data and ways to minimize the potential bias.

In many clinical trials, collected data are longitudinal in nature, that is, data about the same clinical parameter is collected on multiple occasions (e.g., during study visits or through subject diaries). In such studies, a primary endpoint (clinical parameter used to evaluate the primary objective of the trial at a specific time point) is typically required to be measured at the end of the treatment period or a period at the end of which the clinical benefit is expected to be attained or maintained, with assessments performed at that point as well as on several prior occasions, thus capturing subject's progress after the start of the treatment. This is in contrast with another type of trial, where the primary endpoint is event-driven, for example, based on such events as death or disease progression. In this book, we focus primarily on the former type of the trials, and we look at various ways in which partial subject data can be used for analysis.

Most of this book is about ways to handle missing data once it occurs, but it is also important to prevent missing data insofar as this is possible. Chapter 2 discusses this in detail, and describes some ways in which the statistician can contribute to prevention strategies. We now put some of the discussion above somewhat more formally.

1.1.1 Monotone and non-monotone missing data

A subject who completes a clinical trial may have data missing for a measurement because he/she failed to turn up for some visits in the middle of the trial. Such a measurement is said to have “non-monotone missing,” “intermediate missing” or “intermittent missing” data, because the status of the measurement for a subject can switch from missing to non-missing and back as the patient progresses through the trial. In many clinical trials, this kind of missingness is more likely to be unrelated to the study condition or treatment. However, in some trials, it may indicate a temporary but important worsening of the subject's health (e.g., pulmonary exacerbations in lung diseases).

In contrast, monotone missingness occurs when data for a measurement is not available for a subject after some given time point; in the case of monotone missingness, once a measurement starts being missing, it will be missing for the subsequent visits in the trial, even though it had been planned to be collected. Subjects that discontinue early from the study are the usual source of monotone missing data. In most trials, the amount of monotone missing data is much greater than the amount of non-monotone missing data. In trials where the primary endpoint is based on a measurement at a specific time point, prior intermittent missing data will have a smaller impact on the primary analysis, compared to monotone missing data. Nevertheless, even in these cases, non-monotone missing data can affect study conclusions. This can happen if the intermediate data are utilized in a statistical model for analysis – the absence of such intermediate data may bias the estimates of the statistical model parameters. In this book, however, we will focus mostly on the problem of monotone missing data, because monotone missing data tend to pose more serious problems than non-monotone when estimating and interpreting trial results. For a more detailed discussion of handling non-monotone missing data, see Section 6.2.1. In this chapter, to introduce some of the concepts and problems in handling missing data, we will look at some common methods of handling monotone missing data in clinical trials, and examine the implications of each method.

In Section 4.2.1, we will also briefly discuss situations where subject discontinues study treatment prematurely, but may stay on study and provide data at the time points as planned originally, despite being off study treatment. These cases need special consideration when including data after treatment discontinuation in the analysis, so that the interpretation of results takes into account possible confounding factors incurred after discontinuation (e.g., alternative treatments).

1.1.2 Modeling missingness, modeling the missing value and ignorability

In the missing data methodology, we often use two terms: missing value and missingness (or missingness mechanism). It will be helpful to clarify what these terms refer to as they both play important and distinct roles in the statistical analysis. Missing value refers to a datum that was planned to be collected but is not available. A datum may be missing because, for example, the measurement was not made or was not collected. Missing and non-missing data may also be referred to as unobserved and observed, respectively. Missingness refers to a binary outcome (Yes/No), that of the datum being missing or not missing at a given time point. Missingness mechanism refers to the underlying random process that determines when data may be missing. In other words, missingness mechanism refers to the probability distribution of the binary missingness event(s). The missingness mechanism may depend on a number of variables, which themselves may be observed or not observed. In the analysis, we can use one model (often referred to as a substantive model) for the values of the clinical parameter of interest (some values of which in reality will be missing), and another model for the distribution of a binary missingness indicator variable (datum missing or not). The missingness model may not be of interest in itself, but in some situations it may influence estimation of the substantive model and would need to be taken into account in order to avoid bias. Some analyses make use of both of these models.

1.1.3 Types of missingness (MCAR, MAR and MNAR)

The classifications of missing data mechanisms introduced by Rubin (1976; 1987) and Little and Rubin (2002) provide a formal framework that describes how missingness mechanism may affect inferences about the clinical outcome. A value of a clinical outcome variable is said to be missing completely at random (MCAR) when its missingness is independent of observed and unobserved data, that is, when observed outcomes are a simple random sample from complete data; missing at random (MAR) when, given the observed outcomes and the statistical model, missingness is independent of the unobserved outcomes; and missing not at random (MNAR) when missingness is not independent of unobserved data, even after accounting for the observed data. When data are missing for administrative reasons, the missingness mechanism could be MCAR, because the reason for missingness had nothing to do with the outcome model and its covariates. Dropout due to previous lack of efficacy could be MAR, because in some sense predictable from the observed data in the model. It is important to note that MAR is not an intrinsic characteristic of the data or missingness mechanism itself, but is closely related to the analysis model: if we include all the factors on which missingness depends in our model, we will be operating under MAR; otherwise, our analysis would not conform to MAR assumptions. Dropout after a sudden unrecorded drop in efficacy could be MNAR, since missingness would be dependent on unobserved data and would not be predictable from the observed data alone. Of these assumptions, MCAR is strongest and least realistic; while MNAR is the least restrictive. However, the very variety of assumptions possible under MNAR may be regarded as a problem: it has been argued that it would be difficult to pre-specify a single definitive MNAR analysis (Mallinckrodt et al., 2008).

We can test for dependence of missingness on observed outcomes, and so test for MAR versus MCAR. However, we cannot test whether the mechanism is MAR versus MNAR, because that would require testing for a relationship between missingness and unobserved data. Unobserved data, we think it is no harm to repeat, is not there, and so the relationship with missingness cannot be tested.

See Appendix 1.A at the end of this chapter for formal definitions of MCAR, MAR and MNAR.

Under some assumptions, missingness can be shown to be ignorable. Missingness is classified as ignorable if a valid estimate of the outcome can be calculated without taking the missingness mechanism into account. In his first paper addressing the problem of missing data, Rubin (1976) showed that, when using Bayesian or direct likelihood methods to estimate any parameter θ related to the clinical outcome, missing data are ignorable when the missingness mechanism is MAR and θ “is ‘distinct’ from” the parameter of the missing data process (missingness mechanism). Rubin put the word “distinct” in quotation marks because the distinctness condition is a very particular one. The missingness parameter is distinct from θ “if there are no a priori ties, via parameter space restrictions or prior distributions, between (the missingness parameter) and θ.” Thus while we might often expect the same observed data to contribute to the modeling of both missingness and the outcome, θ and the missingness parameter will still probably be distinct in such cases, and the missingness ignorable.

1.1.4 Missing data and study objectives

Clinical trial researchers and regulatory authorities are concerned about the effect of missing data on two aspects of the clinical data analysis: the estimates of the difference between experimental and control treatments, and the variance of this estimate. With respect to the difference between treatments, missing data can affect (and can bias) the magnitude of that estimate and make the experimental treatment look more or less clinically efficacious than it is (in the extreme cases even reverse a true comparison) or obscure an important interplay between treatment efficacy and tolerability. With regard to the variance of this estimate, missing data can either compromise the power of the study or, on the contrary, lead to underestimation of the variance, depending on the method chosen for analysis. Regulatory authorities require reasonable assurances that the chosen method of analysis in the presence of missing data is not likely to introduce an important bias in favor of the experimental treatment and will not underestimate the variance.

1.2 An illustration

To start our exploration of missing data, consider the following illustrative dataset that is patterned after typical Parkinson's disease clinical data as available, for example, in Emre et al. (2004) and Parkinson Study Group (2004a, 2004b). We suppose our trial had two treatment arms, an experimental arm and a placebo control arm, and that the trial had nine visits, with baseline at Visit 0, and the primary efficacy endpoint at Week 28. Fifty-seven subjects were enrolled in each treatment group. The primary measure of efficacy was a sub-score of the Unified Parkinson's Disease Rating Scale (UPDRS). For a general description of this illustrative dataset, see Section 1.10.1. A “spaghetti plot” of the complete dataset (Figure 1.1), although showing no strong distinct patterns, allows us to see the mass of data that can be available in a typical longitudinal trial.

Figure 1.1 Available data in Parkinson's disease dataset.

A high score here indicates poor subject outcome. Parkinson's disease is progressive, and for most treatments of the disease, one would expect to see a return to worsening after three to six months treatment seen in Emre et al. (2004) and Parkinson Study Group (2004a, 2004b) just cited. In other words, some transient improvement may be achieved and progression may be delayed for some time by treatment, but progression is not expected to stop completely. The reader may be able to see from Figure 1.1 that indeed, while many subjects in the illustrative dataset improved slightly (lower scores), subjects tended to revert to disease progression towards the end of the trial (higher scores).

In our example dataset, nearly 38% of subjects discontinued early, 18 (32%) and 25 (44%) subjects in the control and experimental arms, respectively, giving rise to substantial amounts of monotone missing data. Figure 1.2 highlights those subjects.

Figure 1.2 Parkinson's disease dataset: early discontinuations highlighted.

The large proportion of missing data for the primary endpoint in this example (38% of subjects discontinued) is troubling with regard to its impact on the power of the study. Also, the difference between treatment arms in the proportion of withdrawals (12% more in the experimental arm compared to placebo) is large enough to suggest that the reason for discontinuation depends on treatment. Both of these observations should motivate a careful consideration of the impact missing data may have on study conclusions. The statistician will want to consider ways to make inference from available study data while minimizing a possibility of bias that would unfairly favor the experimental treatment.

What options are available to proceed with analysis in the presence of missing data? The most obvious and easiest choice is to use only subjects with data available for the primary endpoint – study completers for whom assessments were performed at the final study visit. A second approach to consider would be to use all available longitudinal data (from all visits), including partial data from study dropouts, with the hope that this partial data could contribute in a meaningful way to the overall statistical analysis. Finally, we can impute missing data of discontinued subjects in some principled way, taking into account the information we have about these subjects prior to their dropout. We will discuss these three basic options in more detail below.

1.3 Why can't I use only the available primary endpoint data?

Sometimes, only the subjects with available data for the primary endpoint (study completers) are used in the primary study analysis and test. Could we discard the data from subjects who discontinued early, use only available data at Week 28, and still have an unbiased estimate of treatment effect? If we are interested in estimating the treatment effect in the kind of subject who would complete the nine trial visits, then the data available at the ninth visit (Visit 8, Week 28) can be the basis of an unbiased estimate. What is to be estimated – the estimand – is important in assessing how to handle missing data. Estimands are discussed at length in the U.S. National Research Council report, The prevention and treatment of missing data in clinical trials, commissioned by the U.S. Food and Drug Administration and published in 2010. A variety of estimands are discussed in Section 4.1.1, and US and EU regulatory guidance are discussed in Chapter 3. Usually, however, it is desired to estimate not just the treatment effect among the “elite” selection of subjects that completed the trial, but something more widely applicable such as the treatment effect in all subjects of the type randomized to the clinical trial (including both completers and subjects who discontinued early). The reasons recorded for discontinuation often suggest that many subjects discontinue either because of side effects or because of lack of efficacy. Thus, there is often good reason to believe that the efficacy score would be better in completers than in the full set of randomized subjects. In summary, complete cases (data from study completers) may give an estimate of efficacy that is not representative of all subjects in the study, and likely will be too favorable to the study treatments. An approach that is applicable to all subjects randomized will generally be more useful and more acceptable to the regulator. This approach where results are applicable to all subjects randomized is known as the “intent-to-treat” (ITT) approach. According to the ITT principle, all subjects that were included (randomized) in the trial should be included in the analysis, regardless of their compliance with treatment.

The use of data from completers only has an additional drawback that partial data from subjects that discontinued early, but still provided some information prior to withdrawal, is completely wasted.

In our dataset, Figure 1.3 illustrates the somewhat poorer efficacy scores that can pertain to subjects who discontinue early, taking as an example subjects whose last observation was at Week 6 or 8, (10 discontinuations each in the control and experimental treatment groups). Early withdrawals in the control group had higher (worse) mean efficacy scores from the start, compared to completers in their own treatment group. Early withdrawals in the experimental treatment group had a lower (better) mean score at baseline; by Visit 4 (Week 6) the gap between the completers and withdrawals had narrowed, and at Visit 5 (Week 8) the withdrawals now had a higher (worse) score than completers. In Section 1.10.1, Figure 1.9 shows that, taking all subjects, for this illustrative dataset, withdrawals tended to have worse UPDRS scores than completers, but not at all visits. However, very often in our experience, completers tend to have more favorable trajectories than subjects who discontinue early, and thus tend not to be representative of efficacy for all subjects who were randomized to the trial.

Figure 1.3 Parkinson's disease dataset: mean efficacy score at each time point for completers and for subjects whose last observation was at Visit 4 or 5 (Week 6 or 8).

Figure 1.4 Parkinson's disease dataset: four selected trajectories.

Figure 1.5 Parkinson's disease dataset: LOCF imputation.

Figure 1.6 Parkinson's disease dataset: MAR imputation for selected trajectories.

Figure 1.7 Parkinson's disease dataset, Kaplan-Meier plot of time to discontinuation from the study in each treatment arm. The two rows of numbers within the plot at the bottom are counts of those “at risk” of discontinuation at each time point.

Figure 1.8 Parkinson's disease dataset, summary of mean change from baseline (CFB) in UPDRS sub-score by time point across dropout cohorts (grouped by time of discontinuation) and study completers.

Figure 1.9 Parkinson's disease dataset, summary of mean change from baseline (CFB) in UPDRS sub-score by time point for study dropouts and completers in each treatment arm.

Perhaps there may be cases where the scores of completers are better than the scores of dropouts to a similar extent in both arms, and the estimated difference between treatment groups may be unaffected? We can never be sure that this will be true, and in many trials the proportion of early discontinuations, and the reasons for discontinuation, vary substantially between treatment groups. If reasons for discontinuation do vary by treatment group, then efficacy in completers will likely vary between treatment groups also, biasing the “available cases” estimate of the difference between treatment groups. In our example study, a higher proportion of subjects discontinued early due to adverse events (AEs) in the experimental arm compared to the control arm (Table 1.2). Excluding dropouts from analysis would likely favor the experimental arm to a greater extent than the control arm, if AEs are associated with poorer efficacy scores. With this kind of difference in discontinuation between treatment groups, it would be very difficult to interpret an estimate based only on completers. Such an estimate could not with credibility be applied to the general Parkinson's population. As noted above, another disadvantage of analyzing completers only is that we are not making use of information from subjects who were partially observed. Thus, a statistical test based on completers would tend, all other things being equal, to have less power than a test which makes use in a principled way of data from withdrawals.

1.4 What's the problem with using last observation carried forward?

Until recently, the most common method of handling missing data was to estimate the treatment effect using the last available measurement for a subject. This method has been known as “last observation carried forward,” or LOCF. The argument can be made for LOCF, that the observation just before a subject discontinues is likely to give evidence unfavorable to the study treatment – because the subject is likely to have left the study when his/her health was at a low point (unless he/she left for reasons unrelated to the condition under study). Last observation carried forward could thus be regarded as leading to a conservative estimate of efficacy for a treatment. To examine the LOCF method further, we plot some typical trajectories from our example study (Figure 1.4).

Figure 1.4 shows two typical trajectories of the efficacy score for completers and two typical trajectories for subjects who discontinued early. As is common in studies of Parkinson's disease, the completer on the control arm shows a small improvement and finishes the study close to his/her baseline value. The completer in the experimental arm also improves, and then reverts to a UPDRS value close to baseline. The two trajectories of early discontinuations are selected to show some implications of the method of handling missing data. Here, the control arm data are of a subject who discontinued very early, and experimental arm data are of a subject who discontinued after the midpoint of the study.

The efficacy score for the subject in the control group had changed little from baseline when he/she discontinued, and so LOCF imputes for Visit 8 (Week 28) a value almost unchanged since baseline (Figure 1.5). The subject in the experimental arm had a somewhat improved (lower) UPDRS score by Visit 6 (Week 12) when he/she discontinued. The values imputed by LOCF here do not seem very unreasonable, except that the tendency of subjects to worsen late in the study is not reflected in the imputation.

For LOCF to provide valid estimates of efficacy at the primary time point (e.g., the last scheduled visit), a very particular MNAR assumption would need to hold, namely that with a probability of one, no matter what the general trend of outcomes in the study, a future outcome is equal to a subject's last available outcome. Molenberghs and Kenward (2007, pp. 45–47) point out how strong and unrealistic the LOCF assumption is. Verbeke and Molenberghs (1997, Chapter 5) show how much at variance LOCF is with the linear mixed model, with breaches of the usual assumptions about group differences and evolution over time. In summary, the LOCF assumption is not often clinically plausible; LOCF is unlikely in general to give a sensible estimate of a subject's efficacy at the study endpoint.

It is striking that LOCF makes no use of the information about the likely trajectory of discontinuations that is available from other subjects in this study. For example, Figure 1.3 tells us that, among completers, there is a slight worsening (increasing) in the mean efficacy score from Visit 5 (Week 8) onwards, in both treatment groups, perhaps reflecting the progressive nature of Parkinson's disease. No such worsening (increase) is included in LOCF imputations. Since LOCF fails to take account of the general worsening observed in subjects with Parkinson's disease, LOCF is likely to favor treatment arms that have more discontinuations, especially for subjects who discontinue mid-study when mean efficacy score is lowest (best). We see this to some extent in the case of the subject from the experimental arm that discontinued at Week 12. Should efficacy at Week 28 not be somewhat worse than efficacy at Week 12, given the progressive nature of Parkinson's disease as seen in the study trend? We note that our example here is not one that will often be found in a real clinical trial, as LOCF is generally considered not appropriate for progressive diseases, precisely because LOCF does not take progression into account. But even for other indications where the outcomes tend to improve with time or for chronic diseases, there are other problems with this method that we will discuss later in the book.

Our uncertainty about missing values in the previous paragraph brings us to another widely expressed objection to LOCF and similar methods such as baseline observation carried forward (BOCF). LOCF and BOCF are known as single imputation methods. They posit a single imputed value for the missing value, and thereafter treat the imputed value as though it were “real” data. The objection to single imputation methods is that they fail to reflect uncertainty about missing data. Regulators have voiced particular concern with regard to this unrealistic lack of variability in single imputation methods, and this concern is described further in Chapter 3, which summarizes the regulatory documents; Chapter 6 discusses this issue further in the context of multiple imputation.

1.5 Can we just assume that data are missing at random?

Would an MAR assumption give more credible results than LOCF for an estimate of efficacy at Visit 8 (Week 28)? If we accept the MAR assumption, we take it that observed data can in some sense account for missing values. Thus, if we assume MAR, missingness of the outcomes Y is independent of unobserved data conditional on the observed outcomes Yobs and other covariables in the statistical model used. The MAR assumption states, as was defined earlier, that probability of missingness does not depend on unobserved data, given observed outcomes. It is helpful to understand that this assumption has an implication for the distribution of unobserved “potential” outcomes, given observed outcomes (or in the repeated measures context, as distribution of future outcomes given earlier outcomes). Informally, MAR can be shown (Verbeke and Molenberghs, 2000, Section 20.2.1, Theorem 20.1, p. 334) to be equivalent to the assumption that the conditional distribution of potential (missing) outcomes for dropouts given their observed outcomes, is the same as the conditional distribution of observed outcomes for patients who continued. As a result, the estimate of treatment effect that we get from the likelihood-based (ignorable) inference under MAR is essentially the estimate of what would have happened, had all the patients who discontinued remained on their respective treatments. See Section 4.2.2.3 for further discussion on what is estimated by MAR, and Section 4.1.1 on what it may be desired to estimate.

Thus under MAR we use all relevant study data, including partial data from discontinued subjects, to infer plausible values for missing data. Chapter 5 shows in detail how to do this in SAS® (SAS Institute Inc., 2011) using a direct likelihood method known as mixed models for repeated measures (MMRM). Direct likelihood approaches can include models of repeated measures for binary and other non-normal outcomes, as well as continuous, normally distributed outcomes. Chapter 6 gives details about how to implement the same MAR assumptions using multiple imputation (MI) When their statistical models are the same, the two methods – MI and MMRM – in theory should give similar results, and in our experience, they usually do. However, we note that with MI, the model used to impute missing values may be distinct from the primary analysis model used to estimate treatment effect. The model used to impute the missing values must have the same explanatory variables as the model of the primary analysis, but can have extra variables in addition to those. MI can even include post-baseline variables to help model the outcome. In contrast, direct likelihood methods such as MMRM, make inferences about missing values and in the same step estimate the treatment effect; inclusion of post-baseline covariables in this single step would almost certainly lead to confounding of the estimate of treatment effect, and so in practice direct likelihood approaches cannot make use of post-baseline variables other than the outcome being modeled. Thus, MI can use more information than direct likelihood approaches in handling missing data, which may in some circumstances give it an advantage over direct likelihood approaches.

Both methods – direct likelihood modeling and MI – use partially observed study data to make inferences with missing data under MAR and have an advantage over single-imputation LOCF in that they take account of the uncertainty pertaining to missing data. Generally speaking, the confidence intervals provided by MI and by direct likelihood methods such as MMRM depend on the amount of missing information with respect to the estimated parameters, whereas for the single-imputation LOCF estimates this is not the case. Thus, our uncertainty about the missing data is reflected in the summary statistics from MMRM and MI, but we cannot depend upon this being so when we use single imputation methods, such as LOCF or BOCF.

Chapter 6 describes how, in place of the missing data, MI uses a number of draws from the posterior distribution of the missing observation, given the observed data. The variability between the MI draws, calculated and incorporated using Rubin's rules (Rubin, 1987) reflects the uncertainty about the missing data. Figure 1.6 shows values imputed under the MAR assumption for the trajectories of the two selected subjects that discontinued early, using MI. MI uses all the data in Figure 1.1 to estimate the posterior distribution of the missing data. Imputed values shown on Figure 1.6