A Guide to Sample Size for Animal-based Studies - Penny S. Reynolds - E-Book

A Guide to Sample Size for Animal-based Studies E-Book

Penny S. Reynolds

0,0
65,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

A Guide to Sample Size for Animal-based Studies Understand a foundational area of experimental design with this innovative reference Animal-based research is an essential part of basic and preclinical research, but poses a unique set of experimental design challenges. The most important of these are the 3Rs - Replacement, Reduction and Refinement - the principles comprising the ethical framework for humane animal-based studies. However, many researchers have difficulty navigating the design trade-offs necessary to simultaneously minimize animal use, and produce scientific information that is both rigorous and reliable. A Guide to Sample Size for Animal-based Studies meets this need with a thorough, accessible reference work to the subject. This book provides a straightforward systematic approach to "rightsizing" animal-based experiments, with sample size estimates based on the fundamentals of statistical thinking: structured research questions, variation control and appropriate design of experiments. The result is a much-needed guide to planning animal-based experiments to ensure scientifically valid and reliable results. This book offers: * Step-by-step guidance in diverse methods for approximating and refining sample size * Detailed treatment of research topics specific to animal-based research, including pilot, feasibility and proof-of-concept studies * Sample size approximation methods for different types of data - binary, continuous, ordinal, time to event - and different study types - description, comparison, nested designs, reference interval construction and dose-response studies * Numerous worked examples, using real data from published papers, together with SAS and R code A Guide to Sample Size for Animal-based Studies is a must-have reference for preclinical and veterinary researchers, as well as ethical oversight committees and policymakers.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 677

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



A Guide to Sample Size for Animal‐based Studies

Penny S. Reynolds

Department of Anesthesiology, College of MedicineDepartment of Small Animal Clinical SciencesCollege of Veterinary MedicineUniversity of Florida, GainesvilleFlorida, USA

This edition first published 2024© 2024 John Wiley & Sons Ltd

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Penny S. Reynolds to be identified as the author of this work has been asserted in accordance with law.

Registered OfficesJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USAJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of WarrantyWhile the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication Data applied for

Paperback ISBN: 9781119799979

Cover Design: WileyCover Images: © MOLEKUUL/SCIENCE PHOTO LIBRARY/Getty Images; MOLEKUUL/SCIENCE PHOTO LIBRARY/Getty Images; Verin/Shutterstock; n.tati.m/Shutterstock; Mariia Zotova/Getty Images; RF Pictures/Getty Images

To Nyx, Mel, Finnegan, and Fat Boy Higgins

Holly, Molly, and Abby

and all their nameless, uncounted kindred

who have done so much to advance science and medicine

Preface

‘How large a sample size do I need for my study’? Although one of the most commonly asked questions in statistics, the importance of proper sample size estimation seems to be overlooked by many preclinical researchers. Over the past two decades, numerous reviews of the published literature indicate many studies are too small to answer the research question and results are too unreliable to be trusted. Few published studies present adequate justification of their chosen sample sizes or even report the total number of animals used. On the other hand, it is not unusual for protocols (usually those involving mouse models) to request preposterous numbers of animals, sometimes in the tens or even hundreds of thousands, ‘because this is an exploratory study, so it is unknown how many animals we will require’.

This widespread phenomenon of sample sizes based on nothing more than guesswork or intuition illustrates the pervasiveness of what Amos Tversky and Daniel Kahneman identified in 1971 as the ‘belief in the law of small numbers’. Researchers overwhelmingly rely on best judgement in planning experiments, but judgement is almost always misleading. Researchers choose sample sizes based on what ‘worked’ before or because a particular sample size is a favourite with the research community. Tversky and Kahneman showed that researchers who gamble their research results on small intuitively‐based samples consistently have the odds stacked against their findings (even if results are true). They overestimate the stability and precision of their results, and fail to account for sampling variation as a possible reason for observed pattern. The result is research waste on a colossal scale, especially of animals, that is increasingly difficult to justify.

This book was written to assist non‐statisticians who use animals in research to ‘right‐size’ experiments, so they are statistically, operationally, and ethically justifiable. A ‘right‐sized’ experiment has a clear plan for sample size justification and transparently reports the numbers of all animals used in the study. For basic and veterinary researchers, appropriate sample sizes are critical to the design and analysis of a study. The best sample sizes optimise study design to align with available resources and ensure the study is adequately powered to detect meaningful, reliable, and generalisable results. Other stakeholders not directly involved in animal experimentation can also benefit from understanding the basic principles involved. Oversight veterinarians and ethical oversight committees are responsible for appraising animal research protocols for compliance with best practice, ethical, and regulatory standards. An appreciation of sample size construction can help assess scientific and ethical justifications for animal use and whether the proposed sample size is fit for purpose. Funding agencies and policymakers use research results to inform decisions related to animal welfare, public health, and future scientific benefit. Understanding the logic behind sample size justification can assist in evaluation of study quality and reliability of research findings, and ultimately promote more informed evidence‐based decision‐making.

An extensive background in statistics is not required, but readers should have had some basic statistical training. The emphasis throughout is on the upstream components of the research process – statistical process, study planning, and sample size calculations rather than analysis. I have used real data in nearly all examples and provided formulae and code, so sample size approximations can be reproduced by hand or by computer. By training and inclination I prefer SAS, but whenever possible I have provided R code or links to R libraries.

Acknowledgements

Many thanks to Anton Bespalov (PAASP, Heidelberg, Germany); Cori Astrom, Christina Hendricks, and Bryan Penberthy (University of Florida); Cora Mezger, Maria Christodoulou, and Mariagrazia Zottoli (Department of Statistics, University of Oxford); and Megan Lafollette (North American 3Rs Collaborative), who kindly reviewed various chapters of this book whilst it was in preparation and provided much helpful feedback. Thanks to the University of Florida IACUC chairs Dan Brown and Rebecca Kimball, who encouraged researchers to consult the original 10‐page handout I had devised for sample size estimation. And last, but certainly not least, special thanks to Tim Morey, Chair of the Department of Anesthesiology, University of Florida, who encouraged me to put that handout into book form.

Thanks are also due to the University of Florida Faculty Endowment Fund for providing me with a Faculty Enhancement Opportunities grant to allow me to devote some concentrated time to writing. A generous honorarium from Scientist Center for Animal Welfare (SCAW) and an award from the UK Animals in Science Education Trust enabled me to upgrade my home computer system, making working on this project immeasurably easier.

The book was nearing completion when I came across the Icelandic word sprakkar that means ‘extraordinary women’. I have been fortunate to encounter many sprakkar whilst writing this book. In addition to the women (and men!) already mentioned, special thanks to researchers Amara Estrada, Francesca Griffin, Autumn Harris, Maggie Hull, Wendy Mandese, and Elizabeth Nunamaker, who generously allowed me to use some of their data as examples. And special thanks to Jane Buck and Julie Laskaris for their wonderful friendship and hospitality over the years. Jane Buck, Professor Emerita of Psychology, Delaware State University, and past president of the American Association of University Professors, continues to amaze and show what is possible for a statistician ‘with attitude’. Julie advised me that the only approach to properly edit one’s own work on a book‐length project was to ‘slit its throat’, then told me to do as she said, not as she actually did. Cheers.

IWhat is Sample Size?

Chapter 1:

The Sample Size Problem in Animal‐Based Research.

Chapter 2:

Sample Size Basics.

Chapter 3:

Ten Strategies to Increase Information (and Reduce Sample Size).

1The Sample Size Problem in Animal‐Based Research

CHAPTER OUTLINE HEAD

1.1 Organisation of the Book

References

Good Numbers Matter. This is especially true when animals are research subjects. Researchers are responsible for minimising both direct harms to research animals and the indirect harms that result from wasting animals in poor‐quality studies (Reynolds 2021). The ethical use of animals in research is framed by the ‘Three Rs’ principles of Replacement, Reduction, and Refinement. Originating over 60 years ago (Russell and Burch 1959), the 3Rs strategy is framed by the premise that maximal information should be obtained for minimal harms. Harms are minimised by Replacement, methods or technologies that substitute for animals; Reduction, the methods using the fewest animals for the most robust and scientifically valid information; and Refinement, the methods that improve animal welfare through minimising pain, suffering, distress, and other harms (Graham and Prescott 2015).

The focus of this book is on Reduction and methods of ‘right‐sizing’ experiments. A right‐sized experiment is an optimal size for a study to achieve its objectives with the least amount of resources, including animals. However, simply minimising the total number of animals is not the same as right‐sizing. A right‐sized experiment has a sample size that is statistically, operationally, and ethically defensible (Box 1.1). This will mean compromising between the scientific objectives of the study, production of scientifically valid results, availability of resources, and the ethical requirement to minimise waste and suffering of research animals. Thus, sample size calculations are not a single calculation but a set of calculations, involving iteration through formal estimates, followed by reality checks for feasibility and ethical constraints (Reynolds 2019).

BOX 1.1Right‐Sizing Checklist

Statistically defensible: Are numbers verifiable? (Calculations)

Outcome variable identified

Difference to be detected

Expected variation in response

Number of groups

Anticipated statistical test (if hypothesis tests used)

All calculations shown

Operationally defensible: Are numbers feasible? (Resources)

Qualified technical staff

Time

Space

Resources

Equipment

Funding

Ethically defensible: Are numbers fit for purpose? (3Rs)

Appropriate for study objectives?

Reasonable number of groups?

Are collateral losses accounted for and minimized?

Are loss mitigation plans described?

Are 3Rs strategies described?

Source: Adapted from Reynolds (2021).

Additional challenges to right‐sizing experiments include those imposed by experimental design and biological variability (Box 1.2). In The Principles of Humane Experimental Technique (1959), Russell and Burch were very clear that Reduction is achieved by systematic strategies of experimentation rather than trial and error. In particular, they emphasised the role of the statistically based family of experimental designs and design principles proposed by Ronald Fisher, still relatively new at the time. Formal experimental designs customised to address the particular research question increase the experimental signal through the reduction of variation. Design principles that reduce bias, such as randomisation and allocation concealment (blinding) increase validity. These methods increase the amount of usable information that can be obtained from each animal (Parker and Browne 2014).

Although it has now been almost a century since Fisher‐type designs were developed many researchers in biomedical sciences still seem unaware of their existence. Many preclinical studies reported in the literature consist of numerous two‐group designs. However, this approach is both inefficient and inflexible and unsuited to exploratory studies with multiple explanatory variables (Reynolds 2022). Statistically based designs are rarely reported in the preclinical literature. In part, this is because the design of experiments is seldom taught in introductory statistics courses directed towards biomedical researchers.

BOX 1.2Challenges for Right‐Sizing Animal‐Based Studies

Ethics and welfare considerations

. The three Rs Replacement, Reduction, and Refinement should be the primary driver of animal numbers.

Experimental design

. Animal‐based research has no design culture. Clinical trial models are inappropriate for exploratory research. Multifactorial agriculture/industrial design may be more suitable in many cases, and they are unfamiliar to most researchers.

Biological variability

. Animals can display significant differences in responses to interventions, making it challenging to estimate an appropriate sample size.

Cost and resource constraints

. The financial cost of conducting animal‐based research, including the cost of housing, caring for, and monitoring the animals, must be considered in estimates of sample size.

Power calculations are the gold standard for sample size justification. However, they are commonly misapplied, with little or no consideration of study design, type of outcome variable, or the purpose of the study. The most common power calculation is for two‐group comparisons of independent samples. However, this is inappropriate when the study is intended to examine multiple independent factors and interactions. Power calculations for continuous variables are not appropriate for correlated observations or count data with high prevalence of zeros. Power calculations cannot be used at all when statistical inference is not the purpose of the study, for example, assessment of operational and ethical feasibility, descriptive or natural history studies, and species inventories.

Evidence of right‐sizing is provided by a clear plan for sample size justification and transparent reporting of the number of all animals used in the study. This is why these items are part of best‐practice reporting standards for animal research publications (Kilkenny et al. 2010, Percie du Sert et al. 2020 and are essential for the assessment of research reproducibility (Vollert et al. 2020). Unfortunately, there is little evidence that either sample size justification or sample size reporting has improved over the past decade. Most published animal research studies are underpowered and biased (Button et al. 2013, Henderson et al. 2013, Macleod et al. 2015) with poor validity (Würbel 2017, Sena and Currie 2019), severely limiting reproducibility and translation potential (Sena et al. 2010, Silverman et al. 2017). A recent cross‐sectional survey of mouse cancer model papers published in high‐impact oncology journals found that fewer than 2% reported formal power calculations, and less than one‐third reported sample size per group. It was impossible to determine attrition losses, or how many experiments (and therefore animals) were discarded due to failure to achieve statistical significance (Nunamaker and Reynolds 2022). The most common sample size mistake is not performing any calculations at all (Fosgate 2009). Instead, researchers make vague and unsubstantiated statements such as ‘Sample size was chosen because it is what everyone else uses’ or ‘experience has shown this is the number needed for statistical significance’. Researchers often game, or otherwise adjust, calculations to obtain a preferred sample size (Schultz and Grimes 2005, Fitzpatrick et al. 2018). In effect, these studies were performed without justification of the number of animals used.

Statistical thinking is both a mindset and a set of skills for understanding and making decisions based on data (Tong 2019). Reproducible data can only be obtained by sustained application of statistical thinking to all experimental processes: good laboratory procedure, standardised and comprehensive operating protocols, appropriate design of experiments, and methods of collecting and analysing data. Appropriate strategies of sample size justification are an essential component.

1.1 Organisation of the Book

This book is a guide to methods of approximating sample sizes. There will never be one number or approach, and sample size will be determined for the most part by study objectives and choice of the most appropriate statistically based study design. Although advanced statistical or mathematical skills are not required, readers are expected to have at least a basic course on statistical analysis methods and some familiarity with the basics of power and hypothesis testing. SAS code is provided in appendices at the end of each chapter and references to specific R packages in the text. It is strongly recommended that everyone involved in devising animal‐based experiments take at least one course in the design of experiments, a topic not often covered by statistical analysis courses.

Figure 1.1 Overview of book organisation. For animal numbers to be justifiable (Are they feasible? appropriate? ethical? verifiable?), sample size should be determined by formal quantitative calculations (arithmetic, probability-based, precisionbased, power-based) and consideration of operational constraints.

This book is organised into four sections (Figure 1.1).

Part I

Sample size basics

discusses definitions of sample size, elements of sample size determination, and strategies for maximising information power without increasing sample size.

Part II

Feasibility

. This section presents strategies for establishing study feasibility with pilot studies. Justification of animal numbers must first address questions of operational feasibility (‘

Can it work?

’ Is the study possible? suitable? convenient? sustainable?). Once operational logistics are standardised, pilot studies can be performed to establish empirical feasibility (‘

Does it work

?’ is the output large enough to be measured? consistent enough to be reliable?) and translational feasibility (‘

Will it work?

’ proof of concept and proof of principle) before proceeding to the main experiments. Power calculations are not appropriate for most pilots. Instead, common‐sense feasibility checks include

basic arithmetic

(with structured back‐of‐the‐envelope calculations), simple probability‐based calculations, and graphics.

Part III

Description

. This section presents methods for summarising the main features of the sample data and results. Basic descriptive statistics provide a simple and concise summary of the data in terms of central tendency and dispersion or spread. Graphical representations are used to identify patterns and outliers and explore relationships between variables. Intervals computed from the sample data are the range of values estimated to contain the true value of a population parameter with a certain degree of confidence. Four types of intervals are discussed: confidence intervals, prediction, intervals, tolerance intervals, and reference intervals. Intervals shift emphasis away from significance tests and

P

‐values to more meaningful interpretation of results.

Part IV

Comparisons

. Power‐based calculations for sample size are centred on understanding effect size in the context of specific experimental designs and the choice of outcome variables. Effect size provides information about the practical significance of the results beyond considerations of statistical significance. Specific designs considered are two‐group comparisons, ANOVA‐type designs, and hierarchical designs.

References

Button, K.S., Ioannidis, J.P.A., Mokrysz, C. et al. (2013). Power failure: why small sample size undermines the reliability of neuroscience.

Nature Reviews Neuroscience

14: 365–376.

Fitzpatrick, B.G., Koustova, E., and Wang, Y. (2018). Getting personal with the “reproducibility crisis”: interviews in the animal research community.

Lab Animal (NY)

47: 175–177.

Fosgate, G.T. (2009). Practical sample size calculations for surveillance and diagnostic investigations.

Journal of Veterinary Diagnostic Investigation

21: 3–14.

https://doi.org/10.1177/104063870902100102

.

Graham, M.L. and Prescott, M.J. (2015). The multifactorial role of the 3Rs in shifting the harm‐benefit analysis in animal models of disease.

European Journal of Pharmacology

759: 19–29.

https://doi.org/10.1016/j.ejphar.2015.03.040

.

Henderson, V.C., Kimmelman, J., Fergusson, D. et al. (2013). Threats to validity in the design and conduct of preclinical efficacy studies: a systematic review of guidelines for in vivo animal experiments.

PLoS Medicine

10: e1001489.

Kilkenny, C., Browne, W.J., Cuthill, I.C. et al. (2010). Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research.

PLoS Biology

8 (6): e1000412.

https://doi.org/10.1371/journal.pbio.1000412

.

Macleod, M.R., Lawson McLean, A., Kyriakopoulou, A. et al. (2015). Risk of bias in reports of

in vivo

research: a focus for improvement.

PLoS Biology

13: e1002301.

https://doi.org/10.1371/journal.pbio.1002273

.

Nunamaker, E.A. and Reynolds, P.S. (2022). “Invisible actors”—how poor methodology reporting compromises mouse models of oncology: a cross‐sectional survey.

PLoS ONE

17 (10): e0274738.

https://doi.org/10.1371/journal.pone.0274738

.

Parker, R.M.A. and Browne, W.J. (2014). The place of experimental design and statistics in the 3Rs.

ILAR Journal

55 (3): 477–485.

Percie du Sert, N., Hurst, V., Ahluwalia, A. et al. (2020). The ARRIVE guidelines 2.0: updated guidelines for reporting animal research.

PLoS Biology

18 (7): e3000410.

https://doi.org/10.1371/journal.pbio.3000410

.

Reynolds, P.S. (2019). When power calculations won’t do: fermi approximation of animal numbers.

Lab Animal (NY)

48: 249–253.

Reynolds, P.S. (2021). Statistics, statistical thinking, and the IACUC.

Lab Animal (NY)

50 (10): 266–268.

https://doi.org/10.1038/s41684‐021‐00832‐w

.

Reynolds, P.S. (2022). Between two stools: preclinical research, reproducibility, and statistical design of experiments.

BMC Research Notes

15: 73.

https://doi.org/10.1186/s13104‐022‐05965‐w

.

Russell, W.M.S. and Burch, R.L. (1959).

The Principles of Humane Experimental Technique

. London: Methuen.

Schulz, K.F. and Grimes, D.A. (2005). Sample size calculations in randomised trials: mandatory and mystical.

Lancet

365 (9467): 1348–1353.

https://doi.org/10.1016/S0140‐6736(05)61034‐3

.

Sena, E.S. and Currie, G.L. (2019). How our approaches to assessing benefits and harms can be improved.

Animal Welfare

28: 107–115.

Sena ES, van der Worp HB, Bath PM, Howells DW, Macleod MR. Publication bias in reports of animal stroke studies leads to major overstatement of efficacy.

PLoS Biology

, 2010 8(3):e1000344.

https://doi.org/10.1371/journal.pbio.1000344

.

Silverman, J., Macy, J., and Preisig, P. (2017). The role of the IACUC in ensuring research reproducibility.

Lab Animal (NY)

46: 129–135.

Tong, C. (2019). Statistical inference enables bad science; statistical thinking enables good science.

American Statistician

73: 246–261.

Vollert, J., Schenker, E., Macleod, M. et al. (2020). Systematic review of guidelines for internal validity in the design, conduct and analysis of preclinical biomedical experiments involving laboratory animals.

BMJ Open Science

4 (1): e100046.

https://doi.org/10.1136/bmjos‐2019‐100046

.

Würbel, H. (2017). More than 3Rs: the importance of scientific validity for harm‐benefit analysis of animal research.

Lab Animal

46: 164–166.

3Ten Strategies to Increase Information (and Reduce Sample Size)

CHAPTER OUTLINE HEAD

3.1 Introduction

3.2 The ‘Well‐Built’ Research Question

3.3 Structured Inputs (Experimental Design)

3.4 Reduce Variation I: Process Control

3.5 Reduce Variation II: Research Animals

3.6 Reduce Variation III: Statistical Control

3.7 Appropriate Comparators and Controls