Understanding Statistical Error - Marek Gierlinski - E-Book

Understanding Statistical Error E-Book

Marek Gierlinski

0,0
41,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This accessible introductory textbook provides a straightforward, practical explanation of how statistical analysis and error measurements should be applied in biological research.

Understanding Statistical Error - A Primer for Biologists:

  • Introduces the essential topic of error analysis to biologists
  • Contains mathematics at a level that all biologists can grasp
  • Presents the formulas required to calculate each confidence interval for use in practice
  • Is based on a successful series of lectures from the author’s established course

Assuming no prior knowledge of statistics, this book covers the central topics needed for efficient data analysis, ranging from probability distributions, statistical estimators, confidence intervals, error propagation and uncertainties in linear regression, to advice on how to use error bars in graphs properly. Using simple mathematics, all these topics are carefully explained and illustrated with figures and worked examples. The emphasis throughout is on visual representation and on helping the reader to approach the analysis of experimental data with confidence.

This useful guide explains how to evaluate uncertainties of key parameters, such as the mean, median, proportion and correlation coefficient. Crucially, the reader will also learn why confidence intervals are important and how they compare against other measures of uncertainty.

Understanding Statistical Error - A Primer for Biologists can be used both by students and researchers to deepen their knowledge and find practical formulae to carry out error analysis calculations. It is a valuable guide for students, experimental biologists and professional researchers in biology, biostatistics, computational biology, cell and molecular biology, ecology, biological chemistry, drug discovery, biophysics, as well as wider subjects within life sciences and any field where error analysis is required.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 351

Veröffentlichungsjahr: 2015

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Understanding statistical error

A primer for biologists

Marek Gierliński

University of Dundee

This edition first published 2016 © 2016 by John Wiley & Sons Ltd

Registered office:John Wiley & Sons, Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK 111 River Street, Hoboken, NJ 07030-5774, USA

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell.

The right of the author to be identified as the author of this work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author(s) have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

Library of Congress Cataloging-in-Publication Data

Gierliński, Marek, author. Understanding statistical error: a primer for biologists / Marek Gierliński.   p. ; cm.  Includes bibliographical references and index.  ISBN 978-1-119-10691-3 (pbk.)  I. Title. [DNLM: 1. Statistics as Topic. 2. Analysis of Variance. 3. Biostatistics. 4. Computational Biology–methods. 5. Probability. 6. Statistical Distributions. WA 950]  R853.S7  610.72′7–dc23

2015024748

A catalogue record for this book is available from the British Library.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Cover image: © Lonely_/iStockphoto

Errors, like straws, upon the surface flow; He who would search for pearls must dive below

                    —John Dryden (1631–1700)

CONTENTS

Introduction

Why would you read an introduction?

What is this book about?

Who is this book for?

About maths

Acknowledgements

Chapter 1: Why do we need to evaluate errors?

Chapter 2: Probability distributions

2.1 Random variables

2.2 What is a probability distribution?

2.3 Mean, median, variance and standard deviation

2.4 Gaussian distribution

2.5 Central limit theorem

2.6 Log-normal distribution

2.7 Binomial distribution

2.8 Poisson distribution

2.9 Student's

t

-distribution

2.10 Exercises

Notes

Chapter 3: Measurement errors

3.1 Where do errors come from?

3.2 Simple model of random measurement errors

3.3 Intrinsic variability

3.4 Sampling error

3.5 Simple measurement errors

3.6 Exercises

Note

Chapter 4: Statistical estimators

4.1 Population and sample

4.2 What is a statistical estimator?

4.3 Estimator bias

4.4 Commonly used statistical estimators

4.5 Standard error

4.6 Standard error of the weighted mean

4.7 Error in the error

4.8 Degrees of freedom

4.9 Exercises

Notes

Chapter 5: Confidence intervals

5.1 Sampling distribution

5.2 Confidence interval: what does it really mean?

5.3 Why 95%?

5.4 Confidence interval of the mean

5.5 Standard error versus confidence interval

5.6 Confidence interval of the median

5.7 Confidence interval of the correlation coefficient

5.8 Confidence interval of a proportion

5.9 Confidence interval for count data

5.10 Bootstrapping

5.11 Replicates

5.12 Exercises

Notes

Chapter 6: Error bars

6.1 Designing a good plot

6.2 Error bars in plots

6.3 When can you get away without error bars?

6.4 Quoting numbers and errors

Summary

6.5 Exercises

Notes

Chapter 7: Propagation of errors

7.1 What is propagation of errors?

7.2 Single variable

7.3 Multiple variables

7.4 Correlated variables

7.5 To use error propagation or not?

7.6 Example: distance between two dots

7.7 Derivation of the error propagation formula for one variable

7.8 Derivation of the error propagation formula for multiple variables

7.9 Exercises

Note

Chapter 8: Errors in simple linear regression

8.1 Linear relation between two variables

8.2 Straight line fit

8.3 Confidence intervals of linear fit parameters

8.4 Linear fit prediction errors

8.5 Regression through the origin

8.6 General curve fitting

8.7 Derivation of errors on fit parameters

8.8 Exercises

Notes

Chapter 9: Worked example

9.1 The experiment

9.2 Results

9.3 Discussion

9.4 The final paragraph

Notes

Solutions to exercises

Appendix A

Bibliography

Index

EULA

List of Tables

Chapter 2

Table 2.1

Table 2.2

Chapter 3

Table 3.1

Chapter 4

Table 4.1

Table 4.2

Table 4.3

Chapter 5

Table 5.1

Chapter 6

Table 6.1

Table 6.2

Table 6.3

Table 6.4

Table 6.5

Table 6.6

Chapter 9

Table 9.1

Appendix A

Table A.1

Table A.2

Table A.3

Guide

Cover

Table of Contents

1

Pages

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

209

210

211

212

213

Introduction

Why would you read an introduction?

It is common that each nonfiction book is preceded by an ‘introduction’, or a ‘preface’, or a ‘foreword’ or sometimes a combination of the above. If you are (un)lucky, you might find a note from the Editor, a foreword followed by the preface to the first edition, a preface to the second edition and a general introduction. There, first of all, you can read about how great the author is. Next, you will find that the book is unique and better than all other books on the topic written so far. Then, the author will delve into painstakingly detailed description of each chapter, which by the way can be found in the table of contents. Finally, there is time for compulsory acknowledgements to all family and friends who the author forced into reading his or her magnum opus. There is no escaping; forewords, prefaces and introductions are everywhere. Stanisław Lem once wrote a book consisting entirely of forewords (Lem 1979).

People usually skip all of these intros as they are boring, pretentious, self-righteous and useless. All right, are you still with me? If you managed to get that far, you might be one of the few who actually read introductions. Very well, then. I'll try to be brief, down to the point and not too conceited.

What is this book about?

As the title suggests, the book is about error analysis, with emphasis on applications in biology or, more generally, in life sciences. Since the time of the great Ronald Fisher, statistics have become an inherent part of biology. Very few numerical results from either biological or medical studies can make their way into publication without confirming their statistical significance. One way of doing this is by providing a p-value from a statistical test, or – roughly speaking – a probability of being wrong in a particular statement. That is what this book is not about.

The other way of assessing the significance of a result is by finding its inherent error, or uncertainty. In my mind, a numerical result quoted without any kind of uncertainty is meaningless. Hence, it is good to know how to calculate errors. And that is what the book is about.

Here I discuss various aspects of error analysis: a bit of theoretical background and practical ways of calculating confidence intervals, but also graphical presentation of error bars and quoting numbers with errors. I put emphasis on intuition and understanding rather than practical computational recipes, although I give exact formulae for types of errors. Beware: this is not a comprehensive book on statistics; it is rather focused on practical understanding of uncertainty analysis. You can find more details in the table of contents, right after the introduction.

Who is this book for?

This book is written for an inquisitive biologist who wants to improve his or her understanding of data analysis. While a biologist is my target reader, the book may be useful for anyone who deals with numerical data and wants to learn more about how to evaluate and compare measurements. If you calculate various types of errors using a software package and you would like to find out where these errors come from, this book is for you. If you use standard deviations, standard errors and confidence intervals, but you are not sure what they really mean, this book is for you. If you struggle with finding errors of the median or correlation coefficient, this book is for you. Or, perhaps you are just curious and would like to learn a few basic things about uncertainty analysis – this book is also for you.

About maths

Despite the existence of a few attempts in the literature that use a purely intuitive approach (e.g. Motulsky 2010), I believe that it is very difficult to do statistics without maths. Plain English explanations cannot replace the strict precision of a mathematical equation. A simple derivation can explain where a given formula came from. Hence, there is maths in this book. Not very complex, not very extensive, but maths there is.

Needless to say, equations are required in practical applications, so if you need to find a particular uncertainty not provided by the statistical software you normally use, you can employ equations from this book. They can be easily encoded, either in any programming language or even in a computer spreadsheet. Mathematics in this book is quite basic; it doesn't really go beyond the level taught in a typical secondary school. Most equations contain simple algebra and sums. The most advanced operator I use is a derivative.

I don't want to scare potential readers away. This is not a mathematical textbook! I apply equations only when necessary and I always try to accompany them with an intuitive explanation. Often, I show the results of a computer simulation to illustrate the meaning of a concept or formula. I have also made a few simplifications and approximations here and there at the expense of mathematical correctness. I hope this makes the maths in this book much easier to understand.

I need to finish with a caveat. This is a book written primarily for biologists, not for mathematicians or physicists. Hence, there are no mathematical proofs, some derivations are not strict and there is a general lack of mathematical rigour. A mathematician might scowl at the content of this book, so if you are one, please shut your eyes now.

Acknowledgements

I would like to thank Professor Angus Lamond, who carefully read the manuscript from cover to cover and gave me a great deal of invaluable comments. Being a biologist, he helped me to understand better my target reader (you!). He also helped me with my English, which is not my first language.

Chapter 1Why do we need to evaluate errors?

A measurement without error is meaningless.

—My physics teachers