Introduction to Statistical Analysis of Laboratory Data - Alfred Bartolucci - E-Book

Introduction to Statistical Analysis of Laboratory Data E-Book

Alfred Bartolucci

0,0
111,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Introduction to Statistical Analysis of Laboratory Data presents a detailed discussion of important statistical concepts and methods of data presentation and analysis

  • Provides detailed discussions on statistical applications including a comprehensive package of statistical tools that are specific to the laboratory experiment process
  • Introduces terminology used in many applications such as the interpretation of assay design and validation as well as “fit for purpose” procedures including real world examples
  • Includes a rigorous review of statistical quality control procedures in laboratory methodologies and influences on capabilities
  • Presents methodologies used in the areas such as method comparison procedures, limit and bias detection, outlier analysis and detecting sources of variation
  • Analysis of robustness and ruggedness including multivariate influences on response are introduced to account for controllable/uncontrollable laboratory conditions

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 420

Veröffentlichungsjahr: 2015

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright

Dedication

Preface

Intended Audience

Prospectus

Acknowledgments

Chapter 1: Descriptive Statistics

1.1 Measures of Central Tendency

1.2 Measures of Variation

1.3 Laboratory Example

1.4 Putting it All Together

1.5 Summary

References

Chapter 2: Distributions and Hypothesis Testing in Formal Statistical Laboratory Procedures

2.1 Introduction

2.2 Confidence Intervals (CI)

2.3 Inferential Statistics – Hypothesis Testing

References

Chapter 3: Method Validation

3.1 Introduction

3.2 Accuracy

3.3 Brief Introduction to Bioassay

3.4 Sensitivity, Specificity (Selectivity)

3.5 Method Validation And Method Agreement – Bland-Altman

References

Chapter 4: Methodologies In Outlier Analysis

4.1 Introduction

4.2 Some Outlier Determination Techniques

4.3 Combined Method Comparison Outlier Analysis

4.4 Some Consequences of Outlier Removal

4.5 Considering Outlier Variance

References

Chapter 5: Statistical Process Control

5.1 Introduction

5.2 Control Charts

5.3 Capability Analysis

5.4 Capability Analysis – An Alternative Consideration

References

Chapter 6: Limits of Calibration

6.1 Calibration: Limit Strategies for Laboratory Assay Data

6.2 Limit Strategies

6.3 Method Detection Limits (Epa)

6.4 Data Near The Detection Limits

6.5 More On Statistical Management Of Nondetects

6.6 The Kaplan–Meier Method (Nonparametric Approach) for Analysis of Laboratory Data with Nondetects

References

Chapter 7: Calibration Bias

7.1 Error

7.2 Uncertainty

7.3 Sources of Uncertainty

7.4 Estimation Methods Of Uncertainty

7.5 Calibration Bias

7.6 Multiple Instruments

7.7 Crude Versus Precise Methodologies

References

Chapter 8: Robustness and Ruggedness

8.1 Introduction

8.2 Robustness

8.3 Ruggedness

8.4 An Alternative Procedure for Ruggedness Determination

8.5 Ruggedness and System Suitability Tests

References

Index

End User License Agreement

Pages

xi

xii

xv

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

Guide

Cover

Table of Contents

Preface

Begin Reading

List of Illustrations

Chapter 1: Descriptive Statistics

Figure 1.1 Frequency Distribution of White Cell Counts

Figure 1.2 Frequency Distribution of Potassium Values

Figure 1.3 CV% for TSH.

Chapter 2: Distributions and Hypothesis Testing in Formal Statistical Laboratory Procedures

Figure 2.1 Distribution of 116 BUN Values – Northeast Lab

Figure 2.2 Shape of the Distribution: (a) Skewed to Left. (b) Symmetric. (c) Skewed to Right

Figure 2.3 Distribution of 76 Potassium Values – Northeast Lab

Figure 2.4 CD3 Laboratory Values from Cytometry Data

Figure 2.5 Distribution of (a) 10 Potassium Values Skewness = 1.774; (b) Log Transformed Potassium Values Skewness = 0.433

Figure 2.6 Types of Normal Distributions

Figure 2.7 Standardized BUN Values

Figure 2.8 Example of Normal Data (

n

= 24)

Figure 2.9 Steps in a Statistical Hypothesis Test

Figure 2.10 Difference: (Frozen Serum − Fresh Serum) – Paired

t

-Test

Figure 2.11 (a) Distribution of the Mean Difference; (b) The

t

-Test Results for Two Groups

Figure 2.12 Distribution of Data for Each Group

Figure 2.13 Skewed Potassium Values by Group

Chapter 3: Method Validation

Figure 3.1 New Instrument A Versus Standard Instrument (Reference) with Respect to Measuring a Particular Analyte

Figure 3.2 Test vs. Standard Results (Data from Table 3.1)

Figure 3.3 Increase in

Y

per One Unit Increase in

X

Figure 3.4 What Are the Residuals?

Figure 3.5 Residuals Versus Predicted Values

Figure 3.6 Method Comparison PEC Versus PCC

Escherichia coli

Figure 3.7 Method Comparison MPN Versus PEC

E. coli

Figure 3.8 Standard Versus Spiked Sample

Figure 3.9 Accuracy Gene Expression

Figure 3.10 Indirect Assay Plot of Reticulocyte Count Versus Log10 (Dose)

Figure 3.11 Example of Interaction

Figure 3.12 Indirect Assay Parallel Model Plot of Residuals Versus Predicted Values

Figure 3.13 Indirect Assay Plot of Sigmoidal Response of Test (line filled with square) and Reference (line filled with diamond) Versus Log10 (Dose)

Figure 3.14 ROC Curve for Data in Table 3.12

Figure 3.15 ROC Curve for Data in Table 3.13

Figure 3.16 Bland–Altman Plot for Data in Table 3.1

Chapter 4: Methodologies In Outlier Analysis

Figure 4.1 Method A Mahalanobis Distances

Figure 4.2 Masking Example Mahalanobis Distance

Figure 4.3 Box Plot of Reticulocyte Counts

Figure 4.4 Method A versus Method B Total Cholesterol

Figure 4.5 Method A versus Method B with 95% Density Ellipse

Figure 4.6 Mahalanobis Distance for Methods A and B

Figure 4.7 Bland–Altman of Method A versus Method B

Figure 4.8 Method C Regressed on Method A

Figure 4.9 Mahalanobis Distance of Methods A and C

Figure 4.10 Bland–Altman Plot of Methods A and C

Figure 4.11 Standard Deviation of Replicate Lab Values

Chapter 5: Statistical Process Control

Figure 5.1 Percent Recovery Spike/Standard

Figure 5.2 Individual Measurement Chart

Figure 5.3 Control Chart of the Means

Figure 5.4 Control Chart of the Means for the Reduced Data Set of Table 5.2

Figure 5.5 Range Chart of Percent Recovery

Figure 5.6 Range Chart of Percent Recovery (Unequal Group Size)

Figure 5.7

S

-Chart for Percent Recovery (Equal Group Sizes)

Figure 5.8

S

-Chart for Percent Recovery (Unequal Group Sizes)

Figure 5.9 Median Control Chart for Percent Recovery (Equal Group Sizes)

Figure 5.10 Median Control Chart for Percent Recovery (Unequal Group Sizes)

Figure 5.11 (a) Capable

C

p

> 1. (b) Capable

C

p

≤ 1

Figure 5.12 Capability Plot (CP) of Percent Recovery Data Based on the Control Limits (LCL, UCL)

Figure 5.13 Capability Plot (CP) of Percent Recovery Data Based on the Specified Limits (LSL, USL)

Figure 5.14 (a) Capable

C

PL

> 1. (b) Capable

C

PU

> 1

Figure 5.15 (a) Data Set of Percent Recovery – Skewed to the Right. (b) Data Set of Percent Recovery – Box Cox Transformed

Chapter 6: Limits of Calibration

Figure 6.1 Typical Detection Results

Figure 6.2 LoQ Results: Plot of Analyte Result by Concentration

Figure 6.3 Graphical Display of Statistical LoD and LoQ

Figure 6.4 Plot of LoD versus Concentration Showing Possible Range of Concentration

Figure 6.5 LoQ by Linear Determination

Figure 6.6 Plot of PPM versus Standard Normal

Z

-Values

Figure 6.7 Plot of PPM versus PERCENTILES (

P

i

) Data

Figure 6.8 Plot of PPM versus PERCENTILES,

X

= (

P

i

)2.6 Data

Figure 6.9 Plot of PPM versus Standard Normal

Z

-Values 12 Observations

Figure 6.10 Kaplan–Meier method for Analysis of Data with Nondetects

Chapter 7: Calibration Bias

Figure 7.1 (a) Accurate and Precise: No Systematic, Little Random Error. (b) Inaccurate and Precise: Little Random Error but Significant Systematic Error. (c) Accurate and Imprecise: No Systematic, but Considerable Random Error. (d) Inaccurate and Imprecise: Both Types of Error

Figure 7.2 Rectangular Distribution

Figure 7.3 Triangular Distribution

Figure 7.4 Solar Radiation versus NO

2

Figure 7.5 Albumin BCG Relative Bias between Standard and Instrument A

Figure 7.7 Phosphorus Relative Bias between Standard and Instrument A

Figure 7.8 (a) GC–MS Calibration Bias – Average, One Instrument Y and One Compound. (b) GC–MS Calibration Bias – Residual, One Instrument Y and One Compound. (c) GC–MS Calibration Bias – All Data, One Instrument Y and One Compound. (d) GC–MS Calibration Bias – Residual, One Instrument Y and One Compound

Figure 7.9 (a) GC–MS Calibration Bias – Average, One Instrument Y and One Compound. (b) GC–MS Calibration Bias – Residual, One Instrument Y and One Compound. (c) GC–MS Calibration Bias – All Data, One Instrument Y and One Compound. (d) GC–MS Calibration Bias – Residual, One Instrument Y and One Compound

Figure 7.10 Means %RSDs for the Three Instruments X, Y, and Z

Figure 7.11 Fitted Regressions between Tech

C

and Tech

P

Chapter 8: Robustness and Ruggedness

Figure 8.1 (a) Robustness Test – Weekly TNF-α mRNA Levels. (b) Robustness Test – Weekly IL-8 mRNA Levels

Figure 8.2 Normal Probability Plot for Single Experiment

List of Tables

Chapter 1: Descriptive Statistics

Table 1.1 Demonstration of Variance

Table 1.2 Potassium Values and Descriptive Statistics

Table 1.3 Descriptive Statistics of 10 Potassium (

X

) Values

Chapter 2: Distributions and Hypothesis Testing in Formal Statistical Laboratory Procedures

Table 2.1 Descriptive Statistics of 116 BUN Values – Northeast Lab

Table 2.2 Descriptive Statistics of 76 Potassium Values – Northeast Lab

Table 2.3 Descriptive Statistics of 500 Left Skewed Data

Table 2.4 Example of Skewed Data

Table 2.5 Descriptive Statistics of Standardized BUN Values

Table 2.6 Example of Normal Data (

n

= 24)

Table 2.7 The Decision Process in Hypothesis Testing

Table 2.8 Test Statistics for Mean Difference: (Frozen Serum − fresh Serum) – Paired t-Test

Table 2.9 Test Statistics and

p

-Values for

t

-Test for Two Groups Assuming Equal Variances

Table 2.10 Summary Statistics for the Three Exercise Groups

Table 2.10 Test Statistics for the Three Exercise Groups

Table 2.11 Absolute Mean Differences and HSD Pairwise Comparisons

Table 2.12 Absolute Mean Differences and Pairwise Comparisons Using Bonferroni Comparisons

Table 2.14 Descriptive Statistics and Generic Output from a Wilcoxon (Mann–Whitney) Test for the Skewed Potassium Values

Chapter 3: Method Validation

Table 3.1 Data for Standard Versus Test Methodology

Table 3.2 Parameter Values for the Validation Example

Table 3.3 Parameter Values for the Validation Example of

Escherichia coli

(PCC vs. PEC)

Table 3.4 Parameter Values for the Validation Example of

E. coli

(MPN vs. PEC)

Table 3.5 Mandel Sensitivity Data

Table 3.6 Data for New Gene Expression

Table 3.7 Data for New Gene Expression

Table 3.8 Data for Direct Assay

Table 3.9 Summary Data for Indirect Assay

Table 3.10 Summary Data for ANCOVA Model

Table 3.11 Setup for Sensitivity Specificity Analysis

Table 3.12 Numerical Example of Sensitivity and Specificity

Table 3.13 Sensitivity of M TB DNA Detection

Chapter 4: Methodologies In Outlier Analysis

Table 4.1 Sample Data

Table 4.2 Example of Grubb Table of Critical Values

Table 4.3 Critical Values from Grubb

A

and Grubb

B

Procedures (

α

= 0.05)

Table 4.5 Critical Values from Prescott Procedure for Sequential Testing of at Most Two Outliers. (

α

= 0.05)

Table 4.4 Prescott Sequential Test for Outliers (

α

= 0.05) – Method A Data

Table 4.6 The Dixon

Q

-Test

Table 4.7 Nonparametric Reticulocyte Outlier Data

Table 4.8 Comparative Reticulocyte Example

Table 4.9 Sample of Critical Values for Cochran C Outlier Variance Test (

α

= 0.05)

Table 4.10 Cochran G Test Results for the Six Laboratories

Chapter 5: Statistical Process Control

Table 5.1 Percent Recovery Data for MRNA data

Table 5.3 Sample of Control Chart Constants

Table 5.2 Reduced Percent Recovery Sample from Table 5.1

Table 5.4 Sigma Constants for Equation (5.5)

Table 5.5 Control Chart Constants for MAD Calculations

Table 5.6 Common Box Cox Transformations

Chapter 6: Limits of Calibration

Table 6.1 LoB and LoD Data

Table 6.2 Predicted Value of the Analyte at Each Concentration Level and CV%

Table 6.3 Differences between the Statistical and Empirical Results

Table 6.4 Comparison of the Empirical and Statistical LoQ Values

Table 6.5 Atrazine Results

Table 6.6 ROS Method for Nine Detects and Two NDs

Table 6.7 Alternative ROS Method for Nine Detects and Two NDs

Table 6.8 ROS Method for Multiple NDs in Various Positions

Table 6.9 Sample of Values of

λ

for Cohen's Adjustment

Table 6.10 Calculations for Using the Kaplan–Meier Methods for Analysis of Laboratory Data with Nondetect Data

Table 6.11 Calculated Values of

C

j

and

A

j

for Estimating the Mean, Standard Deviation, and SE for the Kaplan–Meier Analysis of Laboratory Data with Nondetects

Chapter 7: Calibration Bias

Table 7.1 Normally Distributed Air Pollution Data

Table 7.2 Nonnormally Distributed Air Pollution Data

Table 7.3 20 Bootstrap Samples

Table 7.4 Uncertainty Results from the 20 Bootstrap Samples

Table 7.5 GC–MS Calibration Bias – One Instrument Y, One Compound, Standard

Table 7.6 GC–MS Calibration Bias – One Instrument Z, One Compound, Standard

Table 7.7 Response Factor Standard Level Concentration

Table 7.8 ANOVA-Single Factor-Summary Results

Table 7.9 Absolute Mean Differences and HSD Pairwise Comparisons

Table 7.10 Simulated Data on Tech_

P

and Tech_

C

Table 7.11 Regression Summary Statistics

Chapter 8: Robustness and Ruggedness

Table 8.1 Percent CV for Robustness Study

Table 8.2 A List of Factors That Could Be Considered During Ruggedness Testing in the HPLC Experiment

Table 8.3 Plackett–Burman Design Construction Pattern

Table 8.4 Plackett–Burman Design for Ruggedness Experiment

Table 8.5 Ruggedness Analysis: Average Effect for the Two Levels for Each Experiment

Table 8.6 Ruggedness Experiment: Table of Average Factor Effects and Their

t

-Statistic

Table 8.7 Plackett–Burman Design for a Single Experiment

Table 8.8 Factor Effects from Table 8.7

Table 8.9 Rank of Factor Effects from Table 8.8

Table 8.10 A List of Factors That Could Be Considered During Ruggedness Testing in the 12-Run+ HPLC Experiment

Table 8.11 Plackett–Burman Design for the 8-Factor 12-Run Single Experiment

Table 8.12 Twelve-Run, Eight-Factor Experiment

Table 8.13 Twelve-Run, Two-Factor Experiment

Introduction to Statistical Analysis of Laboratory Data

Alfred A. Bartolucci

University of Alabama at Birmingham Birmingham, Alabama, USA

 

Karan P. Singh

University of Alabama at Birmingham Birmingham, Alabama, USA

 

Sejong Bae

University of Alabama at Birmingham Birmingham, Alabama, USA

 

 

 

Copyright © 2016 by John Wiley & Sons, Inc. All rights reserved

Published by John Wiley & Sons, Inc., Hoboken, New Jersey

Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Bartolucci, Alfred A., author.

Introduction to statistical analysis of laboratory data / Alfred A. Bartolucci, Karan P. Singh, Sejong Bae.

pages cm

Includes bibliographical references and index.

ISBN 978-1-118-73686-9 (cloth)

1. Diagnosis, Laboratory–Statistical methods. 2. Statistics. I. Singh, Karan P., author. II. Bae, Sejong, author. III. Title.

RB38.3.B37 2016

616.07′50151–dc23

2015025700

To Lieve and Frank

Preface

Intended Audience

The advantage of this book is that it provides a comprehensive knowledge of the analytical tools for problem solving related to laboratory data analysis and quality control. The content of the book is motivated by the topics that a laboratory statistics course audience and others have requested over the years since 2003. As a result, the book could also be used as a textbook in short courses on quantitative aspects of laboratory experimentation and a reference guide to statistical techniques in the laboratory and processing of pharmaceuticals. Output throughout the book is presented in familiar software format such as EXCEL and JMP (SAS Institute, Cary, NC).

The audience for this book could be laboratory scientists and directors, process chemists, medicinal chemists, analytical chemists, quality control scientists, quality assurance scientists, CMC regulatory affairs staff and managers, government regulators, microbiologists, drug safety scientists, pharmacists, pharmacokineticists, pharmacologists, research and development technicians, safety specialists, medical writers, clinical research directors and personnel, serologists, and stability coordinators. The book would also be suitable for graduate students in biology, chemistry, physical pharmacy, pharmaceutics, environmental health sciences and engineering, and biopharmaceutics. These individuals usually have an advanced degree in chemistry, pharmaceutics, and formulation science and hold job titles such as scientist, senior scientist, principal scientist, director, senior director, and vice president. The above partial list of titles is from the full list of attendees that have participated in the 2-day course titled “Introductory Statistics for Laboratory Data Analysis” given through the Center for Professional Innovation and Education.

Prospectus

There is an unmet need to have the necessary statistical tools in a comprehensive package with a focus on laboratory experimentation. The study of the statistical handling of laboratory data from the design, analysis, and graphical perspective is essential for understanding pharmaceutical research and development of results involving practical quantitative interpretation and communication of the experimental process. A basic understanding of statistical concepts is pertinent to those involved in the utilization of the results of quantitation from laboratory experimentation and how these relate to assuring the quality of drug products and decisions about bioavailability, processing, dosing and stability, and biomarker development. A fundamental knowledge of these concepts is critical as well for design, formulation, and manufacturing.

This book presents a detailed discussion of important basic statistical concepts and methods of data presentation and analysis in aspects of biological experimentation requiring a fundamental knowledge of probability and the foundations of statistical inference, including basic statistical terminology such as simple statistics (e.g., means, standard deviations, medians) and transformations needed to effectively communicate and understand one's data results. Statistical tests (one-sided, two-sided, nonparametric) are presented as required to initiate a research investigation (i.e., research questions in statistical terms). Topics include concepts of accuracy and precision in measurement analysis to ensure appropriate conclusions in experimental results including between- and within-laboratory variation. Further topics include statistical techniques to compare experimental approaches with respect to specificity, sensitivity, linearity, and validation and outlier analysis. Advanced topics of the book go beyond the basics and cover more complex issues in laboratory investigations with examples, including association studies such as correlation and regression analysis with laboratory applications, including dose response and nonlinear dose–response considerations. Model fit and parallelism are presented. To account for controllable/uncontrollable laboratory conditions, the analysis of robustness and ruggedness as well as suitability, including multivariate influences on response, are introduced. Method comparison using more accurate alternatives to correlation and regression analysis and pairwise comparisons including the Mandel sensitivity are pursued. Outliers, limit of detection and limit of quantitation and data handling of censored results (results below or above the limit of detection) with imputation methodology are discussed. Statistical quality control for process stability and capability is discussed and evaluated. Where relevant, the procedures provided follow the CLSI (Clinical and Laboratory Standards Institute) guidelines for data handling and presentation.

The significance of this book includes the following:

A comprehensive package of statistical tools (simple, cross-sectional, and longitudinal) required in laboratory experimentation

A solid introduction to the terminology used in many applications such as the interpretation of assay design and validation as well as “fit-for-purpose” procedures

A rigorous review of statistical quality control procedures in laboratory methodologies and influences on capabilities

A thorough presentation of methodologies used in the areas such as method comparison procedures, limit and bias detection, outlier analysis, and detecting sources of variation.

Acknowledgments

The authors would like to thank Ms. Laura Gallitz for her thorough review of the manuscript and excellent suggestions and edits that she provided throughout.

Chapter 1Descriptive Statistics

1.1 Measures of Central Tendency

One wishes to establish some basic understanding of statistical terms before we deal in detail with the laboratory applications. We want to be sure to understand the meaning of these concepts, since one often describes the data with which we are dealing in summary statistics. We discuss what is commonly known as measures of central tendency such as the mean, median, and mode plus other descriptive measures from data. We also want to understand the difference between samples and populations.

Data come from the samples we take from a population. To be specific, a population is a collection of data whose properties are analyzed. The population is the complete collection to be studied; it contains all possible data points of interest. A sample is a part of the population of interest, a subcollection selected from a population. For example, if one wanted to determine the preference of voters in the United States for a political candidate, then all registered voters in the United States would be the population. One would sample a subset, say, 5000, from that population and then determine from the sample the preference for that candidate, perhaps noting the percent of the sample that prefer that candidate over another. It would be impossible logistically and costwise in statistics to canvass the entire population, so we take what we believe to be a representative sample from the population. If the sampling is done appropriately, then we can generalize our results to the whole population. Thus, in statistics, we deal with the sample that we collect and make our decisions. Again, if we want to test a certain vegetable or fruit for food allergens or contaminants, we take a batch from the whole collection, send it to the laboratory and it is, thus, subjected to chemical testing for the presence or degree of the allergen or contaminants. There are certain safeguards taken when one samples. For example, we want the sample to appropriately represent the whole population. Factors relevant in considering the representativeness of a sample include the homogeneity of the food and the relative sizes of the samples to be taken, among other considerations. Therefore, keep in mind that when we do statistics, we always deal with the sample in the expectation that what we conclude generalizes to the whole population.

Now let's talk about what we mean when we say we have a distribution of the data. The following is a sample of size 16 of white blood cell (WBC) counts ×1000 from a diseased sample of laboratory animals:

Note that this data is purposely presented in ascending order. That may not necessarily be the order in which the data was collected. However, in order to get an idea of the range of the observations and have it presented in some meaningful way, it is presented as such. When we rank the data from the smallest to the largest, we call this a distribution.

One can see the distribution of the WBC counts by examining . We'll use this figure as well as the data points presented to demonstrate some of the statistics that will be commonplace throughout the text. The height of the bars represents the frequency of counts for each of the values 5.13–6.8, and the actual counts are placed on top of the bars. Let us note some properties of this distribution. The mean is easy. It is obviously the average of the counts from 5.13 to 6.8 or . Algebraically, if we denote the elements of a sample of size as , then the sample mean in statistical notation is equal to

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!