127,99 €
Mathematical Statistics with Resampling and R
This thoroughly updated third edition combines the latest software applications with the benefits of modern resampling techniques
Resampling helps students understand the meaning of sampling distributions, sampling variability, P-values, hypothesis tests, and confidence intervals. The third edition of Mathematical Statistics with Resampling and R combines modern resampling techniques and mathematical statistics. This book is classroom-tested to ensure an accessible presentation, and uses the powerful and flexible computer language R for data analysis.
This book introduces permutation tests and bootstrap methods to motivate classical inference methods, as well as to be utilized as useful tools in their own right when classical methods are inaccurate or unavailable. The book strikes a balance between simulation, computing, theory, data, and applications.
Throughout the book, new and updated case studies representing a diverse range of subjects, such as flight delays, birth weights of babies, U.S. demographics, views on sociological issues, and problems at Google and Instacart, illustrate the relevance of mathematical statistics to real-world applications.
Changes and additions to the third edition include:
Mathematical Statistics with Resampling and R is an ideal textbook for undergraduate and graduate students in mathematical statistics courses, as well as practitioners and researchers looking to expand their toolkit of resampling and classical techniques.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 793
Veröffentlichungsjahr: 2022
Cover
Title Page
Copyright
Dedication
Preface
Note
1 Data and Case Studies
1.1 Case Study: Flight Delays
1.2 Case Study: Birth Weights of Babies
1.3 Case Study: Verizon Repair Times
1.4 Case Study: Iowa Recidivism
1.5 Sampling
1.6 Parameters and Statistics
1.7 Case Study: General Social Survey
1.8 Sample Surveys
1.9 Case Study: Beer and Hot Wings
1.10 Case Study: Black Spruce Seedlings
1.11 Studies
1.12 Google Interview Question: Mobile Ads Optimization
Exercises
Notes
2 Exploratory Data Analysis
2.1 Basic Plots
2.2 Numeric Summaries
2.3 Boxplots
2.4 Quantiles and Normal Quantile Plots
2.5 Empirical Cumulative Distribution Functions
2.6 Scatter Plots
2.7 Skewness and Kurtosis
Exercises
3 Introduction to Hypothesis Testing: Permutation Tests
3.1 Introduction to Hypothesis Testing
3.2 Hypotheses
3.3 Permutation Tests
3.4 Matched Pairs
3.5 Cause and Effect
Exercises
Notes
4 Sampling Distributions
4.1 Sampling Distributions
4.2 Calculating Sampling Distributions
4.3 The Central Limit Theorem
Exercises
5 Introduction to Confidence Intervals: The Bootstrap
5.1 Introduction to the Bootstrap
5.2 The Plug‐in Principle
5.3 Bootstrap Percentile Intervals
5.4 Two Sample Bootstrap
5.5 Other Statistics
5.6 Bias
5.7 Monte Carlo Sampling
5.8 Accuracy of Bootstrap Distributions
5.9 How Many Bootstrap Samples Are Needed?
Exercises
Notes
6 Estimation
6.1 Maximum Likelihood Estimation
6.2 Method of Moments
6.3 Properties of Estimators
6.4 Statistical Practice
Exercises
Notes
7 More Confidence Intervals
7.1 Confidence Intervals for Means
7.2 Confidence Intervals Using Pivots
7.3 One‐Sided Confidence Intervals
7.4 Confidence Intervals for Proportions
7.5 Bootstrap Confidence Intervals
7.6 Confidence Interval Properties
7.7 The Delta Method*
Exercises
Notes
8 More Hypothesis Testing
8.1 Hypothesis Tests for Means and Proportions: One Population
8.2 Bootstrap
Tests
8.3 Hypothesis Tests for Means and Proportions: Two Populations
8.4 Type I and Type II Errors
8.5 Interpreting Test Results
8.6 Likelihood Ratio Tests
8.7 Statistical Practice
Exercises
Notes
9 Regression
9.1 Covariance
9.2 Correlation
9.3 Least Squares Regression
9.4 The Simple Linear Model
9.5 Resampling Correlation and Regression
9.6 Logistic Regression
Exercises
Notes
10 Categorical Data
10.1 Independence in Contingency Tables
10.2 Permutation Test of Independence
10.3 Chi‐Square Test of Independence
10.4 Chi‐Square Test of Homogeneity
10.5 Goodness‐of‐Fit Tests
10.6 Chi‐Square and the Likelihood Ratio*
Exercises
Notes
11 Bayesian Methods
11.1 Bayes Theorem
11.2 Binomial Data: Discrete Prior Distributions
11.3 Binomial Data: Continuous Prior Distributions
11.4 Continuous Data
11.5 Sequential Data
Exercises
Notes
12 One‐Way ANOVA
12.1 Comparing Three or More Populations
Exercises
Notes
13 Additional Topics
13.1 Smoothed Bootstrap
13.2 Parametric Bootstrap
13.3 Stratified Sampling
13.4 Control Variates and Casual Modeling
13.5 Computational Issues in Bayesian Analysis
13.6 Monte Carlo Integration
13.7 Importance Sampling
13.8 The EM Algorithm
Exercises
Notes
Appendix A: Review of Probability
A.1 Basic Probability
A.2 Mean and Variance
A.3 Marginal and Conditional Distributions
A.4 The Normal Distribution
A.5 The Mean of a Sample of Random Variables
A.6 Sums of Normal Random Variables
A.7 The Law of Averages
A.8 Higher Moments and the Moment Generating Function
Appendix B: Probability Distributions
B.1 The Bernoulli and Binomial Distributions
B.2 The Multinomial Distribution
B.3 The Geometric Distribution
B.4 The Negative Binomial Distribution
B.5 The Hypergeometric Distribution
B.6 The Poisson Distribution
B.7 The Uniform Distribution
B.8 The Exponential Distribution
B.9 The Gamma Distribution
B.10 The Chi‐Square Distribution
B.11 The Student's
Distribution
B.12 The Beta Distribution
B.13 The
Distribution
Exercises
Appendix C: Distributions Quick Reference
Problem Solutions
Bibliography
Index
End User License Agreement
Chapter 1
Table 1.1 Partial view of
FlightDelays
data.
Table 1.2 Variables in data set
FlightDelays
.
Table 1.3 Variables in data set
NCBirths2004
.
Table 1.4 Variables in data set
Verizon
.
Table 1.5 Variables in data set Iowa
Recidivism
.
Table 1.6 Variables in data set
GSS2018
.
Table 1.7 Variables in data set
Beerwings
.
Table 1.8 Variables in data set
Spruce
.
Table 1.9 Variables in data set
MobileAds
.
Table 1.10 Partial view of
MobileAds
data.
Chapter 2
Table 2.1 Counts for the
Carrier
variable.
Table 2.2 Counts of Delayed Flights grouped by carrier.
Table 2.3 Distribution of length of flight delays for United Airlines.
Table 2.4 A set of 21 data values.
Chapter 3
Table 3.1 All possible partitions of {30, 25, 20, 18, 21, 22} into two sets...
Table 3.2 Hot wings consumption.
Table 3.3 Partial view of
Beerwings
data set.
Table 3.4 Partial view of diving scores in file
Diving2017
.
Chapter 5
Table 5.1 Summary of center and spread for the normal distribution example....
Table 5.2 Summary of center and spread for the gamma distribution example....
Chapter 6
Table 6.1 Sample of wind speeds (m/s) from Carleton College turbine.
Chapter 8
Table 8.1 Decisions by a jury in a murder trial.
Table 8.2 Cumulative probabilities for
.
Chapter 9
Table 9.1 Partial view of
Spruce
data.
Table 9.2 Bushmeat: local supply of fish per capita and biomass of 41 speci...
Table 9.3 Part of the data on driver fatalities in Pennsylvania.
Chapter 10
Table 10.1 Counts of death penalty opinions grouped by degree.
Table 10.2 Expected counts of death penalty opinions grouped by degree.
Table 10.3 Contingency table after permuting
DeathPenalty
column.
Table 10.4 Observed counts.
Table 10.5 General
contingency table.
Table 10.6 Counts of candy preferences.
Table 10.7 Counts of home runs (in 162 games).
Table 10.8 Probabilities for Poisson distribution with
, and expected coun...
Table 10.9 Chi‐square test for Poisson goodness‐of‐fit to home run data.
Table 10.10 Distribution of wind speed data into intervals.
Chapter 11
Table 11.1 Calculations for Example 11.2.
Chapter 12
Table 12.1 Summary statistics for birth weights.
Table 12.2 Observations drawn from the
populations.
Table 12.3 ANOVA table.
Appendix B
Table B.1 Table for hypergeometric distribution.
Chapter 1
Figure 1.1 Density plot for CPR error.
Chapter 2
Figure 2.1 Bar chart of
Carrier
variable.
Figure 2.2 Histogram of lengths of flight delays for United Airlines. The di...
Figure 2.3 Histogram of average January temperatures in Washington state (18...
Figure 2.4 Example of a dot plot.
Figure 2.5 Boxplot for Table 2.4.
Figure 2.6 Distribution of lengths of the flight delays for United Airlines ...
Figure 2.7 Distribution of birth weights for North Carolina babies.
Figure 2.8 Density for
with
.
Figure 2.9 (a) Example of normal quantile plot for data in Example 2.11. (b)...
Figure 2.10 Normal quantile plot for a random sample of the flight delay tim...
Figure 2.11 Normal quantile plot for average January temperatures in Washing...
Figure 2.12 (a) Example of sample with symmetric, bell‐shaped distribution w...
Figure 2.13 (a) Empirical cumulative distribution function for the data 3, 6...
Figure 2.14 Ecdf's for male and female beer consumption. The vertical line i...
Figure 2.15 A scatter plot of
Beer
against
Hotwings
.
Figure 2.16 Examples of scatter plots.
Figure 2.17 Examples of skewness and kurtosis for four distributions, includ...
Figure 2.18 Empirical cdf for a data set,
.
Chapter 3
Figure 3.1 Empirical cumulative distribution function of the null distributi...
Figure 3.2 Number of hot wings consumed, by gender.
Figure 3.3 Permutation distribution of the difference in means, male
femal...
Figure 3.4 Normal quantile plots of the ILEC and CLEC data.
Figure 3.5 Permutation distribution of difference of means (ILEC
CLEC) for...
Figure 3.6 Repair times for Verizon data. (a) Permutation distribution for d...
Figure 3.7 Repair times for Verizon data. (a) Difference in proportion of re...
Figure 3.8 A diagram to indicate that X causes Y.
Figure 3.9 Confounding variable C affecting both treatment X and outcome Y....
Figure 3.10 Possible confounders in a study of time on Instagram and mental ...
Figure 3.11 Both treatment X and outcome Y cause the collider B.
Figure 3.12 Outcome on green die is not independent of red die if conditioni...
Chapter 4
Figure 4.1 Distribution of
after
sets of
tosses.
Figure 4.2 Distribution of
after
sets of
tosses.
Figure 4.3 Dot plot for sampling distribution of sample means.
Figure 4.4 Density for the exponential distribution with
.
Figure 4.5 (a) Histogram and (b) quantile‐normal plots for simulated samplin...
Figure 4.6 Simulated sampling distribution of the maximum of a sample.
Figure 4.7 Simulated sampling distribution of the sum of two Poissons.
Figure 4.8 Density for the gamma distribution
and
.
Figure 4.9 (a) Histogram and (b) quantile‐normal plots for the sampling dist...
Figure 4.10 Binomial distribution with
and
, with CLT approximation, and
Chapter 5
Figure 5.1 Bootstrap distribution of means for the North Carolina birth weig...
Figure 5.2 (a) The population distribution,
. (b) The theoretical sampling ...
Figure 5.3 Sampling and bootstrap distribution from a gamma distribution. (a...
Figure 5.4 Diagram of the process of creating a sampling distribution. Many ...
Figure 5.5 Diagram of the process of creating a bootstrap distribution. This...
Figure 5.6 (a) Histogram and (b) normal quantile plot(b) QQ plot of arsenic ...
Figure 5.7 (a) Histogram and (b) normal quantile plot of the bootstrap distr...
Figure 5.8 (a) Histogram and (b) normal quantile plot of the bootstrap distr...
Figure 5.9 (a) Histogram and (b) normal quantile plot of the permutation dis...
Figure 5.10 (a) Histogram and (b) quantile normal plots of the bootstrap dis...
Figure 5.11 Bootstrap distribution for the sample mean of the Verizon CLEC d...
Figure 5.12 Bootstrap distribution for the difference in means, ILEC–CLEC. T...
Figure 5.13 Bootstrap distribution for the difference in 25% trimmed means f...
Figure 5.14 Bootstrap distribution for the ratio of means.
Figure 5.15 Bootstrap distribution of relative risk.
Figure 5.16 Bootstrapped proportions of the high blood pressure group agains...
Figure 5.17 Bootstrap distribution for the mean,
. The left column shows th...
Figure 5.18 Bootstrap distributions for the mean,
. The left column shows t...
Figure 5.19 Bootstrap distributions for the median,
. The left column shows...
Figure 5.20 Bootstrap distributions for the mean,
, exponential population....
Chapter 6
Figure 6.1 Likelihood for
, after five heads and three tails.
Figure 6.2 Estimate of area under curve by rectangle.
Figure 6.3 Likelihood for
.
Figure 6.4 (a) Histogram of wind speeds (m/s) with the pdf for the Weibull d...
Figure 6.5 Sampling distribution of estimators for
. (a) Distribution of
....
Figure 6.6 Mean square error against
,
.
Figure 6.7 (a) Running mean for Cauchy. The
value is the running mean from...
Chapter 7
Figure 7.1 Random confidence intervals,
,
. Notice that several miss the m...
Figure 7.2 Standard normal density with shaded area
.
Figure 7.3 Normal quantile plot for sampling distribution of
.
Figure 7.4 Density for standard normal and students
distributions with 4, ...
Figure 7.5 Normal quantile plot of weights of baby girls.
Figure 7.6 Quantile–quantile plot of
statistics versus a
distribution wi...
Figure 7.7 Comparison of the
empirical cumulative distribution function
s (
ec
...
Figure 7.8 Quantile–quantile plot of
statistics versus a
distribution wh...
Figure 7.9 Weights of boy and girl babies born in Texas in 2004.
Figure 7.10 Density for
. Shaded region represents an area of
.
Figure 7.11 Sampling distributions for a location parameter. (a) Sampling di...
Figure 7.12 Sampling distributions for a scale parameter. (a) Sampling distr...
Figure 7.13 (a) Plot of
against
. (b) A normal quantile plot of the boots...
Figure 7.14 How often confidence intervals miss on each side. Confidence int...
Figure 7.15 Delta method.
: sample means
from 200 samples of size
from ...
Chapter 8
Figure 8.1 Distribution of weights of babies born to nonsmoking and smoking ...
Figure 8.2 Critical region, critical value
, and a test statistic
for whi...
Figure 8.3 Distributions of the test statistic under the null and alternativ...
Figure 8.4 Distributions under the null and alternative hypotheses. Shaded r...
Figure 8.5 Plot of power against different alternatives
for
and
when h...
Figure 8.6 Correspondence between hypothesis tests and confidence intervals....
Figure 8.7 Likelihood ratio for Cauchy,
,
, for a single observation. The ...
Figure 8.8 Google Mobile Ads Optimization, distribution of return on investm...
Figure 8.9 Google Mobile Ads Optimization, (a) Positive values on the
‐axis...
Figure 8.10 Return on investment (ROI) is more variable when the size of the...
Chapter 9
Figure 9.1 Change in diameter against change in height of seedlings over a 5...
Figure 9.2 Two scatter plots of two numeric variables.
Figure 9.3 Change in diameter against change in height of seedlings over a 5...
Figure 9.4 Examples of correlation. (a) Correlation −0.9, (b) Correlation −0...
Figure 9.5 “Best fit” line.
Figure 9.6 (a) The relationship between the regression slope,
and
with p...
Figure 9.7 Heights of parents and children. The data are
jittered
– a small ...
Figure 9.8 The partitioning of the variability. The horizontal line is the m...
Figure 9.9 Residuals are the (signed) lengths of the line segments drawn fro...
Figure 9.10 Examples of residual plots. (a) A good straight line fit. (b) Th...
Figure 9.11 Residual plot for the Spruce data.
Figure 9.12 Residual plot for the Spruce data, with a scatter plot smooth in...
Figure 9.13 (a) Scatter plot of fat content against sugar content in ice cre...
Figure 9.14 Scores of the 24 finalists in the 2010 Olympics men's figure ska...
Figure 9.15 Linear regression conditions: each
is normal with mean
and c...
Figure 9.16 Plot of pointwise confidence and prediction intervals for the Ol...
Figure 9.17 Bootstrap distributions of (a) correlations and (b) slopes for s...
Figure 9.18 Distribution of bootstrapped free program scores when the short ...
Figure 9.19 Bushmeat data; (a) fish per capita and (b) biomass of 41 species...
Figure 9.20 Scatter plot of percent change in biomass against fish supply wi...
Figure 9.21 (a) Regression lines from 40 bootstrap samples of the bushmeat d...
Figure 9.22 Plot of a typical logistic curve, Equation (9.20).
Figure 9.23 (a) Plot of estimated probability of alcohol being involved agai...
Figure 9.24 (a) Histogram and (b) normal quantile plot for
, the logistic r...
Figure 9.25 (a) Histogram and (b) normal quantile plot for probability of in...
Figure 9.26 Residuals plot for regression models.
Chapter 10
Figure 10.1 Null distribution for chi‐square statistic for death penalty opi...
Figure 10.2 Densities for the chi‐square distribution.
Chapter 11
Figure 11.1 Prior distribution and posterior distributions after three games...
Figure 11.2 Venn diagrams with box areas showing joint probabilities for
a...
Figure 11.3
with different
's and
's.
Figure 11.4 Pdf for
, resulting from a uniform prior and 110 successes in 2...
Figure 11.5 Pdfs for the prior
and posterior
for Bayesian B.
Figure 11.6 Comparison of pdfs for the priors of Bayesians A and B, as well ...
Figure 11.7 Prior
and posterior
distributions of
. The observed mean
...
Figure 11.8 Posterior distribution for artificial Google Analytics Content E...
Chapter 12
Figure 12.1 (a) Distribution of the birth weights of boys born in Illinois i...
Figure 12.2 Distribution of the resistivity measurements across the five ins...
Chapter 13
Figure 13.1 Ordinary bootstrap distributions for the median and mean,
. (a)...
Figure 13.2 Kernel density estimates. (a) A kernel density estimate using th...
Figure 13.3 Smoothed bootstrap distributions for the median and the mean,
....
Figure 13.4 Empirical cumulative distribution function of wind speeds (m/s),...
Figure 13.5 Bootstrap distributions for 10% quantile of wind speed. (a) The ...
Figure 13.6 Bootstrap distributions for fitted Weibull distribution estimate...
Figure 13.7 A/B Trial, linear relationships between outcome metric and covar...
Figure 13.8 Logistic relationships between outcome metric and covariate. The...
Figure 13.9 Prior and posterior densities for Pew survey example.
Figure 13.10 Importance sampling for the European option example, where
is...
Figure 13.11 Importance sampling for the Twitter account example,
and
, w...
Figure 13.12 Weighted ecdf from importance sampling for the Twitter account ...
Figure 13.13 Weighted ecdf from importance sampling for the Twitter account ...
Appendix B
Figure B.1 Densities of gamma distribution.
Figure B.2 Densities for the chi‐square distribution.
Figure B.3 Densities for the
distribution.
Figure B.4 Densities for the Beta distribution (
,
).
Figure B.5
distribution with (
,
) degrees of freedom.
Cover
Table of Contents
Title Page
Copyright
Dedication
Preface
Begin Reading
Appendix A: Review of Probability
Appendix B: Probability Distributions
Appendix C: Distributions Quick Reference
Problem Solutions
Bibliography
Index
End User License Agreement
iii
iv
v
xiii
xiv
xv
xvi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
493
494
495
496
497
498
499
500
501
502
503
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
553
554
555
556
557
558
559
560
Third Edition
Laura M. ChiharaCarleton College, Northfield, MN, US
Tim C. HesterbergInstacart, Seattle, WA, US
This third edition first published 2022© 2022 John Wiley & Sons, Inc.
Edition History2e: 2018, Wiley; 1e: 2011, Wiley
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Laura M. Chihara and Tim C. Hesterberg to be identified as the authors of this work has been asserted in accordance with law.
Registered OfficeJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
Editorial OfficeJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of WarrantyIn view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging‐in‐Publication Data applied forHardback ISBN: 9781119874034
Cover image: Courtesy of Carleton CollegeCover design by Wiley
The world seldom notices who teachers are; but civilization depends on what they do.– Lindley Stiles
To:Theodore S. Chihara
To:Bev Hesterberg
Mathematical Statistics with Resampling and R is a one term undergraduate statistics textbook aimed at sophomores or juniors who have taken a course in probability (at the level of, for instance, Ross (2009), Ghahramani (2004), or Scheaffer and Young (2010)) but may not have had any previous exposure to statistics.
What sets this book apart from other mathematical statistics texts is the use of modern resampling techniques – permutation tests and bootstrapping. We begin with permutation tests and bootstrap methods before introducing classical inference methods. Resampling helps students understand the meaning of sampling distributions, sampling variability, ‐values, hypothesis tests, and confidence intervals. We are inspired by the textbooks of Wardrop (1995) and Chance and Rossman (2005), two innovative introductory statistics books which also take a non‐traditional approach in the sequencing of topics.
We believe the time is ripe for this book. Many faculty have learned resampling and simulation‐based methods in graduate school and/or use them in their own work, and are eager to incorporate these ideas into a mathematical statistics course. Students and faculty today have access to computers that are powerful enough to perform resampling quickly.
A major topic of debate about the Mathematical Statistics course is how much theory to introduce. We want mathematically talented students to get excited about statistics, so we try to strike a balance between theory, computing, and applications. We feel that it is important to demonstrate some rigor in developing some of the statistical ideas presented here, but that mathematical theory should not dominate the text. To keep the size of the text reasonable, we omit some topics such as sufficiency and Fisher information (though we plan to make some omitted topics available as supplements on the text web page https://github.com/lchihara/MathStatsResamplingR).
We have compiled the definitions and theorems of the important probability distributions into an appendix (see Appendix B). Instructors who want to prove results on distributional theory can refer to that chapter. Instructors who wish to skip the theory can continue without interrupting the flow of the statistical discussion.
Incorporating resampling and bootstrapping methods requires that students use statistical software. We use R or RStudio because they are freely available (www.r-project.org or http://rstudio.com), powerful, flexible, and a valuable tool in future careers. One of us worked at Google where there was an explosion in the use of R, with more and more non‐statisticians learning R (the statisticians already know it). We realize that the learning curve for R is high, but believe that the time invested in mastering R is worth the effort. We have written some basic materials on R that are available on the website for this text. We recommend that instructors work through the introductory worksheet with the students on the first or second day of the term, in a computer lab if possible.
For the third edition, we decided to incorporate the packages in Hadley Wickham's tidyverse, including ggplot2. And though some R packages exist that implement some of the bootstrap and permutation algorithms that we teach, we felt that students understand and internalize the concepts better if they are required to write the code themselves. We do provide R scripts or R Markdown files with code on our website, and we may include alternate coding using some of the many R packages available.
Statistical computing is necessary in statistical practice and for people working with data in a wide variety of fields. There is an explosion of data – more and more data – and new computational methods are continuously being developed to handle this explosion. Statistics is an exciting field, dare we even say sexy?1
Third Edition: The issue of ‐values has generated much discussion in the statistics community (Wasserstein and Lazar, 2016; Wasserstein et al., 2019), and has even made it into the popular press (Resnick, 2019). In light of this controversy, one of the major changes we have made in this edition is to move away from the term “statistically significant” in favor of “statistically discernible.” We discuss this in Section 8.5 (Interpreting Test Results).
As noted above, we also updated the R code, using the ggplot2 package for plots instead of base R. We've added sections on cause‐and‐effect, control variates, expanded on stratification (Chapter 13), moved the section on the delta method from Chapter 13 to Chapter 7, updated the General Social Survey data set, included more examples, and developed more exercises. Through‐out the text, we have updated, clarified or made small changes to the exposition.
Pathways: This textbook contains more than enough material for a one term undergraduate course. We have written the textbook in such a way that instructors can choose to skip or gloss over some sections if they wish to emphasize others. In some instances, we have labeled a section or subsection with an asterisk (*) to denote it as optional. For classes comprised primarily of students who have no statistics background, a possible sequence includes Chapters 1–10. For courses focused more on applications, instructors could omit, for example, Sections 7.2 (Confidence Intervals Using Pivots) and 8.6 (Likelihood Ratio Tests). For classes in which students come in with an introductory statistics course background, instructors could have students read the first two chapters on their own, beginning the course at Chapter 3. In this case, instructors may wish to spend more time on theory, Bayesian methods, the delta method in Chapter 7, or the topics in Chapter 13, including the parametric bootstrap, stratified sampling and importance sampling. These topics could also be assigned as final projects for an undergraduate course or a senior capstone thesis.
Acknowledgments: This textbook could not have been completed without the assistance of many colleagues and students. In particular, for the first edition, we would like to thank Professor Katherine St. Clair of Carleton College who bravely class‐tested an early (very!) rough draft in her Introduction to Statistical Inference class during Winter 2010. In addition, Professor Julie Legler of St. Olaf College adopted the manuscript in her Statistical Theory class for Fall 2010. Both instructors and students provided valuable feedback that improved the exposition and content of this textbook. For the second edition, we would like to thank Professors Elaine Newman (Sonoma State University), Eric Nordmoe (Kalamazoo College), Nick Horton (Amherst College), and Carleton College faculty Katie St. Clair, Andy Poppick and Adam Loy for their helpful comments. We thank Ed Lee of Google for the Mobile Ads data and explanation. For the third edition, we would like to thank Jeff Witmer (Oberlin) and Miles Ott (Johnson & Johnson) for insightful comments; Ruhan Zhang of Instacart for the ad evaluation example in Exercise 13.14; and Rachel Zhang, Nick Cooley, and Jeff Moulton of Instacart for the human evaluation framework Exercise 13.15. Finally, we thank Professor Albert Y. Kim (Smith College) who compiled the data sets into an R package (resampledata).
We would also like to thank Siyuan (Ernest) Liu and Chen (Daisy) Sun, two Carleton College students, for solving many of the exercises in the first edition and writing up the solutions with.
Finally, we are also grateful for the valuable assistance provided by the staff at Wiley, including Skyler Van Valkenburgh, Judit Anbu Hena, Isabella Proietti, and Kalli Schutea. For the first and second editions, we were helped by Jon Gurstelle, Amudhapriya Sivamurthy, Kshitija Iyer, Vishnu Narayanan, Kathleen Pagliaro, Steve Quigley, Sanchari Sill, Dean Gonzalez, and Jackie Palmieri.
Additional Resources: We will place additional materials including R scripts, data sets, tutorials, and errata at our github site
https://github.com/lchihara/MathStatsResamplingR.
Northfield MN
Laura M. Chihara
Seattle WA
Tim C. Hesterberg
April 2022
1
Try googling “statistics sexy profession.”
Statistics is the art and science of collecting and analyzing data and understanding the nature of variability. Mathematics, especially probability, governs the underlying theory, but statistics is driven by applications to real problems.
In this chapter, we introduce several data sets that we will encounter throughout the text in the examples and exercises. These data sets are available in the R package resampledata3 or at the textbook website https://github.com/lchihara/MathStatsResamplingR.
If you have ever traveled by air, you probably have experienced the frustration of flight delays. The Bureau of Transportation Statistics maintains data on all aspects of air travel, including flight delays at departure and arrival.1
LaGuardia Airport (LGA) is one of three major airports that serves the New York City metropolitan area. In 2008, over 23 million passengers and over 375 000 planes flew in or out of LGA. United Airlines and American Airlines are two major airlines that schedule services at LGA. The data set FlightDelays contains information on all 4029 departures of these two airlines from LGA during May and June 2009 (Tables 1.1 and 1.2).
Table 1.1 Partial view of FlightDelays data.
Flight
Carrier
FlightNo
Destination
DepartTime
Day
1
UA
403
DEN
4–8 a.m.
Friday
2
UA
405
DEN
8–noon
Friday
3
UA
409
DEN
4–8 p.m.
Friday
4
UA
511
ORD
8–noon
Friday
Table 1.2 Variables in data set FlightDelays.
Variable
Description
Carrier
UA = United Airlines, AA = American Airlines
FlightNo
Flight number
Destination
Airport code
DepartTime
Scheduled departure time in 4 h intervals
Day
Day of week
Month
May or June
Delay
Minutes flight delayed (negative indicates early departure)
Delayed30
Departure delayed more than 30 min?
FlightLength
Length of time of flight (minutes)
Each row of the data set is an observation. Each column represents a variable – some characteristic that is obtained for each observation. For instance, on the first observation listed, the flight was a United Airlines plane, flight number 403, destined for Denver, and departing on Friday between 4 and 8 a.m. This data set consists of 4029 observations and 9 variables.
Questions we might ask include the following: Are flight delay times different between the two airlines? Are flight delay times different depending on the day of the week? Are flights scheduled in the morning less likely to be delayed by more than 15 min?
The birth weight of a baby is of interest to health officials since many studies have shown possible links between this weight and conditions in later life, such as obesity or diabetes. Researchers look for possible relationships between the birth weight of a baby and the age of the mother or whether or not she smoked cigarettes or drank alcohol during her pregnancy. The Centers for Disease Control and Prevention (CDC) maintains a database on all babies born in a given year,2 incorporating data provided by the US Department of Health and Human Services, the National Center for Health Statistics, and the Division of Vital Statistics. We will investigate different samples taken from the CDC's database of births.
One data set that we will investigate consists of a random sample of 1009 babies born in North Carolina during 2004 (Table 1.3). The babies in the sample had a gestation period of at least 37 weeks and were single births (i.e. not a twin or triplet).
Table 1.3 Variables in data set NCBirths2004.
Variable
Description
MothersAge
Mother's age
Smoker
Mother smoker or non‐smoker
Gender
Gender of baby
Weight
Weight at birth (grams)
Gestation
Gestation time (weeks)
In addition, we will also investigate a data set, Girls2004, consisting of a random sample of 40 baby girls born in Alaska and 40 baby girls born in Wyoming. These babies also had a gestation period of at least 37 weeks and were single births.
The data set TXBirths2004 contains a random sample of 1587 babies born in Texas in 2004. In this case, the sample was not restricted to single births, nor to a gestation period of at least 37 weeks. The numeric variable Number indicates whether the baby was a single birth, or one of a twin, triplet, and so on. The variable Multiple is a factor variable indicating whether or not the baby was a multiple birth.
Verizon is the primary local telephone company (incumbent local exchange carrier (ILEC)) for a large area of the Eastern United States. As such, it is responsible for providing repair service for the customers of other telephone companies known as competing local exchange carriers (CLECs) in this region. Verizon is subject to fines if the repair times (the time it takes to fix a problem) for CLEC customers are substantially worse than those for Verizon customers.
The data set Verizon contains a sample of repair times for 1664 ILEC and 23 CLEC customers (Table 1.4). The mean repair times are 8.4 h for ILEC customers and 16.5 h for CLEC customers. Could a difference this large be easily explained by chance?
Table 1.4 Variables in data set Verizon.
Variable
Description
Time
Repair times (in hours)
Group
ILEC or CLEC
When a person is released from prison, will he or she relapse into criminal behavior and be sent back? The state of Iowa tracks offenders over a 3‐year period, and records the number of days until recidivism for those who are readmitted to prison. The Department of Corrections uses this recidivism data to determine whether or not their strategies for preventing offenders from relapsing into criminal behavior are effective.
The data set Recidivism contains all offenders convicted of either a misdemeanor or felony who were released from an Iowa prison during the 2010 fiscal year (ending in June) (Table 1.5). There were 17 022 people released in that period, of whom 5386 were sent back to prison in the following 3 years (through the end of the 2013 fiscal year).3
The recidivism rate for those under the age of 25 years was 36.5% compared to 30.6% for those 25 years or older. Does this indicate a real difference in the behavior of those in these age groups, or could this be explained by chance variability?
Table 1.5 Variables in data set Iowa Recidivism.
Variable
Description
Gender
F, M
Age
Age at release: Under 25, 25–34, 35–44, 45–54, 55 and Older
Age25
Under 25, Over 25 (binary)
Offense
Original conviction: Felony or Misdemeanor
Recid
Recidivate? No, Yes
Type
New (crime), No Recidivism, Tech (technical violation,
such as a parole violation)
Days
Number of days to recidivism; NA if no recidivism
In analyzing data, we need to determine whether the data represent a population or a sample. A population represents all the individual cases, whether they are babies, fish, cars, or coin flips. The data from the flight delays case study in Section 1.1 are all the flight departures of United Airlines and American Airlines out of LGA in May and June 2009; thus, this data set represents the population of all such flights. On the other hand, the North Carolina data set contains only a subset of 1009 births from over 100 000 births in North Carolina in 2004. In this case, we will want to know how representative statistics computed from this sample are for the entire population of North Carolina babies born in 2004.
Populations may be finite, such as births in 2004, or infinite, such as coin flips or births next year.
Throughout this book, we will talk about drawing random samples from a population. We will use capital letters (e.g. , , , and so on) to denote random variables and lower‐case letters (e.g. , , , and so on) to denote actual values or data.
There are many kinds of random samples. Strictly speaking, a “random sample” is any sample obtained using a random procedure. However, we use random sample to mean a sample of independent and identically distributed (i.i.d.) observations from the population, if the population is infinite.
For instance, suppose you toss a fair coin 20 times and consider each head a “success.” Then your sample consists of the random variables each a Bernoulli random variable with success probability 1/2. We use the notation , .
If the population of interest is finite , we can choose a random sample as follows: Label balls with the numbers and place them in an urn. Draw a ball at random, record its value and then replace the ball. Draw another ball at random, record its value, and replace. Continue until you have a sample . This is sampling with replacement. For instance, if and , then there are different samples of size 2 (where order matters). (Note: By “order matters” we do not imply that order matters in practice, rather we mean that we keep track of the order of the elements when enumerating samples. For instance, the set is different from .)
However, in most real situations, for example, in conducting surveys, we do not want to have the same person polled twice. So we would sample without replacement, in which case, we will not have independence. For instance, if you wish to draw a sample of size from a population of people, then the probability of any one person being selected is . However, after having chosen that first person, the probability of any one of the remaining people being chosen is now .
In cases where populations are very large compared to the sample size, calculations under sampling without replacement are reasonably approximated by calculations under sampling with replacement.
Example 1.1 Consider a population of 1000 people, 350 of whom are smokers, and the rest are nonsmokers. If you select 10 people at random but with replacement, then the probability that 4 are smokers is . If you select without replacement, then the probability is .
When discussing numeric information, we will want to distinguish between populations and samples.
Definition 1.1 A parameter is a (numerical) characteristic of a population or of a probability distribution.
A statistic is a (numerical) characteristic of data.
Any function of a parameter is also a parameter; any function of a statistic is also a statistic. When the statistic is computed from a random sample, it is itself random, and hence is a random variable.
Example 1.2 and are parameters of the normal distribution with pdf .
The variance and signal‐to‐noise ratio are also parameters.
Example 1.3 If are a random sample, then the mean is a statistic.
Example 1.4 Consider the population of all babies born in the United States in 2017. Let denote the average weight of all these babies. Then is a parameter. The average weight of a sample of 2500 babies born in that year is a statistic.
Example 1.5 If we consider the population of all adults in the United States today, the proportion who approve of the president's job performance is a parameter. The fraction who approve in any given sample is a statistic.
Example 1.6 The average weight of 1009 babies in the North Carolina case study in Section 1.2 is 3448.26 g. This average is a statistic.
Example 1.7 If we survey 1000 adults and find that intend to vote in the next presidential election, then is a statistic – it estimates the parameter , the proportion of all adults who intend to vote in the next election.
The General Social Survey (GSS) is a major survey that has tracked American demographics, characteristics, and views on social and cultural issues since the 1970s. It is conducted by the National Opinion Research Center (NORC) at the University of Chicago. Trained interviewers meet face to face with the adults chosen for the survey and question them for about 90 min in their homes.
The GSS case study includes the responses of 2348 participants selected in 2018 to a subset of the questions asked, listed in Table 1.6 (Smith et al., 2014). For example, one of the questions (Courts) asked whether the respondent thinks that the courts in their area deal too harshly or not harshly enough with criminals.
Table 1.6 Variables in data set GSS2018.
Variable
Description
Region
Interview location
GenderNow
Current gender
Age
Age of respondent (
Note
: 89 = 89 or older)
Marital
Marital status
Degree
Highest level of education
Employed
Respondent employed? (Yes = full/part‐time; No = unemployed, retired, in school, etc.)
Income
Respondents income
Polviews
Respondents political views
Pres16
Whom did you vote for in the 2016 presidential election?
DeathPenalty
Death penalty for murder?
Courts
Courts deal harshly with criminals?
Attend
Attendance at religious services
Postlife
Believe in life after death?
Happy
General happiness
Satfin
Satisfied with financial situation?
Energy
Amount of spending on alternative energy
One of the core variables that has been included in all GSS surveys since its inception is Sex with the values (Female/Male) coded by the interviewer. Starting in 2018, GSS added a new question, “What is your current gender?”, to reflect recent research on the nature of gender identity which indicates that gender is not binary.4 Definitions of terms and concepts are always changing over time so it is important to consider the context when analyzing data.
We will analyze the GSS data to investigate questions such as the following: Is there a relationship between a person's marital status and whom they voted for in the 2016 presidential election? Are people who live in certain regions happier? Are there educational differences in support for the death penalty? These data can be obtained with the GSS Data Explorer.5
“Who do you plan to vote for in the next presidential election?” “Would you purchase our product again in the future?” “Do you smoke cigarettes? If yes, how old were you when you first started?” Questions such as these are typical of sample surveys. Researchers want to know something about a population of individuals, whether they are registered voters, online shoppers, or American teenagers, but to poll every individual in the population – that is, to take a census – is impractical and costly. Thus, researchers will settle for a sample from the target population. But if, say, of those in your sample of 1000 adults intend to vote for candidate Wong in the next election, how close is this to the actual percentage who will vote for Wong? How can we be sure that this sample is truly representative of the population of all voters? We will learn techniques for statistical inference, drawing a conclusion about a population based on information about a sample.
When conducting a survey, researchers will start with a sampling frame – a list from which the researchers will choose their sample. For example, to survey all students at a college, the campus directory listing could be a sampling frame. For pre‐election surveys, many polling organizations use a sampling frame of registered voters. Note that the choice of sampling frame could introduce the problem of undercoverage: omitting people from the target population in the survey. For instance, young people were missed in many pre‐election surveys during the 2008 Obama–McCain presidential race because they had not yet registered to vote.
Once the researchers have a sampling frame, they will then draw a random sample from this frame. Researchers will use some type of probability (scientific) sampling scheme, that is, a scheme that gives everybody in the population a positive chance of being selected. For example, to obtain a sample of size 10 from a population of 100 individuals, write each person's name on a slip of paper, put the slips of paper into a basket, and then draw out 10 slips of paper. Nowadays, statistical software is used to draw random samples from a sampling frame.
Another basic survey design uses stratified sampling: The population is divided into nonoverlapping strata, and then random samples are drawn from each stratum. The idea is to group individuals who are similar in some characteristic into homogeneous groups, thus reducing variability. For instance, in a survey of university students, a researcher might divide the students by class: first year, sophomores, juniors, seniors, and graduate students. A market analyst for an electronics store might choose to stratify customers based on income levels.
In cluster sampling, the population is divided into nonoverlapping clusters, and then a random sample of clusters is drawn. Every person in a chosen cluster is then interviewed for the survey. An airport wanting to conduct a customer satisfaction survey might use a sampling frame of all flights scheduled to depart from the airport on a certain day. A random sample of flights (clusters) is chosen, and then all passengers on these flights are surveyed. A modification of this design might involve sampling in stages: For instance, the analysts might first choose a random sample of flights, and then from each flight choose a random sample of passengers.
The GSS uses a more complex sampling scheme in which the sampling frame is a list of counties and county equivalents (standard metropolitan statistical areas) in the United States. These counties are stratified by region, age, and race. Once a sample of counties is obtained, a sample of block groups and enumeration districts is selected, stratifying these by race and income. The next stage is to randomly select blocks and then interview a specific number of men and women who live within these blocks.
Indeed, all major polling organizations such as Gallup or Roper as well as the GSS use a multistage sampling design. In this book, we use the GSS data or polling results for examples as if the survey design used simple random sampling. Calculations for more complex sampling scheme are beyond the scope of this book, and we refer the interested reader to Lohr (1991) for details.
Carleton student Nicki Catchpole conducted a study of hot wings and beer consumption at the Williams Bar in the Uptown area of Minneapolis (N. Catchpole, private communication). She asked patrons at the bar to record their consumption of hot wings and beer over the course of several hours. She wanted to know if people who ate more hot wings would then drink more beer. In addition, she asked each person their gender to investigate whether or not gender had an impact on hot wings or beer consumption.
Table 1.7 Variables in data set Beerwings.
Variable
Description
Gender
Male or female
Beer
Ounces of beer consumed
Hotwings
Number of hot wings eaten
The data for this study are in Beerwings (Table 1.7). There are 30 observations and 3 variables.
Black spruce (Picea mariana) is a species of a slow‐growing coniferous tree found across the northern part of North America. It is commonly found on wet organic soils. In a study conducted in the 1990s, a biologist interested in factors affecting the growth of the black spruce planted its seedlings on sites located in boreal peatlands in northern Manitoba, Canada (Camill et al., 2010).
The data set Spruce contains a part of the data from the study (Table 1.8). Seventy‐two black spruce seedlings were planted in four plots under varying conditions (fertilizer–no fertilizer, competition–no competition), and their heights and diameters were measured over the course of 5 years.
Table 1.8 Variables in data set Spruce.
Variable
Description
Tree
Tree number
Competition
C (competition), CR (competition removed)
Fertilizer
F (fertilized), NF (not fertilized)
Height0
Height (cm) of seedling at planting
Height5
Height (cm) of seedling at year 5
Diameter0
Diameter (cm) of seedling at planting
Diameter5
Diameter (cm) of seedling at year 5
Ht.change
Change (cm) in height
Di.change
Change (cm) in diameter
The researcher wanted to see whether the addition of fertilizer or the removal of competition from other plants (by weeding) affected the growth of these seedlings.
Researchers carry out studies to understand the conditions and causes of certain outcomes: Does smoking cause lung cancer? Do teenagers who smoke marijuana tend to move on to harder drugs? Do males eat more hot wings than females? Do black spruce seedlings grow taller in fertilized plots?
The beer and hot wings case study in Section 1.9 is an example of an observational study, a study in which researchers observe participants but do not influence the outcome. In this case, the student just recorded the gender and the number of hot wings eaten and beer consumed as reported by these patrons in the Williams bar.
Example 1.8 The first Nurses' Health Study is a major observational study funded by the National Institutes of Health. Over 120 000 registered female nurses who were married, between the ages of 33 and 55 years, and lived in the 11 most populous states (all in 1976) have been responding every 2 years to written questions about their health and lifestyle, including smoking habits, hormone use, and menopause status. Many results on women's health have come out of this study, such as finding an association between taking estrogen after menopause and lowering the risk of heart disease, and determining that for nonsmokers there is no link between taking birth control pills and developing heart disease.
Because this is an observational study, no cause and effect conclusions can be drawn. For instance, we cannot state that taking estrogen after menopause will cause a lowering of the risk for heart disease. In an observational study, there may be many unrecorded or hidden factors that impact the outcomes. Also, because the participants in this study are registered nurses, we need to be careful about making inferences about the general female population. Nurses are more educated and more aware of health issues than the average person.
On the other hand, the black spruce case study in Section 1.10 was an experiment. In an experiment, researchers will manipulate the environment in some way to observe the response of the objects of interest (people, mice, ball bearings, etc.). When the objects of interest in an experiment are people, we refer to them as subjects; otherwise, we call them experimental units. In this case, the biologist randomly assigned the experimental units – the seedlings – to plots subject to four treatments: fertilization with competition, fertilization without competition, no fertilization with competition, and no fertilization with no competition. He then recorded their height over a period of several years.
A key feature in this experiment was the random assignment of the seedlings to the treatments. The idea is to spread out the effects of unknown or uncontrollable factors that might introduce unwanted variability into the results. For instance, if the biologist had planted all the seedlings obtained from one particular nursery in the fertilized, no competition plot and subsequently recorded that these seedlings grew the least, then he would not be able to discern whether this was due to this particular treatment or due to some possible problem with seedlings from this nursery. With random assignment of treatments, the seedlings from this particular nursery would usually be spread out over the four treatments. Thus, the differences between the treatment groups should be due to the treatments (or chance).
Example 1.9 Knee osteoarthritis (OA
