58,99 €
Statistics with JMP: Hypothesis Tests, ANOVA and Regression Peter Goos, University of Leuven and University of Antwerp, Belgium David Meintrup, University of Applied Sciences Ingolstadt, Germany A first course on basic statistical methodology using JMP This book provides a first course on parameter estimation (point estimates and confidence interval estimates), hypothesis testing, ANOVA and simple linear regression. The authors approach combines mathematical depth with numerous examples and demonstrations using the JMP software. Key features: * Provides a comprehensive and rigorous presentation of introductory statistics that has been extensively classroom tested. * Pays attention to the usual parametric hypothesis tests as well as to non-parametric tests (including the calculation of exact p-values). * Discusses the power of various statistical tests, along with examples in JMP to enable in-sight into this difficult topic. * Promotes the use of graphs and confidence intervals in addition to p-values. * Course materials and tutorials for teaching are available on the book's companion website. Masters and advanced students in applied statistics, industrial engineering, business engineering, civil engineering and bio-science engineering will find this book beneficial. It also provides a useful resource for teachers of statistics particularly in the area of engineering.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 809
Veröffentlichungsjahr: 2016
Peter Goos
University of Leuven and University of Antwerp, Belgium
David Meintrup
University of Applied Sciences Ingolstadt, Germany
This edition first published 2016 © 2016 John Wiley & Sons, Ltd
Registered officeJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the authors to be identified as the authors of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and authors have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the authors shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Names: Goos, Peter. | Meintrup, David. Title: Statistics with JMP : hypothesis tests, ANOVA, and regression / Peter Goos, David Meintrup. Description: Chichester, West Sussex : John Wiley & Sons, Inc., 2016. | Includes index. Identifiers: LCCN 2015039990 (print) | LCCN 2015047679 (ebook) | ISBN 9781119097150 (cloth) | ISBN 9781119097044 (Adobe PDF) | ISBN 9781119097167 (ePub) Subjects: LCSH: Probabilities--Data processing. | Mathematical statistics--Data processing. | Regression analysis. | JMP (Computer file) Classification: LCC QA273.19.E4 G68 2016 (print) | LCC QA273.19.E4 (ebook) | DDC 519.50285/53--dc23 LC record available at http://lccn.loc.gov/2015039990
A catalogue record for this book is available from the British Library.
To Marijke, Bas, Loes, and Mien To Béatrice and Werner
Preface
Acknowledgments
Part One Estimators and Tests
1 Estimating Population Parameters
1.1 Introduction: Estimators Versus Estimates
1.2 Estimating a Mean Value
1.3 Criteria for Estimators
1.4 Methods for the Calculation of Estimators
1.5 The Sample Mean
1.6 The Sample Proportion
1.7 The Sample Variance
1.8 The Sample Standard Deviation
1.9 Applications
Notes
2 Interval Estimators
2.1 Point and Interval Estimators
2.2 Confidence Intervals for a Population Mean with Known Variance
2.3 Confidence Intervals for a Population Mean with Unknown Variance
2.4 Confidence Intervals for a Population Proportion
2.5 Confidence Intervals for a Population Variance
2.6 More Confidence Intervals in JMP
2.7 Determining the Sample Size
Notes
3 Hypothesis Tests
3.1 Key Concepts
3.2 Testing Hypotheses About a Population Mean
3.3 The Probability of a Type II Error and the Power
3.4 Determination of the Sample Size
3.5 JMP
3.6 Some Important Notes Concerning Hypothesis Testing
Notes
Part Two One Population
4 Hypothesis Tests for a Population Mean, Proportion, or Variance
4.1 Hypothesis Tests for One Population Mean
4.2 Hypothesis Tests for a Population Proportion
4.3 Hypothesis Tests for a Population Variance
4.4 The Probability of a Type II Error and the Power
Notes
5 Two Hypothesis Tests for the Median of a Population
5.1 The Sign Test
5.2 The Wilcoxon Signed-Rank Test
Notes
6 Hypothesis Tests for the Distribution of a Population
6.1 Testing Probability Distributions
6.2 Testing Probability Densities
6.3 Discussion
Notes
Part Three Two Populations
7 Independent Versus Paired Samples
8 Hypothesis Tests for the Means, Proportions, or Variances of Two Independent Samples
8.1 Tests for Two Population Means for Independent Samples
8.2 A Hypothesis Test for Two Population Proportions
8.3 A Hypothesis Test for Two Population Variances
8.4 Hypothesis Tests for Two Independent Samples in JMP
Notes
9 A Nonparametric Hypothesis Test for the Medians of Two Independent Samples
9.1 The Hypotheses Tested
9.2 Exact
p
-Values in the Absence of Ties
9.3 Exact
p
-Values in the Presence of Ties
9.4 Approximate
p
-Values
Notes
10 Hypothesis Tests for the Means of Two Paired Samples
10.1 The Hypotheses Tested
10.2 The Procedure
10.3 Examples
10.4 The Technical Background
10.5 Generalized Hypothesis Tests
10.6 A Confidence Interval for a Difference of Two Population Means
Notes
11 Two Nonparametric Hypothesis Tests for Paired Samples
11.1 The Sign Test
11.2 The Wilcoxon Signed-Rank Test
11.3 Contradictory Results
Notes
Part Four More Than Two Populations
12 Hypothesis Tests for More Than Two Population Means: One-Way Analysis of Variance
12.1 One-Way Analysis of Variance
12.2 The Test
12.3 One-Way Analysis of Variance in JMP
12.4 Pairwise Comparisons
12.5 The Relation Between a One-Way Analysis of Variance and a
t
-Test for Two Population Means
12.6 Power
12.7 Analysis of Variance for Nonnormal Distributions and Unequal Variances
Notes
13 Nonparametric Alternatives to an Analysis of Variance
13.1 The Kruskal–Wallis Test
13.2 The van der Waerden Test
13.3 The Median Test
13.4 JMP
Notes
14 Hypothesis Tests for More Than Two Population Variances
14.1 Bartlett’s Test
14.2 Levene’s Test
14.3 The Brown–Forsythe Test
14.4 O’Brien’s Test
14.5 JMP
14.6 The Welch Test
Notes
Part Five Additional Useful Tests and Procedures
15 The Design of Experiments and Data Collection
15.1 Equal Costs for All Observations
15.2 Unequal Costs for the Observations
16 Testing Equivalence
16.1 Shortcomings of Classical Hypothesis Tests
16.2 The Principles of Equivalence Tests
16.3 An Equivalence Test for Two Population Means
17 The Estimation and Testing of Correlation and Association
17.1 The Pearson Correlation Coefficient
17.2 Spearman’s Rank Correlation Coefficient
17.3 A Test for the Independence of Two Qualitative Variables
18 An Introduction to Regression Modeling
18.1 From a Theory to a Model
18.2 A Statistical Model
18.3 Causality
18.4 Linear and Nonlinear Regression Models
19 Simple Linear Regression
19.1 The Simple Linear Regression Model
19.2 Estimation of the Model
19.3 The Properties of Least Squares Estimators
19.4 The Estimation of σ
2
19.5 Statistical Inference for β
0
and β
1
19.6 The Quality of the Simple Linear Regression Model
19.7 Predictions
19.8 Regression Diagnostics
Notes
Appendix A The Binomial Distribution
Appendix B The Standard Normal Distribution
Appendix C The χ
2
-Distribution
Appendix D Student’s
t
-Distribution
Appendix E The Wilcoxon Signed-Rank Test
Appendix F The Shapiro–Wilk Test
Appendix G Fisher’s
F
-Distribution
Appendix H The Wilcoxon Rank-Sum Test
Appendix I The Studentized Range or
Q
-Distribution
Appendix J The Two-Tailed Dunnett Test
Appendix K The One-Tailed Dunnett Test
Appendix L The Kruskal–Wallis Test
Appendix M The Rank Correlation Test
Index
EULA
Chapter 1
Table 1.1
Chapter 2
Table 2.1
Chapter 3
Table 3.1
Chapter 5
Table 5.1
Table 5.2
Table 5.3
Table 5.4
Table 5.5
Chapter 6
Table 6.1
Table 6.2
Table 6.3
Table 6.4
Table 6.5
Table 6.6
Chapter 7
Table 7.1
Chapter 9
Table 9.1
Table 9.2
Table 9.3
Table 9.4
Table 9.5
Table 9.6
Table 9.7
Chapter 10
Table 10.1
Table 10.2
Chapter 11
Table 11.1
Table 11.2
Table 11.3
Table 11.4
Table 11.5
Table 11.6
Chapter 12
Table 12.1
Table 12.2
Table 12.3
Table 12.4
Chapter 13
Table 13.1
Table 13.2
Table 13.3
Table 13.4
Table 13.5
Table 13.6
Table 13.7
Table 13.8
Table 13.9
Chapter 14
Table 14.1
Table 14.2
Table 14.3
Table 14.4
Chapter 15
Table 15.1
Table 15.2
Table 15.3
Chapter 16
Table 16.1
Table 16.2
Chapter 17
Table 17.1
Table 17.2
Table 17.3
Table 17.4
Table 17.5
Chapter 18
Table 18.1
Chapter 19
Table 19.1
Table 19.2
Table 19.3
Table 19.4
Table 19.5
Appendix E
Table 5.1
Cover
Table of Contents
Preface
xv
xvi
xvii
xix
1
3
4
6
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
24
25
26
28
29
30
31
33
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
54
55
56
58
60
61
63
64
65
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
85
86
87
88
89
90
91
92
93
94
95
96
97
99
100
101
102
103
104
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
122
123
124
125
126
127
128
130
131
132
133
134
135
136
140
141
143
144
145
147
148
149
150
151
152
153
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
176
177
179
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
199
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
247
248
249
251
252
253
254
255
256
257
258
261
262
264
265
266
267
268
269
270
271
273
274
275
276
277
278
279
280
281
282
283
285
286
287
288
289
290
291
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
309
310
311
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
332
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
397
398
399
400
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
430
431
432
433
434
435
436
437
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
490
491
492
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
538
539
541
542
543
544
545
546
547
549
550
551
552
553
554
555
556
557
559
565
567
569
571
577
579
587
599
607
611
615
619
621
622
623
624
This book is the result of a thorough revision of the lecture notes for the course “Statistics for Business and Economics 2” that were developed by Peter Goos at the Faculty of Applied Economics of the University of Antwerp in Belgium. Encouraged by the success of the Dutch version of this book (entitled Verklarende Statistiek: Schatten and Toetsen, published in 2014 by Acco Leuven/Den Haag), we joined forces to create an English version. The new book builds on our first joint work, Statistics with JMP: Graphs, Descriptive Statistics and Probability (Wiley, 2015), which adopts the same philosophy, and uses the same software package, JMP. Hence, it can be regarded as a sequel, but it can also be read as a stand-alone book.
In this book, we give a detailed introduction to point estimators, interval estimators, hypo-thesis tests, analysis of variance, and simple linear regression. Compared with other introductory textbooks on inferential statistics, we cover several additional topics. For example, considerable attention is paid to nonparametric tests, such as the sign test, the signed-rank test, and the Kruskal–Wallis test. In addition, we discuss tests and confidence intervals for the Pearson and Spearman correlation coefficients, introduce the concept of equivalence tests, and include a chapter on the principles of optimal design of experiments. For nonparametric tests, exact p-values are discussed in detail, alongside the better-known approximate p-values. Throughout the book, we discuss different versions of tests and confidence intervals, which includes the construction of confidence intervals for a proportion and approximate p-values for certain tests. The book also incorporates recent insights from the literature, such as a more detailed table with critical values for the Shapiro–Wilk test and a recent table with critical values for the Kruskal–Wallis test.
As in our first book, we pay equal attention to mathematical aspects, the interpretation of all the statistical concepts that are introduced, and their practical application. In order to facilitate the understanding of the methods and to appreciate their usefulness, the book contains many examples involving real-life data. To demonstrate the broad applicability of statistics and probability, these examples have been taken from various fields of application, including business, economics, sports, engineering, and the natural sciences.
Our motivation in writing this book was twofold. First, we wanted to provide students and teachers with a resource that goes beyond other textbooks of similar scope in its technical and mathematical content. It has become increasingly fashionable for authors and statistics teachers to sweep technicalities and mathematical derivations under the carpet. We decided against this, because we feel that students should be encouraged to apply their mathematical knowledge, and that doing so deepens their understanding of statistical methods. Reading this book obviously requires some knowledge of mathematics. In most countries, students are taught mathematics in secondary or high school and the required mathematical concepts are revisited in introductory mathematics courses at university. Therefore, we are convinced that many university students have a sufficiently strong mathematical background to appreciate and benefit from the more thorough nature of this book. In the various derivations, we have tried to include all the intermediate steps, in order to keep the book readable.
Our second motivation was to ensure that the concepts introduced in the book can be successfully put into practice. To this end, we show how to generate estimates, carry out hypothesis tests, and perform regression analyses using the statistical software package JMP (pronounced “jump”). We chose JMP as supporting software because it is powerful yet easy to use, and suitable for a wide range of statistically oriented courses (including descriptive statistics, hypothesis testing, regression, analysis of variance, design of experiments, reliability, multivariate methods, and statistical and predictive modeling). We believe that introductory courses in statistics should use such software wherever possible. Indeed, we find that, because of the way in which students can easily interact with JMP, it can actually spark enthusiasm for statistics in class. The probability that a student will use statistics in his or her future professional career is far greater if the statistics classes were more pleasurable than painful.
In summary, our approach to teaching statistics combines theoretical and mathematical depth, detailed and clear explanations, numerous practical examples, and the use of a user-friendly and yet very powerful statistical package.
As mentioned, we use JMP as enabling software. With the purchase of a hard copy of this book, you receive a one-year license for JMP’s Student Edition. The license period starts when you activate your copy of the software using the code included with this book (to receive an access code, please visit www.wiley.com/go/statsjmp). To download JMP’s Student Edition, visit http://www.jmp.com/wiley. For students accessing a digital version of the book, your lecturer may contact Wiley in order to procure unique codes with which to download the free software. For more information about JMP, go to http://www.jmp.com. JMP is available for the Windows and Mac operating systems. This book is based on JMP version 12 for Windows.
In our examples, we do not assume any familiarity with JMP: the step-by-step instructions are detailed and accompanied by screenshots. For more explanations and descriptions, www.jmp.com offers a substantial amount of free material, including many video demonstrations. In addition, there is a JMP Academic User Community where you can access content, discuss questions, and collaborate with other JMP users worldwide: instructors can share teaching resources and best practices, students can ask questions, and everyone can access the latest resources provided by the JMP Academic Team. To join the community, go to http://community.jmp.com/academic.
Throughout the book, various data sets are used. We strongly encourage everyone who wants to learn statistics to actively try things out using data. JMP files containing the data sets as well as JMP scripts to reproduce figures, tables, and analyses can be downloaded from the publisher’s companion web site to this book:
www.wiley.com/go/goosandmeintrup/JMP
There, we also provide some additional supporting files.
Peter Goos David Meintrup
We consulted plenty of sources during the preparation of this book, and we would like to acknowledge at least the most important ones. A source on nonparametric techniques that we found extremely valuable, given its depth and comprehensive explanations, is the book Nonparametric Statistical Methods by M. Hollander, D.A. Wolfe, and E. Chicken. A more general book that we found very helpful is the Handbook of Parametric and Nonparametric Statistical Procedures by David J. Sheskin. The same applies to Biostatistical Analysis by J.H. Zar.
We would like to thank numerous people who have made the publication of this book possible. The first author, Peter Goos, is very grateful to Professor Willy Gochet of the University of Leuven, who introduced him to the topics of statistics and probability. Professor Gochet allowed Peter to use his lecture notes as a backbone for his own course material, which later developed into this book.
The authors are very grateful for the support and advice offered by several people from the JMP Division of SAS: Brady Brady, Ian Cox, Bradley Jones, Volker Kraft, John Sall, and Mia Stephens. It is Volker who brought the two authors together and encouraged them to work on a series of English books on statistics with JMP (the first book is entitled Statistics with JMP: Graphs, Descriptive Statistics and Probability). A very special word of thanks goes to Ian, whose suggestions substantially improved this book, and to José Ramirez for generously sharing an example and a data set. The authors would also like to thank Eva Angels, Kris Annaert, Stefan Becuwe, Hilde Bemelmans, Marco Castro, Filip De Baerdemaeker, Hajar Hamidouche, Jérémie Haumont, Roselinde Kessels, Ida Ruts, Bagus Sartono, Daniel Palhazi Cuervo, Evelien Stoffels, Anja Struyf, Utami Syafitri, Yahri Tillmans, Ellen Vandervieren, Katrien Van Driessen, Kristel Van Rompay, Alan Vazquez Alcocer, Diane Verbiest, Tom Vermeire, Nha Vo-Thanh, Sara Weyns, Peter Willemé, and Simone Willis for their detailed comments and constructive suggestions, and for their technical assistance in creating figures and tables.
Finally, we thank Liz Wingett, Baljinder Kaur, Heather Kay, Audrey Koh, and Geoffrey D. Palmer at John Wiley & Sons.
I don’t know how long I stand there. I don’t believe I’ve ever stood there mourning faithfully in a downpour, but statistically speaking it must have been spitting now and then, there must have been a bit of a drizzle once or twice.
(from The Misfortunates, Dimitri Verhulst, pp. 125–126)