60,99 €
Concise, thoroughly class-tested primer that features basic statistical concepts in the concepts in the context of analytics, resampling, and the bootstrap A uniquely developed presentation of key statistical topics, Introductory Statistics and Analytics: A Resampling Perspective provides an accessible approach to statistical analytics, resampling, and the bootstrap for readers with various levels of exposure to basic probability and statistics. Originally class-tested at one of the first online learning companies in the discipline, www.statistics.com, the book primarily focuses on applications of statistical concepts developed via resampling, with a background discussion of mathematical theory. This feature stresses statistical literacy and understanding, which demonstrates the fundamental basis for statistical inference and demystifies traditional formulas. The book begins with illustrations that have the essential statistical topics interwoven throughout before moving on to demonstrate the proper design of studies. Meeting all of the Guidelines for Assessment and Instruction in Statistics Education (GAISE) requirements for an introductory statistics course, Introductory Statistics and Analytics: A Resampling Perspective also includes: * Over 300 "Try It Yourself" exercises and intermittent practice questions, which challenge readers at multiple levels to investigate and explore key statistical concepts * Numerous interactive links designed to provide solutions to exercises and further information on crucial concepts * Linkages that connect statistics to the rapidly growing field of data science * Multiple discussions of various software systems, such as Microsoft Office Excel®, StatCrunch, and R, to develop and analyze data * Areas of concern and/or contrasting points-of-view indicated through the use of "Caution" icons Introductory Statistics and Analytics: A Resampling Perspective is an excellent primary textbook for courses in preliminary statistics as well as a supplement for courses in upper-level statistics and related fields, such as biostatistics and econometrics. The book is also a general reference for readers interested in revisiting the value of statistics.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 487
Veröffentlichungsjahr: 2015
Cover
Title Page
Copyright
Preface
Book Website
Acknowledgments
Stan Blank
Michelle Everson
Robert Hayden
Introduction
If You Can't Measure it, You Can't Manage It
Phantom Protection from Vitamin E
Statistician, Heal Thyself
Identifying Terrorists in Airports
Looking Ahead in the Book
Resampling
Big Data and Statisticians
Chapter 1: Designing and Carrying Out a Statistical Study
1.1 A Small Example
1.2 Is Chance Responsible? The Foundation of Hypothesis Testing
1.3 A Major Example
1.4 Designing an Experiment
1.5 What to Measure—Central Location
1.6 What to Measure—Variability
1.7 What to Measure—Distance (Nearness)
1.8 Test Statistic
1.9 The Data
1.10 Variables and Their Flavors
1.11 Examining and Displaying the Data
1.12 Are we Sure we Made a Difference?
Appendix: Historical Note
1.13 EXERCISES
Chapter 2: Statistical Inference
2.1 Repeating the Experiment
2.2 How Many Reshuffles?
2.3 How Odd is Odd?
2.4 Statistical and Practical Significance
2.5 When to Use Hypothesis Tests
2.6 Exercises
Chapter 3: Displaying and Exploring Data
3.1 Bar Charts
3.2 Pie Charts
3.3 Misuse of Graphs
3.4 Indexing
3.5 Exercises
Chapter 4: Probability
4.1 Mendel's Peas
4.2 Simple Probability
4.3 Random Variables and their Probability Distributions
4.4 The Normal Distribution
4.5 Exercises
Chapter 5: Relationship Between Two Categorical Variables
5.1 Two-Way Tables
5.2 Comparing Proportions
5.3 More Probability
5.4 From Conditional Probabilities to Bayesian Estimates
5.5 Independence
5.6 Exploratory Data Analysis (EDA)
5.7 Exercises
Chapter 6: Surveys and Sampling
6.1 Simple Random Samples
6.2 Margin of Error: Sampling Distribution for a Proportion
6.3 Sampling Distribution for a Mean
6.4 A Shortcut—The Bootstrap
6.5 Beyond Simple Random Sampling
6.6 Absolute Versus Relative Sample Size
6.7 Exercises
Chapter 7: Confidence Intervals
7.1 Point Estimates
7.2 Interval Estimates (Confidence Intervals)
7.3 Confidence Interval for a Mean
7.4 Formula-Based Counterparts to the Bootstrap
7.5 Standard Error
7.6 Confidence Intervals for a Single Proportion
7.7 Confidence Interval for a Difference in Means
7.8 Confidence Interval for a Difference in Proportions
7.9 Recapping
Appendix A: More on the Bootstrap
Resampling Procedure—Parametric Bootstrap
Formulas and the Parametric Bootstrap
Appendix B: Alternative Populations
Appendix C: Binomial Formula Procedure
7.10 Exercises
Chapter 8: Hypothesis Tests
8.1 Review of Terminology
8.2 A–B Tests: The Two Sample Comparison
8.3 Comparing Two Means
8.4 Comparing Two Proportions
8.5 Formula-Based Alternative—
t
-Test for Means
8.6 The Null and Alternative Hypotheses
8.7 Paired Comparisons
Appendix A: Confidence Intervals Versus Hypothesis Tests
Confidence Interval
Relationship Between the Hypothesis Test and the Confidence Interval
Comment
Appendix B: Formula-Based Variations of Two-Sample Tests
Z
-Test With Known Population Variance
Pooled Versus Separate Variances
Formula-Based Alternative:
Z
-Test for Proportions
8.8 Exercises
Chapter 9: Hypothesis Testing—2
9.1 A Single Proportion
9.2 A Single Mean
9.3 More Than Two Categories or Samples
9.4 Continuous Data
9.5 Goodness-of-FIT
Appendix: Normal Approximation; Hypothesis Test of a Single Proportion
Confidence Interval for a Mean
9.6 Exercises
Chapter 10: Correlation
10.1 Example: Delta Wire
10.2 Example: Cotton Dust and Lung Disease
10.3 The Vector Product and Sum Test
10.4 Correlation Coefficient
10.5 Other Forms of Association
10.6 Correlation is not Causation
10.7 Exercises
Chapter 11: Regression
11.1 Finding the Regression Line by Eye
11.2 Finding the Regression Line by Minimizing Residuals
11.3 Linear Relationships
11.4 Inference for Regression
11.5 Exercises
Chapter 12: Analysis of Variance—ANOVA
12.1 Comparing More Than Two Groups: ANOVA
12.2 The Problem of Multiple Inference
12.3 A Single Test
12.4 Components of Variance
12.5 Two-Way ANOVA
12.6 Factorial Design
12.7 Exercises
Chapter 13: Multiple Regression
13.1 Regression as Explanation
13.2 Simple Linear Regression—Explore the Data First
13.3 More Independent Variables
13.4 Model Assessment and Inference
13.5 Assumptions
13.6 Interaction, Again
13.7 Regression for Prediction
13.8 Exercises
Index
End User License Agreement
ix
x
xi
xii
xiii
xiv
xv
xvi
xvii
xviii
xix
xx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
283
284
285
Cover
Table of Contents
Preface
Introduction
Chapter 1: Designing and Carrying Out a Statistical Study
Figure 1.1
Figure 1.2
Figure 1.3
Figure 1.4
Figure 1.5
Figure 1.6
Figure 1.7
Figure 1.8
Figure 1.9
Figure 2.1
Figure 2.2
Figure 2.3
Figure 3.1
Figure 3.2
Figure 3.3
Figure 3.4
Figure 3.5
Figure 3.6
Figure 3.7
Figure 3.8
Figure 3.10
Figure 4.2
Figure 4.3
Figure 4.4
Figure 4.5
Figure 4.6
Figure 5.1
Figure 5.2
Figure 6.1
Figure 6.2
Figure 6.3
Figure 6.4
Figure 6.5
Figure 6.6
Figure 6.7
Figure 7.1
Figure 7.2
Figure 7.3
Figure 7.4
Figure 7.5
Figure 7.6
Figure 7.7
Figure 7.8
Figure 7.9
Figure 7.10
Figure 7.11
Figure 8.1
Figure 8.2
Figure 8.3
Figure 8.4
Figure 8.5
Figure 8.6
Figure 8.7
Figure 8.8
Figure 8.9
Figure 8.10
Figure 8.11
Figure 8.12
Figure 8.13
Figure 9.1
Figure 9.2
Figure 9.3
Figure 9.4
Figure 9.5
Figure 10.1
Figure 10.2
Figure 10.3
Figure 10.4
Figure 10.5
Figure 10.6
Figure 10.7
Figure 10.8
Figure 10.9
Figure 10.10
Figure 11.1
Figure 11.2
Figure 11.3
Figure 11.4
Figure 11.5
Figure 11.6
Figure 11.7
Figure 11.8
Figure 11.9
Figure 11.10
Figure 11.11
Figure 11.12
Figure 11.13
Figure 12.1
Figure 12.2
Figure 12.3
Figure 12.4
Figure 12.5
Figure 12.6
Figure 12.7
Figure 12.8
Figure 13.1
Figure 13.2
Figure 13.3
Figure 13.4
Figure 13.5
Figure 13.6
Figure 13.7
Figure 13.8
Figure 13.9
Figure 13.10
Figure 13.11
Figure 13.12
Figure 13.13
Figure 13.14
Table 1.1
Table 1.2
Table 1.3
Table 1.4
Table 1.5
Table 1.6
Table 1.8
Table 1.9
Table 1.10
Table 1.11
Table 1.12
Table 1.13
Table 1.14
Table 2.1
Table 2.2
Table 3.1
Table 3.2
Table 3.3
Table 3.4
Table 3.5
Table 3.6
Table 3.7
Table 4.1
Table 4.2
Table 5.1
Table 5.2
Table 5.3
Table 5.4
Table 5.5
Table 5.6
Table 5.7
Table 5.8
Table 5.9
Table 5.10
Table 5.11
Table 5.12
Table 5.13
Table 5.14
Table 5.15
Table 5.16
Table 5.17
Table 5.18
Table 6.1
Table 7.1
Table 7.2
Table 8.1
Table 8.2
Table 8.3
Table 8.4
Table 8.5
Table 9.1
Table 9.2
Table 9.3
Table 9.4
Table 9.5
Table 10.1
Table 10.2
Table 12.1
Table 12.2
Table 12.3
Table 12.4
Table 12.5
Table 12.6
Table 13.1
Table 13.2
Table 13.3
Peter C. Bruce
Institute for Statistics Education
Statistics.com
Arlington, VA
Copyright © 2015 by John Wiley & Sons, Inc. All rights reserved
Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data is available.
ISBN: 978-1-118-88135-4
This book was developed by Statistics.com to meet the needs of its introductory students, based on experience in teaching introductory statistics online since 2003. The field of statistics education has been in ferment for several decades. With this book, which continues to evolve, we attempt to capture three important strands of recent thinking:
Connection with the field of
data science
—an amalgam of traditional statistics, newer machine learning techniques, database methodology, and computer programming to serve the needs of large organizations seeking to extract value from “big data.”
Guidelines for the introductory statistics course, developed in 2005 by a group of noted statistics educators with funding from the American Statistical Association. These Guidelines for Assessment and Instruction in Statistics Education (GAISE) call for the use of real data with active learning, stress statistical literacy and understanding over memorization of formulas, and require the use of software to develop concepts and analyze data.
The use of resampling/simulation methods to develop the underpinnings of statistical inference (the most difficult topic in an introductory course) in a transparent and understandable manner.
We start off with some examples of statistics in action (including two of statistics gone wrong) and then dive right in to look at the proper design of studies and account for the possible role of chance. All the standard topics of introductory statistics are here (probability, descriptive statistics, inference, sampling, correlation, etc.), but sometimes, they are introduced not as separate standalone topics but rather in the context of the situation in which they are needed.
Throughout the book, you will see “Try It Yourself” exercises. The answers to these exercises are found at the end of each chapter after the homework exercises.
Data sets, Excel worksheets, software information, and instructor resources are available at the book website: www.introductorystatistics.com
Peter C. Bruce
The programmer for Resampling Stats, Stan has participated actively in many sessions of Statistics.com courses based on this work and has contributed well both to the presentation of regression and to the clarification and improvement of sections that deal with computational matters.
Michelle Everson, editor (2013) of the Journal of Statistics Education, has taught many sessions of the introductory sequence at Statistics.com and is responsible for the material on decomposition in the ANOVA chapter. Her active participation in the statistics education community has been an asset as we have strived to improve and perfect this text.
Robert Hayden has taught early sessions of this course and has written course materials that served as the seed from which this text grew. He was instrumental in getting this project launched.
In the beginning, Julian Simon, an early resampling pioneer, first kindled my interest in statistics with his permutation and bootstrap approach to statistics, his Resampling Stats software (first released in the late 1970s), and his statistics text on the same subject. Simon, described as an “iconoclastic polymath” by Peter Hall in his “Prehistory of the Bootstrap,” (Statistical Science, 2003, vol. 18, #2), is the intellectual forefather of this work.
Our Advisory Board—Chris Malone, William Peterson, and Jeff Witmer (all active in GAISE and the statistics education community in general) reviewed the overall concept and outline of this text and offered valuable advice.
Thanks go also to George Cobb, who encouraged me to proceed with this project and reinforced my inclination to embed resampling and simulation more thoroughly than what is found in typical college textbooks.
Meena Badade also teaches using this text and has also been very helpful in bringing to my attention errors and points requiring clarification and has helped to add the sections dealing with standard statistical formulas.
Kuber Deokar, Instructional Operations Supervisor at Statistics.com, and Valerie Troiano, the Registrar at STatisticscom, diligently and carefully shepherded the use of earlier versions of this text in courses at Statistics.com.
The National Science Foundation provided support for the Urn Sampler project, which evolved into the Box Sampler software used both in this course and for its early web versions. Nitin Patel, at Cytel Software Corporation, provided invaluable support and design assistance for this work. Marvin Zelen, an early advocate of urn-sampling models for instruction, shared illustrations that sharpened and clarified my thinking.
Many students at The Institute for Statistics Education at Statistics.com have helped me clarify confusing points and refine this book over the years.
Finally, many thanks to Stephen Quigley and the team at Wiley, who encouraged me and moved quickly on this project to bring it to fruition.
As of the writing of this book, the fields of statistics and data science are evolving rapidly to meet the changing needs of business, government, and research organizations. It is an oversimplification, but still useful, to think of two distinct communities as you proceed through the book:
The traditional academic and medical
research communities
that typically conduct extended research projects adhering to rigorous regulatory or publication standards, and
Business and large organizations that use statistical methods to extract value from their data, often on the fly. Reliability and value are more important than academic rigor to this
data science community
.
You may be familiar with this phrase or its cousin: if you can't measure it, you can't fix it. The two come up frequently in the context of Total Quality Management or Continuous Improvement programs in organizations. The flip side of these expressions is the fact that if you do measure something and make the measurements available to decision-makers, the something that you measure is likely to change.
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!