Understanding and Applying Basic Statistical Methods Using R - Rand R. Wilcox - E-Book

Understanding and Applying Basic Statistical Methods Using R E-Book

Rand R. Wilcox

0,0
73,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Features a straightforward and concise resource for introductory statistical concepts, methods, and techniques using R

Understanding and Applying Basic Statistical Methods Using R uniquely bridges the gap between advances in the statistical literature and methods routinely used by non-statisticians. Providing a conceptual basis for understanding the relative merits and applications of these methods, the book features modern insights and advances relevant to basic techniques in terms of dealing with non-normality, outliers, heteroscedasticity (unequal variances), and curvature.

Featuring a guide to R, the book uses R programming to explore introductory statistical concepts and standard methods for dealing with known problems associated with classic techniques. Thoroughly class-room tested, the book includes sections that focus on either R programming or computational details to help the reader become acquainted with basic concepts and principles essential in terms of understanding and applying the many methods currently available. Covering relevant material from a wide range of disciplines, Understanding and Applying Basic Statistical Methods Using R also includes:

  • Numerous illustrations and exercises that use data to demonstrate the practical importance of multiple perspectives
  • Discussions on common mistakes such as eliminating outliers and applying standard methods based on means using the remaining data
  • Detailed coverage on R programming with descriptions on how to apply both classic and more modern methods using R
  • A companion website with the data and solutions to all of the exercises

Understanding and Applying Basic Statistical Methods Using R is an ideal textbook for an undergraduate and graduate-level statistics courses in the science and/or social science departments. The book can also serve as a reference for professional statisticians and other practitioners looking to better understand modern statistical methods as well as R programming.

Rand R. Wilcox, PhD, is Professor in the Department of Psychology at the University of Southern California, Fellow of the Association for Psychological Science, and an associate editor for four statistics journals. He is also a member of the International Statistical Institute. The author of more than 320 articles published in a variety of statistical journals, he is also the author eleven other books on statistics. Dr. Wilcox is creator of WRS (Wilcox’ Robust Statistics), which is an R package for performing robust statistical methods. His main research interest includes statistical methods, particularly robust methods for comparing groups and studying associations.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 882

Veröffentlichungsjahr: 2016

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Title Page

Copyright

List of Symbols

Preface

About The Companion Website

Chapter 1: Introduction

1.4 R Packages

1.5 Access to Data Used in This Book

1.6 Accessing More Detailed Answers to The Exercises

1.7 Exercises

Chapter 2: Numerical Summaries of Data

2.1 Summation Notation

2.2 Measures of Location

2.3 Quartiles

2.4 Measures of Variation

2.5 Detecting Outliers

2.6 Skipped Measures of Location

2.7 Summary

2.8 Exercises

Chapter 3: Plots Plus More Basics on Summarizing Data

3.1 Plotting Relative Frequencies

3.2 Histograms and Kernel Density Estimators

3.3 Boxplots and Stem-and-Leaf Displays

3.4 Summary

3.5 Exercises

Chapter 4: Probability and Related Concepts

4.1 The Meaning of Probability

4.2 Probability Functions

4.3 Expected Values, Population Mean and Variance

4.4 Conditional Probability and Independence

4.5 The Binomial Probability Function

4.6 The Normal Distribution

4.7 Nonnormality and The Population Variance

4.8 Summary

4.9 Exercises

Chapter 5: Sampling Distributions

5.1 Sampling Distribution of , The Proportion of Successes

5.2 Sampling Distribution of the Mean Under Normality

5.3 Nonnormality and the Sampling Distribution of the Sample Mean

5.4 Sampling Distribution of the Median and 20% Trimmed Mean

5.5 The Mean Versus the Median and 20% Trimmed Mean

5.6 Summary

5.7 Exercises

Chapter 6: Confidence Intervals

6.1 Confidence Interval for The Mean

6.2 Confidence Intervals for The Mean Using

s

( Not Known)

6.3 A Confidence Interval for The Population Trimmed Mean

6.4 Confidence Intervals for The Population Median

6.5 The Impact of Nonnormality on Confidence Intervals

6.6 Some Basic Bootstrap Methods

6.7 Confidence Interval for The Probability of Success

6.8 Summary

6.9 Exercises

Chapter 7: Hypothesis Testing

7.1 Testing Hypotheses about the Mean, Known

7.2 Power and Type II Errors

7.3 Testing Hypotheses about the mean, Not Known

7.4 Student's

T

and Nonnormality

7.5 Testing Hypotheses about Medians

7.6 Testing Hypotheses Based on a Trimmed Mean

7.7 Skipped Estimators

7.8 Summary

7.9 Exercises

Chapter 8: Correlation and Regression

8.1 Regression Basics

8.2 Least Squares Regression

8.3 Dealing with Outliers

8.4 Hypothesis Testing

8.5 Correlation

8.6 Detecting Outliers When Dealing with Two or More Variables

8.7 Measures of Association: Dealing with Outliers

8.8 Multiple Regression

8.9 Dealing with Curvature

8.10 Summary

8.11 Exercises

Chapter 9: Comparing Two Independent Groups

9.1 Comparing Means

9.2 Comparing Medians

9.3 Comparing Trimmed Means

9.4 Tukey's Three-Decision Rule

9.5 Comparing Variances

9.6 Rank-Based (Nonparametric) Methods

9.7 Measuring Effect Size

9.8 Plotting Data

9.9 Comparing Quantiles

9.10 Comparing Two Binomial Distributions

9.11 A Method for Discrete or Categorical Data

9.12 Comparing Regression Lines

9.13 Summary

9.14 Exercises

Chapter 10: Comparing More than Two Independent Groups

10.1 The Anova Test

10.2 Dealing With Unequal Variances: Welch's Test

10.3 Comparing Groups Based on Medians

10.4 Comparing Trimmed Means

10.5 Two-Way Anova

10.6 Rank-Based Methods

10.7 R Functions and

10.8 Summary

10.9 Exercises

Chapter 11: Comparing Dependent Groups

11.1 The Paired Test

11.2 Comparing Trimmed Means and Medians

11.3 The Sign Test

11.4 Wilcoxon Signed Rank Test

11.5 Comparing Variances

11.6 Dealing With More Than Two Dependent Groups

11.7 Between-by-Within Designs

11.8 Summary

11.9 Exercises

Chapter 12: Multiple Comparisons

12.1 Classic Methods For Independent Groups

12.2 The Tukey–Kramer Method

12.3 Scheffé's Method

12.4 Methods That Allow Unequal Population Variances

12.5 Anova Versus Multiple Comparison Procedures

12.6 Comparing Medians

12.7 Two-Way Anova Designs

12.8 Methods For Dependent Groups

12.9 Summary

12.10 Exercises

Chapter 13: Categorical Data

13.1 One-Way Contingency Tables

13.2 Two-Way Contingency Tables

13.3 Logistic Regression

13.4 Summary

13.5 Exercises

Appendix A: Solutions to Selected Exercises

Appendix B: Tables

References

Index

End User License Agreement

Pages

xvii

xviii

xix

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

435

436

437

438

439

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

473

474

475

476

477

Guide

Table of Contents

Preface

Begin Reading

List of Illustrations

Chapter 3: Plots Plus More Basics on Summarizing Data

Figure 3.1 Relative frequencies for the data in Table 3.1

Figure 3.2 Relative frequencies that are symmetric about a central value. For this special case, the mean and median have identical values, the middle value, which here is 3. The relative frequencies in (a) are higher for the more extreme values, compared to (b), indicating that the variance associated with (a) is higher

Figure 3.3 A histogram of the heart transplant data in Table 3.2

Figure 3.4 (a) An example of a histogram that is said to be the skewed to the right. (b) It is skewed to the left

Figure 3.5 A histogram based on a measure of hangover symptoms

Figure 3.6 A histogram of an entire population that is symmetric about 0 with relatively light tails, meaning outliers tend to be rare

Figure 3.7 A histogram based on a sample of 100 observations generated from the histogram in Figure 3.6

Figure 3.8 A histogram of an entire population that is symmetric about 0 with relatively heavy tails, meaning outliers tend to be common

Figure 3.10 An example of a kernel density plot based on the same 100 observations generated from Figure 3.8 and used in Figure 3.9. Note how the kernel density plot does a better job of capturing the shape of the population histogram in Figure 3.8

Figure 3.9 A histogram based on a sample of 100 observations generated from the histogram in Figure 3.8

Figure 3.11 An example of a boxplot with no outliers

Figure 3.12 An example of a boxplot with outliers

Chapter 4: Probability and Related Concepts

Figure 4.1 A normal distribution having mean . The area under the curve and to the left of is 0.158655, which is the probability that an observation is less than or equal to

Figure 4.2 For all normal distributions, the probability that an observation is within one standard deviation of the mean is 0.68. The probability of being within two standard deviations is 0.954

Figure 4.3 The left two distributions have the same mean but different standard deviations, namely, 1 and 1.5. The right distribution has a mean of 2 and standard deviation 1

Figure 4.4 The left tail indicates that for a standard normal distribution, the probability of a value less than or equal to is 0.0606,and the probability of a value greater than or equal to 1.55 is 0.0606 as well

Figure 4.5 Shown is the standard normal and the mixed normal described in the text

Figure 4.6 Two distributions with equal means and variances

Figure 4.7 For skewed distributions, the population mean and median can differ tremendously

Figure 4.8 Taking logarithms sometimes results in a plot of the data being more symmetric, as illustrated here, but outliers can remain

Figure 4.9 Taking logarithms sometimes reduces skewness but does not eliminate it, as illustrated here

Chapter 5: Sampling Distributions

Figure 5.1 An illustration of how the sampling distribution of the sample mean, , changes with the sample size when sampling from a normal distribution

Figure 5.2 The sampling distribution of , the proportion of successes, when randomly sampling from a binomial distribution having probability of success

Figure 5.3 As the sample size gets large, the sampling distribution of the sample mean will approach a normal distribution under random sampling. Here, observations were sampled from a symmetric distribution where outliers tend to occur

Figure 5.4 Examples of skewed distributions having light and heavy tails

Figure 5.5 The distributions of 5,000 sample means when sampling from the skewed distributions in Figure 5.4. The symmetric distributions are the distributions of the sample mean based on the central limit theorem. With skewed, light-tailed distributions, smaller sample sizes are needed to assume that the sample mean has a normal distribution compared to situations where sampling is from a heavy-tailed distribution

Figure 5.6 Although it is often the case that with a sample size of 40, the sampling distribution of the sample mean will be approximately normal, exceptions arise as illustrated here using the sexual attitude data

Figure 5.7 Plots of 5,000 trimmed means (solid line) and medians (dashed line). (a) Plots based on a sample size of . (b) The sample size is 80. As the sample size increases, the sample medians converge to a normal distribution centered around the population median, which here is 1. The 20% trimmed means converge to a normal distribution centered around the population 20% trimmed means, which here is 1.65

Figure 5.8 Plots of 20% trimmed means when data are generated from a discrete distribution where tied values will occur and repeated 5,000 times. (a) Plots based on a sample size of and shows the relative frequencies among the observed 20% trimmed means. (b) The sample size is 80, which illustrates that the distribution of the 20% sample trimmed mean approaches a normal distribution as the sample size increases

Figure 5.9 Plots of 5,000 medians when data are generated from the same discrete distribution used in Figure 5.8. (a) Plots based on a sample size of and shows the relative frequencies among the observed medians. Notice that only seven values for the median were observed. (b) The sample size is 80. Now there are only five observed values for the median. In this particular case, the median does not converge to a normal distribution

Figure 5.10 Boxplots of 10,000 means, 20% trimmed means, and medians using data sampled from a normal distribution

Figure 5.11 Boxplots of 10,000 means, 20% trimmed means, and medians using data sampled from a mixed normal distribution

Figure 5.12 Boxplots of 10,000 means, 20% trimmed means, and medians using data sampled from the skewed distribution in Figure 5.4a for which the proportion of values declared an outlier is typically small

Chapter 6: Confidence Intervals

Figure 6.1 Shown is a standard normal curve and a Student's distribution with four degrees of freedom. Student's distributions are symmetric about zero, but they have thicker tails than a normal distribution

Figure 6.2 A skewed, light-tailed distribution used to illustrate the effects of nonnormality when using Student's

Figure 6.3 Shown is the distribution of (solid line) when sampling from the skewed, light-tailed distribution in Figure 6.2 and the distribution of when sampling from a normal distribution (dotted line)

Figure 6.4 When sampling data from the distribution (a), with , the sampling distribution of the sample mean is approximately normal (b). But compare this to the sampling distribution of shown in Figure 6.5

Figure 6.5 When sampling data from the distribution in Figure 6.4a, with , the sampling distribution of , indicated by the solid line, differs substantially from a Student's distribution, indicated by the dotted line, even though the sampling distribution of the sample mean is approximately normal

Figure 6.6 When dealing with data from actual studies, situations are encountered where the actual distribution of differs even more from the distribution of under normality than is indicated in Figure 6.3 and 6.5. Shown is the distribution of when sampling from the data in Table 2.3, with the extreme outlier removed. The sample size is . With the extreme outlier included, the distribution of becomes even more skewed to the left

Figure 6.7 The distribution of based on % and 20% trimming when sampling observations from the distribution in Figure 6.2. The distribution of when sampling from a normal distribution is indicated by the dashed line. Compare this to Figure 6.3

Figure 6.8 The distribution of based on and when sampling observations from the distribution in Figure 6.2. The distribution of when sampling from a normal distribution is indicated by the dashed line. Compare this to Figure 6.3 and 6.7

Chapter 7: Hypothesis Testing

Figure 7.1 A graphical depiction of the rejection rule when using and . The shaded portions indicate the rejection regions

Figure 7.2 The distribution of based on the bootstrap-t method and the data stemming from a study on hangover symptoms

Chapter 8: Correlation and Regression

Figure 8.1 A few outliers among the independent variable can drastically alter the least squares regression line. Here, the outliers in the upper-left corner result in a regression line having a slightly negative slope. The MAD–median rule indicates that the five smallest surface temperature values are outliers. These are the four points in the upper-left corner, as well as the surface temperature value 3.84. If these outliers are removed, the slope is positive and better reflects the association among the bulk of the points

Figure 8.2 An illustration of homoscedasticity: The variation associated with the dependent variable (C-peptide) does not depend on the value of the independent variable (the age of the participant)

Figure 8.3 An illustration of heteroscedasticity: The variation associated with the dependent variable (C-peptide) depends on the value of the independent variable (the age of the participant)

Figure 8.4 A scatterplot of the marital aggression data. Also shown is the least squares regression line. Notice the apparent outliers in the upper-left corner. Removing these five points, now the regression line is nearly horizontal

Figure 8.5 An illustration that the magnitude of Pearson's correlation is influenced by the magnitude of the residuals

Figure 8.6 When checking for outliers among bivariate data, simply applying the boxplot rule for each of the variables can miss outliers, as illustrated here. The first boxplot in (b) is based on the data stored in the variable x1, ignoring the data stored in x2. The other boxplot is based on the data stored in x2, ignoring x1. No outliers are detected by the boxplots, but the point indicated by the arrow in (a) is an outlier. What is needed is an outlier detection method that takes into account the overall structure of the data

Figure 8.7 This scatterplot illustrates that outliers, properly placed, can have a large influence on both Kendall's tau and Spearman's rho. The two points in the lower-right corner are outliers based on how the data were generated. Kendall's tau, using all of the data, is 0.31, but ignoring the two outliers it is 0.49. Spearman's rho is 0.29 using all of the data and 0.48 when the two outliers are ignored

Figure 8.8 Shown is the smooth created by the R function rplot. The solid line reflects an estimate of the typical CESD measure (depressive symptoms) given a value for the CAR (the cortisol awakening response). Notice that there appears to be a distinct bend close to where the CAR is zero

Figure 8.9 Shown is the smooth created by the R function lplot using the diabetes data. The smooth suggests that there is a positive association up to about the age of 7, but for older children, there seems to be little or no association

Figure 8.10 Shown is the smooth created by the R function lplot using the CAR and CESD to predict the typical MAPA score. Notice the distinct bend close to where CESD is equal to 16. Focusing on only those participants who have a CESD score less than 16, an association is found between CAR and MAPA, in contrast to an analysis where any possibility of curvature is ignored

Figure 8.11 Shown is the smooth created by the R function rplot where the goal is to predict the typical Totagg score, given values for engage and GPA. This illustrates again that curvature can mask an association and that assuming that a regression surface is a plane can yield misleading results

Chapter 9: Comparing Two Independent Groups

Figure 9.1 A skewed distribution with a mean of zero

Figure 9.2 The distribution of (the solid line) when sampling 40 values from a standard normal distribution and 60 values from the distribution in Figure 9.1. Also shown is the distribution of when both groups have normal distributions. This illustrates that differences in skewness can have an important impact on

Figure 9.3 (a) Power is 0.96 based on Student's , and sample sizes . (b) The distributions are not normal, but rather mixed normals, and now, power is only 0.28

Figure 9.5 (a) Estimates of the two distributions in the self-awareness study. (b) The boxplots are based on the same data

Figure 9.4 Examples of error bars using the self-awareness data in Section 9.1

Figure 9.6 The distribution of depressive symptoms for a control group (solid line) and an experimental group (dotted line). The plot suggests that relatively high measures of depressive symptoms are more likely for the control group compared to the group that received intervention

Figure 9.7 Pots of the relative frequencies for the graphics group (solid line) and the text-only group (dashed line)

Figure 9.8 The solid line is the regression lined for the first group (the control group) for predicting MAPA scores based on CAR. The dashed line is the regression line for the second group (the group that received intervention)

Chapter 10: Comparing More than Two Independent Groups

Figure 10.1 The distribution when comparing four groups with 10 observations in each group and the hypothesis of equal means is true. That is, , , so and . If the Type I error is to be , reject if

Chapter 11: Comparing Dependent Groups

Figure 11.1 Boxplots of the difference scores based on the weight of cork borings. The left boxplot is based on the difference between north and east sides of the trees. The right boxplot is based on the east and south sides

Chapter 13: Categorical Data

Figure 13.1 Shown are two chi-squared distributions. One has two degrees of freedom and the other has four degrees of freedom

Figure 13.2 Examples of logistic regression lines based on Equation (13.16). (a) and . Because , the predicted probability is monotonically increasing. (b) , and now the predicted probability is monotonically decreasing

Figure 13.3 The estimated regression line suggests that the probability of kyphosis increases with age, up to a point, and then decreases, which is contrary to the assumption of the basic logistic regression model in Section 13.3

List of Tables

Chapter 1: Introduction

Table 1.1 Weight Gain of Rats in Ozone Experiment

Table 1.2 A Summary of Basic Commands for Accessing Data When Working with a Vector x

Chapter 2: Numerical Summaries of Data

Table 2.1 Changes in Cholesterol Level after 1 Month on an Experimental Drug

Table 2.2 Changes in Cholesterol Level after 1 Month of Taking a Placebo

Table 2.3 Responses by Males in the Sexual Attitude Study

Chapter 3: Plots Plus More Basics on Summarizing Data

Table 3.1 One Hundred Ratings of a Film

Table 3.3 Frequencies and Relative Frequencies for Grouped T5 Scores,

Table 3.2 T5 Mismatch Scores from a Heart Transplant Study

Table 3.4 Word Identification Scores

Table 3.5 Examination Scores

Chapter 4: Probability and Related Concepts

Table 4.1 Hypothetical Probabilities for Getting a Flu Shot and Getting the Flu

Table 4.2 Hypothetical Probabilities for Rating a Book

Chapter 6: Confidence Intervals

Table 6.1 Additional Hours Sleep Gained by Using an Experimental Drug

Table 6.2 Common Choices for and

Table 6.3 Self-Awareness Data

Table 6.4 Actual Probability Coverage for Three Methods Designed to Compute a 0.95 Confidence Interval for the Population Mean

Table 6.5 Actual Probability Coverage When Computing a 0.95 Confidence Interval for the Population 20% Trimmed Mean

Chapter 7: Hypothesis Testing

Table 7.1 Four Possible Outcomes When Testing Hypotheses

Table 7.2 Actual Type I Error Probabilities When Testing Some Hypothesis about the Mean at the Level

Table 7.3 Actual Probability of a Type I Error When Using a 20% Trimmed Mean,

Chapter 8: Correlation and Regression

Table 8.1 Boscovich's Data on Meridian Arcs

Table 8.2 Computing the Least Squares Slope Using Boscovich's Data

Table 8.3 Sale Price of Homes (Divided by 1,000) and Size in Square Feet

Table 8.4 Measures of Marital Aggression and Recall Test Scores

Table 8.5 Reading Data

Chapter 9: Comparing Two Independent Groups

Table 9.1 Weight Gain, in Grams, for Large Babies

Table 9.2 Self-Awareness Data

Chapter 10: Comparing More than Two Independent Groups

Table 10.1 Weight Gains (in Grams) for Rats on One of Four Diets

Table 10.2 Depiction of the Population Means for Four Diets

Table 10.3 Commonly Used Notation for the Means in a -by- ANOVA

Table 10.4 Hypothetical Data Illustrating the Kruskal–Wallis Test

Chapter 11: Comparing Dependent Groups

Table 11.1 Bacteria Counts Before and After Treatment

Table 11.2 Cork Boring Weights for the North, East, South, and West Sides of Trees

Chapter 12: Multiple Comparisons

Table 12.1 Hypothetical Data for Three Groups

Table 12.2 Ratings of Methods for Treating Migraine Headaches

Table 12.3 Critical Values, , for Rom's Method

Table 12.4 An Illustration of Rom's Method

Chapter 13: Categorical Data

Table 13.1 Hypothetical Data on Homework Survey

Table 13.2 Approval Rating of a Political Leader

Table 13.3 Probabilities Associated with a Two-Way Contingency Table

Table 13.4 Notation for Observed Frequencies

Table 13.5 Hypothetical Results on Personality and Blood Pressure

Table 13.6 Ratings of 100 Figure Skaters

Table 13.7 Estimated Probabilities for Personality versus Blood Pressure

Table 13.8 Mortality Rates per 100,000 Person-Years from Lung Cancer and Coronary Artery Disease for Smokers and Nonsmokers of Cigarettes

Understanding and Applying Basic Statistical Methods Using R

Rand R. Wilcox

 

Copyright © 2017 by John Wiley & Sons, Inc. All rights reserved

Published by John Wiley & Sons, Inc., Hoboken, New Jersey

Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Names: Wilcox, Rand R., author.

Title: Understanding and applying basic statistical methods using R / Rand R.

Wilcox.

Description: Hoboken, New Jersey : John Wiley & Sons, 2016. | Includes

bibliographical references and index.

Identifiers: LCCN 2015050582| ISBN 9781119061397 | ISBN 9781119061410 (epub)

| ISBN 9781119061403 (Adobe PDF)

Subjects: LCSH: Statistics–Computer programs. | R (Computer program language)

Classification: LCC QA276.45.R3 W55 2016 | DDC 519.50285/5133–dc23 LC record available at

http://lccn.loc.gov/2015050582

LIST OF SYMBOLS

Type I error probability (alpha)

Type II error probability (beta)

The intercept of a regression line

The slope of a regression line

A measure of effect size (delta)

Population median or the odds ratio (theta)

Population mean (mu)

Population trimmed mean

Degrees of freedom (nu)

Pearson's correlation (rho)

Spearman's correlation

Population standard deviation (sigma)

Population variance

Summation

Kendall's tau

Phi coefficient

Odds (omega)

Sample median

Probability of success. Also used to indicate a

-value as well as a measure of effect size associated with the Wilcoxon–Mann–Whitney method

Estimate of

, Pearson's correlation

Winsorized correlation

A

PREFACE

The goal of this book is to introduce basic statistical principles, techniques, and concepts in a relatively simple and concise manner. The book is designed for a one-semester course aimed at nonstatisticians. Numerous illustrations are provided using data from a wide range of disciplines. Answers to the exercises are in Appendix A. More detailed answers to all of the exercises and data sets for the exercises can be downloaded from .

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!