59,99 €
A highly accessible alternative approach to basic statistics Praise for the First Edition: "Certainly one of the most impressive little paperback 200-page introductory statistics books that I will ever see . . . it would make a good nightstand book for every statistician."--Technometrics Written in a highly accessible style, Introduction to Statistics through Resampling Methods and R, Second Edition guides students in the understanding of descriptive statistics, estimation, hypothesis testing, and model building. The book emphasizes the discovery method, enabling readers to ascertain solutions on their own rather than simply copy answers or apply a formula by rote. The Second Edition utilizes the R programming language to simplify tedious computations, illustrate new concepts, and assist readers in completing exercises. The text facilitates quick learning through the use of: More than 250 exercises--with selected "hints"--scattered throughout to stimulate readers' thinking and to actively engage them in applying their newfound skills An increased focus on why a method is introduced Multiple explanations of basic concepts Real-life applications in a variety of disciplines Dozens of thought-provoking, problem-solving questions in the final chapter to assist readers in applying statistics to real-life applications Introduction to Statistics through Resampling Methods and R, Second Edition is an excellent resource for students and practitioners in the fields of agriculture, astrophysics, bacteriology, biology, botany, business, climatology, clinical trials, economics, education, epidemiology, genetics, geology, growth processes, hospital administration, law, manufacturing, marketing, medicine, mycology, physics, political science, psychology, social welfare, sports, and toxicology who want to master and learn to apply statistical methods.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 360
Veröffentlichungsjahr: 2012
Table of Contents
Cover
Title page
Copyright page
Preface
Chapter 1 Variation
1.1 VARIATION
1.2 COLLECTING DATA
1.3 SUMMARIZING YOUR DATA
1.4 REPORTING YOUR RESULTS
1.5 TYPES OF DATA
1.6 DISPLAYING MULTIPLE VARIABLES
1.7 MEASURES OF LOCATION
1.8 SAMPLES AND POPULATIONS
1.9 SUMMARY AND REVIEW
Chapter 2 Probability
2.1 PROBABILITY
2.2 BINOMIAL TRIALS
*2.3 CONDITIONAL PROBABILITY
2.4 INDEPENDENCE
2.5 APPLICATIONS TO GENETICS
2.6 SUMMARY AND REVIEW
Chapter 3 Two Naturally Occurring Probability Distributions
3.1 DISTRIBUTION OF VALUES
3.2 DISCRETE DISTRIBUTIONS
3.3 THE BINOMIAL DISTRIBUTION
3.4 MEASURING POPULATION DISPERSION AND SAMPLE PRECISION
3.5 POISSON: EVENTS RARE IN TIME AND SPACE
3.6 CONTINUOUS DISTRIBUTIONS
3.7 SUMMARY AND REVIEW
Chapter 4 Estimation and the Normal Distribution
4.1 POINT ESTIMATES
4.2 PROPERTIES OF THE NORMAL DISTRIBUTION
4.3 USING CONFIDENCE INTERVALS TO TEST HYPOTHESES
4.4 PROPERTIES OF INDEPENDENT OBSERVATIONS
4.5 SUMMARY AND REVIEW
Chapter 5 Testing Hypotheses
5.1 TESTING A HYPOTHESIS
5.2 ESTIMATING EFFECT SIZE
5.3 APPLYING THE T-TEST TO MEASUREMENTS
5.4 COMPARING TWO SAMPLES
5.5 WHICH TEST SHOULD WE USE?
5.6 SUMMARY AND REVIEW
Chapter 6 Designing an Experiment or Survey
6.1 THE HAWTHORNE EFFECT
6.2 DESIGNING AN EXPERIMENT OR SURVEY
6.3 HOW LARGE A SAMPLE?
6.4 META-ANALYSIS
6.5 SUMMARY AND REVIEW
Chapter 7 Guide to Entering, Editing, Saving, and Retrieving Large Quantities of Data Using R
7.1 CREATING AND EDITING A DATA FILE
7.2 STORING AND RETRIEVING FILES FROM WITHIN R
7.3 RETRIEVING DATA CREATED BY OTHER PROGRAMS
7.4 USING R TO DRAW A RANDOM SAMPLE
Chapter 8 Analyzing Complex Experiments
8.1 CHANGES MEASURED IN PERCENTAGES
8.2 COMPARING MORE THAN TWO SAMPLES
8.3 EQUALIZING VARIABILITY
8.4 CATEGORICAL DATA
8.5 MULTIVARIATE ANALYSIS
8.6 R PROGRAMMING GUIDELINES
8.7 SUMMARY AND REVIEW
Chapter 9 Developing Models
9.1 MODELS
9.2 CLASSIFICATION AND REGRESSION TREES
9.3 REGRESSION
9.4 FITTING A REGRESSION EQUATION
9.5 PROBLEMS WITH REGRESSION
9.6 QUANTILE REGRESSION
9.7 VALIDATION
9.8 SUMMARY AND REVIEW
Chapter 10 Reporting Your Findings
10.1 WHAT TO REPORT
10.2 TEXT, TABLE, OR GRAPH?
10.3 SUMMARIZING YOUR RESULTS
10.4 REPORTING ANALYSIS RESULTS
10.5 EXCEPTIONS ARE THE REAL STORY
10.6 SUMMARY AND REVIEW
Chapter 11 Problem Solving
11.1 THE PROBLEMS
11.2 SOLVING PRACTICAL PROBLEMS
Answers to Selected Exercises
Index
Copyright © 2013 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Good, Phillip I.
Introduction to statistics through resampling methods and R / Phillip I. Good. – Second edition.
pages cm
Includes indexes.
ISBN 978-1-118-42821-4 (pbk.)
1. Resampling (Statistics) I. Title.
QA278.8.G63 2013
519.5'4–dc23
2012031774
Preface
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Benjamin Franklin
Intended for class use or self-study, the second edition of this text aspires as the first to introduce statistical methodology to a wide audience, simply and intuitively, through resampling from the data at hand.
The methodology proceeds from chapter to chapter from the simple to the complex. The stress is always on concepts rather than computations. Similarly, the R code introduced in the opening chapters is simple and straightforward; R’s complexities, necessary only if one is programming one’s own R functions, are deferred to Chapter 7 and Chapter 8.
The resampling methods—the bootstrap, decision trees, and permutation tests—are easy to learn and easy to apply. They do not require mathematics beyond introductory high school algebra, yet are applicable to an exceptionally broad range of subject areas.
Although introduced in the 1930s, the numerous, albeit straightforward calculations that resampling methods require were beyond the capabilities of the primitive calculators then in use. They were soon displaced by less powerful, less accurate approximations that made use of tables. Today, with a powerful computer on every desktop, resampling methods have resumed their dominant role and table lookup is an anachronism.
Physicians and physicians in training, nurses and nursing students, business persons, business majors, research workers, and students in the biological and social sciences will find a practical and easily grasped guide to descriptive statistics, estimation, testing hypotheses, and model building.
For advanced students in astronomy, biology, dentistry, medicine, psychology, sociology, and public health, this text can provide a first course in statistics and quantitative reasoning.
For mathematics majors, this text will form the first course in statistics to be followed by a second course devoted to distribution theory and asymptotic results.
Hopefully, all readers will find my objectives are the same as theirs: To use quantitative methods to characterize, review, report on, test, estimate, and classify findings.
Warning to the autodidact: You can master the material in this text without the aid of an instructor. But you may not be able to grasp even the more elementary concepts without completing the exercises. Whenever and wherever you encounter an exercise in the text, stop your reading and complete the exercise before going further. To simplify the task, R code and data sets may be downloaded by entering ISBN 9781118428214 at booksupport.wiley.com and then cut and pasted into your programs.
I have similar advice for instructors. You can work out the exercises in class and show every student how smart you are, but it is doubtful they will learn anything from your efforts, much less retain the material past exam time. Success in your teaching can be achieved only via the discovery method, that is, by having the students work out the exercises on their own. I let my students know that the final exam will consist solely of exercises from the book. “I may change the numbers or combine several exercises in a single question, but if you can answer all the exercises you will get an A.” I do not require students to submit their homework but merely indicate that if they wish to do so, I will read and comment on what they have submitted. When a student indicates he or she has had difficulty with an exercise, emulating Professor Neyman I invite him or her up to the white board and give hints until the problem has been worked out by the student.
Thirty or more exercises included in each chapter plus dozens of thought-provoking questions in Chapter 11 will serve the needs of both classroom and self-study. The discovery method is utilized as often as possible, and the student and conscientious reader forced to think his or her way to a solution rather than being able to copy the answer or apply a formula straight out of the text.
Certain questions lend themselves to in-class discussions in which all students are encouraged to participate. Examples include Exercises 1.11, 2.7, 2.24, 2.32, 3.18, 4.1, 4.11, 6.1, 6.9, 9.7, 9.17, 9.30, and all the problems in Chapter 11.
R may be downloaded without charge for use under Windows, UNIX, or the Macintosh from http://cran.r-project.org. For a one-quarter short course, I take students through Chapter 1, Chapter 2, and Chapter 3. Sections preceded by an asterisk (*) concern specialized topics and may be skipped without loss in comprehension. We complete Chapter 4, Chapter 5, and Chapter 6 in the winter quarter, finishing the year with Chapter 7, Chapter 8, and Chapter 9. Chapter 10 and Chapter 11 on “Reports” and “Problem Solving” convert the text into an invaluable professional resource.
If you find this text an easy read, then your gratitude should go to the late Cliff Lunneborg for his many corrections and clarifications. I am deeply indebted to Rob J. Goedman for his help with the R language, and to Brigid McDermott, Michael L. Richardson, David Warton, Mike Moreau, Lynn Marek, Mikko Mönkkönen, Kim Colyvas, my students at UCLA, and the students in the Introductory Statistics and Resampling Methods courses that I offer online each quarter through the auspices of statcourse.com for their comments and corrections.
PHILLIP I. GOODHuntington Beach, [email protected]
Chapter 1
Variation
If there were no variation, if every observation were predictable, a mere repetition of what had gone before, there would be no need for statistics.
In this chapter, you’ll learn what statistics is all about, variation and its potential sources, and how to use R to display the data you’ve collected. You’ll start to acquire additional vocabulary, including such terms as accuracy and precision, mean and median, and sample and population.
We find physics extremely satisfying. In high school, we learned the formula SVT, which in symbols relates the distance traveled by an object to its velocity multiplied by the time spent in traveling. If the speedometer says 60 mph, then in half an hour, you are certain to travel exactly 30 mi. Except that during our morning commute, the speed we travel is seldom constant, and the formula not really applicable. Yahoo Maps told us it would take 45 minutes to get to our teaching assignment at UCLA. Alas, it rained and it took us two and a half hours.
Politicians always tell us the best that can happen. If a politician had spelled out the worst-case scenario, would the United States have gone to war in Iraq without first gathering a great deal more information?
In college, we had Boyle’s law, VKT/P, with its tidy relationship between the volume V, temperature T and pressure P of a perfect gas. This is just one example of the perfection encountered there. The problem was we could never quite duplicate this (or any other) law in the Freshman Physics’ laboratory. Maybe it was the measuring instruments, our lack of familiarity with the equipment, or simple measurement error, but we kept getting different values for the constant K.
By now, we know that variation is the norm. Instead of getting a fixed, reproducible volume V to correspond to a specific temperature T and pressure P, one ends up with a distribution of values of V instead as a result of errors in measurement. But we also know that with a large enough representative sample (defined later in this chapter), the center and shape of this distribution are reproducible.
Here’s more good and bad news: Make astronomical, physical, or chemical measurements and the only variation appears to be due to observational error. Purchase a more expensive measuring device and get more precise measurements and the situation will improve.
But try working with people. Anyone who spends any time in a schoolroom—whether as a parent or as a child, soon becomes aware of the vast differences among individuals. Our most distinct memories are of how large the girls were in the third grade (ever been beat up by a girl?) and the trepidation we felt on the playground whenever teams were chosen (not right field again!). Much later, in our college days, we were to discover there were many individuals capable of devouring larger quantities of alcohol than we could without noticeable effect. And a few, mostly of other nationalities, whom we could drink under the table.
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!