Health and Numbers - Chap T. Le - E-Book

Health and Numbers E-Book

Chap T. Le

0,0
101,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Like its two successful previous editions, Health & Numbers: A Problems-Based Introduction to Biostatistics, Third Edition, is the only fully problems-based introduction to biostatistics and offers a concise introduction to basic statistical concepts and reasoning at a level suitable for a broad spectrum of students and professionals in medicine and the allied health fields. This book has always been meant for use by advanced students who have not previously had an introductory biostatistics course - material often presented in a one-semester course - or by busy professionals who need to learn the basics of biostatistics. This user-friendly resource features over 200 real-life examples and real data to discuss and teach fundamental statistical methods. The new edition offers even more exercises than the second edition, and features enhanced Microsoft Excel and SAS samples and examples. Health & Numbers, Third Edition, truly strikes a balance between principles and methods of calculation that is particularly useful for students in medicine and health-related fields who need to know biostatistics.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 417

Veröffentlichungsjahr: 2011

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Preface

Introduction from the First Edition

What is Statistics?

What Can Statistics Do?

Why is Statistics so Hard to Learn?

Reasons for the Difficulty

The Three Levels of Statistical Knowledge

Dealing with Formulas

1 Proportions, Rates, and Ratios

1.1 Proportions

1.2 Rates

1.3 Ratios

1.4 Computational and Visual Aids

Exercises

2 Organization, Summarization, and Presentation of Data

2.1 Tabular and Graphical Methods

2.2 Numerical Methods

2.3 Coefficient of Correlation

2.4 Visual and Computational Aids

Exercises

3 Probability and Probability Models

3.1 Probability

3.2 The Normal Distribution

3.3 Probability Models

3.4 Computational Aids

Exercises

4 Confidence Estimation

4.1 Basic Concepts

4.2 Estimation of a Population Mean

4.3 Estimation of a Population Proportion

4.4 Estimation of a Population Odds Ratio

4.5 Estimation of a Population Correlation Coefficient

4.6 A Note on Computation

Exercises

5 Introduction to Hypothesis Testing

5.1 Basic Concepts

5.2 Analogies

5.3 Summaries and Conclusions

Exercises

6 Comparisono f Population Proportions

6.1 One-Sample Problem with Binary Data

6.2 Analysis of Pair-Matched Binary Data

6.3 Comparison of Two Proportions

6.4 The Mantel–Haenszel Method

6.5 Computational Aids

Exercises

7 Comparison of Population Means

7.1 One-Sample Problem with Continuous Data

7.2 Analysis of Pair-Matched Data

7.3 Comparison of Two Means

7.4 One-Way Analysis of Variance (ANOVA)

7.5 Computational Aids

Exercises

8 Regression Analysis

8.1 Simple Regression Analysis

8.2 Multiple Regression Analysis

8.3 Graphical and Computational Aids

Exercises

Bibliography

Appendices

Answers to Selected Exercises

Index

Copyright © 2009 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New JerseyPublished simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

ISBN 978-0470-18589-6

To my wife, Minh-Ha,& my daughters, Mina and Jennawith deepest love and appreciation

Preface

A course in introductory biostatistics is often required for professional students in public health, dentistry, nursing, and medicine, and graduate students in nursing and other biomedical sciences. It is only a course or two, but the requirement is often considered as a roadblock causing anxiety in many quarters. The feelings are expressed in many ways and in many different settings but all leading to the same conclusion that simply surviving the endurance is the only practical goal. And students need help, in the form of a user-friendly text, in order to do just that, surviving. In the early 1990s, we decided it’s time to write our own text, and Health and Numbers: Basic Biostatistical Methods was published in 1995 reflecting our experience after teaching introductory biostatistics courses for many years to students from various human health disciplines. The second edition, Health and Numbers: A Problems-based Introduction to Biostatistics was also aimed to the same audience for whom the first edition was written: professional and beginning graduate students in human health disciplines who need help to successfully pass and to benefit from the basic biostatistics course requirement. Our main objective was not only to avoid the perception that statistics is just a number of formulas that students need to get over with, but to present it as a way of thinking, thinking about ways to gather and analyze data so as to benefit from taking the required course. And there is no better way to do that than making our book problem-based: many problems with real data in various fields are provided at the end of all eight chapters as aids to learning how to use statistical procedures, still the nuts and bolts of elementary applied statistics. The most appealing feature of Health & Numbers, through the first two editions, has been that it is based on the most popular, most available, cheapest computer package, the Microsoft Excel. The second edition was successful; however: (1) there are not enough details in the use of Excel, and (2) the book covers many topics — some are difficult for beginners. Therefore, in this third edition, Health and Numbers: A Problems-Based Introduction to Biostatistics, I try to overcome those weaknesses by cutting out a few advanced topics and adding many more hints and examples on the use of Excel in every chapter.

The “way of thinking” called statistics has become important to all professionals who not only are scientific or business-like but also caring people who want to help and make the world a better place. But what is it? What is biostatistics and what can it do? There are popular definitions and perceptions of statistics. For this book and its readers, we don’t emphasize the definition of “statistics as things”, but instead offer an active concept of “doing statistics”. The doing of statistics is a way of thinking about numbers, with emphasis on relating their interpretation and meaning to the manner in which they are collected. Our working definition of statistics, as an activity, is that it is a way of thinking about data, its collection, its analysis, and its presentation. Formulas are needed but not as the only things you need to know.

To illustrate statistics as a way of thinking, let’s start with a familiar scenario of our criminal court procedures. A crime has been discovered and a suspect has been identified. After a police investigation to collect evidence against the suspect, a prosecutor presents summarized evidence to a jury. The jurors learn about the rules and debate the rule about convicting beyond a reasonable doubt and the rule about unanimous decision. After the debate, the jurors vote and a verdict is reached: guilty or not guilty. Why do we need this time-consuming, cost-consuming process called trial by jury? Well, the truth is often unknown, at least uncertain. It is uncertain because of variability (every case is different) and because of incomplete information (missing key evidence?). Trial by jury is the way our society deals with uncertainties; its goal is to minimize mistakes.

How does society deal with uncertainties? We go through a process called trial by jury, consisting of these steps: (1) we form assumption or hypothesis (that every person is innocent until proven guilty), (2) we gather data (evidence supporting the charge), and (3) we decide whether the hypothesis should be rejected (guilty) or should not be rejected (not guilty). Basically, a successful trial should consist of these elements: (1) a probable cause (with a crime and a suspect), (2) a thorough investigation by police, (3) an efficient presentation by the prosecutor, and (4) a fair and impartial jury.

In the above-described context of a trial by jury, let us consider a few specific examples: (1) the crime is lung cancer and the suspect is cigarette smoking, or (2) the crime is leukemia and the suspect is pesticides, or (3) the crime is breast cancer and the suspect is a defective gene. The process is now called research, and the tool to carry out that research is biostatistics. In a simple way, biostatistics serves as the biomedical version of the trial by jury process. It is the science of dealing with uncertainties using incomplete information. Yes, even science is uncertain; scientists arrive at different conclusions in many different areas at different times; many studies are inconclusive. The reasons for uncertainties remain the same. Nature is complex and full of unexplained biological variability. But, most important of all, we always have to deal with incomplete information. It is often not practical to study the entire population; we have to rely on information gained from samples.

How does science deal with uncertainties? We learn from how society deals with uncertainties; we go through a process called biostatistics, consisting of these steps: (1) we form assumption or hypothesis (from the research question), (2) we gather data (from clinical trials, surveys, medical records abstractions), and (3) we make decision(s) (by doing statistical analysis/inference, a guilty verdict is referred to as statistical significance). Basically, a successful research should consist of these elements: (1) a good research question (with well-defined objectives and endpoints), (2) a thorough investigation (by experiments or surveys), (3) an efficient presentation of data (organizing data, summarizing, and presenting data; an area called descriptive statistics), and (4) proper statistical inference. This book is a problems-based introduction to the last three elements; together they form a field called biostatistics. The coverage is rather brief on data collection, but very extensive on descriptive statistics (Chapters 1 and 2), and on methods of statistical inference (Chapters 4, 5, 6, 7 and 8). Chapter 3, on probability and probability models, serves as the link between descriptive and inferential parts.

The author would like to express his sincere appreciation to colleagues for feedback, to teaching assistants who helped with examples and exercises, and to students for feedbacks. Finally, my family bore patiently the pressures caused by my long-term commitment to write books; to my wife and daughters, I am always most grateful.

CHAP T. LEEdina, Minnesota

Introduction from the First Edition

We have taught introductory biostatistics courses for many years to students from various human health disciplines, and decided it’s time to write our own book. We as teachers are very familiar with students who enter such courses with an enormous sense of dread, based on a combination of low self-confidence in their mathematical ability and a perception that they would never need to use statistics in their future work, even if they did learn the subject, especially those students who enter health fields that emphasize caring, human contact view statistics as the antithesis of interpersonal warmth. We are sympathetic to the plight of students taking statistics courses simply because it is a requirement, and consequently we have tried to write a friendly text. By friendly we mean fairly thin book and emphasizing a few basic principles rather than a jumble of isolated facts. The perception that statistics is just a bunch of formulas and long columns of numbers is the main misunderstanding of the field. Statistics is a way of thinking, thinking about ways to gather and analyze data. The formulas are tools of statistics, just like stethoscopes are tools for doctors and wrenches are tools for auto mechanics. The first thing a good statistician does when faced with data is to learn how the data were collected; the last thing is to apply formulas and do calculations. It’s amazing how many times statistical formulas are misused by a well-intentioned researcher simply because the data wasn’t collected correctly. Start to pay attention to the number of times the media will announce that studies have been done that lead to such-and-such a conclusion, but never tell you how the study was done or the data collected. Most of the time the reporters don’t even know that the methods of the study or the method of data collection is crucial to the study’s validity; their job is to seek out newsworthy findings and dramatize the implications.

Almost all of the formal statistical procedures performed fall into two categories: testing and estimation. This book covers the most commonly used, elementary procedures in both ategories: the z-test, the t-test, the chi-square test, point estimation, confidence interval estimation, and correlation and regression. These are the nuts and bolts of elementary applied statistics. We also throw in some basic ideas of standardization of rates and graphical techniques that are easy to learn and incredibly useful. Exercises are added as aid to learning how to use statistical procedures. Learning to use statistics takes practice. It’s not an easy subject, but it’s worth learning. Let us know how we can improve the book.

Recently, we met Carmen, a middle-aged woman. During social chit-chat, she told us that she is a nurse in a cancer ward. When we announced that we are statistics professors, she said she was planning on getting her master’s degree in nursing and that we might be interested in some advice she heard from other nurses who get master’s degrees. “If you get only one chance for a pass-fail option, use it on the stat course,” they counseled. They made it clear that the statistics course required of graduate students in nursing is really hard and that simply surviving it is the only practical goal.

We neither had problem believing Carmen nor did we question the sincerity of the students’ recommendation. Many times this sentiment is expressed: Statistics courses are experiences to be endured and even honest people are tempted to cheat to pass. Survival skills are in order: “Do anything, including neglecting the other courses for one quarter, to get it behind you. Use flash cards, buy other books, and make friends with nerds if you must, but get through the requirement.”

These feelings are expressed in many ways and in many different settings by those who have taken required statistics courses. In the social setting of a party, the revelation that we teach statistics usually provokes such comments from a course survivor or one who is dreading the experience. Indeed, the required statistics course may well be described as a hostage situation, with the students as prisoners and the teacher as torturer.

The drama of the required statistics course is re-enacted year after year throughout the country. Students from a wide variety of helping professions take statistics courses as they prepare for a career. This book is written especially for those who feel inadequate with statistics in any form but are forced by journal editors, government funding agencies, and degree requirements to deal with it. If you are one of those people, this book is for you.

Statistics is a hard subject; many people, however, realize that it is essential not only in science and government but also in the human services fields. As the federal budget problems continue, both the Pentagon and the Department of Health and Human Services cite statistics to help their cases. Applicants for increased government funding for worthy causes can no longer claim only their good intentions as the sole reason for a bigger chunk of the pie. They have to present numbers to show that funding their cause is a better bargain “for the people” than funding a competing worthy cause. This creates a demand for an objective way of thinking and an art of communication, called statistics.

The “way of thinking” called statistics has become important to all professionals who are not only scientific or business-like but also caring people who want to help and make the world a better place. Such people have big hearts, high ideals, and a concern for humanity. They are good people who work hard and expect only a modest income. They become teachers or nurses or clergy or social workers, the kind of people inclined to join the Peace Corps. They even do volunteer work in a wide variety of human services. Such people are the special audience of this book. Due to funding pressures and self-imposed requirements that their field be more research-based, those who seek higher degrees are forced to take statistics courses, read journal articles, and write reports using numbers and statistical formulas. They and their administrators must defend their work by counting and measuring, and defend against those who attack their counting and measuring methodology.

This book is written for helpers who are forced to deal with the world of statistics but feel inadequate because of low mathematical ability, lack of self-confidence, and the perception that they will never need to use statistical concepts and techniques in their future work.

WHAT IS STATISTICS?

There are several popular public definitions and perceptions of statistics. We see “vital statistics” in the newspaper, announcements of life events such as births, marriages, and deaths. Motorists are warned to drive carefully, to avoid “becoming a statistic.” The public use of the word is widely varied, most often indicating lists of numbers or data. We have also heard people use the word data to describe a verbal report, a believable anecdote. Statisticians define statistics to be summarizing numbers, like averages; for example, the average age of all mothers on AFDC is a statistic because it is a (partial) summary of a list of numbers too long and too varied to describe individually. The average is a useful partial summary and thus a statistic.

For this book and its readers, we don’t emphasize the definition of “statistics as things,” but offer instead an active concept of “doing Statistics.” The doing of statistics is a way of thinking about numbers, with emphasis on relating their interpretation and meaning to the manner in which they are collected. The relation of method of collection to technique of analysis, by the way, is absolutely central to the understanding of statistical thought. Anyone who claims facility with statistical formulas but is oblivious to the method of collection of the data to be analyzed must not portray him/herself as a statistician. Failing to see the important relationship of collection with analysis is the root cause of mindless throwing of statistical formulas at data.

Our working definition of statistics, as an activity, is that it is a way of thinking about data, about their collection, about their analysis, and about their presentation to an audience. Formulas are only a part of that thinking, simply tools of the trade.

To illustrate statistics as a way of thinking, we notice that our local radio station has a practice of conducting radio polls and announcing their results during the morning rush hour. One question was whether an old baseball player (who was personally popular among the locals and had a high batting average) should be recruited to play for the Minnesota Twins again. Of the people who called in their answer, 74% said “yes” and 26% said “no.” As the people who are well-groomed into statistical thinking, we wondered whether the 74% meant that 74% of the people in the city want him back or 74% of baseball fans want him back or what? In trying to figure out whether the 74% meant anything, we listened to the method by which the data were collected. The announcer said that only people who had touch-tone phones could vote because only such phones would work for their voting system. So now the poll was limited to people with touch-tone phones, who aren’t too busy at 8:30 a.m. Of those who had the phones and the time, what kind of people cared enough to vote? Thinking through this maze of selections is doing statistics. All this kind of thinking must be done before applying any statistical formula makes sense. There are many people who don’t know statistical formulas but naturally think statistically. Someone who throws statistical formulas at data but doesn’t appreciate the subtleties of data collection is dangerous.

WHAT CAN STATISTICS DO?

There is a broad spectrum of beliefs among nonstatisticians about what the field of statistics can actually do. At one extreme are the cynical disbelievers who think the only contribution of statisticians was the discovery that, at cocktail parties, 2% of the people eat 90% of the nuts. At the other extreme are the blind believers who envision statistics as a crime lab going over the murder room with a fine-tooth comb. They also think of statisticians as archaeologists digging gingerly into mounds of data, extracting from them every bit of truth.

WHY IS STATISTICS SO HARD TO LEARN?

Statistical formulas commonly result in numbers that have no intuitive meaning. They do not relate to any ordinary experience. At the end of the calculations for a t-test, the number one is considered very small but the number three is very large. This defies intuition. The person performing a t-test looks in t-tables at the back of a statistics book to find that the probability associated with 1 is .32. Next, the reader of the table is told that .32 is too large to be called significant. It is not significant. You might well think that .32 is not a significant number in some ordinary sense but because it is too small. The t-table tells you it’s too large to be significant. If, after calculating the t-test, you get the “large” number 3 and then look in the t-table to find the probability associated with it is .005, you are told that. 005 is small enough to be highly significant. What is this, anyway? How does a 1 get you a .32 while a 3 leads to a .005? What’s going on when little numbers like .005 are highly significant? Any logical person thinks big numbers are significant, not small ones. This turning upside down of brand new meanings assigned to small numbers and large ones is a major adjustment for the student of statistics. Anyone who gets confused converting inches to centimeters or Celsius to Fahrenheit is in for hard work learning to use statistical formulas.

One class of volunteers who want to learn statistics are the graduate students seeking a master’s or doctoral degree in biostatistics, a specialization of statistics. These students study full-time for at least a year before starting to get an integrated notion of statistics, and many of them come to our program already having earned a bachelor’s degree in mathematics.

The nonvolunteers have an even worse time. The ones we see are the graduate students in fields such as nursing or environmental sanitation who are required to take one or two statistics courses for their masters or doctoral degrees. When we teach them, we do what we can to minimize their frustration with the material. Some of them seem to learn the formulas fairly easily, but few of them can get a good handle on the concepts. A good number of those students spend an inordinate amount of time on the course, neglecting their other courses, to pass statistics and get beyond the hurdle. Several students try hard to find a statistics course taught by an easy grader. They dread the statistics requirement.

REASONS FOR THE DIFFICULTY

There are two kinds of reasons for statistics being a difficult subject to learn: intellectual and emotional.

Intellectual Reasons

There are concepts in statistics that are difficult to understand. For one thing, although statistics is definitely not the same as mathematics, it uses mathematics. Anyone who has trouble with arithmetic and algebra has trouble with statistics, and there are lots of people with excellent communication skills (reading, writing, and speaking) who have trouble with arithmetic and algebra. The Graduate Record Exam, as well as other standard examinations such as the Scholastic Aptitude Test taken by high school students, has the separate categories of verbal and quantitative; they are different. To attain the master’s degree, knowledge of statistics requires a good knowledge of calculus and some facility with a topic called “matrices” or linear algebra. A doctoral degree requires, in addition to the knowledge of statistics, advanced calculus and something called “measure theory” as a basis for understanding probability. In other words, getting deeply into statistical theory requires the knowledge of mathematics that is attained by very few people. The mathematics of advanced statistics is approximately the level of the mathematics of theoretical physics.

In addition to the mathematics of statistics, there are other difficult concepts. The main difficulty is in visualizing the distribution of numbers you cannot see. Many students have a hard time getting straight the sampling distribution of the sample means when the end point of the study is only one sample mean. That is, they have to be able to visualize infinitely many numbers, of which they see only one, and think about what the other numbers might have been.

Emotional Reasons

There are aspects other than statistics being intellectually difficult that act as barriers to learning. For one thing, statistics does not benefit from a glamorous image that motivates students to persist through tedious and frustrating lessons. A premedical or prelaw student is commonly sustained through long discouraging times in school by dreams of wealth and high social status, heroism in the not-too-distant future. However, there are no TV dramas with a good-looking statistician playing the lead, and few mothers’ chests swell with pride as they introduce their son or daughter as “the statistician.”

The public images of statisticians leave much to be desired as sources of recruitment. How to Lie with Statistics (by Darrell Huff ) is the statistics book whose title is most often quoted by nonstatisticians. One image of statisticians is that of sports nuts who are fascinated by numerical trivia of games. Another image is of the librarian, the solemn keeper of dry details, of “more than you’d ever want to know about . . .” Yet another is the role of the manipulator of numbers, the crook who can make numbers say anything he or she wants them to. Presidential campaigns in which both incumbent and challenger cite “statistics” showing why they should be elected don’t give statistics a good name.

Have you ever heard of a child who answered the question, “What are you going to be when you grow up?” with “I’m going to be a statistician!”?

Another emotional barrier to the learning of statistics is the one related to the difficulty of learning mathematics. The intellectual difficulty of learning mathematics not infrequently creates a phobia currently labeled “math anxiety,” a feeling of inadequacy in doing anything mathematical. Such people who dread having to do mathematical work are particularly uncomfortable in the “required” statistics course. We have had many students in our elementary courses who contacted us early in the quarter tell us how nervous they are about passing the course. Many of them tell their history of math anxiety. Some describe their childhood math teachers as particularly stern and unforgiving.

One reason for math anxiety, leading to stat anxiety, is the fact that mathematics is an especially unforgiving field. The typical student of arithmetic or elementary algebra doesn’t see the beauty and artistry of higher mathematics. Such students see that there is only one correct answer and that no credit is given for approximately correct, or not exactly right but useful. Two plus two equals four, not 4.001. It’s either all right or all wrong. In most other subjects, there is a little give: answers are not so “right” or “wrong.”

Those people very facile with numbers can easily intimidate the innumerate. The expert who spews forth a barrage of numerical facts is one up; numerical facts seem more precise and more correct, and the speaker of the facts thus seems more in charge. Robert McNamara, at one time the president of Ford Motor and later the Secretary of Defense under Lyndon Johnson, used to explain the Vietnam War on live television. His command of facts and confidence with which he spewed forth a barrage of war data gave the impression that he was on top of every detail. He was very impressive, due partly to his command of numbers.

THE THREE LEVELS OF STATISTICAL KNOWLEDGE

Level One

The first level of knowledge is that of familiarity with some of the statistical formulas. These formulas, like the formulas for the t-test or the Chi-square test, are gadgets. They are analogous to the stethoscopes used by physicians, wrenches used by auto mechanics, or desk calculators used by accountants. The formulas of statistics are as important to the field of statistics as the stethoscope, wrench, and desk calculator are for the aforementioned occupations. It’s appropriate and natural when learning a new field to play with and get used to its gadgets. It’s also a good way to see if you have a chance of liking the field and could be happy working in it. If you can’t use a stethoscope, don’t consider becoming a physician. If you can’t use a wrench, cancel any plans to be an auto mechanic.

The formulas of statistics are qualitatively different from stethoscopes and wrenches. To use a stethoscope requires the ability to find the key spots on the human body, a good ear for subtle sounds, and willingness to tolerate the discomfort of the little black knobs that stick in your ears. Using a wrench requires a good sense of selecting the appropriate size and type of wrench, sufficient arm and shoulder strength to loosen tight nuts, and a light enough touch to avoid ruining a nut. Using statistical formulas requires good facility with algebra at the “college algebra” level, real skill and comfort with mathematical calculations, and a low rate of mistakes in arithmetic.

The skill in using the gadgets of a profession is obviously necessary for its practice. A candidate for the profession is well advised to test the waters by trying to attain at least minimal skill with its gadgets. After attaining minimal skill, the candidate can go on to learn to use them in context.

Level Two

The second level of knowledge of statistics is that of knowing how and when to use what gadgets for standard problems. Just as every physician must know what to do for a patient who is in every way healthy but has cracked a rib, or every auto mechanic knows how to install a new gas tank, the Level Two statistician knows how to analyze the data from a well-designed household survey when the households are selected using a random number table and every household has an adult respondent at home when the interviewer arrives. The Level Two statistician also knows how to work with a cooperative laboratory researcher who wants to design an experiment to allocate different chemotherapy doses to mice with cancer.

Level Three

The Level Three statistician is one who is perfectly familiar with the formulas and can handle difficult messy problems. A Level Three statistician can assess the possibility of making sense out of large data sets with many missing observations and chaotic methods of collection.

Setting Your Own Goal Level

The skill essential to learning statistical formulas is that of working with algebra. In other words, you need to be fairly good with abstract symbols and arithmetic. Being solid in the four basic operations of arithmetic (addition, subtraction, multiplication, and division) and taking square roots is absolutely essential for mastery of statistical formulas. Also, being good at looking up numbers in tables is required. There are inexpensive pocket calculators that will do the calculations of statistical formulas, but anyone shaky at arithmetic is in deep statistical water even with calculators. Without the ability to do the calculations by hand, there is great danger of pushing the wrong buttons or pushing the right buttons in the wrong order and not realizing that something is wrong. Another skill essential to the use of statistical formulas is a sense of magnitude, to know when the calculations result in a number that is way too large or too small, or negative when it should be positive. People who confuse debits with credits in their checkbooks have a very hard time with statistical formulas. If you are shaky at arithmetic, don’t expect success with even Level One knowledge of statistics.

If you are good at arithmetic and algebra, you should be ready to learn statistical formulas. You would then be ready to follow the instructions of formulas (which you may think of as recipes) and thus correctly perform the calculations.

Learning Level Two requires an additional talent, that of understanding analogies. The ability to see that choosing households for a survey is mathematically equivalent to pulling names out of a hat is essential. Level Two also requires a good memory to keep in mind the many situations and where to find the formulas to fit them.

Level-Three knowledge of statistics requires, in addition to mastery of the first two levels, the ability and nerve to deal with problems for which there is no standard solution. Working at Level Three requires either finding new solutions or using an old technique that works imperfectly but will do the job. It requires cleverness and adaptability.

Time Required for Learning Statistics

For someone good at arithmetic and the symbol manipulations used in algebra, a few of the most popular formulas can be learned in a one-quarter or one-semester elementary course. In a course lasting one academic year, you can cover quite a few of the well-used formulas; you can also start to learn some of the concepts of statistics, although that is a much harder task than learning the formulas.

Learning Level Two takes two years of full-time study, assuming that you have had a year of calculus (with at least a B grade) and are an advanced undergraduate or graduate student. Level Two is essentially a master’s degree in statistics and attaining it is a major effort. Level Three is acquired only after a good four years of full-time effort in the field of statistics. It can be obtained by a master’s degree plus two years full-time experience. Some statisticians attain Level Three by the time they graduate with a doctorate in statistics. Attaining a very advanced Level Three requires natural statistical talent and many years of experience.

DEALING WITH FORMULAS

People who tend to grimace and flinch at the thought of statistics tend to be most repelled by statistical formulas. We have seen a number of them gingerly open a new statistics book, whose cover beckons the reader with come-ons such as “statistics made easy” or “statistics for the layman.” The Introduction assures the reader that he or she need have no background in anything whatsoever and should simply relax and read on. The scarred veteran of previous statistics books doesn’t believe it, however, and flips through the book to check for the presence of “formulas.” Upon finding them, like bones in a fish, the reader snaps the book shut, defeated once again.

Statistical formulas are frightening to anyone suffering from math anxiety. They remind one of the worst days of the old algebra classes. Some even have Greek letters in them. This book is intended to be light on formulas, but we want to walk the reader through one of the most common ones, just to show how statisticians think about formulas.

There are a couple of well-worn formulas that are used frequently even by nonstatisticians. They are the formulas for the two-sample t-test and the Chi-square, the latter represented symbolically by the Greek letter “chi” with a 2 in the upper right corner: ξ2. By purely arbitrary choice, we’ll introduce the former, the two-sample t-test:

A statistics text will typically present the two-sample t-test as follows: To test

versus

At the α-level, the process is to form

where

Then to “reject” H0 if and only if t exceeds a certain cut-point obtained from some table (at the back of the book) or with the help of some computer software.

How are you doing? Had enough? That formula, one of the two most used, is full of frightening symbols and jargon. It contains the Greek letters α, μ, and σ. It has subscripts, indices, and may include absolute value symbol.

A statistician can dive right in, plugging in numbers for the xs, and get the job done. To the uninitiated, however, the formula is overwhelming. Students in statistics courses need days, sometimes weeks, to get used to it and working with it. Many keep forgetting what α, μ, and σ are. It seems very artificial, without intuitive appeal. It’s intuitively natural to subtract one sample mean from another, but the whole denominator seems weird; it doesn’t make any intuitive sense. Students have a hard time keeping standard deviation separate from standard error, forgetting to take square roots, putting the ns in the wrong place. When they should be getting a + 3, they’re getting a –3, and have no instinct that the –3 is wrong. We don’t blame them; it is difficult and confusing.

Using statistical formulas is like cooking from recipes or putting together lawn furniture from a set of cut-away drawings. Just as the recipe is a kind of shorthand that could be written in prose form, and the drawings for the swing set could be eliminated by substituting a few paragraphs of English, a statistical formula could be expressed in words. A statistician reading the formula does just that, translating the formula into English, seeing it as a set of directions as to what to do with two sets of numbers. It is no surprise, of course, that statisticians write the statistics books in their own language, using formulas instead of English prose because they themselves are so comfortable reading formulas. Formulas, although in principle are similar to recipes, are more abstract. A recipe that says “Break two eggs into a cup of milk” is referring to objects and actions more common than μ, the population mean. Eggs, cups, and milk can be felt and tasted; α, μ, and σ are abstract concepts. Most people are so used to the small counting numbers, that is, 1, 2, 3, ... that they forget how abstract numbers themselves come about. Even the number “1” is a figment of the imagination; it cannot be touched or tasted. It has no volume or weight and is not on display in some museum. It is a property that an apple, a truck, and an ocean have in common, namely their oneness.

In summary, formulas are fine for those very comfortable with algebra and good at following a long list of complicated instructions. If you’re weak in either of those areas, don’t expect statistical formulas to be easy. But the computer can sure help. And that is why we introduce Excel.

CHAPTER 1

Proportions, Rates, and Ratios

Most introductory textbooks in statistics and biostatistics start with methods for organizing, summarizing, and presenting continuous data–numbers measured on a continuous scale; for example, measurements of height, weight, blood pressure, or cholesterol level. We have decided, however, to adopt a different starting point because our focused areas are research in biomedical sciences, and health decisions are more frequently based on proportions, ratios, or rates. In addition, it is easier to learn methods for categorical data and, therefore, to build knowledge and confidence. In this first chapter we will see how the concepts of proportion, rate, and ratios appeal to common sense, and learn their meaning and uses. You will be able to apply what you learn quickly to tackle real-life data, concepts, and applications—concepts such as case–control study, measures of morbidity, and mortality—before moving on to more complicated methods for continuous data in Chapter 2.

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!