105,99 €
Utilizing the latest software, this book presents the essential statistical procedures for drawing valuable results from data in the social sciences. Mobilizing interesting real-world examples from the field of education, Understanding Educational Statistics Using Microsoft Excel and SPSS supplies a seamless presentation that identifies valuable connections between statistical applications and research design. Class-tested to ensure an accessible presentation, the book combines clear, step-by-step explanations and the use of software packages that are accessible to both the novice and professional alike to present the fundamental statistical practices for organizing, understanding, and drawing conclusions from educational research data. The book begines with an introduction to descriptive and inferential statistics and then proceeds to acquaint readers with the various functions for working with quantitative data in the Microsoft Excel environment, such as spreadsheet navigation; sorting and filtering; and creating pivot tables. Subsequent chapters treat the procedures that are commonly-employed when working with data across various fields of social science research, including: * Single-sample tests * Repeated measure tests * Independent t-tests * One way ANOVA and factorial ANOVA * Correlation * Bivariate regression * Chi square * Multiple regression Individual chapters are devoted to specific procedures, each ending with a lab exercise that highlights the importance of that procedure by posing a research question, examining the question through its application in Excel and SPSS, and concluding with a brief research report that outlines key findings drawn from the results. Real-world examples and data from modern educational research are used throughout the book, and a related Web site features additional data sets, examples, and labs, allowing readers to reinforce their comprehension of the material. Bridging traditional statistical topics with the latest software and applications in the field of education, Understanding Educational Statistics Using Microsoft Excel and SPSS is an excellent book for courses on educational research methods and introductory statistics in the social sciences at the upper-undergraduate and graduate levels. It also serves as a valuable resource for researchers and practitioners in the fields of education, psychology, and the social sciences who require a statistical background to work with data in their everyday work.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 737
Veröffentlichungsjahr: 2014
CONTENTS
Preface
Acknowledgments
1 Introduction
Approach of the Book
Project Labs
Real-World Data
Research Design
“Practical Significance”—Implications of Findings
Coverage of Statistical Procedures
2 Getting Acquainted with Microsoft Excel®
Data Management
The Excel® Menus
3 Using Statistics in Excel®
Using Statistical Functions
Entering Formulas Directly
Data Analysis Procedures
Missing Values and “0” Values in Excel® Analyses
Using Excel® with Real Data
4 SPSS® Basics
Using SPSS®
General Features
Management Functions
Additional Management Functions
Analysis Functions
5 Descriptive Statistics—Central Tendency
Research Applications—Spuriousness
Descriptive and Inferential Statistics
The Nature of Data—Scales of Measurement
Descriptive Statistics—Central Tendency
Using Excel® and SPSS® to Understand Central Tendency
Distributions
Describing the Normal Distribution
Descriptive Statistics—Using Graphical Methods
Terms and Concepts
Real-World Lab I: Central Tendency
Real-World Lab I: Solutions
6 Descriptive Statistics—Variablity
Range
Percentile
Scores Based on Percentiles
Using Excel® and SPSS® to Identify Percentiles
Note
Standard Deviation and Variance
Calculating the Variance and Standard Deviation
The Computation Method
Terms and Concepts
Real-World Lab II: Variability
Real-World Lab II: Solutions
7 The Normal Distribution
The Nature of the Normal Curve
The Standard Normal Score: z Score
The z-Score Table of Values
Navigating the z-Score Distribution
Calculating z Scores
Working with Raw Score Distributions
Using Excel® to Create z Scores and Cumulative Proportions
Using SPSS® to Create z Scores
Terms and Concepts
Real-World Lab III: The Normal Curve and z Scores
Real-World Lab III: Solutions
8 The Z Distribution and Probability
Transforming a z Score to a Raw Score
Transforming Cumulative Proportions to z Scores
Deriving Sample Scores from Cumulative Percentages
Additional Transformations Using the Standard Normal Distribution
Using Excel® and SPSS® to Transform Scores
Probability
“Exact” Probability
From Sample Values to Sample Distributions
Terms and Concepts
Real-World Lab IV
Real-World Lab IV Solutions
9 The Nature of Research Design and Inferential Statistics
Research Design
Types of Research Designs
The Nature of Research Design
Inferential Statistics
Z Test
Terms and Concepts
Real-World Lab V
Real-World Lab V: Solutions
10 The T Test for Single Samples
Z Versus T: Making Accommodations
Research Design
Parameter Estimation
T Test for a Single Mean
Type I and Type II Errors
Effect Size
Power, Effect Size, and Beta
One- and Two-Tailed Tests
A Note About Power
Point and Interval Estimates
The Value of Confidence Intervals
Using Excel® and SPSS® with the Single-Sample T Test
Terms and Concepts
Real-World Lab VI: Single-Sample T Test
Real-World Lab VI: Solutions
11 Independent-Samples T Test
A Lot of “T’s”
Research Design
Independent T Test: The Procedure
Independent T-Test Example
Before–After Convention with the Independent T Test
Confidence Intervals for the Independent T Test
Effect Size
Equal and Unequal Sample Sizes
The Assumptions for the Independent-Samples T Test
Using Excel® and SPSS® with the Independent-Samples T Test
Parting Comments
Nonparametric Statistics
Terms and Concepts
Real-World Lab VII: Independent T Test
Real-World Lab VII: Solutions
12 Analysis of Variance
A Hypothetical Example of ANOVA
The Nature of ANOVA
The Components of Variance
The Process of ANOVA
Calculating ANOVA
Effect Size
Post Hoc Analyses
Assumptions of ANOVA
Additional Considerations with ANOVA
A Real-World Example of ANOVA
Are the Assumptions Met?
Hand Calculations
Using Excel® and SPSS® with One-Way ANOVA
The Need for Diagnostics
Nonparametric ANOVA Tests
Terms and Concepts
Real-World Lab VIII: ANOVA
Real-World Lab VIII: Solutions
13 Factorial Anova
Extensions of ANOVA
Multivariate ANOVA Procedures
Factorial ANOVA
The Example DataSet
Calculating Factorial ANOVA
Using SPSS® to Analyze 2 × ANOVA
Summary Chart for 2 × ANOVA Procedures
Terms and Concepts
Real-World Lab IX: 2 × ANOVA
Real-World Lab IX: 2 × ANOVA Solutions
14 Correlation
The Nature of Correlation
The Correlation Design
Pearson’s Correlation Coefficient
Plotting the Correlation: The Scattergram
Creating the Scattergram
Calculating Pearson’s r
The Z-Score Method
The Computation Method
Evaluating Pearson’s r
Correlation Problems
The Example Database
Correlation Using Excel® and SPSS®
Nonparametric Statistics: Spearman’s Rank-Order Correlation (rs)
Terms and Concepts
Real-World Lab X: Correlation
Real-World Lab X: Solutions
15 Bivariate Regression
The Nature of Regression
The Regression Line
Calculating Regression
Effect Size of Regression
The Z-Score Formula for Regression
Testing the Regression Hypotheses
The Standard Error of Estimate
Confidence Interval
Explaining Variance through Regression
Using Scattergrams to Understand the Partitioning of Variance
A Numerical Example of Partitioning the Variation
Using Excel® and SPSS® with Bivariate Regression
Assumptions of Bivariate Linear Regression
Curvilinear Relationships
Detecting Problems in Bivariate Linear Regression
A Real-World Example of Bivariate Linear Regression
Advanced Regression Procedures
Terms and Concepts
Real-World Lab XI: Bivariate Linear Regression
Real-World Lab XI: Solutions
16 Introduction to Multiple Linear Regression
The Elements of MLR
Some Differences Between Bivariate Regression and MLR
Stuff Not Covered
MLR Extended Example
Are the Assumptions Met?
The Findings
The SPSS® Findings
Terms and Concepts
Real-World Lab XII: Multiple Linear Regression
Real-World Lab XII: MLR Solutions
17 Chi Square and Contingency Table Analysis
Contingency Tables
The Chi Square Procedure and Research Design
Chi Square Designs
Goodness of Fit
The Chi Square Test of Independence
A Fictitious Example—Goodness of Fit
Effect Size—Goodness of Fit
Chi Square Test of Independence
A Fictitious Example—Test of Independence
Special 2 × 2 Chi Square
Cramer’s V: Effect Size for the Chi Square Test of Independence
Repeated Measures Chi Square
Using Excel® and SPSS® with Chi Square
Using Excel® for Chi Square Analyses
Using SPSS® for the Chi Square Test of Independence
Terms and Concepts
Real-World Lab XIII: Chi Square
Real-World Lab XIII: Solutions
18 Repeated Measures Procedures: Tdep and ANOVAws
Independent and Dependent Samples in Research Designs
Using Different T Tests
The Dependent T-Test Calculation: The Long Formula
The Dependent T-Test Calculation: The Difference Formula
Tdep and Power
Using Excel® and SPSS® to Conduct the Tdep Analysis
Within-Subjects ANOVA (ANOVAws)
Within-Subjects Example
Using SPSS® for Within-Subjects Data
The SPSS® Procedure
The SPSS® Output
Nonparametric Statistics
Terms and Concepts
References
Appendix: Statistical Tables
Index
Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved
Published by John Wiley & Sons, Inc., Hoboken, New JerseyPublished simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
ISBN: 978-0-470-88945-9
10 987654321
To those who seek a deeper understanding of the world as it appears and of whatlies beyond.
PREFACE
I have written this book many times in my head over the years! As I conducted research and taught statistics (graduate and undergraduate) in many fields, I developed an approach to helping students understand the difficult concepts in a new way. I find that the great majority of students are visual learners, so I developed diagrams and figures over the years that help create a conceptual picture of the statistical procedures that are often problematic to students (like sampling distributions!).
The other reason I wanted to write this book was to give students a way to understand statistical computing without having to rely on comprehensive and expensive statistical software programs. Because most students have access to Microsoft Excel®,1 I developed a step-by-step approach to using the powerful statistical procedures in Excel® to analyze data and conduct research in each of the statistical topics I cover in the book.
I also wanted to make those comprehensive statistical programs more approachable to statistics students, so I have also included a hands-on guide to SPSS® in parallel with the Excel® examples. In some cases, SPSS® has the only means to perform some statistical procedures; but in most cases, both Excel® and SPSS® can be used.
Last, like my other work dealing with applied statistical topics (Abbott, 2010), I included real-world data in this book as examples for the procedures I discuss. I introduce extended examples in each chapter that use these real-world datasets, and I conclude the chapters with a Real-World Lab in which I present data for students to use with Excel® and SPSS®. Each Lab is followed by the Real World Lab: Solutions section so that students can examine their work in greater depth.
One limitation to teaching statistics through Excel® is that the data analysis features are different, depending on whether the user is a Mac user or a PC user. I am using the PC version, which features a Data Analysis suite of statistical tools. This feature may no longer be included in the Mac version of Excel® you are using.
I am posting the datasets for the real-world labs at the Wiley Publisher ftp site. You can access these datasets there to complete the labs instead of entering the data from the tables in the chapters. You may note some slight discrepancies in the results if you enter the data by hand rather than downloading the data due to rounding of values. The data in the chapters are typically reported to two decimal places, whereas the analyses reported in the Labs are based on the actual data that both Excel® and SPSS® carry to many decimal places even though you may only see a value with two decimal places. Despite any slight differences resulting from rounding, the primary findings should not change. You may encounter these types of discrepancies in your research with real data as you move data from program to program to page.
The John Wiley & Sons Publisher ftp address is as follows:
ftp://ftp.wiley.com/public/sci_tech_med/educational_statistics. You may also want to visit my personal website at the following address:
http://myhome.spu.edu/mabbott/.
MARTIN LEE ABBOTT
Seattle, Washington
1 Excel® references and screen shots in this book are used with permission from Microsoft.
ACKNOWLEDGMENTS
I would like to thank everyone who reviewed this manuscript. In particular, Nyaradzo Mvududu’s thorough critique was invaluable throughout the process. Adrianna Bagnail reviewed the manuscript and provided help in a great many other ways, especially with the tables. Dominic Williamson’s outstanding work on the figures and graphic design was a critical feature of my approach to conceptual understanding of complex processes. I am especially grateful for his design of the image on the book cover. Kristin Hovaguimian again provided outstanding support for the Index—not an easy task with a book of this nature. My graduate students in Industrial/Organizational Psychology were kind to review the Factorial ANOVA chapter (Chapter 13).
I also want to thank Duane Baker (The BERC Group, Inc.) and Liz Cunningham (T.E.S.T., Inc.) for approval to use their data in this book as they did for my former work (Abbott, 2010). Using real-world data of this nature will be very helpful to readers in their efforts to understand statistical processes.
I especially want to recognize Jacqueline Palmieri and Stephen Quigley at John Wiley & Sons, Inc. for their continuing encouragement. They have been steadfast in their support of this approach to statistical analysis from the beginning of our work together.
MARTIN LEE ABBOTT
Many students and researchers are intimidated by statistical procedures. This may in part be due to a fear of math, problematic math teachers in earlier education, or the lack of exposure to a “discovery” method for understanding difficult procedures. Readers of this book should realize that they have the ability to succeed in understanding statistical processes.
This is an introduction to statistics using EXCEL® and SPSS® to make it more understandable. Ordinarily, the first course leads the student through the worlds of descriptive and inferential statistics by highlighting the formulas and sequential procedures that lead to statistical decision making. We will do all this in this book, but I place a good deal more attention on conceptual understanding. Thus, rather than memorizing a specific formula and using it in a specific way to solve a problem, I want to make sure the student first understands the nature of the problem, why a specific formula is needed, and how it will result in the appropriate information for decision making.
By using statistical software, we can place more attention on understanding how to interpret findings. Statistics courses taught in mathematics departments, and in some social science departments, often place primary emphases on the formulas/ processes themselves. In the extreme, this can limit the usefulness of the analyses to the practitioner. My approach encourages students to focus more on how to understand and make applications of the results of statistical analyses. EXCEL® and other statistical programs are much more efficient at performing the analyses; the key issue in my approach is how to interpret the results in the context of the research question.
Beginning with my first undergraduate course through teaching statistics with conventional textbooks, I have spent countless hours demonstrating how to conduct statistical tests by hand and teaching students to do likewise. This is not always a bad strategy; performing the analysis by hand can lead the student to understand how formulas treat data and yield valuable information. However, it is often the case that the student gravitates to memorizing the formula or the steps in an analysis. Again, there is nothing wrong with this approach as long as the student does not stop there. The outcome of the analysis is more important than memorizing the steps to the outcome. Examining the appropriate output derived from statistical software shifts the attention from the nuances of a formula to the wealth of information obtained by using it.
It is important to understand that I do indeed teach the student the nuances of formulas, understanding why, when, how, and under what conditions they are used. But in my experience, forcing the student to scrutinize statistical output files accomplishes this and teaches them the appropriate use and limitations of the information derived.
Students in my classes are always surprised (ecstatic) to realize they can use their textbooks, notes, and so on, on my exams. But they quickly find that, unless they really understand the principles and how they are applied and interpreted, an open book is not going to help them. Over time, they come to realize that the analyses and the outcomes of statistical procedures are simply the ingredients for what comes next: building solutions to research problems. Therefore, their role is more detective and constructor than number juggler.
This approach mirrors the recent national and international debate about math pedagogy. In my recent book, Winning the Math Wars (Abbott et al., 2010), my colleagues and I addressed these issues in great detail, suggesting that, while traditional ways of teaching math are useful and important, the emphases of reform approaches are not to be dismissed. Understanding and memorizing detail are crucial, but problem solving requires a different approach to learning.
Labs are a very important part of this course since they allow students to take charge of their learning. This is the “discovery learning” element I mentioned above. Understanding a statistical procedure in the confines of a classroom is necessary and helpful. However, learning that lasts is best accomplished by students directly engaging the processes with actual data and observing what patterns emerge in the findings that can be applied to real research problems.
In this course, we will have several occasions to complete Project Labs that pose research problems on actual data. Students take what they learn from the book material and conduct a statistical investigation using EXCEL® and SPSS®. Then, they have the opportunity to examine the results, write research summaries, and compare findings with the solutions presented at the end of the book.
These are labs not using data created for classroom use but instead using real-world data from actual research databases. Not only does this engage students in the learning process with specific statistical processes, but it presents real-world information in all its “grittiness.” Researchers know that they will discover knotty problems and unusual, sometimes idiosyncratic, information in their data. If students are not exposed to this real-world aspect of research, it will be confusing when they engage in actual research beyond the confines of the classroom.
The project labs also introduce students to two software approaches for solving statistical problems. These are quite different in many regards, as we will see in the following chapters. EXCEL® is widely accessible and provides a wealth of information to researchers about many statistical processes they encounter in actual research. SPSS® provides additional, advanced procedures that educational researchers utilize for more complex and extensive research questions. The project labs provide solutions in both formats so the student can learn the capabilities and approaches of each.
As I mentioned, I focus on using real-world data for many reasons. One reason is that students need to be grounded in approaches they can use with “gritty” data. I want to make sure that students leave the classroom prepared for encountering the little nuances that characterize every research project.
Another reason I use real-world data is to familiarize students with contemporary research questions in education. Classroom data often are contrived to make a certain point or show a specific procedure, which are both helpful. But I believe that it is important to draw the focus away from the procedure per se and understand how the procedure will help the researcher resolve a research question. The research questions are important. Policy reflects the available information on a research topic, to some extent, so it is important for students to be able to generate that information as well as to understand it. This is an “active” rather than “passive” learning approach to understanding statistics.
People who write statistics books have a dilemma with respect to research design. Typically, statistics and research design are taught separately in order for students to understand each in greater depth. The difficulty with this approach is that the student is left on their own to synthesize the information; this is often not done successfully.
Colleges and universities attempt to manage this problem differently. Some require statistics as a prerequisite for a research design course, or vice versa. Others attempt to synthesize the information into one course, which is difficult to do given the eventual complexity of both sets of information. Adding somewhat to the problem is the approach of multiple courses in both domains.
I do not offer a perfect solution to this dilemma. My approach focuses on an in-depth understanding of statistical procedures for actual research problems. What this means is that I cannot devote a great deal of attention in this book to research design apart from the statistical procedures that are an integral part of it. However, I try to address the problem in two ways.
First, wherever possible, I connect statistics with specific research designs. This provides an additional context in which students can focus on using statistics to answer research questions. The research question drives the decision about which statistical procedures to use; it also calls for discussion of appropriate design in which to use the statistical procedures. We will cover essential information about research design in order to show how these might be used.
Second, I am making available an online course in research design as part of this book. In addition to databases and other research resources, you can follow the web address in the Preface to gain access to the online course that you can take in tandem with reading this book or separately.
I emphasize “practical significance” (effect size) in this book as well as statistical significance. In many ways, this is a more comprehensive approach to uncertainty, since effect size is a measure of “impact” in the research evaluation. It is important to measure the likelihood of chance findings (statistical significance), but the extent of influence represented in the analyses affords the researcher another vantage point to determine the relationship among the research variables.
I call attention to problem solving as the important part of statistical analysis. It is tempting for students to focus so much on using statistical procedures to create meaningful results (a critical matter!) that they do not take the next steps in research. They stop after they use a formula and decide whether or not a finding is statistically significant. I strongly encourage students to think about the findings in the context and words of the research question. This is not an easy thing to do because the meaning of the results is not always cut and dried. It requires students to think beyond the formula.
Statisticians and practitioners have devised rules to help researchers with this dilemma by creating criteria for decision making. For example, squaring a correlation yields the “coefficient of determination,” which represents the amount of variance in one variable that is accounted for by the other variable. But the next question is, How much of the “accounted for variance” is meaningful?
Statisticians have suggested different ways of helping with this question. One such set of criteria determines that 0.01 (or 1% of the variance accounted for) is considered “small” while 0.05 (5% of variance) is “medium,” and so forth. (And, much to the dismay of many students, there are more than one set of these criteria.) But the material point is that these criteria do not apply equally to every research question.
If a research question is, “Does class size affect math achievement,” for example, and the results suggest that class size accounts for 1% of the variance in math achievement, many researchers might agree it is a small and perhaps even inconsequential impact. However, if a research question is, “Does drug X account for 1% of the variance in AIDS survival rates,” researchers might consider this to be much more consequential than “small”!
This is not to say that math achievement is any less important than AIDS survival rates (although that is another of those debatable questions researchers face), but the researcher must consider a range of factors in determining meaningfulness: the intractability of the research problem, the discovery of new dimensions of the research focus, whether or not the findings represent life and death, and so on.
I have found that students have the most difficult time with these matters. Using a formula to create numerical results is often much preferable to understanding what the results mean in the context of the research question. Students have been conditioned to stop after they get the right numerical answer. They typically do not get to the difficult work of what the right answer means because it isn’t always apparent.
The statistical applications we will discuss in this book are “workhorses.” This is an introductory treatment, so we need to spend time discussing the nature of statistics and basic procedures that allow you to use more sophisticated procedures. We will not be able to examine advanced procedures in much detail. I will provide some references for students who wish to continue their learning in these areas. It is hoped that, as you learn the capability of EXCEL® and SPSS®, you can explore more advanced procedures on your own, beyond the end of our discussions.
Some readers may have taken statistics coursework previously. If so, my hope is that they are able to enrich what they previously learned and develop a more nuanced understanding of how to address problems in educational research through the use of EXCEL® and SPSS®. But whether readers are new to the study or experienced practitioners, my hope is that statistics becomes meaningful as a way of examining problems and debunking prevailing assumptions in the field of education.
Often, well-intentioned people can, through ignorance of appropriate processes promote ideas in education that may not be true. Furthermore, policies might be offered that would have a negative impact even though the policy was not based on sound statistical analyses. Statistics are tools that can be misused and influenced by the value perspective of the wielder. However, policies are often generated in the absence of compelling research. Students need to become “research literate” in order to recognize when statistical processes should be used and when they are being used incorrectly.
Microsoft Excel® is a powerful application for education researchers and students studying educational statistics. Excel® worksheets can hold data for a variety of uses and therefore serve as a database. We will focus primarily on its use as a spreadsheet, however. This book discusses how students of statistics can use Excel® menus to create specific data management and statistical analysis functions.
I will use Microsoft® Office Excel® 2007 for all examples and illustrations in this book.1 Like other software, Excel® changes occasionally to improve performance and adapt to new standards. As I write, other versions are projected, however, most all of my examples use the common features of the application that are not likely to undergo radical changes in the near future.
I cannot hope to acquaint the reader with all the features of Excel® in this book. Our focus is therefore confined to the statistical analysis and related functions called into play when using the data analysis features. I will introduce some of the general features in this chapter and cover the statistical applications in more depth in the following chapters.
The opening spreadsheet presents the reader with a range of menu choices for entering and managing data. Like other spreadsheets, Excel® consists of rows and columns for entering and storing data of various kinds. Figure 2.1 shows the spreadsheet with its menus and navigation bars. I will cover much of the available spreadsheet capacity over the course of discussing our statistical topics in later chapters. Here are some basic features:
FIGURE 2.1 The initial Excel® spreadsheet.
Typically, rows represent cases in statistical analyses, and columns represent variables. According to the Microsoft Office® website, the spreadsheet can contain over one million rows and over 16,000 columns. We will not approach either of these limits; however, you should be aware of the capacity in the event you are downloading a large database from which you wish to select a portion of data. One practical feature to remember is that researchers typically use the first row of data to record variable names in each of the columns of data. Therefore, the total dataset contains (rows –1) cases, which takes this into account.
Figure 2.1 shows several “Sheet” tabs on the bottom of the spreadsheet. These are separate worksheets contained in the overall workbook spreadsheet. They can be used independently to store data, but typically the statistical user puts a dataset on one Sheet and then uses additional Sheets for related analyses. For example, as we will discuss in later chapters, each statistical procedure will generate a separate “output” Sheet. Thus, the original Sheet of data will not be modified or changed. The user can locate the separate statistical findings in separate Sheets. Each Sheet tab can be named by “right-clicking” on the Sheet. Additional Sheets can be created by clicking on the small icon to the right of “Sheet3” shown in Figure 2.1.
The main Excel® menus are located in a ribbon at the top of the spreadsheet beginning with “Home” and extending several choices to the right. I will comment on each of these briefly before we look more comprehensively at the statistical features.
The “Home” menu includes many options for formatting and structuring the entered data, including a font group, alignment group, cells group (for such features as insert/delete options), and other such features.
One set of sub-menus is particularly useful for the statistical user. These are listed in the “Number” category located in the ribbon at the bottom of the main set of menus. The default format of Number is typically “General” shown in the highlighted box (see Figure 2.1). If you select this drop-down menu, you will be presented with a series of possible formats for your data among which is one entitled “Number”—the second choice in the sub-menu. If you click this option, Excel® returns the data in the cell as a number with two decimal points.
When you double-click on the “Number” option, however, you can select from a larger sub-menu that allows you many choices for your data, as shown in Figure 2.2. (The additional choices for data formats are located in the “Category:” box located on the left side of this sub-menu.) We will primarily use this “Number” format since we are analyzing numerical data, but we may have occasion to use additional formats. You can use this sub-menu to create any number of decimal places by using the “Decimal places:” box. You can also specify different ways of handling negative numbers by selecting among the choices in the “Negative numbers:” box.
I will return to this menu many times over the course of our discussion. Primarily, we will use this menu to create the visual descriptions of our analyses (graphs and charts).
This menu is helpful for formatting functions and creating the desired “look and feel” of the spreadsheet.
FIGURE 2.2 The variety of cell formats available in the Number sub-menu.
The Formulas menu is a very important part of the statistical arsenal of Excel®. We will discuss specific functions as we get to them in the course of our study; for now, I will point out that the first section of this menu is the “Function Library” that contains a great many categories of functions (i.e., “Financial,” “Logical,” “Text,” etc.). Selecting any of these results in a sub-menu of choices for formulas specific to that category of use. There are at least two ways to create statistical formulas, which we will focus on in this book.
1. The “More Functions” Tab. This tab presents the user with additional categories of formulas, one of which is “Statistical.” As you can see when you select it, there are a great many choices for handling data. Essentially, these are embedded formulas for creating specific statistical output. For example, “AVERAGE” is one of the first formulas listed when you choose “More Functions” and then select “Statistical.” This formula returns the mean value of a set of selected data from the spreadsheet.
2. “Insert Functions” Tab. A second way to access statistical (and other) functions from the Function Library is using the “Insert Function” sub-menu that, when selected, presents the user with the screen shown in Figure 2.3.
Choosing this feature is the way to “import” the function to the spreadsheet. The screen in Figure 2.3 shows the “Insert Function” box I obtained from my computer. As you can see, there are a variety of ways to choose a desired function. The “Search for a function:” box allows the user to describe what they want to do with their data. When selected, the program will present several choices in the “Select a function:” box immediately below it, depending on which function you queried.
FIGURE 2.3 The “Insert Function” sub-menu of the “Function Library.”
The “Or select a category:” box lists the range of function categories available. The statistical category of functions will be shown if double-clicked (as shown in Figure 2.3). Accessing the list of statistical functions through this button will result in the same list of functions obtainable through the “More Functions” tab.
When you use the categories repeatedly, as we will use the “Statistical” category repeatedly, Excel® will show the functions last used in the “Select a function” box as shown in Figure 2.3.
This is the main menu for our discussion in this book. Through the sub-menu choices, the statistical student can access the data analysis procedures, sort and filter data in the spreadsheet, and provide a number of data management functions important for statistical analysis. Figure 2.4 shows the sub-menus of the Data menu.
The following are some of the more important sub-menus that I will explain in detail in subsequent chapters.
Sort and Filter. The Sort sub-menu allows the user to rearrange the data in the spreadsheet according to a specific interest or statistical procedure. For example, if you had a spreadsheet with three variables—Gender, Reading achievement, and Math achievement—you could use the “sort” key to arrange the values of the variables according to gender. Doing this would result in Excel® arranging the gender categories, “M” and “F,” in ascending or descending order (alphabetically, depending on whether you proceed from “A to Z” or from “Z to A”) with the values of the other variables linked to this new arrangement. Thus, a visual scan of the data would allow you to see how the achievement variables change as you proceed from male to female students. The following two figures show the results of this example. Figure 2.5 shows the unsorted variables.
FIGURE 2.4 The sub-menus of the Data menu.
As you can see from Figure 2.5, you cannot easily discern a pattern to the data, depending on whether males or females have better math and reading scores in this sample.2 Sorting the data according to the Gender variable may help to indicate relationships or patterns in the data that are not immediately apparent. Figure 2.6 shows the same three variables sorted according to gender (sorted “A to Z” resulting in the Female scores listed first).
Figure 2.6 shows the data arranged according to the categories of the Gender variable. Viewed in this way, you can detect some general patterns. It appears, generally, that female students performed much better on math and just a bit higher on reading than the male students. Of course, this small sample is not a good indicator of the overall relationship between gender and achievement. For example, the math scores for the last male in the dataset (“10”) and for the third female student (“24”) exert a great deal of influence in this small dataset; a much larger sample would not register as great an influence.
FIGURE 2.5 Unsorted data for the three-variable database.
An important operational note for sorting is to first “select” the entire database before you sort any of the data fields. If you do not sort the entire database, you can inadvertently only sort one variable, which may result in the values of this variable disengaging from its associated values on adjacent variables. In these cases, the values for each case may become mixed. Selecting the entire database before any sort ensures that the values of a given variable remain fixed to the values of all the variables for each of the cases. The “Filter” sub-menu is useful in this regard. Excel® adds drop-down menus next to each variable when the user selects this sub-menu. When you use the menus, you can specify a series of ways to sort the variables in the database without “disengaging” the values on the variables.
FIGURE 2.6 Using the “Sort” function to arrange values of the variables.
FIGURE 2.7 The Excel® sub-menu showing a sort by multiple variables.
You can also perform a “multiple” sort in Excel® using the Sort menu. Figure 2.7 shows the sub-menu presented when you choose Sort. As you can see from the screen, choosing the “Add Level” button in the upper left corner of the screen results in a second sort line (“Then by”) allowing you to specify a second sort variable. This would result in a sort of the data first by Gender, and then the values of Reading would be presented low to high within both categories of gender.
Excel® also records the nature of the variables. Under the “Order” column on the far right of Figure 2.7, the variables chosen for sorting are listed as either “A to Z,” indicating that they are “alphanumeric” or “text” variables, or “Smallest to Largest,” indicating they are numerical variables. Text variables are composed of values (either letters or numbers) that are treated as letters and not used in calculations. In Figure 2.6, gender values are either “F” or “M,” so there is little doubt that they represent letters. If I had coded these as “1” for “F” and “2” for “M” without changing the format of the cells, Excel® might treat the values differently in calculations (since letters cannot be added, subtracted, etc.). In this case I would want to ensure that the “1” and the “2” would be treated not as a number but as letters. Be sure to format the cells properly (from the “Number” group in the Home menu) so that you can be sure the values are treated as you intend them to be treated in your analyses.
Figure 2.8 shows the resulting sort. Here you can see that the data were first sorted by Gender (with “F” presented before “M”) and then the values of “Reading” were presented low to high in value within both gender categories.
Data Analysis. This sub-menu choice (located in the “Data” tab in the “Analysis” group) is the primary statistical analysis device we will use in this book. Figure 2.4 shows the “Data Analysis” sub-menu in the upper right corner of the menu bar. Choosing this option results in the box shown in Figure 2.9.
FIGURE 2.8 The Excel® screen showing the results of a multiple sort.
Figure 2.9 shows the statistical procedures available in Excel®. The scroll bar to the right of the screen allows the user to access several additional procedures. We will explore many of these procedures in later chapters.
You may not see the Data Analysis sub-menu displayed when you choose the Data menu on the main Excel® screen. That is because it is often an “add-in” program. Not everyone uses these features so Excel® makes them available as an “adjunct”.3
FIGURE 2.9 The “Data Analysis” sub-menu containing statistical analysis procedures.
FIGURE 2.10 The Add-In options for Excel®.
If your Excel® screen does not show the Data Analysis sub-menu in the right edge of the menu bar when you select the Data menu, you can add it to the menu. Select the “Office Button” in the upper left corner of the screen and then you will see an “Excel® Options” button in the lower center of the screen. Choose this and you will be presented with several options in a column on the left edge of the screen. “Add-Ins” is one of the available choices, which, if you select it, presents you with the screen shown in Figure 2.10. I selected “Add-Ins” and the screen in Figure 2.10 appeared with “Analysis ToolPak” highlighted in the upper group of choices. When you select this option (you might need to restart Excel® to give it a chance to add), you should be able to find the Data Analysis sub-menu on the right side of the Data Menu. This will allow you to use the statistical functions we discuss in the book.
These two tabs available from the main screen have useful menus and functions for data management and appearance. I will make reference to them as we encounter them in later chapters.
1 Used with permission from Microsoft, as per ‘‘Use of Microsoft Copyrighted Content’’ approvals.
2 The example data in these procedures are taken from the school database we will use throughout the book. The small number of cases is used to explain the procedures, not to make research conclusions.
3 Mac users may not have access to the Data Analysis features since they were removed in previous versions.
The heart of the statistical uses of Excel® is in the Data Analysis sub-menu that I described in Chapter 2. I will introduce many of these statistical tools in later chapters as we encounter different statistical topics. However, before we delve into those specific topics, I want to point out other ways that we can build statistical formulas directly into the spreadsheet.
In Chapter 2 I described several ways in which users can enter statistical formulas directly from the available sub-menus (see especially the “Formulas” section). As I mentioned, there are several statistical formulas available that we will use extensively in this book. Most are single-procedure formulas like calculating “AVERAGE” or “STDEV” (Standard Deviation), for example. Other procedures are more complex like the “FTEST” that calculates the equivalence in variance in two sets of data.
Another very important use of Excel® is to “embed” formulas directly into the worksheet so that you can devise whatever calculation you need. The functions we discussed above are simply common calculations that have been arranged so that if you have repeated need for a certain calculation, you can use them more quickly than entering the formulas by hand.
Choosing the “=“ key notifies Excel that what follows is a user-created formula. Thereafter, you can enter the calculation you want as a string of characters. For example, using the sample of Gender, Reading, and Math scores shown in , the following commands (user-created formulas) would yield the average value for “Math” scores: =Sum(C2:C11)/10
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!