97,99 €
Comprehensively teaches the basics of testing statistical assumptions in research and the importance in doing so This book facilitates researchers in checking the assumptions of statistical tests used in their research by focusing on the importance of checking assumptions in using statistical methods, showing them how to check assumptions, and explaining what to do if assumptions are not met. Testing Statistical Assumptions in Research discusses the concepts of hypothesis testing and statistical errors in detail, as well as the concepts of power, sample size, and effect size. It introduces SPSS functionality and shows how to segregate data, draw random samples, file split, and create variables automatically. It then goes on to cover different assumptions required in survey studies, and the importance of designing surveys in reporting the efficient findings. The book provides various parametric tests and the related assumptions and shows the procedures for testing these assumptions using SPSS software. To motivate readers to use assumptions, it includes many situations where violation of assumptions affects the findings. Assumptions required for different non-parametric tests such as Chi-square, Mann-Whitney, Kruskal Wallis, and Wilcoxon signed-rank test are also discussed. Finally, it looks at assumptions in non-parametric correlations, such as bi-serial correlation, tetrachoric correlation, and phi coefficient. * An excellent reference for graduate students and research scholars of any discipline in testing assumptions of statistical tests before using them in their research study * Shows readers the adverse effect of violating the assumptions on findings by means of various illustrations * Describes different assumptions associated with different statistical tests commonly used by research scholars * Contains examples using SPSS, which helps facilitate readers to understand the procedure involved in testing assumptions * Looks at commonly used assumptions in statistical tests, such as z, t and F tests, ANOVA, correlation, and regression analysis Testing Statistical Assumptions in Research is a valuable resource for graduate students of any discipline who write thesis or dissertation for empirical studies in their course works, as well as for data analysts.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 276
Veröffentlichungsjahr: 2019
Cover
Preface
Acknowledgments
About the Companion Website
1 Importance of Assumptions in Using Statistical Techniques
1.1 Introduction
1.2 Data Types
1.3 Assumptions About Type of Data
1.4 Statistical Decisions in Hypothesis Testing Experiments
1.5 Sample Size in Research Studies
1.6 Effect of Violating Assumptions
Exercises
Answers
2 Introduction of SPSS and Segregation of Data
2.1 Introduction
2.2 Introduction to SPSS
2.3 Data Cleaning
2.4 Data Management
Exercises
Answers
3 Assumptions in Survey Studies
3.1 Introduction
3.2 Assumptions in Survey Research
3.3 Questionnaire's Reliability
Exercise
Answers
4 Assumptions in Parametric Tests
4.1 Introduction
4.2 Common Assumptions in Parametric Tests
4.3 Assumptions in Hypothesis Testing Experiments
4.4
F
‐test For Comparing Variability
4.5 Correlation Analysis
4.6 Regression Analysis
Exercises
Answers
5 Assumptions in Nonparametric Tests
5.1 Introduction
5.2 Common Assumptions in Nonparametric Tests
5.3 Chi‐square Tests
5.4 Mann‐Whitney U Test
5.5 Kruskal‐Wallis Test
5.6 Wilcoxon Signed‐Rank Test
Exercises
Answers
6 Assumptions in Nonparametric Correlations
6.1 Introduction
6.2 Spearman Rank‐Order Correlation
6.3 Biserial Correlation
6.4 Tetrachoric Correlation
6.5 Phi Coefficient (Φ)
6.6 Assumptions About Data
6.7 What if the Assumptions Are Violated?
Exercises
Answers
Appendix Statistical Tables
Bibliography
Index
End User License Agreement
Chapter 1
Table 1.1 Assumptions about data in computing measures of central tendency.
Table 1.2 Marks for the students in an examination.
Table 1.3 Statistical errors in hypothesis testing experiment.
Table 1.4 Implication of errors in hypothesis testing experiment.
Table 1.5 Sample size for different precision.
Chapter 2
Table 2.1 Response of the subjects and their demographic data on the issue “Shou...
Table 2.2 Descriptive statistics.
Table 2.3 Frequencies of “Response” variable.
Table 2.4 Commands for selecting cases with different conditions in SPSS.
Chapter 3
Table 3.1 A typical scoring of the subject's response in a lifestyle assessment ...
Table 3.2 Scores of the subjects in each group of statements.
Table 3.3 English test with possible answers.
Table 3.4 Computation in Kuder–Richardson test for testing reliability.
Table 3.5 Sample questions on a lifestyle assessment test.
Table 3.6 Response of the subjects on different items of the questionnaire.
Table 3.7 Reliability statistics.
Table 3.8 Output of the reliability analysis.
Table 3.9 Guidelines for internal consistency benchmark.
Chapter 4
Table 4.1 Performance of the students.
Table 4.2 Tests of normality for the data on students' performance.
Table 4.3 Marks of the students.
Table 4.4 Tests of normality for the data on students' marks.
Table 4.5 Transformed data.
Table 4.6 Tests of normality for the transformed data on students' marks.
Table 4.7 Runs test for the data on weight.
Table 4.8 Data showing the effect of outlier.
Table 4.9 Data file for Levene's test.
Table 4.10 Results of Levene's test for equality of variances.
Table 4.11 Anxiety scores.
Table 4.13 Tests of normality for the data on students' marks.
Table 4.12 Students marks in mathematics.
Table 4.14 Descriptive statistics.
Table 4.15 T‐table for the data on marks in mathematics.
Table 4.17 Tests of normality for the data on hemoglobin.
Table 4.18 Paired
t
‐test for the data on hemoglobin.
Table 4.16 Hemoglobin contents of the participants before and after the exercise...
Table 4.19 Data on hemoglobin and computation in sign test.
Table 4.20 Memory scores.
Table 4.21 Tests of normality for the data on memory.
Table 4.22 Data file in SPSS for independent‐sample
t
‐test.
Table 4.23 Independent‐samples
t
‐test for the data on memory.
Table 4.24 Test statistics for Mann–Whitney test.
Table 4.25 IQ scores of students.
Table 4.26 Descriptive statistics for IQ scores.
Table 4.27 Tests for normality.
Table 4.28 Levene's test of equality of error variances.
Table 4.29 The analysis of variance (ANOVA) table.
Table 4.30 Multiple comparisons post‐hoc test.
Table 4.31 Learning scores of the subjects in different treatment groups.
Table 4.32 Data format in SPSS for one‐way ANOVA.
Table 4.33 Test of homogeneity of variances.
Table 4.34 ANOVA table for the scores in Maths.
Table 4.35 Tests of normality.
Table 4.36 Mean ranks of different groups.
Table 4.37 Output of Kruskal‐Wallis test.
Table 4.38 Child data.
Table 4.39 Pearson's correlation coefficient.
Table 4.40 Spearman's correlation coefficient.
Table 4.41 Testing for multicollinearity.
Table 4.42 Testing for autocorrelation.
Table 4.43 The ANOVA table: testing for overall model fitness.
a
Table 4.44 Model summary.
b
Table 4.45 Regression coefficients.
a
Chapter 5
Table 5.1 Runs test for the Gender and Age of respondents.
Table 5.2 Frequencies for the chi‐square goodness‐of‐fit test.
Table 5.3 Chi‐square goodness‐of‐fit test results.
Table 5.4 Chi‐square test for independence: crosstabs results.
Table 5.5 Chi‐square test of independence results.
Table 5.6 Chi‐square test for homogeneity: crosstabs results.
Table 5.7 Chi‐square test for homogeneity: test statistic and
p
‐value.
Table 5.8 Ranks for the variable Gender.
Table 5.9 Test statistics: Mann‐Whitney U test.
Table 5.10 Hypothesis test summary for Mann‐Whitney U test.
Table 5.11 Ranks for the variable Happiness of marriage.
Table 5.12 Kruskal‐Wallis test results.
Table 5.13 Hypothesis test summary for Kruskal‐Wallis H test.
Table 5.14 Independent samples test view.
Table 5.15 Sample average rank of Race of respondents.
Table 5.16 Migraine score before and after treatment.
Table 5.17 Ranks for the before and after treatment scores.
Table 5.18 Results for the Wilcoxon Signed‐Rank test.
Chapter 6
Table 6.1 Heart rate and number of visits.
Table 6.2 Correlations.
Table 6.3 Test score and handedness data.
Table 6.4 Correlations.
Table 6.5 Responses of the raters to classify subjects having neurosis.
Table 6.6 Clinical findings of psychotherapists on depression.
Table 6.7 Results: crosstabulation (Rater A × Rater B) and Phi coefficient.
Chapter 1
Figure 1.1 Showing the distribution of data.
Figure 1.2 Distribution of mean under null and alternative hypotheses.
Figure 1.3 Showing comparison of power in (b) one‐ and (a) two‐tailed tests at ...
Chapter 2
Figure 2.1 Screen for creating/opening data file.
Figure 2.2 Screen for defining variables and their characteristics.
Figure 2.3 Defining code of nominal variable.
Figure 2.4 Screen after defining all the variables and their properties.
Figure 2.5 Screen after entering the data in SPSS file.
Figure 2.6 Data importing.
Figure 2.7 Data import options.
Figure 2.8 Screen showing command sequence for descriptive statistics.
Figure 2.9 Screen showing selection of variables in the analysis.
Figure 2.10 Screen for selecting different statistics for computation.
Figure 2.11 Screen for selecting chart type.
Figure 2.12 Graph for checking normality of “Response” variable.
Figure 2.13 Screen showing commands for sorting cases.
Figure 2.14 Screen for selecting variable(s) for sorting cases.
Figure 2.15 Data file with sorted cases on response and district variables.
Figure 2.16 Screen showing commands for sorting variables.
Figure 2.17 Screening showing option for sorting criterion.
Figure 2.18 Screening showing sorted data column wise based on variable's name.
Figure 2.19 Command for switching data from coding to categories and vice versa...
Figure 2.20 Data file showing nominal variables as per their categories.
Figure 2.21 Screen showing commands for selecting cases.
Figure 2.22 Screen for selecting cases.
Figure 2.23 Screen showing condition for selecting cases with Male responding A...
Figure 2.24 Screen showing selected cases with Male having response Agree.
Figure 2.25 Screen showing option for selecting specified number of cases rando...
Figure 2.26 Screen showing randomly selected cases.
Figure 2.27 Screen showing commands for splitting the file.
Figure 2.28 Screen showing option for defining the split variable.
Figure 2.29 SPSS data file.
Figure 2.30 Screen for defining formula for computing variable.
Figure 2.31 Screen showing data file with newly created BMI variable.
Chapter 3
Figure 3.1 Screen showing commands for reliability analysis.
Figure 3.2 Screen showing selection of items for reliability analysis.
Figure 3.3 Screen showing options for reliability outputs.
Chapter 4
Figure 4.1 Commands for testing normality.
Figure 4.2 Screen showing options for computing Shapiro–Wilk test.
Figure 4.3 Normal we
Q
–
Q
plot for the data on student's performance.
Figure 4.4 Distribution of data on weight by means of histogram.
Figure 4.5 Histogram showing distribution of
X
(Maths scores).
Figure 4.6 Histogram showing distribution of
X
(Science scores).
Figure 4.7 Screen showing options for Runs Test.
Figure 4.8 Screen showing option for selecting variables and identifying outlie...
Figure 4.9 Box plot for all three groups of scores.
Figure 4.10 Screen showing commands for Levene's test in SPSS.
Figure 4.11 Screen showing commands for inputs in Levene's test.
Figure 4.12 Screen showing steps for inputs in one‐sample
t
test.
Figure 4.13 Screen showing option for paired
t
test.
Figure 4.14 Screen showing options for independent‐samples
t
‐test.
Figure 4.15 Screen showing options for Mann–Whitney test. Source:
Figure 4.16 Three different groups/treatments with their means.
Figure 4.17 Three different groups/treatments with their means and overall mean...
Figure 4.18 Data file in SPSS for testing ANOVA assumptions. Source:
Figure 4.19 Path for testing Skewness and Kurtosis.
Figure 4.20 Procedure for testing Skewness and Kurtosis.
Figure 4.21 Path for testing normality.
Figure 4.22 Procedure for testing normality.
Figure 4.23 Normal
Q
–
Q
plots for IQ scores.
Figure 4.24 Procedure for Levene's test.
Figure 4.25 Path for One‐Way ANOVA.
Figure 4.26 Procedure for One‐Way ANOVA.
Figure 4.27 Means plot.
Figure 4.28 Procedure for Tukey's Post‐Hoc test.
Figure 4.29 Data file in SPSS for one‐way ANOVA.
Figure 4.30 Screen showing selection of variables in one‐way ANOVA.
Figure 4.31 Screen showing option for Post Hoc test in one‐way ANOVA.
Figure 4.32 Screen showing option for homogeneity test and means plot.
Figure 4.33 Screen showing sequence of commands in Kruskal‐Wallis test.
Figure 4.34 Screen showing selection of variables defining coding.
Figure 4.35 Data file in SPSS for correlation and regression analysis.
Figure 4.36 Procedure for creating a scatter plot.
Figure 4.37 Scatter plots for linearity.
Figure 4.38 Path for bivariate correlations.
Figure 4.39 Procedure for bivariate correlations.
Figure 4.40 Path for linear regression analysis.
Figure 4.41 Procedure for Linear Regression analysis.
Figure 4.42 Procedure for testing autocorrelation and multicollinearity.
Figure 4.43 Procedure for testing the normality of errors assumption.
Figure 4.44 The histogram (a) and
P
–
P
plot (b).
Figure 4.45 Scatter plot for the standardized residual.
Chapter 5
Figure 5.1 Path for the Runs test.
Figure 5.2 Choosing options for Runs test.
Figure 5.3 Path for chi‐square goodness‐of‐fit test.
Figure 5.4 Choosing options for chi‐square goodness‐of‐fit test.
Figure 5.5 Path for the chi‐square test of independence.
Figure 5.6 Choosing options for the chi‐square test of independence – 1.
Figure 5.7 Choosing options for the chi‐square test of independence – 2 and 3.
Figure 5.8 Clustered bar chart: Happiness of marriage vs. General happiness.
Figure 5.9 Weight cases procedure for testing homogeneity.
Figure 5.10 Clustered bar chart: testing for homogeneity based on gender.
Figure 5.11 Path for transforming variables.
Figure 5.12 Path 2 for Mann‐Whitney U test.
Figure 5.13 Choosing options for Mann‐Whitney U test (path 1).
Figure 5.14 Path 2 for Mann‐Whitney U test.
Figure 5.15 Mann‐Whitney U test: option 1, path 2.
Figure 5.16 Mann‐Whitney U test: option 2, path 2.
Figure 5.17 Mann‐Whitney U test: option 3, path 2.
Figure 5.18 Mean ranks distribution graph for Mann‐Whitney U test.
Figure 5.19 Path for Kruskal‐Wallis H test (path 1).
Figure 5.20 Options for Kruskal‐Wallis H test (path 1).
Figure 5.21 Path for Kruskal‐Wallis H test (path 2).
Figure 5.22 Options for Kruskal‐Wallis H test (path 2).
Figure 5.23 Kruskal‐Wallis test: option 1, path 2.
Figure 5.24 Kruskal‐Wallis test: option 2, path 2.
Figure 5.25 Pairwise comparisons of Race of respondents.
Figure 5.26 Data file in SPSS for Wilcoxon Signed‐Rank test.
Figure 5.27 Path for Wilcoxon Signed‐Rank test.
Figure 5.28 Options for Wilcoxon Signed‐Rank test.
Chapter 6
Figure 6.1 Data file in SPSS for computing Spearman rank‐order correlation.
Figure 6.2 Path for bivariate correlations.
Figure 6.3 Window for defining option for Spearman correlation.
Figure 6.4 Data file in SPSS for biserial correlation.
Figure 6.5 Options for bivariate correlations.
Figure 6.6 Path for crosstabs: Phi coefficient.
Figure 6.7 Screen showing option for selecting rows and columns in crosstabs.
Figure 6.8 Options for the Phi coefficient.
Cover
Table of Contents
Begin Reading
ix
x
11
xi
xii
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
55
56
54
57
58
59
60
61
62
63
64
65
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
203
211
212
Dedicated to My wife Haripriya children Prachi-Ashish and Priyam, - J. P. Verma My wife, sweet children, parents, all my family and colleagues. - Abdel-Salam G. Abdel-Salam
J. P. Verma
Lakshmibai National Institute of Physical Education Gwalior, India
Abdel-Salam G. Abdel-Salam
Qatar University Doha, Qatar
This edition first published 2019
© 2019 John Wiley & Sons, Inc.
IBM, the IBM logo, ibm.com, and SPSS are trademarks or registered trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on theWeb at “IBM Copyright and trademark information“ at www.ibm.com/legal/copytrade.shtml
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permision to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of J. P. Verma and Abdel-Salam G. Abdel-Salam to be identified as the authors of this work has been asserted in accordance with law.
Registered Offices
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
Editorial Office
111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
The publisher and the authors make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of fitness for a particular purpose. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for every situation. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. The fact that an organization or website is referred to in this work as a citation and/or potential source of further information does not mean that the author or the publisher endorses the information the organization or website may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this works was written and when it is read. No warranty may be created or extended by any promotional statements for this work. Neither the publisher nor the author shall be liable for any damages arising herefrom.
Library of Congress Cataloging-in-Publication data applied for
Hardback ISBN: 9781119528418
Cover design: Wiley
Cover image: © JuSun/iStock.com
The book titled Testing Statistical Assumptions in Research is a collaborative work of J. P. Verma and Abdel‐Salam G. Abdel‐Salam. While conducting research workshops and dealing with research scholars, we felt that most of the scholars do not bother much about the assumptions involved in using different statistical techniques in their research. Due to this reason, the results of the study become less reliable, and in certain cases, if the assumptions are severely violated, the results get completely reversed. In other words, in detecting the effect in hypothesis testing experiment, although the effect exists, it is not visible due to the extreme violation of assumptions. Since lots of resources and time are involved in conducting any survey or empirical research, one must test the related assumptions of statistical tests used in his or her analysis for making his or her findings more reliable.
The bringing out of this text has two specific purposes: first, we wish to educate the researchers about assumptions that are required to be fulfilled in different commonly used statistical tests and show the procedure of testing them using IBM SPSS®1 Statistics software (SPSS) via illustrations. Second, we endeavor to motivate them to check assumptions by showing them the adverse effects of severely violating assumptions using specific examples. We have also suggested the remedial measures in using different statistical tests if assumptions are violated.
This book is meant for research scholars of different disciplines. Since most of the graduate students also write dissertations, this text is equally useful for them as well.
The book contains six chapters. Chapter 1 discusses the importance of assumptions in analyzing research data. The concept of hypothesis testing and statistical errors has been discussed in detail. We have also discussed the concept of power, sample size, and effect size and relationship among themselves in order to provide the foundation of hypothesis to the readers.
In Chapter 2, we have introduced SPSS functionality to the readers. We have also shown how to segregate data, draw random samples, file split, and create variables automatically. These operations are extremely useful for the researchers in analyzing the survey data. Further, acquaintance with SPSS in this chapter shall facilitate the readers to understand the procedure involved in testing assumptions in different chapters.
In Chapter 3, we have discussed different assumptions required in survey studies. We have deliberated upon the importance of designing surveys in reporting the efficient findings.
Chapter 4 gives various parametric tests and the related assumptions. We have shown the procedures for testing these assumptions using the SPSS software. In order to motivate the readers to use assumptions, we have shown many situations where violation of assumptions affects the findings. In this chapter, we have discussed assumptions in all those statistical tests that are commonly used by researchers such as z, t, and F tests, ANOVA, correlation, and regression analysis.
In Chapter 5, we have discussed assumptions required for different nonparametric tests such as Chi‐square, Mann‐Whitney, Kruskal‐Wallis, and Wilcoxon Signed‐Rank test. We have shown the procedures of testing these assumptions as well.
Finally, in Chapter 6, assumptions in nonparametric correlations such as bi‐serial correlation, tetrachoric correlation, and phi coefficient have been discussed. These types of analyses are often used by researchers in survey studies.
Hope this book will serve the purpose for which it has been written. The readers are requested to send their feedback about the book or about the problems that they encounter to the authors for further improvement in the text. You may contact the authors at their emails: Prof. J. P. Verma (email: [email protected], web: www.jpverma.org) and Dr. Abdel‐Salam G. Abdel‐Salam (email: [email protected] or [email protected], web: https://www.abdo.website) for any help in relation to the text.
J. P. Verma
Abdel‐Salam G. Abdel‐Salam
1
SPSS Inc. was acquired by IBM in October 2009.
We would like to thank our workshop participants, research scholars, and graduate students who have constantly posed innumerable problems during academic discussions, which have encouraged us to prepare this text. We extend our thanks to all those who directly or indirectly helped us in completing this text.
J. P. Verma
Abdel‐Salam G. Abdel‐Salam
This book is accompanied by a companion website:
www.wiley.com/go/Verma/Testing_Statistical_Assumptions_Research
The website includes the following:
Chapter presentation in PPT format
SPSS data file for each illustration where the data have been used.
All researches are conducted under certain assumptions. Validity and accuracy of findings depends upon whether we have fulfilled all the assumptions of data and statistical techniques used in the analysis. For instance, in drawing a sample, simple random sampling requires the population to be homogeneous while stratified sampling assumes it to be heterogeneous. In any research, certain research questions are framed that we try to answer by conducting the study. In solving these questions, we frame hypotheses that are tested with the help of the data generated in the study. These hypotheses are tested using some statistical tests, but these tests depend upon whether the data is nonmetric or metric. Different statistical tests are used for nonmetric and metric data for answering same research questions. More specifically, we use nonparametric tests for nonmetric data and parametric tests for metric data. Thus, it is essential for the researchers to understand the type of data generated in their studies. Parametric tests no doubt provide more accurate findings than the nonparametric tests, but they are based upon one common assumption of normality besides some specific assumptions associated with each test. If normality assumption is severely violated, the parametric tests may distort the findings. Thus, in research studies, assumptions are focused on two spheres: data and statistical tests besides methodological issues. Nowadays, many statistical packages such as IBM SPSS® Statistics software (“SPSS”),1 Minitab, Statistica, and Statistical Analysis System (SAS) are available for analyzing both nonmetric and metric data, but they do not check the assumptions automatically. However, these software do provide outputs for testing associated assumptions with the statistical tests. We shall now discuss different types of data that can be generated in research studies. By knowing this, one can decide the relevant strategy for answering their research questions.
Data are classified into two categories: nonmetric and metric. Nonmetric data are also termed as qualitative and metric as quantitative. Nonmetric data are further classified as nominal and ordinal. Nonmetric data are a categorical measurement and are expressed by means of a natural language description. It is often known as “categorical” data. The data such as Student's Specialization = “Economics”, Response = “Agree”, Gender = “Male”, etc. are examples of nonmetric data. These data can be measured on two different scales, i.e. nominal and ordinal.
Nominal data are obtained by categorizing an individual or object into two or more categories, but these categories are not graded. For example, an individual can be classified into male or female category, but we cannot say whether male is higher or female is higher based on the frequency of the data set. Another example of nominal data is the color of the eye. One can be classified into blue, black, or brown eye categories. With this type of data, one can only compute percentage and proportion to know the characteristics of the data. Furthermore, mode is an appropriate measure of central tendency for such a data.
On the other hand, in the ordinal data, categories are graded. The order of items is often defined by assigning numbers to them to show their relative position. Here also, we classify a person, response, or object into one of the many categories, but we can rank them in some order. For example, variables that assess performance (excellent, very good, good, etc.) are ordinal variables. Similarly, attitude (agree, can't say, disagree) and nature (very good, good, bad, etc.) are also ordinal variables. On the basis of the order of an ordinal variable, one may not be sure as to which value is the best or worst on the measured phenomenon. Moreover, the distance between ordered categories is also not measurable. No mathematical operation can be done in the ordinal data. Median and quartile deviation are the appropriate measures of central tendency and variability, respectively, in such data.
Metric data are always associated with a scale measure, and therefore, it is also known as scale data. Such type of data are obtained by measuring some phenomena. Metric data can be measured on two different types of scale, i.e. interval and ratio. The data measured on interval and ratio scales are also termed as interval data and ratio data, respectively. Interval data are obtained by measuring a phenomenon along a scale where each position is equidistant from one another. In this scale, the distance between the two pairs are equivalent in some way. The only problem with this scale is that the doubling principle breaks down as there is no real zero on the scale. For instance, the eight marks given to an individual on the basis of his or her creativity do not explain that his or her creativity is twice as good as the person with four marks on a 10‐point scale. Thus, variables measured on an interval scale have values in which differences are uniform and meaningful but ratios are not. Interval data may be obtained if the parameters such as motivation or level of adjustment is rated on a scale of 1–10.
The data measured on ratio scale has a meaningful zero and has an equidistant measure (i.e. the difference between 30 and 40 is the same as the difference between 60 and 70). Because zero exists in ratio data, 80 marks obtained by person A on a skill test may be considered twice the 40 marks obtained by another person B on the same test. In other words, doubling principle holds in ratio data. All types of mathematical operations can be performed with such kind of data. Examples of ratio data are weight, height, distance, salary, etc.
We know that for metric data, the parametric statistics are calculated while for nonmetric the nonparametric statistics are used. If we violate these assumptions, the findings may be misleading. We shall show this by means of an example. Before that let us elaborate data assumptions little more. If the data are nominal, we find mode as a suitable measure of central tendency, and if the data are ordinal, we compute median. Since both nominal and ordinal data are nonmetric, we use nonparametric statistics (mode and median). On the other hand, if the data are metric (interval/ratio), we should use parametric statistics such as mean and standard deviation. But we can calculate parametric statistics for the metric data only when the assumption of normality holds. In case the normality violates, we should use nonparametric statistics like median and quartile deviation. Assumptions of data in using measures of central tendency are summarized in Table 1.1.
Table 1.1 Assumptions about data in computing measures of central tendency.
Data type
Nature of variable
Appropriate measure of central tendency
Nonmetric
Nominal data
Mode
Ordinal data
Median
Metric
Interval/ratio (if symmetrical or nearly symmetrical)
Mean
Interval/ratio (if skewed)
Median
Let us see what happens if we violate the assumption for the metric data. Consider the marks obtained by the students in an examination as shown in Table 1.2. This is a metric data; hence, without bothering about the normality assumption, let us compute the parametric statistic, mean. Here, the mean of the data set is 46. Can we say that the class average is 46 and report this finding in our research report? Certainly not, as most of the data are less than 46.
Table 1.2 Marks for the students in an examination.
Student
1
2
3
4
5
6
7
8
9
10
Marks
35
40
30
32
35
39
33
32
91
93
Let us see why this situation has arisen. If we look at the distribution of the data, it is skewed toward the positive side of the distribution as shown in Figure 1.1. Since the distribution of data is positively skewed, we can conclude that the normality assumption has been severely violated.
Figure 1.1 Showing the distribution of data.
In a situation where the normality assumption is violated, we can very well use the nonparametric statistic such as median, as shown in Table 1.1 . The median of this data set is 35, which can rightly be claimed as an average as most of the scores are around 35 in comparison to 46. Thus, if the data are skewed, then one should report median and quartile deviation as the measures of central tendency and variability, respectively, instead of mean and standard deviation in their project report.
In hypotheses testing experiments, since population parameter is tested for some of its characteristics on the basis of the sample obtained from the population of interest, some errors are bound to happen. These errors are known as statistical errors. We shall investigate these errors and their repercussion in detail in the following sections.
In hypotheses