Cross Section and Experimental Data Analysis Using EViews -  - E-Book

Cross Section and Experimental Data Analysis Using EViews E-Book

0,0
106,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

A practical guide to selecting and applying the most appropriate model for analysis of cross section data using EViews. "This book is a reflection of the vast experience and knowledge of the author. It is a useful reference for students and practitioners dealing with cross sectional data analysis ... The strength of the book lies in its wealth of material and well structured guidelines ..." Prof. Yohanes Eko Riyanto, Nanyang Technological University, Singapore "This is superb and brilliant. Prof. Agung has skilfully transformed his best experiences into new knowledge ... creating a new way of understanding data analysis." Dr. I Putu Gede Ary Suta, The Ary Suta Center, Jakarta Basic theoretical concepts of statistics as well as sampling methods are often misinterpreted by students and less experienced researchers. This book addresses this issue by providing a hands-on practical guide to conducting data analysis using EViews combined with a variety of illustrative models (and their extensions). Models having numerically dependent variables based on a cross-section data set (such as univariate, multivariate and nonlinear models as well as non-parametric regressions) are concentrated on. It is shown that a wide variety of hypotheses can easily be tested using EViews. Cross Section and Experimental Data Analysis Using EViews: * Provides step-by-step directions on how to apply EViews to cross section data analysis - from multivariate analysis and nonlinear models to non-parametric regression * Presents a method to test for all possible hypotheses based on each model * Proposes a new method for data analysis based on a multifactorial design model * Demonstrates that statistical summaries in the form of tabulations are invaluable inputs for strategic decision making * Contains 200 examples with special notes and comments based on the author's own empirical findings as well as over 400 illustrative outputs of regressions from EViews * Techniques are illustrated through practical examples from real situations * Comes with supplementary material, including work-files containing selected equation and system specifications that have been applied in the book This user-friendly introduction to EViews is ideal for Advanced undergraduate and graduate students taking finance, econometrics, population, or public policy courses, as well as applied policy researchers.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 846

Veröffentlichungsjahr: 2011

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Cover

Title Page

Copyright

Dedication

Preface

Chapter 1: Misinterpretation of Selected Theoretical Concepts of Statistics

1.1 Introduction

1.2 What is a Population?

1.3 A Sample and Sample Space

1.4 Distribution of a Random Sample Space

1.5 What is a Random Variable?

1.6 Theoretical Concept of a Random Sample

1.7 Does a Representative Sample Really Exist?

1.8 Remarks on Statistical Powers and Sample Sizes

1.9 Hypothesis and Hypothesis Testing

1.10 Groups of Research Variables

1.11 Causal Relationship between Variables

1.12 Misinterpretation of Selected Statistics

Chapter 2: Simple Statistical Analysis but Good for Strategic Decision Making

2.1 Introduction

2.2 A Single Input for Decision Making

2.3 Data Transformation

2.4 Biserial Correlation Analysis

2.5 One-Way Tabulation of a Variable

2.6 Two-Way Tabulations

2.7 Three-Way Tabulation

2.8 Special Notes and Comments

2.9 Special Cases of the N-Way Incomplete Tables

2.10 Partial Associations

2.11 Multiple Causal Associations Based on Categorical Variables

2.12 Seemingly Causal Model Based on Categorical Variables

2.13 Alternative Descriptive Statistical Summaries

2.14 How to Present Descriptive Statistical Summary?

2.15 General Seemingly Causal Model

2.16 Empirical Studies Presenting Descriptive Statistical Summaries

Chapter 3: One-Way Proportion Models

3.1 Introduction

3.2 One-Way Proportion Models Based on a 2 × 2 Table

3.3 Binary Choice Models Based on a K × 2 Table

3.4 Binary Logit Models Based on N-Way Tabulation

3.5 General Binary Choice Models

3.6 Special Notes and Comments

3.7 Association between Categorical Variables

3.8 One-Way Binary Choice Models Based on N-Way Tabulation

3.9 Special Notes and Comments on Binary Choice Models

Chapter 4: N-Way Cell-Proportion Models

4.1 Introduction

4.2 The N-Way Tabulation of Proportions

4.3 The 2 × 2 Factorial Model of Proportions

4.4 I × J Factorial Models of Proportions

4.5 Multifactorial Cell-Proportion Model

4.6 Presenting the Statistical Summary

Chapter 5: N-Way Cell-Mean Models

5.1 Introduction

5.2 One-Way Multivariate Cell-Mean Models

5.3 N-Way Multivariate Cell-Mean Models

5.4 Equality Test by Classification

5.5 Testing Weighted Means Differences

5.6 Descriptive Statistical Summary

Chapter 6: Multinomial Choice Models with Categorical Exogenous Variables

6.1 Introduction

6.2 Multinomial Choice Models

6.3 Ordered Choice Models

6.4 Concordance–Discordance Measure of Association

6.5 Multifactorial Ordered Choice Models

6.6 Multilevel Choice Models

6.7 Special Notes on the Multinomial Logit Model

6.8 Selected Population Studies Using Multinomial Choice Models

Chapter 7: General Choice Models

7.1 Introduction

7.2 Binary Choice Models with a Numerical Variable

7.3 Heterogeneous Binary Choice Models

7.4 Homogeneous Binary Choice Models

7.5 General Binary Choice Models

7.6 Advanced Binary Choice Models

7.7 Multidimensional Binary Choice Translog Linear Model

7.8 Piecewise Binary Choice Models

7.9 Ordered Choice Models with Numerical Independent Variables

7.10 Studies Using General Choice Models

7.11 Two-Stage Binary Choice Model

Chapter 8: Experimental Data Analysis

8.1 Introduction

8.2 Analysis Based on Cell-Mean Models

8.3 Bivariate Correlation Analysis

8.4 Effects of the Experimental Factors

8.5 Effects of the Experimental Factors and Covariates

8.6 Application of the Ordered Choice Models

8.7 Application of Seemingly Causal Models

8.8 Multivariate Analysis of Covariance

8.9 Tests for Equality of Medians

8.10 The Simplest Experimental Design

Chapter 9: Seemingly Causal Models Based on Numerical Variables

9.1 Introduction

9.2 The Simplest Seemingly Causal Model

9.3 General Linear Models Based on Bivariate (X, Y)

9.4 Models Based on Numerical Trivariate

9.5 Regression Analysis Using the Principal Components

9.6 Seemingly Causal Models Based on (X1, X2, Y1, Y2)

9.7 Seemingly Causal Models Based on (X1, X2, X3, Y1, Y2)

9.8 New Types of Interaction Model

9.9 Special Cases

9.10 Special Notes and Comments

Chapter 10: Factor Analysis and Latent Variables Models

10.1 Introduction

10.2 The Basic Concept of Factor Analysis

10.3 The First-Level Latent Variables

10.4 Illustrations Based on Hamsal's (2006) Data Set

10.5 Selected Cases Based on Ary Suta's (2005) Data Set

10.6 Evaluation Analysis Based on Latent Variables

Chapter 11: Application of the Stepwise Selection Methods

11.1 Introduction

11.2 The Options for the Stepwise Selection Methods

11.3 Selection Method for the Numerical Variable Regression Models

11.4 Multifactorial Stepwise Regression Models

11.5 Illustrative Stepwise Regressions Based on Mlogit.wf1

11.6 Special Notes and Comments

Chapter 12: Censored Multiple Regression Models

12.1 Introduction

12.2 Tobit Models

12.3 General Tobit Model

12.4 Zero–One Indicator of Censoring

12.5 Illustrative Cases of Censored Observations

12.6 Series for a Censoring Variable

12.7 Switching Censored Regressions

12.8 Special Notes and Comments

References

Index

This edition first published 2011

© 2011 John Wiley & Sons (Asia) Pte Ltd

Registered office

John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop, # 02-01, Singapore 129809

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.

All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as expressly permitted by law, without either the prior written permission of the Publisher, or authorization through payment of the appropriate photocopy fee to the Copyright Clearance Center. Requests for permission should be addressed to the Publisher, John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop, #02–01, Singapore 129809, tel: 65-64632400, fax: 65-64646912, email: [email protected].

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The Publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the Publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

Screenshots from EViews reproduced with kind permission from Quantitative Micro Software, 4521 Campus Drive, #336, Irvine, CA 92612-2621, USA.

Library of Congress Cataloging-in-Publication Data

Agung, I Gusti Ngurah.

Cross section and experimental data analysis using EViews / I Gusti Ngurah Agung.

p. cm.

ISBN 978-0-470-82842-7 (cloth)

1. Statistics. 2. EViews (Computer file) I. Title.

HA29.A376 2011

005.5'5–dc22

2010041053

Print ISBN: 978-0-470-82842-7

ePDF ISBN: 978-0-470-82843-4

oBook ISBN: 978-0-470-82844-1

ePub ISBN: 978-0-470-82845-8

Dedicated to my wife

Anak Agung Alit Mas,

my children

Martiningsih A. Chandra, Ratnaningsih A. Lefort, and Darma Putra,

my sons in law

Aditiawan Chandra, and Eric Lefort,

my daughter in law

Refiana Andries, and

all my grandchildren

Indra, Rama, Luana, Leonard,

Agung Mas Mirah, and Agung Surya Buana

Preface

It is well known that EViews is excellent software for conducting time-series and panel data analyses. However, it has never been considered for doing cross-section data analysis. Based on my own experiences in writing several Indonesian books and papers on data analysis using SPSS, and doing a lot of experiments using EViews, I have found that EViews provides better programs or options for several statistical analysis methods than SPSS does.

The descriptive statistical methods that are very important to mention are specifically the option Equality Tests by Classification, which can easily be used to construct various descriptive statistical summaries, by using or inserting any sets of categorical and numerical variables. For inferential statistical methods, EViews provides the Wald test (which can easily be used to test various hypotheses using the model parameters), an object System (which can be used to represent a general linear model (GLM, either univariate or multivariate), a structural equation model (SEM), and a seemingly causal model (SCM) – refer to Agung (2009a), several estimation settings to conduct analysis based on instrumental variables, and STEPLS (stepwise least squares), which has a unique method, namely the combinatorial selection method. Furthermore, EViews also provides many functions, so that anyone can easily generate new series or variables, such as the simplest function @Meansby(arg1, arg2 [,s]) for generating the mean of ARG1 by the categorical variable ARG2, and many advanced functions which are beyond the scope of this book.

Furthermore, I have found that all types of model and method for cross-section and experimental data presented in this book can easily be applied to panel data having a large number of observed individuals or objects, by taking into account an additional time t-variable. Take note that one of the categorical variables of any of the models presented in this book could be replaced by the categorical time t variable. If the panel data have a sufficient number of time-point observations, then the time t can be used as a numerical independent variable. Finally, with regard to the piecewise regression models, the models for panel data could have the numerical time t variable together with its defined dummy variables. Thus, this book would be an excellent complement or guide for doing analysis based on panel data having a large number of observed individuals or objects.

Each chapter in this book demonstrates the simplest possible data analysis of those presented in the whole chapter. I am very confident that the simplest data analysis (such as the one- and two-way tabulations of the proportions, means, and median), the simplest linear regression of a bivariate numerical variable, and the simplest binary choice model having a single categorical independent variable in particular, as well as simple graphs, can easily be understood by all readers, especially the statistical users. Furthermore, special notes and comments are also presented for any unexpected or uncommon results. Hence, I would say that this book should be a good guide for undergraduate and graduate students in doing data analysis. And regarding this point, it is important to mention that I have advised many undergraduate students, including my children and grandsons, since 1960.

On the other hand, based on my observations, even graduate students and less-experienced researchers do not think that descriptive statistical analysis is the most useful analysis in an evaluation study. Refer to Chapter 2, specifically the illustrative empirical studies presented in Section 2.15.

It is recognized that all statistical models having numerical dependent variables and their estimation methods based on cross-section data can easily be derived from my first book (Agung, 2009a); thus, the application of those models will not be presented in detail in this book.

Furthermore, note that all models having numerical dependent variables based on any cross-section data, such as the additive and two- and three-way interaction SCMs, could easily be derived from all time-series models presented in Agung (2009a) by using two alternative methods or modifications. And similarly for the instrumental variable models (Chapter 7), nonlinear models (Chapter 10), and the nonparametric estimation methods (Chapter 11).

The first method is to delete the time t variable as well as the lags of endogenous and exogenous variables from the time-series models, and then simpler models would be obtained containing fewer variables. The second method is to replace the time t variable as well as the lags of endogenous and exogenous variables by a relevant set of numerical or dummy variables. Then, under the assumption that the corresponding path diagram is acceptable, in a theoretical sense, various cross-section models, either additive or two- and three-way interaction models, could easily be defined. However, in some cases the path diagrams should be modified to anticipate the structural relationship of the new variables. Many alternative path diagrams are presented in Agung (2009a), and additional illustrative path diagrams will be presented in this book. Finally, all methods for testing hypotheses, specifically the additional or advanced testing hypotheses presented in Agung (2009a: Chapter 9), can be applied directly.

For these reasons, the first book should be considered as a very important complement of this book. Hence, this book will mainly present illustrative data analyses based on models having categorical dependent variables, such as the binary and multinomial choice models, including the ordinal choice models, having categorical or numerical independent variables, as well as both types of independent variable. In addition, selected models having numerical dependent variables, such as the latent variable models and censored dependent variable models, are also presented.

This book contains 12 chapters.

Chapter presents special notes and comments on selected theoretical concepts of statistics, which are misinterpreted by all students and less experienced researchers based on my own observations. Some of the theoretical concepts are the sample space, representative sample, statistic as a function or parameter estimator, hypothesis testing, and random variable and its theoretical normal distribution. In addition, notes on specific groups of variables and causal relationships between variables, in a theoretical sense, are presented, which should be considered even before doing data analysis. Finally, special notes and comments on the reliability and validity of instruments or tests, as well as on forecasting and the misinterpretation of some sampled statistics, are presented.

Chapter 2 presents simple descriptive statistical summaries which are considered as very important quantitative inputs for decision makers. The statistical summaries presented are frequency tables of the problem (endogenous) indicators by their relevant cause (exogenous) factors, since it is recognized that the causal relationship between variables can be studied using their tabulation based on the transformed categorical variables. In addition, unconditional and conditional measures of association based on N-way tabulation are presented. Since, with regard to N-way tabulation, EViews also provides the test statistics directly (namely chi-square and likelihood ratio statistics), then this chapter also presents the testing of hypotheses on the multiple associations, as well as on the conditional associations, between categorical variables. Furthermore, this chapter also presents special notes on the statistics-based frequency table with empty cells and incomplete tables. In addition, by defining a specific pattern of causal association between a set of categorical variables, in a theoretical sense, this chapter proposes that the N-way tabulation procedure can also be used to test their causal associations. Finally, this chapter demonstrates how to present a descriptive statistical summary based on any set of variables, with examples of empirical studies in various selected fields.

Chapter 3 presents various linear models which have a zero–one indicator as an endogenous variable and a set of dummy variables as exogenous variables generated by either a single or multiple categorical factors, called one-way proportion models. The alternative one-way proportion models presented are the regressions and the binary choice models, starting with the simplest models based on a 2 × 2 table. For the model having multiple categorical exogenous variables, it is proposed to generate a cell factor, namely CF, so that any multifactorial models can easily be presented as one-way proportion models. In addition, this chapter demonstrates that a lot of regressions and logistic functions can easily be written based on a cell-proportion tabulation, which are called subjective-regression and subjective-logistic functions. Special notes and comments on the three alternative binary choice models, namely the binary logit, probit, and extreme-value models, are presented. Finally, this chapter presents a one-way proportion model based on the N-way tabulation with empty cells and incomplete frequency tables.

Chapter 4 presents various two-way cell-proportion models or bi-factorial design models of a zero–one endogenous variable, in the form of multiple regressions of binary choice models, with categorical exogenous variables. The main objectives of these models are to test hypotheses on the main and interaction effects of categorical exogenous variables on zero–one endogenous variables. For illustration purposes, three types of nonhierarchical model, a full-factorial or hierarchical model, and an additive model are presented. To generalize, multifactorial design models are presented; however, they are treated as if they are bi-factorial design models. For example, based on factors A and B, three alternative nonhierarchical models with designs [A ∗ B], [A + A ∗ B], and [B + A ∗ B] and one hierarchical model with a design [A+B+A ∗ B] can be presented. For the multifactorial designs, the factors A or B can represent two cell factors, namely CF1 or CF2, which are generated based on the subsets of the multifactors. Finally, this chapter presents special multifactorial binary choice models with unexpected statistical results, which are related to the special incomplete frequency tables.

Chapter 5 mainly presents N-way or multifactorial multivariate cell-mean models; however, for data analysis, the models would be considered or presented as one-, two-, or three-way multivariate cell-mean models. In addition, the test for equality of means, variances, and medians of a single numerical variable by various types of categorical factor are presented.

Chapter 6 presents the application of two types of multinomial choice model. The first type of model is a set of one-way and multifactorial binary choice models, and the second type are the one-way and multifactorial ordered choice models. In addition, simple two- and three-level choice models are presented. Finally, special notes on the true multinomial logistic model are presented.

Chapter 7 presents the application of various binary and ordered choice models having numerical exogenous variables, or both numerical and categorical variables, called general choice models (GCMs), starting with various simple GCMs with a single numerical exogenous variable. The models can then easily be extended to a lot of choice or discrete models, both linear and nonlinear GCMs. Special notes and comments on the acceptability of estimates, in a statistical sense, are presented supported by residual analysis.

Chapter 8 demonstrates the applications of various statistical models based on an experimental data set by using the original measured variables and their transformed variables, such as the natural logarithm of the variables, and the zero–one indicators as well as ordinal variables (even though the data only have numerical variables), in addition to treatment factors. In fact, the models, such as the binary and ordered choice models, have been presented in previous chapters. Furthermore, the application of various causal models is also presented, such as MANOVA, homogeneous regression (MANCOVA), heterogeneous regressions, and the system equations, based on numerical variables.

Chapter 9 presents SCMs based mainly on numerical variables, which are most likely do not have pure causal relationships, since the data used is cross-section data. The SCMs presented start with the simplest linear regression in a two-dimensional space, namely SLR_2, based on a pair of numerical variables, and the patterns of their possible relationships are graphically presented in a two-dimensional coordinate system. This is then extended to the simplest linear regression in three-dimensional space, namely SLR_3, and SLR in the k-dimensional space, namely SLR_k, for k > 3, called a hyperplane in an abstract space. Finally, various SCMs are presented that correspond to specific path diagrams which represent the causal relationships defined theoretically based on a set of numerical variables.

Chapter 10 presents illustrative examples on how to develop a latent variable based on a set of defined measurable or observable variables or attributes, by using the object Factor or Factor Analysis, which is available only in EViews 6, and the principal component method for previous versions of EViews. However, this chapter will not present the application of latent variable models in detail, since they can be considered exactly the same as all models based on numerical variables, which are presented in previous chapters and in Agung (2009a). Hence, this chapter only presents single-stage and multistage factor analysis in generating latent variables, selected latent variable models with special notes and comments, and the evaluation or policy analysis based on latent variables.

Chapter 11 demonstrates various acceptable and unexpected regressions obtained based on the same set of three search regressors, in particular based on a larger number of search regressors, by using a multistage stepwise selection method. Since it is well known that the effect of an exogenous (cause, source, upstream, or independent) variable on an endogenous (downstream, impact, or dependent) variable, in general, is theoretically dependent on other exogenous variables, then a two-way interaction model should be applied, and selected or limited three-way interactions may be used as additional independent variables if and only if the corresponding three main factors are completely correlated or associated, in a theoretical sense. As an extension of all the regressions presented, various similar regressions can easily be developed using transformed variables, such as the semilog and translog linear or nonlinear models, as well as the bounded regression models having log[(Y − L)/(U − Y)] as an dependent variable, where L and U are the lower and upper bounds of any numerical endogenous variable Y.

Finally, Chapter 12 presents unexpected empirical findings based on various Tobit and censored regression models, starting from the simplest models, such as the censored mean and cell-mean models, and censored regression models with one and two numerical variables, are introduced by using the fitted or index values variables, either as dependent or independent variables. Finally, it is found that ARCH and GARCH/TARCH models should be applied based on a switching censored regression.

In addition, I present special notes and comments, most of which are not presented in statistical books as well as research methods. Thus, it is expected that readers will have a better picture of the limitations of the various models, as well as the problems found in doing data analysis, specifically in obtaining alternative acceptable or good-fit models based on sampled data that happen to be selected or are available for the researcher. Refer to the notes and comments about a sample presented in Chapter 1. Furthermore, the advice to readers is to read the special notes and comments presented in Agung (2009a).

I wish to express my gratitude to the Graduate School of Management, Faculty of Economics, University of Indonesia, and the Ary Suta Center, Jakarta, for providing a rich intellectual environment and facilities that were indispensable for the writing of this text. In the process of writing the draft of this book using the PC, I would like to thank the staff of the Graduate School of Management, specifically Tridianto Subagio and Asep Saepul Hayat, who gave great help if I had problems with the software.

In the process of writing this applied statistical book in English, I am indebted to my daughter, Ningsih Agung Chandra, and my son, Darma Putera, for their time in correcting my English. My daughter has a Bachelor of Science from the Department of Biostatistics, School of Public Health (BSPH), the University of North Carolina at Chapel Hill, USA, and a Master's Degree in Communication Studies (MSi) from the London School of Public Relations – Jakarta (LSPR). Now, she is a senior lecturer and thesis coordinator of the graduate program, as well as advisor of both undergraduate and graduate programs at LSPR. In addition, she is also the PR & Communication Manager of the Macau Government Tourist Office (MGTO) Representative in Indonesia, and her profile can also be found through Google by typing her complete name – Martiningsih Agung Chandra. My son has an MBA from De La Salle University, Philippines, and a BSc in Management from Adamson University, Philippines, and he had been in the USA for more than 5 years while I was studying in the USA. Now, he is the director of Pure Technology Indonesia, which has a company (Pure Technology Philippines) in Makati, as a subsidiary of Pure Technology Indonesia.

Finally, I would like to thank Dr Esther Levy, the reviewers, editors, and all the staff at John Wiley & Sons (Asia) Pte Ltd for their hard work in getting this book, and my first book, Time Series Data Analysis Using EViews, to publication.

Chapter 1

Misinterpretation of Selected Theoretical Concepts of Statistics

1.1 Introduction

It is recognized that most undergraduate and graduate students do not have sufficient knowledge of the basic theoretical concepts of statistics or mathematical statistics in general, such as the concepts of sample space, representative sample, a statistic as a parameter estimator, testing hypothesis, and random variable and its theoretical (normal) distribution. Thus, they think that they have to prove the theoretical concepts of statistics by using only a sample data set, for instance, error terms of statistical models should have independent and identical normal distributions. In fact, some of them even think that the cross-section data of a numerical variable should be tested whether or not it has a normal distribution, before doing further data analysis.

On the other hand, they think that they have to select a representative sample for their theses or dissertations. This does not have a clear meaning, as in fact there is no sampling method or guide on how to select a representative sample. Agung (1992a) states that it is better not to use the term “representative sample” anymore, since it can be misleading. It is well known that researchers will most likely select a nonprobability sample, specifically a convenient sample where each respondent has a convenient time in giving “good” responses. In other words, most sample survey researches do not use pure random samples, since researchers never take into account a complete list of the population. On the other hand, a random sample is basically defined as a sample where each individual in the population has an equal probability of being selected. In fact, even some of my graduate students for their theses and dissertations should be using a special sampling method, called the friendship sampling method (Agung, 2008a), since they should interview managers or high-ranking persons, who have very limited time and most likely do not want to participate in the study, or they are using their friends as the research objects.

Furthermore, it is recognized that most books in applied statistics do not present or discuss the sample space with simple and detailed illustrations. Hence, students or readers never clearly know the limitation of a sample data set for estimating the true value of population parameters. On the other hand, several sampled statistics can be misinterpreted, such as a causal relationship between a pair of variables should be proven using a simple regression, the standard error of a variable, a sample size has to be estimated using a statistical formula, the reliability Crownbach α, which in fact is a consistency coefficient, and validity of an instrument data collection, which are in fact computed based on a sample of individuals that happen to be selected by the researchers.

For this reason, the following sections present some notes and comments on selected theoretical concepts of statistics, as well as sampled statistical values, which are considered as very important supporting knowledge in giving values to the statistical results based on a sample data set.

1.2 What is a Population?

It has been recognized that a population can be thought of as a complete set of individuals, a complete set of characteristics or variables, or as a complete set of scores, values, or measurements of variables. For these reasons, the following alternative definitions of a population are proposed. On the other hand, a hypothetical population will be introduced later, corresponding to any nonrandom samples which have been used in most or almost all sample survey researches.

Definition 1.1

A population is defined as a complete set of all individuals having specific characteristics defined by a researcher, such that each individual can be perfectly classified into whether or not the individual is a member of the population.

Definition 1.2

A population is defined as a complete set of all possible characteristics or variables of the observed individuals.

Definition 1.3

A variable is a characteristic of a set of individuals, which can have different scores/values/measurements for different individuals in the set.

Definition 1.4

A population is a complete set of multidimensional quantitative and qualitative scores, values, or measurements of all possible variables, which could give a complete data or information to a researcher. In other words, a population is a complete set of quantitative and qualitative scores/values/measurements of all possible defined variables.

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!