99,99 €
This book summarizes the results of various models under normal theory with a brief review of the literature. Statistical Inference for Models with Multivariate t-Distributed Errors: * Includes a wide array of applications for the analysis of multivariate observations * Emphasizes the development of linear statistical models with applications to engineering, the physical sciences, and mathematics * Contains an up-to-date bibliography featuring the latest trends and advances in the field to provide a collective source for research on the topic * Addresses linear regression models with non-normal errors with practical real-world examples * Uniquely addresses regression models in Student's t-distributed errors and t-models * Supplemented with an Instructor's Solutions Manual, which is available via written request by the Publisher
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 215
Veröffentlichungsjahr: 2014
Contents
Cover
Half Title page
Title page
Copyright page
Dedication
List of Figures
List of Tables
Preface
Glossary
List of Symbols
Chapter 1: Introduction
1.1 Objective of the Book
1.2 Models under Consideration
1.3 Organization of the Book
1.4 Problems
Chapter 2: Preliminaries
2.1 Normal Distribution
2.2 Chi-Square Distribution
2.3 Student’s t-Distribution
2.4 F-Distribution
2.5 Multivariate Normal Distribution
2.6 Multivariate t-Distribution
2.7 Problems
Chapter 3: Location Model
3.1 Model Specification
3.2 Unbiased Estimates of θ and σ2 and Test of Hypothesis
3.3 Estimators
3.4 Bias and MSE Expressions of the Location Estimators
3.5 Various Estimates of Variance
3.6 Problems
Chapter 4: Simple Regression Model
4.1 Introduction
4.2 Estimation and Testing of η
4.3 Properties of Intercept Parameter
4.4 Comparison
4.5 Numerical Illustration
4.6 Problems
Chapter 5: Anova
5.1 Model Specification
5.2 Proposed Estimators and Testing
5.3 Bias, MSE, and Risk Expressions
5.4 Risk Analysis
5.5 Problems
Chapter 6: Parallelism Model
6.1 Model Specification
6.2 Estimation of the Parameters and Test of Parallelism
6.3 Bias, MSE, and Risk Expressions
6.4 Risk Analysis
6.5 Problems
Chapter 7: Multiple Regression Model
7.1 Model Specification
7.2 Shrinkage Estimators and Testing
7.3 Bias and Risk Expressions
7.4 Comparison
7.5 Problems
Chapter 8: Ridge Regression
8.1 Model Specification
8.2 Proposed Estimators
8.3 Bias, MSE, and Risk Expressions
8.4 Performance of the Estimators
8.5 Choice of Ridge Parameter
8.6 Problems
Chapter 9: Multivariate Models
9.1 Location Model
9.2 Testing of Hypothesis and Several Estimators of Local Parameter
9.3 Bias, Quadratic Bias, MSE, and Risk Expressions
9.4 Risk Analysis of the Estimators
9.5 Simple Multivariate Linear Model
9.6 Problems
Chapter 10: Bayesian Analysis
10.1 Introduction (Zellner’s Model)
10.2 Conditional Bayesian Inference
10.3 Matrix Variate t-Distribution
10.4 Bayesian Analysis in Multivariate Regression Model
10.5 Problems
Chapter 11: Linear Prediction Models
11.1 Model and Preliminaries
11.2 Distribution of SRV and RSS
11.3 Regression Model for Future Responses
11.4 Predictive Distributions of FRV and FRSS
11.5 An Illustration
11.6 Problems
Chapter 12: Stein Estimation
12.1 Class of Estimators
12.2 Preliminaries and Some Theorems
12.3 Superiority Conditions
12.4 Problems
References
Author Index
Subject Index
Statistical Inference for Models with Multivariate t-Distributed Errors
Copyright © 2014 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic format. For information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Saleh, A. K. Md. Ehsanes, author Statistical inference for models with multivariate t-distributed errors / A.K. Md. Ehsanes Saleh, Department of Mathematics and Statistics, Carleton University, Ottawa, Canada, M. Arashi, Department of Mathematical Sciences, Shahrood University of Technology, Shahrood, Iran, S.M.M. Tabatabaey, Department of Statistics, Ferdowsi University of Mashhad, Mashhad, Iran. pages cm Includes bibliographical references and index. ISBN 978-1-118-85405-1 (hardback) 1. Regression analysis. 2. Multivariate analysis. I. Arashi, M. (Mohammad), 1981–author. II. Tabatabaey, S. M. M., author. III. Title. QA278.2.S254 2014 519.5’36—dc232014007304
To our wives
SHAHIDARA SALEH
REIHANEH ARASHI
IN LOVING MEMORY OF PARI TABATABAEY
LIST OF FIGURES
2.1 Bivariate t-distribution
3.1 Plot of n(h) vs n(d)
3.2 Quadratic relation in (3.5.20)
3.3 Plots of MSE functions (increasing d.f. from up to down).
4.1 Graph of bias function for PTE
4.2 Graph of bias function for PTE
4.3 Graph of bias function for SE
4.4 Graph of risk function for UE and RE
4.5 Graph of risk function for PTE
4.6 Graph of risk function for PTE
4.7 Graph of risk function for SE
4.8 Graph of MRE (RE vs UE)
4.9 Graph of MRE (PTE vs UE)
4.10 Graph of MRE (PTE vs UE)
4.11 Graph of MRE (SE vs UE)
7.1 Risk performance
8.1 Risk of PRSRRE, SRRE and URRE
8.2 Risk of PTRRE and URRE
8.3 Risk of estimators
8.4 Contact time versus reactor temperature
8.5 Risk performance based on both Δ2 and k
8.6 Ridge risk performance based on Δ2
8.7 Ridge risks performance based on Δ2
8.8 Scatter plot of variables versus each other
8.9 Risk performance based on both Δ2 and k
8.10 Ridge trace for the data using four regressors
11.1 Prediction distribution for different future sample sizes
LIST OF TABLES
1.1 Comparison of quantiles of N(0,1) and Student’s t-distribution
3.1 Upper bound (U.B.) of Δ2 for which outperforms
4.1 Maximum and minimum guaranteed efficiencies for n=6
8.1 Acetylene data
8.2 Estimated values of ridge estimators
8.3 Estimated values
8.4 Correlation matrix
8.5 Estimated values
8.6 Coefficients at various values of k
PREFACE
In problems of statistical inference, it is customary to use normal distribution as the basis of statistical analysis. Many results related to univariate analysis can be extended to multivariate analysis using multidimensional normal distribution. Fisher (1956) pointed out, from his experience with Darwin’s data analysis, that a slight change in the specification of the distribution may play havoc on the resulting inferences. To overcome this problem, statisticians tried to broaden the scope of the distributions and achieve reasonable inferential conclusions. Zellner (1976) introduced the idea of using Student’s t-distribution, which can accommodate the heavier tailed distributions in a reasonable way and produce robust inference procedures for applications. Most of the research with Student’s t-distribution, so far, is focused on the agreement of the results with that of the normal theory. For example, the maximum likelihood estimator of the location parameter agrees with the mean-vector of a normal distribution. Similarly, the likelihood ratio test under the Student’s t-distribution has same distribution as the normal distribution under the null hypothesis. This book is an attempt to fill the gap in statistical inference on linear models based on the multivariate t-errors.
This book consists of 12 chapters. Chapter 1 summarizes the results of various models under normal theory with a brief review of the literature. Chapter 2 contains the basic properties of various known distributions and opens the discussion of multivariate t-distribution with its basic properties. We begin Chapter 3 with a discussion of the statistical analysis of a location model from estimation of the intercept. We include the preliminary test and shrinkage type estimators of the location parameter. We also include a discussion of various estimators of variance and their statistical properties. Chapter 4 covers estimation and test of slope and intercept parameters of a simple linear regression model. Chapter 5 is devoted to ANOVA models. Chapter 6 deals with the parallelism model in the same spirit. The multiple regression model is the subject of Chapter 7, and ridge regression is dealt with in Chapter 8. Statistical inference of multivariate models and simple multivariate linear models are discussed in Chapter 9. The Bayesian view point is discussed in multivariate t-models in Chapter 10. The statistical analysis of linear prediction models is included in Chapter 11. The book concludes with Chapter 12, devoted to Stein estimation.
The aim of the book is to provide a clear and balanced introduction to inference techniques using the Student’s t-distribution for students and teachers alike, in mathematics, statistics, engineering, and biostatistics programs among other disciplines. Prerequisite for this book is a modest background in statistics, preferably having used textbooks such as Introduction to Probability and Statistics by Rohatgi and Saleh (2001), Introduction to Mathematical Statistics by Hogg, Mckean and Craig (2012), and some exposure to the book Theory of Preliminary Test and Stein-Type Estimation with Applications by Saleh (2006).
After the preliminary chapters, we begin with the location model in Chapter 3, detailing the mathematical development of the theory using multivariate t-distribution as error distribution and we proceed from chapter to chapter raising the level of discussions with various topics of applied statistics.
The content of the book may be covered in two semesters. Various problems are given at the end of every chapter to enhance the knowledge and application of the theory with multivariate t-distribution.
We wish to thank Ms. Mina Noruzirad (Shahrood University) for diligently typesetting the manuscript with care and for reading several chapters and producing most of the graphs and tables that appear in the book. Without her help, the book could not have been completed in time.
A.K.MD.EHSANES SALEH
Carleton University, Canada
M. ARASHI
Shahrood University, Iran
S. M. M. TABATABAEY
Ferdowsi University of Mashhad, Iran June, 2014
GLOSSARY
ANOVA
analysis of variance
BLF
balanced loss function
BTE
Baranchik-type estimator
C.L.T
central limit theorem
cdf
cumulative distribution function
d.f.
degree of freedom
dim
dimensional
FRV
future regression vector
HPD
highest posterior density
iff
if and only if
JSE
James-Stein estimator
LR
likelihood ratio
LRC
likelihood ratio criterion
LSE
least squares estimator
MLE
maximum likelihood estimator
MRE(.;.)
mean square error based relative efficiency
MSE
mean square error
NCP
natural conjugate prior
p.d.
positive definite
OLS
ordinary least squares
probability density function
RE
restricted estimator
RLSE
restricted least-square estimator
RMLE
restricted maximum likelihood estimator
RRE
restricted rank estimator
RRE
risk-based relative efficiency
RUE
restricted unbiased estimator
R(.;.)
relative efficiency
r.v.
random variable
PRSE
positive rule Stein-type estimator
PRSRRE
positive-rule Stein-type ridge regression estimator
PTE
preliminary test estimator
PTMLE
preliminary test maximum likelihood estimator
PTLSE
preliminary test least-square estimator
PTRRE
preliminary test ridge regression estimator
RRRE
restricted ridge regression estimator
RSS
residual sum of square
SE
Stein-type estimator
SMLE
Stein-type maximum likelihood estimator
SLSE
Stein-type least-square estimator
SRRE
Stein-type ridge regression estimator
SRV
sum of regression vector
SSE
Stein-type shrinkage estimator
UE
unrestricted estimator
ULSE
unrestricted least-square estimator
UMLE
unrestricted maximum likelihood estimator
URE
unrestricted rank estimator
URRE
unrestricted ridge regression estimator
UUE
unrestricted unbiased estimator
LIST OF SYMBOLS
UE
RE
PTE
SE
PRSE
b
(.)/
b
(.)
bias
R
(·)
risk
Δ
2
/Δ
2
*
noncentrality parameter
test-statistics
γ
o
d.f.
V
n
/
V
p
known scale matrix
1
n
vector of
n
-tuple of one’s
Borel measurable function
ϕ(.)
pdf of the standard normal distribution
Φ(.)
cdf of the standard normal distribution
|
a
|
absolute value of the scalar
a
||
a
||
norma of
a
|
A
|
determinant of the matrix
A
Γ(.)
gamma function
Γ
p
(.)
multivariate gamma function
B
(.,.)
beta function
J
(
t
→
z
)
Jacobian of the transformation
t
to
z
dominates
Kronecker product
vec
vectorial operator
Re
(
a
)/
Re
(
A
)
real part of
a
/
A
Ω
1
unrestricted parameter space
Ω
0
restricted parameter space
Diag
diagonal matrix
Ch
max
(
A
)/λ
1
(
A
)
largest eigenvalue of the matrix
A
Ch
min
(
A
)/λ
p
(
A
)
smallest eigenvalue of the matrix
A
N
(.,.)
univariate normal distribution
N
p
(.,.)
p
-variate normal distribution
χ
2
p
chi-square distribution with
p
d.f.
χ
2
p
(Δ
2
)
noncentral chi-square distribution with
p
d.f.
M
(1)
t
(.,.,.)
Student’s t-distribution
M
(
p
)
t
(.,.,.)
P
-variate t-distribution
IG
(.,.)
inverse gamma distribution
F
γ1,γ2
F
distribution
F
γ1,γ2
(Δ
2
)
noncentral
F
distribution with noncentrality parameter Δ
2
1.1 Objective of the Book
1.2 Models under Consideration
1.3 Organization of the Book
1.4 Problems
The classical theory of statistical analysis is primarily based on the assumption that the errors of various models are normally distributed. The normal distribution is also the basis of the (i) chi-square, (ii) Student’s t-, and (iii) F-distributions. Fisher (1956) pointed out that slight differences in the specification of the distribution of the model errors may play havoc on the resulting inferences. To examine the effects on inference, Fisher (1960) analyzed Darwin’s data under normal theory and later under a symmetric non-normal distribution. Many researchers have since investigated the influence on inference of distributional assumptions differing from normality. Further, it has been observed that most economic and business data, e.g., stock return data, exhibit long-tailed distributions. Accordingly, Fraser and Fick (1975) analyzed Darwin’s data and Baltberg and Gonedes (1974) analyzed stock returns using a family of Student’s t-distribution to record the effect of distributional assumptions compared to the normal theory analysis. Soon after, Zellner (1976) considered analyzing stock return data by a simple regression model, assuming the error distribution to have a multivariate t-distribution. He revealed the fact that dependent but uncorrelated responses can be analyzed by multivariate t-distribution. He discussed differences as well as similarities of the results in both classical and Bayesian contexts for multivariate normal and multivariate t-based models.
Fraser (1979, p. 37) emphasized that the normal distribution is extremely short-tailed and thus unrealistic as a sole distribution for variability. He demonstrated the robustness of the Student’s t-family as opposed to the normal distribution based on numerical studies. In justifying the appropriateness and the essence of the use of Student’s t-distribution, Prucha and Kalajian (1984) pointed out that the normal model based analysis (i) is generally very sensitive to deviations from its assumptions, (ii) places too much weight on outliers, (iii) fails to utilize sample information beyond the first two moments, and (iv) appeals to the central limit theorem for asymptotic approximations.
Table 1.1 shows the differences in quantiles of the standard normal and Student’s t-distributions with selected degrees of freedom.
Table 1.1 Comparison of quantiles of N(0,1) and Student’s t-distribution
The objective of this book is to present systematic analytical results for various linear models with error distributions belonging to the class of multivariate t. The book will cover results involving (i) estimation, (ii) test of hypothesis, and (iii) improved estimation of location/regression parameters as well as scale parameters.
We shall consider models whose error-distribution is multivariate t. Explicit formulation of multivariate t-distribution is given by
(1.1.1)
where is a known positive definite matrix of rank n and γo > 2. The mean vector and the dispersion matrix of (1.1.1) are
(1.1.2)
This distribution will be denoted by M(n)t (0, σ2Vn, γo) and will be called the n-dim t-distribution.
In a nutshell, this distribution may be obtained as a mixture of the normal distribution, , and the inverse gamma distribution, IG(t−1, γo), given by
(1.1.3)
where
is the gamma function.
In this section, we consider some basic statistical models that are frequently used in applied statistical and econometric analysis with preliminary information with regard to estimation and testing of a hypothesis under normal theory. These will be used to discuss the estimation and test of hypothesis problems under the multivariate t-distribution in later chapters.
Consider the location model
(1.2.1)
An unrestricted estimator (UE) of θ is given by the least-square/MLE method as
(1.2.2)
while the unrestricted unbiased estimator (UUE) of σ2 is given by
(1.2.3)
Under the normal theory, the exact distribution of , and the exact distribution of mS2u/σ2 is a chi-square distribution with m degrees of freedom (d.f.).
(1.2.4)
The exact distribution of under Ho is the central F-distribution with (1, m) d.f.
Consider a simple linear model
(1.2.5)
(1.2.6)
where , and
(1.2.7)
with the covariance matrix σ2K−1, where
(1.2.8)
The exact distribution of follows . Also, the unbiased estimator of σ2 is S2u given by
(1.2.9)
The distribution of mS2u/σ2 follows a chi-square distribution with m d.f.
(1.2.10)
(1.2.11)
The exact distribution of *n under Ho follows the same central F-distribution with (1, m) d.f.
(1.2.12)
The exact distribution of **n under Ho follows the central F-distribution with (2, m) d.f.
Suppose that the response vector Y is modeled as
(1.2.13)
where
(1.2.14)
(1.2.15)
(1.2.16)
(1.2.17)
We assume that
(1.2.18)
According to the LSE/MLE principle, the unrestricted estimators of θ and σ2 are given by
(1.2.19)
and
(1.2.20)
respectively. Moreover, under normal theory, the exact distribution of is p(0, σ2K−1) and that of mS2u/σ2 follows the chi-square distribution with m d.f. independent of n.
(1.2.21)
where
(1.2.22)
Consider the n-dim (dimensional) response vector Y modeled as
(1.2.23)
where
(1.2.24)
(1.2.25)
(1.2.26)
(1.2.27)
and
(1.2.28)
The LSE of θ and β are given by
(1.2.29)
where
(1.2.30)
The covariance matrix of is given by σ2K−1, where
(1.2.31)
Normal theory results lead to the distribution of as N2p(η, σ2K−1).
The unbiased estimator of σ2 is defined by
(1.2.32)
and mS2u/σ2 follows the chi-square distribution with m d.f., independent of under normal theory.
(1.2.33)
Under Ho, n follows the central F-distribution with (p, m) d.f. On the other hand, if βo is unknown but equals to βo1p, βo scalar and unknown, then the common estimator of βo is given by
(1.2.34)
and the test-statistic is defined by
(1.2.35)
The exact distribution of n under Ho follows the central F-distribution with (q, m) d.f.
Consider the multiple regression model
(1.2.36)
The LSE/MLE of β is then given by
(1.2.37)
where , and the unbiased estimator of σ2 is given by
(1.2.38)
where mS2u/σ2 follows the chi-square distribution with m d.f., independent of n under normal theory.
(1.2.39)
(1.2.40)
If in the model (1.2.36), X′V−1nX is a near-singular or ill-conditioned matrix that prevents the reasonable inversion of X′V−1nX, then multicollinearity may be present among the elements of , and X′V−1n has rank less than p or one or more characteristic roots of X′V−1nX are small, leading to a large variance of n. Thus, following Hoerl and Kennard (1970), one may define the ”Ridge estimator” of β as
(1.2.41)
Let Y1, Y2,…, YN be N observation vectors of p-dim satisfying the model
(1.2.42)
(1.2.43)
(1.2.44)
(1.2.45)
An improved estimator of θ is known to be
(1.2.46)
See for example, James and Stein (1961) and Saleh (2006).
Consider the simple linear model
(1.2.47)
The point estimators of θ and β are given by
(1.2.48)
with .
The point estimator of ∑ is given by
(1.2.49)
(1.2.50)
which follows the central F-distribution with (p, m) d.f.
This book consists of twelve chapters. Chapter 1 summarizes the results of various models under normal theory with brief review of the literature. Chapter 2 contains the basic properties of various known distributions and opens discussion of multivariate t-distribution with its basic properties. We open up Chapter 3 to discuss the statistical analysis of a location model from estimation of the intercept and slope, to test of hypothesis of the parameters. We also add the preliminary test and shrinkage-type estimators of the three parameters which include the estimation of the scale parameter of the model while Chapter 4 contains similar details of a simple regression model. Chapter 5 is devoted to ANOVA models and discussing on preliminary test and shrinkage-type estimators in multivariate t-distribution and Chapter 6 deals with the parallelism model in the same spirit. Multiple regression model is the content of Chapter 7 and ridge regression is dealt with in Chapter 8. Statistical inference of multivariate models and simple multivariate linear models are discussed in Chapter 9. Bayesian view point is discussed in multivariate t-models in Chapter 10. Finally, we discuss linear prediction models in Chapter 11. We conclude the book with Chapter 12 containing the Stein estimation as complementary results.
