Statistical Inference for Models with Multivariate t-Distributed Errors - A. K. Md. Ehsanes Saleh - E-Book

Statistical Inference for Models with Multivariate t-Distributed Errors E-Book

A. K. Md. Ehsanes Saleh

0,0
99,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This book summarizes the results of various models under normal theory with a brief review of the literature. Statistical Inference for Models with Multivariate t-Distributed Errors: * Includes a wide array of applications for the analysis of multivariate observations * Emphasizes the development of linear statistical models with applications to engineering, the physical sciences, and mathematics * Contains an up-to-date bibliography featuring the latest trends and advances in the field to provide a collective source for research on the topic * Addresses linear regression models with non-normal errors with practical real-world examples * Uniquely addresses regression models in Student's t-distributed errors and t-models * Supplemented with an Instructor's Solutions Manual, which is available via written request by the Publisher

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 215

Veröffentlichungsjahr: 2014

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Cover

Half Title page

Title page

Copyright page

Dedication

List of Figures

List of Tables

Preface

Glossary

List of Symbols

Chapter 1: Introduction

1.1 Objective of the Book

1.2 Models under Consideration

1.3 Organization of the Book

1.4 Problems

Chapter 2: Preliminaries

2.1 Normal Distribution

2.2 Chi-Square Distribution

2.3 Student’s t-Distribution

2.4 F-Distribution

2.5 Multivariate Normal Distribution

2.6 Multivariate t-Distribution

2.7 Problems

Chapter 3: Location Model

3.1 Model Specification

3.2 Unbiased Estimates of θ and σ2 and Test of Hypothesis

3.3 Estimators

3.4 Bias and MSE Expressions of the Location Estimators

3.5 Various Estimates of Variance

3.6 Problems

Chapter 4: Simple Regression Model

4.1 Introduction

4.2 Estimation and Testing of η

4.3 Properties of Intercept Parameter

4.4 Comparison

4.5 Numerical Illustration

4.6 Problems

Chapter 5: Anova

5.1 Model Specification

5.2 Proposed Estimators and Testing

5.3 Bias, MSE, and Risk Expressions

5.4 Risk Analysis

5.5 Problems

Chapter 6: Parallelism Model

6.1 Model Specification

6.2 Estimation of the Parameters and Test of Parallelism

6.3 Bias, MSE, and Risk Expressions

6.4 Risk Analysis

6.5 Problems

Chapter 7: Multiple Regression Model

7.1 Model Specification

7.2 Shrinkage Estimators and Testing

7.3 Bias and Risk Expressions

7.4 Comparison

7.5 Problems

Chapter 8: Ridge Regression

8.1 Model Specification

8.2 Proposed Estimators

8.3 Bias, MSE, and Risk Expressions

8.4 Performance of the Estimators

8.5 Choice of Ridge Parameter

8.6 Problems

Chapter 9: Multivariate Models

9.1 Location Model

9.2 Testing of Hypothesis and Several Estimators of Local Parameter

9.3 Bias, Quadratic Bias, MSE, and Risk Expressions

9.4 Risk Analysis of the Estimators

9.5 Simple Multivariate Linear Model

9.6 Problems

Chapter 10: Bayesian Analysis

10.1 Introduction (Zellner’s Model)

10.2 Conditional Bayesian Inference

10.3 Matrix Variate t-Distribution

10.4 Bayesian Analysis in Multivariate Regression Model

10.5 Problems

Chapter 11: Linear Prediction Models

11.1 Model and Preliminaries

11.2 Distribution of SRV and RSS

11.3 Regression Model for Future Responses

11.4 Predictive Distributions of FRV and FRSS

11.5 An Illustration

11.6 Problems

Chapter 12: Stein Estimation

12.1 Class of Estimators

12.2 Preliminaries and Some Theorems

12.3 Superiority Conditions

12.4 Problems

References

Author Index

Subject Index

Statistical Inference for Models with Multivariate t-Distributed Errors

Copyright © 2014 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic format. For information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Saleh, A. K. Md. Ehsanes, author  Statistical inference for models with multivariate t-distributed errors / A.K. Md. Ehsanes Saleh, Department of Mathematics and Statistics, Carleton University, Ottawa, Canada, M. Arashi, Department of Mathematical Sciences, Shahrood University of Technology, Shahrood, Iran, S.M.M. Tabatabaey, Department of Statistics, Ferdowsi University of Mashhad, Mashhad, Iran.    pages cm  Includes bibliographical references and index.  ISBN 978-1-118-85405-1 (hardback)  1. Regression analysis. 2. Multivariate analysis. I. Arashi, M. (Mohammad), 1981–author. II. Tabatabaey, S. M. M., author. III. Title.  QA278.2.S254 2014  519.5’36—dc232014007304

To our wives

SHAHIDARA SALEH

REIHANEH ARASHI

IN LOVING MEMORY OF PARI TABATABAEY

LIST OF FIGURES

2.1 Bivariate t-distribution

3.1 Plot of n(h) vs n(d)

3.2 Quadratic relation in (3.5.20)

3.3 Plots of MSE functions (increasing d.f. from up to down).

4.1 Graph of bias function for PTE

4.2 Graph of bias function for PTE

4.3 Graph of bias function for SE

4.4 Graph of risk function for UE and RE

4.5 Graph of risk function for PTE

4.6 Graph of risk function for PTE

4.7 Graph of risk function for SE

4.8 Graph of MRE (RE vs UE)

4.9 Graph of MRE (PTE vs UE)

4.10 Graph of MRE (PTE vs UE)

4.11 Graph of MRE (SE vs UE)

7.1 Risk performance

8.1 Risk of PRSRRE, SRRE and URRE

8.2 Risk of PTRRE and URRE

8.3 Risk of estimators

8.4 Contact time versus reactor temperature

8.5 Risk performance based on both Δ2 and k

8.6 Ridge risk performance based on Δ2

8.7 Ridge risks performance based on Δ2

8.8 Scatter plot of variables versus each other

8.9 Risk performance based on both Δ2 and k

8.10 Ridge trace for the data using four regressors

11.1 Prediction distribution for different future sample sizes

LIST OF TABLES

1.1 Comparison of quantiles of N(0,1) and Student’s t-distribution

3.1 Upper bound (U.B.) of Δ2 for which outperforms

4.1 Maximum and minimum guaranteed efficiencies for n=6

8.1 Acetylene data

8.2 Estimated values of ridge estimators

8.3 Estimated values

8.4 Correlation matrix

8.5 Estimated values

8.6 Coefficients at various values of k

PREFACE

In problems of statistical inference, it is customary to use normal distribution as the basis of statistical analysis. Many results related to univariate analysis can be extended to multivariate analysis using multidimensional normal distribution. Fisher (1956) pointed out, from his experience with Darwin’s data analysis, that a slight change in the specification of the distribution may play havoc on the resulting inferences. To overcome this problem, statisticians tried to broaden the scope of the distributions and achieve reasonable inferential conclusions. Zellner (1976) introduced the idea of using Student’s t-distribution, which can accommodate the heavier tailed distributions in a reasonable way and produce robust inference procedures for applications. Most of the research with Student’s t-distribution, so far, is focused on the agreement of the results with that of the normal theory. For example, the maximum likelihood estimator of the location parameter agrees with the mean-vector of a normal distribution. Similarly, the likelihood ratio test under the Student’s t-distribution has same distribution as the normal distribution under the null hypothesis. This book is an attempt to fill the gap in statistical inference on linear models based on the multivariate t-errors.

This book consists of 12 chapters. Chapter 1 summarizes the results of various models under normal theory with a brief review of the literature. Chapter 2 contains the basic properties of various known distributions and opens the discussion of multivariate t-distribution with its basic properties. We begin Chapter 3 with a discussion of the statistical analysis of a location model from estimation of the intercept. We include the preliminary test and shrinkage type estimators of the location parameter. We also include a discussion of various estimators of variance and their statistical properties. Chapter 4 covers estimation and test of slope and intercept parameters of a simple linear regression model. Chapter 5 is devoted to ANOVA models. Chapter 6 deals with the parallelism model in the same spirit. The multiple regression model is the subject of Chapter 7, and ridge regression is dealt with in Chapter 8. Statistical inference of multivariate models and simple multivariate linear models are discussed in Chapter 9. The Bayesian view point is discussed in multivariate t-models in Chapter 10. The statistical analysis of linear prediction models is included in Chapter 11. The book concludes with Chapter 12, devoted to Stein estimation.

The aim of the book is to provide a clear and balanced introduction to inference techniques using the Student’s t-distribution for students and teachers alike, in mathematics, statistics, engineering, and biostatistics programs among other disciplines. Prerequisite for this book is a modest background in statistics, preferably having used textbooks such as Introduction to Probability and Statistics by Rohatgi and Saleh (2001), Introduction to Mathematical Statistics by Hogg, Mckean and Craig (2012), and some exposure to the book Theory of Preliminary Test and Stein-Type Estimation with Applications by Saleh (2006).

After the preliminary chapters, we begin with the location model in Chapter 3, detailing the mathematical development of the theory using multivariate t-distribution as error distribution and we proceed from chapter to chapter raising the level of discussions with various topics of applied statistics.

The content of the book may be covered in two semesters. Various problems are given at the end of every chapter to enhance the knowledge and application of the theory with multivariate t-distribution.

We wish to thank Ms. Mina Noruzirad (Shahrood University) for diligently typesetting the manuscript with care and for reading several chapters and producing most of the graphs and tables that appear in the book. Without her help, the book could not have been completed in time.

A.K.MD.EHSANES SALEH

Carleton University, Canada

M. ARASHI

Shahrood University, Iran

S. M. M. TABATABAEY

Ferdowsi University of Mashhad, Iran June, 2014

GLOSSARY

ANOVA

analysis of variance

BLF

balanced loss function

BTE

Baranchik-type estimator

C.L.T

central limit theorem

cdf

cumulative distribution function

d.f.

degree of freedom

dim

dimensional

FRV

future regression vector

HPD

highest posterior density

iff

if and only if

JSE

James-Stein estimator

LR

likelihood ratio

LRC

likelihood ratio criterion

LSE

least squares estimator

MLE

maximum likelihood estimator

MRE(.;.)

mean square error based relative efficiency

MSE

mean square error

NCP

natural conjugate prior

p.d.

positive definite

OLS

ordinary least squares

pdf

probability density function

RE

restricted estimator

RLSE

restricted least-square estimator

RMLE

restricted maximum likelihood estimator

RRE

restricted rank estimator

RRE

risk-based relative efficiency

RUE

restricted unbiased estimator

R(.;.)

relative efficiency

r.v.

random variable

PRSE

positive rule Stein-type estimator

PRSRRE

positive-rule Stein-type ridge regression estimator

PTE

preliminary test estimator

PTMLE

preliminary test maximum likelihood estimator

PTLSE

preliminary test least-square estimator

PTRRE

preliminary test ridge regression estimator

RRRE

restricted ridge regression estimator

RSS

residual sum of square

SE

Stein-type estimator

SMLE

Stein-type maximum likelihood estimator

SLSE

Stein-type least-square estimator

SRRE

Stein-type ridge regression estimator

SRV

sum of regression vector

SSE

Stein-type shrinkage estimator

UE

unrestricted estimator

ULSE

unrestricted least-square estimator

UMLE

unrestricted maximum likelihood estimator

URE

unrestricted rank estimator

URRE

unrestricted ridge regression estimator

UUE

unrestricted unbiased estimator

LIST OF SYMBOLS

UE

RE

PTE

SE

PRSE

b

(.)/

b

(.)

bias

R

(·)

risk

Δ

2

2

*

noncentrality parameter

test-statistics

γ

o

d.f.

V

n

/

V

p

known scale matrix

1

n

vector of

n

-tuple of one’s

Borel measurable function

ϕ(.)

pdf of the standard normal distribution

Φ(.)

cdf of the standard normal distribution

|

a

|

absolute value of the scalar

a

||

a

||

norma of

a

|

A

|

determinant of the matrix

A

Γ(.)

gamma function

Γ

p

(.)

multivariate gamma function

B

(.,.)

beta function

J

(

t

z

)

Jacobian of the transformation

t

to

z

dominates

Kronecker product

vec

vectorial operator

Re

(

a

)/

Re

(

A

)

real part of

a

/

A

Ω

1

unrestricted parameter space

Ω

0

restricted parameter space

Diag

diagonal matrix

Ch

max

(

A

)/λ

1

(

A

)

largest eigenvalue of the matrix

A

Ch

min

(

A

)/λ

p

(

A

)

smallest eigenvalue of the matrix

A

N

(.,.)

univariate normal distribution

N

p

(.,.)

p

-variate normal distribution

χ

2

p

chi-square distribution with

p

d.f.

χ

2

p

2

)

noncentral chi-square distribution with

p

d.f.

M

(1)

t

(.,.,.)

Student’s t-distribution

M

(

p

)

t

(.,.,.)

P

-variate t-distribution

IG

(.,.)

inverse gamma distribution

F

γ1,γ2

F

distribution

F

γ1,γ2

2

)

noncentral

F

distribution with noncentrality parameter Δ

2

CHAPTER 1

INTRODUCTION

Outline

1.1 Objective of the Book

1.2 Models under Consideration

1.3 Organization of the Book

1.4 Problems

1.1 Objective of the Book

The classical theory of statistical analysis is primarily based on the assumption that the errors of various models are normally distributed. The normal distribution is also the basis of the (i) chi-square, (ii) Student’s t-, and (iii) F-distributions. Fisher (1956) pointed out that slight differences in the specification of the distribution of the model errors may play havoc on the resulting inferences. To examine the effects on inference, Fisher (1960) analyzed Darwin’s data under normal theory and later under a symmetric non-normal distribution. Many researchers have since investigated the influence on inference of distributional assumptions differing from normality. Further, it has been observed that most economic and business data, e.g., stock return data, exhibit long-tailed distributions. Accordingly, Fraser and Fick (1975) analyzed Darwin’s data and Baltberg and Gonedes (1974) analyzed stock returns using a family of Student’s t-distribution to record the effect of distributional assumptions compared to the normal theory analysis. Soon after, Zellner (1976) considered analyzing stock return data by a simple regression model, assuming the error distribution to have a multivariate t-distribution. He revealed the fact that dependent but uncorrelated responses can be analyzed by multivariate t-distribution. He discussed differences as well as similarities of the results in both classical and Bayesian contexts for multivariate normal and multivariate t-based models.

Fraser (1979, p. 37) emphasized that the normal distribution is extremely short-tailed and thus unrealistic as a sole distribution for variability. He demonstrated the robustness of the Student’s t-family as opposed to the normal distribution based on numerical studies. In justifying the appropriateness and the essence of the use of Student’s t-distribution, Prucha and Kalajian (1984) pointed out that the normal model based analysis (i) is generally very sensitive to deviations from its assumptions, (ii) places too much weight on outliers, (iii) fails to utilize sample information beyond the first two moments, and (iv) appeals to the central limit theorem for asymptotic approximations.

Table 1.1 shows the differences in quantiles of the standard normal and Student’s t-distributions with selected degrees of freedom.

Table 1.1 Comparison of quantiles of N(0,1) and Student’s t-distribution

The objective of this book is to present systematic analytical results for various linear models with error distributions belonging to the class of multivariate t. The book will cover results involving (i) estimation, (ii) test of hypothesis, and (iii) improved estimation of location/regression parameters as well as scale parameters.

We shall consider models whose error-distribution is multivariate t. Explicit formulation of multivariate t-distribution is given by

(1.1.1)

where is a known positive definite matrix of rank n and γo > 2. The mean vector and the dispersion matrix of (1.1.1) are

(1.1.2)

This distribution will be denoted by M(n)t (0, σ2Vn, γo) and will be called the n-dim t-distribution.

In a nutshell, this distribution may be obtained as a mixture of the normal distribution, , and the inverse gamma distribution, IG(t−1, γo), given by

(1.1.3)

where

is the gamma function.

1.2 Models under Consideration

In this section, we consider some basic statistical models that are frequently used in applied statistical and econometric analysis with preliminary information with regard to estimation and testing of a hypothesis under normal theory. These will be used to discuss the estimation and test of hypothesis problems under the multivariate t-distribution in later chapters.

1.2.1 Location Model

Consider the location model

(1.2.1)

An unrestricted estimator (UE) of θ is given by the least-square/MLE method as

(1.2.2)

while the unrestricted unbiased estimator (UUE) of σ2 is given by

(1.2.3)

Under the normal theory, the exact distribution of , and the exact distribution of mS2u/σ2 is a chi-square distribution with m degrees of freedom (d.f.).

(1.2.4)

The exact distribution of under Ho is the central F-distribution with (1, m) d.f.

1.2.2 Simple Linear Model

Consider a simple linear model

(1.2.5)

(1.2.6)

where , and

(1.2.7)

with the covariance matrix σ2K−1, where

(1.2.8)

The exact distribution of follows . Also, the unbiased estimator of σ2 is S2u given by

(1.2.9)

The distribution of mS2u/σ2 follows a chi-square distribution with m d.f.

(1.2.10)

(1.2.11)

The exact distribution of *n under Ho follows the same central F-distribution with (1, m) d.f.

(1.2.12)

The exact distribution of **n under Ho follows the central F-distribution with (2, m) d.f.

1.2.3 ANOVA Model

Suppose that the response vector Y is modeled as

(1.2.13)

where

(1.2.14)

(1.2.15)

(1.2.16)

(1.2.17)

We assume that

(1.2.18)

According to the LSE/MLE principle, the unrestricted estimators of θ and σ2 are given by

(1.2.19)

and

(1.2.20)

respectively. Moreover, under normal theory, the exact distribution of is p(0, σ2K−1) and that of mS2u/σ2 follows the chi-square distribution with m d.f. independent of n.

(1.2.21)

where

(1.2.22)

1.2.4 Parallelism Model

Consider the n-dim (dimensional) response vector Y modeled as

(1.2.23)

where

(1.2.24)

(1.2.25)

(1.2.26)

(1.2.27)

and

(1.2.28)

The LSE of θ and β are given by

(1.2.29)

where

(1.2.30)

The covariance matrix of is given by σ2K−1, where

(1.2.31)

Normal theory results lead to the distribution of as N2p(η, σ2K−1).

The unbiased estimator of σ2 is defined by

(1.2.32)

and mS2u/σ2 follows the chi-square distribution with m d.f., independent of under normal theory.

(1.2.33)

Under Ho, n follows the central F-distribution with (p, m) d.f. On the other hand, if βo is unknown but equals to βo1p, βo scalar and unknown, then the common estimator of βo is given by

(1.2.34)

and the test-statistic is defined by

(1.2.35)

The exact distribution of n under Ho follows the central F-distribution with (q, m) d.f.

1.2.5 Multiple Regression Model

Consider the multiple regression model

(1.2.36)

The LSE/MLE of β is then given by

(1.2.37)

where , and the unbiased estimator of σ2 is given by

(1.2.38)

where mS2u/σ2 follows the chi-square distribution with m d.f., independent of n under normal theory.

(1.2.39)

(1.2.40)

1.2.6 Ridge Regression

If in the model (1.2.36), X′V−1nX is a near-singular or ill-conditioned matrix that prevents the reasonable inversion of X′V−1nX, then multicollinearity may be present among the elements of , and X′V−1n has rank less than p or one or more characteristic roots of X′V−1nX are small, leading to a large variance of n. Thus, following Hoerl and Kennard (1970), one may define the ”Ridge estimator” of β as

(1.2.41)

1.2.7 Multivariate Model

Let Y1, Y2,…, YN be N observation vectors of p-dim satisfying the model

(1.2.42)

(1.2.43)

(1.2.44)

(1.2.45)

An improved estimator of θ is known to be

(1.2.46)

See for example, James and Stein (1961) and Saleh (2006).

1.2.8 Simple Multivariate Linear Model

Consider the simple linear model

(1.2.47)

The point estimators of θ and β are given by

(1.2.48)

with .

The point estimator of ∑ is given by

(1.2.49)

(1.2.50)

which follows the central F-distribution with (p, m) d.f.

1.3 Organization of the Book

This book consists of twelve chapters. Chapter 1 summarizes the results of various models under normal theory with brief review of the literature. Chapter 2 contains the basic properties of various known distributions and opens discussion of multivariate t-distribution with its basic properties. We open up Chapter 3 to discuss the statistical analysis of a location model from estimation of the intercept and slope, to test of hypothesis of the parameters. We also add the preliminary test and shrinkage-type estimators of the three parameters which include the estimation of the scale parameter of the model while Chapter 4 contains similar details of a simple regression model. Chapter 5 is devoted to ANOVA models and discussing on preliminary test and shrinkage-type estimators in multivariate t-distribution and Chapter 6 deals with the parallelism model in the same spirit. Multiple regression model is the content of Chapter 7 and ridge regression is dealt with in Chapter 8. Statistical inference of multivariate models and simple multivariate linear models are discussed in Chapter 9. Bayesian view point is discussed in multivariate t-models in Chapter 10. Finally, we discuss linear prediction models in Chapter 11. We conclude the book with Chapter 12 containing the Stein estimation as complementary results.

1.4 Problems