E-Book
73,50 €

Multivariate Analysis E-Book

Kanti V. Mardia

0,0

73,50 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.

Herausgeber: John Wiley & Sons
Kategorie: Wissenschaft und neue Technologien
Serie: Wiley Series in Probability and Statistics
Sprache: Englisch

Beschreibung

Multivariate Analysis

Comprehensive Reference Work on Multivariate Analysis and its Applications

The first edition of this book, by Mardia, Kent and Bibby, has been used globally for over 40 years. This second edition brings many topics up to date, with a special emphasis on recent developments.

A wide range of material in multivariate analysis is covered, including the classical themes of multivariate normal theory, multivariate regression, inference, multidimensional scaling, factor analysis, cluster analysis and principal component analysis. The book also now covers modern developments such as graphical models, robust estimation, statistical learning, and high-dimensional methods. The book expertly blends theory and application, providing numerous worked examples and exercises at the end of each chapter. The reader is assumed to have a basic knowledge of mathematical statistics at an undergraduate level together with an elementary understanding of linear algebra. There are appendices which provide a background in matrix algebra, a summary of univariate statistics, a collection of statistical tables and a discussion of computational aspects. The work includes coverage of:

Basic properties of random vectors, copulas, normal distribution theory, and estimation
Hypothesis testing, multivariate regression, and analysis of variance
Principal component analysis, factor analysis, and canonical correlation analysis
Discriminant analysis, cluster analysis, and multidimensional scaling
New advances and techniques, including supervised and unsupervised statistical learning, graphical models and regularization methods for high-dimensional data

Although primarily designed as a textbook for final year undergraduates and postgraduate students in mathematics and statistics, the book will also be of interest to research workers and applied scientists.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 943

Veröffentlichungsjahr: 2024

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Cover

Table of Contents

Title Page

Dedication

Epigraph

Preface to the Second Edition

Preface to the First Edition

Acknowledgments from First Edition

Notation, Abbreviations, and Key Ideas

Matrices and Vectors

Random Variables and Data

Parameters and Statistics

Distributions

Matrix Decompositions

Geometry

Main Abbreviations and Commonly Used Notation

1 Introduction

1.1 Objects and Variables

1.2 Some Multivariate Problems and Techniques

1.3 The Data Matrix

1.4 Summary Statistics

1.5 Linear Combinations

1.6 Geometrical Ideas

1.7 Graphical Representation

1.8 Measures of Multivariate Skewness and Kurtosis

Exercises and Complements

2 Basic Properties of Random Vectors

Introduction

2.1 Cumulative Distribution Functions and Probability Density Functions

2.2 Population Moments

2.3 Characteristic Functions

2.4 Transformations

2.5 The Multivariate Normal Distribution

2.6 Random Samples

2.7 Limit Theorems

Exercises and Complements

3 Nonnormal Distributions

3.1 Introduction

3.2 Some Multivariate Generalizations of Univariate Distributions

3.3 Families of Distributions

3.4 Insights into Skewness and Kurtosis

3.5 Copulas

Exercises and Complements

4 Normal Distribution Theory

4.1 Introduction and Characterization

4.2 Linear Forms

4.3 Transformations of Normal Data Matrices

4.4 The Wishart Distribution

4.5 The Hotelling Distribution

4.6 Mahalanobis Distance

4.7 Statistics Based on the Wishart Distribution

4.8 Other Distributions Related to the Multivariate Normal

Exercises and Complements

5 Estimation

Introduction

5.1 Likelihood and Sufficiency

5.2 Maximum‐likelihood Estimation

5.3 Robust Estimation of Location and Dispersion for Multivariate Distributions

5.4 Bayesian Inference

Exercises and Complements

6 Hypothesis Testing

6.1 Introduction

6.2 The Techniques Introduced

6.3 The Techniques Further Illustrated

6.4 Simultaneous Confidence Intervals

6.5 The Behrens–Fisher Problem

6.6 Multivariate Hypothesis Testing: Some General Points

6.7 Nonnormal Data

6.8 Mardia's Nonparametric Test for the Bivariate Two‐sample Problem

Exercises and Complements

7 Multivariate Regression Analysis

7.1 Introduction

7.2 Maximum‐likelihood Estimation

7.3 The General Linear Hypothesis

7.4 Design Matrices of Degenerate Rank

7.5 Multiple Correlation

7.6 Least‐squares Estimation

7.7 Discarding of Variables

Exercises and Complements

8 Graphical Models

8.1 Introduction

8.2 Graphs and Conditional Independence

8.3 Gaussian Graphical Models

8.4 Log‐linear Graphical Models

8.5 Directed and Mixed Graphs

Exercises and Complements

9 Principal Component Analysis

9.1 Introduction

9.2 Definition and Properties of Principal Components

9.3 Sampling Properties of Principal Components

9.4 Testing Hypotheses About Principal Components

9.5 Correspondence Analysis

9.6 Allometry – Measurement of Size and Shape

9.7 Discarding of Variables

9.8 Principal Component Regression

9.9 Projection Pursuit and Independent Component Analysis

9.10 PCA in High Dimensions

Exercises and Complements

10 Factor Analysis

10.1 Introduction

10.2 The Factor Model

10.3 Principal Factor Analysis

10.4 Maximum‐likelihood Factor Analysis

10.5 Goodness‐of‐fit Test

10.6 Rotation of Factors

10.7 Factor Scores

10.8 Relationships Between Factor Analysis and Principal Component Analysis

10.9 Analysis of Covariance Structures

Exercises and Complements

11 Canonical Correlation Analysis

11.1 Introduction

11.2 Mathematical Development

11.3 Qualitative Data and Dummy Variables

11.4 Qualitative and Quantitative Data

Exercises and Complements

12 Discriminant Analysis and Statistical Learning

12.1 Introduction

12.2 Bayes' Discriminant Rule

12.3 The Error Rate

12.4 Discrimination Using the Normal Distribution

12.5 Discarding of Variables

12.6 Fisher's Linear Discriminant Function

12.7 Nonparametric Distance‐based Methods

12.8 Classification Trees

12.9 Logistic Discrimination

12.10 Neural Networks

Exercises and Complements

Notes

13 Multivariate Analysis of Variance

13.1 Introduction

13.2 Formulation of Multivariate One‐way Classification

13.3 The Likelihood Ratio Principle

13.4 Testing Fixed Contrasts

13.5 Canonical Variables and A Test of Dimensionality

13.6 The Union Intersection Approach

13.7 Two‐way Classification

Exercises and Complements

Note

14 Cluster Analysis and Unsupervised Learning

14.1 Introduction

14.2 Probabilistic Membership Models

14.3 Parametric Mixture Models

14.4 Partitioning Methods

14.5 Hierarchical Methods

14.6 Distances and Similarities

14.7 Grouped Data

14.8 Mode Seeking

14.9 Measures of Agreement

Exercises and Complements

Note

15 Multidimensional Scaling

15.1 Introduction

15.2 Classical Solution

15.3 Duality Between Principal Coordinate Analysis and Principal Component Analysis

15.4 Optimal Properties of the Classical Solution and Goodness of Fit

15.5 Seriation

15.6 Nonmetric Methods

15.7 Goodness of Fit Measure: Procrustes Rotation

15.8 Multisample Problem and Canonical Variates

Exercises and Complements

16 High‐dimensional Data

16.1 Introduction

16.2 Shrinkage Methods in Regression

16.3 Principal Component Regression

16.4 Partial Least Squares Regression

16.5 Functional Data

Exercises and Complements

A Matrix Algebra

A.1 Introduction

A.2 Matrix Operations

A.3 Further Particular Matrices and Types of Matrices

A.4 Vector Spaces, Rank, and Linear Equations

A.5 Linear Transformations

A.6 Eigenvalues and Eigenvectors

A.7 Quadratic Forms and Definiteness

A.8 Generalized Inverse

A.9 Matrix Differentiation and Maximization Problems

A.10 Geometrical Ideas

B Univariate Statistics

B.1 Introduction

B.2 Normal Distribution

B.3 Chi‐squared Distribution

B.4 and Beta Variables

B.5 Distribution

B.6 Poisson Distribution

C R Commands and Data

C.1 Basic R Commands Related to Matrices

C.2 R Libraries and Commands Used in Exercises and Figures

C.3 Data Availability

D Tables

References and Author Index

Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1 Data matrix with five students as objects, where is age in year...

Table 1.2 Marks in open‐ and closed‐book examination out of 100.

Table 1.3 Measurements (in cm) on three types of irises.

Table 1.4 Weights of cork deposits (in centigrams) for 28 trees in the fo...

Table 1.5 Sample means and standard deviations () for the two groups of ...

Table 1.6 Measures of multivariate skewness and kurtosis for each of th...

Chapter 2

Table 2.1 Jacobians of some transformations.

Chapter 3

Table 3.1 Selected examples (shown in Figure 3.1) for a variety of and ...

Table 3.2 Some examples of Archimedean copulas.

Chapter 5

Table 5.1 Extreme values to be excluded from robust estimates in the iris d...

Table 5.2 Estimates of location and scatter for each of four methods: maxim...

Chapter 6

Table 6.1 The measurements on the first and second adult sons in a sample o...

Table 6.2 Values of the statistics and for each of the iris species and...

Table 6.3 Data for the geological problem (in micrometers).

Chapter 7

Table 7.1 Correlation matrix for the physical properties of props.

Table 7.2 Correlation between the independent variables and for pitprop d...

Table 7.3 Variables selected in multiple regression for pitprop data.

Table 7.4 Variables selected in interdependence analysis for pitprop data....

Chapter 9

Table 9.1 Values of , defined by (9.14), – and their square root – for th...

Table 9.2 Eigenvectors and eigenvectors for the pitprop data.

Table 9.3 Open–closed‐book data in Example 9.3.2: standard errors for the l...

Table 9.4 Open–closed‐book data in Example 9.3.2: standard errors for the l...

Table 9.5 Cross‐tabulation of frequencies taken from the General Social Sur...

Table 9.6 Counts obtained from a random sample of size 1000 taken from tabl...

Chapter 10

Table 10.1 Principal factor solutions for the open/closed‐book data with ...

Table 10.2 Maximum‐likelihood factor solutions for the open/closed‐book dat...

Table 10.3 Correlation matrix for the applicant data

Table 10.4 Maximum‐likelihood factor solution of applicant data with fact...

Table 10.5 Maximum‐likelihood factor solution of applicant data with fact...

Chapter 11

Table 11.1 Social mobility contingency table (Glass (1954); see also Goodma...

Table 11.2 Data from 382 Hull University students ( denotes frequencies)....

Chapter 12

Table 12.1 Summary of the examples in this chapter using the iris data, inc...

Table 12.2 Percentage distribution of sentence endings in seven works of Pl...

Table 12.3 Mean scores and their standard errors from seven works of Plato....

Table 12.4 Illustrative data for travel claims.

Table 12.5 Summary of the Titanic data for the number of survivors/total nu...

Table 12.6 Summary of the estimated parameters and standard errors for a lo...

Table 12.7 Confusion matrix for logistic regression model of Titanic surviv...

Table 12.8 Flea‐beetle measurements, taken from Lubischew (1962).

Chapter 13

Table 13.1 Multivariate one‐way classification.

Table 13.2 Logarithms of multiple measurements on anteater skulls at three ...

Table 13.3 Matrices in

MANOVA

table for Reeve's data .

Table 13.4 Weight losses (in grams) for the first and second weeks for rats...

Table 13.5

MANOVA

table for the data in Table 13.4.

Chapter 14

Table 14.1 A subsample of six observations (and two variables) from the iri...

Table 14.2 Clusters for the data in Table 14.1.

Table 14.3 Mahalanobis distances between 10 island races of white‐toothed s...

Table 14.4 Single linkage procedure for shrew data.

Table 14.5 Relative frequencies of blood groups A

, A

, B, and O for four...

Table 14.6 Distances between two points and .

Table 14.7 Comparison of cluster allocations based on hill ‐climbing and mo...

Chapter 15

Table 15.1 Road distances in miles between 12 British towns.

Table 15.2 Percentage of times that the pairs of Morse code signals for two...

Table 15.3 Observed and fitted journey time distance matrices for Example 1...

Chapter 16

Table 16.1 Estimated regression coefficients using 16.33, with regressors...

Appendix A

Table A.1 Particular matrices and types of matrices (List 1). For List 2, se...

Table A.2 Basic matrix operations.

Table A.3 Particular types of matrices (List 2).

Table A.4 Rank of some matrices.

Table A.5 Basic concepts in ‐dimensional geometry.

Appendix C

Table C.1

commands for some basic matrix operations.

Table C.2 Using the

diag

command in R.

Table C.3

data sets used and associated library (see index for usage).

Table C.4 The following datasets are available on the website github.com/cha...

Appendix D

Table D.1 Upper critical values of the distribution.

Table D.2 Upper critical values of the distribution.

Table D.3 Upper critical values of the distribution ().

Table D.4 Upper critical values of the distribution ().

Table D.5 Upper critical values of the distribution ().

Table D.6 Upper critical values of the distribution ().

Table D.7 Using the

function

doubleWishart

(Turgeon, 2018) with , these a...

List of Illustrations

Chapter 1

Figure 1.1 Univariate representation of the cork data of Table 1.4.

Figure 1.2 Consecutive univariate representation. Source: Adapted from Pears...

Figure 1.3 Matrix of scatterplots for the cork data. The upper diagonal show...

Figure 1.4 A glyph representation of the cork data of Table 1.4.

Figure 1.5 Harmonic curves for simulated data with observations and vari...

Figure 1.6 Parallel coordinates plot for the cork data.

Chapter 2

Figure 2.1 Ellipses of equal probability density for the bivariate normal di...

Figure 2.2 Ellipses of equal probability density for the bivariate normal di...

Chapter 3

Figure 3.1 Contour plots of the quantiles for various choices of and in ...

Figure 3.2 Copulas combined with margins. (a) Gaussian (); (b) Clayton–Ma...

Chapter 5

Figure 5.1 Convex hull for the iris data. (

I. versicolor

variety) with me...

Chapter 7

Figure 7.1 Geometry of multiple regression model, .

Chapter 8

Figure 8.1 Graphical model for open/closed‐book data given in Table 1.2. The...

Figure 8.2 (a) Example of a graph with eight vertices; (b) decomposition int...

Figure 8.3 (a) Example of a graph with eight vertices; (b) decomposition int...

Figure 8.4 Proposed graphical model for turtle data.

Figure 8.5 This graph shows a graphical interaction model since the three tw...

Figure 8.6 Graphical model for the Copenhagen housing data.

Figure 8.7 (a) Decomposable graphical model for the vehicle accident data. (...

Figure 8.8 Example of directed acyclic graph (DAG).

Figure 8.9 Venn diagram depicting two events and (a) with and other pr...

Chapter 9

Figure 9.1 Scree graph for the Corsican pitprop data of Table 9.2.

Figure 9.2 The 88 individuals of the open/closed‐book data plotted on the fi...

Figure 9.3 Correlations between the variables and principal components for t...

Figure 9.4 Biplot (using ) for the open/closed‐book data.

Figure 9.5 Biplot (using ) using simulated data of Example 9.2.9.

Figure 9.6 Sampling distribution of the first two log eigenvalues for the op...

Figure 9.7 The row‐principal biplot and the column‐principal biplot for the ...

Figure 9.8 (a) Scatterplot of two of the iris data variables, with normalize...

Figure 9.9 (a) Scatterplot of two of the iris data variables, with direction...

Chapter 10

Figure 10.1 Plot of open/closed‐book factors before and after varimax rotati...

Chapter 12

Figure 12.1 Normal likelihoods with unequal means and unequal variances (fro...

Figure 12.2 Discrimination between two species of iris: a plot of the data p...

Figure 12.3 Discrimination between three species of iris using two variables...

Figure 12.4 (Mental health data). Discrimination between normal individuals ...

Figure 12.5 Classification regions (polygons obtained using the

library

...

Figure 12.6 Classification regions for the first two variables of the iris d...

Figure 12.7 A possible classification tree for data in Table 12.4. The branc...

Figure 12.8 Splitting indices (some linearly transformed) of first two varia...

Figure 12.9 Scatter plot of the first two variables for iris data (

I. seto

...

Figure 12.10 Decision tree (created using

rpart

(Therneau and Atkinson, 2019...

Figure 12.11 (a) Classification tree and the corresponding partition (b) for...

Figure 12.12 Fitted probabilities for survival in the sinking of the Titanic...

Figure 12.13 Multilayer network with an input layer, one hidden layer, and t...

Figure 12.14 Multilayer perceptron boundaries (continuous line) and linear d...

Chapter 13

Figure 13.1 Canonical analysis of iris data, with approximate 99% confidence...

Figure 13.2 Plot of the 10 groups of shrews with respect to the first two ca...

Figure 13.3 Canonical analysis of image segmentation data used in Example 13...

Chapter 14

Figure 14.1 Scatter plot for the small subsample of six observations of iris...

Figure 14.2 Application of the ‐means algorithm, and

mclust

to the iris are...

Figure 14.3 Application of the fuzzy ‐means algorithm to the iris area data...

Figure 14.4 Dendrogram for shrew group means (single linkage) in Example 14....

Figure 14.5 Minimum spanning tree for the shrew data whose distances are giv...

Figure 14.6 Dendrogram for the shrew data (complete linkage).

Figure 14.7 Hierarchical clusters for grouped glass data. Groups are sequent...

Figure 14.8 Contour plots of the kernel density estimate for the transformed...

Figure 14.9 Clusters of countries based on five indicators, after selecting ...

Figure 14.10 World map for the 11 clusters of the dendogram of Figure 14.9. ...

Figure 14.11 Rand index showing the agreement between the complete linkage c...

Chapter 15

Figure 15.1 Classical solution for Morse code data in Table 15.2.

Figure 15.2 MDS solutions for the road data in Table 15.1, , original point...

Figure 15.3 Two‐dimensional MDS solution for Example 15.4.2. Train journey t...

Figure 15.4 Classical MDS solution in two dimensions using similarity matrix...

Figure 15.5 Two‐dimensional representation of Kendall's similarity matrix fo...

Figure 15.6 Seven regions in a country for Exercise 15.2.7.

Chapter 16

Figure 16.1 (a) Log of average nearest neighbor distances for various dimens...

Figure 16.2 Bias‐squared and variance for ridge regression estimates of , w...

Figure 16.3 Results from using LARS with simulated data with uncorrelated ex...

Figure 16.4 (a) Log of scree plot for yarn data; (b) Best subset selection f...

Figure 16.5 (a) Boxplots of scores for six sensory variables (); (b) Pairwi...

Figure 16.6 Height measurements for 39 boys (a) and 54 girls (b). Points sho...

Figure 16.7 Growth Data. (a) Estimated (Eq. 16.25; (b) Cumulative proporti...

Figure 16.8 Growth Data. The estimated regression function obtained using ...

Appendix A

Figure A.1 is the projection of onto the plane .

Figure A.2 Ellipsoid in the plane. Vectors given by and .

Guide

Cover

Table of Contents

Title Page

Dedication

Epigraph

Preface to the Second Edition

Preface to the First Edition

Acknowledgments from First Edition

Notation, Abbreviations, and Key Ideas

Begin Reading

A Matrix Algebra

B Univariate Statistics

C R Commands and Data

D Tables

References and Author Index

Index

End User License Agreement

Pages

iii

xvii

xix

xxi

xxii

xxiii

xxv

xxvii

xxviii

xxix

xxx

xxxi

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

543

544

545

546

547

548

549

550

551

552

553

554

Multivariate Analysis

Second Edition

Kanti V. Mardia, John T. Kent, and Charles C. Taylor

All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Kanti V. Mardia, John T. Kent, and Charles C. Taylor to be identified as the authors of this work have been asserted in accordance with law.

Registered OfficesJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USAJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty

In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication Data

Names: Mardia, K. V., author. | Kent, J. T. (John T.), joint author. | Taylor, C. C (Charles C.), joint author.Title: Multivariate analysis / Kanti V. Mardia, John T. Kent, and Charles C. Taylor.Description: Second edition. | Hoboken, NJ : Wiley, 2024. | Series: Wiley series in probability and statistics | Includes bibliographical references and index.Identifiers: LCCN 2023002460 (print) | LCCN 2023002461 (ebook) | ISBN 9781118738023 (cloth) | ISBN 9781118892527 (adobe pdf) | ISBN 9781118892510 (epub)Subjects: LCSH: Multivariate analysis.Classification: LCC QA278 .M36 2023 (print) | LCC QA278 (ebook) | DDC 519.5/35–dc23/eng/20230531LC record available at https://lccn.loc.gov/2023002460LC ebook record available at https://lccn.loc.gov/2023002461

To my daughters Bela and Neeta

— with Jainness (Kanti V. Mardia)

To my son Edward and daughter Natalie

(John T. Kent)

To my wife Fiona and my children Mike, Anna, Ruth, and Kathryn

(Charles C. Taylor)

Epigraph

Everything is related with every other thing, and this relation involves the emergence of a relational quality. The qualities cannot be known a priori, though a good number of them can be deduced from certain fundamental characteristics.

— Jaina philosophy

The Jaina Philosophy of Non‐Absolutism by S. Mookerjee, q.v.

Mahalanobis (1957).

Preface to the Second Edition

For over 40 years the first edition of this book (which was also translated into Persian) has been used by students to acquire a basic knowledge of the theory and methods of multivariate statistical analysis. The book has also served the wider statistical community to further their understanding of this field. Plans for the second edition started almost 20 years ago, and we have struggled with questions about which topics to add – something of a moving target in a field that has continued to evolve in this new era of artificial intelligence (AI) and “big data”. Since the first edition was published, multivariate analysis has been developed and extended in many directions. This new edition aims to bring the first edition up to date by substantial revision, rewriting, and additions, while seeking to maintain the overall length of the book. The basic approach has been maintained, namely a mathematical treatment of statistical methods for observations consisting of several measurements or characteristics of each subject and a study of their properties. The core topics, and the structure many of the chapters, have been retained.

Briefly, for those familiar with the first edition, the main changes (in addition to updating material in several places) are:

a new section giving Notation, Abbreviations, and Key Ideas used through the book;

a new chapter introducing some nonnormal distributions. This includes new sections on elliptical distributions and copulas;

a new chapter covering an introduction to

graphical models

;

a completely rewritten chapter that begins from discriminant analysis and extends to nonparametric methods, classification and regression trees, logistic discrimination, and multilayer perceptrons. These topics are commonly grouped into the heading of

supervised learning

;

the above chapter focuses on data in which group memberships are known, whereas “unsupervised learning” has more traditionally been known as cluster analysis, for which the current

Chapter 14

has also been substantially updated to reflect recent developments;

a new (final) chapter introduces some approaches to

high‐dimensional data

in which the number of variables may exceed the number of observations. This includes shrinkage methods in regression, principal components regression, partial least squares regression, and functional data analysis;

further development of discrete aspects, including log‐linear models, the EM algorithm for mixture models, and correspondence analysis for contingency tables.

As a consequence of the above new and extended/revised chapters and in order to save space, we have omitted some material from this edition:

the chapter on econometrics, since there are now dedicated books with an emphasis on statistical aspects (Maddala and Lahiri,

2009

); (Wooldridge,

2019

);

the chapter on directional statistics, since there are now related dedicated books by one of the authors (Dryden and Mardia,

2016

); (Mardia and Jupp,

2000

Further changes to this Edition, bringing many subjects up to date, include new graphical representations (Chapter 1), an introduction to the matrix normal distribution (Chapters 2 and 5), elliptical distributions and copulas (Chapter 3), robust estimators for location and dispersion (Chapter 5), a revision of correspondence analysis and biplots (Chapter 9), and projection pursuit and independent component analysis (Chapter 9).

The figures in the first edition have been redrawn in their original style, using the statistical package R. A new Appendix C contains some specific R (R Core Team, 2020) commands applicable to most of the matrix algebra used in the book. In addition, an online addendum to Appendix C contains the data files used in this book as well as the R commands used to obtain the calculations for the examples and figures. This public repository is at github.com/charlesctaylor/MVAdata-rcode. In many cases, we have chosen to use base R functions to mimic the equations used in the text in preference to more “black‐box” R functions. Note that intermediate steps in the calculation are generally rounded only for display purposes.

Multivariate analysis continues to be a research area of active development. We note that the Journal of Multivariate Analysis, in its 50th Anniversary Jubilee Edition (von Rosen and Kollo, 2022), has published a volume that describes the current state of the art and contains review papers. Beyond mainstream multivariate statistics, there have been developments in the applied sciences; one example in morphometrics is Bookstein (2018).

The first edition was published by Academic Press, and we are grateful to John Bibby for his contributions to that edition. For this edition, we thank the many readers who have offered their advice and suggestions. In particular, we would like to acknowledge the help of Susan Holmes for extensive discussions about a new structure as well as a draft of correspondence analysis material for Chapter 9.

We are extremely grateful to Wiley for their patience and help during the writing of the book, especially Helen Ramsey, Sharon Clutton, Richard Davies, Kathryn Sharples, Liz Wingett, Kelvin Matthews, Alison Oliver, Viktoria Hartl‐Vida, Ashley Alliano, Kimberly Monroe‐Hill, and Paul Sayer. Secretarial help at Leeds during the initial development was given by Christine Rutherford and Catherine Dobson.

Kanti would like to thank the Leverhulme Trust for an Emeritus Fellowship and Anna Grundy of the Trust for simplifying the administration process. Finally, he would like to express his sincere gratitude to his family for their continuous love, support, and tolerance.

We would be pleased to hear about any typographical or other errors in the text.

May 2022

Kanti V. Mardia

University of Leeds, Leeds, UK andUniversity of Oxford, Oxford, UK

John T. Kent

University of Leeds, Leeds, UK

Charles C. Taylor

University of Leeds, Leeds, UK

Preface to the First Edition

Multivariate Analysis deals with observations on more than one variable where there is some inherent interdependence between the variables. With several texts already available in this area, one may very well enquire of the authors as to the need for yet another book. Most of the available books fall into two categories, either theoretical or data analytic. The present book not only combines the two approaches but also emphasizes modern developments. The choice of material for the book has been guided by the need to give suitable matter for the beginner as well as illustrating some deeper aspects of the subject for the research worker. Practical examples are kept to the forefront, and, wherever feasible, each technique is motivated by such an example.

The book is aimed at final year undergraduates and postgraduate students in Mathematics/Statistics with sections suitable for practitioners and research workers. The book assumes a basic knowledge of Mathematical Statistics at undergraduate level. An elementary course on Linear Algebra is also assumed. In particular, we assume an exposure to Matrix Algebra to the level required to read Appendix A.

Broadly speaking, Chapters 1–6 and 12 can be described as containing direct extensions of univariate ideas and techniques. The remaining chapters concentrate on specifically multivariate problems that have no meaningful analogs in the univariate case. Chapter 1 is primarily concerned with giving exploratory analyses for multivariate data and briefly introduces some of the important techniques, tools, and diagrammatic representations. Chapter 2 introduces various distributions together with some fundamental results, whereas Chapter 3 concentrates exclusively on normal distribution theory. Chapters 4–6 deal with problems in inference. Chapter 7 [no longer included] gives an overview of Econometrics, while Principal Component Analysis, Factor Analysis, Canonical Correlation Analysis, and Discriminant Analysis are discussed from both theoretical and practical points of view in Chapters 8–11. Chapter 12 is on Multivariate Analysis of Variance, which can be better understood in terms of the techniques of previous chapters. The later chapters look into the presently developing techniques of Cluster Analysis, Multidimensional Scaling, and Directional Data [no longer included].

Each chapter concludes with a set of exercises. Solving these will not only enable the reader to understand the material better but will also serve to complement the chapter itself. In general, the questions have in‐built answers, but, where desirable, hints for the solution of theoretical problems are provided. Some of the numerical exercises are designed to be run on a computer, but as the main aim is on interpretation, the answers are provided. We found NAG routines and GLIM most useful, but nowadays any computer center will have some suitable statistics and matrix algebra routines.

There are three Appendices A, B, and C, which, respectively, provide a sufficient background of matrix algebra, a summary of univariate statistics, and some tables of critical values. The aim of Appendix A on Matrix Algebra is not only to provide a summary of results but also to give sufficient guidance to master these for students having little previous knowledge. Equations from Appendix A are referred to as (A.x.x) to distinguish them from (l.x.x), etc. Appendix A also includes a summary of results in n‐dimensional geometry that are used liberally in the book. Appendix B gives a summary of important univariate distributions.

The reference list is by no means exhaustive. Only directly relevant articles are quoted, and for a fuller bibliography, we refer the reader to Anderson et al. (1972) and Subrahmaniam and Subrahmaniam (1973). The reference list also serves as an author index. A subject index is provided.

The material in the book can be used in several different ways. For example, a one‐semester elementary course of 40 lectures could cover the following topics. Appendix A; Chapter 1 (Sections 1.1-1.7); Chapter 2 (Sections 2.1–2.5); Chapter 3 (Sections 3.4.1, 3.5, and 3.6.1, assuming results from previous sections, Definitions 3.7.1 and 3.7.2); Chapter 4 (Section 4.2.2); Chapter 5 (Sections 5.1, 5.2.1a, 5.2.1b, 5.2.2a, 5.2.2b, 5.3.2b, and 5.5); Chapter 8 (Sections 8.1, 8.2.1, 8.2.2, 8.2.5, 8.2.6, 8.4.3, and 8.7); Chapter 9 (Sections 9.1–9.3, 9.4 (without details), 9.5, 9.6, and 9.8); Chapter 10 (Sections 10.1 and 10.2); Chapter 11 (Sections –11.2.3, 11.3.1, and 11.6.1). Further material that can be introduced is Chapter 12 (Sections 12.1-12.3 and 12.6); Chapter 13 (Sections 13.1 and 13.3.1); Chapter 14 (Sections 14.1 and 14.2). This material has been covered in 40 lectures spread over two terms in different British universities. Alternatively, a one‐semester course with more emphasis on foundation rather than applications could be based on Appendix A and Chapters 1–5. Two‐semester courses could include all the chapters, excluding Chapters 7 and 15 on Econometrics and Directional Data, as well as the sections with asterisks. Mathematically orientated students may like to proceed to Chapter 2, omitting the data analytic ideas of Chapter 1.

Various new methods of presentation are utilized in the book. For instance, the data matrix is emphasized throughout, a density‐free approach is given for normal theory, the union intersection principle is used in testing as well as the likelihood ratio principle, and graphical methods are used in explanation. In view of the computer packages generally available, most of the numerical work is taken for granted, and therefore, except for a few particular cases, emphasis is not placed on numerical calculations. The style of presentation is generally kept descriptive except where rigor is found to be necessary for theoretical results, which are then put in the form of theorems. If any details of the proof of a theorem are felt tedious but simple, they are then relegated to the exercises.

Several important topics not usually found in multivariate texts are discussed in detail. Examples of such material include the complete chapters on Econometrics, Cluster Analysis, Multidimensional Scaling, and Directional Data. Further material is also included in parts of other chapters: methods of graphical presentation, measures of multivariate skewness and kurtosis, the singular multinormal distribution, various nonnormal distributions and families of distributions, a density‐free approach to normal distribution theory, Bayesian and robust estimators, a recent solution to the Fisher–Behrens problem, a test of multinormality, a nonparametric test, discarding of variables in regression, principal component analysis and discrimination analysis, correspondence analysis, allometry, the jack‐knifing method in discrimination, canonical analysis of qualitative and quantitative variables, and a test of dimensionality in MANOVA. It is hoped that coverage of these developments will be helpful for students as well as research workers.

There are various other topics that have not been touched upon partly because of lack of space as well as our own preferences, such as Control Theory, Multivariate Time Series, Latent Variable Models, Path Analysis, Growth Curves, Portfolio Analysis, and various Multivariate Designs.

In addition to various research papers, we have been influenced by particular texts in this area, especially Anderson (1958), Kendall (1975), Kshirsagar (1972), Morrison (1976), Press (1972), and Rao (1973). All these are recommended to the reader.

The authors would be most grateful to readers who draw their attention to any errors or obscurities in the book, or suggest other improvements.

January 1979

Kanti V. Mardia

John T. Kent

John C. Bibby

Acknowledgments from First Edition

First of all, we wish to express our gratitude to the pioneers in this field. In particular, we should mention M. S. Bartlett, R. A. Fisher, H. Hotelling, D. G. Kendall, M. G. Kendall, P. C. Mahalanobis, C. R. Rao, S. N. Roy, W. S. Torgeson, and S. S. Wilks.

We are grateful to the authors and editors who have generously granted us permission to reproduce figures and tables.

We are also grateful to many of our colleagues for their valuable help and comments, in particular Martin Beale, Christopher Bingham, Lesley Butler, Richard Cormack, David Cox, Ian Curry, Peter Fisk, Allan Gordon, John Gower, Peter Harris, Chunni Khatri, Conrad Leser, Eric Okell, Ross Renner, David Salmond, Cyril Smith, and Peter Zemroch. We are also indebted to Joyce Snell for making various comments on an earlier draft of the book, which have led to considerable improvements. We should also express our gratitude to Rob Edwards for his help in various facets of the book, for calculations, proofreading, diagrams, etc.

Some of the questions are taken from examination papers in British universities, and we are grateful to various unnamed colleagues. Since the original sources of questions are difficult to trace, we apologize to any colleague who recognizes a question of their own.

The authors would like to thank their wives, Pavan Mardia, Susan Kent, and Zorina Bibby.

Finally, our thanks go to Barbara Forsyth and Margaret Richardson for typing a difficult manuscript with great skill.

KVM

JTK

JMB

Notation, Abbreviations, and Key Ideas

Matrices and Vectors

Vectors

are viewed as column vectors and are represented using bold lower case letters. Round brackets are generally used when a vector is expressed in terms of its elements. For example, in which the th element or component is denoted . The transpose of is denoted , so is a row vector.

Matrices

are written using bold upper case letters, e.g. and . The matrix may be written as in which is the element of the matrix in row and column . If has rows and columns, then the th row of , written as a column vector, is

and the th column is written as

Hence, can be expressed in various forms,

We generally use square brackets when a matrix is expanded in terms of its elements.

Operations on a matrix include

–

transpose:

–

determinant:

–

inverse:

–

generalized inverse:

where for the final three operations, is assumed to be square, and for the inverse operation, is additionally assumed to be nonsingular. Different types of matrices are given in Tables A.1 and A.3. Table A.2 lists some further matrix operations.

Random Variables and Data

In general, a random vector and a nonrandom vector are both indicated using a bold lower case letter, e.g. . Thus, the distinction between the two must be determined from the context. This convention is in contrast to the standard convention in statistics where upper case letters are used to denote random quantities, and lower case letters their observed values.

The reason for our convention is that bold upper case letters are generally used for a data matrix , both random and fixed.

In spite of the above convention, we very occasionally (e.g. parts of

Chapters 2

and

) use bold upper case letters for a random vector when it is important to distinguish between the random vector and a possible value .

The phrase “high‐dimensional data” often implies , whereas the phrase “big data” often just indicates that or is large.

Parameters and Statistics

Elements of an data matrix are generally written , where indices are used to label the observations, and indices are used to label the variables.

If the rows of a data matrix are normally distributed with mean and covariance matrix , and , the following notation is used to distinguish various population and sample quantities:

Parameter

Sample

Mean vector

Covariance matrix

Unbiased covariance matrix

Concentration matrix

Correlation matrix

Distributions

The following notation is used for univariate and multivariate distributions. Appendix B summarizes the univariate distributions used in the Book.

cumulative distribution function/distribution function (d.f.)

probability density function (p.d.f.)

expectation

c.f.

characteristic function

d.f.

distribution function

Hotelling

multivariate normal distribution in ‐dimensions with mean (column vector of length ) and covariance matrix

variance–covariance matrix

correlation matrix

Wishart distribution

The terms variance matrix, covariance matrix, and variance–covariance matrix are synonymous.

Matrix Decompositions

Any symmetric matrix can (by the spectral decomposition theorem) be written as where is a diagonal matrix of eigenvalues of (which are real‐valued), i.e. , and is an orthogonal matrix whose columns are standardized eigenvectors, i.e. and . See Theorem A.6.8.

Using the above, we define the symmetric square root of a positive definite matrix by

If is an matrix of rank , then by the singular value decomposition, it can be written as where and are column orthonormal matrices, and is a diagonal matrix with positive elements. See Theorem A.6.8.

Geometry

Table A.5 sets out the basic concepts in ‐dimensional geometry. In particular,

Length of a vector

Euclidean distance between and

Squared Mahalanobis distance – one of the most important distances in multivariate analysis, since it takes account of a covariance, i.e.

Table 14.6 gives a list of various distances.

Main Abbreviations and Commonly Used Notation

approximately equal to

(conditionally) independent of

is distributed as

the set of elements that are members of but not

Euclidean distance between and

transpose of matrix

determinant of matrix

inverse of matrix

‐inverse (generalized inverse)

column vector of 1s

column vector or matrix of 0s

between‐groups sum of squares and products (SSP) matrix

beta variable

normalizing constant for beta distribution (note nonitalic font to distinguish from the above)

BLUE

best linear unbiased estimate

covariance between and

chi‐squared distribution with degrees of freedom

upper

critical value of chi‐squared distribution with degrees of freedom

c.f.

characteristic function

partial derivative – multivariate examples in Appendix A.9

distance matrix

squared Mahalanobis distance

d.f.

distribution function

Kronecker delta

diagonal elements of a square matrix (as column vector) or diagonal matrix created from a vector

(see above)

expectation

distribution with degrees of freedom and

upper

critical value of distribution with degrees of freedom and

cumulative distribution function

probability density function

gamma function

GLS

generalized least squares

centering matrix

identity matrix

ICA

independent component analysis

i.i.d.

independent and identically distributed

Jacobian of transformation (see

Table 2.1

)

concentration matrix ()

likelihood

log likelihood

LDA

linear discriminant analysis

logarithm to the base (natural logarithm)

LRT

likelihood ratio test

MANOVA

multivariate analysis of variance

MDS

multidimensional scaling

maximum likelihood

m.l.e.

maximum likelihood estimate

mean (population) vector

multivariate normal distribution for ‐dimensions (usually omitted when )

OLS

ordinary least squares

(population) correlation matrix (sometimes a matrix of counts, e.g.

Section

9.5)

probability

PCA

principal component analysis

p.d.

positive definite

p.d.f.

probability density function

p.s.d.

positive semi definite

real numbers

correlation coefficient

sample correlation matrix

sample covariance matrix

unbiased sample covariance matrix

(population) covariance matrix

SLC

standardized linear combination

SSP

sums of squares and products

distribution with degrees of freedom

total SSP matrix

Hotelling statistic

trace

UIT

union intersection test

variance

variance‐covariance matrix of

within‐groups SSP matrix

Wishart distribution

data matrix

sample mean vector

Wilks' statistic

greatest root statistic

1Introduction

1.1 Objects and Variables

Multivariate analysis deals with data containing observations on two or more variables, each measured on a set of objects. For example, we may have the set of examination marks achieved by certain students, or the cork deposit in various directions of a set of trees, or flower measurements for different species of iris (see Tables 1.2, 1.4, and 1.3, respectively). Each of these data has a set of “variables” (the examination marks, trunk thickness, and flower measurements) and a set of “objects” (the students, trees, and flowers). In general, if there are objects, and variables, , the data contains pieces of information. These may be conveniently arranged using an “data matrix”, in which each row corresponds to an object, and each column corresponds to a variable. For instance, three variables on five “objects” (students) are shown as a data matrix in Table 1.1.

Note that all the variables need not be of the same type: in Table 1.1, is a “continuous” variable, is a discrete variable, and is a binary variable. Note also that attribute, characteristic, description, measurement, and response are synonyms for “variable”, whereas individual, observation, plot, reading, item, and unit can be used in place of “object”.

1.2 Some Multivariate Problems and Techniques

We may now illustrate various categories of multivariate technique.

1.2.1 Generalizations of Univariate Techniques

Most univariate questions are capable of at least one multivariate generalization. For instance, using Table 1.2, we may ask, as an example, “What is the appropriate underlying parent distribution of examination marks on various papers of a set of students?” “What are the summary statistics?” “Are the differences between average marks on different papers significant?”, etc. These problems are direct generalizations of univariate problems, and their motivation is easy to grasp. See, for example, Chapters 2–7 and 13.

Table 1.1 Data matrix with five students as objects, where is age in years at entry to university, is marks out of 100 in an examination at the end of the first year, and is sex.

Variables

Objects

18.45

18.41

18.39

18.70

18.34

1 indicates male; 0 indicates female.

1.2.2 Dependence and Regression

The data in Table 1.2, which were collected at the University of Hull in the early 1970s, formed part of an investigation into the merits of open‐book vs. closed‐book examinations. Marks (out of 100) were given for 88 students on each of five subjects; these observations were sorted (almost) according to the average. Initially, we may enquire as to the degree of dependence between performance on different papers taken by the same students. It may be useful, for counseling or other purposes, to have some idea of how final degree marks (“dependent” variables) are affected by previous examination results or by other variables such as age and sex (“explanatory” variables). This presents the so‐called regression problem, which is examined in Chapter 7.

1.2.3 Linear Combinations

Given examination marks on different topics (as in Table 1.2), the question arises of how to combine or average these marks in a suitable way. A straightforward method would use the simple arithmetic mean, but this procedure may not always be suitable. For instance, if the marks on some papers vary more than others, we may wish to weight them differently. This leads us to search for a linear combination (weighted sum) which is “optimal” in some sense. If all the examination papers fall in one group, then principal component analysis and factor analysis are two techniques that can help to answer such questions (see Chapters 9 and 10). In some situations, the papers may fall into more than one group – for instance, in Table 1.2, some examinations were “open book”, while others were “closed book”. In such situations, we may wish to investigate the use of linear combinations within each group separately. This leads to the method known as canonical correlation analysis, which is discussed in Chapter 11.

The idea of taking linear combinations is an important one in multivariate analysis, and we will return to it in Section 1.5.

Table 1.2 Marks in open‐ and closed‐book examination out of 100.

Mechanics (C)

Vectors (C)

Algebra (O)

Analysis (O)

Statistics (O)

O indicates open book, and C indicates closed book.

1.2.4 Assignment and Dissection

Table 1.3 gives three data matrices (or one data matrix if the species is coded as a variable). In each matrix, the “objects” are 50 irises of species Iris setosa, Iris versicolor, and Iris virginica, respectively. The “variables” are

sepal length,

sepal width,

petal length,

petal width.

The flowers of the first two iris species (I. setosa and I. versicolor) were taken from the same natural colony but the sample of the third iris species (I. virginica) is from a different colony; for more general details on the data, see Mardia (2023). If a new iris of unknown species has measurements , and , we may ask to which species it belongs. This presents the problem of discriminant analysis, which is discussed in Chapter 12. However, if we were presented with the 150 observations of Table 1.3 in an unclassified manner (say, before the three species were established), then the aim could have been to dissect the population into homogeneous groups. This problem is handled by cluster analysis (see Chapter 14).

Table 1.3 Measurements (in cm) on three types of irises.

Source: Fisher (1936) / John Wiley & Sons.

Iris setosa

Iris versicolor

Iris virginica

Sepal length

Sepal width

Petal length

Petal width

Sepal length

Sepal width

Petal length

Petal width

Sepal length

Sepal width

Petal length

Petal width

5.1

3.5

1.4

0.2

7.0

3.2

4.7

1.4

6.3

3.3

6.0

2.5

4.9

3.0

1.4

0.2

6.4

3.2

4.5

1.5

5.8

2.7

5.1

1.9

4.7

3.2

1.3

0.2

6.9

3.1

4.9

1.5

7.1

3.0

5.9

2.1

4.6

3.1

1.5

0.2

5.5

2.3

4.0

1.3

6.3

2.9

5.6

1.8

5.0

3.6

1.4

0.2

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: