Matrix Analysis for Statistics - James R. Schott - E-Book

Matrix Analysis for Statistics E-Book

James R. Schott

0,0
113,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

An up-to-date version of the complete, self-contained introduction to matrix analysis theory and practice

Providing accessible and in-depth coverage of the most common matrix methods now used in statistical applications, Matrix Analysis for Statistics, Third Edition features an easy-to-follow theorem/proof format. Featuring smooth transitions between topical coverage, the author carefully justifies the step-by-step process of the most common matrix methods now used in statistical applications, including eigenvalues and eigenvectors; the Moore-Penrose inverse; matrix differentiation; and the distribution of quadratic forms.

An ideal introduction to matrix analysis theory and practice, Matrix Analysis for Statistics, Third Edition features:

• New chapter or section coverage on inequalities, oblique projections, and antieigenvalues and antieigenvectors

• Additional problems and chapter-end practice exercises at the end of each chapter

• Extensive examples that are familiar and easy to understand

• Self-contained chapters for flexibility in topic choice

• Applications of matrix methods in least squares regression and the analyses of mean vectors and covariance matrices

Matrix Analysis for Statistics, Third Edition is an ideal textbook for upper-undergraduate and graduate-level courses on matrix methods, multivariate analysis, and linear models. The book is also an excellent reference for research professionals in applied statistics.

James R. Schott, PhD, is Professor in the Department of Statistics at the University of Central Florida. He has published numerous journal articles in the area of multivariate analysis. Dr. Schott’s research interests include multivariate analysis, analysis of covariance and correlation matrices, and dimensionality reduction techniques.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 712

Veröffentlichungsjahr: 2016

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

COVER

TITLE PAGE

COPYRIGHT

DEDICATION

PREFACE

Preface to the Second Edition

Preface to the Third Edition

ABOUT THE COMPANION WEBSITE

CHAPTER 1: A REVIEW OF ELEMENTARY MATRIX ALGEBRA

1.1 Introduction

1.2 Definitions and Notation

1.3 Matrix Addition and Multiplication

1.4 The Transpose

1.5 The Trace

1.6 The Determinant

1.7 The Inverse

1.8 Partitioned Matrices

1.9 The Rank of a Matrix

1.10 Orthogonal Matrices

1.11 Quadratic Forms

1.12 Complex Matrices

1.13 Random Vectors and Some Related Statistical Concepts

Problems

CHAPTER 2: VECTOR SPACES

2.1 Introduction

2.2 Definitions

2.3 Linear Independence and Dependence

2.4 Matrix Rank and Linear Independence

2.5 Bases and Dimension

2.6 Orthonormal Bases and Projections

2.7 Projection Matrices

2.8 Linear Transformations and Systems of Linear Equations

2.9 The Intersection and Sum of Vector Spaces

2.10 Oblique Projections

2.11 Convex Sets

Problems

CHAPTER 3: EIGENVALUES AND EIGENVECTORS

3.1 Introduction

3.2 Eigenvalues, Eigenvectors, and Eigenspaces

3.3 Some Basic Properties of Eigenvalues and Eigenvectors

3.4 Symmetric Matrices

3.5 Continuity of Eigenvalues and Eigenprojections

3.6 Extremal Properties of Eigenvalues

3.7 Additional Results Concerning Eigenvalues Of Symmetric Matrices

3.8 Nonnegative Definite Matrices

3.9 Antieigenvalues and Antieigenvectors

Problems

CHAPTER 4: MATRIX FACTORIZATIONS AND MATRIX NORMS

4.1 Introduction

4.2 The Singular Value Decomposition

4.3 The Spectral Decomposition of a Symmetric Matrix

4.4 The Diagonalization of a Square Matrix

4.5 The Jordan Decomposition

4.6 The Schur Decomposition

4.7 The Simultaneous Diagonalization of Two Symmetric Matrices

4.8 Matrix Norms

Problems

CHAPTER 5: GENERALIZED INVERSES

5.1 Introduction

5.2 The Moore–Penrose Generalized Inverse

5.3 Some Basic Properties of the Moore–Penrose Inverse

5.4 The Moore–Penrose Inverse of a Matrix Product

5.5 The Moore–Penrose Inverse of Partitioned Matrices

5.6 The Moore–Penrose Inverse of a Sum

5.7 The Continuity of the Moore–Penrose Inverse

5.8 Some Other Generalized Inverses

5.9 Computing Generalized Inverses

Problems

CHAPTER 6: SYSTEMS OF LINEAR EQUATIONS

6.1 Introduction

6.2 Consistency of a System of Equations

6.3 Solutions to a Consistent System of Equations

6.4 Homogeneous Systems of Equations

6.5 Least Squares Solutions to a System of Linear Equations

6.6 Least Squares Estimation For Less Than Full Rank Models

6.7 Systems of Linear Equations and The Singular Value Decomposition

6.8 Sparse Linear Systems of Equations

Problems

CHAPTER 7: PARTITIONED MATRICES

7.1 Introduction

7.2 The Inverse

7.3 The Determinant

7.4 Rank

7.5 Generalized Inverses

7.6 Eigenvalues

Problems

CHAPTER 8: SPECIAL MATRICES AND MATRIX OPERATIONS

8.1 Introduction

8.2 The Kronecker Product

8.3 The Direct Sum

8.4 The Vec Operator

8.5 The Hadamard Product

8.6 The Commutation Matrix

8.7 Some Other Matrices Associated With the Vec Operator

8.8 Nonnegative Matrices

8.9 Circulant and Toeplitz Matrices

8.10 Hadamard and Vandermonde Matrices

Problems

CHAPTER 9: MATRIX DERIVATIVES AND RELATED TOPICS

9.1 Introduction

9.2 Multivariable Differential Calculus

9.3 Vector and Matrix Functions

9.4 Some Useful Matrix Derivatives

9.5 Derivatives of Functions of Patterned Matrices

9.6 The Perturbation Method

9.7 Maxima and Minima

9.8 Convex and Concave Functions

9.9 The Method of Lagrange Multipliers

Problems

CHAPTER 10: INEQUALITIES

10.1 Introduction

10.2 Majorization

10.3 Cauchy-Schwarz Inequalities

10.4 Hölder's Inequality

10.5 Minkowski's Inequality

10.6 The Arithmetic-Geometric Mean Inequality

Problems

CHAPTER 11: SOME SPECIAL TOPICS RELATED TO QUADRATIC FORMS

11.1 Introduction

11.2 Some Results on Idempotent Matrices

11.3 Cochran's Theorem

11.4 Distribution of Quadratic Forms in Normal Variates

11.5 Independence of Quadratic Forms

11.6 Expected Values of Quadratic Forms

11.7 The Wishart Distribution

Problems

REFERENCES

INDEX

WILEY SERIES IN PROBABILITY AND STATISTICS

End User License Agreement

Pages

xi

xii

xiii

xv

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

35

36

36

37

37

38

38

39

39

40

40

41

41

42

42

43

43

44

44

45

45

46

46

47

47

48

48

49

49

50

50

51

51

52

52

53

53

54

54

55

55

56

56

57

57

58

58

59

59

60

60

61

61

62

62

63

63

64

64

65

65

66

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

387

388

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

513

514

515

516

517

518

519

520

Guide

Cover

Table of Contents

Preface

Begin Reading

List of Illustrations

CHAPTER 2: VECTOR SPACES

Figure 2.1 The angle between

x

and

y

Figure 2.2 Projection of

x

onto a one-dimensional subspace of

Figure 2.3 Projection of

x

onto a two-dimensional subspace of

Figure 2.4 Projection of

x

onto

a

along

b

CHAPTER 9: MATRIX DERIVATIVES AND RELATED TOPICS

Figure 9.1 A convex function of a scalar variable

x

WILEY SERIES IN PROBABILITY AND STATISTICS

 

Established by WALTER A. SHEWHART and SAMUEL S. WILKS

 

Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg

Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels

 

A complete list of the titles in this series appears at the end of this volume.

MATRIX ANALYSIS FOR STATISTICS

 

Third Edition

 

JAMES R. SCHOTT

 

 

 

Copyright © 2017 by John Wiley & Sons, Inc. All rights reserved

Published by John Wiley & Sons, Inc., Hoboken, New Jersey

Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Names: Schott, James R., 1955- author.

Title: Matrix analysis for statistics / James R. Schott.

Description: Third edition. | Hoboken, New Jersey : John Wiley & Sons, 2016. | Includes bibliographical references and index.

Identifiers: LCCN 2016000005| ISBN 9781119092483 (cloth) | ISBN 9781119092469 (epub)

Subjects: LCSH: Matrices. | Mathematical statistics.

Classification: LCC QA188 .S24 2016 | DDC 512.9/434-dc23 LC record available at http://lccn.loc.gov/2016000005

Cover image courtesy of GettyImages/Alexmumu.

To Susan, Adam, and Sarah

PREFACE

As the field of statistics has developed over the years, the role of matrix methods has evolved from a tool through which statistical problems could be more conveniently expressed to an absolutely essential part in the development, understanding, and use of the more complicated statistical analyses that have appeared in recent years. As such, a background in matrix analysis has become a vital part of a graduate education in statistics. Too often, the statistics graduate student gets his or her matrix background in bits and pieces through various courses on topics such as regression analysis, multivariate analysis, linear models, stochastic processes, and so on. An alternative to this fragmented approach is an entire course devoted to matrix methods useful in statistics. This text has been written with such a course in mind. It also could be used as a text for an advanced undergraduate course with an unusually bright group of students and should prove to be useful as a reference for both applied and research statisticians.

Students beginning in a graduate program in statistics often have their previous degrees in other fields, such as mathematics, and so initially their statistical backgrounds may not be all that extensive. With this in mind, I have tried to make the statistical topics presented as examples in this text as self-contained as possible. This has been accomplished by including a section in the first chapter which covers some basic statistical concepts and by having most of the statistical examples deal with applications which are fairly simple to understand; for instance, many of these examples involve least squares regression or applications that utilize the simple concepts of mean vectors and covariance matrices. Thus, an introductory statistics course should provide the reader of this text with a sufficient background in statistics. An additional prerequisite is an undergraduate course in matrices or linear algebra, while a calculus background is necessary for some portions of the book, most notably, Chapter 8.

By selectively omitting some sections, all nine chapters of this book can be covered in a one-semester course. For instance, in a course targeted at students who end their educational careers with the masters degree, I typically omit Sections 2.10, 3.5, 3.7, 4.8, 5.4-5.7, and 8.6, along with a few other sections.

Anyone writing a book on a subject for which other texts have already been written stands to benefit from these earlier works, and that certainly has been thecase here. The texts by Basilevsky (1983), Graybill (1983), Healy (1986), and Searle (1982), all books on matrices for statistics, have helped me, in varying degrees, to formulate my ideas on matrices. Graybill's book has been particularly influential, since this is the book that I referred to extensively, first as a graduate student, and then in the early stages of my research career. Other texts which have proven to be quite helpful are Horn and Johnson (1985, 1991), Magnus and Neudecker (1988), particularly in the writing of Chapter 8, and Magnus (1988).

I wish to thank several anonymous reviewers who offered many very helpful suggestions, and Mark Johnson for his support and encouragement throughout this project. I am also grateful to the numerous students who have alerted me to various mistakes and typos in earlier versions of this book. In spite of their help and my diligent efforts at proofreading, undoubtedly some mistakes remain, and I would appreciate being informed of any that are spotted.

Jim Schott

Orlando, Florida

Preface to the Second Edition

The most notable change in the second edition is the addition of a chapter on results regarding matrices partitioned into a 2×2 form. This new chapter, which is Chapter 7, has the material on the determinant and inverse that was previously given as a section in Chapter 7 of the first edition. Along with the results on the determinant and inverse of a partitioned matrix, I have added new material in this chapter on the rank, generalized inverses, and eigenvalues of partitioned matrices.

The coverage of eigenvalues in Chapter 3 has also been expanded. Some additional results such as Weyl's Theorem have been included, and in so doing, the last section of Chapter 3 of the first edition has now been replaced by two sections.

Other smaller additions, including both theorems and examples, have been made elsewhere throughout the book. Over 100 new exercises have been added to the problems sets.

The writing of a second edition of this book has also given me the opportunity to correct mistakes in the first edition. I would like to thank those readers who have pointed out some of these errors as well as those that have offered suggestions for improvement to the text.

Jim Schott

Orlando, FloridaSeptember 2004

Preface to the Third Edition

The third edition of this text maintains the same organization that was present inthe previous editions. The major changes involve the addition of new material. This includes the following additions.

1.

A new chapter, now

Chapter 10

, on inequalities has been added. Numerous inequalities such as Cauchy-Schwarz, Hadamard, and Jensen's, already appear in the earlier editions, but there are many important ones that are missing, and some of these are given in the new chapter. Highlighting this chapter is a fairly substantial section on majorization and some of the inequalities that can be developed from this concept.

2.

A new section on oblique projections has been added to

Chapter 2

. The previous editions only covered orthogonal projections.

3.

A new section on antieigenvalues and antieigenvectors has been added to

Chapter 3

.

Numerous other smaller additions have been made throughout the text. These include some additional theorems, the proofs of some results that previously had been given without proof, and some more examples involving statistical applications. Finally, more than 70 new problems have been added to the end-of-chapter problem sets.

Jim Schott

Orlando, FloridaDecember 2015

ABOUT THE COMPANION WEBSITE

This book is accompanied by a companion website:

www.wiley.com/go/Schott/MatrixAnalysis3e

The instructor's website includes:

A solutions manual with solutions to selected problems

The student's website includes:

A solutions manual with odd-numbered solutions to selected problems

CHAPTER 1A REVIEW OF ELEMENTARY MATRIX ALGEBRA

1.1 Introduction

In this chapter, we review some of the basic operations and fundamental properties involved in matrix algebra. In most cases, properties will be stated without proof, but in some cases, when instructive, proofs will be presented. We end the chapter with a brief discussion of random variables and random vectors, expected values of random variables, and some important distributions encountered elsewhere in the book.

1.2 Definitions and Notation

Except when stated otherwise, a scalar such as α will represent a real number. A matrix A of size m × n is the m × n rectangular array of scalars given by

and sometimes it is simply identified as . Sometimes it also will be convenient to refer to the th element of A, as ; that is, . If , then A is called a square matrix of order m, whereas A is referred to as a rectangular matrix when . An m × 1 matrix

is called a column vector or simply a vector. The element is referred to as the ith component of a. A matrix is called a row vector. The ith row and jth column of the matrix A will be denoted by and , respectively. We will usually use capital letters to represent matrices and lowercase bold letters for vectors.

The diagonal elements of the m × m matrix A are . If all other elements of A are equal to 0, A is called a diagonal matrix and can be identified as . If, in addition, for so that , then the matrix A is called the identity matrix of order m and will be written as or simply if the order is obvious. If and b is a scalar, then we will use to denote the diagonal matrix . For any m × m matrix A, will denote the diagonal matrix with diagonal elements equal to those of A, and for any m × 1 vector a, denotes the diagonal matrix with diagonal elements equal to the components of a; that is, and .

A triangular matrix is a square matrix that is either an upper triangular matrix or a lower triangular matrix. An upper triangular matrix is one that has all of its elements below the diagonal equal to 0, whereas a lower triangular matrix has all of its elements above the diagonal equal to 0. A strictly upper triangular matrix is an upper triangular matrix that has each of its diagonal elements equal to 0. A strictly lower triangular matrix is defined similarly.

The ith column of the m × m identity matrix will be denoted by ei; that is, ei is the m × 1 vector that has its ith component equal to 1 and all of its other components equal to 0. When the value of m is not obvious, we will make it more explicit by writing ei as . The m × m matrix whose only nonzero element is a 1 in the th position will be identified as .

The scalar zero is written 0, whereas a vector of zeros, called a null vector, will be denoted by 0 , and a matrix of zeros, called a null matrix, will be denoted by . The m × 1 vector having each component equal to 1 will be denoted by or simply 1 when the size of the vector is obvious.

1.3 Matrix Addition and Multiplication

The sum of two matrices A and B is defined if they have the same number of rows and the same number of columns; in this case,

The product of a scalar α and a matrix A is

The premultiplication of the matrix B by the matrix A is defined only if the number of columns of A equals the number of rows of B. Thus, if A is and B is , then will be the m × n matrix which has its th element, , given by

A similar definition exists for BA, the postmultiplication of B by A, if the number of columns of B equals the number of rows of A. When both products are defined, we will not have, in general, . If the matrix A is square, then the product AA, or simply , is defined. In this case, if we have , then A is said to be an idempotent matrix.

The following basic properties of matrix addition and multiplication in Theorem 1.1 are easy to verify.

Theorem 1.1

Let α and β be scalars and A, B, and C be matrices. Then, when the operations involved are defined, the following properties hold:

a.

.

b.

.

c.

.

d.

.

e.

.

f.

.

g.

.

h.

.

1.4 The Transpose

The transpose of an m × n matrix A is the n × m matrix obtained by interchanging the rows and columns of A. Thus, the th element of is . If A is and B is , then the th element of can be expressed as

Thus, evidently . This property along with some other results involving the transpose are summarized in Theorem 1.2.

Theorem 1.2

Let α and β be scalars and A and B be matrices. Then, when defined, the following properties hold:

a.

.

b.

.

c.

.

d.

.

If A is m × m, that is, A is a square matrix, then is also m × m. In this case, if , then A is called a symmetric matrix, whereas A is called a skew-symmetric if .

The transpose of a column vector is a row vector, and in some situations, we may write a matrix as a column vector times a row vector. For instance, the matrix defined in Section 1.2 can be expressed as . More generally, yields an m × n matrix having 1, as its only nonzero element, in the th position, and if A is an m × n matrix, then

1.5 The Trace

The trace is a function that is defined only on square matrices. If A is an m × m matrix, then the trace of A, denoted by , is defined to be the sum of the diagonal elements of A; that is,

Now if A is m × n and B is n × m, then AB is m × m and

This property of the trace, along with some others, is summarized in Theorem 1.3.

Theorem 1.3

Let α be a scalar and A and B be matrices. Then, when the appropriate operations are defined, we have the following properties:

a.

.

b.

.

c.

.

d.

.

e.

if and only if

.

1.6 The Determinant

The determinant is another function defined on square matrices. If A is an m × m matrix, then its determinant, denoted by , is given by

where the summation is taken over all permutations of the set of integers , and the function equals the number of transpositions necessary to change to an increasing sequence of components, that is, to . A transposition is the interchange of two of the integers. Although f is not unique, it is uniquely even or odd, so that is uniquely defined. Note that the determinant produces all products of m terms of the elements of the matrix A such that exactly one element is selected from each row and each column of A.

Using the formula for the determinant, we find that when . If A is 2×2, we have

and when A is 3×3, we get

The following properties of the determinant in Theorem 1.4 are fairly straightforward to verify using the definition of a determinant.

Theorem 1.4

If α is a scalar and A is an m × m matrix, then the following properties hold:

a.

.

b.

.

c.

If

A

is a diagonal matrix, then

.

d.

If all elements of a row (or column) of

A

are zero,

.

e.

The interchange of two rows (or columns) of

A

changes the sign of

.

f.

If all elements of a row (or column) of

A

are multiplied by

α

, then the determinant is multiplied by

α

.

g.

The determinant of

A

is unchanged when a multiple of one row (or column) is added to another row (or column).

h.

If two rows (or columns) of

A

are proportional to one another,

.

An alternative expression for can be given in terms of the cofactors of A. The minor of the element , denoted by , is the determinant of the matrix obtained after removing the ith row and jth column from A. The corresponding cofactor of , denoted by , is then given as .

Theorem 1.5

For any , the determinant of the m × m matrix A can be obtained by expanding along the ith row,

1.1

or expanding along the ith column,

1.2

Proof

We will just prove (1.1), as (1.2) can easily be obtained by applying (1.1) to . We first consider the result when . Clearly

where

and the summation is over all permutations for which . Since , this implies that

where the summation is over all permutations of . If C is the matrix obtained from A by deleting its 1st row and jth column, then can be written

where the summation is over all permutations of and is the minor of . Thus,

as is required. To prove (1.1) when , let D be the m × m matrix for which , , for , and for . Then , and . Thus, since we have already established (1.1) when , we have

and so the proof is complete.

Our next result indicates that if the cofactors of a row or column are matched with the elements from a different row or column, the expansion reduces to 0.

Theorem 1.6

If A is an m × m matrix and , then

1.3

Example 1.1

We will find the determinant of the 5 × 5 matrix given by

Using the cofactor expansion formula on the first column of A, we obtain

and then using the same expansion formula on the first column of this 4 × 4 matrix, we get

Because the determinant of the 3×3 matrix above is 6, we have

Consider the m × m matrix C whose columns are given by the vectors ; that is, we can write . Suppose that, for some m × 1 vector and m × m matrix , we have

Then, if we find the determinant of C by expanding along the first column of C, we get

so that the determinant of C is a linear combination of m determinants. If B is an m × m matrix and we now define , then by applying the previous derivation on each column of C, we find that

where this final sum is only over all permutations of , because Theorem 1.4(h) implies that

if for any . Finally, reordering the columns in and using Theorem 1.4(e), we have

This very useful result is summarized in Theorem 1.7.

Theorem 1.7

If both A and B are square matrices of the same order, then

1.7 The Inverse

An m × m matrix A is said to be a nonsingular matrix if and a singular matrix if . If A is nonsingular, a nonsingular matrix denoted by A−1 and called the inverse of A exists, such that

1.4

This inverse is unique because, if B is another m × m matrix satisfying the inverse formula (1.4) for A, then , and so

The following basic properties of the matrix inverse in Theorem 1.8 can be easily verified by using (1.4).

Theorem 1.8

If α is a nonzero scalar, and A and B are nonsingular m × m matrices, then the following properties hold:

a.

.

b.

.

c.

.

d.

.

e.

If

, then

.

f.

If

, then

.

g.

.

As with the determinant of A, the inverse of A can be expressed in terms of the cofactors of A. Let , called the adjoint of A, be the transpose of the matrix of cofactors of A; that is, the th element of is , the cofactor of . Then

because follows directly from (1.1) and (1.2), and , for follows from (1.3). The equation above then yields the relationship

when . Thus, for instance, if A is a 2×2 nonsingular matrix, then

Similarly when , we get , where

The relationship between the inverse of a matrix product and the product of the inverses, given in Theorem 1.8(g), is a very useful property. Unfortunately, such a nice relationship does not exist between the inverse of a sum and the sum of the inverses. We do, however, have Theorem 1.9 which is sometimes useful.

Theorem 1.9

Suppose A and B are nonsingular matrices, with A being m × m and B being n × n. For any m × n matrix C and any n × m matrix D, it follows that if is nonsingular, then

Proof

The proof simply involves verifying that for given above. We have

and so the result follows.

The expression given for in Theorem 1.9 involves the inverse of the matrix . It can be shown (see Problem 7.12) that the conditions of the theorem guarantee that this inverse exists. If and C and D are identity matrices, then we obtain Corollary 1.9.1 of Theorem 1.9.

Corollary 1.9.1

Suppose that A, B and A+B are all m × m nonsingular matrices. Then

We obtain Corollary 1.9.2 of Theorem 1.9 when .

Corollary 1.9.2

Let A be an m × m nonsingular matrix. If c and d are both m × 1 vectors and is nonsingular, then

Example 1.2

Theorem 1.9 can be particularly useful when m is larger than n and the inverse of A is fairly easy to compute. For instance, suppose we have ,

from which we obtain

It is somewhat tedious to compute the inverse of this 5 × 5 matrix directly. However, the calculations in Theorem 1.9 are fairly straightforward. Clearly, and

so that

and

Thus, we find that

1.8 Partitioned Matrices

Occasionally we will find it useful to partition a given matrix into submatrices. For instance, suppose A is m × n and the positive integers m1, m2, , are such that and . Then one way of writing A as a partitioned matrix is

where A11 is , is , A21 is , and A22 is . That is, A11 is the matrix consisting of the first m1 rows and columns of A, is the matrix consisting of the first m1 rows and last columns of A, and so on. Matrix operations can be expressed in terms of the submatrices of the partitioned matrix. For example, suppose B is an matrix partitioned as

where is , is , is , is , and . Then the premultiplication of B by A can be expressed in partitioned form as

Matrices can be partitioned into submatrices in other ways besides this 2×2 partitioned form. For instance, we could partition only the columns of A, yielding the expression

where A1 is and A2 is . A more general situation is one in which the rows of A are partitioned into r groups and the columns of A are partitioned into c groups so that A can be written as

where the submatrix is and the integers and are such that

This matrix A is said to be in block diagonal form if , is a square matrix for each i, and is a null matrix for all i and j for which . In this case, we will write ; that is,

Example 1.3

Suppose we wish to compute the transpose product , where the 5 × 5 matrix A is given by

The computation can be simplified by observing that A may be written as

As a result, we have

1.9 The Rank of a Matrix

Our initial definition of the rank of an m × n matrix A is given in terms of submatrices. We will see an alternative equivalent definition in terms of the concept of linearly independent vectors in Chapter 2. Most of the material we include in this section can be found in more detail in texts on elementary linear algebra such as Andrilli and Hecker (2010) and Poole (2015).

In general, any matrix formed by deleting rows or columns of A is called a submatrix of A. The determinant of an r × r submatrix of A is called a minor of order r. For instance, for an m × m matrix A, we have previously defined what we called the minor of ; this is an example of a minor of order . Now the rank of a nonnull m × n matrix A is r, written , if at least one of its minors of order r is nonzero while all minors of order (if there are any) are zero. If A is a null matrix, then . If , then A is said to have full rank. In particular, if , A has full row rank, and if , A has full column rank.

The rank of a matrix A is unchanged by any of the following operations, called elementary transformations:

a.

The interchange of two rows (or columns) of

A

.

b.

The multiplication of a row (or column) of

A

by a nonzero scalar.

c.

The addition of a scalar multiple of a row (or column) of

A

to another row (or column) of

A

.

Thus, the definition of the rank of A is sometimes given as the number of nonzero rows in the reduced row echelon form of A.

Any elementary transformation of A can be expressed as the multiplication of A by a matrix referred to as an elementary transformation matrix. An elementary transformation of the rows of A will be given by the premultiplication of A by an elementary transformation matrix, whereas an elementary transformation of the columns corresponds to a postmultiplication. Elementary transformation matrices are nonsingular, and any nonsingular matrix can be expressed as the product of elementary transformation matrices. Consequently, we have Theorem 1.10.

Theorem 1.10

Let A be an m × n matrix, B be an m × m matrix, and C be an n × n matrix. Then if B and C are nonsingular matrices, it follows that

By using elementary transformation matrices, any matrix A can be transformed into another matrix of simpler form having the same rank as A.

Theorem 1.11

If A is an m × n matrix of rank , then nonsingular m × m and n × n matrices B and C exist, such that and , where H is given by

Corollary 1.11.1 is an immediate consequence of Theorem 1.11.

Corollary 1.11.1

Let A be an m × n matrix with . Then an m × r matrix F and an matrix G exist, such that and .

1.10 Orthogonal Matrices

An m × 1 vector p is said to be a normalized vector or a unit vector if . The m × 1 vectors, , where , are said to be orthogonal if for all . If in addition, each is a normalized vector, then the vectors are said to be orthonormal. An m × m matrix P whose columns form an orthonormal set of vectors is called an orthogonal matrix. It immediately follows that

Taking the determinant of both sides, we see that

Thus, or −1, so that P is nonsingular, , and in addition to ; that is, the rows of P also form an orthonormal set of m × 1 vectors. Some basic properties of orthogonal matrices are summarized in Theorem 1.12.

Theorem 1.12

Let P and Q be m × m orthogonal matrices and A be any m × m matrix. Then

a.

,

b.

,

c.

is an orthogonal matrix.

One example of an m × m orthogonal matrix, known as the Helmert matrix, has the form

For instance, if , the Helmert matrix is

Note that if , it is possible for an m × n matrix P to satisfy one of the identities, or , but not both. Such a matrix is sometimes referred to as a semiorthogonal matrix.

An m × m matrix P is called a permutation matrix if each row and each column of P has a single element 1, while all remaining elements are zeros. As a result, the columns of P will be , the columns of , in some order. Note then that the th element of will be for some i, and the th element of will be for some if ; that is, a permutation matrix is a special orthogonal matrix. Since there are ways of permuting the columns of , there are different permutation matrices of order m. If A is also m × m, then PA creates an m × m matrix by permuting the rows of A, and produces a matrix by permuting the columns of A.

1.11 Quadratic Forms

Let x be an m × 1 vector, y an n × 1 vector, and A an m × n matrix. Then the function of x and y given by

is sometimes called a bilinear form in x and y. We will be most interested in the special case in which , so that A is m × m, and . In this case, the function above reduces to the function of x,

which is called a quadratic form in x; A is referred to as the matrix of the quadratic form. We will always assume that A is a symmetric matrix because, if it is not, A may be replaced by , which is symmetric, without altering ; that is,

because . As an example, consider the function

where x is 3 × 1. The symmetric matrix A satisfying is given by

Every symmetric matrix A and its associated quadratic form is classified into one of the following five categories:

a.

If

for all

, then

A

is positive definite.

b.

If

for all

x

and

for some

, then

A

is positive semidefinite.

c.

If

for all

, then

A

is negative definite.

d.

If

for all

x

and

for some

, then

A

is negative semidefinite.

e.

If

for some

x

and

for some

x

, then

A

is indefinite.

Note that the null matrix is actually both positive semidefinite and negative semidefinite.

Positive definite and negative definite matrices are nonsingular, whereas positive semidefinite and negative semidefinite matrices are singular. Sometimes the term nonnegative definite will be used to refer to a symmetric matrix that is either positive definite or positive semidefinite. An m × m matrix B is called a square root of the nonnegative definite m × m matrix A if . Sometimes we will denote such a matrix B as . If B is also symmetric, so that , then B is called the symmetric square root of A.

Quadratic forms play a prominent role in inferential statistics. In Chapter 11, we will develop some of the most important results involving quadratic forms that are of particular interest in statistics.

1.12 Complex Matrices

Throughout most of this text, we will be dealing with the analysis of vectors and matrices composed of real numbers or variables. However, there are occasions in which an analysis of a real matrix, such as the decomposition of a matrix in the form of a product of other matrices, leads to matrices that contain complex numbers. For this reason, we will briefly summarize in this section some of the basic notation and terminology regarding complex numbers.

Any complex number c can be written in the form

where a and b are real numbers and i represents the imaginary number . The real number a is called the real part of c, whereas b is referred to as the imaginary part of c. Thus, the number c is a real number only if b is 0. If we have two complex numbers, and , then their sum is given by

whereas their product is given by

Corresponding to each complex number is another complex number denoted by and called the complex conjugate of c. The complex conjugate of c is given by and satisfies , so that the product of a complex number and its conjugate results in a real number.

A complex number can be represented geometrically by a point in the complex plane, where one of the axes is the real axis and the other axis is the complex or imaginary axis. Thus, the complex number would be represented by the point in this complex plane. Alternatively, we can use the polar coordinates , where r is the length of the line from the origin to the point and