Matrix Differential Calculus with Applications in Statistics and Econometrics - Jan R. Magnus - E-Book

Matrix Differential Calculus with Applications in Statistics and Econometrics E-Book

Jan R. Magnus

0,0
97,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

A brand new, fully updated edition of a popular classic on matrix differential calculus with applications in statistics and econometrics This exhaustive, self-contained book on matrix theory and matrix differential calculus provides a treatment of matrix calculus based on differentials and shows how easy it is to use this theory once you have mastered the technique. Jan Magnus, who, along with the late Heinz Neudecker, pioneered the theory, develops it further in this new edition and provides many examples along the way to support it. Matrix calculus has become an essential tool for quantitative methods in a large number of applications, ranging from social and behavioral sciences to econometrics. It is still relevant and used today in a wide range of subjects such as the biosciences and psychology. Matrix Differential Calculus with Applications in Statistics and Econometrics, Third Edition contains all of the essentials of multivariable calculus with an emphasis on the use of differentials. It starts by presenting a concise, yet thorough overview of matrix algebra, then goes on to develop the theory of differentials. The rest of the text combines the theory and application of matrix differential calculus, providing the practitioner and researcher with both a quick review and a detailed reference. * Fulfills the need for an updated and unified treatment of matrix differential calculus * Contains many new examples and exercises based on questions asked of the author over the years * Covers new developments in field and features new applications * Written by a leading expert and pioneer of the theory * Part of the Wiley Series in Probability and Statistics Matrix Differential Calculus With Applications in Statistics and Econometrics Third Edition is an ideal text for graduate students and academics studying the subject, as well as for postgraduates and specialists working in biosciences and psychology.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 669

Veröffentlichungsjahr: 2019

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Preface

Part One: Matrices

Chapter 1:

Basic properties of vectors and matrices

1 INTRODUCTION

2 SETS

3 MATRICES: ADDITION AND MULTIPLICATION

4 THE TRANSPOSE OF A MATRIX

5 SQUARE MATRICES

6 LINEAR FORMS AND QUADRATIC FORMS

7 THE RANK OF A MATRIX

8 THE INVERSE

9 THE DETERMINANT

10 THE TRACE

11 PARTITIONED MATRICES

12 COMPLEX MATRICES

13 EIGENVALUES AND EIGENVECTORS

14 SCHUR'S DECOMPOSITION THEOREM

15 THE JORDAN DECOMPOSITION

16 THE SINGULAR‐VALUE DECOMPOSITION

17 FURTHER RESULTS CONCERNING EIGENVALUES

18 POSITIVE (SEMI)DEFINITE MATRICES

19 THREE FURTHER RESULTS FOR POSITIVE DEFINITE MATRICES

20 A USEFUL RESULT

21 SYMMETRIC MATRIX FUNCTIONS

Chapter 2:

Kronecker products, vec operator, and Moore‐Penrose inverse

1

INTRODUCTION

2

THE KRONECKER PRODUCT

3 EIGENVALUES OF A KRONECKER PRODUCT

4 THE VEC OPERATOR

6 EXISTENCE AND UNIQUENESS OF THE MP INVERSE

7 SOME PROPERTIES OF THE MP INVERSE

8 FURTHER PROPERTIES

9 THE SOLUTION OF LINEAR EQUATION SYSTEMS

BIBLIOGRAPHICAL NOTES

Chapter 3:

Miscellaneous matrix results

1 INTRODUCTION

2 THE ADJOINT MATRIX

3 PROOF OF THEOREM 3.1

4 BORDERED DETERMINANTS

5 THE MATRIX EQUATION

AX

= 0

6 THE HADAMARD PRODUCT

7 THE COMMUTATION MATRIX

K

mn

8 THE DUPLICATION MATRIX

D

n

9 RELATIONSHIP BETWEEN

D

n

+1

AND

D

n

, I

10 RELATIONSHIP BETWEEN

D

n

+1

AND

D

n

, II

11 CONDITIONS FOR A QUADRATIC FORM TO BE POSITIVE (NEGATIVE) SUBJECT TO LINEAR CONSTRAINTS

12 NECESSARY AND SUFFICIENT CONDITIONS FOR

r

(

A

:

B

) =

r

(

A

) +

r

(

B

)

13 THE BORDERED GRAMIAN MATRIX

14 THE EQUATIONS

X

1

A

+

X

2

B

′ =

G

1

,

X

1

B

=

G

2

BIBLIOGRAPHICAL NOTES

Part Two: Differentials: the theory

Chapter 4:

Mathematical preliminaries

1 INTRODUCTION

2 INTERIOR POINTS AND ACCUMULATION POINTS

3 OPEN AND CLOSED SETS

4 THE BOLZANO‐WEIERSTRASS THEOREM

5 FUNCTIONS

6 THE LIMIT OF A FUNCTION

7 CONTINUOUS FUNCTIONS AND COMPACTNESS

8 CONVEX SETS

9 CONVEX AND CONCAVE FUNCTIONS

Chapter 5:

Differentials and differentiability

1 INTRODUCTION

2 CONTINUITY

3 DIFFERENTIABILITY AND LINEAR APPROXIMATION

4 THE DIFFERENTIAL OF A VECTOR FUNCTION

5 UNIQUENESS OF THE DIFFERENTIAL

6 CONTINUITY OF DIFFERENTIABLE FUNCTIONS

7 PARTIAL DERIVATIVES

8 THE FIRST IDENTIFICATION THEOREM

9 EXISTENCE OF THE DIFFERENTIAL, I

10 EXISTENCE OF THE DIFFERENTIAL, II

11 CONTINUOUS DIFFERENTIABILITY

12 THE CHAIN RULE

13 CAUCHY INVARIANCE

14 THE MEAN‐VALUE THEOREM FOR REAL‐VALUED FUNCTIONS

15 DIFFERENTIABLE MATRIX FUNCTIONS

16 SOME REMARKS ON NOTATION

17 COMPLEX DIFFERENTIATION

BIBLIOGRAPHICAL NOTES

Chapter 6:

The second differential

1 INTRODUCTION

2 SECOND‐ORDER PARTIAL DERIVATIVES

3 THE HESSIAN MATRIX

4 TWICE DIFFERENTIABILITY AND SECOND‐ORDER APPROXIMATION, I

5 DEFINITION OF TWICE DIFFERENTIABILITY

6 THE SECOND DIFFERENTIAL

7 SYMMETRY OF THE HESSIAN MATRIX

8 THE SECOND IDENTIFICATION THEOREM

9 TWICE DIFFERENTIABILITY AND SECOND‐ORDER APPROXIMATION, II

10 CHAIN RULE FOR HESSIAN MATRICES

11 THE ANALOG FOR SECOND DIFFERENTIALS

12 TAYLOR'S THEOREM FOR REAL‐VALUED FUNCTIONS

13 HIGHER‐ORDER DIFFERENTIALS

14 REAL ANALYTIC FUNCTIONS

15 TWICE DIFFERENTIABLE MATRIX FUNCTIONS

BIBLIOGRAPHICAL NOTES

Chapter 7:

Static optimization

1 INTRODUCTION

2 UNCONSTRAINED OPTIMIZATION

3 THE EXISTENCE OF ABSOLUTE EXTREMA

4 NECESSARY CONDITIONS FOR A LOCAL MINIMUM

5 SUFFICIENT CONDITIONS FOR A LOCAL MINIMUM: FIRST‐DERIVATIVE TEST

6 SUFFICIENT CONDITIONS FOR A LOCAL MINIMUM: SECOND‐DERIVATIVE TEST

7 CHARACTERIZATION OF DIFFERENTIABLE CONVEX FUNCTIONS

8 CHARACTERIZATION OF TWICE DIFFERENTIABLE CONVEX FUNCTIONS

9 SUFFICIENT CONDITIONS FOR AN ABSOLUTE MINIMUM

10 MONOTONIC TRANSFORMATIONS

11 OPTIMIZATION SUBJECT TO CONSTRAINTS

12 NECESSARY CONDITIONS FOR A LOCAL MINIMUM UNDER CONSTRAINTS

13 SUFFICIENT CONDITIONS FOR A LOCAL MINIMUM UNDER CONSTRAINTS

14 SUFFICIENT CONDITIONS FOR AN ABSOLUTE MINIMUM UNDER CONSTRAINTS

15 A NOTE ON CONSTRAINTS IN MATRIX FORM

16 ECONOMIC INTERPRETATION OF LAGRANGE MULTIPLIERS

APPENDIX: THE IMPLICIT FUNCTION THEOREM

Part Three: Differentials: the practice

Chapter 8:

Some important differentials

1 INTRODUCTION

2 FUNDAMENTAL RULES OF DIFFERENTIAL CALCULUS

3 THE DIFFERENTIAL OF A DETERMINANT

4 THE DIFFERENTIAL OF AN INVERSE

5 DIFFERENTIAL OF THE MOORE‐PENROSE INVERSE

6 THE DIFFERENTIAL OF THE ADJOINT MATRIX

7 ON DIFFERENTIATING EIGENVALUES AND EIGENVECTORS

8 THE CONTINUITY OF EIGENPROJECTIONS

9 THE DIFFERENTIAL OF EIGENVALUES AND EIGENVECTORS: SYMMETRIC CASE

10 TWO ALTERNATIVE EXPRESSIONS FOR d

λ

11 SECOND DIFFERENTIAL OF THE EIGENVALUE FUNCTION

Chapter 9:

First‐order differentials and Jacobian matrices

1 INTRODUCTION

2 CLASSIFICATION

3 DERISATIVES

4 DERIVATIVES

5 IDENTIFICATION OF JACOBIAN MATRICES

6 THE FIRST IDENTIFICATION TABLE

7 PARTITIONING OF THE DERIVATIVE

8 SCALAR FUNCTIONS OF A SCALAR

9 SCALAR FUNCTIONS OF A VECTOR

10 SCALAR FUNCTIONS OF A MATRIX, I: TRACE

11 SCALAR FUNCTIONS OF A MATRIX, II: DETERMINANT

12 SCALAR FUNCTIONS OF A MATRIX, III: EIGENVALUE

13 TWO EXAMPLES OF VECTOR FUNCTIONS

14 MATRIX FUNCTIONS

15 KRONECKER PRODUCTS

16 SOME OTHER PROBLEMS

17 JACOBIANS OF TRANSFORMATIONS

Chapter 10:

Second‐order differentials and Hessian matrices

1 INTRODUCTION

2 THE SECOND IDENTIFICATION TABLE

3 LINEAR AND QUADRATIC FORMS

4 A USEFUL THEOREM

5 THE DETERMINANT FUNCTION

6 THE EIGENVALUE FUNCTION

7 OTHER EXAMPLES

8 COMPOSITE FUNCTIONS

9 THE EIGENVECTOR FUNCTION

10 HESSIAN OF MATRIX FUNCTIONS, I

11 HESSIAN OF MATRIX FUNCTIONS, II

Part Four: Inequalities

Chapter 11:

Inequalities

1 INTRODUCTION

2 THE CAUCHY‐SCHWARZ INEQUALITY

3 MATRIX ANALOGS OF THE CAUCHY‐SCHWARZ INEQUALITY

4 THE THEOREM OF THE ARITHMETIC AND GEOMETRIC MEANS

5 THE RAYLEIGH QUOTIENT

6 CONCAVITY OF

λ

1

AND CONVEXITY OF

λ

n

7 VARIATIONAL DESCRIPTION OF EIGENVALUES

8 FISCHER'S MIN‐MAX THEOREM

9 MONOTONICITY OF THE EIGENVALUES

10 THE POINCARÉ SEPARATION THEOREM

11 TWO COROLLARIES OF POINCARÉ'S THEOREM

12 FURTHER CONSEQUENCES OF THE POINCARÉ THEOREM

13 MULTIPLICATIVE VERSION

14 THE MAXIMUM OF A BILINEAR FORM

15 HADAMARD'S INEQUALITY

16 AN INTERLUDE: KARAMATA'S INEQUALITY

17 KARAMATA'S INEQUALITY AND EIGENVALUES

18 AN INEQUALITY CONCERNING POSITIVE SEMIDEFINITE MATRICES

19 A REPRESENTATION THEOREM FOR

20 A REPRESENTATION THEOREM FOR (tr

A

)

21 HÖLDER'S INEQUALITY

22 CONCAVITY OF log|

A

|

23 MINKOWSKI'S INEQUALITY

24 QUASILINEAR REPRESENTATION OF |

A

|

25 MINKOWSKI'S DETERMINANT THEOREM

26 WEIGHTED MEANS OF ORDER

p

27 SCHLÖMILCH'S INEQUALITY

28 CURVATURE PROPERTIES OF

M

p

(

x

,

a

)

29 LEAST SQUARES

30 GENERALIZED LEAST SQUARES

31 RESTRICTED LEAST SQUARES

32 RESTRICTED LEAST SQUARES: MATRIX VERSION

Part Five: The linear model

Chapter 12:

Statistical preliminaries

1 INTRODUCTION

2 THE CUMULATIVE DISTRIBUTION FUNCTION

3 THE JOINT DENSITY FUNCTION

4 EXPECTATIONS

5 VARIANCE AND COVARIANCE

6 INDEPENDENCE OF TWO RANDOM VARIABLES

7 INDEPENDENCE OF

n

RANDOM VARIABLES

8 SAMPLING

9 THE ONE‐DIMENSIONAL NORMAL DISTRIBUTION

10 THE MULTIVARIATE NORMAL DISTRIBUTION

11 ESTIMATION

Chapter 13:

The linear regression model

1

INTRODUCTION

2

AFFINE MINIMUM‐TRACE UNBIASED ESTIMATION

3

THE GAUSS‐MARKOV THEOREM

4

THE METHOD OF LEAST SQUARES

5

AITKEN'S THEOREM

6

MULTICOLLINEARITY

7

ESTIMABLE FUNCTIONS

8

LINEAR CONSTRAINTS: THE CASE

ℳ(R) ⊂ ℳ(X)

9

LINEAR CONSTRAINTS: THE GENERAL CASE

10

LINEAR

CONSTRAINTS: THE

CASE

ℳ(R) ∩ ℳ(X) = {0}

11

A SINGULAR

VARIANCE

MATRIX: THE

CASE

ℳ(X) ⊂ ℳ(V)

12 A SINGULAR

VARIANCE

MATRIX: THE

CASE

r

(

X

V

X

) =

r

(

X

)

13 A SINGULAR VARIANCE MATRIX: THE GENERAL CASE, I

14

EXPLICIT AND IMPLICIT LINEAR CONSTRAINTS

15

THE GENERAL LINEAR MODEL, I

16 A SINGULAR VARIANCE MATRIX: THE GENERAL CASE, II

17

THE GENERAL LINEAR MODEL, II

18

GENERALIZED LEAST SQUARES

19

RESTRICTED LEAST SQUARES

Chapter 14:

Further topics in the linear model

1 INTRODUCTION

2 BEST QUADRATIC UNBIASED ESTIMATION OF

σ

3 THE BEST QUADRATIC AND POSITIVE UNBIASED ESTIMATOR OF

σ

4 THE BEST QUADRATIC UNBIASED ESTIMATOR OF

σ

5 BEST QUADRATIC INVARIANT ESTIMATION OF

σ

6 THE BEST QUADRATIC AND POSITIVE INVARIANT ESTIMATOR OF

σ

7 THE BEST QUADRATIC INVARIANT ESTIMATOR OF

σ

8 BEST QUADRATIC UNBIASED ESTIMATION: MULTIVARIATE NORMAL CASE

9 BOUNDS FOR THE BIAS OF THE LEAST‐SQUARES ESTIMATOR OF

σ

, I

10 BOUNDS FOR THE BIAS OF THE LEAST‐SQUARES ESTIMATOR OF

σ

, II

11 THE PREDICTION OF DISTURBANCES

12 BEST LINEAR UNBIASED PREDICTORS WITH SCALAR VARIANCE MATRIX

13 BEST LINEAR UNBIASED PREDICTORS WITH FIXED VARIANCE MATRIX, I

14 BEST LINEAR UNBIASED PREDICTORS WITH FIXED VARIANCE MATRIX, II

15 LOCAL SENSITIVITY OF THE POSTERIOR MEAN

16 LOCAL SENSITIVITY OF THE POSTERIOR PRECISION

Part Six: Applications to maximum likelihood estimation

Chapter 15:

Maximum likelihood estimation

1 INTRODUCTION

2 THE METHOD OF MAXIMUM LIKELIHOOD (ML)

3 ML ESTIMATION OF THE MULTIVARIATE NORMAL DISTRIBUTION

4 SYMMETRY: IMPLICIT VERSUS EXPLICIT TREATMENT

5 THE TREATMENT OF POSITIVE DEFINITENESS

6 THE INFORMATION MATRIX

7 ML ESTIMATION OF THE MULTIVARIATE NORMAL DISTRIBUTION: DISTINCT MEANS

8 THE MULTIVARIATE LINEAR REGRESSION MODEL

9 THE ERRORS‐IN‐VARIABLES MODEL

10 THE NONLINEAR REGRESSION MODEL WITH NORMAL ERRORS

11 SPECIAL CASE: FUNCTIONAL INDEPENDENCE OF MEAN AND VARIANCE PARAMETERS

12 GENERALIZATION OF THEOREM 15.6

Chapter 16:

Simultaneous equations

1 INTRODUCTION

2 THE SIMULTANEOUS EQUATIONS MODEL

3 THE IDENTIFICATION PROBLEM

4 IDENTIFICATION WITH LINEAR CONSTRAINTS ON

B

AND Γ ONLY

5 IDENTIFICATION WITH LINEAR CONSTRAINTS ON

B

, Γ, AND Σ

6 NONLINEAR CONSTRAINTS

7 FIML: THE INFORMATION MATRIX (GENERAL CASE)

8 FIML: ASYMPTOTIC VARIANCE MATRIX (SPECIAL CASE)

9 LIML: FIRST‐ORDER CONDITIONS

10 LIML: INFORMATION MATRIX

11 LIML: ASYMPTOTIC VARIANCE MATRIX

BIBLIOGRAPHICAL NOTES

Chapter 17:

Topics in psychometrics

1 INTRODUCTION

2 POPULATION PRINCIPAL COMPONENTS

3 OPTIMALITY OF PRINCIPAL COMPONENTS

4 A RELATED RESULT

5 SAMPLE PRINCIPAL COMPONENTS

6 OPTIMALITY OF SAMPLE PRINCIPAL COMPONENTS

7 ONE‐MODE COMPONENT ANALYSIS

8 ONE‐MODE COMPONENT ANALYSIS AND SAMPLE PRINCIPAL COMPONENTS

9 TWO‐MODE COMPONENT ANALYSIS

10 MULTIMODE COMPONENT ANALYSIS

11 FACTOR ANALYSIS

12 A ZIGZAG ROUTINE

13 NEWTON‐RAPHSON ROUTINE

14 KAISER'S VARIMAX METHOD

15 CANONICAL CORRELATIONS AND VARIATES IN THE POPULATION

16 CORRESPONDENCE ANALYSIS

17 LINEAR DISCRIMINANT ANALYSIS

BIBLIOGRAPHICAL NOTES

Part Seven: Summary

Chapter 18:

Matrix calculus: the essentials

1 INTRODUCTION

2 DIFFERENTIALS

3 VECTOR CALCULUS

4 OPTIMIZATION

5 LEAST SQUARES

6 MATRIX CALCULUS

7 INTERLUDE ON LINEAR AND QUADRATIC FORMS

8 THE SECOND DIFFERENTIAL

9 CHAIN RULE FOR SECOND DIFFERENTIALS

10 FOUR EXAMPLES

11 THE KRONECKER PRODUCT AND VEC OPERATOR

12 IDENTIFICATION

13 THE COMMUTATION MATRIX

14 FROM SECOND DIFFERENTIAL TO HESSIAN

15 SYMMETRY AND THE DUPLICATION MATRIX

16 MAXIMUM LIKELIHOOD

FURTHER READING

Bibliography

Index of symbols

Subject index

End User License Agreement

List of Tables

Chapter 9

Table 9.1 Classification of functions and variables

Table 9.2 The first identification table

Table 9.3 Differentials of linear and quadratic forms

Table 9.4 Differentials involving the trace

Table 9.5 Differentials involving the determinant

Table 9.6 Simple matrix differentials

Table 9.7 Matrix differentials involving powers

Chapter 10

Table 10.1 The second identification table

List of Illustrations

Chapter 4

Figure 4.1 Convex and nonconvex sets in

2

Figure 4.2 A convex function

Chapter 5

Figure 5.1 Geometrical interpretation of the differential

Chapter 7

Figure 7.1 Unconstrained optimization in one variable

Chapter 8

Figure 8.1  The eigenvalue functions

Chapter 11

Figure 11.1 Diagram showing that

A

(

ɛ

)

<

0

Guide

Cover

Table of Contents

Begin Reading

Pages

xiii

xiv

xv

xvi

xvii

xviii

1

3

4

5

6

7

8

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

161

160

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

223

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

273

274

275

276

277

278

279

280

281

282

283

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

420

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

467

468

469

470

471

472

473

474

475

476

477

478

479

480

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by Walter E. Shewhart and Samuel S. Wilks

Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, and Ruey S. Tsay

Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, and Jozef L. Teugels

The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state‐of‐the‐art developments in the field and classical methods.

Reflecting the wide range of current research in statistics, the series encompasses applied, methodological, and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches. This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.

A complete list of the titles in this series can be found at http://www.wiley.com/go/wsps

Matrix Differential Calculus with Applications in Statistics and Econometrics

Third Edition

Jan R. Magnus

Department of Econometrics and Operations Research Vrije Universiteit Amsterdam, The Netherlands

 

and

 

Heinz Neudeckery†

Amsterdam School of Economics University of Amsterdam, The Netherlands

Copyright

This edition first published 2019

© 2019 John Wiley & Sons Ltd

Edition History

John Wiley & Sons (1e, 1988) and John Wiley & Sons (2e, 1999)

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Jan R. Magnus and Heinz Neudecker to be identified as the authors of this work has been asserted in accordance with law.

Registered Offices

John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA

John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Editorial Office

9600 Garsington Road, Oxford, OX4 2DQ, UK

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of Warranty

While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication Data applied for

ISBN: 9781119541202

Cover design by Wiley

Cover image: © phochi/Shutterstock

Preface

Preface to the first edition

There has been a long‐felt need for a book that gives a self‐contained and unified treatment of matrix differential calculus, specifically written for econometricians and statisticians. The present book is meant to satisfy this need. It can serve as a textbook for advanced undergraduates and postgraduates in econometrics and as a reference book for practicing econometricians. Mathematical statisticians and psychometricians may also find something to their liking in the book.

When used as a textbook, it can provide a full‐semester course. Reasonable proficiency in basic matrix theory is assumed, especially with the use of partitioned matrices. The basics of matrix algebra, as deemed necessary for a proper understanding of the main subject of the book, are summarized in Part One, the first of the book’s six parts. The book also contains the essentials of multivariable calculus but geared to and often phrased in terms of differentials.

The sequence in which the chapters are being read is not of great consequence. It is fully conceivable that practitioners start with Part Three (Differentials: the practice) and, dependent on their predilections, carry on to Parts Five or Six, which deal with applications. Those who want a full understanding of the underlying theory should read the whole book, although even then they could go through the necessary matrix algebra only when the specific need arises.

Matrix differential calculus as presented in this book is based on differentials, and this sets the book apart from other books in this area. The approach via differentials is, in our opinion, superior to any other existing approach. Our principal idea is that differentials are more congenial to multivariable functions as they crop up in econometrics, mathematical statistics, or psychometrics than derivatives, although from a theoretical point of view the two concepts are equivalent.

The book falls into six parts. Part One deals with matrix algebra. It lists, and also often proves, items like the Schur, Jordan, and singular‐value decompositions; concepts like the Hadamard and Kronecker products; the vec operator; the commutation and duplication matrices; and the Moore‐Penrose inverse. Results on bordered matrices (and their determinants) and (linearly restricted) quadratic forms are also presented here.

Part Two, which forms the theoretical heart of the book, is entirely devoted to a thorough treatment of the theory of differentials, and presents the essentials of calculus but geared to and phrased in terms of differentials. First and second differentials are defined, ‘identification’ rules for Jacobian and Hessian matrices are given, and chain rules derived. A separate chapter on the theory of (constrained) optimization in terms of differentials concludes this part.

Part Three is the practical core of the book. It contains the rules for working with differentials, lists the differentials of important scalar, vector, and matrix functions (inter alia eigenvalues, eigenvectors, and the Moore‐Penrose inverse) and supplies ‘identification’ tables for Jacobian and Hessian matrices.

Part Four, treating inequalities, owes its existence to our feeling that econometricians should be conversant with inequalities, such as the Cauchy‐Schwarz and Minkowski inequalities (and extensions thereof), and that they should also master a powerful result like Poincaré’s separation theorem. This part is to some extent also the case history of a disappointment. When we started writing this book we had the ambition to derive all inequalities by means of matrix differential calculus. After all, every inequality can be rephrased as the solution of an optimization problem. This proved to be an illusion, due to the fact that the Hessian matrix in most cases is singular at the optimum point.

Part Five is entirely devoted to applications of matrix differential calculus to the linear regression model. There is an exhaustive treatment of estimation problems related to the fixed part of the model under various assumptions concerning ranks and (other) constraints. Moreover, it contains topics relating to the stochastic part of the model, viz. estimation of the error variance and prediction of the error term. There is also a small section on sensitivity analysis. An introductory chapter deals with the necessary statistical preliminaries.

Part Six deals with maximum likelihood estimation, which is of course an ideal source for demonstrating the power of the propagated techniques. In the first of three chapters, several models are analysed, inter alia the multivariate normal distribution, the errors‐in‐variables model, and the nonlinear regression model. There is a discussion on how to deal with symmetry and positive definiteness, and special attention is given to the information matrix. The second chapter in this part deals with simultaneous equations under normality conditions. It investigates both identification and estimation problems, subject to various (non)linear constraints on the parameters. This part also discusses full‐information maximum likelihood (FIML) and limited‐information maximum likelihood (LIML), with special attention to the derivation of asymptotic variance matrices. The final chapter addresses itself to various psychometric problems, inter alia principal components, multimode component analysis, factor analysis, and canonical correlation.

All chapters contain many exercises. These are frequently meant to be complementary to the main text.

A large number of books and papers have been published on the theory and applications of matrix differential calculus. Without attempting to describe their relative virtues and particularities, the interested reader may wish to consult Dwyer and Macphail (1948), Bodewig (1959), Wilkinson (1965), Dwyer (1967), Neudecker (1967, 1969), Tracy and Dwyer (1969), Tracy and Singh (1972), McDonald and Swaminathan (1973), MacRae (1974), Balestra (1976), Bentler and Lee (1978), Henderson and Searle (1979), Wong and Wong (1979, 1980), Nel (1980), Rogers (1980), Wong (1980, 1985), Graham (1981), McCulloch (1982), Schönemann (1985), Magnus and Neudecker (1985), Pollock (1985), Don (1986), and Kollo (1991). The papers by Henderson and Searle (1979) and Nel (1980), and Rogers’ (1980) book contain extensive bibliographies.

The two authors share the responsibility for Parts One, Three, Five, and Six, although any new results in Part One are due to Magnus. Parts Two and Four are due to Magnus, although Neudecker contributed some results to Part Four. Magnus is also responsible for the writing and organization of the final text.

We wish to thank our colleagues F. J. H. Don, R. D. H. Heijmans, D. S. G. Pollock, and R. Ramer for their critical remarks and contributions. The greatest obligation is owed to Sue Kirkbride at the London School of Economics who patiently and cheerfully typed and retyped the various versions of the book. Partial financial support was provided by the Netherlands Organization for the Advancement of Pure Research (Z. W. O.) and the Suntory Toyota International Centre for Economics and Related Disciplines at the London School of Economics.

London/Amsterdam Jan R. Magnus

April 1987 Heinz Neudecker

Preface to the first revised printing

Since this book first appeared — now almost four years ago — many of our colleagues, students, and other readers have pointed out typographical errors and have made suggestions for improving the text. We are particularly grateful to R. D. H. Heijmans, J. F. Kiviet, I. J. Steyn, and G. Trenkler. We owe the greatest debt to F. Gerrish, formerly of the School of Mathematics in the Polytechnic, Kingston‐upon‐Thames, who read Chapters 1–11 with awesome precision and care and made numerous insightful suggestions and constructive remarks. We hope that this printing will continue to trigger comments from our readers.

London/Tilburg/Amsterdam Jan R. Magnus

February 1991 Heinz Neudecker

Preface to the second edition

A further seven years have passed since our first revision in 1991. We are happy to see that our book is still being used by colleagues and students. In this revision we attempted to reach three goals. First, we made a serious attempt to keep the book up‐to‐date by adding many recent references and new exercises. Second, we made numerous small changes throughout the text, improving the clarity of exposition. Finally, we corrected a number of typographical and other errors.

The structure of the book and its philosophy are unchanged. Apart from a large number of small changes, there are two major changes. First, we interchanged Sections 12 and 13 of Chapter 1, since complex numbers need to be discussed before eigenvalues and eigenvectors, and we corrected an error in Theorem 1.7. Second, in Chapter 17 on psychometrics, we rewrote Sections 8–10 relating to the Eckart‐Young theorem.

We are grateful to Karim Abadir, Paul Bekker, Hamparsum Bozdogan, Michael Browne, Frank Gerrish, Kaddour Hadri, Tõnu Kollo, Shuangzhe Liu, Daan Nel, Albert Satorra, Kazuo Shigemasu, Jos ten Berge, Peter ter Berg, Götz Trenkler, Haruo Yanai, and many others for their thoughtful and constructive comments. Of course, we welcome further comments from our readers.

Tilburg/Amsterdam Jan R. Magnus

March 1998 Heinz Neudecker

Preface to the third edition

Twenty years have passed since the appearance of the second edition and thirty years since the book first appeared. This is a long time, but the book still lives. Unfortunately, my coauthor Heinz Neudecker does not; he died in December 2017. Heinz was my teacher at the University of Amsterdam and I was fortunate to learn the subject of matrix calculus through differentials (then in its infancy) from his lectures and personal guidance. This technique is still a remarkably powerful tool, and Heinz Neudecker must be regarded as its founding father.

The original text of the book was written on a typewriter and then handed over to the publisher for typesetting and printing. When it came to the second edition, the typeset material could no longer be found, which is why the second edition had to be produced in an ad hoc manner which was not satisfactory. Many people complained about this, to me and to the publisher, and the publisher offered us to produce a new edition, freshly typeset, which would look good. In the mean time, my Russian colleagues had proposed to translate the book into Russian, and I realized that this would only be feasible if they had a good English <$$$>LATEX text. So, my secretary Josette Janssen at Tilburg University and I produced a <$$$>LATEX text with expert advice from Jozef Pijnenburg. In the process of retyping the manuscript, many small changes were made to improve the readability and consistency of the text, but the structure of the book was not changed. The English <$$$>LATEX version was then used as the basis for the Russian edition,

Matrichnoe Differenzial’noe Ischislenie s Prilozhenijami k Statistike i Ekonometrike,

translated by my friends Anatoly Peresetsky and Pavel Katyshev, and published by Fizmatlit Publishing House, Moscow, 2002. The current third edition is based on this English LATEX version, although I have taken the opportunity to make many improvements to the presentation of the material.

Of course, this was not the only reason for producing a third edition. It was time to take a fresh look at the material and to update the references. I felt it was appropriate to stay close to the original text, because this is the book that Heinz and I conceived and the current text is a new edition, not a new book. The main changes relative to the second edition are as follows:

Some subjects were treated insufficiently (some of my friends would say ‘incorrectly’) and I have attempted to repair these omissions. This applies in particular to the discussion on matrix functions (Section 1.21), complex differentiation (Section 5.17), and Jacobians of transformations (Section 9.17).

The text on differentiating eigenvalues and eigenvectors and associated continuity issues has been rewritten, see Sections 8.7–8.11.

Chapter 10 has been completely rewritten, because I am now convinced that it is not useful to define Hessian matrices for vector or matrix functions. So I now define Hessian matrices only for scalar functions and for individual components of vector functions and individual elements of matrix functions. This makes life much easier.

I have added two additional sections at the end of Chapter 17 on psychometrics, relating to correspondence analysis and linear discriminant analysis.

Chapter 18 is new. It can be read without consulting the other chapters and provides a summary of the whole book. It can therefore be used as an introduction to matrix calculus for advanced undergraduates or Master’s and PhD students in economics, statistics, mathematics, and engineering who want to know how to apply matrix calculus without going into all the theoretical details.

In addition, many small changes have been made, references have been updated, and exercises have been added. Over the past 30 years, I received many queries, problems, and requests from readers, about once every 2 weeks, which amounts to about 750 queries in 30 years. I responded to all of them and a number of these problems appear in the current text as exercises.

I am grateful to Don Andrews, Manuel Arellano, Richard Baillie, Luc Bauwens, Andrew Chesher, Gerda Claeskens, Russell Davidson, Jean‐Marie Dufour, Ronald Gallant, Eric Ghysels, Bruce Hansen, Grant Hillier, Cheng Hsiao, Guido Imbens, Guido Kuersteiner, Offer Lieberman, Esfandiar Maasoumi, Whitney Newey, Kazuhiro Ohtani, Enrique Sentana, Cezary Sielużycki, Richard Smith, Götz Trenkler, and Farshid Vahid for general encouragement and specific suggestions; to Henk Pijls for answering my questions on complex differentiation and Michel van de Velden for help on psychometric issues; to Jan Brinkhuis, Chris Muris, Franco Peracchi, Andrey Vasnev, Wendun Wang, and Yuan Yue on commenting on the new Chapter 18; to Ang Li for exceptional research assistance in updating the literature; and to Ilka van de Werve for expertly redrawing the figures. No blame attaches to any of these people in case there are remaining errors, ambiguities, or omissions; these are entirely my own responsibility, especially since I have not always followed their advice.

Cross‐References. The numbering of theorems, propositions, corollaries, figures, tables, assumptions, examples, and definitions is with two digits, so that Theorem 3.5 refers to Theorem 5 in Chapter 3. Sections are numbered 1, 2,… within each chapter but always referenced with two digits so that Section 5 in Chapter 3 is referred to as Section 3.5. Equations are numbered (1), (2),… within each chapter, and referred to with one digit if it refers to the same chapter; if it refers to another chapter we write, for example, see Equation (16) in Chapter 5. Exercises are numbered 1, 2,… after a section.

Notation. Special symbols are used to denote the derivative (matrix) D and the Hessian (matrix) H. The differential operator is denoted by d. The third edition follows the notation of earlier editions with the following exceptions. First, the symbol for the vector (1, 1, … , 1)′ has been altered from a calligraphic s to ι (dotless i); second, the symbol i for imaginary root has been replaced by the more common i; third, v(A), the vector indicating the essentially distinct components of a symmetric matrix A, has been replaced by vech(A); fourth, the symbols for expectation, variance, and covariance (previously ε, ν and C) have been replaced by E, var, and cov, respectively; and fifth, we now denote the normal distribution by N (previously N). A list of all symbols is presented in the Index of Symbols at the end of the book.

Brackets are used sparingly. We write tr A instead of tr(A), while tr AB denotes tr(AB), not (tr A)B. Similarly, vec AB means vec(AB) and dXY means d(XY). In general, we only place brackets when there is a possibility of ambiguity.

I worked on the third edition between April and November 2018. I hope the book will continue to be useful for a few more years, and of course I welcome comments from my readers.

Amsterdam/Wapserveen Jan R. Magnus

November 2018

Part OneMatrices

Chapter 1Basic properties of vectors and matrices

1 INTRODUCTION

In this chapter, we summarize some of the well‐known definitions and theorems of matrix algebra. Most of the theorems will be proved.

2 SETS

A set is a collection of objects, called the elements (or members) of the set. We write x ∈ S to mean ‘x is an element of S’ or ‘x belongs to S’. If x does not belong to S, we write x ∉ S. The set that contains no elements is called the empty set, denoted by ∅.

Sometimes a set can be defined by displaying the elements in braces. For example, A = {0, 1} or

Notice that A is a finite set (contains a finite number of elements), whereas ℕ is an infinite set. If P is a property that any element of S has or does not have, then

denotes the set of all the elements of S that have property P.

A set A is called a subset of B, written A ⊂ B, whenever every element of A also belongs to B. The notation A ⊂ B does not rule out the possibility that A = B. If A ⊂ B and A ≠ B, then we say that A is a proper subset of B.

If A and Bare two subsets of S, we define

the union of A and B, as the set of elements of S that belong to A or to B or to both, and

the intersection of A and B, as the set of elements of S that belong to both A and B. We say that A and B are (mutually) disjoint if they have no common elements, that is, if

The complement of A relative to B, denoted by B − A, is the set {x : x ∈ B, but x ∉ A}. The complement of A (relative to S) is sometimes denoted by Ac.

The Cartesian product of two sets A and B, written A × B, is the set of all ordered pairs (a, b) such that a ∈ A and b ∈ B. More generally, the Cartesian product of n sets A1, A2, …, An, written

is the set of all ordered n‐tuples (a1, a2, …, an) such that ai ∈ Ai (i = 1, …, n).

The set of (finite) real numbers (the one‐dimensional Euclidean space) is denoted by ℝ. The n‐dimensional Euclidean spaceℝn is the Cartesian product of n sets equal to ℝ:

The elements of ℝn are thus the ordered n‐tuples (x1, x2, …, xn) of real numbers x1, x2, …, xn.

A set S of real numbers is said to be bounded if there exists a number M such that |x| ≤ M for all x ∈ S.

3 MATRICES: ADDITION AND MULTIPLICATION

A real m × n matrix A is a rectangular array of real numbers

We sometimes write A = (aij). If one or more of the elements of A is complex, we say that A is a complex matrix. Almost all matrices in this book are real and the word ‘matrix’ is assumed to be a real matrix, unless explicitly stated otherwise.

An m × n matrix can be regarded as a point in ℝm × n. The real numbers aij are called the elements of A. An m × 1 matrix is a point in ℝm × 1 (that is, in ℝm) and is called a (column) vector of order m × 1. A 1 × n matrix is called a row vector (of order 1 × n). The elements of a vector are usually called its components. Matrices are always denoted by capital letters and vectors by lower‐case letters.

The sum of two matrices A and B of the same order is defined as

The product of a matrix by a scalar λ is

The following properties are now easily proved for matrices A, B, and C of the same order and scalars λ and μ:

A matrix whose elements are all zero is called a null matrix and denoted by 0. We have, of course,

If A is an m × n matrix and B an n × p matrix (so that A has the same number of columns as B has rows), then we define the product of A and B as

Thus, AB is an m × p matrix and its ikth element is . The following properties of the matrix product can be established:

These relations hold provided the matrix products exist.

We note that the existence of AB does not imply the existence of BA, and even when both products exist, they are not generally equal. (Two matrices A and B for which

are said to commute.) We therefore distinguish between premultiplication and postmultiplication: a given m × n matrix A can be premultiplied by a p × m matrix B to form the product BA; it can also be postmultiplied by an n × q matrix C to form AC.

4 THE TRANSPOSE OF A MATRIX

The transpose of an m × n matrix A = (aij) is the n × m matrix, denoted by A′, whose ijth element is aji.

We have

(1)
(2)
(3)

If x is an n × 1 vector, then x′ is a 1 × n row vector and

The (Euclidean) norm of x is defined as

(4)

5 SQUARE MATRICES

A matrix is said to be square if it has as many rows as it has columns. A square matrix A = (aij), real or complex, is said to be

lower triangular

if

a

ij

= 0 (

i

<

j

),

strictly lower triangular

if

a

ij

= 0 (

i

j

),

unit lower triangular

if

a

ij

= 0 (

i

<

j

) and

a

ii

= 1 (all

i

),

upper triangular

if

a

ij

= 0 (

i

>

j

),

strictly upper triangular

if

a

ij

= 0 (

i

j

),

unit upper triangular

if

a

ij

= 0 (

i

>

j

) and

a

ii

= 1 (all

i

),

idempotent

if

A

2

=

A

.

A square matrix A is triangular if it is either lower triangular or upper triangular (or both).

A real square matrix A = (aij) is said to be

symmetric

if

A

′ =

A

,

skew‐symmetric

if

A

′ = −

A

.

For any square n × n matrix A = (aij), we define dg A or dg(A) as

or, alternatively,

If A = dg A, we say that A is diagonal. A particular diagonal matrix is the identity matrix (of order n × n),

where δij = 1 if i = j and δij = 0 if i ≠ j (δij is called the Kronecker delta). We sometimes write I instead of In when the order is obvious or irrelevant. We have

if A and I have the same order.

A real square matrix A is said to be orthogonal if

and its columns are said to be orthonormal. A rectangular (not square) matrix can still have the property that AA′ = I or A′A = I, but not both. Such a matrix is called semi‐orthogonal.

Note carefully that the concepts of symmetry, skew‐symmetry, and orthogonality are defined only for real square matrices. Hence, a complex matrix Z satisfying Z′ = Z is not called symmetric (in spite of what some textbooks do). This is important because complex matrices can be Hermitian, skew‐Hermitian, or unitary, and there are many important results about these classes of matrices. These results should specialize to matrices that are symmetric, skew‐symmetric, or orthogonal in the special case that the matrices are real. Thus, a symmetric matrix is just a real Hermitian matrix, a skew‐symmetric matrix is a real skew‐Hermitian matrix, and an orthogonal matrix is a real unitary matrix; see also Section 1.12.

6 LINEAR FORMS AND QUADRATIC FORMS

Let a be an n × 1 vector, A an n × n matrix, and B an n × m matrix. The expression a′x is called a linear form in x, the expression x′Ax is a quadratic form in x, and the expression x′By a bilinear form in x and y. In quadratic forms we may, without loss of generality, assume that A is symmetric, because if not then we can replace A by (A + A′)/2, since

Thus, let A be a symmetric matrix. We say that A is

positive definite

if

x

Ax

> 0 for all

x

≠ 0,

positive semidefinite

if

x

Ax

≥ 0 for all

x

,

negative definite

if

x

Ax

< 0 for all

x

≠ 0,

negative semidefinite

if

x

Ax

≤ 0 for all

x

,

indefinite

if

x

Ax

> 0 for some

x

and

x

Ax

< 0 for some

x

.

It is clear that the matrices BB′ and B′B are positive semidefinite, and that A is negative (semi)definite if and only if −A is positive (semi)definite. A square null matrix is both positive and negative semidefinite.

If A is positive semidefinite, then there are many matrices B satisfying

But there is only one positive semidefinite matrix B satisfying B2 = A. This matrix is called the square root of A, denoted by A1/2.

The following two theorems are often useful.

Theorem 1.1

Let A be an m × n matrix, B and C n × p matrices, and let x be an n × 1 vector. Then,

(a)

Ax

= 0 ⇔

A

Ax

= 0,

(b)

AB

= 0 ⇔

A

AB

= 0,

(c)

A

AB

=

A

AC

AB

=

AC

.

Proof

(a) Clearly Ax = 0 implies A′Ax = 0. Conversely, if A′Ax = 0, then (Ax)′(Ax) = x′A′Ax = 0 and hence Ax = 0. (b) follows from (a), and (c) follows from (b) by substituting B − C for B in (b).

Theorem 1.2

Let A be an m × n matrix, B and C n × n matrices, B symmetric. Then,

(a)

Ax

= 0 for all

n

× 1 vectors

x

if and only if

A

= 0,

(b)

x

Bx

= 0 for all

n

× 1 vectors

x

if and only if

B

= 0,

(c)

x

Cx

= 0 for all

n

× 1 vectors

x

if and only if

C

′ = −

C

.

Proof

The proof is easy and is left to the reader.

7 THE RANK OF A MATRIX

A set of vectors x1, …, xn is said to be linearly independent if ∑iαixi = 0 implies that all αi = 0. If x1, …, xn are not linearly independent, they are said to be linearly dependent.

Let A be an m × n matrix. The column rank of A is the maximum number of linearly independent columns it contains. The row rank of A is the maximum number of linearly independent rows it contains. It may be shown that the column rank of A is equal to its row rank. Hence, the concept of rank is unambiguous. We denote the rank of A by

It is clear that

(5)

If r(A) = m, we say that A has full row rank. If r(A) = n, we say that A has full column rank. If r(A) = 0, then A is the null matrix, and conversely, if A is the null matrix, then r(A) = 0.

We have the following important results concerning ranks:

(6)
(7)
(8)
(9)

and finally, if A is an m × n matrix and Ax = 0 for some x ≠ 0, then

The column space of A (m × n), denoted by ℳ(A), is the set of vectors

Thus, ℳ(A) is the vector space generated by the columns of A. The dimension of this vector space is r(A). We have

(10)

for any matrix A.

Exercises

1. If

A

has full column rank and

C

has full row rank, then

r

(

ABC

) =

r

(

B

).

2. Let

A

be partitioned as

A

= (

A

1

:

A

2

). Then

r

(

A

) =

r

(

A

1

) if and only if

ℳ(

A

2

) ⊂ ℳ(

A

1

)

.

8 THE INVERSE

Let A be a square matrix of order n × n. We say that A is nonsingular if r(A) = n, and that A is singular if r(A) < n.

If A is nonsingular, then there exists a nonsingular matrix B such that

The matrix B, denoted by A−1, is unique and is called the inverse of A. We have

(11)
(12)

if the inverses exist.

A square matrix P is said to be a permutation matrix if each row and each column of P contain a single element one, and the remaining elements are zero. An n × n permutation matrix thus contains n ones and n(n − 1) zeros. It can be proved that any permutation matrix is nonsingular. In fact, it is even true that P is orthogonal, that is,

(13)

for any permutation matrix P.

9 THE DETERMINANT

Associated with any n × n matrix A is the determinant |A| defined by

where the summation is taken over all permutations (j1, …, jn) of the set of integers (1, …, n), and φ(j1, …, jn) is the number of transpositions required to change (1, …, n) into (j1, …, jn). (A transposition consists of interchanging two numbers. It can be shown that the number of transpositions required to transform (1, …, n) into (j1, …, jn) is always even or always odd, so that is unambiguously defined.)

We have

(14)
(15)
(16)
(17)
(18)

A submatrix of A is the rectangular array obtained from A by deleting some of its rows and/or some of its columns. A minor is the determinant of a square submatrix of A. The minor of an element aij is the determinant of the submatrix of A obtained by deleting the ith row and jth column. The cofactor of aij, say cij, is (−1)i+j times the minor of aij. The matrix C = (cij) is called the cofactor matrix of A. The transpose of C is called the adjoint of A and will be denoted by A#.

We have

(19)
(20)
(21)

For any square matrix A, a principal submatrix of A is obtained by deleting corresponding rows and columns. The determinant of a principal submatrix is called a principal minor.

Exercises

1. If

A

is nonsingular, show that

A

#

= |

A

|

A

−1

.

2. Prove that the determinant of a triangular matrix is the product of its diagonal elements.

10 THE TRACE

The trace of a square n × n matrix A, denoted by tr A or tr(A), is the sum of its diagonal elements:

We have

(22)
(23)
(24)
(25)

We note in (25) that AB and BA, though both square, need not be of the same order.

Corresponding to the vector (Euclidean) norm

given in (4), we now define the matrix (Euclidean) norm as

(26)

We have

(27)

with equality if and only if A = 0.

11 PARTITIONED MATRICES

Let A be an m × n matrix. We can partition A as

(28)

where A11 is m1 × n1, A12 is m1 × n2, A21 is m2 × n1, A22 is m2 × n2, and m1 + m2 = m and n1 + n2 = n.

Let B (m × n) be similarly partitioned into submatrices Bij (i, j = 1, 2).

Then,

Now let C (n × p) be partitioned into submatrices Cij (i, j = 1, 2) such that C11 has n1 rows (and hence C12 also has n1 rows and C21 and C22 have n2 rows). Then we may postmultiply A by C yielding

The transpose of the matrix A given in (28) is

If the off‐diagonal blocks A12 and A21 are both zero, and A11 and A22 are square and nonsingular, then A is also nonsingular and its inverse is

More generally, if A as given in (28) is nonsingular and is also nonsingular, then

(29)

Alternatively, if A is nonsingular and is also nonsingular, then

(30)

Of course, if both D and E are nonsingular, blocks in (29) and (30) can be interchanged. The results ( 29 ) and ( 30 ) can be easily extended to a 3 × 3 matrix partition. We only consider the following symmetric case where two of the off‐diagonal blocks are null matrices.

Theorem 1.3

If the matrix