97,99 €
A brand new, fully updated edition of a popular classic on matrix differential calculus with applications in statistics and econometrics This exhaustive, self-contained book on matrix theory and matrix differential calculus provides a treatment of matrix calculus based on differentials and shows how easy it is to use this theory once you have mastered the technique. Jan Magnus, who, along with the late Heinz Neudecker, pioneered the theory, develops it further in this new edition and provides many examples along the way to support it. Matrix calculus has become an essential tool for quantitative methods in a large number of applications, ranging from social and behavioral sciences to econometrics. It is still relevant and used today in a wide range of subjects such as the biosciences and psychology. Matrix Differential Calculus with Applications in Statistics and Econometrics, Third Edition contains all of the essentials of multivariable calculus with an emphasis on the use of differentials. It starts by presenting a concise, yet thorough overview of matrix algebra, then goes on to develop the theory of differentials. The rest of the text combines the theory and application of matrix differential calculus, providing the practitioner and researcher with both a quick review and a detailed reference. * Fulfills the need for an updated and unified treatment of matrix differential calculus * Contains many new examples and exercises based on questions asked of the author over the years * Covers new developments in field and features new applications * Written by a leading expert and pioneer of the theory * Part of the Wiley Series in Probability and Statistics Matrix Differential Calculus With Applications in Statistics and Econometrics Third Edition is an ideal text for graduate students and academics studying the subject, as well as for postgraduates and specialists working in biosciences and psychology.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 669
Veröffentlichungsjahr: 2019
Cover
Preface
Part One: Matrices
Chapter 1:
Basic properties of vectors and matrices
1 INTRODUCTION
2 SETS
3 MATRICES: ADDITION AND MULTIPLICATION
4 THE TRANSPOSE OF A MATRIX
5 SQUARE MATRICES
6 LINEAR FORMS AND QUADRATIC FORMS
7 THE RANK OF A MATRIX
8 THE INVERSE
9 THE DETERMINANT
10 THE TRACE
11 PARTITIONED MATRICES
12 COMPLEX MATRICES
13 EIGENVALUES AND EIGENVECTORS
14 SCHUR'S DECOMPOSITION THEOREM
15 THE JORDAN DECOMPOSITION
16 THE SINGULAR‐VALUE DECOMPOSITION
17 FURTHER RESULTS CONCERNING EIGENVALUES
18 POSITIVE (SEMI)DEFINITE MATRICES
19 THREE FURTHER RESULTS FOR POSITIVE DEFINITE MATRICES
20 A USEFUL RESULT
21 SYMMETRIC MATRIX FUNCTIONS
Chapter 2:
Kronecker products, vec operator, and Moore‐Penrose inverse
1
INTRODUCTION
2
THE KRONECKER PRODUCT
3 EIGENVALUES OF A KRONECKER PRODUCT
4 THE VEC OPERATOR
6 EXISTENCE AND UNIQUENESS OF THE MP INVERSE
7 SOME PROPERTIES OF THE MP INVERSE
8 FURTHER PROPERTIES
9 THE SOLUTION OF LINEAR EQUATION SYSTEMS
BIBLIOGRAPHICAL NOTES
Chapter 3:
Miscellaneous matrix results
1 INTRODUCTION
2 THE ADJOINT MATRIX
3 PROOF OF THEOREM 3.1
4 BORDERED DETERMINANTS
5 THE MATRIX EQUATION
AX
= 0
6 THE HADAMARD PRODUCT
7 THE COMMUTATION MATRIX
K
mn
8 THE DUPLICATION MATRIX
D
n
9 RELATIONSHIP BETWEEN
D
n
+1
AND
D
n
, I
10 RELATIONSHIP BETWEEN
D
n
+1
AND
D
n
, II
11 CONDITIONS FOR A QUADRATIC FORM TO BE POSITIVE (NEGATIVE) SUBJECT TO LINEAR CONSTRAINTS
12 NECESSARY AND SUFFICIENT CONDITIONS FOR
r
(
A
:
B
) =
r
(
A
) +
r
(
B
)
13 THE BORDERED GRAMIAN MATRIX
14 THE EQUATIONS
X
1
A
+
X
2
B
′ =
G
1
,
X
1
B
=
G
2
BIBLIOGRAPHICAL NOTES
Part Two: Differentials: the theory
Chapter 4:
Mathematical preliminaries
1 INTRODUCTION
2 INTERIOR POINTS AND ACCUMULATION POINTS
3 OPEN AND CLOSED SETS
4 THE BOLZANO‐WEIERSTRASS THEOREM
5 FUNCTIONS
6 THE LIMIT OF A FUNCTION
7 CONTINUOUS FUNCTIONS AND COMPACTNESS
8 CONVEX SETS
9 CONVEX AND CONCAVE FUNCTIONS
Chapter 5:
Differentials and differentiability
1 INTRODUCTION
2 CONTINUITY
3 DIFFERENTIABILITY AND LINEAR APPROXIMATION
4 THE DIFFERENTIAL OF A VECTOR FUNCTION
5 UNIQUENESS OF THE DIFFERENTIAL
6 CONTINUITY OF DIFFERENTIABLE FUNCTIONS
7 PARTIAL DERIVATIVES
8 THE FIRST IDENTIFICATION THEOREM
9 EXISTENCE OF THE DIFFERENTIAL, I
10 EXISTENCE OF THE DIFFERENTIAL, II
11 CONTINUOUS DIFFERENTIABILITY
12 THE CHAIN RULE
13 CAUCHY INVARIANCE
14 THE MEAN‐VALUE THEOREM FOR REAL‐VALUED FUNCTIONS
15 DIFFERENTIABLE MATRIX FUNCTIONS
16 SOME REMARKS ON NOTATION
17 COMPLEX DIFFERENTIATION
BIBLIOGRAPHICAL NOTES
Chapter 6:
The second differential
1 INTRODUCTION
2 SECOND‐ORDER PARTIAL DERIVATIVES
3 THE HESSIAN MATRIX
4 TWICE DIFFERENTIABILITY AND SECOND‐ORDER APPROXIMATION, I
5 DEFINITION OF TWICE DIFFERENTIABILITY
6 THE SECOND DIFFERENTIAL
7 SYMMETRY OF THE HESSIAN MATRIX
8 THE SECOND IDENTIFICATION THEOREM
9 TWICE DIFFERENTIABILITY AND SECOND‐ORDER APPROXIMATION, II
10 CHAIN RULE FOR HESSIAN MATRICES
11 THE ANALOG FOR SECOND DIFFERENTIALS
12 TAYLOR'S THEOREM FOR REAL‐VALUED FUNCTIONS
13 HIGHER‐ORDER DIFFERENTIALS
14 REAL ANALYTIC FUNCTIONS
15 TWICE DIFFERENTIABLE MATRIX FUNCTIONS
BIBLIOGRAPHICAL NOTES
Chapter 7:
Static optimization
1 INTRODUCTION
2 UNCONSTRAINED OPTIMIZATION
3 THE EXISTENCE OF ABSOLUTE EXTREMA
4 NECESSARY CONDITIONS FOR A LOCAL MINIMUM
5 SUFFICIENT CONDITIONS FOR A LOCAL MINIMUM: FIRST‐DERIVATIVE TEST
6 SUFFICIENT CONDITIONS FOR A LOCAL MINIMUM: SECOND‐DERIVATIVE TEST
7 CHARACTERIZATION OF DIFFERENTIABLE CONVEX FUNCTIONS
8 CHARACTERIZATION OF TWICE DIFFERENTIABLE CONVEX FUNCTIONS
9 SUFFICIENT CONDITIONS FOR AN ABSOLUTE MINIMUM
10 MONOTONIC TRANSFORMATIONS
11 OPTIMIZATION SUBJECT TO CONSTRAINTS
12 NECESSARY CONDITIONS FOR A LOCAL MINIMUM UNDER CONSTRAINTS
13 SUFFICIENT CONDITIONS FOR A LOCAL MINIMUM UNDER CONSTRAINTS
14 SUFFICIENT CONDITIONS FOR AN ABSOLUTE MINIMUM UNDER CONSTRAINTS
15 A NOTE ON CONSTRAINTS IN MATRIX FORM
16 ECONOMIC INTERPRETATION OF LAGRANGE MULTIPLIERS
APPENDIX: THE IMPLICIT FUNCTION THEOREM
Part Three: Differentials: the practice
Chapter 8:
Some important differentials
1 INTRODUCTION
2 FUNDAMENTAL RULES OF DIFFERENTIAL CALCULUS
3 THE DIFFERENTIAL OF A DETERMINANT
4 THE DIFFERENTIAL OF AN INVERSE
5 DIFFERENTIAL OF THE MOORE‐PENROSE INVERSE
6 THE DIFFERENTIAL OF THE ADJOINT MATRIX
7 ON DIFFERENTIATING EIGENVALUES AND EIGENVECTORS
8 THE CONTINUITY OF EIGENPROJECTIONS
9 THE DIFFERENTIAL OF EIGENVALUES AND EIGENVECTORS: SYMMETRIC CASE
10 TWO ALTERNATIVE EXPRESSIONS FOR d
λ
11 SECOND DIFFERENTIAL OF THE EIGENVALUE FUNCTION
Chapter 9:
First‐order differentials and Jacobian matrices
1 INTRODUCTION
2 CLASSIFICATION
3 DERISATIVES
4 DERIVATIVES
5 IDENTIFICATION OF JACOBIAN MATRICES
6 THE FIRST IDENTIFICATION TABLE
7 PARTITIONING OF THE DERIVATIVE
8 SCALAR FUNCTIONS OF A SCALAR
9 SCALAR FUNCTIONS OF A VECTOR
10 SCALAR FUNCTIONS OF A MATRIX, I: TRACE
11 SCALAR FUNCTIONS OF A MATRIX, II: DETERMINANT
12 SCALAR FUNCTIONS OF A MATRIX, III: EIGENVALUE
13 TWO EXAMPLES OF VECTOR FUNCTIONS
14 MATRIX FUNCTIONS
15 KRONECKER PRODUCTS
16 SOME OTHER PROBLEMS
17 JACOBIANS OF TRANSFORMATIONS
Chapter 10:
Second‐order differentials and Hessian matrices
1 INTRODUCTION
2 THE SECOND IDENTIFICATION TABLE
3 LINEAR AND QUADRATIC FORMS
4 A USEFUL THEOREM
5 THE DETERMINANT FUNCTION
6 THE EIGENVALUE FUNCTION
7 OTHER EXAMPLES
8 COMPOSITE FUNCTIONS
9 THE EIGENVECTOR FUNCTION
10 HESSIAN OF MATRIX FUNCTIONS, I
11 HESSIAN OF MATRIX FUNCTIONS, II
Part Four: Inequalities
Chapter 11:
Inequalities
1 INTRODUCTION
2 THE CAUCHY‐SCHWARZ INEQUALITY
3 MATRIX ANALOGS OF THE CAUCHY‐SCHWARZ INEQUALITY
4 THE THEOREM OF THE ARITHMETIC AND GEOMETRIC MEANS
5 THE RAYLEIGH QUOTIENT
6 CONCAVITY OF
λ
1
AND CONVEXITY OF
λ
n
7 VARIATIONAL DESCRIPTION OF EIGENVALUES
8 FISCHER'S MIN‐MAX THEOREM
9 MONOTONICITY OF THE EIGENVALUES
10 THE POINCARÉ SEPARATION THEOREM
11 TWO COROLLARIES OF POINCARÉ'S THEOREM
12 FURTHER CONSEQUENCES OF THE POINCARÉ THEOREM
13 MULTIPLICATIVE VERSION
14 THE MAXIMUM OF A BILINEAR FORM
15 HADAMARD'S INEQUALITY
16 AN INTERLUDE: KARAMATA'S INEQUALITY
17 KARAMATA'S INEQUALITY AND EIGENVALUES
18 AN INEQUALITY CONCERNING POSITIVE SEMIDEFINITE MATRICES
19 A REPRESENTATION THEOREM FOR
20 A REPRESENTATION THEOREM FOR (tr
A
)
21 HÖLDER'S INEQUALITY
22 CONCAVITY OF log|
A
|
23 MINKOWSKI'S INEQUALITY
24 QUASILINEAR REPRESENTATION OF |
A
|
25 MINKOWSKI'S DETERMINANT THEOREM
26 WEIGHTED MEANS OF ORDER
p
27 SCHLÖMILCH'S INEQUALITY
28 CURVATURE PROPERTIES OF
M
p
(
x
,
a
)
29 LEAST SQUARES
30 GENERALIZED LEAST SQUARES
31 RESTRICTED LEAST SQUARES
32 RESTRICTED LEAST SQUARES: MATRIX VERSION
Part Five: The linear model
Chapter 12:
Statistical preliminaries
1 INTRODUCTION
2 THE CUMULATIVE DISTRIBUTION FUNCTION
3 THE JOINT DENSITY FUNCTION
4 EXPECTATIONS
5 VARIANCE AND COVARIANCE
6 INDEPENDENCE OF TWO RANDOM VARIABLES
7 INDEPENDENCE OF
n
RANDOM VARIABLES
8 SAMPLING
9 THE ONE‐DIMENSIONAL NORMAL DISTRIBUTION
10 THE MULTIVARIATE NORMAL DISTRIBUTION
11 ESTIMATION
Chapter 13:
The linear regression model
1
INTRODUCTION
2
AFFINE MINIMUM‐TRACE UNBIASED ESTIMATION
3
THE GAUSS‐MARKOV THEOREM
4
THE METHOD OF LEAST SQUARES
5
AITKEN'S THEOREM
6
MULTICOLLINEARITY
7
ESTIMABLE FUNCTIONS
8
LINEAR CONSTRAINTS: THE CASE
ℳ(R) ⊂ ℳ(X)
9
LINEAR CONSTRAINTS: THE GENERAL CASE
10
LINEAR
CONSTRAINTS: THE
CASE
ℳ(R) ∩ ℳ(X) = {0}
11
A SINGULAR
VARIANCE
MATRIX: THE
CASE
ℳ(X) ⊂ ℳ(V)
12 A SINGULAR
VARIANCE
MATRIX: THE
CASE
r
(
X
′
V
X
) =
r
(
X
)
13 A SINGULAR VARIANCE MATRIX: THE GENERAL CASE, I
14
EXPLICIT AND IMPLICIT LINEAR CONSTRAINTS
15
THE GENERAL LINEAR MODEL, I
16 A SINGULAR VARIANCE MATRIX: THE GENERAL CASE, II
17
THE GENERAL LINEAR MODEL, II
18
GENERALIZED LEAST SQUARES
19
RESTRICTED LEAST SQUARES
Chapter 14:
Further topics in the linear model
1 INTRODUCTION
2 BEST QUADRATIC UNBIASED ESTIMATION OF
σ
3 THE BEST QUADRATIC AND POSITIVE UNBIASED ESTIMATOR OF
σ
4 THE BEST QUADRATIC UNBIASED ESTIMATOR OF
σ
5 BEST QUADRATIC INVARIANT ESTIMATION OF
σ
6 THE BEST QUADRATIC AND POSITIVE INVARIANT ESTIMATOR OF
σ
7 THE BEST QUADRATIC INVARIANT ESTIMATOR OF
σ
8 BEST QUADRATIC UNBIASED ESTIMATION: MULTIVARIATE NORMAL CASE
9 BOUNDS FOR THE BIAS OF THE LEAST‐SQUARES ESTIMATOR OF
σ
, I
10 BOUNDS FOR THE BIAS OF THE LEAST‐SQUARES ESTIMATOR OF
σ
, II
11 THE PREDICTION OF DISTURBANCES
12 BEST LINEAR UNBIASED PREDICTORS WITH SCALAR VARIANCE MATRIX
13 BEST LINEAR UNBIASED PREDICTORS WITH FIXED VARIANCE MATRIX, I
14 BEST LINEAR UNBIASED PREDICTORS WITH FIXED VARIANCE MATRIX, II
15 LOCAL SENSITIVITY OF THE POSTERIOR MEAN
16 LOCAL SENSITIVITY OF THE POSTERIOR PRECISION
Part Six: Applications to maximum likelihood estimation
Chapter 15:
Maximum likelihood estimation
1 INTRODUCTION
2 THE METHOD OF MAXIMUM LIKELIHOOD (ML)
3 ML ESTIMATION OF THE MULTIVARIATE NORMAL DISTRIBUTION
4 SYMMETRY: IMPLICIT VERSUS EXPLICIT TREATMENT
5 THE TREATMENT OF POSITIVE DEFINITENESS
6 THE INFORMATION MATRIX
7 ML ESTIMATION OF THE MULTIVARIATE NORMAL DISTRIBUTION: DISTINCT MEANS
8 THE MULTIVARIATE LINEAR REGRESSION MODEL
9 THE ERRORS‐IN‐VARIABLES MODEL
10 THE NONLINEAR REGRESSION MODEL WITH NORMAL ERRORS
11 SPECIAL CASE: FUNCTIONAL INDEPENDENCE OF MEAN AND VARIANCE PARAMETERS
12 GENERALIZATION OF THEOREM 15.6
Chapter 16:
Simultaneous equations
1 INTRODUCTION
2 THE SIMULTANEOUS EQUATIONS MODEL
3 THE IDENTIFICATION PROBLEM
4 IDENTIFICATION WITH LINEAR CONSTRAINTS ON
B
AND Γ ONLY
5 IDENTIFICATION WITH LINEAR CONSTRAINTS ON
B
, Γ, AND Σ
6 NONLINEAR CONSTRAINTS
7 FIML: THE INFORMATION MATRIX (GENERAL CASE)
8 FIML: ASYMPTOTIC VARIANCE MATRIX (SPECIAL CASE)
9 LIML: FIRST‐ORDER CONDITIONS
10 LIML: INFORMATION MATRIX
11 LIML: ASYMPTOTIC VARIANCE MATRIX
BIBLIOGRAPHICAL NOTES
Chapter 17:
Topics in psychometrics
1 INTRODUCTION
2 POPULATION PRINCIPAL COMPONENTS
3 OPTIMALITY OF PRINCIPAL COMPONENTS
4 A RELATED RESULT
5 SAMPLE PRINCIPAL COMPONENTS
6 OPTIMALITY OF SAMPLE PRINCIPAL COMPONENTS
7 ONE‐MODE COMPONENT ANALYSIS
8 ONE‐MODE COMPONENT ANALYSIS AND SAMPLE PRINCIPAL COMPONENTS
9 TWO‐MODE COMPONENT ANALYSIS
10 MULTIMODE COMPONENT ANALYSIS
11 FACTOR ANALYSIS
12 A ZIGZAG ROUTINE
13 NEWTON‐RAPHSON ROUTINE
14 KAISER'S VARIMAX METHOD
15 CANONICAL CORRELATIONS AND VARIATES IN THE POPULATION
16 CORRESPONDENCE ANALYSIS
17 LINEAR DISCRIMINANT ANALYSIS
BIBLIOGRAPHICAL NOTES
Part Seven: Summary
Chapter 18:
Matrix calculus: the essentials
1 INTRODUCTION
2 DIFFERENTIALS
3 VECTOR CALCULUS
4 OPTIMIZATION
5 LEAST SQUARES
6 MATRIX CALCULUS
7 INTERLUDE ON LINEAR AND QUADRATIC FORMS
8 THE SECOND DIFFERENTIAL
9 CHAIN RULE FOR SECOND DIFFERENTIALS
10 FOUR EXAMPLES
11 THE KRONECKER PRODUCT AND VEC OPERATOR
12 IDENTIFICATION
13 THE COMMUTATION MATRIX
14 FROM SECOND DIFFERENTIAL TO HESSIAN
15 SYMMETRY AND THE DUPLICATION MATRIX
16 MAXIMUM LIKELIHOOD
FURTHER READING
Bibliography
Index of symbols
Subject index
End User License Agreement
Chapter 9
Table 9.1 Classification of functions and variables
Table 9.2 The first identification table
Table 9.3 Differentials of linear and quadratic forms
Table 9.4 Differentials involving the trace
Table 9.5 Differentials involving the determinant
Table 9.6 Simple matrix differentials
Table 9.7 Matrix differentials involving powers
Chapter 10
Table 10.1 The second identification table
Chapter 4
Figure 4.1 Convex and nonconvex sets in
ℝ
2
Figure 4.2 A convex function
Chapter 5
Figure 5.1 Geometrical interpretation of the differential
Chapter 7
Figure 7.1 Unconstrained optimization in one variable
Chapter 8
Figure 8.1 The eigenvalue functions
Chapter 11
Figure 11.1 Diagram showing that
A
(
ɛ
)
<
0
Cover
Table of Contents
Begin Reading
xiii
xiv
xv
xvi
xvii
xviii
1
3
4
5
6
7
8
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
161
160
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
223
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
273
274
275
276
277
278
279
280
281
282
283
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
420
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
467
468
469
470
471
472
473
474
475
476
477
478
479
480
WILEY SERIES IN PROBABILITY AND STATISTICS
Established by Walter E. Shewhart and Samuel S. Wilks
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, and Ruey S. Tsay
Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, and Jozef L. Teugels
The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state‐of‐the‐art developments in the field and classical methods.
Reflecting the wide range of current research in statistics, the series encompasses applied, methodological, and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches. This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.
A complete list of the titles in this series can be found at http://www.wiley.com/go/wsps
Third Edition
Jan R. Magnus
Department of Econometrics and Operations Research Vrije Universiteit Amsterdam, The Netherlands
and
Heinz Neudeckery†
Amsterdam School of Economics University of Amsterdam, The Netherlands
This edition first published 2019
© 2019 John Wiley & Sons Ltd
Edition History
John Wiley & Sons (1e, 1988) and John Wiley & Sons (2e, 1999)
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Jan R. Magnus and Heinz Neudecker to be identified as the authors of this work has been asserted in accordance with law.
Registered Offices
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office
9600 Garsington Road, Oxford, OX4 2DQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging‐in‐Publication Data applied for
ISBN: 9781119541202
Cover design by Wiley
Cover image: © phochi/Shutterstock
Preface to the first edition
There has been a long‐felt need for a book that gives a self‐contained and unified treatment of matrix differential calculus, specifically written for econometricians and statisticians. The present book is meant to satisfy this need. It can serve as a textbook for advanced undergraduates and postgraduates in econometrics and as a reference book for practicing econometricians. Mathematical statisticians and psychometricians may also find something to their liking in the book.
When used as a textbook, it can provide a full‐semester course. Reasonable proficiency in basic matrix theory is assumed, especially with the use of partitioned matrices. The basics of matrix algebra, as deemed necessary for a proper understanding of the main subject of the book, are summarized in Part One, the first of the book’s six parts. The book also contains the essentials of multivariable calculus but geared to and often phrased in terms of differentials.
The sequence in which the chapters are being read is not of great consequence. It is fully conceivable that practitioners start with Part Three (Differentials: the practice) and, dependent on their predilections, carry on to Parts Five or Six, which deal with applications. Those who want a full understanding of the underlying theory should read the whole book, although even then they could go through the necessary matrix algebra only when the specific need arises.
Matrix differential calculus as presented in this book is based on differentials, and this sets the book apart from other books in this area. The approach via differentials is, in our opinion, superior to any other existing approach. Our principal idea is that differentials are more congenial to multivariable functions as they crop up in econometrics, mathematical statistics, or psychometrics than derivatives, although from a theoretical point of view the two concepts are equivalent.
The book falls into six parts. Part One deals with matrix algebra. It lists, and also often proves, items like the Schur, Jordan, and singular‐value decompositions; concepts like the Hadamard and Kronecker products; the vec operator; the commutation and duplication matrices; and the Moore‐Penrose inverse. Results on bordered matrices (and their determinants) and (linearly restricted) quadratic forms are also presented here.
Part Two, which forms the theoretical heart of the book, is entirely devoted to a thorough treatment of the theory of differentials, and presents the essentials of calculus but geared to and phrased in terms of differentials. First and second differentials are defined, ‘identification’ rules for Jacobian and Hessian matrices are given, and chain rules derived. A separate chapter on the theory of (constrained) optimization in terms of differentials concludes this part.
Part Three is the practical core of the book. It contains the rules for working with differentials, lists the differentials of important scalar, vector, and matrix functions (inter alia eigenvalues, eigenvectors, and the Moore‐Penrose inverse) and supplies ‘identification’ tables for Jacobian and Hessian matrices.
Part Four, treating inequalities, owes its existence to our feeling that econometricians should be conversant with inequalities, such as the Cauchy‐Schwarz and Minkowski inequalities (and extensions thereof), and that they should also master a powerful result like Poincaré’s separation theorem. This part is to some extent also the case history of a disappointment. When we started writing this book we had the ambition to derive all inequalities by means of matrix differential calculus. After all, every inequality can be rephrased as the solution of an optimization problem. This proved to be an illusion, due to the fact that the Hessian matrix in most cases is singular at the optimum point.
Part Five is entirely devoted to applications of matrix differential calculus to the linear regression model. There is an exhaustive treatment of estimation problems related to the fixed part of the model under various assumptions concerning ranks and (other) constraints. Moreover, it contains topics relating to the stochastic part of the model, viz. estimation of the error variance and prediction of the error term. There is also a small section on sensitivity analysis. An introductory chapter deals with the necessary statistical preliminaries.
Part Six deals with maximum likelihood estimation, which is of course an ideal source for demonstrating the power of the propagated techniques. In the first of three chapters, several models are analysed, inter alia the multivariate normal distribution, the errors‐in‐variables model, and the nonlinear regression model. There is a discussion on how to deal with symmetry and positive definiteness, and special attention is given to the information matrix. The second chapter in this part deals with simultaneous equations under normality conditions. It investigates both identification and estimation problems, subject to various (non)linear constraints on the parameters. This part also discusses full‐information maximum likelihood (FIML) and limited‐information maximum likelihood (LIML), with special attention to the derivation of asymptotic variance matrices. The final chapter addresses itself to various psychometric problems, inter alia principal components, multimode component analysis, factor analysis, and canonical correlation.
All chapters contain many exercises. These are frequently meant to be complementary to the main text.
A large number of books and papers have been published on the theory and applications of matrix differential calculus. Without attempting to describe their relative virtues and particularities, the interested reader may wish to consult Dwyer and Macphail (1948), Bodewig (1959), Wilkinson (1965), Dwyer (1967), Neudecker (1967, 1969), Tracy and Dwyer (1969), Tracy and Singh (1972), McDonald and Swaminathan (1973), MacRae (1974), Balestra (1976), Bentler and Lee (1978), Henderson and Searle (1979), Wong and Wong (1979, 1980), Nel (1980), Rogers (1980), Wong (1980, 1985), Graham (1981), McCulloch (1982), Schönemann (1985), Magnus and Neudecker (1985), Pollock (1985), Don (1986), and Kollo (1991). The papers by Henderson and Searle (1979) and Nel (1980), and Rogers’ (1980) book contain extensive bibliographies.
The two authors share the responsibility for Parts One, Three, Five, and Six, although any new results in Part One are due to Magnus. Parts Two and Four are due to Magnus, although Neudecker contributed some results to Part Four. Magnus is also responsible for the writing and organization of the final text.
We wish to thank our colleagues F. J. H. Don, R. D. H. Heijmans, D. S. G. Pollock, and R. Ramer for their critical remarks and contributions. The greatest obligation is owed to Sue Kirkbride at the London School of Economics who patiently and cheerfully typed and retyped the various versions of the book. Partial financial support was provided by the Netherlands Organization for the Advancement of Pure Research (Z. W. O.) and the Suntory Toyota International Centre for Economics and Related Disciplines at the London School of Economics.
London/Amsterdam Jan R. Magnus
April 1987 Heinz Neudecker
Preface to the first revised printing
Since this book first appeared — now almost four years ago — many of our colleagues, students, and other readers have pointed out typographical errors and have made suggestions for improving the text. We are particularly grateful to R. D. H. Heijmans, J. F. Kiviet, I. J. Steyn, and G. Trenkler. We owe the greatest debt to F. Gerrish, formerly of the School of Mathematics in the Polytechnic, Kingston‐upon‐Thames, who read Chapters 1–11 with awesome precision and care and made numerous insightful suggestions and constructive remarks. We hope that this printing will continue to trigger comments from our readers.
London/Tilburg/Amsterdam Jan R. Magnus
February 1991 Heinz Neudecker
Preface to the second edition
A further seven years have passed since our first revision in 1991. We are happy to see that our book is still being used by colleagues and students. In this revision we attempted to reach three goals. First, we made a serious attempt to keep the book up‐to‐date by adding many recent references and new exercises. Second, we made numerous small changes throughout the text, improving the clarity of exposition. Finally, we corrected a number of typographical and other errors.
The structure of the book and its philosophy are unchanged. Apart from a large number of small changes, there are two major changes. First, we interchanged Sections 12 and 13 of Chapter 1, since complex numbers need to be discussed before eigenvalues and eigenvectors, and we corrected an error in Theorem 1.7. Second, in Chapter 17 on psychometrics, we rewrote Sections 8–10 relating to the Eckart‐Young theorem.
We are grateful to Karim Abadir, Paul Bekker, Hamparsum Bozdogan, Michael Browne, Frank Gerrish, Kaddour Hadri, Tõnu Kollo, Shuangzhe Liu, Daan Nel, Albert Satorra, Kazuo Shigemasu, Jos ten Berge, Peter ter Berg, Götz Trenkler, Haruo Yanai, and many others for their thoughtful and constructive comments. Of course, we welcome further comments from our readers.
Tilburg/Amsterdam Jan R. Magnus
March 1998 Heinz Neudecker
Preface to the third edition
Twenty years have passed since the appearance of the second edition and thirty years since the book first appeared. This is a long time, but the book still lives. Unfortunately, my coauthor Heinz Neudecker does not; he died in December 2017. Heinz was my teacher at the University of Amsterdam and I was fortunate to learn the subject of matrix calculus through differentials (then in its infancy) from his lectures and personal guidance. This technique is still a remarkably powerful tool, and Heinz Neudecker must be regarded as its founding father.
The original text of the book was written on a typewriter and then handed over to the publisher for typesetting and printing. When it came to the second edition, the typeset material could no longer be found, which is why the second edition had to be produced in an ad hoc manner which was not satisfactory. Many people complained about this, to me and to the publisher, and the publisher offered us to produce a new edition, freshly typeset, which would look good. In the mean time, my Russian colleagues had proposed to translate the book into Russian, and I realized that this would only be feasible if they had a good English <$$$>LATEX text. So, my secretary Josette Janssen at Tilburg University and I produced a <$$$>LATEX text with expert advice from Jozef Pijnenburg. In the process of retyping the manuscript, many small changes were made to improve the readability and consistency of the text, but the structure of the book was not changed. The English <$$$>LATEX version was then used as the basis for the Russian edition,
Matrichnoe Differenzial’noe Ischislenie s Prilozhenijami k Statistike i Ekonometrike,
translated by my friends Anatoly Peresetsky and Pavel Katyshev, and published by Fizmatlit Publishing House, Moscow, 2002. The current third edition is based on this English LATEX version, although I have taken the opportunity to make many improvements to the presentation of the material.
Of course, this was not the only reason for producing a third edition. It was time to take a fresh look at the material and to update the references. I felt it was appropriate to stay close to the original text, because this is the book that Heinz and I conceived and the current text is a new edition, not a new book. The main changes relative to the second edition are as follows:
Some subjects were treated insufficiently (some of my friends would say ‘incorrectly’) and I have attempted to repair these omissions. This applies in particular to the discussion on matrix functions (Section 1.21), complex differentiation (Section 5.17), and Jacobians of transformations (Section 9.17).
The text on differentiating eigenvalues and eigenvectors and associated continuity issues has been rewritten, see Sections 8.7–8.11.
Chapter 10 has been completely rewritten, because I am now convinced that it is not useful to define Hessian matrices for vector or matrix functions. So I now define Hessian matrices only for scalar functions and for individual components of vector functions and individual elements of matrix functions. This makes life much easier.
I have added two additional sections at the end of Chapter 17 on psychometrics, relating to correspondence analysis and linear discriminant analysis.
Chapter 18 is new. It can be read without consulting the other chapters and provides a summary of the whole book. It can therefore be used as an introduction to matrix calculus for advanced undergraduates or Master’s and PhD students in economics, statistics, mathematics, and engineering who want to know how to apply matrix calculus without going into all the theoretical details.
In addition, many small changes have been made, references have been updated, and exercises have been added. Over the past 30 years, I received many queries, problems, and requests from readers, about once every 2 weeks, which amounts to about 750 queries in 30 years. I responded to all of them and a number of these problems appear in the current text as exercises.
I am grateful to Don Andrews, Manuel Arellano, Richard Baillie, Luc Bauwens, Andrew Chesher, Gerda Claeskens, Russell Davidson, Jean‐Marie Dufour, Ronald Gallant, Eric Ghysels, Bruce Hansen, Grant Hillier, Cheng Hsiao, Guido Imbens, Guido Kuersteiner, Offer Lieberman, Esfandiar Maasoumi, Whitney Newey, Kazuhiro Ohtani, Enrique Sentana, Cezary Sielużycki, Richard Smith, Götz Trenkler, and Farshid Vahid for general encouragement and specific suggestions; to Henk Pijls for answering my questions on complex differentiation and Michel van de Velden for help on psychometric issues; to Jan Brinkhuis, Chris Muris, Franco Peracchi, Andrey Vasnev, Wendun Wang, and Yuan Yue on commenting on the new Chapter 18; to Ang Li for exceptional research assistance in updating the literature; and to Ilka van de Werve for expertly redrawing the figures. No blame attaches to any of these people in case there are remaining errors, ambiguities, or omissions; these are entirely my own responsibility, especially since I have not always followed their advice.
Cross‐References. The numbering of theorems, propositions, corollaries, figures, tables, assumptions, examples, and definitions is with two digits, so that Theorem 3.5 refers to Theorem 5 in Chapter 3. Sections are numbered 1, 2,… within each chapter but always referenced with two digits so that Section 5 in Chapter 3 is referred to as Section 3.5. Equations are numbered (1), (2),… within each chapter, and referred to with one digit if it refers to the same chapter; if it refers to another chapter we write, for example, see Equation (16) in Chapter 5. Exercises are numbered 1, 2,… after a section.
Notation. Special symbols are used to denote the derivative (matrix) D and the Hessian (matrix) H. The differential operator is denoted by d. The third edition follows the notation of earlier editions with the following exceptions. First, the symbol for the vector (1, 1, … , 1)′ has been altered from a calligraphic s to ι (dotless i); second, the symbol i for imaginary root has been replaced by the more common i; third, v(A), the vector indicating the essentially distinct components of a symmetric matrix A, has been replaced by vech(A); fourth, the symbols for expectation, variance, and covariance (previously ε, ν and C) have been replaced by E, var, and cov, respectively; and fifth, we now denote the normal distribution by N (previously N). A list of all symbols is presented in the Index of Symbols at the end of the book.
Brackets are used sparingly. We write tr A instead of tr(A), while tr AB denotes tr(AB), not (tr A)B. Similarly, vec AB means vec(AB) and dXY means d(XY). In general, we only place brackets when there is a possibility of ambiguity.
I worked on the third edition between April and November 2018. I hope the book will continue to be useful for a few more years, and of course I welcome comments from my readers.
Amsterdam/Wapserveen Jan R. Magnus
November 2018
In this chapter, we summarize some of the well‐known definitions and theorems of matrix algebra. Most of the theorems will be proved.
A set is a collection of objects, called the elements (or members) of the set. We write x ∈ S to mean ‘x is an element of S’ or ‘x belongs to S’. If x does not belong to S, we write x ∉ S. The set that contains no elements is called the empty set, denoted by ∅.
Sometimes a set can be defined by displaying the elements in braces. For example, A = {0, 1} or
Notice that A is a finite set (contains a finite number of elements), whereas ℕ is an infinite set. If P is a property that any element of S has or does not have, then
denotes the set of all the elements of S that have property P.
A set A is called a subset of B, written A ⊂ B, whenever every element of A also belongs to B. The notation A ⊂ B does not rule out the possibility that A = B. If A ⊂ B and A ≠ B, then we say that A is a proper subset of B.
If A and Bare two subsets of S, we define
the union of A and B, as the set of elements of S that belong to A or to B or to both, and
the intersection of A and B, as the set of elements of S that belong to both A and B. We say that A and B are (mutually) disjoint if they have no common elements, that is, if
The complement of A relative to B, denoted by B − A, is the set {x : x ∈ B, but x ∉ A}. The complement of A (relative to S) is sometimes denoted by Ac.
The Cartesian product of two sets A and B, written A × B, is the set of all ordered pairs (a, b) such that a ∈ A and b ∈ B. More generally, the Cartesian product of n sets A1, A2, …, An, written
is the set of all ordered n‐tuples (a1, a2, …, an) such that ai ∈ Ai (i = 1, …, n).
The set of (finite) real numbers (the one‐dimensional Euclidean space) is denoted by ℝ. The n‐dimensional Euclidean spaceℝn is the Cartesian product of n sets equal to ℝ:
The elements of ℝn are thus the ordered n‐tuples (x1, x2, …, xn) of real numbers x1, x2, …, xn.
A set S of real numbers is said to be bounded if there exists a number M such that |x| ≤ M for all x ∈ S.
A real m × n matrix A is a rectangular array of real numbers
We sometimes write A = (aij). If one or more of the elements of A is complex, we say that A is a complex matrix. Almost all matrices in this book are real and the word ‘matrix’ is assumed to be a real matrix, unless explicitly stated otherwise.
An m × n matrix can be regarded as a point in ℝm × n. The real numbers aij are called the elements of A. An m × 1 matrix is a point in ℝm × 1 (that is, in ℝm) and is called a (column) vector of order m × 1. A 1 × n matrix is called a row vector (of order 1 × n). The elements of a vector are usually called its components. Matrices are always denoted by capital letters and vectors by lower‐case letters.
The sum of two matrices A and B of the same order is defined as
The product of a matrix by a scalar λ is
The following properties are now easily proved for matrices A, B, and C of the same order and scalars λ and μ:
A matrix whose elements are all zero is called a null matrix and denoted by 0. We have, of course,
If A is an m × n matrix and B an n × p matrix (so that A has the same number of columns as B has rows), then we define the product of A and B as
Thus, AB is an m × p matrix and its ikth element is . The following properties of the matrix product can be established:
These relations hold provided the matrix products exist.
We note that the existence of AB does not imply the existence of BA, and even when both products exist, they are not generally equal. (Two matrices A and B for which
are said to commute.) We therefore distinguish between premultiplication and postmultiplication: a given m × n matrix A can be premultiplied by a p × m matrix B to form the product BA; it can also be postmultiplied by an n × q matrix C to form AC.
The transpose of an m × n matrix A = (aij) is the n × m matrix, denoted by A′, whose ijth element is aji.
We have
If x is an n × 1 vector, then x′ is a 1 × n row vector and
The (Euclidean) norm of x is defined as
A matrix is said to be square if it has as many rows as it has columns. A square matrix A = (aij), real or complex, is said to be
lower triangular
if
a
ij
= 0 (
i
<
j
),
strictly lower triangular
if
a
ij
= 0 (
i
≤
j
),
unit lower triangular
if
a
ij
= 0 (
i
<
j
) and
a
ii
= 1 (all
i
),
upper triangular
if
a
ij
= 0 (
i
>
j
),
strictly upper triangular
if
a
ij
= 0 (
i
≥
j
),
unit upper triangular
if
a
ij
= 0 (
i
>
j
) and
a
ii
= 1 (all
i
),
idempotent
if
A
2
=
A
.
A square matrix A is triangular if it is either lower triangular or upper triangular (or both).
A real square matrix A = (aij) is said to be
symmetric
if
A
′ =
A
,
skew‐symmetric
if
A
′ = −
A
.
For any square n × n matrix A = (aij), we define dg A or dg(A) as
or, alternatively,
If A = dg A, we say that A is diagonal. A particular diagonal matrix is the identity matrix (of order n × n),
where δij = 1 if i = j and δij = 0 if i ≠ j (δij is called the Kronecker delta). We sometimes write I instead of In when the order is obvious or irrelevant. We have
if A and I have the same order.
A real square matrix A is said to be orthogonal if
and its columns are said to be orthonormal. A rectangular (not square) matrix can still have the property that AA′ = I or A′A = I, but not both. Such a matrix is called semi‐orthogonal.
Note carefully that the concepts of symmetry, skew‐symmetry, and orthogonality are defined only for real square matrices. Hence, a complex matrix Z satisfying Z′ = Z is not called symmetric (in spite of what some textbooks do). This is important because complex matrices can be Hermitian, skew‐Hermitian, or unitary, and there are many important results about these classes of matrices. These results should specialize to matrices that are symmetric, skew‐symmetric, or orthogonal in the special case that the matrices are real. Thus, a symmetric matrix is just a real Hermitian matrix, a skew‐symmetric matrix is a real skew‐Hermitian matrix, and an orthogonal matrix is a real unitary matrix; see also Section 1.12.
Let a be an n × 1 vector, A an n × n matrix, and B an n × m matrix. The expression a′x is called a linear form in x, the expression x′Ax is a quadratic form in x, and the expression x′By a bilinear form in x and y. In quadratic forms we may, without loss of generality, assume that A is symmetric, because if not then we can replace A by (A + A′)/2, since
Thus, let A be a symmetric matrix. We say that A is
positive definite
if
x
′
Ax
> 0 for all
x
≠ 0,
positive semidefinite
if
x
′
Ax
≥ 0 for all
x
,
negative definite
if
x
′
Ax
< 0 for all
x
≠ 0,
negative semidefinite
if
x
′
Ax
≤ 0 for all
x
,
indefinite
if
x
′
Ax
> 0 for some
x
and
x
′
Ax
< 0 for some
x
.
It is clear that the matrices BB′ and B′B are positive semidefinite, and that A is negative (semi)definite if and only if −A is positive (semi)definite. A square null matrix is both positive and negative semidefinite.
If A is positive semidefinite, then there are many matrices B satisfying
But there is only one positive semidefinite matrix B satisfying B2 = A. This matrix is called the square root of A, denoted by A1/2.
The following two theorems are often useful.
Let A be an m × n matrix, B and C n × p matrices, and let x be an n × 1 vector. Then,
(a)
Ax
= 0 ⇔
A
′
Ax
= 0,
(b)
AB
= 0 ⇔
A
′
AB
= 0,
(c)
A
′
AB
=
A
′
AC
⇔
AB
=
AC
.
(a) Clearly Ax = 0 implies A′Ax = 0. Conversely, if A′Ax = 0, then (Ax)′(Ax) = x′A′Ax = 0 and hence Ax = 0. (b) follows from (a), and (c) follows from (b) by substituting B − C for B in (b).
Let A be an m × n matrix, B and C n × n matrices, B symmetric. Then,
(a)
Ax
= 0 for all
n
× 1 vectors
x
if and only if
A
= 0,
(b)
x
′
Bx
= 0 for all
n
× 1 vectors
x
if and only if
B
= 0,
(c)
x
′
Cx
= 0 for all
n
× 1 vectors
x
if and only if
C
′ = −
C
.
The proof is easy and is left to the reader.
A set of vectors x1, …, xn is said to be linearly independent if ∑iαixi = 0 implies that all αi = 0. If x1, …, xn are not linearly independent, they are said to be linearly dependent.
Let A be an m × n matrix. The column rank of A is the maximum number of linearly independent columns it contains. The row rank of A is the maximum number of linearly independent rows it contains. It may be shown that the column rank of A is equal to its row rank. Hence, the concept of rank is unambiguous. We denote the rank of A by
It is clear that
If r(A) = m, we say that A has full row rank. If r(A) = n, we say that A has full column rank. If r(A) = 0, then A is the null matrix, and conversely, if A is the null matrix, then r(A) = 0.
We have the following important results concerning ranks:
and finally, if A is an m × n matrix and Ax = 0 for some x ≠ 0, then
The column space of A (m × n), denoted by ℳ(A), is the set of vectors
Thus, ℳ(A) is the vector space generated by the columns of A. The dimension of this vector space is r(A). We have
for any matrix A.
1. If
A
has full column rank and
C
has full row rank, then
r
(
ABC
) =
r
(
B
).
2. Let
A
be partitioned as
A
= (
A
1
:
A
2
). Then
r
(
A
) =
r
(
A
1
) if and only if
ℳ(
A
2
) ⊂ ℳ(
A
1
)
.
Let A be a square matrix of order n × n. We say that A is nonsingular if r(A) = n, and that A is singular if r(A) < n.
If A is nonsingular, then there exists a nonsingular matrix B such that
The matrix B, denoted by A−1, is unique and is called the inverse of A. We have
if the inverses exist.
A square matrix P is said to be a permutation matrix if each row and each column of P contain a single element one, and the remaining elements are zero. An n × n permutation matrix thus contains n ones and n(n − 1) zeros. It can be proved that any permutation matrix is nonsingular. In fact, it is even true that P is orthogonal, that is,
for any permutation matrix P.
Associated with any n × n matrix A is the determinant |A| defined by
where the summation is taken over all permutations (j1, …, jn) of the set of integers (1, …, n), and φ(j1, …, jn) is the number of transpositions required to change (1, …, n) into (j1, …, jn). (A transposition consists of interchanging two numbers. It can be shown that the number of transpositions required to transform (1, …, n) into (j1, …, jn) is always even or always odd, so that is unambiguously defined.)
We have
A submatrix of A is the rectangular array obtained from A by deleting some of its rows and/or some of its columns. A minor is the determinant of a square submatrix of A. The minor of an element aij is the determinant of the submatrix of A obtained by deleting the ith row and jth column. The cofactor of aij, say cij, is (−1)i+j times the minor of aij. The matrix C = (cij) is called the cofactor matrix of A. The transpose of C is called the adjoint of A and will be denoted by A#.
We have
For any square matrix A, a principal submatrix of A is obtained by deleting corresponding rows and columns. The determinant of a principal submatrix is called a principal minor.
1. If
A
is nonsingular, show that
A
#
= |
A
|
A
−1
.
2. Prove that the determinant of a triangular matrix is the product of its diagonal elements.
The trace of a square n × n matrix A, denoted by tr A or tr(A), is the sum of its diagonal elements:
We have
We note in (25) that AB and BA, though both square, need not be of the same order.
Corresponding to the vector (Euclidean) norm
given in (4), we now define the matrix (Euclidean) norm as
We have
with equality if and only if A = 0.
Let A be an m × n matrix. We can partition A as
where A11 is m1 × n1, A12 is m1 × n2, A21 is m2 × n1, A22 is m2 × n2, and m1 + m2 = m and n1 + n2 = n.
Let B (m × n) be similarly partitioned into submatrices Bij (i, j = 1, 2).
Then,
Now let C (n × p) be partitioned into submatrices Cij (i, j = 1, 2) such that C11 has n1 rows (and hence C12 also has n1 rows and C21 and C22 have n2 rows). Then we may postmultiply A by C yielding
The transpose of the matrix A given in (28) is
If the off‐diagonal blocks A12 and A21 are both zero, and A11 and A22 are square and nonsingular, then A is also nonsingular and its inverse is
More generally, if A as given in (28) is nonsingular and is also nonsingular, then
Alternatively, if A is nonsingular and is also nonsingular, then
Of course, if both D and E are nonsingular, blocks in (29) and (30) can be interchanged. The results ( 29 ) and ( 30 ) can be easily extended to a 3 × 3 matrix partition. We only consider the following symmetric case where two of the off‐diagonal blocks are null matrices.
If the matrix
