113,99 €
An up-to-date version of the complete, self-contained introduction to matrix analysis theory and practice
Providing accessible and in-depth coverage of the most common matrix methods now used in statistical applications, Matrix Analysis for Statistics, Third Edition features an easy-to-follow theorem/proof format. Featuring smooth transitions between topical coverage, the author carefully justifies the step-by-step process of the most common matrix methods now used in statistical applications, including eigenvalues and eigenvectors; the Moore-Penrose inverse; matrix differentiation; and the distribution of quadratic forms.
An ideal introduction to matrix analysis theory and practice, Matrix Analysis for Statistics, Third Edition features:
• New chapter or section coverage on inequalities, oblique projections, and antieigenvalues and antieigenvectors
• Additional problems and chapter-end practice exercises at the end of each chapter
• Extensive examples that are familiar and easy to understand
• Self-contained chapters for flexibility in topic choice
• Applications of matrix methods in least squares regression and the analyses of mean vectors and covariance matrices
Matrix Analysis for Statistics, Third Edition is an ideal textbook for upper-undergraduate and graduate-level courses on matrix methods, multivariate analysis, and linear models. The book is also an excellent reference for research professionals in applied statistics.
James R. Schott, PhD, is Professor in the Department of Statistics at the University of Central Florida. He has published numerous journal articles in the area of multivariate analysis. Dr. Schott’s research interests include multivariate analysis, analysis of covariance and correlation matrices, and dimensionality reduction techniques.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 712
Veröffentlichungsjahr: 2016
COVER
TITLE PAGE
COPYRIGHT
DEDICATION
PREFACE
Preface to the Second Edition
Preface to the Third Edition
ABOUT THE COMPANION WEBSITE
CHAPTER 1: A REVIEW OF ELEMENTARY MATRIX ALGEBRA
1.1 Introduction
1.2 Definitions and Notation
1.3 Matrix Addition and Multiplication
1.4 The Transpose
1.5 The Trace
1.6 The Determinant
1.7 The Inverse
1.8 Partitioned Matrices
1.9 The Rank of a Matrix
1.10 Orthogonal Matrices
1.11 Quadratic Forms
1.12 Complex Matrices
1.13 Random Vectors and Some Related Statistical Concepts
Problems
CHAPTER 2: VECTOR SPACES
2.1 Introduction
2.2 Definitions
2.3 Linear Independence and Dependence
2.4 Matrix Rank and Linear Independence
2.5 Bases and Dimension
2.6 Orthonormal Bases and Projections
2.7 Projection Matrices
2.8 Linear Transformations and Systems of Linear Equations
2.9 The Intersection and Sum of Vector Spaces
2.10 Oblique Projections
2.11 Convex Sets
Problems
CHAPTER 3: EIGENVALUES AND EIGENVECTORS
3.1 Introduction
3.2 Eigenvalues, Eigenvectors, and Eigenspaces
3.3 Some Basic Properties of Eigenvalues and Eigenvectors
3.4 Symmetric Matrices
3.5 Continuity of Eigenvalues and Eigenprojections
3.6 Extremal Properties of Eigenvalues
3.7 Additional Results Concerning Eigenvalues Of Symmetric Matrices
3.8 Nonnegative Definite Matrices
3.9 Antieigenvalues and Antieigenvectors
Problems
CHAPTER 4: MATRIX FACTORIZATIONS AND MATRIX NORMS
4.1 Introduction
4.2 The Singular Value Decomposition
4.3 The Spectral Decomposition of a Symmetric Matrix
4.4 The Diagonalization of a Square Matrix
4.5 The Jordan Decomposition
4.6 The Schur Decomposition
4.7 The Simultaneous Diagonalization of Two Symmetric Matrices
4.8 Matrix Norms
Problems
CHAPTER 5: GENERALIZED INVERSES
5.1 Introduction
5.2 The Moore–Penrose Generalized Inverse
5.3 Some Basic Properties of the Moore–Penrose Inverse
5.4 The Moore–Penrose Inverse of a Matrix Product
5.5 The Moore–Penrose Inverse of Partitioned Matrices
5.6 The Moore–Penrose Inverse of a Sum
5.7 The Continuity of the Moore–Penrose Inverse
5.8 Some Other Generalized Inverses
5.9 Computing Generalized Inverses
Problems
CHAPTER 6: SYSTEMS OF LINEAR EQUATIONS
6.1 Introduction
6.2 Consistency of a System of Equations
6.3 Solutions to a Consistent System of Equations
6.4 Homogeneous Systems of Equations
6.5 Least Squares Solutions to a System of Linear Equations
6.6 Least Squares Estimation For Less Than Full Rank Models
6.7 Systems of Linear Equations and The Singular Value Decomposition
6.8 Sparse Linear Systems of Equations
Problems
CHAPTER 7: PARTITIONED MATRICES
7.1 Introduction
7.2 The Inverse
7.3 The Determinant
7.4 Rank
7.5 Generalized Inverses
7.6 Eigenvalues
Problems
CHAPTER 8: SPECIAL MATRICES AND MATRIX OPERATIONS
8.1 Introduction
8.2 The Kronecker Product
8.3 The Direct Sum
8.4 The Vec Operator
8.5 The Hadamard Product
8.6 The Commutation Matrix
8.7 Some Other Matrices Associated With the Vec Operator
8.8 Nonnegative Matrices
8.9 Circulant and Toeplitz Matrices
8.10 Hadamard and Vandermonde Matrices
Problems
CHAPTER 9: MATRIX DERIVATIVES AND RELATED TOPICS
9.1 Introduction
9.2 Multivariable Differential Calculus
9.3 Vector and Matrix Functions
9.4 Some Useful Matrix Derivatives
9.5 Derivatives of Functions of Patterned Matrices
9.6 The Perturbation Method
9.7 Maxima and Minima
9.8 Convex and Concave Functions
9.9 The Method of Lagrange Multipliers
Problems
CHAPTER 10: INEQUALITIES
10.1 Introduction
10.2 Majorization
10.3 Cauchy-Schwarz Inequalities
10.4 Hölder's Inequality
10.5 Minkowski's Inequality
10.6 The Arithmetic-Geometric Mean Inequality
Problems
CHAPTER 11: SOME SPECIAL TOPICS RELATED TO QUADRATIC FORMS
11.1 Introduction
11.2 Some Results on Idempotent Matrices
11.3 Cochran's Theorem
11.4 Distribution of Quadratic Forms in Normal Variates
11.5 Independence of Quadratic Forms
11.6 Expected Values of Quadratic Forms
11.7 The Wishart Distribution
Problems
REFERENCES
INDEX
WILEY SERIES IN PROBABILITY AND STATISTICS
End User License Agreement
xi
xii
xiii
xv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
35
36
36
37
37
38
38
39
39
40
40
41
41
42
42
43
43
44
44
45
45
46
46
47
47
48
48
49
49
50
50
51
51
52
52
53
53
54
54
55
55
56
56
57
57
58
58
59
59
60
60
61
61
62
62
63
63
64
64
65
65
66
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
387
388
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
513
514
515
516
517
518
519
520
Cover
Table of Contents
Preface
Begin Reading
CHAPTER 2: VECTOR SPACES
Figure 2.1 The angle between
x
and
y
Figure 2.2 Projection of
x
onto a one-dimensional subspace of
Figure 2.3 Projection of
x
onto a two-dimensional subspace of
Figure 2.4 Projection of
x
onto
a
along
b
CHAPTER 9: MATRIX DERIVATIVES AND RELATED TOPICS
Figure 9.1 A convex function of a scalar variable
x
WILEY SERIES IN PROBABILITY AND STATISTICS
Established by WALTER A. SHEWHART and SAMUEL S. WILKS
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg
Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels
A complete list of the titles in this series appears at the end of this volume.
Third Edition
JAMES R. SCHOTT
Copyright © 2017 by John Wiley & Sons, Inc. All rights reserved
Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Names: Schott, James R., 1955- author.
Title: Matrix analysis for statistics / James R. Schott.
Description: Third edition. | Hoboken, New Jersey : John Wiley & Sons, 2016. | Includes bibliographical references and index.
Identifiers: LCCN 2016000005| ISBN 9781119092483 (cloth) | ISBN 9781119092469 (epub)
Subjects: LCSH: Matrices. | Mathematical statistics.
Classification: LCC QA188 .S24 2016 | DDC 512.9/434-dc23 LC record available at http://lccn.loc.gov/2016000005
Cover image courtesy of GettyImages/Alexmumu.
To Susan, Adam, and Sarah
As the field of statistics has developed over the years, the role of matrix methods has evolved from a tool through which statistical problems could be more conveniently expressed to an absolutely essential part in the development, understanding, and use of the more complicated statistical analyses that have appeared in recent years. As such, a background in matrix analysis has become a vital part of a graduate education in statistics. Too often, the statistics graduate student gets his or her matrix background in bits and pieces through various courses on topics such as regression analysis, multivariate analysis, linear models, stochastic processes, and so on. An alternative to this fragmented approach is an entire course devoted to matrix methods useful in statistics. This text has been written with such a course in mind. It also could be used as a text for an advanced undergraduate course with an unusually bright group of students and should prove to be useful as a reference for both applied and research statisticians.
Students beginning in a graduate program in statistics often have their previous degrees in other fields, such as mathematics, and so initially their statistical backgrounds may not be all that extensive. With this in mind, I have tried to make the statistical topics presented as examples in this text as self-contained as possible. This has been accomplished by including a section in the first chapter which covers some basic statistical concepts and by having most of the statistical examples deal with applications which are fairly simple to understand; for instance, many of these examples involve least squares regression or applications that utilize the simple concepts of mean vectors and covariance matrices. Thus, an introductory statistics course should provide the reader of this text with a sufficient background in statistics. An additional prerequisite is an undergraduate course in matrices or linear algebra, while a calculus background is necessary for some portions of the book, most notably, Chapter 8.
By selectively omitting some sections, all nine chapters of this book can be covered in a one-semester course. For instance, in a course targeted at students who end their educational careers with the masters degree, I typically omit Sections 2.10, 3.5, 3.7, 4.8, 5.4-5.7, and 8.6, along with a few other sections.
Anyone writing a book on a subject for which other texts have already been written stands to benefit from these earlier works, and that certainly has been thecase here. The texts by Basilevsky (1983), Graybill (1983), Healy (1986), and Searle (1982), all books on matrices for statistics, have helped me, in varying degrees, to formulate my ideas on matrices. Graybill's book has been particularly influential, since this is the book that I referred to extensively, first as a graduate student, and then in the early stages of my research career. Other texts which have proven to be quite helpful are Horn and Johnson (1985, 1991), Magnus and Neudecker (1988), particularly in the writing of Chapter 8, and Magnus (1988).
I wish to thank several anonymous reviewers who offered many very helpful suggestions, and Mark Johnson for his support and encouragement throughout this project. I am also grateful to the numerous students who have alerted me to various mistakes and typos in earlier versions of this book. In spite of their help and my diligent efforts at proofreading, undoubtedly some mistakes remain, and I would appreciate being informed of any that are spotted.
Jim Schott
Orlando, Florida
The most notable change in the second edition is the addition of a chapter on results regarding matrices partitioned into a 2×2 form. This new chapter, which is Chapter 7, has the material on the determinant and inverse that was previously given as a section in Chapter 7 of the first edition. Along with the results on the determinant and inverse of a partitioned matrix, I have added new material in this chapter on the rank, generalized inverses, and eigenvalues of partitioned matrices.
The coverage of eigenvalues in Chapter 3 has also been expanded. Some additional results such as Weyl's Theorem have been included, and in so doing, the last section of Chapter 3 of the first edition has now been replaced by two sections.
Other smaller additions, including both theorems and examples, have been made elsewhere throughout the book. Over 100 new exercises have been added to the problems sets.
The writing of a second edition of this book has also given me the opportunity to correct mistakes in the first edition. I would like to thank those readers who have pointed out some of these errors as well as those that have offered suggestions for improvement to the text.
Jim Schott
Orlando, FloridaSeptember 2004
The third edition of this text maintains the same organization that was present inthe previous editions. The major changes involve the addition of new material. This includes the following additions.
1.
A new chapter, now
Chapter 10
, on inequalities has been added. Numerous inequalities such as Cauchy-Schwarz, Hadamard, and Jensen's, already appear in the earlier editions, but there are many important ones that are missing, and some of these are given in the new chapter. Highlighting this chapter is a fairly substantial section on majorization and some of the inequalities that can be developed from this concept.
2.
A new section on oblique projections has been added to
Chapter 2
. The previous editions only covered orthogonal projections.
3.
A new section on antieigenvalues and antieigenvectors has been added to
Chapter 3
.
Numerous other smaller additions have been made throughout the text. These include some additional theorems, the proofs of some results that previously had been given without proof, and some more examples involving statistical applications. Finally, more than 70 new problems have been added to the end-of-chapter problem sets.
Jim Schott
Orlando, FloridaDecember 2015
This book is accompanied by a companion website:
www.wiley.com/go/Schott/MatrixAnalysis3e
The instructor's website includes:
A solutions manual with solutions to selected problems
The student's website includes:
A solutions manual with odd-numbered solutions to selected problems
In this chapter, we review some of the basic operations and fundamental properties involved in matrix algebra. In most cases, properties will be stated without proof, but in some cases, when instructive, proofs will be presented. We end the chapter with a brief discussion of random variables and random vectors, expected values of random variables, and some important distributions encountered elsewhere in the book.
Except when stated otherwise, a scalar such as α will represent a real number. A matrix A of size m × n is the m × n rectangular array of scalars given by
and sometimes it is simply identified as . Sometimes it also will be convenient to refer to the th element of A, as ; that is, . If , then A is called a square matrix of order m, whereas A is referred to as a rectangular matrix when . An m × 1 matrix
is called a column vector or simply a vector. The element is referred to as the ith component of a. A matrix is called a row vector. The ith row and jth column of the matrix A will be denoted by and , respectively. We will usually use capital letters to represent matrices and lowercase bold letters for vectors.
The diagonal elements of the m × m matrix A are . If all other elements of A are equal to 0, A is called a diagonal matrix and can be identified as . If, in addition, for so that , then the matrix A is called the identity matrix of order m and will be written as or simply if the order is obvious. If and b is a scalar, then we will use to denote the diagonal matrix . For any m × m matrix A, will denote the diagonal matrix with diagonal elements equal to those of A, and for any m × 1 vector a, denotes the diagonal matrix with diagonal elements equal to the components of a; that is, and .
A triangular matrix is a square matrix that is either an upper triangular matrix or a lower triangular matrix. An upper triangular matrix is one that has all of its elements below the diagonal equal to 0, whereas a lower triangular matrix has all of its elements above the diagonal equal to 0. A strictly upper triangular matrix is an upper triangular matrix that has each of its diagonal elements equal to 0. A strictly lower triangular matrix is defined similarly.
The ith column of the m × m identity matrix will be denoted by ei; that is, ei is the m × 1 vector that has its ith component equal to 1 and all of its other components equal to 0. When the value of m is not obvious, we will make it more explicit by writing ei as . The m × m matrix whose only nonzero element is a 1 in the th position will be identified as .
The scalar zero is written 0, whereas a vector of zeros, called a null vector, will be denoted by 0 , and a matrix of zeros, called a null matrix, will be denoted by . The m × 1 vector having each component equal to 1 will be denoted by or simply 1 when the size of the vector is obvious.
The sum of two matrices A and B is defined if they have the same number of rows and the same number of columns; in this case,
The product of a scalar α and a matrix A is
The premultiplication of the matrix B by the matrix A is defined only if the number of columns of A equals the number of rows of B. Thus, if A is and B is , then will be the m × n matrix which has its th element, , given by
A similar definition exists for BA, the postmultiplication of B by A, if the number of columns of B equals the number of rows of A. When both products are defined, we will not have, in general, . If the matrix A is square, then the product AA, or simply , is defined. In this case, if we have , then A is said to be an idempotent matrix.
The following basic properties of matrix addition and multiplication in Theorem 1.1 are easy to verify.
Let α and β be scalars and A, B, and C be matrices. Then, when the operations involved are defined, the following properties hold:
a.
.
b.
.
c.
.
d.
.
e.
.
f.
.
g.
.
h.
.
The transpose of an m × n matrix A is the n × m matrix obtained by interchanging the rows and columns of A. Thus, the th element of is . If A is and B is , then the th element of can be expressed as
Thus, evidently . This property along with some other results involving the transpose are summarized in Theorem 1.2.
Let α and β be scalars and A and B be matrices. Then, when defined, the following properties hold:
a.
.
b.
.
c.
.
d.
.
If A is m × m, that is, A is a square matrix, then is also m × m. In this case, if , then A is called a symmetric matrix, whereas A is called a skew-symmetric if .
The transpose of a column vector is a row vector, and in some situations, we may write a matrix as a column vector times a row vector. For instance, the matrix defined in Section 1.2 can be expressed as . More generally, yields an m × n matrix having 1, as its only nonzero element, in the th position, and if A is an m × n matrix, then
The trace is a function that is defined only on square matrices. If A is an m × m matrix, then the trace of A, denoted by , is defined to be the sum of the diagonal elements of A; that is,
Now if A is m × n and B is n × m, then AB is m × m and
This property of the trace, along with some others, is summarized in Theorem 1.3.
Let α be a scalar and A and B be matrices. Then, when the appropriate operations are defined, we have the following properties:
a.
.
b.
.
c.
.
d.
.
e.
if and only if
.
The determinant is another function defined on square matrices. If A is an m × m matrix, then its determinant, denoted by , is given by
where the summation is taken over all permutations of the set of integers , and the function equals the number of transpositions necessary to change to an increasing sequence of components, that is, to . A transposition is the interchange of two of the integers. Although f is not unique, it is uniquely even or odd, so that is uniquely defined. Note that the determinant produces all products of m terms of the elements of the matrix A such that exactly one element is selected from each row and each column of A.
Using the formula for the determinant, we find that when . If A is 2×2, we have
and when A is 3×3, we get
The following properties of the determinant in Theorem 1.4 are fairly straightforward to verify using the definition of a determinant.
If α is a scalar and A is an m × m matrix, then the following properties hold:
a.
.
b.
.
c.
If
A
is a diagonal matrix, then
.
d.
If all elements of a row (or column) of
A
are zero,
.
e.
The interchange of two rows (or columns) of
A
changes the sign of
.
f.
If all elements of a row (or column) of
A
are multiplied by
α
, then the determinant is multiplied by
α
.
g.
The determinant of
A
is unchanged when a multiple of one row (or column) is added to another row (or column).
h.
If two rows (or columns) of
A
are proportional to one another,
.
An alternative expression for can be given in terms of the cofactors of A. The minor of the element , denoted by , is the determinant of the matrix obtained after removing the ith row and jth column from A. The corresponding cofactor of , denoted by , is then given as .
For any , the determinant of the m × m matrix A can be obtained by expanding along the ith row,
or expanding along the ith column,
We will just prove (1.1), as (1.2) can easily be obtained by applying (1.1) to . We first consider the result when . Clearly
where
and the summation is over all permutations for which . Since , this implies that
where the summation is over all permutations of . If C is the matrix obtained from A by deleting its 1st row and jth column, then can be written
where the summation is over all permutations of and is the minor of . Thus,
as is required. To prove (1.1) when , let D be the m × m matrix for which , , for , and for . Then , and . Thus, since we have already established (1.1) when , we have
and so the proof is complete.
Our next result indicates that if the cofactors of a row or column are matched with the elements from a different row or column, the expansion reduces to 0.
If A is an m × m matrix and , then
We will find the determinant of the 5 × 5 matrix given by
Using the cofactor expansion formula on the first column of A, we obtain
and then using the same expansion formula on the first column of this 4 × 4 matrix, we get
Because the determinant of the 3×3 matrix above is 6, we have
Consider the m × m matrix C whose columns are given by the vectors ; that is, we can write . Suppose that, for some m × 1 vector and m × m matrix , we have
Then, if we find the determinant of C by expanding along the first column of C, we get
so that the determinant of C is a linear combination of m determinants. If B is an m × m matrix and we now define , then by applying the previous derivation on each column of C, we find that
where this final sum is only over all permutations of , because Theorem 1.4(h) implies that
if for any . Finally, reordering the columns in and using Theorem 1.4(e), we have
This very useful result is summarized in Theorem 1.7.
If both A and B are square matrices of the same order, then
An m × m matrix A is said to be a nonsingular matrix if and a singular matrix if . If A is nonsingular, a nonsingular matrix denoted by A−1 and called the inverse of A exists, such that
This inverse is unique because, if B is another m × m matrix satisfying the inverse formula (1.4) for A, then , and so
The following basic properties of the matrix inverse in Theorem 1.8 can be easily verified by using (1.4).
If α is a nonzero scalar, and A and B are nonsingular m × m matrices, then the following properties hold:
a.
.
b.
.
c.
.
d.
.
e.
If
, then
.
f.
If
, then
.
g.
.
As with the determinant of A, the inverse of A can be expressed in terms of the cofactors of A. Let , called the adjoint of A, be the transpose of the matrix of cofactors of A; that is, the th element of is , the cofactor of . Then
because follows directly from (1.1) and (1.2), and , for follows from (1.3). The equation above then yields the relationship
when . Thus, for instance, if A is a 2×2 nonsingular matrix, then
Similarly when , we get , where
The relationship between the inverse of a matrix product and the product of the inverses, given in Theorem 1.8(g), is a very useful property. Unfortunately, such a nice relationship does not exist between the inverse of a sum and the sum of the inverses. We do, however, have Theorem 1.9 which is sometimes useful.
Suppose A and B are nonsingular matrices, with A being m × m and B being n × n. For any m × n matrix C and any n × m matrix D, it follows that if is nonsingular, then
The proof simply involves verifying that for given above. We have
and so the result follows.
The expression given for in Theorem 1.9 involves the inverse of the matrix . It can be shown (see Problem 7.12) that the conditions of the theorem guarantee that this inverse exists. If and C and D are identity matrices, then we obtain Corollary 1.9.1 of Theorem 1.9.
Suppose that A, B and A+B are all m × m nonsingular matrices. Then
We obtain Corollary 1.9.2 of Theorem 1.9 when .
Let A be an m × m nonsingular matrix. If c and d are both m × 1 vectors and is nonsingular, then
Theorem 1.9 can be particularly useful when m is larger than n and the inverse of A is fairly easy to compute. For instance, suppose we have ,
from which we obtain
It is somewhat tedious to compute the inverse of this 5 × 5 matrix directly. However, the calculations in Theorem 1.9 are fairly straightforward. Clearly, and
so that
and
Thus, we find that
Occasionally we will find it useful to partition a given matrix into submatrices. For instance, suppose A is m × n and the positive integers m1, m2, , are such that and . Then one way of writing A as a partitioned matrix is
where A11 is , is , A21 is , and A22 is . That is, A11 is the matrix consisting of the first m1 rows and columns of A, is the matrix consisting of the first m1 rows and last columns of A, and so on. Matrix operations can be expressed in terms of the submatrices of the partitioned matrix. For example, suppose B is an matrix partitioned as
where is , is , is , is , and . Then the premultiplication of B by A can be expressed in partitioned form as
Matrices can be partitioned into submatrices in other ways besides this 2×2 partitioned form. For instance, we could partition only the columns of A, yielding the expression
where A1 is and A2 is . A more general situation is one in which the rows of A are partitioned into r groups and the columns of A are partitioned into c groups so that A can be written as
where the submatrix is and the integers and are such that
This matrix A is said to be in block diagonal form if , is a square matrix for each i, and is a null matrix for all i and j for which . In this case, we will write ; that is,
Suppose we wish to compute the transpose product , where the 5 × 5 matrix A is given by
The computation can be simplified by observing that A may be written as
As a result, we have
Our initial definition of the rank of an m × n matrix A is given in terms of submatrices. We will see an alternative equivalent definition in terms of the concept of linearly independent vectors in Chapter 2. Most of the material we include in this section can be found in more detail in texts on elementary linear algebra such as Andrilli and Hecker (2010) and Poole (2015).
In general, any matrix formed by deleting rows or columns of A is called a submatrix of A. The determinant of an r × r submatrix of A is called a minor of order r. For instance, for an m × m matrix A, we have previously defined what we called the minor of ; this is an example of a minor of order . Now the rank of a nonnull m × n matrix A is r, written , if at least one of its minors of order r is nonzero while all minors of order (if there are any) are zero. If A is a null matrix, then . If , then A is said to have full rank. In particular, if , A has full row rank, and if , A has full column rank.
The rank of a matrix A is unchanged by any of the following operations, called elementary transformations:
a.
The interchange of two rows (or columns) of
A
.
b.
The multiplication of a row (or column) of
A
by a nonzero scalar.
c.
The addition of a scalar multiple of a row (or column) of
A
to another row (or column) of
A
.
Thus, the definition of the rank of A is sometimes given as the number of nonzero rows in the reduced row echelon form of A.
Any elementary transformation of A can be expressed as the multiplication of A by a matrix referred to as an elementary transformation matrix. An elementary transformation of the rows of A will be given by the premultiplication of A by an elementary transformation matrix, whereas an elementary transformation of the columns corresponds to a postmultiplication. Elementary transformation matrices are nonsingular, and any nonsingular matrix can be expressed as the product of elementary transformation matrices. Consequently, we have Theorem 1.10.
Let A be an m × n matrix, B be an m × m matrix, and C be an n × n matrix. Then if B and C are nonsingular matrices, it follows that
By using elementary transformation matrices, any matrix A can be transformed into another matrix of simpler form having the same rank as A.
If A is an m × n matrix of rank , then nonsingular m × m and n × n matrices B and C exist, such that and , where H is given by
Corollary 1.11.1 is an immediate consequence of Theorem 1.11.
Let A be an m × n matrix with . Then an m × r matrix F and an matrix G exist, such that and .
An m × 1 vector p is said to be a normalized vector or a unit vector if . The m × 1 vectors, , where , are said to be orthogonal if for all . If in addition, each is a normalized vector, then the vectors are said to be orthonormal. An m × m matrix P whose columns form an orthonormal set of vectors is called an orthogonal matrix. It immediately follows that
Taking the determinant of both sides, we see that
Thus, or −1, so that P is nonsingular, , and in addition to ; that is, the rows of P also form an orthonormal set of m × 1 vectors. Some basic properties of orthogonal matrices are summarized in Theorem 1.12.
Let P and Q be m × m orthogonal matrices and A be any m × m matrix. Then
a.
,
b.
,
c.
is an orthogonal matrix.
One example of an m × m orthogonal matrix, known as the Helmert matrix, has the form
For instance, if , the Helmert matrix is
Note that if , it is possible for an m × n matrix P to satisfy one of the identities, or , but not both. Such a matrix is sometimes referred to as a semiorthogonal matrix.
An m × m matrix P is called a permutation matrix if each row and each column of P has a single element 1, while all remaining elements are zeros. As a result, the columns of P will be , the columns of , in some order. Note then that the th element of will be for some i, and the th element of will be for some if ; that is, a permutation matrix is a special orthogonal matrix. Since there are ways of permuting the columns of , there are different permutation matrices of order m. If A is also m × m, then PA creates an m × m matrix by permuting the rows of A, and produces a matrix by permuting the columns of A.
Let x be an m × 1 vector, y an n × 1 vector, and A an m × n matrix. Then the function of x and y given by
is sometimes called a bilinear form in x and y. We will be most interested in the special case in which , so that A is m × m, and . In this case, the function above reduces to the function of x,
which is called a quadratic form in x; A is referred to as the matrix of the quadratic form. We will always assume that A is a symmetric matrix because, if it is not, A may be replaced by , which is symmetric, without altering ; that is,
because . As an example, consider the function
where x is 3 × 1. The symmetric matrix A satisfying is given by
Every symmetric matrix A and its associated quadratic form is classified into one of the following five categories:
a.
If
for all
, then
A
is positive definite.
b.
If
for all
x
and
for some
, then
A
is positive semidefinite.
c.
If
for all
, then
A
is negative definite.
d.
If
for all
x
and
for some
, then
A
is negative semidefinite.
e.
If
for some
x
and
for some
x
, then
A
is indefinite.
Note that the null matrix is actually both positive semidefinite and negative semidefinite.
Positive definite and negative definite matrices are nonsingular, whereas positive semidefinite and negative semidefinite matrices are singular. Sometimes the term nonnegative definite will be used to refer to a symmetric matrix that is either positive definite or positive semidefinite. An m × m matrix B is called a square root of the nonnegative definite m × m matrix A if . Sometimes we will denote such a matrix B as . If B is also symmetric, so that , then B is called the symmetric square root of A.
Quadratic forms play a prominent role in inferential statistics. In Chapter 11, we will develop some of the most important results involving quadratic forms that are of particular interest in statistics.
Throughout most of this text, we will be dealing with the analysis of vectors and matrices composed of real numbers or variables. However, there are occasions in which an analysis of a real matrix, such as the decomposition of a matrix in the form of a product of other matrices, leads to matrices that contain complex numbers. For this reason, we will briefly summarize in this section some of the basic notation and terminology regarding complex numbers.
Any complex number c can be written in the form
where a and b are real numbers and i represents the imaginary number . The real number a is called the real part of c, whereas b is referred to as the imaginary part of c. Thus, the number c is a real number only if b is 0. If we have two complex numbers, and , then their sum is given by
whereas their product is given by
Corresponding to each complex number is another complex number denoted by and called the complex conjugate of c. The complex conjugate of c is given by and satisfies , so that the product of a complex number and its conjugate results in a real number.
A complex number can be represented geometrically by a point in the complex plane, where one of the axes is the real axis and the other axis is the complex or imaginary axis. Thus, the complex number would be represented by the point in this complex plane. Alternatively, we can use the polar coordinates , where r is the length of the line from the origin to the point and
