Matrix Algebra for Linear Models - Marvin H. J. Gruber - E-Book

Matrix Algebra for Linear Models E-Book

Marvin H. J. Gruber

0,0
103,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

A self-contained introduction to matrix analysis theory and applications in the field of statistics

Comprehensive in scope, Matrix Algebra for Linear Models offers a succinct summary of matrix theory and its related applications to statistics, especially linear models. The book provides a unified presentation of the mathematical properties and statistical applications of matrices in order to define and manipulate data.

Written for theoretical and applied statisticians, the book utilizes multiple numerical examples to illustrate key ideas, methods, and techniques crucial to understanding matrix algebra’s application in linear models. Matrix Algebra for Linear Models expertly balances concepts and methods allowing for a side-by-side presentation of matrix theory and its linear model applications. Including concise summaries on each topic, the book also features:

  • Methods of deriving results from the properties of eigenvalues and the singular value decomposition
  • Solutions to matrix optimization problems for obtaining more efficient biased estimators for parameters in linear regression models
  • A section on the generalized singular value decomposition
  • Multiple chapter exercises with selected answers to enhance understanding of the presented material

Matrix Algebra for Linear Models is an ideal textbook for advanced undergraduate and graduate-level courses on statistics, matrices, and linear algebra. The book is also an excellent reference for statisticians, engineers, economists, and readers interested in the linear statistical model.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 303

Veröffentlichungsjahr: 2013

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



CONTENTS

PREFACE

ACKNOWLEDGMENTS

PART I: BASIC IDEAS ABOUT MATRICES AND SYSTEMS OF LINEAR EQUATIONS

SECTION 1: WHAT MATRICES ARE AND SOME BASIC OPERATIONS WITH THEM

1.1 INTRODUCTION

1.2 WHAT ARE MATRICES AND WHY ARE THEY INTERESTING TO A STATISTICIAN?

1.3 MATRIX NOTATION, ADDITION, AND MULTIPLICATION

1.4 SUMMARY

EXERCISES

SECTION 2: DETERMINANTS AND SOLVING A SYSTEM OF EQUATIONS

2.1 INTRODUCTION

2.2 DEFINITION OF AND FORMULAE FOR EXPANDING DETERMINANTS

2.3 SOME COMPUTATIONAL TRICKS FOR THE EVALUATION OF DETERMINANTS

2.4 SOLUTION TO LINEAR EQUATIONS USING DETERMINANTS

2.5 GAUSS ELIMINATION

2.6 SUMMARY

EXERCISES

SECTION 3: THE INVERSE OF A MATRIX

3.1 INTRODUCTION

3.2 THE ADJOINT METHOD OF FINDING THE INVERSE OF A MATRIX

3.3 USING ELEMENTARY ROW OPERATIONS

3.4 USING THE MATRIX INVERSE TO SOLVE A SYSTEM OF EQUATIONS

3.5 PARTITIONED MATRICES AND THEIR INVERSES

3.6 FINDING THE LEAST SQUARE ESTIMATOR

3.7 SUMMARY

EXERCISES

SECTION 4: SPECIAL MATRICES AND FACTS ABOUT MATRICES THAT WILL BE USED IN THE SEQUEL

4.1 INTRODUCTION

4.2 MATRICES OF THE FORM aIn+ bJn

4.3 ORTHOGONAL MATRICES

4.4 DIRECT PRODUCT OF MATRICES

4.5 AN IMPORTANT PROPERTY OF DETERMINANTS

4.6 THE TRACE OF A MATRIX

4.7 MATRIX DIFFERENTIATION

4.8 THE LEAST SQUARE ESTIMATOR AGAIN

4.9 SUMMARY

EXERCISES

SECTION 5: VECTOR SPACES

5.1 INTRODUCTION

5.2 WHAT IS A VECTOR SPACE?

5.3 THE DIMENSION OF A VECTOR SPACE

5.4 INNER PRODUCT SPACES

5.5 LINEAR TRANSFORMATIONS

5.6 SUMMARY

EXERCISES

SECTION 6: THE RANK OF A MATRIX AND SOLUTIONS TO SYSTEMS OF EQUATIONS

6.1 INTRODUCTION

6.2 THE RANK OF A MATRIX

6.3 SOLVING SYSTEMS OF EQUATIONS WITH COEFFICIENT MATRIX OF LESS THAN FULL RANK

6.4 SUMMARY

EXERCISES

PART II: EIGENVALUES, THE SINGULAR VALUE DECOMPOSITION, AND PRINCIPAL COMPONENTS

SECTION 7: FINDING THE EIGENVALUES OF A MATRIX

7.1 INTRODUCTION

7.2 EIGENVALUES AND EIGENVECTORS OF A MATRIX

7.3 NONNEGATIVE DEFINITE MATRICES

7.4 SUMMARY

EXERCISES

SECTION 8: THE EIGENVALUES AND EIGENVECTORS OF SPECIAL MATRICES

8.1 INTRODUCTION

8.2 ORTHOGONAL, NONSINGULAR, AND IDEMPOTENT MATRICES

8.3 THE CAYLEY–HAMILTON THEOREM

8.4 THE RELATIONSHIP BETWEEN THE TRACE, THE DETERMINANT, AND THE EIGENVALUES OF A MATRIX

8.5 THE EIGENVALUES AND EIGENVECTORS OF THE KRONECKER PRODUCT OF TWO MATRICES

8.6 THE EIGENVALUES AND THE EIGENVECTORS OF A MATRIX OF THE FORM aI + bJ

8.7 THE LOEWNER ORDERING

8.8 SUMMARY

EXERCISES

SECTION 9: THE SINGULAR VALUE DECOMPOSITION (SVD)

9.1 INTRODUCTION

9.2 THE EXISTENCE OF THE SVD

9.3 USES AND EXAMPLES OF THE SVD

9.4 SUMMARY

EXERCISES

SECTION 10: APPLICATIONS OF THE SINGULAR VALUE DECOMPOSITION

10.1 INTRODUCTION

10.2 REPARAMETERIZATION OF A NON-FULL-RANK MODEL TO A FULL-RANK MODEL

10.3 PRINCIPAL COMPONENTS

10.4 THE MULTICOLLINEARITY PROBLEM

10.5 SUMMARY

EXERCISES

SECTION 11: RELATIVE EIGENVALUES AND GENERALIZATIONS OF THE SINGULAR VALUE DECOMPOSITION

11.1 INTRODUCTION

11.2 RELATIVE EIGENVALUES AND EIGENVECTORS

11.3 GENERALIZATIONS OF THE SINGULAR VALUE DECOMPOSITION: OVERVIEW

11.4 THE FIRST GENERALIZATION

11.5 THE SECOND GENERALIZATION

11.6 SUMMARY

EXERCISES

PART III: GENERALIZED INVERSES

SECTION 12: BASIC IDEAS ABOUT GENERALIZED INVERSES

12.1 INTRODUCTION

12.2 WHAT IS A GENERALIZED INVERSE AND HOW IS ONE OBTAINED?

12.3 THE MOORE–PENROSE INVERSE

12.4 SUMMARY

EXERCISES

SECTION 13: CHARACTERIZATIONS OF GENERALIZED INVERSES USING THE SINGULAR VALUE DECOMPOSITION

13.1 INTRODUCTION

13.2 CHARACTERIZATION OF THE MOORE–PENROSE INVERSE

13.3 GENERALIZED INVERSES IN TERMS OF THE MOORE–PENROSE INVERSE

13.4 SUMMARY

EXERCISES

SECTION 14: LEAST SQUARE AND MINIMUM NORM GENERALIZED INVERSES

14.1 INTRODUCTION

14.2 MINIMUM NORM GENERALIZED INVERSES

14.3 LEAST SQUARE GENERALIZED INVERSES

14.4 AN EXTENSION OF THEOREM 7.3 TO POSITIVE- SEMI-DEFINITE MATRICES

14.5 SUMMARY

EXERCISES

SECTION 15: MORE REPRESENTATIONS OF GENERALIZED INVERSES

15.1 INTRODUCTION

15.2 ANOTHER CHARACTERIZATION OF THE MOORE–PENROSE INVERSE

15.3 STILL ANOTHER REPRESENTATION OF THE GENERALIZED INVERSE

15.4 THE GENERALIZED INVERSE OF A PARTITIONED MATRIX

15.5 SUMMARY

EXERCISES

SECTION 16: LEAST SQUARE ESTIMATORS FOR LESS THAN FULL-RANK MODELS

16.1 INTRODUCTION

16.2 SOME PRELIMINARIES

16.3 OBTAINING THE LS ESTIMATOR

16.4 SUMMARY

EXERCISES

PART IV: QUADRATIC FORMS AND THE ANALYSIS OF VARIANCE

SECTION 17: QUADRATIC FORMS AND THEIR PROBABILITY DISTRIBUTIONS

17.1 INTRODUCTION

17.2 EXAMPLES OF QUADRATIC FORMS

17.3 THE CHI-SQUARE DISTRIBUTION

17.4 WHEN DOES THE QUADRATIC FORM OF A RANDOM VARIABLE HAVE A CHI-SQUARE DISTRIBUTION?

17.5 WHEN ARE TWO QUADRATIC FORMS WITH THE CHI-SQUARE DISTRIBUTION INDEPENDENT?

17.6 SUMMARY

EXERCISES

SECTION 18: ANALYSIS OF VARIANCE: REGRESSION MODELS AND THE ONE- AND TWO-WAY CLASSIFICATION

18.1 INTRODUCTION

18.2 THE FULL-RANK GENERAL LINEAR REGRESSION MODEL

18.3 ANALYSIS OF VARIANCE: ONE-WAY CLASSIFICATION

18.4 ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION

18.5 SUMMARY

EXERCISES

SECTION 19: MORE ANOVA

19.1 INTRODUCTION

19.2 THE TWO-WAY CLASSIFICATION WITH INTERACTION

19.3 THE TWO-WAY CLASSIFICATION WITH ONE FACTOR NESTED

19.4 SUMMARY

EXERCISES

SECTION 20: THE GENERAL LINEAR HYPOTHESIS

20.1 INTRODUCTION

20.2 THE FULL-RANK CASE

20.3 THE NON-FULL-RANK CASE

20.4 CONTRASTS

20.5 SUMMARY

EXERCISES

PART V: MATRIX OPTIMIZATION PROBLEMS

SECTION 21: UNCONSTRAINED OPTIMIZATION PROBLEMS

21.1 INTRODUCTION

21.2 UNCONSTRAINED OPTIMIZATION PROBLEMS

21.3 THE LEAST SQUARE ESTIMATOR AGAIN

21.4 SUMMARY

EXERCISES

SECTION 22: CONSTRAINED MINIMIZATION PROBLEMS WITH LINEAR CONSTRAINTS

22.1 INTRODUCTION

22.2 AN OVERVIEW OF LAGRANGE MULTIPLIERS

22.3 MINIMIZING A SECOND-DEGREE FORM WITH RESPECT TO A LINEAR CONSTRAINT

22.4 THE CONSTRAINED LEAST SQUARE ESTIMATOR

22.5 CANONICAL CORRELATION

22.6 SUMMARY

EXERCISES

SECTION 23: THE GAUSS–MARKOV THEOREM

23.1 INTRODUCTION

23.2 THE GAUSS–MARKOV THEOREM AND THE LEAST SQUARE ESTIMATOR

23.3 THE MODIFIED GAUSS–MARKOV THEOREM AND THE LINEAR BAYES ESTIMATOR

23.4 SUMMARY

EXERCISES

SECTION 24: RIDGE REGRESSION-TYPE ESTIMATORS

24.1 INTRODUCTION

24.2 MINIMIZING A SECOND-DEGREE FORM WITH RESPECT TO A QUADRATIC CONSTRAINT

24.3 THE GENERALIZED RIDGE REGRESSION ESTIMATORS

24.4 THE MEAN SQUARE ERROR OF THE GENERALIZED RIDGE ESTIMATOR WITHOUT AVERAGING OVER THE PRIOR DISTRIBUTION

24.5 THE MEAN SQUARE ERROR AVERAGING OVER THE PRIOR DISTRIBUTION

24.6 SUMMARY

EXERCISES

ANSWERS TO SELECTED EXERCISES

PART I

PART II

PART III

PART IV

PART V

REFERENCES

INDEX

Copyright © 2014 by John Wiley & Sons, Inc. All rights reserved

Published by John Wiley & Sons, Inc., Hoboken, New JerseyPublished simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Gruber, Marvin H. J., 1941–    Matrix algebra for linear models / Marvin H. J. Gruber, Department of Mathematical Sciences, Rochester Institute of Technology, Rochester, NY.      pages cm    Includes bibliographical references and index.

    ISBN 978-1-118-59255-7 (cloth)1. Linear models (Statistics) 2. Matrices. I. Title.    QA279.G78 2013    519.5′36–dc23

2013026537

ISBN: 9781118592557

To the memory of my parents, Adelaide Lee Gruber and Joseph George Gruber, who were always there for me while I was growing up and as a young adult.

PREFACE

This is a book about matrix algebra with examples of its application to statistics, mostly the linear statistical model. There are 5 parts and 24 sections.

Part I (Sections 1–6) reviews topics in undergraduate linear algebra such as matrix operations, determinants, vector spaces, and solutions to systems of linear equations. In addition, it includes some topics frequently not covered in a first course that are of interest to statisticians. These include the Kronecker product of two matrices and inverses of partitioned matrices.

Part II (Sections 7–11) tells how to find the eigenvalues of a matrix and takes up the singular value decomposition and its generalizations. The applications studied include principal components and the multicollinearity problem.

Part III (Sections 12–16) deals with generalized inverses. This includes what they are and examples of how they are useful. It also considers different kinds of generalized inverses such as the Moore–Penrose inverse, minimum norm generalized inverses, and least square generalized inverses. There are a number of results about how to represent generalized inverses using nonsingular matrices and using the singular value decomposition. Results about least square estimators for the less than full rank case are given, which employ the properties of generalized inverses. Some of the results are applied in Parts IV and V.

The use of quadratic forms in the analysis of variance is the subject of Part IV (Sections 17–20). The distributional properties of quadratic forms of normal random variables are studied. The results are applied to the analysis of variance for a full rank regression model, the one- and two-way classification, the two-way classification with interaction, and a nested model. Testing the general linear hypothesis is also taken up.

Part V (Sections 21–24) is about the minimization of a second-degree form. Cases taken up are unconstrained minimization and minimization with respect to linear and quadratic constraints. The applications taken up include the least square estimator, canonical correlation, and ridge-type estimators.

Each part has an introduction that provides a more detailed overview of its contents, and each section begins with a brief overview and ends with a summary.

The book has numerous worked examples and most illustrate the important results with numerical computations. The examples are titled to inform the reader what they are about.

At the end of each of the 24 sections, there are exercises. Some of these are proof type; many of them are numerical. Answers are given at the end for almost all of the numerical examples and solutions, or partial solutions, are given for about half of the proof-type problems. Some of the numerical exercises are a bit cumbersome, and readers are invited to use a computer algebra system, such as Mathematica, Maple, and Matlab, to help with the computations. Many of the exercises have more than one right answer, so readers may, in some instances, solve a problem correctly and get an answer different from that in the back of the book.

The author has prepared a solutions manual with solutions to all of the exercises, which is available from Wiley to instructors who adopt this book as a textbook for a course.

The end of an example is denoted by the symbol , the end of a proof by , and the end of a formal definition by .

The book is, for the most part, self-contained. However, it would be helpful if readers had a first course in matrix or linear algebra and some background in statistics.

There are a number of other excellent books on this subject that are given in the references. This book takes a slightly different approach to the subject by making extensive use of the singular value decomposition. Also, this book actually shows some of the statistical applications of the matrix theory; for the most part, the other books do not do this. Also, this book has more numerical examples than the others. Hopefully, it will add to what is out there on the subject and not necessarily compete with the other books.

MARVIN H. J. GRUBER

ACKNOWLEDGMENTS

There are a number of people who should be thanked for their help and support. I would like to thank three of my teachers at the University of Rochester, my thesis advisor, Poduri S.R.S. Rao, Govind Mudholkar, and Reuben Gabriel (may he rest in peace) for introducing me to many of the topics taken up in this book. I am very grateful to Steve Quigley for his guidance in how the book should be organized, his constructive criticism, and other kinds of help and support. I am also grateful to the other staff of John Wiley & Sons, which include the editorial assistant, Sari Friedman, the copy editor, Yassar Arafat, and the production editor, Stephanie Loh.

On a personal note, I am grateful for the friendship of Frances Johnson and her help and support.

PART I

BASIC IDEAS ABOUT MATRICES AND SYSTEMS OF LINEAR EQUATIONS

This part of the book reviews the topics ordinarily covered in a first course in linear algebra. It also introduces some other topics usually not covered in the first course that are important to statistics, in particular to the linear statistical model.

The first of the six sections in this part gives illustrations of how matrices are useful to the statistician for summarizing data. The basic operations of matrix addition, multiplication of a matrix by a scalar, and matrix multiplication are taken up. Matrices have some properties that are similar to real numbers and some properties that they do not share with real numbers. These are pointed out.

Section 2 is an informal review of the evaluation of determinants. It shows how determinants can be used to solve systems of equations. Cramer’s rule and Gauss elimination are presented.

Section 3 is about finding the inverse of a matrix. The adjoint method and the use of elementary row and column operations are considered. In addition, the inverse of a partitioned matrix is discussed.

Special matrices important to statistical applications are the subject of Section 4. These include combinations of the identity matrix and matrices consisting of ones, orthogonal matrices in general, and some orthogonal matrices useful to the analysis of variance, for example, the Helmert matrix. The Kronecker product, also called the direct product of matrices, is presented. It is useful in the representation sums of squares in the analysis of variance. This section also includes a discussion of differentiation of matrices which proves useful in solving constrained optimization problems in Part V.

Vector spaces are taken up in Section 5 because they are important to understanding eigenvalues, eigenvectors, and the singular value decomposition that are studied in Part II. They are also important for understanding what the rank of a matrix is and the concept of degrees of freedom of sums of squares in the analysis of variance. Inner product spaces are also taken up and the Cauchy–Schwarz inequality is established.

The Cauchy–Schwarz inequality is important for the comparison of the efficiency of estimators.

The material on vector spaces in Section 5 is used in Section 6 to explain what is meant by the rank of a matrix and to show when a system of linear equations has one unique solution, infinitely many solutions, and no solution.

SECTION 1

WHAT MATRICES ARE AND SOME BASIC OPERATIONS WITH THEM

1.1 INTRODUCTION

This section will introduce matrices and show how they are useful to represent data. It will review some basic matrix operations including matrix addition and multiplication. Some examples to illustrate why they are interesting and important for statistical applications will be given. The representation of a linear model using matrices will be shown.

1.2 WHAT ARE MATRICES AND WHY ARE THEY INTERESTING TO A STATISTICIAN?

Matrices are rectangular arrays of numbers. Some examples of such arrays are

Often data may be represented conveniently by a matrix. We give an example to illustrate how.

 

Example 1.1 Representing Data by Matrices

An example that lends itself to statistical analysis is taken from the Economic Report of the President of the United States in 1988. The data represent the relationship between a dependent variable Y (personal consumption expenditures) and three other independent variables X1, X2, and X3. The variable X1 represents the gross national product, X2 represents personal income (in billions of dollars), and X3 represents the total number of employed people in the civilian labor force (in thousands). Consider this data for the years 1970–1974 in Table 1.1.

TABLE 1.1Consumption expenditures in terms of gross national product, personal income, and total number of employed people

The dependent variable may be represented by a matrix with five rows and one column. The independent variables could be represented by a matrix with five rows and three columns. Thus,

A matrix with m rows and n columns is an m × n matrix. Thus, the matrix Y in Example 1.1 is 5 × 1 and the matrix X is 5 × 3. A square matrix is one that has the same number of rows and columns. The individual numbers in a matrix are called the elements of the matrix.  

We now give an example of an application from probability theory that uses matrices.

 

Example 1.2 A “Musical Room” Problem

Another somewhat different example is the following. Consider a triangular-shaped building with four rooms one at the center, room 0, and three rooms around it numbered 1, 2, and 3 clockwise (Fig. 1.1).

There is a door from room 0 to rooms 1, 2, and 3 and doors connecting rooms 1 and 2, 2 and 3, and 3 and 1. There is a person in the building. The room that he/she is in is the state of the system. At fixed intervals of time, he/she rolls a die. If he/she is in room 0 and the outcome is 1 or 2, he/she goes to room 1. If the outcome is 3 or 4, he/she goes to room 2. If the outcome is 5 or 6, he/she goes to room 3. If the person is in room 1, 2, or 3 and the outcome is 1 or 2, he/she advances one room in the clockwise direction. If the outcome is 3 or 4, he/she advances one room in the counterclockwise direction. An outcome of 5 or 6 will cause the person to return to room 0. Assume the die is fair.

FIGURE 1.1 Building with four rooms.

Let pij be the probability that the person goes from room i to room j. Then we have the table of transitions

that indicates

Then the transition matrix would be

  

Matrices turn out to be handy for representing data. Equations involving matrices are often used to study the relationship between variables.

More explanation of how this is done will be offered in the sections of the book that follow.

The matrices to be studied in this book will have elements that are real numbers. This will suffice for the study of linear models and many other topics in statistics. We will not consider matrices whose elements are complex numbers or elements of an arbitrary ring or field.

We now consider some basic operations using matrices.

1.3 MATRIX NOTATION, ADDITION, AND MULTIPLICATION

We will show how to represent a matrix and how to add and multiply two matrices.

The elements of a matrix A are denoted by aij meaning the element in the ith row and the jth column. For example, for the matrix

 

Example 1.3 Illustration of Matrix Operations

Let .

Then

and

  

 

Example 1.4 Continuation of Example 1.2

the probabilities the person is in room 0 initially are 1/2, room 1 1/6, room 2 1/12, and room 3 1/4, then

Thus, after one transition given the initial probability vector above the probabilities that the person ends up in room 0, room 1, room 2, or room 3 after one transition are 1/6, 5/18, 11/36, and 1/4, respectively. This example illustrates a discrete Markov chain. The possible transitions are represented as elements of a matrix.

Suppose we want to know the probabilities that a person goes from room i to room j after two transitions. Assuming that what happens at each transition is independent, we could multiply the two matrices. Then

Most, but not all, of the rules for addition and multiplication of real numbers hold true for matrices. The associative and commutative laws hold true for addition. The zero matrix is the matrix with all of the elements zero. An additive inverse of a matrix A would be −A, the matrix whose elements are (−1)aij. The distributive laws hold true.

The transpose of a matrix A is the matrix A′ where the rows and the columns of A are exchanged. For example, for the matrix A in Example 1.3,

 

Example 1.5 Two Nonzero Matrices Whose Product Is Zero

Consider the matrix

Notice that

  

 

Example 1.6 The Cancellation Law for Real Numbers Does Not Hold for Matrices

Consider matrices A, B, C where

Now

but B ≠ C.  

Matrix theory is basic to the study of linear models. Example 1.7 indicates how the basic matrix operations studied so far are used in this context.

 

Example 1.7 The Linear Model

Let Y be an n-dimensional vector of observations, an n × 1 matrix. Let X be an n × m matrix where each column has the values of a prediction variable. It is assumed here that there are m predictors. Let β be an m × 1 matrix of parameters to be estimated. The prediction of the observations will not be exact. Thus, we also need an n-dimensional column vector of errors ε. The general linear model will take the form

(1.1)  

(1.2)  

Equation (1.2) may be represented by the matrix equation

(1.3)  

In experimental design models, the matrix is frequently zeros and ones indicating the level of a factor. An example of such a model would be

(1.4)  

This is an unbalanced one-way analysis of variance (ANOVA) model where there are three treatments with four observations of treatment 1, three observations of treatment 2, and two observations of treatment 3. Different kinds of ANOVA models will be studied in Part IV.  

1.4 SUMMARY

We have accomplished the following. First, we have explained what matrices are and illustrated how they can be used to summarize data. Second, we defined three basic matrix operations: addition, scalar multiplication, and matrix multiplication. Third, we have shown how matrices have some properties similar to numbers and do not share some properties that numbers have. Fourth, we have given some applications to probability and to linear models.

EXERCISES

SECTION 2

DETERMINANTS AND SOLVING A SYSTEM OF EQUATIONS

2.1 INTRODUCTION

This section will review informally how to find determinants of matrices and their use in solving systems of equations. We give the definition of determinants and show how to evaluate them by expanding along rows and columns. Some tricks for evaluating determinants are given that are based on elementary row and column operations on a matrix. We show how to solve systems of linear equations by Cramer’s rule and Gauss elimination.

2.2 DEFINITION OF AND FORMULAE FOR EXPANDING DETERMINANTS

Let A be an n × n matrix. Let Aij be the (n − 1) × (n − 1) submatrix formed by deleting the ith row and the jth column. Then formulae for expanding determinants are

(2.1a)  

and

(2.1b)  

These are the formulae used to compute determinants. The actual definition of a determinant is

(2.2)  

There are four other equalities that may be established. This is left to the reader. The above formulae indicate that a determinant may be expanded along any of the rows or columns.

 

Example 2.1 Calculation of a Determinant

Expanding along the first row, we have

Also, for example, expanding along the second column,

  

The reader may, if he or she wishes, calculate the determinant expanding along the remaining rows and columns and verify that the answer is indeed the same.

2.3 SOME COMPUTATIONAL TRICKS FOR THE EVALUATION OF DETERMINANTS

There are a few properties of determinants that are easily verified and applied that make their expansion easier. Instead of formal proofs, we give some illustrations to illustrate how these rules are true:

1. The determinant of a square matrix with two or more identical rows or columns is zero. Notice that, for example,
2. If a row or column is multiplied by a number, the value of the determinant is multiplied by that number. For example, expanding along the first column,
3. If a row or column is exchanged with another row or column, the absolute value of the determinant is the same, but the sign changes. For example, in a three-by-three determinant, exchange rows one and three. Now expand along row two and obtain

Observe that, for example,

Similarly, each of the other two-by-two determinants is the negative of the determinant of the matrix with the two rows exchanged so that

4. If a constant multiplied by a row or column of a determinant is added to another row or column, the value of the determinant is unchanged. For example,

after expanding the determinant along the first row, applying the distributive law, and rewriting the three-by-three determinants and observing that the second determinant of the right-hand side is zero by Rule 1.

The fourth property stated above is particularly useful for expanding a determinant. The objective is to add multiples of rows or columns in such a way to get as many zeros as possible in a particular row and column. The determinant is then easily expanded along that row or column.

 

Example 2.2 Continuation of Example 2.1

Consider the determinant D in Example 2.1. When applying the rules, above a goal is to get some zeros in a row or column and then expand the determinant along that row or column. For the determinant D below, we subtract the first row from the second and third row obtaining two zeros in the first column. We then expand along that column to obtain

  

 

Example 2.3 Determinant of a Triangular Matrix

One possibility for the expansion of a determinant is to use the rules to put it in upper or lower triangular form. A matrix is in upper triangular form if all of the elements below the main diagonal are zero. Likewise a matrix is in lower triangular form if all of the elements above the main diagonal are zero. The resulting determinant is then the product of the elements in the main diagonal.

Consider the determinant

The following steps are one way to reduce the matrix to upper triangular form:

1. Factor 3 from the third row and 2 from the fourth row.
2. Subtract 2/3 times the first row from the second row and 1/3 times the first row from the third row and the first row from the fourth row obtaining three zeros in the first column.
3. Add the second row to the third row and add 12 times the third row to the fourth row.
4. Multiply the second row by three and the fourth row by 2 and expand the upper triangular matrix.

Thus,

  

An important property of determinants is that

A sketch of a proof of this important fact will be given in Subsection 4.5.

2.4 SOLUTION TO LINEAR EQUATIONS USING DETERMINANTS

Let A be a square n × n matrix with a nonzero determinant. A system of equations may be written

The matrix A is called the coefficient matrix. The matrix [A, b] is called the augmented matrix.

For example, the system

can be written

The system

may be written

There are some different methods of solving these equations. These include Cramer’s method, Gauss elimination, and the use of the inverse of the matrix A. We will take up Cramer’s rule first in this subsection. The next subsection will take up Gauss elimination, and the use of the inverse of a matrix will be discussed in Section 3.

Consider two equations in two unknowns, say

Following a procedure that is generally taught in high school, we eliminate the x2 variable by multiplying the first equation by a22 and the second equation by − a12.

We then add the results to obtain

so that

Similarly, the variable x1 could be eliminated, and we would obtain

This is Cramer’s rule for two equations in two unknowns. It is an easy formula to remember because the matrix in the denominator is the coefficients of x1 and x2 in the two equations and the matrix in the numerator is the same with the numbers on the right-hand side of the equation replacing the first column for x1 and the second column for x2. This idea generalizes to n equations in n unknowns. We now give a more formal derivation of Cramer’s rule.