138,99 €
This 1971 classic on linear models is once again available--as a Wiley Classics Library Edition. It features material that can be understood by any statistician who understands matrix algebra and basic statistical methods.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 760
Veröffentlichungsjahr: 2012
Contents
Cover
Half Title page
Title page
Copyright page
Preface
Chapter 1: Generalized Inverse Matrices
1. Introduction
2. Solving Linear Equations
3. The Penrose Inverse
4. Other Definitions
5. Symmetric Matrices
6. Arbitrariness in a Generalized Inverse
7. Other Results
8. Exercises
Chapter 2: Distributions and Quadratic Forms
1. Introduction
2. Symmetric Matrices
3. Positive Definiteness
4. Distributions
5. Distribution of Quadratic Forms
6. Bilinear Forms
7. The Singular Normal Distribution
8. Exercises
Chapter 3: Regression, or the Full Rank Model
1. Introduction
2. Deviations from Means
3. Four Methods of Estimation
4. Consequences of Estimation
5. Distributional Properties
6. The General Linear Hypothesis
7. Related Topics
8. Summary of Regression Calculations
9. Exercises
Chapter 4: Introducing Linear Models: Regression on Dummy Variables
1. Regression on Allocated Codes
2. Regression on Dummy (0, 1) Variables
3. Describing Linear Models
4. The Normal Equations
5. Exercises
Chapter 5: Models Not of Full Rank
1. The Normal Equations
2. Consequences of a Solution
3. Distributional Properties
4. Estimable Functions
5. The General Linear Hypothesis
6. Restricted Models
7. The “Usual Constraints”
8. Generalizations
9. Summary
10. Exercises
Chapter 6: Two Elementary Models
1. Summary of General Results
2. The 1-Way Classification
3. Reductions in Sums of Squares
4. The 2-Way Nested Classification
5. Normal Equations for Design Models
6. Exercises
Chapter 7: The 2-Way Crossed Classification
1. The 2-Way Classification without Interaction
2. The 2-Way Classification with Interaction
3. Interpretation of Hypotheses
4. Connectedness
5. μij-Models
6. Exercises
Chapter 8: Some Other Analyses
1. Large-Scale Survey-Type Data
2. Covariance
3. Data Having all Cells Filled
4. Exercises
Chapter 9: Introduction to Variance Components
1. Fixed and Random Models
2. Mixed Models
3. Fixed or Random?
4. Finite Populations
5. Introduction to Estimation
6. Rules for Balanced Data
7. The 2-Way Classification
8. Estimating Variance Components from Balanced Data
9. Normality Assumptions
10. Exercises
Chapter 10: Methods of Estimating Variance Components from Unbalanced Data
1. Expectations of Quadratic Forms
2. Analysis of Variance Method (Henderson’s Method 1)
3. ADJUSTING for Bias in Mixed Models
4. Fitting Constants Method (Henderson’s Method 3)
5. Analysis of Means Methods
6. Symmetric Sums Methods
7. Infinitely Many Quadratics
8. Maximum Likelihood for Mixed Models
9. Mixed Models Having One Random Factor
10. Best Quadratic Unbiased Estimation
11. Exercises
Chapter 11: Variance Component Estimation from Unbalanced Data: Formulae
1. The 1-Way Classification
2. The 2-Way Nested Classification
3. The 3-Way Nested Classification
4. The 2-Way Classification with Interaction, Random Model
5. The 2-Way Classification with Interaction, Mixed Model
6. The 2-Way Classification without Interaction, Random Model
7. Mixed Models with One Random Factor
8. The 2-Way Classification without Interaction, Mixed Model
9. The 3-Way Classification, Random Model
Literature Cited
Statistical Tables
Index
Linear Models
Copyright © 1971 by John Wiley & Sons, Inc.
Wiley Classics Library Edition Published 1997
All rights reserved. Published simultaneously in Canada.
Reproduction or translation of any part of this work beyond that permitted by Sections 107 or 108 of the 1976 United States Copyright Act without the permission of the copyright owner is unlawful. Requests for permission or further information should be addressed to the Permissions Department, John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012.
Library of Congress Catalog Card Number: 70-138919
ISBN 0-471-18499-3
Preface
This book describes general procedures of estimation and hypothesis testing for linear statistical models and shows their application for unbalanced data (i.e., unequal-subclass-numbers data) to certain specific models that often arise in research and survey work. In addition, three chapters are devoted to methods and results for estimating variance components, particularly from unbalanced data. Balanced data of the kind usually arising from designed experiments are treated very briefly, as just special cases of unbalanced data. Emphasis on unbalanced data is the backbone of the book, designed to assist those whose data cannot satisfy the strictures of carefully managed and well-designed experiments.
The title may suggest that this is an all-embracing treatment of linear models. This is not the case, for there is no detailed discussion of designed experiments. Moreover, the title is not An Introduction to …, because the book provides more than an introduction; nor is it … with Applications, because, although concerned with applications of general linear model theory to specific models, few applications in the form of real-life data are used. Similarly, … for Unbalanced Data has also been excluded from the title because the book is not devoted exclusively to such data. Consequently the title Linear Models remains, and I believe it has brevity to recommend it.
My main objective is to describe linear model techniques for analyzing unbalanced data. In this sense the book is self-contained, based on prerequisites of a semester of matrix algebra and a year of statistical methods. The matrix algebra required is supplemented in Chapter 1, which deals with generalized inverse matrices and allied topics. The reader who wishes to pursue the mathematics in detail throughout the book should also have some knowledge of statistical theory. The requirements in this regard are supplemented by a summary review of distributions in Chapter 2, extending to sections on the distribution of quadratic and bilinear forms and the singular multinormal distribution. There is no attempt to make this introductory material complete. It serves to provide the reader with foundations for developing results for the general linear model, and much of the detail of this and other chapters can be omitted by the reader whose training in mathematical statistics is sparse. However, he must know Theorems 1 through 3 of Chapter 2, for they are used extensively in succeeding chapters.
Chapter 3 deals with full-rank models. It begins with a simple explanation of regression (based on an example) and proceeds to multiple regression, giving a unified treatment for testing a general linear hypothesis. After dealing with various aspects of this hypothesis and special cases of it, the chapter ends with sections on reduced models and other related topics. Chapter 4 introduces models not of full rank by discussing regression on dummy (0, 1) variables and showing its equivalence to linear models. The results are well known to most statisticians, but not to many users of regression, especially those who are familiar with regression more in the form of computer output than as a statistical procedure. The chapter ends with a numerical example illustrating both the possibility of having many solutions to normal equations and the idea of estimable and non-estimable functions.
Chapter 5 deals with the non-full-rank model, utilizing generalized inverse matrices and giving a unified procedure for testing any testable linear hypothesis. Chapters 6 through 8 deal with specific cases of this model, giving many details for the analysis of unbalanced data. Within these chapters there is detailed discussion of certain topics that other books tend to ignore: restrictions on models and constraints on solutions (Sections 5.6 and 5.7); singular covariance matrices of the error terms (Section 5.8); orthogonal contrasts with unbalanced data (Section 5.5g); the hypotheses tested by F-statistics in the analysis of variance of unbalanced data (Sections 6.4f, 7.1g, and 7.2f); analysis of covariance for unbalanced data (Section 8.2); and approximate analyses for data that are only slightly unbalanced (Section 8.3). On these and other topics, I have tried to coordinate some ideas and make them readily accessible to students, rather than continuing to leave the literature relatively devoid of these topics or, at best, containing only scattered references to them. Statisticians concerned with analyzing unbalanced data on the basis of linear models have talked about the difficulties involved for many years but, probably because: the problems are not easily resolved, little has been put in print about them. The time has arrived, I feel, for trying to fill this void. Readers may not always agree with what is said, indeed I may want to alter some things myself in due time but, meanwhile, if this book sets readers to thinking and writing further about these matters, I will feel justified. For example, there may be criticism of the discussion of F-statistics in parts of Chapters 6 through 8, where these statistics are used, not so much to test hypotheses of interest (as described in Chapter 5), but to specify what hypotheses are being tested by those F-statistics available in analysis of variance tables for unbalanced data. I believe it is important to understand what these hypotheses are, because they are not obvious analogs of the corresponding balanced data hypotheses and, in many cases, are relatively useless.
The many numerical illustrations and exercises in Chapters 3 through 8 use hypothetical data, designed with easy arithmetic in mind. This is because I agree with C. C. Li (1964) who points out that we do not learn to solve quadratic equations by working with something like
Chapters 9 through 11 deal with variance components. The first part of Chapter 9 describes random models, distinguishing them from fixed models by a series of examples and using the concepts, rather than the details, of the examples to make the distinction. The second part of the chapter is the only occasion where balanced data are discussed in depth: not for specific models (designs) but in terms of procedures applicable to balanced data generally. Chapter 10 presents methods currently available for estimating variance components from unbalanced data, their properties, procedures, and difficulties. Parts of these two chapters draw heavily on Searle (1971). Finally, Chapter 11 catalogs results derived by applying to specific models some of the methods described in Chapter 10, gathering together the cumbersome algebraic expressions for variance component estimators and their variances in the 1-way, 2-way nested, and 2-way crossed classifications (random and mixed models), and others. Currently these results are scattered throughout the literature. The algebraic expressions are themselves so lengthy that there would be little advantage in giving numerical illustrations. Instead, extra space has been taken to typeset the algebraic expressions in as readable a manner as possible.
All chapters except the last have exercises, most of which are designed to encourage the student to reread the text and to practice and become thoroughly familiar with the techniques described. Statisticians, in their consulting capacity, are much like lawyers. They do not need to remember every technique exactly, but must know where to locate it when needed and be able to understand it once found. This is particularly so with the techniques of unbalanced data analysis, and so the exercises are directed towards impressing on the reader the methods and logic of establishing the techniques rather than the details of the results themselves. These can always be found when needed.
No computer programs are given. This would be an enormous task, with no certainty that such programs would be optimal when written and even less chance by the time they were published. While the need for good programs is obvious, I think that a statistics book is not the place yet for such programs. Computer programs printed in books take on the aura of quality and authority, which, even if valid initially, soon becomes outmoded in today’s fast-moving computer world.
The chapters are long, but self-contained and liberally sign-posted with sections, subsections, and sub-subsections—all with titles (see Contents).
My sincere thanks go to many people for helping with the book: the Institute of Statistics at Texas A. and M. University which provided me with facilities during a sabbatical leave (1968–1969) to do most of the initial writing; R. G. Cornell, N. R. Draper, and J. S. Hunter, the reviewers of the first draft who made many helpful suggestions; and my colleagues at Cornell who encouraged me to keep going. I also thank D. F. Cox, C. H. Goldsmith, A. Hedayat, R. R. Hocking, J. W. Rudan, D. L. Solomon, N. S. Urquhart, and D. L. Weeks for reading parts of the manuscript and suggesting valuable improvements. To John W. Rudan goes particular gratitude for generous help with proof reading. Grateful thanks also go to secretarial help at both Texas A. and M. and Cornell Universities, who eased the burden enormously.
S. R. SEARLE
Ithaca, New YorkOctober, 1970
The application of generalized inverse matrices to linear statistical models is of relatively recent occurrence. As a mathematical tool such matrices aid in understanding certain aspects of the analysis procedures associated with linear models, especially the analysis of unbalanced data, a topic to which considerable attention is given in this book. An appropriate starting point is therefore a summary of the features of generalized inverse matrices that are important to linear models. Other ancillary results in matrix algebra are also discussed.
a. Definition and existence
A generalized inverse of a matrix A is defined, in this book, as any matrix G that satisfies the equation
(1)
The name “generalized inverse” for matrices G defined by (1) is unfortunately not universally accepted, although it is used quite widely. Names such as “conditional inverse”, “pseudo inverse” and “g-inverse” are also to be found in the literature, sometimes for matrices defined as is G of (1) and sometimes for matrices defined as variants of G. However, throughout this book the name “generalized inverse” of A is used exclusively for any matrix G satisfying (1).
Notice that (1) does not define G as “the” generalized inverse of A but as “a” generalized inverse. This is because G, for a given matrix A, is not unique. As shown below, there is an infinite number of matrices G that satisfy (1) and so we refer to the whole class of them as generalized inverses of A.
One way of illustrating the existence of G and its non-uniqueness starts with the equivalent diagonal form of A. If A has order p × q the reduction to this diagonal form can be written as
or, more simply, as
As usual, P and Q are products of elementary operators [see, for example, Searle (1966), Sec. 5.7], r is the rank of A and Dr is a diagonal matrix of order r. In general, if d1d2, …, dr, are the diagonal elements of any diagonal matrix D we will use the notation D{di} for Dr; i.e.,
(2)
Furthermore, as in Δ, null matrices will be represented by the symbol 0, with order being determined by context on each occasion.
Derivation of G comes easily from Δ. Analogous to Δ we define Δ− (to be read as “Δ minus”) as
Then, as shown below,
(3)
Before showing that G does satisfy (1), note from the definitions of Δ and Δ− given above that
(4)
Hence, by the definition implied in (1), we can say that Δ− is a generalized inverse of Δ, an unimportant result in itself but one which leads to G satisfying (1). To show this we use Δ to write
(5)
the inverses P−1 and Q−1 existing because P and Q are products of elementary operators and hence non-singular. Then (3), (4) and (5) give
i.e., (1) is satisfied. Hence G is a generalized inverse of A.
Example. For
a diagonal form is obtained using
so that
Hence
It is to be emphasized that generalized inverses exist for rectangular matrices as well as for square ones. This is evident from the formulation of Δpxq. However, for A of order p × q, we define Δ− as having order q × p, the null matrices therein being of appropriate order to make this so. As a result G has order q × p.
Example. Consider
the same as A in the previous example except for an additional column. With P as given earlier and Q now taken as
Δ− is then taken as
b. An algorithm
Another way of computing G is based on knowing the rank of A. Suppose it is r and that A can be partitioned in such a way that its leading r × r minor is non-singular, i.e.,
where A11 is r × r of rank r. Then a generalized inverse of A is
where the null matrices are of appropriate order to make G be q × p. To see that G is a generalized inverse of A, note that
Example. A generalized inverse of
There is no need for the non-singular minor of order r to be in the leading position. Suppose it is not. Let R and S represent the elementary row and column operations respectively to bring it to the leading position. Then R and S are products of elementary operators with
where B11 is non-singular of order r. Then
and
(6)
Clearly, so far as B11 is concerned, this product represents the operations of returning the elements of B11 to their original positions in A. Now consider G: we have
Note that this procedure is not equivalent, in (iii), to replacing elements of M in A by the elements of M−1 (and others by zero) and then in (v) transposing. It is if M is symmetric. Nor is it equivalent to replacing, in (iii), elements of M in A by elements of M−1 (and others by zero) and then in (v) not transposing (see Exercise 5). In general, the algorithm must be carried out exactly as described.
One case where it can be simplified is when A is symmetric. Then any principal minor of A is symmetric and the transposing in both (iii) and (v) can be ignored. The algorithm can then become as follows.
However, when A is symmetric and a non-symmetric non-principal minor is used for M, then the general algorithm must be used.
Example. The matrix
has the following matrices, among others, as generalized inverses:
derived from inverting the 2 × 2 minors
Similarly,
as a generalized inverse.
Since the use of generalized inverse matrices in solving linear equations is the application of prime interest so far as linear models are concerned, the procedures involved are now outlined. Following this, some general properties of generalized inverses are discussed.
a. Consistent equations
A convenient starting point from which to develop the solution of linear equations using a generalized inverse is the definition of consistent equations.
As a simple example, the equations
are consistent: in the matrix on the left the second row is thrice the first, and this is also true of the elements on the right. But the equations
are not consistent. Further evidence of this is seen by writing them in full:
The importance of the concept of consistency lies in the following theorem: linear equations can be solved if, and only if, they are consistent. Proof can be established from the above definition of consistent equations [see, for example, Searle (1966), Sec. 6.2, or Searle and Hausman (1970), Sec. 7.2]. Since it is only consistent equations that can be solved, discussion of a procedure for solving linear equations is hereafter confined to equations that are consistent. The procedure is described in a series of theorems.
b. Obtaining solutions
(7)
where z is any arbitrary vector of order q.
(8)
so defining A, x and y. It will be found that
(9)
(10)
(11)
It will be found that both 1 and 2 satisfy (8). That (9) does satisfy (8) for all values of z3 and z4 can be seen by substitution. For example, the left-hand side of the first equation is then
as it should be.
The G used earlier is not the only generalized inverse of the matrix A in (8). Another is
for which (7) becomes
(12)
for arbitrary values 1 and 4. This too, it will be found, satisfies (8).
c. Properties of solutions
The relationship between solutions using G and those using is that, on putting
reduces to .
A stronger result, which concerns generation of all solutions from , is contained in the following theorem.
Having established a method for solving linear equations and shown that they can have an infinite number of solutions, we ask two questions: What relationships exist among the solutions and to what extent are the solutions linearly independent (LIN)? Since each solution is a vector of order q there can, of course, be no more than q LIN solutions. In fact there are fewer, as Theorem 4 shows. But first, a lemma.
(13)
Then
(14)
Proof. Because
(15)
Notice that Theorem 5 is in terms of any s solutions. Hence for any number of solutions, whether LIN or not, any linear combination of them is itself a solution provided the coefficients in that combination sum to unity.
and it can be seen that
the coefficients on the right-hand side, 2, 1 and −2, summing to unity in accord with Theorem 5.
A final theorem relates to an invariance property of the elements of a solution. It is important in the study of linear models because of its relationship with what is known as estimability, discussed in Chapter 5. Without worrying about details of estimability here, we give the theorem and refer to it later as needed. The theorem is due to Rao (1962) and it concerns linear combinations of the elements of a solution vector: certain combinations are invariant to whatever solution is used.
Proof. For a solution given by Theorem 2
Example. In deriving (9),
and for
(16)
and
and in general, from (9),
So too does k′ have the same value for, from (12),
is different from (16); and
The concept of a generalized inverse has now been defined and its use in solving linear equations explained. We next briefly discuss the generalized inverse itself, its various definitions and some of its properties. Extensive review of generalized inverses and their applications is to be found in Boullion and Odell (1968) and the approximately 350 references listed there.
Penrose (1955), in extending the work of Moore (1920), shows that for any matrix A there is a unique matrix K which satisfies the following four conditions:
(17)
We refer to these as Penrose’s conditions and to K as the (unique) Penrose inverse; more correctly it is the Moore-Penrose inverse. Penrose’s proof of the existence of K satisfying these conditions is lengthy but instructive. It rests upon two lemmas relating to matrices having real (but not complex) numbers as elements, lemmas that are used repeatedly in what follows.
(18)
Similarly, (ii) and (iv) are true if and only if
(19)
Hence any K satisfying (18) and (19) also satisfies the Penrose conditions.
Before showing how K can be derived we show that it is unique. For if it is not, assume that some other matrix M satisfies the Penrose conditions. Then from conditions (i) and (iv) in terms of M we would have
(20)
and (ii) and (iii) would lead to
(21)
Therefore, on substituting (20) into (19) and using (19) again we have
and on substituting (21) into this and using (18) and (21) we get
Therefore K satisfying (18) and (19) is unique and satisfies the Penrose conditions ; we derive its form by assuming that
(22)
for some matrix T. Then (18) is satisfied if
(23)
Therefore
which is
There remains the derivation of a suitable T. This is done as follows. Consider A′A: it is square and so are its powers. And for some integer t there will be, as a consequence of the Cayley-Hamilton theorem [see, e.g., Searle (1966), Sec. 7.5e], a series of scalars λ1, λ2, …, λt, not all zero, such that
If λr is the first λ in this identity that is non-zero, then T is denned as
(24)
To show that this satisfies (23) note that, by direct multiplication,
Since, by definition, λr is the first non-zero λ in the series λ1, λ2,…, the above reduces to
(25)
Example. For
Then, by the Cayley-Hamilton theorem,
and so T is taken as
and
is the Penrose inverse of A satisfying (17).
An alternative procedure for deriving K has been suggested by Graybill et al. (1966). Their method is to find X and Y such that
(26)
and then
(27)
TABLE 1.1. SUGGESTED NAMES FOR MATRICES SATISFYING SOME OR ALL OF THE PENROSE CONDITIONS
Conditions Satisfied (
Eq. 17
)
Name of Matrix
Symbol
i
Generalized inverse
A
(
g
)
i and ii
Reflexive generalized inverse
A
(
r
)
i, ii and iii
Normalized generalized inverse
A
(
n
)
i, ii, iii and iv
Penrose inverse
A
(
p
)
Using the symbols of Table 1.1 it can be seen that
(28)
That these expressions satisfy the appropriate conditions can be proved by repeated use of Lemma 3 of Sec. 3 or by Theorem 7 of Sec. 5, which follows.
The study of linear models frequently leads to equations of the form that have to be solved for . Special attention is therefore given to properties of a generalized inverse of the symmetric matrix X′X.
a. Properties of a generalized inverse
Four useful properties of a generalized inverse of X′X are contained in the following theorem.
Theorem 7. When G is a generalized inverse of X′X, then
Proof. By definition, G satisfies
(29)
Corollary. Applying part (i) of the theorem to its other parts shows that
It is to be emphasized that not all generalized inverses of a symmetric matrix are symmetric. This is evident from the general algorithm given at the end of Sec. 1. For example, applying that algorithm to
as a non-symmetric generalized inverse of the symmetric matrix A2. However, Theorem 7 and its corollary very largely enable us to avoid difficulties that this lack of symmetry of generalized inverses of X′X might otherwise appear to involve. For example, if G is a generalized inverse of X′X and P is some other matrix,
b. Two methods of derivation
In addition to the methods given in Sec. 1, two methods discussed by John (1964) are sometimes pertinent to linear models. They depend on the regular inverse of a non-singular matrix:
(30)
H used here, in keeping with John’s notation, is not the matrix GA used earlier. Where X′X has order p and rank p − m(m > 0), the matrix H is any matrix of order m × p that is of full row rank with its rows also LIN of those of X′X. [The existence of such a matrix is assured by considering m vectors of order p that are LIN of any set of p − m LIN rows of X′X. Furthermore, if these rows constitute H in such a way that the m LIN rows of H correspond in S to the m rows of X′X that are linear combinations of that set of p − m rows, then S−1 of (30) exists.] With (30) existing the two matrices
(31)
Three useful lemmas help in proving these results.
Lemma 6. For X and D of Lemma 5 and H of order m × p with full row rank, HD has full rank if and only if the rows of H are LIN of those of X.
(ii) Given that the rows of H are LIN of those of X, the matrix, of order (N + m) × p, has full column rank. Therefore it has a left inverse, [U V] say [Searle (1966), Sec. 5.13], and so
(32)
(33)
Pre-multiplying (32) by D′ and using Lemmas 5 and 6 leads to
(34)
Then, from (32) and (34)
(35)
and post-multiplying this by X′X shows, from Lemma 5, that B11 is a generalized inverse of X′X. Furthermore, making use of (33), (35) and Lemmas 5 and 6 gives
Pre- and post-multiplying (X′X + H′H)−1 obtained from this by X′X then shows that (X′X + H′H)−1 is a generalized inverse of X′X.
It can also be shown that B11 satisfies the second of Penrose’s conditions, (ii) in (17), but (X′X + H′H)−1 does not; and neither generalized inverse in (31) satisfies condition (iii) or (iv).
(36)
a simplified form of Rayner and Pringle’s equation (7). The relationship between the two generalized inverses of X′X shown in (31) is therefore that indicated in (36). Note also that Lemma 6 is equivalent to Theorem 3 of Scheffe (1959, p. 17).
Lemma 7. A matrix of full row rank r can be written as a product of matrices, one being of the form [I S] for some matrix S, of r rows.
Lemma 8. I + KK′ has full rank for any non-null matrix K.
Proof. Assume that I + KK′ does not have full rank. Then its columns are not LIN and there exists a non-null vector u such that
Lemma 9. When B has full row rank BB′ is non-singular.
Corollary. When B has full column rank B′B is non-singular.
(37)
Then, with A−111 existing, A can be written as
(38)
(39)
Since, from Lemma 4, L has full column rank and M has full row rank, Lemma 9 shows that
(40)
Pre-multiplication by A−111(L′L)−1L′ and post-multiplication by M′(MM′)−1A−111 then gives
(41)
Whatever the generalized inverse is, suppose it is partitioned as
(42)
of order q × p, conformable for multiplication with A. Then substituting (42) and (39) into (41) gives
(43)
This is true whatever the generalized inverse may be. Therefore, for any matrices G11, G12, G21 and G22 that satisfy (43), G as given in (42) will be a generalized inverse of A. Therefore, on substituting from (43) for G11, we have
(44)
as a generalized inverse of A for any matrices G12, G21 and G22 of appropriate order. Thus is the arbitrariness of a generalized inverse characterized.
Certain consequences of (44) can be noted. One is that by making G12,
The arbitrariness evident in (44) prompts investigating the relationship of one generalized inverse to another. It is simple: if G1 is a generalized inverse of A then so is
(45)
for any X and Y. Pre- and post-multiplication of (45) by A shows that this is so.
(46)
This characterizes the arbitrariness even more specifically than does (44). Thus for the four sub-matrices of G shown in (42) we have:
Sub-matrix
Source of Arbitrariness
G
11
X
21
and
Y
12
G
12
X
22
and
Y
12
G
21
X
21
and
Y
22
G
22
X
22
and
Y
22
This means that, in the partitioning of
implicit in (45), the first set of rows in the partitioning of X does not enter into G, and neither does the first set of columns of Y.
Now we have the theorem on generating solutions.
where G is exactly the form given in (45) using G1 for Y.
Procedures for inverting partitioned matrices are well known [e.g., Searle (1966), Sec. 8.7]. In particular, the inverse of the partitioned full rank symmetric matrix
(47)
can, for
be written as
(48)
(49)
