74,99 €
A GUIDE TO ECONOMICS, STATISTICS AND FINANCE THAT EXPLORES THE MATHEMATICAL FOUNDATIONS UNDERLING ECONOMETRIC METHODS
An Introduction to Econometric Theory offers a text to help in the mastery of the mathematics that underlie econometric methods and includes a detailed study of matrix algebra and distribution theory. Designed to be an accessible resource, the text explains in clear language why things are being done, and how previous material informs a current argument. The style is deliberately informal with numbered theorems and lemmas avoided. However, very few technical results are quoted without some form of explanation, demonstration or proof.
The author—a noted expert in the field—covers a wealth of topics including: simple regression, basic matrix algebra, the general linear model, distribution theory, the normal distribution, properties of least squares, unbiasedness and efficiency, eigenvalues, statistical inference in regression, t and F tests, the partitioned regression, specification analysis, random regressor theory, introduction to asymptotics and maximum likelihood. Each of the chapters is supplied with a collection of exercises, some of which are straightforward and others more challenging. This important text:
Written for undergraduates and graduate students of economics, statistics or finance, An Introduction to Econometric Theory is an essential beginner's guide to the underpinnings of econometrics.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 376
Veröffentlichungsjahr: 2018
Cover
List of Figures
Preface
About the Companion Website
Part I: Fitting
Chapter 1: Elementary Data Analysis
1.1 Variables and Observations
1.2 Summary Statistics
1.3 Correlation
1.4 Regression
1.5 Computing the Regression Line
1.6 Multiple Regression
1.7 Exercises
Chapter 2: Matrix Representation
2.1 Systems of Equations
2.2 Matrix Algebra Basics
2.3 Rules of Matrix Algebra
2.4 Partitioned Matrices
2.5 Exercises
Chapter 3: Solving the Matrix Equation
3.1 Matrix Inversion
3.2 Determinant and Adjoint
3.3 Transposes and Products
3.4 Cramer's Rule
3.5 Partitioning and Inversion
3.6 A Note on Computation
3.7 Exercises
Chapter 4: The Least Squares Solution
4.1 Linear Dependence and Rank
4.2 The General Linear Regression
4.3 Definite Matrices
4.4 Matrix Calculus
4.5 Goodness of Fit
4.6 Exercises
Part II: Modelling
Chapter 5: Probability Distributions
5.1 A Random Experiment
5.2 Properties of the Normal Distribution
5.3 Expected Values
5.4 Discrete Random Variables
5.5 Exercises
Chapter 6: More on Distributions
6.1 Random Vectors
6.2 The Multivariate Normal Distribution
6.3 Other Continuous Distributions
6.4 Moments
6.5 Conditional Distributions
6.6 Exercises
Chapter 7: The Classical Regression Model
7.1 The Classical Assumptions
7.2 The Model
7.3 Properties of Least Squares
7.4 The Projection Matrices
7.5 The Trace
7.6 Exercises
Chapter 8: The Gauss‐Markov Theorem
8.1 A Simple Example
8.2 Efficiency in the General Model
8.3 Failure of the Assumptions
8.4 Generalized Least Squares
8.5 Weighted Least Squares
8.6 Exercises
Part III: Testing
Chapter 9: Eigenvalues and Eigenvectors
9.1 The Characteristic Equation
9.2 Complex Roots
9.3 Eigenvectors
9.4 Diagonalization
9.5 Other Properties
9.6 An Interesting Result
9.7 Exercises
Chapter 10: The Gaussian Regression Model
10.1 Testing Hypotheses
10.2 Idempotent Quadratic Forms
10.3 Confidence Regions
10.4
Statistics
10.5 Tests of Linear Restrictions
10.6 Constrained Least Squares
10.7 Exercises
Chapter 11: Partitioning and Specification
11.1 The Partitioned Regression
11.2 Frisch‐Waugh‐Lovell Theorem
11.3 Misspecification Analysis
11.4 Specification Testing
11.5 Stability Analysis
11.6 Prediction Tests
11.7 Exercises
Part IV: Extensions
Chapter 12: Random Regressors
12.1 Conditional Probability
12.2 Conditional Expectations
12.3 Statistical Models Contrasted
12.4 The Statistical Assumptions
12.5 Properties of OLS
12.6 The Gaussian Model
12.7 Exercises
Chapter 13: Introduction to Asymptotics
13.1 The Law of Large Numbers
13.2 Consistent Estimation
13.3 The Central Limit Theorem
13.4 Asymptotic Normality
13.5 Multiple Regression
13.6 Exercises
Chapter 14: Asymptotic Estimation Theory
14.1 Large Sample Efficiency
14.2 Instrumental Variables
14.3 Maximum Likelihood
14.4 Gaussian ML
14.5 Properties of ML Estimators
14.6 Likelihood Inference
14.7 Exercises
Part V: Appendices
Appendix A: The Binomial Coefficients
Appendix B: The Exponential Function
Appendix C: Essential Calculus
Appendix D: The Generalized Inverse
Recommended Reading
Preliminary Reading
Additional Reading
For Reference
Further Reading
Index
End User License Agreement
Chapter 1
Figure 1.1 Long and short UK interest rates.
Figure 1.2 Scatter plot of the interest rate series.
Figure 1.3 The regression line
Figure 1.4 The regression line and the data
Figure 1.5 The regression residual
Figure 1.6 Plot of
Chapter 5
Figure 5.1 Archery target scatter.
Figure 5.2 Archery target, frequency contours.
Figure 5.3 Bivariate normal probability density function.
Figure 5.4 Normal p.d.f., shaded area shows
.
Figure 5.5 Binomial probabilities and the normal p.d.f.
Figure 5.6 Prussian cavalry data and predictions.
Chapter 6
Figure 6.1 The standard Cauchy p.d.f.
Chapter 10
Figure 10.1 Regression confidence regions,
.
Source
: Figure 2.1 of
Econometric Theory
by James Davidson, Blackwell Publishers 2000. Reproduced by permission of Wiley‐Blackwell.
Chapter 13
Figure 13.1 P.d.f of the sum of three uniform r.v.s, with normal p.d.f. for comparison.
Cover
Table of Contents
Begin Reading
C1
ix
xi
xii
xiii
xv
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
21
22
23
24
25
26
27
28
29
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
63
65
66
67
68
70
71
72
73
74
75
76
77
78
79
80
81
83
84
85
86
87
88
89
90
91
92
93
94
95
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
121
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
223
224
225
226
227
228
229
230
231
233
235
236
237
238
239
James Davidson
University of ExeterUK
This edition first published 2018
© 2018 John Wiley & Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of James Davidson to be identified as the author of this work has been asserted in accordance\break with law.
Registered Offices
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office
9600 Garsington Road, Oxford, OX4 2DQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging-in-Publication Data
Names: Davidson, James, 1944- author.
Title: An introduction to econometric theory / by Prof. James Davidson, University of Exeter.
Description: Hoboken, NJ : John Wiley & Sons, Inc., [2018] | Includes bibliographical references and index. |
Identifiers: LCCN 2018009800 (print) | LCCN 2018011202 (ebook) | ISBN 9781119484936 (pdf) | ISBN 9781119484929 (epub) | ISBN 9781119484882 (cloth)
Subjects: LCSH: Econometrics.
Classification: LCC HB139 (ebook) | LCC HB139 .D3664 2018 (print) | DDC 330.01/5195-dc23
LC record available at https://lccn.loc.gov/2018009800
Cover Design: Wiley
Cover Image: © maciek905/iStockphoto
Figure 1.1 Long and short UK interest rates.
Figure 1.2 Scatter plot of the interest rate series.
Figure 1.3 The regression line
Figure 1.4 The regression line and the data
Figure 1.5 The regression residual
Figure 1.6 Plot of
Figure 5.1 Archery target scatter.
Figure 5.2 Archery target, frequency contours.
Figure 5.3 Bivariate normal probability density function.
Figure 5.4 Normal p.d.f., shaded area shows
.
Figure 5.5 Binomial probabilities and the normal p.d.f.
Figure 5.6 Prussian cavalry data and predictions.
Figure 6.1 The standard Cauchy p.d.f.
Figure 10.1 Regression confidence regions,
.
Source
: Figure 2.1 of
Econometric Theory
by James Davidson, Blackwell Publishers 2000. Reproduced by permission of Wiley‐Blackwell.
Figure 13.1 P.d.f of the sum of three uniform r.v.s, with normal p.d.f. for comparison.
This book has its origin in a course of lectures offered to second year economics undergraduates who are simultaneously taking a core module in applied econometrics. Courses of the latter type, typically based on excellent texts such as Wooldridge's Introductory Econometrics or Stock and Watson's Introduction to Econometrics, teach modern techniques of model building and inference, but necessarily a good deal of technical material has to be taken on trust. This is like following a cake recipe that dictates ingredients in given proportions and then the baking time and oven temperature but does not tell you why these instructions give a good result. One can drive a car without knowing anything about spark plugs and transmissions, but one cannot so easily fix it. For students with the requisite motivation, these lectures have aimed to provide a look under the bonnet (being British; their American counterparts would of course be wanting to look under the hood).
A problem has been that no very suitable textbook has existed to accompany the lectures. The reading list has had to cite chapters from various large and indigestible texts, often with special reference to the technical appendices. To master the mathematics underlying econometric methods requires a detailed study of matrix algebra and a sound grasp of distribution theory, and to find readings with the right focus and at the right level is not easy. Sometimes, books written a generation ago and now out of print appear to do a better job than modern texts. Hence, this book.
Jargon, obscure conventions, and austere expository style all conspire to make this kind of material hard for beginners to access. This book may or may not succeed in its aim, but its aim is clear, which is to be successfully read by students who do not have too many techniques at their fingertips. As little as possible is done without a full explanation and careful cross‐referencing to relevant results. This may make the discussion long‐winded and repetitive at times, but hopefully it is helpful if at every stage the reader is told why things are being done, and what previous material is informing the argument. The style is deliberately informal, with numbered theorems and lemmas avoided. However, there is no dumbing down! Very few technical results are quoted without some form of explanation, demonstration, or proof.
It is expected that readers will have taken the standard mathematics and statistics courses for economics undergraduates, but the prior knowledge required is actually quite small. The treatment is as far as possible self‐contained, with almost all the mathematical concepts needed being explained either in situ or in the appendices.
The chapters are grouped into four parts distinguishing the type of analysis undertaken in each.
Part I, “Fitting”, is about summarizing and fitting data sets. Matrices are the main tools, and regression is the unifying principle for explaining and predicting data. The main topics are the solution of systems of equations and fitting by least squares. These are essential tools in econometrics proper, but at this stage there is no statistical modelling. The role of the calculations is purely descriptive.
Part II, “Modelling”, invokes the methods of Part I to study the connections between a sample of data and the environment in which those data were generated, via econometric models. The probability distribution is the essential theoretical tool. The key idea of the sampling distribution of an estimator is introduced and allows attributes of estimators such as unbiasedness and efficiency to be defined and studied.
Part III “Testing”, shows how to use the modelling framework of Part II to pose and answer questions. The central, beautiful idea of statistical inference is that noise and randomness can be domesticated and analyzed scientifically, so that when decisions to accept or reject a hypothesis have to be made, the chance of making the wrong decision can be quantified. These methods are used both to test simplifying restrictions on econometric models and to check out the adequacy and stability of the models themselves.
Up to this point in the story, the focus is wholly on the classical regression model. Elegant and powerful, while relying only on elementary statistical concepts, the classical model nonetheless suffers major limitations in the analysis of economic data. Part IV, “Extensions”, is about breaking out of these limitations. These chapters provide a gentle introduction to some of the more advanced techniques of analysis that are the staple of current econometric research.
Finally, Part V contains four brief appendices reviewing some important bits of mathematics, including essential calculus.
Each of the chapters is supplied with a collection of exercises, some of which are straightforward and others more challenging. The first exercise in each case is a collection of statements with the question “true or false?” Some of these are true, some are subtly misstated, and some are nonsensical. After completing each chapter, readers are strongly encouraged to check their understanding before going further, by working through and making their choices. The correct answers are always to be found by a careful reading of the chapter material.
It may be helpful to mention some stylistic conventions adopted here to give hints to the reader. Formal definitions are generally avoided, but a technical term being used for the first time and receiving a definition is generally put in italics. Single quotation marks around words or phrases are used quite extensively, to provide emphasis and alert readers to the fact that common words are being used in a specialized or unusual way. Double quotes also get used to enclose words that we might imagine being articulated, although not actual quotations.
As the title emphasizes, this is no more than a primer, and there is no attempt to cover the whole field of econometric theory. The treatment of large sample theory is brief, and nothing is said about the analysis of time series and panel data, to name but two important topics. Some of the many excellent texts available that deal with these and other questions are listed in the “Recommended Reading” section at the end. Among others, my own earlier textbook Econometric Theory (Oxford: Blackwell Publishers, 2000) may be found useful. There is some overlap of material, and I have even taken the liberty of recycling one of the fancier illustrations from that volume. However, a different audience is addressed here. The earlier book was intended, as is the way with graduate texts, to report on the cutting edge of econometric research, with much emphasis on time series problems and the most advanced asymptotic theory. By contrast the present book might appear somewhat old fashioned, but with a purpose. The hope is that it can provide beginners with some of the basic intellectual equipment, as well as the confidence, to go further with the fascinating discipline of econometrics.
Exeter, December 2017
James Davidson
The companion website for this book is at
www.wiley.com/go/davidson/introecmettheory
This contains the full set of solutions to the exercises. There is also a set of lecture slides designed for instructors' use in conjunction with the text.
Scan this QR code to visit the companion website.
Where to begin? Data analysis is the business of summarizing a large volume of information into a smaller compass, in a form that a human investigator can appreciate, assess, and draw conclusions from. The idea is to smooth out incidental variations so as to bring the ‘big picture’ into focus, and the fundamental concept is averaging, extracting a representative value or central tendency from a collection of cases. The correct interpretation of these averages, and functions of them, on the basis of a model of the environment in which the observed data are generated,1 is the main concern of statistical theory. However, before tackling these often difficult questions, gaining familiarity with the methods of summarizing sample information and doing the associated calculations is an essential preliminary.
Information must be recorded in some numerical form. Data may consist of measured magnitudes, which in econometrics are typically monetary values, prices, indices, or rates of exchange. However, another important data type is the binary indicator of membership of some class or category, expressed numerically by ones and zeros. A thing or entity of which different instances are observed at different times or places is commonly called a variable. The instances themselves, of which collections are to be made and then analyzed, are the observations. The basic activity to be studied in this first part of the book is the application of mathematical formulae to the observations on one or more variables.
These formulae are, to a large extent, human‐friendly versions of coded computer routines. In practice, econometric calculations are always done on computers, sometimes with spreadsheet programs such as Microsoft Excel but more often using specialized econometric software packages. Simple cases are traditionally given to students to carry out by hand, not because they ever need to be done this way but hopefully to cultivate insight into what it is that computers do. Making the connection between formulae on the page and the results of running estimation programs on a laptop is a fundamental step on the path to econometric expertise.
The most basic manipulation is to add up a column of numbers, where the word “column” is chosen deliberately to evoke the layout of a spreadsheet but could equally refer to the page of an accounting ledger in the ink‐and‐paper technology of a now‐vanished age. Nearly all of the important concepts can be explained in the context of a pair of variables. To give them names, call them and . Going from two variables up to three and more introduces no fundamental new ideas. In linear regression analysis, variables are always treated in pairs, no matter how many are involved in the calculation as a whole.
Thus, let denote the pair of variables chosen for analysis. The enclosure of the symbols in parentheses, separated by a comma, is a simple way of indicating that these items are to be taken together, but note that is not to be regarded as just another way of writing . The order in which the variables appear is often significant.
Let , a positive whole number, denote the number of observations or in other words the number of rows in the spreadsheet. Such a collection of observations, whose order may or may not be significant, is often called a series. The convention for denoting which row the observation belongs to is to append a subscript. Sometimes the letters , , or are used as row labels but there are typically other uses for these, and in this book we generally adopt the symbol for this purpose. Thus, the contents of a pair of spreadsheet columns may be denoted symbolically as
We variously refer to the and as the elements or the coordinates of their respective series.
This brings us inevitably to the question of the context in which observations are made. Very frequently, macroeconomic or financial variables (prices, interest rates, demand flows, asset stocks) are recorded at successive dates, at intervals of days, months, quarters, or years, and then is simply a date, standardized with respect to the time interval and the first observation. Such data sets are called time series. Economic data may also be observations of individual economic units. These can be workers or consumers, households, firms, industries, and sometimes regions, states, and countries. The observations can represent quantities such as incomes, rates of expenditure on consumption or investment, and also individual characteristics, such as family size, numbers of employees, population, and so forth. If these observations relate to a common date, the data set is called a cross‐section. The ordering of the rows typically has no special significance in this case.
Increasingly commonly studied in economics are data sets with both a time and a cross‐sectional dimension, known as panel data, representing a succession of observations on the same cross section of entities. In this case two subscripts are called for, say and . However, the analysis of panel data is an advanced topic not covered in this book, and for observations we can stick to single subscripts henceforth.
As remarked at the beginning, the basic statistical operation of averaging is a way of measuring the central tendency of a set of data. Take a column of numbers, add them up, and divide by . This operation defines the sample mean of the series, usually written as the symbol for the designated variable with a bar over the top. Thus,
where the second equality defines the ‘sigma’ representation of the sum. The Greek letter , decorated with upper and lower limits, is a neat way to express the adding‐up operation, noting the vital role of the subscript in showing which items are to be added together. The formula for is constructed in just the same way.
The idea of the series mean extends from raw observations to various constructed series. The mean deviations are the series
Naturally enough this ‘centred’ series has zero mean, identically:
Not such an interesting fact, perhaps, but the statistic obtained as the mean of the squared mean deviations is very interesting indeed. This is the sample variance,
which contains information about how the series varies about its central tendency. The same information, but with units of measurement matching the original data, is conveyed by the square root , called the standard deviation of the series. If is a measure of location, then is a measure of dispersion.
One of the mysteries of the variance formula is the division by , not as for the mean itself. There are important technical reasons for this,2but to convey the intuition involved here, it may be helpful to think about the case where , a single observation. Clearly, the mean formula still makes sense, because it gives . This is the best that can be done to measure location. There is clearly no possibility of computing a measure of dispersion, and the fact that the formula would involve dividing by zero gives warning that it is not meaningful to try. In other words, to measure the dispersion as , which is what (1.3) would produce with division by instead of , would be misleading. Rather, it is correct to say that no measure of dispersion exists.
Another property of the variance formula worth remarking is found by multiplying out the squared terms and summing them separately, thus:
In the first equality, note that “adding up” instances of (which does not depend on ) is the same thing as just multiplying by . The second equality then follows by cancellation, given the definition (1.1). This result shows that to compute the variance, there is no need to perform subtractions. Simply add up the squares of the coordinates, and subtract times the squared mean. Clearly, this second formula is more convenient for hand calculations than the first one.
The information contained in the standard deviation is nicely captured by a famous result in statistics called Chebyshev's rule, after the noted Russian mathematician who discovered it.3Consider, for some chosen positive number , whether a series coordinate falls ‘far from’ the central tendency of the data set in the sense that either or . In other words, does lie beyond a distance from the mean, either above or below? This condition can be expressed as
Letting denote the number of cases that satisfy inequality (1.5), the inequality
is true by definition, where the ‘sigma’ notation variant expresses compactly the sum of the terms satisfying the stated condition. However, it is also the case that
since, remembering the definition of from (1.3), the sum cannot exceed , even with . Putting together the inequalities in (1.6) and (1.7) and also dividing through by and by yields the result
In words, the proportion of series coordinates falling beyond a distance from the mean is at most.
To put this another way, let denote the distance expressed in units of standard deviations. The upper bound on the proportion of coordinates lying more than standard deviations from the mean is . This rule gives a clear idea of what conveys about the scatter of the data points – whether they are spread out or concentrated closely about the mean. However, the constraint applies in one direction only. It gives no information about the actual number of data points lying outside the specified interval, which could be none. There is simply a bound on the maximum number. Any number of data points can lie beyond one standard deviation of the mean, so the rule says nothing about this case. However, at most a quarter of the coordinates can lie more than two standard deviations from the mean, and at most a ninth beyond three standard deviations from the mean.
Measuring the characteristics of a single series, location and dispersion, is as a rule just a preliminary to considering series in pairs. Relationships are what really matter. Given pairs of observations on variables and , with means and and standard deviations and , define their covariance as the average of the products of the mean deviations:
Note right away the alternative version of this formula, by analogy with (1.4),
The interpretation of (1.9) is best conveyed by an illustration. Figure 1.1 shows quarterly series for short and long interest rates for the UK, covering the period 1963 quarter 1 to 1984 quarter 2 (86 observations). The horizontal axis in this figure shows the date, so the observations appear in time order, and for visual convenience the points are joined by line segments to make a continuous path. RS is the 3‐months local authorities' lending rate, and RL is the rate on Consols, undated British Government securities similar to 10‐year bonds. These rates moved in a similar manner through time, both responding to the rate of price inflation in the UK, which over the period in question was high and volatile. The mean and standard deviation of RS are respectively 9.48 and 3.48, while the mean and standard deviation of RL are 10.47 and 3.16. Their covariance, calculated by (1.9), is 9.13.
Figure 1.1 Long and short UK interest rates.
The same data are represented in a different way in Figure 1.2, as a scatter plot, where the axes of the diagram are respectively the values of RL (vertical) and RS (horizontal). Here, each plotted point represents a pair of values. To convey exactly the same information as Figure 1.1 would require labelling each plotted point with the date in question, but for clarity this is not done here. The crosshairs in the scatter plot denote the point of sample means, so that the diagram shows the disposition of the terms contributing to the sum in formula (1.9). The positions of the means divide the plot into four quadrants. In the top‐right and bottom‐left quadrants, the mean deviations have the same signs, positive in the first case, negative in the second. In either case , so these data points make positive contributions to the sum in (1.9). On the other hand, the top‐left and bottom‐right quadrants contain points where the mean deviations have different signs. In these cases, , so these points make a negative contribution to the sum. The overall positive association that is evident from the scatter plot is captured by the covariance having a positive value, since the positive contributions to the sum of products outweigh the negative contributions, as is clearly the case from the plot. As well as counting positive and negative contributions, the contributions are larger absolutely (of whichever sign) as the pairs of data points in question are further away from the point of the means. If either point happened to be equal to its mean, the contribution to the sum would of course be zero.
Figure 1.2 Scatter plot of the interest rate series.
The problem with the covariance is that like the means and standard deviations, its value depends on the units of measurement of the data. While the sign indicates a direction of association, it is difficult to answer the question “How big is big?” The solution is to normalize by dividing by the product of the standard deviations, to yield the correlation coefficient,
The divisor cancels in this ratio, so an equivalent formula is
The correlation coefficient of the interest rates is 0.837.
The remarkable fact about the correlation coefficient is that it cannot exceed 1 in absolute value. This makes sense, if a series cannot be more closely correlated with any other series than it is with itself. Putting for every in (1.11) gives . Putting for every gives . However, this is not quite the same thing as saying that can never fall outside the interval whatever pair of series is chosen. This is in fact a rather deep result, an instance of what is called by mathematicians the Cauchy‐Schwarz inequality.
The demonstration is a bit magical and not too technical to follow, so it is worth setting it out in simple steps. Here's how. Using an abbreviated notation to keep the equations as simple as possible, write
so that the sum limits and the mean deviations are taken as implicit. In this setup, note first that for any pair of numbers and , and any pair of series and ,
This is a sum of squares and the smallest it can get is zero, if for every . Except in this case, it must be positive.
Multiplying out the square and adding up the components individually, an equivalent way to write the relation in (1.13) (check this!) is
Now (the clever bit!) choose and . Substituting in (1.14) gives
Noting that appears in each term and is necessarily positive, it can be cancelled, so inequality (1.15) implies
Finally, subtract from both sides to get
In words, “the product of the sums of squares is never smaller than the square of the sum of products.” Comparing with (1.12), the argument shows that , and this must hold for any pair of series whatever. The claim that
is established directly.
Given this result, a correlation coefficient of 0.837 represents quite a strong positive relationship, nearer to 1 than to 0 at least. However, there is another aspect to correlation not to be overlooked. It is clear that scatter patterns yielding a large positive value must have the characteristic “big with big, small with small”. In other words, the pairs of points must tend to be on the same side of their means, above or below. In the case of negative correlation we should find “big with small, small with big”, with most points in the top left or bottom right quadrants. In other words, the scatter of points has to be clustered around a straight line, either with positive slope, or negative slope.
However, it is not difficult to imagine a case where “big goes with either big or small, but small goes with intermediate”. For example, the points might be clustered around a quadratic curve. A close relationship is perfectly compatible with similar numbers of points in all four quadrants and a correspondingly small value for . The implication is that correlation measures specifically the strength of linear relationships.
Correlation is a way of describing relationships between pairs of variables, but in a neutral and context‐free way, simply posing the question “Is there a connection?” But most commonly in economics, such questions do have a context. One of these might be “Is there a causal connection? Does change because changes?” Another might be “In a situation where I observe but I don't observe , could I exploit their correlation to predict?” Note that in the latter case there need be no implication about a direction of causation; it's just a question of what is and is not observed. However, in either situation there is an asymmetry in the relationship. Consider to be the object of interest and as the means to explain it. Regression is the tool for using correlation to try to answer such questions.
Since correlation is a feature of linear relationships, it is a natural step to write down the equation for a straight line to embody the relationship of interest. Such an equation has the form
and is illustrated in Figure 1.3. The coefficients and are known respectively as the intercept and the slope, and the illustration shows the roles that they play.
Figure 1.3 The regression line
Given the series of pairs for , consider how to express the idea of an approximate linear relationship between observed variables. Figure 1.4 shows what the teaming up of a scatter of observed pairs with a regression line might look like. To express the same idea algebraically requires an equation of the form
in which a new series, for , has to be introduced to close the equation. This is not a part of the data set, although given fixed values for and , equation (1.16) is nothing but an identity defining as a function of the observed series, and if those values were known, it could be calculated from the formula. Think of
