77,99 €
Statistical Methods for Spatial and Spatio-Temporal Data Analysis provides a complete range of spatio-temporal covariance functions and discusses ways of constructing them. This book is a unified approach to modeling spatial and spatio-temporal data together with significant developments in statistical methodology with applications in R.
This book includes:
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 505
Veröffentlichungsjahr: 2015
Cover
Wiley Series in Probability and Statistics
Title Page
Copyright
Dedication
Foreword by Abdel H. El-Shaarawi
Foreword by Hao Zhang
List of figures
List of tables
About the companion website
Chapter 1: From classical statistics to geostatistics
1.1 Not all spatial data are geostatistical data
1.2 The limits of classical statistics
1.3 A real geostatistical dataset: data on carbon monoxide in Madrid, Spain
Chapter 2: Geostatistics: preliminaries
2.1 Regionalized variables
2.2 Random functions
2.3 Stationary and intrinsic hypotheses
2.4 Support
Chapter 3: Structural analysis
3.1 Introduction
3.2 Covariance function
3.3 Empirical covariogram
3.4 Semivariogram
3.5 Theoretical semivariogram models
3.6 Empirical semivariogram
3.7 Anisotropy
3.8 Fitting a semivariogram model
Chapter 4: Spatial prediction and kriging
4.1 Introduction
4.2 Neighborhood
4.3 Ordinary kriging
4.4 Simple kriging: the special case of known mean
4.5 Simple kriging with an estimated mean
4.6 Universal kriging
4.7 Residual kriging
4.8 Median-Polish kriging
4.9 Cross-validation
4.10 Non-linear kriging
Chapter 5: Geostatistics and spatio-temporal random functions
5.1 Spatio-temporal geostatistics
5.2 Spatio-temporal continuity
5.3 Relevant spatio-temporal concepts
5.4 Properties of the spatio-temporal covariance and semivariogram
Chapter 6: Spatio-temporal structural analysis (I): empirical semivariogram and covariogram estimation and model fitting
6.1 Introduction
6.2 The empirical spatio-temporal semivariogram and covariogram
6.3 Fitting spatio-temporal semivariogram and covariogram models
6.4 Validation and comparison of spatio-temporal semivariogram and covariogram models
Chapter 7: Spatio-temporal structural analysis (II): theoretical covariance models
7.1 Introduction
7.2 Combined distance or metric model
7.3 Sum model
7.4 Combined metric-sum model
7.5 Product model
7.6 Product-sum model
7.7 Porcu and Mateu mixture-based models
7.8 General product-sum model
7.9 Integrated product and product-sum models
7.10 Models proposed by Cressie and Huang
7.11 Models proposed by Gneiting
7.12 Mixture models proposed by Ma
7.13 Models generated by linear combinations proposed by Ma
7.14 Models proposed by Stein
7.15 Construction of covariance functions using copulas and completely monotonic functions
7.16 Generalized product-sum model
7.17 Models that are not fully symmetric
7.18 Mixture-based Bernstein zonally anisotropic covariance functions
7.19 Non-stationary models
7.20 Anisotropic covariance functions by Porcu and Mateu
7.21 Spatio-temporal constructions based on quasi-arithmetic means of covariance functions
Chapter 8: Spatio-temporal prediction and kriging
8.1 Spatio-temporal kriging
8.2 Spatio-temporal kriging equations
Chapter 9: An introduction to functional geostatistics
9.1 Functional data analysis
9.2 Functional geostatistics: The parametric vs. the non-parametric approach
9.3 Functional ordinary kriging
Appendix A: Spectral representations
Spectral representation of the covariogram
Spectral representation of the semivariogram
Appendix B: Probabilistic aspects of Uij=Z(si)−Z(sj)
Appendix C: Basic theory on restricted maximum likelihood
Restricted Maximum Likelihood equation
Appendix D: Most relevant proofs (Chapter 7)
D.1 Product model: Peculiarity (ii) (Rodríguez-Iturbe and Mejia 1974; De Cesare
et al.
1997)
D.2 Product model: Peculiarity (iv) (Rodríguez-Iturbe and Mejia 1974; De Cesare
et al.
1997)
D.3 Product-sum model: Semivariogram expression (7.29) (De Iaco
et al.
2001)
D.4 General product-sum model: Obtaining the constant
k
(De Iaco
et al.
2001)
D.5 General product-sum model: Theorem 7.8.1 (De Iaco
et al.
2001)
D.6 General product-sum model: Theorem 7.8.2. (De Iaco
et al.
2001)
D.7 Generalized product-sum model. Proposition 1 (Gregori
et al
. 2008)
D.8 Generalized product-sum model. Proposition 1 for (Gregori
et al
. 2008)
D.9 Generalized product-sum model. Corollary 1 of Proposition 2 (Gregori
et al
. 2008)
D.10 Generalized product-sum model. Range of . Case 1: The Gaussian case (Gregori
et al
. 2008)
D.11 Generalized product-sum model. Range of . Case 2: The Matérn case (Gregori
et al
. 2008)
D.12 Generalized product-sum model. Range of . Case 3: The Gaussian-Matérn case (Gregori
et al
. 2008)
D.13 Mixture-based Bernstein zonally anisotropic covariance functions. Theorem 7.18.1 (Ma 2003b)
D.14 Construction of non-stationary spatio-temporal covariance functions using spatio-temporal stationary covariances and intrinsically stationary semivariograms. Equation (7.159) (Ma 2003c)
D.15 Construction of non-stationary spatio-temporal covariance functions using spatio-temporal stationary covariances and intrinsically stationary semivariograms. Equation (7.161) is a valid covariance function (Ma 2003c)
D.16 Construction of non-stationary spatio-temporal covariance functions using spatio-temporal stationary covariances and intrinsically stationary semivariograms. Equation (7.163) Ma (2003c)
D.17 Permissibility criteria for quasi-arithmetic means of covariance functions. Proposition 1 (Porcu
et al.
2009b)
Bibliography and further reading
Wiley Series in Probability and Statistics: Established by Walter A. Shewhart and Samuel S. Wilks
Index
End User License Agreement
xi
xii
xiii
xv
xvi
xvii
xviii
xix
xxi
xxii
xxiii
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
Cover
Table of Contents
Foreword
Begin Reading
Chapter 1: From classical statistics to geostatistics
Figure 1.1 Location of the pollution monitoring stations in Madrid and map of predicted NOx levels (10 pm; average of the week days;
50
th week of
2008
) using geostatistical techniques.
Figure 1.2 Percentage of households with problems of pollution and odors in Madrid, Spain,
2001
(census tracts).
Figure 1.3 Fires in Castilla-La Mancha, Spain,
1998
.
Figure 1.4 Location of the monitoring stations in the city of Madrid.
Chapter 2: Geostatistics: preliminaries
Figure 2.1 Simulation of a regionalized variable.
Figure 2.2 Four pairs of points separated by a distance h in a 2D domain.
Figure 2.3 Stationary and intrinsic hypotheses.
Figure 2.4 Top panel: Realization of a Wiener-Levy process. Bottom panel: First-order increments of the realization of the above Wiener-Levy process.
Chapter 3: Structural analysis
Figure 3.1 Spherical, exponential, and Gaussian covariance models with and different values of the scale parameter.
Figure 3.2 Spherical, exponential, and Gaussian models with and .
Figure 3.3 Bounded semivariogram and its covariogram counterpart.
Figure 3.4 Simulations of a rf having semivariograms that only differ in the range: (a) ; (b) ; (c) .
Figure 3.5 2D representation of simulations of two rf's with a semivariogram that only differs in the behavior near the origin: (a) linear, (b) parabolic.
Figure 3.7 Nested semivariogram.
Figure 3.6 Simulated fields of values using a semivariogram with scale parameter : (a) nugget effect =
0
; (b) nugget effect =
0
.
25
.
Figure 3.8 Upper panel: Spherical model. Left: . Right: . Middle panel: Simulation of a rf having a spherical semivariogram (2D representation). Left: . Right: . Bottom panel: Simulation of a rf having a spherical semivariogram (3D representation). Left: . Right: .
Figure 3.9 Left: Pure nugget semivariogram. Right: Simulation of a non-spatially correlated rf (2D representation).
Figure 3.10 Upper panel: Exponential model. Left: . Right: . Middle panel: Simulation of a rf having an exponential semivariogram (2D representation). Left: . Right: . Bottom panel: Simulation of a rf having an exponential semivariogram (3D representation). Left: . Right: .
Figure 3.11 Upper panel: Gaussian model. Left: . Right: . Middle panel: Simulation of a rf having a Gaussian semivariogram (2D representation). Left: . Right: . Bottom panel: Simulation of a rf having a Gaussian semivariogram (3D representation). Left: . Right: .
Figure 3.12 Cubic model with different ranges and the same sill ().
Figure 3.13 Simulation of a rf with cubic semivariogram model (3D representation). Left: . Right: .
Figure 3.14 Left: 3D representation of a simulation of a rf having a Gaussian semivariogram with . Right: 3D representation of a simulation of a rf having a cubic semivariogram with .
Figure 3.15 Upper panel, left: Stable model with the same sill () and scale parameter () but different shape parameter . Upper panel, right: 3D representation of a simulation of a rf having a stable semivariogram (). Bottom panel, left: 3D representation of a simulation of a rf having a stable semivariogram (). Bottom panel, right: 3D representation of a simulation of a rf having a stable semivariogram ().
Figure 3.16 Cauchy models with the same sill () and scale parameter () but different shape parameter .
Figure 3.17 K-Bessel model with the same scale parameter () and the same sill () but different shape parameter .
Figure 3.18 Cardinal sine models with the same sill () and different values of the scale parameter: (a) plot of the models; (b), (c) and (d) simulation of a rf having a cardinal sine model with , respectively (2D representation).
Figure 3.19 Power models.
Figure 3.20 Linear model.
Figure 3.21 Logarithmic model ().
Figure 3.22 Nested model composed of: (a) Pure nugget semivariogram (); (b) Spherical semivariogram (); (c) Spherical semivariogram (); (d) Nested semivariogram (3.40).
Figure 3.23 Tolerance region on .
Figure 3.24 Effect of the tolerance angle on a North-South empirical semivariogram. Tolerance angle: (a) , (b) , (c) , (d) . The observed regionalization was simulated with an spherical model ().
Figure 3.25 Twenty-five observed values in a grid.
Figure 3.26 Left: Empirical semivariogram (classic estimator). Right: Semivariogram cloud.
Figure 3.27 Observed values of logCO* at the
23
monitoring stations operating in Madrid, week
50
,
10
pm. Left panel: 2D representation (black: higher values; white: lower values). Right panel: 3D representation.
Figure 3.28 Data on carbon monoxide in Madrid, week
50
,
10
pm: (a) Classical empirical semivariogram; (b) Semivariogram cloud.
Figure 3.29 Observed points and data values.
Figure 3.30 Observed points and data values (without the outlier).
Figure 3.31 Left panel: Geometric anisotropy. Right panel: Zonal anisotropy.
Figure 3.32 Simulation of two rf's. Left panel: The geometric anisotropy case; Right panel: The zonal anisotropy case.
Figure 3.33 Simulation of a isotropic rf.
Figure 3.34 Semivariogram maps: (a) The isotropic case (circular contours); (b) The anisotropic case (elliptic contours, ). The axes depict lag distances in the corresponding coordinate system.
Figure 3.35 3D representation of zonal anisotropy. Left panel: Pure zonal anisotropy in vertical direction. Right panel: Directional semivariograms in horizontal directions (), vertical () and in an intermediate direction (). Source: Emery (2000,
p. 111)
. Reproduced with permission of Xavier Emery.
Figure 3.36 Spherical models resulting from the automatic fitting (data on carbon monoxide in Madrid, week
50
,
10
pm).
Chapter 4: Spatial prediction and kriging
Figure 4.1 Location of eight observation points used for prediction at the non-observed point .
Figure 4.2 New location of the prediction point .
Figure 4.3 Prediction and prediction standard deviation (SD) maps of logCO*: January 2008,
2
nd week,
10
am.
Figure 4.4 Prediction and prediction standard deviation (SD) maps of logCO*: January 2008,
2
nd week,
3
pm.
Figure 4.5 Prediction and prediction standard deviation (SD) maps of logCO*: January 2008,
2
nd week,
9
pm.
Figure 4.6 Six points, , discretizing the block V, and an observed point, .
Figure 4.7 Location of seven observation points used for prediction over the block .
Figure 4.8 Six points, discretizing , and six points, discretizing .
Figure 4.9 Coal-ash data. Location (reoriented) and observed values.
Figure 4.10 Coal-ash data. 3D scatterplot.
Figure 4.11 Contour plot of coal-ash percentages.
Figure 4.12 Coal-ash percentages surface interpolation (via triangulation).
Figure 4.13 Coal-ash percentages: Column and row summaries.
Figure 4.14 Prediction point.
Figure 4.15 Coal-ash percentages: Median-polish residuals.
Figure 4.16 Coal-ash data: Original data, median-polish drift and median-polish residuals (the lighter the color, the higher the value).
Figure 4.17 Coal-ash residuals: Classical semivariogram.
Figure 4.18 Exponential, spherical, cubic, and Gaussian models resulting from the WLS fitting. (Data on carbon monoxide in Madrid, week
50
,
10
pm).
Chapter 5: Geostatistics and spatio-temporal random functions
Figure 5.1 A spatio-temporal dataset on :
7
spatial locations observed at
3
moments in time (adapted from Luo 1998).
Figure 5.2 (a) Three pairs of spatio-temporal locations with the same . (b) Three pairs of spatio-temporal locations with the same . (c) Two sets of three pairs of variables, with both the same and also for the three pairs of each set.
Figure 5.6 Relationships between the different types of spatio-temporal covariance functions (adapted from Gneiting et al. 2007).
Figure 5.3 (a) Two pairs of spatio-temporal locations with the same covariance under the assumption of full symmetry. (b) Five spatio-temporal locations such that, under the hypotheses of stationarity and full symmetry, the covariance between the peripheral locations and the central one is the same.
Figure 5.4 Regularly spaced grid.
Figure 5.5 Full symmetry.
Chapter 6: Spatio-temporal structural analysis (I): empirical semivariogram and covariogram estimation and model fitting
Figure 6.1 Regular
500
meters spaced grid.
Figure 6.2 Pairs of points separated by a spatio-temporal distance of (
1500
,
2
).
Figure 6.3
1200
data simulated with the Gneiting non-separable covariance function (6.3) at
200
points irregularly spaced on a grid at
6
instants of time.
Figure 6.4 The empirical spatio-temporal semivariogram. Upper panel: The point plot version. Bottom panel: The smoothed version.
Figure 6.5 Empirical purely spatial (left panel) and temporal (right panel) semivariograms.
Figure 6.6
300
spatio-temporal data simulated with a spatio-temporal Gaussian rf with zero mean and doubly exponential covariance function (6.9) in
50
irregularly spaced sites on a grid at
6
instants of time.
Figure 6.7 Empirical spatio-temporal semivariogram corresponding to the data in Figure 6.6 (upper panel) and the fitted semivariogram corresponding to the doubly exponential covariance function (bottom panel).
Chapter 7: Spatio-temporal structural analysis (II): theoretical covariance models
Figure 7.1 Representation of a spatio-temporal distance in the metric model.
Figure 7.2 2D and 3D different representations of the exponential metric model (7.6).
Figure 7.3 2D and 3D different representations of the exponential sum model
(
7.13
)
.
Figure 7.4 2D and 3D representations of the exponential metric-sum model
(
7.17
)
. Upper panel: Covariance function; Bottom panel: Semivariogram.
Figure 7.5 3D representation of the product model (7.27) showing the spatial (left panel) and temporal (right panel) margins as well as the covariance functions for different spatial distances and time lags.
Figure 7.6 2D and 3D different representations of the product model
(
7.27
)
.
Figure 7.7 2D and 3D different representations of the product-sum model
(
7.35
)
.
Figure 7.8 2D and 3D different representations of the model based on mixtures
(7.37)
.
Figure 7.9 2D and 3D different representations of the family of covariance functions
(
7.76
)
.
Figure 7.10 2D and 3D different representations of the family of covariance functions
(7.81)
.
Figure 7.11 Representation of the generalized product-sum model (7.138) with negative covariances.
Figure 7.12 2D and 3D different representations of the generalized product-sum model (7.138).
Chapter 8: Spatio-temporal prediction and kriging
Figure 8.1 Location of the
20
sites observed.
Figure 8.2 Spatio-temporal SK prediction maps (left panel) and prediction variance maps (right panel):
1
-day,
2
-day and
3
-day time horizons.
Chapter 9: An introduction to functional geostatistics
Figure 9.1 Raw daily temperature data (left panel) and their corresponding functional data (right panel) for
35
Canadian monitoring stations (data are averaged over
1960
and
1994
).
Figure 9.2 Mean and standard deviation function for functional data obtained in Example 9.1.
Figure 9.3 Covariance function across locations for functional data obtained in Example 9.1.
Figure 9.4 Correlation function across locations for functional data obtained in Example 9.1.
Figure 9.5 Functional data in a set of locations.
Figure 9.6 Location (central panel) and pictures of the prediction sites: (A) Plaza Cibeles, (B) Plaza de Callao, (C) Plaza Carlos V (Atocha), and (D) Puerta del Sol.
Figure 9.7 Contour plot for MSE
FCV
.
Figure 9.8 Original (top panel) and functional (bottom panel) data of for the
23
monitoring stations operating in the city of Madrid in
2008
.
Figure 9.9 Functional residuals obtained through functional cross-validation. Mean function (continuous black line) and standard deviation function of residuals (dotted black line).
Figure 9.10 Predicted curves of at (A) Plaza Cibeles, (B) Plaza de Callao, (C) Plaza Carlos V (Atocha), and (D) Puerta del Sol (bold lines) together with the observed functional data at the 23 monitoring stations.
Chapter 3: Structural analysis
Table 3.1
Table 3.2
Table 3.3
Table 3.4
Table 3.5
Table 3.6
Table 3.7
Table 3.8
Chapter 4: Spatial prediction and kriging
Table 4.1
Table 4.2
Table 4.3
Table 4.4
Table 4.5
Table 4.6
Table 4.7
Table 4.8
Table 4.9
Table 4.10
Table 4.11
Table 4.12
Table 4.13
Table 4.14
Table 4.17
Table 4.16
Table 4.15
Table 4.18
Table 4.19
Table 4.20
Table 4.21
Table 4.22
Chapter 5: Geostatistics and spatio-temporal random functions
Table 5.1
Chapter 6: Spatio-temporal structural analysis (I): empirical semivariogram and covariogram estimation and model fitting
Table 6.1
Table 6.2
Chapter 7: Spatio-temporal structural analysis (II): theoretical covariance models
Table 7.1 Some completely monotone functions .
Table 7.2 Some Bernstein functions: positive-definite functions , with completely monotonic derivatives (the functions have been standardized so that ).
Table 7.3 Values and for in the case in which Gaussian and Matérn functions have been combined in accordance with their parameters.
Table 7.4 Examples of quasi-arithmetic means.
Chapter 8: Spatio-temporal prediction and kriging
Table 8.1
Table 8.2
Chapter 9: An introduction to functional geostatistics
Table 9.1
Established by WALTER A. SHEWHART and SAMUEL S. WILKS
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg
Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels
A complete list of the titles in this series appears at the end of this volume.
José-María Montero and Gema Fernández-Avilés
University of Castilla-La Mancha, Spain
Jorge Mateu
University Jaume I of Castellón, Spain
This edition first published 2015
© 2015 John Wiley & Sons, Ltd
Registered office
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Montero, José María.
Spatial and spatio-temporal geostatistical modeling and kriging / José María Montero, Department of Statistics, University of Castilla-La Mancha, Spain, Gema Fernández-Avilés, Department of Statistics, University of Castilla-La Mancha, Spain, Jorge Mateu, Department of Mathematics, University Jaume I of Castellon, Spain.
pages cm
Includes bibliographical references and index.
ISBN 978-1-118-41318-0 (pbk.)
1.Geology–Statistical methods. 2.Kriging.I. Fernández-Avilés, Gema. II. Mateu, Jorge. III. Title.
QE33.2.S82M66 2015
551.01'5195–dc23
2015015489
To my parents Pepe and Maruja, my wife Gema, my children Beatriz and José, my sister Nines and my niece Eva…and also to Fito and Atlético de Madrid.
They never let me walk alone.
José María Montero
To my parents, Jose and Juli, and to my husband, Chema. Thanks for loving me.
Gema Fernández-Avilés
To my sons and daughters, you are my life motif
Jorge Mateu
It is a great pleasure to write the Foreword for this important book on space-time modeling. I have known the authors for many years and am very familiar with their significant contributions to the subject. I consider the book to be an important addition to the current knowledge about space-time models.
The authors have written an excellent book on a topic of immense interest to a broad class of researchers, students, scientists and decision-makers. The book presents the current state of the art on modeling spatial/temporal complexity in a clear and accurate manner with illustrations using both artificial and real data examples that will help the readers to understand the steps required to build and assess spatial/temporal models. Although many of the examples discussed are related to pollution and the environment, the techniques presented are sufficiently general and can be easily used in other areas of investigation.
Clearly most phenomena encountered in real life include some aspects of space and time that need to be taken into account in the modeling process. The period since 1970 has been a time of spectacular growth in our understanding and ability to develop “valid” sophisticated spatial/temporal models to understand the changes in large systems; examples are climate change and long-range transport of air pollution and acid rain in the eastern part of North America and the Scandinavian countries. This growth can in large part be attributed to one factor: the exponential growth in our ability to both compute and measure data sets. We now have more powerful computers that allow us to store and analyze large data sets with relative ease and advanced instruments for measuring. The book shows us how to use these facilities effectively to develop, fit and assess models for spatio-temporal processes. The primary reason for working in this area is to be able to predict unknown values at unmeasured locations and at future times. Such models can thus be used to produce maps and to identify regions (problem areas) in the domain of interest where, for example, the level of pollution exceeds the permissible level and thus could be of importance to human or ecosystem health. The book describes how to include in the modeling processes the two components: a systematic component with available explanatory variables and the spatial and temporal correlation component, and how the two components interact to produce reliable forecasts.
The book consists of nine chapters which can be grouped into four parts. The first part consists of the first two chapters and is devoted to providing an excellent introduction to the nature and type of spatial data, along with the associated probabilistic and statistical terminology about random functions, stationarity, etc. The second part, consisting of Chapters 3 and 4, involves the development of valid covariance and semivariogram models in two-dimensional and three-dimensional space and explains the ways to fit these models to the corresponding empirical counterparts. This part covers the use of kriging equations to predict the values of the random function at non-observed points and blocks. A survey of different types of kriging is presented such as: simple kriging, ordinary kriging, universal kriging, direct residual kriging, iterative residual kriging, modified iterative residual kriging, median-polish kriging, and non-linear kriging (disjunctive kriging and indicator kriging). In addition, plenty of numerical examples are used to illustrate the step-by-step process of implementing the techniques covered.
The third section, Chapters 5–8, provides a natural extension of the methods in the second part to the case when the process evolves over time. Again the approach used starts with the empirical estimation of the space-time variogram and covariogram, then appropriate valid covariance models are fitted to the empirical estimates introduced. The final outcome is an adequate model that will be used for predictions. One important aspect of this part is the presentation of an extensive survey of available valid covariance functions. This will be of interest to researchers in the area of spatial temporal models because of the particularly detailed proofs of the theoretical work given in Appendix D.
The final part, Chapter 9, provides a brief discussion of functional geostatistics. This is an area of immense interest and is currently under extensive development. This chapter provides good starting point for research in this area.
The material in the book can be used for teaching a course on spatial statistics to graduate students or senior undergradute students. It is also an excellent reference book for researchers.
Abdel H. El-Shaarawi
National Water Research Institute
Burlington, Ontario, Canada
The American University in Cairo
Cairo, Egypt
One of the areas that constantly presents stimulating research problems is spatial and spatio-temporal statistics. Most of the studies in environmental sciences, agriculture, natural resources and ecology involve data collected at various spatial locations and over a certain period of time. Technological advances make it possible to collect large-scale and massive amounts of data. Societal concerns about climate change and environmental issues are also contributory factors to explain the increased amount of interest in spatio-temporal statistics.
One distinctive feature of this kind of data is that they are correlated or dependent. This dependence should be included in the statistical modeling, and it is modeled through the covariance function or covariogram of a Gaussian stochastic process.
When I explain what spatial or spatio-temporal correlation means in my spatial statistics course, students often ask why there is spatial or temporal dependence. I often give a simplistic answer that the exact reasons are not known but it could be due to the unknown or unobserved variables. For example, despite our best effort in modeling the mean by incorporating all available observed explanatory variables, the residuals may still show spatial correlation. In this case, modeling the covariance appropriately may improve the efficiency of the estimation of the mean, and offset the effects of the unobserved variables that may affect the mean. In addition, the covariance function is essential for the prediction of a value at an unobserved location or time. Indeed, when making a prediction, there is an advantage in keeping the mean dependent on as few variables as possible.
This book provides a comprehensive coverage of covariance functions, some of which are relatively new. It includes details and some new materials that are hard to find in any other existing book. It is a very welcome addition to the current literature in spatio-temporal statistics. I trust readers will find it a very valuable resource.
Hao Zhang
Purdue University
West Lafayette, IN 47906
USA
1.1
Location of the pollution monitoring stations in Madrid and map of predicted NOx levels (10 pm; average of the week days;
50
th week of
2008
) using geostatistical techniques.
1.2
Percentage of households with problems of pollution and odors in Madrid, Spain, 2001 (census tracts).
1.3
Fires in Castilla-La Mancha, Spain, 1998.
1.4
Location of the monitoring stations in the city of Madrid.
2.1
Simulation of a regionalized variable.
2.2
Four pairs of points separated by a distance
h
in a 2D domain.
2.3
Stationary and intrinsic hypotheses.
2.4
Top panel
: Realization of a Wiener-Levy process.
Bottom panel
: First-order increments of the realization of the above Wiener-Levy process.
3.1
Spherical, exponential, and Gaussian covariance models with
and different values of the scale parameter.
3.2
Spherical, exponential, and Gaussian models with
and
.
3.3
Bounded semivariogram and its covariogram counterpart.
3.4
Simulations of a rf having semivariograms that only differ in the range: (a)
; (b)
; (c)
.
3.5
2D representation of simulations of two rf's with a semivariogram that only differs in the behavior near the origin: (a) linear, (b) parabolic.
3.6
Simulated fields of values using a semivariogram with scale parameter
: (a) nugget effect =
0
; (b) nugget effect =
0
.
25
.
3.7
Nested semivariogram.
3.8
Upper panel: Spherical model. Left:
. Right:
. Middle panel: Simulation of a rf having a spherical semivariogram (2D representation). Left:
. Right:
. Bottom panel: Simulation of a rf having a spherical semivariogram (3D representation). Left:
. Right:
.
3.9
Left: Pure nugget semivariogram. Right: Simulation of a non-spatially correlated rf (2D representation).
3.10
Upper panel: Exponential model. Left:
. Right:
. Middle panel: Simulation of a rf having an exponential semivariogram (2D representation). Left:
. Right:
. Bottom panel: Simulation of a rf having an exponential semivariogram (3D representation). Left:
. Right:
.
3.11
Upper panel: Gaussian model. Left:
. Right:
. Middle panel: Simulation of a rf having a Gaussian semivariogram (2D representation). Left:
. Right:
. Bottom panel: Simulation of a rf having a Gaussian semivariogram (3D representation). Left:
. Right:
.
3.12
Cubic model with different ranges and the same sill (
).
3.13
Simulation of a rf with cubic semivariogram model (3D representation). Left:
. Right:
.
3.14
Left: 3D representation of a simulation of a rf having a Gaussian semivariogram with
. Right: 3D representation of a simulation of a rf having a cubic semivariogram with
.
3.15
Upper panel, left: Stable model with the same sill (
) and scale parameter (
) but different shape parameter
. Upper panel, right: 3D representation of a simulation of a rf having a stable semivariogram (
). Bottom panel, left: 3D representation of a simulation of a rf having a stable semivariogram (
). Bottom panel, right: 3D representation of a simulation of a rf having a stable semivariogram (
).
3.16
Cauchy models with the same sill (
) and scale parameter (
) but different shape parameter
.
3.17
K-Bessel model with the same scale parameter (
) and the same sill (
) but different shape parameter
.
3.18
Cardinal sine models with the same sill (
) and different values of the scale parameter: (a) plot of the models; (b), (c) and (d) simulation of a rf having a cardinal sine model with
, respectively (2D representation).
3.19
Power models.
3.20
Linear model.
3.21
Logarithmic model (
).
3.22
Nested model composed of: (a) Pure nugget semivariogram (
); (b) Spherical semivariogram (
); (c) Spherical semivariogram (
); (d) Nested semivariogram (
3.40
).
3.23
Tolerance region on
.
3.24
Effect of the tolerance angle on a North-South empirical semivariogram. Tolerance angle: (a)
, (b)
, (c)
, (d)
. The observed regionalization was simulated with an spherical model (
).
3.25
Twenty-five observed values in a
grid.
3.26
Left: Empirical semivariogram (classic estimator). Right: Semivariogram cloud.
3.27
Observed values of logCO* at the
23
monitoring stations operating in Madrid, week
50
,
10
pm. Left panel: 2D representation (black: higher values; white: lower values). Right panel: 3D representation.
3.28
Data on carbon monoxide in Madrid, week
50
,
10
pm: (a) Classical empirical semivariogram; (b) Semivariogram cloud.
3.29
Observed points and data values.
3.30
Observed points and data values (without the outlier).
3.31
Left panel: Geometric anisotropy. Right panel: Zonal anisotropy.
3.32
Simulation of two rf's. Left panel: The geometric anisotropy case; Right panel: The zonal anisotropy case.
3.33
Simulation of a isotropic rf.
3.34
Semivariogram maps: (a) The isotropic case (circular contours); (b) The anisotropic case (elliptic contours,
). The axes depict lag distances in the corresponding coordinate system.
3.35
3D representation of zonal anisotropy. Left panel: Pure zonal anisotropy in vertical direction. Right panel: Directional semivariograms in horizontal directions (
), vertical (
) and in an intermediate direction (
). Source: Emery (2000,
p. 111)
. Reproduced with permission of Xavier Emery.
3.36
Spherical models resulting from the automatic fitting (data on carbon monoxide in Madrid, week
50
,
10
pm).
4.1
Location of eight observation points used for prediction at the non-observed point
.
4.2
New location of the prediction point
.
4.3
Prediction and prediction standard deviation (SD) maps of logCO*: January 2008,
2
nd week,
10
am.
4.4
Prediction and prediction standard deviation (SD) maps of logCO*: January 2008,
2
nd week,
3
pm.
4.5
Prediction and prediction standard deviation (SD) maps of logCO*: January 2008,
2
nd week,
9
pm.
4.6
Six points,
, discretizing the block V, and an observed point,
.
4.7
Location of seven observation points used for prediction over the block
.
4.8
Six points,
discretizing
, and six points,
discretizing
.
4.9
Coal-ash data. Location (reoriented) and observed values.
4.10
Coal-ash data. 3D scatterplot.
4.11
Contour plot of coal-ash percentages.
4.12
Coal-ash percentages surface interpolation (via triangulation).
4.13
Coal-ash percentages: Column and row summaries.
4.14
Prediction point.
4.15
Coal-ash percentages: Median-polish residuals.
4.16
Coal-ash data: Original data, median-polish drift and median-polish residuals (the lighter the color, the higher the value).
4.17
Coal-ash residuals: Classical semivariogram.
4.18
Exponential, spherical, cubic, and Gaussian models resulting from the WLS fitting. (Data on carbon monoxide in Madrid, week
50
,
10
pm).
5.1
A spatio-temporal dataset on
:
7
spatial locations observed at
3
moments in time (adapted from Luo 1998).
5.2
(a) Three pairs of spatio-temporal locations with the same
. (b) Three pairs of spatio-temporal locations with the same
. (c) Two sets of three pairs of variables, with both the same
and also
for the three pairs of each set.
5.3
(a) Two pairs of spatio-temporal locations with the same covariance under the assumption of full symmetry. (b) Five spatio-temporal locations such that, under the hypotheses of stationarity and full symmetry, the covariance between the peripheral locations and the central one is the same.
5.4
Regularly spaced grid.
5.5
Full symmetry.
5.6
Relationships between the different types of spatio-temporal covariance functions (adapted from Gneiting et al. 2007).
6.1
Regular
500
meters spaced grid.
6.2
Pairs of points separated by a spatio-temporal distance of (
1500
,
2
).
6.3
1200
data simulated with the Gneiting non-separable covariance function (
6.3
) at
200
points irregularly spaced on a
grid at
6
instants of time.
6.4
The empirical spatio-temporal semivariogram. Upper panel: The point plot version. Bottom panel: The smoothed version.
6.5
Empirical purely spatial (left panel) and temporal (right panel) semivariograms.
6.6
300
spatio-temporal data simulated with a spatio-temporal Gaussian rf with zero mean and doubly exponential covariance function (
6.9
) in
50
irregularly spaced sites on a
grid at
6
instants of time.
6.7
Empirical spatio-temporal semivariogram corresponding to the data in
Figure 6.6
(upper panel) and the fitted semivariogram corresponding to the doubly exponential covariance function (bottom panel).
7.1
Representation of a spatio-temporal distance in the metric model.
7.2
2D and 3D different representations of the exponential metric model (
7.6
).
7.3
2D and 3D different representations of the exponential sum model
(
7.13
)
.
7.4
2D and 3D representations of the exponential metric-sum model
(
7.17
)
. Upper panel: Covariance function; Bottom panel: Semivariogram.
7.5
3D representation of the product model (7.27) showing the spatial (left panel) and temporal (right panel) margins as well as the covariance functions for different spatial distances and time lags.
7.6
2D and 3D different representations of the product model
(
7.27
)
.
7.7
2D and 3D different representations of the product-sum model
(
7.35
)
.
7.8
2D and 3D different representations of the model based on mixtures
(7.37)
.
7.9
2D and 3D different representations of the family of covariance functions
(
7.76
)
.
7.10
2D and 3D different representations of the family of covariance functions
(7.81)
.
7.11
Representation of the generalized product-sum model (
7.138
) with negative covariances.
7.12
2D and 3D different representations of the generalized product-sum model (
7.138
).
8.1
Location of the
20
sites observed.
8.2
Spatio-temporal SK prediction maps (left panel) and prediction variance maps (right panel):
1
-day,
2
-day and
3
-day time horizons.
9.1
Raw daily temperature data (left panel) and their corresponding functional data (right panel) for
35
Canadian monitoring stations (data are averaged over
1960
and
1994
).
9.2
Mean and standard deviation function for functional data obtained in Example 9.1.
9.3
Covariance function across locations for functional data obtained in Example 9.1.
9.4
Correlation function across locations for functional data obtained in Example 9.1.
9.5
Functional data in a set of locations.
9.6
Location (central panel) and pictures of the prediction sites: (A) Plaza Cibeles, (B) Plaza de Callao, (C) Plaza Carlos V (Atocha), and (D) Puerta del Sol.
9.7
Contour plot for MSE
FCV
.
9.8
Original (top panel) and functional (bottom panel) data of
for the
23
monitoring stations operating in the city of Madrid in
2008
.
9.9
Functional residuals obtained through functional cross-validation. Mean function (continuous black line) and standard deviation function of residuals (dotted black line).
9.10
Predicted curves of
at (A) Plaza Cibeles, (B) Plaza de Callao, (C) Plaza Carlos V (Atocha), and (D) Puerta del Sol (bold lines) together with the observed functional data at the 23 monitoring stations.
3.1
Coordinates and value of the observed data.
3.2
Empirical semivariogram values together with the variance of half the squared semidifferences used to compute them.
3.3
Data on carbon monoxide in Madrid, week 50, 10 pm: Empirical semivariogram values for each lag distance together with the variance of half the squared semidifferences used to compute them.
3.4
Empirical semivariogram values (MoM estimator).
3.5
Empirical semivariogram values (Cressie-Hawkins estimator).
3.6
Empirical semivariogram values after removing the outlier (MoM estimator).
3.7
Empirical semivariogram values after removing the outlier (Cressie-Hawkins estimator).
3.8
Estimates of the parameters of a spherical model (data on carbon monoxide in Madrid, week 50, 10 pm).
4.1
Coordinates and data values of observed points. Coordinates of prediction point.
4.2
Inter-point distances.
4.3
Semivariogram values.
4.4
OK weights, prediction and OK variance.
4.5
OK weights, prediction and OK variance.
4.6
OK weights, prediction and OK variance.
4.7
OK weights, prediction and OK variance.
4.8
Semivariogram models representing the spatial dependencies of logCO
*
.
4.9
Point-to-block distances and semivariogram values.
4.10
Distances between the points discretizing the block
V
.
4.11
Semivariogram values for the distances between the points discretizing the block
V
.
4.12
Coordinates of the observed points (s
i
, i=1,...,7) and the points discretizing
V(s′)
, (s′
i
, i=1,...,6).
4.13
Inter-point distances. Observation points.
4.14
Distances between the observation points and the points discretizing
V(s′)
.
4.15
Semivariogram values for distances between the observation points and the points discretizing
V(s′)
.
4.16
Distances between s′
i
and s′
j
.
4.17
Semivariogram values for distances between s′
i
and s′
j
.
4.18
Distances between the points discretizing
v
i
(s)
and
v
j
(s′)
.
4.19
Semivariogram values for distances between the points discretizing
and
(columns 1–6). The values for
are shown is the last column. and those for
are listed in the last row.
4.20
Row effects.
4.21
Column effects.
4.22
ME, MSE and MSDE for five semivariogram models. Ordinary kriging of logCO
*
.
5.1
Full symmetry.
6.1
Semivariogram values for the combinations of the 13 spatial and 5 temporal bins.
6.2
Number of pairs used for computing the semivariogram values listed in
Table 6.1
.
7.1
Some completely monotone functions
.
7.2
Some Bernstein functions: positive-definite functions
, with completely monotonic derivatives (the functions have been standardized so that
).
7.3
Values
and
for
in the case in which Gaussian and Matérn functions have been combined in accordance with their parameters.
7.4
Examples of quasi-arithmetic means.
8.1
Simulated spatio-temporal database.
8.2
ME, MSE and MSDE.
9.1
Raw dataset: Canadian average annual weather cycle.
This book's companion website: www.wiley.com/go/montero/spatial provides you with:
Use of the widely spread and used free software R to show the open source code used throughout the book, and provide practical guidance to the practitioner.
Use of extensively used libraries geoR, RandomFields, fields, gstat, CompRandFld, scatterplot3d, fda, and animation.
Worked out examples with the code used for each chapter of the book.
Descriptive coded examples, including a number of attractive graphical outputs.
Developed spatio-temporal modelling code to run variogram and covariance fitting in space and in space-time, simulations and predictions.
As pointed out in Schabenberger and Gotway (2005, p. 6), because spatial data arise in a myriad of fields and applications, there is also a myriad of spatial data types, structures and scenarios. Thus, an exhaustive classification of spatial data would be a very difficult challenge and this is why we have opted for embracing the general, simple and useful classification of spatial data provided by Cressie (1993, pp. 8–13). Cressie's classification of spatial data is based on the nature of the spatial domain under study. Depending on this, we can have: geostatistical data, lattice data and point patterns.
Following Cressie 1993, let be a generic location in a d-dimensional Euclidean space and be a spatial random function, Z denoting the attribute we are interested in.
Geostatistical data arise when the domain under study is a fixed set D that is continuous. That is: (i) can be observed at any point of the domain (continuous); and (ii) the points in D are non-stochastic (fixed, D is the same for all the realizations of the spatial random function). From (i) it can be easily seen that geostatistical data are identified with spatial data with a continuous variation (the spatial process is indexed over a continuous space).
Some examples of geostatistical data are the level of a pollutant in a city, the precipitation or air temperature values in a country, the concentrations of heavy metals in the top soil of a region, etc. It is obvious that, at least theoretically, the level of a specific pollutant could be measured at any location of the city; the same can be said for measurements of precipitations or air temperatures across a country or concentrations of a heavy metal across a region. However, in practice, an exhaustive observation of the spatial process is not possible. Usually, the spatial process is observed at a set of locations (for example, the level of a specific pollutant in a city is observed at the points where the monitoring stations are located) and, based on such observed values, geostatistical analysis reproduces the behavior of the spatial process across the entire domain of interest. Sometimes the goal is not so ambitious and the aim is the prediction at one or some few non-observed points or the estimation of an average value over small areas, or over the whole area under study. In geostatistical analysis the most important thing is to quantify the spatial correlation between observations (through the basic tool in geostatistics, the semivariogram) and use this information to achieve the above goals.
Figure 1.1 depicts the locations where the main pollutants are measured in Madrid, Spain (the location of the monitoring stations), along with the mapping of the level of nitrogen oxide (NOx) for the whole city (average of the NOx levels at 10 pm in the week days of the 50th week of 2008).
Figure 1.1 Location of the pollution monitoring stations in Madrid and map of predicted NOx levels (10 pm; average of the week days; 50th week of 2008) using geostatistical techniques.
The fact that the attribute of interest is continuous or discrete has nothing to do with the data being geostatistical or not. Also, how observation points are selected (according to our convenience, using a monitoring network, using a probabilistic sampling scheme ...) has nothing to do with the data being geostatistical or not.
Lattice data arise when: (i) the domain under study D is discrete, that is, can be observed in a number of fixed locations that can be enumerated. These locations can be points or regions, but they are usually ZIP codes, census tracks, neighborhoods, provinces, etc., and the data in most of cases are spatially aggregated data over these areal regions. Although these regions can be regularly shaped, usually the shape they exhibit is irregular, and this, together with the spatially aggregated character of the data, is why lattice data are also called regional data. And (ii) the locations in D are non-stochastic. Of course, a core concept in lattice data analysis is the neighborhood. Some examples of lattice data include the unemployment rate by states, crime data by counties, agricultural yields in plots, average housing prices by provinces, etc. Unlike geostatistical data, lattice data can be exhaustively observed and in this case prediction makes no sense. However, smoothing and clustering acquire special importance when dealing with this type of spatial data. Similar to geostatistical data, the response measured can be discrete or continuous, and this has nothing to do with the data being lattice data or not.
Figure 1.2 depicts the percentage of households with problems of pollution and odors in each census tract of Madrid, Spain, in 2001. As can be observed, the attribute under study is aggregated over each census tract, the domain is the set of the 128 census tracts (discrete) and sites in the domain (the census tracts) are fixed (non-stochastic).
Figure 1.2 Percentage of households with problems of pollution and odors in Madrid, Spain, 2001 (census tracts).
While in both geostatistical and lattice data the domain D is fixed, in point pattern data it is discrete or continuous, but random. Point patterns arise when the attribute under study is the location of events (observations). That is, the interest lies in where events of interest occur. Some examples of point patterns are the location of fires in Castilla-La Mancha, a Spanish region (see Figure 1.3), the location of trees in a forest or the location of nests in a breeding colony of birds, among many others. In these cases, it is obvious that D is random and the observation points do not depend on the researcher. The realizations of spatial point processes are arrangements or patterns of points and we can observe all the points of such patterns or a sample. The main goal of point pattern analysis is to determine if the location of events tends to exhibit a systematic pattern over the area under study or, on the contrary, they are randomly distributed. More specifically, we are interested in analyzing if the location of events is completely random spatially (the location where events occur is not affected by the location of other events), uniform or regular (every point is as far from all of its neighbors as possible) or clustered or aggregated (the location of events is concentrated in clusters). Some other interesting questions in point pattern analysis include: How does the intensity of a point pattern vary over an area? Over what spatial scales do patterns exist? If along with the location of events a stochastic attribute is observed, the pattern is called a marked pattern; otherwise it will be named an unmarked pattern. Obviously, marked patterns extend the possibilities of spatial analysis.
Figure 1.3 Fires in Castilla-La Mancha, Spain, 1998.
The above refers to merely spatial data, but in recent years the spatio-temporal data analysis has become a core research area in a large variety of scientific disciplines. In the spatio-temporal context, the observed data are viewed as partial realizations of a spatio-temporal random function which spreads out in space and evolves in time. Thus, spatio-temporal data simultaneously capture spatial and temporal aspects of data.
Recalling some of the examples used to illustrate the types of spatial data, if we observe every hour the level of a specific pollutant in a city at the points where the monitoring stations are located, we have a spatio-temporal geostatistical dataset. Now, based on the spatio-temporal observations, we aim to reproduce the behavior of the spatio-temporal pollution process, or simply predict its value at a space-time point. Geostatistics takes advantage of the spatio-temporal correlations existing in the spatio-temporal data (the interaction of space-time is crucial) to make predictions at unobserved space-time locations. If we annually record the percentage of households with problems of pollution and odors in the census tracts of Madrid, we have a collection of spatio-temporal lattice data. Now we can study how the spatial percentage pattern evolves in time. If we observe the location of fires in Castilla-La Mancha for the last ten years, we have a spatio-temporal point pattern database. Now we can express the relationship of points not only by distance but by time lag, and can study whether there is complete spatio-temporal randomness in the disposition of the space-time events, or they exhibit a spatio-temporal aggregation, or their spatio-temporal disposition is regular or uniform.
As is well known, classic statistics is based on the independence of the observed values. These observed values are considered as independent realizations of the same random variable. However, when the observed values are anchored in space, the hypothesis of independence is no longer acceptable. As stated in the First Law of Geography: “Everything is related to everything else, but near things are more related than distant things” (Tobler 1970).
In this section we illustrate some of the consequences of ignoring the spatialdependencies in the data and using classical statistical methods with spatial data. For the sake of simplicity, we will focus on the spatial case but the consequences of using classical statistics with spatio-temporal data are the same.
Suppose that are n identically distributed observations recorded at spatial points . More specifically, suppose that they follow a Gaussian distribution with unknown mean, , and known variance, , and that covariances between the observed points are positive and diminish with the distance between them, h, according to the expression: .
If our goal is the estimation of the unknown mean and we ignore the existing spatial correlations, the sample mean, , would undoubtedly be the estimator proposed for . Ignoring the spatial correlations, it is well known that and . But, if we consider such correlations, the sample mean continues to be an unbiased estimator of but now its variance is:
that is, the variance of the sample mean is larger than in the random sample case.
If we ignore the existing correlations in the data and use as an estimator of , the under-estimation of brings unfortunate consequences for inference about . The classical confidence intervals for a specific confidence level will be narrower than they really are, or, in other words, the confidence level of classical intervals is larger than it really is. When testing, if spatial correlations are not taken into account, the p-values will be larger than they really are and this will lead to undesirable rejections of the null hypothesis. In addition, the power of the tests will be overstated.
If the number of observations is 16 and they are recorded over a regularly spaced grid we have, ignoring the spatial correlations between observations, , so that the confidence interval for is and the test-statistic for is .
When we take into account the spatial correlations between the observations, we obtain:
.
confidence interval:
Test-statistic for testing
.
As can be seen, when we take into account the spatial correlations, the variance of lengthens by more than six, the width of the confidence interval increases by 2.52 and the value of the test-statistic decreases by 2.52.
