Geostatistical Functional Data Analysis - Jorge Mateu Mahiques - E-Book

Geostatistical Functional Data Analysis E-Book

Jorge Mateu Mahiques

0,0
100,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Geostatistical Functional Data Analysis

Explore the intersection between geostatistics and functional data analysis with this insightful new reference

Geostatistical Functional Data Analysis presents a unified approach to modelling functional data when spatial and spatio-temporal correlations are present. The Editors link together the wide research areas of geostatistics and functional data analysis to provide the reader with a new area called geostatistical functional data analysis that will bring new insights and new open questions to researchers coming from both scientific fields. This book provides a complete and up-to-date account to deal with functional data that is spatially correlated, but also includes the most innovative developments in different open avenues in this field.

Containing contributions from leading experts in the field, this practical guide provides readers with the necessary tools to employ and adapt classic statistical techniques to handle spatial regression. The book also includes:

  • A thorough introduction to the spatial kriging methodology when working with functions
  • A detailed exposition of more classical statistical techniques adapted to the functional case and extended to handle spatial correlations
  • Practical discussions of ANOVA, regression, and clustering methods to explore spatial correlation in a collection of curves sampled in a region
  • In-depth explorations of the similarities and differences between spatio-temporal data analysis and functional data analysis

Aimed at mathematicians, statisticians, postgraduate students, and researchers involved in the analysis of functional and spatial data, Geostatistical Functional Data Analysis will also prove to be a powerful addition to the libraries of geoscientists, environmental scientists, and economists seeking insightful new knowledge and questions at the interface of geostatistics and functional data analysis.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 658

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright

List of Contributors

Foreword

1 Introduction to Geostatistical Functional Data Analysis

1.1 Spatial Statistics

1.2 Spatial Geostatistics

1.3 Spatiotemporal Geostatistics

1.4 Functional Data Analysis in Brief

References

Part I: Mathematical and Statistical Foundations

2 Mathematical Foundations of Functional Kriging in Hilbert Spaces and Riemannian Manifolds

2.1 Introduction

2.2 Definitions and Assumptions

2.3 Kriging Prediction in Hilbert Space: A Trace Approach

2.4 An Operatorial Viewpoint to Kriging

2.5 Kriging for Manifold-Valued Random Fields

2.6 Conclusion and Further Research

References

3 Universal, Residual, and External Drift Functional Kriging

3.1 Introduction

3.2 Universal Kriging for Functional Data (UKFD)

3.3 Residual Kriging for Functional Data (ResKFD)

3.4 Functional Kriging with External Drift (FKED)

3.5 Accounting for Spatial Dependence in Drift Estimation

3.6 Uncertainty Evaluation

3.7 Implementation Details in R

3.8 Conclusions

References

4 Extending Functional Kriging When Data Are Multivariate Curves: Some Technical Considerations and Operational Solutions

4.1 Introduction

4.2 Principal Component Analysis for Curves

4.3 Functional Kriging in a Nutshell

4.4 An Example with the Precipitation Observations

4.5 Functional Principal Component Kriging

4.6 Multivariate Kriging with Functional Data

4.7 Discussion

4.A Appendices

References

5 Geostatistical Analysis in Bayes Spaces: Probability Densities and Compositional Data

5.1 Introduction and Motivations

5.2 Bayes Hilbert Spaces: Natural Spaces for Functional Compositions

5.3 A Motivating Case Study: Particle-Size Data in Heterogeneous Aquifers – Data Description

5.4 Kriging Stationary Functional Compositions

5.5 Analyzing Nonstationary Fields of FCs

5.6 Conclusions and Perspectives

References

6 Spatial Functional Data Analysis for Probability Density Functions: Compositional Functional Data vs. Distributional Data Approach

6.1 FDA and SDA When Data Are Densities

6.2 Measures of Spatial Association for Georeferenced Density Functions

6.3 Real Data Analysis

6.4 Conclusion

Acknowledgments

References

Notes

Part II: Statistical Techniques for Spatially Correlated Functional Data

7 Clustering Spatial Functional Data

7.1 Introduction

7.2 Model-Based Clustering for Spatial Functional Data

7.3 Descendant Hierarchical Classification (HC) Based on Centrality Methods

7.4 Application

7.5 Conclusion

References

8 Nonparametric Statistical Analysis of Spatially Distributed Functional Data

8.1 Introduction

8.2 Large Sample Properties

8.3 Prediction

8.4 Numerical Results

8.5 Conclusion

8 Appendix

References

9 A Nonparametric Algorithm for Spatially Dependent Functional Data: Bagging Voronoi for Clustering, Dimensional Reduction, and Regression

9.1 Introduction

9.2 The Motivating Application

9.3 The Bagging Voronoi Strategy

9.4 Bagging Voronoi Clustering (BVClu)

9.5 Bagging Voronoi Dimensional Reduction (BVDim)

9.6 Bagging Voronoi Regression (BVReg)

9.7 Conclusions and Discussion

References

Note

10 Nonparametric Inference for Spatiotemporal Data Based on Local Null Hypothesis Testing for Functional Data

10.1 Introduction

10.2 Methodology

10.3 Data Analysis

10.4 Conclusion and Future Works

References

11 Modeling Spatially Dependent Functional Data by Spatial Regression with Differential Regularization

11.1 Introduction

11.2 Spatial Regression with Differential Regularization for Geostatistical Functional Data

11.3 Simulation Studies

11.4 An Illustrative Example: Study of the Waste Production in Venice Province

11.5 Model Extensions

References

Notes

12 Quasi-maximum Likelihood Estimators for Functional Linear Spatial Autoregressive Models

12.1 Introduction

12.2 Model

12.3 Results and Assumptions

12.4 Numerical Experiments

12.5 Conclusion

12.A Appendix

References

13 Spatial Prediction and Optimal Sampling for Multivariate Functional Random Fields

13.1 Background

13.2 Functional Kriging

13.3 Functional Cokriging

13.4 Optimal Sampling Designs for Spatial Prediction of Functional Data

13.5 Real Data Analysis

13.6 Discussion and Conclusions

References

Part III: Spatio–Temporal Functional Data

14 Spatio–temporal Functional Data Analysis

14.1 Introduction

14.2 Randomness Test

14.3 Change-Point Test

14.4 Separability Tests

14.5 Trend Tests

14.6 Spatio–Temporal Extremes

References

15 A Comparison of Spatiotemporal and Functional Kriging Approaches

15.1 Introduction

15.2 Preliminaries

15.3 Kriging

15.4 A Simulation Study

15.5 Application: Spatial Prediction of Temperature Curves in the Maritime Provinces of Canada

15.6 Concluding Remarks

References

16 From Spatiotemporal Smoothing to Functional Spatial Regression: a Penalized Approach

16.1 Introduction

16.2 Smoothing Spatial Data via Penalized Regression

16.3 Penalized Smooth Mixed Models

16.4 P-spline Smooth ANOVA Models for Spatial and Spatiotemporal data

16.5 P-spline Functional Spatial Regression

16.6 Application to Air Pollution Data

Acknowledgments

References

Index

End User License Agreement

List of Tables

Chapter 3

Table 3.1 Performance indexes over the 10 validation sites.

Chapter 8

Table 8.1 Simulation results for

according to the models

and

, the cases...

Table 8.2 Simulation results for

according to the models

and

with cases...

Chapter 9

Table 9.1 DUSAF data: soil use categories and corresponding explanations

Chapter 12

Table 12.1 Estimation of parameters with

,

.

Table 12.2 Estimation of parameters with

,

.

Table 12.3 Estimation of parameters associated with scenario 2 with

.

Table 12.4 Estimation of parameters associated with scenario 2 with

.

Table 12.5 Estimation of parameters associated with scenario 2 with

.

Table 12.6 Estimation of parameters associated with scenario 2 with

.

Table 12.7 Estimated parameters for FLM and functional spatial autoregressiv...

Chapter 13

Table 13.1 Nested variogram components of the linear model of coregionalizat...

Chapter 14

Table 14.1 Randomness test results applied to the Russian weather data.

Table 14.2 Randomness test results applied to a subset of 14 Russian weather...

Table 14.3

-Values for each change-point test applied to the Russian weather dat...

Table 14.4

-Values for each change-point test applied to a subset of 14 Russian ...

Table 14.5

-Values for norm-based separability test (

) applied to the Russian w...

Table 14.6

-values for norm-based separability test (

) applied to a subset of 1...

Chapter 15

Table 15.1 The 24 different types (cases) of simulated Gaussian processes an...

Table 15.2 Prediction performance in terms of MSPEs for the simulated cases ...

Table 15.3 Prediction performance in terms of MSPEs for the simulated cases ...

Table 15.4 Prediction performance of different Sp.T. kriging models for the ...

List of Illustrations

Chapter 2

Figure 2.1 Spatially dependent curves simulated from the fields

(a) and

,...

Figure 2.2 Empirical trace-variograms in

(a) and

(b).

Figure 2.3 Canada's Maritime Provinces Temperatures dataset, year 1980. (a) ...

Figure 2.4 Estimated trace-semivariogram from the residuals (a) and estimate...

Figure 2.5 Universal kriging maps for the Summer Solstice (

June; a) and th...

Figure 2.6 Visual representation of the tangent space in

on a sphere and o...

Figure 2.7 (a) Empirical semivariogram (symbols) and fitted exponential mode...

Figure 2.8 Kriging of the (temperature, precipitation) covariance matrix fie...

Figure 2.9 (a) Empirical prediction error as a function of the sample margin...

Chapter 3

Figure 3.1 Locations of the 24

monitoring sites (light gray triangles) and...

Figure 3.2

raw data (in log scale) observed at the 24 monitoring sites....

Figure 3.3 Trace-variogram cloud and estimated trace-variogram.

Figure 3.4 Estimated functional coefficients assuming independent observatio...

Figure 3.5 Raw data (dots), smoothed data (dashed line), predicted drift (li...

Figure 3.6 Original

data (black dots), FKED predicted curve (dark gray lin...

Chapter 4

Figure 4.1 On a sampled domain

, two functional variables are observed some...

Figure 4.2 Map of France and climate dataset. On each point, annual curves (...

Figure 4.3 An example of computation of the spatial covariance. Experimental...

Figure 4.4 Predictions of precipitation curves (dashed line: mean precipitat...

Figure 4.5 LMC fitting of the variogram model on the four PCs of precipitati...

Figure 4.6 Boxplot of the errors when decreasing the number of principal com...

Figure 4.7 Empirical estimate of a

matrix of correlation operators (normal...

Figure 4.8 First factors of the MFPCA of temperature and precipitation profi...

Figure 4.9 Example of PCA 2D-mapping of observations

(a) accounting for

...

Figure 4.10 Empirical variograms and fitting of a coregionalization model on...

Figure 4.11 Predictions of temperature and precipitation curves (dashed line...

Figure 4.12 Correlation (absolute values) between PCs of a MFPCA realized fr...

Chapter 5

Figure 5.1 Example of perturbation and powering in

, compared to the typica...

Figure 5.2 Raw particle-size data at the Lauswiesen site. (a) Collection of ...

Figure 5.3 (a) Vertical distribution of smoothed densities; (b) raw particle...

Figure 5.4 Vertical distribution of ordinary kriging predictions results: (a...

Figure 5.5 Kriged field and conditional realizations. (a) Kriging estimation...

Figure 5.6 Field data: (a) smoothed PSDs; (b) soil types at the field site (...

Figure 5.7 (a) Estimated trace-semivariogram of the residuals; (b) estimated...

Figure 5.8 Class-kriging of PSDs: (a) results at boreholes B5, F4, and F6 an...

Chapter 6

Figure 6.1 ACS-5y 2015, Texas data: first five counties (of 254) of the inpu...

Figure 6.2 ACS-5y 2015, Texas data: counties with a significant local Moran'...

Figure 6.3 ACS-5y 2015, Texas data: Moran's plot for residual functions.

Figure 6.4 ACS-5y 2015, Texas data: cluster of counties with a significant l...

Figure 6.5 ACS-5y 2015, Texas data:

AGE

variable, first two harmonics after ...

Figure 6.6 ACS-5y 2015, Texas data:

AGE

variable. On top, the two maps with ...

Figure 6.7 ACS-5y 2015, Texas data:

INCOME

variable, first two harmonics aft...

Figure 6.8 ACS-5y 2015, Texas data:

INCOME

variable. On the top, the two map...

Figure 6.9 ACS-5y 2015, Texas data:

INCOME

variable transformed using a Box–...

Figure 6.10 ACS-5y 2015, Texas data:

INCOME

variable transformed using Box–C...

Chapter 7

Figure 7.1 Algorithm of the descendant HC.

Figure 7.2 Location of 106 monitoring stations (in the same number of cities...

Figure 7.3 Ozone concentration curves (obtained after smoothing the data by ...

Figure 7.4 Value of BIC criterion according to the number of clusters.

Figure 7.5 Locations of the stations are colored according to the cluster (a...

Figure 7.6 Average curves by cluster, respectively, for two clusters (a) and...

Figure 7.7 The classification results of the descendant HC.

Figure 7.8 Locations of the stations colored according to the cluster (a), m...

Figure 7.9 The curves of the different groups by the descendant HC.

Chapter 8

Figure 8.1 Some simulated curves of Case

(a) and Case

(b). In Case 1,

...

Figure 8.2 A simulated field considering Model

, Case

and

with (a) an i...

Figure 8.3 A simulated field considering Model

, Case

, and

with (a) an ...

Figure 8.4 A simulated field considering Model

, Case

, and

with (a) an ...

Figure 8.5 Boxplots of

,

and

, respectively, over the

replications of ...

Chapter 9

Figure 9.1 Map of the region around Milan (metropolitan area) covered by the...

Figure 9.2 The total Erlang data as a function of time. Continuous vertical ...

Figure 9.3 Average power spectrum

obtained via sitewise smoothing of the E...

Figure 9.4 Results of BVClu on the Telecom data. Average normalized entropy

Figure 9.5 Results of BVClu on the Telecom data, with

and

. (a) Map of th...

Figure 9.6 Results of BVDim on the Telecom data. Euclidean distance from the...

Figure 9.7 Results of BVDim on the Telecom data: the first six elements of t...

Figure 9.8 Results of BVDim on the Telecom data: maps of the estimated surfa...

Figure 9.9 Four of the

selected DUSAF covariates superimposed to the metro...

Figure 9.10 Results of BVReg on the Telecom data. Mean cross-validation erro...

Figure 9.11 Results of the BVReg lasso regression on the first four estimate...

Chapter 10

Figure 10.1 (a) Map of Canada with locations of the 35 weather stations. Smo...

Figure 10.2 FANOVA test on curves (a, b) and first derivatives (c, d) of Can...

Figure 10.3 Pairwise comparisons between curves. Diagonal panels: all temper...

Figure 10.4 Pairwise comparisons between first derivatives. Diagonal panels:...

Chapter 11

Figure 11.1 Spatial domain of the Venice waste data, with a line highlightin...

Figure 11.2 Temporal evolution of the yearly per capita production (kilogram...

Figure 11.3 Per capita production (kilogram per resident) of municipal waste...

Figure 11.4 Simplified boundary of the Venice province (a) and detail of the...

Figure 11.5 Triangulation of the Venice province.

Figure 11.6 Example of linear finite element basis function.

Figure 11.7 Simulation without covariates: test function (first row), sample...

Figure 11.8 Simulation with covariates: test function (first row), added con...

Figure 11.9 (a) Simulation without covariates: boxplots of the RMSE, over 50...

Figure 11.10 Estimated spatiotemporal field for the Venice waste data (yearl...

Figure 11.11 Temporal evolution of the estimated spatiotemporal field for th...

Chapter 12

Figure 12.1 Estimated parameter function

with the different criteria and

Figure 12.2 Estimated parameter function

with the different criteria and

Figure 12.3 Estimated parameter function

with the different criteria in Sc...

Figure 12.4 Locations and areas of the 106 stations (a) and corresponding oz...

Figure 12.5 The three first eigenfunctions (a) and the proportion of explain...

Figure 12.6 Estimated parameter functions.

Figure 12.7 Ozone concentration (solid curves) at 4 stations selected random...

Chapter 13

Figure 13.1 (a) México city. Air quality network RAMA (stations shown in lig...

Figure 13.2 Empirical and theoretical variograms fitted according to the lin...

Figure 13.3 (a) Optimal location for one additional station. (b) Cross-valid...

Chapter 14

Figure 14.1 Locations of the 220 Russian weather stations, with 14 stations ...

Figure 14.2 Daily temperature maxima for five weather stations during 2000....

Figure 14.3 Five consecutive years (2006–2010) of typhoon data. The dots rep...

Figure 14.4 Typhoons (a) and hurricanes (b) data in 2005 with expectile curv...

Figure 14.5 Gray lines represent ionosonde measurements obtained at observat...

Figure 14.6 Number of available stations in the mid-latitude northern hemisp...

Figure 14.7 A map of the neighborhood structures for different locations usi...

Figure 14.8 Probability of a heat wave with amplitude more than two standard...

Chapter 15

Figure 15.1 Examples of simulated data for: (a) case 3 (

,

), (b) case 7 (

Figure 15.2 Prediction performance (minimum MSPE over the three trace-semiva...

Figure 15.3 Box plots for cases 1–9 of the differences in (minimum) MSPE bet...

Figure 15.4 Prediction performance (minimum MSPE over the three trace-semiva...

Figure 15.5 The locations of the 36 weather stations in the Canadian Maritim...

Figure 15.6 (a) The empirical trace-semivariogram and the best fitted stable...

Figure 15.7 The empirical Sp.T. semivariogram (a) and the best-fitted Sp.T. ...

Figure 15.8 (A) Functional cross-validation residuals (gray lines) resulting...

Figure 15.9 Predicted temperatures at locations Bertrand (a) and Moncton (b)...

Chapter 16

Figure 16.1 Portion of the

-spline basis (tensor product of nine cubic spli...

Figure 16.2 Simulated functions: (a) and (b) are the nonlinear main effects ...

Figure 16.3

of fitted smooth model for

: scenario 1 (a–c), scenario 2 (d–...

Figure 16.4 Medians of daily ozone curves (from 2002 to 2015) observed at 55...

Figure 16.5 Smoothed spatial and temporal main effects for the ANOVA model. ...

Figure 16.6 Smoothed spatiotemporal interaction for ANOVA model at four sele...

Figure 16.7 Smoothed spatiotemporal fit for ANOVA model at four selected loc...

Figure 16.8 Regression splines fitted from the ozone raw data by using a cub...

Figure 16.9 Predicted curve from the regression splines of the ozone raw dat...

Figure 16.10 Predicted curves (gray) from the regression splines of the ozon...

Figure 16.11 Smoothed spatiotemporal fit for ANOVA model at four selected lo...

Guide

Cover Page

Table of Contents

Title Page

Copyright

List of Contributors

Foreword

Begin Reading

Index

End User License Agreement

Pages

ii

iii

iv

xiii

xiv

xv

xvi

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

27

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

155

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

351

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by Walter A. Shewhart and Samuel S. Wilks

The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods.

Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches. This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.

A complete list of titles in this series can be found at http://www.wiley.com/go/wsps

Geostatistical Functional Data Analysis

Edited by

Jorge MateuUniversity Jaume I of Castellon Castellon, Spain

Ramón GiraldoNational University of Colombia Bogota, Colombia

 

 

 

 

 

This edition first published 2022© 2022 John Wiley & Sons Ltd

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Jorge Mateu and Ramón Giraldo to be identified as the authors of the editorial material in this work has been asserted in accordance with law.

Registered OfficesJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USAJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Editorial Office9600 Garsington Road, Oxford, OX4 2DQ, UK

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of WarrantyThe contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting scientific method, diagnosis, or treatment by physicians for any particular patient. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging-in-Publication Data

Names: Mateu, Jorge, editor. | Giraldo, Ramón, editor.

Title: Geostatistical functional data analysis / edited by Jorge Mateu, Ramón Giraldo.

Description: Hoboken, NJ : Wiley, 2022. | Series: Wiley series in probability and statistics | Includes bibliographical references and index.

Identifiers: LCCN 2021015788 (print) | LCCN 2021015789 (ebook) | ISBN 9781119387848 (hardback) | ISBN 9781119387909 (adobe pdf) | ISBN 9781119387886 (epub)

Subjects: LCSH: Geology–Statistical methods. | Kriging. | Spatial analysis (Statistics) | Functional analysis.

Classification: LCC QE33.2.S82 G434 2022 (print) | LCC QE33.2.S82 (ebook) | DDC 551.072/7–dc23

LC record available at https://lccn.loc.gov/2021015788

LC ebook record available at https://lccn.loc.gov/2021015789

Cover Design: Wiley

Cover Image: © Googee/Shutterstock

List of Contributors

Ana M. AguileraUniversity of GranadaDepartment of Statistics and Operational ResearchSpain

Mohamed-Salem AhmedUniversity of LilleFrance

Mara S. BernardiPolitecnico di MilanoMOX - Department of MathematicsItaly

Gregory BoppPennsylvania State UniversityDepartment of StatisticsUSA

Martha BohorquezNational University of ColombiaDepartment of StatisticsColombia

Laurence BrozeUniversity of LilleFrance

María del Carmen Aguilera MorilloUniversitat Politècnica de ValènciaDepartment of Statistics and Operational Research and QualitySpain

Sophie Dabo-NiangUniversity of LilleFrance

Maria DurbanUniversidad Carlos IIIDepartment of StatisticsSpain

John EnsleyPennsylvania State UniversityDepartment of StatisticsUSA

Maria Franco-VilloriaUniversità di Modena e Reggio EmiliaDepartment of Economics “Marco Biagi”Italy

Zied GharbiUniversity of LilleFrance

Ramón GiraldoNational University of ColombiaDepartment of StatisticsColombia

Alberto GuadagniniPolitecnico di MilanoDepartment of Civil and Environmental EngineeringItaly

and

The University of ArizonaDepartment of Hydrology and Atmospheric SciencesUSA

Rosaria IgnaccoloUniversità degli Studi di TorinoDipartimento di Economia e Statistica “Cognetti de Martiis”Italy

Antonio IrpinoUniversity of Campania “Luigi Vanvitelli”Department of Mathematics and PhysicsItaly

Piotr KokoszkaColorado State UniversityDepartment of StatisticsUSA

Sara Sjöstedt de LunaUmeå UniversityDepartment of Mathematics and Mathematical StatisticsSweden

Dae-Jin LeeBCAM–Basque Center for Applied MathematicsSpain

Claude MantéUniversité du Sud Toulon-VarCNRS/INSU, IRD, MIO, Aix-Marseille UniversitéFrance

Jorge MateuUniversity Jaume I of CastellonDepartment of MathematicsSpain

Alessandra MenafoglioPolitecnico di MilanoMOX - Department of MathematicsItaly

Pascal MonestiezINRAE - Unité BioSPFrance

David NeriniUniversité du Sud Toulon-VarCNRS/INSU, IRD, MIO, Aix-Marseille UniversitéFrance

Federica PassamontiPolitecnico di MilanoMOX - Department of MathematicsItaly

Davide PigoliKing's College LondonUK

Alessia PiniUniversità Cattolica del Sacro CuoreDepartment of Statistical SciencesItaly

Cristian PredaInstitute of Statistics and Applied Mathematics of the Romanian AcademyRomania

Matthew ReimherrPennsylvania State UniversityDepartment of StatisticsUSA

Elvira RomanoUniversity of Campania “Luigi Vanvitelli”Department of Mathematics and PhysicsItaly

Laura M. SangalliPolitecnico di MilanoMOX - Department of MathematicsItaly

Piercesare SecchiPolitecnico di MilanoMOX - Department of MathematicsItaly

and

CADS - Center for Analysis Decisions and SocietyHuman TechnopoleItaly

Johan StrandbergUmeå UniversityDepartment of StatisticsSweden

Camille TernynckUniversity of LilleFrance

Baba ThiamUniversity of LilleFrance

Vincent VandewalleUniversity of LilleFrance

Simone VantiniPolitecnico di MilanoMOX - Department of MathematicsItaly

Valeria VitelliUniversity of OsloOslo Center for Biostatistics and EpidemiologyDepartment of BiostatisticsNorway

Anne-Françoise YaoUniversité Clermont-AuvergneFrance

Foreword

Functional data analysis (FDA) is a branch of statistics that analyses data providing information about curves, surfaces, or anything else varying over a continuum. In its most general form, under an FDA framework each sample element is a function. The continuum over which these functions are defined is often time, but may also be spatial location, wavelength, probability, etc. In the 20 years since the first books and papers on this topic, this field of statistics has received the attention and encouragement of researchers in statistics and many applied disciplines and has become an important and dynamic area of modern statistics. Topics that have been covered include descriptive techniques, statistical inference, multivariate and nonparametric methods, regression, generalized linear models, time series, and spatial statistics.

Modern technology has made it possible to obtain large spatial and spatiotemporal data sets, and poses the challenge of statistical modeling of such data. The combination of spatial statistics with FDA has emerged as a key approach. This book presents new theories and methods to define, describe, characterize, and model functional data indexed in spatial or spatio-temporal domains. The main focus is on functional data obtained under a geostatistical framework, where the domain is fixed and continuous. Specific topics considered include kriging, clustering, regression, and optimal sampling, moving on in the last part of the book to spatiotemporal data. Some chapters also consider the treatment of functional data on lattices.

When we wrote our original book on the subject in the 1990s, James Ramsay and I hoped that we would encourage FDA as a way of thinking, not simply a collection of techniques. It has therefore been very pleasing to see the development of the field since then, and the abundance of research activity in the area has confirmed our hopes. I would urge readers and researchers to raise their sights above any specific methods, obviously important that they are, to ask how considering data as functions changes and broadens our statistical horizons. Particularly in the new era of data science, this concerns both what data can be collected and how they can be analyzed. I am sure this book will make a valuable contribution in helping them to do so.

November 2020

         

Sir Bernard Silverman

University of Oxford

University of Nottingham

Oxford and Nottingham

1Introduction to Geostatistical Functional Data Analysis

Jorge Mateu1 and Ramón Giraldo2

1Department of Mathematics, University Jaume I of Castellon, Spain

2Department of Statistics, National University of Colombia, Bogota, Colombia

1.1 Spatial Statistics

Spatial statistics has developed rapidly during the last 30 years. We have seen an interesting progress both in theoretical developments and in practical studies. Some early applications were in mining, forestry, and hydrology. It seems to be honest to remark that the increasing availability of computer power and skillful computer software has stimulated the ability to solve increasingly complex problems. Clearly, these problems have some common elements: they were all of a spatial nature. Some theory was available, for example the random function theory as developed by Yaglom and others in the 1960s. But that was largely insufficient to find generic solutions for the whole class of problems, and hence, the applications required a new theory. Thereupon some far-reaching theories have been developed: image reconstruction, Markov random fields, point process statistics, geostatistics, and random sets, to mention just a few. As a next stage, these theories were applied successfully to new disciplinary problems leading to modifications and extensions of mathematical and statistical procedures. We therefore notice a general scientific process that has occurred in the field of spatial statistics: well-defined problems with a common character were suddenly on the agenda, and data availability and intensive discussion with practical and disciplinary researchers resulted in new theoretical developments. Often, it is difficult to say which was first, and what followed, but we see different theoretical models developed for different applications.

Spatial statistics has hence emerged as an important new field of science. One of the peculiarities is its power for visualization. A common cold-water fear of many statisticians and mathematicians to analyze images, to communicate their results by maps, and to have to trust information in pictures was overcome. It has led to interesting theories and better and more objective procedures for dealing with spatial variation. Following Wittgenstein, we could state that we needed some geniuses to tackle the obvious. Now, many results of a spatial statistical analysis could be communicated smoothly toward the nonstatistical audience, like a disciplinary scientist, a policy-maker, or an interested student. They, in turn, were able to judge whether a problem was solved, whether a policy measure was relevant or was inspired by the beautiful pictures expressing deep thoughts on relevant issues.

The role in policy-making may be once more stressed. It is known that many policy-makers are inclined to make a decision on the basis of a well developed, well organized, and well understandable figure. They find it (rightly so!) rather boring to use long lists of statistical data. But as political decisions affect us all, it puts another responsibility on the back of statisticians: to make statistically sound maps. It is often hard to say what that should be, but at the very least, we should be able to generate pictures, maps, and graphs that rely on good data and that show important aspects for decision-making.

In this way, spatial statistics has become a refreshing wind in statistics. We do not need to do well much longer on difficult equations, long lists of data, and tables with simulated controlled scenarios. But, to be clear on the back of all these nice pictures a sound science with sometimes difficult and tedious derivations and deep thoughts are still required to make serious progress.

Spatial statistics recognizes and exploits the spatial locations of data when designing for, collecting, managing, analyzing, and displaying such data. Spatial data are typically dependent, for which there are classes of spatial models available that allow process prediction and parameter estimation. Spatially arranged measurements and spatial patterns occur in a surprisingly wide variety of scientific disciplines. The origins of human life link studies of the evolution of galaxies, the structure of biological cells, and settlement patterns in archaeology. Ecologists study the interactions among plants and animals. Foresters and agriculturalists need to investigate plant competition and account for soil variations in their experiments. The estimation of rainfall and of ore and petroleum reserves is of prime economic importance. Rocks, metals, and tissue and blood cells are all studied at a microscopic level. Geology, soil science, image processing, epidemiology, crop science, ecology, forestry, astronomy, atmospheric science, or simply any discipline that works with data collected from different spatial locations, need to develop models that indicate when there is dependence between measurements at different locations. Spatiotemporal variability is a relatively new area within Spatial Statistics, which explains the scarcity of space-time statistical tools 20 years ago. There has been a growing realization in the last decade that knowing where data were observed could help enormously in answering the substantive questions that precipitated their collection. One of the most powerful tools for spatial data analysis is the map. For example, in military applications, the battlespace is mapped for command and control. The sensors are both in situ and remote, and they generate spatially distributed data of many different kinds. Producing a statistically optimal map, together with measures of map uncertainty, which is always up to date, is a complicated task. Once these types of statistical problems are solved, a geographic information system, or GIS, is well suited to forming the decision-making maps.

Spatial statistics can be considered a natural generalization of signal processing to higher dimensions. In traditional signal processing, one has a signal dependent on a scalar variable , which may belong to a discrete set or which may be continuous. Spatial statistics is concerned with cases in which is a multidimensional index of dimension . In most practical examples , though much of the basic theory and methodology is the same whatever the dimension. Although the models and methods of spatial statistics have not developed as rapidly as those for one-dimensional signal processing, there have nevertheless been substantial new developments in recent years. Standard and modern references on spatial statistics include the books of [1–4] among others.

Following Cressie [5], spatial data can be thought of as resulting from observations on the stochastic process , where is possibly a random set in . If we believe that the roots of statistical science are in data, we can classify spatial areas according to the type of observations encountered. Thus, (i) if is a fixed subset of and is a random vector at location , we are dealing with geostatistical data; (ii) if is a fixed (regular or irregular) collection of countably many points of and is a random vector at location , we are dealing with lattice data; (iii) if is a point process in and is a random vector at location , we are dealing with point patterns; (iv) if is a point process in and is itself a random set, we are dealing with spatial objects. Geostatistical-type problems are distinguished most clearly from lattice-and point-pattern-type problems by the ability of the spatial index to vary continuously over a subset of . A space-time process can be denoted by , where each of , , and is possibly random.

Spatial statistics is one of the major methodologies of environmental statistics. Its applications include producing spatially smoothed or interpolated representations of air pollution fields, calculating regional average means or regional average trends based on data at a finite number of monitoring stations, and performing regression analyses with spatially correlated errors to assess the agreement between observed data and the predictions of some numerical model. The notion of proximity in space is implicitly or explicitly present in the environmental sciences. Proximity is a relative notion, relative to the spatial scale of the scientific investigation. When a spatial dimension is present in an environmental study, the statistician's job is to create a statistical framework within which one carries out defensible inferences on processes and parameters of interest. These modeling and inference strategies are not always easy to do, but are never impossible. If statistics is to continue to be the broker of variability, it must address difficult questions such as those found in the environmental sciences, otherwise, it will become marginalized as a discipline. Problems in the environmental sciences are inherently spatial (and temporal), observational in nature, and have experimental units that are highly variable.

In the last decade, spatial statistics has undergone enormous development in the area of statistical modeling. It started slowly, building from models that were purely descriptive of spatial dependence. Then, it became apparent that the process of interest was usually hidden by measurement error and that the principal goal should be inference on the hidden process from the noisy data. It has only been in the last few years that the full potential for hierarchical spatial statistical modeling has been glimpsed. There is an enormous amount of flexibility in hierarchical statistical models, such as the opportunity to account for nonlinearities. Their attractive feature is that at each level of the hierarchy, the model specification is simple, yet globally, the model can be quite complex. This approach could be summarized as a model locally, analyze globally.

Applications of spatial statistics cover many areas. Much of the original impetus for the subject was driven by geostatistics. It was in this context that the technique of kriging, optimal least squares interpolation over a random spatial field, was originally developed. In recent years, the applications of spatial statistics have increased enormously, with particularly fruitful applications in the environmental and ecological sciences. A typical problem is the sampling of a pollution field, such as ozone in the atmosphere or toxic chemicals in rivers and lakes. Another example is the use of meteorological measurements in studies of global climate change. In these fields, as in geostatistics, the objective may be to interpolate spatially between measurements, but there are also other objectives which may be quite different. Spatial statistics has also found applications in such diverse fields as sociology, for example social networks theory and financial economics.

The usual approach in geostatistics is based on an assumption that the spatial random field is stationary and isotropic. In the original geophysical applications which motivated the development of the field, this assumption was often justified by the fact that with sparse data, there was no reasonable alternative. A further point is that many geostatistical applications involved only one measurement at each site (or equivalently, only one replication of the random field) so there was no way of determining the complete spatial covariance function without some kind of stationarity assumption. In modern environmental applications, however, there are very often enough monitoring stations to go beyond such assumptions, and with multiple observations per site, it is also possible to estimate the covariance between any pair of sites without assuming stationarity across the field. Another consideration is that very often, simple topography makes a stationary assumption implausible. Therefore, there are by now many reasons to go beyond a stationary model. In spite of this obvious need for nonstationary models; however, there is not, as yet, a wide variety of approaches to the problem.

Environmental issues have brought atmospheric science to the center of science and technology, where it now plays a key role in shaping national and international policy. Weather prediction plays a significant role in the planning of human affairs. Further, a broader appreciation of the role of weather and climate impacts on the environment of the planet has now led to nearly universal concern regarding potential climate change, its causes, impacts, and possible remedying. A large variety of statistical methods are used routinely in the atmospheric sciences. For example, techniques of multivariate time series are especially common. These include multivariate autoregressive, moving average models and Kalman filtering. Statistical methods for spatial data are also standard. A major tool in the analysis of space-time data is empirical orthogonal functions (EOF). Virtually, all atmospheric and oceanographic processes (e.g. wind, temperature, sea surface temperature, moisture) involve variability over space and time. One only needs examine the governing partial differential equations for wind processes, or their selected spatial-temporal averages, to see that mathematical and statistical descriptions of these dynamical processes depend on complicated temporal and spatial relationships. Furthermore, observations of geophysical processes typically include measurement errors and are often temporally and spatially incomplete, which may obscure the signal of interest.

In studies involving spatial data, it is seldom the case that data for only a single process are collected. Typically, there is a great expense associated with establishing spatial monitoring networks or other mechanisms of spatial data collection (e.g. satellites) and so measurements are usually made on two or more variables. Thus, statistical techniques for multivariate spatial data are critical for effective modeling of spatial processes.

Lately, there has been a rich and growing literature on space–time modeling. Fundamentally, it is clear that in the absence of a temporal component, second-order geostatistical models can be used to represent spatial variability. These are descriptive in the sense that, although they model spatial correlation, there is no causative interpretation associated with them. Thus, for space-time modeling, the geostatistical paradigm assumes a descriptive structure for both space and time (i.e. covariance structures are directly specified). For example, one can extend the geostatistical kriging methodology for spatial processes by assuming that time is just another spatial dimension. Alternatively, one can treat time slices of a spatial field as variables and apply a multivariate or cokriging approach. Although these approaches have been successful in many applications, there are fundamental differences between space and time, and it is not likely that realistic covariance structures can be specified that accurately capture the complicated dynamical processes as found in geophysical applications.

In the absence of a spatial component, there is a large class of time series models that could be used to represent the temporal variability. These are dynamic in the sense that they exploit the fact that time flows in only one direction, and so the state of the process at the current time is related to what happened at previous times. Thus, one might consider the space–time process as a collection of spatially correlated time series in continuous space, or on a spatial lattice. Although these approaches include dynamical structures, without a descriptive spatial component one lacks the ability to perform spatial prediction at locations without observations. If both temporal and spatial components are present, it is natural to combine the temporally dynamic state-space approach and the spatially descriptive approach. These models are referred to as space–time dynamic models.

Spatial interpolation is an essential feature of many GIS. It is a procedure for estimating values of a variable at unsampled locations. A map with isolines is usually the visual output of such a process and plays a crucial role in decision-making. Based on Tobler's law of geography, which stipulates that observations close together in space are more likely to be similar than those farther apart, the development of models attempting to represent the way close observations are related can sometimes be very problematic. The approaches can be divergent and may therefore lead to very different results. As a consequence, an understanding of the initial assumptions and methods used is the key to the spatial interpolation process.

Surprisingly, when spatial interpolation tools are integrated within GIS, they are often implemented in such a way that users have no real choice in selecting the best possible methods, and if they do have a choice, required input parameters are sometimes fixed, without any possible way of modifying them. One reason for the frequent blind use of spatial interpolation methods, and spatial statistics in general, probably has its origins in teaching. Despite the large variety of its applications, the discipline has been confined to those fields where it has seen its major developments. The progress made in spatial statistics is therefore usually presented only in journals dedicated to statistics, mining, and petroleum engineering. As a consequence, GIS users who have a different technical background often do not have an in-depth knowledge of such spatial interpolation techniques. Furthermore, since the conventional tests used in basic statistics usually generate some kind of categorical answer, the prerequisite experience and statistical knowledge necessary for the proper use of spatial interpolation techniques are often discouraging to this type of users. Nevertheless, during the last few years, the diversity of the applications of these methods has encouraged the publication of new books and new case studies and has stimulated a number of conferences on the subject.

1.2 Spatial Geostatistics

This section has been partially taken and summarized in parts from [6], intending to provide a brief overview to spatial geostatistics. The reader is referred to [6] for further and more complete details.

1.2.1 Regionalized Variables

Geostatistics can be defined as the study of regionalized phenomena, that is, phenomena that stretch across space and which have a certain spatial organization or structure. However, geostatistics is not applied to the regionalized phenomenon as such, which is a physical reality, but to a mathematical description of that reality, that is, a numerical function called regionalized variable or regionalization, defined in a geographical space, which is supposed to correctly represent and measure that phenomenon.

In order to delve deeper into the concept of regionalized variable, let us imagine we are interested in a feature of a given phenomenon that spans across space and that several measurements are taken in a domain at a given moment in time. If the measurements are taken on objects or similar, the objects sampled can be considered a subset of a larger collection of objects, as many more measurements could have been taken, but were not for many possible reasons. If the observations were made at certain points in the domain, infinite measurements could be taken.

When spans across the domain under study, , the set , is called a regionalized variable or regionalization, the set being a collection of values of the regionalized variable, and each value of that collection being a regionalized value.

It is true that a deterministic approach can be employed to describe or model a regionalized phenomenon and obtain an accurate assessment of the values of the regionalization on the basis of a limited number of observations. However, this requires in-depth knowledge of the origin of the phenomenon and the physical or mathematical laws that govern the evolution of the regionalized variable. Furthermore, many of the regionalized phenomena that are usually studied are so complex that a deterministic approach can only partially portray them. That is why the deterministic approach is discarded and the probabilistic approach, which permits modeling both the knowledge of and also the uncertainty surrounding the regionalized random phenomenon, is adopted.

1.2.2 Random Functions

From a probabilistic perspective, the regionalized value can be seen as the result of a random mechanism, resulting in a random variable (r.v.). If the regionalized values at all the points in the domain are considered, it can be seen as a reality of an infinitely large set of r.v.s, one at each point in the domain, which is known as spatial random function (synonyms: stochastic process, random field).

When spans across the domain under study, , we have a family of r.v.s, , which constitutes a spatial random field (r.f.).

This methodological decision is one of the cornerstones of geostatistics: the regionalized variable is interpreted as a realization of a spatial r.f. At this point, we must state that the regionalized variable is often highly locally irregular (which makes it impossible to represent using a deterministic mathematical function) and has a certain spatial organization or structure. The probabilistic approach, or probabilistic geostatistics, which interprets the regionalized variable as a realization of a r.f., can take into account all the aspects of regionalization mentioned above, because, as stated in page 55 of [7]:

At each location

,

is a r.v. (hence, the erratic aspect).

For any given set of points

, the r.v.s

are linked by a network of spatial correlations responsible for the similarity of the values they take (hence the structured aspect).

Let be a r.f. and let us consider the set of points . Then, the r.f. is characterized by its -dimensional distribution function. The set of -dimensional distribution functions for all values of and all possible choices of in the domain is called the spatial law of probability.

For a given r.f., , the -dimensional distribution function is defined as

(1.1)

In linear geostatistics, it is enough to know the first two moments of the distribution of . What is more, in most practical applications, the available information does not allow to infer higher-order moments.

The expectation, expected value or first-order moment of a r.f. is defined as a nonrandom function of that coincides at each point with the expectation of the r.v. at that point , where , . It is also called the drift of the r.f., especially when it varies with location.

The variance of a r.f. is defined as a nonrandom function of that coincides at each point with the variance of the r.v. at that point, i.e. , where , .

The covariance function of a r.f. is defined as a nonrandom function of and , such that for any pair of values , coincides with the covariance between the r.v. at those two points

(1.2)

The variogram of the r.f. is defined as the variance of the first differences of the r.f.

(1.3)

The function is called semivariogram.

is a Gaussian r.f. if for all and any given set of points , the joint distribution of is a multivariate Gaussian distribution. A multivariate Gaussian distribution is characterized by a mean vector and a variance–covariance matrix, such that the two first moments of a Gaussian r.f. completely determine its probability structure. The Gaussianity of the r.f. is a common assumption in geostatistics.

1.2.3 Stationarity and Intrinsic Hypothesis

regionalized variable in probabilistic terms as a particular realization of a given r.f. makes operational sense when it is possible to infer part or all of the law of probability which defines that r.f. In this sense, stationarity, which indicates a certain degree of homogeneity in the regionalization across space, is a desirable quality.

Indeed, it would be impossible to infer the probability law of a r.f. if there was only one realization of the r.f. In order to make inferences consistently, many realizations are necessary. However, in reality there is only one. The solution to this problem is to adopt the hypothesis of stationarity or spatial homogeneity. The idea behind the hypothesis of stationarity is to substitute repetitions of the (inaccessible) realizations of the r.f. with repetitions in space, that is, the values observed at different locations in the domain under study have the same characteristics and can be considered as realizations of the same r.f. in mathematical terms. However, these realizations are not independent, and an additional hypothesis, ergodicity, is normally assumed; see pages 19–22 of [8] for details. The hypothesis of stationarity means that the spatial law of probability of the r.f. or part of it, is translation invariant. That is, the probabilistic properties of a set of observations do not depend on the specific locations where they have been measured, but only on their separations.

Therefore, in mathematical and probabilistic terms, the hypothesis of stationarity refers to the regular behavior in space of the moments of the r.f., or the function itself and, as we will see later, there are different degrees of stationarity. This hypothesis will allow us to act as if all the variables that make up the r.f. had the same probability distribution (or the same moments; we can even relax this assumption) and, as a consequence, to be able to make inferences.

Using the assumed level of spatial homogeneity of the r.f. that (supposedly) generates the observed realization as a basis, we have the following cases: Stationary random function in the strict sense, second-order stationary random function, and intrinsically stationary random function or random function of stationary increments. Let us briefly introduce these concepts.

The r.f. is said to be stationary in the strict sense, or strictly stationary, if the families of r.v.s and have the same joint distribution function for all , and for any given spatial points and any translation vector .

In other words, the joint distribution function of is unaffected by the translation of an arbitrary quantity . As a result, density functions with dimension lower than k do not depend on location either. Generally speaking, this is a strongly strict condition, which is why this hypothesis is normally relaxed to the so-called “assumption of second-order stationarity,” which limits the stationarity hypothesis to the first two moments of the r.f. (recall that in linear geostatistics, we are only interested in the two first moments of the r.f.).

The r.f. is said to be second-order stationary, weakly stationary or stationary in the broad sense, if it has finite second-order moments (that is the covariance exists) and verifies that

The expectation exists and is constant, and therefore does not depend on the location

(1.4)

The covariance exists for every pair of r.v.s,

and

, and only depends on the vector

that joins the locations

and

(1.5)

As the covariance function of a second-order Stationary, r.f. is only a function of , the variance of the r.f. exists and is finite and constant:

(1.6)

In light of Eqs. (1.4) and (1.6), the second-order stationarity hypotheses can be interpreted as if the regionalized variable takes values that fluctuate around a constant value (the mean), and the variation of these fluctuations is the same everywhere in the domain.

In some cases, in order to model the spatial dependence of second-order stationary r.f.s, the correlogram, or correlation function, is used instead of the covariogram, and is defined as

(1.7)