71,99 €
The Wiley Classics Library consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. Spatial statistics -- analyzing spatial data through statistical models -- has proven exceptionally versatile, encompassing problems ranging from the microscopic to the astronomic. However, for the scientist and engineer faced only with scattered and uneven treatments of the subject in the scientific literature, learning how to make practical use of spatial statistics in day-to-day analytical work is very difficult. Designed exclusively for scientists eager to tap into the enormous potential of this analytical tool and upgrade their range of technical skills, Statistics for Spatial Data is a comprehensive, single-source guide to both the theory and applied aspects of spatial statistical methods. The hard-cover edition was hailed by Mathematical Reviews as an "excellent book which will become a basic reference." This paper-back edition of the 1993 edition, is designed to meet the many technological challenges facing the scientist and engineer. Concentrating on the three areas of geostatistical data, lattice data, and point patterns, the book sheds light on the link between data and model, revealing how design, inference, and diagnostics are an outgrowth of that link. It then explores new methods to reveal just how spatial statistical models can be used to solve important problems in a host of areas in science and engineering. Discussion includes: * Exploratory spatial data analysis * Spectral theory for stationary processes * Spatial scale * Simulation methods for spatial processes * Spatial bootstrapping * Statistical image analysis and remote sensing * Computational aspects of model fitting * Application of models to disease mapping Designed to accommodate the practical needs of the professional, it features a unified and common notation for its subject as well as many detailed examples woven into the text, numerous illustrations (including graphs that illuminate the theory discussed) and over 1,000 references. Fully balancing theory with applications, Statistics for Spatial Data, Revised Edition is an exceptionally clear guide on making optimal use of one of the ascendant analytical tools of the decade, one that has begun to capture the imagination of professionals in biology, earth science, civil, electrical, and agricultural engineering, geography, epidemiology, and ecology.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 1586
Veröffentlichungsjahr: 2015
Contents
Cover
Half Title page
Title page
Copyright page
Dedication
Preface
Acknowledgments
Chapter 1: Statistics for Spatial Data
1.1 Spatial Data and Spatial Models
1.2 Introductory Examples
1.3 Statistics for Spatial Data: Why?
Part I: Geostatistical Data
Chapter 2: Geostatistics
2.1 Continuous Spatial Index
2.2 Spatial Data Analysis of Coal Ash in Pennsylvania
2.3 Stationary Processes
2.4 Estimation of the Variogram
2.5 Spectral Representations
2.6 Variogram Model Fitting
Chapter 3: Spatial Prediction and Kriging
3.1 Scale of Variation
3.2 Ordinary Kriging
3.3 Robust Kriging
3.4 Universal Kriging
3.5 Median-Polish Kriging
3.6 Geostatistical Data, Simulated and Real
Chapter 4: Applications of Geostatistics
4.1 Wolfcamp-Aquifer Data
4.2 Soil–Water Tension Data
4.3 Soil–Water-Infiltration Data
4.4 Sudden-Infant-Death-Syndrome Data
4.5 Wheat-Yield Data
4.6 Acid-Deposition Data
4.7 Space-Time Geostatistical Data
Chapter 5: Special Topics in Statistics for Spatial Data
5.1 Nonlinear Geostatistics
5.2 Change of Support
5.3 Stability of the Geostatistical Method
5.4 Intrinsic Random Functions of Order k
5.5 Applications of the Theory of Random Processes
5.6 Spatial Design
5.7 Field Trials
5.8 Infill Asymptotics
5.9 The Many Faces of Spatial Prediction
Part II: Lattice Data
Chapter 6: Spatial Models on Lattices
6.1 Lattices
6.2 Spatial Data Analysis of Sudden Infant Deaths in North Carolina
6.3 Conditionally and Simultaneously Specified Spatial Gaussian Models
6.4 Markov Random Fields
6.5 Conditionally Specified Spatial Models for Discrete Data
6.6 Conditionally Specified Spatial Models for Continuous Data
6.7 Simultaneously Specified and Other Spatial Models
6.8 Space-Time Models
Chapter 7: Inference for Lattice Models
7.1 Inference for the Mercer and Hall Wheat-Yield Data
7.2 Parameter Estimation for Lattice Models
7.3 Properties of Estimators
7.4 Statistical image Analysis and Remote Sensing
7.5 Regional Mapping: Scotland Lip-Cancer Data
7.6 Sudden-Infant-Death-Syndrome Data
7.7 Lattice Data, Simulated and Real
Part III: Spatial Patterns
Chapter 8: Spatial Point Patterns
8.1 Random Spatial Index
8.2 Spatial Data Analysis of Longleaf Pines (Pinus palustris)
8.3 Point Process Theory
8.4 Complete Spatial Randomness, Distance Functions, and Second Moment Measures
8.5 Models and Model Fitting
8.6 Multivariate Spatial Point Processes
8.7 Marked Spatial Point Processes
8.8 Space–Time Point Patterns
8.9 Spatial Point Patterns, Simulated and Real
Chapter 9: Modeling Objects
9.1 Set Models
9.2 Random Parallelograms in 2
9.3 Random Closed Sets and Mathematical Morphology
9.4 The Boolean Model
9.5 Methods of Boolean-Model Parameter Estimation
9.6 Inference for the Boolean Model
9.7 Modeling Growth with Random Sets
References
Author Index
Subject Index
WILEY SERIES IN PROBABILITY AND MATHEMATICAL STATISTICS
ESTABLISHED BY WALTER A. SHEWHART AND SAMUEL S. WILKS
Editors
Vic Barnett, Ralph A. Bradley, Nicholas I. Fisher, J. Stuart Hunter, Joseph B. Kadane, David G. Kendall, Adrian F. M. Smith, Stephen M. Stigler, Jozef L. Teugels, Geoffrey S. Watson
Probability and Mathematical Statistics
ADLER • The Geometry of Random Fields
ANDERSON • The Statistical Analysis of Time Series
ANDERSON • An Introduction to Multivariate Statistical Analysis,
Second Edition
ARNOLD • The Theory of Linear Models and Multivariate Analysis
BARNETT • Comparative Statistical Inference, Second Edition
BERNARDO and SMITH • Bayesian Statistical Concepts and Theory
BHATTACHARYYA and JOHNSON • Statistical Concepts and Methods
BILLINGSLEY • Probability and Measure, Second Edition
BOROVKOV • Asymptotic Methods in Queuing Theory
BOSE and MANVEL • Introduction to Combinatorial Theory
CAINES • Linear Stochastic Systems
CHEN • Recursive Estimation and Control for Stochastic Systems
COCHRAN • Contributions to Statistics
COCHRAN • Planning and Analysis of Observational Studies
CONSTANTINE • Combinatorial Theory and Statistical Design
COVER and THOMAS • Elements of Information Theory
*DOOB • Stochastic Processes
DUDEWICZ and MISHRA • Modern Mathematical Statistics
EATON • Multivariate Statistics: A Vector Space Approach
ETHIER and KURTZ • Markov Processes: Characterization and Convergence
FELLER • An Introduction to Probability Theory and Its Applications, Volume I, Third Edition, Revised; Volume II, Second Edition
FULLER • Introduction to Statistical Time Series
FULLER • Measurement Error Models
GIFI • Nonlinear Multivariate Analysis
GRENANDER • Abstract Inference
GUTTORP • Statistical Inference for Branching Processes
HALD • A History of Probability and Statistics and Their Applications before 1750
HALL • Introduction to the Theory of Coverage Processes
HANNAN and DEISTLER • The Statistical Theory of Linear Systems
HEDAYAT and SINHA • Design and Inference in Finite Population Sampling
HOEL • Introduction to Mathematical Statistics, Fifth Edition
HUBER • Robust Statistics
IMAN and CONOVER • A Modern Approach to Statistics
IOSIFESCU • Finite Markov Processes and Applications
JOHNSON and BHATTACHARYYA • Statistics: Principles and Methods, Revised Printing
LAHA and ROHATGI • Probability Theory
LARSON • Introduction to Probability Theory and Statistical Inference, Third Edition
MATTHES, KERSTAN, and MECKE • Infinitely Divisible Point Processes
MORGENTHALER and TUKEY • Configural Polysampling: A Route to Practical Robustness
MUIRHEAD • Aspects of Multivariate Statistical Theory
OLIVER and SMITH • Influence Diagrams, Belief Nets and Decision Analysis
PILZ • Bayesian Estimation and Experimental Design in Linear Regression Models
PRESS • Bayesian Statistics: Principles, Models, and Applications
PURI and SEN • Nonparametric Methods in General Linear Models
PURI and SEN • Nonparametric Methods in Multivariate Analysis
PURI, VILAPLANA, and WERTZ • New Perspectives in Theoretical and Applied Statistics
RANDLES and WOLFE • Introduction to the Theory of Nonparametric Statistics
RAO • Asymptotic Theory of Statistical Inference
RAO • Linear Statistical Inference and Its Applications, Second Edition
ROBERTSON, WRIGHT, and DYKSTRA • Order Restricted Statistical Inference
ROGERS and WILLIAMS • Diffusions, Markov Processes, and Martingales, Volume II: Ito Calculus
ROHATGI • Statistical Inference
ROSS • Stochastic Processes
RUBINSTEIN • Simulation and the Monte Carlo Method
RUZSA and SZEKELY • Algebraic Probability Theory
SCHEFFE • The Analysis of Variance
SEBER • Linear Regression Analysis
SEBER • Multivariate Observations
SEBER and WILD • Nonlinear Regression
SEN • Sequential Nonparametrics: Invariance Principles and Statistical Inference
SERFLING • Approximation Theorems of Mathematical Statistics
SHORACK and WELLNER • Empirical Processes with Applications to Statistics
STAUDTE and SHEATHER • Robust Estimation and Testing
STOYANOV • Counterexamples in Probability
STYAN • The Collected Papers of T. W. Anderson: 1943–1985
WHITTAKER • Graphical Models in Applied Multivariate Statistics
YANG • The Construction Theory of Denumerable Markov Processes
Applied Probability and Statistics
ABRAHAM and LEDOLTER • Statistical Methods for Forecasting
AGRESTI • Analysis of Ordinal Categorical Data
AGRESTI • Categorical Data Analysis
AICKIN • Linear Statistical Analysis of Discrete Data
ANDERSON and LOYNES • The Teaching of Practical Statistics
ANDERSON, AUQUIER, HAUCK, OAKES, VANDAELE, and WEISBERG • Statistical Methods for Comparative Studies
ARTHANARI and DODGE • Mathematical Programming in Statistics
ASMUSSEN • Applied Probability and Queues
*BAILEY • The Elements of Stochastic Processes with Applications to the Natural Sciences
BARNETT • Interpreting Multivariate Data
BARNETT and LEWIS • Outliers in Statistical Data, Second Edition
BARTHOLOMEW • Stochastic Models for Social Processes, Third Edition
BARTHOLOMEW and FORBES • Statistical Techniques for Manpower Planning
BATES and WATTS • Nonlinear Regression Analysis and Its Applications
BECK and ARNOLD • Parameter Estimation in Engineering and Science
BELSLEY • Conditioning Diagnostics: Collinearity and Weak Data in Regression
BELSLEY, KUH, and WELSCH • Regression Diagnostics: Identifying Influential Data and Sources of Collinearity
BHAT • Elements of Applied Stochastic Processes, Second Edition
BHATTACHARYA and WAYMIRE • Stochastic Processes with Applications
BIEMER, GROVES, LYBERG, MATHIOWETZ, and SUDMAN • Measurement Errors in Surveys
BLOOMFIELD • Fourier Analysis of Time Series: An Introduction
BOLLEN • Structural Equations with Latent Variables
BOX • R. A. Fisher, the Life of a Scientist
BOX and DRAPER • Empirical Model-Building and Response Surfaces
BOX and DRAPER • Evolutionary Operation: A Statistical Method for Process Improvement
BOX, HUNTER, and HUNTER • Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building
BROWN and HOLLANDER • Statistics: A Biomedical Introduction
BUCKLEW • Large Deviation Techniques in Decision, Simulation, and Estimation
BUNKE and BUNKE • Nonlinear Regression, Functional Relations and Robust Methods: Statistical Methods of Model Building
BUNKE and BUNKE • Statistical Inference in Linear Models, Volume I
CHAMBERS • Computational Methods for Data Analysis
CHATTERJEE and HADI • Sensitivity Analysis in Linear Regression
CHATTERJEE and PRICE • Regression Analysis by Example, Second Edition
CHOW • Econometric Analysis by Control Methods
CLARKE and DISNEY • Probability and Random Processes: A First Course with Applications, Second Edition
COCHRAN • Sampling Techniques, Third Edition
*Now available in a lower priced paperback edition in the Wiley Classics Library.
COCHRAN and COX • Experimental Designs, Second Edition
CONOVER • Practical Nonparametric Statistics, Second Edition
CONOVER and IMAN • Introduction to Modern Business Statistics
CORNELL • Experiments with Mixtures, Designs, Models, and the Analysis of Mixture Data, Second Edition
COX • A Handbook of Introductory Statistical Methods
COX • Planning of Experiments
CRESSIE • Statistics for Spatial Data
DANIEL • Applications of Statistics to Industrial Experimentation
DANIEL • Biostatistics: A Foundation for Analysis in the Health Sciences, Fourth Edition
DANIEL and WOOD • Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition
DAVID • Order Statistics, Second Edition
DAVISON • Multidimensional Scaling
DEGROOT, FIENBERG, and KADANE • Statistics and the Law
*DEMING • Sample Design in Business Research
DILLON and GOLDSTEIN • Multivariate Analysis: Methods and Applications
DODGE • Analysis of Experiments with Missing Data
DODGE and ROMIG • Sampling Inspection Tables, Second Edition
DOWDY and WEARDEN • Statistics for Research, Second Edition
DRAPER and SMITH • Applied Regression Analysis, Second Edition
DUNN • Basic Statistics: A Primer for the Biomedical Sciences, Second Edition
DUNN and CLARK • Applied Statistics: Analysis of Variance and Regression, Second Edition
ELANDT-JOHNSON and JOHNSON • Survival Models and Data Analysis
FLEISS • The Design and Analysis of Clinical Experiments
FLEISS • Statistical Methods for Rates and Proportions, Second Edition
FLEMING and HARRINGTON • Counting Processes and Survival Analysis
FLURY • Common Principal Components and Related Multivariate Models
FRANKEN, ICONIC, ARNDT, and SCHMIDT • Queues and Point Processes
GALLANT • Nonlinear Statistical Models
GIBBONS, OLKIN, and SOBEL • Selecting and Ordering Populations: A New Statistical Methodology
GNANADESIKAN • Methods for Statistical Data Analysis of Multivariate Observations
GREENBERG and WEBSTER • Advanced Econometrics: A Bridge to the Literature
GROSS and HARRIS • Fundamentals of Queueing Theory. Second Edition
GROVES • Survey Errors and Survey Costs
GROVES, BIEMER, LYBERG, MASSEY, NICHOLLS, and WAKSBERG • Telephone Survey Methodology
GUPTA and PANCHAPAKESAN • Multiple Decision Procedures: Theory and Methodology of Selecting and Ranking Populations
GUTTMAN, WILKS, and HUNTER • Introductory Engineering Statistics, Third Edition
HAHN and MEEKER • Statistical Intervals: A Guide for Practitioners
HAHN and SHAPIRO • Statistical Models in Engineering
HALD • Statistical Tables and Formulas
HALD • Statistical Theory with Engineering Applications
HAND • Discrimination and Classification
HEIBERGER • Computation for the Analysis of Designed Experiments
HELLER • MACSYMA for Statisticians
HOAGLIN, MOSTELLER, and TUKEY • Exploratory Approach to Analysis of Variance
HOAGLIN, MOSTELLER, and TUKEY • Exploring Data Tables, Trends and Shapes
HOAGLIN, MOSTELLER, and TUKEY • Understanding Robust and Exploratory Data Analysis
HOCHBERG and TAMHANE • Multiple Comparison Procedures
HOEL • Elementary Statistics, Fourth Edition
HOEL and JESSEN • Basic Statistics for Business and Economics, Third Edition
HOGG and KLUGMAN • Loss Distributions
HOLLANDER and WOLFE • Nonparametric Statistical Methods
HOSMER and LEMESHOW • Applied Logistic Regression
IMAN and CONOVER • Modern Business Statistics
JACKSON • A User’s Guide to Principle Components
JESSEN • Statistical Survey Techniques
JOHN • Statistical Methods in Engineering and Quality Assurance
JOHNSON • Multivariate Statistical Simulation
JOHNSON and KOTZ •Distributions in Statistics Discrete Distributions Continuous Univariate Distributions—1 Continuous Univariate Distributions—2 Continuous Multivariate DistributionsJUDGE, GRIFFITHS, HILL, LÜTKEPOHL, and LEE • The Theory and Practice of Econometrics, Second Edition
JUDGE, HILL, GRIFFITHS, LÜTKEPOHL, and LEE • Introduction to the Theory and Practice of Econometrics, Second Edition
KALBFLEISCH and PRENTICE • The Statistical Analysis of Failure Time Data
KASPRZYK, DUNCAN, KALTON, and SINGH • Panel Surveys
KAUFMAN and ROUSSEEUW • Finding Groups in Data: An Introduction to Cluster Analysis
KEENEY and RAIFFA • Decisions with Multiple Objectives
KISH • Statistical Design for Research
KISH • Survey Sampling
KUH, NEESE, and HOLLINGER • Structural Sensitivity in Econometric Models
LAWLESS • Statistical Models and Methods for Lifetime Data
LEAMER • Specification Searches: Ad Hoc Inference with Nonexperimental Data
LEBART, MORINEAU, and WARWICK • Multivariate Descriptive Statistical Analysis: Correspondence Analysis and Related Techniques for Large Matrices
LEVY and LEMESHOW • Sampling of Populations: Methods and Applications
LINHART and ZUCCHINI • Model Selection
LITTLE and RUBIN • Statistical Analysis with Missing Data
McNEIL • Interactive Data Analysis
MAGNUS and NEUDECKER • Matrix Differential Calculus with Applications in Statistics and Econometrics
MAINDONALD • Statistical Computation
MALLOWS • Design, Data, and Analysis by Some Friends of Cuthbert Daniel
MANN, SCHAFER, and SINGPURWALLA • Methods for Statistical Analysis of Reliability and Life Data
MASON, GUNST, and HESS • Statistical Design and Analysis of Experiments with Applications to Engineering and Science
MILLER • Survival Analysis
MILLER, EFRON, BROWN, and MOSES • Biostatistics Casebook
MONTGOMERY and PECK • Introduction to Linear Regression Analysis, Second Edition
NELSON • Accelerated Testing, Statistical Models, Test Plans, and Data Analyses
NELSON • Applied Life Data Analysis
OCHI • Applied Probability and Stochastic Processes in Engineering and Physical Sciences
OSBORNE • Finite Algorithms in Optimization and Data Analysis
OTNES and ENOCHSON • Digital Time Series Analysis
PANKRATZ • Forecasting with Dynamic Regression Models
PANKRATZ • Forecasting with Univariate Box-Jenkins Models: Concepts and Cases
POLLOCK • The Algebra of Econometrics
RAO and MITRA • Generalized Inverse of Matrices and Its Applications
RÉNYI • A Diary on Information Theory
RIPLEY • Spatial Statistics
RIPLEY • Stochastic Simulation
ROSS • Introduction to Probability and Statistics for Engineers and Scientists
ROUSSEEUW and LEROY • Robust Regression and Outlier Detection
RUBIN • Multiple Imputation for Nonresponse in Surveys
RUBINSTEIN • Monte Carlo Optimization, Simulation, and Sensitivity of Queueing Networks
RYAN • Statistical Methods for Quality Improvement
SCHUSS • Theory and Applications of Stochastic Differential Equations
SEARLE • Linear Models
SEARLE • Linear Models for Unbalanced Data
SEARLE • Matrix Algebra Useful for Statistics
SEARLE, CASELLA, and McCULLOCH • Variance Components
SKINNER, HOLT, and SMITH • Analysis of Complex Surveys
SPRINGER • The Algebra of Random Variables
STEUER • Multiple Criteria Optimization
STOYAN • Comparison Methods for Queues and Other Stochastic Models
STOYAN, KENDALL, and MECKE • Stochastic Geometry and Its Applications
THOMPSON • Empirical Model Building
TIERNEY • LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics
TIJMS • Stochastic Modeling and Analysis: A Computational Approach
TITTERINGTON, SMITH, and MAKOV • Statistical Analysis of Finite Mixture Distributions
UPTON and FINGLETON • Spatial Data Analysis by Example, Volume I: Point Pattern and Quantitative Data
UPTON and FINGLETON • Spatial Data Analysis by Example, Volume II: Categorical and Directional Data
VAN RIJCKEVORSEL and DE LEEUW • Component and Correspondence Analysis
WEISBERG • Applied Linear Regression, Second Edition
WHITTLE • Optimization Over Time: Dynamic Programming and Stochastic Control, Volume I and Volume II
WHITTLE • Systems in Stochastic Equilibrium
WONNACOTT and WONNACOTT • Econometrics, Second Edition
WONNACOTT and WONNACOTT • Introductory Statistics, Fourth Edition
WONNACOTT and WONNACOTT • Introductory Statistics for Business and Economics, Third Edition
WOOLSON • Statistical Methods for the Analysis of Biomedical Data
Tracts on Probability and Statistics
AMBARTZUMIAN • Combinatorial Integral Geometry
BIBBY and TOUTENBURG • Prediction and Improved Estimation in Linear Models
BILLINGSLEY • Convergence of Probability Measures
KELLY • Reversibility and Stochastic Networks
TOUTENBURG • Prior Information in Linear Models
*Now available in a lower priced paperback edition in the Wiley Classics Library.
Statistic for Spatial Data
Copyright © 1993 by John Wiley & Sons, Inc.
All rights reserved. Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008.
Library of Congress Cataloging in Publication Data:
Cressie, Noel A. C. Statistics for spatial data revised edition / Noel A. C. Cressie. p. cm.—(Wiley series in probability and mathematicalstatistics. Applied probability and statistics section)
“A Wiley-Interscience publication.” Includes bibliographical references and index. 1. Spatial analysis (Statistics) I. Title. II. Series.
QA278.2.C75 1993519.5—dc20 93-775
CIP
ISBN 0-471-00255-0
To Yoko, Amie, and Sean
Preface
The purpose of this book is to present Statistics for spatial data to scientists and engineers. (Notice that Statistics is capitalized to distinguish it from its other meaning: a collection of numbers that summarize a complex phenomenon—such as baseball or cricket.) In the last 10 years, much interest has been generated in the area, but its exposure in the literature has been uneven. This book attempts to take that literature and extend it, correct it, and unify it. What appears to be a gathering of unconnected subject areas can be annealed into a cohesive approach to the analysis of spatial data. Chapter 1 provides an overview of the approach and of the enormous diversity of problems involving spatial data, from the microscopic to the astronomic.
The book attempts to give a somewhat complete coverage of each of three parts, dealing with geostatistical data, lattice data, and point patterns. Thus, the subject areas are classified according to the type of observations encountered, reflecting my belief that the roots of statistical science are in data. Statistical models, then, try to make sense out of the data, albeit imperfectly. Design, inference, and diagnostics are natural consequences of the data–model symbiosis, and all play an important role in Statistics for spatial data.
This book grew from lecture notes for a one-semester, 3-credit Statistics graduate course that I conduct at Iowa State University. In 45 lectures, each of 50-minutes duration, I cover the following topics:
Part I (Geostatistical Data): All of Chapter 2 except Section 2.5. All of Chapter 3 except Sections 3.3 and 3.6.
Part II (Lattice Data): Chapter 6. Chapter 7, Sections 7.2, 7.3, and 7.6.
Part III (Point Patterns): Chapter 8, Sections 8.1, 8.2, 8.4, 8.5.1, 8.5.2, 8.5.3. Chapter 9, Section 9.1.
Prerequisites for the Statistics graduate course are one semester of Masters-level statistical inference and one semester of Masters-level linear models. While giving the course and preparing the book, I have benefitted from useful reference books in the area. Reading lists for the course have included Matern (1960), Bartlett (1975), Journel and Huijbregts (1978), Cliff and Ord (1981), Ripley (1981), and Upton and Fingleton (1985).
It is my hope that this book can be read in varying depths by people with varying mathematical and statistical backgrounds. However, there are sections in the book, on random processes, point processes, and random sets, that are beyond a Masters-level student and closer to the frontiers of theoretical research (these sections are denoted by an asterisk). Equally, there are sections that concentrate purely on an application and do not add to the theoretical development of the subject (these sections are denoted by a dagger). These applications-oriented sections of the book should appeal to a large number of scientists and engineers, with at least the background of a service course in Statistics (or equivalent). For this reason, most chapters begin with an application, which is meant to be an invitation to read on. The emphasis on applications has led to considerable use of graphs and illustrations.
There are some features of the book that I believe will enhance its value as a textbook and as a reference book. An attempt has been made at completeness, in terms of the topics covered, and uniformity, in terms of the depth to which they are covered, except for Chapters 5 and 9. These two chapters contain material that is either of personal interest or is speculative in nature. The reader will notice frequent referencing to a diverse literature; one of the interesting features of Statistics for spatial data is that a large proportion of it is appearing outside Statistics journals. The referencing allows me to pick up apparently different streams of thought and tie them together. Equally importantly, I have tried to give credit where it is due. The linear reader will also notice a certain amount of repetition between chapters (and, to a lesser extent, between sections). This is deliberate and is meant to help the sporadic reader who wants to understand the essence of a topic but who has not read all the previous pages.
We should not forget our roots. All data sets for the various spatial statistical analyses are given in the book, as well as some background to the problems being studied. Further, each of the three parts has a section devoted to sources of spatial data, both real and simulated.
No exercises are given at the end of sections; the depth of coverage within a section should allow practice exercises to suggest themselves to an instructor teaching from the book. Software is not given. (A geostatistics package, Toolkit, by Geostokos, London, was used for the kriging presented in Chapter 3.) Currently, those of us who work in the area tend to custom-build our own software, which is usually not very portable. Statistics for spatial data will truly realize its enormous potential when a comprehensive software package is developed.
This is a big book. I had thought of splitting it into smaller volumes; however, the present format emphasizes the subject’s unity. This may be the last time spatial Statistics will be squeezed between two covers. A healthy exponential growth of the literature is apparent from the bibliography.
The future of the subject is in solving problems for spatiotemporal data; some sections are devoted to it, but a proper treatment needs another book. Statistics for spatial and temporal data would provide dynamic models for phenomena distributed through space and evolving in time. Onward into the next decade!
NOEL CRESSIE
Ames, Iowa
December 1, 1990
Acknowledgments
As a boy, knowing “why,” “how,” and “when” was not enough for me. With encouragement from my parents, a keen interest in the “where” question eventually led me to summer jobs with mining companies in Western Australia (while an undergraduate) and three years of doctoral work at Princeton University. There, I had the good fortune to be taught by Geof Watson, John Tukey, and, in my last year, Julian Besag. Geof gave me an appreciation for all things spatial and geometric, John showed me how to make my data speak (sometimes, even sing) to me, and Julian introduced me to the mysteries of Markov random fields (and English hockey). More recently, in his role as a series editor for Wiley and as an aficionado of Statistics for spatial data, Geof Watson has been of immense help with his comments and his support.
The influence of Georges Matheron of the École Nationale Supérieure des Mines de Paris in Part I and Chapter 9 is obvious; his work has been truly pioneering. I was fortunate to spend a post-doctoral period of five months at his center in Fontainebleau, France, in 1975.
My coauthors on spatial articles provided valuable impetus to my research in the subject and my colleagues throughout the world have (through their letters, their telephone calls, their questions at seminars, their anonymous referees’ reports, and their comments in hallways) helped shape my current opinions on Statistics in general and on Statistics for spatial data in particular. Along with those mentioned in the paragraphs above, I would like to thank Peter Diggle for early suggestions on topics that a course on spatial statistics might cover, Dale Zimmerman for comments on Chapter 5, Subhash Lele for comments on parts of Chapters 6 and 7, and Daryl Daley for comments on Chapter 8.
The task of taking a diverse and uneven literature on spatial Statistics and extending it, correcting it, and unifying it has not been an easy one. The editors at Wiley have been very understanding of my desire for a complete and uniformly comprehensive product. Bea Shube provided ideas and encouragement in the early stages of this project. In the last three years, helpful advice has come from Kate Roach, and, for a brief period before her, from Maggie Irwin. My department head at Iowa State University, Dean Isaacson, has been equally understanding and in various ways helped to make an impossible task possible.
The writing of this book started in the second half of 1985 while I was an ASA/NSF/Census Fellow at the U.S. Bureau of the Census. Based on an experimental course called Spatial Statistics that I conducted at Iowa State University in 1984, I gave a series of seven seminars at the Census Bureau and distributed material that later became Chapter 6. The rest of the book (over 90%) was written at Iowa State, partially supported by the Department of Statistics and by the National Science Foundation. A special mention should be made of the superb journal and book collection of the Parks Library at Iowa State; it is one of the great resources of the University, and has helped me achieve a coverage that would otherwise have been impossible.
We are very fortunate, in our Department of Statistics, to be surrounded by intelligent and motivated graduate students. I have now offered my course on Spatial Statistics four times; it has helped me attract good students with a keen interest in the subject. As research assistants and as candidates for graduate degrees, they have been been involved in various aspects of this book. The contributions of three of them deserve special mention and thanks. Stephen Rathbun wrote a preliminary draft of Sections 8.3, 8.5 through 8.9, and 7.4, and commented on subsequent drafts of Chapter 8. Carol Gotway wrote a preliminary draft of Sections 3.6 and 5.1, Martín Grondona wrote a preliminary draft of Sections 5.6 and 5.7, and Gotway and Grondona gave comments on Part I. All three were involved in the production of the figures. My thanks also go to Renkuan Guo, Jeff Helterbrand, Fred Hulting, and Jay Ver Hoef, who contributed in various ways to the preliminary through penultimate and final drafts.
This manuscript was turned into immaculate type by Sharon Shepard; by comparing what I gave her and what she gave me, it is clear that Sharon was able to work near miracles. Jeanette LaGrange, Rose Ann Anderson, and Jan Franklin also provided valuable secretarial assistance.
A trace of my space–time line would show a trip to Tokyo, Japan, in 1987. There I met Yoko. She could rightly claim that Western marriage vows say nothing about writing a book. Yet somehow she understood. We are fortunate to have been able to create two new space–time lines in our household, Amie and Sean. It was difficult for them to understand, and even harder for me to persevere. My deepest gratitude goes to all of them for their love and patience.
I am glad I started exploring Statistics for spatial data, because I have learned a great deal. Some of the territory is now well charted. Other parts are only passable by those who are sure-footed, and by not looking down. Much more of it I can only glimpse and tell you what I see. It is an exciting area that deserves a place in every statistical scientist’s repertoire; I hope you will agree.
N. A. C. C.
Perhaps, it may turn out a Sang; Perhaps, turn out a Sermon.
ROBERT BURNS
Statistics, the science of uncertainty, attempts to model order in disorder. It is not surprising that students (and their teachers) find the subject enigmatic. However, as life experiences and scientific experiences accumulate, Statistics is usually recognized as an extremely powerful research tool. Even when the disorder is discovered to have a perfectly rational explanation at one scale, there is very often a smaller scale where the data do not fit the theory exactly, and the need arises to investigate the new, residual uncertainty.
A physical interpretation of this inequality might be that the universe tends to seek levels of entropy that are higher in relation to previous levels. Technology attempts to slow the increase in entropy by constraining evolving systems, but there are convincing arguments given that it will never decrease total entropy (e.g., Brooks and Wiley, 1988, Chapter 2). In short, Statistics is important because disorder is here to stay.
Beginning classes in Statistics (and many of the more advanced ones) always assume that observations on a phenomenon are taken under identical conditions and that each observation is taken independently of any other. The data then form a random sample [i.e., are independent and identically distributed (i.i.d.)]; standard statistical techniques can be applied to build a statistical model and to estimate the model’s parameters (see, e.g., Hogg and Craig, 1978). For example, Heyl and Cook (1936) describe experiments performed between May 1934 and July 1935 to determine the acceleration of gravity in a laboratory of the National Bureau of Standards, Washington, D.C. The method used was that of the reversible pendulum; various configurations of pendulum tube diameter and type of knife edge were used. One particular configuration, taken late May/early June 1934, yielded (after appropriate adjustments for flexure and clock rate were made)
expressed as deviations from 980,000 × 10−3 cm/sec2, in units of 10−3 cm/sec2 (Heyl and Cook, 1936, Table 12). That these data can be modeled as arising from a random sample is probably a fair assumption, but a later experiment performed under a different configuration yielded deviations (again after appropriate adjustments)
in units of 10−3 cm/sec2 (Heyl and Cook, 1936, Table 8). Again these could be modeled as a random sample, but because each experiment attempted to measure the same physical constant, the data should obviously be combined in some way. Is it fair to model the preceding 16 numbers as observations from a random sample?
Lack of homogeneity in data is usually accounted for in statistical models by a nonconstant-mean assumption; often the mean is assumed to be a linear combination of several explanatory variables. However, even after these large-scale variations are accounted for, there are often reasons to suspect inhomogeneous small-scale variations.
Cressie (1982) assumes the data from the experiment (to determine the acceleration of gravity) just described to be independent realizations from statistical distributions whose means are constant but whose variances differ (sometimes called heteroskedasticity) markedly, depending on the configurations of pendulum tube diameter and type of knife edge. Standard one-sample theory no longer applies, but it is still possible to construct a confidence interval for the common mean, based on a weighted t-like statistic.
The preceding example shows a relaxation of the identical-distribution assumption. Relaxation of the independence assumption is a further obvious way to generalize statistical models, but are these more general models of any scientific value? In the pages to follow, I hope to convince the reader that the answer to this question is very definitely “Yes.”
Independence is a very convenient assumption that makes much of mathematical–statistical theory tractable. However, models that involve statistical dependence are often more realistic; two classes of models that have commonly been used involve intraclass-correlation structures and serial-correlation structures. These offer little scope for spatial data, where dependence is present in all directions and becomes weaker as data locations become more dispersed.
We have not yet been able to escape the three-dimensional world in which we live, nor the unidirectional flow of time through which we live. The notion that data close together, in time or space, are likely to be correlated (i.e., cannot be modeled as statistically independent) is a natural one and has been used successfully by statisticians to model physical and social phenomena. Purely temporal models, or time series models as they have come to be known (e.g., Box and Jenkins, 1970), are usually based on identically distributed observations that are dependent and occur at equally spaced time points. The unidirectional flow of time underlies the construction of these models.
Spatial models are a more recent addition to the Statistics literature. Geology, soil science, image processing, epidemiology, crop science, ecology, forestry, astronomy, atmospheric science, or simply any discipline that works with data collected from different spatial locations, need to develop (not necessarily statistical) models that indicate when there is dependence between measurements at different locations. However, the models need to be more flexible than their temporal counterparts, because past, present, and future have no analogy in space, and furthermore it is simply not reasonable to assume that spatial locations of data occur regularly, as do most time series models.
When dealing with data where (space–time) dependencies are likely, two approaches can be contrasted. Departures from the independence paradigm could be modeled, or statistical procedures could be constructed that would be robust to these departures. Throughout this book I shall consider mostly the modeling approach.
It is useful to consider an example that has both the temporal and spatial component in it. The weather is a universal topic of conversation (some would say it is a resort of the desperate), and among its various facets that of rainfall is paramount. To the more than one million inhabitants of South Australia, the driest state in the driest continent, drought can be devastating. In a series of papers, Cornish, with others (1936, 1954, 1958, 1961, 1976), looked at temporal and spatial aspects of 26 rainfall recording stations in South Australia, shown in Figure 1.1. Data analyzed were monthly rainfall amounts at 26 locations that had been recording for a period of 30 years or more. Thus, at any one location, the data form a typically lengthy time series (e.g., for Adelaide, the capital city of South Australia, there were 1248 monthly observations available), and at any one time point, the data are spatial and not more than 26 in number; over time and space they form a collection of approximately 20,000 observations. More recently, meteorological space–time data sets have been collected for the purposes of studying the effects of atmospheric pollution, in particular acid rain (see e.g., Peters and Bonelli, 1982). Daily data collection over a number of years at various locations throughout say the northeast United States, yields a massive data set. But most of it is temporal so that, spatially speaking, the data are still rather sparse. Nevertheless, spatial prediction is just as important as temporal prediction, because people living in those cities and rural districts without monitoring stations have the same right to know how little or how much their water or their air is polluted. (Section 4.6 discusses some of the issues surrounding acid rain and analyzes a spatial data set.)
Figure 1.1 Map of South Australia showing 26 rainfall recording locations: Adelaide (1), Beltana (2), Blinman (3), Booleroo Centre (4), Bordertown (5), Broken Hill (6), Clare (7), Cleve (8), Cordillo Downs (9), Fowler’s Bay (10), Gawler (11), Hawker (12), Kapunda (13), Maitland (14), Marree (15), Mount Gambier (16), Murray Bridge (17), Naracoorte (18), Peterborough (19), Port Augusta (20), Port Lincoln (21), Port Pirie (22), Stirling West (23), Strathalbyn (24), Yardea (25), Yorketown (26).
The topics covered in this book are almost exclusively related to data analysis and statistical modeling of spatial data; the basic components are spatial locations {s1,…, sn} and data {Z(s1),…, Z(sn)} observed at those locations. Usually the data are assumed random and sometimes the locations are assumed random. Moreover, once the locations are given, the possibility of mistaken or imprecise positioning is generally not modeled. Some modifications to this paradigm are occasionally considered; for example, see Sections 4.4, 4.5, and 6.1.
Issues regarding the measurement, storage, and retrieval of spatial information are extremely important, although that is not the emphasis in this book. A geographic information system (GIS) is a collection of computer software tools that facilitate, through georeferencing, the integration of spatial, nonspatial, qualitative, and quantitative data into a data base that can be managed under one system environment (e.g., Burrough, 1986). Much of the research in GIS has been concerned with computational geometry, spatial discretization, spatial languages and user interfaces, systems designs and architectures for data integration, spatial data handling approaches for alternative system architectures (such as parallel processing and neural networks), and so forth. Satellites are amassing huge amounts of data, a small fraction of which is being analyzed. This embarrassment of riches must be controlled by being selective; selectivity can be guided by the types of problems solved and the models that will be needed to solve them. Geographic information systems have only recently begun to incorporate model-based spatial analyses into their information-processing subsystems; it is hoped that this book will provide ideas for further initiatives.
My principal purpose for writing this book is to make statistical theory and methods for spatial data available to scientists and engineers. I believe that the full potential of Statistics is seen in its application to substantive problems, and I have tried to reinforce this in the applications I have given throughout the book. I hope that readers will find ideas and information here that will be useful in their own areas of scientific endeavor, although I do recognize that there are large parts that may be inaccessible to all but the most theoretically inclined. Such sections have been marked with an asterisk to warn the nonstatistician to obtain help in reading them. By way of balance, those sections that are truly oriented toward applications have been marked with a dagger.
The book is divided into three parts, dealing with spatial processes indexed over continuous space (a topic sometimes referred to as geostatistics), spatial processes indexed over lattices in space (the spatial analogue of time series), and spatial point processes (including marked point processes and random set processes). These are convenient subareas that do not exhaust all of the statistical models one might use to analyze spatial data. The next section gives some background on spatial problems and defines the general model considered in this book.
The first manifestations of statistics for spatial data appear to have arisen in the form of data maps. For example, Halley (1686) superimposed, onto a map of land forms, directions of trade winds and monsoons between and near the tropics, and attempted to assign them a physical cause.
Spatial models appeared much later. For example, Student (1907) was concerned with the distribution of particles throughout a liquid. Instead of analyzing their spatial positions he aggregated the data into counts of particles per unit area: A hemocytometer of area 1 mm2, divided into 400 squares, was used to count yeast cells. Student found that the distribution of the number of cells per square followed a Poisson distribution. (Daley and Vere-Jones, 1988, p. 8, can be consulted for a brief history of counting problems.)
R. A. Fisher was clearly aware of spatial dependence in agricultural field experiments, because he went to such great lengths to remove it (see Fisher, 1935, Chapter IV). In the 1920s and 1930s, at Rothamsted Experimental Station in England, he established the principles of randomization, blocking, and replication. As well as controlling for unwanted bias, randomization also neutralizes (but does not remove) the effect of spatial correlation (Yates, 1938; Section 5.6.2). However, it should be realized that randomization does not neutralize the spatial correlation at spatial scales larger or smaller than the plot dimensions.
Fairfield Smith (1938) was concerned with choosing plot dimensions so that any increase in plot size would yield little decrease in error variance. Although his analysis was empirical, the very formulation of the problem recognizes the presence of spatial correlation in field experiments. Models for such phenomena did not begin to appear until much later (Whittle, 1954).
Nearest-neighbor methods for analyzing agricultural field trials attempt to take spatial dependence into account, indirectly, by using residuals from neighboring plots as covariates, or by differencing (Papadakis, 1937; Bartlett, 1938, 1978; Wilkinson et al., 1983; Besag and Kempton, 1986). A review of these methods is given in Section 5.7.1.
Presently, Statistics can be found in any quantitative discipline, giving rise to rich variations on the theme that was played at Rothamsted 60 years ago. In areas such as geology, ecology, and environmental science, it is not often possible (nor always appropriate) to randomize, block, and replicate the data. There is a need for new statistical models and approaches that address new-questions arising from old and new technologies. Many of the resulting problems, such as resource assessment, environmental monitoring (e.g., for global warming), and medical imaging, are spatial in nature.
Statistics, in all its guises from exploratory data analysis to asymptotic distribution theory of parameter estimators, relies on a more or less vague stochastic model. I shall present such a model for spatial data, which has a very simple structure that is flexible enough to handle an extremely large class of problems (including the ubiquitous i.i.d. case). The data may be continuous or discrete, they may be spatial aggregations or observations at points in space, their spatial locations may be regular or irregular, and those locations may be from a spatial continuum or a discrete set. At the very least, the stochastic model is used to summarize extant data or to predict unobserved data. It may or may not explain why a particular phenomenon occurs, and so should be distinguished from the more commonly accepted use of the word “model.” To many scientists, a model must have causative dynamic components, but that is not necessarily the case in this book.
Let sd be a generic data location in d-dimensional Euclidean space and suppose that the potential datum Z(s) at spatial location s is a random quantity. Now let s vary over index set D ⊂ d so as to generate the multivariate random field (or random process)
(1.1.1)
a realization of (1.1.1) is denoted {z(s): sD}. This “superpopulation” model for spatial data is used exclusively throughout the book, although brief discussion of a design-based approach (whose randomness is derived from sampling the deterministic process {z(s): sD}) is given in Section 2.3.
Usually, D is assumed to be a fixed (i.e., nonrandom) subset of d (see, e.g., Vanmarcke, 1983), but I shall assume more generally that D is a random set. Formally speaking, a random set is a measurable mapping from a probability space onto a measure space of (usually closed) subsets of d (see Section 9.3 for more details). Less formally speaking, I shall assume that D as well as Z may vary from realization to realization, giving another source of randomness to the problem. This simple structure allows me to talk about problems with continuous spatial index, problems with lattice index, spatial point patterns, and more, all under the same umbrella. Thus, the three seemingly distinct parts of this book alluded to earlier treat special cases of (1.1.1), where the index set D could be a random set. Specifically:
The flexibility of (1.1.1) is now apparent, and although it is clear that D and Z(s) could be even more general random quantities, the fact that D is a subset of d allows (1.1.1) to be called a spatial process. It is even possible to think of the ubiquitous i.i.d. (possibly multivariate) model as a O-dimensional spatial process, because any spatial index is unimportant. I have not mentioned how the D process and the Z process might covary. For most of this book, one or the other is fixed so the question does not arise. When both are random, they are usually assumed to be independent, or one process is analyzed conditionally upon the observed values of the other. Spatial modeling occurs then within the Z process (Parts I and II), within the D process, or within both processes (Part III), and typically involves modeling the large-and the small-scale variations in terms of a finite number of parameters.
I prefer to keep (multivariate) time-series processes separate from spatial processes by using the index t and denoting them as
(1.1.2)
The unidirectional flow of time sometimes forces one to distinguish between (1.1.2) and (1.1.1) in 1 (see Section 6.3).
A space–time process will be denoted as
(1.1.3)
where each of Z, D, and T is possibly random. This is as general a process as will be considered in this book.
As disciplines in Statistics mature, their statistical analyses advance through three stages of development that I describe as description, indication, and estimation. For the first, the goal is to summarize the data (perhaps to suggest models). For the second, estimates of model parameters are obtained from the data, but no measures of precision are available. For the third, enough distribution theory is at hand to allow (approximate) inference on the model parameters; for example, the bias and variance of an estimator can be calculated and estimated. Statistics for spatial data is still a young discipline, showing increasing signs of maturity. I hope this book will contribute to, and be an influence on, its formative years.
In this section, I shall present examples where data are spatial in nature. Some data sets are new and some have already appeared in the literature, although they may not have been analyzed spatially. It is not my intention to present spatial statistical analyses in this section, but rather to illustrate the range of problems that can be addressed. For some of these data, more complete analyses are carried out later in the book.
As described in Section 1.1, spatial data can be thought of as resulting from observations on the stochastic process
(1.2.1)
where D is possibly a random set in d. I shall present some special cases that are typical of spatial statistical problems one might encounter.
Geostatistics (Matheron, 1962, 1963a, 1963b) emerged in the early 1980s as a hybrid discipline of mining engineering, geology, mathematics, and statistics. Its strength over more classical approaches to ore-reserve estimation is that it recognizes spatial variability at both the large scale and the small scale, or in statistical parlance it models both spatial trend and spatial correlation. Trend-surface methods (e.g., Whitten, 1970) include only large-scale variation, assuming independent errors. Watson (1972) compares the two approaches and points out that most geological problems have a small-scale variation, typically exhibiting strong positive correlation between data at nearby spatial locations. One of the most important problems in geostatistics is to predict the ore grade in a mining block from observed samples. Matheron (1963b) has called this process of prediction kriging (see Chapter 3).
Exploration data on coal ash (Gomez and Hazen, 1970, Tables 19 and 20) for the Robena Mine Property in Greene County, Pennsylvania, are an example of regularly spaced data in 2, although some locations have not been sampled or their observations are missing. Section 2.2 performs a spatial exploratory data analysis on these mining data, and kriging is carried out in Section 3.4.
The geostatistical method has found favor among soil scientists who seek to map soil properties of a field from a small number of soil samples at known locations throughout the field. For example, soil pH in water, soil electrical conductivity, exchangeable potassium in the soil, soil–water tension, and soil-water infiltration are some of the variables that could be sampled. Water erosion is of great concern to agriculturalists, because rich topsoil can be carried away in runoff water. Some forms of tillage result in greater soil–water infiltration than others, and the study described in Section 4.3 investigates this question. The greater the infiltration, the less the runoff, resulting in less soil erosion and less stream pollution by pesticides and fertilizers. Also, greater infiltration implies better soil structure that is more conducive to crop growth. In Cressie and Horton (1987) and Section 4.3, it is explained how double-ring infiltrometers were placed at regular locations in a field that had received four tillage treatments. From these data the spatial relationships can be characterized, treatments compared, and kriging maps drawn.
There are many published sources for case studies using geostatistics, although the original data are usually not given for confidentiality reasons (original data for both of the examples just introduced are available in Sections 2.2 and 4.3, respectively). The books by David (1977, 1988), Journel and Huijbregts (1978), and Clark (1979) are all devoted to applications in the mining industry. Application of geostatistics in other areas abound; for example, in rainfall precipitation (e.g., Ord and Rees, 1979), atmospheric science (e.g., Thiebaux and Pedder, 1987), soil mapping (e.g., Burgess and Webster, 1980), and predicting groundwater contaminant concentrations (e.g., Istok and Cooper, 1988). For further applications, see the various sections of Chapter 4 that contain analyses of data sets from hydrology, soil science, public health, uniformity trials, and acid rain.
Geostatistical-type problems are distinguished most clearly from lattice-and point–pattern-type problems by the ability of the spatial index s to vary continuously over a subset of d. That is not to say that methods from one class of problems cannot be borrowed from methods usually associated with another class; see, for example, Sections 4.4 and 7.6, where the same public-health data set is analyzed using two approaches.
A lattice of locations evokes an idea of regularly spaced points in d, linked to nearest neighbors, second-nearest neighbors, and so on. In this book these will be referred to as regular lattices, allowing for the possibility of irregular lattices, whose relative displacements do not follow a predictable pattern and whose linkages are not always obvious from their geometry. Of all the possible spatial structures that (1.2.1) can generate, a data set whose spatial locations are a regular lattice in d is the closest analogue to a time series observed at equally spaced time points.
Remote sensing from satellites offers an efficient means of data gathering. For example, it allows information to be gathered rapidly on weather patterns, mineral distribution, and crop acreage without having to conduct lengthy and labor-intensive traditional surveys. Satellites orbit the earth and receive data in the form of electromagnetic reflectance waves at a number of frequencies, including those in the visible part of the spectrum. By various sampling and integration methods (see Section 7.4 for more details), the earth’s surface is divided into small rectangles (e.g., 56 m × 56 m) called pixels (short for picture elements). An agricultural scene of interest (around, say, 34,000 km2) has certain proportions devoted to wheat, corn, soybeans, and so forth that need to be estimated. These various crops have their own reflectance properties that, together with noise, are remotely sensed. Thus the data are received as a regular lattice in 2 (neglecting the earth’s curvature) and are identified with the centers of their respective pixels. (This is analogous to temporal problems where economic time-series data, such as yearly export earnings, are aggregated over the whole year but are identified with single, equally spaced points on the time axis.) There is a large overlap between remote-sensing techniques and (low-level) medical-imaging techniques; although the spatial scales are vastly different, the form of the data and the questions being asked are often similar. Statistical models for such data need to express the fact that observations nearby (in time or space) tend to be alike; see Section 7.4.
Sometimes it might be expected that nearby observations tend to be dissimilar. Competition between plants for light and soil nutrients could lead to large healthy plants being surrounded by less sturdy ones. Mead (1967) analyzes data on cabbages and investigates this competition effect.
In contrast to geostatistical problems, data from lattice problems may be exhaustive of the phenomenon. Of course, sampling may also occur; for example, suppose only a small window of the full data set is available.
There have been various published studies based on lattice data (although such studies have not always exploited the spatial component) that include Mercer and Hall (1911) (see also Sections 4.5 and 7.1), Batchelor and Reed (1918), Cochran (1936), Bartlett (1974), Cliff and Ord (1981), Ripley (1981), Symons et al. (1983) (see also Sections 4.4, 6.2, and 7.6), and Cressie and Chan (1989).
Point patterns arise when the important variable to be analyzed is the location of “events.” Most often the first question to be answered is whether the pattern is exhibiting complete spatial randomness, clustering, or regularity. For example, consider the locations of longleaf pines in a natural forest in southern Georgia (given in Section 8.2). What is the biological significance of clustering of these trees? The variable, diameter at breast height, is also recorded along with the tree’s location. Do large (small) trees cluster and how do large and small trees interact? The size variable is usually called a mark variable, and the whole process is then called a marked spatial point process.
But the mark variable does not have to be a real variable; it could, for example, be a set. The process then consists of {Z(si): siD}, D a spatial point process and Z(si) a random set (Section 9.3) located at siD. Data from this process are often observed as a realization of
(1.2.2)
which is referred to as a (generalized) Boolean model (Serra, 1980) or a mosaic process (Diggle, 1981b). The data are often images, and their assimilation to a model like (1.2.2) requires higher levels of image analysis than the methods considered in Section 7.4 of Part II. The goal is to estimate parameters of the random set and the point process; see Cressie and Laslett (1987) and Section 9.5 where an artificially generated Boolean model of random parallelograms is analyzed. Diggle (1981b) analyzes the incidence of heather according to a Boolean model; see also Section 9.5. A model for tumor growth that iterates the Boolean model through successive stages is developed in Section 9.7. Pictures of cell islands growing in vitro are presented and analyzed in a manner that takes shape as well as size into account.
Published studies involving analyses of spatial point patterns can be found inter alia in Pielou (1959, 1977), Matern (1960), Getis and Boots (1978), Marquiss et al. (1978), Ripley (1981), Diggle (1983), and Upton and Fingleton (1985). Of course, there is a large overlap of methods for patterns occurring in space and for those occurring in time; the reader interested in temporal methods is referred to Cox and Lewis (1966), the articles in Lewis (1972) (in particular the review by Daley and Vere-Jones, 1972), Cox and Isham (1980), and Daley and Vere-Jones (1988). A second-moment measure of spatial point patterns is adapted for the temporal animal-behavior data in Section 8.4.
Some simple spatial models will be given to show the effect of correlation on estimation, prediction, and design. The models allow closed-form expressions to be calculated, from which discussion of more general issues can be initiated.
Consider the following simple statistical model, taught in all beginning Statistics service courses. Suppose Z(1),…., Z(n) are independent and identically distributed (i.i.d.) from a Gaussian (i.e., normal) distribution with unknown mean μ and known variance σ02. The minimum-variance unbiased estimator of μ. is
(1.3.1)
and inference on μ is straightforward: The estimator is Gaussian with mean μ and variance σ02/n. Thus, a two-sided 95% confidence interval for μ is
(1.3.2)
Instead of independent data, now suppose the data are positively correlated with a correlation that decreases as the separation between data increases:
(1.3.3)
Now,
(1.3.4)
Some intuitive understanding of the effect of spatial correlation can be obtained from (1.3.4). Write it as
(1.3.5)
where
(1.3.6)
Spatial models, more complicated than (1.3.3), show the same general behavior. Haining (1988) considers constant-mean Gaussian models in 2 that are conditionally specified autoregressions (Section 6.6), and simultaneously specified autoregressions and moving averages (Section 6.7.1), each with an unknown variance parameter σ2. He compares the variance of assuming independence, with the variance of assuming positive dependence; he also compares the latter with the variance of the maximum likelihood estimator of the constant mean μ In general, classical inference based on and is misleading; for positive spatial dependence, and are typically much larger than .
Grenander (1954) demonstrates that, for a class of time-series models that includes (1.3.3), and the maximum likelihood estimator
