Statistics for Earth and Environmental Scientists - John H. Schuenemeyer - E-Book

Statistics for Earth and Environmental Scientists E-Book

John H. Schuenemeyer

0,0
122,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

A comprehensive treatment of statistical applications for solving real-world environmental problems

A host of complex problems face today's earth science community, such as evaluating the supply of remaining non-renewable energy resources, assessing the impact of people on the environment, understanding climate change, and managing the use of water. Proper collection and analysis of data using statistical techniques contributes significantly toward the solution of these problems. Statistics for Earth and Environmental Scientists presents important statistical concepts through data analytic tools and shows readers how to apply them to real-world problems.

The authors present several different statistical approaches to the environmental sciences, including Bayesian and nonparametric methodologies. The book begins with an introduction to types of data, evaluation of data, modeling and estimation, random variation, and sampling—all of which are explored through case studies that use real data from earth science applications. Subsequent chapters focus on principles of modeling and the key methods and techniques for analyzing scientific data, including:

  • Interval estimation and Methods for analyzinghypothesis testing of means time series data

  • Spatial statistics

  • Multivariate analysis

  • Discrete distributions

  • Experimental design

Most statistical models are introduced by concept and application, given as equations, and then accompanied by heuristic justification rather than a formal proof. Data analysis, model building, and statistical inference are stressed throughout, and readers are encouraged to collect their own data to incorporate into the exercises at the end of each chapter. Most data sets, graphs, and analyses are computed using R, but can be worked with using any statistical computing software. A related website features additional data sets, answers to selected exercises, and R code for the book's examples.

Statistics for Earth and Environmental Scientists is an excellent book for courses on quantitative methods in geology, geography, natural resources, and environmental sciences at the upper-undergraduate and graduate levels. It is also a valuable reference for earth scientists, geologists, hydrologists, and environmental statisticians who collect and analyze data in their everyday work.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 620

Veröffentlichungsjahr: 2011

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Cover

Title Page

Copyright

Preface

Acknowledgments

Chapter 1: Role of Statistics and Data Analysis

1.1 Introduction

1.2 Case Studies

1.3 Data

1.4 Samples Versus the Population: Some Notation

1.5 Vector and Matrix Notation

1.6 Frequency Distributions and Histograms

1.7 Distribution as a Model

1.8 Sample Moments

1.9 Normal (Gaussian) Distribution

1.10 Exploratory Data Analysis

1.11 Estimation

1.12 Bias

1.13 Causes of Variance

1.14 About Data

1.15 Reasons to Conduct Statistically Based Studies

1.16 Data Mining

1.17 Modeling

1.18 Transformations

1.19 Statistical Concepts

1.20 Statistics Paradigms

1.21 Summary

Chapter 2: Modeling Concepts

2.1 Introduction

2.2 Why Construct a Model?

2.3 What Does a Statistical Model Do?

2.4 Steps in Modeling

2.5 Is a Model a Unique Solution to a Problem?

2.6 Model Assumptions

2.7 Designed Experiments

2.8 Replication

2.9 Summary

Chapter 3: Estimation and Hypothesis Testing on Means and Other Statistics

3.1 Introduction

3.2 Independence of Observations

3.3 Central Limit Theorem

3.4 Sampling Distributions

3.5 Confidence Interval Estimate on a Mean

3.6 Confidence Interval on the Difference Between Means

3.7 Hypothesis Testing on Means

3.8 Bayesian Hypothesis Testing

3.9 Nonparametric Hypothesis Testing

3.10 Bootstrap Hypothesis Testing on Means

3.11 Testing Multiple Means via Analysis of Variance

3.12 Multiple Comparisons of Means

3.13 Nonparametric ANOVA

3.14 Paired Data

3.15 Kolmogorov–Smirnov Goodness-of-Fit Test

3.16 Comments on Hypothesis Testing

3.17 Summary

Chapter 4: Regression

4.1 Introduction

4.2 Pittsburgh Coal Quality Case Study

4.3 Correlation and Covariance

4.4 Simple Linear Regression

4.5 Multiple Regression

4.6 Other Regression Procedures

4.7 Nonlinear Models

4.8 Summary

Chapter 5: Time Series

5.1 Introduction

5.2 Time Domain

5.3 Frequency Domain

5.4 Wavelets

5.5 Summary

Chapter 6: Spatial Statistics

6.1 Introduction

6.2 Data

6.3 Three-Dimensional Data Visualization

6.4 Spatial Association

6.5 Effect of Trend

6.6 Semivariogram Models

6.7 Kriging

6.8 Space–time models

6.9 Summary

Chapter 7: Multivariate Analysis

7.1 Introduction

7.2 Multivariate Graphics

7.3 Principal Components Analysis

7.4 Factor Analysis

7.5 Cluster Analysis

7.6 Multidimensional scaling

7.7 Discriminant Analysis

7.8 Tree-based modeling

7.9 Summary

Chapter 8: Discrete Data Analysis and Point Processes

8.1 Introduction

8.2 Discrete Process and Distributions

8.3 Point Processes

8.4 Lattice Data and Models

8.5 Proportions

8.6 Contingency Tables

8.7 Generalized Linear Models

8.8 Summary

Chapter 9: Design of Experiments

9.1 Introduction

9.2 Sampling Designs

9.3 Design of experiments

9.4 Comments on Field Studies and Design

9.5 Missing data

9.6 Summary

Chapter 10: Directional Data

10.1 Introduction

10.2 Circular Data

10.3 Spherical Data

10.4 Summary

References

Index

Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty; While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Schuenemeyer, J. H.

Statistics for earth and environmental scientists / John H. Schuenemeyer, Lawrence J. Drew.

p. cm.

Includes index.

ISBN 978-0-470-58469-9 (cloth)

1. Geology–Statistical methods. 2. Environmental sciences–Statistical methods. I. Drew, Lawrence J. II. Title.

QE33.2.S82S38 2010

550.72'7–dc22 2010006819

Preface

This book is intended for students and practitioners of the earth and environmental sciences who want to use statistical tools to solve real problems. It provides a range of tools that are used across earth science disciplines. Statistical methods need to be understood because today's interesting problems are complex and involve uncertainty. These complex problems include energy resources, climate change, and geologic hazards. Through the use of statistical tools, an understanding of process can be obtained and proper inferences made. In addition, through design of field trials and experiments, these inferences can be made efficiently.

We stress data analysis, modeling, model evaluation, and an understanding of concepts through the use of real data from many earth science disciplines. We also encourage the reader to supplement exercises with data from his or her discipline. The reader, especially the student, is encouraged to collect his or her own data. This may be as simple as the recording of temperature and precipitation or the travel time to work or school. The downside to using real data is that the resulting analysis may not always be as clean as when artificial data are used. In the real world, however, important structure often is not readily apparent. The goal of this book is to engage you, the reader, in the application of statistics to assist in the solution of important problems. We use statistics to explore, model, and forecast.

Statistics is a blend of science and art. Statistics cannot be learned or practiced by rote application of a method. Every problem is different and requires careful examination. The reader needs to gain an understanding of when and why methods work. Sometimes, different methods perform equally well, and at times none of the standard methods are suitable and a new method must be developed. Most often, model assumptions do not hold exactly. A challenge is to determine when they are “close enough.” Simulation is a useful tool to evaluate assumptions.

Most of the statistical models in this book are introduced by concept and application, given as equations and then heuristic justification provided rather than a formal proof. Some of the mathematics, especially in the chapters on spatial statistics (Chapter 6) and multivariate analysis (Chapter 7), may be challenging and can be omitted without loss of basic understanding. Those with the necessary background will benefit from having them available.

The use of graphs to illustrate concepts, to identify unusual observations, and to assist in model evaluation is strongly encouraged. Graphs combined with statistics lead to more informative results than those for either taken separately.

There are a variety of paradigms in statistics. We introduce models using the frequentist approach; however, we also discuss Bayesian, nonparametric, and computer-intensive methods. There is no single approach that works best in all circumstances, and we tend to be pragmatic and use whatever method seems appropriate to solve a given problem.

It is assumed that the reader has had at least a one-semester undergraduate course in statistics or equivalent experience and is familiar with basic probability and statistical distributions, including the normal, binomial, and uniform. However, these concepts, with the exception of basic probability, are covered in the first four chapters. Further, we have assumed a general ability to recognize basic matrix computations. The book may be used for a one-semester course for students who have a minimal background in statistics. A more advanced reader or student may begin with concepts from multiple regression, time series, spatial statistics, multivariate analysis, discrete data analysis, and design. During many years of university teaching, presenting workshops, and working with practitioners, we have discovered that the mathematical and statistical background of earth scientists is diverse. At the expense of an occasional uneven level of technical presentation, we have attempted to provide information that will be useful to students and practitioners of varied backgrounds.

The Web site for this book is www.EarthStatBook.com. Appendixes I through V can be downloaded from this Web site. This site also contains other selected data sets, answers to some exercises, R-code for selected exercises and examples, a blog, and an errata page.

Some of the exercises we present are conceptual. Many require the use of a computer. Our expectation is that students will develop insight in solving problems using statistics rather than a rote application of methods and computer programs. We expect that the reader has access to and is familiar with a standard statistical computing package. Most standard statistical packages will do all of the computations required of students to complete the assignments. A major exception may be spatial statistics. Spatial statistical modeling and analysis and most other computations have been done in R, an open-source code statistics and graphics language.

Acknowledgments

We appreciate discussions with many earth scientists. Some have shared their data, and credit is given where used. We especially acknowledge the help of Anne Schuenemeyer, BSN, RN. Without her invaluable assistance, this book would not have come to fruition.

John H. Schuenemeyer

Lawrence J. Drew

Chapter 1

Role of Statistics and Data Analysis

1.1 Introduction

The purpose of this chapter is to provide an overview of important concepts in data analysis and statistics. Types of data, data evaluation, and an introduction to modeling and estimation are presented. Random variation, sampling, and different statistical paradigms are also introduced. These concepts are investigated in detail in subsequent chapters. An important distinguishing feature in many earth and environmental science analyses is the need for spatial sampling. Problems are described in the context of case studies, which use real data from earth science applications.

1.2 Case Studies

Wherever possible, case studies are used to illustrate methods. Two studies that are used extensively in this and subsequent chapters are water-well yield data and observations from an ice core.

1.2.1 Water-Well Yield Case Study

A concern in many parts of the world is the availability of an adequate supply of fresh water. Planners and managers want to know how much water is available. Scientists want to gain a greater understanding of transport systems and the relationship of water to other geologic phenomena. Homeowners who do not have access to municipal water want to know where to drill for water on their property. A subset of 754 water-well yield observations (water-well yield case study, Appendix I; see the book's Web site) from the Blue Ridge Geological Province, Loudoun County, Virginia (Sutphin et al., 2001) is used to illustrate graphical procedures. The variables are water-well yield in gallons per minute (gpm) for rock type Yg (Yg is a Middle Proterozoic Leucocratic Metagranite) and corresponding coordinates called easting (x-axis) and northing (y-axis). In Chapter 6 spatial applications are discussed.

1.2.2 Ice Core Case Study

Ice core data help scientists understand how Earth's climate works. The U.S. Geological Survey National Ice Core Laboratory (2004) states that “Over the past decade, research on the climate record frozen in ice cores from the Polar Regions has changed our basic understanding of how the climate system works. Changes in temperature and precipitation, which previously we believed, would require many thousands of years to happen were revealed, through the study of ice cores, to have happened in fewer than twenty years. These discoveries have challenged our beliefs about how the climate system works.”

A record that can extend back many thousands of years may include temperature, precipitation, and chemical composition. An example of ice core data (ice core case study, Appendix II; see the book's Web site) submitted to the National Geophysical Data Center (2004) by Arkhipov et al. (1987) has been chosen. Data submitted by Arkhipov are from 1987 in the Austfonna Ice Cap of the Svalbard Archipelago and go to a depth of 566 m. Melting of ice masses is thought to be contributing to sea-level rise. Only data in the first 50 m are presented. In addition to depth, the variables are pH, (hydrogen carbonate), and Cl (chlorine), all in milligrams per liter of water.

1.3 Data

Sir Arthur Conan Doyle, physician and writer (1859–1930), noted: “It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” Data are fundamental to statistics. Most data are obtained from measurements. Increasingly, these measurements are obtained from automated processes such as ground weather stations and satellites. However, field studies are still an important way to collect data. Another important source of data is expert judgment. In areas where few hard data (measurements) are available, such as in the Arctic, experts are called upon to express their opinions.

Data may be rock type, wind speed, orientation of a fault, temperature, and a host of other variables. There are several ways to classify data. Two of the most useful classifications are continuous versus discrete and ratio–interval–ordinal–nominal (Table 1.1). A continuous process generates continuous data. Discrete data typically result from counting. Continuous data can be ratio or interval. Discrete data are nominal data. Data classification systems help to select appropriate data analytic techniques and models.

Table 1.1 Data Classification Systems.

ExamplesContinuous vs. Discrete DataContinuous: measurements can be made as fine as neededTemperature, depth, sulfur content, well water yieldDiscrete: data that can be categorized into a classification where only a finite number of values are possible, typically count dataNumber of days above freezing, number of water wells producing among a sample of 50 holesRatio, Interval, Ordinal, and Nominal DataRatio: continuous data where an interval and ratio are meaningfulDepth, sulfur contentInterval: continuous data with no natural zeroTemperature measured in degrees CelsiusOrdinal: data that are rank orderedSurvey responses such as good, fair, poor; water yields as high, medium, lowNominal: Data that fit into categories; cannot be rank orderedLocation name, rock type

To distinguish between ratio and interval data, consider the following example. With a ratio scale, zero means an absence of something, such as rainfall. With an interval scale, zero is arbitrary, such as zero degrees Celsius, which is not an absence of temperature and has a different meaning than zero degrees Fahrenheit. The terms quantitative and qualitative are also used. Sometimes qualitative data is considered synonymous with nominal data; and sometimes it just refers to something subjective or not precisely defined. Categorical data are data classified into categories. The terms categorical and nominal are sometimes used interchangeably.

Another way to view data is as primary or secondary. are collected to answer questions related to a particular study, such as sampling a site to ascertain the level of coal bed methane seepage. are collected for some other purpose and may be used as supportive data. Typically, secondary data are . Numerous government agencies routinely collect and publish both types of data on the earth sciences.

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!