Data Analysis for the Geosciences - Michael W. Liemohn - E-Book

Data Analysis for the Geosciences E-Book

Michael W. Liemohn

0,0
115,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

An initial course in scientific data analysis and hypothesis testing designed for students in all science, technology, engineering, and mathematics disciplines Data Analysis for the Geosciences: Essentials of Uncertainty, Comparison, and Visualization is a textbook for upper-level undergraduate STEM students, designed to be their statistics course in a degree program. This volume provides a comprehensive introduction to data analysis, visualization, and data-model comparisons and metrics, within the framework of the uncertainty around the values. It offers a learning experience based on real data from the Earth, ocean, atmospheric, space, and planetary sciences. About this volume: * Serves as an initial course in scientific data analysis and hypothesis testing * Focuses on the methods of data processing * Introduces a wide range of analysis techniques * Describes the many ways to compare data with models * Centers on applications rather than derivations * Explains how to select appropriate statistics for meaningful decisions * Explores the importance of the concept of uncertainty * Uses examples from real geoscience observations * Homework problems at the end of chapters The American Geophysical Union promotes discovery in Earth and space science for the benefit of humanity. Its publications disseminate scientific knowledge and provide resources for researchers, students, and professionals.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 1108

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Advanced Textbook Series

Unconventional Hydrocarbon Resources: Techniques for Reservoir Engineering Analysis

Reza Barati and Mustafa M. Alhubail

Geomorphology and Natural Hazards: Understanding Landscape Change for Disaster Mitigation

Tim R. Davies, Oliver Korup, and John J. Clague

Remote Sensing Physics: An Introduction to Observing Earth from Space

Rick Chapman and Richard Gasparovic

Geology and Mineralogy of Gemstones

David Turner and Lee A. Groat

Data Analysis for the Geosciences: Essentials of Uncertainty, Comparison, and Visualization

Michael W. Liemohn

Advanced Textbook 5

Data Analysis for the Geosciences

Essentials of Uncertainty, Comparison, and Visualization

Michael W. Liemohn

University of Michigan, USA

This work is a co‐publication of the American Geophysical Union and John Wiley and Sons, Inc.

This edition first published 2024© 2024 American Geophysical Union

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

Published under the aegis of the AGU Publications Committee

Matthew Giampoala, Vice President, PublicationsCarol Frost, Chair, Publications CommitteeFor details about the American Geophysical Union, visit us at www.agu.org.

The right of Michael W. Liemohn to be identified as the author of this work has been asserted in accordance with law.

Wiley Global Headquarters111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products, visit us at www.wiley.com.

Limit of Liability/Disclaimer of WarrantyWhile the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials, or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication Data applied for:9781119747871 (Paperback); 9781119747888 (Adobe PDF); 9781119747895 (ePub)

Cover Design: WileyCover Image: Aurora borealis, Iceland © Cavan Images/Getty Images

To Ginger, for inspiring me to take on a project like writing a book

Preface

A critical element of a robust undergraduate education in science, technology, engineering, and mathematics (STEM) disciplines is an understanding of sets of numbers and how to process, plot, and compare them. This concept is usually first taught in introductory laboratory courses and reinforced in the advanced lab classes. These labs, however, are usually focused on the scientific concept being explored by the experiment as well as the methodologies of setting up the equipment and making the measurements. These classes often only give a brief glimpse into the many techniques for analyzing the obtained number sets, and these courses often devote even less attention to how one should visualize the number sets and compare number sets.

STEM students need to gain an appreciation of the uncertainties surrounding observations and model results. This uncertainty strongly governs the interpretation of the values and especially the comparison of several values. A related concept is uncertainty propagation, keeping track of the uncertainties as the values are processed (e.g., used as a value in an equation to yield a new number). Without a grasp on the uncertainty of a given number, its comparison with other numbers is meaningless.

Some STEM departments require their students to take a statistics course as part of the undergraduate degree program. While this is a highly worthwhile and useful topic for such students to understand, it typically does not cover the full range of issues concerning data analysis, visualization, and comparative metrics techniques that a practicing scientist or engineer should know. Applied statistics courses for STEM majors often cover the basic statistical processing for a single data set (calculating mean and standard deviation, for instance) and for two data sets (calculating a linear regression fit and correlation coefficient, for instance). It usually covers the basics of comparing those two data sets, including t tests, chi‐squared tests, and F‐statistic tests. It usually does not cover very much in the way of visualization of the data and almost never covers comparison metrics beyond the correlation coefficient.

Using numerous examples from the Earth, ocean, atmospheric, space, and planetary sciences, this volume presents a comprehensive introduction to data analysis, visualization, and data‐model comparisons and metrics, within the framework of the uncertainty around the values. Currently, I teach an upper‐level undergraduate course, “Data Analysis and Visualization for Geoscientists,” at the University of Michigan, for which this textbook is written. The volume can be used as the text for a data analysis course, a supplement for an advanced laboratory series, or as a reference resource, for everyone from upper‐level undergraduate students to experienced researchers in STEM fields.

While data‐model comparisons have always been an essential component of scientific research, it is often a topic not rigorously introduced at the undergraduate level. This is no longer acceptable, especially with the advent of machine learning as a fast‐growing field of analysis. A fundamental trait of machine learning is the optimization of the computer‐developed model, fitting its result to the training data set. Undergraduate science students gain experience as data analysts, but for some reason, data‐model comparisons are barely mentioned in most undergraduate curricula. Many of these students are going straight into the industrial and commercial sector at ever‐increasing rates, often as data analytics experts. To be an effective data scientist and user of advanced statistical applications including machine learning, these students should have an understanding and appreciation of data‐model comparison techniques.

How to Use This Book

This is a data analysis textbook for upper level undergraduate STEM students. It is designed to be their statistics course in the degree program, offering them a learning experience based on real geophysical observations. Data from geoscience examples are used throughout the book to actively engage the reader in the concept of uncertainty as a leading factor in the interpretation and usage of a set of numbers. Note that this book intentionally avoids many of the derivations of the formulas presented. Some are given, but only for context to understand the assumptions built into the formula so that students learn the limitations of that particular formula. No derivations are assigned in the homework problems at the end of the chapters. If an instructor wanted to add derivations to the assignments, then please feel free to do so, but I am omitting them intentionally because I want the focus to be on the application of the formulas for scientific investigations.

This book should be useful to students across all science disciplines, meant to serve as an initial course in scientific data analysis and hypothesis testing. This book provides the precursor knowledge to understanding machine learning techniques. While it does not explicitly cover machine learning, it provides a critical toolkit for students to fully understand, appreciate, and optimally use the latest machine learning advancements in data science. A key topic in geosciences that it does not cover is periodicity analysis. In my department at the University of Michigan, we have a separate course that deals explicitly with this subject, exploring Fourier transforms and other periodicity methodologies, and then teaching students how to interpret the resulting power spectral density graphs in a scientific context. Inclusion of this topic could occur in a future edition, when one merged book is used for both of these undergraduate geophysical data analysis courses, but that endeavor is reserved for the future.

While this book could be assigned as a reference text in conjunction with an advanced laboratory course, it is perhaps most effectively used as a separate course taken before, after, or in parallel to the lab class. The lab course is focused on the methods of data collection while this text is focused on the methods of data processing. These two topics are intimately related but the methods are completely different and each deserves its own focused learning experience.

The topic of data analysis and model metrics requires a computational approach. When teaching the course on which this book is based, half of the class sessions are held in a computer lab with experiential learning examples, walking the students through the usage of the concepts and equations presented in class. Specifically, these interactive analysis sessions are taught using the Python programming language via Jupyter notebooks, which include interleaved blocks of code and explanatory text. At least one version of these code files is already available online, as a supplement to the book content. Instructors should feel free to use this coding material in designing their own version of this type of course. The exercises at the end of the chapters assume some programming proficiency and most of them require some coding for completion.

I have not included any programming‐language‐specific content in this book; any necessary coding instruction should be provided in addition to the content of this book. For my class, I give them lots of code; this is not a class about the special tricks of opening a data set in a particular format but rather the usage of the analysis techniques to robustly assess one or more data sets. Note that some of the “Exercises and geosciences” problem sets at the ends of the chapters are quite long; instructors might want to think about whether to assign everything or a subset of it. For some of the chapters, I only assign half, switching back and forth each year.

Prerequisites for Using This Book

This is meant as an introductory statistics textbook for upper level undergraduates. It does not require any prior statistics coursework. If students have taken a statistics course already, then the some of the first half of the book (especially Chapters 4, 6, and 7) will be somewhat of a review. The content in these chapters, however, is different from a typical statistics approach to the material, so students should find it to be a new perspective on these concepts. There is a small bit of probability in the course, but again no prior knowledge of this topic is needed.

The math in Chapter 3 on uncertainty propagation includes differential calculus. It is assumed that students have this “Calculus 1” knowledge base, so proficiency to this level of math is essential for that part of the course. Chapter 3 includes partial derivatives, so higher level calculus is preferred, but this concept could be briefly introduced as part of this course (as I do, when I teach it).

Some programming experience is required to successfully navigate the homework problems in this book. I teach it in Python, via Jupyter Notebooks, going through coding technique and example code in class. I do not, however, teach the basic fundamentals of scientific programming, but rather go through a style guide of code format and documentation expectations. I allow students to work together on coding assignments, as I do not want the programming aspect to be a hurdle to understanding the statistical and data‐model comparison concepts.

The book contains numerous examples in Earth, atmospheric, space, and planetary sciences. No prior knowledge of these topics is necessary to fully appreciate these examples; background context is provided and they are separated from the statistical concepts. They offer a real‐world usage of the math that hopefully exposes readers to these geoscience topics and piques interest in them. From earthquakes to tornadoes to the auroral lights to the bright “wanderers" of the dark night sky, geoscience is a fascinating discipline that we experience every day. I hope that you enjoy these examples from this field.

Features of This Book

The content is organized into 12 chapters. A quick summary of the content is as follows:

Chapter 1

: Uncertainty around data, comparing a single number to a group, and the Gaussian “normal” distribution

Chapter 2

: Visualizing a data set, elements to consider when making a plot, and best practices for conveying a message with a figure

Chapter 3

: Calculating the uncertainty of a processed data set

Chapter 4

: Quantifying the centroid and spread of a number set, Poisson counting statistics in relation to centroids and spreads

Chapter 5

: Methods for determining if a number set follows a Gaussian distribution, including the chi‐squared test and Kolmogorov‐Smirnov test, rejecting single data points from a set, what to do if it isn’t Gaussian

Chapter 6

: Comparing two data sets, t tests, covariance, correlation coefficients, and calculating an uncertainty of a metric with the jackknife and bootstrap methods

Chapter 7

: Fitting a line between two paired data sets, including polynomial fitting and nonlinear curve fitting techniques

Chapter 8

: Visualization techniques for comparing a data set with a model trying to reproduce it, approaches to data‐model comparison metrics, categories of metrics

Chapter 9

: Fit performance metrics (those that use the continuous nature of the two number sets)

Chapter 10

: event detection metrics (those that categorize the data and model values into event/non‐event status)

Chapter 11

: techniques for sliding the threshold of event identification

Chapter 12

: summary of best options for data‐model comparison metrics choices for certain applications, maximizing interpretive value with combinations of metrics, binomial distribution and decisions with metrics, and introductions to additional statistical topics

Each chapter includes similar content elements meant to make the material readily accessible and discoverable. At the end of each section or subsection is a light blue “Quick and Easy” box that recaps the primary point of that section. If you are skimming the book, reading only these should provide a concise, high‐level summary of all of the main content. Most of the sections are focused on statistics, but interspersed within every chapter are sections providing example usage of those concepts in the Earth, atmospheric, space, and planetary sciences. These geoscience sections contain no new statistical concepts and could be skipped or replaced with different examples. This clear separation of the statistics content from the example geoscience content is slightly violated in Chapters 9, 10, and 11, which contain running examples. In these chapters, there is an early section introducing the running example and then an ending section that recaps the running example, and in the intervening sections that introduce the metrics for each category, there is one paragraph, usually near the end, that applies that metric to the running example.

Another accessibility feature of the book is the highlighting of key term definitions. At the first usage of a key statistics term, it is listed in bold font within the text and then a brief summary definition is given in red text at the bottom of that page. Similarly, key geoscience terms are also listed in bold font with a brief summary definition given in blue text at the bottom of the column. Again, for those skimming the book, looking for a particular concept, scanning these red‐text or blue‐text definitions should allow you to quickly find it.

Each chapter ends with the same two sections. The first of these is “Further Reading,” which is an annotated reference list for the chapter. The ordering of entries in this section follows the sections of the book, providing key references, links to data sets or important websites, and additional commentary. The final section of every chapter is “Exercises in the Geosciences,” which are a mixture of by‐hand calculations, coding tasks, and short‐answer interpretations about the results from other parts of the problem sets. All of the data sets are available at the companion website for this book, as well as Python code in Jupyter Notebook format for opening these files. As preferred programming languages shift, additional code files in these other languages could be made available at this website.

Final Thoughts

The long‐range goal of writing this book is to improve the quality of methodology in scientific research. Many scientific studies conduct only cursory analysis of their selected data; approaches abound for conducting more rigorous assessments and this book serves as an introduction to many of those analysis techniques. Furthermore, many studies involving numerical models to assess and interpret observations implemented rather simplistic approaches to the data‐model comparisons, often choosing only one or two metrics for the analysis. As presented in this book, there is a “zoo of metrics” available, each with their particular strengths for examining a particular aspect of the data‐model relationship. Each metric, though, has its limitations, because it was designed for a particular purpose. With additional emphasis on the numerous options for data processing and data‐model comparisons, including the role of uncertainty in these evaluations, the scientific discourse will be elevated.

Acknowledgments

I am a huge fan of John R. Taylor. He wrote An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements in 1982, which was assigned as the required text for my Advanced Physics Lab course series at Rose‐Hulman Institute of Technology. This was my first full immersion into data analysis and processing, and the Taylor book was an outstanding resource for understanding these concepts. I thank John Taylor for providing a brilliant example of how to easily convey difficult concepts. If I could measure it, I am sure that the influence John Taylor has had on me through this book passes the threshold of statistical significance.

This book would not have happened without my appointment as the editor in chief of the Journal of Geophysical Research Space Physics. It was through my experiences reading and editing a few hundred manuscripts a year that I realized the need in the scientific community for a better awareness of data processing techniques and a stronger appreciation for conducting robust data‐model comparisons using a wide array of metrics. Suggesting resources to people for additional information on these concepts led me to the realization that the community could really use a single resource for this collection of topics. Thanks to the editor selection committee that recommended me for the position and the many people at the American Geophysical Union (AGU) and across the space physics research community that helped make that experience a positive one for me.

AGU contracts with Wiley for the publication of their various journals and topic‐specific monographs. Intending to reach the broad spectrum of disciplines included within the umbrella of AGU, I chose Wiley as the publisher for this. I am glad that they accepted my book proposal and that I took this path of writing this book. A special thanks goes out to Ritu Bose, Mandy Collison, and Layla Harden from Wiley and Jenny Lunn from AGU for encouraging and supporting me during this process. I would also like to thank the technical editor, Dr. Bea Gallardo‐Lacort, for her time, effort, and encouragement through the long process of peer review.

I’d like to thank my colleagues in the Department of Climate and Space Sciences and Engineering at the University of Michigan. Our curriculum committee recognized a need for an upper‐level undergraduate data processing and visualization course, and they looked around for someone to develop and teach it. This coincided with my editorship and my realization that the natural sciences could use a course like this, so I volunteered to do this task for the department. Thanks to the entire faculty that have contributed to my understanding of data analysis, data‐model comparisons, data visualization, and, of course, for providing the opportunity for me to develop this course, from which I have now written this book.

I will be forever in awe of the intellect, audacity, and tenacity of Dr. Abigail Azari. As my PhD student at the time when I was assigned the new course, she heard me pondering which programming language to use for the labs. After that group meeting, she came to my office and told me that the labs would be in Python, specifically Jupyter Notebooks, and that she would develop them. I wholeheartedly agreed, and while we developed that lab structure and outline together, she developed the content and actual code. In fact, she taught the labs for the first two years. Thank you, Abby; you will make an outstanding professor.

Many other coworkers had a direct impact on the content of this book. I am very grateful for my many conversations about data analysis, data visualization, and data‐model comparisons with others in the space physics field, both those in my immediate group at the University of Michigan and those across the world. There are too many to name, but I would like to especially mention Dr. Natalia Ganushkina, Dr. Dan Welling, Dr. Janet Kozyra, Dr. Lutz Rastätter, Dr. Katherine Garcia‐Sage, Dr. Steven Morley, Dr. Alexa Halford, Dr. Meghan Burleigh, Dr. Michael Balikhin, Dr. Aaron Ridley, Dr. Derek Posselt, Dr. Suzy Bingham, and Dr. Darren De Zeeuw. In addition, I would like to thank Prof. Ted Bergin from the Department of Astronomy at the University of Michigan who graciously provided an office during my sabbatical, a time when a part of this book was written. I would also like to thank my former and current PhD students that have greatly contributed to my thoughts on these topics (in addition to Abby); you have all profoundly and positively influenced me. In particular with respect to the subjects covered in this book, I extend an extra thanks to Roxanne Katus, Shaosui Xu, Alicia Petersen, Alexander Shane, Brian Swiger, and Agnit Mukhopadhyay.

I am indebted to the many students that took the class on which this is based. In particular, I would like to thank the students of class from winter term 2021, who were given draft—and often incomplete—versions of the chapters throughout the term. I would especially like to thank the robust comments and suggestions from Aiden Kingwell, John Delpizzo, and Neha Satish. I would also like to acknowledge Samuel Ephraim, whose class project provided the data set for the event detection metrics running example. I also thank the students who took the class from winter term 2022, the first to see the complete book, and for the comments provided throughout the term. I would especially like to appreciate the many comments from Jena Alidinaj, Kira Biener, Alanah Cardenas‐OToole, Jessica Fisher, Akshay Gupta, Lunia Oriol, and Ollie Paulus.

I owe a big thanks to Asher/Anya Hurst, the artist that I contracted to create the Chicken Little illustrations used throughout the book. They are a very talented graphic artist with skills well beyond the simple figures I requested for this project. I hope that you enjoy this whimsical accent to the otherwise dry topic of statistics for the Earth and space sciences, and please consider commissioning Asher/Anya for your illustrative needs.

Finally, I would like to thank my family. Here is a message to my immediate family—my wife Ginger and children Annie and Derek—I love you very much and thank you for giving me the encouragement and feedback that I needed to keep going with this project. Additionally, I thank my parents and siblings for being there with me throughout, well, all the years of my life. I especially thank my dad, Dr. Harold Liemohn, who gave me the encouragement to pursue a technical career and advice along the way on how to navigate the scientific research landscape, and for graciously reading and editing several chapters of this manuscript.

About the Companion Website

This book is accompanied by a companion website.

www.wiley.com/go/liemohn/uncertaintyingeosciences

The website includes:

All of the data files needed for the homework problems in the book

Example code (in Python, in Jupyter Notebook format) for opening these data files

The instructor pages on this site include:

Solution sets to the homework problems

PDF and Powerpoint files of all figures from the book for downloading

Latex and Powerpoint files containing all equations used in the text

1Assessment and Uncertainty: Examples and Introductory Concepts

Analyzing data is a fundamental skill for someone pursuing a career in science, technology, engineering, and mathematics (STEM). That first word in the list, science, is defined broadly here and includes not only geosciences, the focus of the examples throughout the book, but also all natural sciences, physical sciences, social sciences, and the medical professions. Nearly every discipline uses “sets of numbers” in its standard analysis methods. Therefore, a firm grounding in how to approach a data set and understand its properties is a skill that allows you to confidently address a very wide variety of real‐world problems. Similarly, being able to thoroughly assess and interpret how two data sets relate to one another (the core of the field of statistics) is fully transferrable from one discipline to another. Knowing how to examine numbers is foundational for making good decisions; those in managerial positions should also know these skills to properly distill large quantities of information into the essential elements needed for progress on the projects they lead.

A key element of data analysis is the use of models to decipher features of the measured values relative to physical processes. This topic of data‐model comparisons is fundamental to the investigative methods across STEM disciplines. Yet, applying metrics to find the “goodness of fit” regarding a model output compared to an observational data set requires careful thought about how to conduct the analysis and how to interpret the values resulting from this “processing” of the number sets. The first half of that methodology, how to conduct the comparison analysis, could take many paths depending on the eventual use of the quantities to be calculated. Some equations are much better suited for some applications than others, or with certain types of data. Choosing which types of analysis to conduct is an important aspect of the process because there are no one‐size‐fits‐all statistics. The second half of the methodology, interpreting the resulting metric values, is subjective and requires context about not only the origins of the numbers but also the final application.

All of these steps depend on the uncertainty connected with the values. Uncertainty here means “the range of other possible values for this particular number.” That is, it is not about poor memory and being unsure of whether a procedure was conducted properly, but rather the spread of options that a value might reasonably take instead of the actual value reported. This is especially true for visually determined values, such as precipitation accumulated in a rain gauge. Uncertainty allows you to ascribe meaning to the measured values and properly compare it with another number. Uncertainty allows you to place two numbers in context with each other. Uncertainty is the key to applied statistics.

Uncertainty:  The range of other possible values for a particular number.

Unfortunately, it is often forgotten, unreported, or incorrectly included in the analysis and interpretation. To make uncertainty a little more tangible, let us consider a fresh take on a well‐known story.

1.1 Chicken Little, Amateur Meteorologist

Chicken Little, being a methodical amateur meteorologist, went out every day at noon to measure atmospheric pressure. One day, they recorded a value of 100 kilopascals (abbreviated kPa) on the barometer. The next day, the reading was only 90 kPa. Figure 1.1 shows Chicken Little hard at work at their computer, analyzing the data. Did they take this drop in atmospheric pressure as a signal that the sky is falling and run off to tell the local authorities?

Before we answer, perhaps you want a bit of context. A pressure of 100 kPa is close to a normal atmospheric pressure here on Earth, while 90 kPa is a typical pressure in the eye of a very strong hurricane. At first glance, this looks like a big atmospheric pressure difference that should be reported.

Figure 1.1 Should Chicken Little be scared by a 10 kPa pressure drop? Should they declare that the sky is falling? It depends. Artwork by Asher/Anya Hurst.

The correct answer, to no one’s satisfaction except a scientist, is that it depends. On what, you ask? It depends on the uncertainty in their two values. If Chicken Little is using a modern digital barometer with a pressure‐sensing electronic transducer, then the uncertainty on the measurements is only a few kilopascals (approximately 2%). In this case, the drop is large compared to the uncertainty and they should run to report it. If, however, they were judging the atmospheric pressure based on the rise time of a balloon, then the uncertainty estimate on these numbers could be something like 30%. The 10 kPa discrepancy, that is, the difference between the two numbers, is quite a bit less than the 30 kPa uncertainty on either number, so they should not report it, as the difference is not meaningful. What if they made the first measurement of 100 kPa with a modern barometer and the second measurement of 90 kPa with the rising balloon? The combined uncertainty (simply adding the two uncertainties) is still greater than the discrepancy, so they should not report it. However, because atmospheric pressure only varies by about 10 kPa, if the measurement techniques were reversed and they measured 100 kPa with the rising balloon and then 90 kPa with the digital barometer, meaning that the range is between 88 and 92 kPa with the typical uncertainty of such devices, then Chicken Little should report the very low atmospheric pressure regardless of what it was the day before. The uncertainty on the numbers makes all the difference in whether their measurements warrant action.

Discrepancy:  The difference between two numbers.

Quick and Easy for Section 1.1

The interpretation of numbers depends on their uncertainty.

1.2 Uncertainty Ascribes Meaning to Values

The story mentioned in the preceding text illustrates a main premise of science—the meaning of a particular number cannot be properly understood without a value for the uncertainty surrounding that number. This is especially important when, like in the abovementioned example, two numbers are being compared. You can calculate the difference between the two numbers, but this is not enough; it is only with the uncertainty intervals around the two values that it becomes possible to ascribe a judgment on whether the two numbers are basically the same or fundamentally different.

Uncertainty is the amount that a measured or calculated number could be off from the true value. This can sometimes be inferred from the significant digits reported on the number, with the last nonzero digit indicating the degree of accuracy of the number. This would work except for two things: the first is that calculators, computers, and digital sensors make it very easy to report many digits beyond the true accuracy of the number; and the second is that many people have forgotten their middle school math unit on significant figures. Numbers are often reported with more nonzero digits than they should have, including in scientific experiments. If you trust that the source of the number took into account the sources of uncertainty when reporting the value, then you can proceed with this assumption of nonzero digits being the accuracy of the value; otherwise, caution is highly advised when using this technique to determine uncertainty.