Earth Observation Using Python - Rebekah B. Esmaili - E-Book

Earth Observation Using Python E-Book

Rebekah B. Esmaili

0,0
143,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Learn basic Python programming to create functional and effective visualizations from earth observation satellite data sets Thousands of satellite datasets are freely available online, but scientists need the right tools to efficiently analyze data and share results. Python has easy-to-learn syntax and thousands of libraries to perform common Earth science programming tasks. Earth Observation Using Python: A Practical Programming Guide presents an example-driven collection of basic methods, applications, and visualizations to process satellite data sets for Earth science research. * Gain Python fluency using real data and case studies * Read and write common scientific data formats, like netCDF, HDF, and GRIB2 * Create 3-dimensional maps of dust, fire, vegetation indices and more * Learn to adjust satellite imagery resolution, apply quality control, and handle big files * Develop useful workflows and learn to share code using version control * Acquire skills using online interactive code available for all examples in the book The American Geophysical Union promotes discovery in Earth and space science for the benefit of humanity. Its publications disseminate scientific knowledge and provide resources for researchers, students, and professionals. Find out more about this book from this Q&A with the Author

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 396

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright Page

Foreword

Acknowledgments

Introduction

Part I: Overview of Satellite Datasets

1 A TOUR OF CURRENT SATELLITE MISSIONS AND PRODUCTS

1.1 History of Computational Scientific Visualization

1.2 Brief Catalog of Current Satellite Products

1.3 The Flow of Data from Satellites to Computer

1.4 Learning Using Real Data and Case Studies

1.5 Summary

References

2 OVERVIEW OF PYTHON

2.1 Why Python?

2.2 Useful Packages for Remote Sensing Visualization

2.3 Maturing Packages

2.4 Summary

References

3 A DEEP DIVE INTO SCIENTIFIC DATA SETS

3.1 Storage

3.2 Data Formats

3.3 Data Usage

3.4 Summary

References

Part II: Practical Python Tutorials for Remote Sensing

4 PRACTICAL PYTHON SYNTAX

4.1 “Hello Earth” in Python

4.2 Variable Assignment and Arithmetic

4.3 Lists

4.4 Importing Packages

4.5 Array and Matrix Operations

4.6 Time Series Data

4.7 Loops

4.8 List Comprehensions

4.9 Functions

4.10 Dictionaries

4.11 Summary

References

5 IMPORTING STANDARD EARTH SCIENCE DATASETS

5.1 Text

5.2 NetCDF

5.3 HDF

5.4 GRIB2

5.5 Importing Data Using Xarray

5.6 Summary

References

6 PLOTTING AND GRAPHS FOR ALL

6.1 Univariate Plots

6.2 Two Variable Plots

6.3 Three Variable Plots

6.4 Summary

References

7 CREATING EFFECTIVE AND FUNCTIONAL MAPS

7.1 Cartographic Projections

7.2 Cylindrical Maps

7.3 Polar Stereographic Maps

7.4 Geostationary Maps

7.5 Creating Maps from Datasets Using OpenDAP

7.6 Summary

References

8 GRIDDING OPERATIONS

8.1 Regular One‐Dimensional Grids

8.2 Regular Two‐Dimensional Grids

8.3 Irregular Two‐Dimensional Grids

8.4 Summary

References

9 MEANINGFUL VISUALS THROUGH DATA COMBINATION

9.1 Spectral and Spatial Characteristics of Different Sensors

9.2 Normalized Difference Vegetation Index (NDVI)

9.3 Window Channels

9.4 RGB

9.5 Matching with Surface Observations

9.6 Summary

References

10 EXPORTING WITH EASE

10.1 Figures

10.2 Text Files

10.3 Pickling

10.4 NumPy Binary Files

10.5 NetCDF

10.6 Summary

Part III: Effective Coding Practices

11 DEVELOPING A WORKFLOW

11.1 Scripting with Python

11.2 Version Control

11.3 Virtual Environments

11.4 Methods for Code Development

11.5 Summary

References

12 REPRODUCIBLE AND SHAREABLE SCIENCE

12.1 Clean Coding Techniques

12.2 Documentation

12.3 Licensing

12.4 Effective Visuals

12.5 Summary

References

Conclusion

Appendix A: INSTALLING PYTHON

A.1 Download Tutorials for This Book

A.2 Download and Install Anaconda

A.3 Package Management in Anaconda

Appendix B: JUPYTER NOTEBOOK

Appendix C: ADDITIONAL LEARNING RESOURCES

Appendix D: TOOLS

Appendix E: FINDING, ACCESSING, AND DOWNLOADING SATELLITE DATASETS

Appendix F: ACRONYMS

Index

End User License Agreement

List of Tables

Chapter 3

Table 3.1 Data Types, Typical Ranges, and Decimal Precision and Size in Compu...

Table 3.2 Examples of How Data Can Be Rescaled to Fit in Integer Ranges

Table 3.3 Comparing Increasing Numbers in Base‐10 to Base‐2 (Binary)

Table 3.4 Levels and Examples of Transformations Performed on the Data

Chapter 4

Table 4.1 Useful Python Functions and Methods Used in This Text

Table 4.2 Built‐in Comparison, Identity, Logical, and Membership Operators in...

Chapter 5

Table 5.1 Common Attributes Found a netCDF file Containing Remote Sensing Dat...

Chapter 6

Table 6.1 Calls and Options for Initializing a Figure Using matplotlib

Table 6.2 Helpful Options for matplotlob Colorbars

Chapter 7

Table 7.1 Map Projection Shapes, Examples, and Regions of Minimal Distortion

Chapter 10

Table 10.1 Differences between Image Types

Chapter 12

Table 12.1 Examples of Code Problems and Solutions

List of Illustrations

Chapter 1

Figure 1.1 (a) An example of a Fortran punch card. Each vertical column repr...

Figure 1.2 Illustration of current Earth, space weather, and environmental m...

Figure 1.3 Equatorial crossing times for various LEO satellites displayed us...

Figure 1.4 NOAA‐20 satellite downlink.

Chapter 3

Figure 3.1 (a) Canisters of 35mm film that contain imagery recovered from Ni...

Figure 3.2 Spacing and distance between (

x, y

) points for an example regular...

Figure 3.3 An illustration of how a granule of satellite data taken from a p...

Figure 3.4 Read order for row and column major.

Figure 3.5 Example of how netCDF data are organized. Each variable has metad...

Figure 3.6 Direct broadcast (DB) antenna sites, which can provide data in re...

Chapter 5

Figure 5.1 Illustration showing which values are used in a computation of a ...

Figure 5.2 NUCAPS data are organized into 120 fields of regard (FORs) or foo...

Figure 5.3 Screenshot of the mean SST dataset in the NOAA/ESRL data catalog....

Chapter 6

Figure 6.1 VIIRS imagery of wildfires on 8 November 2018. Source: Worldview ...

Figure 6.2 Different matplotlib axes layout syntax using the

plt.subplot

...

Chapter 7

Figure 7.1 Most varieties of map projections are cylindrical, conic, or plan...

Figure 7.2 Each projection type has different regions of distortion. Lighter...

Figure 7.3 Examples of the types of maps within each projection category.

Figure 7.4 Imager projections from Geostationary Satellites.

Chapter 8

Figure 8.1 An example of differences in the nearest neighbor, linear, and cu...

Chapter 9

Figure 9.1 Example of various instrument channels and ranges between 400–100...

Chapter 11

Figure 11.1 Converting Jupyter Notebook to a script.

Figure 11.2 A flow chart of the steps necessary to run scripts in the comman...

Figure 11.3 Installing nb_conda adds a new tab to Jupyter Notebook, which al...

Figure 11.4 Changing kernel inside Juypter Notebooks.

Figure 11.5 Comparison of waterfall and agile framework.

Chapter 12

Figure 12.1 Example of uniform color scheme (top) and how a different color ...

Figure 12.2 Perceptually uniform sequential colors available in matplotlib....

Guide

Cover Page

Title Page

Copyright Page

Foreword

Acknowledgments

Table of Contents

Begin Reading

Conclusion

Appendix A Installing Python

Appendix B Jupyter Notebook

Appendix C Additional Learning Resources

Appendix D Tools

Appendix E Finding, Accessing, and Downloading Satellite Datasets

Appendix F Acronyms

Index

WILEY END USER LICENSE AGREEMENT

Pages

iii

vii

viii

ix

1

2

3

4

5

7

8

9

10

11

12

13

14

15

17

18

19

20

21

22

23

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

259

260

261

262

263

264

265

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

283

284

285

286

287

288

Special Publications 75

EARTH OBSERVATION USING PYTHON

A Practical Programming Guide

Rebekah B. Esmaili

This Work is a co‐publication of the American Geophysical Union and John Wiley and Sons, Inc.

This edition first published 2021© 2021 American Geophysical Union

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

Published under the aegis of the AGU Publications Committee

Brooks Hanson, Executive Vice President, ScienceCarol Frost, Chair, Publications CommitteeFor details about the American Geophysical Union visit us at www.agu.org.

The right of Rebekah B. Esmaili to be identified as the author of this work has been asserted in accordance with law.

Registered OfficeJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA

Editorial Office111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of Warranty

While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication DataName: Esmaili, Rebekah Bradley, author.Title: Earth observation using Python : a practical programming guide / Rebekah B. Esmaili.Description: Hoboken, NJ : Wiley, [2021] | Includes bibliographical references and index.Identifiers: LCCN 2021001631 (print) | LCCN 2021001632 (ebook) | ISBN 9781119606888 (hardback) | ISBN 9781119606895 (adobe pdf) | ISBN 9781119606918 (epub)Subjects: LCSH: Earth sciences—Data processing. | Remote sensing–Data processing. | Python (Computer program language) | Information visualization. | Artificial satellites in earth sciences. | Earth sciences—Methodology.Classification: LCC QE48.8 .E85 2021 (print) | LCC QE48.8 (ebook) | DDC 550.285/5133—dc23LC record available at https://lccn.loc.gov/2021001631LC ebook record available at https://lccn.loc.gov/2021001632

Cover Design: WileyCover Image: © NASA

FOREWORD

When I first met the author a few years ago, she was eager to become more involved in the Joint Polar Satellite System’s Proving Ground. The Proving Ground by definition assesses the impact of a product in the user’s environment; this intrigued Rebekah because as a product developer, she wanted to understand the user’s perspective. Rebekah worked with the National Weather Service to demonstrate how satellite‐derived atmospheric temperature and water vapor soundings can be used to describe the atmosphere’s instability to support severe weather warnings. Rebekah spent considerable time with users at the Storm Prediction Center in Norman, Oklahoma, to understand their needs, and she found their thirst for data and the need for data to be easily visualized and understandable. This is where Rebekah leveraged her expert skills in Python to provide NWS with the information they found to be most useful. Little did I know at the time she was writing a book.

As noted in this book, a myriad of Earth‐observing satellites collect critical information of the Earth’s complex and ever‐changing environment and landscape. However, today, unfortunately, all that information is not effectively being used for various reasons: issues with data access, different data formats, and the need for better tools for data fusion and visualization. If we were able to solve these problems, then suddenly there would be vast improvements in providing societies with the information needed to support decisions related to weather and climate and their impacts, including high‐impact weather events, droughts, flooding, wildfires, ocean/coastal ecosystems, air quality, and more. Python is becoming the universal language to bridge these various data sources and translate them into useful information. Open and free attributes, and the data and code sharing mindset of the Python communities, make Python very appealing.

Being involved in a number of international collaborations to improve the integration of Earth observations, I can certainly emphasize the importance of working together, data sharing, and demonstrating the value of data fusion. I am very honored to write this Foreword, since this book focuses on these issues and provides an excellent guide with relevant examples for the reader to follow and relate to.

Dr. Mitch GoldbergChief Program ScientistNOAA-National Environmental Satellite, Data, and Information ServiceJune 22, 2020

ACKNOWLEDGMENTS

This book evolved from a series of Python workshops that I developed with the help of Eviatar Bach and Kriti Bhargava from the Department of Atmospheric and Oceanic Science at the University of Maryland. I am very grateful for their assistance providing feedback for the examples in this book and for leading several of these workshops with me.

This book would not exist without their support and contributions from others, including:

The many reviewers who took the time to read versions of this book, several of whom I have never met in person. Thanks to modern communication systems, I was able to draw from their expertise. Their constructive feedback and insights not only helped to improve this quality and breadth of the book but also helped me hone my technical writing skills.

Rituparna Bose, Jenny Lunn, Layla Harden, and the rest of the team at AGU and Wiley for keeping me informed, organized, and on track throughout this process. They were truly a pleasure to work with.

Nadia Smith and Chris Barnet, and my other colleagues at Science and Technology Corp., who provided both feedback and conversations that helped shape some of the ideas and content in this book.

Catherine Thomas, Clare Flynn, Erin Lynch, and Amy Ho for their endless encouragement and support.

Tracie and Farid Esmaili, my parents, who encouraged me to aim high even if they were initially confused when their atmospheric scientist daughter became interested in “snakes.”

INTRODUCTION

Python is a programming language that is rapidly growing in popularity. The number of users is large, although difficult to quantify; in fact, Python is currently the most tagged language on stackoverflow.com, a coding Q&A website with approximately 3 million questions a year. Some view this interest as hype, but there are many reasons to join the movement. Scientists are embracing Python because it is free, open source, easy to learn, and has thousands of add‐on packages. Many routine tasks in the Earth sciences have already been coded and stored in off‐the‐shelf Python libraries. Users can download these libraries and apply them to their research rather than simply using older, more primitive functions. The widespread adoption of Python means scientists are moving toward a common programming language and set of tools that will improve code shareability and research reproducibility.

Among the wealth of remote sensing data available, satellite datasets are particularly voluminous and tend to be stored in a variety of binary formats. Some datasets conform to a “standard” structure, such as netCDF4. However, because of uncoordinated efforts across different agencies and countries, such standard formats bear their own inconsistencies in how data are handled and intended to be displayed. To address this, many agencies and companies have developed numerous “quick look” methods. For instance, data can be searched for and viewed online as Jpeg images, or individual files can be displayed with free, open‐source software tools like Panoply (www.giss.nasa.gov/tools/panoply/) and HDFView (www.hdfgroup.org/downloads/hdfview/).

Still, scientists who wish to execute more sophisticated visualization techniques will have to learn to code. Coding knowledge is not the only limitation for users. Not all data are “analysis ready,” i.e., in the proper input format for visualization tools. As such, many pre‐processing steps are required to make the data usable for scientific analysis. This is particularly evident for data fusion, where two datasets with different resolutions must first be mapped to the same grid before they are compared. Many data users are not satellite scientists or professional programmers but rather members of other research and professional communities, these barriers can be too great to overcome. Even to a technical user, the nuances can be frustrating. At worst, obstacles in coding and data visualization can potentially lead to data misuse, which can tarnish the work of an entire community.

The purpose of this text is to provide an overview of the common preparatory work and visualization techniques that are applied to environmental satellite data using the Python language. This book is highly example‐driven, and all the examples are available online. The exercises are primarily based on hands‐on tutorial workshops that I have developed. The motivation for producing this book is to make the contents of the workshops accessible to more Earth scientists, as very few Python books currently available target the Earth science community.

This book is written to be a practical workbook and not a theoretical textbook. For example, readers will be able to interactively run prewritten code interactively alongside the text to guide them through the code examples. Exercises in each section build on one another, with incremental steps folded in. Readers with minimal coding experience can follow each “baby step” to get them up to become “spun up” quickly, while more experienced coders have the option of working with the code directly and spending more time on building a workflow as described in Section III.

The exercises and solutions provided in this book use Jupyter Notebook, a highly interactive, web‐based development environment. Using Jupyter Notebook, code can be run in a single line or short blocks, and the results are generated within an interactive documented format. This allows the student to view both the Python commands and comments alongside the expected results. Jupyter Notebook can also be easily converted to programs or scripts than can be executed on Linux Machines for high‐performance computing. This provides a friendly work environment to new Python users. Students are also welcome to develop code in any environment they wish, such as the Spyder IDE or using iPython.

While the material builds on concepts learned in other chapters, the book references the location of earlier discussions of the material. Within each chapter, the examples are progressive. This design allows students to build on their understanding knowledge (and learn where to find answers when they need guidance) rather than memorizing syntax or a “recipe.” Professionally, I have worked with many datasets and I have found that the skills and strategies that I apply on satellite data are fairly universal. The examples in this book are intended to help readers become familiar with some of the characteristic quirks that they may encounter when analyzing various satellite datasets in their careers. In this regard, students are also strongly encouraged to submit requests for improvements in future editions.

Like many technological texts, there is a risk that the solutions presented will become outdated as new tools and techniques are developed. The sizable user community already contributing to Python implies it is actively advancing; it is a living language in contrast to compiled, more slowly evolving legacy languages like Fortran and C/C++. A drawback of printed media is that it tends to be static and Python is evolving more rapidly than the typical production schedule of a book. To mitigate this, this book intends to teach fluency in a few, well‐established packages by detailing the steps and thought processes needed for a user needs to carry out more advanced studies. The text focuses discipline‐agnostic packages that are widely used, such as NumPy, Pandas, and xarray, as well as plotting packages such as Matplotlib and Cartopy.

I have chosen to highlight Python primarily because it is a general‐purpose language, rather than being discipline or task‐specific. Python programmers can script, process, analyze, and visualize data. Python’s popularity does not diminish the usefulness and value of other languages and techniques. As with all interpreted programming languages, Python may run more slowly compared to compiled languages like Fortran and C++, the traditional tools of the trade. For instance, some steps in data analysis could be done more succinctly and with greater computational efficiency in other languages. Also, underlying packages in Python often rely on compiled languages, so an advanced Python programmer can develop very computationally efficient programs with popular packages that are built with speed‐optimized algorithms. While not explicitly covered in this book, emerging packages such as Dask can be helpful to process data in parallel, so more advanced scientific programmers can learn to optimize the speed performance of their code. Python interfaces with a variety of languages, so advanced scientific programmers can compile computationally expensive processing components and run them using Python. Then, simpler parts of the code can be written in Python, which is easier to use and debug.

This book encourages readers to share their final code online with the broader community, a practice more common among software developers than scientists. However, it is also good practice to write code and software in a thoughtful and carefully documented manner so that it is usable for others. For instance, well‐written code is general purpose, lacks redundancy, and is intuitively organized so that it may be revised or updated if necessary. Many scientific programmers are self‐learners with a background in procedural programming, and thus their Python code will tend to resemble the flow of a Fortran or IDL program. This text uses Jupyter Notebook, which is designed to promote good programming habits in establishing a “digestible code” mindset; this approach organizes code into short chunks. This book focuses on clear documentation in science algorithms and code. This is handled through version control, using virtual environments, how to structure a usable README file, and what to include in inline commenting.

For most environmental science endeavors, data and code sharing are part of the research‐to‐operations feedback loop. “Operations” refers to continuous data collection for scientific research and hazard monitoring. By sharing these tools with other researchers, datasets are more fully and effectively utilized. Satellite data providers can upgrade existing datasets if there is a demand. Globally, satellite data are provided through data portals by NASA, NOAA, EUMETSAT, ESA, JAXA, and other international agencies. However, the value of these datasets is often only visible through scientific journal articles, which only represent a small subset of potential users. For instance, if the applications of satellite observations used for routine disaster mitigation and planning in a disadvantaged nation are not published in a scientific journal, improvements for disaster‐mitigation specific needs may never be met.

Further, there may be unexpected or novel uses of datasets that can drive scientific inquiry, but if the code that brings those uses to life is hastily written and not easily understood, it is effectively a waste of time for colleagues to attempt to employ such applications. By sharing clearly written code and corresponding documentation for satellite data applications, users can alert colleagues in their community of the existence of scientific breakthrough efforts and expand the potential value of satellite datasets within and beyond their community. Moreover, public knowledge of those efforts can help justify the versatility and value of satellite missions and provide a return on investment for organizations that fund them. In the end, the dissemination of code and data analysis tools will only benefit the scientific community as a whole.

Part IOverview of Satellite Datasets

1A TOUR OF CURRENT SATELLITE MISSIONS AND PRODUCTS

There are thousands of datasets containing observations of the Earth. This chapter describes some satellite types, orbits, and missions, which benefit a variety of fields within Earth sciences, including atmospheric science, oceanography, and hydrology. Data are received on the ground through receiver stations and processed for use using retrieval algorithms. But the raw data requires further manipulation to be useful, and Python is a good choice for analysis and visualization of these datasets.

At present, there are over 13,000 satellite‐based Earth observations freely and openly listed on www.data.gov. Not only is the quantity of available data notable, its quality is equally impressive; for example, infrared sounders can estimate brightness temperatures within 0.1 K from surface observations (Tobin et al., 2013), imagers can detect ocean currents with an accuracy of 1.0 km/hr (NOAA, 2020), and satellite‐based lidar can measure the ice‐sheet elevation change with a 10 cm sensitivity (Garner, 2015). Previously remote parts of our planet are now observable, including the open oceans and sparsely populated areas. Furthermore, many datasets are available in near real time with image latencies ranging from less than an hour down to minutes – the latter being critically important for natural disaster prediction. Having data rapidly available enables science applications and weather prediction as well as to emergency management and disaster relief. Research‐grade data take longer to process (hours to months) but has a higher accuracy and precision, making it suitable for long‐term consistency. Thus, we live in the “golden age” of satellite Earth observation. While the data are accessible, the tools and skills necessary to display and analyze this information require practice and training.

Python is a modern programming language that has exploded in popularity, both within and beyond the Earth science community. Part of its appeal is its easy‐to‐learn syntax and the thousands of available libraries that can be synthesized with the core Python package to do nearly any computing task imaginable. Python is useful for reading Earth‐observing satellite datasets, which can be difficult to use due to the volume of information that results from the multitude of sensors, platforms, and spatio‐temporal spacing. Python facilitates reading a variety of self‐describing binary datasets in which these observations are often encoded. Using the same software, one can complete the entirety of a research project and produce plots. Within a notebook environment, a scientist can document and distribute the code to other users, which can improve efficiency and transparency within the Earth sciences community.

Satellite data often require some pre‐processing to make it usable, but which steps to take and why are not always clear. Data users often misinterpret concepts such as data quality, how to perform an atmospheric correction, or how to implement the complex gridding schemes necessary to compare data at different resolutions. Even to a technical user, the nuances can be frustrating and difficult to overcome. This book walks you through some of the considerations a user should make when working with satellite data.

The primary goal of this text is to get the reader up to speed on the Python coding techniques needed to perform research and analysis using satellite datasets. This is done by adopting an example‐driven approach. It is light on theory but will briefly cover relevant background in a nontechnical manner. Rather than getting lost in the weeds, this book purposefully uses realistic examples to explain concepts. I encourage you to run the interactive code alongside reading the text. In this chapter, I will discuss a few of the satellites, sensors, and datasets covered in this book and explain why Python is a great tool for visualizing the data.

1.1 History of Computational Scientific Visualization

Scientific data visualizing used to be a very tedious process. Prior to the 1970s, data points were plotted by hand using devices such as slide rules, French curls, and graph paper. During the 1970s, IBM mainframes became increasingly available at universities and facilitated data analysis on the computer. For analysis, IBM mainframes required that a researcher write Fortran‐IV code, which was then printed to cards using a keypunch machine (Figure 1.1). The punch cards then were manually fed into a shared university computer to perform calculations. Each card is roughly one line of code. To make plots, the researcher could create a Fortran program to make an ASCII plot, which creates a plot by combining lines, text, and symbols. The plot could then be printed to a line‐printer or a teleprinter. Some institutions had computerized graphic devices, such as Calcomp plotters. Rather than create ASCII plots, the researcher could use a Calcomp plotting command library to control how data were visualized and store the code on computer tape. The scientist would then take the tape to a plotter, which was not necessarily (or usually) in the same area as the computer or keypunch machine. Any errors – such as bugs in the code, damaged punch cards, or damaged tape – meant the whole process would have to be repeated from scratch.

Figure 1.1 (a) An example of a Fortran punch card. Each vertical column represents a character and one card roughly one line of Fortran code. (b) 1979 photo of an IMSAI 8080 computer that could store up to 32 kB of the data, which could then be transferred to a keypunch machine to create punch cards. (c) an image created from the Hubble Space Telescope using a Calcomp printer, which was made from running punch cards and plotting commands through a card reader.

In the mid‐1980s, universities provided remote terminals that would eventually replace the keypunch and card reader machine system. This substantially improved data visualization processes, as scientists no longer had to share limited resources such as keypunch machines, card readers, or terminals. By the late 1980s, personal computers became more affordable for scientists. A typical PC, such as the IBM XT 286, had 640 Kb of random access memory, a 32 MB hard drive, and 5.25 inch floppy disks with 1.2 MB of disk storage (IBM, 1989). At this time, pen plotters became increasingly common for scientific visualization, followed later by the prevalence of ink‐jet printers in the 1990s. These technologies allowed researchers to process and visualize data conveniently from their offices. With the proliferation of user‐friendly person computers, printers eventually made their way into all homes and offices.