143,99 €
Learn basic Python programming to create functional and effective visualizations from earth observation satellite data sets Thousands of satellite datasets are freely available online, but scientists need the right tools to efficiently analyze data and share results. Python has easy-to-learn syntax and thousands of libraries to perform common Earth science programming tasks. Earth Observation Using Python: A Practical Programming Guide presents an example-driven collection of basic methods, applications, and visualizations to process satellite data sets for Earth science research. * Gain Python fluency using real data and case studies * Read and write common scientific data formats, like netCDF, HDF, and GRIB2 * Create 3-dimensional maps of dust, fire, vegetation indices and more * Learn to adjust satellite imagery resolution, apply quality control, and handle big files * Develop useful workflows and learn to share code using version control * Acquire skills using online interactive code available for all examples in the book The American Geophysical Union promotes discovery in Earth and space science for the benefit of humanity. Its publications disseminate scientific knowledge and provide resources for researchers, students, and professionals. Find out more about this book from this Q&A with the Author
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 396
Veröffentlichungsjahr: 2021
Cover
Title Page
Copyright Page
Foreword
Acknowledgments
Introduction
Part I: Overview of Satellite Datasets
1 A TOUR OF CURRENT SATELLITE MISSIONS AND PRODUCTS
1.1 History of Computational Scientific Visualization
1.2 Brief Catalog of Current Satellite Products
1.3 The Flow of Data from Satellites to Computer
1.4 Learning Using Real Data and Case Studies
1.5 Summary
References
2 OVERVIEW OF PYTHON
2.1 Why Python?
2.2 Useful Packages for Remote Sensing Visualization
2.3 Maturing Packages
2.4 Summary
References
3 A DEEP DIVE INTO SCIENTIFIC DATA SETS
3.1 Storage
3.2 Data Formats
3.3 Data Usage
3.4 Summary
References
Part II: Practical Python Tutorials for Remote Sensing
4 PRACTICAL PYTHON SYNTAX
4.1 “Hello Earth” in Python
4.2 Variable Assignment and Arithmetic
4.3 Lists
4.4 Importing Packages
4.5 Array and Matrix Operations
4.6 Time Series Data
4.7 Loops
4.8 List Comprehensions
4.9 Functions
4.10 Dictionaries
4.11 Summary
References
5 IMPORTING STANDARD EARTH SCIENCE DATASETS
5.1 Text
5.2 NetCDF
5.3 HDF
5.4 GRIB2
5.5 Importing Data Using Xarray
5.6 Summary
References
6 PLOTTING AND GRAPHS FOR ALL
6.1 Univariate Plots
6.2 Two Variable Plots
6.3 Three Variable Plots
6.4 Summary
References
7 CREATING EFFECTIVE AND FUNCTIONAL MAPS
7.1 Cartographic Projections
7.2 Cylindrical Maps
7.3 Polar Stereographic Maps
7.4 Geostationary Maps
7.5 Creating Maps from Datasets Using OpenDAP
7.6 Summary
References
8 GRIDDING OPERATIONS
8.1 Regular One‐Dimensional Grids
8.2 Regular Two‐Dimensional Grids
8.3 Irregular Two‐Dimensional Grids
8.4 Summary
References
9 MEANINGFUL VISUALS THROUGH DATA COMBINATION
9.1 Spectral and Spatial Characteristics of Different Sensors
9.2 Normalized Difference Vegetation Index (NDVI)
9.3 Window Channels
9.4 RGB
9.5 Matching with Surface Observations
9.6 Summary
References
10 EXPORTING WITH EASE
10.1 Figures
10.2 Text Files
10.3 Pickling
10.4 NumPy Binary Files
10.5 NetCDF
10.6 Summary
Part III: Effective Coding Practices
11 DEVELOPING A WORKFLOW
11.1 Scripting with Python
11.2 Version Control
11.3 Virtual Environments
11.4 Methods for Code Development
11.5 Summary
References
12 REPRODUCIBLE AND SHAREABLE SCIENCE
12.1 Clean Coding Techniques
12.2 Documentation
12.3 Licensing
12.4 Effective Visuals
12.5 Summary
References
Conclusion
Appendix A: INSTALLING PYTHON
A.1 Download Tutorials for This Book
A.2 Download and Install Anaconda
A.3 Package Management in Anaconda
Appendix B: JUPYTER NOTEBOOK
Appendix C: ADDITIONAL LEARNING RESOURCES
Appendix D: TOOLS
Appendix E: FINDING, ACCESSING, AND DOWNLOADING SATELLITE DATASETS
Appendix F: ACRONYMS
Index
End User License Agreement
Chapter 3
Table 3.1 Data Types, Typical Ranges, and Decimal Precision and Size in Compu...
Table 3.2 Examples of How Data Can Be Rescaled to Fit in Integer Ranges
Table 3.3 Comparing Increasing Numbers in Base‐10 to Base‐2 (Binary)
Table 3.4 Levels and Examples of Transformations Performed on the Data
Chapter 4
Table 4.1 Useful Python Functions and Methods Used in This Text
Table 4.2 Built‐in Comparison, Identity, Logical, and Membership Operators in...
Chapter 5
Table 5.1 Common Attributes Found a netCDF file Containing Remote Sensing Dat...
Chapter 6
Table 6.1 Calls and Options for Initializing a Figure Using matplotlib
Table 6.2 Helpful Options for matplotlob Colorbars
Chapter 7
Table 7.1 Map Projection Shapes, Examples, and Regions of Minimal Distortion
Chapter 10
Table 10.1 Differences between Image Types
Chapter 12
Table 12.1 Examples of Code Problems and Solutions
Chapter 1
Figure 1.1 (a) An example of a Fortran punch card. Each vertical column repr...
Figure 1.2 Illustration of current Earth, space weather, and environmental m...
Figure 1.3 Equatorial crossing times for various LEO satellites displayed us...
Figure 1.4 NOAA‐20 satellite downlink.
Chapter 3
Figure 3.1 (a) Canisters of 35mm film that contain imagery recovered from Ni...
Figure 3.2 Spacing and distance between (
x, y
) points for an example regular...
Figure 3.3 An illustration of how a granule of satellite data taken from a p...
Figure 3.4 Read order for row and column major.
Figure 3.5 Example of how netCDF data are organized. Each variable has metad...
Figure 3.6 Direct broadcast (DB) antenna sites, which can provide data in re...
Chapter 5
Figure 5.1 Illustration showing which values are used in a computation of a ...
Figure 5.2 NUCAPS data are organized into 120 fields of regard (FORs) or foo...
Figure 5.3 Screenshot of the mean SST dataset in the NOAA/ESRL data catalog....
Chapter 6
Figure 6.1 VIIRS imagery of wildfires on 8 November 2018. Source: Worldview ...
Figure 6.2 Different matplotlib axes layout syntax using the
plt.subplot
...
Chapter 7
Figure 7.1 Most varieties of map projections are cylindrical, conic, or plan...
Figure 7.2 Each projection type has different regions of distortion. Lighter...
Figure 7.3 Examples of the types of maps within each projection category.
Figure 7.4 Imager projections from Geostationary Satellites.
Chapter 8
Figure 8.1 An example of differences in the nearest neighbor, linear, and cu...
Chapter 9
Figure 9.1 Example of various instrument channels and ranges between 400–100...
Chapter 11
Figure 11.1 Converting Jupyter Notebook to a script.
Figure 11.2 A flow chart of the steps necessary to run scripts in the comman...
Figure 11.3 Installing nb_conda adds a new tab to Jupyter Notebook, which al...
Figure 11.4 Changing kernel inside Juypter Notebooks.
Figure 11.5 Comparison of waterfall and agile framework.
Chapter 12
Figure 12.1 Example of uniform color scheme (top) and how a different color ...
Figure 12.2 Perceptually uniform sequential colors available in matplotlib....
Cover Page
Title Page
Copyright Page
Foreword
Acknowledgments
Table of Contents
Begin Reading
Conclusion
Appendix A Installing Python
Appendix B Jupyter Notebook
Appendix C Additional Learning Resources
Appendix D Tools
Appendix E Finding, Accessing, and Downloading Satellite Datasets
Appendix F Acronyms
Index
WILEY END USER LICENSE AGREEMENT
iii
vii
viii
ix
1
2
3
4
5
7
8
9
10
11
12
13
14
15
17
18
19
20
21
22
23
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
259
260
261
262
263
264
265
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
283
284
285
286
287
288
Special Publications 75
Rebekah B. Esmaili
This Work is a co‐publication of the American Geophysical Union and John Wiley and Sons, Inc.
This edition first published 2021© 2021 American Geophysical Union
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
Published under the aegis of the AGU Publications Committee
Brooks Hanson, Executive Vice President, ScienceCarol Frost, Chair, Publications CommitteeFor details about the American Geophysical Union visit us at www.agu.org.
The right of Rebekah B. Esmaili to be identified as the author of this work has been asserted in accordance with law.
Registered OfficeJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
Editorial Office111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging‐in‐Publication DataName: Esmaili, Rebekah Bradley, author.Title: Earth observation using Python : a practical programming guide / Rebekah B. Esmaili.Description: Hoboken, NJ : Wiley, [2021] | Includes bibliographical references and index.Identifiers: LCCN 2021001631 (print) | LCCN 2021001632 (ebook) | ISBN 9781119606888 (hardback) | ISBN 9781119606895 (adobe pdf) | ISBN 9781119606918 (epub)Subjects: LCSH: Earth sciences—Data processing. | Remote sensing–Data processing. | Python (Computer program language) | Information visualization. | Artificial satellites in earth sciences. | Earth sciences—Methodology.Classification: LCC QE48.8 .E85 2021 (print) | LCC QE48.8 (ebook) | DDC 550.285/5133—dc23LC record available at https://lccn.loc.gov/2021001631LC ebook record available at https://lccn.loc.gov/2021001632
Cover Design: WileyCover Image: © NASA
When I first met the author a few years ago, she was eager to become more involved in the Joint Polar Satellite System’s Proving Ground. The Proving Ground by definition assesses the impact of a product in the user’s environment; this intrigued Rebekah because as a product developer, she wanted to understand the user’s perspective. Rebekah worked with the National Weather Service to demonstrate how satellite‐derived atmospheric temperature and water vapor soundings can be used to describe the atmosphere’s instability to support severe weather warnings. Rebekah spent considerable time with users at the Storm Prediction Center in Norman, Oklahoma, to understand their needs, and she found their thirst for data and the need for data to be easily visualized and understandable. This is where Rebekah leveraged her expert skills in Python to provide NWS with the information they found to be most useful. Little did I know at the time she was writing a book.
As noted in this book, a myriad of Earth‐observing satellites collect critical information of the Earth’s complex and ever‐changing environment and landscape. However, today, unfortunately, all that information is not effectively being used for various reasons: issues with data access, different data formats, and the need for better tools for data fusion and visualization. If we were able to solve these problems, then suddenly there would be vast improvements in providing societies with the information needed to support decisions related to weather and climate and their impacts, including high‐impact weather events, droughts, flooding, wildfires, ocean/coastal ecosystems, air quality, and more. Python is becoming the universal language to bridge these various data sources and translate them into useful information. Open and free attributes, and the data and code sharing mindset of the Python communities, make Python very appealing.
Being involved in a number of international collaborations to improve the integration of Earth observations, I can certainly emphasize the importance of working together, data sharing, and demonstrating the value of data fusion. I am very honored to write this Foreword, since this book focuses on these issues and provides an excellent guide with relevant examples for the reader to follow and relate to.
Dr. Mitch GoldbergChief Program ScientistNOAA-National Environmental Satellite, Data, and Information ServiceJune 22, 2020
This book evolved from a series of Python workshops that I developed with the help of Eviatar Bach and Kriti Bhargava from the Department of Atmospheric and Oceanic Science at the University of Maryland. I am very grateful for their assistance providing feedback for the examples in this book and for leading several of these workshops with me.
This book would not exist without their support and contributions from others, including:
The many reviewers who took the time to read versions of this book, several of whom I have never met in person. Thanks to modern communication systems, I was able to draw from their expertise. Their constructive feedback and insights not only helped to improve this quality and breadth of the book but also helped me hone my technical writing skills.
Rituparna Bose, Jenny Lunn, Layla Harden, and the rest of the team at AGU and Wiley for keeping me informed, organized, and on track throughout this process. They were truly a pleasure to work with.
Nadia Smith and Chris Barnet, and my other colleagues at Science and Technology Corp., who provided both feedback and conversations that helped shape some of the ideas and content in this book.
Catherine Thomas, Clare Flynn, Erin Lynch, and Amy Ho for their endless encouragement and support.
Tracie and Farid Esmaili, my parents, who encouraged me to aim high even if they were initially confused when their atmospheric scientist daughter became interested in “snakes.”
Python is a programming language that is rapidly growing in popularity. The number of users is large, although difficult to quantify; in fact, Python is currently the most tagged language on stackoverflow.com, a coding Q&A website with approximately 3 million questions a year. Some view this interest as hype, but there are many reasons to join the movement. Scientists are embracing Python because it is free, open source, easy to learn, and has thousands of add‐on packages. Many routine tasks in the Earth sciences have already been coded and stored in off‐the‐shelf Python libraries. Users can download these libraries and apply them to their research rather than simply using older, more primitive functions. The widespread adoption of Python means scientists are moving toward a common programming language and set of tools that will improve code shareability and research reproducibility.
Among the wealth of remote sensing data available, satellite datasets are particularly voluminous and tend to be stored in a variety of binary formats. Some datasets conform to a “standard” structure, such as netCDF4. However, because of uncoordinated efforts across different agencies and countries, such standard formats bear their own inconsistencies in how data are handled and intended to be displayed. To address this, many agencies and companies have developed numerous “quick look” methods. For instance, data can be searched for and viewed online as Jpeg images, or individual files can be displayed with free, open‐source software tools like Panoply (www.giss.nasa.gov/tools/panoply/) and HDFView (www.hdfgroup.org/downloads/hdfview/).
Still, scientists who wish to execute more sophisticated visualization techniques will have to learn to code. Coding knowledge is not the only limitation for users. Not all data are “analysis ready,” i.e., in the proper input format for visualization tools. As such, many pre‐processing steps are required to make the data usable for scientific analysis. This is particularly evident for data fusion, where two datasets with different resolutions must first be mapped to the same grid before they are compared. Many data users are not satellite scientists or professional programmers but rather members of other research and professional communities, these barriers can be too great to overcome. Even to a technical user, the nuances can be frustrating. At worst, obstacles in coding and data visualization can potentially lead to data misuse, which can tarnish the work of an entire community.
The purpose of this text is to provide an overview of the common preparatory work and visualization techniques that are applied to environmental satellite data using the Python language. This book is highly example‐driven, and all the examples are available online. The exercises are primarily based on hands‐on tutorial workshops that I have developed. The motivation for producing this book is to make the contents of the workshops accessible to more Earth scientists, as very few Python books currently available target the Earth science community.
This book is written to be a practical workbook and not a theoretical textbook. For example, readers will be able to interactively run prewritten code interactively alongside the text to guide them through the code examples. Exercises in each section build on one another, with incremental steps folded in. Readers with minimal coding experience can follow each “baby step” to get them up to become “spun up” quickly, while more experienced coders have the option of working with the code directly and spending more time on building a workflow as described in Section III.
The exercises and solutions provided in this book use Jupyter Notebook, a highly interactive, web‐based development environment. Using Jupyter Notebook, code can be run in a single line or short blocks, and the results are generated within an interactive documented format. This allows the student to view both the Python commands and comments alongside the expected results. Jupyter Notebook can also be easily converted to programs or scripts than can be executed on Linux Machines for high‐performance computing. This provides a friendly work environment to new Python users. Students are also welcome to develop code in any environment they wish, such as the Spyder IDE or using iPython.
While the material builds on concepts learned in other chapters, the book references the location of earlier discussions of the material. Within each chapter, the examples are progressive. This design allows students to build on their understanding knowledge (and learn where to find answers when they need guidance) rather than memorizing syntax or a “recipe.” Professionally, I have worked with many datasets and I have found that the skills and strategies that I apply on satellite data are fairly universal. The examples in this book are intended to help readers become familiar with some of the characteristic quirks that they may encounter when analyzing various satellite datasets in their careers. In this regard, students are also strongly encouraged to submit requests for improvements in future editions.
Like many technological texts, there is a risk that the solutions presented will become outdated as new tools and techniques are developed. The sizable user community already contributing to Python implies it is actively advancing; it is a living language in contrast to compiled, more slowly evolving legacy languages like Fortran and C/C++. A drawback of printed media is that it tends to be static and Python is evolving more rapidly than the typical production schedule of a book. To mitigate this, this book intends to teach fluency in a few, well‐established packages by detailing the steps and thought processes needed for a user needs to carry out more advanced studies. The text focuses discipline‐agnostic packages that are widely used, such as NumPy, Pandas, and xarray, as well as plotting packages such as Matplotlib and Cartopy.
I have chosen to highlight Python primarily because it is a general‐purpose language, rather than being discipline or task‐specific. Python programmers can script, process, analyze, and visualize data. Python’s popularity does not diminish the usefulness and value of other languages and techniques. As with all interpreted programming languages, Python may run more slowly compared to compiled languages like Fortran and C++, the traditional tools of the trade. For instance, some steps in data analysis could be done more succinctly and with greater computational efficiency in other languages. Also, underlying packages in Python often rely on compiled languages, so an advanced Python programmer can develop very computationally efficient programs with popular packages that are built with speed‐optimized algorithms. While not explicitly covered in this book, emerging packages such as Dask can be helpful to process data in parallel, so more advanced scientific programmers can learn to optimize the speed performance of their code. Python interfaces with a variety of languages, so advanced scientific programmers can compile computationally expensive processing components and run them using Python. Then, simpler parts of the code can be written in Python, which is easier to use and debug.
This book encourages readers to share their final code online with the broader community, a practice more common among software developers than scientists. However, it is also good practice to write code and software in a thoughtful and carefully documented manner so that it is usable for others. For instance, well‐written code is general purpose, lacks redundancy, and is intuitively organized so that it may be revised or updated if necessary. Many scientific programmers are self‐learners with a background in procedural programming, and thus their Python code will tend to resemble the flow of a Fortran or IDL program. This text uses Jupyter Notebook, which is designed to promote good programming habits in establishing a “digestible code” mindset; this approach organizes code into short chunks. This book focuses on clear documentation in science algorithms and code. This is handled through version control, using virtual environments, how to structure a usable README file, and what to include in inline commenting.
For most environmental science endeavors, data and code sharing are part of the research‐to‐operations feedback loop. “Operations” refers to continuous data collection for scientific research and hazard monitoring. By sharing these tools with other researchers, datasets are more fully and effectively utilized. Satellite data providers can upgrade existing datasets if there is a demand. Globally, satellite data are provided through data portals by NASA, NOAA, EUMETSAT, ESA, JAXA, and other international agencies. However, the value of these datasets is often only visible through scientific journal articles, which only represent a small subset of potential users. For instance, if the applications of satellite observations used for routine disaster mitigation and planning in a disadvantaged nation are not published in a scientific journal, improvements for disaster‐mitigation specific needs may never be met.
Further, there may be unexpected or novel uses of datasets that can drive scientific inquiry, but if the code that brings those uses to life is hastily written and not easily understood, it is effectively a waste of time for colleagues to attempt to employ such applications. By sharing clearly written code and corresponding documentation for satellite data applications, users can alert colleagues in their community of the existence of scientific breakthrough efforts and expand the potential value of satellite datasets within and beyond their community. Moreover, public knowledge of those efforts can help justify the versatility and value of satellite missions and provide a return on investment for organizations that fund them. In the end, the dissemination of code and data analysis tools will only benefit the scientific community as a whole.
There are thousands of datasets containing observations of the Earth. This chapter describes some satellite types, orbits, and missions, which benefit a variety of fields within Earth sciences, including atmospheric science, oceanography, and hydrology. Data are received on the ground through receiver stations and processed for use using retrieval algorithms. But the raw data requires further manipulation to be useful, and Python is a good choice for analysis and visualization of these datasets.
At present, there are over 13,000 satellite‐based Earth observations freely and openly listed on www.data.gov. Not only is the quantity of available data notable, its quality is equally impressive; for example, infrared sounders can estimate brightness temperatures within 0.1 K from surface observations (Tobin et al., 2013), imagers can detect ocean currents with an accuracy of 1.0 km/hr (NOAA, 2020), and satellite‐based lidar can measure the ice‐sheet elevation change with a 10 cm sensitivity (Garner, 2015). Previously remote parts of our planet are now observable, including the open oceans and sparsely populated areas. Furthermore, many datasets are available in near real time with image latencies ranging from less than an hour down to minutes – the latter being critically important for natural disaster prediction. Having data rapidly available enables science applications and weather prediction as well as to emergency management and disaster relief. Research‐grade data take longer to process (hours to months) but has a higher accuracy and precision, making it suitable for long‐term consistency. Thus, we live in the “golden age” of satellite Earth observation. While the data are accessible, the tools and skills necessary to display and analyze this information require practice and training.
Python is a modern programming language that has exploded in popularity, both within and beyond the Earth science community. Part of its appeal is its easy‐to‐learn syntax and the thousands of available libraries that can be synthesized with the core Python package to do nearly any computing task imaginable. Python is useful for reading Earth‐observing satellite datasets, which can be difficult to use due to the volume of information that results from the multitude of sensors, platforms, and spatio‐temporal spacing. Python facilitates reading a variety of self‐describing binary datasets in which these observations are often encoded. Using the same software, one can complete the entirety of a research project and produce plots. Within a notebook environment, a scientist can document and distribute the code to other users, which can improve efficiency and transparency within the Earth sciences community.
Satellite data often require some pre‐processing to make it usable, but which steps to take and why are not always clear. Data users often misinterpret concepts such as data quality, how to perform an atmospheric correction, or how to implement the complex gridding schemes necessary to compare data at different resolutions. Even to a technical user, the nuances can be frustrating and difficult to overcome. This book walks you through some of the considerations a user should make when working with satellite data.
The primary goal of this text is to get the reader up to speed on the Python coding techniques needed to perform research and analysis using satellite datasets. This is done by adopting an example‐driven approach. It is light on theory but will briefly cover relevant background in a nontechnical manner. Rather than getting lost in the weeds, this book purposefully uses realistic examples to explain concepts. I encourage you to run the interactive code alongside reading the text. In this chapter, I will discuss a few of the satellites, sensors, and datasets covered in this book and explain why Python is a great tool for visualizing the data.
Scientific data visualizing used to be a very tedious process. Prior to the 1970s, data points were plotted by hand using devices such as slide rules, French curls, and graph paper. During the 1970s, IBM mainframes became increasingly available at universities and facilitated data analysis on the computer. For analysis, IBM mainframes required that a researcher write Fortran‐IV code, which was then printed to cards using a keypunch machine (Figure 1.1). The punch cards then were manually fed into a shared university computer to perform calculations. Each card is roughly one line of code. To make plots, the researcher could create a Fortran program to make an ASCII plot, which creates a plot by combining lines, text, and symbols. The plot could then be printed to a line‐printer or a teleprinter. Some institutions had computerized graphic devices, such as Calcomp plotters. Rather than create ASCII plots, the researcher could use a Calcomp plotting command library to control how data were visualized and store the code on computer tape. The scientist would then take the tape to a plotter, which was not necessarily (or usually) in the same area as the computer or keypunch machine. Any errors – such as bugs in the code, damaged punch cards, or damaged tape – meant the whole process would have to be repeated from scratch.
Figure 1.1 (a) An example of a Fortran punch card. Each vertical column represents a character and one card roughly one line of Fortran code. (b) 1979 photo of an IMSAI 8080 computer that could store up to 32 kB of the data, which could then be transferred to a keypunch machine to create punch cards. (c) an image created from the Hubble Space Telescope using a Calcomp printer, which was made from running punch cards and plotting commands through a card reader.
In the mid‐1980s, universities provided remote terminals that would eventually replace the keypunch and card reader machine system. This substantially improved data visualization processes, as scientists no longer had to share limited resources such as keypunch machines, card readers, or terminals. By the late 1980s, personal computers became more affordable for scientists. A typical PC, such as the IBM XT 286, had 640 Kb of random access memory, a 32 MB hard drive, and 5.25 inch floppy disks with 1.2 MB of disk storage (IBM, 1989). At this time, pen plotters became increasingly common for scientific visualization, followed later by the prevalence of ink‐jet printers in the 1990s. These technologies allowed researchers to process and visualize data conveniently from their offices. With the proliferation of user‐friendly person computers, printers eventually made their way into all homes and offices.
