Learning Geospatial Analysis with Python - Joel Lawhead - E-Book

Learning Geospatial Analysis with Python E-Book

Joel Lawhead

0,0
39,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Geospatial analysis is used in almost every field you can think of from medicine, to defense, to farming. It is an approach to use statistical analysis and other informational engineering to data which has a geographical or geospatial aspect. And this typically involves applications capable of geospatial display and processing to get a compiled and useful data.

"Learning Geospatial Analysis with Python" uses the expressive and powerful Python programming language to guide you through geographic information systems, remote sensing, topography, and more. It explains how to use a framework in order to approach Geospatial analysis effectively, but on your own terms.

"Learning Geospatial Analysis with Python" starts with a background of the field, a survey of the techniques and technology used, and then splits the field into its component speciality areas: GIS, remote sensing, elevation data, advanced modelling, and real-time data.

This book will teach you everything there is to know, from using a particular software package or API to using generic algorithms that can be applied to Geospatial analysis. This book focuses on pure Python whenever possible to minimize compiling platform-dependent binaries, so that you don't become bogged down in just getting ready to do analysis.

"Learning Geospatial Analysis with Python" will round out your technical library with handy recipes and a good understanding of a field that supplements many a modern day human endeavors.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 388

Veröffentlichungsjahr: 2013

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Learning Geospatial Analysis with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Learning Geospatial Analysis with Python
Geospatial analysis and our world
Beyond politics
History of geospatial analysis
Geographic Information Systems
Remote sensing
Elevation data
Computer-aided drafting
Geospatial analysis and computer programming
Object-oriented programming for geospatial analysis
Importance of geospatial analysis
Geographic Information System concepts
Thematic maps
Spatial databases
Spatial indexing
Metadata
Map projections
Rendering
Raster data concepts
Images as data
Remote sensing and color
Common vector GIS concepts
Data structures
Buffer
Dissolve
Generalize
Intersection
Merge
Point in polygon
Union
Join
Geospatial rules about polygons
Common raster data concepts
Band math
Change detection
Histogram
Feature extraction
Supervised classification
Unsupervised classification
Creating the simplest possible Python GIS
Getting started with Python
Building SimpleGIS
Summary
2. Geospatial Data
Data structures
Common traits
Geo-location
Subject information
Spatial indexing
Indexing algorithms
Quad-Tree index
R-Tree index
Grids
Overviews
Metadata
File structure
Vector data
Shapefiles
CAD files
Tag and markup-based formats
GeoJSON
Raster data
TIFF files
JPEG, GIF, BMP, and PNG
Compressed formats
ASCII GRIDS
World files
Point cloud data
Summary
3. The Geospatial Technology Landscape
Data access
GDAL
OGR
Computational geometry
PROJ.4
CGAL
JTS
GEOS
PostGIS
Other spatially-enabled databases
Oracle spatial and graph
ArcSDE
Microsoft SQL Server
MySQL
SpatiaLite
Routing
Esri Network Analyst and Spatial Analyst
pgRouting
Desktop tools
Quantum GIS
OpenEV
GRASS GIS
uDig
gvSIG
OpenJUMP
Google Earth
NASA World Wind
ArcGIS
Metadata management
GeoNetwork
CatMDEdit
Summary
4. Geospatial Python Toolbox
Installing third-party Python modules
Installing GDAL
Windows
Linux
Mac OS X
Python networking libraries for acquiring data
Python urllib module
FTP
ZIP and TAR files
Python markup and tag-based parsers
The minidom module
ElementTree
Building XML
WKT
Python JSON libraries
json module
geojson module
OGR
PyShp
dbfpy
Shapely
GDAL
NumPy
PIL
PNGCanvas
PyFPDF
Spectral Python
Summary
5. Python and Geographic Information Systems
Measuring distance
Pythagorean theorem
Haversine formula
Vincenty formula
Coordinate conversion
Reprojection
Editing shapefiles
Accessing the shapefile
Reading shapefile attributes
Reading shapefile geometry
Changing a shapefile
Adding fields
Merging shapefiles
Splitting shapefiles
Subsetting spatially
Performing selections
Point in polygon formula
Attribute selections
Creating images for visualization
Dot density calculations
Choropleth maps
Using spreadsheets
Using GPS data
Summary
6. Python and Remote Sensing
Swapping image bands
Creating histograms
Performing a histogram stretch
Clipping images
Classifying images
Extracting features from images
Change detection
Summary
7. Python and Elevation Data
ASCII Grid files
Reading grids
Writing grids
Creating a shaded relief
Creating elevation contours
Working with LIDAR
Creating a grid from LIDAR
Using PIL to visualize LIDAR
Creating a Triangulated Irregular Network (TIN)
Summary
8. Advanced Geospatial Python Modelling
Creating an NDVI
Setting up the framework
Loading the data
Rasterizing the shapefile
Clipping the bands
Using the NDVI formula
Classifying the NDVI
Additional functions
Loading the NDVI
Creating classes
Creating a flood inundation model
The flood fill function
Making a flood
Least cost path analysis
Setting up the test grid
The simple A* algorithm
Generating the test path
Viewing the test output
The real-world example
Loading the grid
Defining the helper functions
The real-world A* algorithm
Generating a real-world path
Summary
9. Real-Time Data
Tracking vehicles
Nextbus agency list
Nextbus route list
Nextbus vehicle locations
Mapping Nextbus locations
Storm chasing
Summary
10. Putting It All Together
A typical GPS report
Working with GPX-Reporter.py
Stepping through the program
Initial setup
Working with utility functions
Parsing the GPX
Getting the bounding box
Downloading OpenStreetMap images
Creating the hillshade
Creating maps
Measuring elevation
Measuring distance
Retrieving weather data
Summary
Index

Learning Geospatial Analysis with Python

Learning Geospatial Analysis with Python

Copyright © 2013 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: October 2013

Production Reference: 1181013

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78328-113-8

www.packtpub.com

Cover Image by Jarek Blaminsky (<[email protected]>)

Credits

Author

Joel Lawhead

Reviewers

Jorge Samuel Mendes de Jesus

Athanasios Tom Kralidis

Alessandro Pasotti

Acquisition Editor

Joanne Fitzpatrick

Lead Technical Editor

Balaji Naidu

Technical Editors

Pooja Arondekar

Anita Nayak

Anusri Ramchandran

Project Coordinator

Angel Jathanna

Proofreader

Bernadette Watkins

Indexer

Hemangini Bari

Graphics

Abhinash Sahu

Production Coordinator

Shantanu Zagade

Cover Work

Shantanu Zagade

About the Author

Joel Lawhead is a PMI-certified Project Management Professional (PMP) and the Chief Information Officer (CIO) for NVisionSolutions.com, an award-winning firm specializing in geospatial technology integration and sensor engineering.

He began using Python in 1997 and began combining it with geospatial software development in 2000. He has been published in two editions of the Python Cookbook by O'Reilly. He is also the developer of the widely used open source Python Shapefile Library (PyShp) and maintains the geospatial technical blog GeospatialPython.com and Twitter feed @SpatialPython discussing the use of the Python programming language within the geospatial industry.

In 2011, he reverse engineered and published the undocumented shapefile spatial indexing format and assisted fellow geospatial Python developer, Marc Pfister, in reversing the algorithm used, allowing developers around the world to create better-integrated and more robust geospatial applications involving shapefiles.

He has served as the lead architect, project manager, and co-developer for geospatial applications used by US government agencies including NASA, FEMA, NOAA, the US Navy, as well as many commercial and non-profit organizations. In 2002, he received the international "Esri Special Achievment in GIS" award for work on the Real-time Emergency Action Coordination Tool (REACT) for emergency management using geospatial analysis.

I would like to acknowledge my loving family including my wife Julie and four children Lauren, Will, Lillie, and Lainie who allowed me to write this book after hours. Thank you to my parents who inspired me through their actions to pursue computers, teaching, and writing; all the ingredients needed for a technical book. I would also like to acknowledge the work of the geospatial Python pioneers whose relentless and selfless contributions over the years in developing and publishing code to the geospatial Python body of knowledge made the content of this book possible, including Sean Gillies, Howard Butler, Matthew Perry, Frank Warmerdam, and Marc Pfister.

About the Reviewers

Jorge Samuel Mendes de Jesus has 15 years of programming experience in the field of Geoinformatics, with focus on Python programming, web services, and spatial databases.

He has a PhD in Geography and Sustainable Development from Ben-Gurion University and has been employed by the Joint Research Center, ISPRA, Plymouth Marine Laboratory and currently works at ISRIC, World Soil Information.

He currently lives in Wageningen, the Netherlands and spends his time learning combat sports and Dutch.

Athanasios Tom Kralidis is a Senior Systems Scientist for the Meteorological Service of Canada, where he provides geospatial technical and architectural leadership in support of MSC's data. His professional background includes key involvement in the development and integration of geospatial web standards, systems and services for the Canadian Geospatial Data Infrastructure (CGDI) with Natural Resources Canada (NRCan), as well as using these principles in architecting RésEau, Canada's water information portal.

He is active in the Open Geospatial Consortium (OGC) community, was lead contributer to the OGC Web Map Context Documents Specification, member of the CGDI Architecture Advisory Board, as well as part of the Canadian Advisory Committee to ISO Technical Committee 211 Geographic Information / Geomatics.

He is a developer on the MapServer, GeoNode and OWSLib open source software projects, and part of the MapServer Project Steering Committee. He is the founder and lead developer of pycsw, an OGC-compliant CSW reference implementation. He is also a charter member of the OGC.

Tom holds a Bachelor's degree in Geography from York University, GIS certification from Algonquin College, and a Master's degree in Geography and Environmental Studies (research and dissertation in Geospatial Web Services / Infrastructure) from Carleton University. He is a Certified Geomatics Specialist (GIS/LIS) with the Canadian Institute of Geomatics.

Alessandro Pasotti is the founder of ItOpen, an Italian web development consultancy focused on web GIS development and accessible websites. He has been programming for over two decades and he is now mainly a web application developer, handling both frontend and backend development.

He fell in love with Linux and free software in 1994 and never turned back. He spends most of his time developing web GIS applications in Python using GeoDjango and JavaScript mapping libraries such as OpenLayers.

www.PacktPub.com

Support files, eBooks, discount offers and more

You might want to visit www.PacktPub.com for support files and downloads related to your book.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

http://PacktLib.PacktPub.com

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books. 

Why Subscribe?

Fully searchable across every book published by PacktCopy and paste, print and bookmark contentOn demand and accessible via web browser

Free Access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.

Preface

The best books change the way you look at the world. They take your mind to a different place than where you started. The transformation we experience from a good book is the reason books have survived for centuries as a way to share the breadth of human experience.

This book is about geospatial analysis. Geospatial analysis is the combination of statistical analysis, computational geometry, and image processing applied to data which is tied to the Earth (or even other planets). But that technical definition falls short of what geospatial analysis truly is. Similar to a good book, geospatial analysis tells a story about our world. This story is told through thematic maps, processed satellite images, and tables of information.

These stories quite literally change your worldview by revealing patterns about human behavior and natural processes that are otherwise difficult to discern or are even invisible to us. The increased awareness of our world and our place in it allows us to make better decisions about everything from agriculture to politics to disaster management.

This book will teach you geospatial analysis using the Python programming language. Python is a very popular and easy to learn language used in nearly every field. Python was invented in the late 1980s by Guido van Rossum and is based on the language "ABC" designed to teach programming to kids. The clean and intuitive syntax allows you to think about the problem you are trying to solve and not the language you are using. It also interfaces well with nearly every geospatial library available.

Learning Geospatial Analysis with Python supplements the library of Packt Publishing with a third book on geospatial technology and Python. The series offered by Packt Publishing covers the most complete range of published knowledge in this domain. In order to understand the scope of this book and its benefits, it helps to be familiar with the other offerings by Packt Publishing.

Python Geospatial Development by Erik Westra covers building desktop and web applications using Python and leading open source geospatial libraries. The focus of the book is capturing well-defined geospatial processes as requirements and then developing applications allowing users to interactively execute that process again and again.

Programming ArcGIS 10.1 with Python Cookbook by Eric Pimpler teaches readers how to automate ArcGIS 10.1, the leading Geographic Information System (GIS) software package by Esri. ArcGIS contains a Python environment called ArcPy that provides an interface to nearly the entire package. The book shows how to use Python to script the ArcGIS for a variety of geoprocessing tasks.

Geospatial analysis will allow you to look at the world in a whole new way and with new understanding. And Python will facilitate the journey and even make it fun! This book will serve as both a guide and future reference as you move deeper into this exciting field.

What this book covers

Chapter 1, Learning Geospatial Analysis with Python, introduces geospatial analysis as a way of answering questions about our world. The differences between GIS and remote sensing are explained. Common geospatial analysis processes are illustrated and a code for a simple geographic information system in Python is introduced.

Chapter 2, Geospatial Data, discusses geospatial data, and explains the forms geospatial data comes in. The most challenging part of geospatial analysis is acquiring the data you need and preparing it for analysis. This chapter explains the two major categories of data as well as several newer formats that are becoming more and more common. Familiarity with these data types is essential to understand geospatial analysis.

Chapter 3, The Geospatial Technology Landscape, covers the geospatial technology ecosystem that consists of thousands of software libraries and packages. This vast array of choices is overwhelming for newcomers to geospatial analysis. The secret to learning geospatial analysis quickly is to understand the handful of libraries and packages that really matter. Most other software is derived from these critical packages. Understanding the hierarchy of geospatial software and how it's used allows you to quickly comprehend and evaluate any geospatial tool.

Chapter 4, Geospatial Python Toolbox, explains the software and libraries introduced which forms the basis of the book and are used throughout. In this chapter, Python's role within the geospatial industry is elaborated: GIS scripting language, mash-up glue language, and full-blown programming language. Code examples are used to teach data editing concepts, and many of the basic geospatial concepts in Chapter 1, Learning Geospatial Analysis with Python, are also demonstrated in Python.

Chapter 5, Python and Geographic Information Systems, teaches the simple yet practical python GIS geospatial products using processes which can be applied to a variety of problems.

Chapter 6, Python and Remote Sensing, shows readers how to work with remote sensing geospatial data. Remote sensing includes some of the most complex and least documented geospatial operations. This chapter will build a solid core for the reader and demystify remote sensing using Python.

Chapter 7, Python and Elevation Data, demonstrates the most common uses of elevation data, which can be contained in almost any geospatial format but is used quite differently from other types of geospatial data, and will show you how to work with its unique properties.

Chapter 8, Advanced Geospatial Python Modeling, discusses how geospatial data editing and processing help us understand the world as it is. But the true power of geospatial analysis is modeling. Geospatial models help us predict the future, narrow vast fields of choices down to the best options, and visualize concepts which cannot be directly observed in the natural world. This chapter uses Python to teach the reader the true power of geospatial technology.

Chapter 9, Real-Time Data, introduces real-time data and examines a modern phenomenon. A wise geospatial analyst once said, "As soon as a map is created it is obsolete." Until recently, by the time you collected data about the earth, processed it, and created a geospatial product, the world it represented had already changed. But modern geospatial data shatters this notion. Data sets are available over the Internet which are up to the minute or even the second. These data sets fundamentally change the way we perform geospatial analysis.

Chapter 10, Putting It All Together, combines the skills from previous chapters step-by-step to build a simple, automated geospatial analysis system which produces a report.

What you need for this book

To follow through the various examples, you will need to download and install the following software:

Python Version 2.x (minimum Version 2.5)GDAL/OGR Version 1.7.1 or laterGEOS Version 3.2.2 or laterPyShp 1.1.6 or laterShapely Version 1.2 or laterProj Version 4.7 or laterPyProj Version 1.8.6 or laterNumPyPNGCanvasPython Imaging Library (PIL)

This book assumes at least a basic working knowledge of Python and a familiarity with geospatial analysis. Procedures for unloading and installing these tools are covered in the relevant chapters of this book as needed.

Who this book is for

This book is for anyone who wants to understand digital mapping and analysis and who uses Python or another scripting language for automation or crunching data manually. This book primarily targets Python developers, researchers, and analysts who want to perform geospatial modeling, and GIS analysis with Python.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title via the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the erratasubmissionform link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.

Chapter 1. Learning Geospatial Analysis with Python

This chapter is an overview of geospatial analysis and will cover the following topics:

How geospatial analysis is impacting our worldA history of geospatial analysis including Geographic Information Systems (GIS) and remote sensingReasons for using a programming language for geospatial analysisImportance of more people learning geospatial analysisGIS conceptsRemote sensing conceptsCreating the simplest possible GIS using Python

This book assumes some basic knowledge of Python, some IT literacy, and at least an awareness of geospatial analysis. This chapter provides a foundation in geospatial analysis, needed to attack any subject in the areas of remote sensing and GIS including the material in all the other chapters of the book.

Geospatial analysis and our world

The morning of November 7, 2012, saw political experts in the United States scrambling to explain how incumbent Democratic President, Barack Obama, had pulled off such a decisive election victory. They scrambled because none of them had seen the win coming—at least not the 332 electoral college votes for Obama, to Republican candidate Mitt Romney's anemic 206. The major political polling organizations had also unanimously declared the race would be a photo finish in the weeks leading up to the election.

Political experts offered broad explanations including "a better ground campaign" by Obama, "demographic shifts" that favored the Democrats, and even accusations of a weakened Republican Party brand. But these generalized theories fell far short of explaining the results in any satisfying detail. The following map shows the electoral votes received by each candidate:

The explanation for the political upset came instead from a 34 year old blogger from Michigan, named Nate Silver. Armed with only a laptop, he had predicted the exact outcome long before the election day, and he had done so with startling precision.

Both election campaigns calculated multiple winning scenarios which followed a path of winning certain key battleground states. The battleground states are also known as swing states, because neither candidate had overwhelming support from that state going into the election. These states included Colorado, Florida, Iowa, Nevada, New Hampshire, North Carolina, Ohio, Virginia, and Wisconsin. But Silver had called these states accurately as if they had been known all along.

Silver's method for predicting the future can be summed up as geostatistical profiling. He used geographic analysis to fill in gaps in polling data that caused other analysts to have inaccurate predictions. Large polling organizations poll states on a rolling but irregular basis leading up to elections. Furthermore, different organizations use different polling approaches. Silver first weighted these pollsters based on their historical accuracy and calculated an error rate.

He could then average polls together and account for potential error. His second innovation was to profile states based on historical voting trends and demographics. He could then classify similar states and even voting districts. Anywhere he was missing polling data from a particular state, he could find surrogate data from a similar state and extrapolate to complete his data set. The combination of careful weighting and extrapolation allowed Silver to run a more robust national voting model which paid off. Interestingly, Silver's political models use many of the same elements of probability theory used in his PECOTA software he had developed earlier for baseball but with a geospatial twist. The following plot shows an accuracy comparison of researchers and political experts. The analysts using geospatial techniques led the pack by a wide margin.

It would be one thing if Nate Silver had been the only one to come up with such an accurate prediction. But he was just the most visible due to his high-profile blog on the New York Times, and his articulate and detailed posts about his methods. He recognized many other analysts including Sam Wang of the Princeton Election Consortium and David Linzer of Emory University, who used similar geostatistical methods and achieved highly accurate results. Silver was on the crest of a wave of geospatial analysts who were bringing the field to the forefront of national attention through detailed, objective, and corrective spatial and statistical modeling.

Tip

An economist and statistician named Skipper Seabold attempted to reverse engineer the FiveThirtyEight model using Python. His efforts can be found at the following URL:

https://github.com/jseabold/538model

Beyond politics

The application of geospatial modeling to politics is one of the most recent and visible case studies. However, the use of geospatial analysis has been increasing steadily over the last 15 years. In 2004, the US Department of Labor declared the geospatial industry one of 13 high-growth industries in the United States expected to create millions of jobs in the coming decades.

Geospatial analysis can be found in almost every industry including real estate, oil and gas, agriculture, defense, disaster management, health, transportation, and oceanography to name a few. For a good overview of how geospatial analysis is used in dozens of different industries visit: http://www.esri.com/what-is-gis/who-uses-gis.

History of geospatial analysis

Geospatial analysis can be traced as far back as 15,000 years ago, to the Lascaux Cave in southwestern France. In that cave, paleolithic artists painted commonly hunted animals and what many experts believe are astronomical star maps for either religious ceremonies or potentially even migration patterns of prey. Though crude, these paintings demonstrate an ancient example of humans creating abstract models of the world around them and correlating spatial-temporal features to find relationships. The following image shows one of the paintings with an overlay illustrating the star maps:

Over the centuries the art of cartography and the science of land surveying developed, but it wasn't until the 1800s that significant advances in geographic analysis emerged. Deadly cholera outbreaks in Europe between 1830 and 1860 led geographers in Paris and London to use geographic analysis for epidemiological studies.

In 1832, Charles Picquet used different half-toned shades of gray to represent deaths per thousand citizens in the 48 districts of Paris, as part of a report on the cholera outbreak. In 1854, John Snow expanded on this method by tracking a cholera outbreak in London as it occurred. By placing a point on a map of the city each time a case was diagnosed, he was able to analyze the clustering of cholera cases. Snow traced the disease to a single water pump and prevented further cases. The map has three layers with streets, an X for each pump, and dots for each cholera outbreak:

A retired French engineer named Charles Minard produced some of the most sophisticated infographics ever drawn between 1850 and 1870. The term infographics is too generic to describe these drawings because they have strong geographic components. The quality and detail of these maps make them fantastic examples of geographic information analysis even by today's standards. Minard released his masterpiece Carte figurative des pertes successives en hommes de l'Armée Française dans la campagne de Russie 1812-1813, in 1869, depicting the decimation of Napoleon's army in the Russian campaign of 1812. The map shows the size and location of the army over time, along with prevailing weather conditions. The following graphic contains four different series of information on a single theme. It is a fantastic example of geographic analysis using pen and paper. The size of the army is represented by the widths of the brown and black swaths at a ratio of one millimeter for every 10,000 men. The numbers are also written along the swaths. The brown-colored path shows soldiers who entered Russia, while the black represents the ones who made it out. The map scale is shown on the center right as one "French league" (2.75 miles or 4.4 kilometers). The chart on the bottom runs from right to left and depicts the brutal freezing temperatures experienced by the soldiers on the return march home from Russia.

While far more mundane than a war campaign, Minard released another compelling map cataloguing the number of cattle sent to Paris from around France. Minard used pie charts of varying sizes in the regions of France to show each area's variety and volume of cattle shipped.

In the early 1900s, mass printing drove the development of the concept of map layers—a key feature of geospatial analysis. Cartographers drew different map elements (vegetation, roads, elevation contours) on plates of glass which could then be stacked and photographed for printing as a single image. If the cartographer made a mistake, only one plate of glass had to be changed instead of the entire map. Later the development of plastic sheets made it even easier to create, edit, and store maps in this manner. However, the layering concept for maps as a benefit to analysis would not come into play until the modern computer age.

Geographic Information Systems

Computer mapping evolved with the computer itself in the 1960s. But the origin of the term Geographic Information System (GIS) began with the Canadian Department of Forestry and Rural Development. Dr. Roger Tomlinson headed a team of 40 developers in an agreement with IBM to build the Canadian Geographic Information System (CGIS). The CGIS tracked the natural resources of Canada and allowed profiling of these features for further analysis. The CGIS stored each type of land cover as a different layer. The CGIS also stored data in a Canadian-specific coordinate system suitable for the entire country devised for optimal area calculations. While the technology used is primitive by today's standards, the system had phenomenal capability at that time. The CGIS included software features which seem quite modern: map projection switching, rubber sheeting of scanned images, map scale change, line smoothing and generalization to reduce the number of points in a feature, automatic gap closing for polygons, area measurement, dissolving and merging of polygons, geometric buffering, creation of new polygons, scanning, and digitizing of new features from reference data.

Tip

The National Film Board of Canada produced a 1967 documentary on the CGIS which can be seen at the following URL:

http://video.esri.com/watch/128/data-for-decision_comma_-1967-short-version

Tomlinson is often called "The Father of GIS". After launching the CGIS, he earned his doctorate from the University of London with his 1974 dissertation, entitled The application of electronic computing methods and techniques to the storage, compilation, and assessment of mapped data, which describes GIS and geospatial analysis. Tomlinson now runs his own global consulting firm, Tomlinson Associates Ltd., and remains an active participant in the industry. He is often found delivering the keynote address at geospatial conferences.

CGIS is the starting point of geospatial analysis as defined by this book. But this book would not have been written if not for the work of Howard Fisher and the Harvard Laboratory for Computer Graphics and Spatial Analysis, at the Harvard Graduate School of Design. His work on the SYMAP GIS software, which outputs maps to a line printer, started an era of development at the lab, which produced two other important packages and as a whole permanently defined the geospatial industry. GRID was a raster-based GIS system which used cells to represent geographic features instead of geometry. GRID was written by Carl Steinitz and David Sinton. The system later became IMGRID. Next came ODYSSEY. ODYSSEY was a team effort led by Nick Chrisman and David White. It was a system of programs which included many advanced geospatial data management features typical of modern geodatabase systems. Harvard attempted to commercialize these packages with limited success. However, their impact is still seen today. Virtually every existing commercial and open source package owes something to these code bases.

Tip

Howard Fisher produced a 1967 film using output from SYMAP to show the urban expansion of Lansing, Michigan from 1850 to 1965 by hand-coding decades of property information into the system. The analysis took months but would take only a few minutes to create now using modern tools and data. You can see the film at the following URL:

http://youtu.be/xj8DQ7IQ8_o

There are now dozens of graphical user interface geospatial desktop applications available today from companies including Esri, ERDAS, Intergraph, and ENVI to name a few. Esri is the oldest continuously operating GIS software company, which started in the late 1960s. In the open source realm, packages including Quantum GIS (QGIS) and GRASS are widely used. Beyond comprehensive desktop software packages, software libraries for building new software exist in the thousands.

Remote sensing

Remote sensing is the collection of information about an object without making physical contact with that object. In the context of geospatial analysis, the object is usually the Earth. Remote sensing also includes the processing of the collected information. The potential of geographic information systems is limited only by the available geographic data. The cost of land surveying, even using a modern GPS, to populate a GIS has always been resource intensive. The advent of remote sensing not only dramatically reduced that cost of geospatial analysis, but it took the field in entirely new directions. In addition to powerful reference data for GIS systems, remote sensing has made possible the automated and semi-automated generation of GIS data by extracting features from images and geographic data.

The eccentric French photographer Gaspard-Félix Tournachon, also known as Nadar, took the first aerial photograph in 1858 from a hot air balloon over Paris. The value of a true bird's eye view of the world was immediately apparent. As early as 1920, the books on aerial photo interpretation began to appear.

When America entered the cold war with the Soviet Union after World War II, aerial photography for monitoring military capability became prolific with the invention of the American U2 spy plane. The U2 spy plane could fly at 75,000 feet, putting it out of range of existing anti-aircraft weapons designed to reach only 50,000 feet. The American U2 flights over Russia ended when the Soviets finally shot down a U2 and captured the pilot.

But aerial photography had little impact on modern geospatial analysis. Planes could only capture small footprints of an area. Photographs were tacked to walls or examined on light tables but not in the context of other information. Though extremely useful, aerial photo interpretation was simply another visual perspective.

The game changer came on October 4, 1957, when the Soviet Union launched the Sputnik 1 satellite. The Soviets had scrapped a much more complex and sophisticated satellite prototype because of manufacturing difficulties. Once corrected, this prototype would later become Sputnik 3. They opted instead for a simple metal sphere with 4 antennae and a simple radio transmitter. Other countries including the United States were also working on satellites. The satellite initiatives were not entirely a secret. They were driven by scientific motives as part of the International Geophysical Year. Advancement in rocket technology made artificial satellites a natural evolution for earth science. However, in nearly every case each country's defense agency was also heavily involved. Like the Soviets, other countries were struggling with complex satellite designs packed with scientific instruments. The Soviets' decision to switch to the simplest possible device for the sole reason of launching a satellite before the Americans was effective. Sputnik was visible in the sky as it passed over and its radio pulse could be heard by amateur radio operators. Despite Sputnik's simplicity, it provided valuable scientific information which could be derived from its orbital mechanics and radio frequency physics.

The Sputnik program's biggest impact was on the American space program. America's chief adversary had gained a tremendous advantage in the race to space. The United States ultimately responded with the Apollo moon landings. But, before that, the US launched a program that would remain a national secret until 1995. The classified CORONA program resulted in the first pictures from space. The US and Soviet Union had signed an agreement to end spy plane flights but satellites were conspicuously absent from the negotiations. The following map shows the CORONA process. Dashed lines are satellite flight paths, longer white tubes are the satellite, the smaller white cones are the film canisters, and the black blobs are the control stations that triggered the ejection of the film so a plane could catch it in the sky.

The first CORONA satellite was a four year effort with many setbacks. But the program ultimately succeeded. The difficulty of satellite imaging even today is retrieving the images from space. The CORONA satellites used canisters of black and white film which were ejected from the vehicle once exposed. As the film canister parachuted to earth, a US military plane would catch the package in midair. If the plane missed the canister it would float for a brief duration in the water before sinking into the ocean to protect the sensitive information. The US continued to develop the CORONA satellites until they matched the resolution and photographic quality of the U2 spy plane photos. The primary disadvantages of the CORONA instruments were reusability and timeliness. Once out of film a satellite could no longer be of service. Also, the film recovery was on a set schedule making the system unsuitable to monitor real-time situations. The overall success of the CORONA program, however, paved the way for the next wave of satellites, which ushered in the modern era of remote sensing.

Because of the CORONA program's secret status, its impact on remote sensing was indirect. Photographs of the earth taken on manned US space missions inspired the idea of a civilian-operated remote sensing satellite. The benefits of such a satellite were clear but the idea was still controversial. Government officials questioned whether a satellite was as cost efficient as aerial photography. The military were worried the public satellite could endanger the secrecy of the CORONA program. And yet other officials worried about the political consequences of imaging other countries without permission. But the Department of the Interior finally won permission for NASA to create a satellite to monitor earth's surface resources.

On July 23, 1972, NASA launched the Earth Resources Technology Satellite (ERTS). The ERTS was quickly renamed to Landsat-1. The platform contained two sensors. The first was the Return Beam Vidicon (RBV) sensor, which was essentially a video camera. It was even built by the radio and television giant RCA. The RBV immediately had problems including disabling the satellite's altitude guidance system. The second attempt at a satellite was the highly experimental Multi-Spectral Scanner or MSS. The MSS performed flawlessly and produced superior results to the RBV. The MSS captured four separate images at four different wavelengths of the light reflected from the earth's surface.

This sensor had several revolutionary capabilities. The first and most important capability was the first global imaging of the planet scanning every spot on the earth every 16 days. The following image from the US National Aeronautics and Space Administration (NASA) illustrates this flight and collection pattern:

It also recorded light beyond the visible spectrum. While it did capture green and red light visible to the human eye, it also scanned near-infrared light at two different wavelengths not visible to the human eye. The images were stored and transmitted digitally to three different ground stations in Maryland, California, and Alaska. The multispectral capability and digital format meant the aerial view provided by Landsat wasn't just another photograph from the sky. It was beaming down data. This data could be processed by computers to output derivative information about the earth in the same way a GIS provided derivative information about the earth by analyzing one geographic feature in the context of another. NASA promoted the use of Landsat worldwide and made the data available at very affordable prices to anyone who asked.

This global imaging capability led to many scientific breakthroughs including the discovery of previously unknown geography as late as 1976. Using Landsat imagery the government of Canada located a tiny uncharted island inhabited by polar bears. They named the new landmass Landsat Island.

Landsat-1 was followed by six other missions and turned over to the National Oceanic and Atmospheric Administration (NOAA) as the responsible agency. Landsat-6 failed to achieve orbit due to a ruptured manifold, which disabled its maneuvering engines. During some of those missions the satellites were managed by the company EOSAT, now called Space Imaging, but returned to government management by the Landsat-7 mission. The following image from NASA is a sample of a Landsat 7 product:

The Landsat Data Continuity Mission (LDCM) launched February 13, 2013 and began collecting images on April 27, 2013 as part of its calibration cycle to become Landsat 8. The LDCM is a joint mission between NASA and the United States Geological Survey (USGS).

Elevation data

A Digital Elevation Model (DEM) is a three-dimensional representation of a planet's terrain. Within the context of this book that planet is Earth. The history of digital elevation models is far less complicated than remotely-sensed imagery but no less significant. Before computers, representations of elevation data were limited to topographic maps created through traditional land surveys. Technology existed to create 3D models from stereoscopic images or physical models from materials such as clay or wood, but these approaches were not widely used for geography.

The concept of digital elevation models began in 1986 when the French space agency, CNES, launched its SPOT-1 satellite which included a stereoscopic radar. This system created the first usable DEM. Several other US and European satellites followed this model with similar missions. In February 2000 the Space Shuttle Endeavour conducted the Shuttle Radar Topography Mission (SRTM), which collected elevation data over 80 percent of the earth's surface using a special radar antenna configuration that allowed a single pass. This model was surpassed in 2009 by the joint US and Japanese mission using the ASTER sensor aboard NASA's TERRA satellite. This system captured 99 percent of the earth's surface but has proven to have minor data issues. SRTM remains the gold standard. The following image from the US Geological Survey (USGS) shows a colorized DEM known as a hillshade. Greener areas are lower elevations while yellow and brown areas are mid-range to high elevations:

Recently more ambitious attempts at a worldwide elevation data set are underway in the form of TerraSAR-X and TanDEM-X satellites launched by Germany in 2007 and 2010, respectively. These two radar elevation satellites are working together to produce a global DEM, called WorldDEM, planned for release in 2014. This data set will have a relative accuracy of 2 meters and an absolute accuracy of 10 meters.

Computer-aided drafting

Computer-aided drafting (CAD) is worth mentioning, though it does not directly relate to geospatial analysis. The history of CAD system development parallels and intertwines with the history of geospatial analysis. CAD is an engineering tool used to model two- and three-dimensional objects usually for engineering and manufacturing. The primary difference between a geospatial model and a CAD model is a geospatial model is referenced to the earth, whereas a CAD model can possibly exist in abstract space. For example, a 3D blueprint of a building in a CAD system would not have a latitude or longitude. But in a GIS, the same building model would have a location on the earth. However, over the years CAD systems have taken on many features of GIS systems and are commonly used for smaller GIS projects. And likewise, many GIS programs can import CAD data which have been georeferenced. Traditionally, CAD tools were designed primarily for engineering data that were not geospatial.

However, engineers who became involved with geospatial engineering projects, such as designing a city utility electric system, would use the CAD tools they were familiar with to create maps. Over time both GIS software evolved to import the geospatial-oriented CAD data produced by engineers, and CAD tools evolved to better support geospatial data creation and better compatibility with GIS software. AutoCAD by AutoDesk and ArcGIS by Esri were the leading commercial packages to develop this capability and the GDAL OGR library developers added CAD support as well.

Geospatial analysis and computer programming

Modern geospatial analysis can be conducted with the click of a button in any of the easy-to-use commercial or open source geospatial packages. So then why would you want to use a programming language to learn this field? The most important reasons are:

You want complete control of the underlying algorithms, data, and execution.You want to automate a specific, repetitive analysis task with minimal overheadYou want to create a program that's easy to share You want to learn geospatial analysis beyond pushing buttons in software

The geospatial