39,59 €
The most commonly used GIS tools automate tasks that were historically done manually—compiling new maps by overlaying one on top of the other or physically cutting maps into pieces representing specific study areas, changing their projection, and getting meaningful results from the various layers by applying mathematical functions and operations. This book is an easy-to-follow guide to use the most matured open source GIS tools for these tasks.
We’ll start by setting up the environment for the tools we use in the book. Then you will learn how to work with QGIS in order to generate useful spatial data. You will get to know the basics of queries, data management, and geoprocessing.
After that, you will start to practice your knowledge on real-world examples. We will solve various types of geospatial analyses with various methods. We will start with basic GIS problems by imitating the work of an enthusiastic real estate agent, and continue with more advanced, but typical tasks by solving a decision problem.
Finally, you will find out how to publish your data (and results) on the web. We will publish our data with QGIS Server and GeoServer, and create a basic web map with the API of the lightweight Leaflet web mapping library.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 503
Veröffentlichungsjahr: 2017
BIRMINGHAM - MUMBAI
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: June 2017
Production reference: 1080617
ISBN 978-1-78712-332-8
www.packtpub.com
AuthorGábor Farkas
Copy Editor
Sonia Mathur
Reviewers
Mark Lewin
David Bianco
Project Coordinator
Prajakta Naik
Commissioning Editor
Aaron Lazar
Proofreader
Safis Editing
Acquisition Editor
Angad Singh
Indexer
Mariammal Chettiyar
ContentDevelopmentEditor
Lawrence Veigas
Graphics
Abhinash Sahu
Technical Editor
Abhishek Sharma
Production Coordinator
Shantanu Zagade
Gábor Farkas is a PhD student in the University of Pécs's Institute of Geography. He holds a master's degree in geography, although he moved from traditional geography to pure geoinformatics in his early studies. He often studies geoinformatical solutions in his free time, keeps up with the latest trends, and is an open source enthusiast. He loves to work with GRASS GIS, PostGIS, and QGIS, but his all time favorite is Web GIS, which mostly covers his main research interest.
Mark Lewin has been developing, teaching, and writing about software for over 16 years. His main interest is GIS and web mapping. Working for ESRI, the world's largest GIS company, he acted as a consultant, trainer, course author, and a frequent speaker at industry events. He has subsequently expanded his knowledge to include a wide variety of open source mapping technologies and a handful of relevant JavaScript frameworks including Node.js, Dojo, and JQuery.
Mark now works for Oracle's MySQL curriculum team, focusing on creating great learning experiences for DBAs and developers, but remains crazy about web mapping.
He is the author of books such as Leaflet.js Succinctly, Go Succinctly, and Go Web Development Succinctly for Syncfusion. He is also the co-author of the forthcoming second edition of Building Web and Mobile ArcGIS Server Applications with JavaScript, which is to be published by Packt.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1787123324. If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
Setting Up Your Environment
Understanding GIS
Setting up the tools
Installing on Linux
Installing on Windows
Installing on macOS
Getting familiar with the software
About the software licenses
Collecting some data
Getting basic data
Licenses
Accessing satellite data
Active remote sensing
Passive remote sensing
Licenses
Using OpenStreetMap
OpenStreetMap license
Summary
Accessing GIS Data With QGIS
Accessing raster data
Raster data model
Rasters are boring
Accessing vector data
Vector data model
Vector topology - the right way
Opening tabular layers
Understanding map scales
Summary
Using Vector Data Effectively
Using the attribute table
SQL in GIS
Selecting features in QGIS
Preparing our data
Writing basic queries
Filtering layers
Spatial querying
Writing advanced queries
Modifying the attribute table
Removing columns
Joining tables
Spatial joins
Adding attribute data
Understanding data providers
Summary
Creating Digital Maps
Styling our data
Styling raster data
Styling vector data
Mapping with categories
Graduated mapping
Understanding projections
Plate Carrée - a simple example
Going local with NAD83 / Conus Albers
Choosing the right projection
Preparing a map
Rule-based styling
Adding labels
Creating additional thematics
Creating a map
Adding cartographic elements
Summary
Exporting Your Data
Creating a printable map
Clipping features
Creating a background
Removing dangling segments
Exporting the map
A good way for post-processing - SVG
Sharing raw data
Vector data exchange formats
Shapefile
WKT and WKB
Markup languages
GeoJSON
Raster data exchange formats
GeoTIFF
Clipping rasters
Other raster formats
Summary
Feeding a PostGIS Database
A brief overview of databases
Relational databases
NoSQL databases
Spatial databases
Importing layers into PostGIS
Importing vector data
Spatial indexing
Importing raster data
Visualizing PostGIS layers in QGIS
Basic PostGIS queries
Summary
A PostGIS Overview
Customizing the database
Securing our database
Constraining tables
Saving queries
Optimizing queries
Backing up our data
Creating static backups
Continuous archiving
Summary
Spatial Analysis in QGIS
Preparing the workspace
Laying down the rules
Vector analysis
Proximity analysis
Understanding the overlay tools
Towards some neighborhood analysis
Building your models
Using digital elevation models
Filtering based on aspect
Calculating walking times
Summary
Spatial Analysis on Steroids - Using PostGIS
Delimiting quiet houses
Proximity analysis in PostGIS
Precision problems of buffering
Querying distances effectively
Saving the results
Matching the rest of the criteria
Counting nearby points
Querying rasters
Summary
A Typical GIS Problem
Outlining the problem
Raster analysis
Multi-criteria evaluation
Creating the constraint mask
Using fuzzy techniques in GIS
Proximity analysis with rasters
Fuzzifying crisp data
Aggregating the results
Calculating statistics
Vectorizing suitable areas
Using zonal statistics
Accessing vector statistics
Creating an atlas
Summary
Showcasing Your Data
Spatial data on the web
Understanding the basics of the web
Spatial servers
Using QGIS for publishing
Using GeoServer
General configuration
GeoServer architecture
Adding spatial data
Tiling your maps
Summary
Styling Your Data in GeoServer
Managing styles
Writing SLD styles
Styling vector layers
Styling waters
Styling polygons
Creating labels
Styling raster layers
Using CSS in GeoServer
Styling layers with CSS
Creating complex styles
Styling raster layers
Summary
Creating a Web Map
Understanding the client side of the Web
Creating a web page
Writing HTML code
Styling the elements
Scripting your web page
Creating web maps with Leaflet
Creating a simple map
Compositing layers
Working with Leaflet plugins
Loading raw vector data
Styling vectors in Leaflet
Annotating attributes with popups
Using other projections
Summary
Appendix
In the past, professional spatial analysis in the business sector was equivalent to buying an ArcGIS license, storing the data in some kind of Esri database, and publishing results with the ArcGIS Server. These trends seem to be changing in the favor of open source software. As FOSS (free and open source software) products are gaining more and more power due to the hard work of the enthusiastic open source GIS community, they pique the curiosity of the business sector at a growing rate. With the increasing number of FOSS GIS experts and consulting companies, both training and documentation--the two determining factors that open source GIS products traditionally lacked--are becoming more available.
Chapter 1, Setting Up Your Environment, guides you through the basic steps of creating an open source software infrastructure you can carry out your analyses with. It also introduces you to popular open data sources you can freely use in your workflow.
Chapter 2, Accessing GIS Data with QGIS, teaches you about the basic data models used in GIS. It discusses the peculiarities of these data models in detail, and also makes you familiar with the GUI of QGIS by browsing through some data.
Chapter 3, Using Vector Data Effectively, shows you how you can interact with vector data in the GIS software. It discusses GUI-based queries, SQL-based queries, and basic attribute data management. You will get accommodated to the vector data model and can use the attributes associated to the vector features in various ways.
Chapter 4, Creating Digital Maps, discusses the basics of digital map making by going through an exhaustive yet simple example in QGIS. It introduces you to the concept of projections and spatial reference systems, and the various steps of creating a digital map.
Chapter 5, Exporting Your Data, guides you through the most widely used vector and raster data formats in GIS. It discusses the strengths and weaknesses of the various formats, and also gives you some insight on under what circumstances you should choose a particular spatial data format.
Chapter 6, Feeding a PostGIS Database, guides you through the process of making a spatial database with PostGIS. It discusses how to create a new database, and how to fill it with various kinds of spatial data using QGIS. You will also learn how to manage existing PostGIS tables from QGIS.
Chapter 7, A PostGIS Overview, shows what other options you have with your PostGIS database. It leaves QGIS and talks about important PostgreSQL and PostGIS concepts by managing the database created in the previous chapter through PostgreSQL's administration software, pgAdmin.
Chapter 8, Spatial Analysis in QGIS, goes back to QGIS in order to discuss vector data analysis and spatial modeling. It shows you how different geometry types can be used to get some meaningful results based on the features' spatial relationship. It goes through the practical textbook example of delimiting houses based on some customer preferences.
Chapter 9, Spatial Analysis on Steroids - Using PostGIS, reiterates the example of the previous chapter, but entirely in PostGIS. It shows how a good software choice for the given task can enhance productivity by minimizing manual labor and automating the entire workflow. It also introduces you to the world of PostGIS spatial functions by going through the analysis again.
Chapter 10, A Typical GIS Problem, shows raster analysis, where spatial databases do not excel. It discusses typical raster operations by going through a decision making process. It sheds light on typical considerations related to the raster data model during an analysis, while also introducing some powerful tools and valuable methodology required to make a good decision based on spatial factors and constraints.
Chapter 11, Showcasing Your Data, goes on to the Web stack, and discusses the basics of the Web, the client-server architecture, and spatial servers. It goes into details on how to use the QGIS Server to create quick visualizations, and how to use GeoServer to build a powerful spatial server with great capabilities.
Chapter 12, Styling Your Data in GeoServer, discusses the basic vector and raster symbology usable in GeoServer. It goes through the styling process by using traditional SLD documents. When the concepts are clear, it introduces the powerful and convenient GeoServer CSS, which is also based on SLD.
Chapter 13, Creating a Web Map, jumps to the client side of the Web and shows you how to create simple web maps using the server architecture created before, and the lightweight web mapping library--Leaflet. It guides you through the process of creating a basic web map, ranging from creating an HTML document to scripting it with JavaScript.
Appendix shows additional information and interesting use cases of the learned material through images and short descriptions.
For this book, you will need to have a computer with mid-class computing capabilities. As the open source GIS software is not that demanding, you don't have to worry about your hardware specification when running the software, although some of the raster processing tools will run pretty long (about 5-10 minutes) on slower machines.
What you need to take care of is that you have administrator privileges on the machine you are using, or the software is set up correctly by an administrator. If you don't have administrator privileges, you need to write the privilege at least to the folder used by the web server to serve content.
The aim of this book is to carry on this trend and demonstrate how even advanced spatial analysis is convenient with an open source product, and how this software is a capable competitor of proprietary solutions. The examples from which you will learn how to harness the power of the capable GIS software, QGIS; the powerful spatial ORDBMS (object-relational database management system), PostGIS; and the user-friendly geospatial server, GeoServer are aimed at IT professionals looking for cheap alternatives to costly proprietary GIS solutions with or without basic GIS training.
On the other hand, anyone can learn the basics of these great open source products from this practical guide. If you are a decision maker looking for easily producible results, a CTO looking for the right software, or a student craving for an easy-to-follow guide, it doesn't matter. This book presents you the bare minimum of the GIS knowledge required for effective work with spatial data, and thorough but easy-to-follow examples for utilizing open source software for this work.
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, and user input are shown as follows: "It uses the*wildcard for selecting everything from the table namedtable, where the content of the column namedcolumnmatchesvalue."
A block of code is set as follows:
SELECT ST_Buffer(geom, 200) AS geom FROM spatial.roads r WHERE r.fclass LIKE 'motorway%' OR r.fclass LIKE 'primary%';
Any command-line input or output is written as follows:
update-alternatives --config java
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "If we open thePropertieswindow of a vector layer and navigate to theStyletab, we can see theSingle symbolmethod applied to the layer."
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
Log in or register to our website using your e-mail address and password.
Hover the mouse pointer on the
SUPPORT
tab at the top.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box.
Select the book for which you're looking to download the code files.
Choose from the drop-down menu where you purchased this book from.
Click on
Code Download
.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Practical-GIS. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/PracticalGIS_ColorImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at [email protected] with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.
The development of open source GIS technologies has reached a state where they can seamlessly replace proprietary software in the recent years. They are convenient, capable tools for analyzing geospatial data. They offer solutions from basic analysis to more advanced, even scientific, workflows. Moreover, there are tons of open geographical data out there, and some of them can even be used for commercial purposes. In this chapter, we will acquaint ourselves with the open source software used in this book, install and configure them with an emphasis on typical pitfalls, and learn about some of the most popular sources of open data out there.
In this chapter, we will cover the following topics:
Installing the required software
Configuring the software
Free geographical data sources
Software and data licenses
Before jumping into the installation process, let's discuss geographic information systems (GIS) a little bit. GIS is a system for collecting, manipulating, managing, visualizing, analyzing, and publishing spatial data. Although these functionalities can be bundled in a single software, by definition, GIS is not a software, it is rather a set of functionalities. It can help you to make better decisions, and to get more in-depth results from data based on their spatial relationships.
The most important part of the former definition is spatial data. GIS handles data based on their locations in a coordinate reference system. This means, despite GIS mainly being used for handling and processing geographical data (data that can be mapped to the surface of Earth), it can be used for anything with dimensions. For example, a fictional land like Middle-Earth, the Milky Way, the surface of Mars, the human body, or a single atom. The possibilities are endless; however, for most of them, there are specialized tools that are more feasible to use.
The functionalities of a GIS outline the required capabilities of a GIS expert. Experts need to be able to collect data either by surveying, accessing an other's measurements, or digitizing paper maps, just to mention a few methods. Collecting data is only the first step. Experts need to know how to manage this data. This functionality assumes knowledge not only in spatial data formats but also in database management. Some of the data just cannot fit into a single file. There can be various reasons behind this; for example, the data size or the need for more sophisticated reading and writing operations. Experts also need to visualize, manipulate, and analyze this data. This is the part where GIS clients come in, as they have the capabilities to render, edit, and process datasets. Finally, experts need to be able to create visualizations from the results in order to show them, verify decisions, or just help people interpreting spatial patterns. This phase was traditionally done via paper maps and digital maps, but nowadays, web mapping is also a very popular means of publishing data.
From these capabilities, we will learn how to access data from freely available data sources, store and manage them in a database, visualize and analyze them with a GIS client, and publish them on the Web.
Most of the software used in this book is platform-dependent; therefore, they have different ways of getting installed on different operating systems. I assume you have enough experience with your current OS to install software, and thus, we will focus on the possible product-related pitfalls in a given OS. We will cover the three most popular operating systems--Linux, Windows, and macOS. If you don't need the database or the web stack, you can skip the installation of the related software and jump through the examples using them.
The list of the software stack used in this book can be found in the following thematically grouped table:
Installing the packages on Linux distributions is pretty straightforward. The dependencies are installed with the packages, when there are any. We only have to watch out for three things prior to installing the packages. First of all, the package name of the Apache web server can vary between different distributions. On distros using RPM packages (for example--Fedora, CentOS, and openSUSE), it is called httpd, while on the ones using DEB packages (for example--Debian and Ubuntu), it is called apache2. On Arch Linux, it is simply called apache.
The second consideration is related to distributions which do not update their packages frequently, like Debian. GeoServer has a hard dependency of a specific JRE (Java Runtime Environment). We must make sure we have it installed and configured as the default. We will walk through the Debian JRE installation process as it is the most popular Linux distribution with late official package updates. Debian Jessie, the latest stable release of the OS when writing these lines, is packed with OpenJDK 7, while GeoServer 2.11 requires JRE 8:
To install OpenJDK 8, we have to enable the Backports repository according to the official Debian guide at
https://wiki.debian.org/Backports
.
If the repository is added, we can reload the packages and install the package
openjdk-8-jre
.
The next step is to make this JRE the default one. We can do this by opening a terminal and typing the following command:
update-alternatives --config java
The next step is self-explanatory; we have to choose the new default environment by typing its ID and pressing enter.
The last consideration before installing the packages is related to the actual version of QGIS. Most of the distributions offer the latest version in a decent time after release; however, some of them like Debian do not. For those distros, we can use QGIS's repository following the official guide at http://www.qgis.org/en/site/forusers/alldownloads.html.
After all things are set, we can proceed and install the required packages. The order should not matter. If done, let's take a look at GeoServer, which doesn't offer Linux packages to install. It offers two methods for Linux: a WAR for already installed Java servlets (such as Apache Tomcat), and a self-containing platform independent binary. We will use the latter as it's easier to set up:
Download GeoServer's platform independent binary from
http://geoserver.org/release/stable/
.
Extract the downloaded archive. It can be anywhere as long as we have a write permission to the destination.
Start GeoServer with its startup script. To do this, we navigate into the extracted archive from a terminal and run
startup.sh
in its
bin
folder with the following command:
cd <geoserver's folder>/bin ./startup.sh
Optionally, we can detach GeoServer from the shell used by the terminal with the startup command
nohup ./startup.sh > /dev/null &
. This way, we can close the terminal. If we would like to shut down GeoServer manually, we can do so by running its
shutdown.sh
script.
Installing the required software on Windows only requires a few installers as most of the packages are bundled into the OSGeo4W installer.
First of all, we have to download the 32-bit installer from
https://trac.osgeo.org/osgeo4w/
as this is the only architecture where an OSGeo version of Apache is bundled.
Opening the installer, we can choose between different setups. For our cause, we should choose
Advanced Install
. When we reach the
Select Packages
section, we must choose the following packages as a minimum:
Desktop--
grass
,
qgis
Web--
apache
,
qgis-server
The next page tells us we don't have to bother with dependencies as the installer selected them for us automatically.
The last step can be quite troublesome as there isn't a general solution; we have to configure Apache and QGIS Server if they don't want to collaborate (opening
http://localhost/qgis/qgis_mapserv.fcgi.exe
returns an
Internal Server Error
or it simply cannot be reached). For a good start, take a look at the official tutorial at
http://hub.qgis.org/projects/quantum-gis/wiki/QGIS_Server_Tutorial
.
Don't worry if you end up with no solutions, we will concentrate on GeoServer, which runs perfectly on Windows. Just make sure Apache is installed and working (i.e. http://localhost returns a blank page or the OSGeo4W default page), as we will need it later.
The next thing to consider is the PostgreSQL stack. We can download the installer from
https://www.postgresql.org/download/windows/
, where the EnterpriseDB edition comes with a very handy Stack Builder. After the installation of PostgreSQL, we can use it to install PostGIS. We can find PostGIS in the
Spatial Extensions
menu. The default installer comes with pgAdmin 4, while we will use pgAdmin 3 in this book. The two look and feel similar enough; however, if you would like to install the latter, you can download it from
https://www.pgadmin.org/download/pgadmin-3-windows/
.
Installing the software on macOS could be the most complicated of all (because of GRASS). However, thanks to William Kyngesburye, the compiled version of QGIS already contains a copy of GRASS along with other GIS software used by QGIS. In order to install QGIS, we have to download the disk image from http://www.kyngchaos.com/software/qgis.
PostgreSQL and PostGIS are also available from the same site, you will see the link on the left sidebar. pgAdmin, on the other hand, is available from another source: https://www.pgadmin.org/download/pgadmin-4-macos/. Finally, the GeoServer macOS image can be downloaded from http://geoserver.org/release/stable/, while its dependency of Java 8 can be downloaded from https://www.java.com/en/download/.
The only thing left is configuring the QGIS Server. As the OS X and macOS operating systems are shipped with an Apache web server, we don't have to install it. However, we have to make some configurations manually due to the lack of the FastCGI Apache module, on which QGIS Server relies. This configuration can be made based on the official guide at http://hub.qgis.org/projects/quantum-gis/wiki/QGIS_Server_Tutorial.
Open source GIS software offer a very high degree of freedom. Their license types can differ; however, they are all permissive licenses. That means we can use, distribute, modify, and distribute the modified versions of the software. We can also use them in commercial settings and even sell the software if we can find someone willing to buy it (as long as we sell the software with the source code under the same license). The only restriction is for companies who would like to sell their software under a proprietary license using open source components. They simply cannot do that with most of the software, although some of the licenses permit this kind of use, too.
There is one very important thing to watch out for when we use open source software and data. If somebody contributes often years of work to the community, at least proper attribution can be expected. Most of the open source licenses obligate this right of the copyright holder; however, we must distinguish software from data. Most of the licenses of open source software require the adapted product to reproduce the same license agreement. That is, we don't have to attribute the used software in a work, but we must include the original license with the copyright holders' name when we create an application with them. Data, on the other hand, is required to be attributed when we use it in our work.
There are a few licenses which do not obligate us to give proper attribution. These licenses state that the creator of the content waives every copyright and gives the product to the public to use without any restrictions. Two of the most common licenses of this kind are the Unlicense, which is a software license, and the Creative Commons Public Domain, which is in the GIS world mostly used as a data license.
Now that we have our software installed and configured, we can focus on collecting some open source data. Data collecting (or data capture) is one of the key expertise of a GIS professional and it often covers a major part of a project budget. Surveying is expensive (for example, equipment, amortization, staff, and so on); however, buying data can also be quite costly. On the other hand, there is open and free data out there, which can drastically reduce the cost of basic analysis. It has some drawbacks, though. For example, the licenses are much harder to attune with commercial activity, because some of them are more restrictive.
There are two types of data collection. The first one is primary data collection, where we measure spatial phenomena directly. We can measure the locations of different objects with GPS, the elevation with radar or lidar, the land cover with remote sensing. There are truly a lot of ways of data acquisition with different equipment. The second type is secondary data collection, where we convert already existing data for our use case. A typical secondary data collection method is digitizing objects from paper maps. In this section, we will acquire some open source primary data.
The only thing to consider is our study area. We should choose a relatively small administrative division, like a single county. For example, I'm choosing the county I live in as I'm quite familiar with it and it's small enough to make further analysis and visualization tasks fast and simple:
The first data we will download is the administrative boundaries of our country of choice. Open data for administrative divisions are easy to find for the first two levels, but it becomes more and more scarce for higher levels. The first level is always the countries' boundaries, while higher levels depend on the given country. There is a great source for acquiring the first three levels for every country in a fine resolution: GADM or Global Administrative Areas. We will talk about administration levels in more details in a later chapter. Let's download some data from http://www.gadm.org/country by selecting our study area, and the file format as Shapefile:
In the zipped archive, we will need the administrative boundaries, which contain our division of choice. If you aren't sure about the correct dataset, just extract everything and we will choose the correct one later.
The second vector dataset we download is the GeoNames archive for the country encasing our study area. GeoNames is a great place for finding data points. Every record in the database is a single point with a pair of coordinates and a lot of attribute data. Its most instinctive use case is for geocoding (linking names to locations). However, it can be a real treasure box for those who can link the rich attribute data to more meaningful objects. The country-level data dumps can be reached at http://download.geonames.org/export/dump/ through the countries' two-letter ISO codes.
GADM's license is very restrictive. We are free to use the downloaded data for personal and research purposes but we cannot redistribute it or use it in commercial settings. Technically, it isn't open source data as it does not give the four freedoms of using, modifying, redistributing the original version, and redistributing the modified version without restrictions. That's why the example dataset doesn't contain GADM's version of Luxembourg.
GeoNames has two datasets--a commercially licensed premium dataset and an open source one. The open source data can be used for commercial purposes without restrictions.
Data acquisition with instruments mounted on airborne vehicles is commonly called remote sensing. Mounting sensors on satellites is a common practice by space agencies (for example, NASA and ESA), and other resourceful companies. These are also the main source of open source data as both NASA and ESA grant free access to preprocessed data coming from these sensors. In this part of the book, we will download remote sensing data (often called imagery) from USGS's portal: Earth Explorer. It can be found at https://earthexplorer.usgs.gov/. As the first step, we have to register an account in order to download data.
When we have an account, we should proceed to the Earth Explorer application and select our study area. We can select an area on the map by holding down the Shift button and drawing a rectangle with the mouse, as shown in the following screenshot:
As the next step, we should select some data from the Data Sets tab. There are two distinct types of remote sensing based on the type of sensor: active and passive. In active remote sensing, we emit some kind of signal from the instrument and measure its reflectance from the target surface. We make our measurement from the attributes of the reflected signal. Three very typical active remote sensing instruments are radar (radio detection and ranging) using radio waves, lidar (light detection and ranging) using laser, and sonar (sound navigation and ranging) using sound waves. The first dataset we download is SRTM (Shuttle Radar Topographic Mission), which is a DEM (digital elevation model) produced with a radar mounted on a space shuttle. For this, we select the Digital Elevation item and then SRTM. Under the SRTM menu, there are some different datasets from which we need the 1 Arc-Second Global. Finally, we push the Results button, which navigates us to the results of our query. In the results window, there are quite a few options for every item, as shown in the following screenshot:
The first two options (Show Footprint and Show Browse Overlay) are very handy tools to show the selected imagery on the map. The footprint only shows the enveloping rectangle of the data, therefore, it is fast. Additionally, it colors every footprint differently, so we can identify them easily. The overlay tool is handy for getting a glance at the data without downloading it.
Finally, we download the tiles covering our study area. We can download them individually with the item's fifth option called Download Options. This offers some options from which we should select the BIL format as it has the best compression rate, thus, our download will be fast.
Let's get back to the Data Sets tab and select the next type of data we need to download--the Landsat data. These are measured with instruments of the other type--passive remote sensing. In passive remote sensing, we don't emit any signal, just record the electromagnetic radiance of our environment. This method is similar to the one used by our digital cameras except those record only the visible spectrum (about 380-450 nanometers) and compose an RGB picture from the three visible bands instantly. The Landsat satellites use radiometers to acquire multispectral images (bands). That is, they record images from spectral intervals, which can penetrate the atmosphere, and store each of them in different files. There is a great chart created by NASA (http://landsat.gsfc.nasa.gov/sentinel-2a-launches-our-compliments-our-complements/) which illustrates the bands of Landsat 7, Landsat 8, and Sentinel-2 along with the atmospheric opacity of the electromagnetic spectrum:
From the Landsat Archive, we need the Pre-Collection menu. From there, we select L8 OLI/TIRS and proceed to the results. With the footprints of the items, let's select an image which covers our study area. As Landsat images have a significant amount of overlap, there should be one image which, at least, mostly encases our study area. There are two additional information listed in every item--the row number and the path number. As these kinds of satellites are constantly orbiting Earth, we should be able to use their data for detecting changes. To assess this kind of use case (their main use case), their orbits are calculated so that, the satellites return to the same spot periodically (in case of Landsat, 18 days). This is why we can classify every image by their path and row information:
Let's note down the path and row information of the selected imagery and go to the Additional Criteria tab. We feed the path and row information to the WRS Path and WRS Row fields and go back to the results. Now the results are filtered down, which is quite convenient as the images are strongly affected by weather and seasonal effects. Let's choose a nice imagery with minimal cloud coverage and download its Level 1 GeoTIFF Data Product. From the archive, we will need the TIFF files of bands 1-6.
SRTM is in the public domain; therefore, it can be used without restrictions, and giving attribution is also optional. Landsat data is also open source; however, based on USGS's statement (https://landsat.usgs.gov/are-there-any-restrictions-use-or-redistribution-landsat-data), proper attribution is recommended.
The last dataset we put our hands on is the swiss army knife of open source GIS data. OpenStreetMap provides vector data with a great global coverage coming from measurements of individual contributors. OpenStreetMap has a topological structure; therefore, it's great for creating beautiful visualizations and routing services. On the other hand, its collaborative nature makes accuracy assessments hard. There are some studies regarding the accuracy of the whole data, or some of its subsets, but we cannot generalize those results as accuracy can greatly vary even in small areas.
One of the main strengths of OpenStreetMap data is its large collection and variety of data themes. There are administrative borders, natural reserves, military areas, buildings, roads, bus stops, even benches in the database. Although its data isn't surveyed with geodesic precision, its accuracy is good for a lot of cases: from everyday use to small-scale analysis where accuracy in the order of meters is good enough (usually, a handheld GPS has an accuracy of under 5 meters). Its collaborative nature can also be evaluated as a strength as mistakes are corrected rapidly and the content follows real-world changes (especially large ones) with a quick pace.
Accessing OpenStreetMap data can be tricky. There are some APIs and other means to query OSM, although either we need to know how to code or we get everything in one big file. There is one peculiar company which creates thematic data extracts from the actual content--Geofabrik. We can reach Geofabrik's download portal at http://download.geofabrik.de/. It allows us to download data in OSM's native PBF format (Protocolbuffer Binary Format), which is great for filling a PostGIS database with OSM data from the command line on a Linux system but cannot be opened with a desktop GIS client. It also serves XML data, which is more widely supported, but the most useful extracts for us are the shapefiles.
Due to various reasons, open source shapefiles are only exported by Geofabrik for small areas. We have to narrow down our search by clicking on links until the shapefile format (.shp.zip) is available. This means country-level extracts for smaller countries and regional extracts for larger or denser ones. The term dense refers to the amount of data stored in the OSM database for a given country. Let's download the shapefile for the smallest region enveloping our study area:
OpenStreetMap data is licensed under ODbL, an open source license, and therefore gives the four basic freedoms. However, it has two important conditions. The first one is obligatory attribution, while the second one is a share-alike condition. If we use OpenStreetMap data in our work, we must share the OSM part under an ODbL-compatible open source license.
ODbL differentiates three kind of products: collective database, derived database, and produced work. If we create a collective database (a database which has an OSM part), the share-alike policy only applies on the OSM part. If we create a derived database (make modifications to the OSM database), we must make the whole thing open source. If we create a map, a game, or any other work based on the OSM database, we can use any license we would like to. However, if we modify the OSM database during the process, we must make the modifications open source.
In this chapter, we installed the required open source GIS software, configured some of them, and downloaded a lot of open source data. We became familiar with open source products, licenses, and data sources. Now we can create an open source GIS working environment from zero and acquire some data to work with. We also gained some knowledge about data collection methods and their nature.
In the next chapter, we will visualize the downloaded data in QGIS. We will learn to use some of the most essential functionalities of a desktop GIS client while browsing our data. We will also learn some of the most basic attributes and specialities of different data types in GIS.
Despite the fact that some of the advanced GIS software suggest, we only need to know which buttons to press in order to get instant results, GIS is much more than that. We need to understand the basic concepts and the inner workings of a GIS in order to know the kind of analyses we can perform on our data. We must be able to come up with specific workflows, models which get us the most meaningful results. We also need to understand the reference frame of GIS, how our data behaves in such an environment, and how to interpret those results. In this chapter, we will learn about GIS data models by browsing our data in QGIS, and getting acquainted with its GUI.
In this chapter, we will cover the following topics:
Graphical User Interface of QGIS
Opening spatial data in QGIS
GIS data models
The first data type that we will use is raster data. It might be the most familiar to you, as it resembles traditional images. First of all, let's open QGIS. In the browser panel, we can immediately see our downloaded data if we navigate to our working directory. We can easily distinguish vector data from raster data by their icons. Raster layers have a dedicated icon of a 3x3 pixels image, while vector layers have an icon of a concave polygon:
We can drag and drop most of the data from the browser panel or, alternatively, use the Add Raster Layer button from the Add layer toolbar and browse the layer. The browser panel is more convenient for easily recognizable layers as it only lists the files we can open and hides auxiliary files with every kind of metadata. Let's drag one of the SRTM rasters to the canvas (or open one with Add Raster Layer). This is a traditional, single-band raster. It is displayed as a greyscale image with the minimum and maximum values displayed in the Layers Panel:
As you can see in the preceding screenshot, there is a regular grid with cells painted differently, just like an image. However, based on the maximum value of the data, its colors aren't hard coded into the file, like in an image. Furthermore, it has only a single band, not three or four bands for RGB(A). Let's examine the raster more carefully by zooming in until we can see individual cells. We can also query them for their values with the Identify Features tool by clicking on a cell (raster):
As you can see, we get a number for every cell, which can be quite out of the range of 0-255 representing color codes. These numbers seem arbitrary, and indeed, they are arbitrary. They usually represent some kind of real-world phenomenon, like in our case, the elevation from the mean sea level in meters.
These are the basic properties of the raster data model. Raster data are regular grids (matrices) made up from individual cells with some arbitrary values describing something. The values are only limited by the type of the storage. They can be in the range of bytes, 8-bit integers, 16-bit integers, floating point numbers, and so on. Rasters are always rectangular (like an image); however, they can give a feeling of having some other shape with a special kind of value: NULL or No-Data.
