This chapter examines a number of libraries and other tools that can be used for geo-spatial development in Python.
More specifically, we will cover:
Note that there are two types of geo-spatial tools that are not discussed in this chapter: geo-spatial databases and geo-spatial web toolkits. Both of these will be examined in detail later in this book.
While you could in theory write your own parser to read a particular geo-spatial data format, it is much easier to use an existing Python library to do this. We will look at two popular libraries for reading and writing geo-spatial data: GDAL and OGR.
Unfortunately, the naming of these two libraries is rather confusing. GDAL, which stands for Geospatial Data Abstraction Library, was originally just a library for working with raster geo-spatial data, while the separate OGR library was intended to work with vector data. However, the two libraries are now partially merged, and are generally downloaded and installed together under the combined name of GDAL. To avoid confusion, we will call this combined library GDAL/OGR and use GDAL to refer to just the raster translation library.
A default installation of GDAL supports reading 81 different raster file formats and writing to 41 different formats. OGR by default supports reading 27 different vector file formats and writing to 15 formats. This makes GDAL/OGR one of the most powerful geo-spatial data translators available, and certainly the most useful freely-available library for reading and writing geo-spatial data.
GDAL uses the following data model for describing raster geo-spatial data:
Let's take a look at the various parts of this model:
An affine transformation is a mathematical formula allowing the following operations to be applied to the raster data:
More than one of these operations can be applied at once; this allows you to perform sophisticated transforms such as rotations.
Ground Control Points (GCPs) relate one or more positions within the raster to their equivalent georeferenced coordinates, as shown in the following figure:
Each raster band contains (among other things):
GDAL provides a number of drivers that allow you to read (and sometimes write) various types of raster geo-spatial data. When reading a file, GDAL selects a suitable driver automatically based on the type of data; when writing, you first select the driver and then tell the driver to create the new dataset you want to write to.
A Digital Elevation Model (DEM) file contains height values. In the following example program, we use GDAL to calculate the average of the height values contained in a sample DEM file:
from osgeo import gdal,gdalconst import struct dataset = gdal.Open("DEM.dat") band = dataset.GetRasterBand(1) fmt = "<" + ("h" * band.XSize) totHeight = 0 for y in range(band.YSize): scanline = band.ReadRaster(0, y, band.XSize, 1, band.XSize, 1, band.DataType) values = struct.unpack(fmt, scanline) for value in values: totHeight = totHeight + value average = totHeight / (band.XSize * band.YSize) print "Average height =", average
As you can see, this program obtains the single raster band from the DEM file, and then reads through it one scanline at a time. We then use the struct
standard Python library module to read the individual values out of the scanline. Each value corresponds to the height of that point, in meters.
OGR uses the following model for working with vector-based geo-spatial data:
Let's take a look at this design in more detail:
The Geometry design within OGR is based on the Open Geospatial Consortium's Simple Features model for representing geo-spatial geometries. For more information, see http://www.opengeospatial.org/standards/sfa.
Like GDAL, OGR also provides a number of drivers that allow you to read (and sometimes write) various types of vector-based geo-spatial data. When reading a file, OGR selects a suitable driver automatically; when writing, you first select the driver and then tell the driver to create the new datasource to write to.
The following example program uses OGR to read through the contents of a Shapefile, printing out the value of the NAME
attribute for each feature, along with the geometry type:
from osgeo import ogr shapefile = ogr.Open("TM_WORLD_BORDERS-0.3.shp") layer = shapefile.GetLayer(0) for i in range(layer.GetFeatureCount()): feature = layer.GetFeature(i) name = feature.GetField("NAME") geometry = feature.GetGeometryRef() print i, name, geometry.GetGeometryName()
GDAL and OGR are well-documented, but with a catch for Python programmers. The GDAL/OGR library and associated command-line tools are all written in C and C++. Bindings are available that allow access from a variety of other languages, including Python, but the documentation is all written for the C++ version of the libraries. This can make reading the documentation rather challenging—not only are all the method signatures written in C++, but the Python bindings have changed many of the method and class names to make them more "pythonic".
Fortunately, the Python libraries are largely self-documenting, thanks to all the docstrings embedded in the Python bindings themselves. This means you can explore the documentation using tools such as Python's built-in pydoc
utility, which can be run from the command line like this:
pydoc -g osgeo
This will open up a GUI window allowing you to read the documentation using a web browser. Alternatively, if you want to find out about a single method or class, you can use Python's built-in help()
command from the Python command line, like this:
>>> import osgeo.ogr >>> help(osgeo.ogr.Datasource.CopyLayer)
Not all the methods are documented, so you may need to refer to the C++ docs on the GDAL website for more information. Some of the docstrings present are copied directly from the C++ documentation—but, in general, the documentation for GDAL/OGR is excellent, and should allow you to quickly come up to speed using this library.
GDAL/OGR runs on modern Unix machines, including Linux and Mac OS X as well as most versions of Microsoft Windows. The main website for GDAL can be found at:
And the main website for OGR is http://gdal.org/ogr
To download GDAL/OGR, follow the Downloads link on the main GDAL website. Windows users may find the "FWTools" package useful as it provides a wide range of geo-spatial software for win32 machines, including GDAL/OGR and its Python bindings. FWTools can be found at:
For those running Mac OS X, pre-built binaries for GDAL/OGR can be obtained from:
http://www.kyngchaos.com/software/frameworks
Make sure that you install GDAL version 1.7 or later as you will need this version to work through the examples in this book.
Being an open source package, the complete source code for GDAL/OGR is available from the website, so you can compile it yourself. Most people, however, will simply want to use a pre-built binary version.
18.224.51.145