Now that we have the basics on how to work with polygons and we know how to represent and store them, we will go back to our app and add the ability to import geospatial files containing polygons. As we did with the points, we will abstract the features into the Python objects and we will also use class inheritance.
First, let's look at the code we already wrote. In the models.py
file, we have the PointCollection
class:
class PointCollection(object): def __init__(self, file_path=None): """This class represents a group of vector data.""" self.data = [] self.epsg = None if file_path: self.import_data(file_path) def __add__(self, other): self.data += other.data return self def import_data(self, file_path): """Opens an vector file compatible with OGR and parses the data. :param str file_path: The full path to the file. """ features, metadata = open_vector_file(file_path) self._parse_data(features) self.epsg = metadata['epsg'] print("File imported: {}".format(file_path)) def _parse_data(self, features): """Transforms the data into Geocache objects. :param features: A list of features. """ for feature in features: geom = feature['geometry']['coordinates'] attributes = feature['properties'] cache_point = Geocache(geom[0], geom[1], attributes = attributes) self.data.append(cache_point) def describe(self): print("SRS EPSG code: {}".format(self.epsg)) print("Number of features: {}".format(len(self.data)))
This class represents a collection of geocaching points and is responsible for importing these points and converting and storing them. These are exactly the same functionality that we want to implement to import polygons.
In the previous chapter, you saw how it's possible, through inheritance, to make a class inherit functionalities from other classes. We will use this same technique to use what we already have to import the polygons.
Since the processing of geocaching points and polygons may have its particularities, it will need some of the things to be specific for each one. A specific example is the _parse_data
method that, for now, converts features into geocaching points.
So, it's not a good idea to make the class that represents polygons to inherit directly from the PointCollection
class. Instead, the idea is to have two base classes, one that represents a single object and other that represents a collection of that object. These base classes will contain methods that are common to the points and the polygons, then the child classes will contain methods specific for each case.
The polygons that we will import could be countries, boundaries, states, or provinces of a country, city, district regions, and so on. Since it's not clear yet, let's call it boundaries. This is explained in the following steps:
BaseGeoObject
object and adapting from the Geocache
class. Open the models.py
file in the Chapter4
folder.Geocache
class with all its methods (copy and paste).BaseGeoObject
and change docstring
to something like "Base class for single geo objects."
. You should have this:class BaseGeoObject(object): """Base class for a single geo object.""" def __init__(self, lat, lon, attributes=None): self.lat = lat self.lon = lon self.attributes = attributes @property def coordinates(self): return self.lat, self.lon class Geocache(object): """This class represents a single geocaching point.""" def __init__(self, lat, lon, attributes=None): self.lat = lat self.lon = lon self.attributes = attributes @property def coordinates(self): return self.lat, self.lon
Now try to think, looking at both of the classes, what is specific for the Geocache, what doesn't belong to a generic GeoObject or what belongs to it, and what properties and methods every type of geospatial object could have.
This separation could lead to some debate, and sometimes, depending on the complexity of the project and the nature of what you are dealing with, it may be hard to reach a final state in the first iteration through the code. In your projects, you may need to come back and change how the classes are organized more than once.
For now, I'm going to propose the following logic:
__init__
and __add__
that we had in the previous chapter. __repr__
is called when you use the print()
function on an object. We will add it and set it to not be implemented on the base class, because every type of object should have its own representation.geom
property that will contain the object geometry.Let's make the first changes to these classes. Edit your code to be as follows:
class BaseGeoObject(object): """Base class for a single geo object.""" def __init__(self, geometry, attributes=None): self.geom = geometry self.attributes = attributes @property def coordinates(self): raise NotImplementedError def __repr__(self): raise NotImplementedError class Geocache(BaseGeoObject): """This class represents a single geocaching point.""" def __init__(self, geometry, attributes=None): super(Geocache, self).__init__(geometry, attributes) def __repr__(self): name = self.attributes.get('name', 'Unnamed') return "{} {} - {}".format(self.geom.x, self.geom.y, name)
A geom
property was added to the class as a required argument while instantiating it. In this property, we will store the Shapely object. The lat
and lon
properties were removed; they can be accessed directly from the Shapely object (geom
) and we will adapt PointCollection
to do this.
The __repr__
method of the Geocache
class returns a string containing the coordinates of the point and the name
attribute when it's available or Unnamed
.
Now add the Boundary class:
class Boundary(BaseGeoObject): """Represents a single political Boundary.""" def __repr__(self): return self.name
For now, the Boundary
class is almost the same as the BaseGeoObject
class, so we only change the __repr__
method, so it returns only the name of the boundary.
The next step is to edit the collection classes. Our PointCollection
class is almost compatible with the new organization. We only need to make a few changes to the _parse_data
method, transform this class into a base class, and create the classes that will inherit from it:
PointCollection
class.class BaseGeoCollection(object): """This class represents a collection of spatial data.""" ...
_parse_data
method and alter it to be as follows:#... def _parse_data(self, features): raise NotImplementedError
What we did here was we explicitly told that this method is not implemented in the base class. This is a good practice for two reasons: first it is a hint for the programmer that this method needs to be implemented when this class is inherited and it also states the signature for the method (the arguments that it should receive). Secondly, if it is not implemented, Python will raise NotImplementedError
instead of AttributeError
, leading to a better debugging experience.
# coding=utf-8 from __future__ import print_function import gdal from shapely.geometry import Point from shapely import wkb, wkt from utils.geo_functions import open_vector_file
PointCollection
class. Firstly, you can remove all the methods from this class. Leave only the docstring and the _parse_data
method.BaseGeoCollection
._parse_data
method to be compliant with the geometry represented by Shapely objects. Your code should be as follows:class PointCollection(BaseGeoCollection): """This class represents a collection of geocaching points. """ def _parse_data(self, features): """Transforms the data into Geocache objects. :param features: A list of features. """ for feature in features: coords = feature['geometry']['coordinates'] point = Point(float(coords[1]), float(coords[0])) attributes = feature['properties'] cache_point = Geocache(point, attributes = attributes) self.data.append(cache_point)
Note that the difference is that while instancing the Geocache, instead of passing the coordinates, we are now passing a Point
object, which is an instance of the Point
class provided by Shapely.
BoundaryCollection
class. Insert this code anywhere after the base classes:class BoundaryCollection(BaseGeoCollection): """This class represents a collection of geographic boundaries. """ def _parse_data(self, features): for feature in features: geom = feature['geometry']['coordinates'] attributes = feature['properties'] polygon = wkt.loads(geom) boundary = Boundary(geometry=polygon, attributes=attributes) self.data.append(boundary)
The difference from PointCollection
is that we are now creating polygons and instances of the Boundary
class. Note how the polygon is created with the statement wkt.loads(geom)
.
models.py
file should contain the following code:# coding=utf-8 from __future__ import print_function import gdal from shapely.geometry import Point from shapely import wkb, wkt from utils.geo_functions import open_vector_file class BaseGeoObject(object): """Base class for a single geo object.""" def __init__(self, geometry, attributes=None): self.geom = geometry self.attributes = attributes @property def coordinates(self): raise NotImplementedError def __repr__(self): raise NotImplementedError class Geocache(BaseGeoObject): """This class represents a single geocaching point.""" def __init__(self, geometry, attributes=None): super(Geocache, self).__init__(geometry, attributes) def __repr__(self): name = self.attributes.get('name', 'Unnamed') return "{} {} - {}".format(self.geom.x, self.geom.y, name) class Boundary(BaseGeoObject): """Represents a single geographic boundary.""" def __repr__(self): return self.attributes.get('name', 'Unnamed') class BaseGeoCollection(object): """This class represents a collection of spatial data.""" def __init__(self, file_path=None): self.data = [] self.epsg = None if file_path: self.import_data(file_path) def __add__(self, other): self.data += other.data return self def import_data(self, file_path): """Opens an vector file compatible with OGR and parses the data. :param str file_path: The full path to the file. """ features, metadata = open_vector_file(file_path) self._parse_data(features) self.epsg = metadata['epsg'] print("File imported: {}".format(file_path)) def _parse_data(self, features): raise NotImplementedError def describe(self): print("SRS EPSG code: {}".format(self.epsg)) print("Number of features: {}".format(len(self.data))) class PointCollection(BaseGeoCollection): """This class represents a collection of geocaching points. """ def _parse_data(self, features): """Transforms the data into Geocache objects. :param features: A list of features. """ for feature in features: coords = feature['geometry']['coordinates'] point = Point(coords) attributes = feature['properties'] cache_point = Geocache(point, attributes=attributes) self.data.append(cache_point) class BoundaryCollection(BaseGeoCollection): """This class represents a collection of geographic boundaries. """ def _parse_data(self, features): for feature in features: geom = feature['geometry']['coordinates'] attributes = feature['properties'] polygon = wkt.loads(geom) boundary = Boundary(geometry=polygon, attributes=attributes) self.data.append(boundary)
if __name__ == '__main__':
block:if __name__ == '__main__': world = BoundaryCollection("../data/world_borders_simple.shp") for item in world.data: print(item)
models
. If everything is OK, you should see a long list of the unnamed countries:File imported: ../data/world_borders_simple.shp Unnamed Unnamed Unnamed Unnamed ... Process finished with exit code 0
This is disappointing. We expected to see the names of the countries, but for some reason, the program failed to get it from the attributes. We will solve this problem in the next topic.
3.149.236.27