Importing polygons

Now that we have the basics on how to work with polygons and we know how to represent and store them, we will go back to our app and add the ability to import geospatial files containing polygons. As we did with the points, we will abstract the features into the Python objects and we will also use class inheritance.

First, let's look at the code we already wrote. In the models.py file, we have the PointCollection class:

class PointCollection(object):
    def __init__(self, file_path=None):
        """This class represents a group of vector data."""
        self.data = []
        self.epsg = None

        if file_path:
            self.import_data(file_path)

    def __add__(self, other):
        self.data += other.data
        return self

    def import_data(self, file_path):
        """Opens an vector file compatible with OGR and parses
         the data.

        :param str file_path: The full path to the file.
        """
        features, metadata = open_vector_file(file_path)
        self._parse_data(features)
        self.epsg = metadata['epsg']
        print("File imported: {}".format(file_path))

    def _parse_data(self, features):
        """Transforms the data into Geocache objects.

        :param features: A list of features.
        """
        for feature in features:
            geom = feature['geometry']['coordinates']
            attributes = feature['properties']
            cache_point = Geocache(geom[0], geom[1],
                                   attributes = attributes)
            self.data.append(cache_point)

    def describe(self):
        print("SRS EPSG code: {}".format(self.epsg))
        print("Number of features: {}".format(len(self.data)))

This class represents a collection of geocaching points and is responsible for importing these points and converting and storing them. These are exactly the same functionality that we want to implement to import polygons.

In the previous chapter, you saw how it's possible, through inheritance, to make a class inherit functionalities from other classes. We will use this same technique to use what we already have to import the polygons.

Since the processing of geocaching points and polygons may have its particularities, it will need some of the things to be specific for each one. A specific example is the _parse_data method that, for now, converts features into geocaching points.

So, it's not a good idea to make the class that represents polygons to inherit directly from the PointCollection class. Instead, the idea is to have two base classes, one that represents a single object and other that represents a collection of that object. These base classes will contain methods that are common to the points and the polygons, then the child classes will contain methods specific for each case.

The polygons that we will import could be countries, boundaries, states, or provinces of a country, city, district regions, and so on. Since it's not clear yet, let's call it boundaries. This is explained in the following steps:

  1. We will start creating the BaseGeoObject object and adapting from the Geocache class. Open the models.py file in the Chapter4 folder.
  2. Make a copy of the Geocache class with all its methods (copy and paste).
  3. Rename the first copy to BaseGeoObject and change docstring to something like "Base class for single geo objects.". You should have this:
    class BaseGeoObject(object):
        """Base class for a single geo object."""
        def __init__(self, lat, lon, attributes=None):
            self.lat = lat
            self.lon = lon
            self.attributes = attributes
    
        @property
        def coordinates(self):
            return self.lat, self.lon
    
    class Geocache(object):
        """This class represents a single geocaching point."""
        def __init__(self, lat, lon, attributes=None):
            self.lat = lat
            self.lon = lon
            self.attributes = attributes
    
        @property
        def coordinates(self):
            return self.lat, self.lon

Now try to think, looking at both of the classes, what is specific for the Geocache, what doesn't belong to a generic GeoObject or what belongs to it, and what properties and methods every type of geospatial object could have.

This separation could lead to some debate, and sometimes, depending on the complexity of the project and the nature of what you are dealing with, it may be hard to reach a final state in the first iteration through the code. In your projects, you may need to come back and change how the classes are organized more than once.

For now, I'm going to propose the following logic:

  • Lat, lon: These properties are for the Geocache only. As we saw, we may have other types of geometries and we want to generalize how the geometries are stored.
  • Attributes: All the objects should have this property.
  • A __repr__ method: This is another magic method like __init__ and __add__ that we had in the previous chapter. __repr__ is called when you use the print() function on an object. We will add it and set it to not be implemented on the base class, because every type of object should have its own representation.
  • Coordinates property: All geo objects should have coordinates, but how it is implemented here is specific to the Geocache. We will change that to a generic form: a geom property that will contain the object geometry.

Let's make the first changes to these classes. Edit your code to be as follows:

class BaseGeoObject(object):
    """Base class for a single geo object."""
    def __init__(self, geometry, attributes=None):
        self.geom = geometry
        self.attributes = attributes

    @property
    def coordinates(self):
        raise NotImplementedError

    def __repr__(self):
        raise NotImplementedError


class Geocache(BaseGeoObject):
    """This class represents a single geocaching point."""
    def __init__(self, geometry, attributes=None):
        super(Geocache, self).__init__(geometry, attributes)

    def __repr__(self):
        name = self.attributes.get('name', 'Unnamed')
        return "{} {}  -  {}".format(self.geom.x,
                                     self.geom.y, name)

A geom property was added to the class as a required argument while instantiating it. In this property, we will store the Shapely object. The lat and lon properties were removed; they can be accessed directly from the Shapely object (geom) and we will adapt PointCollection to do this.

The __repr__ method of the Geocache class returns a string containing the coordinates of the point and the name attribute when it's available or Unnamed.

Now add the Boundary class:

class Boundary(BaseGeoObject):
    """Represents a single political Boundary."""
    def __repr__(self):
        return self.name 

For now, the Boundary class is almost the same as the BaseGeoObject class, so we only change the __repr__ method, so it returns only the name of the boundary.

The next step is to edit the collection classes. Our PointCollection class is almost compatible with the new organization. We only need to make a few changes to the _parse_data method, transform this class into a base class, and create the classes that will inherit from it:

  1. First, like we did earlier, make a copy of the PointCollection class.
  2. Now, rename the first occurrence of this class and change its docstring:
    class BaseGeoCollection(object):
        """This class represents a collection of spatial data."""
    ...
  3. Go to the _parse_data method and alter it to be as follows:
    #...
        def _parse_data(self, features):
            raise NotImplementedError    

What we did here was we explicitly told that this method is not implemented in the base class. This is a good practice for two reasons: first it is a hint for the programmer that this method needs to be implemented when this class is inherited and it also states the signature for the method (the arguments that it should receive). Secondly, if it is not implemented, Python will raise NotImplementedError instead of AttributeError, leading to a better debugging experience.

  1. Before we continue, edit the imported modules at the beginning of the file to match the following code:
    # coding=utf-8
    
    from __future__ import print_function
    import gdal
    from shapely.geometry import Point
    from shapely import wkb, wkt
    from utils.geo_functions import open_vector_file
  2. The base class is ready and now we are going to edit the PointCollection class. Firstly, you can remove all the methods from this class. Leave only the docstring and the _parse_data method.
  3. Edit the class declaration and make it inherit from BaseGeoCollection.
  4. Finally, edit the _parse_data method to be compliant with the geometry represented by Shapely objects. Your code should be as follows:
    class PointCollection(BaseGeoCollection):
        """This class represents a collection of
        geocaching points.
        """
        def _parse_data(self, features):
            """Transforms the data into Geocache objects.
    
            :param features: A list of features.
            """
            for feature in features:
                coords = feature['geometry']['coordinates']
                point = Point(float(coords[1]), float(coords[0]))
                attributes = feature['properties']
                cache_point = Geocache(point, attributes = attributes)
                self.data.append(cache_point)

    Note that the difference is that while instancing the Geocache, instead of passing the coordinates, we are now passing a Point object, which is an instance of the Point class provided by Shapely.

  5. Next we are going to create the BoundaryCollection class. Insert this code anywhere after the base classes:
    class BoundaryCollection(BaseGeoCollection):
        """This class represents a collection of
        geographic boundaries.
        """
        def _parse_data(self, features):
            for feature in features:
                geom = feature['geometry']['coordinates']
                attributes = feature['properties']
                polygon = wkt.loads(geom)
                boundary = Boundary(geometry=polygon,
                                    attributes=attributes)
                self.data.append(boundary)

    The difference from PointCollection is that we are now creating polygons and instances of the Boundary class. Note how the polygon is created with the statement wkt.loads(geom).

  6. We are almost done. Check whether everything is correct. The complete models.py file should contain the following code:
    # coding=utf-8
    
    from __future__ import print_function
    import gdal
    from shapely.geometry import Point
    from shapely import wkb, wkt
    from utils.geo_functions import open_vector_file
    
    class BaseGeoObject(object):
        """Base class for a single geo object."""
        def __init__(self, geometry, attributes=None):
            self.geom = geometry
            self.attributes = attributes
    
        @property
        def coordinates(self):
            raise NotImplementedError
    
        def __repr__(self):
            raise NotImplementedError
    
    
    class Geocache(BaseGeoObject):
        """This class represents a single geocaching point."""
        def __init__(self, geometry, attributes=None):
            super(Geocache, self).__init__(geometry, attributes)
    
        def __repr__(self):
            name = self.attributes.get('name', 'Unnamed')
            return "{} {}  -  {}".format(self.geom.x,
                                         self.geom.y, name)
    
    class Boundary(BaseGeoObject):
        """Represents a single geographic boundary."""
        def __repr__(self):
            return self.attributes.get('name', 'Unnamed')
    
    class BaseGeoCollection(object):
        """This class represents a collection of spatial data."""
        def __init__(self, file_path=None):
            self.data = []
            self.epsg = None
    
            if file_path:
                self.import_data(file_path)
    
        def __add__(self, other):
            self.data += other.data
            return self
    
        def import_data(self, file_path):
            """Opens an vector file compatible with OGR and parses
             the data.
    
            :param str file_path: The full path to the file.
            """
            features, metadata = open_vector_file(file_path)
            self._parse_data(features)
            self.epsg = metadata['epsg']
            print("File imported: {}".format(file_path))
    
        def _parse_data(self, features):
            raise NotImplementedError
    
        def describe(self):
            print("SRS EPSG code: {}".format(self.epsg))
            print("Number of features: {}".format(len(self.data)))
    
    
    class PointCollection(BaseGeoCollection):
        """This class represents a collection of
        geocaching points.
        """
        def _parse_data(self, features):
            """Transforms the data into Geocache objects.
    
            :param features: A list of features.
            """
            for feature in features:
                coords = feature['geometry']['coordinates']
                point = Point(coords)
                attributes = feature['properties']
                cache_point = Geocache(point, attributes=attributes)
                self.data.append(cache_point)
    
    
    class BoundaryCollection(BaseGeoCollection):
        """This class represents a collection of
        geographic boundaries.
        """
        def _parse_data(self, features):
            for feature in features:
                geom = feature['geometry']['coordinates']
                attributes = feature['properties']
                polygon = wkt.loads(geom)
                boundary = Boundary(geometry=polygon,
                                    attributes=attributes)
                self.data.append(boundary)
  7. Now, in order to test it, go to the end of the file and edit the if __name__ == '__main__': block:
    if __name__ == '__main__':
        world = BoundaryCollection("../data/world_borders_simple.shp")
        for item in world.data:
            print(item)
  8. Now run it, press Alt + Shift + F10, and select models. If everything is OK, you should see a long list of the unnamed countries:
    File imported: ../data/world_borders_simple.shp
    Unnamed
    Unnamed
    Unnamed
    Unnamed
    ...
    
    Process finished with exit code 0

This is disappointing. We expected to see the names of the countries, but for some reason, the program failed to get it from the attributes. We will solve this problem in the next topic.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.236.27