Geo-aggregations

Sometimes searches may return too many results but you might be just interested in finding out how many documents exist in a particular range of a location. A simple example can be to see how many news events related to crime occurred in an area by plotting them on a map or by generating a heatmap cluster of the events on the map, as shown in the following image:

Geo-aggregations

Elasticsearch offers both metric and bucket aggregations for geo_point fields.

Geo distance aggregation

Geo distance aggregation is an extension of range aggregation. It allows you to create buckets of documents based on specified ranges. Let's see how this can be done using an example.

Python example

query = {
  "aggs": {
    "news_hotspots": {
      "geo_distance": {
        "field": "location",
        "origin": "28.61, 77.23",
     "unit": "km",
     "distance_type": "plane",
        "ranges": [
          {
            "to": 50
          },
          {
            "from": 50, "to": 200
          },
          {
            "from": 200
          }
        ]
      }
    }
  }
}

Executing the query, as follows:

response = es.search(index=index_name, doc_type=doc_type, body=query, search_type='count')

The preceding query creates buckets of documents with the following ranges with respect to the specified origin point:

  • The count of the news events that happened in 0 to 50 km of range
  • The count of the news events that happened in 50 to 200 km of range
  • The count of the news events that happened outside the 200 km range

The query parameters are as follows:

  • origin: This accepts lat-lon in all three formats: object, string or array.
  • unit: This defaults to m (meters), but accepts other distance units as well, such as km.
  • distance_type: This is used to specify how the distance needs to be calculated. It is an optional parameter, which defaults to sloppy_arc (faster but less accurate), but can also be set to arc (slower but most accurate) or plane (fastest but least accurate). Because of high error margins, plane should be used only for small geographic areas.

Java example

We covered aggregation in detail in the previous chapter, where you saw range aggregation. Geo distance aggregation is similar to it and only takes the following extra parameters:

Point, distance unit, and distance type, which we have already covered in the previous section.

For the distance type, import org.elasticsearch.common.geo.GeoDistance;.

AggregationBuilder aggregation =
  AggregationBuilders.geoDistance("news_hotspots").field(fieldName).point(new GeoPoint(28.61, 77.23))
        .unit(DistanceUnit.KILOMETERS)
        .distanceType(GeoDistance.PLANE)
        .addUnboundedTo(50)
        .addRange(50, 100)
        .addUnboundedFrom(200);
SearchResponse response =   client.prepareSearch(indexName).setTypes(docType)
        .setQuery(QueryBuilders.matchAllQuery())
        .addAggregation(aggregation)
        .setSize(0).execute().actionGet();
Range agg = response.getAggregations().get("news_hotspots");

for (Range.Bucket entry : agg.getBuckets()) {
      String key = entry.getKeyAsString();
      Number from = (Number) entry.getFrom();
      Number to = (Number) entry.getTo();
      long docCount = entry.getDocCount();      System.out.println("key: "+key + " from: "+from+" to: "+to+" doc count: "+docCount);
}

Using bounding boxes with geo distance aggregation

The following is an example of using a bounding box query to limit the scope of our searches and then performing aggregation.

Python example

query= {
  "query": {
    "bool": {
      "must": {
        "match_all": {}
      },
      "filter": {
        "geo_bounding_box": {
          "location": {
            "top_left": {"lat": 68.91, "lon": 35.6},
            "bottom_right": {"lat": 7.8, "lon": 97.29}
          }
        }
      }
    }
  },
  "aggs": {
    "news_hotspots": {
      "geo_distance": {
        "field": "location",
        "origin": "28.61, 77.23",
        "unit": "km",
        "distance_type": "plane",
        "ranges": [
          {"from": 0, "to": 50 },
          {"from": 50, "to": 200 }
        ]
      }
    }
  }
}
response = es.search(index=index_name, doc_type=doc_type, body=query)
print 'total documents found', response['hits']['total']
for hit in response['hits']['hits']:
    print hit.get('_source')

The preceding query finds all the news documents within India (specified using the bounding box query) and creates buckets from 0 to 50 km and from 50 to 200 km in the national capital region of Delhi.

To build this query in Java, you can use the geo bounding box query in combination with geo distance aggregation examples.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.243.130