Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Executing geo distance aggregations

Among the other standard types that we have seen in the previous aggregations, Elasticsearch allows for executing aggregations against a GeoPoint: the geo distance aggregations. This is an evolution of the previous discussed range aggregations built to work on geo locations.

Getting ready

You need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command-line, you need to install curl for your operative system.

To correctly execute the following command, you need an index populated with the script (chapter_08/populate_aggregations.sh) available in the online code.

How to do it...

For executing geo distance aggregations, we will perform the following steps:

Using the position field available in the documents, we want to aggregate the other documents in five ranges:
- Less than 10 kilometers
- From 10 kilometers to 20
- From 20 kilometers to 50
- From 50 kilometers to 100
- Above 100 kilometers

To achieve these goals, we create a geo distance aggregation with a code similar to following one:

        curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search? 
        pretty&size=0' -d ' { 
            "query" : { 
                "match_all" : {} 
            }, 
           "aggs" : { 
               "position" : { 
                    "geo_distance" : { 
                       "field":"position", 
                       "origin" : { 
                           "lat": 83.76, 
                           "lon": -81.20 
                        }, 
                        "ranges" : [ 
                            { "to" : 10 }, 
                            { "from" : 10, "to" : 20 }, 
                            { "from" : 20, "to" : 50 }, 
                            { "from" : 50, "to" : 100 }, 
                            { "from" : 100 } 
                       ] 
                   } 
                } 
           } 
        }'

The result returned by Elasticsearch, if everything is okay, should be as follows:

        { 
          "took" : 177, 
          "timed_out" : false, 
          "_shards" : {...truncated...}, 
          "hits" : {...truncated...}, 
          "aggregations" : { 
            "position" : { 
              "buckets" : [ { 
                "key" : "*-10.0", 
                "from" : 0.0, 
                "to" : 10.0, 
                "doc_count" : 0 
              }, { 
                "key" : "10.0-20.0", 
                "from" : 10.0, 
                "to" : 20.0, 
                "doc_count" : 0 
              }, { 
                "key" : "20.0-50.0", 
                "from" : 20.0, 
                "to" : 50.0, 
                "doc_count" : 0 
              }, { 
                "key" : "50.0-100.0", 
                "from" : 50.0, 
                "to" : 100.0, 
                "doc_count" : 0 
              }, { 
                "key" : "100.0-*", 
                "from" : 100.0, 
                "doc_count" : 1000 
              } ] 
            } 
          } 
        }

How it works...

The geo range aggregation is an extension of the range aggregations that works on geo localizations. It works only if a field is mapped as a geo_point.

The field can contain a single or a multi-values geo points.

The aggregation requires at least the following three parameters:

field: the field of the geo point to work on
origin: the geo point to be used for computing the distances
ranges: a list of ranges to collect documents based on their distance from the target point

The GeoPoint can be defined in one of the following accepted formats:

latitude and longitude as properties, that is: {"lat": 83.76, "lon": -81.20 }
longitude and latitude as array, that is: [-81.20, 83.76]
latitude and longitude as string, that is: 83.76, -81.20
geohash, that is: fnyk80

The ranges are defined as a couple of from/to values. If one of them is missing, they are considered unbound.

The values used for the range are by default set to kilometers, but using the property unit it's possible to set them as follows:

mi or miles
in or inch
yd or yard
km or kilometers
m or meters
cm or centimeter
mm or millimeters

It's also possible to set how the distance is computed with the distance_type parameter. Valid values for this parameter are as follows:

arc, which uses the Arc Length formula. It is the most precise. (See http://en.wikipedia.org/wiki/Arc_length for more details on the arc length algorithm.)
sloppy_arc (default), which is a faster implementation of the arc length formula, but less precise.
plane, which is used for the plane distance formula. It is the fastest and most CPU intensive, but it's also the least precise.

As for the range filter, the range values are treated independently, so the overlapping ranges are allowed.

When the results are returned, this aggregation provides a lot of information in its fields as follows:

from/to defines the analyzed range
key defines the string representation of the range
doc_count defines the number of documents in the bucket that matches the range

Table of Contents for
Executing geo distance aggregations

Executing geo distance aggregations

Getting ready

How to do it...

How it works...

See also

Table of Contents for Executing geo distance aggregations

Create new playlist

Sign In

Sign Up

Executing geo distance aggregations

Getting ready

How to do it...

How it works...

See also

Table of Contents for
Executing geo distance aggregations