Among the other standard types that we have seen in the previous aggregations, Elasticsearch allows for executing aggregations against a GeoPoint: the geo distance aggregations. This is an evolution of the previous discussed range aggregations built to work on geo locations.
You need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via the command-line, you need to install curl
for your operative system.
To correctly execute the following command, you need an index populated with the script (chapter_08/populate_aggregations.sh
) available in the online code.
For executing geo distance aggregations, we will perform the following steps:
position
field available in the documents, we want to aggregate the other documents in five ranges:curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search? pretty&size=0' -d ' { "query" : { "match_all" : {} }, "aggs" : { "position" : { "geo_distance" : { "field":"position", "origin" : { "lat": 83.76, "lon": -81.20 }, "ranges" : [ { "to" : 10 }, { "from" : 10, "to" : 20 }, { "from" : 20, "to" : 50 }, { "from" : 50, "to" : 100 }, { "from" : 100 } ] } } } }'
{ "took" : 177, "timed_out" : false, "_shards" : {...truncated...}, "hits" : {...truncated...}, "aggregations" : { "position" : { "buckets" : [ { "key" : "*-10.0", "from" : 0.0, "to" : 10.0, "doc_count" : 0 }, { "key" : "10.0-20.0", "from" : 10.0, "to" : 20.0, "doc_count" : 0 }, { "key" : "20.0-50.0", "from" : 20.0, "to" : 50.0, "doc_count" : 0 }, { "key" : "50.0-100.0", "from" : 50.0, "to" : 100.0, "doc_count" : 0 }, { "key" : "100.0-*", "from" : 100.0, "doc_count" : 1000 } ] } } }
The geo range aggregation is an extension of the range aggregations that works on geo localizations. It works only if a field is mapped as a geo_point
.
The field can contain a single or a multi-values geo points.
The aggregation requires at least the following three parameters:
field
: the field of the geo point to work onorigin
: the geo point to be used for computing the distancesranges
: a list of ranges to collect documents based on their distance from the target pointThe GeoPoint can be defined in one of the following accepted formats:
{"lat": 83.76, "lon": -81.20 }
[-81.20, 83.76]
83.76, -81.20
fnyk80
The ranges are defined as a couple of from/to
values. If one of them is missing, they are considered unbound.
The values used for the range are by default set to kilometers, but using the property unit
it's possible to set them as follows:
mi
or miles
in
or inch
yd
or yard
km
or kilometers
m
or meters
cm
or centimeter
mm
or millimeters
It's also possible to set how the distance is computed with the distance_type
parameter. Valid values for this parameter are as follows:
arc
, which uses the Arc Length formula. It is the most precise. (See http://en.wikipedia.org/wiki/Arc_length for more details on the arc length algorithm.)sloppy_arc
(default), which is a faster implementation of the arc length formula, but less precise.plane
, which is used for the plane distance formula. It is the fastest and most CPU intensive, but it's also the least precise.As for the range filter, the range values are treated independently, so the overlapping ranges are allowed.
When the results are returned, this aggregation provides a lot of information in its fields as follows:
from
/to
defines the analyzed rangekey
defines the string representation of the rangedoc_count
defines the number of documents in the bucket that matches the range52.14.62.197