Elasticsearch standard pagination using from
and size
performs very poorly on large datasets because for every query you need to compute and discard all the results before the from value. The scrolling doesn't have this problem, but it consumes a lot, due to memory search contexts, so it cannot be used for frequent user queries.
To bypass these problems, Elasticsearch 5.x provides the search_after
functionality that provides a fast skipping for scrolling results.
You will need an up-and-running Elasticsearch installation as used in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via a command line, you need to install curl
for your operating system.
To correctly execute the following commands, you will need an index populated with the chapter_05/populate_query.sh
script available in the online code.
In order to execute a scrolling query, we will perform the following steps:
_uid
of the document as the last sort parameter, as follows:curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search? pretty' -d ' { "size": 1, "query": { "match_all" : {} }, "sort": [ {"price": "asc"}, {"_uid": "desc"} ] }'
{ "took" : 52, "timed_out" : false, "_shards" : {...}, "hits" : { "total" : 3, "max_score" : null, "hits" : [ { "_index" : "test-index", "_type" : "test-type", "_id" : "1", "_score" : null, "_source" : {...}, "sort" : [ 4.0, "test-type#1" ] } ] } }
search_after
functionality, you need to keep track of your last sort result, which in this case is as follows: [4.0, "test-type#1"]
.search_after
functionality with the last sort value of your last record, as follows:curl -XGET 'http://127.0.0.1:9200/test-index/test-type/_search? pretty' -d ' { "size": 1, "query": { "match_all" : {} }, "search_after": [4.0, "test-type#1"], "sort": [ {"price": "asc"}, {"_uid": "desc"} ] }'
Elasticsearch uses Lucene for indexing data. In Lucene indices, all the terms are sorted and stored in an ordered way, so it's natural for Lucene to be extremely fast in skipping to a term value. This operation is managed in the Lucene core with the skipTo
method. This operation doesn't consume memory and in the case of search_after
, a query is built using search_after
values to fast skip in Lucene search and to speed up the result pagination.
The search_after
functionality is introduced in Elasticsearch 5.x, but it must be kept as an important focal point to improve the user experience in search scrolling/pagination results.
3.141.197.212