After having indexed a document, during your application life it must probably be retrieved.
The GET REST call allows us to get a document in real time without the need of a refresh.
You need an up-and-running Elasticsearch installation, as used in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via the command line, you need to install curl
for your operative system.
To correctly execute the following commands, use the indexed document in the Indexing a document recipe.
The GET
method allows us to return a document given its index, type, and ID.
The REST API URL is:
http://<server>/<index_name>/<type_name>/<id>
To get a document, we will perform the following steps:
curl -XGET 'http://localhost:9200/myindex/order/2qLrAfPVQvCRMe7Ku8r0Tw? pretty=true'
{ "_index":"myindex", "_type":"order", "_id":"2qLrAfPVQvCRMe7Ku8r0Tw", "_version":1, "found":true, "_source" : { "id" : "1234", "date" : "2013-06-07T12:14:54", "customer_id" : "customer1", "sent" : true, "items":[ {"name":"item1", "quantity":3, "vat":20.0}, {"name":"item2", "quantity":2, "vat":20.0}, {"name":"item3", "quantity":1, "vat":10.0} ] }}
_source
parameter, but other information is returned:_index
: The index that stores the document_type
: The type of the document_id
: The ID of the document_version
: The version of the documentfound
: Whether the document has been found404
is returned as status code and the return JSON will be:{ "_id": "2qLrAfPVQvCRMe7Ku8r0Tw", "_index": "myindex", "_type": "order", "found": false }
The Elasticsearch GET API on the document doesn't require a refresh: all the GET calls are in real time.
This call is very fast because Elasticsearch redirects the search only on the shard that contains the document without an other overhead, and the document IDs are often cached in memory for fast look up.
The source of the document is only available if the _source
field is stored (default settings in Elasticsearch).
There are several additional parameters that can be used to control the get call:
fields
allow us to retrieve only a subset of fields. This is very useful to reduce bandwidth or to retrieve calculated fields such as the attachment mapping ones:curl 'http://localhost:9200/myindex/order/ 2qLrAfPVQvCRMe7Ku8r0Tw?fields=date,sent'
routing
allows us to specify the shard to be used for the get operation. To retrieve a document, the routing used in indexing time must be the same as the search time:curl 'http://localhost:9200/myindex/order/ 2qLrAfPVQvCRMe7Ku8r0Tw?routing=customer_id'
refresh
allows us to refresh the current shard before doing the get operation (it must be used with care because it slows down indexing and introduces some overhead):curl http://localhost:9200/myindex/order/ 2qLrAfPVQvCRMe7Ku8r0Tw?refresh=true
preference
allows us to control which shard replica is chosen to execute the GET
method. Generally, Elasticsearch chooses a random shard for the GET
call. The possible values are as follows:_primary
for the primary shard._local
, first trying the local shard and then falling back to a random choice. Using the local shard reduces the bandwidth usage and should generally be used with autoreplicating shards (replica set to 0-all).custom value
for selecting a shard related value such as customer_id
and username
.The GET API is very fast, so a good practice for developing applications is to try to use as much as possible. Choosing the correct ID form during application development can bring a big boost in performance.
If the shard, which contains the document, is not bound to an ID, to fetch the document a query with an ID filter (we will see them in Chapter 6, Text and Numeric Queries in the Using a IDS query recipe) is required.
If you don't need to fetch the record, but only check the existence, you can replace GET
with HEAD
and the response will be status code 200
if it exists, or 404
if it is missing.
The GET
call has also a special endpoint _source
that allows fetching only the source of the document.
The GET source REST API URL is:
http://<server>/<index_name>/<type_name>/<id>/_source
To fetch the source of the previous order, we will call:
curl -XGET http://localhost:9200/myindex/order/2qLrAfPVQvCRMe7Ku8r0Tw/_source
18.119.29.105