Speeding up GET operations (multi GET)

The standard GET operation is very fast, but if you need to fetch a lot of documents by ID, Elasticsearch provides the multi GET operation.

Getting ready

You need an up-and-running Elasticsearch installation, as used in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command line, you need to install curl for your operative system.

To correctly execute the following commands, use the indexed document in the Indexing a document recipe.

How to do it...

The multi GET REST URLs are:

http://<server</_mget

http://<server>/<index_name>/_mget

http://<server>/<index_name>/<type_name>/_mget

To execute a multi GET action, we will perform the following steps:

  1. The method is POST with a body that contains a list of document IDs and the index/type if they are missing. As an example, using the first URL, we need to provide the index, type, and ID:
            curl -XPOST 'localhost:9200/_mget' -fd '{
                "docs" : [
                    {
                        "_index" : "myindex",
                        "_type" : "order",
                        "_id" : "2qLrAfPVQvCRMe7Ku8r0Tw"
                    },
                    {
                        "_index" : "myindex",
                        "_type" : "order",
                        "_id" : "2"
                    }
                ]
            }'
    

    This kind of call allows us to fetch documents in several different indices and types.

  2. If the index and the type is fixed, a call should also be in the form of:
            curl 'localhost:9200/test/type/_mget' -d '{
                "ids" : ["1", "2"]
            }'
    

The multi get result is an array of documents.

How it works...

Multi GET call is a shortcut for executing many get commands in one shot.

Elasticsearch internally spreads the get in parallel on several shards and collects the results to return to the user.

The get object can contain the following parameters:

  • _index: The index that contains the document. It can be omitted if passed in the URL.
  • _type: The type of the document. It can be omitted if passed in the URL.
  • _id: The document ID.
  • stored_fields: (optional) a list of fields to retrieve.
  • _source: (optional) source filter object.
  • routing: (optional) the shard routing parameter.

The advantages of a multi GET are as follows:

  • Reduced networking traffic, both internally and externally of Elasticsearch
  • Speed up if used in an application: the time of processing a multi get is quite similar to a standard get

See also...

  • Refer to the Getting a document recipe in this chapter to learn how to execute a simple get and general parameters for a GET call
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.106.33