Managing index settings

Index settings are more important because they allow you to control several important Elasticsearch functionalities such as sharding/replica, caching, term management, routing, and analysis.

Getting ready

You need an up-and-running Elasticsearch installation, as used in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command line, you need to install curl for your operative system.

To correctly execute the following commands, use the index created in the Creating an index recipe.

How to do it...

For managing the index settings, we will perform the following steps:

  1. To retrieve the settings of your current index, use the following URL format: http://<server>/<index_name>/_settings
  2. We are reading information via the REST API, so the method will be GET and an example of call, using the index created in the Creating an index recipe, is as follows:
            curl -XGET 'http://localhost:9200/myindex/_settings?
            pretty=true'
  3. The response will be something similar:
            { 
              "myindex" : { 
                "settings" : { 
                  "index" : { 
                   "uuid" : "pT65_cn_RHKmg1wPX7BGjw", 
                    "number_of_replicas" : "1", 
                    "number_of_shards" : "2", 
                    "version" : { 
                      "created" : "1020099" 
                    } 
                  } 
                } 
              } 
            } 
    
  4. The response attributes depend on the index settings set. In this case, the response will be the number of replicas (1), shards (2), and the index creation version (1020099). The UUID represents the unique ID of the index.
  5. To modify the index settings, we need to use the PUT method. A typical settings change is to increase the replica number:
            curl -XPUT 'http://localhost:9200/myindex/_settings' -d '
            {"index":{ "number_of_replicas": "2"}}'
    

How it works...

Elasticsearch provides a lot of options to tune the index behaviors, such as:

  • Replica management:
    • index.number_of_replicas: The number of replicas each shard has
    • index.auto_expand_replicas: This allows you to define a dynamic number of replicas related to the number of shards

    Tip

    Using set index.auto_expand_replicas to 0-all allows creating an index that is replicated in every node (very useful for settings or cluster propagated data such as language options/stopwords).

  • Refresh interval (default 1s): In the Refreshing an index recipe, we saw how to manually refresh an index. The index settings index.refresh_interval control the rate of automatic refresh.
  • Write management: Elasticsearch provides several settings to block read/write operation in the index and to change metadata. They live in the index.blocks settings.
  • Shard Allocation Management: These settings control how the shards must be allocated. They live in the index.routing.allocation.* namespace.

There are other index settings that can be configured for very specific needs. In every new version of Elasticsearch, the community extends these settings to cover new scenarios and requirements.

There's more...

The refresh_interval parameter allows several tricks to optimize the indexing speed. It controls the rate of refresh and refreshing and reduces the index performances due to opening and closing of files. A good practice is to disable the refresh interval (set -1) during a big bulk indexing and restore the default behavior after it. This can be done with these steps:

  1. Disable the refresh:
            curl -XPOST 'http://localhost:9200/myindex/_settings' -d '
            {"index":{"index_refresh_interval": "-1"}}'
    
  2. Bulk index millions of documents.
  3. Restore the refresh:
            curl -XPOST 'http://localhost:9200/myindex/_settings' -d '
            {"index":{"index_refresh_interval": "1s"}}'
    
  4. Optionally, you can optimize index for search performances:
            curl -XPOST 'http://localhost:9200/myindex/_optimize'
    

See also

  • In this chapter, refer to the Refreshing an index recipe to search for more recent indexed data and the ForceMerge an index recipe to optimize an index for searching.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.43.26