Node upgrades without downtime

To achieve SLAs, you need highly available systems. However, at the same time, you may need to upgrade or downgrade your machines or even upgrade Elasticsearch to its upgraded release. Both cases require some best practices to be followed because one wrong step can incur data loss or a delay in the completion of the required changes. In both the cases, one thing is for sure, nodes must be stopped one by one. While it's easy to stop the client or the master node and perform maintenance tasks, data nodes require special considerations because they might need a higher shard recovery time.

Every time a data node is restarted, shard re-balancing is done by Elasticsearch, which takes too much time because of some unnecessary data movement and synchronization inside the cluster. To avoid this scenario and for a faster recovery of the data nodes, follow these steps.

Before stopping a data node, set the shard routing allocation to none with the following command:

PUT _cluster/settings
{
"transient" : {
"cluster.routing.allocation.enable" : "none"}
}

After starting the data node, set back the shard routing allocation to all with this command:

PUT _cluster/settings
{
"transient" : {
"cluster.routing.allocation.enable" : "all"
}
}

An Elasticsearch version can't be downgraded to a lower release. So, take data backups before going for a version upgrade. Backup and restores are covered in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.9.12