The Elasticsearch core is based on Lucene, which stores the data in segments on disk. During the life of an index, a lot of segments are created and changed. With the increase of segment number, the speed of search is decreased due to the time required to read all of them. The ForceMerge operation allows us to consolidate the index for faster search performance and reducing segments.
You need an up-and-running Elasticsearch installation, as used in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via the command line, you need to install curl
for your operative system.
To correctly execute the following commands, use the index created in the Creating an index recipe.
The HTTP method used is POST
. The URL format for optimizing one or more indices, is:
http://<server>/<index_name(s)>/_flush[?refresh=True]
The URL format for optimizing all the indices in a cluster, is:
http://<server>/_flush[?refresh=True]
For optimizing or to ForceMerge an index, we will perform the steps given as follows:
curl -XPOST 'http://localhost:9200/myindex/_forcemerge'
{ "_shards" : { "total" : 10, "successful" : 5, "failed" : 0 } }
The result contains the shard operation status.
Lucene stores your data in several segments on disk. These segments are created when you index a new document/record or when you delete a document.
In Elasticsearch the deleted document is not removed from disk, but marked deleted (tombstone), to free up space you need to ForceMerge to purge deleted documents.
Due to all these factors the segment number can be large. (For this reason, in the setup we have increased the file description number for Elasticsearch processes.)
Internally Elasticsearch has a merger, which tries to reduce the number of segments, but it's designed to improve the index performances rather than search performances. The ForceMerge operation in Lucene tries to reduce the segments in an IO-heavy way, removing unused ones, purging deleted documents, and rebuilding the index with the minor number of segments.
The main advantages are:
You can pass several additional parameters to the ForceMerge call, such as:
max_num_segments
: The default value is autodetect
. For full optimization, set this value to 1.only_expunge_deletes
: The default value is false
. Lucene does not delete documents from segments, but it marks them as deleted. This flag only merges segments that have been deleted.flush
: The default value is true
. Elasticsearch performs a flush after force merge.wait_for_merge
: The default value is true
. If the request needs to wait then the merge ends.3.21.43.26