There are a lot of common scenarios that involve changing your mapping. Due to limitation to Elasticsearch mapping, that is, it not being possible to delete a defined one, you often need to reindex index data. The most common scenarios are:
You need an up-and-running Elasticsearch installation, as used in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via the command line, you need to install curl
for your operative system.
To correctly execute the following commands, the index created in the Creating an index recipe is required.
The HTTP method to reindex an index is POST
. The URL formats to get mapping is http://<server>/_reindex
.
To get a mapping from the type of an index, we will perform the steps given as follows:
myindex
to the myindex2
index, the call will be:curl -XPOST 'http://localhost:9200/_reindex?pretty=true' -d '{ "source": { "index": "myindex" }, "dest": { "index": "myindex2" } }'
{ "took" : 66, "timed_out" : false, "total" : 2, "updated" : 0, "created" : 2, "deleted" : 0, "batches" : 1, "version_conflicts" : 0, "noops" : 0, "retries" : { "bulk" : 0, "search" : 0 }, "throttled_millis" : 0, "requests_per_second" : "unlimited", "throttled_until_millis" : 0, "failures" : [ ] }
The reindex functionality introduced in Elasticsearch 5.x provides an efficient way to reindex a document.
In the previous Elasticsearch version, this functionality was to be implemented at a client level. The advantages of the new Elasticsearch implementations are as follows:
At server level, this action is composed of the following steps:
The main parameters that can be provided to this action are:
source
section that manages how to select source documents. The most important sub sections are as follows:index
, which is the source index to be used. It can also be a list of indices.type
(optional), which is the source type to be reindexed. It can also be a list of types.query
(optional), which is an Elasticsearch query to be used to select parts of the document.sort
(optional), which can be used to provide a way of sorting the documents.dest
section that manages how to control the target written documents. The most important parameters in this section are:index
, which is the target index to be used. If it is not available, it's created.version_type
(optional), if it is set to external, the external version is preserved.routing
(optional), which controls the routing in the destination index. It can be:keep
(the default), which preserves the original routingdiscard
, which discards the original routing=<text>
, which uses the text value for the routingpipeline
(optional), which allows you to define a custom pipeline for ingestion. We will see more about the ingestion pipeline in Chapter 13, Ingest.size
(optional), the number of documents to be reindexed.script
(optional), which allows you to define a scripting for document manipulation. This case will be discussed in the Reindex with a custom script recipe in Chapter 9, Scripting.18.226.177.151