ElasticSearch allows updating a document in-place.
Updating a document via scripting reduces networking traffic (otherwise, you need to fetch the document, change the field, and send it back) and allows improving performance when you need to process a huge amount of documents.
You need a working ElasticSearch cluster and an index populated with the script used for facet processing, available in the online code.
For updating using a scripting, we will perform the following steps:
curl -XPOST 'http://127.0.0.1:9200/test-index/test-type/9/_update?&pretty=true' -d '{ "script" : "ctx._source.tag += tag", "params" : { "tag" : "cool" } }'
{ "ok" : true, "_index" : "test-index", "_type" : "test-type", "_id" : "9", "_version" : 2 }
The REST HTTP method used to update a document is POST.
The URL contains only the index name, the type, and the document ID, as follows:
http://<server>/<index_name>/<type>/<document_id>/_update
The update action is composed of three different steps:
The script execution follows the workflow in the following manner:
ctx
variable in the script.The update script can set several parameters in the ctx variable. The most important parameters are:
ctx._source
: This contains the source of the documentctx._timestamp
: If it's defined, this value is set to the document timestampctx.op
: This defines the main operation type to be executed. There are several available values, such as:index
: The default value is nothing is defined: the record is re-indexed with the update valuesdelete
: The document is deleted after the updatenone
: The document is skipped without re-indexing the documentThe previous example can be rewritten using the JavaScript language, and it looks as shown in the following code:
curl -XPOST 'http://127.0.0.1:9200/test-index/test-type/9/_update?&pretty=true' -d '{ "script" : "ctx._source.tag += tag", "lang":"js", "params" : { "tag" : "cool" } }'
The previous example can be written using the Python language, as follows:
curl -XPOST 'http://127.0.0.1:9200/test-index/test-type/9/_update?&pretty=true' -d '{ "script" : "ctx["_source"]["tag"] = list(ctx["_source"]["tag"]) + [tag]", "lang":"python", "params" : { "tag" : "cool" } }'
In the Python example, the Java list must be converted into a Python list to allow add elements; the back conversion is automatically done.
In the following example we will execute an update that adds new "tags"
and "labels"
to an object, but we will mark for indexing the document only if the tags
or labels
values are changed.
curl -XPOST 'http://127.0.0.1:9200/test-index/test-type/9/_update?&pretty=true' -d '{ "script" : "ctx.op = "none"; if(ctx._source.containsValue("tags")){ foreach(item:new_tags){ if(!ctx._source.tags.contains(item)){ ctx._source.tags += item; ctx.op = "index"; } } }else{ ctx._source.tags=new_tags; ctx.op = "index"; }; if(ctx._source.containsValue("labels")){ foreach(item:new_labels){ if(!ctx._source.labels.contains(item)){ ctx._source.labels += item; ctx.op = "index"; } } }else{ ctx._source.labels=new_labels; ctx.op = "index"; };", "params" : { "new_tags" : ["cool", "nice"], "new_labels" : ["red", "blue", "green"] } }'
The preceding script uses the following steps:
none
to prevent indexing if in the following steps the original source is not changed.tags
field is available in the source object.tags
field is available in the source object, it iterates all the values of the new_tags
list. If the value is not available in the current tags
list, it adds it and updates the operation to index.tags
field doesn't exist in the source object, it simply adds it to the source and marks the operation to index.labels
value. The repetition is present in this example to show the ElasticSearch user how it is possible to update multiple values in a single update operation.This script could be quite complex, but it shows the powerful capabilities of scripting in ElasticSearch.
3.14.142.74