The power of the pipeline definition is the ability for to be updated and created without a node restart (compared to Logstash). The definition is stored in a cluster state via the put pipeline API.
After having defined a pipeline, we need to provide it to the Elasticsearch cluster.
You need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via the command line, you need to install curl
for your operative system.
To store or update an ingestion pipeline in Elasticsearch, we will perform the following steps:
PUT
call:curl -XPUT 'http://127.0.0.1:9200/_ingest/pipeline/add-user- john' -d '{ "description" : "Add user john field", "processors" : [ { "set" : { "field": "user", "value": "john" } } ] , "version":1 }'
{"acknowledged":true}
The PUT
pipeline method works both for creating a pipeline as well as updating an existing one.
The pipelines are stored in a cluster state, and they are immediately propagated to all ingest nodes. When the ingest nodes receive the new pipeline, they will update their node in-memory pipeline representation: the pipeline changes take effect immediately.
When you store a pipeline in the cluster, pay attention to provide a meaningful name to it (in the example, add-user-john
) so as to easily understand what the pipeline does.
The name of the pipeline used in the put call will be the ID of the pipeline in other pipeline flows.
After having stored your pipeline in Elasticsearch, you can index a document providing the pipeline name as query argument.
For example:
curl -XPUT http://localhost:9200/my_index/my_type/my_id?pipeline=add-user-john -d '{}'
The document will be enriched by the pipeline before being indexed.
18.119.159.178