Elasticsearch provides by default a large set of ingest processors. Their number and functionalities can also change from minor versions to extended versions for new scenarios.
In this recipe, we will see the most commonly used ones.
You need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via the command-line, you need to install curl
for your operative system.
To use several processors in an ingestion pipeline in Elasticsearch, we will perform the following steps:
curl -XPOST 'http://127.0.0.1:9200/_ingest/pipeline/_simulate? pretty' -d '{ "pipeline": { "description": "Testing some build-processors", "processors": [ { "dot_expander": { "field": "extfield.innerfield" } }, { "remove": { "field": "unwanted" } }, { "trim": { "field": "message" } }, { "set": { "field": "tokens", "value": "{{message}}" } }, { "split": { "field": "tokens", "separator": "\s+" } }, { "sort": { "field": "tokens", "order": "desc" } }, { "convert": { "field": "mynumbertext", "target_field": "mynumber", "type": "integer" } } ] }, "docs": [ { "_index": "index", "_type": "type", "_id": "1", "_source": { "extfield.innerfield": "booo", "unwanted": 32243, "message": " 155.2.124.3 GET /index.html 15442 0.038 ", "mynumbertext": "3123" } } ] }'
{ "docs" : [ { "doc" : { "_index" : "index", "_type" : "type", "_id" : "1", "_source" : { "mynumbertext" : "3123", "extfield" : { "innerfield" : "booo" }, "tokens" : [ "GET", "155.2.124.3", "15442", "0.038", "/index.html" ], "message" : "155.2.124.3 GET /index.html 15442 0.038", "mynumber" : 3123 }, "_ingest" : { "timestamp" : "2016-12-10T16:49:40.875+0000" } } } ] }
The preceding example shows how to build a complex pipeline to pre-process a document. There are a lot of built-in processors to cover the most common scenarios in log and text processing.
More complex ones can be done via scripting.
At the time of writing, Elasticsearch provides built-in pipelines the following processors:
Name |
Description |
Append |
Appends values to a field. If required, it converts them in an array. |
Convert |
Converts a field value to a different type. |
Date |
Parses a date and uses it as a timestamp for the document. |
Date Index Name |
Allows us to set the |
Fail |
Raises a failure. |
Foreach |
Processes the element of an array with the provided processor. |
Grok |
Applies grok pattern extraction. |
Gsub |
Executes a regular expression |
Join |
Joins an array of values using a separator. |
JSON |
Convert a JSON string to a JSON object. |
Lowercase |
Lowercases a field. |
Remove |
Removes a field. |
Rename |
Renames a field. |
Script |
Allows us to execute a script. |
Set |
Sets the value of a field. |
Split |
Splits a field in an array using regular expression. |
Sort |
Sorts the values of an array field. |
Trim |
Trims whitespaces from a field. |
Uppercase |
Uppercases a field. |
Dot expander |
Expands a field with a dot in the objects. |
3.144.90.182