The pattern_replace filter

You can specify a regular expression (for more information, you can refer to https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.htmlto match a sequence of characters. When the input string has such a sequence, the original value will be substituted by the replacement string. To define a pattern_replace character filter, you need to specify three parameters:

  • The Java regular expression (pattern)
  • The replacement string (replacement)
  • The Java regular expression flag (flags)

Let's take a look at an example of how to replace the input text, 7.0, with v7. We will apply all three character filters: html_strip, mapping, and pattern_replace. We write a regular expression matching pattern, (\d+).(\d+) to match the string 7.0. Then we replace it with the replacement parameter, where we use the value v$1.  The symbol $1 denotes the first portion of the matching pattern, which is  (\.d+). In our example, the value of the matching string is 7. Suppose that we use the same HTML input text string as the previous examples; after applying all three character filters, the output of the filters will be as follows:

"You will love Elasticsearch v7"

Let's apply all three character filters, the standard tokenizer and the lowercase token filter to the input text. In the following screenshot, you can see that the token has been changed from 7.0 to v7:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.214.81