During an analysis, the analyzer passes the input string to the first tokenizer in the list. If we want to apply any preprocessing to the input string before passing to the tokenizer, we can do it through CharFilter. CharFilters can be chained like token filters and placed in front of a tokenizer to add, change, or remove characters from an input string. Here is a list of char filters provided by Solr:
- solr.MappingCharFilterFactory
- solr.HTMLStripCharFilterFactory
- solr.ICUNormalizer2CharFilterFactory
- solr.PatternReplaceCharFilterFactor