Letter tokenizer

The letter tokenizer discards all non-letter characters from the input string and then generates a token at strings of contiguous letters.

Factory classsolr.LetterTokenizerFactory

Arguments: None

Example:

<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.LetterTokenizerFactory"/>
</analyzer>
</fieldType>

Input: I haven't received mail by Nov12Sunday

Output: IhaventreceivedmailbyNovSunday

All non-letter characters (' and 12) are discarded first, and then tokens are generated by considering strings of contiguous letters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.118.14