This discards the modification/stemming of words listed in the file protwords.txt. Any words in the protected word list will not be modified by any stemmer in Solr.
Arguments:
- protected: The path of the file that contains the protected word list, one per line
The following is the sample protwords.txt file:
removing
transforming
Example:
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" />
<filter class="solr.KStemFilterFactory"/>
</analyzer>
</fieldType>
Input: remove removing removed transforming
Tokenizer to filter: remove, removing, removed, transforming
Output: remove, removing, remove, transforming
Here we can see that the words removing and transforming are not stemmed by KStemFilterFactory because they are mentioned in the file protwords.txt and protected by KeywordMarkerFilterFactory.