Spellchecking

We have seen that Solr provides magical support for searching. Solr provides a strong index building mechanism, unifiable search configurations, and providing interesting and expected formatted results by executing various transformation steps on the query output. Spellchecking is an advantageous feature provided by Solr for those who make mistakes while typing a query or may enter an incorrect or inappropriate input. Sometimes, we have this experience while searching on Google. If we enter sokcer, then Google provides a hint: Did you mean: soccer? Or sometimes, typing socer will directly show results for soccer rather than displaying any hints.

Likewise, there are some scenarios where we need to be careful about the input word:

  • If a user enters input search terms with incorrect spelling and there is no matching document available, we use the Solr spellcheck feature, displaying a message that searching for soccer instead of socer will give the user a hassle-free experience of searching without worrying much about the spelling.
  • A user enters less terms for search which is not sufficient to fetch more or sufficient matching documents at that time if any suggestion terms available which contains more matching documents then we can instruct the user by giving a message like Did you mean xxxxx. But if the suggestion terms have the same or lesser-matching documents than the query terms, then no message should be shown.
  • When no index is available for the entered search terms, no suggestions should be given to the user.

To take advantage of the Solr spellchecking feature, we need to tell the request handle to check spelling during processing. Here is the configuration of the Solr default request handler to enable spellchecking while processing a request:

<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck.dictionary">wordbreak</str>
<str name="spellcheck">on</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.alternativeTermCount">5</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck.maxCollationTries">10</str>
<str name="spellcheck.maxCollations">5</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>

The preceding configuration is sufficient for enabling spellchecking and performs spellchecking for all queries processed through this request handler, for example, searching for cemera instead of camera.

URL: http://localhost:8983/solr/techproducts/select?q=cemera:

{
"responseHeader":{
"status":0,
"QTime":7,
"params":{
"q":"cemera"}},
"response":{"numFound":0,"start":0,"docs":[]
},
"spellcheck":{
"suggestions":[
"cemera",{
"numFound":1,
"startOffset":0,
"endOffset":6,
"origFreq":0,
"suggestion":[{
"word":"camera",
"freq":1}]}],
"correctlySpelled":false,
"collations":[
"collation",{
"collationQuery":"camera",
"hits":1,
"misspellingsAndCorrections":[
"cemera","camera"]}]}
}

From the preceding response, we can see that Solr returns the spellcheck container, along with suggested words and the correct spelling in response. Here we have searched for an incorrect word (cemera) but in the response, the correct spelling has been returned as a spellcheck-suggested word.

However, you don't need to provide any spellchecking-related parameters to the query string. Still, if you want to disable spellchecking for any specific query, you can use spellcheck=false and disable spellchecking for that particular query. For example:

URL: http://localhost:8983/solr/techproducts/select?q=cemera&spellcheck=false:

{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"q":"cemera",
"spellcheck":"false"}},
"response":{"numFound":0,"start":0,"docs":[]
}}

Spellchecking is not executed for this query though we have searched for an incorrect spelling. 

It is always advisable to perform spellchecking last because we still want the default search components, such as query, facet, and debug, to execute during query processing. This can be done easily by setting spellcheck.collate=true. This is a collation parameter that tells Solr to run spellchecking last because generating the collation query requires an already executed query.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.204.142