Time for action – searching for incomplete terms with the wildcard query

Now that we have some basic idea of the Lucene query syntax, we can extend our syntax with some more useful tricks. One of the most recurring use-case is about partial or incomplete term searches. For example, we could try to intercept a partial term written by the user and return a list of documents, including some expansion with documents containing a term that contains the query term as a substring.

For example, let's search for a partial match over real, realism or surrealism, using only the term real in the following combinations:

>> curl -X GET 'http://localhost:8983/solr/paintings/select?q=abstract:real*&wt=xml'
>> curl -X GET 'http://localhost:8983/solr/paintings/select?q=abstract:realis&wt=xml'
>> curl -X GET 'http://localhost:8983/solr/paintings/select?q=abstract:*real&wt=xml'
>> curl -X GET 'http://localhost:8983/solr/paintings/select?q=abstract:*real*&wt=xml'

What just happened?

In the first example, we are asking for documents containing the terms that start with the real substring, as denoted by the use of the * wildcard. So, we can expect to find documents with the terms real, realism, or reality. Notice that the * wildcard could also intercept no characters following the real term.

The second example query should give us no results. This is because without the wildcard, the exact match for term realis will be searched. This term is obviously not present, so we will find no results. Notice that if we had used the term "real" in this example, all the documents containing the word real itself will be returned. So, please be careful in your evaluations.

The next two tests are only for symmetry. We want to check if it's possible to search for match over the last part of a word, or even for a term that is a generic substring of a word.

Notice that this test is not so trivial. This is because in the old version of Solr, the only way to achieve this was to implement a reversed version of the same field values.

If you are able to create your queries programmatically by JavaScript, you can use the wildcard search to create a very simple auto-completion service. In this case, remember to add a simple timeout for avoid sending request for every char pressed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.82.253