It is interesting to pay attention to the results, because you will find that the first document for the last two examples contain only the words real or realistic, and once we will introduce the parameter start=10
, we will start seeing some presence of surrealist in the documents. The reason for this is the difference in the ranking of the documents returned. This will be explained later, but we can also give more importance to some terms over others using the boost options:
>> curl -X GET 'http://localhost:8983/solr/paintings/select?q=abstract:(*real* AND surrealist^2)&wt=json'
Again, I have omitted the encoding for spaces. So, please rewrite the appropriate part of the query as *real*%20AND%20surrealist^2
.
Imagine we want to give more importance to the documents containing the term surrealist
in the results. In our case, this could be achieved by simply adding the boosted search condition with the AND
operator. The boost condition is expressed by the surrealist^2
syntax, which tells the query parser to consider the occurrence of the term surrealist
to be two times more interesting than the other. Notice that Solr uses an implicit hidden score
parameter behind the scene, and we can project it explicitly if we add it to the fields list with fl
, as seen before.
A Lucene score is calculated using factors like term frequency, inverse document frequency, and normalization of terms over the documents. It's not important to go into the details now, but you should consider some basic rules:
Given a certain score, the boost operation acts as a multiplier over the existing score for a term. The same mechanism could be used in the indexing phase too when needed, but we will not go into these details now.
3.133.150.142