In the case of MoreLikeThis
, we can imagine performing an internal query for every term that seems to be relevant by computing the similarity based on its vectors. The document extracted from the collection by the lookup will be included in the MoreLikeThis
results list.
We can easily play with a small, simple example. Search for the term boy
in the title field, and ask the results to be sorted by descending score (sort=score+desc
) from the most relevant to the least as shown in the following query:
>> curl -GET 'http://localhost:8983/solr/paintings/select?q=title:boy&start=0&rows=1&mlt=on&mlt.fl=artist_entity,title&mlt.mindf=1&mlt.mintf=1&mlt.minwl=3&fl=uri,title,score&sort=score+desc&omitHeader=true&wt=json'
Note that we will not need to add new handlers to solrconfig.xml
as More Like This will work on term vectors, and can be used by default. It is preferable to use stored term vectors for fields that will be used for calculating similarity. If term vectors are not stored, MoreLikeThis
will generate terms from the stored fields.
Once we have enabled the More Like This component (mlt=on
), we are able to obtain results for our example query, as shown in the following screenshot:
In our example, we asked for recommendations with a minimum document frequency of 1 (mlt.mindf=1
), a minimum term frequency again equal to 1 (mlt.mintf=1
), and ignored words that are shorter than a length of three characters as we feel that are not so significant (mlt.wl=3
).
We can refer to the wiki page to look for a list of parameters and use cases:
http://wiki.apache.org/solr/MoreLikeThis
If you look at the image, you will find that there are many proposed recommendations, and the first one seems to be very pertinent, but still there is a odd error on the third position. This is probably due to the textual distance between the term boy
and the term bee
, and we could have configured it better in our example by enabling term vectors on the fields for which we calculated the similarities (mlt.fl=artist_entity,title
).
Even if we are not required to define a new handler for the MoreLikeThis
component, this is still possible by adding a simple configuration to the solrconfig.xml
file:
<requestHandler name="/mlt" class="solr.MoreLikeThisHandler"> </requestHandler>
This could be helpful in some case and is easy to configure. For more details, please visit:
Q1. Which components can be used for an auto-suggester?
There seem to exist different versions of the 'adoration of the magi' theme:
>> curl -X GET 'http://localhost:8983/solr/paintings/select?q=subject_entity:adoration+of+the+magi&fl=title,artist,comment&wt=json'
Q2. How can we discover this only using facet suggestions?
Q3. How can we obtain a list of all the possible cities and museums in the index?
Q4. For what is the More Like This component designed?
Q5. What is the meaning of tf-idf
?
18.221.163.13