Time for action – finding interesting subjects using a facet query

Let's now look at the other possible applications of faceting. For example, we can use a facet query to find the most-recurring terms. If this recurring term is on the subject field, this could be used for example to obtain suggestions on interesting topics.

  1. Now, we want to obtain a simple facet result for the subjects; thus, enter the following query:
    >> curl -X GET 'http://localhost:8983/solr/paintings/select?q=*:*&rows=0&facet=true&facet.field=subject_entity&facet.limit=-1&facet.mincount=2&facet.sort=count&json.nl=map&wt=json'
    

    We will find that the most common subject in our data is related to the religious theme of Annunciation. This result is not particularly surprising, because this is one of the most widely represented themes in classic European art.

  2. If we started from the opposite direction, and ask ourselves if the annunciating action is present in the collection, we could easily write the following facet query:
    >> curl -X GET 'http://localhost:8983/solr/paintings/select?q=*:*&rows=0&facet=on&facet.query=subject_entity:annunciating~5&facet.mincount=1&wt=json'
    
  3. We will obtain the same results: five matches in facet query counts on 5069 documents. Note that we can ask for the same information while we are querying on other facets, for example:
    >> curl -X GET 'http://localhost:8983/solr/paintings/select?q=*:*&rows=0&facet=on&facet.field=city_entity&facet.field=artist_entity&facet.query=subject_entity:annunciating~5&facet.limit=10&f.artist_entity.facet.limit=2&wt=json&json.nl=map&fq=abstract:angel'
    
  4. We can restrict the number documents to those that contain a reference to an angel figure (fq=abstract:angel). We will ask facets for cities and artists related to it (facet.field=city_entity and facet.field=artist_entity) and to the number of documents that could possibly be related to our search on the subject too (facet.query=subject_entity:annunciating~5).

In this case, we will obtain two facet query counts.

What just happened?

We started from the list of terms in the facet for the subject_entity field. We found that the term annunciation has been used most on our dataset. Note that the subject field plays a similar role here as tags in a controlled vocabulary. This could be used as an idea to play with your fixed, controlled tag vocabulary if you have one. Once we have found an interesting term, we play in reverse just to understand how the facet query works. What we see here is if we use a similar term in the same field (subject_entity:annunciating~5), we will obtain the same expected results. Starting from that acquisition, the next step will be to use the facet query without restricting it to a single field, using the following query:

>> http://localhost:8983/solr/paintings/select?q=*:*&rows=0&facet=on&facet.query=annunciating~5&facet.mincount=1&wt=json

In this case, we will obtain 50 matches over all the fields.

If we introduce more than one field for faceting, and perform a facet query, as in the last example, it is simple to notice that every result is independent of the other. Even if we write a query in the facet.query field, it will be not used outside its context to filter the other results. Instead, the filter and common queries will produce changes to the facet and facet query results too as they will restrict the collection on which the facets will operate their counts.

Using a filter query is best when we have to fix some criteria to restrict the collection size. We can then use facets as a way to provide suggestion for navigation paths. A typical user interface will add a filter to our query when we select a specific facet suggestion, thus narrowing the search. In contrast, when a filter is removed, the collection on which we search will be broader, and the faceting results will change accordingly.

As a last note, it's possible to specify parameters on a per field basis when needed, for example, using f.artist_entity.facet.limit=2 we are deciding to have no more than two facet results for the artist_entity field. Note that facet.mincount does not imply any semantics; it's only an acceptable minimum ground value for a text match, but it can still be used as if it implies some specific simple relevance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.235.188