We can easily verify whether the update script is working well by asking for a list of recurring subjects:
>> curl -X GET 'http://localhost:8983/solr/paintings/select?q=*:*&rows=0&facet=true&facet.field=subject_entity&facet.limit=-1&facet.mincount=2&facet.sort=count&json.nl=map&wt=json'
In this case, we are producing a facet over a more readable version of the data, which is produced during the update phase on the subject_entity
field, stripping the namespace part off the original subject
field data. If you want, you can play the same query by changing the faceted field to facet.field=subject,
and you will return the original complete value.
The faceting functionality can be combined with the default search capability, but this involves using several different parameters at once. The list of parameters can be found on the official wiki page:
http://wiki.apache.org/solr/SimpleFacetParameters
I strongly suggest you to play with different parameter combinations to understand how they perform on our data. Sometimes turning off the actual results (rows=0
) to think only about the facets can be very useful. While we showcase only a few parameters in our examples, you can find the whole list on the reference page:
https://cwiki.apache.org/confluence/display/solr/Faceting
These parameters shown in the reference page can be very useful for evaluating the best choice for performance. For example, it's possible to adopt different strategies for faceting over a field. The type of strategy/algorithm used can be chosen by the facet.method
parameter on a per field basis, and the parameter permits us to choose among the enum, field cache, or even field cache per segment strategies (which is the same as field cache, but more fine grained on different segments).
3.144.255.87