If you have ever used the field-collapsing functionality of Solr, you might be wondering whether there is a possibility of using that functionality and faceting. Of course, there is, but the default behavior still works and so you get the faceting calculation on the basis of documents and not on document groups. In this recipe, we will learn how to query Solr so that it returns facets calculated for the most relevant document in each group.
Before reading the following recipe, let's take a look at Grouping documents by the field value, Grouping documents by the query value, and Grouping documents by the function value recipes in Chapter 8, Using Additional Functionalities. Also if you are not familiar with faceting functionality, read the first three recipes of this chapter.
schema.xml
file):<field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="name" type="text_general" indexed="true" stored="true" /> <field name="category" type="string" indexed="true" stored="true" /> <field name="stock" type="boolean" indexed="true" stored="true" />
<add> <doc> <field name="id">1</field> <field name="name">Book 1</field> <field name="category">books</field> <field name="stock">true</field> </doc> <doc> <field name="id">2</field> <field name="name">Book 2</field> <field name="category">books</field> <field name="stock">true</field> </doc> <doc> <field name="id">3</field> <field name="name">Workbook 1</field> <field name="category">Workbooks</field> <field name="stock">false</field> </doc> <doc> <field name="id">4</field> <field name="name">Workbook 2</field> <field name="category">Workbooks</field> <field name="stock">true</field> </doc> </add>
category
field and we want the faceting to be calculated on the stock
field. Also remember that we are only interested in the most relevant document from each result group when it comes to faceting. So, the query that would tell Solr to do what we want should look as follows:http://localhost:8983/solr/cookbook/select?q=*:*&facet=true&facet.field=stock&group=true&group.field=category&group.truncate=true
The results for the preceding query would look as follows:
<?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">1</int> <lst name="params"> <str name="q">*:*</str> <str name="facet.field">stock</str> <str name="group.truncate">true</str> <str name="facet">true</str> <str name="group.field">category</str> <str name="group">true</str> </lst> </lst> <lst name="grouped"> <lst name="category"> <int name="matches">4</int> <arr name="groups"> <lst> <str name="groupValue">books</str> <result name="doclist" numFound="2" start="0"> <doc> <str name="id">1</str> <str name="name">Book 1</str> <str name="category">books</str> <bool name="stock">true</bool> <long name="_version_">1487145087213240320</long></doc> </result> </lst> <lst> <str name="groupValue">Workbooks</str> <result name="doclist" numFound="2" start="0"> <doc> <str name="id">3</str> <str name="name">Workbook 1</str> <str name="category">Workbooks</str> <bool name="stock">false</bool> <long name="_version_">1487145087281397760</long></doc> </result> </lst> </arr> </lst> </lst> <lst name="facet_counts"> <lst name="facet_queries"/> <lst name="facet_fields"> <lst name="stock"> <int name="false">1</int> <int name="true">1</int> </lst> </lst> <lst name="facet_dates"/> <lst name="facet_ranges"/> <lst name="facet_intervals"/> </lst> </response>
As you can see, everything has worked as it should have. Now let's see how it works in detail.
Our data is very simple. As you can see in the field definition section of the schema.xml
file and the example data, every document is described by four fields:
id
name
category
stock
I think that their names speak for themselves and I don't need to discuss them.
When it comes to the query, we fetch all the documents from the index (the q=*:*
parameter). Next, we say that we want to use faceting and want it to be calculated on the stock
field. We want the grouping mechanism to be active and also want to group documents on the basis of the category
field (all the query parameters responsible for defining faceting and grouping behavior are described in the appropriate recipes in this book, so take a look at those if you are not familiar with those parameters). And finally, something new—the group.truncate
parameter is set to true
. If it is set to true
, like in our case, facet counts will be calculated using only the most relevant document in each of the calculated groups. So, in our case for the group with the category
field equal to books
, we have the true
value in the stock
field and for the second group we have false
in the stock
field. Of course, we are looking at the most relevant documents, so the first ones in our case. As you can easily see, we've got two facet counts for the stock
field both having a count of 1
, which is what we would expect.
18.219.228.156