Thinking about faceted search and findability

From the previous two examples, we have seen that it is simple to obtain a list of recurring terms for a specific field using facets, including information about the number of documents involved. The tagcloud example, in particular, is similar to what we usually see in action with folksonomies, but all the collection of terms projected here is from a single field at once. If we want to perform a search once a term has been selected, we will make a new query on that term. This is interesting because in some ways it can gives us specific directions of the search, improving find ability.

Note

Findability is a definition often used in Information architecture (IA) and User experience (UX) contexts, which relies not only on the process of doing a "good" search but mostly on the possibility for a common user to be able ideally to reach every relevant resource of information on a site by navigational patterns.

There is a good collection of articles that could be used as a source and inspiration for the topic at http://findability.org/.

Starting from that idea, I'd like to think about facets in Solr as ways to obtain a wider view of the same data, which we are able to manage using complex search criteria. This may sound confusing, because once a facet criteria will be added to our query, it will actually narrow the search. But it is important to focus on the idea that a facet is not a mechanism to perform searches by itself (if you look at the response, you will find that the facet results are on a specific section), but it is more intended to provide a shallow, fluid classification system in contrast to the traditional taxonomies-based approaches, which are predefined, fixed and hierarchical.

A facet comprises "clearly defined, mutually exclusive, and collectively exhaustive aspects, properties, or characteristics of a class or specific subject".

http://en.wikipedia.org/wiki/Faceted_classification

In this perspective, a facet is a simple aggregation of metadata referred to a group of resources that can be seen as related to each other from the perspective of their reference and usage of the same terms, moving our interest from searching to matching, as we will see later.

Faceting for narrowing searches and exploring data

A good idea is to again use prototyping to construct, in this case, a simple navigation over facets that can help us focus on one of the most simple ways to use them in many contexts. I have prepared a simple example using the very good ajax-solr library, which we will see in action again in further chapters. Feel free to use it as a base for creating prototypes on your data too:

Faceting for narrowing searches and exploring data

You will find this example by navigating to /SolrStarterBook/test/chp06/paintings/html/index.html. You can play with it directly with your web browser, without any web server, after you have started your Solr painting core in the usual way.

The simple interface, on the left-hand side of the page, gives us the chance to collect one or more terms from different facets in order to produce a collection of resources, on the right hand side of the page as a visual preview that changes, reflecting the choice we have made. This is very similar to the concept of filtering the list of resources, as we are accustomed to do on e-commerce sites, but it can be used backwards to explore different search paths from the original simply by removing a selected criteria from the list.

In other words, this approach can expand the traditional "bag of words" - based search for full text into a more wide search capability that mixes the fulltext advanced search functionality with a kind of detour-based exploring of the same relevant data. A user can, at some point, find that Louvre contains many paintings, and decide to explore the list by simply clicking the interface. As a general case, this leads to performing queries of which we as users had not originally thought. These queries are of interest to us as we will be informed in advance on how many relevant documents we will find using the selected criteria, and every time we add or remove a selection, a new series of criteria (and their related results) is triggered.

In our example, the query uses the parameters tabulated as follows:

q:*:*

We are not searching for specific terms

facet:true

The faceting support has to be enabled

facet.field:museum_entity

facet.field:artist_entity

facet.field:city_entity

The HTML interface asks for the facets' results on the artist_entity, museum_entity, and city_entity fields.

fq:city_entity:paris

fq:museum_entity:"musée du louvre"

We are already using a filter query based on the selection made. This is an improvement of the same idea behind the tagcloud example.

If you want to directly play with the query using facets, you can use the following command:

>>  curl -X GET 'http://localhost:8983/solr/paintings/select?facet=true&q=*:*&facet.field=museum_entity&facet.field=artist_entity&facet.field=city_entity&facet.limit=20&facet.mincount=1&f.city_entity.facet.limit=10&json.nl=map&fq=city_entity:paris&fq=museum_entity:"mus%25c3%25a9e+du+louvre"&wt=json'

Here, note the use of lower case, HTTP entity substitution (for example, for accented chars and double quotes), and + for spaces.

Note that we can decide how many results we will obtain for every facet (facet.limit) or even customize the number of results for a specific field (f.field_name.facet.limit, in our case f.city_entity.facet.limit).

If you analyze the parameters used, and test the query directly in your browser, you will see that is not too complicated. Unfortunately, sometimes it is not easy to read using cURL, and I suggest creating small prototypes on HTML, for example, using the ajax-solr library, as in the previous example. (You will find references to ajax-solr in the Appendix, Solr Clients and Integrations.) We will see filter query later in this chapter. A filter query is very often used as a method to perform a search using the criteria selected from a previous faceted search. This widely adopted navigational pattern is useful as it provides a simple interaction for an easy-to-understand user interface (the user can enable/disable filters), and it is even a better performer.

There is a good overview on the wiki page, http://wiki.apache.org/solr/SolrFacetingOverview, which can be used as an introduction, and as you can imagine, there are a lot of resources on this topic, not only on the Solr site, but also on the Web.

If you want to read something to gain a deeper theoretical knowledge of faceted navigation, I strongly suggest reading the article, written in 2009, by William Denton at http://www.miskatonic.org/library/facet-web-howto.html. This article gives clear definitions and practical patterns to understand deeply where and how to adopt a faceted navigation. You will find it very useful for a better understanding of the concepts behind the Solr faceting capabilities and where to use them in your projects.

The following are some other sites where you can find articles that could offer some good perspective:

For the moment, we will practice using simple examples on our data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.36.38