Avoiding caching of rare filters to improve the performance

Imagine that some of the filters you use in your queries are not good candidates for caching. You might wonder why—for example, do those filters have date and time with seconds or are spatial filters scattered all over the world? Such filters are quite unique, and thus when they are put into the cache, they are very rarely reused and thus they are more or less useless. Caching such filters is a waste of memory and CPU cycles. Is there something you can do to avoid filter queries caching? Yes, there is a way, and this recipe will show you how to do it.

How to do it...

Let's assume we have the following query being used to get the information we need:

q=solr+cookbook&fq=category:books&fq=date:[2014-06-12T13:22:12Z+TO+2014-07-11T11:24:54Z]

The filter query we don't want to cache is the one filtering our documents on the basis of the date field. Of course, even though we don't want that filter to be cached, we still want the filtering to be done. In order to turn off caching, we need to add the {!cache=false} line to our filter that filters on the basis of the date field. After the change, our query should look as follows:

q=solr+cookbook&fq=category:books&fq={!cache=false}date:[2014-06-12T13:22:12Z+TO+2014-07-11T11:24:54Z]

So now let's take a look at how that works.

How it works...

The first query is very simple; we just search for the document that has the words solr cookbook and we want the result set to be narrowed to the books category. We also want to narrow the results further to only those that fall into the range of 2014-06-12T13:22:12Z to 2014-07-11T11:24:54Z in the date field.

As you can imagine, if we have many filters with such dates as the one in the query, the filter cache can be filled very fast. And in addition to that, if you don't reuse the same value in that field, the entry in the field cache is pretty useless. That's why, by adding the {!cache=false} part to the filter query, we tell Solr that we don't want those filter query results to be put into the filter cache. With such an approach, we won't pollute the filter cache and thus save some CPU cycles and memory.

There is one more thing when it comes to querying. The filters that are not cached will be executed in parallel with the query, so this can be an additional improvement to your query execution time.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.189.23