Improving Solr query performance after the start and commit operations

Almost everyone who has some experience with Solr would have noticed one thing—right after startup or searcher reopening (such as a soft autocommit), Solr doesn't have such query performance as after running for a while. This is happening because Solr doesn't have any information stored in caches, the I/O is not optimized, and so on. Can we do something about it? Of course we can, and this recipe will show you how to do it.

How to do it...

  1. First of all, we need to identify the most common and the heaviest queries that we send to Solr. I have two ways of doing this: first, I analyze the logs that Solr produces and see how queries are behaving. I tend to choose those queries that are run often and those that run slowly. The second way of choosing the right queries is analyzing the applications that use Solr and see what queries they produce, what queries will be the most crucial, and so on. Based on my experience, the log-based approach is usually much faster and can be done with the use of self-written scripts.

    However, let's assume that we have identified the following queries as good candidates:

    q=cats&fq=category:1&sort=title+desc,value+desc,score+desc
    q=cars&fq=category:2&sort=title+desc
    q=harry&fq=category:4&sort=score+desc
  2. What we will do next is just add so-called warming queries to the solrconfig.xml file. So the listener XML tag definition in the solrconfig.xml file should look similar to this:
    <listener event="firstSearcher" class="solr.QuerySenderListener">
     <arr name="queries">
      <lst><str name="q">cats</str><str name="fq">category:*</str><str name="sort">title desc,value desc,score desc</str><str name="start">0</str><str name="rows">20</str></lst>
      <lst><str name="q">cars</str><str name="fq">category:*</str><str name="sort">title desc</str><str name="start">0</str><str name="rows">20</str></lst>
     </arr>
    </listener>

Basically, what we did is added the so-called warming queries to the startup of Solr. Now let's see how it works.

How it works...

By adding the preceding fragment of configuration to the solrconfig.xml file, we told Solr that we want it to run those queries whenever a firstSearcher event occurs. The firstSearcher event is fired whenever a new searcher object is prepared and there is no searcher object available in the memory. Basically, the firstSearcher event occurs right after Solr startup.

So what happens after Solr starts up? After adding the preceding fragment, Solr runs each of the defined queries. By doing that, the caches are populated with the entries that are significant for the queries that we identified. This means that if we did the job right, we'll have Solr configured and it is ready to handle the most common and heavy queries right after its start.

Maybe a few words about what the configuration options mean. The warm-up queries are always defined under the listener XML tag. The event parameter tells Solr what event should trigger the queries—in our case, it is the firstSearcher event. The class parameter is the Java class that implements the listener mechanism. Next, we have an array of queries that are bound together by the array tag with the name="queries" parameter. Each of the warming queries is defined as a list of parameters that are grouped by the lst tag.

The thing to remember is choosing the warming queries wisely. You don't need to choose the queries with all the values in the q parameter, but warm your common filter queries, your sort parameter values, and so on. Also remember that warming is not only about Solr itself. During warm-up, Lucene segments are read by the operating system and are cached. This results in commonly used index parts being placed in the memory and thus can be accessed very fast.

There's more...

There is one more recipe that I would like to cover.

Improving Solr performance after committing operations

If you are interested in improving the performance of your Solr instance, you should also look at the newSearcher event. This event occurs whenever a commit operation is performed by Solr (for example, after replication). Assuming that we identified the same queries as before as good candidates to warm the caches, we should add the following entries to the solrconfig.xml file:

<listener event="newSearcher" class="solr.QuerySenderListener">
 <arr name="queries">
  <lst><str name="q">cats</str><str name="fq">category:*</str><str name="sort">title desc,value desc,score desc</str><str name="start">0</str><str name="rows">20</str></lst>
  <lst><str name="q">cars</str><str name="fq">category:*</str><str name="sort">title desc</str><str name="start">0</str><str name="rows">20</str></lst>
 </arr>
</listener>

Remember that the warming queries are especially important for the caches that can't be automatically warmed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.162.214