SolrCloud read-side fault tolerance

When a single shard from your collection fails to respond to the query, Solr will fail the whole request. This is good for most use cases, but not for all. Sometimes, you might like to show partial results so that your users can see at least some portion of the results. By default, this is not possible, but luckily Solr allows you to adapt to its behavior when it comes to such situations on per request basis. This recipe will show you how to force Solr to return even partial results.

Getting ready

We assume that we already have a running SolrCloud cluster with two nodes and a collection created with two leader shards and no replica. If you don't know how to do this, refer to the Creating a new SolrCloud cluster recipe in Chapter 7, In the Cloud. This recipe will show you how to create a new SolrCloud cluster and create a collection.

How to do it...

We indexed four sample documents to our Solr cluster. Two documents are placed in one shard and the other two documents are placed in the second shard. Let's now assume that one of the shards failed and Solr is not responding to any calls:

  1. First, we try to run a simple query to Solr in such a situation:
    http://localhost:8983/solr/testcollection/select?q=*:*&rows=0

    The results would be as follows:

    <?xml version="1.0" encoding="UTF-8"?>
    <response>
     <lst name="responseHeader">
      <int name="status">503</int>
      <int name="QTime">2</int>
      <lst name="params">
       <str name="q">*:*</str>
       <str name="rows">0</str>
      </lst>
     </lst>
     <lst name="error">
      <str name="msg">no servers hosting shard: </str>
      <int name="code">503</int>
     </lst>
    </response>
  2. Now, if we would like to force Solr to return partial results, we would have to provide the shards.tolerant parameter and set it to true so that our query looks as follows:
    http://localhost:8983/solr/testcollection/select?q=*:*&shards.tolerant=true

    The response returned by Solr will be as follows:

    <?xml version="1.0" encoding="UTF-8"?>
    <response>
     <lst name="responseHeader">
      <bool name="partialResults">true</bool>
      <int name="status">0</int>
      <int name="QTime">5</int>
      <lst name="params">
       <str name="q">*:*</str>
       <str name="shards.tolerant">true</str>
       <str name="rows">0</str>
      </lst>
     </lst>
     <result name="response" numFound="2" start="0" maxScore="1.0">
     </result>
    </response>

As we can see, even though one of the shards fail, Solr returned partial results. Let's now see how it works.

How it works...

As we can see, the first request failed. This is because, by default, Solr will fail a search request if it can't execute a search on full results set. Solr tries to execute the search request on all the shards that build the collection and fails to do that, which results in a failed search request. As we said, we are not interested in such a behavior, because we are good with even partial results.

To achieve this, we introduced the shards.tolerant parameter to our query and we set it to true (by default, it is false). By doing this, we tell Solr that we want to have partial results of the query, which means that even if only a single shard of our collection is working, Solr will still return the data to us. To notify us that the results are not full, but partial, Solr included the partialResults property in the response header and set it to true. If all the shards return the data, that property would be set to false.

There's more...

Of course, search time fault tolerance (being able to operate even in case of partial system failure) is not everything—Solr also supports indexing time fault tolerance.

Defining the achieved replication factor

Similar to what we just discussed, Solr allows you to specify the min_rf property on update requests. The value passed to this property should be set to the desired replication factor. For example, if we want Solr to only ensure that the update was processed by the leader shard, we should set it to 1; if we want one leader and one replica, we should set it to 2, and so on. For example, if we set the min_rf property to 1 and only the leader shard should successfully index the document, but all the replicas fail, Solr will return the rf property in the results and set it to 1. This means that only the leader successfully indexed the document and the replicas will have to sync with the leader once they recover. Such an update is still considered successful, but we can force Solr to return such information as the update request response.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.76.175