Time for action – finding a document from any shard

Now that we have a principle, let's try to simulate a sharded query in a bit more realistic way, as given in the following steps

  1. We can, for example, start two Solr instances and use cores from them as our shards. In order to simplify that, I have prepared a simple script (/SolrStarterBook/test/chp07/arts/start_shards.sh), which you can use after the usual start chp07 script to have a new running Solr instance.
  2. The new instance will need a different port, so I chose the 9999 port, which is simple to remember. I also decided to use the multicore definitions again from Chapter 5, Extending Search, and please remember to add data again before continuing.
  3. Once we have our instances running (port 8983 for the current examples, port 9999 for the examples in Chapter 5, Extending Search), we need to add a new test document on a single shard to be able to prove that we will find it even by querying on a different one.
  4. Let's add our test document (/SolrStarterBook/test/chp97/arts/test_postSingleShard.sh):
    >> curl 'http://localhost:8983/solr/arts_paintings/update?commit=true&wt=json' -H 'Content-type:application/json' -d '
    [{
        "uri" : "TEST_SHARDS_01",
        "title" : "This is a DUMMY title",
        "artist" : "Some Artist Name",
        "museum" : "Some Museum"
    }]'
  5. All we have to do then is to search for this particular document, starting our search on the paintings core from Chapter 5, Extending Search:
    >> curl -X GET 'http://localhost:9999/solr/paintings/select?q=title:dummy&wt=json&indent=true&fl=*,[shard]&shards=localhost:8983/solr/arts_paintings,localhost:9999/solr/paintings'
    

Note

Note that we need to remove the http://part from the host/core definition in the shards list.

What just happened?

Looking at the results obtained by our test query, we will find the documents we are searching for, as expected:

What just happened?

In this screenshot we can clearly recognize the shard source by the use of the [shard] field. An interesting point is that the localhost:9999/paintings core we are using is defined by a schema that is almost identical to the ones used in this chapter. If we test the same query (or any other query) on localhost:9999/cities core, we will get an error similar to the following output:

java.lang.NullPointerException at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:879)...

The errors remind us that the results obtained by different shards are actually merged, and the most important field for Solr to be able to merge those results is the one used as the unique key. In the cities example indeed we had no uri field, and even if our schema is very flexible using dynamic fields, the results cannot be merged this way. Note that if we repeat our query adding the localhost:9999/cities core in the shards list this time, it makes no difference, as there simply is no match.

This could give us some insight into how to use shards in a creative way, by defining a single nonrestrictive core to be used as a central query point, distributing requests all over different core definitions, used as shards. Even if we plan to force the use of shards this way, please note that we still have to manage the data in them manually, and we cannot automatically split an existing index in a simple way. These topics are well covered by adopting the features of SolrCloud, as we will see later in this chapter.

In these examples, we are using very simple queries to focus ourselves on the main aspects of using shards and to avoid too complex list of parameters that is too complex; but once we are querying on some Solr instance, we can use the common query parameters and components for faceting, highlighting, and stats and terms handling. However, there are some intuitive limitations: for example, you should have defined a field to be used as a unique key across all the shards and joins; scoring and similarity-oriented features will not work at the moment (for example, idf, MoreLikeThis, and so on). You will find more precise details on sharding on the official wiki page:

http://wiki.apache.org/solr/DistributedSearch

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.202.240