Chapter 4. Client API

A search application needs to interact with Solr by issuing index and search requests. Although Solr exposes these services through HTTP, working at that (low) level is not so easy for a developer. Client APIs are façade libraries that hide the low-level details of client-server communication. They allow us to interact with Solr using client-native constructs and structures such as the so-called Plain Old Java Object (POJO) in the Java programming language.

In this chapter we will describe Solrj, the official Solr client Java library. We will also describe the structure and the main classes involved in index and search operations. The chapter will cover the following topics:

  • Solrj: the official Java client library
  • Other available bindings

Solrj

Solrj is the name of the official Solr Java client. It completely abstracts the underlying (HTTP) transport layer and offers a simple interface to client applications to interact with Solr.

SolrServer – the Solr façade

A client library necessarily needs a façade or a proxy, that is, an object representing the remote resource that hides and abstracts the low-level details of client-server interaction. In Solrj, this role is played by classes that implement the org.apache.solr.client.solrj.SolrServer abstract class. At the time of writing this book, these are the available SolrServer implementers:

  • EmbeddedSolrServer: This connects to a local SolrCore without requiring an HTTP connection. This is not recommended in production but is definitely useful for unit tests and development.
  • HttpSolrServer: This is a proxy that connects to a remote Solr using an HTTP connection.
  • LBHttpSolrServer: A proxy that wraps multiple HttpSolrServer instances and implements client-side, round-robin load balancing between them. It also ensures it periodically checks the (running) state of each server, eventually removing or adding members to the round-robin list.
  • ConcurrentUpdateSolrServer: This is a proxy that uses an asynchronous queue to buffer input data (that is, documents). Once a given buffer threshold is reached, data is sent to Solr using a configurable number of dequeuer threads.
  • CloudSolrServer: A proxy used to communicate with SolrCloud.

Although any SolrServer implementers mentioned previously offer the same functionalities, HttpSolrServer and LBHttpSolrServer are better suited for issuing queries, while ConcurrentUpdateSolrServer is recommended for update requests.

Tip

The test case, org.gazzax.labs.solr.ase.ch3.index.SolrServersITCase, contains several methods that demonstrate how to index data using different types of servers.

Input and output data transfer objects

As described in the previous chapters, a Document is a central concept in Solr. It represents an atomic unit of information exchanged between the client and the server. The Solr API separates input documents from output documents using the SolrInputDocument and SolrDocument classes, respectively.

Although they share basic data transfer object behavior, each of them has its own specific features associated with the direction of interaction between the client and the server where they are supposed to play.

SolrInputDocument is a write object. You can add, change, and remove fields in it. You can also set a name, value, and optional boost for each of them:

public void addField(String name, Object value) 
public void addField(String name, Object value, float boost)
public void setField(String name, Object value) 
public void setField(String name, Object value, float boost)

SolrDocument is the output data transfer object, and it is primarily intended as a query result holder. Here, you can get field values, field names, and so on:

public Object getFieldValue(String name)
public Collection<Object> getFieldValues(String name)
public Object getFirstValue(String name)

Within an UpdateRequestProcessor instance, or while adding data to Solr, we will use SolrInputDocument instances. In QueryResponse (that is, the result of a query execution), we will find SolrDocument instances.

Tip

All the examples in the sample project associated with this chapter make extensive use of these data transfer objects.

Adds and deletes

Once a valid reference of a SolrServer has been created, adding data to Solr is very easy. The SolrServer interface defines several methods to do this:

void add(SolrInputDocument document)
void add(List<SolrInputDocument> document)

So we first create one or more SolrInputDocument instances filled with the appropriate data:

final SolrInputDocument doc1 = new SolrInputDocument();
doc1.setField("id", 1234);
doc1.setField("title", "Delicate Sound of Thunder");
doc1.addField("genre", "Rock");
doc1.addField("genre", "Progressive Rock");

Then, using the proxy instance, we can add that data:

solrServer.add(doc1);

Finally, we can commit:

solrServer.commit();

We can also accumulate all the documents within a list and use that as the argument of the add method.

Following the same logic as described in the second chapter for REST services, SolrServer provides the following methods to delete documents:

UpdateResponse deleteById(String id)
UpdateResponse deleteById(String id, int commitWithinMs)
UpdateResponse deleteById(List<String> ids)
UpdateResponse deleteById(List<String> ids, int commitWithinMs)
UpdateResponse deleteByQuery(String query)
UpdateResponse deleteByQuery(String query, int commitWithinMs)

Tip

The org.gazzax.labs.solr.ase.ch3.index.SolrServersITCase test case contains several methods that illustrate how to index and delete data.

Search

Searching with Solrj requires knowledge of (mainly) two classes: org.apache.solr.client.solrj.SolrQuery and org.apache.solr.client.solrj.response.QueryResponse. The first is an object representation of a query that can be sent to Solr. It allows us to inject all parameters we described in the previous chapter. One way of doing this is by providing dedicated methods, such as these:

SolrQuery setQuery(String query)
SolrQuery setRequestHandler(String qt)
SolrQuery addSort(String field, ORDER order)
SolrQuery setStart(Integer start)
SolrQuery setFacet(boolean b)
SolrQuery addFacetField(String ... fields)
SolrQuery setHighlight(boolean b)
SolrQuery setHighlightSnippets(int num)
…

Alternatively, generic setter methods can be provided:

SolrQuery setParam(String name, String … values)
SolrQuery setParam(String name, boolean value)

Note that all the preceding methods return the same SolrQuery object, thus allowing a caller to chain method calls, like this:

SolrQuery query = new SolrQuery()
  .setQuery("Charles Mingus")
  .setFacet(true)
  .addFacetField("genre")
  .addSort("title", Order.ASC)
  .addSort("released", Order.DESC) 
  .setHighlighting(true);

Once a SolrQuery has been built, we can use the appropriate method in the SolrServer proxy to send the query request:

QueryResponse query(SolrParams params)

The method returns a QueryResponse, which is an object representation of the response that Solr sent back as a result of the query execution. With that object, we can get the list of SolrDocuments of the currently returned page. We can also get facets and their values, and in general, we can inspect and access any part of the response.

Tip

The org.gazzax.labs.solr.ase.ch3.search.SearchITCase test case contains several examples that demonstrate how to query with Solrj.

The following is an example of the use of QueryResponse:

// Executes a query and get the corresponding response
QueryResponse  res = solrServer.query(aQuery);

// Gets the request execution elapsed time
long elapsedTime = res.getElapsedTime();

// Gets the results (i.e. a page of results)
SolrDocumentList results = res.getResults();

// How many total hits for this response
int totalHits = results.getNumFound();

// Iterates over the current page
for (SolrDocument document : results) {
  // Do something with the current document
  String title = document.getFieldValue("title");
  …
}
// Gets the facet field "genre"
FacetField ff = res.getFacetField("genre");
// Iterate over the facet values
for (Count count : genre.getValues()) {
  String name = count.getName(); // e.g. Jazz
  String count = count.getCount(); // e.g. 19
}
// The Highlighting section is a bit complicated, as the 
// value object is a composite map where keys are the documents identifiers while values are maps with highlighted fields as key and snippets (a list of snippets) as values.

Map<String, Map<String, List<String>>> hl = 
response.getHighlighting();

// Iterates over highlighting sectio
for (Entry<String, Map<String, List<String>> docEntry : hl) {
  String docId = docEntry.getKey();

  // Iterates over highlighted fields
  for (Entry <String, List<String> fEntry : entry.getValue()) {
    String fEntry = field.getKey();

    // Iterates over snippets
    for (String snippet : field.getValue()) {
      // Do something with the snippet
  } 
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.220.92