Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Time for action – adding a custom DocTransformer to hide empty fields in the results

Since DocTransformers can manipulate the data we are going to return for a given document, they can be very useful for adding custom normalizations or simple data manipulations. Looking briefly at how to design a new transformer can add some clarity.

Imagine we had a transformer designed to avoid returning the empty values in our results (in our case there will be several empty fields):

>> curl -X GET 'http://localhost:8983/solr/paintings_transformers/select?q=*:*&fl=*,[noempty]&wt=json&indent=true'

The previous request works if we have added two lines like these in our solrconfig.xml:

<lib dir="${solr.core.instanceDir}/lib/" regex="solr-plugins-java.jar" />
<transformer name="noempty" class="it.seralf.solrbook.doctransformers.RemoveEmptyFieldsTransformerFactory" />

Here, the factory class is used to create an instance of RemoveEmptyFieldsTransformer, which has an outline as shown in the following code:

class RemoveEmptyFieldsTransformer extends DocTransformer {
  final String name = "noempty";
  @Override
  public String getName() {
    return this.name;
  }
  private void removeEmpty(final List<?> list){
    // TODO: remove empty fields in a list
  }
  @Override
  public void transform(final SolrDocument doc, final int docid) {
    final Iterator<Entry<String, Object>> it = doc.entrySet().iterator();
    while(it.hasNext()){
      final Entry<String, Object> entry = it.next();
      if(entry.getValue() == null) {
        it.remove();
      }else if(entry.getValue() instanceof List<?>){
        final List<?> list = (List<?>)entry.getValue();
        removeEmpty(list);
        if(list.size()==0) it.remove(); // if the list is empty
      }
    }
  }
}

The compiled jar containing the class will be under the /lib directory under the core folder. I've omitted the details here, you can find the complete runnable example under the path /SolrStarterBook/solr-app/chp04/paintings_transformers/, and the source in the project /SolrStarterBook/projects/solr-maven/solr-plugins-java.

What just happened?

The Java class structure is really simple. Every new transformer must extend an abstract, general DocTransformer class. There exists a single transform() method that contains all the logic, and we expect to receive inside it a SolrDocument object to transform its values. We will again see this behavior when we introduce customizations inside the update chain.

The getName()method is used to correctly recognize this object. This allows us to call its execution from requests by the name noempty. The same name has been used in the configuration file to bind the name to the Java class. Also, notice that a very similar approach can be used to introduce custom functions.

Looking at core parameters for queries

At this point, you are probably curious about what are all the parameters that we can use within the Solr request. A complete list will be tedious and difficult to read, because there are many specialized parameters and sometimes fields-specific parameters!. I would suggest you to start reading carefully on how to use the main and basic ones that you can find in the following table:

Parameter	Meaning	Default value
`q`	This is the query.	`:`
`defType`	This is the query parser that will be used.	Lucene query parser
`q.op`	The Boolean query operator used: `AND`/`OR`.	`OR`
`df`	This is the default field that will be deprecated soon.	defined in `schema.xml`
`start`	The ordinal number of the first document to be returned in the results.	0
`rows`	The number of document to be returned in the results.	10
`fl`	The list of the fields to be exposed in the results.	all
`fq`	This is a filter query for filtering the results.	N/A
`pageDoc` and `pageScore`	These parameters are useful to request document with a score greater than a certain value. Notice that they need the implicit `score` field.	N/A
`omitHeader`	This produces a response without the header part.	false
`qt`	This is the type of the query handler.	Lucene standard
`wt`	This is the writer for the response formatting.	XML
`debug`	This adds debugging info to the output.	false
`timeAllowed`	This is the maximum time allowed (in milliseconds) for executing a query.	N/A

As usual, this should be considered as a list from which to start. If you want more details, you should check the wiki page for the list:

http://wiki.apache.org/solr/CommonQueryParameters

You can also look at the following wiki pages:

http://wiki.apache.org/solr/SearchHandler

http://wiki.apache.org/solr/SearchComponent

In these pages, you will find that there are actually some different search components that can extend this list. We will cover the most interesting ones for our purposes in Chapter 5, Extending Search, and Chapter 6, Using Faceted Search – from Searching to Finding. Although they can be used in combination with the ones showed in the table, at the moment we will remain focused on the most essential parts of a Solr query.

Using the Lucene query parser with defType

We can explicitly decide to use the Lucene query parser, which is the default, or a custom parser that uses some different equivalent combination for the defType parameter:

q={!query defType=lucene}museum:louvre
q={!lucene}museum:louvre
q=museum:louvre&defType=lucene
q=museum:louvre

For the moment, we choose to use the last case, which uses the most simple syntax. The others will be used in a more advanced search, when you need to choose a specific alternative parser for queries or for handling sub-queries.

Notice that the default field parameter df can be used for specifying a default field in which to search, but will be deprecated in the next versions. So, I suggest not to use it.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Time for action – adding a custom DocTransformer to hide empty fields in the results

Create new playlist

Sign In

Sign Up

Time for action – adding a custom DocTransformer to hide empty fields in the results

What just happened?

Looking at core parameters for queries

Using the Lucene query parser with defType

Table of Contents for
Time for action – adding a custom DocTransformer to hide empty fields in the results