Chapter 8. Search Components

Many of Solr's major capabilities are internally organized into search components. You've actually been using several of them already: QueryComponent performs the actual searching, DebugComponent gathers invaluable query debugging information when setting debugQuery, and FacetComponent performs the faceting we used in Chapter 7, Faceting. In addition, there are many more that do all sorts of useful things that can really enhance your search experience:

  • Highlighting: This returns highlighted text snippets of matching text in the original data
  • Spell checking: This suggests alternative queries, often called Did you mean?
  • Suggester: This suggests complete queries based on partially typed input; often called query autocomplete
  • Query elevation: This manually modifies search results for certain queries.
  • More-like-this: This helps to find documents similar to another document or provided text
  • Stats: This is used for mathematical statistics of indexed numbers
  • Clustering: This organizes search results into statistically similar clusters
  • Collapse and Expand / Grouping: These group search results by a field and limit the number of results per group
  • Terms and TermVector: These retrieve raw indexed data

And there are a few others we won't go into. For example, ResponseLogComponent adds document IDs and scores to Solr's log output—a feature useful for debugging and for relevancy analysis.

Tip

In a hurry?

Search features, such as search result highlighting, query spell-checking, and auto-completing queries are of high value for most search applications; don't miss them. Take a peek at the others to see if they are applicable to you.

About components

At this point, you should be familiar with the <requestHandler/> definitions defined in solrconfig.xml—this was explained in Chapter 5, Searching. Any request handlers with class="solr.SearchRequestHandler" are intuitively related to searching. The Java code implementing SearchRequestHandler doesn't actually do any searching! Instead, it maintains a list of SearchComponents that are invoked in sequence for a request. The search components used and their order are configurable.

What follows is our request handler for MusicBrainz releases but modified to explicitly configure the components for the purpose of illustration:

<requestHandler name="mb_releases" class="solr.SearchHandler">
  <!-- default values for query parameters -->
  <lst name="defaults">
    <str name="defType">edismax</str>
    <str name="qf">r_name r_a_name^0.4</str>
    <str name="pf">r_name^0.5 r_a_name^0.2</str>
    <str name="qs">1</str>
    <str name="ps">0</str>
    <str name="tie">0.1</str>
    <str name="q.alt">*:*</str>
  </lst>
  <!-- note: these components are the default ones -->
  <arr name="components">
    <str>query</str>
    <str>facet</str>
    <str>mlt</str>
    <str>highlight</str>
    <str>stats</str>
    <str>debug</str>
  </arr>
  <!-- INSTEAD, "first-components" and/or 
"last-components" may be specified. -->
</requestHandler>

The named search components are the default ones that are automatically registered if you do not specify the components section. To specify additional components, you can either re-specify components with changes, or you can add it to the first-components or last-components lists, which are prepended and appended respectively to the standard component list.

Tip

Many components depend on other components being executed first, especially the query component, so you will usually add components to last-components.

Search components must be registered in solrconfig.xml so that they can then be referred to in a components list. The components in the default set are pre-registered, and some like highlighting will be registered explicitly anyway because they have configuration settings that aren't request parameters. Here's an example of how the search component named elevator is registered in solrconfig.xml:

<searchComponent name="elevator" class="solr.QueryElevationComponent">
  <str name="queryFieldType">string</str>
  <str name="config-file">elevate.xml</str>
</searchComponent>

The functionality in QueryComponent, FacetComponent, and DebugComponent have been described in previous chapters. The rest of this chapter describes other search components that come with Solr.

Tip

Doing a distributed-search?

A Solr distributed-search has Solr search across multiple Solr cores/servers (shards in distributed-search lingo) as if it were one logical index. It will be discussed in Chapter 11, Deployment. An internal sharded request will by default go to the default request handler, even if your client issued a request to another handler. To ensure that the relevant search components are still activated on a sharded request, you can use the shards.qt parameter, just as you would qt. Solr 5.1 changed the default behavior for the better.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.122.162