Chapter 9. Integrating Solr

As the saying goes, if a tree falls in the woods and no one hears it, did it make a sound? Similarly, if you have a wonderful search engine, but your users can't access it, do you really have a wonderful search engine? Fortunately, Solr is very easy to integrate into a wide variety of client environments via its modern, easy-to-use, REST-like interface and multiple data formats. In this chapter, we will:

  • Quickly prototype a search UI using Solritas (the /browse UI)
  • Look at accessing Solr results through various language-based clients, including Java, Ruby, and PHP
  • Learn how to build a dynamic JavaScript-based interface for Solr using AJAX calls
  • Briefly cover building our own Google-like search engine by crawling the MusicBrainz.org site with the Nutch web crawler
  • Leverage Hadoop to build Solr indexes using multiple machines
  • Translate search results into the OpenSearch XML standard via XSLT
  • Review ManifoldCF, a framework for syncing content from external repositories that respects the access rules of external documents

There are so many possible topics we could have covered in this chapter, but only so much space is available. We have put a page together on the Solr community wiki page that pulls all the options for working with Solr, from language-specific client libraries, such as .NET and Python, to options for using document processing pipelines, various Solr compatible crawlers, monitoring tools, and more. Visit http://wiki.apache.org/solr/SolrEcosystem for the latest listings.

Tip

In a hurry?

This chapter covers a wide variety of integrations with Solr. If you are in a hurry, jump to the next section, Inventory of examples, to find the source code that you can immediately start using. Then read the sections that apply to the environment you are working in.

We will be using our MusicBrainz dataset to power these examples. You can download the full sample code for these integrations from our website http://www.SolrEnterpriseSearchServer.com. This includes a prebuilt Solr and scripts to load the collections mbtracks with seven million records and mbartists with 400,000 records. When you have downloaded the zipped file, you should follow the setup instructions in the README.txt file.

Working with the included examples

We have included a wide variety of sample integrations that you can run as you work through this chapter. The examples stored in ./examples/9/ of the downloadable ZIP file are as self-contained as we could make them. They are detailed in this chapter, and you shouldn't run into any problems making them work. Check the support section of the book website for any errata.

Inventory of examples

The following is a quick summary of the various examples of using Solr, available unless otherwise noted in ./examples/9/:

  • ajaxsolr: This is an example of building a fully featured Solr Search UI using just JavaScript.
  • php: This is a bare bones example of the PHP integration with Solr.
  • solr-php-client: This is a richer example of integrating Solr results into a PHP-based application.
  • Solritas: This a web search UI using the template files in /cores/mbtypes/conf/velocity.
  • jquery_autocomplete: This is an example of using the jQuery Autocomplete library to provide search suggestions based on Solr searches.
  • myfaves: This is a Ruby on Rails application using the Ruby Solr client library Sunspot to search for music artists.
  • nutch: This is a simple example of the Nutch web crawler integrated with Solr.
  • manifoldcf: This is a crawler document ingestion framework with connectors to many systems such as SharePoint.
  • solrj: This is an example of a SolrJ-based Java client.
  • solr-map-reduce-example: This shows using Hadoop and the MapReduce paradigm to build Solr indexes using multiple machines.
  • heritrix-2.0.2: This is an example of web crawling with Heritrix. The output files in heritrix-2.0.2/jobs/ are used in the SolrJ example.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.178.157