Case study – Apache Solr and Drupal

Apache Solr can be integrated with any subsystem given the technology integration feasibility. However, some of the subsystems provide additional plugins/modules through which such integration goes deeper. In this section, we will look at two different subsystems as a case study.

For any CMS, search is an integral part of the system. CMS contains most of the information that users are consistently updating every day. They also provide different versions of the same document, and many other features. However, many content management systems do not provide competitive search. This is mainly because the development focus of many CMSes have been around providing feature-rich, failsafe CMS for more practical usage. Besides web portal, CMSes are also widespread and many organizations use them heavily. Integrating Apache Solr Enterprise search with CMS brings the best of both worlds together. We will primarily look at the most popular open source CMS systems in the world.

Drupal is one of the most popular CMSes used today. Integrating Drupal with Apache Solr provides users with the access to a rich search interface. It also enables users to search for content dynamically at high speeds on CMS data. The unique Solr features such as facets and relevance ranking enable users to reach the information they were looking for at the earliest time, thus saving some time in their day-to-day work. Since Apache Solr never queries Drupal's database, it provides a scalable environment for both Drupal and Solr to grow. Apache Solr can be integrated with Drupal as a Drupal module. The integration between these two systems is beyond the normal information demand-sharing way. Drupal does a smart thing of performing batch indexing, keeping tracking of Solr server connectivity, and so on. This integration is at a deeper level.

How to do it?

Let's look at how Drupal and Apache Solr can be brought together to effectively use the capabilities of both these applications together:

  1. First download Apache Solr and install it.
  2. The next step is to download Drupal and install it. You will find installation instructions at https://drupal.org/documentation/install.
  3. Once the two subsystems are available, the next step would be to make the Apache Solr Search module available in Drupal. To do that, first download the module from Drupal's site (https://drupal.org/project/apachesolr). Unzip it and put it in your sites module directory at, sites/all/modules or in the Drupal's module directory. You may also install the facet API (https://drupal.org/project/facetapi) and the Apache Solr framework modules along with this. If you use the Apache Solr framework, the facet API is a part of it.
  4. Now enable Apache Solr Modules from Drupal administration. The modules are shown in the following screenshot:
    How to do it?
  5. Since Drupal requires its own Apache Solr schema and configuration, it is recommended to create a separate domain (call it Drupal) and copy the schema from $MODULE_HOME/solr-conf/<your-solr>/* to your new core.
  6. Restart Apache Solr and ensure everything works fine.
  7. After enabling the modules in Drupal, go to the configuration of that module and click on the SETTINGS tab. Ensure the Solr URL is correct. You can also test the Solr connection by clicking on the localhost server as shown in the following screenshot:
    How to do it?
  8. Now create some data (pages) and add some content to them on Drupal.
  9. Here, you have two options: either write a cron job (http://<drupal-site>/cron.php) or perform indexing manually. Drupal entities are indexed during a Drupal cron job. For now, visit the DEFAULT INDEX tab in the Configure module of the Apache Solr search, and try to run the index. Refer to the following screenshot:
    How to do it?

    The Apache Solr search module holds a pipeline of entities, which are processed into one or more documents. Each document object is then transformed into XML and sent to Solr for processing.

  10. Validate on your Solr server to see whether data indexing has taken place. This can also be seen by simply running a search on Solr. Once this is verified, enable search blocks on the Drupal site. You can do this by navigating to Module configuration | Page/Block.
  11. Verify the search results using the blocks.

On similar lines, you can extend the Apache Solr integration with Drupal in more advanced ways. Drupal provides a lot of interesting modules as shown in the following table:

Module

Description

Apache Solr attachment

This sends all the attachments to Tika and makes them searchable through Drupal.

Apache Solr multisite search

If you have multiple Drupal sites, they can be searched across a single Solr core.

Apache Solr sort

This adds support for the Solr grouping feature and adds a UI to enable/disable sort fields.

Facet APIs

This provides faceted search on top of the Drupal Solr basic search.

You can see a complete listing of these modules at https://drupal.org/project/apachesolr. We have seen the Drupal module for Solr; additionally, Acquia provides cloud-based Apache Solr search for Drupal customers. Similar to Drupal, many portals such as WordPress and Typo3 provides some way of integration:

Subsystem

Description

WordPress CMS

This is used through the WordPress plugin and is PHP-based (http://wordpress.org/plugins/solr-for-wordpress/).

OpenCMS

Previously, OpenCMS used to work with the opencms-solr project. From OpenCMS 8.0, it is integrated.

MongoDB

Solr cannot run on top of MongoDB; they can be parallely run with data sync. Sync can take place in the following ways:

Replication feature of MongoDB

Third-party MongoDB connectors

JDBC-Solr DataImportHandler (https://github.com/erh/mongo-jdbc)

FOSWIKI (Free and Open Source Wiki)

It is used through the FOSWIKI Solr plugin (http://foswiki.org/Extensions/SolrPlugin).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.166.97