Using JavaScript/AJAX with Solr

During the Web 1.0 epoch, JavaScript was primarily used to provide basic client-side interactivity such as a roll-over effect for buttons in the browser for what were essentially static pages generated by the server. However, in today's Web 2.0 environment, AJAX has led to JavaScript being used to build much richer web applications that blur the line between client-side and server-side functionality. Solr's support for the JavaScript Object Notation (JSON) format for transferring search results between the server and the web browser client makes it simple to consume Solr information by modern Web 2.0 applications. JSON is a human-readable format for representing JavaScript objects, which is rapidly becoming a de facto standard for transmitting language-independent data with parsers available to many languages. The JSON.parse() function will safely parse and return a valid JavaScript object that you can then manipulate:

var json_text = ["Smashing Pumpkins","Dave Matthews Band","The Cure"];
var bands = JSON.parse('(' + json_text + ')'),
alert("Band Count: " + bands.length()); // alert "Band Count: 3"

While JSON is very simple to use in concept, it does come with its own set of quirks related to security and browser compatibility. To learn more about the JSON format, the various client libraries that are available, and how it is and is not like XML, visit the homepage at http://www.json.org.

As you may recall from the discussion on query parameters in Chapter 4, Indexing Data, you change the format of the response from Solr from the default XML to JSON by specifying the JSON writer type as a parameter in the URL via wt=json. Here is the result with indent=on:

{
  "responseHeader":{
  "status":0,
  "QTime":1,
  "params":{
    "q":"hills rolling",
    "wt":"json",
    "indent":"on"}},
  "response":{"numFound":44,"start":0,"docs":[
    {
    "a_name":"Hills Rolling",
    "a_release_date_latest":"2006-11-30T05:00:00Z",
    "a_type":"2",
    "id":"Artist:510031",
    "type":"Artist"}
…
    ]
}}

Solr can be configured to change the way it structures certain parts of the response, most notably for field value faceting. This affects JSON, Ruby, and Python response formats: json.nl. Yes, it's not just for JSON, and it technically affects the output of Solr's so-called NamedList internal data, but only in rare circumstances. The default choice, flat, is inconvenient to work with despite its succinctness, so other options are available. Note that the map choice does not retain the ordering once it is materialized in memory. Here is a table showing the effects of each choice on faceting on the MusicBrainz artist type:

Choice

Effect

flat

"a_type":["person",126,"group",71,"0",0]

map

"a_type":{"person":126,"group":71,"0":0}

arrarr

"a_type":[["person",126],["group",71],["0",0]]

arrmap

"a_type":[{"person":126},{"group":71},{"0":0}]

You will find that you run into difficulties while parsing JSON in various client libraries, as some are stricter about the format than others. Solr does output very clean JSON, such as quoting all keys and using double quotes and offers some formatting options for customizing the handling of lists of data. If you run into difficulties, a very useful website for validating your JSON formatting is http://www.jsonlint.com/. This can be invaluable for finding issues such as an errant trailing comma.

Wait, what about security?

If requests to Solr come from a web browser, then you must consider security. You will learn in Chapter 11, Deployment, that one of the best ways to secure Solr is to limit what IP addresses can access your Solr install through firewall rules. Obviously, if users on the Internet are accessing Solr through JavaScript, then you can't do this. However, this chapter describes how to expose a read-only request handler that can be safely exposed to the Internet without exposing the complete admin interface. Also, make sure that any filters that must be applied to your data, such as a filter query enforcing only active products, are shown as appends parameters in your request handler. Additionally, you might proxy Solr requests to ensure the parameters meet a whitelist, to include their values. This can be where you apply various business rules, such as preventing a malicious user from passing parameters such as rows=1000000!

Building a Solr-powered artists autocomplete widget with jQuery and JSONP

Now, it's well established in the search industry that some form of query autocompletion remarkably improves the effectiveness of a search application. There are several fundamentally different types of autocompletion—be sure to read about them in Chapter 8, Search Components. Here is a screenshot of Google showing completions based on search queries it has seen before:

Building a Solr-powered artists autocomplete widget with jQuery and JSONP

Building an autocomplete textbox powered by Solr is very simple by leveraging the JSON output format and the very popular jQuery JavaScript library's Autocomplete widget.

Tip

jQuery is a fast and concise JavaScript library that simplifies HTML document traversing, event handling, animating, and AJAX interactions for rapid web development. It has gone through explosive usage growth in 2008 and is one of the most popular AJAX frameworks. jQueryUI is a subproject that provides widgets such as Autocomplete. You can learn more about jQuery at http://www.jquery.com and http://www.jqueryui.com.

A working example using search-result-based completions (versus query log completion or term completion) is available at /examples/9/jquery_autocomplete/index.html that demonstrates suggesting an artist as you type in his or her name. You can read the doc and see a live demo of various autocompletions online at http://jqueryui.com/demos/autocomplete/.

There are three major sections to the HTML page:

  • The JavaScript script import statements at the top
  • The jQuery JavaScript that actually handles the events around the text being input
  • A very basic HTML page for the form at the bottom

We start with a very simple HTML form that has a single text input box with the id="artist" attributes:

<div class="ui-widget">
  <label for="artist">Artist: </label>
  <input id="artist" />
</div>

We then add a function that runs, after the page has loaded, to turn our basic input field into a text field with suggestions:

$( "#artist" ).autocomplete({
  source: function( request, response ) {
    $.ajax({
      url: "http://localhost:8983/solr/mbartists/ artistAutoComplete?wt=json&json.wrf=?",
      dataType: "jsonp",
      data: {
        q: request.term,
        rows: 10, 
        fq: "type:Artist"
      },
      success: function( data ) {
        response( $.map(data.response.docs,function(doc) {
          return {
            label: doc.a_name,
            value: doc.a_name,
          }
        }));
      }
    });
  },
  minLength: 2,
  select: function( event, ui ) {
    log( ui.item ?
      "Selected: " + ui.item.label :
      "Nothing selected, input was " + this.value);
  },
  open: function() {
    $( this ).removeClass( "ui-corner-all" ).addClass( "ui-corner-top" );
  },
  close: function() {
    $( this ).removeClass( "ui-corner-top" ).addClass( "ui-corner-all" );
  }
});

The $("#artist").autocomplete() function takes in the URL of our data source, in our case Solr, and an array of options and custom functions and ties it to the input form element. The source: function( request, response ) function supplies the list of suggestions to display via a $.ajax callback. The dataType: "jsonp" option informs jQuery that we want to retrieve our data using JSONP. JSONP stands for JSON with Padding, an admittedly not very intuitive name! This means when you call the server for JSON data, the server wraps its typical JSON response in a call to a function provided by jQuery. This allows you to work around the web browser cross-domain scripting issues of running Solr on a different URL and/or port from the originating web page. jQuery takes care of all of the low level plumbing to create the callback function, which is supplied to Solr through the json.wrf=? URL parameter. If you look at the Solr logs, you will see the name of a function passed in: json.wrf=jQuery15104412757297977805_1309313922023.

Notice the data structure:

data: {
  q: request.term,
  rows: 10, 
  fq: "type:Artist"
},

These items are tacked onto the URL as query parameters to Solr.

Following the best practices, we have created a Solr request handler called /artistAutoComplete, which is configured with the DisMax query parser to search over all of the fields in which an artist's name might show up: a_name, a_alias, and a_member_name, so arguably this is more of an instant search than word autocompletion. It's nice to use different request handlers for different search types rather than using /select for absolutely everything.

The response() function is called to convert the JSON result data from Solr into the format Autocomplete requires. It consists of a map() function that takes the returned JSON data structure for the documents returned and calls an anonymous function for each document. The anonymous function parses out the value to use as the label and value, in our case just the artist name.

Once the user has selected a suggestion, the select() function is called, and the name of the selected artist is appended to the <div id="log"> div.

One thing that we haven't covered is the pretty common use case for an Autocomplete widget that populates a text field with an identifier for the suggestion used to take the user to a detail page on it—typical of instant-search type completion. For example, in order to store a list of artists, I would want the Autocomplete widget to simplify the process of looking up the artists, but would need to store the list of selected artists in a database. You can still leverage Solr's superior search ability, but tie the resulting list of artists to the original database record through a primary key ID, which is indexed as part of the Solr document.

If you try to lookup the primary key of an artist using the name of the artist, then you might run into problems, such as having multiple artists with the same name or unusual characters that don't translate cleanly from Solr to the web interface to your database record.

Instead, a hidden field stores the primary key of the artist and is used in your server-side processing in place of the text typed into the search box:

<input type="hidden" id="artist_id"/>
<input id="artist" />

We use the change() function to ensure freeform text that doesn't result in a match is ignored by clearing out the artist_id form field and returning false from the function:

change: function( event, ui ) {
  if (!ui.item){
    log("term " + $( this ).val() + " was not found, clearing");
    $( this ).val( "" );
    return false;
  } else {
    log("hidden field artist_id:" + ui.item.id);
    $( "#artist_id").val( ui.item.id);
    return true;
  }
}

Look at /examples/9/jquery_autocomplete/index_with_id.html for a complete example. Change the field artist_id from input type="hidden" to type="text" so that you can see the ID changing more easily as you select different artists. Make sure you click away from the suggestion box to see the change occur!

Tip

Where should I get my results to display as suggestions?

There are many approaches for supplying the list of suggestions for autocomplete, and even the nomenclature of autosuggest, autocomplete, or suggest-as-you-type have loosely defined meanings. This important subject is covered in the Query complete/suggest section in Chapter 8, Search Components.

AJAX Solr

AJAX Solr is an excellent Solr search UI framework for building AJAX-based search interfaces. It is an off-shoot of an older project called SolrJS, which is now defunct. AJAX Solr adds some interesting visualizations of result data, including widgets for displaying the tag clouds of facets, plotting country code-based data on a map of the world using the Google Chart API, and filtering results by date fields. When it comes to integrating Solr into your web application, if you are comfortable with JavaScript, then this can be a very effective way to add a really nice AJAX view of your search results without changing the underlying web application. If you're working with an older web framework that is brittle and hard to change, such as IBM's Lotus Notes or ColdFusion frameworks, then this keeps the integration from touching the actual business objects, and keeps the modifications in the client layer via HTML and JavaScript.

The AJAX Solr project's homepage is at https://github.com/evolvingweb/ajax-solr and provides a great demo of searching Reuter's business news wire results:

AJAX Solr

A slightly tweaked copy of the demo is at /examples/9/ajaxsolr/reuters.html. Note that if you try to access the demo and see no content, then most likely the Internet-accessible demo Solr instance is offline.

AJAX Solr provides rich UI functionality through widgets—small blocks of JavaScript that render a specific UI component. It comes with widgets, such as an autocompletion of field values, a tag cloud, a facet view, a country code, and calendar-based date ranges, as well as displaying the results with paging. They all inherit from an AbstractWidget and follow pretty much the same pattern. They are configured in reuters/js/reuters.js by passing in a set of options. Here is an example of configuring the autocomplete widget to populate the search box with autocomplete suggestions drawn from the topics, organizations, and exchanges fields:

Manager.addWidget(new AjaxSolr.AutocompleteWidget({
id: 'text',
target: '#search',
field: 'allText',
fields: [ 'topics', 'organisations', 'exchanges' ]
    }));

A central AjaxSolr.Manager object coordinates the event handling between the various widgets, makes the queries to Solr, and messages the widgets. The preceding code shows the call to add the widget to the AjaxSolr.Manager object. Working with AJAX Solr and creating new widgets for your specific display purposes comes easily to anyone who comes from an object-oriented background.

Note

The various widgets that come with AJAX Solr serve more as a foundation and source of ideas rather than as a finished set of widgets. You'll find yourself customizing them extensively to meet your specific display needs.

We've developed a MusicBrainz-based example at ./examples/9/ajaxsolr/mbtracks.html for browsing track data. It is based on the Reuters example with a custom widget for term autocompletion using the facet.prefix technique. We did not configure Solr to load these facets via Solr's firstSearcher event in solrconfig.xml because this is the only demo that uses it, and it takes up to 30 seconds to load given the large index. Therefore, be patient while waiting for the first completion results.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.138.223