Chapter 20. Using the Integrated Verity Search Engine

In this chapter

So you have planned out your next site, given a lot of time to the database backend, identified thousands of products for your visitors to buy, and you’re ready to roll...right? Well, maybe not quite yet. Sure, you might have tons of stuff to buy, but how are your visitors going to find the item they’re looking for in your mega-catalog? The answer is usually a search application for your site.

Companies such as Google, AltaVista, and Yahoo! have made a pretty penny developing search engine technology, but don’t be fooled into thinking that search engines are only for the big boys. Any site that initially has a large volume of data or even those that start small and grow over time can be benefited by the addition of searching capabilities.

Understanding the Basics of Searching

Searching is searching is searching. It doesn’t matter whether you’re referring to the process of finding your keys in the morning or finding your favorite library book in an online catalog, the process usually works the same (assuming you have had your coffee in the morning).

Every logical search that we perform relies on a set of reference points. These reference points enable you to filter out unnecessary data and focus on the data that is pertinent to your search. For instance, when you begin your search for your car keys in the morning, is India in the search criteria? Probably not, unless you live in India. I know the example might seem a little silly, but it demonstrates that there are lots of possible options (no matter how improbable) that we automatically filter out when performing simple searches.

So suppose I were searching for my keys. I have pretty much filtered out India and all countries outside of the U.S. In addition, I filter out all states except Kansas, and all cities except Olathe. I’m pretty sure I had them when I came home (because I drove), so I filter out all other houses in Olathe.

Next, I can start filtering out rooms within the house. I could search every nook and cranny within each room, but by understanding some criteria about the keys and my habits, I can further limit the number of places I have to look. For instance, if I think back to yesterday, I know I had them when I walked in the door, and I tend to put them on the kitchen table, on the dresser, or leave them in my work pants. So after a check of these usual places, the majority of the time I end up with my keys.

Although I’m sure it’s absolutely fascinating for you to review the whereabouts of my car keys, the exercise has a point. It demonstrates how important filters are in finding data. Each of the filters I applied to the search for my keys significantly narrowed the number of places I had left to look.

The same concept applies to search applications within websites. Think of the search criteria that your visitors type into form fields as filters on all the data stored in your web pages. For instance, if a visitor to the Retro’s Cycles site typed “Honda” into the search field, it’s likely that he or she is not interested in seeing any information other than that relating to Honda motorcycles, and the results that you provide should be tailored to those needs.

Major search engines take this concept one step further by expanding their searchable data through the proactive searching of other websites—a process commonly referred to as spidering. For instance, Google (see Figure 20.1) has a search program called GoogleBot that visits thousands of websites each day and reads the content on the site.

The simple, yet effective Google search interface.

Figure 20.1. The simple, yet effective Google search interface.

If GoogleBot determines the information to be new and useful, it copies portions of the site content into its database (or index) and adds additional information (called metadata) such as the time the site was last indexed, keywords that might be useful in retrieving the data, and title of the page where the data is located. Then, when you visit Google and type search criteria into the form, Google searches the index and returns only those results that apply to your search term.

Adding search capabilities to your site, however, doesn’t necessarily require complex tools such as a spider or a metadata index. You can also accomplish a very similar result by allowing your visitors to run SQL queries against your database without them ever having to understand what a SQL query is.

Using SQL Queries to Return Search Results

For sites that use data stored in a database, one of the most common methods of implementing search capabilities is to simply allow your visitors to query certain fields within your database. For instance, suppose you have a database table named tbInventory that contains the following fields:

  • InventoryID

  • Year

  • Make

  • Model

  • Color

  • Size

  • Condition

  • Description

With this table structure, you could allow your visitors to easily narrow their searches by providing them with a search form on which they could choose from the available years, makes, and models, like the one shown in Figure 20.2.

A search form like this enables visitors to easily limit their results.

Figure 20.2. A search form like this enables visitors to easily limit their results.

For instance, suppose a visitor selected “2000” from the year field, selected “Honda” from the Make field, and selected “VT1100” from the Model field. When the Submit button was clicked, the visitor would effectively be submitting a query that says “Show me all the records in the tbInventory table where the Year field is equal to 2000, the Make field is equal to Honda, and the Model field is equal to VT1100.” Luckily, we can translate this request into a SQL query that the ColdFusion application understands, and it would look something like this:

SELECT *
FROM tbInventory
WHERE Year = "2000" AND Make = "Honda" AND Model = "VT1100"

If there were any records that met all these criteria, those would be displayed for the visitor to review (see Figure 20.3).

Results from a simple SQL query search.

Figure 20.3. Results from a simple SQL query search.

If no matching data was found, the visitor would be informed and might have the opportunity to refine the search (see Figure 20.4).

No data is returned by the query and the user has the opportunity to modify the search.

Figure 20.4. No data is returned by the query and the user has the opportunity to modify the search.

Creating this type of search application is quick and easy and requires no further configuration of the ColdFusion engine. Instead, it requires an understanding of developing web forms, database connections, recordsets, and results pages that we explore further in the Dreamweaver section of this book. For now, just be aware that through the use of web forms and database connectivity, ColdFusion is capable of allowing your visitors to easily search your database content.

A second type of query is one that searches the results of a search engine index or collection. This type of search is extremely useful when your site contains a large number of pages with static content or a large number of documents.

ColdFusion’s Verity Search Engine Architecture

ColdFusion MX 7 comes with its own search server, called the Verity Search Server. The Verity application runs separately from ColdFusion and stores the page metadata in indexes called collections. When a search request is presented to ColdFusion with the <cfsearch> tag along with a collection attribute, it hands off the search criteria to the Verity Search Server. Verity then searches the collections for the site and returns any corresponding metadata back to ColdFusion. ColdFusion then builds the results page based on that metadata for the visitor to review (see Figure 20.5).

The search request and results delivery process using Verity.

Figure 20.5. The search request and results delivery process using Verity.

For the collections to be searchable, however, there must first be data stored in them. Populating this data is done in one of two ways. The easiest method is via the ColdFusion Administrator, which handles the indexing of any collections created with the Administrator.

The second option is to use the command-line interface to create the collection and to index the site. Verity comes with its own search spider, called vspider, that can scan all the pages and files within your website, including Microsoft Office documents, WordPerfect, text, and PDF documents, and include their contents in the collections. The only drawback, however, is that by default the vspider is not capable of indexing collections created within the ColdFusion Administrator. You can, however, make modifications to ColdFusion and Verity that allow vspider to index these collections.

Tip

For instructions on configuring vspider to search collections created within the ColdFusion Administrator, check out the ColdFusion TechNote located at http://www.macromedia.com/cfusion/knowledgebase/index.cfm?id=50f419a.

Configuring Verity and Creating a Collection

Before you can begin using Verity as your search engine, you need to create a collection for your site. To create a new collection, open the ColdFusion MX 7 Administrator by typing http://localhost/CFIDE/administrator in your browser’s address bar. Log in to the administrator and choose Verity Collections from the left navigation menu.

As shown in Figure 20.6, the form for creating a new Verity collection is very simple.

You can create a new Verity collection via the ColdFusion Administrator.

Figure 20.6. You can create a new Verity collection via the ColdFusion Administrator.

Simply type the name for your new collection and designate a path where that collection will be stored. Next, choose a language for your collection and choose whether or not you want to enable category support. Category support enables you to easily group your pages into logical categories that can be used to further assist your visitors in their searches. For instance, in the results page, you could add a link that says “View similar pages,” which links to the other pages in the same category.

Caution

You can choose whether or not to enable category support only at the time of collection creation. You cannot go back and turn the feature on after the collection has been created, so consider whether you think you want to use this feature before you click the Create Collection button.

After you have completed the form, click the Create Collection button and the collection is created and added to the list of collections (see Figure 20.7).

Existing collections are listed in the ColdFusion Administrator.

Figure 20.7. Existing collections are listed in the ColdFusion Administrator.

The Verity Search Server application does not necessarily have to be installed on the same server as ColdFusion. If you choose to install Verity on a different server, you need to indicate where the application is located for ColdFusion to be able to access the collections. To modify this setting, click the Verity K2 Server link in the ColdFusion Administrator, type the name of the server in the Verity Host Name field, and click Submit Changes (see Figure 20.8).

The ColdFusion Administrator enables you to indicate where Verity is installed if it is not on the same server as ColdFusion.

Figure 20.8. The ColdFusion Administrator enables you to indicate where Verity is installed if it is not on the same server as ColdFusion.

After you have created your collection and specified the location of the Verity Search Server, you can use the Action buttons next to the collection to index, optimize, purge, and delete the collections. When you choose to index a collection (see Figure 20.9), you can indicate which file types and directories should be indexed.

Indexing a collection is easy to do from within the ColdFusion Administrator.

Figure 20.9. Indexing a collection is easy to do from within the ColdFusion Administrator.

Submitting Queries to the Verity Search Server

After you have your collection in place and your site indexed, you should create a search interface that enables your visitors to submit their queries to the Verity server for comparison against the collection. A simple form such as the one in Figure 20.10 enables the visitor to enter a search term and submit it to a page that displays the results.

A simple search form that submits a query to Verity.

Figure 20.10. A simple search form that submits a query to Verity.

The results page is where the real search magic takes place. For instance, suppose that the text field in the search form was named tfSearchTerm. The results page needs a snippet of code like the following to begin the search:

<CFSEARCH NAME="searchMyCollection" COLLECTION="myCollection" CRITERIA="#FORM.tfSearchTerm#">

This tag compares the content of the tfSearchTerm text field with the data stored in the myCollection Verity collection.

Note

The value of the COLLECTION attribute must match the name of the collection that you created in the ColdFusion Administrator.

To output the search results on the page, add the following code:

<CFOUTPUT QUERY="searchMyCollection">
    <a href="#URL#">#Title#</a>
</CFOUTPUT>

This code outputs the results of the searchMyCollection search and displays the Title as a clickable hyperlink. The #URL# and #Title# variables are built-in variables that are returned by the <cfsearch> tag. Some additional variables that you can use include the following:

  • score—Displays the relevancy score of the result based on the frequency with which the search term occurs in the page.

  • summary—Displays a brief summary of the document.

  • recordCount—Displays the number of results returned by the query.

  • recordsSearched—Displays the number of records searched within the collection.

  • author—New in ColdFusion MX 7, this variable displays the name of the document author.

  • category—New in ColdFusion MX 7, this variable displays the category of the document if the collection allows for categorization.

  • size—New in ColdFusion MX 7, this variable displays the size of the document in bytes.

  • type—New in ColdFusion MX 7, this variable displays the MIME type of the document.

Using these basic concepts, you should understand how the Verity Search Server can assist you in providing search capabilities to your site visitors. When coupled with SQL queries against the database, this powerful search engine application can assist you in ensuring that your visitors are able to find the information they are looking for.

Troubleshooting

When I view the results of my search, I see no results. Where should I look to figure out what’s happening?

The first place I would look is the ColdFusion Administrator. Be sure that you have indexed your collection or it will contain no metadata to search. Next, take a look at your <cfsearch> tag and ensure that the collection attribute is spelled exactly as it is in the ColdFusion Administrator.

I have made significant changes to the content on my site. Do I have to delete and re-create my collection to ensure that the information stored in the collection is correct?

No. Just purge the data stored in the collection via the ColdFusion Administrator and then rebuild the index. All your new data will be included in the new index.

Best Practices—Designing a User Interface for Your Search Application

Almost as powerful as the database backend or the search engine application you use, your search interface can determine whether your users are successful in retrieving the search results they are seeking. Sometimes, the most effective search forms are as simple as possible, allowing the hard work of parsing out the search term to be done on the back end without the user ever knowing that the query is being sliced and diced. For instance, when designing a search interface, you might consider some of the following:

  • Do your search form and results page accommodate for multiple search terms? For instance, does the search term “sticky wicket” return only those pages with the full phrase “sticky wicket,” or does it also return pages that contain “sticky” or “wicket”?

  • Is your search form accessible to those with disabilities? For tips on making your forms accessible, check out http://www.netmechanic.com/news/vol5/accessibility_no19.htm.

  • Is your search form easy to find? Is it visible on every page in your site so that visitors can use it if they get stuck in the navigation process?

  • Can your search form handle and parse out Boolean functions such as “and” and “or”?

  • Does your simple search form include a link to a more advanced search form that provides more functionality?

Addressing some of these concerns may help you in your search form development and may produce a stronger search form that helps your users in their search activities.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.237.24