CHAPTER 12

Search Engine Optimization
and Joomla!

No matter how much your site excels in design, implementation, and content, if web users can't locate it, then your efforts are largely wasted. Therefore, ensuring that your site is found by the relevant keywords on Google, Yahoo, MSN, Ask.com, and other popular search engines is worth some effort. A well-placed link on a popular search engine can mean the difference between popularity and anonymity.

There are a number of strategies that will help maximize your search placement. The process of adapting your web site for the best search results is known as search engine optimization (SEO). There are a number of expert companies—such as the Search Agency (www.thesearchagency.com)—that provide skilled consulting services to maximize your web site placement. However, you can do a great deal of work on your own to promote a Joomla site on the search engines.

Search engines use programs called spiders that process or "crawl" through each page of a web site and index the content found there for inclusion in the search engine database. A site's ranking in the search database depends a great deal on how effectively and accurately the spider can process the content of your web site.

There are a number of specific configuration settings in Joomla that will optimize it for spider crawling. Further, there are a number of techniques you can adopt to make certain that the content of your site has the best chance of being highly rated. Spending even a couple of hours fine-tuning your Joomla site for SEO can make a world of difference.

SEO on a Joomla! Site

The developers of Joomla recognize the importance of search engine placement for traffic. To assure that Joomla sites have a good chance of being well crawled, they have included a number of features that help increase site visibility and web presence. Various parameters are used to maximize opportunities for web recognition for everything on the site from individual items of content to sitewide configuration.

Since Joomla dynamically creates the web pages sent to a requestor, it has the advantage that changes made to the configuration are immediately effective on a sitewide basis. However, the dynamic nature of a Joomla site also creates a set of disadvantages, since webmasters don't have control over the organization and configuration the way that they do with static web sites. To remedy this problem, Joomla contains parameter settings for all of the major features that affect web spidering. Among the most important of these features is the Search Engine Friendly (SEF) URLs setting.

Configuring Joomla! to Be Search Engine–Friendly

By default, the page access URLs used by Joomla are not very friendly to a search engine spider. If you've ever looked closely at a URL on a Joomla site with default installation, you may notice that it reads something like this:

http://www.example.com/index.php?option=com_content&
    view=category&id=33&Itemid=53

That URL may not seem very descriptive to you—and it doesn't seem very descriptive to a spider either. The web address contains parameters that tell the Joomla engine the exact content to retrieve and render. At the time a page is requested, Joomla uses the current template and the requested database content to generate a formatted web page to return to the requestor. While the URL is perfectly understandable to Joomla, a web spider has a hard time with it.

A more straightforward address such as the following is much clearer about the type of content it points toward:

http://www.cnn.com/2009/SHOWBIZ/

This URL for the CNN web site is formatted like that of a static web site. In contrast to a dynamic site (like Joomla), which renders content on the fly, a static web site stores web page files in various directories (which can be named descriptively) and retrieves them when the proper URL directory path is used.

While search engines can catalog content with a path like the default Joomla URLs, pages with static folder addresses and descriptive links will nearly always outrank the dynamically generated ones. How can this problem be resolved?

Fortunately, the Joomla developers have included three options that allow Joomla to simulate the more descriptive URLs. The options render the URL addresses of the site using a search engine–friendly (SEF) folder-like structure. The native Joomla URLs still won't be as descriptive as ones created by hand (such as the CNN directory path just shown), but they will be good enough that search engines will have no problem finding and cataloging pages properly.

The complication with using the SEF URLs and the reason that this option is turned off by default is that for the feature to work, Joomla needs to be able to dynamically modify the URL on the web server. Some web hosting services will not allow a program to make the URL modifications because theoretically a hacker could exploit such capabilities.

Activating the SEF Options

The SEF options are found in the Joomla Administrator on the Global Configuration screen. Notice that the SEO Settings frame (see Figure 12-1) contains three options: Search Engine Friendly URLs, Use Apache mod_rewrite, and Add suffix to URLs. You will certainly want to set the Search Engine Friendly URLs option to Yes to make links generated by the system appear as the folder-format URLs.

image

Figure 12-1. Set the Search Engine Friendly URLs option to Yes.

When this option is active, the URLs generated by the site will take on the following format:

http://www.example.com/index.php/joomla-overview

This option uses a routing trick that causes the web server to read the index.php reference in the URL and make that page load and execute. When the index file executes, it processes the folder path that follows it in the URL and supplies the referenced Joomla content. The good news for this technique is that it doesn't require special configuration of the web server to activate the mod_rewrite extension. The bad news is that some web hosts won't work properly using this technique.

If the basic SEF URLs option doesn't work with your host, the server will return an "HTTP 404 - File not found" error when any links are clicked from the Front Page. In this case, you will want to activate the Use Apache mod_rewrite option. When that is active, the URLs are formatted slightly differently, such that the page referenced earlier will appear as follows:

http://www.example.com/home/joomla-overview

To make the URLs even more SEF to some search engines, you can activate the "Add suffix to URLs" option. This option appends an .html to page references, making the reference appear to a spider like an actual static page. When that is active, the page referenced earlier will appear as follows:

http://www.example.com/home/joomla-overview.html

Configuring mod_rewrite on Apache

You will need to check with your web provider to see if mod_rewrite functionality is available. The Apache server needs to have the mod_rewrite module enabled. You can determine if the module is enabled by executing the phpinfo() function (see Chapter 3 for more information). The apache2handler section of the phpinfo() output screen should display mod_rewrite in the module list, as shown in Figure 12-2.

image

Figure 12-2. The Loaded Modules text area of the phpinfo() output screen should include the mod_rewrite listing.

To activate the module on your Apache server, open the httpd.conf file on the web server. If the module is not being loaded, you should find the following line:

#LoadModule rewrite_module modules/mod_rewrite.so

Simply uncomment the line by removing the pound sign (#). Then you can add the directive that enables the mod_rewrite module:

RewriteEngine On

To test the mod_rewrite module, you can add a rewrite command. For example, you can add a path to reroute any access to the /myadmin directory to the Joomla /administrator directory. In the httpd.conf file, after the line that enables the RewriteEngine, add the following code:

RewriteRule myadmin/(.*) /Administrator/$1 [PT]

Restart the Apache server and try to access the /myadmin directory in your browser with a URL like this:

http://localhost/myadmin/

If the localhost root directory points to your Joomla installation, the /myadmin reference will display the Joomla Administrator login. If you would like to monitor the URL mapping that occurs, you can have Apache write the maps into a log file. Simply add the following two directives to the httpd.conf file:

RewriteLog "C:/rewrite.log"
RewriteLogLevel 9

With the mod_rewrite module enabled, you're ready to activate the necessary Joomla routing.

Activating the .htaccess File

To allow Apache to properly handle the SEF URLs, you need to set up a custom Joomla .htaccess file in the root directory. For the Apache server, the Joomla installation includes a sample .htaccess file that has the proper configuration settings for the main directory to allow Joomla to handle the URL conversion.

The sample configuration file, named htaccess.txt, will be located at the root directory of your Joomla site in a default installation file. To allow the Joomla execution of mod_rewrite, you will need to rename the file to .htaccess. To enable the htaccess.txt file included with Joomla, rename it to .htaccess (or ht.acl on Windows; see the following note for more information). Restart the Apache server so that the file will be correctly addressed.


Note On the Windows platform, Windows Explorer won't rename a file to an extension without a main filename (which is how the OS will consider the filename .htaccess). You can get around this prohibition by using the command prompt to rename the file, but there is another solution. Load the httpd.conf file for your Apache server into a text editor and add a line that changes the name of the default .htaccess file (such as AccessFileName ht.acl .htaccess). After you've added the line, restart the server. The added directive will allow the .htaccess file to have either the traditional filename or the name ht.acl.


You can examine the .htaccess file to see if any of the special cases listed in the comments section of the file may cause problems on your server. Open the file and you'll see the following setting in the text:

Options +FollowSymLinks

This setting may already be set in another part of the Apache configuration (especially on a remote server). If this setting generates an error when you restart Apache, you may need to add a pound sign (#) to the front of the line to make the directive a comment so that it won't execute.

Using Third-Party SEF Plug-Ins

There are a number of SEF plug-ins for Joomla (available on the Joomla extensions site, at http://extensions.joomla.org), the most popular being sh404SEF (http://extensions.joomla.org/extensions/2380/details), SEF Advance (http://extensions.joomla.org/extensions/362/details), and ARTIO JoomSEF (http://extensions.joomla.org/extensions/site-management/sef/1063/details). While the built-in Joomla SEF option is convenient, URLs still have names that may not be as descriptive as you want. The third-party extensions allow you to specify exactly what URL will appear for a given page.

The custom URL mapping supported by the third-party plug-ins is especially useful if you are converting an existing static web site to Joomla. You may already have web pages and directories with good search engine page ranks. By setting up a custom map, you can have Joomla mimic the existent URL and therefore retain the ranking the page has already achieved.

You can download ARTIO JoomSEF for free from the Joomla! Related section of the ARTIO download page(www.artio.net/en/downloads). After you install it, you will need to use the Administrator interface to configure it to set the SEF output. In the Components menu, you will see the ARTIO JoomSEF menu option, which can be used to display the control panel that allows you to access all the component functions (see Figure 12-3).

image

Figure 12-3. The main ARTIO JoomSEF control panel provides panels for all of the component functions.

You can craft friendly URLs for any URL that Joomla will use (see Figure 12-4). To set up the custom URL mapping, you need to go to the content where you want to create a friendly address, record that address, and then add it with the SEF address that you want.

image

Figure 12-4. You can set friendly URLs for any Joomla URLs.


Caution Some of the third-party SEF extensions use a MySQL table to convert between the actual Joomla URL and the SEF version. On a page with a great number of URLs (such as a calendar control), the performance on the server can suffer. Therefore, be sure to configure pages with page links to be ignored by the extension.


Using Titles, Meta Descriptions, and Keywords

Joomla has features that aid in proper search engine recognition. Two of the most important are found under the Advanced Parameters tab in the article editor. The meta description of an article, which generates the Description tag in the HTML output (see Figure 12-5), is used by most search engines to present a summary of the web page. The description is also examined in conjunction with the title of the page and the headings to ascertain the most relevant information about the page. From this information, the search engine will attempt to file the page under the most relevant keywords.

image

Figure 12-5. The advanced parameters of an article hold the metadata used by the search engines.

The keywords for an article were very important for page classification in the past. Because of the abuse of this information by spammers (who include popular keywords in pages that have no relevance to them), search engines are known to discount or outright ignore these meta keywords. Nonetheless, they can provide just a little extra information and may aid the local search engine in finding articles pertinent to a user query. Therefore, it is prudent to spend a small amount of time entering keywords that are appropriate to each article.

The title of the web page is one of the most overlooked aspects of SEO by new webmasters. There are many web sites in which pages have no titles, duplicate titles, or nondescriptive titles. In fact, most search engines put a premium on a web page title for a description of the page—especially if the title matches one of the major page headings. Therefore, try to make your titles as relevant and descriptive as possible.


Tip I've written an open source module called the Missing Metadata module (available for free download at www.joomlajumpstart.com), which you might find useful. This module displays a table of the articles that have no information in the metadata fields. Clicking an article entry takes you directly to the editor so the empty fields can be populated with the appropriate text.


Sitemaps

Including a sitemap on your page is an excellent way to ensure that the search spider will find and crawl all of the individual pages of the site. Since search engine programs understand sitemaps, their spidering can be guided by the directory provided by the list.

Sitemaps should be limited in length, however. Long sitemaps (those with more than 100 links on a single page) are delayed in mapping. Generally the first 100 links will be spidered promptly with any additional links placed in a queue for spidering later—perhaps even months later. Some of the most popular sitemap generators include Xmap, Joomap, the Google Sitemap Generator, and SEF Service Map 2.

Xmap

Xmap is the top sitemap generator for Joomla; it can be found on the Joomla Extension Directory at http://extensions.joomla.org/extensions/site-management/site-map/3066/details. Xmap can generate multiple sitemaps with different preferences, includes a caching system for high-traffic sites, renders an XML sitemap version, and has many other features. Additionally, Xmap generates statistical reports for each sitemap, including last-visit data, number of links generated on the last visit, and total number of visits.

Joomap

Joomap is one of the top sitemap generators for Joomla; it can be downloaded from the Joomap home page, at http://koder.de/projekte/joomap/. Joomap not only provides complete mapping for categories and sections, but it can also map items included in the VirtueMart categories (introduced in Chapter 11) if you are using the VirtueMart extension for e-commerce. Entries processed by Joomap can be easily rendered as a Google Sitemap XML list.

Google Sitemap Generator

If you want to cater to the Google search engine and use technology that is most tuned to Google's specifications, you can use the Google Sitemap Generator. This sitemap generator is written in the Python language and can be downloaded from Google at www.google.com/webmasters/tools/docs/en/sitemap-generator.html. It creates a sitemap using the Sitemap protocol (see www.sitemaps.org/protocol.php for complete details).

There are many sites that offer to execute the Google Sitemap Generator scripts for you through a web page. XML-Sitemaps (www.xml-sitemaps.com), for example, will take you step by step through the rendering of a sitemap for your Joomla site. It will render an XML file that is used by Joomla for the most accurate content rending of your web site. It will also generate a sitemap rendered in the text format used by Yahoo.

SEF Service Map 2

The SEF Service Map 2 component (www.sefservicemap.com) creates a sitemap dynamically. It also includes a Google Sitemap Generator as well as a Yahoo text format generator for submission to that search engine. This component is compatible with all of the default installation components, as well as Fireboard, Joomlaboard, SMF, DOCMan, Remository, JCal Pro, Ext-Calendar 2, Gallery2, Zoom Gallery, SOBI2, and VirtueMart.

SEF Service Map 2 will even map RSS headlines, links, and contacts. It can also cache its output, so a sitemap doesn't need to be rendered each time it's accessed—saving valuable server resources. The component provides full multilingual support.

One of the most useful features included in SEF Service Map 2 is the ability to exclude menu items or entire menus from being cataloged. This option allows you to prevent private or inconsequential pages from being included in the sitemap.

Breadcrumbs

In web terminology, breadcrumbs are the set of links that show the path of the current page as it relates to the greater context of the entire site. For example, if you are on the page for Article A, which is located in Category B within Section C, the breadcrumbs will show a link to the category and section in which the article is held. This user interface convention allows a web visitor to move up the hierarchy, often to look at content of the same general type. The breadcrumbs links on a page will appear something like this:

  • Home imageimage Section C imageimage Category B imageimage Article A

More importantly for SEO, breadcrumbs provide the search engine spider a clearer understanding of the structure of your web site. They also provide internal links that can have a slight but important effect on how the individual pages of your site are rated in the spider's index.

Whatever template you use, try to make sure that breadcrumbs exist on the page. In Joomla, breadcrumbs are displayed as a module (mod_breadcrumbs) and appear at the top of each page in most templates, as shown in Figure 12-6.

Figure 12-6. Breadcrumbs appear as a set of links displayed by the mod_breadcrumbs module.

By default, the module is configured to appear in the breadcrumbs position of the template. If a template doesn't include such a position, then the breadcrumbs won't be displayed. If your template omits that position, you will need to add it yourself.

To do so, open the index.php file of your template. You can edit the template in your favorite text editor, or you can enter the Template Manager, click the template to which you want to add the breadcrumbs, and click the Edit HTML button, as shown in Figure 12-7. The screen will display the PHP/HTML code of the template main page.

image

Figure 12-7. Click the Edit HTML button to edit the template index code.

It is a good idea to place the breadcrumbs somewhere near the top of the page, although the location will vary from page to page. In the default Joomla template, the module appears to the left of the search engine module (this appears as the user4 position in the code). In the following PHP listing, you can see the reference that displays the Breadcrumbs module:

<div id="search">
    <jdoc:include type="modules" name="user4" />
</div>

<div id="pathway">
    <jdoc:include type="module" name="breadcrumbs" />
</div>

<div class="clr"></div>

If you duplicate the module reference to the appropriate place in your template code, the breadcrumbs link list will appear. Look at the page and make sure that the list is in the proper position. For correct placement, you can examine the rendering of your Front Page between edits.

As an alternative, you can use the Preview button (on the same screen where the Edit HTML button is located). It will display the current template with dummy content and show where each module position is located. The preview can help you determine whether the template has a location to display the Breadcrumbs module, and also to properly align the module if you are adding it to an existing template.

Creating an SEF Joomla! Template

In Chapter 6, you created a Joomla template that displayed two columns using CSS. By using similar CSS code, you can make the display much more SEF by rearranging the column display. The new template will increase the visibility of the central content of each page of your site.

When a search engine spider indexes a web site, the text nearest the beginning of the file is indexed first and weighted most heavily in the valuation of content. In a two- or three-column layout, this means that the left navigation panel appears first in the HTML source code, followed by the center column, which holds the meat of most web pages. That's not an ideal situation, since the navigation is not the most important item on the page—the center content that holds the article is far more significant.

The original two-column template has the following code to define the columns:

#col1 {
    float:left;width:20%;
    background:#244223;
    padding: 10px;
}
#col2 {
    float:left;width:75%;
    border:3px solid #244223;
    background:#58a155;
    padding: 10px;
}

These style sheets are logical and display properly. However, column 1 must appear first in the source code for this to function properly. If a style sheet could be created in which column 2 appears first in the code, but still displays correctly, everything would work perfectly. Such a CSS design is possible if you use a container element.

If you create a container, the style sheet for column 2 can appear first and simply be assigned to the right side of the container. When column 1 appears in the source code, it specifies a location on the left, and everything is displayed exactly as needed. Change the style sheets for the columns in the CSS file to match the following definitions, and add the container and myclear styles:

div#logo {
    width: 110%; height: 100px;
    margin-left: −10px;
    margin-bottom: 10px;
    background: url(../images/LSlogo.jpg) left no-repeat;
    border: 1px solid #244223 ;
    padding: 0px;
}

#col1 {
    float:left;width:20%;
    display:inline;
    background:#244223;
    padding: 10px;
}
#col2 {
    float:right;width:75%;
    display:inline;
    border:3px solid #244223;
    background:#58a155;
    padding: 10px;
}

#container {
    float:left;width:85%;
    display:inline;
}

#myclear {
    clear:both;
}

With that change, you only need to change the index.php file to position column 1 first. Change the code to match the following (the changes are shown in bold):

<?php echo '<?xml version="1.0" encoding="utf-8"?' .'>'; ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
lang="<?php echo _LANGUAGE; ?>" xml:lang=
   "<?php echo _LANGUAGE; ?>">
<head>
<jdoc:include type="head" />
<link rel="stylesheet" href="templates/_system/css/general.css"
    type="text/css" />
<link rel="stylesheet" href="templates/
   <?php echo $this->template ?>/css/template.css"
   type="text/css" />
</head>

<body id="page_bg">

<jdoc:include type="message" />
<div id="logo">&nbsp</div>
<div id="container">
    <div id="col2">
        <jdoc:include type="component" />
    </div>

    <div id="col1">
        <jdoc:include type="modules" name="left" style="xhtml" />
    </div>
    <div class="myclear">&nbsp;</div>
</div>
<jdoc:include type="modules" name="debug" />

</body>
</html>

The code shows that both columns are encapsulated within the container structure. Column 2 appears first, which will make the content output by the component appear first in the source code file and therefore be indexed first by the spider.

General Techniques

Joomla includes a number of features that make SEO possible. However, there are other techniques you might consider to make sure your web site is optimized that lie outside of the Joomla configuration. These methods will work on any type of web site—dynamic or static.

Problems of JavaScript, Flash, and Ajax

An increasing number of web sites are adding dynamic interaction either directly through JavaScript (for functions such as drop-down menus) or by using a community of technologies such as Ajax (for dynamic information retrieval). While these new tools provide functionality that can make a web site very flashy and user-friendly, they create special problems for search engine spidering.

For example, a typical Joomla menu is simply an HTML list of links, which makes it easy for the search engine spider to recognize the links and visit the corresponding pages. A JavaScript-enabled menu system, however, is more likely to base link selection upon the current mouse position. Since the search engine spider will not execute the JavaScript code, how can it know which links are available for selection?


Tip One alternative to full JavaScript menus are CSS-based menus such as FreeStyle Menus (www.twinhelix.com/dhtml/fsmenu). These menus combine CSS with JavaScript to format and handle the display of the menu content. This means that the actual menu text and links exist as text formatted with CSS within the HTML document—just like any other page text. Such menus are completely understandable to a search engine spider.


Likewise, a Flash-based site may have a great deal of content hidden within an SWF file, which the search spider has no way to effectively address. Search engines cannot read into Flash files or execute Flash code, so all of the content within Flash animations remains invisible to the spider.

Therefore, it is always a good idea to have a non-Flash version of your site for SEO. Each page may have a link to the flashier animated content if desired. Without a parallel HTML version of the Flash data, most search engines will not be able to catalog either the content itself or the links that lead to the content.

HTML-to-Text Ratio

One of the methods search engines use to evaluate and rate content within a page is calculation of the HTML-to-text ratio. This ratio indicates whether most of the page's content is HTML code (such as vast tables or substantial JavaScript code) or actual text content. The lower the ratio, the more important the text will seem to the engine.

This is one reason to locate your CSS and JavaScript code in external files. Search spiders do not evaluate these external files as part of the ratio, meaning that the clean content that remains in the main file will be given more priority than if it were lost in a sea of extraneous code.

Spidering Your Own Site

While the exact functions of the search company spiders are closely guarded industrial secrets, you can get an idea about how a spider will view your site by spidering it yourself. There are several free web spiders that you can use to scan and analyze your web site. One popular spider is the Java-based, open source Pavuk Web Spider and Performance Measure, which is available on SourceForge at http://sourceforge.net/projects/pavuk.

If you're operating on the Windows platform, you might try Xenu's Link Sleuth, available at http://home.snafu.de/tilman/xenulink.html.

Xenu will quickly and completely spider your web site and provide you a variety of information about the site (see Figure 12-8). This utility is very useful because it will show you any problems with your site, including broken links and missing graphics files. The program will generate a report of all the broken links on the pages of the site.

image

Figure 12-8. Xenu will spider your web site and identify any broken links or missing files.

One of the most useful columns in the Xenu report is the Duration column, which reveals how long it took to retrieve the linked file. By looking at the retrieval duration times, you can see which pages (and perhaps which Joomla extensions on specific pages) are slowing down access to site information.

The program will also generate an excellent report of the general content of the web site. At the bottom of the report, a summary will be made that appears something like this:


All pages, by result type:
ok 165 URLs 83.76%
not found 10 URLs 5.08%
server error 20 URLs 10.15%
skip type 2 URLs 1.02%
Total 197 URLs 100.00%

If the spider report shows a thorough cataloging of your site, search engine spiders will likely have no problem crawling your site and finding all the content.

Checking Page Rank

Google originated a value of relative search engine importance, called page rank. Each individual web page (pages within a site can vary) is assigned a number from 0 to 10. The 0 value simply means that Google has not yet indexed the page. New pages often have a rank of 2 or 3, while larger, well-established sites are generally in the 6 to 9 range.

To get a very rough approximation of a web site's general search ranking, you can install Google Toolbar (see Figure 12-9). While page ranking is not useful for any precision evaluation of a web site value, it will allow you to get a feel for how important the web site is in the Internet sphere.

image

Figure 12-9. Google Toolbar shows a web site's general search ranking.

When Google first released the page ranking system, optimizers recognized the assigned value as very important. Nowadays, with all the other valuation methods used by search engines, it has become less important. However, it still provides an excellent general assessment of a page's popularity on Google. It can therefore be used in a rough manner to evaluate the popularity of your own site, as well as other associated or competitive sites.

Keyword-Rich Content

Keyword lists should contain all of the important variations of a topic. Whatever the web page is about, the keyword list should contain all the various synonyms of the central terms related to the content in order to encompass each term a person might search for information about. A page on investment, for example, might have a keyword list like this: EPS, valuation, earnings, share, DOW, index, and prospectus.

Because of the abuse of the technology by spammers, metadata keywords are scarcely given attention by search engine spiders. For search optimization, it is not important to spend much time creating the list of keywords to include in the metadata—except for the advantage of generating the list itself.

The keyword list can be used to ensure that the keywords are located in the content of the article. If all of the important keywords are included in the headlines and body of the article, the search engine indexing system will rate the relevance of the page very highly in terms of loan information because of the association of the common terms.

Preventing Content Listing

Most web sites need to be found by the general public. However, there are some web sites, or even specific pages on a web site, that have reasons to remain invisible. These pages are generally either completely private or needed only by authorized personnel who will be given the URLs individually and won't need to locate the references in a search engine.

A common example of this type of site is a real-time quotes site in the finance industry. Since these sites themselves require paid subscribers to have usernames and passwords (and therefore the site address), there is no reason to advertise the subscriber home page URL to the world. Keeping the site off the search engines prevents unwanted random traffic, confusion by consumers, and targeting by hackers.

You can be more specific than keeping your entire site off the search engines by explicitly listing individual pages or directories that the spider should ignore. By creating a list of excluded pages, you can hide content that should only be viewed by targeted visitors of the site. For example, you may want to provide a equity growth calculator to potential clients who are geographically local. Having the calculator listed on a search engine will bring worldwide visitors who have no potential to become customers yet still use up your bandwidth.

The excluded files or directories must be listed in a text file that sits at the root directory of the site. The file, named robots.txt, contains a case-insensitive list of fields. The pound sign (#) can be used to include comments in the file, which will be ignored by the spider. The User-agent field can be used to explicitly specify which spider (such as the Yahoo spider) should use the file. More commonly this parameter is set to the * setting (which means "all") to indicate that all spiders should restrict their spidering based on the file contents.

For example, the robots.txt file for restricting the contents of the forum directory and the clientlist.htm file would appear like this:

# Spidering exclusion file for http://www.example.com/

User-agent: *
Disallow: /forum    # Don't spider anything in the forum directory
Disallow: /clientlist.htm  # Don't spider the client list file.

The Disallow field for the /forum folder excludes all references to items in the directory. You may want to only disallow the index file in a folder, for example, to eliminate spidering of the central listing of all the articles, but allow spidering of articles that are located in the folder but linked from other articles. To exclude only the index file (whether it is default.htm, index.html, index.php, or a different file configured for that web server), you can add an extra forward slash (/) after the directory reference:

Disallow: /forum/        # Don't spider anything in the index file

Unfortunately, you cannot do exclusion on query string parameters. Therefore, the robots.txt file will require you to have the Joomla SEF option turned on for it to work properly. Otherwise, the exclusion file can only be used practically with Joomla to provide exclusion of the entire web site from the search engine spider.

Linking Strategy

It is useful to have a linking strategy in place when you are attempting to increase your placement. Tabulating the number of other important web sites on a particular topic that link to your site is one of the primary methods search engines use to determine if a site has important information on that topic.

For example, ESPN is a very important web site for sports fans. If you run a web site that focuses on football memorabilia, a link from the ESPN site would dramatically elevate your ranking in any searches related to sports. Notice that the link will help you most if it is in your same topic area. A link from a very popular car parts manufacturer would not help the sports memorabilia site nearly as much—even if the parts site had more popularity than ESPN.

Likewise, a prominent link on a small, rarely visited site is not worth nearly as much as one on a popular site. With this basic understanding of how links from other sites can affect your search engine ranking, you can begin to develop a linking strategy that will help you decide where to focus your efforts in obtaining links from other web sites.

Some ways to obtain links are as follows:

  • Offer reciprocal link placement: If you can find the administrator e-mail for a popular site, you can offer to exchange reciprocal links. Your web site must have a fair amount of content or a substantial page ranking to make this worth the while of the other site's webmaster.
  • Write articles for web publication: There are a number of sites that will publish articles that they will syndicate for republication across the Web (e.g., www.ezinearticles.com and www.onlypunjab.com). An article can contain a link to your web site. Writing a general description article (or more than one) on a topic relevant to your site can be an excellent way to promote yourself as a field expert.
  • Post to relevant message boards with a signature link: There are forums and message boards on the Web dedicated to almost any topic under the sun. Often these sites have new users posting basic questions that you, as an expert in your field, can answer. It is typically acceptable behavior on these sites to have a small advertisement link for your web site in the signature text that follows your posting. Be sure not to simply spam a forum advertising your wares. Not only will the advertisement likely be removed, but you will also have generated some ill will toward your site. If you can provide value through useful and informative posts, your small link should not raise the ire of any forum members, and could help generate new traffic.

Avoid Keyword Spamming

Most of the advice for adapting your site to make it the most friendly to search engines is also useful advice for simply making your site well designed for your visitors. Likewise, the presentation aspects that can hurt your site rating also generally fall under the category of bad web design.

You should avoid keyword spamming on your page. This form of spamming entails placing a text field at the bottom of your web page that includes hundreds if not thousands of keywords in small or invisible text. Previously, search engines would be fooled by these masses of keywords and increase the site's ranking.

No more. If a search engine recognizes that your site is attempting this sort of rank manipulation, the page may very well be penalized in the search index. In the past, it was generally considered poor form to attempt this strategy—now it can have the opposite of the intended effect.

Conclusion

It requires some effort to ensure that your web site has the highest possible rankings on the search terms relevant to the site. Joomla makes it fairly easy to implement SEO functionality on your web site, and you should take advantage of its features.

Despite having to deal with a little complexity in configuration, one of the first steps you should take in optimizing your site is setting Joomla to use the Search Engine Friendly URLs option for content addressing. The sooner you activate this option, the sooner the search engines will have a proper list of article URLs. This setting alone can significantly increase your web presence. It is worth the trouble of configuring your web server to enable this option.

So far you have used extensions written by other developers for everything from e-commerce to SEF functionality. In the next chapter, you will learn how to create your own modules and components to add any capabilities to a Joomla site that you might need.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.216.75