Chapter 28. Helping People Find Your Web Pages


WHAT YOU’LL LEARN IN THIS CHAPTER:

• How to publicize your website

• How to list your pages with the major search sites

• How to optimize your site for search engines


Your web pages are ultimately only as useful as they are accessible—if no one can find your pages, your hard work in creating a useful architecture, providing interesting content, and coding them correctly will be for naught. The additional HTML tags you’ll discover in this chapter won’t make any visible difference in your web pages, but they are extremely important in that they will help your audience more easily find your web pages.

For most website creators, this might be the easiest—but most important—chapter in the book. You’ll learn how to add elements to your pages and how to construct your site architecture in such a way as to increase the possibility that search engines will return links to your site when someone searches for words related to your topic or company; this is called search engine optimization (SEO).

Contrary to what you might hear from companies who try to sell SEO services to you, there are no magic secrets that guarantee you’ll be at the top of every list of search results. However, there is a set of free best practices that you can do on your own to make sure your site is as easy to find as possible.

Publicizing Your Website

Presumably, you want your website to attract someone’s attention or you wouldn’t bother to create it in the first place. However, if you are placing your pages only on a local network or corporate intranet, or you are distributing your site exclusively on removable storage media, helping users find your pages might not be much of a problem. But if you are adding the content of your website to the billions of other pages of content indexed by search engines, bringing your intended audience to your site is a very big challenge indeed.

To tackle this problem, you need a basic understanding of how most people decide which pages they will look at. There are basically three ways people become aware of your website:

• Somebody tells them about it and gives them the address; they enter that address directly into their web browser.

• They follow a link to your site from someone else’s site, an aggregator and recommendation service such as Digg or Reddit, or from a link or mention on a social networking site such as Facebook, Twitter, or Google+.

• They find your site indexed in the databases that power the Google, Bing, or Yahoo! search engines (among others).

You can increase your website traffic with a little time and effort. To increase the number of people who hear about you through word-of-mouth, well, use your mouth—and every other channel of communication available to you. If you have an existing contact database or mailing list, announce your website to those people. Add the site address to your business cards or company literature. If you have the money, go buy TV and radio ads broadcasting your Internet address. In short, do the marketing thing. Good old-fashioned word-of-mouth marketing is still the best thing going, even on the Internet—we just have more and more tools available to us online.

Increasing the number of incoming links to your site from other sites is also pretty straightforward—although that doesn’t mean it isn’t a lot of work. If there are specialized directories on your topic, either online or in print, be sure you are listed. Participate in social networking, including the implementation of Facebook fan pages (if applicable) for your service or business. Create a Twitter account to broadcast news and connect with customers—again, if that is applicable to your online presence. Go into the spaces where your customers might be, such as blogs that comment on your particular topic of interest, and participate in those communities. That is not to say that you should find a forum on your topic or service and spam its users with links to your site. Act as an expert in your given field, offering advice and recommendations along with your own site URL. There’s not much I can say in this chapter to help you with that, except to go out and do it.


Note

A very popular, high traffic, and well-respected site (due to their accuracy and added value) for tips for interacting in social networking spaces, especially for the business user, is Mashable: http://www.mashable.com/.


The main thing I can help you with is making sure your content has been gathered and indexed correctly by search engines. It’s a fair assumption that if your content isn’t in Google’s databases, you’re in trouble.

Search engines are basically huge databases that index as much content on the Internet as possible—including videos and other rich media. They use automated processing to search sites, using programs called robots or spiders to search pages for content and build the databases. After the content is indexed, the search applications themselves use highly sophisticated techniques of ranking pages to determine which content to display first, second, third, and so on when a user enters a search term.

When the search engine processes a user query, it looks for content that contains the key words and phrases that the user is looking for. But it is not a simple match, as in “if this page contains this phrase, return it as a result,” because content is ranked according to frequency and context of the keywords and phrases, as well as the number of links from other sites that lend credibility to it. This chapter will teach you a few ways to ensure that your content appears appropriately in the search engine, based on the content and context you provide.


But wait! Before you rush off this minute to submit your listing requests, read the rest of this chapter. Otherwise, you’ll have a very serious problem, and you will have already lost your best opportunity to solve it.

To see what I mean, imagine this scenario: You publish a page selling automatic cockroach flatteners. I am an Internet user who has a roach problem, and I’m allergic to bug spray. I open my laptop, brush the roaches off the keyboard, log on to my favorite search site, and enter cockroach as a search term. The search engine promptly presents me with a list of the first 10 out of 10,400,000 web pages containing the word cockroach. You have submitted your listing request, so you know that your page is somewhere on that list.

Did I mention that I’m rich? And did I mention that two roaches are mating on my foot? You even offer same-day delivery in my area. Do you want your page to be number 3 on the list or number 8,542? Okay, now you understand the problem. Just getting listed in a search engine isn’t enough—you need to work your way up the rankings.


Listing Your Pages with the Major Search Sites

If you want users to find your pages, you absolutely must submit a request to each of the major search sites to index your pages. Even though search engines index web content automatically, this is the best way to ensure your site has a presence on their sites. Each of these sites has a form for you to fill out with the URL address, a brief description of the site, and, in some cases, a category or list of keywords with which your listing should be associated. These forms are easy to fill out; you can easily complete all of them in an hour with time left over to list yourself at one or two specialized directories you might have found as well. (How do you find the specialized directories? Through the major search sites, of course!)

Even though listing with the major search engines is easy and quick, it can be a bit confusing: Each search engine uses different terminology to identify where you should click to register your pages. The following list might save you some frustration; it includes the addresses of some popular search engines which will include your site for free, along with the exact wording of the link you should click to register:

Google—Visit http://www.google.com/addurl/, enter the address of your site and a brief description, and then enter the squiggly verification text, called a CAPTCHA, (or Completely Automated Public Turing test to tell Computers and Humans Apart) shown on the page. Then click the Add URL button to add your site to Google.

Yahoo! Search—Visit http://siteexplorer.search.yahoo.com/submit, click on Submit a Website or Webpage, enter the address of your site, and then click the Submit URL button.

Bing—Visit http://www.bing.com/docs/submit.aspx, enter the verification text, enter the address of your site, and then click the Submit URL button.

AllTheWeb—AllTheWeb search results are provided by Yahoo! Search, so just be sure to submit your site to Yahoo! Search, as explained previously.

AltaVista—AltaVista search results are also provided by Yahoo! Search, so just be sure to submit your site to Yahoo!. Search, as explained previously.


Tip

There are sites that provide one form that automatically submits itself to all the major search engines, plus several minor search engines. These sites—such as http://www.scrubtheweb.com/, http://www.submitexpress.com/, and http://www.hypersubmit.com/—are popular examples of sites that attempt to sell you a premium service that lists you in many other directories and indexes as well. Depending on your target audience, these services might or might not be of value, but I strongly recommend that you go directly to the major search sites listed on the right and use their forms to submit your requests to be listed. That way you can be sure to answer the questions (which are slightly different at every site) accurately, and you will know exactly how your site listing will appear at each search engine.


Providing Hints for Search Engines

Fact: There is absolutely nothing you can do to guarantee that your site will appear in the top 10 search results for a particular word or phrase in any major search engine (short of buying ad space from the search site, that is). After all, if there were such guarantees, why couldn’t everyone else who wants to be number one on the list do it, too? What you can do is avoid being last on the list and give yourself as good a chance as anyone else of being first; this is called SEO, or optimizing the content and structure of your pages so that search engines will favor your pages over others.

Each search engine uses a different method for determining which pages are likely to be most relevant and should therefore be sorted to the top of a search result list. You don’t need to get too hung up on the differences, though, because they all use some combination of the same basic criteria. The following list includes almost everything any search engine considers when trying to evaluate which pages best match one or more keywords:

• Do keywords appear in the <title> tag of the page?

• Do keywords appear in the first few lines of the page?

• Do keywords appear in a <meta /> tag in the page?

• Do keywords appear in <h1> headings in the page?

• Do keywords appear in the names of image files and alt text for images in the page?

• How many other pages within the website link to the page?

• How many other pages in other websites link to the page? How many other pages link to those pages?

• How many times have users chosen this page from a previous search list result?


Note

Some over-eager web page authors put dozens or even hundreds of repetitions of the same word on their pages, sometimes in small print or a hard-to-see color, just to get the search engines to position that page at the top of the list whenever users search for that word. This practice is called search engine spamming.

Don’t be tempted to try this sort of thing—all the major search engines are aware of this practice and immediately delete any page from their database that sets off a spam detector by repeating the same word or group of words in a suspicious pattern. It’s still fine (and quite beneficial) to have several occurrences of important search words on a page, in the natural course of your content. Make sure, however, that you use those words in normal sentences or phrases—then the spam police will leave you alone.


Clearly, the most important thing you can do to improve your position is to consider the keywords your intended audience are most likely to enter. I’d recommend that you not concern yourself with common, single-word searches like food; the lists they generate are usually so long that trying to make it to the top is like playing the lottery. Focus instead on uncommon words and two- or three-word combinations that are most likely to indicate relevance to your topic (for instance, Southern home-style cooking instead of simply food). Make sure that those terms and phrases occur several times on your page and be certain to put the most important ones in the <title> tag and the first heading or introductory paragraph.

Of all the search-engine evaluation criteria just listed, the use of <meta /> tags is probably the least understood. Some people rave about <meta /> tags as if using them could instantly move you to the top of every search list. Other people dismiss <meta /> tags as ineffective and useless. Neither of these extremes is true.

A <meta /> tag is a general-purpose tag you can put in the <head> portion of any document to specify some information about the page that doesn’t belong in the <body> text. Most major search engines look at <meta /> tags to provide them with a short description of your page and some keywords to identify what your page is about. For example, your automatic cockroach flattener order form might include the following two tags:

<meta  name="description"
      content="Order the SuperSquish cockroach flattener." />
<meta  name="keywords"
      content="cockroach,roaches,kill,squish,supersquish" />


Caution

Always place <meta /> tags after the <head>, <title>, and </title> tags but before the closing </head> tag.

According to XHTML standards, <title> must be the very first tag in the <head> section of every document.


The first tag in this example ensures that the search engine has an accurate description of the page to present on its search results list. The second <meta /> tag slightly increases your page’s ranking on the list whenever any of your specified keywords are included in a search query.

You should always include <meta /> tags with name="description" and name="keywords" attributes in any page that you want to be indexed by a search engine. Doing so might not have a dramatic effect on your position in search lists, and not all search engines look for <meta /> tags, but it can only help.


Tip

The previous cockroach example aside, search engine experts suggest that the ideal length of a page description in a <meta /> tag is in the 100- to 200-character range. For keywords, the recommended length is in the 200- to 400-character range. Experts also suggest not wasting spaces in between keywords, which is evident in the cockroach example. And, finally, don’t go crazy repeating the same keywords in multiple phrases in the keywords—some search engines will penalize you for attempting to overdo it.


To give you a concrete example of how to improve search engine results, consider the page shown in Listing 28.1.

This page should be easy to find because it deals with a specific topic and includes several occurrences of some uncommon technical terms for which users interested in this subject would be likely to search. However, there are several things you could do to improve the chances of this page appearing high on a search engine results list.


Tip

In the unlikely event that you don’t want a page to be included in search engine databases at all, you can put the following <meta /> tag in the <head> portion of that page:

<meta name="robots" content="noindex,noindex" />

This causes some search robots to ignore the page. For more robust protection from prying robot eyes, ask the person who manages your web server to include your page address in the server’s robots.txt file. (She will know what that means and how to do it; if not, you can refer to the handy information at http://www.robotstxt.org/.) All major search spiders will then be sure to ignore your pages. This might apply to internal company pages that you’d rather not be readily available via public searches.


Listing 28.1 A Page That Will Have Little Visibility During an Internet Site Search


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
  <head>
    <title>Fractal Central</title>
  </head>

  <body>
    <div style="text-align:center">
      <img src="fractalaccent.gif" alt="a fractal" />
    </div>
    <div style="width:133px; float:left; padding:6px;
                text-align:center; border-width:4px;
                border-style:ridge">
      Discover the latest software, books and more at our
      online store.<br />
      <a href="orderform.html"><img src="orderform.gif"
         alt="Order Form" style="border-style:none" /></a>
    </div>
   <div  style="float:left; padding:6px">
     <h2>A Comprehensive Guide to the<br />
     Art and Science of Chaos and Complexity</h2>
     <p>What's that? You say you're hearing about
     "fractals" and "chaos" all over the place, but still
     aren't too sure what they are? How about a quick
     summary of some key concepts:</p>
     <ol>
       <li>Even the simplest systems become deeply
       complex and richly beautiful when a process is
       "iterated" over and over, using the results
        of each step as the starting point of the next.
        This is how Nature creates a magnificently
        detailed 300-foot redwood tree from a seed
        the size of your fingernail.</li>
        <li>Most "iterated systems" are easily simulated
        on computers, but only a few are predictable and
        controllable. Why? Because a tiny influence, like
        a "butterfly flapping its wings," can be strangely
        amplified to have major consequences such as
        completely changing tomorrow's weather in a distant
        part of the world.</li>
        <li>Fractals can be magnified forever without loss
        of detail, so mathematics that relies on straight
        lines is useless with them. However, they give us
        a new concept called "fractal dimension" which
        can measure the texture and complexity of anything
        from coastlines to storm clouds.</li>
        <li>While fractals win prizes at graphics shows,
        their chaotic patterns pop up in every branch of
        science. Physicists find beautiful artwork coming
        out of their plotters. "Strange attractors" with
        fractal turbulence appear in celestial mechanics.
        Biologists diagnose "dynamical diseases" when
        fractal rhythms fall out of sync. Even pure
        mathematicians go on tour with dazzling videos of their
        research.</li>
      </ol>
      <p>Think all these folks may be on to something?</p>
    </div>
    <div style="text-align:center">
      <a href="http://netletter.com/nonsense/">
          <img src="findout.gif" alt="Find Out More"
               style="border-style:none" /></a>
    </div>
  </body>
</html>


Now compare the page in Listing 28.1 with the changes made to the page in Listing 28.2. The two pages look almost the same, but to search robots and search engines, these two pages appear quite different. The following list summarizes what was changed in the page and how those changes affected indexing:

• Important search terms were added to the <title> tag and the first heading on the page. The original page didn’t even include the word fractal in either of these two key positions.

<meta /> tags were added to assist search engines with a description and keywords.

• A very descriptive alt attribute was added to the first <img /> tag. Not all search engines read and index alt text, but some do.

• The quotation marks around technical terms (such as "fractal" and "iterated") were removed because some search engines consider “fractal” to be a different word than fractal. The quotation marks were replaced with the character entity &quot;, which search robots simply disregard. This is also a good idea because XHTML urges web developers to use the &quot; entity instead of quotation marks anyway.

• The keyword fractal was added twice to the text in the order-form box.

It is impossible to quantify how much more frequently users searching for information on fractals and chaos were able to find the page shown in Listing 28.2 versus the page shown in Listing 28.1, but it’s a sure bet that the changes could only improve the page’s visibility to search engines. As is often the case, the improvements made for the benefit of the search spiders probably made the page’s subject easier for humans to recognize and understand as well. This makes optimizing a page for search engines a win-win effort!

Listing 28.2 An Improvement on the Page in LISTING 28.1


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
  <head>
    <title>Fractal Central: A Guide to Fractals, Chaos,
             and Complexity</title>
    <meta name="description" content="A comprehensive guide
          to fractal geometry, chaos science and complexity
          theory." />
    <meta name="keywords"  content="fractal,fractals,chaos
          science,chaos theory,fractal geometry,complexity,
          complexity theory" />
    </head>

    <body>
      <div style="text-align:center">
        <img  src="fractalaccent.gif" alt=" Fractal Central: A
             Guide to Fractals, Chaos, and Complexity " />
      </div>
      <div style="width:133px; float:left; padding:6px;
                  text-align:center; border-width:4px;
                  border-style:ridge">
        Discover the latest fractal software, books and
        more at the  <span style="font-weight:bold">Fractal
        Central</span> online store <br />
      <a href="orderform.html"><img src="orderform.gif"
         alt="Order Form" style="border-style:none" /></a>
           </div>
    <div  style="float:left; padding:6px">
      <h2>A Comprehensive Guide to the<br />
      Art and Science of Chaos and Complexity</h2>
      <p>What's that? You say you're hearing about
      &quot;fractals&quot; and &quot;chaos&quot; all
      over the place, but still aren't too sure what
      they are? How about a quick summary of some key
      concepts:</p>
      <ol>
        <li>Even the simplest systems become deeply
        complex and richly beautiful when a process is
        &quot;iterated&quot; over and over, using the
        Results of each step as the starting point of
        the next. This is how Nature creates a magnificently
        detailed 300-foot redwood tree from a seed
        the size of your fingernail.</li>
        <li>Most &quot;iterated systems&quot; are easily
        simulated on computers, but only a few are predictable
        and controllable. Why? Because a tiny influence, like
        a &quot;butterfly flapping its wings, &quot; can be
        strangely amplified to have major consequences such as
        completely changing tomorrow's weather in a distant
        part of the world.</li>
        <li>Fractals can be magnified forever without loss
        of detail, so mathematics that relies on straight
        lines is useless with them. However, they give us
        a new concept called &quot;fractal dimension&quot;
        which can measure the texture and complexity of
        anything from coastlines to storm clouds.</li>
        <li>While fractals win prizes at graphics shows,
        their chaotic patterns pop up in every branch of
        science. Physicists find beautiful artwork coming
        out of their plotters. &quot;Strange attractors&quot;
        with fractal turbulence appear in celestial mechanics.
        Biologists diagnose &quot;dynamical diseases&quot;
        when fractal rhythms fall out of sync. Even pure
        mathematicians go on tour with dazzling videos of their
        research.</li>
      </ol>
      <p>Think all these folks may be on to something?</p>
    </div>
    <div style="text-align:center">
      <a href="http://netletter.com/nonsense/">
          <img src="findout.gif" alt="Find Out More"
               style="border-style:none" /></a>
    </div>
  </body>
</html>


These changes will go a long way toward making the content of this site more likely to be appropriately indexed. In addition to good, indexed content, remember that the quality of content—as well as the number of other sites linking to yours—is important as well.

Additional Tips for Search Engine Optimization

The most important tip I can give you regarding SEO is to not pay an SEO company to perform your SEO tasks if that company promises specific results for you. If a company promises that your site will be the number one result in a Google search, run for the hills and take your checkbook with you—no one can promise that because the search algorithms have so many variables that the number one result might change several times over the course of a given week. That is not to say that all SEO companies are scam artists. Some legitimate site content and architect consultants who perform SEO tasks get lumped in with the spammers who send unsolicited email, such as this:

"Dear google.com, I visited your website and noticed that you are not
listed in most of the major search engines and directories..."

This sample e-mail is used as an example in Google’s own guidelines for webmasters, along with the note to “reserve the same skepticism for unsolicited email about search engines as you do for burn-fat-at-night diet pills or requests to help transfer funds from deposed dictators.” Yes, someone actually sent Google a spam e-mail about how to increase their search ranking...in Google. For more good advice from Google, visit http://www.google.com/webmasters/.

Here are some additional actions you can take, for free, to optimize your content for search engines:

• Use accurate page titles. Your titles should be brief, but descriptive and unique. Do not try to stuff your titles with keywords.

• Create human-friendly URLs, such as those with words in them that users can easily remember. It is a lot easier to remember—and it’s easier for search engines to index in a relevant way—a URL such as http://www.mycompany.com/products/super_widget.html compared to something like http://www.mycompany.com?c=p&id=4&id=49f8sd7345fea.

• Create URLs that reflect your directory structure. This assumes you have a directory structure in the first place, which you should.

• When possible, use text—not graphical elements—for navigation.

• If you have content several levels deep, use a breadcrumb trail so that users can find their way back home. A breadcrumb trail also provides search engines with more words to index. For example, if you are looking at a recipe for biscuits in the Southern Cooking category of a food-related website, the breadcrumb trail for this particular page might look like this:

Home > Southern Cooking > Recipes > Biscuits

• Within the content of your page, use headings (<h1>, <h2>, <h3>) appropriately.

In addition to providing rich and useful content for your users, you should follow these tips to increase your site’s prominence in page rankings.

Summary

This chapter covered some extremely important territory by exploring how to provide hints to search engines (such as Google, Bing, and Yahoo!) so that users can find your pages more easily. You also saw an example of the HTML behind a perfectly reasonable web page redone to make it more search engine friendly. Finally, you learned a few more tips to optimize the indexing of your site overall.

Table 28.1 lists the tags and attributes covered in this chapter.

Table 28.1 HTML Tags and Attributes Covered in Chapter 28

image

Q&A

Q. I have lots of pages in my site. Do I need to fill out a separate form for each page at each search site?

A. No. If you submit just your home page (which is presumably linked to all the other pages), the search spiders will crawl through all the links on the page (and all the links on the linked pages, and so on) until they have indexed all the pages on your site.

Q. I submitted a request to be listed with a search engine, but when I search for my page, my page never comes up—not even when I enter my company’s unique name. What can I do?

A. Most of the big search engines offer a form you can fill out to instantly check whether a specific address is included in their database. If you find that it isn’t included, you can submit another request form. Sometimes it takes days or even weeks for the spiders to get around to indexing your pages after you submit a request.

Q. When I put keywords in a <meta /> tag, do I need to include every possible variation of spelling and capitalization?

A. Don’t worry about capitalization; almost all searches are entered in all lowercase letters. Do include any obvious variations or common spelling errors as separate keywords. Although simple in concept, there are more advanced strategies available when it comes to manipulating the <meta /> tag than I’ve been able to cover in this chapter. Visit http://en.wikipedia.org/wiki/Meta_element for good information on the various attributes of this tag and how to use it.

Q. I’ve heard that I can use the <meta /> tag to make a page automatically reload itself every few seconds or minutes. Is this true?

A. Yes, but there’s no point in doing that unless you have some sort of program or script set up on your web server to provide new information on the page. And if that is the case, the chances are good that you can go about that refresh in a different way using AJAX (see Chapter 24, “AJAX: Remote Scripting,” for basic information on AJAX). For usability reasons, the use of <meta /> to refresh content is frowned upon by the W3C and users in general.

Workshop

The workshop contains quiz questions and activities to help you solidify your understanding of the material covered. Try to answer all questions before looking at the “Answers” section that follows.

Quiz

1. If you publish a page about puppy adoption, how could you help make sure that the page can be found by users who enter puppy, dog, and/or adoption at all the major Internet search sites?

2. Suppose you decide to paste your keywords hundreds of times in your HTML code, using a white font on a white background, so that your readers cannot see them. How would search engine spiders deal with this?

3. Is it better to throw all your content in one directory or to organize it into several directories?

Answers

1. Make sure that puppy, dog, and adoption all occur frequently on your main page (as they probably already do) and title your page something along the lines of Puppy Dog Adoption. While you’re at it, put the following <meta /> tags in the <head> portion of the page:

<meta name="description"
content="dog adoption information and services" />
<meta name="keywords" content="puppy, dog, adoption" />

Publish your page online, and then visit the site submittal page for each major search engine (listed earlier in the chapter) to fill out the site submission forms.

2. Search engine spiders would ignore the duplications and possibly blacklist you from their index and label you as a spammer.

3. Definitely organize your content into several directories. This will provide easier maintenance of your content, but will also give you the opportunity to create human-readable URLs with directory structures that make sense. It also creates a navigational breadcrumb trail.

Exercises

• You’ve reached the end of the book. If you have a site that is ready for the world to see, review the content and structure for the best possible optimizations, and then submit the address to all the major search engines.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.101.192