3. Page Monitors

In the last chapter, we looked at RSS feeds, which are easy-to-read and easy-to-follow sources of site changes and content updates. I wish I could tell you that RSS feeds are the only source you’ll ever need to use when setting up information traps, but that wouldn’t be true.

Why not? Because many sites still don’t provide RSS feeds. Although RSS has been around for quite some time, it hasn’t gained the prominence you might think it would. Some site operators don’t have the time or interest to integrate another technology into their site. Others provide alternative ways for site visitors to keep up with their site’s content (such as e-mail alerts) and don’t feel the need to offer RSS feeds. Still others want you to actually visit their site and see their ads, rather than receive content via RSS feeds. The list goes on. So in your search for information, you more than likely will come across pages you want to continually monitor that don’t have RSS feeds. What’s an information trapper to do?

Use a page monitor! This chapter discusses the various kinds of page monitors that are available, walks you through how to set one up, and shows you how to limit the number of insignificant page updates you receive.

Nuts and Bolts

A page monitor simply watches HTML pages for changes and then reports them to you. Generally, the monitoring program grabs the page, then returns to the same page later, grabs a new copy of the page, and then compares the two. Any new information that’s added to the page is reported to you.

At first, this sounds great. But unfortunately there’s a downside: too many “false positives.” For example, if a page changes its date every day, this date change may trigger a false positive to the change monitor, and the change monitor may incorrectly alert you to the new content. If a page has a visitor counter that gets updated, those updates could also trigger an alert or false positive. Even tiny things that are updated, such as a corrected spelling error, can trip the page monitor. Not good!

However, if you’re careful about the pages you pick and use the page monitors to best advantage, you can minimize the number of times information traps trigger without providing any useful information.

Why use a page monitor?

There are two instances in which you’d want to use a page monitor:

You need the content, but you can’t get it via RSS. I already touched on this in the introduction to this chapter—you may want to keep up with a news source or with a page of information content, such as press releases, for example, but the information isn’t offered via RSS. In this case, you can use a page monitor on the appropriate pages.

Say you’re monitoring the Events page of your favorite band. You want to be updated when and where they’re touring so that you’re sure not to miss them when they’re in town. Or maybe you want to monitor a page that contains information on a company’s board of directors. You want to see when information about that group changes, something that might be more difficult to do with an RSS feed.

When you need just a tiny bit of information. There may be some data point that you’re interested in that is too small to be the subject of an RSS feed. You may want to know when a number changes, for example, or when a date changes. In that case, monitoring the page for changes makes more sense than trying to set up an RSS feed.

Just make sure you’re not reinventing the wheel. If you’re looking for a common small bit of data, like a stock quote or a temperature, there are services that can provide you with that information via e-mail alerts. (I cover e-mail alerts in the next chapter.)

Types of page monitors

There are two kinds of page monitors:

Web-based. Some page monitors are Web services. You go to the site, enter the pages you want to monitor, and receive page updates via e-mail (Figure 3.1).

Figure 3.1. Trackle, a Web-based monitoring service, sends out plain-text update notices of the pages it monitors.

Image

These services can be either free or fee-based. The advantage with Web service page monitors is that you can access them anywhere and receive the alerts on any device in which you are able to receive an e-mail (including cell phone, PDA, and so forth). The downside is that they’re often limited in their configurability and sometimes show far more false-positive alerts than you would like. They’ll also start costing you some serious money if you want to monitor more than a few dozen pages at a time.

Client-side. Client-side page monitoring software is installed on your computer and keeps copies of the pages it grabs on your computer for comparison purposes. The downside with this type of page monitor is that if you’re not at your computer or have alternative access to it, you’re not going to be able to access your page changes. Another downside is that unless you’re connected to the Internet 24/7 with broadband (and you really need a broadband connection if you’re going to monitor any number of pages), you might not get around to doing regular scanning. On the upside, client-side software lets you easily manage large numbers of monitored pages (several thousand in my case), whereas it can be cost-prohibitive to do that with an online service. Finally, client-side page monitors are usually much more configurable than Web-based ones, and can often avoid false-positive page change alerts.

Within these two categories, there are dozens of page monitors available, with varying functionality. The rest of this chapter provides an overview of some monitors I like and recommend.

Web-Based Page Monitors

If you’re monitoring a minimal number of pages—less than two dozen—and you want to watch the pages you monitor from several different computers, the flexibility of a Web-based page monitor will be ideal for you.

WatchThatPage

WatchThatPage (watchthatpage.com) is a free service that lets you specify a list of pages and then monitors them for changes.

You must register to use WatchThatPage with your name, e-mail address, and a password. When you register, you have the option of specifying your time zone, how often you want to receive updates, and so on. WatchThatPage won’t provide updates more than once a day.

Once you’ve registered, you can add pages and even folders (Figure 3.2). If you’re planning to cover several different topics, it makes sense to set up folders for each of your interests—and setting them up just after you register makes it that much easier to administer the sites later. If you have some pages you want to monitor for changes every day, and others you want to monitor every week, you can set up different channels.

Figure 3.2. With WatchThatPage, you can group your pages by folder and select pages as you’re browsing using a special bookmark called a bookmarklet.

Image

WatchThatPage provides page alerts via e-mail. However, you can also view recent page changes from the site, which is handy when you accidentally delete some updates but still want to see them.

TrackEngine

Unlike WatchThatPage, TrackEngine (trackengine.com) is a paid service, with some free functionality available. Registration with TrackEngine is free, and requires a name, company name, email address, user name, password, country/time zone, and a terms-of-service agreement. You can monitor up to five sources before you need to pay a fee.

TrackEngine offers a nifty bookmarklet button you can add to your browser toolbar. When you’re surfing the Internet and you find a page you want to monitor, you simply click the bookmarklet button on your toolbar, and that page is automatically added to TrackEngine.

When you add a page, you are asked to provide the following: the page’s URL, a title for the page (something that’s easy for you to remember), and how often you want to track this page, whether it’s daily, every two days, every three days, or weekly (Figure 3.4). You also have the option of being notified of updates containing only the keywords you specify.

Figure 3.4. TrackEngine’s bookmarklet pops open a window allowing you to add a page you’re viewing to your list of monitored URLs.

Image


Tip

The notification option can save you buckets of time! If there’s an easy keyword that encompasses the kind of information you’re looking for, such as a company name, person’s name, or technology name, by all means use it! It helps eliminate a lot of the false positives you can get when trivial information on the page changes.


You can get reports about page updates from the front page once you’re logged in. Unfortunately, TrackEngine doesn’t allow you to monitor more than five pages at a time, which is not much, unless you have a very small monitoring job. If you want to monitor more, you have to upgrade. Monitoring 10 bookmarks costs $19.95 a year, or you can pay $4.95 a month to monitor up to 50 bookmarks.

InfoMinder

InfoMinder (infominder.com/webminder) combines two types of tools—it tracks both pages and RSS feeds. It’s a fee-based service, but a 30-day free trial is available. (The trial version is limited to 10 pages/feeds.)

Once you’ve registered for the trial (and confirmed your registration) you get—well, nothing but a blank page! But that’s okay. Look for the Add Page option, and then find and use the Advanced Form, because for an information trapper, the basic form doesn’t amount to much (Figure 3.5).

Figure 3.5. The advanced form for adding a URL to track gives you lots of options.

Image

The Advanced Form lets you first specify the URL of the page you want to track, of course. But it also lets you specify whether you want to track insignificant changes (like dates, number changes, etc.), and the threshold for sending notices about changes (either after a certain number of changes or when certain keywords are detected, if what you’re tracking makes the keyword option useful). You can also specify how often notices about changes are sent to you (every x number of days). There are also some other advanced options at the bottom involving cookies and form posting and content posting, but I wouldn’t change those; it’s too easy to mess up your notifications if you get a setting wrong.

At the top of the advanced search page, you’re asked to provide the URL of the page you wish to search, a description, and the categories into which the page fits. Once you’ve provided those categories, you get a pop-up window displaying the available categories (which are very broad) and your categories (Figure 3.6). You can put your categories into the already-existing categories and use them later.

Figure 3.6. InfoMinder’s categories are very broad.

Image

What are good category considerations? If you’re monitoring search engine news, you might want to enter "Search Engines", or if you only want information on one search engine, you might want to enter Google, Teoma, Yahoo, or whatever. Make sure what you enter is easily recognized, and specific enough that you can effectively narrow in a particular topic.

Once you’ve set up some pages to monitor, your login page displays a list of pages that you’re monitoring (Figure 3.7). Click on a page URL and a framed version of the page displays with changes highlighted. The date that the changes were detected is also displayed. Having a copy of the page with the changes highlighted makes it easier to see the changes than when they’re mailed to you, but the mailed changes, encapsulated in a single text e-mail, are also very useful. I recommend using both modification alert types.

Figure 3.7. New icons make it easy to see which pages have changed on your InfoMinder control panel.

Image

There’s one more thing you have to do to make sure you’re ready to use InfoMinder as your page monitor: Use Preferences to choose how you want the changed information sent to you (via text or e-mail) and how you want the changed information on the pages highlighted. Once you do that, you’re set.

The 30-day free trial lets you track up to 10 URLs. The paid service varies from $9 a year, to track up to 20 URLs, to $179 a year, to track up to 1,000 URLs. There’s also a Premium edition that monitors your pages for updates multiple times a day; it costs anywhere from $299 to $499 a year, depending on how many pages you track. Now do you see how after a certain number of URLs, using Web-based change detection services gets expensive?

For large companies or enterprises, there’s also a server edition available; you install the software on your own server and get more control over the results and the ability to track huge numbers of pages. Contact InfoMinder for pricing information.

ChangeDetect

The free registration that ChangeDetect (changedetect.com) requires as of this writing also requires you to provide an address and phone number. If that’s not a problem for you, then you’ll probably discover that ChangeDetect offers some very precise features in a presentation that isn’t all that different from InfoMinder.

However, the free trial only allows you to monitor five pages. But ChangeDetect includes a nifty option called Bulk Monitors that lets you enter several pages to monitor at a time.

Once you’ve set up pages to be monitored, you’ll see a page that resembles InfoMinder’s interface, except that it provides more controls (Figure 3.8). In fact, it looks a little cryptic until you get into it further. The options on the front page let you go straight to the page being monitored, view the monitor, modify the monitor, delete the monitor, test the monitor, and see when the pages were last changed.

Figure 3.8. ChangeDetect’s control panel looks a little cryptic until you get used to it.

Image

The options for modification are extensive and worth a look. You can provide thorough descriptions of pages you’re monitoring (useful if you’re sharing the monitoring chores), specify how you want change information delivered to you (you can get an e-mail alert about the page changing or you can get an e-mail with the changes in it), and choose how often you want the page to be checked (every 12 hours, every day, every week, or every month).

If you’re using the fee-based version of the service, you also get the option to use a user name and password (for monitored pages that require validation). And don’t forget to check out the advanced options at the bottom of the page. You can set the monitor to trigger when the page changes by a certain number of bytes, when a word or phrase does (or doesn’t!) appear, or when one of a series of keywords appears. (There’s a regular expression content filter, but it’s available only for advanced users.)

ChangeDetect is available in three flavors: Personal (up to 10 pages monitored, $1.95 a month), Plus (up to 100 pages monitored, $14.95 a month), or Professional (up to 500 pages monitored, $39.95 a month). An enterprise version is also available.

Trackle

I use Trackle (trackle.com) and like it a lot for pages that I know will change on a regular basis, and for pages for which I don’t need anything but plain text updates.

Trackle offers a free 14-day trial available. Once you sign up (this requires only a user name, password, and e-mail address), a one-page form displays that allows you to enter a list of URLs (up to 25) and which hours of the day you want to monitor them (up to 24, though once or twice is usually enough). Then just click the Update/Activate button and you’re set (Figure 3.9).

Figure 3.9. Trackle’s simple interface lets you specify page URLs to monitor and when you want to monitor them.

Image

While Trackle has a handy place on its Web site that shows you the results of your last set of updates, it’s mostly a mail delivery service. It delivers the updates of specified URLs as plain text, but keeps the URLs of an update clickable. This works well if you’re trying to monitor something that will provide coherent updates—say, a blog that doesn’t have an RSS feed (there are still plenty of these around!) or a page of press release notes. On the other hand, if you’re trying to monitor something that doesn’t have coherent updates, such as a list of numbers or book names or something that won’t make much sense without the context of the rest of the page around it, Trackle isn’t a good solution.

Trackle costs $1.95 a month or $19.95 a year to monitor up to 25 pages. It’s a lot cheaper than the other services I’ve mentioned so far in this chapter, but on the other hand, it doesn’t offer some of the advanced features of the other services, including the ability to look for specific keywords, ignore changes below a certain page size, and so on.

Even though Trackle is the cheapest of the services I’ve covered here, for the number of pages it covers, it’s still expensive. If you get to the point that you want to cover more than 100 pages or so, you’re into “$75 or more” territory. At that point, it’s best to use a client-side monitor to keep track of page changes.

Client-Side Page Monitors

There are a few major disadvantages to using client-side monitors for the Web pages you want to track. The first is that a client-side monitor is only available from your computer, not from all over the Internet. The second disadvantage is that unless you leave your computer on all the time, you won’t be able to automate how often your bookmarks are checked. And if your computer crashes or has a problem, you’ll need to make sure you have backups.

But there are advantages too! Because you’re using software on your computer, you can have more detailed control over the kinds of monitoring you do. You can monitor pages as often as you like, even every 5 minutes if you want to. And you can monitor huge numbers of pages—200, 500, even over 1,000. You can do that with online services, of course, but at a terrific expense.

Can’t possibly imagine wanting to monitor that many pages? Believe me, once you’ve been doing information trapping for a while, you’ll see how much time it can save you, compared to hunting down information manually all the time. And then you’ll begin to discover how many of your everyday searching chores can be turned into traps, and gradually you’ll find yourself wanting to monitor more and more pages!

Page monitors for Windows

For Windows page monitoring, there’s one program I adore and can’t recommend enough. I’ve been using it for years and it’s amazing: WebSite-Watcher.

By all rights, WebSite-Watcher (aignes.com) should have its own book. It’s powerful, inexpensive, amazingly feature-rich, and an indispensable tool for Windows-based information trappers. There’s no way I can do justice to all its features in part of a chapter. So I’m going to hit the highlights and encourage you to download it (a free trial is available), play with it, and if you like it, add it to your toolbox!

When you first launch WebSite-Watcher, a screen displays that looks like Figure 3.10.

Figure 3.10. WebSite-Watcher’s basic page.

Image

There are two ways you can organize your pages to be monitored. You can monitor everything in one big file, or you can set up different files to monitor different topics in which you’re interested. For example, I have a file for monitoring pages relevant to ResearchBuzz, and I have another file set up to monitor pages relevant to some topics in another job I have. Since I want to review these two types of information with different frequency, I keep them in different files. That way I can open one of them, monitor pages, and save the other pages to monitor for later.

Whether you use one file or multiple files, you’ll need to add pages to be monitored. You can do that by clicking Bookmarks/New Bookmarks. A multi-tabbed input box with options displays, as shown in Figure 3.11.

Figure 3.11. Using WebSite-Watcher’s many options will help eliminate false-positive page changes.

Image

Don’t worry if all of these options seem overwhelming. The important parts are the title and the URL, as well as the content filtering checkboxes at the bottom of the screen (leave them on default). If you want to be able to filter by the appearance of certain keywords, click the Keywords tab. If you frequently want to check a particular URL, click the AutoWatch tab. Remember that WebSite-Watcher can only check a URL when your computer is on and connected to the Internet.

These are the basic options. Once you’ve gotten a set of bookmarks, you check them by choosing Bookmarks/Check all Bookmarks (or the F9 key). WebSite-Watcher scrolls through the bookmarks and checks them all. How long this takes depends on the speed of your computer and the speed of the Web sites you’re checking. When the pages are all checked, you’ll see a framed page that shows the lists of the URLs you’re monitoring and monitored pages, with the changes highlighted as shown in Figure 3.12.

Figure 3.12. WebSite-Watcher highlights page changes so they pop right out at you.

Image

As you can see, it’s really simple to view the pages that have changed using WebSite-Watcher. You can also go to the current version of the page, compare the current version with previously changed pages, and use other options to make sure you’re getting every nuance of changed information on a page.

I’ve just skimmed the surface of WebSite-Watcher, but the important things you need to know are adding, checking, and viewing URLs. If you want to monitor more than 20 or so pages and you use Windows, I strongly recommend WebSite-Watcher.

Page monitors for Mac

There aren’t as many page monitors available for the Mac as there are for the PC, but you still have at least one good option in Web Watcher. And if you want to monitor a minimum number of pages, you can always use a Web-based monitoring service. Then it won’t matter what kind of computer you’re using!

Web Watcher (chaoticsoftware.com/ProductPages/WebWatcher.html) is a shareware program for Mac that costs $20, with a $200 site license and $500 multi-site license available. You may try the software for free for 15 days, however.

Web Watcher is very easy to use. When you first start the program, a list of Watchpoints displays, which will be an empty list until you populate it. To add a Watchpoint, click the Add button at the bottom of the window. A screen displays that looks like Figure 3.13.

Figure 3.13. Web Watcher’s options are not as extensive as WebSite-Watcher’s, but far easier to understand.

Image

Fill out the name of the page you want to monitor, as well as the URL, username and password (if it’s required), and specify how often you want to check the page (you can check pages as frequently as every x seconds, which I don’t recommend, to every x days). You can check for changes to a page’s size, date, and whether or not its URL is accessible.

You also have several different notification options. You can request that Web Watcher play a beep, display an alert, show the URL in your Web browser, or send a notification by e-mail. You can e-mail up to three addresses and have custom text for each alert. You may also specify if you want to copy the watched URL or its contents—or both—into the notification e-mail. Be sure to add your mail server and a reply-to address using Web Watcher’s Preferences window if you want to send e-mail alerts!

If you want to monitor for page changes often (several times a day or even several times an hour), I recommend setting them up to display in the browser. However, if you want to check pages less frequently—say, once a week—I recommend having the page changes sent in an e-mail, from which you can visit the monitored page itself and see if you find anything intriguing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.67.203