Chapter 23. SCHEDULING WEBBOTS AND SPIDERS

Up to this point, all of our webbots have run only when executed directly from a command line or through a browser. In real-world situations, however, you may want to schedule your webbots and spiders to run automatically. This chapter describes methods for scheduling webbots to run unattended in a Windows environment. Most readers should have access to the scheduling tool I'll be using here.

If you are using an operating system other than Windows, don't despair. Most operating systems support scheduling software of some type. In Unix, Linux, and Mac OS X environments, you can always use the cron command, a text-based scheduling tool. Regardless of the operating system you use, there should also be a graphical interface for a scheduling tool, similar to the one Windows uses.

The Windows Task Scheduler

The Windows Task Scheduler is an easy-to-use graphical user interface (GUI) designed for the somewhat complex duty of scheduling tasks. You can access the Task Scheduler through the Control Panel or in the Accessories directory, under System Tools.

To see the tasks currently scheduled on your computer, simply click Scheduled Tasks. In addition to showing the schedule and status of these tasks, this window is also the tool you'll use to create new scheduled tasks. It will look like the one in Figure 23-1.

The Windows Task Scheduler

Figure 23-1. The Windows Task Scheduler

Preparing Your Webbots to Run as Scheduled Tasks

Before you schedule your webbot to run automatically, you should create a batch file that executes the webbot. It is easier to schedule a batch file than to specify the PHP file directly, because the batch file adds flexibility in defining path names and allows multiple webbots, or events, to run from the same scheduled task. Listing 23-1 shows the format for executing a PHP webbot from a batch file.

drive:/php_path/php drive:/webbot_path/my_webbot.php

Listing 23-1: Executing a local webbot from a batch file

In the batch file shown in Listing 23-1, the operating system executes the PHP interpreter, which subsequently executes my_webbot.php.

drive:/curl_path/curl http://www.somedomain.com/remote_webbot.php

Listing 23-2: Executing a remote webbot from a batch file

Scheduling a Webbot to Run Daily

To schedule a daily execution of your batch file, click Add Scheduled Task in the Task Scheduler window. This will initiate a wizard, which walks you through the process of creating a schedule of execution times for your application. The first step is to identify the application you want to schedule. To schedule your webbot, click the Browse button to locate the batch file that executes it, as shown in Figure 23-2.

Selecting an application to schedule

Figure 23-2. Selecting an application to schedule

Once you select the webbot you want to schedule—in this example, test_webbot.bat—the wizard asks for the periodicity, or the frequency of execution. Windows allows you to schedule a task to run daily, weekly, monthly, just once, when the computer starts, or when you log on, as shown in Figure 23-3.

Configuring the periodicity of your webbot

Figure 23-3. Configuring the periodicity of your webbot

After selecting a period, you will specify the time of day you want your webbot to execute. You can also specify whether the webbot will run every day or only on weekdays, as shown in Figure 23-4. You can even schedule a webbot to skip one day or more.

Additionally, you can set the entire schedule to begin sometime in the future. For example, the configuration shown in Figure 23-4 will cause the webbot to run Monday through Friday at 6:20 PM, commencing on January 16, 2008.

Configuring the time and days your webbot will run

Figure 23-4. Configuring the time and days your webbot will run

The final step of the scheduling wizard is to enter your Windows username and password, as shown in Figure 23-5. This will allow your webbot to run without Windows prompting you for authentication.

Entering a username and password to authenticate your webbot

Figure 23-5. Entering a username and password to authenticate your webbot

On completing the wizard, the scheduler displays your new scheduled task, as shown in Figure 23-6.

The Task Scheduler showing the status of test_webbot's schedule

Figure 23-6. The Task Scheduler showing the status of test_webbot's schedule

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.151.153