CHAPTER 12
cron

The system scheduler on UNIX and Linux systems is called cron. Its purpose is to run commands, series of commands, or scripts on a predetermined schedule. Normally these tasks are performed on systems that run 24 hours a day, 7 days a week. Writing cron scripts to perform system maintenance, backups, monitors, or any other job that you would want to run on a schedule is a very common task. There are a few subtleties with cron however, that many users and administrators may be unaware of.

crontab Entries

On a UNIX or Linux system, the cron daemon process runs all the time. A daemon is generally a program that runs as a background task and provides some type of service, in this case a scheduler. A run control (rc) script starts the daemon when the system boots. The cron daemon searches for entries in the systemwide or individual user's crontab (short for "cron table") files and loads them into memory. Once each minute, the daemon determines—based on its predetermined schedule—if any of the entries should be run. A scheduled job can run as often as every minute or as infrequently as once a year.

A crontab entry is a specially formatted line in a crontab file that specifies on which minute, hour, day, day of the week, and day of the month a particular task should run. To add a task to the cron table, you run the crontab -e command, which allows an individual user to maintain the entries in his personal crontab file; this launches a session with the editor that is defined by the EDITOR shell environment variable. Each user on a machine may have a crontab file for his own purposes. However, a system administrator can curtail individual users' capabilities according to security policy. The following line is a simple cron entry scheduled to run at 4:15 pm on Tuesdays:

15 16 * * 2 /some/path/myscript.sh

There are six fields in each entry. The first five fields define the schedule, and the remainder of the line is the job that you want to run on that schedule. The last field is what I want to focus on here. Many users are aware that they can run a job within cron, although they may not be aware that the task can be quite complex and may therefore contain multiple elements. You can source environment files, set variables, debug the code, put logic into your entry, and call scripts and other commands from the crontab file.

This cron entry is a more complex example:

* * * * set -x ; cron_count='ps -ef | grep [c]ron | wc -l'
  ;[ $cron_count -ne 5 ] && echo "Cron Count $cron_count" | mail -s
  "Cron Count $cron_count" rbpeters

The entry first sets the set -x expansion flag, assigns a variable, performs a test, and then, based on the results, sends an e-mail message. crontab entries can be powerful tools in your scripting arsenal.

Environment Problems

I've worked with many users who have had problems with a script they have coded and debugged for a significant amount of time, only to arrive at the conclusion that "It works from the command line, but not from cron?!" I have run into this problem from time to time myself, but knowing the issues tends to help you find the solution much more quickly.

A script running from cron is not run in the same shell environment as a command typed at the prompt. When you log in to a machine and you are at the shell prompt, many variables need to be set to enable your interactive shell session. cron is not run from an interactive session, however. A cron job runs with only some of the shell environment variables that are set in an interactive shell session. The cron job has only a very rudimentary environment. Most problems with users' cron scripts stem from the assumption that the code runs in an environment with the characteristics of an interactive session, rather than the cron environment.

Here's an example to illustrate the difference. This is one of my user environments, which is displayed when I use the env command:

SSH_CLIENT=172.16.5.199 3433 22
USER=rbpeters
MAIL=/var/mail/rbpeters
HOME=/home/rbpeters
SSH_TTY=/dev/ttyp1
PAGER=more
ENV=/home/rbpeters/.shrc
LOGNAME=rbpeters
BLOCKSIZE=K
TERM=xterm
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:
/usr/local/sbin:/usr/local/bin:/usr/X11R6/bin:
/home/rbpeters/bin
SHELL=/bin/sh
SSH_CONNECTION=172.16.5.199 3433 172.16.5.2 22
FTP_PASSIVE_MODE=YES
EDITOR=vi

I then set up the following cron job to run temporarily for illustrative purposes:

* * * * * env > /usr/home/rbpeters/env.out

After the job ran and created the env.out file, I found the following lines in it:

USER=rbpeters
HOME=/home/rbpeters
LOGNAME=rbpeters
PATH=/usr/bin:/bin
SHELL=/bin/sh

Notice that there is a fairly significant difference between the two environments. For instance, the PATH variable in the cron job environment doesn't have nearly as many directories to search, which can easily break a script because of the assumption that the paths available in the interactive shell environment are available for the cron job. The system cron daemon automatically sets the environment variables that make up the minimal environment. It sets SHELL to /bin/sh, and PATH to /usr/bin:/bin. The USER, LOGNAME, and HOME variables are set based on your entry in the passwd file. That's all you get in the default cron environment.

In the following slightly modified version of the example cron job, note the addition of the command to source the .profile file, which sets up some environment parameters prior to running the command. The additional command adds a more useful environment to the cron job:

* * * * * . /home/rbpeters/.profile >/dev/null ; env > /usr/home/rbpeters/env.out

Here is the new output in the env.out file:

USER=rbpeters
HOME=/home/rbpeters
PAGER=morec
ENV=/home/rbpeters/.shrc
LOGNAME=rbpeters
BLOCKSIZE=K
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:
/usr/local/bin:/usr/X11R6/bin:/home/rbpeters/bin
SHELL=/bin/sh
EDITOR=vi

Now you can see several additional items, as well as a more complete PATH variable. Another way of getting similar results is to fully qualify all paths, commands, and other files mentioned in the script called by the cron job. Supplying the full path to any element that may otherwise rely on the variables set in the interactive shell environment is a good idea in general when writing scripts, since it keeps you aware of the external files you are depending on.

Finally, you can replace the env command with the set command and see similar results to those in the preceding example. The output is much more extensive, but the same principle applies.

Output Redirection

When a scheduled cron task is run, it may or may not create output. Since there is no interactive session attached to the task, the output, if any, is sent to the owner of the crontab file via an e-mail that has the subject set to the cron entry, and the body of the message as the output of the job. This is the e-mail message I received from the example of a complex cron entry shown earlier:

++ ps -ef
++ grep '[c]ron'
++ wc -l
+ cron_count=4
+ '[' 4 -ne 5 ']'
+ echo 'Cron Count 4'
+ mail -s 'Cron Count 4' rbpeters

Since the cron entry started with set -x, the subsequent commands executed while the jobs run were expanded and printed as they were executed. This is a valuable feature when you have to debug a job. Any output from the job will be mailed in the same way. Even though this is useful output, many cron users redirect all output to /dev/null because once the job is in place they don't want to become desensitized by too many routine e-mail messages that don't indicate a problem. A typical job might look like this:

30 * * * * /usr/local/bin/some_script > dev/null 2>&1

This entry, which uses a very common pattern, not only redirects normal output (stdout) to /dev/null, but also redirects all errors (stderr) to the same target. This can become a problem. I have seen jobs scheduled like this that have run for years without ever doing anything. They may have environment issues, as described in the previous section. They may have worked at the time of implementation, but at some point a change somewhere else in the system caused the cron script to break. In either case, the output that would have warned the user about emerging problems was dropped in the bit bucket. An issue like this will be an annoyance in that the job simply doesn't run. A more worst-case scenario is that a routine system-maintenance job doesn't run properly and eventually allows bigger problems to crop up. I have seen these types of problems cause downtime on production systems, and for these reasons I would not recommend using this type of output redirection in a cron entry.

Whenever I write a script to be run from cron, my goal is to have the script emit output only if a debug flag has been set. Normal usage would not display any output. This way the crontab entry sends mail only when error messages are generated. Additionally, I would redirect the script's output (stdout) to /dev/null. If any error messages were to be created, they would still be sent to the user for diagnostic purposes.

The following modified form of the unsafe cron job discussed previously would yield the desired result:

30 * * * * /usr/local/bin/some_script > dev/null
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.137.169