There are times when there’s a need for running a group of tasks automatically at certain times in the future. These tasks are usually administrative, but could be anything - from making database backups to downloading emails when everyone is asleep.
Cron is a time-based job scheduler in Unix-like operating systems, which triggers certain tasks at a point in the future. The name originates from the Greek word χρόνος (chronos), which means time.
The most commonly used version of Cron is known as Vixie Cron, originally developed by Paul Vixie in 1987.
This piece is an in-depth walkthrough of this program, and a reboot of this ancient, but still surprisingly relevant post.
Job: a unit of work, a series of steps to do something. For example, sending an email to a group of users. In this chapter, we’ll use task, job, cron job or event interchangeably.
Daemon: (/ˈdiːmən/ or /ˈdeɪmən/) is a computer program which runs in the background, serving different purposes. Daemons are often started at boot time. A web server is a daemon serving HTTP requests. Cron is a daemon for running scheduled tasks.
Cron Job: a cron job is a scheduled job, being run by Cron when it’s due.
Webcron: a time-based job scheduler which runs within the web server environment. It’s used as an alternative to the standard Cron, often on shared web hosts that do not provide shell access.
This tutorial assumes you’re running a Unix-based operating system like Ubuntu. If you aren’t, we recommend setting up Homestead Improved - it’s a 5 minute process which will save you years down the line.
If we take a look inside the /etc
directory, we can see directories like cron.hourly
, cron.daily
, cron.weekly
and cron.monthly
, each corresponding to a certain frequency of execution. One way to schedule our tasks is to place our scripts in the proper directory. For example, to run db_backup.php
on a daily basis, we put it inside cron.daily
. If the folder for a given frequency is missing, we would need to create it first.
run-parts
This approach uses the run-parts
script, a command which runs every executable it finds within the specified directory.
This is the simplest way to schedule a task. However, if we need more flexibility, we should use Crontab.
Cron uses special configuration files called crontab
files, which contain a list of jobs to be done. Crontab stands for Cron Table. Each line in the crontab file is called a cron job, which resembles a set of columns separated by a space character. Each row specifies when and how often a certain command or script should be executed.
In a crontab file, blank lines or lines starting with #
, spaces or tabs will be ignored. Lines starting with #
are considered comments.
Active lines in a crontab are either the declaration of an environment variable or a cron job, and comments are not allowed on the active lines.
Below is an example of a crontab file with just one entry:
0 0 * * * /var/www/sites/db_backup.sh
The first part 0 0 * * *
is the cron expression, which specifies the frequency of execution. The above cron job will run once a day.
Users can have their own crontab files named after their username as registered in the /etc/passwd
file. All user-level crontab files reside in Cron’s spool area. These files should not be edited directly. Instead, we should edit them using the crontab
command-line utility.
The spool directory varies across different distributions of Linux. On Ubuntu it’s /var/spool/cron/crontabs
while in CentOS it’s /var/spool/cron
.
To edit our own crontab file:
crontab -e
The above command will automatically open up the crontab file which belongs to our user. If a default system editor for the crontab hasn’t been selected before, a choice will be presented listing the installed ones. We can also explicitly choose or change our desired editor for editing the crontab file:
export VISUAL=nano; crontab -e
After we save the file and exit the editor, the crontab will be checked for accuracy. If everything is set properly, the file will be saved to the spool directory.
Each command in the crontab file is executed from the perspective of the user who owns the crontab, so if your command runs as root (sudo) you will not be able to define this crontab from your own user account unless you log in as root.
To list the installed cron jobs belonging to our own user:
crontab -l
We can also write our cron jobs in a file and send its contents to the crontab file like so:
crontab /path/to/the/file/containing/cronjobs.txt
The preceding command will overwrite the existing crontab file with /path/to/the/file/containing/cronjobs.txt
.
To remove the crontab, we use the -r
option:
crontab -r
The anatomy of a user-level crontab entry looks like the following:
# ┌───────────── min (0 - 59)
# │ ┌────────────── hour (0 - 23)
# │ │ ┌─────────────── day of month (1 - 31)
# │ │ │ ┌──────────────── month (1 - 12)
# │ │ │ │ ┌───────────────── day of week (0 - 6)
# │ │ │ │ │
# │ │ │ │ │
# * * * * * command to execute
The first two fields specify the time (minute and hour) at which the task will run. The next two fields specify the day of the month and the month. The fifth field specifies the day of the week.
The command will be executed when the minute, hour, month and either day of month or day of week match the current time.
If both day of week and day of month have certain values, the event will be run when either field matches the current time. Consider the following expression:
0 0 5-20/5 Feb 2 /path/to/command
The preceding cron job will run once per day every five days, from 5th to 20th of February plus all Tuesdays of February.
When both day of month and day of week have certain values (not an asterisk), it will create an OR
condition, meaning both days will be matched.
The syntax in system crontabs (/etc/crontab
) is slightly different than user-level crontabs. The difference is that the sixth field is not the command, but it is the user we want to run the job as.
* * * * * testuser /path/to/command
It’s not recommended to put system-wide cron jobs in /etc/crontab
, as this file might be modified in future system updates. Instead, we put these cron jobs in the /etc/cron.d
directory.
We might need to edit other users’ crontab files. To do this, we use the -u
option as below:
crontab -u username -e
We can only edit other users’ crontab files as the root user, or as a user with administrative privileges.
Some tasks require super admin privileges, thus, they should be added to the root user’s crontab file:
sudo crontab -e
Please note that using sudo
with crontab -e
will edit the root user’s crontab file. If we need to to edit another user’s crontab while using sudo
, we should use -u
option to specify the crontab owner.
To learn more about the crontab
command:
man crontab
Crontab fields accept numbers as values. However, we can put other data structures in these fields, as well.
We can pass in ranges of numbers:
0 6-18 1-15 * * /path/to/command
The above cron job will be executed from 6 am to 6 pm from 1st to 15th of each month in the year. Note that the specified range is inclusive, so 1-5 means 1,2,3,4,5.
A list is a group of values separated by commas. We can have lists as field values:
0 1,4,5,7 * * * /path/to/command
The above syntax will run the cron job at 1 am, 4 am, 5 am and 7 am every day.
Steps can be used with ranges or the asterisk character (*)
. When they are used with ranges they specify the number of values to skip through the end of the range. They are defined with a /
character after the range, followed by a number. Consider the following syntax:
0 6-18/2 * * * /path/to/command
The above cron job will be executed every two hours from 6 am to 6 pm.
When steps are used with an asterisk, they simply specify the frequency of that particular field. As an example if we set the minute to */5
, it simply means every five minutes.
We can combine lists, ranges, and steps together to have more flexible event scheduling:
0 0-10/5,14,15,18-23/3 1 1 * /path/to/command
The above event will run every five hours from midnight of January 1st to 10 am, 2 pm, 3 pm and also every three hours from 6pm to 11 pm.
For the fields month and day of week we can use the first three letters of a particular day or month, like Sat
, sun
, Feb
, Sep
, etc.
* * * Feb,mar sat,sun /path/to/command
The preceding cron job will be run only on Saturdays and Sundays of February and March.
Please note that the names are not case-sensitive. Ranges are not allowed when using names.
Some cron implementations may support some special strings. These strings are used instead of the first five fields, each specifying a certain frequency:
(0 0 1 1 *)
(0 0 1 * *)
(0 0 * * 0)
(0 0 * * *)
(0 * * * *)
We can run several commands in the same cron job by separating them with a semi-colon (;
).
* * * * * /path/to/command-1; /path/to/command-2
If the running commands depend on each other, we can use double ampersand (&&)
between them. As a result, the second command will not be executed if the first one fails.
* * * * * /path/to/command-1 && /path/to/command-2
Environment variables in crontab files are in the form of VARIABLE_NAME = VALUE
(The white spaces around the equal sign are optional). Cron does not source any startup files from the user’s home directory (when it’s running user-level crons). This means we should manually set any user-specific settings required by our tasks.
Cron daemon automatically sets some environmental variables when it starts. HOME
and LOGNAME
are set from the crontab owner’s information in /etc/passwd
. However, we can override these values in our crontab file if there’s a need for this.
There are also a few more variables like SHELL
, specifying the shell which runs the commands. It is /bin/sh
by default. We can also set the PATH
in which to look for programs.
PATH = /usr/bin;/usr/local/bin
We should wrap the value in quotation marks when there’s a space in the value. Please note that values are considered as ordinary strings and are not interpreted or parsed in any way.
Cron uses the system’s time zone setting when evaluating crontab entries. This might cause problems for multiuser systems with users based in different time zones. To work around this problem, we can add an environment variable named CRON_TZ
in our crontab file. As a result, all crontab entries will be parsed based on the specified timezone.
After Cron starts, it searches its spool area to find and load crontab files into the memory. It additionally checks the /etc/crontab
and or /etc/cron.d
directories for system crontabs.
After loading the crontabs into memory, Cron checks the loaded crontabs on a minute-by-minute basis, running the events which are due.
In addition to this, Cron regularly (every minute) checks if the spool directory’s modtime
(modification time) has changed. If so, it checks the modetime
of all the loaded crontabs and reloads those which have changed. That’s why we don’t have to restart the daemon when installing a new cron job.
We can specify which user should be able to use Cron and which user should not. There are two files which play an important role when it comes to cron permissions: /etc/cron.allow
and /etc/cron.deny
.
If /etc/cron.allow
exists, then our username must be listed in this file in order to use crontab
. If /etc/cron.deny
exists, it shouldn’t contain our username. If neither of these files exist, then based on the site-dependent configuration parameters, either the super user or all users will be able to use crontab
command. For example, in Ubuntu, if neither file exists, all users can use crontab by default.
We can put ALL
in /etc/cron.deny
file to prevent all users from using cron:
echo ALL > /etc/cron.deny
If we create an /etc/cron.allow
file, there’s no need to create a /etc/cron.deny
file as it has the same effect as creating a /etc/cron.deny
file with ALL
in it.
We can redirect the output of our cron job to a file, if the command (or script) has any output:
* * * * * /path/to/php /path/to/the/command >> /var/log/cron.log
We can redirect the standard output to dev null, to get no email (more on emails below), but still allowing the standard error to be sent as email:
* * * * * /path/to/php /path/to/the/command > /dev/null
To prevent Cron from sending any emails to us, we change the respective crontab entry as below:
* * * * * /path/to/php /path/to/the/command > /dev/null 2>&1
This means “send both the standard output, and the error output into oblivion”.
The output is mailed to the owner of the crontab or the email(s) specified in the MAILTO
environment variable (if the standard output or standard error are not redirected as above).
If MAILTO
is set to empty, no email will be sent out as the result of the cron job.
We can set several emails by separating them with commas:
[email protected],[email protected]
* * * * * /path/to/command
We usually run our PHP command line scripts using the PHP executable.
php script.php
Alternatively, we can use shebang at the beginning of the script, and point to the PHP executable:
#! /usr/bin/php
<?php
// PHP code here
As a result, we can execute the file by calling it by name. However, we need to make sure we have the permission to execute it.
To have more robust PHP command line scripts, we can use third-party components for creating console applications like Symfony Console Component or Laravel Artisan. This article is a good start for using Symfony’s Console Component.
Creating console commands using Laravel Artisan has been also covered here. If you’d rather use another command line tool for PHP, we have a comparison here.
There are times when scheduled tasks take much longer than expected. This will cause overlaps, meaning some tasks might be running at the same time. This might not cause a problem in some cases, but when they are modifying the same data in a database, we’ll have a problem. We can overcome this by increasing the execution frequency of the tasks, but still it’s not guaranteed that these overlaps won’t happen again.
We have several options to prevent cron jobs from overlapping.
Flock is a nice tool to manage lock files from within shell scripts or the command line. These lock files are useful for knowing whether or not a script is running.
When used in conjunction with Cron, the respective cron jobs do not start if the lock file exists. You can install Flock using apt-get
or yum
depending on the Linux distribution.
apt-get install flock
Or
yum install flock
Consider the following crontab entry:
* * * * * /usr/bin/flock --timeout=1 /path/to/cron.lock /usr/bin/php /path/to/scripts.php
In the preceding example, flock
looks for /path/to/cron.lock
. If the lock is acquired in one second, it will run the script, otherwise, it will fail with an exit code of 1.
If the cron job executes a script, we can implement a locking mechanism in the script. Consider the following PHP script:
<?php
$lockfile = sys_get_temp_dir() . '/' md5(__FILE__) . '.lock';
$pid = file_exists($lockfile) ? trim(file_get_contents($lockfile)) : null;
if (is_null($pid) || posix_getsid($pid) === false) {
// Do something here
// And then create/update the lock file
file_put_contents($lockfile, getmypid());
} else {
exit('Another instance of the script is already running.');
}
In the preceding code, we keep pid
of the current PHP process in a file, which is located in the system’s temp
directory. Each PHP script has its own lock file, which is the MD5 hash of the script’s filename.
First, we check if the lock file exists, and then we get its content, which is the process ID of the last running instance of the script. Then we pass the pid
to posix_getsid PHP function, which returns the session ID of the process. If posix_getsid
returns false
it means the process is not running anymore and we can safely start a new instance.
One of the problems with Cron is that it assumes the system is running continuously (24 hours a day). This causes problems for machines which are not running all day long (like personal computers). If the system goes off during the time a task is scheduled to run, Cron will not run that task retroactively.
Anacron is not a replacement for Cron, but it has been developed to solve this problem. It runs the commands once a day, week or month but not on a minute-by-minute or hourly basis as Cron does. It is, however, guaranteed that the task will run even if the system goes off for an unanticipated period of time.
Only root or a user with administrative privileges can manage Anacron tasks. Anacron does not run in the background like a daemon, but only once, executing the tasks which are due.
Anacron uses a configuration file (just like crontab) named anacrontabs
. This file is located in the /etc
directory.
The content of this file looks like this:
# /etc/anacrontab: configuration file for anacron
# See anacron(8) and anacrontab(5) for details.
SHELL=/bin/sh
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# the maximal random delay added to the base delay of the jobs
RANDOM_DELAY=45
# the jobs will be started during the following hours only
START_HOURS_RANGE=3-22
#period in days delay in minutes job-identifier command
1 5 cron.daily nice run-parts /etc/cron.daily
7 25 cron.weekly nice run-parts /etc/cron.weekly
@monthly 45 cron.monthly nice run-parts /etc/cron.monthly
In an anacrontab
file, we can only set the frequencies with a period of n
days, followed by the delay time in minutes. This delay time is just to make sure the tasks do not run at the same time.
The third column is a unique name, which is used to identify the task in the Anacron log files.
The fourth column is the actual command to be run.
Consider the following entry:
1 5 cron.daily nice run-parts /etc/cron.daily
The above tasks are run daily, 5 minutes after Anacron is run. It uses run-parts
to execute all the scripts within /etc/cron.daily
.
The second entry in the list above runs every 7 days (weekly), with a 25 minutes delay.
As you have probably noticed, Cron is also set to execute the scripts inside /etc/cron.*
directories. This sort of possible collision with Anacron is handled differently in different flavors of Linux. In Ubuntu, Cron checks if Anacron is present in the system, and if it so, it won’t execute the scripts within /etc/cron.*
directories.
In other flavors of Linux, Cron updates the Anacron times-stamps when it runs the tasks, so Anacron won’t execute them if they have been already run by Cron.
It’s a good habit to use the absolute paths to all the executables we use in a crontab file.
* * * * * /usr/local/bin/php /absolute/path/to/the/command
If our tasks are not running at all, first we need to make sure the Cron daemon is running:
ps aux | grep crond
The output should similar to this:
root 7481 0.0 0.0 116860 1180 ? Ss 2015 0:49 crond
/etc/cron.allow
and /etc/cron.deny
FilesIf the cron jobs are not running, then we need to check if /etc/cron.allow
exists. If it does, we need to make sure our username is listed in this file. If /etc/cron.deny
exists, we need to make sure our username is not listed in this file.
If we edit a user’s crontab file whereas the user does not exist in the /etc/cron.allow
file, including the user in the /etc/cron.allow
won’t run the cron until we re-edit the crontab file.
We need to make sure that the owner of the crontab has the execute permissions for all the commands and scripts in the crontab file. Otherwise, the cron will not work. Execute permissions can be added to any folder or file with chmod +x /some/file.php
Every entry in the crontab should end with a new line. This means there must be a blank line after the last crontab entry, or the last cron job will never run.
Cron is a daemon, running a list of events scheduled to take place in the future. These jobs are listed in special configuration files called crontab files. Users can have their own crontab file, if they are allowed to use Cron, based on /etc/cron.allow
or /etc/cron.deny files
. In addition to user-level cron jobs, Cron also loads the system-wide cron jobs which are slightly different in syntax.
Our tasks are commonly PHP scripts or command-line utilities. In systems which are not running all the time, we can use Anacron to run the events which happen in the period of n
days.
When working with Cron, we should also be aware of the tasks overlapping each other, to prevent data loss. After a cron job is finished, the output will be sent to the owner of the crontab and or the email(s) specified in the MAILTO
environment variable.
3.142.119.220