This chapter marks a significant turning point for us in this book. Up until now, we have only discussed development issues (more or less). For the next four chapters, we will only discuss deployment/maintenance issues (more or less).
Do you remember the site administration diagram, located at the beginning of Chapter 1? A fair amount of development tasks have now been covered, so we can now switch to deployment/maintenance tasks with confidence, knowing that we understand what the various development tasks look like. Also, remember that in the real world, you are likely to have deployed to staging numerous times before deploying to production.
In a way, you have already been prepared for what is coming next. By demonstrating Buildout's ability to extend configuration files, we have (hopefully) shown that a production buildout is just a buildout configuration file(s) that configures the site for production use.
In some cases, a production buildout may get so elaborate that it works only in production. In the other cases, it may work just fine in both development and production environments. For example, a production buildout may configure packages to depend on libraries whose locations vary, depending on the environment. Hence, a production buildout may not work in development because the libraries are in a different place.
At this point, it may become more desirable to include such dependencies in the buildout, but that is for you to decide. There is always a trade-off while deciding what dependencies to include in the buildout, versus what (operating system vendor packages) dependencies to rely on in production. A good rule of thumb is to install as many vendor packages as possible, and then use Buildout to install the rest.
In our first example, we will demonstrate what a typical production configuration file looks like. Then, we will demonstrate the techniques to automate various maintenance tasks. Lastly, we will discuss the remaining miscellaneous tasks, if any, which relate to deployment and maintenance.
In this chapter, you will learn the following:
Creating a production buildout
Backing up your database
Automating database backups
Restoring your database from a backup
Packing your database
Automating database packing
Rotating logs
We would be remiss at this point if we did not discuss adding Zope Enterprise Objects (ZEOs) to our buildout. ZEO provides a way to allow numerous Zope 2 instances to access the same database (Data.fs file).
During a normal operation (that is, without ZEO), Zope 2 locks the
Data.fs
file when it is in use. When ZEO is in use, it locks the file
too, but allows connections over TCP/IP.
So from now on, we will be using ZEO. This will enable us to do a number of things:
Learn how to use ZEO
Establish additional connections to the database to facilitate
Backups
Debugging
Database packing
Load balancing
However, if we were to add ZEO to our current buildout, we would experience this problem:
$ bin/buildout -c 05-deployment-maintenance-production.cfg Updating zope2. Updating fake eggs Updating instance. Installing plonesite. 2010-05-19 22:27:32 WARNING ZEO.zrpc (10025) CW: error connecting to ('127.0.0.1', 8100): ECONNREFUSED
This is because in 05-deployment-maintenance-production.cfg
, we
reconfigure the instance
section to use ZEO, but we do not have a
ZEO installed yet (because the add ZEO buildout run
has not yet
been completed).
At the same time, the plonesite
part is trying to start the
instance so that it can create a Plone site. However, the Zope 2 instance is unable
to connect to a ZEO instance since it is not running and does not exist yet (unless
the add ZEO buildout run
completes).
To work around this, we can create a configuration file to subtract the
plonesite
part from the list of parts:
[buildout] extends = 04-administration-ldap.cfg parts -= plonesite
Note that we are using the -=
notation for the first
time.
Until now, we have only been adding parts to the parts
parameter. In this case, we are subtracting the plonesite
part to
make the add ZEO buildout run
work properly.
If we need the plonesite
part later, we can easily add it
back.
Next, we configure ZEO in
05-deployment-maintenance-production.cfg
by adding adding a
zeo
section, and by configuring the instance
section to be a ZEO client:
[buildout] extends = 05-deployment-maintenance-plonesite.cfg parts += zeo [instance] zeo-client = True [zeo] recipe = plone.recipe.zope2zeoserver zope2-location = ${zope2:location}
Now run Buildout:
$ bin/buildout -c 05-deployment-maintenance-production.cfg
You should see:
$ bin/buildout -c 05-deployment-maintenance-production.cfg Getting distribution for 'plone.recipe.zope2zeoserver==1.4'. Got plone.recipe.zope2zeoserver 1.4. … Installing zeo. Created directory /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo Created directory /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo/etc Created directory /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo/var Created directory /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo/log Created directory /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo/bin Wrote file /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo/etc/zeo.conf Wrote file /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo/bin/zeoctl Changed mode for /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo/bin/zeoctl to 755 Wrote file /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo/bin/runzeo Changed mode for /Users/aclark/Developer/plone-site-admin/buildout/parts/zeo/bin/runzeo to 755 Generated script '/Users/aclark/Developer/plone-site-admin/buildout/bin/zeo'. Generated script '/Users/aclark/Developer/plone-site-admin/buildout/bin/zeopack'.
Now start ZEO:
$ bin/zeo start . daemon process started, pid=10578
Now start Plone:
$ bin/instance fg
Database backups are an important part of any web application deployment, and Plone is no exception.
However, unlike a lot of web applications, there is no SQL data to backup. This is
because Plone uses Zope 2, which uses the ZODB (http://pypi.python.org/pypi/ZODB3)
for persistent storage. The ZODB is an object database (not a relational database)
and its contents are typically stored in a single flat file called
Data.fs
.
By using the undo feature of Zope 2, you can often avert disasters by undoing a transaction or more from the Undo form, but you must do so sequentially starting from the most recent transaction.
If you want to examine the Undo form, browse
to http://localhost:8080/Plone
and click on Site Setup |
Zope Management Interface.
You should see the Undo form in the list of tabs at the top, toward the right side, as shown in the following screenshot:
In other words, you cannot pick an arbitrary transaction from the middle of the transaction log and expect to be able to roll back to that point, without selecting all of the transactions in between.
Even if you do, it may not work. The undo feature is unfortunately not impervious to transaction conflict errors, and sometimes, we have to restore from a backup to put things right.
This is where automated "hot" backups come in handy.
Copying the Data.fs
file from one host to another off-site
may suffice as a "hot" backup, but in theory you should never copy data from a
running instance because you could miss data being written to the file at the
time of the copy.
Lucky for us, Zope 2 ships with the repozo
utility as a
part of the ZODB3 package (http://pypi.python.org/pypi/ZODB3) to facilitate
"hot" backups and you can use it to backup your site as shown:
% bin/repozo -v B -F -f var/filestorage/Data.fs -r var/filestorage
The output for this should be something like this:
looking for files between last full backup and 2009-07-04-06-08-12... no files found doing a full backup writing full backup: 4100111 bytes to var/filestorage/2009-07-04-06-08-12.fs
In the example above, we forced a full backup with the F
flag. If you just want to run incremental backups, you can do this:
% bin/repozo -v -B -f var/filestorage/Data.fs -r var/filestorage
This command tells repozo to run in verbose mode with -v
,
create a backup with -B
, backup the specified file
with -f
, and put the results in the directory specified
by -r
.
It is likely you will not want to memorize all of these commands; so, let us make the task easier by adding collective.recipe.backup (http://pypi.python.org/pypi/collective.recipe.backup) to our buildout.
By doing this, we are configuring additional scripts to be added to
the bin
directory of the buildout. These are for us to
execute (or automate) at our earliest convenience.
When executed, these scripts will run repozo
with a
sensible set of defaults.
In 05-deployment-maintenance-backup.cfg
, we have:
[buildout] extends = 05-deployment-maintenance-production.cfg parts += backup [backup] recipe = collective.recipe.backup
Now run Buildout:
$ bin/buildout -c 05-deployment-maintenance-backup.cfg
You should see:
$ bin/buildout -c 05-deployment-maintenance-backup.cfg Getting distribution for 'collective.recipe.backup'. Got collective.recipe.backup 1.3. … Installing backup. backup: Created /Users/aclark/Developer/plone-site-admin/buildout/var/backups backup: Created /Users/aclark/Developer/plone-site-admin/buildout/var/snapshotbackups Generated script '/Users/aclark/Developer/plone-site-admin/buildout/bin/backup'. Generated script '/Users/aclark/Developer/plone-site-admin/buildout/bin/snapshotbackup'. Generated script '/Users/aclark/Developer/plone-site-admin/buildout/bin/restore'. Generated script '/Users/aclark/Developer/plone-site-admin/buildout/bin/snapshotrestore'.
We now have two new directories and three new scripts in our buildout. The directories we created are:
var/backups
var/snapshotbackups
The scripts we created are:
bin/backup
bin/snapshotbackup
bin/restore
Thanks to collective.recipe.backup we can now run repozo
easily, and we have a special place to put the results.
If you want to do a regular backup of your site, just run:
$ bin/backup
The backup will be stored in var/backups
.
Every subsequent running of bin/backup
, assuming site
content has changed, will automatically create an incremental backup ending
in .deltafs
.
If you want to set aside a full backup for some other purpose, like copying to a local development environment, you can run:
$ bin/snapshotbackup
A full backup will be created in var/snapshotbackups
. At
this point, you may be noticing that the business of backups can be somewhat
complicated. In order to be successful over time (or in other words, to have the
best chance of recovering important lost data), you will need a sane backup
strategy.
A very simple backup strategy is listed here:
Daily incremental backups
Monthly full backups
Yearly retention
This means we must configure our incremental backups to run daily and full backups to run once a month. We will keep a directory full of backups for a period of one year and then, we will archive that directory and start over.
In the section that follows, we will cover how to implement a portion of daily incremental backups.
Of course, no one wants to perform manual backups on a regular basis.
On Mac OS X and Ubuntu Linux, we can use Buildout to configure a
cron
entry for us. On Windows, since there is no
cron
, we can use the Task manager (outside our Buildout)
instead.
To configure cron
entries with Buildout, you can
use z3c.re.backup
(http://pypi.python.org/pypi/collective.rec.recipe.usercrontab).
In 05-deployment-maintenance-cron.cfg
, we have:
[buildout] extends = 05-deployment-maintenance-backup.cfg parts += cron [cron] recipe = z3c.recipe.usercrontab command = ${buildout:directory}/bin/backup times = 0 0 * * *
We have added a cron
section, defined
recipe
and command
we want to run, and the
number of times
we want to run it. If you are not familiar
with the cron
syntax, refer to the following:
Field |
Allowed values |
---|---|
Minute |
0-59 |
Hour |
0-23 |
Day of the month |
1-31 |
Month |
1-12 (or names, see below) |
Day of the week |
0-7 (0 or 7 is Sun, or use names) |
A field may be an asterisk (*), which always stands for "first-last''.… Names can also be used for the Month and Day of the week fields. Use the first three letters of the particular day or month (case does not matter). Ranges or lists of names are not allowed. |
This information is from the output of the command (run on Mac OS X and Linux):
$ man 5 crontab
So, the fields are listed in order, in the times
parameter. We have created a cron
entry that will run
our command
at the zeroth minute of the zeroth hour,
every day of the month, every month, and every day of the week.
Now run Buildout:
$ bin/buildout -c 05-deployment-maintenance-cron.cfg
You should see:
$ bin/buildout -c 05-deployment-maintenance-cron.cfg … Installing cron.
You can check to make sure your crontab
entry has been
created by running with the help of the following command:
$ crontab l
You should see:
$ crontab l # Generated by /Users/aclark/Developer/plone-site-admin/buildout [cron] 0 0 * * * /Users/aclark/Developer/plone-site-admin/buildout/bin/backup # END /Users/aclark/Developer/plone-site-admin/buildout [cron]
To schedule tasks on Windows, select Start | All Programs | Accessories | System Tools | Task Scheduler.
When the task scheduler appears, select Create Basic Task:
Next, fill in the Name and Description:
This is followed by when you want the task to start:
This is followed by when you want the task to recur:
Next is what you want the task to do:
Select the program you want to start (for example,
C:UsersAdministratorDeveloperplone-site-adminuildoutinackup.exe)
.
You should see:
Click on Finish.
You can now test your installation by scrolling down to Active Tasks in the center pane, and scrolling down to your task.
If you want to restore your database from the latest backup, you can stop your site (including ZEO) and run:
$ bin/restore
In the event that you would like to know more about the process, or you would like
to restore from a date prior to the last backup, you can always use the
repozo
command.
Assuming you have the backups, you can restore the data from any date by using
the -D
option:
% bin/repozo -R -D 2009-10-26-00-06-33 -r var/backups -o var/filestorage/Data.fs
You must give -D
a date string that matches one of your
backups. It can be an incremental backup (that is, a file ending in
.deltafs); repozo
will figure out what to do with it.
Next, we will apply the same basic techniques to another important task packing your site's database.
In Zope 2, every database transaction is saved. So if you never pack, your database will keep growing and fill up your disk. This is why we must pack; of course, how often you pack will depend on how often your site's content changes.
The recommended strategy is to pack to within seven days of the current date. In other words, keep a week's worth of transactions in the database so that you can restore to as far back as one week, if needed (assuming you are able to use undo successfully).
After you run the 05-deployment-maintenance-production.cfg
buildout, you should have a bin/zeopack
script.
The default value is to pack to one day. If you want to change that (or the not
very commonly used ZEO user/password), you can use the following parameters in
the zeo
section (which uses the
plone.recipe.zope2zeoserver recipe):
pack-days:
Specifies the number of days for
the zeopack
script to retain the history. Default
value is one day.
pack-user:
If the ZEO server uses authentication, this
is the username used by the zeopack
script to connect to
the ZEO server.
pack-password:
If the ZEO server uses authentication,
this is the password used by the zeopack
script to
connect to the ZEO server.
In the section that follows, we will cover how to implement the next portion of our simple backup policy monthly full backups.
Packing the database every month will cause the next daily backup to be a full
backup instead of an incremental backup. That is because repozo
detects the database change, and performs a full backup as a result.
This is good news for us, because it means that all we have to do to ensure
monthly full backups is to configure an automated task to pack the database. In
other words, we just require a cron
entry in Mac OS X and Ubuntu
Linux, and a scheduled task in Windows Task Scheduler.
We will leave the task scheduling to the Windows folks, who can easily create another scheduled task by referring to the steps we performed earlier.
For the Mac OS X and Ubuntu Linux folks, in
05-deployment-maintenance-cron2.cfg
we have the following:
[buildout] extends = 05-deployment-maintenance-cron.cfg parts += cron2 [cron2] recipe = z3c.recipe.usercrontab command = ${buildout:directory}/bin/zeopack times = 0 0 1 * *
Notice that we chose a value of 0 0 1 * *
for the
times
parameter, to indicate that we want to perform the task on the
first day of the month, at the zeroth hour, and zeroth minute.
As you are probably aware, Zope 2 generates log files. Hence, we need to think about what to do when these log files grow.
One of the tasks that could be difficult to standardize across operating systems is rotating Zope 2 log files. Luckily for us, there is iw.rotatezlogs (http://pypi.python.org/pypi/iw.rotatezlogs).
Although technically not a recipe, iw.rotatezlogs makes it simple to set up log rotation.
The process involves three steps:
Add the iw.rotatezlogs egg to the eggs
parameter
in instance
section.
Add an event-log-custom
parameter to
your instance
section with iw.rotatezlogs
settings.
Add an access-log-custom
parameter to
your instance
section with iw.rotatezlogs
settings.
In 05-deployment-maintenance-rotate.cfg
, we have the
following:
[buildout] extends = 05-deployment-maintenance-cron2.cfg [instance] eggs += iw.rotatezlogs event-log-custom = %import iw.rotatezlogs <rotatelogfile> path ${buildout:directory}/var/log/instance.log max-bytes 1MB backup-count 5 </rotatelogfile> access-log-custom = %import iw.rotatezlogs <rotatelogfile> path ${buildout:directory}/var/log/instance-Z2.log max-bytes 1MB backup-count 5 </rotatelogfile>
Now run Buildout:
$ bin/buildout -c 05-deployment-maintenance-rotate.cfg
The following code should be included in
parts/instance/etc/zope.conf:
… %import iw.rotatezlogs <rotatelogfile> path /Users/aclark/Developer/plone-site-admin/var/log/instance.log max-bytes 1MB backup-count 5 </rotatelogfile> </eventlog> <logger access> level WARN %import iw.rotatezlogs <rotatelogfile> path /Users/aclark/Developer/plone-site-admin/var/log/instance-Z2.log max-bytes 1MB backup-count 5 </rotatelogfile> </logger> …
You will notice that we have set max-bytes
to
1MB
and backup-count
to 5
,
which means that whenever either log file reaches 1 MB, it will be rotated and
when five log files are reached, the oldest one will be deleted each time a
rotation occurs.
To verify this, try filling up the log files to just under 1 MB, then restart Plone. You should see the rotation occur.
Similarly, you could create five log files first, then restart Plone and you should see the rotation occur.
We can use this technique for any log files, for example ZEO.
In 05-deployment-maintenance-rotate.cfg
we have:
[buildout] extends = 05-deployment-maintenance-rotate.cfg [zeo] eggs += iw.rotatezlogs zeo-log-custom = %import iw.rotatezlogs <rotatelogfile> path ${buildout:directory}/var/log/zeo.log max-bytes 1MB backup-count 5 </rotatelogfile>
Since the ZEO instance runs under its own process, you will notice that we
have added the iw.rotatezlogs egg to the zeo
section's eggs
parameter.
At this point, we have done just about all of the setup and preparation we can do to make our site run smoothly.
In this chapter, you have learned the following:
Adding ZEO to create a production buildout
Using collective.recipe.backup to make using repozo
easier
Automating database backups with z3c.recipe.usercrontab and Task Scheduler
Restoring your database from a backup with the help of collective. recipe.backup
Packing your database with zeopack
Automating database packing with help from z3c.recipe.usercrontab and Task Scheduler
Rotating logs with iw.rotatezlogs
Next, in Chapter 6, we will to focus on what we can do while our site is running to monitor and optimize the performance.
Later in Chapter 7, we will cover troubleshooting and upgrading along with various security concerns.
18.118.7.102