CHAPTER 11

Infrastructure Enhancement

At this point, we have a fully functional infrastructure. We have automated all of the changes to the hosts at our site from the point at which the initial imaging hosts and cfengine server were set up.

We're running a rather large risk, however, because if we make errors in our cfengine configuration files, we won't have an easy way to revert the changes. We run an even greater risk if our cfengine server were to suffer hardware failure: we would have no way of restoring the cfengine masterfiles tree. The other hosts on our network will continue running cfengine, and they will apply the last copied policies and configuration files, but no updates will be possible until we restore our central host.

Subversion can help us out with both issues. Using version control, we can easily track the changes to all the files hosted in our cfengine masterfiles tree, and by making backups of the Subversion repository, we can restore our cfengine server in the event of system failure or even total site failure.

Cfengine Version Control with Subversion

With only a small network in place, we already have over 2,800 lines of configuration code in over 55 files under the PROD/inputs directory. We need to start tracking the different versions of those files as time goes on, as well as tracking any additional files that are added. The workplace of one of this book's authors has over 30,000 lines of cfengine configuration in 971 files. Without version control, it is difficult to maintain any semblance of control over your cfengine configuration files, as well as the files being copied by cfengine.

We covered basic Subversion usage in Chapter 8 and included instructions on how to set up a Subversion server with an Apache front end. We'll utilize that infrastructure to host version control for our cfengine master repository.

Importing the masterfiles Directory Tree

In order to import our cfengine masterfiles directory into Subversion, we need to create the repository on etchlamp, our Subversion host. Conveniently, we already created the repository with cfengine back in Chapter 8 and granted read/write access to the nate and kirk users.

Now, we want to set up a read-only user to be used to check out changes to production hosts. Once we check out a copy of the production cfengine masterfiles tree, we don't want to allow changes to be checked in directly from that tree. We want our administrators to edit a working copy of the configuration, check in their changes, and then have the production working copy updated directly from Subversion.

To set up the read-only user, create it manually on the system etchlamp(as the root user), and copy the access file to the cfengine master:

# htpasswd /etc/apache2/dav_svn.passwd readonly
New password:
Re-type new password:
Adding password for user readonly
# scp /etc/apache2/dav_svn.passwd
 goldmaster:/var/lib/cfengine2/masterfiles/PROD/repl/root/etc/apache2/dav_svn.passwd

Now, we want to grant read-only access to the cfengine Subversion repository to this new user. Change this section in PROD/repl/root/etc/apache2/svn_accessfile

[cfengine:/]
@admins = rw

to this

[cfengine:/]
@admins = rw
readonly = r

Before we import into the Subversion repository, we'll want to make sure that all the .svn directories that get added into the masterfiles tree don't get copied out to clients later on. These are unnecessary and are a bit of a security risk. We'll accomplish this with a global ignore action. Create the directory PROD/inputs/ignore on the cfengine master, and place these contents in a new file at PROD/inputs/ignore/cf.ignore:

ignore:
        any::
                .svn

Import this file into cfagent.conf. Since the file is made up entirely of imports, you can place this entry anywhere after the import: line:

# globally ignore certain files and directories
ignore/cf.ignore

At this point we're ready to import the masterfiles/PROD directory. On the cfengine master goldmaster, run these commands:

# svn --username=nate import /var/lib/cfengine2/masterfiles/PROD
https://svn.campin.net/svn/cfengine/masterfiles/PROD
-m"initial cfengine import of the PROD tree only"
Authentication realm: <https://svn.campin.net:443> Campin.net Subversion Repository
Password for 'nate':
Adding     /var/lib/cfengine2/masterfiles/PROD/repl
Adding     /var/lib/cfengine2/masterfiles/PROD/repl/root
Adding     /var/lib/cfengine2/masterfiles/PROD/repl/root/var
Adding     /var/lib/cfengine2/masterfiles/PROD/repl/root/var/www
Adding     /var/lib/cfengine2/masterfiles/PROD/repl/root/var/www/html

The output went on for some time; it's quite surprising just how many files we have in there at this point. The large number of files highlights the importance of keeping our files in Subversion, if only as a backup measure. The utility of version control for our repository goes far beyond simple backups, as you will see in the next section.

Now, when you visit the URL https://svn.campin.net/svn/cfengine/masterfiles/PROD/ in a web browser, you'll see your repl and inputs directories in Subversion, with revision 1. To use our current masterfiles/PROD tree from Subversion on our cfengine master, we'll need to check out a working copy in place of the current PROD directory. Here are the commands we ran (as the root user) on goldmaster:

# cd /var/lib/cfengine2/masterfiles
# mv PROD PROD.bak
# svn --username=readonly co https://svn.campin.net/svn/cfengine/masterfiles/PROD
A    PROD/inputs
A    PROD/inputs/cfservd.conf
A    PROD/inputs/control
A    PROD/inputs/control/cf.friendstatus
A    PROD/inputs/control/cf.control_cfagent_conf
A    PROD/inputs/control/cf.control_cfexecd
A    PROD/inputs/cf.preconf

The output went on for quite some time as all the files were checked out. In order to edit those files, we can (and will) check out the tree somewhere else, such as in our home directory. This way, we will be working on changes in an environment where they won't immediately be copied by the systems on our network.


Note We should never again work directly on the files in the PROD tree on the cfengine master and check in our changes from there. It is bad practice to directly edit the live files used for configuration at our site.


If you attempt to check in files from the /var/lib/cfengine2/masterfiles/PROD tree on the cfengine master, you'll get errors like this:

# cd PROD
# touch foo
# svn add foo
# svn commit -m"this shouldn't work"
svn: Commit failed (details follow):
svn: MKACTIVITY of '/svn/cfengine/!svn/act/7900b02f-eb97-4954-8ea6-8645e662404e':
403 Forbidden (https://svn.campin.net)

We got this error because we checked out the tree as the readonly user, and that user lacks the privileges to check back in. We could of course specify the --username=nate argument to the Subversion client, which would allow the check-in, but that's bad practice. We want to carefully test our changes and have clients see our modifications only once we've committed them. We also ensure that all files are properly checked into version control this way. We don't want administrators to copy files manually into the PROD tree, because if the cfengine server fails and we restore from Subversion, we would be missing some of our configuration files! We need to avoid this at all costs. Always update the /var/lib/cfengine2/masterfiles/PROD tree only with Subversion updates.

Another risk from working directly on the live PROD tree is we might accidentally save a working copy of a file such as cfagent.conf without meaning to. If we're working on an offline working copy, we don't have to worry about such accidents. Start developing good habits now.

We would like to know when changes are checked into the repository so that we see when other administrators make changes that might affect work we're doing or catch errors in their changes. Subversion has a feature called hooks that allows scripts to run when different repository actions happen. You can see the actions for which hooks are supported by inspecting the template hook scripts that the svnadmin create command placed in the hooks subdirectory in the cfengine repository:

# ls /var/svn/repository/cfengine/hooks/
post-commit.tmpl  post-revprop-change.tmpl  pre-commit.tmpl  pre-revprop-change.tmpl
start-commit.tmpl  post-lock.tmpl  post-unlock.tmpl  pre-lock.tmpl pre-unlock.tmpl

Inspect the hook template files themselves to see what actions are supported.

To get e-mail notifications when a change is committed to our cfengine Subversion repository, we'll place a shell script at the location PROD/repl/root/var/svn/repository/cfengine/hooks/post-commit.

But wait! We can't do this directly on the cfengine master any longer. We'll need to check out our own personal working copy of the cfengine repository. We logged into the system goldmaster as our own user account and checked out a working copy with these commands:

$ cd ~/
$ svn co https://svn.campin.net/svn/cfengine/masterfiles

These commands will give us a working copy at ~/masterfiles. Since our home directory is shared via NFS, we can work on our working copy of the cfengine masterfiles tree from any host on the network that we choose. All of our systems have a Subversion client, so it's a matter of personal preference.

It turns out to have been highly useful that so far in this book we have referenced the path to files in the cfengine masterfiles tree as relative to the PROD directory, because from now on, the files we're working with will be a working copy, and the base directory will be different for every user. We'll continue referring to files as relative to the PROD directory. When we start working with the DEV and STAGE directory trees, we'll also refer to files and directories stored within as relative to those directories.

Now, we need to create the file PROD/repl/root/var/svn/repository/cfengine/hooks/post-commit in our working copy of the cfengine tree. Create the directories with these commands:

$ cd ~/masterfiles/PROD
$ mkdir -p repl/root/var/svn/repository/cfengine/hooks

Then, create the file PROD/repl/root/var/svn/repository/cfengine/hooks/post-commit with these contents:

#!/bin/sh
REPOS="$1"
REV="$2"
LOG='/usr/bin/svnlook log -r $REV $REPOS'
AUTHOR='/usr/bin/svnlook author -r $REV $REPOS'

/usr/bin/svnnotify --repos-path "$REPOS" --revision "$REV" --with-diff
--to [email protected] --from "$AUTHOR" --reply-to [email protected] --subject-prefix
 "[CFENGINE SVN]" --subject-cx --no-first-line

To have the svnnotify program on our Debian-based Subversion server, we'll need to install the package libsvn-notify-perl. We added it to the file /srv/fai/config/package_config/WEB on goldmaster (our FAI installation host) back in Chapter 8, so it is already installed.

Next, we needed to check our new hook script into Subversion. The way we handled this is to add the highest level directory in the tree that isn't yet checked in:

$ svn add repl/root/var/svn
A         repl/root/var/svn
A         repl/root/var/svn/repository
A         repl/root/var/svn/repository/cfengine
A         repl/root/var/svn/repository/cfengine/hooks
A         repl/root/var/svn/repository/cfengine/hooks/post-commit
$ svn commit

When you type svn commit without the -m argument (which automatically submits the log entry), you're dropped into an editor that allows you to enter a comment for the commit, along with the files (and directories, if applicable) that are being modified in the current commit. The editor screen for the preceding commit looked like this:

--This line, and those below, will be ignored--

A    PROD/repl/root/var/svn
A    PROD/repl/root/var/svn/repository
A    PROD/repl/root/var/svn/repository/cfenginey
A    PROD/repl/root/var/svn/repository/cfengine/hooksy
A    PROD/repl/root/var/svn/repository/cfengine/hooks/post-commit

The cursor was at the top of the screen, where the comments belong. We could see that all of our new directories were being committed, along with our new file. We entered a comment about how this is to enable notifications for cfengine repository commits, saved the file, and saw this Subversion client output:

"svn-commit.tmp" 8L, 346C written
Adding         PROD/repl/root/var/svn
Adding         PROD/repl/root/var/svn/repository
Adding         PROD/repl/root/var/svn/repository/cfengine
Adding         PROD/repl/root/var/svn/repository/cfengine/hooks
Adding         PROD/repl/root/var/svn/repository/cfengine/hooks/post-commit
Transmitting file data .
Committed revision 4.

Now, we need to set up a task to copy out our new hook script. We'll set it up as a recursive copy of a hooks directory, even though we currently have only one hook script. This will allow us to develop other hook scripts later and simply place them into a directory in the masterfiles tree to have the new hook script automatically copied to the Subversion server by cfengine.

We created a task at PROD/inputs/tasks/app/svn/cf.copy_hooks with these contents:

copy:
        svn_server.debian::
                $(master)/repl/root/var/svn/repository/cfengine/hooks
                                dest=/var/svn/repository/cfengine/hooks
                                mode=555
                                r=inf
                                purge=false
                                owner=www-data
                                group=www-data
                                type=checksum
                                server=$(fileserver)
                                encrypt=true

We then added this new task to the repository as follows:

$ svn add inputs/tasks/app/svn/cf.copy_hooks
A         inputs/tasks/app/svn/cf.copy_hooks
$ svn commit -m"added task to copy out the cfengine svn repo hooks directory"
Adding         PROD/inputs/tasks/app/svn/cf.copy_hooks
Transmitting file data .
Committed revision 5.

We still need to activate this task by importing it, so we added this line to PROD/inputs/hostgroups/cf.svn_server:

tasks/app/svn/cf.copy_hooks

We issued the svn commit command again, and now, our Subversion repository should have all the changes required to copy out our new hook to our Subversion repository. We still need to update our live masterfiles/PROD working copy on the cfengine master. As root on goldmaster, we issued these commands:

# cd /var/lib/cfengine2/masterfiles/PROD
# svn update
A    inputs/hostgroups/cf.svn_server
A    inputs/tasks/app/svn/cf.copy_hooks
A    repl/root/var/svn
A    repl/root/var/svn/repository
A    repl/root/var/svn/repository/cfengine
A    repl/root/var/svn/repository/cfengine/hooks
A    repl/root/var/svn/repository/cfengine/hooks/post-commit
Updated to revision 6.

Now, we just need to wait for cfengine to run again on etchlamp (the Subversion server) so that it gets the new hook script. After the next cfagent run (it runs every 20 minutes at our site), we committed a new version of PROD/inputs/hostgroups/cf.svn_server with a blank line added to the end, just to test the notifications. We got this e-mail shortly thereafter:

From:  [email protected]
To:      [email protected]
Date:   Mon, Sep 1, 2008 at 2:21 AM
Subject: [CFENGINE SVN] [7] masterfiles/PROD/inputs/hostgroups/cf.svn_server

Revision: 7
Author:   nate
Date:     2008-09-01 02:21:26 −0700 (Mon, 01 Sep 2008)

Log Message:
-----------
just a blank line to test email notifications

Modified Paths:
--------------
   masterfiles/PROD/inputs/hostgroups/cf.svn_server
Modified: masterfiles/PROD/inputs/hostgroups/cf.svn_server
===================================================================
--- masterfiles/PROD/inputs/hostgroups/cf.svn_server 2008-09-01 09:16:24 UTC (rev 6)
+++ masterfiles/PROD/inputs/hostgroups/cf.svn_server 2008-09-01 09:21:26 UTC (rev 7)
@@ −8,3 +8,4 @@
               tasks/app/svn/cf.setup_svn_plus_apache
               tasks/app/svn/cf.copy_hooks
+

The output displays our new blank line with a simple plus sign, followed by nothing (nothing but a newline character, of course).

You can see how useful these e-mail notifications will be when multiple people are committing to the repository. It can also be used for peer review of changes. Standard practice at your site could be to have a meeting where all commits are peer reviewed before the working production copy is updated with the changes committed to the repository.

The major problem with such a system is that there is no mechanism set up to test the changes before they are pushed to the live environment. A typographical error can easily be missed during peer review, causing cfengine to fail to execute properly on all hosts at our site. Clearly a better mechanism is needed. In the next section, we'll explore a way to try out our changes in a nonproduction environment.

Using Subversion to Implement a Testing Environment

We initially set up our cfengine clients to use files under the PROD directory. In this section, we'll start to make use of the DEV directory, which is at the same level as PROD in the masterfiles tree.

To create a new branch in the repository, simply use the svn copy command with two URL paths in the repository. First, we made sure the repository has the required base paths; then, we created the branch:

$ cd ~/masterfiles/
$ mkdir -p DEV/branches
$ svn add DEV
$ svn commit -m"creating DEV/branches directory structure"
Adding         DEV
Adding         DEV/branches
Committed revision 8.
$ svn copy https://svn.campin.net/svn/cfengine/masterfiles/PROD
https://svn.campin.net/svn/cfengine/masterfiles/DEV/branches/1
-m"creating the first cfengine development branch"

Committed revision 9.

Now, we have a branch for development at DEV/branches/1 inside the repository. In order to work with it, we'll need to check it out:

$ cd ~/masterfiles/DEV/branches/
$ svn co https://svn.campin.net/svn/cfengine/masterfiles/DEV/branches/1
A    1/inputs
A    1/inputs/cfservd.conf
A    1/inputs/control
A    1/inputs/control/cf.friendstatus
A    1/inputs/control/cf.control_cfagent_conf
A    1/inputs/control/cf.control_cfexecd
A    1/inputs/cf.preconf
A    1/inputs/ignore
A    1/inputs/ignore/cf.ignore
...output truncated...

Note that inside the repository the branches don't take up much extra space. Subversion has a cheap copy mechanism where branches are really more like hard links to the original copy. The branch really only starts taking up space as it is modified and added to. Be aware that our checkout of the branch does take up the full amount of space in our local filesystem.

Creating arbitrarily named branches in the version repository under DEV is fine. We'll be able to check out multiple trees under DEV on the cfengine master and point clients at any tree of our choosing. Let's set up that branch now. On the cfengine master host (as the root user), check out the new development branch to the live tree where cfengine clients pull files:

# pwd
/var/lib/cfengine2/masterfiles/PROD
# cd ../DEV/
# mkdir branches
# cd branches/
# svn --username=readonly co
https://svn.campin.net/svn/cfengine/masterfiles/DEV/branches/1

Now that we have a development tree available on the cfengine master, we need a nonproduction host to use it on. We don't have any hosts that aren't important to our network, or more importantly to our business, so we'll image a new one. We'll call it ops1, meaning that it belongs to the operations team, and use it for testing. We'll create a Debian i686 host, since that's what we use for most of our system roles at this point.

Here are the summarized steps to create the new Debian host:

  1. Add entries for the new host to the DNS. We created a forward entry in the file db.campin.net and a reverse entry in the file db.192.168. As is now the norm, we had to commit the changes to Subversion and update the Subversion working copy on the cfengine master.
  2. We'll set up FAI on goldmaster to image the host, which means adding an entry to boot the new host in /etc/dhcp3/dhcpd.conf and running the command fai-chboot-IFv ops1.
  3. Image the new host. We don't need to do anything custom to it at this point, so we didn't add it to any special classes in FAI. We want it to be a very basic system.

Now, we needed to change some core files in cfengine in order to have ops1 utilize the DEV tree. In PROD/inputs/update.conf, we added these lines to the top:

classes:
        dev_servers             = (     ops1
                                        )

Then, we added these lines to the control section in PROD/inputs/update.conf:

any::
        AllowRedefinitionOf  = ( branch )
        branch          = ( PROD )

dev_servers::
        branch          = ( "DEV/branches/1" )

any::
        master_cfinput = ( /var/lib/cfengine2/masterfiles/$(branch)/inputs )

and we removed this line from PROD/inputs/update.conf:

master_cfinput = ( /var/lib/cfengine2/masterfiles/PROD/inputs )

In PROD/control/cf.control_cfagent_conf, the section that looked like this

        branch          = ( PROD )
        master_cfinput = ( /var/lib/cfengine2/masterfiles/$(branch)/inputs )

became this:

        AllowRedefinitionOf  = ( branch )
        branch          = ( PROD )

dev_servers::
        branch          = ( "DEV/branches/1" )

any::
        master_cfinput = ( /var/lib/cfengine2/masterfiles/$(branch)/inputs )
        master

And in PROD/inputs/classes/cf.main_classes, we added these lines:

dev_servers             = (     ops1
                                )

After all those updates are completed, we checked in the changes:

$ svn commit -m"support the DEV branch for the system ops1"
Sending        inputs/classes/cf.main_classes
Sending        inputs/control/cf.control_cfagent_conf
Sending        inputs/update.conf
Transmitting file data ...
Committed revision 11.

We're all set. Update the cfengine master with svn update in the PROD tree, and now ops1 should be using the DEV/branches/1 tree. There is one problem: the DEV tree hasn't been updated to point ops1 at itself! This is the perfect opportunity to perform our first merge in Subversion.

First, in our working copy, we ran svn log from the DEV/branches/1 directory to note the revision at which we created the branch (revision 9):

$ cd ~/masterfiles/DEV/branches/1
$ svn log
------------------------------------------------------------------------
r9 | nate | 2008-09-01 03:27:26 −0700 (Mon, 01 Sep 2008) | 1 line

creating the first cfengine development branch
------------------------------------------------------------------------

The svn log output went on, but the first entry was the important one, because it was the last time that the branch was updated. The history beyond that point is actually the history of the PROD branch, because that's where the DEV branch was copied from. Up until that point there was only the PROD branch. We'll want everything done to the production branch from that point forward to be applied to the DEV branch—synchronizing the two branches completely.

We then changed directory to the PROD directory to gather the latest revision of the PROD branch, since we'll want to apply everything done to the PROD branch since revision 9 back to the DEV branch. Then, we ran a merge as a dry run to see the files that have changed and would be merged. The commands to do this follow:

$ cd PROD/
$ svn status -u
Status against revision:     11
$ cd ../DEV/branches/1
$ svn  merge --dry-run -r 9:11 https://svn.campin.net/svn/cfengine/masterfiles/PROD
U    inputs/control/cf.control_cfagent_conf
U    inputs/update.conf
U    inputs/classes/cf.main_classes
U    repl/root/etc/bind/debian-ext/db.campin.net

This looks good, since those are the files with changes that need to be migrated over to the DEV/branches/1 branch. We now need to go ahead and perform the merge against our working copy and inspect the changes:

$ svn merge -r 9:11 https://svn.campin.net/svn/cfengine/masterfiles/PROD
-m"merging revisions 9-11 from PROD into DEV"

We inspected the changed files, and the expected changes are there. We'll commit our development branch with svn commit and update the DEV/branches/1 tree on the cfengine master.

When merging, be sure to specify the revisions you're merging in the commit message, so that later, when you merge again, you can find the revision at which to start your new merge. You don't ever want to attempt to merge the same changes twice. The lack of detection and prevention of duplicated merges is an acknowledged weak spot in Subversion, and you don't want to get caught by it if you can avoid it.

Our host ops1 is now utilizing a completely separate tree on the cfengine master, using a Subversion tree that we can leverage to share code between development and production. Setting up more hosts to use the DEV tree is as simple as adding hosts to the dev_servers class in update.conf and inputs/classes/cf.main_classes in both the PROD and DEV lines of development.

To make full use of the DEV tree, you'll want to specify a testing host for all of the production roles that you're using in the PROD tree, some of which follow:

  • Debian Subversion, Nagios, and Ganglia web host
  • Solaris NFS home directory server
  • Red Hat public web server
  • Debian DNS server
  • Debian mail relay

Since we don't ever specify hostnames in the cfengine tasks, it's simply a matter of redefining some group memberships in the DEV/branches/1/inputs/classes/cf.main_classes file for testing purposes. Notice how abstracting the hostnames away from role names helps in yet another way. We're now free to test out entirely new DNS mechanisms or change anything else in our development environment, with no effect on production. Additionally, setting up virtual hosts under a system such as VMware can help ensure that not a lot of extra hardware is needed for testing purposes.

Note that we didn't cover usage of the STAGE directory tree. Our network is still small that we're not making use of that tree yet. The idea is that once our network is large enough, we'll have separate hosts for testing configurations once they come out of the initial development phase. Some changes might need days or weeks before they are approved for promotion to the main production branch. You can always use the DEV tree this way as well, but it's useful to give it a descriptive name such as STAGE if you intend to use it as a longer-term testing ground.

The usage of the STAGE tree will technically be identical to usage of the DEV tree. It is the policies around usage that will differ, and those need to be defined on a per-site basis.

Backups

A substantial amount of work has now been put into our cfengine master, as well as our three imaging systems. Since we set up Kickstart, Jumpstart, and FAI before we had cfengine managing our systems, we have no backups of those systems. In addition, we need to back up our cfengine Subversion repository. If we had automated the setup of the configuration of all three imaging system hosts with cfengine, we would need to back up only the Subversion repository.

We would like to have to back up only the Subversion repository. This would mean that all of the configuration at our site is performed via cfengine, which is how we want things. To use cfengine to perform all configuration at our site, we should go back and automate the setup of our imaging systems as much as possible and then only back up Subversion.

The automation of our imaging systems would include neither the Kickstart and Jumpstart process of copying the installation image(s) to disk (setup_install_server for Jumpstart and the DVD copy to /kickstart/rhel_5_2 on the Kickstart host) nor the installation client setup for those systems. We're looking to automate the synchronization of files that we had to manually create or manually edit.

Backing up only the Subversion repository obviously won't work for application data backups, but at this point, we don't have any application data to be concerned about. When we need to worry about application logs or other variable data, we'll want to investigate an open source backup solution such as Amanda or a commercial backup product such as Veritas NetBackup.

First, let's grab the important configuration files from our imaging systems, check them into Subversion, and distribute the files using cfengine.

Jumpstart

Jumpstart is great in that the setup is done entirely via scripts contained on the installation media. We don't need to worry about backing up most of the files in the /jumpstart directory tree. All we'll need to copy using cfengine is the /jumpstart/profiles/ directory. Everything else that we need to re-create a functional Jumpstart server is contained in Chapter 6. Those steps don't lend themselves well to automation, since the steps to recreate the Jumpstart environment depend on having some form of installation media available—and it could be a series of CDs, a DVD, or an ISO file.

We copied the /jumpstart/profiles directory from our Jumpstart server hemingway into our working copy:

$ cd ~/masterfiles/PROD/repl/root/
$ mkdir jumpstart
$ scp -r root@hemingway:/jumpstart/profiles jumpstart/

Then, we added the jumpstart directory to the cfengine Subversion repository:

$ svn add jumpstart/
A         jumpstart
A         jumpstart/profiles
A         jumpstart/profiles/aurora
A         jumpstart/profiles/aurora/sysidcfg
A         jumpstart/profiles/aurora/finish_install.sh
A         jumpstart/profiles/aurora/rules
A         jumpstart/profiles/aurora/rules.ok
A         jumpstart/profiles/aurora/basic_prof
A         jumpstart/profiles/jumpstart_sample
A         jumpstart/profiles/jumpstart_sample/any_machine
A         jumpstart/profiles/jumpstart_sample/check
A         jumpstart/profiles/jumpstart_sample/host_class
A         jumpstart/profiles/jumpstart_sample/net924_sun4c
A         jumpstart/profiles/jumpstart_sample/rules
A         jumpstart/profiles/jumpstart_sample/set_nfs4_domain
A         jumpstart/profiles/jumpstart_sample/set_root_pw
A         jumpstart/profiles/jumpstart_sample/upgrade
A         jumpstart/profiles/jumpstart_sample/x86-begin
A         jumpstart/profiles/jumpstart_sample/x86-begin.conf
A         jumpstart/profiles/jumpstart_sample/x86-begin.conf/OWconfig
A         jumpstart/profiles/jumpstart_sample/x86-begin.conf/msm.conf
A         jumpstart/profiles/jumpstart_sample/x86-class

After that, we needed to distribute the profiles directory to the Jumpstart host. We created a class in cfengine for the role jumpstart_server, and added hemingway to that class. We used the class in a task located at PROD/inputs/tasks/app/jumpstart/cf.copy_jump_profiles with these contents:

copy:
        jumpstart_server::
                $(master)/repl/root/jumpstart/profiles
                                dest=/jumpstart/profiles
                                mode=755
                                r=inf
                                owner=root
                                group=root
                                type=checksum
                                server=$(fileserver)
                                encrypt=true

directories:
        jumpstart_server::
                /jumpstart/profiles mode=755 owner=root group=root inform=false

We copy all the files with mode 775, since some of them need to be executable. It won't hurt anything if they're all executable, just be aware that the executable bit being set in this directory doesn't mean that the file is necessarily a script.

We then added the PROD/inputs/tasks/app/jumpstart directory to the Subversion repository with this command:

$ pwd
/home/nate/masterfiles/PROD/inputs/tasks/app
$ svn add jumpstart/
A         jumpstart
A         jumpstart/cf.copy_jump_profile

Next, we added this line to PROD/inputs/classes/cf.main_classes to create the new class:

jumpstart_server        = ( hemingway )

We then created a hostgroup file for the jumpstart_server class, with a new file at the location PROD/inputs/hostgroups/cf.jumpstart_server with these contents:

import:
        any::
                tasks/app/jumpstart/cf.copy_jump_profiles

Be sure to svn add the cf.jumpstart_server file into the repository.

As usual, the last step is to set up the cfengine import of this hostgroup file in the hostgroup mapping file at PROD/inputs/hostgroups/cf.hostgroup_mappings. We added this line:

jumpstart_server::              hostgroups/cf.jumpstart_server

Since this was all done in our working copy, we needed to check in all these changes.

$ cd ~/masterfiles/PROD/
$ svn commit
-m"set up the copy of the jumpstart profiles directory to the jumpstart host"

We then checked them out on the cfengine master with the svn update command in the PROD directory.

We should now be able to restore what we need, if and when the hemingway host dies and is subsequently reinstalled. All the rest of the configuration on the host is easily reproducible, simply by referring to the Jumpstart section in Chapter 6.

Kickstart

To distribute the important files that would need restoration if the Kickstart host is rebuilt, we first copied the important files into our cfengine repository working copy:

$ cd ~/masterfiles/PROD/repl/root/
$ mkdir -p kickstart/rhel5_2
$ scp -r root@rhmaster:/kickstart/cfengine-2.2.7 kickstart/
$ scp -r root@rhmaster:/kickstart/scripts kickstart/
$ scp  root@rhmaster:/kickstart/rhel5_2/ks.cfg kickstart/rhel5_2/
$ svn add kickstart

After that, we needed to copy out these files to the /kickstart directory on the host rhmaster using cfengine. Once again in our working copy, we created the directory PROD/inputs/tasks/app/kickstart, and created a task in the directory called cf.copy_kickstart_dir with these contents:

copy:
        kickstart_server::
                $(master)/repl/root/kickstart
                                dest=/kickstart
                                mode=755
                                r=inf
                                owner=root
                                group=root
                                type=checksum
                                server=$(fileserver)
                                encrypt=true

directories:
        kickstart_server::
                /kickstart mode=755 owner=root group=root inform=false

We added the PROD/inputs/tasks/app/kickstart directory to Subversion with svn add once we had the task file inside it. Next, we needed to do the usual steps in order to make this task get used by our Kickstart server. Here's a summary of the steps:

  1. Create the kickstart_server class in PROD/inputs/classes/cf.main_classes.
  2. Create the hostgroup file at PROD/inputs/hostgroups/cf.kickstart_server that imports the cf.copy_kickstart_dir task. Add the file to the Subversion repository.
  3. Set up the hostgroup import in the hostgroup mapping file PROD/inputs/hostgroups/cf.hostgroup_mappings.
  4. Commit the changes to your working copy, and update the production working copy on the cfengine master.

Now our important Kickstart files are contained in Subversion and will be restored by cfengine via a copy if we ever have to rebuild our Kickstart server.

FAI

When we set up FAI, we were careful to modify the default FAI configuration files as little as possible. We wanted to be able to push new files as much as possible, since we knew that we would want to distribute those files using cfengine later on.

We collected all the files under the /srv/fai/config directory that we modified or added back in Chapter 6 in our working copy of the repository:

$ pwd
/home/nate/masterfiles/PROD/repl/root/srv/fai/config
$ ls -R
.:
./  ../  class/  disk_config/  files/  hooks/  package_config/  scripts/

./class:
./  ../  60-more-host-classes*  FAIBASE.var

./disk_config:
./  ../  LOGHOST  WEB

./files:
./  ../  etc/

./files/etc:
./  ../  cfengine/

./files/etc/cfengine:
./  ../  cfagent.conf/  update.conf/

./files/etc/cfengine/cfagent.conf:
./  ../  FAIBASE*

./files/etc/cfengine/update.conf:
./  ../  FAIBASE*

./hooks:
./  ../  savelog.LAST.source*
./package_config:
./  ../  FAIBASE  LOGHOST  WEB

./scripts:
./  ../  FAIBASE/

./scripts/FAIBASE:
./  ../  50-cfengine*  60-create-cf-config*

We'll distribute all these as another recursive copy, this time into the /srv/fai/config directory on the FAI server (goldmaster). We have some additional files that we modified during the setup of our FAI server:

  • /etc/fai/make-fai-nfsroot.conf

• /etc/dhcp3/dhcpd.conf

  • /etc/inetd.conf

There is a problem with /etc/inetd.conf: in the task PROD/inputs/tasks/app/rsync/cf.enable_rsync_daemon, we add a line to /etc/inetd.conf using the editfiles action. This editfiles action must be changed or removed, since it makes no sense to have an editfiles action acting on a file that cfengine is also copying out. Two scenarios could result, depending on the contents of the inetd.conf file that cfengine copies into place:

  • The copied /etc/inetd.conf file won't have the entry that the task cf.enable_rsync_daemon is looking for, and it will be added by the editfiles action. This means that the next time cfengine runs, /etc/inetd.conf won't match the checksum of the file in the masterfiles tree, and inetd.conf will be copied again. After that, the editfiles action will once again notice that the required entry isn't there, and it will add it yet again. This loop will continue on every time cfengine runs.
  • The copied /etc/inetd.conf file will already have the required entry, making the editfiles action unnecessary.

You can see that, either way, we don't need the editfiles action. It either produces what we can only consider an error by constantly changing the file or is totally unneeded. We'll simply place the required entry in the inetd.conf file that we copy out and remove the editfiles section from the cf.enable_rsync_daemon task. We will add a comment to the task, however, stating that the enable of the daemon is handled via a static file copy in another task and provide the task file name in the comment.

After editing the PROD/inputs/tasks/app/rsync/cf.enable_rsync_daemon task to comment out the editfiles section and add the new comment, we placed these files into our working copy of the cfengine tree:

$ pwd
/home/nate/masterfiles/PROD/repl
$ cp /etc/inetd.conf root/etc/
$ svn add root/etc/inetd.conf
A         root/etc/inetd.conf
$ cp /etc/fai/make-fai-nfsroot.conf root/etc/fai/
$ svn add root/etc/fai/make-fai-nfsroot.conf
A         root/etc/fai/make-fai-nfsroot.conf
$ mkdir root/etc/dhcp3
$ cp /etc/dhcp3/dhcpd.conf root/etc/dhcp3/
$ svn add root/etc/dhcp3
A         root/etc/dhcp3
A         root/etc/dhcp3/dhcpd.conf

Note that the copies were local since we were working in our home directory from the goldmaster system itself.

We created a task at PROD/inputs/tasks/app/fai/cf.copy_fai_files with these contents:

control:
        fai_server::
                AddInstallable          = ( restart_inetd restart_dhcpd )

copy:
        fai_server::
                $(master)/repl/root/srv
                                dest=/srv
                                mode=755
                                r=inf
                                owner=root
                                group=root
                                type=checksum
                                server=$(fileserver)
                                encrypt=true

                $(master_etc)/inetd.conf
                                dest=/etc/inetd.conf
                                mode=755
                                owner=root
                                group=root
                                type=checksum
                                server=$(fileserver)
                                encrypt=true
                                define=restart_inetd

                $(master_etc)/fai/make-fai-nfsroot.conf
                                dest=/etc/fai/make-fai-nfsroot.conf
                                mode=755
                                owner=root
                                group=root
                                type=checksum
                                server=$(fileserver)
                                encrypt=true

                $(master_etc)/dhcp3/dhcpd.conf
                                dest=/etc/dhcp3/dhcpd.conf
                                mode=755
                                owner=root
                                group=root
                                type=checksum
                                server=$(fileserver)
                                encrypt=true
                                define=restart_dhcpd

directories:
        fai_server::
                /srv mode=755 owner=root group=root inform=false

shellcommands:
        debian.restart_inetd::
                "/etc/init.d/openbsd-inetd restart" timeout=30 inform=true

        debian.restart_dhcpd::
                "/etc/init.d/dhcp3-server restart" timeout=30 inform=true

We made sure to add the new tasks/app/fai directory to the repository. We need to create the fai_server class, create a hostgroup file for it, and import it in the cf.hostgroup_mappings file. Here's a summary of the steps:

  1. Create the fai_server class in PROD/inputs/classes/cf.main_classes.
  2. Create the hostgroup file at PROD/inputs/hostgroups/cf.fai_server that imports the cf.copy_fai_files task. Add the file to the Subversion repository.
  3. Set up the hostgroup import in the hostgroup mapping file PROD/inputs/hostgroups/cf.hostgroup_mappings.
  4. Commit the changes to your working copy, and update the production working copy on the cfengine master.

Subversion Backups

The procedure to back up a Subversion repository is quite simple. We can use the svnadmin command with the hotcopy argument to properly lock the repository and perform a file-based backup. Backing up this way is much better than performing a cp or rsync copy of the repository files, which might result in a corrupted backup.

Use the command like this:

# svnadmin hotcopy /path/to/repository /path/to/backup-repository

The repository made by svnadmin hotcopy is fully functional; we are able to drop it in place of our current repository should something go wrong. We can create periodic backups of our repository this way and copy the backups to another host on our network or even to an external site.

Be aware that each time a hot copy is made, it will use up the same amount of disk space as the original repository. Backup scripts that make multiple copies using svnadmin hotcopy will need to be careful not to fill up the local disk with backups.

We'll create a script at PROD/repl/admin-scripts/svn-backup with these contents (explained section by section):

#!/bin/sh
# This script is tested on Debian Linux only.
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/opt/admin-scripts

SVN_REPOS="/var/svn/repository/binary-server /var/svn/repository/cfengine"

case 'hostname' in
etchlamp*)
        echo "This is the host on which to backup the Subversion repo, continuing."
        ;;
*)
        echo "This is NOT the host on which to backup the SVN repo, exiting..."
        exit 1
        ;;
esac

Since we copied the script to all hosts on our network, we took steps to make sure that it only runs on the proper host:

BACKUP_BASE_DIR=/var/backups
LOCKFILE=/root/svn_backup_lock

rm_lock_file() {
        rm -f $LOCKFILE
}

We'll be using file locking to prevent two invocations of this script from running at once.

rotate_backups() {
        BACKUP_DIR_NAME=$1

        if cd $BACKUP_DIR_NAME
        then
                for num in 6 5 4 3 2 1
                do
                        one_more='expr $num + 1'
                        if [ -d backup.${num} ]
                        then
                                if [ -d backup.${one_more} ]
                                then
                                        rm -rf backup.${one_more} &&
                                        mv backup.${num} backup.${one_more}
                                else
                                        mv backup.${num} backup.${one_more}
                                fi
                        fi
                done
        else
                echo "Can't cd to $BACKUP_DIR_NAME - exiting now"
                rm_lock_file
                exit 1
        fi
}

We wrote a subroutine to manage our stored backup directories. It takes an argument of a repository directory that needs to be backed up, and it moves any numbered backup directories to a new backup directory with the number incremented by one. A backup directory with the number 7 is removed, since we only save seven of them.

For example, the directory /var/backups/binary-server/backup.7/ is removed, and the directory /var/backups/binary-server/backup.6/ is moved to the name /var/backups/binary-server/backup.7/. The subroutine then progresses backward numerically from 5 to 1, moving each directory to another directory with the same name except the number incremented by 1. When it is done, there is no directory named /var/backups/binary-server/backup.1/, which is the directory name we'll use for a new Subversion backup:

# don't ever run two of these at once
lockfile $LOCKFILE || exit 1

for REPO in $SVN_REPOS
do
        SHORTNAME='basename $REPO'
        BACKUP_DIR="$BACKUP_BASE_DIR/$SHORTNAME"
        [ -d "$BACKUP_DIR" ] || mkdir -p $BACKUP_DIR

        cd $BACKUP_DIR && rotate_backups $BACKUP_DIR

        /usr/bin/svnadmin hotcopy $REPO $BACKUP_DIR/backup.1
done

In this section, we perform these steps:

  1. Retrieve just the short portion of the directory name using the basename command so that the variable SHORTNAME contains the value binary-server or cfengine—the two repository directory names.
  2. We then make sure that the directory used for the backups exists and create it if necessary.
  3. Now that the directory is known to exist, we change directory to the proper backup directory and use our subroutine that rotates the previous backup directories.
  4. Then we use the svnadmin hotcopy command to create a new backup of the repository. This is done for each directory listed in the variable SVN_REPOS.
# if we get here without errors, clean up
rm_lock_file

Finally, we removed the lock file that is used to prevent two of these from running at once. We ran the script eight times in a row to demonstrate the output, here it is:

# hostname
etchlamp
# ls -ltr /var/backups/binary-server/
total 28
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.7
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.6
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.5
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.4
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.3
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.2
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.1
# ls -ltr /var/backups/cfengine/
total 28
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.7
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.6
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.5
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.4
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.3
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.2
drwxr-xr-x 7 root root 4096 2008-09-01 23:31 backup.1

In order to use the lockfile command (contained in the script), the package procmail needs to be installed. Add the string procmail on a line by itself to your working copy of PROD/repl/root/srv/fai/config/package_config/FAIBASE, and check in the modification so that all future hosts get the package installed. For now, just install the procmail package using apt-get on the Subversion sever (the system etchlamp).

We'll create a task to run the backup script once per day, in a file at the location PROD/inputs/tasks/app/svn/cf.svn_backups with these contents (be sure to add it into the Subversion repository):

shellcommands:
        svn_server.debian.Hr00.Min00_05::
                "/opt/admin-scripts/svn-backup"
                        timeout=600

We're using cfengine to run the backups every day between midnight and five minutes after midnight. Remember that we set a five-minute SplayTime, so cfagent will run at some time in the five minutes after midnight. We need to specify the range so that our shellcommands action will run. The absolute time class of Min00 probably wouldn't match, but the range Min00_05 definitely will.

Now, we need to add this line to PROD/inputs/hostgroups/cf.svn_server:

tasks/app/svn/cf.svn_backups

Commit your changes to the repository, and update the production working copy. Now, every night at midnight, a new backup will be created, and we'll always have seven day's worth of backups on hand.

Copying the Subversion Backups to Another Host

We will copy the Subversion backup directories to another host on our local network using cfengine, so we'll be able to quickly restore our two Subversion repositories if the Subversion server fails.

We'll modify our site's shared cfservd.conf configuration file to grant access to the backup directories on etchlamp from a designated backup host. We will use the cfengine master as the backup host and always keep a complete backup of those directories.

We added these lines to PROD/inputs/cfservd.conf in the admit: section:

etchlamp::
        # Grant access to the Subversion backups to the goldmaster host
         /var/backups/binary-server     192.168.1.249
         /var/backups/cfengine             192.168.1.249

Then, we created a task to copy the directories, the file PROD/inputs/tasks/app/svn/cf.copy_svn_backups with these contents (and we added the file to the repository, of course):

copy:
        fileserver.Hr00.Min20_25::
                /var/backups/cfengine
                                dest=/var/backups/svnbackups/cfengine
                                mode=555
                                r=inf
                                purge=false
                                owner=root
                                group=root
                                type=checksum
                                server=$(svn_server)
                                encrypt=true
                                trustkey=true
               /var/backups/binary-server
                                dest=/var/backups/svnbackups/binary-server
                                mode=555
                                r=inf
                                purge=false
                                owner=root
                                group=root
                                type=checksum
                                server=$(svn_server)
                                encrypt=true
                                trustkey=true
directories:
        policyhost::
                /var/backups/svnbackups/cfengine mode=750
                        owner=daemon group=root inform=false

                /var/backups/svnbackups/binary-server mode=750
                                owner=daemon group=root inform=false

We then added this line to PROD/inputs/control/cf.control_cfagent_conf so that we could abstract the hostname of the Subversion server with a variable:

svn_server      = ( etchlamp.campin.net )

Next, we added a comment to PROD/inputs/classes/cf.main_classes so that this line:

svn_server              = ( etchlamp )

became this:

# we also define svn_server as a variable in the file
# inputs/control/cf.control_cfagent_conf - update that file
# as well if you change the svn_server class below.
svn_server              = ( etchlamp )

We then needed a hostgroup file for the policyhost machine, so we created PROD/inputs/hostgroups/cf.policyhost with these contents:

import:
        any::
                tasks/app/svn/cf.copy_svn_backups

And we added this line to PROD/inputs/hostgroups/cf.hostgroup_mappings:

policyhost::                    hostgroups/cf.policyhost

Commit your changes, and update the production PROD tree on the cfengine master. The next day (after 12:25 a.m.), you should have fully functional Subversion backups stored in the /var/backups/svnbackups/ directory on your cfengine master.

We'll leave the task of copying the backup directories to an offsite host as an exercise for you.

Enhancement Is an Understatement

This chapter took our site from being at a high risk due to system failure to being a fully version controlled and backed up environment.

Many sites that utilize cfengine or other automated management software don't have the ability to easily manage a testing environment such as the one demonstrated here. We have a real advantage in the existence of our DEV cfengine branch, and we should use it as much as possible to try out new configurations and applications.

Our backup measures are certainly minimal, but they're effective. If we suffered total system failure on any of our hosts, including the critical cfengine master, we can restore the system to full functionality.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.110.0