Every UNIX-based site requires a similar list of infrastructure services in order to function. All sites need to keep the correct time, route e-mail from system processes (such as cron jobs, and in our case cfexecd
) to the correct place, convert hostnames into IP addresses, and control user accounts.
We think it's only fair to warn you that this chapter won't go into great detail on the protocols and server software that we'll configure. If we had to explain DNS, NTP, SMTP, NFS, the automounter, and UNIX authentication files in great detail, the chapter would never end. Additionally, it would draw focus away from our goal of automating a new infrastructure using cfengine. We'll recommend other sources of information for the protocols and server software as we progress though the chapter.
When we refer to files in the cfengine masterfiles
repository on our central host (goldmaster), we'll use paths relative to /var/lib/cfengine2/masterfiles
. This means that the full path to PROD/inputs/tasks/os/cf.ntp
is /var/lib/cfengine2/masterfiles/PROD/inputs/tasks/os/cf.ntp
.
Many programs and network protocols fail to function properly when the clock on two systems differ by more than a small amount.
The lack of time synchronization can cause extensive problems at a site. These are the most common:
We'll tackle Network Time Protocol (NTP) configuration before any other infrastructure setup tasks. We won't go into the level of detail that you'll want if you're deploying NTP across hundreds or thousands of systems. If that's the case, accept our apologies and proceed over to http://www.ntp.org to browse the online documentation, or head to your nearest bookseller and pick up a copy of Expert Network Time Protocol by Peter Rybaczyk (Apress, 2005).
The fact that we already have six hosts at our example site without synchronized clocks is a potential problem. The cfservd
daemon will refuse to serve files to clients if the clocks on the systems differ by more than one hour. You can turn off this behavior with this setting in cfservd.conf
:
DenyBadClocks off
It might make sense to turn it off during the initial bootstrapping phase at your site, before you deploy NTP.
NTP is the Internet standard for time synchronization. Interestingly, it's one of the oldest Internet standards still in widespread use. NTP is a mechanism for transmitting the universal time (UTC, or Coordinated Universal Time) between systems on a network. It is up to the local system to determine the local time zone and Daylight Saving settings, if applicable. NTP has built-in algorithms for dealing with variable network latency, and can achieve rather impressive accuracy even over the public Internet.
The ntp.org web site has a list of public NTP servers here: http://support.ntp.org/bin/view/Servers/WebHome. These are groups of public NTP servers that use round-robin DNS to enable clients to make a random selection from the group. Both Red Hat and Debian have NTP pools set up this way, and the NTP packages from those distributions utilize these pools by default.
Our intention is to have two of our internal servers synchronize to an external source, and have the rest of our systems synchronize from those two. This is the polite way to utilize a public NTP source: placing as little load as possible on it. We don't want a single system to perform off-site synchronization for our entire network because it becomes a single point of failure. We generally want to set up DNS aliases for system roles such as NTP service, but NTP configuration files use IP addresses. This actually works out well because we have yet to set up internal DNS.
We'll use our cfengine master host (goldmaster.campin.net) and our Red Hat Kickstart system (rhmaster.campin.net) as the two systems that sync to an external NTP source.
Note There is no reason to choose Linux over Solaris systems to handle this role. You should find it quite easy to modify this procedure to use one or more Solaris systems to synchronize off site instead, and have all other systems synchronize to the internal Solaris NTP servers.
The Red Hat system already had ntpd
installed (the ntp
RPM package). If you wish to graphically configure NTP on Red Hat, you'll need to have the system-config-date
RPM installed. Basic NTP configuration is straightforward, so we'll stick with text-based methods of configuration.
The Debian system didn't have the required packages installed, so we used apt-get
to install the ntp
package. We went back to our FAI configuration and added the line ntp
to the file /srv/fai/config/package_config/FAIBASE
so that all future Debian installs have the package by default. Our Kickstart installation process already installs the ntp
RPM, so we don't have to make any Kickstart modifications.
Here is the /etc/ntpd.conf
file that we'll use on our systems that synchronize to off-site NTP sources:
driftfile /var/lib/ntp/ntp.drift
statsdir /var/log/ntpstats/
statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable
# pool.ntp.org maps to more than 300 low-stratum NTP servers.
# Your server will pick a different set every time it starts up.
server 0.debian.pool.ntp.org iburst
server 1.debian.pool.ntp.org iburst
server 2.debian.pool.ntp.org iburst
server 3.debian.pool.ntp.org iburst
# By default, exchange time with everybody, but don't allow configuration.
# See /usr/share/doc/ntp-doc/html/accopt.html for details.
restrict −4 default kod notrap nomodify nopeer noquery
restrict −6 default kod notrap nomodify nopeer noquery
# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1
# allow the local subnet to query us
restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
Both Red Hat and Debian have a dedicated user to run the NTP daemon process. The user account, named "ntp," will need write access to the /var/lib/ntp
directory.
When you name a subnet using the restrict
keyword and omit the noquery
keyword, the server allows NTP client connections from that subnet.
Now that we have working NTP servers on our network, we need configuration files for the Linux (both Red Hat and Debian) and Solaris systems on our network. We refer to the systems running NTP to synchronize only with internal hosts as NTP "clients."
Solaris 10 NTP Client
You'll find it easy to configure a single Solaris 10 system to synchronize its time using NTP. We will automate the configuration across all our Solaris systems later, but will first test our configuration on a single host to validate it. Simply copy /etc/inet/ntp.servers
to /etc/inet/ntp.conf
, and comment out these lines:
server 127.127.XType.0
fudge 127.127.XType.0 stratum 0
keys /etc/inet/ntp.keys
trustedkey 0
requestkey 0
controlkey 0
Add lines for our internal NTP servers:
server 192.168.1.249
server 192.168.1.251
Create the file /var/ntp/ntp.drift
as root
using the touch
command, and enable the ntp
service:
# touch /var/ntp/ntp.drift
# /usr/sbin/svcadm enable svc:/network/ntp
It's really that easy. Check the /var/log/messages
log file for lines like this, indicating success:
Jul 27 18:05:30 aurora ntpdate[995]: [ID 558275 daemon.notice] adjust time server
192.168.1.249 offset 0.008578 sec
Red Hat and Debian NTP Client
We use the same NTP configuration-file contents for all the remaining Debian and Red Hat hosts at our site, shown here:
driftfile /var/lib/ntp/ntp.drift
statsdir /var/log/ntpstats/
statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable
# By default, exchange time with everybody, but don't allow configuration.
# See /usr/share/doc/ntp-doc/html/accopt.html for details.
restrict −4 default kod notrap nomodify nopeer noquery
restrict −6 default kod notrap nomodify nopeer noquery
# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1
restrict 192.168.1.249 nomodify # goldmaster.campin.net
restrict 192.168.1.251 nomodify # rhmaster.campin.net
You'll notice that these file contents resemble the contents of the configuration file used on the hosts that sync off site. The difference here is that we have no server
lines, and we added new restrict
lines specifying our local NTP server systems.
Now we will distribute the NTP configuration file using cfengine, including automatic ntp
daemon restarts when the configuration file is updated. First, put the files into a suitable place in the cfengine master repository (on the host goldmaster):
# cd /var/lib/cfengine2/masterfiles/PROD/repl/root/etc/ntp/
# ls −1
ntp.conf
ntp.conf-master
ntp.server
You might remember that we created the ntp
directory back when we first set up the masterfiles
repository. The ntp.conf-masters
file is meant for rhmaster and goldmaster, the hosts that synchronize NTP using off-site sources. The ntp.conf
file is for all remain-ing Linux hosts, and ntp.server
is our Solaris 10 NTP configuration file.
We'll create a task file at the location PROD/inputs/tasks/os/cf.ntp
on the cfengine master (goldmaster). Once the task is written, we'll import it into the PROD/inputs/ hostgroups/cf.any
file for inclusion across our entire site. Here is the task file:
classes: # synonym groups:
ntp_servers = ( rhmaster
goldmaster
)
Now we define a simple group of two hosts, the machines that sync off site:
control:
any::
AddInstallable = ( restartntpd )
AllowRedefinitionOf = ( ntp_conf_source )
#
# The default ntp.conf doesn't sync off-site
#
ntp_conf_source = ( "ntp.conf" )
linux::
ntp_user = ( "ntp" )
ntp_conf_dest = ( "/etc/ntp.conf" )
drift_file = ( "/var/lib/ntp/ntp.drift" )
solaris|solarisx86::
ntp_user = ( "root" )
ntp_conf_dest = ( "/etc/inet/ntp.conf" )
drift_file = ( "/var/ntp/ntp.drift" )
ntp_conf_source = ( "ntp.server" )
ntp_servers::
# the ntp.conf for these hosts causes ntpd to sync
# off-site, and share the information with the local net
ntp_conf_source = ( "ntp.conf-master" )
In the control
section, you define class-specific variables for use in the files
and copy
actions:
files:
# ensure that the drift file exists and is
# owned and writable by the correct user
any::
$(drift_file) mode=0644 action=touch
owner=$(ntp_user) group=$(ntp_user)
If we didn't use variables for the location of the NTP drift file and the owner of the ntpd
process, we would have to write multiple files
stanzas. When the entry is duplicated with a small change made for the second class of systems, you face a greater risk of making errors when both entries have to be updated later. We avoid such duplication.
We also manage to write only a single copy
stanza, again through the use of variables:
copy:
any::
$(master_etc)/ntp/$(ntp_conf_source)
dest=$(ntp_conf_dest)
mode=644
type=checksum
server=$(fileserver)
encrypt=true
owner=root
group=root
define=restartntpd
Here we copy out the applicable NTP configuration file to the correct location for each operating system. When the file is successfully copied, the restartntpd
class is defined. This triggers actions in the following shellcommands
section:
shellcommands:
# restart ntpd when the restartntpd class is defined
debian.restartntpd::
"/etc/init.d/ntp restart" timeout=30 inform=true
# restart ntpd when the restartntpd class is defined
redhat.restartntpd::
"/etc/init.d/ntpd restart" timeout=30 inform=true
# restart ntpd when the restartntpd class is defined
(solarisx86|solaris).restartntpd::
"/usr/sbin/svcadm restart svc:/network/ntp" timeout=30 inform=true
When the ntp.conf
file is updated, the class restartntpd
is defined, and it causes the ntp
daemon process to restart. Based on the classes a system matches, the restartntpd
class causes cfengine to take the appropriate restart action.
Note that we have two almost identical restart commands for the debian
and redhat
classes. We could have reduced that to a single stanza, as we did for the files
and copy
actions. Combining those into one shellcommands
action is left as an exercise for the reader.
Now let's look at the processes
section:
processes:
# start ntpd when it's not running
debian::
"ntpd" restart "/etc/init.d/ntp start"
# start ntpd when it's not running
redhat::
"ntpd" restart "/etc/init.d/ntpd start"
# this is for when it's not even enabled
solarisx86|solaris::
"ntpd" restart "/usr/sbin/svcadm enable svc:/network/ntp"
In this section, we could have used the restartntpd
classes to trigger the delivery of a HUP signal to the running ntpd
process. We don't do that because a HUP signal causes the ntpd
process to die. For this reason, we use the init scripts on Linux and the SMF on Solaris.
This task represents how we'll write many of our future cfengine tasks. We'll define variables to handle different configuration files for different system types, then use actions that utilize those variables.
The required entry in PROD/inputs/hostgroups/cf.any
to get all our hosts to import the task is the file path relative to the inputs
directory:
import:
any::
tasks/os/cf.motd
tasks/os/cf.cfengine_cron_entries
tasks/os/cf.ntp
If you decide that more hosts should synchronize off site, you'd simply configure an additional Linux host to copy the ntp.conf-masters
file instead of the ntp.conf
file. You'd need to write a slightly modified Solaris ntp.server
config file if you choose to have a Solaris host function in this role. We haven't done so in this book—not because Solaris isn't suited for the task, but because we needed only two hosts in this role. You'd then add a new restrict
line to the NTP client configuration file on Linux, or a new server
line for Solaris NTP clients. That's three easy steps to make our site utilize an additional local NTP server.
We can perform time synchronization at our site using a much simpler procedure than running the NTP infrastructure previously described. We can simply utilize the ntpdate
utility to perform one-time clock synchronization against a remote NTP source. To manually use ntpdate
once, run this at the command line as root:
# /usr/sbin/ntpdate 0.debian.pool.ntp.org
20 Sep 17:09:15 ntpdate[181]: adjust time server 208.113.193.10 offset −0.00311 sec
Note that ntpdate
will fail if a local ntpd
process is running, due to contention for the local NTP TCP/IP port (UDP/123). Temporarily stop any running ntpd
processes if you want to test out ntpdate
.
We consider this method of time sychronization to be useful only on a temporary basis. The reason for this is that ntpdate
will immediately force the local time to be identical to the remote NTP source's time. This can (and often does) result in a major change to the local system's time, basically a jump forward or backward in the system's clock.
By contrast, when ntpd
sees a gap between the local system's time and the remote time source(s), it will gradually decrease the difference between the two times until they match. We prefer the approach that ntpd
uses because any logs, e-mail, or other information sources where the time is important won't contain misleading times around and during the clock jump.
Because we discourage the use of ntpdate
, we won't demonstrate how to automate its usage. That said, if you decide to use ntpdate
at your site, you could easily run it from cron or a cfengine shellcommands
section on a regular basis.
The Domain Name System (DNS) is a globally distributed database containing domain names and associated information. Calling it a "name-to-IP-address mapping service" is overly simplistic, although it's often described that way. It also contains the list of mail servers for a domain as well as their relative priority, among other things. We don't go into great detail on how the DNS works or the finer details of DNS server administration, but you can get more information from DNS and BIND, Fifth Edition by Cricket Liu and Paul Albitz (O'Reilly Media Inc., 2006), and the Wikipedia entry at http://en.wikipedia.org/wiki/Domain_Name_System.
Standard practice with DNS is to make only certain hostnames visible to the general public. This means that we wouldn't make records such as those for goldmaster.campin.net available to systems that aren't on our private network. When we need mail to route to us from other sites properly or get our web site up and running, we'll publish MX records (used to map a name to a list of mail exchangers, along with relative preference) and an A record (used to map a name to an IPv4 address) for our web site in the public DNS.
This sort of setup is usually called a "split horizon," or simply "split" DNS. We have the internal hostnames for the hosts we've already set up (goldmaster, etchlamp, rhmaster, rhlamp, hemingway, and aurora) loaded into our campin.net domain with a DNS-hosting company. We'll want to remove those records at some point because they reference private IP addresses. They're of no use to anyone outside our local network and therefore should be visible only on our internal network. We'll enable this record removal by setting up a new private DNS configuration and moving the private records into it.
Right about now you're thinking "Wait! You've been telling your installation clients to use 192.168.1.1
for both DNS and as a default gateway. What gives? Where did that host or device come from?" Good, that was observant of you. When we mentioned that this book doesn't cover the network-device administration in our example environment, we meant our single existing piece of network infrastructure: a Cisco router at 192.168.1.1
that handles routing, Network Address Translation (NAT), and DNS-caching services. After we get DNS up and running on one or more of our UNIX systems, we'll have cfengine configure the rest of our systems to start using our new DNS server(s) instead.
We'll configure an internal DNS service that is utilized only from internal hosts. This will be an entirely stand-alone DNS infrastructure not linked in any way to the public DNS for campin.net.
This architecture choice means we need to synchronize any public records (currently hosted with a DNS-hosting company) to the private DNS infrastructure. We currently have only mail (MX) records and the hostnames for our web site (http://www.campin.net and campin.net) hosted in the public DNS. Keeping this short list of records synchronized isn't going to be difficult or time-consuming.
We'll use Berkeley Internet Name Domain (BIND) to handle our internal DNS needs.
Note Be sure that the BIND software you install is resistant to the DNS protocol flaw made public in July 2008. Also, if your DNS servers are behind NAT, make sure your NAT device doesn't defeat the port randomization that works around the flaw. For more information, see the CERT advisory here: http://www.kb.cert.org/vuls/id/800113.
We'll use the etchlamp system that was installed via FAI as our internal DNS server. Once it's working there, we can easily deploy a second system just like it using FAI and cfengine.
First, we need to install the bind9
package, as well as add it to the set of packages that FAI installs on the WEB
class.
In order to install the bind9
package without having to reinstall using FAI, run this command as the root
user on the system etchlamp:
# apt-get update && apt-get install bind9
The bind9
package depends on other packages such as bind-doc
(and several more), but apt-get
will resolve the dependencies and install everything required. Because FAI uses apt-g
et, it will work the same way, so we can just add the line "bind9" to the file /srv/ fai/config/package_config/WEB
on our FAI host goldmaster. This will ensure that the preceding manual step never needs to be performed when the host is reimaged.
We'll continue setting up etchlamp manually to ensure that we know the exact steps to configure an internal DNS server. Once we're done, we'll automate the process using cfengine. Note that the bind9
package creates a user account named "bind." Add the lines from your passwd
, shadow
, and group
files to your standardized Debian account files in cfengine. We'll also have to set up file-permission enforcement using cfengine. The BIND installation process might pick different user ID (UID) or group ID (GID) settings from the ones we'll copy out using cfengine.
The Debian bind9
package stores its configuration in the /etc/bind
directory. The package maintainer set things up in a flexible manner, where the installation already has the standard and required entries in /etc/bind/named.conf
, and the configuration files use an include
directive to read two additional files meant for site-specific settings:
/etc/bind/named.conf.options
: You use this file to configure the options section of named.conf
. The options section is used to configure settings such as the name server's working directory, recursion settings, authentication-key options, and more. See the relevant section of the BIND 9 Administrator's Reference Manual for more information: http://www.isc.org/sw/bind/arm95/Bv9ARM.ch06.html#options./etc/bind/named.conf.local
: This file is meant to list the local zones that this BIND instance will load and serve to clients. These can be zone files on local disk, zones slaved from another DNS server, forward zones, or stub zones. We're simply going to load local zones, making this server the "master" for the zones in question.The existence of these files means that we don't need to develop the configuration files for the standard zones needed on a BIND server; we need only to synchronize site-specific zones. Here is the named.conf.options
file as distributed by Debian:
options {
directory "/var/cache/bind";
// If there is a firewall between you and nameservers you want
// to talk to, you might need to uncomment the query-source
// directive below. Previous versions of BIND always asked
// questions using port 53, but BIND 8.1 and later use an unprivileged
// port by default.
// query-source address * port 53;
// If your ISP provided one or more IP addresses for stable
// nameservers, you probably want to use them as forwarders.
// Uncomment the following block, and insert the addresses replacing
// the all-0's placeholder.
// forwarders {
// 0.0.0.0;
// };
auth-nxdomain no; # conform to RFC1035
listen-on-v6 { any; };
};
The only modification we'll make to this file is to change the listen-on-v6
line to this:
listen-on-v6 { none; };
Because we don't intend to utilize IPv6, we won't have BIND utilize it either.
The default Debian /etc/bind/named.conf.local
file has these contents:
//
// Do any local configuration here
//
// Consider adding the 1918 zones here, if they are not used in your
// organization
//include "/etc/bind/zones.rfc1918";
Note the zones.rfc1918
file. It is a list of "private" IP address ranges specified in RFC1918. The file has these contents:
zone "10.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "16.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "17.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "18.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "19.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "20.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "21.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "22.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "23.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "24.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "25.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "26.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "27.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "28.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "29.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "30.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "31.172.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
zone "168.192.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
It is a good idea to include this configuration file, with an important caveat we'll cover later. When you use this file, the db.empty
zone file is loaded for all the RFC1918 address ranges. And because those are valid zone files with no entries for individual reverse DNS records (i.e., PTR records), the DNS traffic for those lookups won't go out to the public DNS. A "host not found" response will be returned to applications looking up the PTR records for IPs in those ranges. Those IP ranges are intended only for private use, so the DNS traffic for these networks should stay on private networks. Most sites utilize those ranges, so the public DNS doesn't have a set of delegated servers that serves meaningful information for these zones.
The caveat mentioned earlier is that we will not want to serve the db.empty
file for the 192.168.x.x
range that we use at our site. This means we'll delete this line from zones.rfc1918
:
zone "168.192.in-addr.arpa" { type master; file "/etc/bind/db.empty"; };
Then we'll uncomment this line in /etc/bind/named.conf.local
by deleting the two slashes at the start of the line:
//include "/etc/bind/zones.rfc1918";
Next, you'll need to create the campin.net and 168.192.in-addr.arpa
zone files. The file /etc/bind/db.campin.net
has these contents:
$TTL 600
@ IN SOA etchlamp.campin.net. hostmaster.campin.net. (
2008072900 ; serial
1800 ; refresh (30 minutes)
600 ; retry (10 minutes)
2419200 ; expire (4 weeks)
600 ; minimum (10 minutes)
)
IN NS etchlamp.campin.net.
; the A record for campin.net
600 IN A 66.219.68.159
etchlamp IN A 192.168.1.239
aurora IN A 192.168.1.248
goldmaster IN A 192.168.1.249
rhmaster IN A 192.168.1.251
rhlamp IN A 192.168.1.236
hemingway IN A 192.168.1.237
; www.campin.net is a CNAME back to the A record for campin.net
www 600 IN CNAME @
skitzo 86400 IN A 64.81.57.165
scampi 86400 IN A 66.219.68.159
; www.campin.net is a CNAME back to the A record for campin.net
www 600 IN CNAME @
; give the default gateway an easy to remember name
gw IN A 192.168.1.1
We created entries for our six hosts, our local gateway address, and some records from our public zone.
Next, you need to create the "reverse" zone, in the file /etc/bind/db.192.168
:
$TTL 600
@ IN SOA etchlamp.campin.net. hostmaster.campin.net. (
2008072900 ; serial
1800 ; refresh (30 minutes)
600 ; retry (10 minutes)
2419200 ; expire (4 weeks)
600 ; minimum (10 minutes)
)
@ IN NS etchlamp.campin.net.
$ORIGIN 1.168.192.in-addr.arpa.
1 IN PTR gw.campin.net.
236 IN PTR rhlamp.campin.net.
237 IN PTR hemingway.campin.net.
239 IN PTR etchlamp.campin.net.
248 IN PTR aurora.campin.net.
249 IN PTR goldmaster.campin.net.
251 IN PTR rhmaster.campin.net.
The $ORIGIN
keyword set all the following records to the 192.168.1.0/24
subnet's in-addr.arpa
reverse DNS range. This made the records simpler to type in. Be sure to terminate the names on the right-hand side of all your records with a dot (period character) when you specify the fully qualified domain name.
Next, populate the file /etc/bind/named.conf.local
with these contents, to utilize our new zone files:
include "/etc/bind/zones.rfc1918";
zone "campin.net" {
type master;
file "/etc/bind/db.campin.net";
};
zone "168.192.in-addr.arpa" {
type master;
file "/etc/bind/db.192.168";
};
Restart BIND using the included init script:
# /etc/init.d/bind9 restart
Look for errors from the init script, as well as in the /var/log/daemon.log
log file. If the init script successfully loaded the zones, you'll see lines like this in the log file:
Jul 29 17:43:30 etchlamp named[2580]: zone 168.192.in-addr.arpa/IN: loaded serial
2008072900
Jul 29 17:43:30 etchlamp named[2580]: zone campin.net/IN: loaded serial 2008072900
Jul 29 17:43:30 etchlamp named[2580]: running
Test resolution from another host on the local subnet using the dig
command:
$ dig @etchlamp gw.campin.net.
; <<>> DiG 9.3.4-P1.1 <<>> @etchlamp gw.campin.net.
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45274
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; QUESTION SECTION:
;gw.campin.net. IN A
;; ANSWER SECTION:
gw.campin.net. 600 IN A 192.168.1.1
;; AUTHORITY SECTION:
campin.net. 600 IN NS etchlamp.campin.net.
;; ADDITIONAL SECTION:
etchlamp.campin.net. 600 IN A 192.168.1.239
;; Query time: 19 msec
;; SERVER: 192.168.1.239#53(192.168.1.239)
;; WHEN: Tue Jul 29 17:45:49 2008
;; MSG SIZE rcvd: 86
This query returns the correct results. In addition, the flags section of the response has the aa
bit set, meaning that the remote server considers itself authoritative for the records it returns. Do the same thing again, but this time query for a reverse record:
$ dig @etchlamp -x 192.168.1.1 ptr
; <<>> DiG 9.3.4-P1.1 <<>> @etchlamp -x 192.168.1.1 ptr
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47489
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; QUESTION SECTION:
;1.1.168.192.in-addr.arpa. IN PTR
;; ANSWER SECTION:
1.1.168.192.in-addr.arpa. 600 IN PTR gw.campin.net.
;; AUTHORITY SECTION:
168.192.in-addr.arpa. 600 IN NS etchlamp.campin.net.
;; ADDITIONAL SECTION:
etchlamp.campin.net. 600 IN A 192.168.1.239
;; Query time: 2 msec
;; SERVER: 192.168.1.239#53(192.168.1.239)
;; WHEN: Tue Jul 29 17:46:11 2008
;; MSG SIZE rcvd: 108
Again, we have successful results. We had to modify only three included files (zones. rfc1918
, named.conf.local
, and named.conf.options
), and create two new ones (db.campin. net
and db.192.168
). Now we know the file locations and file contents that we need in order to host our private DNS on a Debian system running BIND.
Automating the BIND Configuration
We'll create a cfengine task to distribute our BIND configuration, and as usual it will restart the BIND daemon when the configuration files are updated.
Here are the steps to automate this process:
cfagent.conf
, so that the hostgroup and task are used.The first step is to get our files from etchlamp onto the cfengine master, in the correct location. Create the directory on goldmaster:
# mkdir -p /var/lib/cfengine2/masterfiles/PROD/repl/root/etc/bind/debian-ext
Now copy those five files from etchlamp to the new directory on goldmaster:
# pwd
/etc/bind
# scp zones.rfc1918 named.conf.local db.campin.net db.192.168 named.conf.options
goldmaster:/var/lib/cfengine2/masterfiles/PROD/repl/root/etc/bind/debian-ext/
Name the task PROD/inputs/tasks/apps/bind/cf.debian_external_cache
and start the task with these contents:
groups:
have_etc_rndc_key = ( FileExists(/etc/bind/rndc.key) )
Later in this task we'll perform permission fixes on the rndc.key
file, but we like to make sure it's actually there before we do it.
We'll continue explaining the cf.debian_external_cache
task. In the control
section we tell cfengine about some classes that we dynamically define, and put in an entry for DefaultPkgMgr
:
control:
any::
addinstallable = ( bind_installed bind_installed
reload_bind
)
debian::
DefaultPkgMgr = ( dpkg )
which is required when we use the packages
action:
packages:
debian::
bind9
version=9.3.4
cmp=ge
define=bind_installed
elsedefine=bind_not_installed
We use the packages
action simply to detect whether the bind9
package is installed, and we go with the version installed by Debian 4.0 ("Etch") as the minimum installed version. Assumptions will only lead to errors, so we double-check even basic assumptions such as whether BIND has been installed on the system at all.
Here we use the processes
action to start up BIND when it is missing from the process list, but only if it's one of our external caches, and only if the bind9
package is installed:
processes:
debian.bind_installed::
"named" restart "/etc/init.d/bind9 start" inform=false umask=022
There's no point in even trying to start BIND if it isn't installed.
Here we copy the five files we placed into the debian-ext
directory to the host's /etc/ bind
directory:
copy:
debian.bind_installed::
$(master_etc)/bind/debian-ext/
dest=/etc/bind/
r=inf
mode=644
type=checksum
purge=false
server=$(fileserver)
encrypt=true
owner=root
group=root
define=reload_bind
We carefully named the source directory debian-ext
because we might end up deploying BIND to our Debian hosts later in some other configuration. Having a complete source directory to copy makes the copy
stanza simpler. We know that only the files we want to overwrite are in the source directory on the cfengine master—so be careful not to add files into the source that you don't want automatically copied out. You also have to be careful not to purge during your copy, or you'll lose all the default Debian bind9
configuration files you depend on.
This shellcommands
section uses the reload_bind
class to trigger a restart of the BIND daemon:
shellcommands:
debian.restart_bind::
# when the config is updated, reload bind
"/etc/init.d/bind9 reload" timeout=30
The reload_bind
class is defined when files are copied from the master, via the define=
line.
These file and directory settings fix the important BIND files and directory permissions in the unlikely event that the bind user's UID and GID change:
files:
debian.bind_installed.have_etc_rndc_key::
/etc/bind/rndc.key owner=bind group=bind m=640 action=fixall
inform=true syslog=on
directories:
debian.bind_installed::
/var/cache/bind mode=775 owner=root group=bind inform=true syslog=on
/etc/bind mode=2755 owner=root group=bind inform=true syslog=on
Such an event happens if and when we later synchronize all the user accounts across our site. Now we'll take steps to recover properly from a bind-user UID/GID change. Set up an alerts
section to issue a warning when you designate a host as an external_debian_bind_cache
but don't actually have the bind9
package installed:
alerts:
debian.!bind_installed::
"Error: I am an external cache but I don't have bind9 installed."
We use the packages
action in this task, so we need to add packages to the actionsequence
in the control/cf.control_cfagent_conf
file for cfengine to run it:
actionsequence = (
directories
disable
packages
copy
editfiles
links
files
processes
shellcommands
)
Now we need to add the task to a hostgroup file, but it certainly isn't a good fit for the cf.any
hostgroup. Create a new hostgroup file for the task and place it at PROD/inputs/hostgroups/cf.external_dns_cache
. That name was chosen carefully; we won't assume that all our caching DNS servers will be running Debian, or even BIND for that matter. The role is to serve DNS to our network, and the hostgroup name is clear about that. The contents of this new hostgroup file are:
import:
any::
tasks/app/bind/cf.debian_external_cache
Now we need to define an alias for the hosts that serve this role. We'll edit PROD/inputs/classes/cf.main_classes
and add this line:
caching_dns_servers = ( etchlamp )
Then we'll edit cfagent.conf
and add an import for the new hostgroup file for the caching_dns_servers
class:
caching_dns_servers:: hostgroups/cf.external_dns_cache
Wait! If you were to run cfagent -qv
on etchlamp at this point, the file PROD/inputs/hostgroups/cf.external_dns_cache
would not be imported, even though cfagent
's "Defined Classes" output shows that the caching_dns_servers
class is set. Most people learn this important lesson the hard way, and we wanted you to learn it the hard way as well, so it will be more likely to stick.
To reorganize in a way that will work with cfengine's issues around imports but preserve our hostgroup system, delete these two lines from cfagent.conf
:
any:: hostgroups/cf.any
caching_dns_servers:: hostgroups/cf.external_dns_cache
Place the line in a new file, hostgroups/cf.hostgroup_mappings
, with these contents:
import:
any:: hostgroups/cf.any
caching_dns_servers:: hostgroups/cf.external_dns_cache
Remember that any lines added below the cf.external_dns_cache
import will apply only to the caching_dns_servers
class, unless a new class is specified. That is a common error made by inexperienced cfengine-configuration authors, and often even experienced ones.
We need to add the cf.hostgroup_mappings
file to cfagent.conf
, by adding this line at the end:
hostgroups/cf.hostgroup_mappings
We don't need to specify the any::
class because it's already inherent in all of this task's imports. In fact, unless otherwise specified, it's inherent in every cfengine action.
Now we should validate that our hostgroup is being imported properly—by running cfagent -qv
on etchlamp. Look for this line in the output:
Looking for an input file tasks/app/bind/cf.debian_external_cache
Success! All future hostgroup imports will happen from the cf.hostgroup_mappings
file. We'll mention one last thing while on the subject of imports. Note that we don't do any imports in any of our task files. Any file containing actions other than import
should not use the import
action at all. You can get away with this if you do it carefully, but we'll avoid it like the plague.
Remember that every host that ever matches the caching_dns_servers
class will import the cf.external_dns_cache hostgroup
file, and therefore will also import the cf.debian_external_cache
task. If a Solaris host is specified as a member of the caching_dns_servers
class, it will not do anything unintended when it reads the cf.debian_external_cache
task. This is because we specify the debian
class for safety in the class settings for all our actions. You could further protect non-Debian hosts by importing the task only for Debian hosts from the hostgroups/cf.external_dns_cache
file:
import:
debian::
tasks/app/bind/cf.debian_external_cache
Importing the task this way is safer, but even if you do, you should make sure that your cfengine configuration files perform actions only on the hosts you intend. Always be defensive with your configurations, and you'll avoid unintended changes. Up until this point, we have purposely made our task files safe to run on any operating system and hardware architecture by limiting the cases when an action will actually trigger, and we will continue to do so.
Now it's time to reimage etchlamp via FAI, and make sure that the DNS service is fully configured and working when we set up etchlamp from scratch. Always ensure that your automation system works from start to finish. The etchlamp host's minimal install and configuration work will take under an hour, so the effort and time is well worth it.
While etchlamp is reimaging, remove the old installation's cfengine public key on the cfengine master because the reimaging process will generate a new key. The host etchlamp has the IP 192.168.1.239
, so run this command on goldmaster as the root
user:
# rm /var/lib/cfengine2/ppkeys/root-192.168.1.239.pub
When etchlamp reboots after installation, the cfengine daemons don't start up because we have only the bootstrap update.conf
and cfagent.conf
files in /var/lib/cfengine2/inputs
. We need to make sure that cfagent
runs once upon every reboot. Modify /srv/fai/config/scripts/FAIBASE/50-cfengine
on the FAI server to add a line that will run cfagent
upon every boot, mainly to help on the first boot after installation:
#! /usr/sbin/cfagent -f
control:
any::
actionsequence = ( editfiles )
EditFileSize = ( 30000 )
editfiles:
any::
{ ${target}/etc/aliases
AutoCreate
AppendIfNoSuchLine "root: [email protected]"
}
{ ${target}/etc/default/cfengine2
ReplaceAll "=0$" With "=1"
}
{ ${target}/etc/init.d/bootmisc.sh
AppendIfNoSuchLine "/usr/sbin/cfagent -qv"
}
This configures the cfagent
program to run from the /etc/init.d/bootmisc.sh
file at boot time. So, to recap: We started another reimage of etchlamp and removed /var/lib/cfengine2/ppkeys/root-192.168.1.239.pub
again on the cfengine master while the host was reimaging.
The host etchlamp returned from reimaging fully configured, with cfengine running. Now every time a Debian host boots at our site after FAI installs it, it will run cfagent
during boot. Without logging into the host (i.e., without manual intervention), you can run a DNS query against etchlamp successfully:
$ dig @etchlamp gw.campin.net
; <<>> DiG 9.3.4-P1.1 <<>> @etchlamp gw.campin.net
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59779
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; QUESTION SECTION:
;gw.campin.net. IN A
;; ANSWER SECTION:
gw.campin.net. 600 IN A 192.168.1.1
;; AUTHORITY SECTION:
campin.net. 600 IN NS etchlamp.campin.net.
;; ADDITIONAL SECTION:
etchlamp.campin.net. 600 IN A 192.168.1.239
;; Query time: 1 msec
;; SERVER: 192.168.1.239#53(192.168.1.239)
;; WHEN: Wed Jul 30 00:39:52 2008
;; MSG SIZE rcvd: 86
What we have accomplished here is worth celebrating. If you suffer total system failure on the host etchlamp, you can simply reimage a new host with the same host-name and bring it back onto the network as a DNS server. This is exactly what we want of all hosts at our site. As you deploy web servers, NFS servers, and other system roles, you should test that the host can be reimaged and properly configured to serve its designated function again without any human intervention. The extent of human involvement should be to identify hardware and do any Kickstart/FAI/JumpStart configuration needed to support imaging that piece of hardware.
We have a private DNS server now, and although it's the only one, we'll configure the /etc/resolv.conf
files across all our hosts to utilize the new DNS server before any other DNS servers. We'll still list our existing DNS server, 192.168.1.1
, as the second nameserver in /etc/resolv.conf
in case etchlamp becomes unreachable.
Cfengine has a resolve
action that you can use to configure the /etc/resolv.conf
file. We'll create a task called tasks/os/cf.resolv_conf
and test whether we have resolv.conf
in a directory where postfix is chroot
ed by default on Debian:
classes:
have_postfix_resolv = ( FileExists(/var/spool/postfix/etc/resolv.conf) )
Here's something we've never done before—change the actionsequence
in a task file:
control:
any::
addinstallable = ( reloadpostfix )
actionsequence = ( resolve )
EmptyResolvConf = (true )
The preceding code adds resolve
to the actionsequence
. We can add it to the global actionsequence
defined in the control/cf.control_cfagent_conf
file that's imported directly from cfagent.conf
, but there's really no need. We'll generally add actionsequence
items there, but we wanted to demonstrate that we still have some flexibility in our cfengine configurations.
The order of the IP addresses and comment is preserved in the /etc/resolv.conf
file:
resolve:
any::
#
# If EmptyResolvConf is set to true, we'll completely wipe out
# resolv.conf EVEN if we have no matches in the below classses!
#
# When EmptyResolvConf is set, always be sure that you have an
# any class to catch all hosts with some basic nameserver entries.
#
192.168.1.239
192.168.1.1
"# resolv.conf edited by cfengine, don't muck with this"
We added the comment so that if any SAs want to change /etc/resolv.conf
directly with a text editor, they'll realize that the file is under cfengine control.
We use the local copy to keep postfix name resolution working properly after cfen-gine updates the /etc/resolv.conf
file and to restart postfix when we do the copy:
copy:
# this is a local copy to keep the chroot'd postfix resolv.conf up to date
have_postfix_resolv::
/etc/resolv.conf
dest=/var/spool/postfix/etc/resolv.conf
mode=644
owner=root
group=root
type=checksum
define=reloadpostfix
shellcommands:
# reload postfix when we update the chroot resolv.conf
debian.reloadpostfix::
"/etc/init.d/postfix restart" timeout=30 inform=true
Next, add the task to PROD/inputs/hostgroups/cf.any
. Once the task is enabled, we connect to the host aurora and inspect the new /etc/resolv.conf
:
# cat /etc/resolv.conf
domain campin.net
nameserver 192.168.1.239
nameserver 192.168.1.1
# resolv.conf edited by cfengine, don't muck with this
Then test name resolution:
# nslookup gw
Server: 192.168.1.239
Address: 192.168.1.239#53
Name: gw.campin.net
Address: 192.168.1.1
We're done with the DNS for now. When we get more hardware to deploy another Debian-based DNS server system, we'll add it to the caching_dns_servers
class, let cfengine set up BIND, then update cf.resolv_conf
to add another nameserver
entry to all our site's /etc/resolv.conf
files.
We need to take control of the user accounts at our site. Every site eventually needs a centralized mechanism the SA staff can use to create and delete accounts, lock them out after a designated number of failed logins, and log user access. This will be usually a system such as NIS/NIS+, LDAP, or perhaps LDAP combined with Kerberos.
At this point, we're not talking about setting up a network-based authentication system—we're not ready for that yet. First, we need to take control of our local account files: /etc/passwd
, /etc/shadow
, and /etc/group
. Even if we already had LDAP deployed at our site and all our users had accounts only in the LDAP directory, we would need to be able to change the local root
account password across all our systems on a regular basis. In addition, we normally change the default shell on many system accounts that come with the system, for added security. Allowing local account files to go unmanaged is a security risk.
We have three different sets of local account files at our site: those for Red Hat, Solaris, and Debian. We're going to standardize the files for each system type, and synchronize those files to each system from our central cfengine server on a regular basis. Over time, we'll need to add accounts to the standard files to support new software (e.g., a "mysql" user to run the MySQL database software). We will never add them directly onto the client systems; instead, we will add them to the centralized files.
We have only two installed instances of each OS type, so it's easy to copy all the files to a safe location and consolidate them. Because we're copying the shadow
files, the location should be a directory with restrictive permissions:
# mkdir -m 700 /root/authfiles
# cd /root/authfiles
# for host in goldmaster rhmaster rhlamp ethlamp hemingway aurora ;
do for file in passwd shadow group ; do [ -d $file ] || mkdir -m 700 $file ;
scp root@${host}:/etc/$file ${file}/${file}.$host ; done ; done
These commands will iterate over all our hosts and copy the three files we need to a per-file subdirectory, with a file name that includes the hostname of the system that the file is from. We will illustrate standardization of account files for our two Solaris hosts only, to keep this section brief. Assume that we will perform the same process for Debian and Red Hat.
Now you can go into each directory and compare the files from the two Solaris hosts:
# cd /root/authfiles/passwd
# diff passwd.aurora passwd.hemingway
12a13,14
> postgres:x:90:90:PostgreSQL Reserved UID:/:/usr/bin/pfksh
> svctag:x:95:12:Service Tag UID:/:
The hemingway host has two accounts that weren't created on aurora. We won't need the postgres
account, used to run the freeware Postgres database package. We will keep the svctag
account because the Solaris serial port–monitoring facilities use it.
# mv passwd.hemingway passwd.solaris
# rm passwd.aurora
Edit passwd.solaris
and remove the line starting with postgres
. Now the passwd.solaris
file contains the accounts we need on both systems. We will use this as our master Solaris password file.
Go through the same procedure for the Solaris shadow
files:
# cd ../shadow
# diff shadow.hemingway shadow.aurora
13,14d12
< postgres:NP:::::::
< svctag:*LK*:6445::::::
# mv shadow.hemingway shadow.solaris
# rm shadow.aurora
Use a text editor to remove the postgres
line from shadow.solaris
as well.
Here's the procedure for the group
file:
# diff group.hemingway group.aurora
17d16
< postgres::90:
20a20
> sasl::100:
We have a postgres
group on hemingway that we'll remove, and a sasl
group on aurora that we'll keep. SASL is the Simple Authentication and Security Layer, which you use to insert authentication into network protocols. We might end up needing this if we set up authenticated Simple-mail Transfer Protocol (SMTP) or another authenticated network protocol later on.
# mv group.aurora group.solaris
# rm group.hemingway
Now we'll move our new files into the directories we created for these files (back when we originally created our masterfiles
directory in Chapter 5).
# scp group/group.solaris
goldmaster:/var/lib/cfengine2/masterfiles/PROD/repl/root/etc/group/
# scp passwd/passwd.solaris
goldmaster:/var/lib/cfengine2/masterfiles/PROD/repl/root/etc/passwd/
# scp shadow/shadow.solaris
goldmaster:/var/lib/cfengine2/masterfiles/PROD/repl/root/etc/shadow/
Now perform the same decision-making process for the Red Hat and Debian account files. When you're done, move them into the proper place in the masterfiles
directories as you did for the Solaris account files. You need to be careful during this stage that you don't change the UID or GID of system processes without setting up some remediation steps in cfengine.
Our two Debian systems ended up with different UID and GID numbers for the postfix user and group, as well as for the postdrop
group (also used by postfix). We chose to stick with the UID and GID from the goldmaster host, and to add some permission fixes in a cfengine task that will fix the ownership of the installed postfix files and directories.
Once we've standardized all our files, we have these files on the cfengine master system:
# pwd
/var/lib/cfengine2/masterfiles/PROD/repl/root/etc
# ls passwd/ shadow/ group
group:
./ ../ group.debian group.redhat group.solaris
passwd/:
./ ../ passwd.debian passwd.redhat passwd.solaris
shadow/:
./ ../ shadow.debian shadow.redhat shadow.solaris
We'll develop a cfengine task to distribute our new master account files. We will add some safety checks into this task because we need to treat these files with the utmost caution.
We'll place the file in a task called cf.account_sync
, with these contents:
classes: # synonym groups:
safe_to_sync = ( debian_4_0
redhat_s_5_2
sunos_5_10
)
We create a group to control which classes of systems get the account-file synchronization. These three classes encompass all the systems we're currently running at our site. We do this because we know our account files will work on the UNIX/Linux versions that we're currently running, but we don't know if they will work on older or newer versions. In fact, if you don't know for sure that something will work, you should assume that it won't.
So if you deploy a new type of system at your site, you run the risk that the new system type won't have local account files synchronized by cfengine. Take measures to detect this situation in the task, and alert the site administrators:
control:
debian::
passwd_file = ( "passwd.debian" )
shadow_file = ( "shadow.debian" )
group_file = ( "group.debian" )
redhat::
passwd_file = ( "passwd.redhat" )
shadow_file = ( "shadow.redhat" )
group_file = ( "group.redhat" )
solaris|solarisx86::
passwd_file = ( "passwd.solaris" )
shadow_file = ( "shadow.solaris" )
group_file = ( "group.solaris" )
Here you'll recognize the standardized files we created earlier.
copy:
safe_to_sync::
$(master_etc)/passwd/$(passwd_file)
dest=/etc/passwd
mode=644
server=$(fileserver)
trustkey=true
type=checksum
owner=root
group=root
encrypt=true
verify=true
size=>512
$(master_etc)/shadow/$(shadow_file)
dest=/etc/shadow
mode=400
owner=root
group=root
server=$(fileserver)
trustkey=true
type=checksum
encrypt=true
size=>200
$(master_etc)/group/$(group_file)
dest=/etc/group
mode=644
owner=root
group=root
server=$(fileserver)
trustkey=true
type=checksum
encrypt=true
size=>200
The size
keyword in these copy
stanzas adds file-size minimums for the passwd
, shadow
, and group
file copies. We use this keyword so we don't copy out empty or erroneously stripped down files. The minimums should be around half the size of the smallest version that we have of that particular file. You might need to adjust the minimums if the files happen to shrink later on. Usually these files grow in size.
Here we define an alert for hosts that don't have local account files to synchronize:
alerts:
!safe_to_sync::
"I am not set up to sync my account files, please check on it."
The alerts
action simply prints text used to alert the system administrator. The cfexecd
daemon will e-mail this output.
Next, put the task into the cf.any
hostgroup:
import:
any::
tasks/os/cf.motd
tasks/os/cf.cfengine_cron_entries
tasks/os/cf.ntp
tasks/os/cf.account_sync
When cfagent
performs a copy, and the repository
variable is defined, the version of the file before the copy is backed up to the repository
directory. Define repository
like this in PROD/inputs/control/cf.control_cfagent_conf
:
repository = ( $(workdir)/backups )
This means you can see the old local account files in the backup directory on each client after the copy. On Debian the directory is /var/lib/cfengine2/backups
, and on the rest of our hosts it's /var/cfengine/backups
.
If you encounter any problems, compare the previous and new versions of the files, and see if you left out any needed accounts. Be aware that each performed copy overwrites previous backup files in the repository
directory. This means you'll want to validate soon after the initial sync. We also saved the original files in the home directory for the root
user. It's a good idea to store them for at least a few days in case you need to inspect them again.
Our etchlamp system had the postfix account's UID and GID change with this local account sync. The GID of the postdrop
group also changed. We can fix that with cfengine, in a task we call cf.postfix_permissions
:
classes: # synonym groups:
have_var_spool_postfix = ( IsDir("/var/spool/postfix") )
have_var_spool_postfix_public = ( IsDir("/var/spool/postfix/public") )
have_var_spool_postfix_maildrop = ( IsDir("/var/spool/postfix/maildrop") )
have_usr_sbin_postdrop = ( IsDir("/usr/sbin/postdrop") )
have_usr_sbin_postqueue = ( IsDir("/usr/sbin/postqueue") )
Here we have some classes based on whether files or directories are present on the system. We don't want to assume that postfix is installed on the system. We previously added postfix into the list of FAI base packages, but we can't guarantee with absolute certainty that every Debian system we ever manage will be running postfix.
We could use a more sophisticated test, such as verifying that the postfix Debian package is installed, but a simple directory test suffices and happens quickly:
directories:
debian.have_var_spool_postfix_public::
/var/spool/postfix/public mode=2710
owner=postfix group=postdrop inform=true
debian.have_var_spool_postfix_maildrop::
/var/spool/postfix/maildrop mode=1730
owner=postfix group=postdrop inform=true
debian.have_var_spool_postfix::
/var/spool/postfix/active mode=700 owner=postfix
group=root inform=true
/var/spool/postfix/bounce mode=700 owner=postfix
group=root inform=true
/var/spool/postfix/corrupt mode=700 owner=postfix
group=root inform=true
/var/spool/postfix/defer mode=700 owner=postfix
group=root inform=true
/var/spool/postfix/deferred mode=700 owner=postfix
group=root inform=true
/var/spool/postfix/flush mode=700 owner=postfix
group=root inform=true
/var/spool/postfix/hold mode=700 owner=postfix
group=root inform=true
/var/spool/postfix/incoming mode=700 owner=postfix
group=root inform=true
/var/spool/postfix/private mode=700 owner=postfix
group=root inform=true
/var/spool/postfix/trace mode=700 owner=postfix
group=root inform=true
Here we make sure that all the postfix spool directories have the correct ownership and permissions. If you blindly create the directories without verifying that /var/spool/postfix
is already there, it'll appear as if postfix is installed when it isn't. This might seem like a minor detail, but the life of an SA comprises a large collection of minor details such as this. Creating confusing situations such as unused postfix spool directories is just plain sloppy, and you should avoid doing so.
Here we ensure that two important postfix binaries have the SetGID bit set, as well as proper ownership:
files:
debian.have_usr_sbin_postqueue::
/usr/sbin/postqueue m=2555 owner=root group=postdrop
action=fixall inform=true
debian.have_usr_sbin_postdrop::
/usr/sbin/postdrop m=2555 owner=root group=postdrop
action=fixall inform=true
At any time you can validate that postfix has the proper permissions by executing this line:
# postfix check
You'll also want to restart any daemons that had their process-owner UID change after you fixed file and directory permissions.
Now we'll put the task into the cf.any
hostgroup:
import:
any::
tasks/os/cf.motd
tasks/os/cf.cfengine_cron_entries
tasks/os/cf.ntp
tasks/os/cf.account_sync
tasks/os/cf.postfix_permissions
You're probably wondering why we put the cf.postfix_permissions
task into the cf.any
hostgroup, when it performs actions only on Debian hosts. We did this because we might end up having to set postfix permissions on other platforms later. The task does nothing on host types for which it's not intended, so you face little risk of damage.
From this point on, when you install new packages at your site that require additional local system accounts, manually install on one host (of each platform) as a test. When you (or the package) find the next available UID and GID for the account, you can add the account settings into your master passwd
, shadow
, and group
files for synchronization to the rest of your hosts. That way, when you deploy the package to all hosts via cfengine, the needed account will be in place with the proper UID and GID settings. This is another example of how the first step in automating a procedure is to make manual changes on test systems.
Now you can add user accounts at your site. We didn't want to add a single user account before we had a mechanism to standardize UIDs across the site. The last thing we need is to deploy LDAP or a similar service later on, and have a different UID for each user account—on many systems. We have avoided that mess entirely.
At this point, you can simply add users into the centralized account files stored on the cfengine master. New users won't automatically have a home directory created, but later in the chapter we'll address that issue using a custom adduser
script, an NFS-mounted home directory, and the automounter.
Using Scripts to Create User Accounts
You shouldn't ever create user accounts manually by hand-editing the centralized passwd
, shadow
, and group
files at your site. We'll create a simple shell script that chooses the next available UID and GID, prompts for a password, and properly appends the account information into the account files.
We'll make the script simple because we don't intend to use it for long. Before we even write it, we need to consider where we'll put it. We know that it is the first of what will surely be many administrative scripts at our site. When we first created the masterfiles
directory structure, we created the directory PROD/repl/admin-script
s/, which we'll put into use now.
We'll copy the contents of this directory to all hosts at our site, at a standard location. We've created a cfengine task to do this, called cf.sync_admin_scripts
:
copy:
any::
$(master)/repl/admin-scripts
dest=/opt/admin-scripts
mode=550
owner=root
group=root
type=checksum
server=$(fileserver)
encrypt=true
r=inf
purge=true
directories:
any::
/opt/admin-scripts mode=750 owner=root group=root inform=false
We're copying every file in that directory, making sure each is protected from non-root
users and executable only for members of the root
group. Because we haven't set up special group memberships yet, SA staff will need to become root
to execute these scripts—for now, anyway. Remember that our actionsequence
specifies that directories
runs before copy
, so the directory will be properly created before the copy is attempted.
Add this entry to the end of the cf.any
hostgroup:
tasks/misc/cf.sync_admin_scripts
You place the task in the misc
directory because it's not application-specific and it doesn't affect part of the core operating system. Now you can utilize a collection of administrative scripts that is accessible across the site. You can create the new user script and place it in there. The script itself will have checks to make sure it is running on the appropriate master host.
We call the script add_local_user
, and we don't append a file suffix such as .sh
. This way, we can rewrite it later in Perl or Python and not worry about a misleading file suffix. UNIX doesn't care about file extensions, and neither should you.
#!/bin/sh
############################################################################
# This script was written to work on Debian Linux, specifically the Debian host
# serving as the cfengine master at our site. Analysis should be done before
# attempting to run elsewhere.
############################################################################
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/opt/admin-scripts
# this is the deepest shared directory for all the
# passwd/shadow/group files
BASE_PATH=/var/lib/cfengine2/masterfiles/PROD/repl/root/etc
USERNAME_FILE=/var/lib/cfengine2/masterfiles/PROD/repl/root/etc/USERFILE
case 'hostname' in
goldmaster*)
echo "This is the proper host on which to add users, continuing..."
;;
*)
echo "This is NOT the proper host on which to add users, exiting now..."
exit 1
;;
esac
We have only one cfengine master host that has the centralized files, so make sure we're running on the correct host before moving on. We also define a file, which we'll use later, to store usernames for accounts that we create:
cd $BASE_PATH
LOCKFILE=/root/add_user_lock
rm_lock_file() {
rm -f $LOCKFILE
}
# don't ever run two of these at once
lockfile $LOCKFILE || exit 1
We define a file to use for locking to ensure that we run only one instance of this script at a time. We use methods that should prevent files from getting corrupted, but if two script instances copy an account file at the same time, update it, then copy it back into place, one of those instances will have its update overwritten.
Now collect some important information about the user account:
# We REALLY need to sanity check what we accept here, before blindly
# trusting the values, that's an excercise for the reader.
echo "Please specify a username for your new account, 8 chars or less: "
read USERNAME
echo "Please give the person's full name for your new account: "
read GECOS
stty -echo
echo "Please specify a password for your new account: "
read PASSWORD
stty echo
Later we should add some logic to test that the password meets certain criteria. The eight-character UNIX username limit hasn't applied for years on any systems that we run, but we observe the old limits just to be safe.
Here we generate an encrypted password hash for our shadow
files:
ENC_PASS='echo $PASSWD | mkpasswd -s'
You can add -H md5
to generate an MD5 hash, which is more secure. We've chosen to use the lowest common denominator here, in case we inherit some old system. Which type of hash you choose is up to you.
Now create the file containing the next available UID, if it doesn't already exist:
[ -f "$BASE_PATH/NEXTUID" ] || echo 1001 > $BASE_PATH/NEXTUID
Collect the UID and GID to use for the account. Always use the same number for both:
NEXTUID='cat $BASE_PATH/NEXTUID'
Test that the value inside the NEXTUID
file is numerically valid. We would hate to create an account with an invalid UID:
if [ -n "$NEXTUID" -a $NEXTUID -gt 1000 ]
then
echo "Our next UID appears valid, continuing..."
else
echo "The $BASE_PATH/NEXTUID file appears to be corrupt, please
investigate."
echo "Exiting now..."
exit 1
fi
Here we set up the formatting of our account-file entries, to be used in the next section:
SEC_SINCE_EPOCH='date +%s'
GROUP_FORMAT="$USERNAME:x:$NEXTUID:"
PASSWD_FORMAT="$USERNAME:x:$NEXTUID:$NEXTUID:$GECOS:/home/$USERNAME:/bin/bash"
SHADOW_FORMAT="$USERNAME:$ENC_PASS:$SEC_SINCE_EPOCH:7:180:14:7::"
If you use this script, you need to set values for the shadow
fields that make sense at your site. The meanings are:
1 login name
2 encrypted password
3 days since Jan 1, 1970 that password was last changed
4 days before password may be changed
5 days after which password must be changed
6 days before password is to expire that user is warned
7 days after password expires that account is disabled
8 days since Jan 1, 1970 that account is disabled
9 a reserved field (unused)
for groupfile in group/group*
do
cp $groupfile ${groupfile}.tmp &&
echo $GROUP_FORMAT >> ${groupfile}.tmp &&
mv ${groupfile}.tmp $groupfile ||
( echo "Failed to update $groupfile - exiting now." ; rm_lock_file ; exit 1 )
done
for shadowfile in shadow/shadow*
do
cp $shadowfile ${shadowfile}.tmp &&
echo $SHADOW_FORMAT >> ${shadowfile}.tmp &&
mv ${shadowfile}.tmp $shadowfile ||
( echo "Failed to update $shadowfile - exiting now." ; rm_lock_file ;
exit 1 )
done
for passwdfile in passwd/passwd*
do
cp $passwdfile ${passwdfile}.tmp &&
echo $PASSWD_FORMAT >> ${passwdfile}.tmp &&
mv ${passwdfile}.tmp $passwdfile ||
( echo "Failed to update $passwdfile - exiting now." ; rm_lock_file ; exit 1 )
done
Update each of the files in the group
, shadow
, and password
directories. Make a copy of the file (i.e., cp $passwdfile ${passwdfile}.tmp
), update it (i.e., echo $PASSWD_FORMAT >> ${passwdfile}.tmp
), then use the mv
command to put it back into place (i.e., mv ${passwdfile}.tmp $passwdfile
).
The mv
command makes an atomic update when moving files within the same filesystem. This means you face no risk of file corruption from the system losing power or our process getting killed. The command will either move the file into place, or it won't work at all. SAs must make file updates this way. The script will exit with an error if any part of the file-update process fails:
# update the UID file
NEWUID='expr $NEXTUID + 1'
echo $NEWUID > $BASE_PATH/NEXTUID ||
( echo "Update of $BASE_PATH/NEXTUID failed, exiting now" ; rm_lock_file ; exit 1 )
Update the file used to track the next available UID:
# update a file used to create home dirs on the NFS server
if [ ! -f $USERNAME_FILE ]
then
touch $USERNAME_FILE
fi
cp $USERNAME_FILE ${USERNAME_FILE}.tmp &&
echo $USERNAME >> ${USERNAME_FILE}.tmp &&
mv ${USERNAME_FILE}.tmp $USERNAME_FILE ||
( echo "failed to update $USERNAME_FILE with this user's account name."
rm_lock_file ; exit 1 )
We store all new user accounts in a text file on the cfengine master system. We'll write another script (PROD/repl/admin-scripts/setup_home_dirs
from the next section) that uses this file to create central home directories. The script ends with a cleanup step:
# if we get here without errors, clean up
rm_lock_file
Put this script in the previously mentioned admin-scripts
directory, and run it from there on the goldmaster host when a new account is needed.
We've left one exercise for the reader: the task of removing accounts from the centralized account files. You'll probably want to use the procedure in which you edit a temporary file and mv
it into place for that task. If the process or system crashes during an update of the account files, corrupted files could copy out during the next scheduled cfengine run. Our size minimums might catch this, but in such a scenario the corrupted files might end up being large, resulting in a successful copy and major problems.
NFS-Automounted Home Directories
We installed the host aurora to function as the NFS server for our future web application. We should also configure the host to export user home directories over NFS.
Configuring NFS-Mounted Home Directories
We'll configure the NFS-share export and the individual user's home directory creation with a combination of cfengine configuration and a script that's used by cfengine.
Put this line into PROD/inputs/classes/cf.main_classes
:
homedir_server = ( aurora )
Create the file PROD/inputs/hostgroups/cf.homedir_server
with these contents:
import:
any::
tasks/app/nfs/cf.central_home_dirs
Create the file PROD/inputs/tasks/app/nfs/cf.central_home_dirs
with these contents:
control:
any::
addinstallable = ( create_homedirs enable_nfs )
copy:
homedir_server.(solaris|solarisx86)::
$(master_etc)/USERFILE
dest=/export/home/USERFILE
mode=444
owner=root
group=root
type=checksum
server=$(fileserver)
encrypt=true
define=create_homedirs
$(master_etc)/skel
dest=/export/home/skel
mode=555
owner=root
group=root
type=checksum
server=$(fileserver)
encrypt=true
r=inf
directories:
homedir_server.(solaris|solarisx86)::
/export/home mode=755 owner=root group=root inform=false
shellcommands:
homedir_server.create_homedirs.(solaris|solarisx86)::
"/opt/admin-scripts/setup_home_dirs"
timeout=300 inform=true
homedir_server.enable_nfs.(solaris|solarisx86)::
"/usr/sbin/svcadm enable network/nfs/server"
timeout=60 inform=true
editfiles:
homedir_server.(solaris|solarisx86)::
{ /etc/dfs/dfstab
AppendIfNoSuchLine "share -F nfs -o rw,anon=0 /export/home"
DefineClasses "enable_nfs"
}
This should all be pretty familiar by now. The interesting part is that we sync the USERFILE
file, and when it is updated we call a script that creates the needed accounts. This is the first NFS share for the host aurora, so we enable the NFS service when the share is added to /etc/dfs/dfstab
.
Create a file at PROD/repl/admin-scripts/setup_home_dirs
to create the home directories:
#!/bin/sh
# distributed by cfengine, don't edit locally
PATH=/usr/sbin:/usr/bin:/opt/csw/bin
USERFILE=/export/home/USERFILE
for user in 'cat $USERFILE'
do
USERDIR=/export/home/$user
if [ ! -d $USERDIR ]
then
cp -r /export/home/skel $USERDIR
chmod 750 $USERDIR
chown -R ${user}:${user} $USERDIR
fi
done
Now that the task is done, enable it in the file PROD/inputs/hostgroups/cf.hostgroup_mappings
with this entry:
homedir_server:: hostgroups/cf.homedir_server
Our home-directory server is ready for use by the rest of the hosts on the network.
Configuring the Automounter
Sites often utilize the automounter to mount user home directories. Instead of mounting the home NFS share from all client systems, the automounter mounts individual users' home directories on demand. After a period of no access (normally after the user is logged out for a while), the share is unmounted. Automatic share unmounting results in less maintenance, and it doesn't tax the NFS server as much. Note that most automounter packages can mount remote filesystem types other than NFS.
We're missing the autofs
package in our base Debian installation. At this point, we add the autofs
package to the /srv/fai/config/package_config/FAIBASE
list of packages, so that future Debian installations have the required software. The package already exists on our Red Hat and Solaris installations.
The file names for the automounter configuration files vary slightly between Linux and Solaris. We'll create the needed configuration files and put them into our masterfiles
repository. We created an autofs
directory at PROD/repl/root/etc/autofs
when we first set up our file repository in Chapter 5.
The files we'll utilize and configure on Linux are /etc/auto.master
and /etc/auto.home
. On Solaris, the files are /etc/auto_master
and /etc/auto_home
. The auto.master
and auto_master
files map filesystem paths to files that contain the commands to mount a remote share at that path. The auto.home
and auto_home
files have the actual mount commands.
Our auto.master
and auto_master
files each contain only a single line:
/home /etc/auto.home
Our auto.home
and auto_home
files are identical, and contain only a single line:
* -nolock,rsize=32767,wsize=32767,proto=tcp,hard,intr,timeo=8,nosuid,retrans=5
aurora:/export/home/&
Note The single line in the auto_home
and auto.home
files is shown as two lines due to publishing line-length limitations. It is important that you create the entry as a single line in your environment. You can download all the code for this book from the Downloads section of the Apress web site at http://www.apress.com.
We have a number of mount options listed, but the important thing to note is that we use a wildcard pattern on the left to match all paths requested under /home
. The wildcard makes the file match /home/nate
as well as /home/kirk
, and look for the same path (either nate
or kirk
) in the share on aurora, using the ampersand at the end of the line.
Next, we create a task to distribute the files at PROD/inputs/tasks/os/cf.sync_autofs_maps
. This task follows what is becoming a common procedure for us, in which we define some variables to hold different file names appropriate for different hosts or operating systems, then synchronize the files, then restart the daemon(s) as appropriate:
control:
any::
addinstallable = ( restartautofs )
AllowRedefinitionOf = (
auto_master
auto_home
)
linux::
auto_master = ( "auto.master" )
auto_home = ( "auto.home" )
auto_net = ( "auto.net" )
etc_auto_home = ( "/etc/auto.home" )
etc_auto_master = ( "/etc/auto.master" )
(solaris|solarisx86)::
auto_master = ( "auto_master" )
auto_home = ( "auto_home" )
auto_net = ( "auto_net" )
etc_auto_home = ( "/etc/auto_home" )
etc_auto_master = ( "/etc/auto_master" )
copy:
any::
$(master_etc)/autofs/$(auto_master)
dest=$(etc_auto_master)
mode=444
owner=root
group=root
server=$(fileserver)
trustkey=true
type=checksum
encrypt=true
define=restartautofs
$(master_etc)/autofs/$(auto_home)
dest=$(etc_auto_home)
mode=444
owner=root
group=root
server=$(fileserver)
trustkey=true
type=checksum
encrypt=true
define=restartautofs
shellcommands:
(debian|redhat).restartautofs::
# when config is updated, restart autofs
"/etc/init.d/autofs reload"
timeout=60 inform=true
(solaris|solarisx86).restartautofs::
# when config is updated, restart autofs
"/usr/sbin/svcadm restart autofs"
timeout=180 inform=false
processes:
debian|redhat::
"automount" restart "/etc/init.d/autofs start" inform=true
solaris|solarisx86::
"/usr/sbin/svcadm enable autofs ; /usr/sbin/svcadm restart autofs" inform=true
We start the automounter when the process isn't found in the process list. We attempt to enable the NFS service on Solaris when it's not running, then we try to restart it. We don't know what the problem is when it's not running on Solaris, so the enable
step seems like a logical solution to one possible cause.
Import this task into PROD/inputs/hostgroups/cf.any
to give all your hosts a working automounter configuration.
We now have a system to add users, and we also have a shared home-directory server. This should suffice until you can implement a network-enabled authentication scheme later.
Mail is the primary message-passing mechanism at UNIX-based sites. You use-mail to notify users of cron-job output, cfexecd
sends cfagent
output via e-mail, and many application developers and SAs utilize e-mail to send information directly from applications and scripts.
Mail relays on internal networks route e-mail and queue it up for the rest of the hosts on the network when remote destinations become unreachable. You should centralize disk space and CPU resources needed for mail queuing and processing. In addition, it's simpler to configure a centralized set of mail relays to handle special mail-routing tables and aliases than it is to configure all the-mail-transfer agents on all machines at a site.
We'll use our etchlamp Debian host as our site's mail relay. We've built this host entirely using automation, so it's the sensible place to continue to focus infrastructure services.
We add a CNAME for relayhost.campin.net to PROD/repl/root/etc/bind/debian-ext/db.campin.n
et, and it'll simply go out to etchlamp on the next cfexecd
run:
relayhost IN CNAME etchlamp
Be sure to increment the serial number in the zone file.
We run postfix on all our Debian hosts, and we'll stick with postfix as our mail-relay Mail Transfer Agent (MTA). The default postfix configuration on etchlamp needs some modifications from the original file placed in /etc/postfix/main.cf
. Modify the file like this:
smtpd_banner = $myhostname ESMTP $mail_name (Debian/GNU)
biff = no
# appending .domain is the MUA's job.
append_dot_mydomain = no
# TLS parameters
smtpd_tls_cert_file=/etc/ssl/certs/ssl-cert-snakeoil.pem
smtpd_tls_key_file=/etc/ssl/private/ssl-cert-snakeoil.key
smtpd_use_tls=yes
smtpd_tls_session_cache_database = btree:${queue_directory}/smtpd_scache
smtp_tls_session_cache_database = btree:${queue_directory}/smtp_scache
myhostname = campin.net
alias_maps = hash:/etc/aliases
alias_database = hash:/etc/aliases
mydestination = campin.net
myorigin = campin.net
mynetworks = 127.0.0.0/8, 192.168.1.0/24
mailbox_command = procmail -a "$EXTENSION"
mailbox_size_limit = 0
recipient_delimiter = +
inet_interfaces = all
virtual_maps = hash:/etc/postfix/virtual
Next, create a file that we'll copy to /etc/postfix/virtual
on the-mail relay:
campin.net OK
@campin.net [email protected]
We use the virtual-domain functionality of postfix to alias the entire campin.net domain to one e-mail address: [email protected]. This ensures that any mail sent will arrive in the SA team's mailbox (hosted with an e-mail hosting provider). Later, we can use the same virtual table to forward specific e-mail addresses to other destinations, instead of the single catch-all address we're using now.
When the source file /etc/postfix/virtual
is updated, we need to run this command as root:
# /usr/sbin/postmap /etc/postfix/virtual
This builds a new /etc/postfix/virtual.db
file, which is what postfix actually uses. We'll configure cfengine to perform that step for us automatically.
Place the two files in a replication directory on the cfengine master (goldmaster), and also create a new directory under the tasks hierarchy intended for postfix:
# mkdir /var/lib/cfengine2/masterfiles/PROD/repl/root/etc/postfix/
# cp main.cf virtual /var/lib/cfengine2/masterfiles/PROD/repl/root/etc/postfix/
# mkdir /var/lib/cfengine2/masterfiles/PROD/inputs/tasks/app/postfix
First, create a class called relayhost
, and place the host etchlamp in it. Place this line in PROD/inputs/classes/cf.main_classes
:
relayhost = ( etchlamp )
Now create the task PROD/inputs/tasks/app/cf.sync_postfix_config
with these contents:
control:
debian_4_0.relayhost::
main_cf = ( "main.cf_debian-relayhost" )
virtual = ( "virtual-relayhost" )
copy:
debian_4_0.relayhost::
$(master_etc)/postfix/$(main_cf)
dest=/etc/postfix/main.cf
mode=444
owner=root
group=root
type=checksum
server=$(fileserver)
encrypt=true
# we already have reloadpostfix from
# tasks/os/cf.resolve_conf, we are reusing it
define=reloadpostfix
$(master_etc)/postfix/$(virtual)
dest=/etc/postfix/virtual
mode=444
owner=root
group=root
type=checksum
server=$(fileserver)
encrypt=true
define=rebuild_virtual_map
We define variables for the virtual
and main.cf
files, and copy them individually. They're set up individually because different actions are required when the files are updated. We are careful to copy the configuration files that we've prepared only to Debian 4.0, using the debian_4_0 class
. When Debian 5.0 ("Lenny") is released, we'll have to test our config files against the postfix version that it uses. We might have to develop a new "relayhost" postfix configuration file specifically for Lenny when we upgrade or reimage the "relayhost" system to use the newer Debian version. Once again, we assume that something won't work until we can prove that it will.
Here we use the copy
action to rebuild the virtual map when it is updated:
shellcommands:
rebuild_virtual_map::
"/usr/sbin/postmap /etc/postfix/virtual ; /usr/sbin/postfix reload "
timeout=60 inform=true
Now we need another hostgroup file for the "relayhost" role. We create PROD/inputs/hostgroups/cf.relayhost
with these contents:
import:
any::
tasks/app/postfix/cf.sync_postfix_config
Then to finish the job, map the new class to the hostgroup file by adding this line to PROD/inputs/hostgroups/cf.hostgroup_mappings
:
relayhost:: hostgroups/cf.relayhost
Now etchlamp is properly set up as our mail-relay host. When our network is larger, we can simply add another Debian 4.0 host to the relayhost
class in PROD/inputs/control/cf.main_classes
, thus properly configuring it as another mail relay. Then we just update the DNS to have two A records for relayhost.campin.net, so that the load is shared between the two. An additional benefit of having two hosts serving in the "relayhost" system role is that if one host fails, mail will still make it off our end systems.
You have several options to accomplish the task of configuring systems across the site to utilize the-mail relay. For example, you can configure Sendmail, qmail, and postfix in a "nullclient" configuration where they blindly forward all mail off the local system. Or you could use the local aliases file to forward mail as well. The method, and automation of that method, is left up to the reader. You should now have a solid understanding of how to use cfengine to automate these configuration changes once you've worked out the procedure on one or more test systems.
In a rather short amount of time, we've gone from having no systems at all to having a basic UNIX/Linux infrastructure up and running. This by itself might not be very interesting, but what is noteworthy is that everything we've done to set up our infrastructure was accomplished using automation.
If our DNS server (and mail-relay) host suffers a hard-drive crash, we will simply replace the drive and reimage the host using FAI and the original hostname. Cfengine will configure a fully functional replacement system automatically, with no intervention required by the SA staff. The benefits of this are obvious:
We now have sufficient core services in place at our site to support customer-facing applications. In the next chapter, we'll take advantage of that fact, and deploy a web site.
13.59.243.64