Chapter 7. SunOS/Solaris

As mentioned previously in this book, disks will fail. Occasionally, even the disk that contains the operating system will fail. How do you protect against such a disaster? Depending on your budget and the level of availability that you need, you may explore one or more of the following options if you’re running Solaris:

Solstice Disk Suite mirrored root disk

If you are running Solaris, you can use Solstice Disk Suite (SDS) to mirror your root disk. (Other platforms have similar products.) A mirrored disk would automatically take over if the other disk fails. SDS is bundled free with Solaris, and mirroring the root drive is relatively easy.[44] It’s also easy to undo in case things get a little mixed up. You simply boot off a CD-ROM, mount the root filesystem from the good drive, and change the /etc/vfstab to use the actual disk slice names instead of the SDS metadevice names. There are two downsides to this method. The first is that many people are using Veritas Volume Manager to manage the rest of their disks, and using SDS for the root disk may be confusing. The second downside is that it does not protect against root filesystem corruption. If someone accidentally overwrites /etc, the mirroring software will only make that mistake more efficient.

Tip

Although this chapter talks mainly about SunOS/Solaris, the principles covered here are used in the other bare-metal recovery chapters.

Veritas Volume Manager encapsulated root disk

If you have purchased the Veritas Volume Manager for Solaris, you also have the option of mirroring your root drive with it. (Putting the root drive under Veritas control is called encapsulation.) I’m not sure what to say about this other than to say that I’ve heard more than one Veritas consultant or instructor tell me not to encapsulate the root disk. It creates too many potential catch-22 situations. Of course, this method also does not protect against the corruption of the root filesystem.

Standby root disk

A standby root disk is created by copying the currently running root disk to the alternate boot disk. If the system crashes, you can simply tell it to boot off the other disk. Some people even automate this to the point that the system always boots off the other disk when it is rebooted, unless it is told not to do so. (There is a tool on http://www.backupcentral.com that does just that.)

The advantage to the two previous methods is that if one of the mirrored root disks fails, the operating system will continue to function until it is rebooted. The disadvantage is that they do not protect you against a bad patch or administrator error. Anything that causes “logical” problems with the operating system will simply be duplicated on the other disk. A standby root disk is just the opposite. You will be completely protected against administrator and other errors, but it will require a reboot to boot off the new disk.

Mirrored/HA system

A high-availability (HA) system, as discussed in Chapter 6, is the most expensive of these four options, but it offers the greatest level of protection against such a failure. Instead of having a mirrored or standby root disk, you have an entire system standing by ready to take over in case of failure.

What About Fire?

All of the preceding methods will allow you to recover from an operating system disk failure. None of them, however, will protect you when that server burns to the ground. Fire (and other disasters) tend to take out both sides of a mirrored disk pair or HA system. If that happens or if you didn’t use any of the preceding methods prior to an operating system disk failure, you will need to recover the root disk from some type of backup.

Recovering the root disk is called a bare-metal recovery, and there are many platform-specific, bare-metal recovery utilities. The earliest example of such a utility on a Unix platform is AIX’s mksysb command. mksysb is still in use today and makes a special backup tape that stores all of the root volume-group information. The administrator can replace the root disk, boot off the latest mksysb, and the utility automatically restores the operating system to that disk. Today, there is AIX’s mksysb, Compaq’s btcreate, HP’s make_recovery, and SGI’s Backup. Each of these utilities is covered in its own later chapter.

Without a planned bootstrap recovery system, the usual solution to the bare-metal recovery problem is as follows:

  1. Replace the defective disk.

  2. Reinstall the OS and its patches.

  3. Reinstall the backup software.

  4. Recover the current, backed-up OS on top of the old OS.

This solution is insufficient, misleading, and not very efficient. The first problem is that you actually end up laying the OS down twice. The second problem is that it doesn’t work very well. Try overwriting some of the system files when a system is running from the disk to which you are trying to restore.



[44] It is beyond the scope of this book, though. There are too many products like this to cover, so I won’t be covering any of them in detail.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.212.199