Unrelated Miscellanea

We were going to call this section “Oh, and by the way,” but that seemed like a really weird heading.

Protect Your Career

One of the reasons that backups are unpopular is that people are worried that they might get fired if they do them wrong. People do get in trouble when restores don’t go right, but following the suggestions in this section will help you to protect yourself from “recovery failure fallout.”

Self-preservation: document, document, document

Have you ever tried to go on vacation? If you’re the only one who understands the restore process or the organization of your media, you can bet that you will be called if a big restore is required. Backups are one area of system administration in which inadequate documentation can really get you in trouble. It’s hard to go on vacation, get promoted, or do anything that would pull you away from the magical area that only you know. Your backups and restores should be documented to the point that any system administrator can follow them step-by-step in your absence. That is actually a good way to test your documentation—have someone else try to use it.

The opposite of good documentation is, of course, bad, or nonexistent, documentation. Bad documentation is the surest way to help you find a new job. If you do ever manage to take a real vacation in which you don’t carry a beeper, check you voice mail, or check your email, watch out. Murphy’s law governs vacations as well. You can guarantee yourself that you, or more accurately, your coworkers, will have a major outage that week. If they crash and burn because you left them no clue as to how to perform a restore, they will be looking for you when you return. You will not be a popular person, and you just might find yourself combing through the want ads.

Documentation is also an important method of letting your internal customers know what you are doing. For example, if you skip certain types of files or filesystems, it is good if you let people know that. I can remember at least one very long conversation with a user who really didn’t want to hear that I didn’t back up /tmp ! “I never knew that TMP was short for temporary!”

Strategy: make backups an integral part of the installation process

When a new system comes in the door, someone makes sure that it has power. Someone is responsible for the network connection, assigning an IP address, adding it to the NIS configuration, and installing the appropriate patches. All of those things happen because things don’t work if they don’t happen. Unfortunately, no one will notice if you don’t add the machine to your backup list. That is, of course, until it crashes and they need something restored. You have the difficult task of making something as “unimportant” as backups become just as natural as adding the network connection.

Tip

A new system coming in the door is usually the best test machine for a complete server recovery/duplication test. Not many people miss a machine that they don’t have yet.

The only way that this is going to happen is if you become very involved in the whole process. Perhaps you are a junior person, and you never sit in on the planning meetings because you don’t understand what’s going on. Perhaps you do understand what’s going on but just hate to go to meetings. So do I. If you don’t want to attend every meeting yourself, just make sure that someone is looking out for your interests in those meetings. Maybe there’s an ex-backup operator that goes to the meetings who will be sympathetic to your cause. Have briefings with him, and remind him to make sure that backup needs are being addressed, or to let you know about any new systems that are coming down the pike. Occasionally go to the meetings yourself, and make sure that people know that you and your backups exist; hopefully they will remember that the next time they think about installing a new system without telling you. Never count on this happening, though. You’ve got to be ever-diligent looking for new systems in need of backup.

New installations are not the only thing that can affect your backups. New versions of the operating system, new patches, new database versions, all have the potential of breaking your backups. Most system administrators bring in a new version of their operating system or database and run it on a new box or development box before they commit it to production. Make sure that your backup programs run on the new platform as well. I can think of a number of times that new versions broke my backups. Here are a few examples:

  • HP-UX 10.20 supports file sizes greater than 2 GB, but the dump manpage says that dump will not back up a filesystem with large files.

  • Informix 7.1.x switched the order in which ontape asked its questions. My script used a here document,[7] and it was expecting the questions in the old order, of course, and just didn’t work all of a sudden. (This one surprised me, because it wasn’t a major upgrade. We went from 7.1.3.ud1 to 7.1.3.ud3, or something like that.)

  • Oracle changed significantly from Version 6 to 7. As I recall, they maintained backward compatibility, but there was a slew of new ways to do things that had to do with backups.

  • An operating system upgrade introduced a SCSI_TIMEOUT variable into the kernel that even support didn’t know about. The result was that, when I issued an mt . . . fsr command that took longer than 60 minutes, it just timed out and went back to the prompt. (I had really old 8200 8-mm drives, which did not support fast file search; an fsr that took more than an hour was actually very common.) This one surprised me as well, because my backups worked just fine. It was only when I went to restore that I had the problem. I ended up tweaking an include file somewhere and rebuilding the kernel.

  • Another operating system upgrade introduced a new “feature” that limited to 64K the number of records you could skip with the fsr command. My commercial backup utility just stopped dead in its tracks. This one actually took some coding on the OS vendor’s end, resulting in a kernel patch. It was a bit difficult to accomplish, because they considered it an enhancement and not a bug fix.

The longer I think about it, the more stories I come up with. If you’ve been doing this for a while, I’m sure you have a few of your own. Suffice it to say that OS and application upgrades can and do cause problems for the backup person. Test . . . test . . . test . . .

Get the Money Your Backups Need

This final section has absolutely nothing to do with backups. It has to do with politics, budgeting, money, and cost justifications. I know that sometimes it sounds as if I think that backups get no respect. Maybe you work at Utopia Inc., where the first thing they think about is backups. The rest of us, on the other hand, have to fight for every volume, drive, and piece of software that we buy in order to accomplish this increasingly difficult task of getting it all on a backup volume.

Getting the money you need to accomplish your task can sometimes be very difficult. Once a million dollar computer is rolled in the door and uncrated, how do you tell the appropriate department that the 2-GB backup drive that came with it just isn’t going to cut it? Do you know how many hoops they went through to spend a million dollars on one machine? You want them to spend how much more?

Be ready . . .

The first thing is to be prepared. Be ready to justify what you need. Be ready with information such as:

  • Statistics on recoveries that you have performed

  • Any numbers that you have on what downtime and lost data would cost the company

  • Numbers that demonstrate how a purchase would help reduce staff costs

  • Numbers that demonstrate how the current backup system is being negatively impacted by growth or new applications

  • Cost comparisons between the one-time cost of an expensive jukebox and the continuing cost of the manual labor required to swap volumes every night; (also be prepared to explain how jukeboxes reduce the chance for human error and how that helps the company as well)

  • A documented policy that every new gigabyte results in a surcharge of a certain amount of money

  • A well-designed presentation of what service your backups will provide and the speed at which you can recover data[8]

  • A letter all ready to go, with which your boss is comfortable, that explains very matter-of-factly what they can expect if they don’t provide the funding you need

Make a formal presentation

The more expensive your solution is, the more important it is that you make a formal presentation, especially if you are in a corporate environment. A formal technical presentation has three parts: an executive summary, an overview section that goes into more detail, and then a technical specifications section for those who are really interested.

Executive summary

This should be one page and should explain on a very high level what is proposed in the rest of the presentation. Global figures and broad descriptions are good; do not go into too much detail. This is made for the VP who needs to do the final sign-off but has 20 other presentations just like it to review at the same time. Basically, state the current problem and describe your solution.

General section

Go into detail in this section. Use plenty of section headings, allowing your readers to read ahead and skim over it if they like. Headings also allow the people who read only the executive summary to look up any specific area that they are not clear on. The outline of the general section should match that of the executive summary. You can include things like references to other publications, such as magazine reviews of a particular product, but do not quote them in detail. If they are relevant, you can attach copies in the technical section. Make sure you demonstrate that you have thought this through and that it is not just a stopgap measure. Put a high-level comparison between the option you chose and the other options available, and explain why you chose the one you did. Describe how it allows for future growth and how much growth it will allow before you must reconsider. Also explain what plans you have for the old methodology and what conversion method you are going to use, such as running both in parallel for a while. Tables are also good. If you can use real numbers, it is much more effective—just make sure you can back them up. If anyone believes that the numbers are made up, it will totally invalidate the report. Try to compare the up-front cost of your solution with the surprise cost of lost data.

Technical specifications

Go wild. If anyone has made it this far, they’re either really interested or a true computer techie just like you! If this report is the cost justification for a new backup drive, find a table that compares the relative cost per MB of all the various options. Include hard numbers and any white papers that are included with the proposed product. If you think it is relevant, but possibly too long and boring, this is the place to put it.



[7] A here document is a way to call a command within a script and answer the questions that it will ask you.

[8] Don’t commit yourself to unrealistic restore times, but if the new system can significantly improve restore times, show that!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.14.63