In Chapter 1, we looked at disaster recovery as a whole. The nuts and bolts of backup and recovery are but a small part of the overall disaster recovery picture. Before we begin looking at the details of how to perform certain types of backups, let’s look at backups in general.
The casual reader might assume that this chapter is an introduction to basic backup concepts. While that is, in fact, the purpose of this chapter, it is also true that many seasoned administrators are unfamiliar with the ideas presented here. One reason for this is that administrators find themselves constantly being pulled away from “mundane” activities like backups for things that are thoughtto be more “important”—like installing new servers and figuring out why the systems are running slowly. Also, many administrators may go several years without ever needing a restore. (The need to use your backups on a regular basis would undoubtedly change your ideas about their importance.)
I wrote this book because backups (and recoveries) have been my primary area of emphasis for several years, and I would like to share the lessons I’ve learned from this focused activity. This chapter provides an overview of how your backups should work. It also explains many basic, yet extremely important, concepts upon which any good backup plan should be based and upon which any implementation discussed in this book will be based.
There are many stories in this book, like the one in the following sidebar. Each is a true story that really happened to someone I know. These are not urban legends or horror stories passed on from admin to admin. These are firsthand encounters with disaster. Why is that important? Each story makes a point, and it was not just made up to make that point. The things that I warn about in this book really happen. This can be a very tough job if you are not prepared, so read closely.
The One That Got Away
“You mean to tell me that we have absolutely no backups of paris whatsoever?” I will never forget those words. I had been in charge of backups for only about two months, and I just knew my career was over. We had moved an Oracle application from one server to another about six weeks earlier, and there was one crucial part of the move that I missed. I knew very little about database backups in those days, and I didn’t realize that I needed to shut down an Oracle database before backing it up. This was accomplished on the old server by a cron job that I never knew existed. I discovered all of this after a disk on the new server went south.
“Just give us the last full backup,” they said. I started looking through my logs. That’s when I started seeing the errors. “No problem,” I thought, “I’ll just use an older backup.” The older logs didn’t look any better. Frantically, I looked at log after log until I came to one that looked as if it were OK. It was just over six weeks old. When I went to grab that volume, I realized that we had a six-week rotation cycle, and we had overwritten that volume two days ago.
That was it! At that moment, I knew that I’d be looking for another job. This was our purchasing database, and this data loss would amount to approximately two months of lost purchase orders for a multibillion-dollar company.
So I told my boss the news. That’s when I heard, “You mean to tell me that we have absolutely no backupsof paris whatsoever?” (Isn’t it amazing how I haven’t forgotten its name? I don’t remember any other system names from that place, but I remember this one.) I felt so small that I could have fit inside a 4-mm tape box. Fortunately, a system administrator worked what, at the time, I could only describe as magic. The dead disk was resurrected, and the data was recovered straight from the disk itself. We lost only a few days’ worth of data. Our department had to send a memo to the entire company saying that any purchase orders entered in the last two days had to be reentered. I should have framed a copy of that memo to remind me what can happen if you don’t take this job seriously enough. I didn’t need to, though—its image is permanently etched in my brain.
Some of this book’s reviewers said things like, “That’s pretty bold! You’re writing a book on backups, and you start it out with a story about how you messed up. Some authority you are!” Why did I include it? Through all the years, and all the outages, this one sticks in my mind. Perhaps that’s because it’s the only one that almost “got me.” Had it not been for the miraculous efforts of a wonderful administrator named Joe Fitzpatrick, my career might have been over before it started. I include this anecdote because:
It’s the one that changed the direction of my career.
There are several valuable lessons that I learned from it, which I discuss in this book.
It could have been avoided if I had had a book like this one.
You must admit that it’s pretty darn scary.
Most backup utilities were written originally to back up to tape, and most people do back up to tape. Therefore, most books and manpages talk about backing up to tape. However, many people are backing up to CDs or magneto-optical disks. These media types have many advantages, since they act more like disk drives than tape drives. Random access of backup data is easier, and you can read them using any block size you wish, since they do not record interrecord gaps as tape drives do.[2]
Since many people are no longer using tape, this book will use the more generic word "volume” whenever appropriate. You’ll also find the term “backup drive” instead of “tape drive.” Again, that is because the backup drive could be a CD burner, especially if you’re a Linux user. The book uses the words “tape” and “tape drive” only when they are necessary and appropriate.
3.144.124.232