Backing Up an RDBMS

Protecting an RDBMS is very complex. There are several storage elements, including datafiles, rollback logs, transaction logs, and the master database. How you get all of the data to a secondary storage medium if it’s changing all the time?

Physical and Logical Backups

There are two primary methods of backing up an RDBMS: physical backups and logical backups. A physical backup physically backs up the data files. This also is referred to as a database backup. There are also two types of physical backups, a cold backup and a hot backup. A cold backupis done by quiescing, or shutting down, the database prior to doing the backup. This is often the simplest way to do a database backup, especially if your database’s data files reside in the filesystem. All you have to do is shut down the database and run your normal filesystem backup utility. Unfortunately, this method may require your database to be shut down for a long time. That is why more and more environments are performing hot backups, which are done while the database is online. This, of course, requires a lot more work behind the scenes, since you are trying to copy the data files while the database is writing to them. This requires a backup utility that understands the internal structure of the database. The purpose of this utility is to log the changes to a particular datafile while it is being copied to the backup media. This allows for a consistent backup image.

A logical backup copies, or exports, data objects (usually tables) but does not record the data’s location. A logical backup can be used to restore a deleted table without having to restore all of the datafiles in which it resides. It also can be used to move a table from one database to another. These options are made possible by the definition of a logical backup: it backs up the data and not the data’s location. Therefore, it can be restored into any location. Logical backups, however, do not have the ability to do a point-in-time recovery. They also can introduce referential integrity problems, since you could load a table that requires information from another table that is not present. The biggest problem with exports, though, is that they almost always need to be done with the database offline.

There are several methods of doing physical backups:

  • If you are using Oracle or Sybase, and your datafiles are cooked files, you can simply shut down the database and do a full system backup. Since everything exists as a regular filesystem file, you will get everything backed up. If you need to restore, you can then restore the entire database from this backup. You then would need to replay the transaction logs against the old database files. This is why you cannot do this with Informix; Informix has no way of playing the transaction logs without restoring from a backup that was made with its backup utility.

  • If you are using Oracle or Sybase, and your datafiles are raw partitions, you still can shut down the database and back them up if you have a utility or script to do so. For example, in Unix you would use dd. This is quite a bit more complex than the first method, though, because you need to know which devices to run dd against.

  • The next method of backing up a database is to back it up live to disk or tape, using a utility provided for that purpose. Informix provides the ontape utility and Sybase provides the dump utility. Oracle does not have such a utility,[60] but it does provide the ability for you to write your own. (Oracle’s alter database begin/end backup commands allow you to back up an Oracle database in a number of ways.) This provides a lot of flexibility but also a lot of hassle if you’re not really good at scripting. However, Chapter 15 describes oraback.sh , a public domain utility that you should be able to use with no problem.

  • The newest method of backing up databases is to use a utility that sends one or more streams of data to a commercial storage manager (i.e., backup software). This is the cleanest method if you can afford it (it costs several thousand dollars per system). Each of the three major database vendors provides a utility to send data to third-party storage managers: Informix provides Online Backup and Recovery (onbar), Sybase provides dump, Oracle7 provides the Enterprise Backup Utility (EBU ), and Oracle8 provides Recovery Manager (rman).

  • A few commercial utilities provide functionality that is different from that of the previously listed options. The most popular of these is SQL Backtrack, which is now sold by BMC software. These utilities wrap around some of the native utilities and do not always use the vendor-supplied API. They also provide the ability to send a data stream to commercial backup products, and some products have interfaces to the utilities. Still other options include interfaces to products that do not use the newer interfaces. For example, many vendors prefer not to use Oracle’s EBU and have written commercial interfaces that use the same native commands that oraback.sh uses. The vendors claim more reliability and/or faster performance. The validity of backup and recovery programs that do not use the vendor-supplied API is left as a decision for the reader.

Logical backups are actually much simpler than physical backups. Each of the databases has an export utility that creates a logical backup of one or more database objects to a file. Some of the commercial utilities also allow you to integrate logical and physical backups into your backup system.

Get Every Instance

Chapter 2 includes a section called Section 2.5.3 It talks about how your backup programs should be written in such a way that everything in your system is automatically discovered and backed up. If you add a new filesystem, you should not then have to edit your backup scripts to get it backed up. This goes double for databases, because they often are added and deleted much more frequently than filesystems.

You need some way to ensure that every database instance on every server is being backed up. In Chapter 4, there is a program called hostdump.sh. hostdump.sh makes sure that it backs up all the filesystems on the box by starting at the fstab file, which lists all filesystems. Wouldn’t it be nice if you had such a file for databases? Oracle and Sybase already do. Sybase has the interfaces file that lists every server on each system. If an instance is not listed in this file, users cannot connect to it. Oracle has the oratab file that accomplishes the same task, but its use is not mandatory, as Sybase’s interfaces file is. For example, some sites don’t use the oratab file because they have only one instance. The best way to enforce the use of the oratab file is to write startup scripts that start up only databases that are in that file.

Informix has no file that stores all of the Informix instances on a server. This is disappointing since there are many companies that run more than one instance of Informix on a server. The good news is that you can make your own. You can create an inftab file that looks a lot like the oratab file and accomplishes the same thing. Again, the way to enforce its use is to write startup programs that start up only instances that are in that file.

Since the files described here are not always used and yet should be, I’d like to emphasize what I just said: you need a centralized file that lists all of the instances on the server. You then should start with that file to determine what instances are on a given server—instances that need to be backed up. Sybase already has the interfaces file and enforces its use. Oracle has the oratab file, but its use is optional. You can create an inftab file for Informix, but its use also would be optional. Enforce the use of the oratab and inftab files by writing startup scripts[61] that start up only instances that are listed in those files. A wayward (or busy) DBA still can create an instance without putting it in this file and can even get it running. But if you reboot the box enough times, the DBA will be sure to put it in the startup file so she doesn’t have to manually start it every time!

Transaction Log Dumps Are Not Incremental Backups

This important topic often is misunderstood. The confusion comes from Sybase documentation that often refers to a transaction log dump as an incremental backup. They are not the same thing!

What is the difference between the two? An incremental backup is a special backup that contains only the changed pages (blocks) since the last full (or higher level incremental) backup. A transaction log dump is a backup of all the transactions that have occurred since the last transaction log dump. They may sound similar, but they’re not. The latter is much more difficult to manage and much slower to read.

Perhaps the best way to illustrate this would be to discuss Informix’s ontape program, which has both incremental and transaction log backups. Suppose that you created a level-0 (full) backup on Friday. During the week, you did not perform any full backups and ran only continuous (transaction log) backups. Now suppose that it is Thursday, and you need to restore your database. You would have your full backup from Friday and your continuous (transaction log) backups from each day. To restore, you would need to read your full backup and then read each of the continuous backup volumes in order. Assuming you made one backup volume per day, you would need seven volumes.

Now assume the same scenario as the preceding, except that you also ran an incremental (level-1) backup every night. If you needed to restore the database on Thursday, you would need the full backup from Friday, the latest incremental backup volume (presumably Wednesday’s), and the continuous backup volume for Thursday. That is three volumes instead of seven. That is because the level-1 incremental backup contains all changes since the level-0 backup.

Besides the difference in complexity, reading an incremental backup also is much quicker than reading a transaction log backup. Ask anyone who has rolled through several days’ worth of transaction logs. In one benchmark that I performed, reading two weeks of transaction logs took 36 hours. Reading an incremental backup covering the same time period took only one hour. The reason for this is simple. A given page may be changed several times. Replaying the transaction log also would change it several times. Loading a true incremental backup would change it only once, to its last value.

Sybase’s Backup Server and Oracle7’s EBU have no concept of this type of incremental backup. Informix’s ontape and onbar do. (Oracle8’s rman does have an incremental backup capability.)

Do-It-Yourself: Creating Your Own Backup Utility

You don’t have to use a high-priced commercial utility to back up your databases. Doing so certainly can make your backups more automated or centrally controlled, but since most of them run $3,000-$8,000 per system, many people are using homegrown systems.

Intermediary disk

This is one of the most popular ways to do homegrown database backups. It’s fast, clean, and easy. The basic idea uses a script that backs up the database to disk. That backup then is treated as a regular file and backed up by the nightly filesystem backup. You can even save the amount of disk space needed by compressing the file after it’s backed up. If you’re really pressed for space, and you’re running Unix, you can use named pipes to compress the backup as it’s being written. In that case, you would need a backup disk that is only one-third to one-half the size of your original database disk (depending on the compression rate you get). Unless you have a very large database, this probably is cheaper than buying a commercial utility to perform this task. Each of the vendor-specific database backup chapters in this book contains a script that is able to do this.

Dedicated backup drive

You can use homegrown backup scripts to back up to a dedicated backup drive. This is a little more complicated and requires somewhat more work on your part. Depending on the size of your database, this may be more or less expensive than backing up to disk, but it definitely will be slower. It is also more complex, since you must keep track of each volume and label it in such a way that you know which database was backed up to it. (If you back up to disk, this can be done by naming the backup file the same name as the database.)

Shell scripts

I assume that you are doing the preceding backups by using some sort of shell script. Shell scripts are much better than having a simple cron or at entry that says "back up databaseA to deviceB.” Shell scripts can do lots of error checking and can be told to do things such as notify the DBA if something is wrong.

Calling a Professional

This is one of the biggest growth markets within the backup product industry. Most commercial filesystem backup products now have interfaces to automatically back up your database to volumes that are managed by their product. It’s really a beautiful thing, but it does come at a price! Some of these products also have an interface to a third-party program (e.g., SQL Backtrack) that accomplishes the same task.

The Big Three

Each of the three biggest Unix database vendors—-Informix, Oracle, and Sybase—has a backup utility that can interface with commercial backup products, also referred to as storage managers. On a high level, these backup utilities all work in essentially the same way. The database vendor’s utility generates one or more backup streams via an API that storage managers can talk to. The companies that produce the storage managers then can write a utility that talks to their storage manager on one side, and the database backup utility’s API on the other side. Although the database backup utilities come bundled with the database products, the commercial backup products’ utilities cost several thousand dollars.

Of the three main database vendors, only Sybase’s backup utility can perform backups without interfacing with a commercial storage manager. Oracle’s rman and Informix’s onbar both have advanced capabilities, but without a storage manager to talk to, the products are essentially useless. Since Oracle and Informix didn’t want to force their customers to buy a storage manager or the interface to their backup utility, they came up with a compromise. Both of these vendors now bundle a free, stripped-down version of Legato NetWorker and its Business Suite Module with their product. This “OEM” version of NetWorker has significantly less functionality than the full-featured version, but it gives you enough functionality to be able to use rman and onbar to do backups. Figure 13-5 uses Oracle and this bundled version of Legato NetWorker to illustrate the different pieces of the backup puzzle. Oracle uses rman to interface with the database on one side and the storage manager on the other side. NetWorker communicates with the backup media on one side and its Business Suite Module on the other side. NetWorker then uses the Business Suite module to interface between NetWorker and rman. The backup data flows from the Oracle database, through rman, through the Business Suite Module, through NetWorker, and to the backup media. Restores obviously flow in the opposite direction.

A typical commercial database backup utility setup

Figure 13-5. A typical commercial database backup utility setup

Each of the three big databases’s utilities has its own backup and recovery history.

Informix

Informix has always been the easiest database to back up and recover. They have always had tbtape (now called ontape), which is a standalone backup command designed to back up to tape. ontape is simple, has incremental capabilities, also can back up to disk, and backs up the database live. Some of these features, which always were assumed by Informix users to be present in other database systems, are just now appearing in other products. (Informix also has a product called onarchive , which is not recommended; onarchive and other Informix utilities are covered in Chapter 14.) Informix now also has onbar , which is designed specifically to send a stream of backup data to a commercial product. Some backup vendors have ported to the earlier ontape command, while others waited for onbar. Whichever command you use, you can recover individual dbspaces.

Oracle

Historically, Oracle did not have a true backup utility, but it did have commands that allowed you to write your own—even allowing you to do live backups. (The oraback.sh script in Chapter 15 uses these commands.) Now Oracle7 has the Enterprise Backup Utility, or EBU, and Oracle8 has Recovery Manager, or rman. Both are designed to give streams of backup data to a commercial backup utility. Both of these utilities require a storage manager to be able to do backups, but Oracle8 now comes bundled with a stripped-down storage manager that you can use. Some storage manager vendors have stayed away from EBU and rman interfaces, citing reasons such as performance or flexibility. BMC’s SQL Backtrack is probably the best known among these products. SQL Backtrack can do a true incremental backup, as discussed earlier. (The ability to do incremental backups is now provided by Oracle8’s rman, but as of this writing there are still hundreds of thousands of Oracle7 databases out there. The only way to do a true incremental backup of an Oracle7 database is to use a product like SQL Backtrack.) Whichever utility you use, you can recover individual files or tablespaces.

Sybase

Sybase has come a long way in the backup arena, but they still have a long way to go. The dump command used to be very slow, and its speed severely impacted database performance, but that problem was fixed in System 10. The fix was done by separating the dump processes from the database engine and creating the Backup Server. The Backup Server also now has a nice ability to stream data to multiple backup devices simultaneously; its main problem now is that it is an all-or-nothing utility. You cannot recover an individual dataspace or device; you must restore the entire database or nothing at all. This means that if you have a 500-GB Sybase database and you lose one 4-GB disk, you have to restore the entire 500 GB. (This lack of functionality is what originally created the market for BMC’s SQL Backtrack.)



[60] Oracle7 has EBU, and Oracle8 has rman. Both are utilities that can back up the database to a commercial backup product. Oracle8 also comes bundled with a stripped-down version of Legato NetWorker, which can be used in conjuction with rman to back up to disk. However, this is not the same as Informix’s ontape or Sybase’s Backup Server, both of which can back up directly to a tape or disk file without the intervention of a third-party product.

[61] This often can be done by making slight modifications to the default startup scripts that come with the database.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.60.149