10.1. Backup types

Unless you're a DBA, you'd probably define a database backup as a complete copy of a database at a given point in time. While that's one type of database backup, there are many others. Consider a multi-terabyte database that's used 24/7:

  • How long does the backup take, and what impact does it have on users?

  • Where are the backups stored, and what is the media cost?

  • How much of the database changes each day?

  • If the database failed partway through the day, how much data would be lost if the only recovery point was the previous night's backup?

In considering these questions, particularly for large databases with high transaction rates, we soon realize that simplistic backup strategies limited to full nightly backups are insufficient for a number of reasons, not the least of which is the potential for data loss. Let's consider the different types of backups in SQL Server.

Backup methods

There are many tools and techniques for performing database backups, including various third-party products and database maintenance plans (covered in chapter 14). For the purposes of the examples throughout this chapter, we'll use a T-SQL script approach.


10.1.1. Full backup

Full backups are the simplest, most well understood type of database backup. Like standard file backups (documents, spreadsheets, and so forth), a full backup is a complete copy of the database at a given time. But unlike with a normal file backup, you can't back up a database by simply backing up the underlying .mdf and .ldf files.

One of the classic mistakes made by organizations without appropriate DBA knowledge is using a backup program to back up all files on a database server based on the assumption that the inclusion of the underlying database files (.mdf and .ldf) in the backup will be sufficient for a restore scenario. Not only will this backup strategy be unsuccessful, but those who use such an approach usually fail to realize that fact until they try to perform a restore.

For a database backup to be valid, you must use the BACKUP DATABASE command or one of its GUI equivalents. Let's look at a simple example in which we'll back up the AdventureWorks database. Check Books Online (BOL) for the full description of the backup command with all of its various options.

-- Full Backup to Disk
BACKUP DATABASE [AdventureWorks2008]
TO DISK = N'G:SQL BackupAdventureWorks.bak'
WITH INIT

You can perform backups in SQL Server while the database is in use and is being modified by users. Such backups are known as online backups. In order for the resultant backup to be restored as a transactionally consistent database, SQL Server includes part of the transaction log in the full database backup. Before we cover the transaction log in more detail, let's consider an example of a full backup that's executed against a database that's being actively modified.

Figure 10.1 shows a hypothetical example of a transaction that starts and completes during a full backup, and modifies a page after the backup process has read it from disk. In order for the backup to be transactionally consistent, how will the backup process ensure this modified page is included in the backup file? In answering this question, let's walk through the backup step by step. The step numbers in the following list correspond to the steps in figure 10.1.

  1. When the backup commences, a checkpoint is issued that flushes dirty buffer cache pages to disk.

  2. After the checkpoint completes, the backup process begins reading pages from the database for inclusion in the backup file(s), including page X.

  3. Transaction A begins.

  4. Transaction A modifies page X. The backup has already included page X in the backup file, so this page is now out of date in the backup file.

  5. Transaction B begins, but won't complete until after the backup finishes. At the point of backup completion, this transaction is the oldest active (uncommitted/incomplete) transaction.

    Figure 10.1. Timeline of an online full backup. Based on an example used with permission from Paul S. Randal, managing director of SQLskills.com.
     
  6. Transaction A completes successfully.

  7. The backup completes reading pages from the database.

  8. As described shortly, the backup process includes part of the transaction log in the backup.

If the full backup process didn't include any of the transaction log, the restore would produce a backup that wasn't transactionally consistent. Transaction A's committed changes to page X wouldn't be in the restored database, and because transaction B hasn't completed, its changes would have to be rolled back. By including parts of the transaction log, the restore process is able to roll forward committed changes and roll back uncommitted changes as appropriate.

In our example, once SQL Server completes reading database pages at step 7, it will include all entries in the transaction log since the oldest log sequence number (LSN) of one of the following:

  • The checkpoint (step 1 in our example)

  • The oldest active transaction (step 5)

  • The LSN of the last replicated transaction (not applicable in our example)

In our example, transaction log entries since step 1 will be included because that's the oldest of these items. However, consider a case where a transaction starts before the backup begins and is still active at the end of the backup. In such a case, the LSN of that transaction will be used as the start point.

This example was based on a blog post from Paul Randal of SQLskills.com. The full post, titled "More on How Much Transaction Log a Full Backup Includes" is available at http://www.sqlskills.com/BLOGS/PAUL/post/More-on-how-much-transaction-log-a-full-backup-includes.aspx.

It's important to point out here that even though parts of the transaction log are included in a full backup, this doesn't constitute a transaction log backup. Another classic mistake made by inexperienced SQL Server DBAs is never performing transaction log backups because they think a full backup will take care of it. A database in full recovery mode (discussed shortly) will maintain entries in the transaction log until it's backed up. If explicit transaction log backups are never performed, the transaction log will continue growing forever (until it fills the disk). It's not unusual to see a 2GB database with a 200GB transaction log!

Finally, when a full backup is restored as shown in our next example, changes since the full backup are lost. In later examples, we'll look at combining a full backup with differential and transaction log backups to restore changes made after the full backup was taken.

-- Restore from Disk
RESTORE DATABASE [AdventureWorks2008]
FROM DISK = N'G:SQL BackupAdventureWorks.bak'
WITH REPLACE

To reduce the user impact and storage costs of nightly full backups, we can use differential backups.

Multi-file backups

Backing up a database to multiple files can lead to a significant reduction in backup time, particularly for large databases. When you use the T-SQL BACKUP DATABASE command, the DISK = clause can be repeated multiple times (separated by commas), once for each backup file, as per this example:


BACKUP DATABASE [ADVENTUREWORKS2008]
  TO

     DISK = 'G:SQL BACKUPADVENTUREWORKS_1.BAK'
     , DISK = 'G:SQL BACKUPADVENTUREWORKS_2.BAK'
     , DISK = 'G:SQL BACKUPADVENTUREWORKS_3.BAK'

10.1.2. Differential backup

While a full backup represents the most complete version of the database, performing full backups on a nightly basis may not be possible (or desirable) for a variety of reasons. Earlier in this chapter we used an example of a multi-terabyte database. If only a small percentage of this database changes on a daily basis, the merits of performing a full nightly backup are questionable, particularly considering the storage costs and the impact on users during the backup.

A differential backup, an example of which is shown here, is one that includes all database changes since the last full backup:

-- Differential Backup to Disk
BACKUP DATABASE [AdventureWorks2008]
TO DISK = N'G:SQL BackupAdventureWorks-Diff.bak'
WITH DIFFERENTIAL, INIT

A classic backup design is one in which a full backup is performed weekly, with nightly differential backups. Figure 10.2 illustrates a weekly full/nightly differential backup design.

Figure 10.2. Differential backups grow in size and duration the further they are from their corresponding full backup (base).
 

Compared to nightly full backups, a nightly differential with a weekly full backup offers a number of advantages, primarily the speed and reduced size (and therefore storage cost) of each nightly differential backup. However, there comes a point at which differential backups become counterproductive; the further from the full backup, the larger the differential, and depending on the rate of change, it may be quicker to perform a full backup. It follows that in a differential backup design, the frequency of the full backup needs to be assessed on the basis of the rate of database change.

When restoring a differential backup, the corresponding full backup, known as the base backup, needs to be restored with it. In the previous example, if we needed to restore the database on Friday morning, the full backup from Sunday, along with the differential backup from Thursday night, would be restored, as in this example:

-- Restore from Disk. Leave in NORECOVERY state for subsequent restores
RESTORE DATABASE [AdventureWorks2008]
FROM DISK = N'G:SQL BackupAdventureWorks.bak'
WITH NORECOVERY, REPLACE
GO

-- Complete the restore process with a Differential Restore
RESTORE DATABASE [AdventureWorks2008]
FROM DISK = N'G:SQL BackupAdventureWorks-Diff.bak'
GO

Here, we can see the full backup is restored using the WITH NORECOVERY option. This leaves the database in a recovering state, and thus able to restore additional backups. We follow the restore of the full backup with the differential restore.

As you'll recall from the restore of the full backup shown earlier, without transaction log backups, changes made to the database since the differential backup will be lost.

10.1.3. Transaction log backup

A fundamental component of database management systems like SQL Server is the transaction log. Each database has its own transaction log, which SQL Server uses for several purposes, including the following:

  • The log records each database transaction, as well as the individual database modifications made within each transaction.

  • If a transaction is canceled before it completes, either at the request of an application or due to a system error, the transaction log is used to undo, or roll back, the transaction's modifications.

  • A transaction log is used during a database restore to roll forward completed transactions and roll back incomplete ones. This process also takes place for each database when SQL Server starts up.

  • The transaction log plays a key role in log shipping and database mirroring, both of which will be covered in the next chapter.

Regular transaction log backups, as shown here, are crucial in retaining the ability to recover a database to a point in time:

-- Transaction Log Backup to Disk
BACKUP LOG [AdventureWorks2008]
TO DISK = N'G:SQL BackupAdventureWorks-Trn.bak'
WITH INIT

As you can see in figure 10.3, each transaction log backup forms part of what's called a log chain. The head of a log chain is a full database backup, performed after the database is first created, or when the database's recovery model, discussed shortly, is changed. After this, each transaction log backup forms a part of the chain. To restore a database to a point in time, an unbroken chain of transaction logs is required, from a full backup to the required point of recovery.

Figure 10.3. An unbroken chain of backups is required to recover to the point of failure.
 

Consider figure 10.3. Starting at point 1, we perform a full database backup, after which differential and transaction log backups occur. Each of the backups serves as part of the chain. When restoring to a point in time, an unbroken sequence of log backups is required. For example, if we lost backup 4, we wouldn't be able to restore past the end of backup 3 at 6 a.m. Tuesday. Attempting to restore the transaction log from log backup 5 would result in an error message similar to that shown in figure 10.4.

In addition to protecting against potential data loss, regular log backups limit the growth of the log file. With each transaction log backup, certain log records, discussed in more detail shortly, are removed, freeing up space for new log entries. As covered earlier, the transaction log in a database in full recovery mode will continuing growing indefinitely until a transaction log backup occurs.

The frequency of transaction log backups is an important consideration. The two main determining factors are the rate of database change and the sensitivity to data loss.

Transaction log backup frequency

Frequent transaction log backups reduce the exposure to data loss. If the transaction log disk is completely destroyed, then all changes since the last log backup will be lost. Assuming a transaction log backup was performed 15 minutes before the disk destruction, the maximum data loss would be 15 minutes (assuming the log backup file isn't contained on the backup disk!). In contrast, if transaction log backups are only performed once a day (or longer), the potential for data loss is large, particularly for databases with a high rate of change.

Figure 10.4. Attempting to restore an out-of-sequence transaction log
 

The more frequent the log backups, the more restores will be required in a recovery situation. In order to recover up to a given point, we need to restore each transaction log backup between the last full (or differential) backup and the required recovery point. If transaction log backups were taken every minute, and the last full or differential backup was 24 hours ago, there would be 1,440 transaction log backups to restore! Clearly, we need to get the balance right between potential data loss and the complexity of the restore. Again, the determining factors are the rate of database change and the maximum allowed data loss, usually defined in a service level agreement.

In a moment we'll run through a point-in-time restore, which will illustrate the three backup types working together. Before we do that, we need to cover tail log backups.

Tail log backups

When restoring a database that's currently attached to a server instance, SQL Server will generate an error[] unless the tail of the transaction log is first backed up. The tail refers to the section of log that hasn't been backed up yet—that is, new transactions since the last log backup.

[] Unless the WITH REPLACE option is used.

A tail log backup is performed using the WITH NORECOVERY option, which immediately places the database in the restoring mode, guaranteeing that the database won't change after the tail log backup and thus ensuring that all changes are captured in the backup.

WITH NO_TRUNCATE

Backing up the tail of a transaction log using the WITH NO_TRUNCATE option should be limited to situations in which the database is damaged and inaccessible. The COPY_ONLY option, covered shortly, should be used in its place.


When restoring up to the point of failure, the tail log backup represents the very last transaction log backup, with all restores preceding it performed using the WITH NORECOVERY option. The tail log is then restored using the WITH RECOVERY option to recover the database up to the point of failure, or a time before failure using the STOPAT command.

So let's put all this together with an example. In listing 10.1, we first back up the tail of the log before restoring the database to a point in time. We begin with restoring the full and differential backups using the WITH NORECOVERY option, and then roll forward the transaction logs to a required point in time.

Example 10.1. Recovering a database to a point in time
-- Backup the tail of the transaction log
BACKUP LOG [AdventureWorks2008]
TO DISK = N'G:SQL BackupAdventureWorks-Tail.bak'
WITH INIT, NORECOVERY

-- Restore the full backup
RESTORE DATABASE [AdventureWorks2008]
FROM DISK = N'G:SQL BackupAdventureWorks.bak'
WITH NORECOVERY
GO

-- Restore the differential backup
RESTORE DATABASE [AdventureWorks2008]
FROM DISK = N'G:SQL BackupAdventureWorks-Diff.bak'
WITH NORECOVERY
GO

-- Restore the transaction logs
RESTORE LOG [AdventureWorks2008]
FROM DISK = N'G:SQL BackupAdventureWorks-Trn.bak'
WITH NORECOVERY
GO

-- Restore the final tail backup, stopping at 11.05AM
RESTORE LOG [AdventureWorks2008]
FROM DISK = N'G:SQL BackupAdventureWorks-Tail.bak'
WITH RECOVERY, STOPAT = 'June 24, 2008 11:05 AM'
GO

As we covered earlier, the NO_TRUNCATE option of a transaction log backup, used to perform a backup without removing log entries, should be limited to situations in which the database is damaged and inaccessible. Otherwise, use the COPY_ONLY option.

10.1.4. COPY_ONLY backups

Earlier in this chapter we defined a log chain as the sequence of transaction log backups from a given base. The base for a transaction log chain, as with differential backups, is a full backup. In other words, before restoring a transaction log or differential backup, we first restore a full backup that preceded the log or differential backup.

Take the example presented earlier in figure 10.3, where we perform a full backup on Sunday night, nightly differential backups, and six hourly transaction log backups. In a similar manner to the code in listing 10.1, to recover to 6 p.m. on Tuesday, we'd recover Sunday's full backup, followed by Tuesday's differential and the three transaction log backups leading up to 6 p.m.

Now let's assume that a developer, on Monday morning, made an additional full backup, and moved the backup file to their workstation. The differential restore from Tuesday would now fail. Why? A differential backup uses a Differential Changed Map (DCM) to track which extents have changed since the last full backup. The DCM in the differential backup from Tuesday now relates to the full backup made by the developer on Monday morning. In our restore code, we're not using the full backup from Monday—hence the failure.

Now, there are a few ways around this problem. First, we have an unbroken transaction log backup sequence, so we can always restore the full backup, followed by all of the log backups since Sunday. Second, we can track down the developer and ask him for the full backup and hope that he hasn't deleted it!

To address the broken chain problem as outlined here, COPY_ONLY backups were introduced in SQL Server 2005 and fully supported in 2008.[] A COPY_ONLY backup, supported for both full and transaction log backups, is used in situations in which the backup sequence shouldn't be affected. In our example, if the developer performed the Monday morning full backup as a COPY_ONLY backup, the DCM for the Tuesday differential would still be based on our Sunday full backup. In a similar vein, a COPY_ONLY transaction log backup, as in this example, will back up the log without truncation, meaning that the log backup chain will remain intact without needing the additional log backup file:

[] Management Studio in SQL Server 2008 includes enhanced support for COPY_ONLY backups with GUI options available for this backup type. Such options were absent in SQL Server 2005, which required a T-SQL script approach.

-- Perform a COPY ONLY Transaction Log Backup
BACKUP LOG [AdventureWorks2008]
TO DISK = N'G:SQL BackupAdventureWorks-Trn_copy.bak'
WITH COPY_ONLY

When discussing the different backup types earlier in the chapter, we made several references to the database recovery models. The recovery model of a database is an important setting that determines the usage of the transaction log and the exposure to data loss during a database restore.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.79.65