Chapter 8. Backup and Disaster Recovery

Nowadays, most business processes, if not all, are automated. Activities of a company, from sales to management decisions, generally require an application to access a database server and to read or modify records. If data loss occurs, some of the company's normal activities become impossible to continue until the data is restored. If some data is lost forever, the company probably loses some of its opportunities and some of its vital information. In a sense, we can safely state that each relevant data loss diminishes the value of a company. For this reason, such an event is commonly called a disaster. In technical jargon, the task of restoring as much data as possible is called disaster recovering. Since it is not always possible to repair data files, it is necessary to make regular backups of data so that they can be restored after a disaster. This chapter discusses backup and disaster recovering, some of the most vital tasks of a DBA.

The main topics that will be discussed in the chapter are:

  • Types of backups
  • Logical backups via mysqldump
  • Physical backups using the filesystem
  • Complete and incremental backups via Percona XtraBackup
  • How to restore a backup
  • How to repair a corrupt table

Types of backups

Several events can corrupt or delete important data. Some technical problems that may cause data loss are:

  • A power failure during a disk write
  • Hardware failures (such as disks and motherboards)
  • Crashing of the operating system
  • MariaDB or storage engines bugs (yes, like all programs, MariaDB has bugs)

But even a human being can cause damage. A cracker can use a software vulnerability to destroy some data. Or, we can accidentally issue a DROP DATABASE command on a database that we did not want to erase.

Since there is no way to be sure that these things will not happen, we need to be prepared to restore all critical data by performing regular and automated backups.

A backup can be performed in several ways. No backup method is better than others in all situations. The choice depends on many factors. Before deciding on a backup plan, we should ask ourselves questions such as:

  • How critical is our data?
  • How often is our data updated?
  • Can the server slow down or be temporarily stopped during the backup process?

After defining our needs, we can wisely choose a backup strategy that best fits our workload.

Logical and physical backups

A logical backup creates a representation of the relevant data. Take for instance a CSV file containing all the values, or a text file containing the SQL statements that need to be executed to exactly recreate the original data; this is called a dump file.

A physical backup is a copy of the files that physically contain the data. It is important to know that MariaDB does not write anything on such files until the copying finishes. This is easier with simple storage engines like MyISAM but harder with complex storage engines such as InnoDB. On MariaDB versions older than 10.0, one has to stop the server before taking a physical backup of InnoDB tables. A physical backup copies the whole data directory. By default, this includes the log and configuration files.

Some storage engines, like MyISAM, store each table in a separated data file; others do not. As explained in Chapter 7, InnoDB Compressed Tables, InnoDB is able to store some tables in the system tablespace and others in separate files. If tables are stored separately, it is possible to take a backup of the most relevant table instead of copying the data of all tables. This is very important in situations where some tables rarely (or never) change or where the contents of some tables can easily be recreated starting from other tables (such as the summary tables explained in Chapter 6, Caches).

In the case of partitioned tables, each partition is stored in a separate file. Sometimes, only the most recent partition contains recent data, while other partitions contain historical data. For example, a partition might contain the sales that took place in the last month, and other partitions might contain older sales. In such cases, usually we have a backup of historical data, so we can copy only one partition. Partitions are discussed in Chapter 10, Table Partitioning.

The pros of the logical backups are:

  • The servers need not be completely stopped. However, keep in mind that each of the heavy transaction will affect transactional tables, and the non-transactional tables need to be locked. Also, a backup is normally a read-intensive operation that slows down the server.
  • A logical backup is very flexible, because modifying it is relatively easy. For example, we can change a database name in a dump file before applying the backup. Or, we can drop some rows.
  • A logical backup can be more selective than a physical backup. If it is based on SQL queries, we can exclude some rows or columns from the copy.
  • A logical backup can be restored on a newer or an older version of MariaDB. For this reason, a logical backup should be created before upgrading MariaDB in case something goes wrong.

The pros of the physical backups are:

  • Physical backups are much faster, because they are done directly using the filesystem. Also, their size is usually much smaller, because they only include data and indexes in a compact format; SQL statements and data are not in a text format.
  • A physical backup usually includes log and configuration files. This is not strictly necessary to recover data, but the loss of the server's configuration or an important log should be considered to be a major disaster.

Hot and cold backups

Hot backups are taken while the server is running. Cold backups are taken while the server is stopped.

A logical backup is always hot. There is no way to get a representation of MariaDB data without querying the server.

With MariaDB 10, it is always possible to lock the physical files during a backup process, so there is no reason to stop the server. However, with older versions, cold backups are necessary for InnoDB files. During hot backups, the server accepts commands from the clients. However, a hot backup allows us to perform logical backups.

But maybe we know that the backup will take too much time, and queuing the client's requests for the duration of the backup makes no sense. Or, maybe the server does not work at certain times, for example, when an office is closed. In such cases, if we want to make a physical backup, we may prefer to stop the server. We need not stop lock tables, and the process will be more straightforward.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.30.62