3. Backup and Recovery

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

3 Backup and Recovery

Backups and the recovery process are a very important topic for a database management system. Backups act as the last line of defense if everything else goes wrong. There are two possible causes of these catastrophic events. The first is a user-induced failure, in which a user accidentally or maliciously affects your data in a negative way. An example would be if someone issued DROP TABLE or DELETE to remove data. The only way to recover in the event of such an action would be to use a backup. The second possible cause of a catastrophic data loss is if you have simultaneous hardware failures on all the members of a node group. For example, if all the node groups suffer a hard drive corruption/failure, there won’t be data available to recover from. Generally, to mitigate the danger from these events occurring, you should make backups quite frequently (normally at least once per day, if not more frequently).

You may also need to do a backup and recover from it if you are upgrading or changing a system setting. Many upgrades and configuration changes can be online, but some cannot be. The configuration parameters that require a backup/recovery process are ones that affect the partitioning of the data, including NoOfReplicas, as well as those that change the number of data nodes in the cluster. As far as software upgrades are concerned, generally minor updates (for example, from version 5.0.12 to version 5.0.13) can be done in an online manner through a rolling upgrade process (as explained in Chapter 1, “Installation”). Major version upgrades (for example, from version 4.1 to version 5.0) also require a backup/restore in order to proceed.

MySQL Cluster’s Native Backup Tool

MySQL Cluster includes a native backup tool that allows you to make online backups. This is the most common tool to use for backups because it allows normal continuous running of the cluster while you’re making the backup, as it doesn’t set any locks to make the backup. The native backup tool only backs up those database objects that are clustered (for example, table structures and data). This tool does not back up objects such as databases, stored procedures, and triggers, so you need to back them up by using some other method.

To control MySQL Cluster’s native backup tool, you use the management client, ndb_mgm. There are two commands related to dealing with the back process: START BACKUP and ABORT BACKUP. To initiate the backup process, you use the START BACKUP command. You can use a few options with this command to control when the command will return. The default is WAIT COMPLETED, which causes the command to wait until all the nodes have successfully reported that the backup was complete. If you have a lot of data in the cluster, this command may take a long time to complete, so it might be better to use one of the other options. WAIT STARTED causes the command to wait until all the nodes have successfully received the command to start making a backup and actually start the process. The final option is the NOWAIT option, which causes the command to return immediately, without waiting for the data nodes to confirm that they have started or finished. The following are some examples of these commands:

shell> ndb_mgm –e "START BACKUP NOWAIT"
ndb_mgm> START BACKUP WAIT COMPLETED
Waiting for completed, this may take several minutes
Node 2: Backup 2 started from node 1
Node 2: Backup 2 started from node 1 completed
StartGCP: 114086 StopGCP: 114089
#Records: 5321 #LogRecords: 0
Data: 191220 bytes Log: 0 bytes

Regardless of the option you use to initiate a backup, the cluster log will contain information about the backup process as well. You can monitor this log if you have used the NOWAIT option to see when it completes. The cluster log file has output similar to the following:

2005-11-25 17:02:42 [MgmSrvr] INFO   -- Node 2: Backup 1 started from node 1
2005-11-25 17:02:43 [MgmSrvr] INFO   -- Node 2: Backup 1 started from node 1
completed
StartGCP: 114078 StopGCP: 114081
#Records: 5321 #LogRecords: 0
Data: 191220 bytes Log: 0 bytes
2005-11-25 17:02:53 [MgmSrvr] INFO   -- Node 2: Backup 2 started from node 1
2005-11-25 17:02:53 [MgmSrvr] INFO   -- Node 2: Backup 2 started from node 1
completed
StartGCP: 114086 StopGCP: 114089
#Records: 5321 #LogRecords: 0
Data: 191220 bytes Log: 0 bytes

Notice that each backup you make is assigned a number, called the backup ID. It is important to know this ID when you want to use a backup later on for recovery. If you don’t have the number, it isn’t too important because you can get it from the name of the backup. The backup ID numbers increment, so the largest number is the most recent backup. The incrementing number persists across cluster restarts, but an all --initial restart resets the number. You need to be careful if you reset it in this manner because it will overwrite the previous backups if they still exist in the same location when you try to make a new backup.

The backup itself is stored locally on the hard drive of each data node. For example, if you have four data nodes, four locations contain the data required to restore your backup. You normally have to copy all these files into a single location in order to store your backup for the long term. MySQL Cluster doesn’t have a built-in way to do this, but it should be fairly trivial to do it by using something such as scp or rsync. Chapter 4, “Security and Management,” provides some examples of scripts to do similar things.

The backup itself is stored in a directory called BACKUP. Normally, this is inside the DataDir defined in the cluster configuration file. However, it is possible to move it with an additional cluster configuration option called BackupDataDir. Each backup you make creates an additional directory within this BACKUP directory called BACKUP-#, where # is the backup ID. Inside the BACKUP-# directory are three files—BACKUP-#.N.ctl, BACKUP-#.N.log, and BACKUP-#-M.N.Data—where # is the backup ID, N is the node ID of the originating node, and M is the file number. The .ctl file is the metadata from the cluster. This includes information about what tables, indexes, triggers, and so on exist. The second set of files is the .Data files, which contain the data from your entire system. In theory, there can be multiple .Data files, depending on how much data is in your system. The final file is the .log file. This is the UNDO log used to make the backup consistent, and it allows the backup to not set any locks while occurring. It is automatically reapplied during system recovery. Your BACKUP directory should look similar to the following:

The other management client command involved in backups is the ABORT BACKUP # command. This command allows you to stop a backup that is in progress. It causes the data nodes to delete all the data that has to that point been written out as well. This command does not return an error if the backup isn’t in progress, even though it doesn’t do anything because there is nothing to abort. If the abort occurs, you see a message in the cluster log confirming that it was aborted:

ndb_mgm> ABORT BACKUP 2
Abort of backup 2 ordered

In the cluster log you see this:

2005-11-25 18:13:47 [MgmSrvr] INFO -- Node 3: Backup 5 started from node 1
2005-11-25 18:13:47 [MgmSrvr] ALERT -- Node 3: Backup 5 started from 1 has
been aborted. Error: 1321

Using `mysqldump` for Backups

Another way you can make backups is by using a tool called mysqldump. If you have been using MySQL with other storage engines, this tool should be familiar to you. Using mysqldump to back up non-cluster objects, such as triggers and stored procedures, works exactly the same as with non-clustered databases. Quite commonly when doing this, you should use the --no-data flag with mysqldump to ensure that you aren’t backing up clustered data. If you want to use it to back up clustered data, you need to follow a few special restrictions related to using it with MySQL Cluster. The special restrictions exist because MySQL Cluster doesn’t have a repeatable read isolation level and doesn’t support distributed table locking (the two methods used by mysqldump for making consistent backups). Generally, you use mysqldump only if you needed to restore the data into a different system (that is, not into MySQL Cluster).

If you want to back up data with mysqldump in order for mysqldump to get a consistent view of the data, you need to enter single-user mode before making the backup (or otherwise ensure that no one else is going to be changing data while you’re making the backup). This allows mysqldump to get a consistent view of the data to backup. However, this will not be an online backup because users will be denied access to the cluster while this is continuing.

After you enter single-user mode, you can make a backup as you normally would. There are many different options you can use with mysqldump to change the format and other parameters. We recommend that you take a look at the output of mysqldump --help and at the webpage for mysqldump (http://dev.mysql.com/doc/refman/5.0/en/mysqldump.html), which includes many examples. You should make sure to use one of the backup methods that lock the tables first, such as the --lock-tables option or the --master-data option. With mysqldump, you can choose what pieces to back up as well, whereas the native hot backup tool always does all the data for all tables.

The following is an example of a mysqldump command to back up the world database data while in single-user mode:

shell> mysqldump -u root -p --lock-tables world > backup-world.sql

Single-User Mode

Single-user mode allows you to lock down a cluster so that it is accessible only to a single MySQL server (or similar connection to the cluster). All other servers give an error when someone tries to access one of the tables stored in the cluster.

To enter single-user mode, you issue the command ENTER SINGLE USER MODE #, where # is the node ID of the node you want to allow to communicate with the cluster. Entering this mode can take a brief time to complete; you can see when you have entered single-user mode by using the ALL STATUS command or the SHOW command. When you are done with the mode, you can use the command EXIT SINGLE USER MODE to resume normal operation.

The following is an example of a session:

ndb_mgm> ENTER SINGLE USER MODE 4
Entering single user mode
Access will be granted for API node 4 only.
Use ALL STATUS to see when single user mode has been entered.
ndb_mgm> ALL STATUS
Node 2: single user mode (Version 5.0.13)
Node 3: single user mode (Version 5.0.13)

ndb_mgm> SHOW
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @127.0.0.1   (Version: 5.0.13, single user mode, Nodegroup: 0)
id=3    @127.0.0.1   (Version: 5.0.13, single user mode, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @127.0.0.1   (Version: 5.0.13)

[mysqld(API)]   2 node(s)
id=4    @127.0.0.1   (Version: 5.0.13)
id=5 (not connected, accepting connect from any host)

ndb_mgm> EXIT SINGLE USER MODE
Exiting single user mode in progress.
Use ALL STATUS to see when single user mode has been exited.
ndb_mgm> ALL STATUS
Node 2: started   (Version 5.0.13)
Node 3: started   (Version 5.0.13)

Single-user mode is useful for a few different things. First, it is somewhat like mysqldump, as mentioned previously. Second, it is useful when you’re restoring a backup into MySQL Cluster. When you restore a backup (either from mysqldump or by using the hot backup method), you should normally enter single-user mode. In this way, restoring a backup is not a hot operation. Finally, there are some maintenance tasks for which you generally want to enter single-user mode as well. This includes tasks such as ALTER TABLE, which need to be synchronized across all the MySQL servers.

The following is a typical example of the usage of this mode in order to ensure you make a consistent backup of the world database:

shell> ndb_mgm -e "ENTER SINGLE USER MODE 4"; sleep 10;
mysqldump -u root -p --lock-tables world > backup-world.sql;
ndb_mgm -e "EXIT SINGLE USER MODE";

Restoring a Backup

The previous sections cover how to perform a backup operation. But a backup is useless if you cannot properly recover from it. The recovery process is relatively simple for both of the backup methods discussed. However, you should generally practice the process a few times before you attempt to do it on a production system.

Restoring a Backup by Using Native Hot Backup

To restore a backup made by using the native hot backup tool, you use the ndb_restore program that comes with the MySQL Cluster installation. This program reads the backup files and loads the data and structure into a running cluster. The cluster that the data is being loaded into can have a different topography than the original cluster. For example, if you want to increase from two data nodes to four data nodes, you can do it without any problems when you use this backup method.

ndb_restore connects to the cluster in the same manner as a MySQL server does, and then it effectively re-issues the CREATE and INSERT statements needed to re-create your tables. Because it connects directly to the cluster, you need to make sure you have an extra [mysqld] section in your cluster configuration file for it. You still need to manually re-create your databases and other non-clustered objects after you restore this backup.

These are the steps for restoring a backup taken by the native hot backup:

1. Enter single-user mode for the node ID you plan to use for ndb_restore (or otherwise guarantee that no one else is using the cluster).

2. From within the BACKUP-# directory, use the ndb_restore tool to restore the metadata (that is, table structure) of the system by using -m or --restore_meta.

3. Use ndb_restore to restore the data from the fragment to the tables, with the -r or --restore_data option.

4. Repeat step 3 for each of the other data nodes. (For example, if you had four data nodes when you made the backup, you need to perform step 3 four times.)

5. Leave single-user mode. Your cluster is now ready to be used.

When using ndb_restore, you are required to specify a few options.

-b or –backupid—You use this option to specify the backup ID that needs to be restored. The backup ID is the number immediately after the BACKUP word. For example, with BACKUP-7.2.ctl, the backup ID is 7.

The original node ID for the backup—Again, you can get this ID from the filename; it is the number listed immediately before the file extension. For example, with BACKUP-7.2.ctl, the original node ID is 2.

-m, -r, or both—You use these options to specify whether to restore the schema, the data, or both at the same time.

The following is an example of a session involving restoring from a native backup of two nodes

shell:~/ndb/BACKUP/BACKUP-1$ ls -l
total 432
-rw-r-----   1 user group 99568   Nov 25 17:19 BACKUP-1-0.2.Data
-rw-r-----   1 user group 91956   Nov 25 17:19 BACKUP-1-0.3.Data
-rw-r-----   1 user group  6384    Nov 25 17:19 BACKUP-1.2.ctl
-rw-r-----   1 user group      44    Nov 25 17:19 BACKUP-1.2.log
-rw-r-----   1 user group  6384    Nov 25 17:19 BACKUP-1.3.ctl
-rw-r-----   1 user group      44    Nov 25 17:19 BACKUP-1.3.log
shell:~/ndb/BACKUP/BACKUP-1$ ndb_restore -m -b 1 -n 2
Ndb version in backup files: Version 5.0.13
Connected to ndb!!
Successfully restored table world/def/countrylanguage
Successfully restored table world/def/country
Successfully created index PRIMARY on countrylanguage
Successfully created index PRIMARY on country

NDBT_ProgramExit: 0 - OK

shell:~/ndb/BACKUP/BACKUP-1$ ndb_restore -r -b 1 -n 2
Ndb version in backup files: Version 5.0.13
Connected to ndb!!
_____________________________________________________
Processing data in table: world/def/countrylanguage(3) fragment 0
_____________________________________________________
Processing data in table: world/def/country(2) fragment 0
_____________________________________________________
Processing data in table: sys/def/NDB$EVENTS_0(1) fragment 0
_____________________________________________________
Processing data in table: sys/def/SYSTAB_0(0) fragment 0
Restored 627 tuples and 0 log entries

NDBT_ProgramExit: 0 - OK

shell:~/ndb/BACKUP/BACKUP-1$ ndb_restore -r -b 1 -n 3
Ndb version in backup files: Version 5.0.13
Connected to ndb!!
_____________________________________________________
Processing data in table: world/def/countrylanguage(3) fragment 1
_____________________________________________________
Processing data in table: world/def/country(2) fragment 1
_____________________________________________________
Processing data in table: sys/def/NDB$EVENTS_0(1) fragment 1

______________________________________________________
Processing data in table: sys/def/SYSTAB_0(0) fragment 1
Restored 596 tuples and 0 log entries

NDBT_ProgramExit: 0 - OK

Restoring a Backup Made Using `mysqldump`

To restore a backup made by using mysqldump, you simply redirect the file you saved the output to into the mysql command-line client. This causes the client to reissue all the SQL commands needed to restore your database objects and data (if you made a backup of the data). Generally, you need to make sure the cluster isn’t in use when you are doing this.

You need to log in to the MySQL server by using normal user account privileges (normally as the root user with a password).

The following is an example of this command:

shell> mysql -u root -p < file.sql

If the command returns, you know the backup has completed successfully. If it gives an error, something has gone wrong (possibly version incompatibilities or something similar).

Keep in mind that if this restoration is creating non-clustered objects, such as stored procedures and triggers, you need to follow this procedure on each MySQL server to restore them all fully. Chapter 4 provides some examples of scripts you can use as a basis to do this easily.

When doing this type of restoration, you might want to consider setting the MySQL server variable ndb_use_transactions to false or off. This allows you to not have any problems with REDO log segments.

Speeding Up Restoration

When restoring data, the biggest thing you can do to attempt to speed things up is to do the restoration in parallel. Because a cluster has many data nodes, you generally want to do many loads at the same time to take advantage of them and to avoid bottlenecks such as network bottlenecks.

With native backups, this is quite easy because the data for your tables is split into many files (one for each data node). You can simply restore the metadata from one and then restore the data from all the nodes at the same time. The exact number of restorations to do at once depends on the number of data nodes you have, network settings, and so on. We recommend that you do some simple testing to determine the ideal number of restorations to do at once. Keep in mind that each ndb_restore running at once requires its own [mysqld] slot in the cluster configuration.

Getting the load to occur in parallel with mysqldump takes a bit of effort. We recommend possibly splitting each table into a different dump file and then loading it in that way. If you are just dumping a single database, you can use the --tab option for mysqldump to produce a separate file for each table, which can then possibly make it easier to load in parallel.

Troubleshooting Backups

There are a few reasons the MySQL Cluster native backup tool may fail to make a backup. The following sections discuss a few common problems and how to solve them.

If a problem occurs while making a backup, it should always be reported in the cluster log. It is a good practice to monitor the cluster log for any such issues. The cluster log should give a general idea as to why the backup failed. It might be because someone manually aborted the backup by using the ABORT BACKUP command or be due to an uncontrollable cause, such as the three mentioned in the following sections.

In the unlikely event that the backup causes a node to fail, you can find more information in the node log and also in the trace file for the node that failed.

Disk Is Full

If you get a “disk is full” error, you need to ensure that the backup has enough space to complete. To solve this problem, you can either free up space or move BackupDataDir to some place that has more space.

Disk Is Too Slow

If the disk to which you are making a backup isn’t fast enough, the backup might fail. A system of two different memory buffers is used when you make a backup. The first one is BackupDataBufferSize, which is used for buffering data before it is written to disk. The second one, which is important for this error, is BackupLogBufferSize. This is the buffer that is used to record data changes before writing to disk. If this buffer ever fills up due to the disk not being able to keep up with the number of changes occurring during the backup, the backup will fail. If that occurs, you can either attempt to increase this buffer or move the backup to be taken to a faster disk subsystem.

Not Enough Memory

As mentioned previously, some additional memory sections are allocated when performing a backup. If your system is too low on resources, this memory allocation can fail, which causes the cluster in MySQL Cluster to abort the backup process.

In order to fix this, you should either decrease the sizes of BackupDataBufferSize and BackupLogBufferSize or decrease another memory parameter in order to allow sufficient space for these to be allocated.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 3. Backup and Recovery

Create new playlist

Sign In

Sign Up