Configuring replication

In this section we will see which configuration parameters are required on the master and slave servers.

All the following explanations also apply to the multisource replication environments. However, there are some differences that will be explained after the common tasks.

Setting up a replication environment requires at least the following steps:

  1. Configuring a new replication master
  2. Configuring one or more replication slave
  3. Loading data from the master
  4. Starting the slave
  5. Checking if the slaves are running

These tasks, along with other useful topics, will be explained in this section.

Configuring a new replication master

When setting up a replication environment, the first thing to do is of course to set up at least one master. This is an easy task, as the master is just a normal MariaDB server that has a unique ID and maintains a binary log.

First, a server ID needs to be set on all masters and slaves. The server ID needs to be unique. It must be an integer value of 4 bytes with the minimum value 1. If the server ID is not set, or it is set to 0, the replication is disabled.

Also, as explained previously, the binary log must be enabled on the master server. This is needed to record the events that will be sent to the slaves.

A master's configuration file needs to contain lines similar to the following:

server-id=1
log_bin="binlog"
binlog_format=STATEMENT

The @@server_id and @@binlog_format variables are dynamic; thus they can simply be changed at runtime without stopping the server:

MariaDB [(none)]> SET @@server_id = 1;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> SET @@binlog_format = 'MIXED';
Query OK, 0 rows affected (0.03 sec)

The @@log_bin variable is not dynamic, so enabling the binary log requires a restart.

Note

Note that it is always necessary to modify the configuration file. In this way, if the server is restarted these settings are not lost. A restart can be automatically made by mysqld_safe, if the server crashes.

Then, we need to create at least one account for replication slaves. Strictly speaking, a slave only needs the REPLICATION SLAVE privilege to work properly. If some databases must not be replicated, this permission should not be granted on them. It is possible for all the slaves to use the same user; they could even use an account that is shared with other clients. However, since the password is stored in clear text in a file called master.info, it is generally better for them to use different passwords. Also, to improve the security of data, each slave user should only be able to connect from a specific hostname. It could also be a good idea to force the slaves to connect using SSL.

Here is an example that shows how to create a secure replication account:

MariaDB [test]> CREATE USER 'mslave1'@'host10' IDENTIFIED BY 'somepwd';
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> GRANT REPLICATION SLAVE ON *.* TO 'mslave1'@'host10' REQUIRE SSL;
Query OK, 0 rows affected (0.00 sec)

Note that the REPLICATION SLAVE permission must not be confused with REPLICATION CLIENT. This permission should be granted to the user who performs replication configuration and diagnostics, because it allows you to execute the SHOW MASTER STATUS and SHOW SLAVE STATUS statements. On production servers, root should not be used unless we really need all the permissions for the current session.

For details about security and user management, see Chapter 5, Users and Connections.

Configuring a new replication slave

After setting up one or more master, we need to properly configure the slaves. This step is required when setting up a replication environment, and every time we want to add a new slave.

Similarly to the masters, each slave needs to have a unique server ID.

It is not necessary to enable the binary logging on the slaves, unless a slave should also act as a master of another slave. If this is not the case, the binary log can still be useful for taking backups, but it affects the performance.

The following example shows the most minimalist configuration settings required for a slave to work, in the configuration file:

server-id=2

The slave might also act as a master of one or more slaves. In this case, the slave needs to log into the binary log the events that it receives from its masters, so that its slaves will be able to retrieve them. However, the replicated events are not logged by default. To replicate them, we must enable the binary log and set the @@log_slave_updates server variable to ON. This variable is not dynamic, so changing its value requires a server restart.

The following example shows a minimalist configuration for a slave that also acts as a server:

server-id=2
log_bin="binlog"
binlog_format=STATEMENT
log_slave_updates=ON

Starting a slave

If a master has no data yet, setting up a replication between the master and a slave is very straightforward. After starting the master and the slave as explained previously, we will follow this procedure:

  1. Lock the master with a global lock. We want to be sure that it does not start to work before we set up the slaves.
  2. Obtain the binary log coordinates. The coordinates are the name of the binary logfile currently in use, and the position of the last written event.
  3. Unlock the master.
  4. Provide the slave with the information that is necessary to access the master. We will do this using the CHANGE MASTER TO statement.
  5. Repeat these operations for each slave.

The following example shows how to obtain the master's binary log coordinates:

MariaDB [(none)]> FLUSH TABLES WITH READ LOCK;
Query OK, 0 rows affected (0.04 sec)
MariaDB [(none)]> SHOW MASTER STATUS;
+---------------+----------+--------------+------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+---------------+----------+--------------+------------------+
| binlog.000031 |      323 |              |                  |
+---------------+----------+--------------+------------------+
1 row in set (0.02 sec)
MariaDB [(none)]> UNLOCK TABLES;
Query OK, 0 rows affected (0.00 sec)

In this case, the coordinates are the binlog.000031 logfile and the position 323. The slave will start to replicate the data from these coordinates.

The CHANGE MASTER TO statement can be used to tell the slave which master it has to replicate. The following basic example shows how to do this:

MariaDB [(none)]> CHANGE MASTER TO
    -> MASTER_HOST = '162.100.100.100',
    -> MASTER_USER = 'slave01',
    -> MASTER_PASSWORD='somepwd',
    -> MASTER_PORT = 5000,
    -> MASTER_LOG_FILE = 'binlog.000031',
    -> MASTER_LOG_POS = 3234;
Query OK, 0 rows affected (0.23 sec)

At this point, the slave knows all it has to know about replication, the necessary permissions on the master are set, and both are running. To start replication, we will just use the START SLAVE statement:

MariaDB [(none)]> START SLAVE;
Query OK, 0 rows affected (0.02 sec)

The parameters of the connection between the slave and the server can be changed in the future. However, while this is done, the slave must be temporarily stopped. For example:

MariaDB [(none)]> STOP SLAVE;
Query OK, 0 rows affected, 1 warning (0.00 sec)
MariaDB [(none)]> CHANGE MASTER TO MASTER_PASSWORD = 'my_new_pwd';
Query OK, 0 rows affected (0.22 sec)
MariaDB [(none)]> START SLAVE;
Query OK, 0 rows affected (0.00 sec)

If the server is restarted, the slave thread will be restarted too.

Checking whether a slave is running

We can check the running slave threads with a SHOW SLAVE STATUS query. For example:

MariaDB [(none)]> SHOW SLAVE STATUS G
*************************** 1. row ***************************
               Slave_IO_State: Connecting to master

               Slave_IO_Running: Connecting
               Slave_SQL_Running: Yes

                   Last_Error:

2 rows in set (0.02 sec)

The output of this statement has many columns, so we truncated it. We left the columns that are important to verify that a slave is running:

Column name

Description

Slave_IO_State

The current state of the I/O thread, as shown in SHOW PROCESSLIST.

Slave_IO_Running

Shows if the I/O thread is connected and working.

Slave_SQL_Running

The current state of the SQL thread, as shown in SHOW PROCESSLIST.

Last_Error

The last error encountered by the slave.

The Slave_IO_Running and Slave_SQL_Running columns should be set to Yes. If either of them is not set to Yes, Slave_IO_State and Last_Error help us identify the problem. For example, if a connection error occurs, we will see an error similar to the following:

error connecting to master '[email protected]:3310' - retry-time: 10  retries: 86400  message: Can't connect to MySQL server on '127.0.0.1' (111 "Connection refused")

Reconfiguring an existing slave

Sometimes we want to reconfigure an existing slave. Probably we restored an old database into the master, or the filtering rules were not set correctly. But, whatever the reason is, we want a slave to forget the data it replicated until now and start again.

After the restart, the slave will need to replicate the master's binary log from the start. So, it must forget the current coordinates. The RESET SLAVE statement does the trick by deleting the current slave log files. This cannot be done while the slave is running, so it needs to be temporarily stopped.

Consider the following example:

MariaDB [(none)]> STOP SLAVE;
Query OK, 0 rows affected (0.17 sec)
MariaDB [(none)]> RESET SLAVE;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> START SLAVE;
Query OK, 0 rows affected (0.18 sec)

Importing the data into a master

Sometimes, we want to replace an old master with a new one. Probably the older master was slow, or its hardware was damaged. But, whatever the reason is, we will need to load the old master's data into the new master's data.

We can choose to do this when the slaves are already running and connected to the master. In this case, we will use a dump of the data. In Chapter 8, Backup and Disaster Recovery, we discussed how to take this kind of backup using mysqldump and how to load the dump into a server. The dump will be automatically replicated by the slaves.

An alternative to using a logical backup is copying the files. In this case, the backup will not be automatically replicated, if the slaves are already running. We will need to copy the physical backup into the slave's data directory, too. In this way, they will be provided with the old master's data before starting to replicate the new master.

Importing the data into a slave from a master

A replication environment is not meant to be static. It is possible to add slaves at any time, to obtain more redundancy or to balance the workload across a higher number of MariaDB servers. In this case, we will need to load the current master data into the new slaves. Then, the slave can start to replicate the master.

It is possible to create a physical backup and copy it into the slave's data directories as explained in Chapter 8, Backup and Disaster Recovery. There is no difference between normal physical backups and backups used for replication.

It is also possible to use mysqldump. This tool has already been discussed, but it has some parameters that make the dump easier to restore into a slave. The dump can be taken from the master or from a slave. The latter option is useful to avoid overloading the master.

Dumping data from a master

When taking a backup from a master that must be restored into a slave, the --master-data option is particularly useful. It adds a CHANGE MASTER TO statement to the dump, so that slaves are automatically set up to replicate the master from the proper coordinates.

The --delete-master-logs option executes the PURGE BINARY LOGS statement on the master. This statement is used to delete the old binary log files and will be discussed later in this chapter.

Dumping data from a slave

If the master already has at least one slave, the dump can be taken from the slave to avoid the execution of the long locking queries on the master.

The main parameter that we will want to use is --dump-slave. It is very similar to --master-data, but they produce different CHANGE MASTER TO statements. If we use --master-data, mysqldump assumes that the statement will be used to replicate data from the server it is connecting to. Instead, --dump-slave assumes that this server is a slave, and the CHANGE MASTER TO statement will be used to replicate its server. This difference is very important.

Note that, while --dump-slave includes the replication coordinates in the dump, it does not include the master's hostname and port. Thus we may want to specify the --include-master-host-port option, which ensures the MASTER_HOST and MASTER_PORT clauses are included in the CHANGE MASTER TO statement. This option makes the configuration of a slave even easier, unless the dump is created on another slave, or the master's hostname or port is changed for some reason.

With --apply-slave-statements the dump will contain STOP SLAVE and START SLAVE before and after CHANGE MASTER TO. This is useful if the target slave is already running. If it is not running yet, the START SLAVE statement will make the replication start immediately. We do not want it to start immediately if we want to test the consistency of data before the slave starts its work.

Filtering binary log events

It is possible to avoid replicating some statements. This can be done on a master or on individual slaves.

The SET SQL_LOG_BIN statement

The SET SQL_LOG_BIN statement can be used to enable or disable the logging of subsequent statements into the binary log. It only affects the current session. Statements that are not written in the master's binary log cannot be replicated by any slave.

Consider the following example:

MariaDB [test]> SET SQL_LOG_BIN = 0;
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> /* this statement will not be logger or replicated */
    -> DROP TABLE orders;
Query OK, 0 rows affected (0.38 sec)
MariaDB [test]> SET SQL_LOG_BIN = 1;
Query OK, 0 rows affected (0.00 sec)

The @@skip_replication variable

The @@skip_replication session variable is similar to the SET SQL_LOG_BIN statements. However it does not inhibit the logging of statements; it just causes those statements to be flagged as skip_replication in the binary log. The slaves will receive such events. Their behavior depends on the value of the @@replicate_events_marked_for_skip variable:

  • REPLICATE: This causes the events to be replicated: the flag is simply ignored. This is the default value.
  • FILTER_ON_SLAVE: This causes the slaves to ignore such events. They will still receive them and log them, if the binary log is enabled. The flag is preserved on the slave's binary log.
  • FILTER_ON_MASTER: With this the slave will not receive the events at all.

Filtering the replication of events on the slaves

The slaves have three dynamic variables that can be used to prevent some tables, or some complete databases, from being replicated. A comma-separated list of arguments can be provided to these variables. Before changing their values, it is necessary to stop the slave. Such variables are as follows:

  • @@replicate_skip_db: This prevents the replication of the specified databases. This option does not affect multidatabase statements: to prevent them, @@replicate_skip_table must be used instead.
  • @@replicate_skip_table: This prevents the replication of the specified tables. The names should be specified in the following form: db_name.table_name.
  • @@replicate_wild_skip_tables: This is similar to @@replicate_skip_table, but the use of the % and _ wildcard characters is allowed. These characters have the same meaning they have for the LIKE operator. For example, the following value prevents the replication of any table in any database whose name starts with "test": test%.%.

As an alternative, it is possible to disallow replication for all tables and databases, except for the specified subsets. This can be done using the following variables, which are the complement of the ones discussed earlier:

  • @@replicate_do_db
  • @@replicate_do_table
  • @@replicate_wild_do_tables

Checksums of the binary log events

Since MariaDB 5.3, it has been possible to write each event's checksum into the binary log. This feature is not enabled by default, because it modifies the binary log format, adding an incompatibility. To write checksums, we can set the @@binlog_checksum variable to 1 (not ON). Enabling this option makes the replication more reliable, but we should avoid it when performance of the slaves is a problem.

It is possible to verify the checksums in several situations:

  • The slave I/O thread verifies the checksums when it receives them from the master if @@master_verify_checksum is set to 1. By default, it is 0.
  • The slave SQL thread verifies the checksums if @@slave_sql_verify_checksum is set to 1, which is the default value.
  • The mysqlbinlog utility (discussed in Chapter 8, Backup and Disaster Recovery) verifies the checksums if it is invoked with the --verify-binlog-checksum option.

All the variables that affect the binary log events checksums are dynamic.

Configuring parallel replication

Usually, a database server processes requests from several clients at the same time. As far as these requests do not lock rows or tables to guarantee data integrity, they are processed simultaneously. In Chapter 5, Users and Connections, we discussed how simultaneous connections are handled. However, before MariaDB 10.0, slaves used only one thread to replicate all the events they received from the master. Because of this limitation, the write operations were much slower on the slaves, especially in environments where a master could execute many non-blocking writes. Parallel replication solves this issue by using multiple parallel threads to apply replication events.

As mentioned earlier, parallel replication is not used by default. To enable it, we must set the @@slave_parallel_threads server variable to a value higher than 0, on the master. This value is the number of parallel worker threads that will be used by all the slaves.

The following variables are only meaningful when used with parallel replication.

The @@slave_parallel_max_queued variable determines the amount of memory that the slaves must use to cache the next not-yet-executed relay log events. When at least the worker thread is free, the slaves examine this cache looking for events that can be executed in parallel.

The @@slave_domain_parallel_threads variable is useful when using parallel replication on a slave that replicates multiple masters. Imagine that a slave replicates three masters. Imagine that one of them (call it master1) executes a statement that takes several hours. This can happen with big tables. The slave will need to replicate this statement. Meanwhile, initially all connections will be able to allocate worker threads. But the worker threads associated to master1 will need to wait until the very long-running statement has been executed. When no more worker threads are free, the other connections will not be able to benefit from parallel replication anymore.

The purpose of @@slave_domain_parallel_threads is to prevent a single master connection from monopolizing the pool of threads. This value determines the maximum number of threads that can be allocated by the same master connection. If the value is not lower than @@slave_parallel_threads, it has no effect. But if it is lower than @@slave_parallel_threads divided by the number of master connections, some worker threads will never be used. The value should be left as high as possible to avoid preventing a master connection from allocating a thread when not necessary.

In real cases, finding the optimal value for @@slave_parallel_threads can be complex. It is highly dependent on the characteristics of the master's workloads. As a general rule, we can start with a value that is slightly lower than @@slave_parallel_threads, and then lower the value if a problem occurs because of long-running statements.

All these variables are dynamic, so changing the configuration of the parallel replication does not require the master to be restarted. However, the slaves should be stopped temporarily.

Delaying a slave

There are several reasons why we might want to have a delayed slave. For example, we may want to have a slave with a 30 minute delay to recover from errors. Alternatively, if we accidentally drop a table, we might have 30 minutes to recover it from the slave before the deletion is replicated. Or, we may have a slave with a delay of a day, week, or month. It would allow us to compare the most recent database with an older version, to find the recent structure or data modifications.

MariaDB does not natively support delayed replication. However, Percona Toolkit contains a tool that implements this feature on the client side: it is called pt-slave-delay. It is necessary to invoke the tool separately for each slave that needs to be delayed. This tool works by connecting to the slave and periodically checking how much the slave is lagging behind the master. When the delay is too low, pt-slave-delay stops the slave for a while. It is possible to only cause the delay for a given period of time, or permanently.

The following options are also very important:

  • delay: This specifies the desired amount of delay. The --interval option determines the time interval between the checks.
  • --run-time: This determines for how much time the tool will run; if not specified, it will never terminate spontaneously. By default, when the program terminates, the slave will be restarted, if it is not running. This applies even if the program is terminated by pressing Ctrl + C. To change this behavior, and stop the slave on exit if it is running, we can use the --continue option.

The time options can be specified with a number followed by a letter that represents the time unit. For example, 30m means 30 minutes.

An invocation example is as follows:

pt-slave-delay --delay 3h --interval 10m -uroot -proot
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.139.15