The FEDERATEDX and CONNECT storage engines

The FEDERATEDX and CONNECT storage engines allow us to use a remote table as if it were located in the local server. Local FEDERATEDX and CONNECT tables act as a proxy between the client and the remote server. When the client sends a SQL statement to the tables, the tables send it to the remote server; when the remote server returns some results, the tables send the result to the client.

This is not the most advanced solution to share data between multiple servers. The SPIDER storage engine has more features and optimizations, as explained in the next section. However, FEDERATEDX and CONNECT may have some advantages.

MariaDB Knowledge Base explains that the initial version of FEDERATEDX has been developed for Cisco. Its devices did not have much storage space, so they needed a MySQL storage engine to access remote data. This engine at the time was called FEDERATED, and was included in MySQL 5.0. More features were added in MySQL 5.1 to make it more usable for the majority of users. Since MariaDB developers thought that Oracle did not invest enough resources to maintain FEDERATED, they created a fork called FEDERATEDX, which is included in MariaDB. The author of this fork is the original developer of FEDERATED. The fork is fully compatible, but it contains more bug fixes and features. The most notable features are transactions (when the underlying table supports them) and the support for the ODBC protocol, to connect to database systems other than MariaDB or MySQL.

The CONNECT storage engine has already been mentioned in several chapters of this book. It basically allows the user to access a wide variety of external data sources as if they were MariaDB tables. This includes non-relational data sources, such as the text files written in several formats; CSV, XML, HTML and INI are just some examples, and the files can also be compressed with gzip. Even more exotic data sources are also supported; for example, on Windows, even directories and MAC network addresses can be read as tables. One of the supported data sources is a connection to a remote database server. Specific MariaDB/MySQL protocols are supported, as well as the generic ODBC standard. The CONNECT storage engine was then included in MariaDB 10.0.

While both FEDERATEDX and CONNECT are able to transparently use a remote MariaDB or MySQL table or view, their different histories determine different features and optimizations.

Some common features of FEDERATEDX and CONNECT are as follows:

  • When creating the local table, the columns' definition can be omitted. In this case, the remote table's definition will be used.
  • It is possible to exclude some remote columns from the local table.
  • The local table cannot have additional columns. An exception is that CONNECT supports virtual columns.
  • No indexes are created locally; this is impossible by design, since modifying the remote table does not require accessing the local table.

Creating a FEDERATEDX table

For backward compatibility, the FEDERATEDX storage engine must be referred to as FEDERATED in MariaDB. It is a built-in plugin, so it need not be installed and cannot be uninstalled.

Let's see how to create the FEDERATEDX table with some examples. First, we need to create the underlying table on a server, which we will call remote. We can use any storage engine. Consider the following example:

remote> CREATE TABLE db1.user (
    -> id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
    -> username VARCHAR(50),
    -> password VARCHAR(50),
    -> PRIMARY KEY (id)
    -> )
    -> ENGINE = InnoDB;
Query OK, 0 rows affected (0.47 sec)

Then, we will create FEDERATEDX on another server. We will do this on a server called local, shown as follows:

local> CREATE TABLE test.user_fed
    -> ENGINE = FEDERATED
    -> CONNECTION = 'mysql://user1:pwd@remote_server/db1/user';
Query OK, 0 rows affected (0.20 sec)

We did not provide the structure of the table, so it will be identical to the structure of the remote table. The CONNECTION table option specifies the information necessary to connect to the remote table. The string passed to this option is called a connection string and it contains the following information:

  • The protocol to use: mysql://
  • A username: user1
  • A password: pwd
  • The IP address or hostname of the remote server: remote_server
  • The database that contains the table: db1
  • The name of the remote table: user

If not specified, the database name and table name are the same as the ones used for the FEDERATEDX table.

Tip

With very old versions of the FEDERATED storage engine, the COMMENT table option was used instead of CONNECTION to store the connection data. It was sort of a trick implemented to work around issues with the storage engine's architecture. As a side effect, the user could not associate a comment to the FEDERATED tables. In modern versions of both FEDERATEDX and FEDERATED, the CONNECTION option must be used.

Defining a link to a remote server

The syntax shown previously to create a FEDERATEDX table is convenient when we want to access only one table on a remote server. However, if a remote server contains several tables that we want to access, we do not want to repeat the same long connection string for each table.

A good solution is to define a link to a remote server or, more specifically, a remote database. This can be done using the CREATE SERVER statement. A link created in this way can be used with all the storage engines that allow access to tables from remote servers:

  • The FEDERATED and FEDERATEDX storage engines
  • The CONNECT storage engine
  • The SPIDER storage engine

Here's an example of CREATE SERVER usage:

CREATE SERVER srv1
  FOREIGN DATA WRAPPER `mysql`
  OPTIONS (
    USER 'root',
    PASSWORD 'root',
    HOST '127.0.0.1',
    DATABASE 'remote_server'
  );

Note that there is no way to specify a table name.

We can now create a new table using this link:

CREATE TABLE test.t
  ENGINE = FEDERATED
  CONNECTION 'srv1/user';

The connection string, in this case, is the link name followed by the table name. If the table name is omitted, the remote table name is assumed to be equal to the new table name.

Links are stored in the tables present on the servers, in the mysql system database. Like all tables in the mysql database, servers should not be directly modified, but queried to examine the existing links, shown as follows:

MariaDB [test]> SELECT * FROM mysql.servers WHERE Server_name = 'srv1' G
*************************** 1. row ***************************
Server_name: srv1
       Host: remote_server
         Db: db1
   Username: root
   Password: root
       Port: 0
     Socket:
    Wrapper: mysql
      Owner:
1 row in set (0.00 sec)

Note that the only supported Wrapper is mysql, and the Owner property is currently not supported.

The following statement can be used to drop a link:

DROP SERVER IF EXISTS srv1;

The existing links cannot be edited, but they can be dropped and recreated. Note that dropping or recreating them has no effect on the existing tables. To update a table definition, it is necessary to drop and recreate it.

Creating a MYSQL CONNECT table

The CONNECT storage engine supports several table types. Each of them allows us to use a different type of data source. The data sources that allow us to communicate the remote DBMSs are MYSQL and ODBC. The MYSQL type works for both MariaDB and MySQL servers, using the native protocol. The ODBC type works for any DBMS that supports the ODBC standard. In this chapter, we will only discuss the MYSQL type.

The syntax used to create a FEDERATEDX table can also be used to create a MYSQL CONNECT table, except that no indexes can be defined:

CREATE TABLE test.user (
  id INTEGER UNSIGNED NOT NULL,
  username VARCHAR(50),
  password VARCHAR(50)
)
  ENGINE = CONNECT
  TABLE_TYPE = MYSQL
  CONNECTION = 'mysql://user1:pwd@remote_server/db1/user';

Similar to FEDERATEDX, CONNECT requires the connection string to be specified in the CONNECTION option, not in COMMENT.

A server name can be specified instead of a complete connection string:

CREATE TABLE test.user (
  id INTEGER UNSIGNED NOT NULL,
  username VARCHAR(50),
  password VARCHAR(50)
)
  ENGINE = CONNECT
  TABLE_TYPE = MYSQL
  CONNECTION = 'srv1';

The CONNECT storage engine also supports the DBNAME and TABNAME table options, which specify a remote database name and a remote table name. They can be combined with a connection string, shown as follows:

CREATE TABLE test.user (
  id INTEGER UNSIGNED NOT NULL,
  username VARCHAR(50),
  password VARCHAR(50)
)
  ENGINE = CONNECT
  TABLE_TYPE = MYSQL
  DBNAME = 'test'
  TABNAME = 'user'
  CONNECTION = 'mysql://user1:pwd@remote_server/';

If a database name and table name are provided in the connection string and via the DBNAME and TABNAME options, the values specified in the connection string will be used.

A view name can be specified instead of a table name, using any of the preceding syntaxes. It is also possible to specify a query, which will be used as an unnamed view:

CREATE TABLE test.user (
  id INTEGER UNSIGNED NOT NULL,
  username VARCHAR(50),
  password VARCHAR(50)
)
  ENGINE = CONNECT
  TABLE_TYPE = MYSQL
  SRCDEF = 'SELECT * FROM db1.user WHERE id > 1000000'
  CONNECTION = 'mysql://user1:pwd@remote_server/';

Sending SQL statements to a remote server

The CONNECT storage engine allows us to send arbitrary SQL statements to a remote server. This is mainly useful for sending administrative commands or creating new tables remotely without directly accessing a remote MariaDB server.

To be able to directly send statements to a remote server, a special CONNECT table must be created. Here is an example:

CREATE TABLE srv1_sql (
  statement VARCHAR(128) NOT NULL FLAG = 0,
  number INTEGER NOT NULL FLAG = 1,
  message VARCHAR(255) FLAG = 2,
  warnings INTEGER NOT NULL FLAG = 3
)
  ENGINE = CONNECT,
  TABLE_TYPE = MYSQL,
  CONNECTION 'srv1',
  OPTION_LIST = 'Execsrc=1,Maxerr=10';

The table and column names are not relevant and can be defined arbitrarily. We used the name of a defined server followed by the _sql suffix. This seems quite logical, if we decide to define special CONNECT tables to send statements to all the remote servers. This can be useful even if we decide to use FEDERATEDX or SPIDER to access the remote tables. However, we must be cautious while setting the permissions for these special tables. Usually, they should only be accessible by root, or other users who have the SUPER privilege.

The OPTION_LIST table option is what makes CONNECT aware of the purpose of this special table. The key option is Execsrs. The Maxerr option can be used to set the maximum number of errors and warnings that can be received from the remote server, in response to each statement we send.

Each column in the table has a special meaning. While their meanings are made clearer for us if we use descriptive names, the purpose of the columns is defined using FLAG, a column option specific to CONNECT. For example, renaming the message column will not alter its purpose, as long as the FLAG option remains unchanged. The following table shows the meanings of the various FLAG values:

FLAG value

Column meaning

0

Statement to be executed

1

Number of rows affected by a write statement, or returned by a query

2

An informative message

3

Number of errors and warnings

The default value for the FLAG option is 0, so FLAG=0 can always be omitted in the CREATE TABLE statement.

The following example shows how to execute a statement remotely using the previously created table:

MariaDB [test]> SELECT * FROM srv1_sql WHERE statement = '
    '> CREATE OR REPLACE TABLE db1.new_table (id INT) ENGINE = InnoDB
    '> ';
+------------------------------------------------------------------+--------+---------------+----------+
| statement                                                        | number | message       | warnings |
+------------------------------------------------------------------+--------+---------------+----------+
|
CREATE OR REPLACE TABLE db1.new_table (id INT) ENGINE = InnoDB
 |      0 | Affected rows |        0 |
+------------------------------------------------------------------+--------+---------------+----------+
1 row in set (0.85 sec)

The executed statement has been sent via the WHERE condition and then returned in the statement column. We know that the statement succeeded because the warnings value is 0. The message and number columns tell us that the number of rows affected by CREATE TABLE is 0; this is normal, because CREATE TABLE always returns this value.

Just to verify that the CREATE TABLE statement succeeded on the remote server, we will now create a CONNECT table that is linked to it:

MariaDB [test]> CREATE TABLE new_table (
    -> id INT
    -> )
    -> ENGINE = CONNECT,
    -> TABLE_TYPE = MYSQL,
    -> CONNECTION 'srv1/new_table';
Query OK, 0 rows affected (0.13 sec)
MariaDB [test]> SELECT COUNT(*) FROM new_table;
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

It is also possible to send more than one statement to the remote server using a single SELECT statement against the CONNECT table. This can be done using the IN operator. In this case, a row will be returned for each executed statement:

MariaDB [test]> SELECT * FROM srv1_sql WHERE statement IN (
    -> 'SET @@global.innodb_file_per_table = 1',
    -> 'SET @@global.innodb_strict_mode = 1'
    -> );
+----------------------------------------+--------+---------------+----------+
| statement                              | number | message       | warnings |
+----------------------------------------+--------+---------------+----------+
| SET @@global.innodb_file_per_table = 1 |      0 | Affected rows |        0 |
| SET @@global.innodb_strict_mode = 1    |      0 | Affected rows |        0 |
+----------------------------------------+--------+---------------+----------+
2 rows in set (0.00 sec)

By default, only fatal errors are returned, and they are contained in the message (FLAG=2) column:

MariaDB [test]> SELECT * FROM srv1_sql WHERE statement = 'SET @@global.performance_schema = 1' G
*************************** 1. row ***************************
statement:
SET @@global.performance_schema = 1

   number: 1238
  message: Remote: Variable 'performance_schema' is a read only variable
 warnings: 0
1 row in set (0.23 sec)

In most situations, this is enough, but there are cases in which we prefer to assess each warning in order to debug messages. To do this, there are three special commands that we can send to the special table as if they were normal SQL statements. The CONNECT storage engine will parse those commands and it will not send them to the remote server. Examples of such commands are:

  • Note
  • Warning
  • Error

They tell CONNECT to show notes, warnings, and errors in the result set of the current SELECT statement. For example:

MariaDB [test]> SELECT * FROM srv1_sql WHERE statement IN (
    -> 'Note','Warning', 'Error',
    -> 'DROP TABLE IF EXISTS t1',
    -> 'Note','Warning', 'Error',
    -> 'CREATE TABLE t1 (c INT) ENGINE = MySAM'
    -> );
+----------------------------------------+--------+--------------------------------------------+----------+
| statement                              | number | message                                    | warnings |
+----------------------------------------+--------+--------------------------------------------+----------+
| DROP TABLE IF EXISTS t1                |      0 | Affected rows                              |        1 |
| Note                                   |   1051 | Unknown table 'db1.t1'                     |        0 |
| CREATE TABLE t1 (c INT) ENGINE = MySAM |      0 | Affected rows                              |        2 |
| Warning                                |   1286 | Unknown storage engine 'MySAM'             |        0 |
| Warning                                |   1266 | Using storage engine InnoDB for table 't1' |        0 |
+----------------------------------------+--------+--------------------------------------------+----------+
5 rows in set (0.42 sec)

In the preceding example, the first Note informs us that the table we tried to DROP does not exist. More importantly, the last two warnings tell us that we mistyped MyISAM, and thus, the InnoDB storage engine has been used instead.

This technique does not provide a direct way to retrieve a result set from a remote server. However, this can be done with an easy procedure:

  1. Send a CREATE TABLE … SELECT statement to the remote server
  2. Create a CONNECT table that points to the new remote table
  3. Query the local table

Note that the local table must be recreated every time the remote table changes, unless its columns remain the same.

The following example shows how to retrieve the remote server version:

MariaDB [test]> SELECT * FROM srv1_sql WHERE statement = '
    '> CREATE OR REPLACE TABLE a.output AS SELECT version(), @@global.version_comment
    '> ';
+----------------------------------------------------------------------------------+--------+---------------+----------+
| statement                                                                        | number | message       | warnings |
+----------------------------------------------------------------------------------+--------+---------------+----------+
|
CREATE OR REPLACE TABLE a.output AS SELECT version(), @@global.version_comment
 |      1 | Affected rows |        0 |
+----------------------------------------------------------------------------------+--------+---------------+----------+
1 row in set (0.75 sec)

MariaDB [test]> CREATE OR REPLACE TABLE output
    ->      ENGINE = CONNECT,
    ->      TABLE_TYPE = MYSQL,
    ->      CONNECTION 'srv1/output';
Query OK, 0 rows affected (0.09 sec)

MariaDB [test]> SELECT * FROM output;
+---------------------+--------------------------+
| version()           | @@global.version_comment |
+---------------------+--------------------------+
| 10.0.11-MariaDB-log | MariaDB Server           |
+---------------------+--------------------------+
1 row in set (0.25 sec)

Merging multiple CONNECT MYSQL tables

The TBL table is another useful table type. It represents a collection of identical (or very similar) CONNECT tables. Such a collection can be queried as if it were a single table. Used with the MYSQL table type, TBL allows us to implement a simple form of data sharding over several servers. It is possible to define a local set of CONNECT MYSQL tables, which are linked to several remote tables spread on multiple servers. A table of type TBL can then be built on such MYSQL tables. Queries against the TBL table will be sent to the MYSQL tables, and data will be retrieved from all the relevant remote servers.

Note that the TBL table type has an important limitation: it is read-only. Data in the underlying tables cannot be modified through TBL, which makes it unusable in many situations. However, it can still be an option when we want to shard read-only data through several MariaDB servers.

The syntax to create a TBL table is as follows:

CREATE TABLE <table_name>
  ENGINE = CONNECT,
  TABLE_TYPE = TBL,
  TABLE_LIST = '<table_list>';

The <table_list> variable is a comma-separated list of CONNECT table names. Each name can optionally be specified as database_name.table_name. If the database name is omitted, the table is assumed to be located in the same database as the TBL table.

While a TBL table, technically, can only be built on CONNECT tables, it can still be indirectly linked to tables that use different storage engines, such as InnoDB or MyISAM. This is done by creating MYSQL tables that point to such tables and build a TBL table on them. The result will be slower than a direct access because a connection to the local server will be used. However, it should still be faster than connecting a remote server. This technique is useful if we want to distribute data over multiple servers, including the local server.

The CONNECT engine can solve an important performance issue that cannot be solved easily using other storage engines, such as FEDERATEDX or SPIDER. Imagine that we work for a company that has several physical stores. Each store has its own database, which contains, among other data, the stocked products, product categories, and recent transactions. Assume that we are asked to write a query, which returns the average quantity of phones sold last week in all the stores. To do this, we need to perform JOIN between three tables: product_category, product, and transaction.

However, this JOIN must be executed separately for each store; comparing the transactions of one store with the products of another store may or may not lead to wrong results, depending on how the system is designed, but it would require a huge amount of network traffic and would be way too slow. To execute the JOIN query on a store basis, we can create a MYSQL CONNECT table for each store; this table will be based on that query, using the SRCDEF option as shown in the following code. Such tables, when queried, will return the associations between categories, products, and transactions for each store. Then, we can build a TBL table on top of these MYSQL tables, and perform a query that returns the desired average. Only the relevant rows will be sent from the databases in the stores to the local MariaDB server.

A special syntax can be used to achieve the same result without defining a new CONNECT table for each remote server. The SRCDEF option can be specified directly in the TBL table definition, and a list of remote servers can be provided in the TABLE_LIST option:

CREATE TABLE <table_name>
  ENGINE = CONNECT,
  TABLE_TYPE = TBL,
  SRCDEF = '<query>',
  TABLE_LIST = '<server_list>';
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.123.73