Using londiste to replicate data

pgq is the backbone of a replication tool called londiste. The idea of londiste is to have a mechanism that is more simplistic and easier to use than, say, Slony. If you use Slony in a large installation, it is very easy for a problem on one side of the cluster to cause some issues at some other point—this was especially true many years ago when Slony was still fairly new.

The main advantage of londiste over Slony is that in the case of londiste replication there will be one process per "route". So, if you replicate from A to B this channel will be managed by one londiste process. If you replicate from B to A or from A to C those will be separate processes, which are totally independent from each other. All channels from A to somewhere might share a queue on the consumer but the transport processes themselves will not interact. There is some beauty in this approach because if one component fails, it is unlikely to cause additional problems—this is not the case if all the processes interact as they do in the case of Slony. To me this is one of the key advantages of londiste over Slony.

Replicating our first table

After this theoretical introduction we can move ahead and replicate our first table. To do so we create two databases inside the same instance (it makes no difference whether those databases are in the same instance or far apart):

hs@hs-VirtualBox:~$ createdb node1
hs@hs-VirtualBox:~$ createdb node2

Just as before we will create a table in both databases:

node1=# CREATE TABLE t_test (id int4, name text, 
t timestamp DEFAULT now(), 
PRIMARY KEY (id));
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t_test_pkey" for table "t_test"
CREATE TABLE

In the chapter about Slony we have already seen that DDLs are not replicated. The same rules apply to londiste because both systems are facing the same limitations on the PostgreSQL side.

Before we dig into details—let us briefly sum up the next steps to replicate our tables:

  • Write an init file and initialize the master.
  • Start londiste on the master.
  • Write a slave configuration and initialize the slave.
  • Start londiste on the slave.
  • Write a ticker config and start the ticker process.
  • Add desired tables to replication.

Let us get started with the first part of the process. We have to create an init file which is supposed to control the master:

[londiste3]
job_name = first_table
db = dbname=node1
queue_name = replication_queue
logfile = /home/hs/log/londiste.log
pidfile = /home/hs/pid/londiste.pid

The important part here is that every job must have a name. This makes sense so that we can distinguish those processes easily. Then we have to define a connect string to the master database as well as the name of the replication queue involved. Finally we can configure a PID and a logfile.

Tip

Every job must have a name. This is required as Slony has a single process above a cluster (and thus the cluster name is enough) but londiste has one process per route.

To install important things and to initialize the master node, we can call londiste:

hs@hs-VirtualBox:~/skytools$ londiste3 londiste3.ini create-root node1 dbname=node1
2013-05-15 13:37:24,902 3999 WARNING No host= in public connect string, bad idea
2013-05-15 13:37:25,118 3999 INFO plpgsql is installed
2013-05-15 13:37:25,119 3999 INFO Installing pgq
2013-05-15 13:37:25,119 3999 INFO   Reading from /usr/local/share/skytools3/pgq.sql
2013-05-15 13:37:25,327 3999 INFO pgq.get_batch_cursor is installed
2013-05-15 13:37:25,328 3999 INFO Installing pgq_ext
2013-05-15 13:37:25,328 3999 INFO   Reading from /usr/local/share/skytools3/pgq_ext.sql
2013-05-15 13:37:25,400 3999 INFO Installing pgq_node
2013-05-15 13:37:25,400 3999 INFO   Reading from /usr/local/share/skytools3/pgq_node.sql
2013-05-15 13:37:25,471 3999 INFO Installing londiste
2013-05-15 13:37:25,471 3999 INFO   Reading from /usr/local/share/skytools3/londiste.sql
2013-05-15 13:37:25,579 3999 INFO londiste.global_add_table is installed
2013-05-15 13:37:25,670 3999 INFO Initializing node
2013-05-15 13:37:25,674 3999 INFO Location registered
2013-05-15 13:37:25,755 3999 INFO Node "node1" initialized for queue "replication_queue" with type "root"
2013-05-15 13:37:25,761 3999 INFO Done

In Skytools there is a very simple rule: The first parameter passed to the script is always the INI file containing the desired configuration. Then comes an instruction as well as some parameters. The call will install all necessary infrastructure and return Done.

Once this has been completed, we can fire up a worker process:

hs@hs-VirtualBox:~/skytools$ londiste3 londiste3.ini worker
2013-05-15 13:41:31,761 4069 INFO {standby: 1}
2013-05-15 13:41:42,801 4069 INFO {standby: 1}

After firing up the worker on the master we can take a look at the slave configuration:

[londiste3]
job_name = first_table_slave
db = dbname=node2
queue_name = replication_queue
logfile = /home/hs/log/londiste_slave.log
pidfile = /home/hs/pid/londiste_slave.pid

The main difference here is that we use a different connect string and some different name for the job. If master and slave are two separate machines, the rest can stay the same.

Once we have compiled the configuration we can create the leaf node:

hs@hs-VirtualBox:~/skytools$ londiste3 slave.ini create-leaf node2 dbname=node2 --provider=dbname=node1
2013-05-15 13:51:27,090 4246 WARNING No host= in public connect string, bad idea
2013-05-15 13:51:27,117 4246 INFO plpgsql is installed
2013-05-15 13:51:27,118 4246 INFO pgq is installed
2013-05-15 13:51:27,122 4246 INFO pgq.get_batch_cursor is installed
2013-05-15 13:51:27,122 4246 INFO pgq_ext is installed
2013-05-15 13:51:27,123 4246 INFO pgq_node is installed
2013-05-15 13:51:27,124 4246 INFO londiste is installed
2013-05-15 13:51:27,126 4246 INFO londiste.global_add_table is installed
2013-05-15 13:51:27,205 4246 INFO Initializing node
2013-05-15 13:51:27,291 4246 INFO Location registered
2013-05-15 13:51:27,308 4246 INFO Location registered
2013-05-15 13:51:27,317 4246 INFO Subscriber registered: node2
2013-05-15 13:51:27,321 4246 INFO Location registered
2013-05-15 13:51:27,324 4246 INFO Location registered
2013-05-15 13:51:27,334 4246 INFO Node "node2" initialized for queue "replication_queue" with type "leaf"
2013-05-15 13:51:27,345 4246 INFO Done

The key here is to tell the slave where to find the master (provider). Once the system knows where to find all the data we can fire up the worker here as well.

hs@hs-VirtualBox:~/skytools$ londiste3 slave.ini worker
2013-05-15 13:55:10,764 4301 INFO Consumer uptodate = 1

This should not cause any issues and should work nicely if the previous command has succeeded as well. Now that we have everything in place we can attack the final component of the setup—the ticker process:

[pgqd]

logfile = /home/hs/log/pgqd.log
pidfile = /home/hs/pid/pgqd.pid

The ticker config is pretty trivial—all it takes is three simple lines. Those lines are enough to fire up the ticker process:

hs@hs-VirtualBox:~/skytools$ pgqd pgqd.ini
2013-05-15 14:01:12.181 4683 LOG Starting pgqd 3.1.4
2013-05-15 14:01:12.188 4683 LOG auto-detecting dbs ...
2013-05-15 14:01:12.310 4683 LOG test: pgq version ok: 3.1.3
2013-05-15 14:01:12.531 4683 LOG node1: pgq version ok: 3.1.3
2013-05-15 14:01:12.596 4683 LOG node2: pgq version ok: 3.1.3
2013-05-15 14:01:42.189 4683 LOG {ticks: 90, maint: 3, retry: 0}
2013-05-15 14:02:12.190 4683 LOG {ticks: 90, maint: 0, retry: 0}

If the ticker has started successfully, we have all of the infrastructure in place. So far we have configured all processes needed for replication—but, we have not yet told the system what to replicate.

The londiste command will offer us a set of commands to define exactly that. In our example we simply want to add all tables and replicate them:

hs@hs-VirtualBox:~/skytools$ londiste3 londiste3.ini add-table --all
2013-05-15 14:02:39,367 4760 INFO Table added: public.t_test

Just like Slony, londiste will install a trigger, which keeps track of all changes. Those changes will be written into a pgq queue and dispatched by the processes we have just set up:

node1=# d t_test
                Table "public.t_test"
 Column |            Type             |   Modifiers   
--------+-----------------------------+---------------
id      | integer                     | not null
name    | text                        | 
t       | timestamp without time zone | default now()
Indexes:
    "t_test_pkey" PRIMARY KEY, btree (id)
Triggers:
    _londiste_replication_queue AFTER INSERT OR DELETE OR UPDATE ON t_test FOR EACH ROW EXECUTE PROCEDURE pgq.logutriga('replication_queue')
    _londiste_replication_queue_truncate AFTER TRUNCATE ON t_test FOR EACH STATEMENT EXECUTE PROCEDURE pgq.sqltriga('replication_queue')

Skytools and londiste in particular offer a rich set of additional features to make your life easy. However, documenting all those features would unfortunately exceed the scope possible in this book. If you want to learn more we suggest taking a deep look at the doc directory inside the Skytools source code. You will find a couple of interesting documents explaining step by step what can be done.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.43.26