Replicating your first database

After this little introduction we can move forward and replicate our first database. To do so we can create two databases in a database instance. We want to simply replicate between these two databases.

Tip

It makes no difference if you replicate within an instance or between two instances—it works exactly the same way.

Creating those two databases should be an easy task once your instance is up and running:

hs@hs-VirtualBox:~$ createdb db1
hs@hs-VirtualBox:~$ createdb db2

Now we can create a table, which should be replicated from database db1 to database db2:

db1=# CREATE TABLE t_test (id serial, name text, 
PRIMARY KEY (id));
NOTICE:  CREATE TABLE will create implicit sequence "t_test_id_seq" for serial column "t_test.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t_test_pkey" for table "t_test"
CREATE TABLE

Create this table in both databases in an identical way because the table structure won't be replicated automatically.

Once this has been done we can write a slonik script to tell the cluster about our two nodes. slonik is a command-line interface that we can use to talk to Slony directly. You could also work with it interactively but this is far from comfortable.

A script to register these nodes would look as follows:

#!/bin/sh

MASTERDB=db1
SLAVEDB=db2
HOST1=localhost
HOST2=localhost
DBUSER=hs

slonik<<_EOF_
cluster name = first_cluster;

    # define nodes
node 1 admin conninfo = 'dbname=$MASTERDB host=$HOST1 user=$DBUSER';
node 2 admin conninfo = 'dbname=$SLAVEDB host=$HOST2 user=$DBUSER';

    # init cluster
init cluster ( id=1, comment = 'Master Node'),

    # group tables into sets
create set (id=1, origin=1, comment='Our tables'),
set add table (set id=1, origin=1, id=1, 
  fully qualified name = 'public.t_test', 
comment='sample table'),

store node (id=2, comment = 'Slave node', 
event node=1);
store path (server = 1, client = 2, conninfo='dbname=$MASTERDB host=$HOST1 user=$DBUSER'),
store path (server = 2, client = 1, conninfo='dbname=$SLAVEDB host=$HOST2 user=$DBUSER'),
_EOF_

First of all we define a handful of environment variables. This is not necessary but can be quite handy to make sure that in case of a change, nothing is forgotten. Then our slonik script starts.

The first thing we have to do is to define a cluster name. This is important: With Slony a cluster is more of a virtual thing—it is not necessarily related to physical hardware. We will find out later on when talking about failovers what this means.

In the next step we have to define our nodes of this cluster. The idea here is that each node will have a number associated to a connection string. Once this has been done, we can call init cluster. During this step Slony will deploy all of the infrastructure to do replication. We don't have to install anything manually here.

Now that the cluster has been initialized we can organize our tables into replication sets, which are really just a set of tables. In Slony we will always work with replication sets. Tables are grouped into sets and replicated together. This layer of abstraction allows us to quickly move groups of tables around. In many cases it is a lot easier than to just move individual tables one by one.

Finally we have to define paths. What is a path? A path is basically the connection string to move from A to B. The main question here is why paths are needed at all. We have already defined nodes earlier so why define paths? The point is: The route from A to B is not necessarily the same as the route from B to A. This is especially important if one of these servers is in some DMZ while the other one is not. In other words, by defining paths you can easily replicate between different private networks and cross firewalls doing some NAT if necessary.

As the script is a simple shell script we can easily execute it:

hs@hs-VirtualBox:~/slony$ sh slony_first.sh

Slony has done some work in the background. When looking at our test table we can see what has happened:

db1=# d t_test
                         Table "public.t_test"
 Column |  Type   |                      Modifiers                      
--------+---------+-----------------------------------------------------
id      | integer | not null default nextval('t_test_id_seq'::regclass)
name    | text    | 
Indexes:
    "t_test_pkey" PRIMARY KEY, btree (id)
Triggers:
    _first_cluster_logtrigger AFTER INSERT OR DELETE 
OR UPDATE ON t_test
FOR EACH ROW EXECUTE PROCEDURE _first_cluster.logtrigger('_first_cluster', '1', 'k')
    _first_cluster_truncatetrigger BEFORE TRUNCATE ON t_test FOR EACH STATEMENT EXECUTE PROCEDURE _first_cluster.log_truncate('1')
Disabled triggers:
    _first_cluster_denyaccess BEFORE INSERT OR DELETE OR UPDATE ON t_test FOR EACH ROW EXECUTE PROCEDURE _first_cluster.denyaccess('_first_cluster')
    _first_cluster_truncatedeny BEFORE TRUNCATE ON t_test FOR EACH STATEMENT EXECUTE PROCEDURE _first_cluster.deny_truncate()

A handful of triggers have been deployed automatically to keep track of these changes. Each event is covered by a trigger.

Now that this table is under Slony's control we can start to replicate it. To do so we have to come up with a slonik script again:

#!/bin/sh

MASTERDB=db1
SLAVEDB=db2
HOST1=localhost
HOST2=localhost
DBUSER=hs

slonik<<_EOF_
cluster name = first_cluster;

node 1 admin conninfo = 'dbname=$MASTERDB host=$HOST1 user=$DBUSER';
node 2 admin conninfo = 'dbname=$SLAVEDB host=$HOST2 user=$DBUSER';

subscribe set ( id = 1, provider = 1, receiver = 2, forward = no);
_EOF_

After stating the cluster name and after listing the nodes, we can call subscribe set. The point here is that in our example set number 1 is replicated from node 1 to node 2 (receiver). The forward keyword is important to mention here. This keyword indicates whether or not the new subscriber should store the log information during replication to make it possible to be a candidate for the provider role for future nodes. Any node that is intended to be a candidate for FAILOVER must have forward = yes. In addition to that, this keyword is essential to do cascaded replication (meaning, A replicates to B and B replicates to C).

If you execute this script, Slony will truncate the table on the slave and reload all the data to make sure that things are in sync. In many cases you know already that you are in sync and you want to avoid copying gigabytes of data over and over again. To achieve that we can add OMIT COPY = yes. This will tell Slony that we are sufficiently confident that data is already in sync.

After defining what we want to replicate, we can fire up those two slon daemons in our favorite UNIX shell:

$ slon first_cluster 'host=localhostdbname=db1'
$ slon first_cluster 'host=localhostdbname=db2'

This can also be done before we define this replication route—so order is not the primary concern here.

Now we can move forward and check if replication is working nicely:

db1=# INSERT INTO t_test (name) VALUES ('anna'),
INSERT 0 1
db1=# SELECT * FROM t_test;
id  | name 
----+------
  1 |  anna
(1 row)

db1=# q
hs@hs-VirtualBox:~/slony$ psql db2
psql (9.2.4)
Type "help" for help.

db2=# SELECT * FROM t_test;
id | name 
---+------
(0 rows)

db2=# SELECT * FROM t_test;
id  | name 
--- +------
  1 |  anna
(1 row)

We add a row to the master, quickly disconnect, and query if the data is already there. If you happen to be quick enough you will see that the data comes with a small delay. In our example, we managed to get an empty table just to demonstrate what asynchronous replication really means.

Tip

Let us assume you are running a book shop. Your application connects to server A to create a new user. Then the user is redirected to a new page, which queries some information about the new user—be prepared for the possibility that the data is not there yet on server B. This is a common mistake in many web applications dealing with load balancing. The same kind of delay happens with asynchronous streaming replication.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.165.115