Adding tables to replication and managing problems

Once we have added this table to the system, we can add it to the replication setup. Doing so is a little complex. First of all we have to create ourselves a new table set and merge this one with the one we already have. So, for a brief moment we will have two table sets involved. The script goes like this:

#!/bin/sh

MASTERDB=db1
SLAVEDB=db2
HOST1=localhost
HOST2=localhost
DBUSER=hs

slonik<<_EOF_
cluster name = first_cluster;

node 1 admin conninfo = 'dbname=$MASTERDB host=$HOST1 user=$DBUSER';
node 2 admin conninfo = 'dbname=$SLAVEDB host=$HOST2 user=$DBUSER';

create set (id=2, origin=1, 
comment='a second replication set'),
set add table (set id=2, origin=1, id=5, 
fully qualified name = 'public.t_second', 
  comment='second table'),
subscribe set(id=1, provider=1,receiver=2);
merge set(id=1, add id=2,origin=1);
_EOF_

The key to success is the merge call at the end of the script. It will make sure that those new tables will be integrated into the existing table set.

When the script is executed we will face an expected problem, as follows:

hs@hs-VirtualBox:~/slony$ sh slony_add_to_set.sh
<stdin>:7: PGRES_FATAL_ERROR select "_first_cluster".determineIdxnameUnique('public.t_second', NULL);  - ERROR:  Slony-I: table "public"."t_second" has no primary key

We have created the table without a primary key. This is highly important—there is no way for Slony to replicate a table without a primary key. So, we have to add this primary key. Basically we have two choices to do that. The desired way here is definitely to use execute script just as we have shown before. If your system is idling, you can also do it the quick and dirty way:

db1=# ALTER TABLE t_second ADD PRIMARY KEY (id);
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "t_second_pkey" for table "t_second"
ALTER TABLE
db1=# q
hs@hs-VirtualBox:~/slony$ psql db2
psql (9.2.4)
Type "help" for help.

db2=# ALTER TABLE t_second ADD PRIMARY KEY (id);
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "t_second_pkey" for table "t_second"
ALTER TABLE

However, this is not recommended—it is definitely more desirable to use the Slony interface to make changes like that.

Once we have fixed the data structure we can execute the slonik script again and see what happens:

hs@hs-VirtualBox:~/slony$ sh slony_add_to_set.sh
<stdin>:6: PGRES_FATAL_ERROR lock table "_first_cluster".sl_event_lock, 
"_first_cluster".sl_config_lock;select "_first_cluster".storeSet(2, 'a second replication set'),  - ERROR:  duplicate key value violates unique constraint "sl_set-pkey"
DETAIL:  Key (set_id)=(2) already exists.
CONTEXT:  SQL statement "insert into "_first_cluster".sl_set
      (set_id, set_origin, set_comment) values
      (p_set_id, v_local_node_id, p_set_comment)"
PL/pgSQL function _first_cluster.storeset(integer,text) line 7 at SQL statement

What you see is a typical problem that you will face with Slony. If something goes wrong, it can be really, really hard to get things back in order. This is a scenario you should definitely be prepared for.

Tip

If you are working with Slony on a production system always create yourself a perfectly working library with scripts to perform different tasks. It will greatly reduce your risk if you don't have to come up with fixes on the fly and during normal operations. Always make sure that you have got enough scripts around to handle most common issues such as the one we have just outlined.

So, to fix the problem we can simply drop the table set again and start from scratch:

slonik<<_EOF_
cluster name = first_cluster;

node 1 admin conninfo = 'dbname=$MASTERDB host=$HOST1 user=$DBUSER';
node 2 admin conninfo = 'dbname=$SLAVEDB host=$HOST2 user=$DBUSER';

drop set (id=2, origin=1);
_EOF_

To kill a table set we can run drop set. It will help you to get back to where you started. The script will execute cleanly:

hs@hs-VirtualBox:~/slony$ sh slony_drop_set.sh

Now we can restart again and add the table. Note that we are subscribing both sets to the slave to make sure this executes cleanly:

slonik<<_EOF_
cluster name = first_cluster;

node 1 admin conninfo = 'dbname=$MASTERDB host=$HOST1 user=$DBUSER';
node 2 admin conninfo = 'dbname=$SLAVEDB host=$HOST2 user=$DBUSER';
create set (id=2, origin=1, comment='a second replication set'),
set add table (set id=2, origin=1, id=5, fully qualified name =   'public.t_second', comment='second table'),
subscribe set(id=1, provider=1,receiver=2);
subscribe set(id=2, provider=1,receiver=2);
merge set(id=1, add id=2,origin=1);
_EOF_

We can now cleanly execute the script and everything will be replicated as expected:

hs@hs-VirtualBox:~/slony$ sh slony_add_to_set_v2.sh
<stdin>:11 subscription in progress before mergeSet. waiting
<stdin>:11 subscription in progress before mergeSet. waiting

As we have stated already, in this chapter we have intentionally made a small mistake and you have seen, how tricky and work intense it can be to get things straight even if it is just a small mistake. One of the reasons for that is that a script is basically not a transaction on the server side. So, if a script fails somewhere in the middle, it will just stop working—it will not undo changes made so far. This can cause some issues; these are outlined in this section.

So, once you have made a change you should always take a look and see if everything works nicely. One simple way to do that is as follows:

db2=# BEGIN;
BEGIN
db2=# DELETE FROM t_second;
ERROR:  Slony-I: Table t_second is replicated and cannot be modified on a subscriber node - role=0
db2=# ROLLBACK;
ROLLBACK

You can start a transaction and try to delete a row. It is supposed to fail. If it does not, you can safely rollback and try to fix your problem. As you are using a transaction that never commits, nothing can go wrong.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.140.195.225