Configuring a simple cluster

In this chapter, we want to set up a cluster consisting of three (Datanodes). A Coordinator and a Global Transaction Manager will be in charge of the cluster. For each component, we have to create a directory:

hs@vm:~/data$ ls -l
total 24
drwx------  2 hshs 4096 Jun 13 15:56 gtm
drwx------ 13 hshs 4096 Jun 13 15:54 node1
drwx------ 13 hshs 4096 Jun 13 15:55 node2
drwx------ 13 hshs 4096 Jun 13 15:55 node3
drwx------ 13 hshs 4096 Jun 13 15:55 node4

Keep in mind that, to make life simple, we will set up the entire cluster on a single server. In production, you would logically use different nodes for those components, otherwise there is no point in using Postgres-XC.

Creating the GTM

In the first step, we have to initialize the directory handling the GTM. To do so, we can simply call initgtm:

hs@vm:~/data/gtm$ initgtm -Z gtm -D /home/hs/data/gtm/
The files belonging to this GTM system will be owned by user "hs".
This user must also own the server process.

fixing permissions on existing directory /home/hs/data/gtm ... ok
creating configuration files ... ok

Success. You can now start the GTM server using:

gtm -D /home/hs/data/gtm
or
gtm_ctl -Z gtm -D /home/hs/data/gtm -l logfile start

Don't expect anything large and magic from initgtm. It merely creates some basic configuration needed for handling the GTM. It does not create a large database infrastructure there.

However, it already gives us a clue how to start GTM, which will be done later on in the process.

Then we have to initialize those four database nodes we want to run. To do so, we have to run initdb, just like for any Vanilla PostgreSQL database instance. However, in the case of Postgres-XC, we have to tell initdb what name the node will have. In our case, we will create the first node called node1 in the node1 directory. Each node will need a dedicated name. This is shown as follows:

initdb -D /home/hs/data/node1/ --nodename=node1

We can call initdb for all the four instances we will run. To make sure that those instances can coexist on the very same test box, we have to change the port for each of those boxes. In our example, we will simply use the following ports: 5432, 5433, 5434, and 5435.

Tip

To change the port, just edit the port setting in the postgresql.conf file of each instance. Also, please make sure that each instance has a different socket_directory directory, otherwise you cannot start more than once instance.

Now that all the instances have been initialized, we can start the Global Transaction Manager. This works as follows:

hs@vm:~/data$ gtm_ctl -D ./gtm/ -Z gtm start
server starting

To see if it works, we can check for the process as follows:

hs@vm:~/data$ ps ax | grep gtm
16976 pts/5    S      0:00 /usr/local/postgres-xc/bin/gtm -D ./gtm

Then we can start all those nodes one after the other.

In our case, we will use one of those four nodes as the Coordinator. The Coordinator will be using port 5432. To start it, we can call pg_ctl and tell the system to use this node as a Coordinator:

pg_ctl -D ./node1/ -Z coordinator start

The remaining nodes will simply act as Datanodes. We can easily define the role of a node on startup:

pg_ctl -D ./node2/ -Z datanode start
pg_ctl -D ./node3/ -Z datanode start
pg_ctl -D ./node4/ -Z datanode start

Once this has been done, we can already check and see if those nodes are up and running.

We simply connect to a Datanode to list those databases in the system:

hs@vm:~/data$ psql -h localhost -l -p 5434
                          List of databases
   Name   | Owner | Encoding  | Collate 
----------+-------+-----------+------------
postgres  | hs    | SQL_ASCII | C       | C 
template0 | hs    | SQL_ASCII | C       | C 
template1 | hs    | SQL_ASCII | C       | C 
(3 rows)

We are almost done now. Before we can get started, we have to familiarize those nodes with each other. Otherwise we cannot run queries or commands inside the cluster. If those nodes don't know each other, an error will show up:

hs@vm:~/data$ createdb test -h localhost -p 5432
ERROR:  No Datanode defined in cluster
HINT:  You need to define at least 1 Datanode with CREATE NODE.
STATEMENT:  CREATE DATABASE test;

To tell those systems about the location of nodes, we connect to the Coordinator and run the following instructions:

postgres=# CREATE NODE node2 WITH (TYPE = datanode, HOST = localhost, PORT = 5433);
CREATE NODE
postgres=# CREATE NODE node3 WITH (TYPE = datanode, HOST = localhost, PORT = 5434);
CREATE NODE
postgres=# CREATE NODE node4 WITH (TYPE = datanode, HOST = localhost, PORT = 5435);
CREATE NODE

Once those nodes are familiar with each other, we can connect to the Coordinator and execute whatever we want. In our example, we will simply create a database:

hs@vm:~/data$ psql postgres -p 5432 -h localhost
psql (PGXC 1.0.3, based on PG 9.1.9)
Type "help" for help.

postgres=# CREATE DATABASE test;
CREATE DATABASE

To see if things have been replicated successfully, we can connect to a Datanode and check if the database is actually present. In our case we are lucky. The code is shown as follows:

hs@vm:~/data$ psql -l -p 5433 -h localhost
                          List of databases
   Name    | Owner | Encoding  | Collate  
-----------+-------+-----------+---------------
postgres   | hs    | SQL_ASCII | C       | C 
template0  | hs    | SQL_ASCII | C       | C
template1  | hs    | SQL_ASCII | C       | C
test       | hs    | SQL_ASCII | C       | C  
(4 rows)

Keep in mind that you always have to connect to a Coordinator to make sure that things are replicated nicely. Connecting to a Datanode should only be done to see if everything is up and running as it should. Never execute SQL on a Datanode, always use the Coordinator to do that.

Tip

You can run SQL on a data node directly but it will not be replicated.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.110.183