In this chapter, we want to set up a cluster consisting of three (Datanodes). A Coordinator and a Global Transaction Manager will be in charge of the cluster. For each component, we have to create a directory:
hs@vm:~/data$ ls -l total 24 drwx------ 2 hshs 4096 Jun 13 15:56 gtm drwx------ 13 hshs 4096 Jun 13 15:54 node1 drwx------ 13 hshs 4096 Jun 13 15:55 node2 drwx------ 13 hshs 4096 Jun 13 15:55 node3 drwx------ 13 hshs 4096 Jun 13 15:55 node4
Keep in mind that, to make life simple, we will set up the entire cluster on a single server. In production, you would logically use different nodes for those components, otherwise there is no point in using Postgres-XC.
In the first step, we have to initialize the directory handling the GTM. To do so, we can simply call initgtm
:
hs@vm:~/data/gtm$ initgtm -Z gtm -D /home/hs/data/gtm/ The files belonging to this GTM system will be owned by user "hs". This user must also own the server process. fixing permissions on existing directory /home/hs/data/gtm ... ok creating configuration files ... ok Success. You can now start the GTM server using: gtm -D /home/hs/data/gtm or gtm_ctl -Z gtm -D /home/hs/data/gtm -l logfile start
Don't expect anything large and magic from initgtm
. It merely creates some basic configuration needed for handling the GTM. It does not create a large database infrastructure there.
However, it already gives us a clue how to start GTM, which will be done later on in the process.
Then we have to initialize those four database nodes we want to run. To do so, we have to run initdb
, just like for any Vanilla PostgreSQL database instance. However, in the case of Postgres-XC, we have to tell initdb
what name the node will have. In our case, we will create the first node called node1
in the node1
directory. Each node will need a dedicated name. This is shown as follows:
initdb -D /home/hs/data/node1/ --nodename=node1
We can call initdb
for all the four instances we will run. To make sure that those instances can coexist on the very same test box, we have to change the port for each of those boxes. In our example, we will simply use the following ports: 5432
, 5433
, 5434
, and 5435
.
Now that all the instances have been initialized, we can start the Global Transaction Manager. This works as follows:
hs@vm:~/data$ gtm_ctl -D ./gtm/ -Z gtm start server starting
To see if it works, we can check for the process as follows:
hs@vm:~/data$ ps ax | grep gtm 16976 pts/5 S 0:00 /usr/local/postgres-xc/bin/gtm -D ./gtm
Then we can start all those nodes one after the other.
In our case, we will use one of those four nodes as the Coordinator. The Coordinator will be using port 5432
. To start it, we can call pg_ctl
and tell the system to use this node as a Coordinator:
pg_ctl -D ./node1/ -Z coordinator start
The remaining nodes will simply act as Datanodes. We can easily define the role of a node on startup:
pg_ctl -D ./node2/ -Z datanode start pg_ctl -D ./node3/ -Z datanode start pg_ctl -D ./node4/ -Z datanode start
Once this has been done, we can already check and see if those nodes are up and running.
We simply connect to a Datanode to list those databases in the system:
hs@vm:~/data$ psql -h localhost -l -p 5434 List of databases Name | Owner | Encoding | Collate ----------+-------+-----------+------------ postgres | hs | SQL_ASCII | C | C template0 | hs | SQL_ASCII | C | C template1 | hs | SQL_ASCII | C | C (3 rows)
We are almost done now. Before we can get started, we have to familiarize those nodes with each other. Otherwise we cannot run queries or commands inside the cluster. If those nodes don't know each other, an error will show up:
hs@vm:~/data$ createdb test -h localhost -p 5432 ERROR: No Datanode defined in cluster HINT: You need to define at least 1 Datanode with CREATE NODE. STATEMENT: CREATE DATABASE test;
To tell those systems about the location of nodes, we connect to the Coordinator and run the following instructions:
postgres=# CREATE NODE node2 WITH (TYPE = datanode, HOST = localhost, PORT = 5433); CREATE NODE postgres=# CREATE NODE node3 WITH (TYPE = datanode, HOST = localhost, PORT = 5434); CREATE NODE postgres=# CREATE NODE node4 WITH (TYPE = datanode, HOST = localhost, PORT = 5435); CREATE NODE
Once those nodes are familiar with each other, we can connect to the Coordinator and execute whatever we want. In our example, we will simply create a database:
hs@vm:~/data$ psql postgres -p 5432 -h localhost psql (PGXC 1.0.3, based on PG 9.1.9) Type "help" for help. postgres=# CREATE DATABASE test; CREATE DATABASE
To see if things have been replicated successfully, we can connect to a Datanode and check if the database is actually present. In our case we are lucky. The code is shown as follows:
hs@vm:~/data$ psql -l -p 5433 -h localhost List of databases Name | Owner | Encoding | Collate -----------+-------+-----------+--------------- postgres | hs | SQL_ASCII | C | C template0 | hs | SQL_ASCII | C | C template1 | hs | SQL_ASCII | C | C test | hs | SQL_ASCII | C | C (4 rows)
Keep in mind that you always have to connect to a Coordinator to make sure that things are replicated nicely. Connecting to a Datanode should only be done to see if everything is up and running as it should. Never execute SQL on a Datanode, always use the Coordinator to do that.
18.222.110.183