Creating and handling a RethinkDB cluster

We have done enough of theory; let's deal with clustering in RethinkDB. Till now we have covered what clustering really is in terms of computing and what it provides us. In this section, we are going to learn how we can perform clustering in RethinkDB, which by nature is a distributed database.

We will also learn how to add new machines into our existing cluster, manage them from the RethinkDB administrative screen, and monitor them for any errors.

We have two ways to perform RethinkDB clustering:

  • In the same machine with a different RethinkDB instance
  • In a different machine with a different RethinkDB instance

Creating a RethinkDB cluster in the same machine

We can create a RethinkDB cluster in the same machine using the simple command under a minute. Yes you heard it right, in a minute (assuming you have RethinkDB installed). Let's do this.

Lift up the default RethinkDB server using the following command in the terminal:

rethinkdb

It should lift the RethinkDB server on the default port and you should be able to see the console as shown here:

Creating a RethinkDB cluster in the same machine

Now open a new terminal, and run the following command:

rethinkdb --port-offset 1 --directory rethinkdb_data2 --join localhost:29015

You should be able to see a new RethinkDB instance lifting up, as shown here:

Creating a RethinkDB cluster in the same machine

That is it. We have our first RethinkDB cluster running. Let's verify this, visit the administrative console and you should be seeing the 2 servers connected in Servers section, as shown in the following figure:

Creating a RethinkDB cluster in the same machine

Yes it works! Try to execute a query from the data explorer and you should be receiving the same result regardless of having the cluster, because no matter which instance you use, RethinkDB will automatically route the query to the appropriate node.

Let us look at the command that we executed previously:

  • --port-offset: This makes sure that no two nodes use the same port by incrementing them by 1
  • --directory: This tells RethinkDB to use a different directory in order to maintain consistency and avoid read/write issues
  • --join: This tells RethinkDB to connect to the existing seed node
  • Instance of RethinkDB to create a cluster: In our command, it was 29015, which is the default port of RethinkDB

Note

I would like to point out an important key here regarding failover. If you are creating a cluster in the same machine, you won't be able to achieve full failover because if a RethinkDB instance is down, it will manage it, but if your machine (which is running RethinkDB) goes down, your complete cluster will go down altogether. So, for learning purposes, this is OK, but not for production.

You can add new RethinkDB instances to the existing cluster using the same command, but make sure you use a different port offset and directory. Let's create RethinkDB using different machines.

Creating a RethinkDB cluster using different machines

We have seen how easy it is to create a RethinkDB cluster in the same machine. Let's see how we can create it using different machines. Actually, this is much easier than creating a cluster on the same machine, because you really don't need to worry about the port and directory usage.

Let's say you have two servers, one running on 104.121.23.24 and another one running on 104.121.23.25 respectively. We need to first install RethinkDB in each machine. You can find a detailed description about installing RethinkDB in the Mac, Linux or Windows machines at the official website of RethinkDB (https://www.rethinkdb.com/docs/install/).

Assuming you have RethinkDB installed on both machines, log in to the machine with the 104.121.23.24 IP address and lift the RethinkDB Server using the following command:

rethinkdb --bind all

Note

Please note that the --bind all parameter allows RethinkDB to accept connections from any machine. If you don't provide this, it will restrict access to the localhost only; hence, a machine with a different IP will not be able to communicate.

RethinkDB will initiate itself, and you should be able to see the following on the terminal:

Creating a RethinkDB cluster using different machines

Now log in to the machine with the 104.121.23.25 IP address and lift the RethinkDB server using the following command:

rethinkdb --join 104.121.23.24:29015 --bind all

Upon running this command, you should be seeing the two servers on the administrative screen of RethinkDB. There is our cluster.

As you may notice, this is really easy to do, but is this sustainable? What I mean by sustainable here is: will it run on production? Let's find out.

Since we have performed cluster creation using the command line, what if one of the servers requires a reboot? Will it create the same cluster automatically? Well, no. Since we have a different machine running on the Internet, is it secure enough to run on production with so many hackers trying to intercept our data? Well No!

So why did we do this in the first place? To simply learn the concept. I always believe that everyone (including me) is looking for shortcuts to get the end result. Since we have covered the shortcut part and seen why it is not good for production, let's learn to optimize this and run our cluster in production mode. That's what this book is all about Mastering RethinkDB.

Creating a RethinkDB cluster in production

We are going to consider a separate machine as an instance for a cluster in production. Considering you have a two-server machine with IPs 104.121.23.24 and 104.121.23.25 respectively. We need to define which server will act as the starting point (SEED server) for other machines to enter the cluster; for instance, say 104.121.23.24 machine is the SEED server.

Log in to the machine with the 104.121.23.24 IP address and open up the RethinkDB configuration file. Assuming it's Ubuntu, the config file can be located at /etc/rethinkdb/instances.d/default.conf. If it is not present, you need to manually create it.

If it's already present, then you can uncomment some of the code lines shown here or simply write them.

First set up the instance name as follows. For the instance, the name is rethink_main:

server-name = rethink_main 

The next setting we need to alter or create is to allows the RethinkDB instance running on a different machine to connect to this server in order to create a cluster.

If the config file is already there, you may find this code:

# bind=127.0.0.1 

Change it to:

bind = all 

Later, save and close the file. You will need to restart RethinkDB to let change take effect. You can do this in the following ways:

  • By executing the rethinkdb command as follows:
      sudo /etc/init.d/rethinkdb restart
  • By using the service command as follwos:
sudo service rethinkdb restart
  • By using the systemctl command as follows:
sudo systemctl rethinkdb restart

Once the restart is done, we can move ahead to configure our next machine. We need to ensure one thing before this; it's the startup script. In case we need to reboot the server (which happens at regular maintenance by the service provider), our RethinkDB Server should be started on boot.

By default, RethinkDB automatically creates the init.d script, which a Unix-based system reads on boot to start the services. As we have our configuration file in place, it will automatically start the service on boot.

This is our administrative screen after restarting the RethinkDB server:

Creating a RethinkDB cluster in production

Note

If you have a Mac- or Windows-based server, you need to install the respective packages available for them to perform the startup execution. We are covering Linux-based systems as the majority of servers are based on them.

Now let's configure our second machine and form our cluster. Log in to the machine with the 104.121.23.25 IP address and open up the configuration file using your favorite editor by the following command:

sudo vi /etc/rethinkdb/instances.d/default.conf

Change the name of the server by setting the following key:

server-name=rethink_child

Again, change the bind key and allow it to connect to other RethinkDB instances:

bind=all

Now, to add this machine to a cluster, we need to add the following join command. This command will join our machine to the first machine and we will have our cluster ready:

join=104.121.23.24:29015

Save and close the editor. To make the setting effective, we need to restart the server:

sudo /etc/init.d/rethinkdb restart

Once the reboot is successful, we can check whether we have our cluster formed successfully or not. Visit the administrative screen and you should be able to see a similar screen to that shown here:

Creating a RethinkDB cluster in production

As we can see, we have two servers connected and running as one cluster. Congratulations! We have a production-ready cluster with us.

Note

You can learn more about the configuration files of RethinkDB at their official docs website ( https://rethinkdb.com/docs/config-file/)

However, there is still a little work left to do on the security of the cluster. As you may have noticed, we are setting bind=all in our configuration, which simply means any machine can make an attempt to connect to our cluster. We need to add some security layers in order to have some prevention.

In the next section, we are going to learn how we can secure our cluster.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.172.38