Chapter 9.  Polyglot Persistence Using RethinkDB

In the last chapter, we went through all the stages of development required to build an application using RethinkDB. We have done the data modeling, code structure, and finally the frontend design. Building an application where real-time data is required is one of the best-suited cases for RethinkDB; however, there is something more than just real-time data where we can hook RethinkDB. It is Polyglot Persistence.

In this chapter, we will cover the following topics:

  • Introducing Polyglot Persistence
  • Using the RethinkDB changefeed as a Polyglot agent
  • Developing a proof-of-concept application with MongoDB and MySQL
  • Developing the Polyglot agent
  • Developing event consumers
  • Running the app
  • Further improvements

So let's begin.

Introducing Polyglot Persistence

In 2006, Neal Ford coined the term called Polyglot programming, which in short tells you to use different programming languages which are best suited to solve a specific problem instead of trying to solve all problems using a single programming language. He suggested that there are different programming languages which are best suited for particular things, such as Java for data processing, Erlang for functional programming, and so on.

As the field of databases is growing, we have lots of different databases to solve different kinds of problems.

For example, SQL is still suited for user-based applications such as Quora, NoSQL is suited for large-batch processing and analytics such as Hadoop, and real-time databases such as RethinkDB are suited for building interactive real-time applications.

Having these options at hand, we should not stick to using one database for all purposes, but instead we should use a variety of databases to match our needs. However, having various sources of database requires extra caution, which is synchronization. Using different databases and maintaining the synchronization between them is called Polyglot Persistence.

To most of us, the term Polyglot Persistence looks simple, but it's really not. When you dive into architecture details, you need to figure out lots of details, some of them being:

  • Choosing the databases: This can be complex due to performance metrics of various database
  • Choosing the entry database: The database where data manipulation happens first
  • Choosing the transaction necessity: Does your application require transaction or not?

There is one point I would like to mention: even if you make a mistake when choosing the database or transaction needs, you can always redo it without affecting the existing system because database engines are separate processes running probably on different machines and the consumer really has no idea that the data coming to his/her request is from which database.

Please note, I have personally developed and deployed such a system and eventually did make a mistake on choosing the entry database, although, I changed it later without any downtime. I would love to share the project details but it's confidential and I am the one who signed the non-disclosure agreement but I will share my experience and try to help you out as much as possible.

I am sure till now you have grasped the concept in one sentence Using lots of databases and maintaining the data synchronization.

As mentioned, the entry database, that is, a database where creation, updates, and deletion happens first, is then synchronized to other concerned databases. So we need some kind of agent which tells the other databases about the change in the entry database. I am sure you are getting what I mean to say here: RethinkDB changefeed can act as the agent.

In the next section, we are going to learn how RethinkDB can help us to implement the agent for the other databases or I should say you've already got it.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.255.87