Enter NoSQL

NoSQL is a blanket term for the databases that solve the scalability issues that are common among relational databases. This term, in its modern meaning, was first coined by Eric Evans (NoSQL naming available at http://blog.sym-link.com/2009/10/30/nosql_whats_in_a_name.html). It should not be confused with the database named NoSQL (NoSQL: the database available at http://www.strozzi.it/cgi-bin/CSA/tw7/I/en_US/nosql/Home%20Page).

The NoSQL solutions provide scalability and high availability, but may not guarantee ACID: atomicity, consistency, isolation, and durability in transactions. Many of the NoSQL solutions including Cassandra sit on the other extreme of ACID, named BASE (Basically Available, Soft-state, Eventual consistency).

The CAP theorem

In 2000, Eric Brewer (Wikipedia page available at http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29), in his keynote speech at the ACM Symposium, said that one cannot guarantee consistency in a distributed system. This was his conjecture based on his experience with the distributed systems. This conjecture was later formally proved by Nancy Lynch and Seth Gilbert in 2002 (Brewer's Conjecture and the Feasibility of Consistent, Available at Partition-tolerant Web Services, and ACMSIGACT News, Volume 33, Issue 2 (2002), page 51 to 59 available at http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf) It says, if we have a distributed system where data is replicated at two distinct locations and two conflicting requests arrive—one at each location—when the communication link between the two servers is broken. If the system (the cluster) has obligations to be highly available (a response, even when some components of the system are failing), one of the two responses will be inconsistent with what a system with no replication (no partitioning, single copy) would have returned. To understand it better, let us take an example to learn the terminologies. These terms will be used frequently throughout this book.

Let's say you are planning to read George Orwell's book titled Nineteen Eighty-Four (1984, The Novel available at http://en.wikipedia.org/wiki/Nineteen_Eighty-Four) over the Christmas vacation. A day before the holidays start, you logged in to your favorite online book store to find out that there is only one copy left. You add it to your cart. But then you realize that you need to buy something else to be eligible for free shipping. You start to browse the website for any other item that you might buy. To make the situation interesting, let's say there is another customer who is trying to buy Nineteen Eighty-Four at the same time.

Consistency

In a distributed system, consistency will be defined as one that responds with the same output for the same request at the same time across all the replicas. Loosely, one can say a consistent system is one where each server returns the right response to each request.

In our book-buying example, we have only one copy of Nineteen Eighty-Four. So, only one of the two customers is going to get the book delivered from this store. In a consistent system, only one can check out the book from the payment page. As soon as one customer makes the payment, the number of Nineteen Eighty-Four books in stock will get decremented by one and one quantity of Nineteen Eighty-Four will be added to the order of that customer. When the second customer tries to check out his shopping cart, the system tells that the book is not available any more.

Relational databases are good for this task because they comply with the ACID properties. If both the customers make the request at the same time, one customer will have to wait till the other customer is done with the processing, and the database is made consistent. This may add a few milliseconds of wait to the customer who came later.

An eventual consistent (where consistency of data across the distributed servers may not be guaranteed immediately) database system may have showed both the customers availability of the book at the time of check-out. This will lead to a back order, and one of the customers will be paid back. This may or may not be a good policy. A large number of back orders may affect the shop's reputation and there may also be financial repercussions.

Availability

Availability in simplest term is responsiveness. A system that's always available to serve. The funny thing about availability is that sometimes a system becomes unavailable exactly when you need it the most.

In our example, one day before Christmas, everyone is buying gifts. Millions of people are searching, adding items to their carts, buying, and applying for discount coupons. If one server goes down due to overload, the rest of the servers get even more loaded now, because the request from the dead server is getting redirected to the rest of the machines. Dominoes start to fall. Now the site is down. When it comes online again, it faces a storm of requests from all the people who are hurrying to place their order because the offer end time is even closer, or probably acting quickly before the site goes down again.

Availability is the key component for extremely loaded services. Bad availability leads to bad user experience, dissatisfied customers, and financial losses.

Partition-tolerance

Partition-tolerance is a system that can operate during the network partition. The network will be allowed to lose arbitrarily many messages sent from one node to another. This means a cable between the two nodes is chopped, but the system still works.

Partition-tolerance

Figure 2.1. Database classification based on CAP Theorem

An example of a partition-tolerant system is a system with real-time data replication. A system where data is replicated across two datacenters, the availability will not be affected, even if a datacenter goes down.

Significance of the CAP theorem

Once you decide to scale up, the first thing that comes to mind is vertical scaling, which means putting beefier servers with a bigger RAM, a more powerful processor, and bigger disks. For further scaling, you need to go horizontal, which means adding more servers. Once your system becomes distributed, the CAP theorem starts to play. That means, in a distributed system, you can choose only two out of consistency, availability, and partition-tolerance. So, let us see how choosing two out of the three options affect the system behavior as follows:

CA system: In this system, you drop partition-tolerance for consistency and availability. This happens when you put everything related to a transaction on one machine or a system that fails like an atomic unit, for example, a rack. This system will have serious problems in scaling.

CP system: The opposite of a CA system is a CP system. In a CP system, availability is sacrificed for consistency and partition-tolerance. What does this mean? If the system is available to serve the requests, data will be consistent. In the event of a node failure, some data will not be available. A sharded database is an example of such a system.

AP system: An available and partition-tolerance system is like an always-on system on risk of producing conflicting results in the event of network partition. This is good for user experience, your application stays available, and inconsistency in rare events may be alright for some use cases. In the book example, it may not be such a bad idea to back order a few unfortunate customers due to inconsistency of the system than having a lot of users to return without making any purchase because of the system's poor availability.

Eventual consistent aka BASE system: The AP system makes more sense when viewed from an uptime perspective—it's simple and provides good user experience. But, an inconsistent system is not good for anything, certainly not good for business. It may be acceptable that one customer for the book Nineteen Eighty-Four gets a back order. But if it happens more often, the users would be reluctant to use the service. It will be great if the system can fix itself (read repair) as soon as the first inconsistency is observed. Or, maybe there are processes dedicated to fix the inconsistency of a system when a partition failure is fixed or a dead node comes back to life. Such systems are called Eventual Consistent Systems.

Significance of the CAP theorem

Figure 2.2: Life of an eventual consistent system

Quoting Wikipedia, "[In a distributed system] given a sufficiently long time over which no changes [in system state] are sent, all updates can be expected to propagate eventually through the system and the replicas will be consistent." (Eventual Consistency available at http://en.wikipedia.org/wiki/Eventual_consistency)

Eventual Consistent Systems are also called BASE, a made-up term to represent that these systems are on one end of the spectrum, which has traditional databases with the ACID properties on the opposite end.

Cassandra is one such system that provides high availability, and partition-tolerance at the cost of consistency, which is tunable. The preceding figure shows a partition-tolerant Eventual Consistent System.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.70.170