Chapter 1. Basic Concepts and Architecture

The Apache Cassandra database is a linearly scalable and highly available distributed data store which doesn't compromise on performance and runs on commodity hardware. Cassandra's support for replicating across multiple datacenters / multiple discrete environments is the best in the industry. Cassandra provides high throughput with low latency without any single point of failure on commodity hardware.

Cassandra was inspired by the two papers published by Google (BigTable) in 2006 and Amazon (Dynamo) in 2007, after which Cassandra added more features. Cassandra is different from most of the NoSQL solutions in a lot of ways: the core assumption of most of the distributed NoSQL solutions is that Mean Time Between Failures (MTBF) of the whole system becomes negligible when the failures of individual nodes are independent, thus resulting in a highly reliable system.

CAP theorem

If you want to understand Cassandra, you first need to understand the CAP theorem. The CAP theorem (published by Eric Brewer at the University of California, Berkeley) basically states that it is impossible for a distributed system to provide you with all of the following three guarantees:

  • Consistency: Updates to the state of the system are seen by all the clients simultaneously
  • Availability: Guarantee of the system to be available for every valid request
  • Partition tolerance: The system continues to operate despite arbitrary message loss or network partition

Cassandra provides users with stronger availability and partition tolerance with tunable consistency tradeoff; the client, while writing to and/or reading from Cassandra, can pass a consistency level that drives the consistency requirements for the requested operations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.70.247