Chapter 2. Installing Cassandra

I would like to start this chapter by showing you some numbers published by Jeff Dean (a Google fellow, http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf):

Operation

Time taken

Send 1K bytes over 1 Gbps network

10,000 ns

0.01 ms

Read 4K randomly from SSD*

150,000 ns

0.15 ms

Read 1 MB sequentially from memory

250,000 ns

0.25 ms

Round trip within the same datacenter

500,000 ns

0.5 ms

Read 1 MB sequentially from SSD*

1,000,000 ns

1 ms

Disk seek

10,000,000 ns

10 ms

Send packet CA->Netherlands->CA

150,000,000 ns

150 ms

The preceding table tells us the average cost of a system call performed to complete an operation. Typically, a read/write request in Cassandra involves multiple of the above operations.

Memory, CPU, and network requirements

To understand the memory requirements for Cassandra, it's important to know that Cassandra is a Java-based service that uses a JVM heap to create temporary objects. Cassandra also uses the heap for its in-memory data structures. Cassandra relies on the OS kernel to manage the page cache of the frequently used file blocks. Most OS kernels have intelligent (multiple) ways to figure out the block of the files that will be accessed by the application and those that can be evicted from its cache.

There are two main functions of any Cassandra node: one is to coordinate the client requests and the other to serve data. The coordinator is a simple proxy, which sends data requests or updates to the nodes that have data and waits for their responses. To achieve quorum, it waits for the N/2 + 1 nodes, or it waits for the required nodes as per the consistency levels. Every node in the cluster handles both of these functions; the coordinator contains the most recent information about the cluster via gossip.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.65.130