So, you have played a bit with Cassandra on your local machine and read something about how great it scales. Now it's time to evaluate all the tall claims that Cassandra makes.
This chapter deals with cluster deployment and the decision that you need to make that will affect a number of nodes, types of machines, and tweaks in the Cassandra configuration file. We start with hardware evaluations and then dive into operating-system-level tweaks, followed by prerequisite software and how to install them. Once the base machine is ready, we will discuss the Cassandra installation—which is fairly easy. The rest of the chapter talks about various settings available, what fits in which situation, the pros and cons, and so on. Having been equipped with all this information, you are ready to launch your first cluster. The chapter provides a working code that deploys Cassandra on N number of nodes, sets the entire configuration, and starts Cassandra, effectively launching each node in about 40 seconds, thus enabling you to get going with an eight-node cluster in about five minutes. Toward the end of the chapter, it briefly discusses client-connection security as in Version 1.1.11.
Code pattern
: All the shell commands mentioned in this chapter follow a pattern. Each line starting with a #
sign is just a comment for readers to clarify the context. Each line starting with a $
sign is a command (excluding the beginning $
sign). Some longer commands may be broken into multiple lines for reading clarity. If a command is broken, the end of the line contains a line-continuation character—a backslash (). Each line that does not have either of these symbols is the output of a command. Please follow this pattern unless specified.
It is generally a good idea to examine what kind of load Cassandra is going to face when deployed on a production server. It does not have to be accurate, but some sense of traffic can give a little more clarity to what you expect from Cassandra (criteria for load tests), whether you really need Cassandra (the halo effect), or if you can bear all the expenses that a running Cassandra cluster can incur on a daily basis (the value proposition). Let's see how to choose various hardware specifications for a specific need.
A rough disk space calculation of the user that will be stored in Cassandra involves adding up data stored in three data components on disk: commit logs, SSTable, index file, and bloom filter. When compared to the data that is incoming and the data on disk, you need to take account of the database overheads associated with each data. The data on disk can be about two times as large as raw data. Disk usage can be calculated using the following code:
# Size of one normal column column_size (in bytes) = column_name_size + column_val_size+ 15 # Size of an expiring or counter column col_size (in bytes) = column_name_size + column_val_size+ 23 # Size of a row row_size (bytes) = size_of_all_columns + row_key_size + 23 # Primary index file size index_size (bytes) = number_of_rows * (32 + mean_key_size) # Additional space consumption due to replication replication_overhead = total_data_size * (replication_factor - 1)
Apart from this, the disk also faces high read-write during compaction. Compaction is the process that merges SSTables to improve search efficiency. The important thing about compaction is that it may, in the worst case, utilize as much space as occupied by user data. So, it is a good idea to have a large space left.
We'll discuss this again, but it depends on the choice of compaction_strategy
that is applied. For LeveledCompactionStrategy
, having 10 percent space left is enough; for SizeTieredCompactionStrategy
, it requires 50 percent free disk space in the worst case. Here are some rules of thumb with regard to disk choice and disk operations:
ext4
, ext3
, and ext2
filesystems (in that order) can be considered to be used for Cassandra.Instead of using EBS, use ephemeral devices attached to the instance (also known as an instance store ). Instance stores are fast and do not suffer any problems like EBS. Instance stores can be configured as RAID 0 to utilize them even further.
Larger memory boosts Cassandra performance from multiple aspects. More memory can hold larger MemTables, which means that fresh data stays for a longer duration in memory and leads to fewer disk accesses for recent data. This also implies that there will be fewer flushes (less frequent disk IO) of MemTable to SSTable; and the SSTables will be larger and fewer. This leads to improved read performance as lesser SSTables are needed to scan during a lookup. Larger RAM can accommodate larger row cache, thus decreasing disk access.
For any sort of production setup, a RAM capacity less than 8 GB is not suggested. Memory above 16 GB is preferred.
Cassandra is highly concurrent—compaction, writes, and getting results from multiple SSTables and creation of one single view to clients, and all are CPU intensive. It is suggested to use an 8-core CPU, but anything with a higher core will just be better.
For a cloud-based setup, a couple of things to keep in mind:
Each node in the ring is responsible for a set of row keys. Nodes have a token assigned to them on startup either via bootstrapping during startup or by the configuration file. Each node stores keys from the last node's token (excluded) to the current node's token (included). So, the greater the number of nodes, the lesser the number of keys per node; the fewer the number of requests to be served by each node, the better the performance.
In general, a large number of nodes is good for Cassandra. It is a good idea to keep 300 GB to 500 GB disk space per node to start with, and to back calculate the number of nodes you may need for your data. One can always add more nodes and change tokens on each node.
As with any other distributed system, Cassandra is highly dependent on a network. Although Cassandra is tolerant to network partitioning, a reliable network with less outages are better preferred for the system—less repairs, less inconsistencies.
A congestion-free, high speed (Gigabit or higher), reliable network is pretty important as each read-write, replication, moving/draining node puts heavy load on a network.
18.190.239.166