Troubleshooting

We have learned cluster configuration, repairing and scaling, and, finally, monitoring. The purpose of all this learning is for you to keep production environments up-and-running smoothly. You may choose the right ingredients to set up a cluster that fits your need, but there may be node failures, high CPU usage, high memory usage, disk space issues, network failures, and, probably, performance issues with time. You will get most of this information from the monitoring tool that you have configured. You will need to take the necessary action, depending on the problems that you are facing.

Usually, one goes about finding these issues via various tools that we've discussed in the past. You may want to extend the list of tools for investigation to include Linux tooling. These include netstat and tcpdump for network debugging; vmstat, free, top, and dstat for memory statistics; perf, top, dstat, and uptime for CPU statistics; and iostat, iotop, and df for disk usage.

How do you actually know there is a problem? With a decent monitoring setup and a vigilant system admin, problems usually come to one's knowledge via alerts sent by the monitoring system. It may be a mail from OpsCenter, a critical message from Nagios, or a message from your home-grown JMX-based monitoring system. Another way to see the issues is as performance degradation at a certain load. You may find that your application is acting weird or abnormally slow. You dig into the error and find out that the Cassandra calls are taking a really long time, more than expected. The other, and scarier, way the problems come to one's notice is on production. Things have been working decently in the test environment and you suddenly start seeing frequent garbage collection calls or the production servers start to scream, "Too many open files."

In many of the error scenarios, the solution is a simple one. For cases such as where AWS notifies an instance shutdown due to underlying hardware degradation, the fix is to replace the node with a new one. For a disk full issue, you may add either a new node or just more hard disks and add the location to the data directory setting in Cassandra—yaml. The following are a few troubleshooting tips. Most of these things you might have known from previous chapters.

High CPU usage

High CPU usage can be associated with frequent Garbage Collections (GC). If you see a lot of GC call information in Cassandra logs and if they take longer than one second to finish, it means the system has loaded the JVM with the garbage collector.

The easiest fix is to add more nodes. Another option can be to increase the JVM heap size (adding more RAM, if required) and to tweak the garbage collector setting for Cassandra.

Compaction is a CPU-intensive process. You may expect a spike during compaction. You should plan to perform a nodetool compaction during relatively silent hours. The same goes for repair. Execute nodetool repair during low load.

High memory usage

Before we dive into memory usage, it is nice to point out that providing a lot of RAM to the Java heap may not always help. We have learned in a previous chapter that Cassandra automatically sets the heap memory, which is good in most cases. If you are planning to override it, note that garbage collection does not do well beyond a 16 GB heap.

There are a couple of things you should check when debugging for high memory usage. The bloom filter's false positive ratio can lead to large memory usage. For smaller error rates in the bloom filter, we need a larger memory. If you find the bloom filter to be the culprit and decide to increase the false positive ratio, remember that the recommended maximum value for the false positive value is 0.1. Performance starts to degrade after this. This may not be applicable to Cassandra 1.2 and onward where the bloom filter is managed off-heap.

Continuing the subject of off-heap, another thing that one might want to look into is row caches. Row caches are stored off-heap, if you have the JNA installed. If there is no JNA, the row cache falls back onto the on-heap memory—adding to the used heap memory. It may lead to frequent GC calls.

High memory usage can be a result of pulling lots of rows in one go. Look into such queries. Cassandra 1.2 onward has a trace feature that can help you find such queries.

Hotspots

A hotspot in a cluster is a node or a small set of nodes that show abnormally high resource usage. In the context of Cassandra, it will be the nodes in the cluster that get abnormally high hits or show high resource usage compared to other nodes.

A poorly balanced cluster can cause some nodes to own a high number of keys. If the request for each key has equal probability, the nodes with the higher numbers of ownership will have to serve a high number of requests. Rebalancing the cluster may fix this issue.

Ordered partitioners, such as ByteOrderedPartitioner, usually have a hard time making sure that each key range has an equal amount of data, unless the data coming for each key range has the same probability. It is suggested that you rework the application to avoid dependency on key ordering and use Murmur3Partitioner or RandomOrderPartitioner, unless you have a very strong reason to depend on byte-order partitioning. Refer to the Partitioners section in Chapter 4, Deploying a Cluster.

High throughput-wide columns may cause a hotspot. We know that a row resides on one server (actually, on all the replicas). If we have a row that gets written to and/or read from at a really high rate, the node gets loaded disproportionately (and the other nodes are probably idle). A good idea is to bucket the row key. For example, assume you are a popular website. If you decide to document a live presidential debate by recording everything told by the candidates, host, and audiences and stream this data live, you allow users to scroll back and forth to see the past records. In this case, if you decide to use a single row, you are creating a hotspot. The ideal thing would be to break the row key into buckets such as <rowKey>:<bucket_id> and apply round-robin to the buckets to store the data. Keys are being distributed across the nodes. Now you have the load distributed on multiple machines. To fetch the data, you may want to multiget slice the buckets and merge them into the application. The merging should be fast because the rows are already sorted. Refer to the High throughput rows and hotspots section in Chapter 3, Effective CQL.

Another cause of hotspots can be wrong token assignment in a multi data center setup (refer to Chapter 4, Deploying a Cluster). If you have two nodes, A and B, in data center 1, and two nodes, C and D, in data center 2, you calculate equidistant tokens and assign them to A, B, C, and D in increasing order. It seems OK, but it actually makes node A and node C hotspots.

Ideally, one should assign alternate tokens in different data centers. Thus, A should get the first token, C the second, B the third, and D the fourth. If there are three data centers, pick one from each and assign increasing tokens, then go for the second, and so on.

Open JDK's erratic behavior

Linux distros ship with Open Java and Open JDK. Cassandra does not officially support any variant of the JVM other than Oracle/Sun Java 1.7. Open Java may cause some weird issues, such as the GC pausing for a very long time and performance degradation. The safest thing to do is to remove Open Java and install the suggested version.

Disk performance

AWS users often find Elastic Block Storage (EBS) lucrative to use from the performance and reliability points of view. Unfortunately, it is a bad idea to use it.

It slows down the disk I/O speed. It can cause slow reads and writes. If you are using EBS, try comparing it with the instance store (ephemeral storage) with the RAID 0 setup.

If you see Too many open files or any other resource-related issue, the first thing to check is ulimit -a to see all available system resources. You can edit this setting by editing /etc/security/limits.conf and setting it to the following recommended setting:

  * soft nofile 32768
  * hard nofile 32768
  root soft nofile 32768
  root hard nofile 32768
  * soft memlock unlimited
  * hard memlock unlimited
  root soft memlock unlimited
  root hard memlock unlimited
  * soft as unlimited
  * hard as unlimited
  root soft as unlimited
  root hard as unlimited

Slow snapshots

Creating a snapshot for backup purposes is done by creating a hard link to SSTables. In the absence of JNA, it is done by using /bin/ln by fork and exec to create a hard link. This is observably slow with thousands of SSTables. Thus, if you are seeing an abnormally high snapshot time, check whether you have JNA configured.

Getting help from the mailing list

Cassandra is a robust and fault-tolerant software. It is possible that your production is running as expected while something is broken within. Replication and eventual consistency can help you to build a robust application on top of Cassandra, but it is important to keep an eye on monitoring statistics.

The tools that have been discussed in this and previous chapters should help you get enough information about what went wrong. You should be able to fix common problems using these tools. But sometimes it is a good idea to ask about a stubborn issue on the friendly Cassandra mailing list: .

When asking a question on the mailing list, provide as many statistics as you can gather around the problem. Nodetool's cfstats, tpstats, and ring commands are common ways to get Cassandra-specific statistics. You may want to check the Cassandra logs, enable JVM_OPTS for GC-related statistics, and profile using Java jhat or JConsole. Apart from this, server specifications such as memory, CPU, network, and disk stats provide crucial insights. Among other things, the following also repay consideration (as required) the replication factor, compaction strategy, consistency level, and table specifications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.252.204