Chapter 7. Monitoring

Monitoring is the key to provide reliable service. For a distributed software, monitoring becomes more important and more complex. Fortunately, Cassandra has an excellent tool built-in for it. It is called nodetool. Apart from this, there are third-party tools to monitor Cassandra.

The purpose of monitoring is to be able to catch a problem before or as soon as it happens and resolve it. So, this chapter will be a mix of monitoring, management (comes with monitoring tools), and very quick troubleshooting tips. It gets you familiarized with the JMX interface that Cassandra provides and then moves on to accessing it via JConsole. Cassandra's nodetool—the application to monitor and administer Cassandra—is discussed in detail. Further, DataStax OpsCenter (community version), which is an excellent web-based tool that stores performance history, is discussed. Nagios is another tool that can be used to not only monitor Cassandra but also the complete infrastructure with heterogeneous components. Nagios is a veteran monitoring tool. It is a pretty simple, intuitive, extendable, and robust tool. It provides monitoring along with e-mail notification.

Cassandra JMX interface

Cassandra has a powerful JMX interface to monitor almost all of its aspects. Java Management Extension (JMX) is a standard part of Java SE (standard edition) 5.0 and onward. It provides a standard interface to manage and monitor resources such as applications, devices, JVM settings, and services. The way JMX technology manages and monitors a resource is called Managed Beans (MBeans). JMX defines standard connectors that enable us to access JMX agents remotely. With this introductory JMX knowledge, let's see what Cassandra offers us to control or monitor almost all of its aspects using JMX.

Note

This discussion is sufficient to get you to work with JMX in the context of Cassandra. Learn more about it at http://docs.oracle.com/javase/tutorial/jmx/TOC.html.

Cassandra exposes JMX MBeans in different packages. These are:

  • The org.apache.cassandra.internal package: This package includes MBeans that inform us about internal operations. So, you can view the status of AntiEntropy, FlushWriter, gossip, hinted handoff, response stage, migration and stream stages, pending range calculation, and commit log archival. Other than getting internal status statistics, there is not much that can be done with these MBeans.
  • The org.apache.cassandra.db package: This is probably the most interesting MBean package. It includes vital metrics and actionable operational items. MBeans gives statistics and commands for the following database components: cache management, column family, commit log, compaction control, hinted handoff management, storage service (general ring statistics and operations), and storage proxy (client read/write statistics).
  • The org.apache.cassandra.net package: This package contains statistics on network communication within the cluster. It has some interesting Mbeans such as FailureDetector, gossip, inter-node messaging, and data stream status.
  • The org.apache.cassandra.request package: One can view pending and completed tasks at different stages. The stages listed under this package are: mutation stage, read repair stage, read stage, replicate on write stage, and read response stage.
  • The org.apache.cassandra.metrics package: It includes statistics about client read and write. Specifically, the number of requests that are timed out and those that have thrown UnavailableException, that is, not enough replicas available to satisfy the operation with the given consistency level or, maybe, many replicas are down.

Cassandra is designed around Staged Event Driven Architecture (SEDA). At very high levels, it chunks a task into multiple stages, each having their own thread pool and event queue. To read more about SEDA visit http://www.eecs.harvard.edu/~mdw/proj/seda.

Accessing MBeans using JConsole

JConsole is a built-in utility in JDK 5+. You can access it from $JAVA_HOME/bin/jconsole. It is a JVM monitoring tool and allows you to access MBeans in the Java application to which JConsole is connected to. It allows you to monitor the CPU, memory, thread pools, heap information, and other important JVM-related things.

To peek into the insides of Cassandra, launch JConsole. The GUI shows two options to connect to—Local Process and Remote Process. If you are running JConsole on the same machine as Cassandra, you will see the option to connect to Cassandra in the drop-down under the Local Process radio button. However, it is not recommended to run JConsole on the same machine as Cassandra. This is because JConsole takes a large amount of system resources and can hamper Cassandra's performance. So, unless you just want to test Cassandra on your local machine, it may not be a good idea to have Cassandra and JConsole running on the same machine.

The JConsole summary tab is shown in the following screenshot:

Accessing MBeans using JConsole

Figure 7.1: JConsole summary tab

To connect to a remote machine, you need to select the Remote process radio button and fill in the URL of the Cassandra node. The format is:

# CASSANDRAHOST is address of remote Cassandra nodeservice:jmx:rmi:///jndi/rmi://CASSANDRAHOST:7199/jmxrmi

If you have a firewall or port blocking the Cassandra node, you may face some issues in connection.

AWS Users: It requires some work to get JConsole connected to your Cassandra instance running within EC2 from outside the security group without compromising its security. The suggested way is to connect via an SSH tunnel. Setting up an SSH tunnel is outside the scope of this book. You may refer to articles online. One of the online articles for using an SSH tunnel to connect to JConsole is: http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html.

Note

You may want to add your local machine's external IP to Cassandra's security group and open all the TCP ports (0 to 65535) to it. By doing this, you are compromising the security of the server. It is not a recommended way to get around this problem. Remember to remove this entry once you are done with the JConsole task.

In case you have a server set up with different internal and external IPs, you may need to configure an RMI hostname. Open config/cassandra-env.sh, and add the hostname parameter for JMX as part of JVM_OPTS in the following manner:

JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=174.129.145.160"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT" 
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false" 
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false" 

Amazon Web Services (AWS) users may need to put a public DNS name provided for the Cassandra node. Once you are connected to the node, you can see an overview of the JVM on that node; see Figure 7.1. Look through the tabs and analyze them more closely. Interestingly, you can see a spike in the memory usage, CPU usage, and thread counts. This is the duration in which a sample stress was running.

To execute various JMX operations provided via JConsole, you will need to switch to the MBeans tab. Expand the various menu items in the bar on the left-hand side of the screen. The interesting ones are under org.apache.cassandra.*. For example, you can clear the value of hinted handoff for a node that is dead before the configured timeout (max_hint_window_in_ms) for that node arrives (see Error! Reference source not found.).

As a matter of taste, some people prefer VisualVM over JConsole. VisualVM does not provide JMX support out of the box. However, it is fairly easy to add a VisualVM-MBeans plugin to enable such functionality. VisualVM combines JConsole, jstat, jinfo, jstack, and jmap. This is available in versions starting from JDK 6 and can be accessed at $JDK_HOME/bin/jvisualvm.

Details on VisualVM are outside the scope of this book, but you can learn about them at https://visualvm.java.net/jmx_connections.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.68.28