Chapter 7. Monitoring

Monitoring is the key to provide reliable service. For distributed software, monitoring becomes more important and more complex. Fortunately, Cassandra has an excellent tool built in for monitoring. It is called nodetool. Apart from this, there are third-party tools to monitor Cassandra.

The purpose of monitoring is to be able to catch a problem before or as soon as it happens and resolve it. Therefore, this chapter will be a mix of monitoring, management (it comes with monitoring tools), and very quick troubleshooting tips. It familiarizes you with the Java Management Extension (JMX) interface that Cassandra provides and then moves on to accessing it via JConsole. Cassandra's nodetool—the application to monitor and administer Cassandra—is discussed in detail. Further, DataStax OpsCenter (the community version), which is an excellent web-based tool that stores performance history, is discussed. Nagios is another tool that can be used to monitor not only Cassandra but also the complete infrastructure with heterogeneous components. Nagios is a veteran monitoring tool. It is a pretty simple, intuitive, extendable, and robust tool. It provides monitoring along with e-mail notifications.

Cassandra's JMX interface

Cassandra has a powerful JMX interface to monitor almost all of its aspects. JMX is a standard part of Java standard edition (SE) 5.0 and onward. It provides a standard interface to manage and monitor resources such as applications, devices, JVM settings, and services. The way JMX technology manages and monitors a resource is called Managed Beans (MBeans). JMX defines standard connectors that enable us to access JMX agents remotely. With this introductory JMX knowledge, let's see what Cassandra offers us to control or monitor almost all of its aspects using JMX.

Note

This discussion is sufficient to get you to work with JMX in the context of Cassandra. Learn more about it at http://docs.oracle.com/javase/tutorial/jmx/TOC.html.

Cassandra exposes JMX MBeans in different packages. These are as follows:

  • org.apache.cassandra.internal: This package includes MBeans that inform us about internal operations. Therefore, you can view the status of AntiEntropy, FlushWriter, gossip, hinted handoff, response stage, migration and stream stages, pending range calculation, and commit log archival. Other than getting internal status statistics, there is not much that can be done with these MBeans.
  • org.apache.cassandra.db: This is probably the most interesting MBean package. It includes vital metrics and actionable operational items. MBeans give statistics and commands for database components such as cache management, table, commit log, compaction control, hinted handoff management, storage service (general ring statistics and operations), and storage proxy (client read/write statistics).
  • org.apache.cassandra.net: This package contains statistics on network communication within the cluster. It has some interesting Mbeans, such as FailureDetector, gossip, internode messaging, and data stream status.
  • org.apache.cassandra.request: One can view pending and completed tasks at different stages. The stages listed under this package are mutation, read repair, read, replicate on write, and read response.

    Cassandra is designed around Staged Event Driven Architecture (SEDA). At very high levels, it chunks a task into multiple stages, with each having its own thread pool and event queue. To read more about SEDA, visit http://www.eecs.harvard.edu/~mdw/proj/seda.

  • org.apache.cassandra.metrics: This package includes statistics about client read and write; specifically, the number of requests that are timed out and those that have thrown UnavailableException. That is, not enough replicas are available to satisfy the operation with the given consistency level or, maybe, many replicas are down.

Accessing MBeans using JConsole

JConsole is a built-in utility in JDK 5+. You can access it from $JAVA_HOME/bin/jconsole. It is a JVM monitoring tool and allows you to access MBeans in the Java application to which JConsole is connected. It allows you to monitor the CPU, memory, thread pools, heap information, and other important JVM-related things.

To peek into the insides of Cassandra, launch JConsole. The GUI shows two options to connect to—Local Process and Remote Process. If you are running JConsole on the same machine as Cassandra, you will see the option to connect to Cassandra in the drop-down under the Local Process radio button. However, it is not recommended to run JConsole on the same machine as Cassandra. This is because JConsole takes a large amount of system resources and can hamper Cassandra's performance. Thus, unless you just want to test Cassandra on your local machine, it may not be a good idea to have Cassandra and JConsole running on the same machine.

The Overview tab of JConsole is shown in the following screenshot:

Accessing MBeans using JConsole

To connect to a remote machine, you need to select the Remote process radio button and fill in the URL of the Cassandra node. The format is as follows:

# CASSANDRAHOST is address of remote Cassandra node
  service:jmx:rmi:///jndi/rmi://CASSANDRAHOST:7199/jmxrmi

If you have a firewall or port blocking the Cassandra node, you may face some issues in connection.

It requires some work to get JConsole connected to your Cassandra instance running within EC2 from outside the security group without compromising its security. The suggested way is to connect via a Secure Shell (SSH) tunnel. Setting up an SSH tunnel is beyond the scope of this book. You may refer to articles online. Refer to the online article at http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html for information on using an SSH tunnel to connect to JConsole.

Note

You may want to add your local machine's external IP to Cassandra's security group and open all the TCP ports (0 to 65535) to it. By doing this, you are compromising the security of the server. It is not a recommended way to get around this problem. Remember to remove this entry once you are done with the JConsole task.

If you have a server set up with different internal and external IPs, you may need to configure an RMI host name. Open config/cassandra-env.sh and add the hostname parameter for JMX as part of JVM_OPTS in the following manner:

JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=174.129.145.160"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false"
JVM_OPTS="$JVM_OPTS - Dcom.sun.management.jmxremote.authenticate=false"

AWS users may need to put a public DNS name provided for the Cassandra node. Once you are connected to the node, you can see an overview of the JVM on that node (refer to the previous screenshot). Look through the tabs and analyze them more closely. Interestingly, you can see a spike in the memory usage, CPU usage, and thread counts. This is the duration in which a sample stress was running.

To execute various JMX operations provided via JConsole, you will need to switch to the MBeans tab. Expand the various menu items in the bar on the left-hand side of the screen. The interesting ones are under org.apache.cassandra.*. For example, you can clear the value of hinted handoff for a node that is dead before the configured timeout (max_hint_window_in_ms) for that node arrives.

As a matter of taste, some people prefer VisualVM to JConsole. VisualVM does not provide JMX support out of the box. However, it is fairly easy to add a VisualVM-MBeans plugin to enable such functionality. VisualVM combines JConsole, jstat, jinfo, jstack, and jmap. This is available in versions starting from JDK 6 and can be accessed at $JDK_HOME/bin/jvisualvm.

Details on VisualVM are beyond the scope of this book, but you can learn about them at https://visualvm.java.net/jmx_connections.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.235.79