Chapter 10. Monitoring

In this chapter, you’ll learn how to use a variety of tools to monitor and understand important events in the life cycle of your Cassandra cluster. We’ll look at some simple ways to see what’s going on, such as changing the logging levels and understanding the output.

Cassandra also features built-in support for Java Management Extensions (JMX), which offers a rich way to monitor your Cassandra nodes and their underlying Java environment. Through JMX, we can see the health of the database and ongoing events, and even interact with it remotely to tune certain values. JMX is an important part of Cassandra, and we’ll spend some time to make sure we know how it works and what exactly Cassandra makes available for monitoring and management with JMX. Let’s get started!

Logging

The simplest way to get a picture of what’s happening in your database is to just change the logging level to make the output more verbose. This is great for development and for learning what Cassandra is doing under the hood.

Cassandra uses the Simple Logging Facade for Java (SLF4J) API for logging, with Logback as the implementation.  SLF4J provides a facade over various logging frameworks such as Logback, Log4J, and Java’s built-in logger (java.util.logging). You can learn more about Logback at http://logback.qos.ch/.

By default, the Cassandra server log level is set at INFO, which doesn’t give you much detail about what work Cassandra is doing at any given time. It just outputs basic status updates, such as the following:

INFO  [main] 2015-09-19 09:40:20,215 CassandraDaemon.java:149 - 
  Hostname: Carp-iMac27.local
INFO  [main] 2015-09-19 09:40:20,233 YamlConfigurationLoader.java:92 -
  Loading settings from file:/Users/jeff/Cassandra/
  apache-cassandra-2.1.8/conf/cassandra.yaml
INFO  [main] 2015-09-19 09:40:20,333 YamlConfigurationLoader.java:135 - 
  Node configuration
...   

When you start Cassandra in a terminal, you keep this output running in the terminal window by passing the program the -f flag (to keep output visible in the foreground of the terminal window).   But Cassandra is also writing these logs to physical files for you to examine later.

By changing the logging level to DEBUG, we can see much more clearly what activity the server is working on, instead of seeing only these stage updates.

To change the logging level, open the file <cassandra-home>/conf/logback.xml and find the section that looks like this:

  <root level="INFO">
    <appender-ref ref="FILE" />
    <appender-ref ref="STDOUT" />
  </root>

Change the first line so it looks like this:

  <root level="DEBUG">

Once we have made this change and saved the file, Cassandra will shortly begin printing DEBUG-level logging statements. This is because the default logging is configured to scan the configuration file once a minute, as set by the line:

<configuration scan="true">

Now we can see a lot more activity as Cassandra does its work. This allows you to see exactly what Cassandra is doing and when, which is very helpful in troubleshooting. But it’s also helpful in simply understanding what Cassandra does to maintain itself.

Tuning Logging in Production

Of course, in production you’ll want to tune the logging level back up to WARN or ERROR, as the verbose output will slow things down considerably. 

By default, Cassandra’s log files are stored in the logs directory underneath the Cassandra installation directory.

If you want to change the location of the logs directory, just find the following entry in the logback.xml file and chose a different filename:

    <file>${cassandra.logdir}/system.log</file>

Missing Log Files

If you don’t see any logfiles in the location specified, make sure that you are the owner of the directories, or at least that proper read and write permissions are set. Cassandra won’t tell you if it can’t write the log; it just won’t write. Same for the datafiles.

Other settings in the logback.xml file support rolling log files. By default, the system.log file is rolled to an archive once it reaches a size of 20 MB. Each log file archive is compressed in zip format and named according to the pattern system.log.1.zip, system.log.2.zip, and so on.

Tailing

You don’t need to start Cassandra using the foreground switch in order to see the rolling log.  You can also simply start it without the -f option and then tail the logs. Tailing is not specific to Cassandra; it’s a small program available in Linux distributions to see new values printed to a console as they are appended to a file.

To tail the logs, start Cassandra like this:

$ bin/cassandra

Then open a second console, enter the tail command, and pass it the location of the particular file you want to tail, like this:

$ tail -f $CASSANDRA_HOME/logs/system.log

The -f option means “follow,” and as Cassandra outputs information to the physical logfile, tail will output it to the screen. To stop tailing, just press Ctrl-C.

You can do the same thing if you’re using Windows, but Windows doesn’t include a tail program natively. So to achieve this, you’ll need to download and install Cygwin, which is a  free and open source Bash shell emulator. Cygwin allows you to have a Linux-style interface and use a variety of Linux tools on Windows.

Then you can start Cassandra regularly and tail the logfile using this command:

$ tail -f %CASSANDRA_HOME%\logs\system.log

This will show the output in the console in the same way as if it were foregrounded.

Examining Log Files

Once you’re running the server with debug logging enabled, you can see a lot more happening that can help during debugging. For example, here we can see the output when writing a simple value to the database using cqlsh:

cqlsh> INSERT INTO hotel.hotels (id, name, phone, address) 
   ... VALUES ( 'AZ123', 'Comfort Suites Old Town Scottsdale', 
   ... '(480) 946-1111', { street : '3275 N. Drinkwater Blvd.', 
   ... city : 'Scottsdale', state : 'AZ', zip_code : 85251 });

DEBUG [SharedPool-Worker-1] 2015-09-30 06:21:41,410 Message.java:506 - 
  Received: OPTIONS, v=4
DEBUG [SharedPool-Worker-1] 2015-09-30 06:21:41,410 Message.java:525 - 
  Responding: SUPPORTED {COMPRESSION=[snappy, lz4], 
  CQL_VERSION=[3.3.1]}, v=4
DEBUG [SharedPool-Worker-1] 2015-09-30 06:21:42,082 Message.java:506 -
  Received: QUERY INSERT INTO hotel.hotels (id, name, phone, address) 
  VALUES ( 'AZ123', 'Comfort Suites Old Town Scottsdale', 
  '(480) 946-1111', { street : '3275 N. Drinkwater Blvd.', 
  city : 'Scottsdale', state : 'AZ', zip_code : 85251 });
  [pageSize = 100], v=4
DEBUG [SharedPool-Worker-1] 2015-09-30 06:21:42,086 
  AbstractReplicationStrategy.java:87 - clearing cached endpoints
DEBUG [SharedPool-Worker-1] 2015-09-30 06:21:42,087 Tracing.java:155 - 
request complete
DEBUG [SharedPool-Worker-1] 2015-09-30 06:21:42,087 Message.java:525 -
   Responding: EMPTY RESULT, v=4

This particular output is less expressive than it could otherwise be, given that it was run on a single node cluster.

If we then load the row via a simple query:

cqlsh> SELECT * from hotel.hotels;

The server log records this query as follows:

DEBUG [SharedPool-Worker-1] 2015-09-30 06:27:27,392 Message.java:506 - 
  Received: QUERY SELECT * from hotel.hotels;[pageSize = 100], v=4
DEBUG [SharedPool-Worker-1] 2015-09-30 06:27:27,395 
  StorageProxy.java:2021 - Estimated result rows per range: 0.0; 
  requested rows: 100, ranges.size(): 257; concurrent range requests: 1
DEBUG [SharedPool-Worker-1] 2015-09-30 06:27:27,401 
  ReadCallback.java:141 - Read: 0 ms.
DEBUG [SharedPool-Worker-1] 2015-09-30 06:27:27,401 Tracing.java:155 - 
  request complete
DEBUG [SharedPool-Worker-1] 2015-09-30 06:27:27,401 Message.java:525 - 
  Responding: ROWS [id(hotel, hotels), 
  org.apache.cassandra.db.marshal.UUIDType][address(hotel, hotels), 
  org.apache.cassandra.db.marshal.UserType(hotel,61646472657373,
  737472656574:org.apache.cassandra.db.marshal.UTF8Type,
  63697479:org.apache.cassandra.db.marshal.UTF8Type,7374617465:
  org.apache.cassandra.db.marshal.UTF8Type,7a69705f636f6465:
  org.apache.cassandra.db.marshal.Int32Type)][name(hotel, hotels), 
  org.apache.cassandra.db.marshal.UTF8Type][phone(hotel, hotels), 
  org.apache.cassandra.db.marshal.UTF8Type][pois(hotel, hotels), 
  org.apache.cassandra.db.marshal.SetType(org.apache.cassandra.db.
  marshal.UUIDType)]
 | 452d27e1-804e-479b-aeaf-61d1fa31090f | 3275 N. Drinkwater Blvd.:
  Scottsdale:AZ:85251 | Comfort Suites Old Town Scottsdale | 
  (480) 946-1111 | null

As you can see, the server loads each of the columns we requested via a class responsible for marshalling data from the on-disk format.

The DEBUG log level should give you enough information to follow along with what the server’s doing as you work.

Monitoring Cassandra with JMX

In this section, we explore how Cassandra makes use of Java Management Extensions (JMX) to enable remote management of your servers. JMX started as Java Specification Request (JSR) 160 and has been a core part of Java since version 5.0.

More on JMX

You can read more about the JMX implementation in Java by examining the java.lang.management package.

JMX is a Java API that provides management of applications in two key ways. First, JMX allows you to understand your application’s health and overall performance in terms of memory, threads, and CPU usage—things that are generally applicable to any Java application. Second, JMX allows you to work with specific aspects of your application that you have instrumented.

Instrumentation refers to putting a wrapper around application code that provides hooks from the application to the JVM in order to allow the JVM to gather data that external tools can use. Such tools include monitoring agents, data analysis tools, profilers, and more. JMX allows you not only to view such data but also, if the application enables it, to manage your application at runtime by updating values.

JMX is commonly used for a variety of application control operations, including:

  • Low available memory detection, including the size of each generation space on the heap

  • Thread information such as deadlock detection, peak number of threads, and current live threads

  • Verbose classloader tracing

  • Log level control

  • General information such as application uptime and the active classpath

Many popular Java applications are instrumented using JMX, including the JVM itself, the Tomcat application server, and Cassandra. A depiction of the JMX architecture is shown in Figure 10-1.

Figure 10-1. The JMX architecture

The JMX architecture is simple. The JVM collects information from the underlying operating system. The JVM itself is instrumented, so many of its features are exposed for management as described earlier. An instrumented Java application (such as Cassandra) runs on top of this, also exposing some of its features as manageable objects. The JDK includes an MBean server that makes the instrumented features available over a remote protocol to a JMX Management Application. The JVM also offers management capabilities via Simple Network Monitoring Protocol (SNMP), which may be useful if you are using SMTP monitoring tools such as Nagios or Zenoss.

But within a given application, you can manage only what the application developers have made available for you to manage. Luckily, the Cassandra developers have instrumented large parts of the database engine, making management via JMX fairly straightforward.

This instrumentation of a Java application is performed by wrapping the application code that you want JMX to hook into with managed beans.

Connecting to Cassandra via JConsole

The jconsole tool ships with the standard Java Development Kit.  It provides a graphical user interface client for working with MBeans and can be used for local or remote management. Let’s connect to Cassandra on its JMX port using JConsole. To do so, open a new terminal and type the following:

>jconsole

When you run jconsole, you’ll see a login screen similar to that in Figure 10-2.

Figure 10-2. The jconsole login

From here, you can simply double-click on the value org.apache.cassandra.service.CassandraDaemon under the Local Process section if you’re monitoring a node on the same machine. If you want to monitor a node on a different machine, check the Remote Process radio button, then enter the host and port you want to connect to. Cassandra JMX by default broadcasts on port 7199, so you can enter a value like the one shown here and then hit Connect:

>lucky:7199

Connecting Remotely via JMX

By default, Cassandra runs with JMX enabled for local access only. To enable remote access, edit the file <cassandra-home>/cassandra-env.sh (or cassandra.ps1 on Windows). Search for “JMX” to find the section of the file with options to control the JMX port and other local/remote connection settings.

Once you’ve connected to a server, the default view includes four major categories about your server’s state, which are updated constantly:

Heap memory usage

This shows the total memory available to the Cassandra program, as well as how much it’s using right now.

Threads

This is the number of live threads Cassandra is using.

Classes

The number of classes that Cassandra has loaded. This number is relatively small for such a powerful program; Cassandra typically requires under 3,000 classes out of the box. Compare this to a program such as Oracle WebLogic, which typically loads around 24,000 classes.

CPU usage

This shows the percentage of the processor that the Cassandra program is currently using.

You can use the selector to adjust the time range shown in the charts.

If you want to see a more detailed view of how Cassandra is using the Java heap and non-heap memory, click the Memory tab. By changing the chart value in the drop-down, you can see in detail the graduations in which Cassandra is using its memory. You can also (try to) force a garbage collection if you think it’s necessary.

You can connect to more than one JMX agent at once. Just choose File → New Connection... and repeat the steps to connect to another running Cassandra node to view multiple servers at once. 

Overview of MBeans

A managed bean, or MBean, is a special type of Java bean that represents a single manageable resource inside the JVM. MBeans interact with an MBean server to make their functions remotely available.

A view of JConsole is provided in Figure 10-3.

Figure 10-3. JConsole showing the peak thread count for a Cassandra daemon

In this figure, you can see tabbed windows that offer general views about threads, memory, and CPU that every application will have, and a more detailed MBeans tab that exposes the ability to interact in more detail with MBeans exposed by the application. For example, in the figure, we’ve selected to view the peak thread count value. You can see that many other instrumented aspects of the application are also available.

There are many aspects of an application or the JVM that can be instrumented but that may be disabled. Thread Contention is one example of a potentially useful MBean that is turned off by default in the JVM. These aspects can be very useful for debugging, so if you see an MBean that you think might help you hunt down a problem, go ahead and enable it. But keep in mind that nothing comes for free, and it’s a good idea to read the JavaDoc on the MBean you want to enable in order to understand the potential impact on performance. For example, measuring CPU time per thread is an example of a useful, but expensive, MBean operation.

MBean Object Name Conventions

When an MBean is registered with the MBean server, it specifies an object name that is used to identify the MBean to JMX clients. An object name consists of a domain followed by a list of key-value pairs, at least one of which must identify a type. The typical convention is to choose a domain name that is similar to the Java package name of the MBean, and to name the type after the MBean interface name (minus the “MBean”), but this is not strictly required.

For example, the threading attributes we looked at earlier appear under the java.lang.Threading heading in JConsole, and are exposed by a class implementing the java.lang.management.​ThreadMXBean interface, which registers the MBean with the object name java.lang.type=Threading.

As we discuss various MBeans in this chapter, we’ll identify both the MBean object name and the interface to help you navigate between JMX clients and the Cassandra source code.

Some simple values in the application are exposed as attributes.  An example of this is Threading > PeakThreadCount, which just reports the value that the MBean has stored for the greatest number of threads the application used at a single point in time. You can refresh to see the most recent value, but that’s pretty much all you can do with it. Because such a value is maintained internally in the JVM, it doesn’t make sense to set it externally (it’s derived from actual events, and not configurable).

But other MBeans are configurable. They make operations available to the JMX agent that let you get and set values. You can tell whether the MBean will let you set a value by looking at the value for writable. If it’s false, you will see a label indicating the read-only value; if it’s true, you will see a set of one or more fields to add your new value and a button to update it. An example of this is the ch.qos.logback.classic.​jmx.JMXConfigurator bean,  as shown in Figure 10-4.

Figure 10-4. The JMXConfigurator MBean allows you to set a logger’s log level

Note that the parameter names are not available to the JMX agent; they’re just labeled as p0, p1, and so on. That’s because the Java compiler “erased” the parameter names during compilation. So in order to know what parameters to set on an operation, you’ll need to look at the JavaDoc for the particular MBean you’re working with.

In the case of JMXConfigurator, this class implements an interface called JMXConfiguratorMBean, which wraps it for instrumentation. To find out what the right parameters are for the setLoggerLevel operation, we examine the JavaDoc for this interface, available at http://logback.qos.ch/apidocs/ch/qos/logback/classic/jmx/JMXConfiguratorMBean.html. Looking at the documentation, you’ll see that p0 represents the name of the logger you want to change, and p1 describes the logging level you want to set that logger to.

Some MBeans return an attribute value of javax.management.openmbean.CompositeDataSupport. That means that these are not simple values that can be displayed in a single field, such as LoadedClassCount, but are instead multivalued. One example is Memory > HeapMemoryUsage, which offers several data points and therefore has its own view.

Another type of MBean operation is one that doesn’t simply show a value or allow you to set a value, but instead lets you execute some useful action. dumpAllThreads and resetPeakThreadCount are two such operations.

Now we’ll quickly get set up to start monitoring and managing Cassandra specifically.

Cassandra’s MBeans

Once you’ve connected with a JMX agent such as JConsole, you can manage Cassandra using the MBeans it exposes. To do so, click the MBeans tab. Other than the standard Java items available to every agent, there are several Cassandra packages that contain manageable beans, organized by their package names, which start with org.apache.cassandra. We won’t go into detail on all of them here, but there are several of interest that we’ll take a look at.

Many classes in Cassandra are exposed as MBeans, which means in practical terms that they implement a custom interface that describes the operations that need to be implemented and for which the JMX agent will provide hooks. The steps are basically the same for getting any MBean to work. If you’d like to JMX-enable something that isn’t already enabled, modify the source code following this general outline and you’ll be in business.

For example, we look at Cassandra’s CompactionManager from the org.apache.cassandra.db.compaction package and how it uses MBeans. Here’s the definition of the CompactionManagerMBean class, with comments omitted for brevity:

public interface CompactionManagerMBean
{
    public List<Map<String, String>> getCompactions();
    public List<String> getCompactionSummary();
    public TabularData getCompactionHistory();

    public void forceUserDefinedCompaction(String dataFiles);
    public void stopCompaction(String type);
    public void stopCompactionById(String compactionId);

    public int getCoreCompactorThreads();
    public void setCoreCompactorThreads(int number);

    public int getMaximumCompactorThreads();
    public void setMaximumCompactorThreads(int number);

    public int getCoreValidationThreads();
    public void setCoreValidationThreads(int number);

    public int getMaximumValidatorThreads();
    public void setMaximumValidatorThreads(int number);
}

As you can see by this MBean interface definition, there’s no magic going on. This is just a regular interface defining the set of operations that will be exposed to JMX that the CompactionManager implementation must support. This typically means maintaining additional metadata as the regular operations do their work.

The CompactionManager class implements this interface and must do the work of directly supporting JMX. The CompactionManager class itself registers and unregisters with the MBean server for the JMX properties that it maintains locally:

public static final String MBEAN_OBJECT_NAME = 
    "org.apache.cassandra.db:type=CompactionManager";
// ...
static
{
    instance = new CompactionManager();
    MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
    try
    {
        mbs.registerMBean(instance, 
           new ObjectName(MBEAN_OBJECT_NAME));
    }
    catch (Exception e)
    {
        throw new RuntimeException(e);
    }
}

Note that the MBean is registered in the domain org.apache.cassandra.db with a type of CompactionManager. The attributes and operations exposed by this MBean appear under org.apache.cassandra.db > CompactionManager in JMX clients. The implementation does all of the work that it is intended to do, and then has implementations of the methods that are only necessary for talking to the MBean server. For example, here is the CompactionManager implementation of the stopCompaction() operation:

public void stopCompaction(String type)
{
    OperationType operation = OperationType.valueOf(type);
    for (Holder holder : CompactionMetrics.getCompactions())
    {
        if (holder.getCompactionInfo().getTaskType() == operation)
            holder.stop();
    }
}

The CompactionManager iterates through the compactions in progress, stopping each one that is of the specified type. The Javadoc lets us know that the valid types are COMPACTION, VALIDATION, CLEANUP, SCRUB, and INDEX_BUILD.

When we view the CompactionManagerMBean in JConsole we can select the operations and view the stopCompaction() operation, as shown in Figure 10-5. We can enter one of the preceding types and request that the compactions be stopped.

Figure 10-5. The CompactionManager stopCompaction() operation

In the following sections, we see what features are available for monitoring and management via JMX.

Database MBeans

These are the Cassandra classes related to the core database itself that are exposed to clients in the org.apache.cassandra.db domain. There are many MBeans in this domain, but we’ll focus on a few key ones related to storage, caching, the commit log, and table stores.

Storage Service MBean

Because Cassandra is a database, it’s essentially a very sophisticated storage program; therefore, one of the first places you’ll want to look when you encounter problems is the org.apache.cassandra.service.StorageServiceMBean. This allows you to inspect your OperationMode, which reports normal if everything is going smoothly (other possible states are leaving, joining, decommissioned, and client).

You can also view the current set of live nodes, as well as the set of unreachable nodes in the cluster. If any nodes are unreachable, Cassandra will tell you their IP addresses in the UnreachableNodes attribute.

If you want to change Cassandra’s log level at runtime without interrupting service (as we saw earlier in the general example), you can invoke the setLoggingLevel(String classQualifier, String level) method. For example, say that you have set Cassandra’s log level to DEBUG because you’re troubleshooting an issue. You can use some of the methods described here to try to help fix your problem, then you might want to change the log level to something less taxing on the system. To do this, navigate to the StorageService MBean in a JMX client such as JConsole. We’ll change the value for a particularly chatty class: the Gossiper. The first parameter to this operation is the name of the class you want to set the log level for, and the second parameter is the level you want to set it to. Enter org.apache.cassandra.gms.Gossiper and INFO, and click the button labeled setLoggingLevel. You should see the following output in your logs (assuming your level was already debug):

INFO  03:08:20 set log level to INFO for classes under 
  'org.apache.cassandra.gms.Gossiper' (if the level doesn't look 
   like 'INFO' then the logger couldn't parse 'INFO')

After invoking the setLoggingLevel operation, we get the INFO output and no more DEBUG-level statements.

To get an understanding of how much data is stored on each node, you can use the getLoadMap() method, which will return to you a Java Map with keys of IP addresses with values of their corresponding storage loads. You can also use the effectiveOwnership(String keyspace) operation to access the percentage of the data in a keyspace owned by each node.

If you’re looking for a certain key, you can use the getNaturalEndpoints(String table, byte[] key) operation. Pass it the table name and the key for which you want to find the endpoint value, and it will return a list of IP addresses that are responsible for storing this key.

You can also use the getRangeToEndpointMap operation to get a map of range to end points describing your ring topology.

If you’re feeling brave, you can invoke the truncate() operation for a given table in a given keyspace. If all of the nodes are available, this operation will delete all data from the table but leave its definition intact.

There are many standard maintenance operations that the StorageServiceMBean affords you, including resumeBootstrap(), joinRing(), repairAsync(), drain(), removeNode(), decommission(), and operations to start and stop gossip, the native transport, and Thrift (until Thrift is finally removed). Understanding the available maintenance operations is important to keeping your cluster in good health, and we’ll dig more into these in Chapter 11.

Storage Proxy MBean

As we learned in Chapter 6, the org.apache.cassandra.service.StorageProxy provides a layer on top of the StorageService to handle client requests and inter-node communications. The StorageProxyMBean provides the ability to check and set timeout values for various operations including read and write.

This MBean also provides access to hinted handoff settings such as the maximum time window for storing hints. Hinted handoff statistics include getTotalHints() and getHintsInProgress(). You can disable hints for a particular node with the disable​HintsForDC() operation.

You can also turn this node’s participation in hinted handoff on or off via setHintedHandoffEnabled(), or check the current status via getHintedHandoffEnabled(). These are used by nodetool’s enablehandoff, disablehandoff, and statushandoff commands, respectively.

The reloadTriggerClasses() operation allows you to install a new trigger without having to restart a node.

ColumnFamilyStoreMBean

Cassandra registers an instance of the org.apache.cassandra.db.ColumnFamily​Store​MBean for each  table stored in the node under org.apache.cassandra.db > Tables (previously ColumnFamilies). 

The ColumnFamilyStoreMBean provides access to the compaction and compression settings for each table. This allows you to temporarily override these settings on a specific node. The values will be reset to those configured on the table schema when the node is restarted.

The MBean also exposes a lot of information about the node’s storage of data for this table on disk. The getSSTableCountPerLevel() operation provides a list of how many SStables are in each tier. The estimateKeys() operation provides an estimate of the number of partitions stored on this node. Taken together, this information can give you some insight as to whether invoking the forceMajorCompaction() operation for this table might help free space on this node and increase read performance.

There is also a trueSnapshotsSize() operation that allows you to determine the size of SSTable shapshots which are no longer active. A large value here indicates that you should consider deleting these snapshots, possibly after making an archive copy.

Because Cassandra stores indexes as tables, there is also a ColumnFamilyStoreMBean instance for each indexed column, available under org.apache.cassandra.db > IndexTables (previously IndexColumnFamilies), with the same attributes and operations.

CacheServiceMBean

The org.apache.cassandra.service.CacheServiceMBean provides access to Cassandra’s key cache, row cache, and counter cache under the domain org.apache.cassandra.db > Caches. The information available for each cache includes the maximum size and time duration to cache items, and the ability to invalidate each cache.

CommitLogMBean

The org.apache.cassandra.db.commitlog.CommitLogMBean exposes attributes and operations that allow you to learn about the current state of commit logs. The Commit​LogMBean also exposes the recover() operation which can be used to restore database state from archived commit log files.

The default settings that control commit log recovery are specified in the conf/commitlog_archiving.properties file, but can be overridden via the MBean. We’ll learn more about data recovery in Chapter 11.

Compaction Manager MBean

We’ve already taken a peek inside the source of the org.apache.cassandra.db.compaction.CompactionManagerMBean to see how it interacts with JMX, but we didn’t really talk about its purpose. This MBean allows us to get statistics about compactions performed in the past, and the ability to force compaction of specific SSTable files we identify by calling the forceUserDefinedCompaction method of the Compaction​Manager class. This MBean is leveraged by nodetool commands including compact, compactionhistory, and compactionstats.

Snitch MBeans

Cassandra provides two MBeans to monitor and configure behavior of the snitch. The org.apache.cassandra.locator.EndpointSnitchInfoMBean provides the name of the rack and data center for a given host, as well as the name of the snitch in use.

If you’re using the DynamicEndpointSnitch, the org.apache.cassandra.locator.​DynamicEndpointSnitchMBean is registered. This MBean exposes the ability to reset the badness threshold used by the snitch for marking nodes as offline, as well as allowing you to see the scores for various nodes.

HintedHandoffManagerMBean

In addition to the hinted handoff operations on the StorageServiceMBean mentioned earlier, Cassandra provides more fine grained control of hinted handoff via the org.apache.cassandra.db.HintedHandOffManagerMBean. The MBean exposes the ability to list nodes for which hints are stored by calling listEndpointsPendingHints(). You can then force delivery of hints to a node via scheduleHintDelivery(), or delete hints that are stored up for a specific node with deleteHintsForEndpoint().

Additionally, you can pause and resume hint delivery to all nodes with pauseHint​Delivery() or delete stored hints for all nodes with the truncateAllHints() operation. These are used by nodetool’s pausehandoff, resumehandoff, and truncatehints commands, respectively.

Duplicative Hinted Handoff Management

The org.apache.cassandra.hints.HintsService exposes the HintsServiceMBean under the  domain org.apache.cassandra​.hints > HintsService. This MBean provides operations to pause and resume hinted handoff, and to delete hints stored for all nodes, or for a specific node identified by IP address.

Because there is a lot of overlap between the StorageService​MBean, HintedHandOffManagerMBean, and HintsServiceMBean, there is likely to be some consolidation of these operations in future releases. 

Networking MBeans

The org.apache.cassandra.net domain contains MBeans to help manage Cassandra’s network-related activities, including Phi failure detection and gossip, the Messaging Service, and Stream Manager.

FailureDetectorMBean

The org.apache.cassandra.gms.FailureDetectorMBean provides attributes describing the states and Phi scores of other nodes, as well as the Phi conviction threshold.

GossiperMBean

The org.apache.cassandra.gms.GossiperMBean provides access to the work of the Gossiper.

We’ve already discussed how the StorageServiceMBean reports which nodes are unreachable. Based on that list, you can call the getEndpointDowntime() operation on the GossiperMBean to determine how long a given node has been down. The downtime is measured from the perspective of the node whose MBean we’re inspecting, and the value resets when the node comes back online. Cassandra uses this operation internally to know how long it can wait to discard hints.

The getCurrentGenerationNumber() operation returns the generation number associated with a specific node. The generation number is included in gossip messages exchanged between nodes and is used to distinguish the current state of a node from the state prior to a restart. The generation number remains the same while the node is alive and is incremented each time the node restarts. It’s maintained by the Gossiper using a timestamp.

The assassinateEndpoint() operation attempts to remove a node from the ring by telling the other nodes that the node has been permanently removed, similar to the concept of “character assassination” in human gossip. Assassinating a node is a maintenance step of last resort when a node cannot be removed from the cluster normally. This operation is used by the nodetool assassinate command.

StreamManagerMBean

The org.apache.cassandra.streaming.StreamManagerMBean allows us to see the streaming activities that occur between a node and its peers. There are two basic ideas here: a stream source and a stream destination. Each node can stream its data to another node in order to perform load balancing, and the StreamManager class supports these operations. The StreamManagerMBean gives necessary view into the data that is moving between nodes in the cluster.

The StreamManagerMBean supports two modes of operation. The getCurrentStreams() operation provides a snapshot of the current incoming and outgoing streams, and the MBean also publishes notifications associated with stream state changes, such as initialization, completion or failure. You can subscribe to these notifications in your JMX client in order to watch the streaming operations as they occur.

So in conjunction with the StorageServiceMBean, if you’re concerned that a node is not receiving data as it should, or that a node is unbalanced or even down, these two MBeans working together can give you very rich insight into exactly what’s happening in your cluster.

Metrics MBeans

The ability to access metrics related to application performance, health, and key activities has become an essential tool for maintaining web-scale applications. Fortunately, Cassandra collects a wide range of metrics on its own activities to help us understand the behavior.

The metrics reported by Cassandra include the following:

  • Buffer pool metrics describing Cassandra’s use of memory.
  • CQL metrics including the number of prepared and regular statement executions.
  • Cache metrics for key, row, and counter caches such as the number of entries versus capacity, as well as hit and miss rates.
  • Client metrics including the number of connected clients, and information about client requests such as latency, failures, and timeouts.
  • Commit log metrics including the commit log size and statistics on pending and completed tasks.
  • Compaction metrics including the total bytes compacted and statistics on pending and completed compactions.
  • Connection metrics to each node in the cluster including gossip.
  • Dropped message metrics which are used as part of nodetool tpstats.
  • Read repair metrics describing the number of background versus blocking read repairs performed over time.
  • Storage metrics, including counts of hints in progress and total hints.
  • Thread pool metrics, including active, completed, and blocked tasks for each thread pool.
  • Table metrics, including caches, memtables, SSTables, and Bloom filter usage and the latency of various read and write operations, reported at one, five, and fifteen minute intervals.
  • Keyspace metrics that aggregate the metrics for the tables in each keyspace.

To make these metrics accessible via JMX, Cassandra uses the Dropwizard Metrics open source Java library. Cassandra registers its metrics with the Metrics library, which in turn exposes them as MBeans in the org.apache.cassandra.metrics domain.

Many of these metrics are used by nodetool commands such as tpstats, table​histograms, and proxyhistograms. For example, tpstats is simply a presentation of the thread pool and dropped message metrics.

Threading MBeans

The org.apache.cassandra.internal domain houses MBeans that allow you to configure the thread pools associated with each stage in Cassandra’s Staged Event-Driven Architecture (SEDA). The stages include AntiEntropyStage, GossipStage, InternalResponseStage, MigrationStage, and others. 

Read Repair MBean

For historical reasons, the ReadRepairStage MBean is located under the org.apache.cassandra.request domain instead of org.apache.cassandra.internal.

The thread pools are implemented via the JMXEnabledThreadPoolExecutor and JMXEnabledScheduledThreadPoolExecutor classes in the org.apache.cassandra​.concurrent package. The MBeans for each stage implement the JMXEnabledScheduledThreadPoolExecutorMBean interface, which allows you to view and configure the number of core threads in each thread pool as well as the maximum number of threads.

Service MBeans

The GCInspectorMXBean exposes a single operation getAndResetStats() which retrieves and then  resets garbage collection metrics that Cassandra collects on its JVM. This MBean appears in the org.apache.cassandra.service domain. This MBean is accessed by the nodetool gcstats command.

Security MBeans

The org.apache.cassandra.auth domain contains security-related MBeans that are grouped according to the same Java package name. As of the 3.0 release, this consists of a single MBean, the PermissionsCacheMBean, exposed to clients as org.apache.​cassandra.​auth.PermissionsCache. We’ll discuss this MBean in Chapter 13.

Monitoring with nodetool

We’ve already explored a few of the commands offered by nodetool in previous chapters, but let’s take this opportunity to get properly introduced.

nodetool ships with Cassandra and can be found in <cassandra-home>/bin. This is a command-line program that offers a rich array of ways to look at your cluster, understand its activity, and modify it. nodetool lets you get limited statistics about the cluster, see the ranges each node maintains, move data from one node to another, decommission a node, and even repair a node that’s having trouble.

Overlap of nodetool and JMX

Many of the tasks in nodetool overlap with functions available in the JMX interface. This is because, behind the scenes, nodetool is invoking JMX using a helper class called org.apache​.cassandra​.tools.NodeProbe. So JMX is doing the real work, the NodeProbe class is used to connect to the JMX agent and retrieve the data, and the NodeCmd class is used to present it in an interactive command-line interface.

nodetool uses the same environment settings as the Cassandra daemon: bin/cassandra.in.sh and conf/cassandra-env.sh on Unix (or bin/cassandra.in.bat and conf/cassandra-env.ps1 on Windows). The logging settings are found in the conf/logback-tools.xml file; these work the same way as the Cassandra daemon logging settings found in conf/logback.xml.

Starting nodetool is a breeze. Just open a terminal, navigate to <cassandra-home>, and enter the following command:

$ bin/nodetool help

This causes the program to print a list of available commands, several of which we will cover momentarily. Running nodetool with no arguments is equivalent to the help command. You can also execute help with the name of a specific command to get additional details.

Connecting to a Specific Node

With the exception of the help command, nodetool must connect to a Cassandra node in order to access information about that node or the cluster as a whole.

You can use the -h option to identify the IP address of the node to connect to with nodetool. If no IP address is specified, the tool attempts to connect to the default port on the local machine, which is the approach we’ll take for examples in this chapter.

Getting Cluster Information

There is a variety of information you can get about the cluster and its nodes, which we look at in this section. You can get basic information on an individual node or on all the nodes participating in a ring.

describecluster

The describecluster command prints out basic information about the cluster, including the name, snitch, and partitioner:

$ bin/nodetool describecluster
Cluster Information:
  Name: Test Cluster
  Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
  Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
  Schema versions:
    2d4043cb-2124-3589-b2d0-375759b9dd0a: [127.0.0.1, 127.0.0.2, 127.0.0.3]]

The last part of the output is especially important for identifying any disagreements in table definitions, or “schema,” between nodes. As Cassandra propagates schema changes through a cluster, any differences are typically resolved quickly, so any lingering schema differences usually indicate a node that is down or unreachable that needs to be restarted.

status

A more direct way to identify the nodes in your cluster and what state they’re in, is to use the status command:

$ bin/nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns  Host ID
UN  127.0.0.1  103.82 KB  256     ?     31d9042b-6603-4040-8aac-fef0a235570b
UN  127.0.0.2  110.9 KB   256     ?     caad1573-4157-43d2-a9fa-88f79344683d
UN  127.0.0.3  109.6 KB   256     ?     e78529c8-ee9f-46a4-8bc1-3479f99a1860 

The status is organized by data center and rack. Each node’s status is identified by a two-character code, in which the first character indicates whether the node is up (currently available and ready for queries) or down, and the second character indicates the state or operational mode of the node. The load column represents the byte count of the data each node is holding.

The owns column indidates the effective percentage of the token range owned by the node, taking replication into account. Because we did not specify a keyspace and the various keyspaces in this cluster have differing replication strategies, nodetool is not able to calculate a meaningful ownership percentage.

info

The info command tells nodetool to connect with a single node and get basic data about its current state. Just pass it the address of the node you want info for:

$ bin/nodetool -h 192.168.2.7 info
ID                     : 197efa22-ecaa-40dc-a010-6c105819bf5e
Gossip active          : true
Thrift active          : false
Native Transport active: true
Load                   : 301.17 MB
Generation No          : 1447444152
Uptime (seconds)       : 1901668
Heap Memory (MB)       : 395.03 / 989.88
Off Heap Memory (MB)   : 2.94
Data Center            : datacenter1
Rack                   : rack1
Exceptions             : 0
Key Cache              : entries 85, size 8.38 KB, capacity 49 MB,
  47958 hits, 48038 requests, 0.998 recent hit rate, 14400 save 
  period in seconds
Row Cache              : entries 0, size 0 bytes, capacity 0 bytes,
  0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds
Counter Cache          : entries 0, size 0 bytes, capacity 24 MB, 
  0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds
Token                  : (invoke with -T/--tokens to see all 256 tokens)

The information reported includes the memory and disk usage (“Load”) of the node and the status of various services offered by Cassandra. You can also check the status of individual services by the nodetool commands statusgossip, statusthrift, statusbinary, and statushandoff (note that handoff status is not part of info).

ring

To determine what nodes are in your ring and what state they’re in, use the ring command on nodetool, like this:

$ bin/nodetool ring
Datacenter: datacenter1
==========
Address      Rack   Status State   Load      Owns   Token                                       
                                                    9208237582789476801 
192.168.2.5  rack1  Up     Normal  243.6 KB  ?      -9203905334627395805
192.168.2.6  rack1  Up     Normal  243.6 KB  ?      -9145503818225306830
192.168.2.7  rack1  Up     Normal  243.6 KB  ?      -9091015424710319286
... 

This output is organized in terms of vnodes. Here we see the IP addresses of all the nodes in the ring. In this case, we have three nodes, all of which are up (currently available and ready for queries). The load column represents the byte count of the data each node is holding. The output of the describering command is similar but is organized around token ranges.

Other useful status commands provided by nodetool include:

  • The getLoggingLevels and setLoggingLevels commands allow dynamic configuration of logging levels, using the Logback JMXConfiguratorMBean we discussed previously.
  • The gossipinfo command prints the parameters this node communicates about itself to other nodes via gossip.
  • The version command prints the version of Cassandra this node is running. 

Getting Statistics

nodetool also lets you gather statistics about the state of your server in the aggregate level as well as down to the level of specific keyspaces and tables. Two of the most frequently used commands are tpstats and tablestats, both of which we examine now.

Using tpstats

The tpstats tool gives us information on the thread pools that Cassandra maintains. Cassandra is highly concurrent, and optimized for multiprocessor/multicore machines. Moreover, Cassandra employs a Staged Event-Driven Architecture (SEDA) internally, so understanding the behavior and health of the thread pools is important to good Cassandra maintenance.

To find statistics on the thread pools, execute nodetool with the tpstats command:

$ bin/nodetool tpstats
Pool Name            Active  Pending   Completed   Blocked  All time 
                                                             blocked
ReadStage                 0        0         216         0         0
MutationStage             1        0        3637         0         0
CounterMutationStage      0        0           0         0         0
ViewMutationStage         0        0           0         0         0
GossipStage               0        0           0         0         0
RequestResponseStage      0        0           0         0         0
AntiEntropyStage          0        0           0         0         0
MigrationStage            0        0           2         0         0
MiscStage                 0        0           0         0         0
InternalResponseStage     0        0           2         0         0
ReadRepairStage           0        0           0         0         0

Message type           Dropped
READ                         0
RANGE_SLICE                  0
_TRACE                       0
HINT                         0
MUTATION                     0
COUNTER_MUTATION             0
BATCH_STORE                  0
BATCH_REMOVE                 0
REQUEST_RESPONSE             0
PAGED_RANGE                  0
READ_REPAIR                  0

The top portion of the output presents data on tasks in each of Cassandra’s thread pools. You can see directly how many operations are in what stage, and whether they are active, pending, or completed. This output was captured during a write operation, and therefore shows that there is an active task in the MutationStage.

The bottom portion of the output indicates the number of dropped messages for the node. Dropped messages are an indicator of Cassandra’s load shedding implementation, which each node uses to defend itself when it receives more requests than it can handle. For example, internode messages that are received by a node but not processed within the rpc_timeout are dropped, rather than processed, as the coordinator node will no longer be waiting for a response.

Seeing lots of zeros in the output for blocked tasks and dropped messages means that you either have very little activity on the server or that Cassandra is doing an exceptional job of keeping up with the load. Lots of non-zero values is indicative of situations where Cassandra is having a hard time keeping up, and may indicate a need for some of the techniques described in Chapter 12.

Using tablestats

To see overview statistics for keyspaces and tables, you can use the tablestats command. You may also recognize this command from its previous name cfstats. Here is sample output on the hotel keyspace:

$ bin/nodetool tablestats hotel
Keyspace: hotel
    Read Count: 8
    Read Latency: 0.617 ms.
    Write Count: 13
    Write Latency: 0.13330769230769232 ms.
    Pending Flushes: 0
        Table: hotels
        SSTable count: 3
        Space used (live): 16601
        Space used (total): 16601
        Space used by snapshots (total): 0
        Off heap memory used (total): 140
        SSTable Compression Ratio: 0.6277372262773723
        Number of keys (estimate): 19
        Memtable cell count: 8
        Memtable data size: 792
        Memtable off heap memory used: 0
        Memtable switch count: 1
        Local read count: 8
        Local read latency: 0.680 ms
        Local write count: 13
        Local write latency: 0.148 ms
        Pending flushes: 0
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 56
        Bloom filter off heap memory used: 32
        Index summary off heap memory used: 84
        Compression metadata off heap memory used: 24
        Compacted partition minimum bytes: 30
        Compacted partition maximum bytes: 149
        Compacted partition mean bytes: 87
        Average live cells per slice (last five minutes): 1.0
        Maximum live cells per slice (last five minutes): 1
        Average tombstones per slice (last five minutes): 1.0
        Maximum tombstones per slice (last five minutes): 1

Here we have omitted output for other tables in the keyspace so we can focus on the hotels table; the same statistics are generated for each table. We can see the read and write latency and total number of reads and writes at the keyspace and table level. We can also see detailed information about Cassandra’s internal structures for each table, including memtables, Bloom filters and SSTables.

Summary

In this chapter, we looked at ways you can monitor and manage your Cassandra cluster. In particular, we went into some detail on JMX and learned the rich variety of operations Cassandra makes available to the MBean server. We saw how to use JConsole and nodetool to view what’s happening in your Cassandra cluster. You are now ready to learn how to perform routine maintenance tasks to help keep your Cassandra cluster healthy.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.151.106