Monitoring a ZooKeeper instance

The ZooKeeper service can be monitored in the following two ways:

  • Monitoring of health and status using a set of four-letter words
  • Using Java Management Extensions capabilities built into ZooKeeper

Four-letter words

ZooKeeper responds to a small set of commands, each being composed of four letters. These commands can be issued through telnet or nc at the client port. The main objective of these commands is to provide a simple mechanism to check health of the server or diagnose any problems.

The following are the four-letter words supported by ZooKeeper services at the time of writing this book:

  • conf: This print details about server configuration parameters such as clientPort, dataDir, tickTime, and so on.
  • cons: This lists the full connection/session details for all clients connected to this server.
  • crst: This resets connection/session statistics for all connections.
  • dump: This lists the outstanding sessions and ephemeral nodes. This only works on the leader.
  • envi: Lists the environment parameters
  • ruok: This checks whether the server is running without any error. The server will respond with imok if it is running. If the server is in some error state, it will not respond to this command.
  • srst: This resets the server statistics.
  • stat: This provides information on the current status of the server and the list of connected clients.
  • srvr: This provides the same information as the stat command, except the list of connected clients.
  • wchs: This provides brief information on watches for the server.
  • wchc: This provides detailed information on watches for the server, sorted by sessions (connections), showing a list of sessions with associated watches (paths).
  • wchp: This provides detailed information on watches for the server, sorted by paths (znodes). This shows a list of paths with associated sessions.
  • mntr: This outputs a list of variables that can be used to monitor the health of the cluster.

Some examples of running these four-letter commands to monitor the current status of the ZooKeeper server are shown in this screenshot:

Four-letter words

Java Management Extensions

ZooKeeper provides for extensive monitoring and management capabilities with Java Management Extensions (JMX). In this section, we will look at using jconsole, a simple management console available with JMX, to explore ZooKeeper management.

Setting up of JMX for monitoring and management is beyond the scope of this book. For more details, visit https://docs.oracle.com/javase/7/docs/technotes/guides/management/agent.html.

Now, let's start jconsole from the command line on the same system where ZooKeeper is also running. In our case, the ZooKeeper service is running on localhost. Running jconsole starts a window similar to the one shown in the following screenshot:

Java Management Extensions

We can see that the ZooKeeper process with PID 4824 is discovered by jconsole. Let's connect to this process by double-clicking on it. Once jconsole attaches to the ZooKeeper process, we will see a window similar to the following one with various forms of system statistics, such as memory usage, thread, JVM-specific information, and so on. These statistics and system counters are very important to monitor the state of the ZooKeeper server and to help in debugging performance issues in a production cluster.

Java Management Extensions

The MBeans tab shows detailed information on ZooKeeper's internal state, such as details of the clients connected and the various attributes and details about operations done in the ZooKeeper namespace. Managed Beans (MBeans) is a very elegant and flexible way to expose internal information on the ZooKeeper server through JMX.

More details on the various MBe ans available for ZooKeeper management and monitoring can be found at https://zookeeper.apache.org/doc/trunk/zookeeperJMX.html.

Java Management Extensions
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.16.229