Chapter 7. Integrating Storm with JMX, Ganglia, HBase, and Redis

In the previous chapter, we covered an overview of Apache Hadoop and its various components, overview of Storm-YARN and deploying Storm-YARN on Apache Hadoop.

In this chapter, we will explain how you can monitor the Storm cluster using well-known monitoring tools such as Java Managements Extensions (JMX) and Ganglia.

We will also cover sample examples that will demonstrate how you can store the process data into databases and a distributed cache.

In this chapter, we will cover the following topics:

  • Monitoring Storm using JMX
  • Monitoring Storm using Ganglia
  • Integrating Storm with HBase
  • Integrating Storm with Redis

Monitoring the Storm cluster using JMX

In Chapter 3, Monitoring the Storm Cluster, we learned how to monitor a Storm cluster using the Storm UI or Nimbus thrift API. This section will explain how you can monitor the Storm cluster using JMX. JMX is a set of specifications used to manage and monitor applications running in the JVM. We can collect or display the Storm metrics such as heap size, non-heap size, number of threads, number of loaded classes, heap and non-heap memory, and virtual machine arguments, and manage objects on the JMX console. The following are the steps we need to perform to monitor the Storm cluster using JMX:

  1. We will need to add the following line in the storm.yaml file of each supervisor node to enable JMX on each of them:
    supervisor.childopts: -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=12346

    Here, 12346 is the port number used to collect the supervisor Java Virtual Machine (JVM) metrics through JMX.

  2. Add the following line in the storm.yaml file of the Nimbus machine to enable JMX on the Nimbus node:
    nimbus.childopts: -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=12345

    Here, 12345 is the port number used to collect the Nimbus JVM metrics through JMX.

  3. Also, you can collect the JVM metrics of worker processes by adding the following line in the storm.yaml file of each supervisor node:
    worker.childopts: -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=2%ID%

    Here, %ID% denotes the port number of the worker processes. If the port of the worker process is 6700, then its JVM metrics are published on port number 26700 (2%ID%).

  4. Now, run the following commands on any machine where Java is installed to start the JConsole:
    cd $JAVA_HOME
    ./bin/jconsole
    

    The following screenshot shows how we can connect to the supervisor JMX port using the JConsole:

    Monitoring the Storm cluster using JMX

    The JMX connection page

    If you open the JMX console on a machine other than the supervisor machine, then you need to use the IP address of the supervisor machine in the preceding screenshot instead of 127.0.0.1.

    Now, click on the Connect button to view the metrics of the supervisor node. The following screenshot shows what the metrics of the Storm supervisor node looks like on the JMX console:

    Monitoring the Storm cluster using JMX

    The JMX console

    Similarly, you can collect the JVM metrics of the Nimbus node by specifying the IP address and the JMX port of the Nimbus machine on the JMX console.

The following section will explain how you can display the Storm cluster metrics on Ganglia.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.232.152