Monitoring the Storm cluster using Ganglia

Ganglia is a monitoring tool that is used to collect the metrics of different types of processes that run on a cluster. In most of the applications, Ganglia is used as the centralized monitoring tool to display the metrics of all the processes that run on a cluster. Hence, it is essential that you enable the monitoring of the Storm cluster through Ganglia.

Ganglia has three important components:

  • Gmond: This is a monitoring daemon of Ganglia that collects the metrics of nodes and sends this information to the Gmetad server. To collect the metrics of each Storm node, you will need to install the Gmond daemon on each of them.
  • Gmetad: This gathers the metrics from all the Gmond nodes and stores them in the round-robin database.
  • The Ganglia web interface: This displays the metrics information in a graphical form.

Storm doesn't have built-in support to monitor the Storm cluster using Ganglia. However, with jmxtrans, you can enable Storm monitoring using Ganglia. The jmxtrans tool allows you to connect to any JVM and fetches its JVM metrics without writing a single line of code. The JVM metrics exposed via JMX can be displayed on Ganglia using jmxtrans. Hence, jmxtrans acts as a bridge between Storm and Ganglia.

The following diagram shows how jmxtrans are used between the Storm node and Ganglia:

Monitoring the Storm cluster using Ganglia

Integrating Ganglia with Storm

Perform the following steps to set up jmxtrans and Ganglia:

  1. Run the following commands to download and install the jmxtrans tool on each Storm node:
    wget https://jmxtrans.googlecode.com/files/jmxtrans-239-0.noarch.rpm
    sudo rpm -i jmxtrans-239-0.noarch.rpm
    
  2. Run the following commands to install the Ganglia Gmond and Gmetad packages on any machine in a network. You can deploy the Gmetad and Gmond processes on a machine that will not be a part of the Storm cluster.
    sudo yum -q -y install rrdtool
    sudo yum –q –y install ganglia-gmond
    sudo yum –q –y install ganglia-gmetad
    sudo yum –q –y install ganglia-web
    
  3. Edit the following line in the gmetad.conf configuration file, which is located at /etc/ganglia in the Gmetad process. We are editing this file to specify the name of the data source and the IP address of the Ganglia Gmetad machine.
    data_source "stormcluster" 127.0.0.1

    Note

    You can replace 127.0.0.1 with the IP address of the Ganglia Gmetad machine.

  4. Edit the following line in the gmond.conf configuration file, which is located at /etc/ganglia, in the Gmond process:
    cluster {
      name = "stormcluster"
      owner = "clusterOwner"
      latlong = "unspecified"
      url = "unspecified"
    }
    host {
      location = "unspecified"
    }
    udp_send_channel {
      host = 127.0.0.1
      port = 8649
      ttl = 1
    }
    udp_recv_channel {
      port = 8649
    }

    Here, 127.0.0.1 is the IP address of the Storm node. You need to replace 127.0.0.1 with the actual IP address of the machine. We have mainly edited the following entries in the Gmond configuration file:

    • The cluster name
    • The host address of the head Gmond node in the udp_send channel
    • The port in the udp_recv channel
  5. Edit the following line in the ganglia.conf file, which is located at /etc/httpd/conf.d. We are editing the ganglia.conf file to enable access on the Ganglia UI from all machines.
    Alias /ganglia /usr/share/ganglia<Location /ganglia>Allow from all</Location>

    Note

    The ganglia.conf file can be found on the node where the Ganglia web frontend application is installed. In our case, the Ganglia web interface and the Gmetad server are installed on the same machine.

  6. Run the following commands to start the Ganglia Gmond, Gmetad, and Web UI processes:
    sudo service gmond start
    
    setsebool -P httpd_can_network_connect 1
    sudo service gmetad start
    
    sudo service httpd stop
    sudo service httpd start
    
  7. Now, go to http://127.0.0.1/ganglia to verify the installation of Ganglia.

    Note

    Replace 127.0.0.1 with the IP address of the Ganglia web interface machine.

  8. Now, you will need to write a supervisor.json file on each supervisor node to collect the JVM metrics of the Storm supervisor node using jmxtrans and publish them on Ganglia using the com.googlecode.jmxtrans.model.output.GangliaWriter OutputWriters class. The com.googlecode.jmxtrans.model.output.GangliaWriter OutputWriters class is used to process the input JVM metrics and convert them into the format used by Ganglia. The following is the content for the supervisor.json JSON file:
    {
      "servers" : [ {
        "port" : "12346",
        "host" : "IP_OF_SUPERVISOR_MACHINE",
        "queries" : [ {
          "outputWriters": [{
            "@class":"com.googlecode.jmxtrans.model.output.GangliaWriter",
            "settings": {
              "groupName": "supervisor",
              "host": "IP_OF_GANGLIA_GMOND_SERVER",
              "port": "8649" }
          }],
          "obj": "java.lang:type=Memory",
          "resultAlias": "supervisor",
          "attr": ["ObjectPendingFinalizationCount"]
        },
        {
          "outputWriters": [{
            "@class":"com.googlecode.jmxtrans.model.output.GangliaWriter",
            "settings": {
              "groupName": " supervisor ",
              "host": "IP_OF_GANGLIA_GMOND_SERVER",
              "port": "8649"
            }
          }],
          "obj": "java.lang:name=Copy,type=GarbageCollector",
          "resultAlias": " supervisor ",
          "attr": [
            "CollectionCount",
            "CollectionTime"
          ]
        },
        {
          "outputWriters": [{
            "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter",
            "settings": {
              "groupName": "supervisor ",
              "host": "IP_OF_GANGLIA_GMOND_SERVER",
              "port": "8649"
            }
          }],
          "obj": "java.lang:name=Code Cache,type=MemoryPool",
          "resultAlias": "supervisor ",
          "attr": [
            "CollectionUsageThreshold",
            "CollectionUsageThresholdCount",
            "UsageThreshold",
            "UsageThresholdCount"
          ]
        },
        {
          "outputWriters": [{
            "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter",
            "settings": {
              "groupName": "supervisor ",
              "host": "IP_OF_GANGLIA_GMOND_SERVER",
              "port": "8649"
            }
          }],
          "obj": "java.lang:type=Runtime",
          "resultAlias": "supervisor",
          "attr": [
            "StartTime",
            "Uptime"
          ]
        }],
        "numQueryThreads" : 2
      }]
    }

    Here, 12346 is the JMX port of the supervisor specified in the storm.yaml file.

    You need to replace the IP_OF_SUPERVISOR_MACHINE value with the IP address of the supervisor machine. If you have two supervisors in a cluster, then the supervisor.json file of node 1 contains the IP address of node 1, and the supervisor.json file of node 2 contains the IP address of node 2.

    You need to replace the IP_OF_GANGLIA_GMOND_SERVER value with the IP address of the Ganglia Gmond server.

  9. Create nimbus.json JSON file on the Nimbus node. Using jmxtrans, collect the Storm Nimbus process JVM metrics and publish them on Ganglia using the com.googlecode.jmxtrans.model.output.GangliaWriter OutputWriters class. The following is the contents of the nimbus.json JSON file:
    {
      "servers" : [{
        "port" : "12345",
        "host" : "IP_OF_NIMBUS_MACHINE",
        "queries" : [
          { "outputWriters": [{
            "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter",
            "settings": {
              "groupName": "nimbus",
              "host": "IP_OF_GANGLIA_GMOND_SERVER",
              "port": "8649"
            }
          }],
          "obj": "java.lang:type=Memory",
          "resultAlias": "nimbus",
          "attr": ["ObjectPendingFinalizationCount"]
        },
        {
          "outputWriters": [{
            "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter",
            "settings": {
              "groupName": "nimbus",
              "host": "IP_OF_GANGLIA_GMOND_SERVER",
              "port": "8649"
            }
          }],
          "obj": "java.lang:name=Copy,type=GarbageCollector",
          "resultAlias": "nimbus",
          "attr": [
            "CollectionCount",
            "CollectionTime"
          ]
        },
        {
          "outputWriters": [{
            "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter",
            "settings": {
              "groupName": "nimbus",
              "host": "IP_OF_GANGLIA_GMOND_SERVER",
              "port": "8649"
            }
          }],
          "obj": "java.lang:name=Code Cache,type=MemoryPool",
          "resultAlias": "nimbus",
          "attr": [
            "CollectionUsageThreshold",
            "CollectionUsageThresholdCount",
            "UsageThreshold",
            "UsageThresholdCount"
          ]
        },
        {
          "outputWriters": [{
            "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter",
            "settings": {
              "groupName": "nimbus",
              "host": "IP_OF_GANGLIA_GMOND_SERVER",
              "port": "8649"
            }
          }],
          "obj": "java.lang:type=Runtime",
          "resultAlias": "nimbus",
          "attr": [
            "StartTime",
            "Uptime"
          ]
        }]
        "numQueryThreads" : 2
      } ]
    }

    Here, 12345 is the JMX port of the Nimbus machine specified in the storm.yaml file.

    You need to replace the IP_OF_NIMBUS_MACHINE value with the IP address of the Nimbus machine.

    You need to replace the IP_OF_GANGLIA_GMOND_SERVER value with the IP address of the Ganglia Gmond server.

  10. Run the following commands on each Storm node to start the jmxtrans process:
    cd /usr/share/jmxtrans/
    sudo ./jmxtrans.sh start PATH_OF_JSON_FILES
    

    Here, PATH_OF_JSON_FILE is the location of the supervisor.json and nimbus.json files.

  11. Now, go to the Ganglia page at http://127.0.0.1/ganglia to view the Storm metrics. The following screenshot shows what the Storm metrics look like:
    Monitoring the Storm cluster using Ganglia

    The Ganglia home page

  12. Perform the followings steps to view the metrics of Storm Nimbus and the supervisor processed on the Ganglia UI:
    1. Open the Ganglia page.
    2. Now click on the stormCluster link to view the metrics of the Storm cluster.
    3. The following screenshot shows the metrics of the Storm supervisor node:
      Monitoring the Storm cluster using Ganglia

      Supervisor metrics

    4. The following screenshot shows the metrics of the Storm Nimbus node:
    Monitoring the Storm cluster using Ganglia

    Nimbus metrics

In the following section, we will explain how you can store the data processed by Storm on the HBase database.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.212.124