Ganglia is a monitoring tool that is used to collect the metrics of different types of processes that run on a cluster. In most of the applications, Ganglia is used as the centralized monitoring tool to display the metrics of all the processes that run on a cluster. Hence, it is essential that you enable the monitoring of the Storm cluster through Ganglia.
Ganglia has three important components:
Storm doesn't have built-in support to monitor the Storm cluster using Ganglia. However, with jmxtrans, you can enable Storm monitoring using Ganglia. The jmxtrans tool allows you to connect to any JVM and fetches its JVM metrics without writing a single line of code. The JVM metrics exposed via JMX can be displayed on Ganglia using jmxtrans. Hence, jmxtrans acts as a bridge between Storm and Ganglia.
The following diagram shows how jmxtrans are used between the Storm node and Ganglia:
Perform the following steps to set up jmxtrans and Ganglia:
wget https://jmxtrans.googlecode.com/files/jmxtrans-239-0.noarch.rpm sudo rpm -i jmxtrans-239-0.noarch.rpm
sudo yum -q -y install rrdtool sudo yum –q –y install ganglia-gmond sudo yum –q –y install ganglia-gmetad sudo yum –q –y install ganglia-web
gmetad.conf
configuration file, which is located at /etc/ganglia
in the Gmetad process. We are editing this file to specify the name of the data source and the IP address of the Ganglia Gmetad machine.data_source "stormcluster" 127.0.0.1
gmond.conf
configuration file, which is located at /etc/ganglia
, in the Gmond process:cluster { name = "stormcluster" owner = "clusterOwner" latlong = "unspecified" url = "unspecified" } host { location = "unspecified" } udp_send_channel { host = 127.0.0.1 port = 8649 ttl = 1 } udp_recv_channel { port = 8649 }
Here, 127.0.0.1
is the IP address of the Storm node. You need to replace 127.0.0.1
with the actual IP address of the machine. We have mainly edited the following entries in the Gmond configuration file:
udp_send
channeludp_recv
channelganglia.conf
file, which is located at /etc/httpd/conf.d
. We are editing the ganglia.conf
file to enable access on the Ganglia UI from all machines.Alias /ganglia /usr/share/ganglia<Location /ganglia>Allow from all</Location>
sudo service gmond start setsebool -P httpd_can_network_connect 1 sudo service gmetad start sudo service httpd stop sudo service httpd start
http://127.0.0.1/ganglia
to verify the installation of Ganglia.supervisor.json
file on each supervisor node to collect the JVM metrics of the Storm supervisor node using jmxtrans and publish them on Ganglia using the com.googlecode.jmxtrans.model.output.GangliaWriter OutputWriters
class. The com.googlecode.jmxtrans.model.output.GangliaWriter OutputWriters
class is used to process the input JVM metrics and convert them into the format used by Ganglia. The following is the content for the supervisor.json
JSON file:{ "servers" : [ { "port" : "12346", "host" : "IP_OF_SUPERVISOR_MACHINE", "queries" : [ { "outputWriters": [{ "@class":"com.googlecode.jmxtrans.model.output.GangliaWriter", "settings": { "groupName": "supervisor", "host": "IP_OF_GANGLIA_GMOND_SERVER", "port": "8649" } }], "obj": "java.lang:type=Memory", "resultAlias": "supervisor", "attr": ["ObjectPendingFinalizationCount"] }, { "outputWriters": [{ "@class":"com.googlecode.jmxtrans.model.output.GangliaWriter", "settings": { "groupName": " supervisor ", "host": "IP_OF_GANGLIA_GMOND_SERVER", "port": "8649" } }], "obj": "java.lang:name=Copy,type=GarbageCollector", "resultAlias": " supervisor ", "attr": [ "CollectionCount", "CollectionTime" ] }, { "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter", "settings": { "groupName": "supervisor ", "host": "IP_OF_GANGLIA_GMOND_SERVER", "port": "8649" } }], "obj": "java.lang:name=Code Cache,type=MemoryPool", "resultAlias": "supervisor ", "attr": [ "CollectionUsageThreshold", "CollectionUsageThresholdCount", "UsageThreshold", "UsageThresholdCount" ] }, { "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter", "settings": { "groupName": "supervisor ", "host": "IP_OF_GANGLIA_GMOND_SERVER", "port": "8649" } }], "obj": "java.lang:type=Runtime", "resultAlias": "supervisor", "attr": [ "StartTime", "Uptime" ] }], "numQueryThreads" : 2 }] }
Here, 12346
is the JMX port of the supervisor specified in the storm.yaml
file.
You need to replace the IP_OF_SUPERVISOR_MACHINE
value with the IP address of the supervisor machine. If you have two supervisors in a cluster, then the supervisor.json
file of node 1 contains the IP address of node 1, and the supervisor.json
file of node 2 contains the IP address of node 2.
You need to replace the IP_OF_GANGLIA_GMOND_SERVER
value with the IP address of the Ganglia Gmond server.
nimbus.json
JSON file on the Nimbus node. Using jmxtrans, collect the Storm Nimbus process JVM metrics and publish them on Ganglia using the com.googlecode.jmxtrans.model.output.GangliaWriter OutputWriters
class. The following is the contents of the nimbus.json
JSON file:{ "servers" : [{ "port" : "12345", "host" : "IP_OF_NIMBUS_MACHINE", "queries" : [ { "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter", "settings": { "groupName": "nimbus", "host": "IP_OF_GANGLIA_GMOND_SERVER", "port": "8649" } }], "obj": "java.lang:type=Memory", "resultAlias": "nimbus", "attr": ["ObjectPendingFinalizationCount"] }, { "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter", "settings": { "groupName": "nimbus", "host": "IP_OF_GANGLIA_GMOND_SERVER", "port": "8649" } }], "obj": "java.lang:name=Copy,type=GarbageCollector", "resultAlias": "nimbus", "attr": [ "CollectionCount", "CollectionTime" ] }, { "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter", "settings": { "groupName": "nimbus", "host": "IP_OF_GANGLIA_GMOND_SERVER", "port": "8649" } }], "obj": "java.lang:name=Code Cache,type=MemoryPool", "resultAlias": "nimbus", "attr": [ "CollectionUsageThreshold", "CollectionUsageThresholdCount", "UsageThreshold", "UsageThresholdCount" ] }, { "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.GangliaWriter", "settings": { "groupName": "nimbus", "host": "IP_OF_GANGLIA_GMOND_SERVER", "port": "8649" } }], "obj": "java.lang:type=Runtime", "resultAlias": "nimbus", "attr": [ "StartTime", "Uptime" ] }] "numQueryThreads" : 2 } ] }
Here, 12345
is the JMX port of the Nimbus machine specified in the storm.yaml
file.
You need to replace the IP_OF_NIMBUS_MACHINE
value with the IP address of the Nimbus machine.
You need to replace the IP_OF_GANGLIA_GMOND_SERVER
value with the IP address of the Ganglia Gmond server.
cd /usr/share/jmxtrans/ sudo ./jmxtrans.sh start PATH_OF_JSON_FILES
Here, PATH_OF_JSON_FILE
is the location of the supervisor.json
and nimbus.json
files.
http://127.0.0.1/ganglia
to view the Storm metrics. The following screenshot shows what the Storm metrics look like:In the following section, we will explain how you can store the data processed by Storm on the HBase database.
3.137.212.124