Cloudera Manager uses roles to define the configuration of different hosts in a cluster. Each role will have a certain set of properties and configurations defined that can be applied to a node in the cluster. The role applied to a node will define the different Hadoop services that will run on that specific node.
The following is the list of a few roles applied to a host by Cloudera Manager:
Balancer: This role is responsible for balancing the blocks across the different nodes on the cluster
DataNode: This role defines all the configurations required to start a datanode on the host
NameNode: This role defines all the configurations required to start a namenode on the host
SecondaryNameNode: This role defines all the configurations required to start a secondary namenode on the host
JobTracker: This role defines all the configurations required to start a jobtracker on the host
TaskTracker: This role defines all the configurations required to start a tasktracker on the host
Adding a role instance to a host
To add a role instance to a host, navigate to Hosts from the Cloudera Manager toolbar. You should see a screen as shown in the following screenshot:
As you can see, the node4.hcluster host does not have any roles assigned to it. Let's say that we want the node4.hcluster host to run datanode and tasktracker daemons. To do this, we need to add the DataNode and TaskTracker roles to this host.
Adding a DataNode role to a host
The following are the steps to add the DataNode role to a host:
Navigate to the Cloudera Manager's Clusters menu and select HDFS as shown in the following screenshot:
To add the DataNode role to the node4.hcluster host, select Instances for HDFS as shown in the following screenshot:
As you can see, all roles related to the HDFS service and the nodes to which it has been applied are listed.
Click on Add to add a new role instance. You should see the screen to add a role instance as shown in the following screenshot:
Click on Select hosts under the DataNode section and select Custom... as shown in the following screenshot:
Next, you should see the host selection screen as shown in the following screenshot:
Select node4 and click on OK.
Click on Continue on the next screen to bring up the Review Changes screen as shown in the following screenshot:
Click on Finish to complete the steps of adding the DataNode role. You should see the Role Instances screen with the newly added DataNode role as shown in the following screenshot:
To start the DataNode role, check the checkbox for the datanode (node4) item, click on the Actions for Selected menu button and select Start as shown in the following screenshot:
You should see a dialog box as shown in the following screenshot. Click on Start to start the datanode.
The datanode should start successfully as shown in the following screenshot:
Adding a TaskTracker role to a host
The following are the steps to add the TaskTracker role to a host:
The TaskTracker role is part of the MapReduce service. Navigate to the Cloudera Manager's Clusters menu and select the Instances tab as shown in the following screenshot:
Click on Add to add a new role instance. You should see the screen to add a role instance as shown in the following screenshot:
Click on Select hosts under the TaskTracker section and select Custom... as shown in the following screenshot:
On the next screen, select node4 as shown in the following screenshot and click on OK:
On the next screen, click on Continue to bring up the Review Changes screen as shown in the following screenshot. Click on the Finish button.
To start the TaskTracker role on this node, select the checkbox for tasktracker (node4), click on the Actions for Selected menu button and click on Start as shown in the following screenshot:
Using the preceding mentioned steps, we have successfully added the DataNode and TaskTracker roles to node4.hcluster. Similarly, you could add any role you want to the nodes of a cluster managed by Cloudera Manager.