Node types

A Hadoop cluster has one MasterNode and one to many slave nodes. The MasterNode operates what is called the NameNode. The role of the NameNode is to track which other nodes are healthy and some other key information about the cluster, such as file locations. There is also the role of the ResourceManager, one per cluster, which may or may not be on the same server. You will learn more about the NameNode in the HDFS section and more about the ResourceManager in the YARN section.

The rest of the machines in the cluster act as both DataNode and NodeManager. These are the workers. This is where, you guessed it, the data is distributed and where the distributed computing happens. There are several types of slave node that serve different roles for the cluster. Most of them will be referred to as data nodes. There can also be slave nodes that are primarily meant to interface with the outside network, these are called edge nodes.

Some other services on the cluster, such as Web App Proxy Server and MapReduce Job History server, are usually run either on dedicated nodes or on shared nodes–typically the edge nodes. The decision on where to place them depends on the load requirements of the services sharing the resources of a node.

Table of Contents for Node types

Create new playlist

Sign In

Sign Up

Table of Contents for
Node types