Time for action – killing a DataNode process

Firstly, we'll kill a DataNode. Recall that the DataNode process runs on each host in the HDFS cluster and is responsible for the management of blocks within the HDFS filesystem. Because Hadoop, by default, uses a replication factor of 3 for blocks, we should expect a single DataNode failure to have no direct impact on availability, rather it will result in some blocks temporarily falling below the replication threshold. Execute the following steps to kill a DataNode process:

  1. Firstly, check on the original status of the cluster and check whether everything is healthy. We'll use the dfsadmin command for this:
    $ Hadoop dfsadmin -report
    Configured Capacity: 81376493568 (75.79 GB)
    Present Capacity: 61117323920 (56.92 GB)
    DFS Remaining: 59576766464 (55.49 GB)
    DFS Used: 1540557456 (1.43 GB)
    DFS Used%: 2.52%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    -------------------------------------------------
    Datanodes available: 4 (4 total, 0 dead)
    Name: 10.0.0.102:50010
    Decommission Status : Normal
    Configured Capacity: 20344123392 (18.95 GB)
    DFS Used: 403606906 (384.91 MB)
    Non DFS Used: 5063119494 (4.72 GB)
    DFS Remaining: 14877396992(13.86 GB)
    DFS Used%: 1.98%
    DFS Remaining%: 73.13%
    Last contact: Sun Dec 04 15:16:27 PST 2011
    
    

    Now log onto one of the nodes and use the jps command to determine the process ID of the DataNode process:

    $ jps
    2085 TaskTracker
    2109 Jps
    1928 DataNode
    
  2. Use the process ID (PID) of the DataNode process and kill it:
    $ kill -9  1928
    
  3. Check that the DataNode process is no longer running on the host:
    $ jps
    2085 TaskTracker
    
  4. Check the status of the cluster again by using the dfsadmin command:
    $ Hadoop dfsadmin -report
    Configured Capacity: 81376493568 (75.79 GB)
    Present Capacity: 61117323920 (56.92 GB)
    DFS Remaining: 59576766464 (55.49 GB)
    DFS Used: 1540557456 (1.43 GB)
    DFS Used%: 2.52%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    -------------------------------------------------
    Datanodes available: 4 (4 total, 0 dead)
    
    
  5. The key lines to watch are the lines reporting on blocks, live nodes, and the last contact time for each node. Once the last contact time for the dead node is around 10 minutes, use the command more frequently until the block and live node values change:
    $ Hadoop dfsadmin -report
    Configured Capacity: 61032370176 (56.84 GB)
    Present Capacity: 46030327050 (42.87 GB)
    DFS Remaining: 44520288256 (41.46 GB)
    DFS Used: 1510038794 (1.41 GB)
    DFS Used%: 3.28%
    Under replicated blocks: 12
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    -------------------------------------------------
    Datanodes available: 3 (4 total, 1 dead)
    
    
  6. Repeat the process until the count of under-replicated blocks is once again 0:
    $ Hadoop dfsadmin -report
    
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    -------------------------------------------------
    Datanodes available: 3 (4 total, 1 dead)
    
    

What just happened?

The high-level story is pretty straightforward; Hadoop recognized the loss of a node and worked around the problem. However, quite a lot is going on to make that happen.

When we killed the DataNode process, the process on that host was no longer available to serve or receive data blocks as part of the read/write operations. However, we were not actually accessing the filesystem at the time, so how did the NameNode process know this particular DataNode was dead?

NameNode and DataNode communication

The answer lies in the constant communication between the NameNode and DataNode processes that we have alluded to once or twice but never really explained. This occurs through a constant series of heartbeat messages from the DataNode reporting on its current state and the blocks it holds. In return, the NameNode gives instructions to the DataNode, such as notification of the creation of a new file or an instruction to retrieve a block from another node.

It all begins when the NameNode process starts up and begins receiving status messages from the DataNode. Recall that each DataNode knows the location of its NameNode and will continuously send status reports. These messages list the blocks held by each DataNode and from this, the NameNode is able to construct a complete mapping that allows it to relate files and directories to the blocks from where they are comprised and the nodes on which they are stored.

The NameNode process monitors the last time it received a heartbeat from each DataNode and after a threshold is reached, it assumes the DataNode is no longer functional and marks it as dead.

Note

The exact threshold after which a DataNode is assumed to be dead is not configurable as a single HDFS property. Instead, it is calculated from several other properties such as defining the heartbeat interval. As we'll see later, things are a little easier in the MapReduce world as the timeout for TaskTrackers is controlled by a single configuration property.

Once a DataNode is marked as dead, the NameNode process determines the blocks which were held on that node and have now fallen below their replication target. In the default case, each block held on the killed node would have been one of the three replicas, so each block for which the node held a replica will now have only two replicas across the cluster.

In the preceding example, we captured the state when 12 blocks were still under-replicated, that is they did not have enough replicas across the cluster to meet the replication target. When the NameNode process determines the under-replicated blocks, it assigns other DataNodes to copy these blocks from the hosts where the existing replicas reside. In this case we only had to re-replicate a very small number of blocks; in a live cluster, the failure of a node can result in a period of high network traffic as the affected blocks are brought up to their replication factor.

Note that if a failed node returns to the cluster, we have the situation of blocks having more than the required number of replicas; in such a case the NameNode process will send instructions to remove the surplus replicas. The specific replica to be deleted is chosen randomly, so the result will be that the returned node will end up retaining some of its blocks and deleting the others.

Have a go hero – NameNode log delving

We configured the NameNode process to log all its activities. Have a look through these very verbose logs and attempt to identify the replication requests being sent.

The final output shows the status after the under-replicated blocks have been copied to the live nodes. The cluster is down to only three live nodes but there are no under-replicated blocks.

Tip

A quick way to restart the dead nodes across all hosts is to use the start-all.sh script. It will attempt to start everything but is smart enough to detect the running services, which means you get the dead nodes restarted without the risk of duplicates.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.78.83