Time for action – swapping to a new NameNode host

In the following steps we keep the new configuration files on an NFS share mounted to /share/backup and change the paths to match where you have the new files. Also use a different string to grep; we use a portion of the IP address we know isn't shared with any other host in the cluster.

  1. Log on to the current NameNode host and shut down the cluster.
    $ stop-all.sh
    
  2. Halt the host that runs the NameNode.
    $ sudo poweroff
    
  3. Log on to the new NameNode host and confirm the new configuration files have the correct NameNode location.
    $ grep 110 /share/backup/*.xml
    
  4. On the new host, first copy across the slaves file.
    $ cp /share/backup/slaves Hadoop/conf
    
  5. Now copy across the updated configuration files.
    $ cp /share/backup/*site.xml Hadoop/conf
    
  6. Remove any old NameNode data from the local filesystem.
    $ rm -f /var/Hadoop/dfs/name/*
    
  7. Copy the updated configuration files to every node in the cluster.
    $ slaves.sh cp /share/backup/*site.xml Hadoop/conf
    
  8. Ensure each node now has the configuration files pointing to the new NameNode.
    $ slaves.sh grep 110 hadoop/conf/*site.xml
    
  9. Start the cluster.
    $ start-all.sh
    
  10. Check HDFS is healthy, from the command line.
    $ Hadoop fs ls /
    
  11. Verify whether HDFS is accessible from the web UI.

What just happened?

First, we shut down the cluster. This is a little un-representative as most failures see the NameNode die in a much less friendly way, but we do not want to talk about issues of filesystem corruption until later in the chapter.

We then shut down the old NameNode host. Though not strictly necessary, it is a good way of ensuring that nothing accesses the old host and gives you an incorrect view on how well the migration has occurred.

Before copying across files, we take a quick look at core-site.xml and hdfs-site.xml to ensure the correct values are specified for the fs.default.dir property in core-site.xml.

We then prepare the new host by firstly copying across the slaves configuration file and the cluster configuration files and then removing any old NameNode data from the local directory. Refer to the preceding steps about being very careful in this step.

Next, we use the slaves.sh script to get each host in the cluster to copy across the new configuration files. We know our new NameNode host is the only one with 110 in its IP address, so we grep for that in the files to ensure all are up-to-date (obviously, you will need to use a different pattern for your system).

At this stage, all should be well; we start the cluster and access via both the command-line tools and UI to confirm it is running as expected.

Don't celebrate quite yet!

Remember that even with a successful migration to a new NameNode, you aren't done quite yet. You decided in advance how to handle the SecondaryNameNode and which host would be the new designated NameNode host should the newly migrated one fail. To be ready for that, you will need to run through the "Be prepared" checklist mentioned before once more and act appropriately.

Note

Do not forget to consider the chance of correlated failures. Investigate the cause of the NameNode host failure in case it is the start of a bigger problem.

What about MapReduce?

We did not mention moving the JobTracker as that is a much less painful process as shown in Chapter 6, When Things Break. If your NameNode and JobTracker are running on the same host, you will need to modify the preceding approach by also keeping a new copy of mapred-site.xml, which has the location of the new host in the mapred.job.tracker property.

Have a go hero – swapping to a new NameNode host

Perform a migration of both the NameNode and JobTracker from one host to another.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.5.201