Time for action – adding an additional fsimage location

Let's now configure our NameNode to simultaneously write multiple copies of fsimage to give us our desired data resilience. To do this, we require an NFS-exported directory.

  1. Ensure the cluster is stopped.
    $ stopall.sh
    
  2. Add the following property to Hadoop/conf/core-site.xml, modifying the second path to point to an NFS-mounted location to which the additional copy of NameNode data can be written.
    <property>
    <name>dfs.name.dir</name>
    <value>${hadoop.tmp.dir}/dfs/name,/share/backup/namenode</value>
    </property>
  3. Delete any existing contents of the newly added directory.
    $ rm -f /share/backup/namenode
    
  4. Start the cluster.
    $ start-all.sh
    
  5. Verify that fsimage is being written to both the specified locations by running the md5sum command against the two files specified before (change the following code depending on your configured locations):
    $ md5sum /var/hadoop/dfs/name/image/fsimage
    a25432981b0ecd6b70da647e9b94304a  /var/hadoop/dfs/name/image/fsimage
    $ md5sum /share/backup/namenode/image/fsimage
    a25432981b0ecd6b70da647e9b94304a  /share/backup/namenode/image/fsimage
    

What just happened?

Firstly, we ensured the cluster was stopped; though changes to the core configuration files are not reread by a running cluster, it's a good habit to get into in case that capability is ever added to Hadoop.

We then added a new property to our cluster configuration, specifying a value for the data.name.dir property. This property takes a list of comma-separated values and writes fsimage to each of these locations. Note how the hadoop.tmp.dir property discussed earlier is de-referenced, as would be seen when using Unix variables. This syntax allows us to base property values on others and inherit changes when the parent properties are updated.

Tip

Do not forget all required locations

The default value for this property is ${Hadoop.tmp.dir}/dfs/name. When adding an additional value, remember to explicitly add the default one also, as shown before. Otherwise, only the single new value will be used for the property.

Before starting the cluster, we ensure the new directory exists and is empty. If the directory doesn't exist, the NameNode will fail to start as should be expected. If, however, the directory was previously used to store NameNode data, Hadoop will also fail to start as it will identify that both directories contain different NameNode data and it does not know which one is correct.

Be careful here! Especially if you are experimenting with various NameNode data locations or swapping back and forth between nodes; you really do not want to accidentally delete the contents from the wrong directory.

After starting the HDFS cluster, we wait for a moment and then use MD5 cryptographic checksums to verify that both locations contain the identical fsimage.

Where to write the fsimage copies

The recommendation is to write fsimage to at least two locations, one of which should be the remote (such as a NFS) filesystem, as in the previous example. fsimage is only updated periodically, so the filesystem does not need high performance.

In our earlier discussion regarding the choice of hardware, we alluded to other considerations for the NameNode host. Because of fsimage criticality, it may be useful to ensure it is written to more than one disk and to perhaps invest in disks with higher reliability, or even to write fsimage to a RAID array. If the host fails, using the copy written to the remote filesystem will be the easiest option; but just in case that has also experienced problems, it's good to have the choice of pulling another disk from the dead host and using it on another to recover the data.

Swapping to another NameNode host

We have ensured that fsimage is written to multiple locations and this is the single most important prerequisite for managing a swap to a different NameNode host. Now we need to actually do it.

This is something you really should not do on a production cluster. Absolutely not when trying for the first time, but even beyond that it's not a risk-free process. But do practice on other clusters and get an idea of what you'll do when disaster strikes.

Having things ready before disaster strikes

You don't want to be exploring this topic for the first time when you need to recover the production cluster. There are several things to do in advance that will make disaster recovery much less painful, not to mention possible:

  • Ensure the NameNode is writing the fsimage to multiple locations, as done before.
  • Decide which host will be the new NameNode location. If this is a host currently being used for a DataNode and TaskTracker, ensure it has the right hardware needed to host the NameNode and that the reduction in cluster performance due to the loss of these workers won't be too great.
  • Make a copy of the core-site.xml and hdfs-site.xml files, place them (ideally) on an NFS location, and update them to point to the new host. Any time you modify the current configuration files, remember to make the same changes to these copies.
  • Copy the slaves file from the NameNode onto either the new host or the NFS share. Also, make sure you keep it updated.
  • Know how you will handle a subsequent failure in the new host. How quickly can you likely repair or replace the original failed host? Which host will be the location of the NameNode (and SecondaryNameNode) in the interim?

Ready? Let's do it!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.88.62