NameNode

NameNode is the primary node of the HDFS cluster of nodes. Its primary responsibility is to have a clear understanding of all the files that are part of the distributed filesystem. Think of it as a mega cache of meta information containing information on each block of each file in the system.

Clients of HDFS always go through the NameNode whenever they need to locate a specific file or perform CRUD operations on it. In Hadoop, where the version is < 2.X , the NameNode is the single point of failure in the system. This means that if the NameNode goes down for some reason, the entire HDFS becomes non-usable. To re-mediate this major flaw, Hadoop 2.0 introduced the concept of backup NameNode. As the name suggests, it's an active-passive design to replicate everything that is on the NameNode to the backup NameNode so that if NameNode goes down, the backup node can be started and operations resumed. We will not go into the advantages and disadvantages of having an active-passive system as that is outside the scope of this book:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.190.160.63