The core modules of Hadoop

The core modules of Hadoop consist of:

  • Hadoop Common: Libraries and other common helper utilities required by Hadoop
  • HDFS: A distributed, highly-available, fault-tolerant filesystem that stores data
  • Hadoop MapReduce: A programming paradigm involving distributed computing across commodity servers (or nodes)
  • Hadoop YARN: A framework for job scheduling and resource management

Of these core components, YARN was introduced in 2012 to address some of the shortcomings of the first release of Hadoop. The first version of Hadoop (or equivalently, the first model of Hadoop) used HDFS and MapReduce as its main components. As Hadoop gained in popularity, the need to use facilities beyond those provided by MapReduce became more and more important. This, along with some other technical considerations, led to the development of YARN.

Let's now look at the salient characteristics of Hadoop as itemized previously.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.236.70