DataNode

As opposed to a Named Node, a DataNode stores the actual data about a file. Each DataNode is connected to a named Node and constantly relays its health information to the named Nodes. As you can see in the preceding diagram, the DataNode stores blocks of files in the local filesystem. Each block, which is typically 128 MB in size, can be read via the NameNode by the end user in a single operation.

Client applications can talk directly to a DataNode, once the NameNode (https://wiki.apache.org/hadoop/NameNode) has provided the location of the data. Similarly, MapReduce (https://wiki.apache.org/hadoop/MapReduce) operations farmed out to TaskTracker (https://wiki.apache.org/hadoop/TaskTracker) instances near a DataNode talk directly to the DataNode to access the files. TaskTracker instances can be deployed on the same servers that host DataNode instances, so that MapReduce operations are performed close to the data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.242.131