Hadoop setup

We can install Hadoop and use a single node for the use case in this chapter using the instructions from Apache Hadoop's website at: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html.

After following these steps, we can browse the HDFS files in our local machine at: http://localhost:50070/explorer.html#/.

Assuming that our signals data is written in HDFS under the /user/<username>/signals directory, we will use the MongoDB Connector for Hadoop to export and import it into MongoDB.

MongoDB Connector for Hadoop is the officially supported library, allowing MongoDB data files or MongoDB backup files in BSON to be used as source or destination for Hadoop MapReduce tasks.

This means that we can also easily export to, and import data from, MongoDB when we are using higher-level Hadoop ecosystem tools such as Pig (a procedural high-level language), Hive (a SQL-like high-level language), and Spark (a cluster-computing framework).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.205.99