An alternative to using the MongoDB Connector for Hadoop is to use the programming language of our choice to export data from Hadoop and then write into MongoDB using the low-level driver or an ODM as described in previous chapters.
For example in Ruby there are a few options:
- WebHDFS on GitHub, which uses the WebHDFS or the HttpFS Hadoop API to fetch data from HDFS
- System calls, using the Hadoop command-line tool and Ruby's system() call
Whereas in Python we can use:
- HdfsCLI, which uses the WebHDFS or the HttpFS Hadoop API
- libhdfs, which uses a JNI-based native C wrapped around the HDFS Java client
All of these options require an intermediate server between our Hadoop infrastructure and our MongoDB server but on the other hand allow for more flexibility in the extract transform load (ETL) process of exporting/importing data.