Big data use case

Putting all of this into action, we will develop a fully working system using a data source, a Kafka message broker, an Apache Spark cluster on top of HDFS feeding a Hive table, and a MongoDB database.

Our Kafka message broker will ingest data from an API, streaming market data for an XMR/BTC currency pair. This data will be passed on to an Apache Spark algorithm on HDFS to calculate the price for the next ticker timestamp based on:

  • The corpus of historical prices already stored on HDFS
  • The streaming market data arriving from the API

This predicted price will then be stored in MongoDB using the MongoDB Connector for Hadoop.

MongoDB will also receive data straight from the Kafka message broker, storing it in a special collection with the document expiration date set to 1 minute. This collection will hold the latest orders with the goal of being used by our system to buy or sell, using the signal coming from the Spark ML system.

So for example, if the price is currently 10 and I have a bid for 9.5 but I expect the price to go down at the next market tick, then the system would wait. If we expect the price to go up in the next market tick then the system would increase the price to 10.01 to match the price in the next ticker.

Similarly, if the price is 10 and I bid for 10.5 but I expect the price to go down, I would adjust to 9.99 to make sure I don't overpay for it. But if the price is expected to go up, I would immediately buy to make a profit at the next market tick.

Schematically, our architecture looks like this:

The API is simulated by posting JSON messages to a Kafka topic named xmr_btc. On the other end, we have a Kafka consumer importing real-time data to MongoDB.

We also have another Kafka consumer importing data to Hadoop, to be picked up by our algorithms, which sends recommendation data (signals) to a Hive table.

Finally, we export data from the Hive table into MongoDB.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.122.235