Chapter 2. Big Data in Financial Services

Today, financial service organizations generate the highest volume of data compared to any other industry. For example, the NYSE creates 1 terabyte of market and reference data per day and individuals do 10,000 card transactions per second. So, financial organizations have a great opportunity to exploit big data with Hadoop.

In this chapter, I will extend our big data overview with the financial sector angle and cover:

  • Big data use cases across industry sectors
  • Why is big data required in the financial sector?
  • Big data use cases in finance
  • Big data evolution in finance
  • Big data tools—what should you learn?
  • Big data implementations in finance

Big data use cases across industry sectors

Since Hadoop was developed by Yahoo (inspired by Google papers) more than 10 years ago, driven by their need to process massive sets of unstructured data such as pictures, comments, videos, web logs, and documents across the Web, it comes as no surprise that it was a massive success with Internet giants such as Yahoo, Google, Facebook, eBay, and LinkedIn.

But in the last few years, especially due to significant improvements to the Hadoop framework in terms of stability and functionalities, it has caught up in all industry sectors. We have collected a few good real success stories across different industry sectors to give you an idea of the power of this awesome technology called Hadoop.

Healthcare

An IT company for healthcare had to store 7 years of historical claims and remittance data and process millions of claims every day, which wasn't possible with traditional databases at a reasonable cost.

They implemented a big data solution and now they:

  • Store all 7 years and ongoing claims and remit data on the Hadoop platform
  • Use Hadoop analytical tools to perform queries

Human science

NextBio needed to process massive multiterabytes of unstructured human genome data, which was not possible using their relational MySQL database at a reasonable cost.

They implemented a Big Data solution and now they use HBase as the datastore and Hadoop MapReduce jobs to process genome data in batches.

Telecom

A large telecom company, China Mobile, had to store billions of call and billing records of its customers, which required very high scalability of data storage and processing.

They implemented a big data solution and now they:

  • Use HBase as a datastore to store the complete data set, and it is now easily possible to add 30 TB of data every month
  • Have reached 100+ nodes and the solution is very economical compared to traditional data processing

Online retailer

Etsy (https://www.etsy.com/) is an online retailer and had to analyze a large amount of Internet log data to calculate user behavior and make search recommendations.

They implemented a big data solution in the cloud and now they:

  • Use Hadoop on Amazon Elastic MapReduce to run dozens of Hadoop workflows every night
  • Use MATLAB for predictive analytics and the results are shown using the visualization tool Tableau
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.124.21