Today, financial service organizations generate the highest volume of data compared to any other industry. For example, the NYSE creates 1 terabyte of market and reference data per day and individuals do 10,000 card transactions per second. So, financial organizations have a great opportunity to exploit big data with Hadoop.
In this chapter, I will extend our big data overview with the financial sector angle and cover:
Since Hadoop was developed by Yahoo (inspired by Google papers) more than 10 years ago, driven by their need to process massive sets of unstructured data such as pictures, comments, videos, web logs, and documents across the Web, it comes as no surprise that it was a massive success with Internet giants such as Yahoo, Google, Facebook, eBay, and LinkedIn.
But in the last few years, especially due to significant improvements to the Hadoop framework in terms of stability and functionalities, it has caught up in all industry sectors. We have collected a few good real success stories across different industry sectors to give you an idea of the power of this awesome technology called Hadoop.
An IT company for healthcare had to store 7 years of historical claims and remittance data and process millions of claims every day, which wasn't possible with traditional databases at a reasonable cost.
They implemented a big data solution and now they:
NextBio needed to process massive multiterabytes of unstructured human genome data, which was not possible using their relational MySQL database at a reasonable cost.
They implemented a Big Data solution and now they use HBase as the datastore and Hadoop MapReduce jobs to process genome data in batches.
A large telecom company, China Mobile, had to store billions of call and billing records of its customers, which required very high scalability of data storage and processing.
They implemented a big data solution and now they:
Etsy (https://www.etsy.com/) is an online retailer and had to analyze a large amount of Internet log data to calculate user behavior and make search recommendations.
They implemented a big data solution in the cloud and now they:
3.133.124.21