Hadoop in the cloud can be implemented with very low initial investment and is well suited for proof of concepts and data systems with variable IT resource requirements. In this chapter, I will discuss the story of Hadoop in the cloud and how Hadoop can be implemented in the cloud for banks.
I will cover the full data life cycle of a risk simulation project using Hadoop in the cloud.
I recommend you refer to your Hadoop cloud provider documentation if you need to dive deeper.
In the last few years, cloud computing has grown significantly within banks as they strive to improve the performance of their applications, increase agility, and most importantly reduce their IT costs. As moving applications into the cloud reduces the operational cost and IT complexity, it helps banks to focus on their core business instead of spending resources on technology support.
The Hadoop-based big data platform is just like any other cloud computing platform and a few financial organizations have implemented projects with Hadoop in the cloud.
As far as banks are concerned, especially investment banks, business fluctuates a lot and is driven by the market. Fluctuating business means fluctuating trade volume and variable IT resource requirements. As shown in the following figure, traditional on-premise implementations will have a fixed number of servers for peak IT capacity, but the actual IT capacity needs are variable:
As shown in the following figure, if a bank plans to have more IT capacity than maximum usage (a must for banks), there will be wastage, but if they plan to have IT capacity that is the average of required fluctuations, it will be lead to processing queues and customer dissatisfaction:
With cloud computing, financial organizations only pay for the IT capacity they use and it is the number-one reason for using Hadoop in the cloud–elastic capacity and thus elastic pricing.
The second reason is proof of concept. For every financial institution, before the adoption of Hadoop technologies, the big dilemma was, "Is it really worth it?" or "Should I really spend on Hadoop hardware and software as it is still not completely mature?" You can simply create Hadoop clusters within minutes, do a small proof of concept, and validate the benefits. Then, either scale up your cloud with more use cases or go on-premise if that is what you prefer.
Have a look at the following questions. If you answer yes to any of these for your big data problem, Hadoop in the cloud could be the way forward:
If the cloud solves all big data problems, why isn't every bank implementing it?
In the next section, I will pick up one of the most popular use cases: implementing Hadoop in the cloud for the risk division of a bank.
3.135.198.174