Big data implementations in finance

Before any big data project is kicked off, just like any other project, there are certain prerequisites for it to be successful:

  • Business requirements: Work with business users to understand their requirements—the problems with the current data systems, new sources, and possible opportunities. We need to have the big data problem defined.
  • Gap analysis: Understand the current and future state and list down all gaps—changes to data interfaces, data governance, data architecture, data visualization, and so on.
  • Project plan: Details on the business, big data, and technical architectures, including resource requirements and clear return on investment calculations.

The key challenges

As Hadoop is a new technology, its adoption into financial organizations is not easy and is faced with various obstacles such as:

  • You go first: No business division wants to be the first to explore the technology, especially when they are not fully confident of the return on the investment.
  • Skilled resources: The technology is new and so getting experienced people with Hadoop skills is very hard. Even if financial organizations find people with Hadoop skills from other industries, they will lack finance skills.
  • Hype: There is still a lot of hype around this and that leads to very unreasonable expectations.
  • Too agile or too rigid: The Hadoop ecosystem is open source and their components get new major releases every now and then. Financial organizations are a little slow to upgrade as there is normally no appetite for downtime and risks associated with technical glitches due to upgrades. Striking a balance is a key challenge.
  • Security: Financial organizations get a little paranoid when it comes to security, especially if the data is in the cloud. Installation of the Hadoop cluster in a completely secure environment is a must so that only authorized users can access the data and the data transmissions are encrypted.
  • IT infrastructure management may object to cheap commodity hardware as it will generally not meet the standards set by the financial organization.

Overcoming the challenges

Start with small "low-hanging fruits"—process already structured data that saves cost and whose benefits are easy to quantify.

Even if many of us think that successful Hadoop implementations are dependent on IT or the Technology department, it is not true. For any Hadoop implementation, the changes need to cut through its business culture, operating model, and data architecture are as explained:

  • Business culture: Business must be convinced of the advantages of analytics on Hadoop, in addition to their existing analytics. They need to be in confidence throughout the project cycles such as planning, proof of concept, implementations, and post-implementations.
  • Operating model: Organization must start sharing knowledge across the business technology divisions. Most large banks have a central-group-level chief data function and they should have a central big data advisory group.
  • Data: The data architecture needs to be very agile and accommodate all new proven technologies as Hadoop is still evolving.

Most of the successful implementations in financial organization are usually done in three steps, discussed in the next three subsections.

Generate interest – play area

Because of the hype around this technology, there may already be a good level of interest in financial organization, but that needs to be continuously fuelled by providing practical experience and sharing success results.

It is possible for organizations to provide a Hadoop play area for developers and analysts to load and analyze real or testing data. The Hadoop platform may either be built using unused servers or the strategic purchase of Hadoop servers, including distributions for development, testing, and production.

What should developers and analysts do?

  • Upload data into Hadoop and start writing MapReduce programs to analyze the data
  • Create your own schema in Hive or map Hive to the existing files on HDFS
  • Create your own schema in HBase or other NoSQL databases
  • Write queries on HDFS, Hive, or HBase
  • Experiment with statistical languages such as R and connect your BI tools into Hadoop

Most importantly, developers and analysts should share their results with business users and the wider community within financial organizations. This is an important step in moving to the next step—doing a project with real data and business benefits.

Note

Please do check with your manager and compliance as to what can be shared on your GitHub account. Normally, the code and data results cannot be shared outside your financial organization.

Pilot with a low-cost project

Once the business benefits are documented, data and technical architecture designed, and the team already skilled in Hadoop tools, it is a time to do a small project or a proof of concept for a large project.

What should developers and analysts do?

  • Don't be discouraged or too excited by new tools and technologies coming up. Your existing tools are still okay and you should only move to the new ones when there is a real need.
  • Learn how scrum or agile project methodology works, as Hadoop projects will most probably be based on that.
  • This is the time when you should master your skills in more depth, either development skills or analytical skills.
  • You need to master at least a few Hadoop components such as Hive, Pig, or MapReduce, and a BI or analytics tool such as R, QlikView, and Tableau.

Hadoop is live – now scale it up

Once the project has been put into production, the minimum expectation is that:

  • The business is fully aware of the benefits and the users are fully trained with the new tools.
  • Disaster recovery is configured and well tested. It is very unlikely that any data system in the financial organization will make its way into production without disaster recovery.

What should developers and analysts do?

  • Keep exploring new business cases and new tools
  • Add more data sources, as Hadoop is more effective with more data
  • Scale up your Hadoop to support enterprise-level projects
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.152.139