CHAPTER 13

Trend 9—Big Data

Introduction

Big data is another one of the emerging technology trends that is receiving a large amount of attention, especially around what it can do, its challenges, and how it could make the world a better place. However, it is important to remember that, while Big Data does have a large number of positives, it does have its challenges which means the firms need to take careful care regarding its implementation and ongoing use.

What Is Big Data?

There are a variety of different definitions available of Big Data available but, at a general, Bug Data can be defined as:

1. There is a massive amount or volume of data. While volume and size of data are often relative to an organization’s circumstances, we are taking in the range of millions (if not billions) of different data items which each data item often contains hundreds (if not thousands) of different variables.

2. The data itself is constantly changing often at a rapid pace. This means that new data items are being added as well as existing data items being changed or deleted.

3. The data is stored across a variety of different formats. This could cover the “traditional” structured formats such as databases (such as relational, No-SQL, or hierarchical), flat files, spreadsheets, XML lists plus others. It also covers many less structured formats such as video recordings, audio recordings, scanned documents, free format text, social media postings plus others.

With this data being collected then organizations can perform various types of data analysis. Historic analysis can be performed to determine the correlation between variables and data items, patterns of behavior, trends, and other themes. This analysis can then be used to (a) understand historic behavior, (b) predict future events or behaviors, and even (c) try and change future behaviors.

There is a clear link and dependence between Machine Learning (see Chapter 7), Cloud computing (see Chapter 12), and Big Data. Big Data will provide the underlying raw data that will allow Machine Learning algorithms to be developed, tested and executed using Cloud computing.

Generic Uses of Big Data

The section below outlines some of the uses of Big Data for Financial Services, but there are a variety of generic uses that are worth discussing.

Local and national governments will collect a vast amount of data from their citizens regarding their age, health, lifestyle, and so on and then use Machine Learning techniques to create algorithms that will help understand what causes certain diseases and illnesses so governments can implement policy or legal changes to try and change people’s behavior.

Political parties have been using Big Data and Machine Learning techniques to try and understand what makes voters vote in a certain way or another. Once these are understood then political parties can then run very targeting campaigns to try and alter voter intentions.

Similarly, meteorologists can use Big Data techniques to gather data on weather conditions and use Machine Learning techniques to create algorithms to try and predict weather patterns. These predictions can be provided to a range of areas such as airlines, ships, and so on.

Finally, Big Data (again with Machine Learning techniques) can assist with the “normal” day-to-day functions of running a business. This could cover understanding what drives costs, understanding customer behavior, understanding what drives customer selling decisions, and understanding operational efficiencies. This will, in turn, allow organizations to make changes to rectify these issues.

images

Figure 13.1 Big data overview

How Does Big Data Work?

At a high level, Big Data can be split into three sequential stages:

Data Gathering

This stage covers obtaining or gathering the required data that a firm needs to perform its analysis.

At a high level, there are two types of data:

Primary Data: This is data that a firm has gathered first-hand or itself (typically as part of its normal business operations). For example, customer details, customer behavior, trading patterns, sales figures, operational costs, and so on. This can be sourced from a variety of sources such as application systems, Internet of things devices (see Chapter 9), or Natural Language Processing activity (see Chapter 10).

Secondary Data: This is data that another firm or organization has collected which the firm in question has purchased or obtained in another way. This could cover data such as economic data, marketing data, demographic spreads, general market behavior, and so on.

A set of technology (with supporting business processes and controls) will be developed that will

Consume the required data. This consumption could be real-time, ad-hoc, or regular.

Clean the data. Even for the most accurate and robust data, there will always be issues such as missing attributes. Therefore, firms will need to review all the data received against a set of known controls then fix (or “clean”) the data before it is sent further.

Normalize the data. Because the (now cleansed) data has been received from a variety of different sources then it will be formatted differently. This could range from numeric fields being rounded differently to different date formats to different currencies. Therefore, all this input data will need to be converted (or normalized) into a single format that is suitable for storage.

Data Storage

Once all the data above has been gathered, cleansed, and normalized then it will need to be stored in some type of data storage. Because Big Data allows a wide range of data items to be held (“traditional” data records, videos, social media messages, e-mails, etc.) then the designated storage technology must be selected carefully.

As mentioned previously, organizations are now looking to move away from the traditional relational-type databases because they cannot support the volume, speed, and different types of data required for Big Data. This means organizations are now looking at technologies such as No-SQL and hierarchical databases. They are also looking to host these databases on the Cloud (see Chapter 12) because it avoids having complex and costly data stores running in-house.

Data Analysis

Once the data is being gathered and stored then it is possible to analyze this data for business advantage.

At the most simple level, it is possible to develop reports and outputs that perform simple but rather shallow analysis of the data. For example sales by region, operational errors by product type, revenue lost to fraudulent activities, and so on.

However, if Big Data is combined with Machine Learning then it is possible to generate much deeper and more useful analysis. While this is explained in more detail in Chapter 7, Machine Learning will allow organizations to uncover trends, behaviors, causes of issues, and so on that are not that obvious to the human eye. This will then allow organizations to improve their operating model, improve their risk controls, improve cross-selling, and so on.

Uses With Financial Services

There are many uses that Financial Services are using Big Data for (and the majority of these work in conjunction with Machine Learning).

Real-Time Stock Market Insights

Big Data (again working with Machine Learning) has helped with providing more accurate real-time stock market insights. Previously, it was only possible to analyze stock price, bond price, and exchange rate movements. However now with Big Data, it is possible to gather real-time data on political, micro-economic, macro-economic, and social trends which (using Machine Learning techniques) will provide more insight on how these impact stock price, bond price, and exchange rates. This in turn will allow firms to make better investment decisions.

Fraud Detection and Prevention

Firms are using Big Data to improve their fraud detection and prevention capabilities. Big Data has been used to collate vast amounts of information on customer behavior, market behavior, and environmental issues which will allow firms to predict what causes frauds. This allows firms to improve the processes and controls to either (a) prevent frauds and/or (b) be aware when fraud is taking place so suitable action can be taken.

Improved Risk Analysis

One of the biggest issues that firms are challenged with is managing the increased level of risk caused by their customers, the marketplace, regulation, and the general economy or environment. Big Data can help mitigate risks to a certain extent. Data on customer, market, and environment behaviors and how these contribute to or mitigate risks can be collated which will then allow Machine Learning techniques to be used to see what activities help or hinder risks so firms can change their operations accordingly.

Improved Customer Servicing and Selling

Due to the rise of Customer Self Servicing (see Chapter 6) and other automation then firms have a vast amount of data on customer behavior, their buying patterns, and their demographics. This data can be analyzed to determine what customer servicing improvements are required, what triggers selling decisions, and so on. This will allow firms to make operational improvements, target certain client groups for cross-selling as well as develop new or enhance existing products to make them more appropriate to the marketplace.

Operational Efficiencies

All firms have large, complex, and often disjoined operating models which leads to inefficiencies, errors, costs, poor customer service, and regulatory issues. Therefore, Big Data can be used to gather data on operational activity so firms can understand what is causing these problems so remedial action can be taken.

Financial Analysis

Similar to the section above, firms often have a complex set of costs that are hard to understand in detail. Big Data will allow firms to model their cost profile to try and understand what causes costs, what are fixed costs, what are step costs, what are variable costs, and so on. This understanding then allows firms to make changes to improve their cost profile.

Challenges of Big Data

As noted earlier, Big Data has several great benefits, but it is important to understand that implementing and using Big Data is not an easy activity and firms need to think and plan carefully about any implementation.

What Is the Business Reason to Implement Big Data?

There are many stories of firms (not just in the Financial Services) implementing new technology for the sake of the technology as opposed to implementing technology to meet some type of business or strategic need. (In the same way as any major change) this means before embarking on a Big Data implementation then a firm must have a clear business reason to implement the technology.

For example, what sort of analysis or insight is the firm looking to use Big Data for? This could cover items such as trying to improve trading patterns, reducing operational costs, understanding client behavior, reducing fraud, and so on. Each of these is different with different data requirements, challenges, and costs. Therefore, firms need to have a very clear understanding of what their business case is.

Please refer to Appendix B for a list of the variables to be discussed when completing a Business Case.

Clean, Accurate, and Timely Data Is Essential

While it is obvious to say that firms need clean, accurate, and timely data for Big Data, it is often an area that firms neglect because they are completely focusing on the “glamour” of implementing a Big Data hub with the related analysis and/or Machine Learning tools. However, if the data that is “fed into” the Big Data hub is poor then any “output” analysis will be flawed.

Poor data can be caused by a variety of reasons, namely:

The data could be taken from a variety of different sources which are completely separate and challenging to link together. For example, the input data could be based on different assumptions, different date ranges, analyzed at a different depth, collected for a different reason plus many others.

The data is of poor quality. This could cover gaps in the data, errors in the data, and data being presented differently (such as clean vs. dirty priced bonds). Apart from t causing problems with linking data sets together, this poor quality may result in dubious or even completely wrong outputs.

The data may be gathered over an inappropriate sample size, which means the results could be biased and/or incomplete. For example, if a firm is looking to assess trading patterns then they should aim to gather data across all asset classes, all types of trades, all types of market volatility, and so on.

To be fair, it is somewhat naive to expect to have fully complete data with no gaps or errors in it. This means that firms will need to fully understand any gaps or issues in their input data so (a) relevant data cleansing can be factored into the pre-processing and (b) any suitable adjustments or even disclaimers to the outputs can be made.

Start Small and Then Grow Implementation

There is a clear tendency to try and implement new technology too quickly. This is because people get “carried away with the technology.” This is especially true in Financial Services. However, implementing any new technology is a challenging process and it should be taken gradually with regular point reviews.

Therefore, it is normally advisable to perform some sort of pilot implementation to assess: does the technology work?, does it offer the benefits promised?, and does the firm have the skills to implement the technology? The results of the pilot study can inform the rest of the implementation. For example, how quickly should the roll-out be performed?, what skills are needed?, will Machine Learning provide the benefits promised?, and so on. It is not that uncommon for firms to considerable rethink machine learning projects after the pilot stage because the technology is more complex or costly than originally thought.

Even once the pilot study is completed then a gradual rollout plan should be implemented. This allows the firm to understand the technology and associated impacts as well as build confidence, momentum, and a return on investment with senior management.

Build a Big Data Technology Capability

This is a key decision because if the wrong or inappropriate technology platform is selected then it will cause issues during implementation and day-to-day working, which will ultimately cause the change to fail.

As discussed above (and repeated below for ease of reading) there are three technology elements to Big Data.

images

Figure 13.2 Big Data overview

While it is possible to develop an in-house technology platform to support Big Data, most firms will struggle with this due to the sheer volume of data, the different formats of data, and the speed that the data changes. This means that firms will look to an external supplier to provide the required technology. Typically, firms will use a Cloud Computing provider to provide the “Data Gathering” and “Data Storage” capability and use a combination of data analysis and Machine Learning tools for the “Data Analysis” requirements. Therefore, some type of formal technology platform vendor selection process must be followed. Please see Appendix A which contains a list of the checks to be performed when selecting a supplier.

Regardless of approach, Big Data is another operating model or technology component that will need to be plugged into a firm’s existing complex operating models. This extra component will need to be supported by suitably skilled people, processes, and systems in conjunction with possible external suppliers. The result is that the firm’s operating model’s complexity, cost, and risk profile increase.

Ensuring the Firm Has Sufficiently Skilled Staff to Support Big Data Once the Project Completes

A dedicated project team will have been formed to implement the initial Big Data project. This will have consisted of senior management to provide oversight and steer plus many “on the ground” people who would perform the developments, integrations, and so on required. This group of people would have been sourced from in-house staff, contractors, and possibly staff from the platform vendor.

Therefore, firms must develop the necessary skills to be able to support the Big Data platform once the project closes because otherwise, they will be reliant on external contractors and platform vendor staff.

This means that senior management will need to be educated on understanding Big Data and its benefits at a general level.

Also, more junior staff will need to be trained on Big Data, the specific platform selected, the integration with back-end systems as well as the suite of bots developed and rolled out as part of the project. This training can be done by training existing staff, but it may also be necessary to recruit new permanent staff with the required skillsets.

Data Privacy and Data Security Regulations

Data privacy and security legislation is complex and hard to understand at the best of times (especially if firms operate across many legal jurisdictions). Most jurisdictions will have their local data protection laws. For example, the European Union and the UK both have General Data Protection Regulation (GDPR). The U.S. state of California has its California Consumer Privacy Act (CCPA). Japan has its Act on the Protection of Personal Information (APPI). There is also similar other regulation across the globe.

However, the level of risk and complexity will increase when firms implement Big Data. Firms must understand and comply with these regulations and ensure that they (and their suppliers) fully comply with these regulations. This will add more complexity, risk, and cost.

Ensure There Is Sufficient Governance and Control in Place

While Big Data offers good opportunities it does create several risks that need to be governed and controlled. Big Data infrastructures can be complex which is a danger is as firms become more and more reliant on them. Therefore, firms need to implement policies, controls, and oversight around Big Data to ensure its usage is controlled and understood. Any problems found should be escalated to senior management in the same way as any other risk or issue.

These controls need to cover four main areas.

Oversight of the daily processes: Controls need to be in place to ensure the data is being gathered, stored, and analyzed as designed. If problems are identified then the cause needs to be discovered so its impact can be assessed and the issue fixed. Any material issues will need escalating to management.

Oversight of the supplier: The firm will likely be reliant on the supplier in some way. This could cover areas such as hosting the platform, providing support, providing consultancy, and so on. Therefore, the firm needs to ensure that the supplier complies with what they have committed to on the contract. For example, are support calls being responded to per schedule?, are there issues with the hosting?, and so on. If there are issues then these need to be escalated to senior management at both the supplier and firm.

Change control process: Data will need to be added, updated, or removed. This could cover activity as part of the implementation project or part of business-as-usual running. Therefore, there needs to be a process to ensure changes are made safely to ensure existing data and analysis are not impacted adversely. Depending on the involvement of the supplier (such as hosting the platform) then the supplier may need to be involved as part of this process.

Policies for making changes to the Big Data infrastructure: Good standards need to be created around making changes. This will cover data standards, programming standards, integration standards, information security standards, testing standards, version control, and release procedures. Depending on the involvement of the supplier then the supplier may need to be involved as part of this process.

The Dark Side of Big Data and Social Acceptance

Big Data does sometimes receive bad publicity regarding how it has been used to change (or manipulate) people’s behaviors to the advantage of somebody else. For example, making people buy a particular product or vote in a particular manner. (However, it could be argued that this is not necessarily a bad idea, because if Big Data is used to make people eat healthier or reduce traffic congestion then one could argue that this is good for society.)

However, this is something that firms need to be aware of because clients, staff, and society, in general, are nervous about organizations using Big Data techniques to manipulate them. This means that firms need a clear business reason for their use (see “What is the business reason to implement Big Data?” on page 175 above) as well as having strong governance controls regarding its use (see “Ensure there is sufficient governance and control in place” on page 178 above).

Firms will also need to be reasonably open regarding their use of Big Data in case clients or staff ask questions. If firms appear secretive or obstructive then it creates nervousness and tension.

Finally, while there is no specific Big Data financial regulation at the moment, this could change in the future. If Big Data becomes more mainstream and if there are several high-profile issues then it is expected that the relevant regulators will look to implement rules. These rules could areas such as firms clearly publicly stating what they are using Big Data for and what their governance controls are.

Does the Analysis Produced Sound Sensible?

While this point is covered in the Machine Learning chapter in this book (see Chapter 7), it is important to always double-check or even be skeptical regarding any analysis, trends, or insights generated. Because of the complexity around the amount of data, the different types of data, the constantly changing nature of the data as well as the actual challenges of analyzing it then there is a chance of issues.

Therefore, for this reason, it is recommended that multiple different models are developed using different development techniques. This will provide several models whose outputs can be compared to help highlight any issues.

Future Challenges

Table 13.1 Future challenges of Big Data

Area

Details

Increased regulations

The impact for this area is neutral at the moment.

There is no specific Big Data financial regulation at the moment, this could change in the future. If Big Data becomes more mainstream and if there are some high-profile issues then it is expected that the relevant regulators will look to implement rules. These rules could areas such as firms clearly publicly stating what they are using Big Data for and what their governance controls are.

Changing nature of clients

The impact on this area is positive.

If used appropriately then Big Data (with Machine Learning) will allow firms to understand their customer better which will allow them to (a) improve their customer servicing and (b) develop better products for the marketplace.

Evolution of products

The impact on this area is positive.

Similar to client impacts directly above; if used appropriately then Big Data (with Machine Learning) will allow firms to understand their customer better which will allow them to develop better products for the marketplace.

Lack of trust

The impact in this area is mixed at the moment.

While Big Data will allow customer behavior to be understood which will allow improvements in customer servicing and products to be made, customers can be nervous or skeptical about being monitored. In effect, they may see firms as a “big brother” overlooking them. Therefore firms need to be open about what they are using machine learning for and how it will benefit the customer.

Accurate data

This impact is impacted adversely.

The running of Big Data is very reliant on timely, accurate, and complete data. If the data has issues then Big Data will not work. Therefore firms will need to implement technology, process, controls, and oversight to ensure all data used is as correct as possible with any issues being identified.

Poor operating and technology models

This impact is impacted adversely.

Big Data is another operating model or technology component that will need to be plugged into a firm’s existing complex operating models. This extra component will need to be supported by suitably skilled people, processes, and systems in conjunction with possible external suppliers. The result is that the firm’s operating model’s complexity, cost, and risk profile increase.

Profitability/Cost drivers

Big Data should in the long term (at least) support firms in improving their profitability.

Understanding customer behavior should allow better products, cross-selling, and improved customer servicing. This should stop customers from leaving as well as attract new customers. The result should be an increase in revenue.

Likewise using Big Data to reduce risk events (such as fraud) and improve operational efficiency should reduce operating costs.

However, it is important to note that Big Data will require a cost to implement as well as additional running costs. Therefore (as part of the business case), firms need to understand any payback period.

Changing nature of the workforce

The impact on this area is generally positive.

Big Data requires new skills. This covers both understanding Big Data at a conceptual level and also understanding the technologies to develop and build models. This offers career development possibilities for staff. However, there is one possible downside. If Big Data is being used to improve operating model efficiency then there is a possibility that some staff could either lose or have their jobs change as a result.

New competition and replacements

The impact on this area is positive to the customer.

This is because new and/or more agile firms may be able to use Big Data to develop more innovative, functionality-rich, and better products for customers. This is good news for customers but a risk for other firms.

Risk profile

The impact on this area is neutral.

Big Data does create risks about another piece of technology that needs to be supported by an existing complex operating model. Also as vendor platforms are often used then the risk of supplier reliance increases as well.

But if it is implemented fully then it will reduce risks around fraud, operating inefficiencies, poor customer service, and losing revenue.

Case Study

The case study relates to a large UK subsidiary of a large U.S. bank that had a very large retail customer base (i.e., several million customs) spread across the UK and mainland Europe.

Despite having a large number of clients, the firm did not feel that they understood their customers based on which customers were profitable, which products were profitable, and what their customer buying behaviors were. This meant they were not pushing profitable products, probably supporting unprofitable products and not taking advantage of cross-selling opportunities.

Therefore, this firm implemented a Big Data infrastructure to allow them to deeply assess their client base. The firm carefully selected a Big Data storage provider. They also diligently implemented data gathering interfaces (with cleansing and normalization) from all their client, banking, and trading systems. Finally, they selected and integrated a Machine Learning application which they used to generate analysis, reporting, insights, and trends.

This appeared to be working fine until it was discovered (by one of the firm’s in-house legal team) that the Big Data platform was in breach of various data protection regulations and customer confidentiality clauses. Therefore, all reporting and activity on the platform was immediately halted while the firm investigated what changes needed to be made to address these problems.

At the time of writing, this investigation work is still in progress.

This case study stresses the importance of ensuring all regulatory and data protection laws are understood and taken into account in the design of a Big Data infrastructure. Big Data is not just about implementing technology.

Summary

Big data is another useful technology that will provide benefits to firms and their customers. However, like all technologies, thought is required around its implementation, and day-to-day running is required. Also, Big Data cannot do anything on its own. In effect, it is a large collection of constantly changing data covering multiple formats. This means Big Data needs to be integrated with other technology (such as Machine Learning) to perform any analysis across the data.

Like all major changes, firms will need a clear business reason for implementing Big Data and it is typically easier to roll out Big Data on a risk-averse phased basis.

Because Big Data is complex then firms need to build an in-house capability to support it. This will cover required technology, suitably skilled staff, and the required governance controls to oversee it.

Big Data is very reliant on data in terms of (a) ensuring the data is timely, complete, and accurate as possible and (b) ensuring compliance with the relevant data protection laws.

Finally, some ethical issues need addressing. While Big Data is normally implemented for the best of reasons, customers and staff will feel uncomfortable if they feel their actions are being monitored. Therefore firms need to be clear and upfront about their usages.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.153.38