Introduction

You often hear the term ‘big data’, but do you really know what it is and why it’s important? Can it make a difference in your organization, improving results and bringing competitive advantage, and is it possible that not utilizing big data puts you at a significant competitive disadvantage?

The goal of this book is to demystify the term ‘big data’ and to give practical ways for you to leverage this data using data science and machine learning.

The term ‘big data’ refers to a new class of data: vast, rapidly accumulating quantities, which often do not fit a traditional structure. The term ‘big’ is an understatement that simply does not do justice to the complexity of the situation. The data we are dealing with is not only bigger than traditional data; it is fundamentally different, as a motorcycle is more than simply a bigger bicycle and an ocean is more than simply a deeper swimming pool. It brings new challenges, presents new opportunities, blurs traditional competitive boundaries and requires a paradigm shift related to how we draw tangible value from data. The ocean of data, combined with the technologies that have been developed to handle it, provide insights at enormous scale and have made possible a new wave of machine learning, enabling computers to drive cars, predict heart attacks better than physicians and master extremely complex games such as Go better than any human.

Why is big data a game-changer? As we will see, it allows us to draw much deeper insights from our data, understanding what motivates our customers and what slows down our production lines. In real time, it enables businesses to simultaneously deliver highly personalized experiences to millions of global customers, and it provides the computational power needed for scientific endeavours to analyse billions of data points in fields such as cancer research, astronomy and particle physics. Big data provides both the data and the computational resources that have enabled the recent resurgence in artificial intelligence, particularly with advances in deep learning, a methodology that has recently been making global headlines.

Beyond the data itself, researchers and engineers have worked over the past two decades to develop an entire ecosystem of hardware and software solutions for collecting, storing, processing and analysing this abundant data. I refer to these hardware and software tools together as the big data ecosystem. This ecosystem allows us to draw immense value from big data for applications in business, science and healthcare. But to use this data, you need to piece together the parts of the big data ecosystem that work best for your applications, and you need to apply appropriate analytic methods to the data – a practice that has come to be known as data science.

All in all, the story of big data is much more than simply a story about data and technology. It is about what is already being done in commerce, science and society and what difference it can make for your business. Your decisions must go further than purchasing a technology. In this book, I will outline tools, applications and processes and explain how to draw value from modern data in its many forms.

Most organizations see big data as an integral part of their digital transformation. Many of the most successful organizations are already well along their way in applying big data and data science techniques, including machine learning. Research has shown a strong correlation between big data usage and revenue growth (50 per cent higher revenue growth1), and it is not unusual for organizations applying data science techniques to see a 10–20 per cent improvement in key performance indicators (KPIs).

For organizations that have not yet started down the path of leveraging big data and data science, the number one barrier is simply not knowing if the benefits are worth the cost and effort. I hope to make those benefits clear in this book, along the way providing case studies to illustrate the value and risks involved.

In the second half of this book, I’ll describe practical steps for creating a data strategy and for getting data projects done within your organization. I’ll talk about how to bring the right people together and create a plan for collecting and using data. I’ll discuss specific areas in which data science and big data tools can be used within your organization to improve results, and I’ll give advice on finding and hiring the right people to carry out these plans.

I’ll also talk about additional considerations you’ll need to address, such as data governance and privacy protection, with a view to protecting your organization against competitive, reputational and legal risks.

We’ll end with additional practical advice for successfully carrying out data initiatives within your organization.

Overview of chapters

Part 1: Big data demystified

Chapter 1: The story of big data

How big data developed into a phenomenon, why big data has become such an important topic over the past few years, where the data is coming from, who is using it and why, and what has changed to make possible today what was not possible in the past.

Chapter 2: Artificial intelligence, machine learning and big data

A brief history of artificial intelligence (AI), how it relates to machine learning, an introduction to neural networks and deep learning, how AI is used today and how it relates to big data, and some words of caution in working with AI.

Chapter 3: Why is big data useful?

How our data paradigm is changing, how big data opens new opportunities and improves established analytic techniques, and what it means to be data-driven, including success stories and case studies.

Chapter 4: Use cases for (big) data analytics

An overview of 20 common business applications of (big) data, analytics and data science, with an emphasis on ways in which big data improves existing analytic methods.

Chapter 5: Understanding the big data ecosystem

Overview of key concepts related to big data, such as open-source code, distributed computing and cloud computing.

Part 2: Making the big data ecosystem work for your organization

Chapter 6: How big data can help guide your strategy

Using big data to guide strategy based on insights into your customers, your product performance, your competitors and additional external factors.

Chapter 7: Forming your strategy for big data and data science

Step-by-step instructions for scoping your data initiatives based on business goals and broad stakeholder input, assembling a project team, determining the most relevant analytics projects and carrying projects through to completion.

Chapter 8: Implementing data science – analytics, algorithms and machine learning

Overview of the primary types of analytics, how to select models and databases, and the importance of agile methods to realize business value.

Chapter 9: Choosing your technologies

Choosing technologies for your big data solution: which decisions you’ll need to make, what to keep in mind, and what resources are available to help make these choices.

Chapter 10: Building your team

The key roles needed in big data and data science programmes, and considerations for hiring or outsourcing those roles.

Chapter 11: Governance and legal compliance

Principles in privacy, data protection, regulatory compliance and data governance, and their impact from legal, reputational and internal perspectives. Discussions of PII, linkage attacks and Europe’s new privacy regulation (GDPR). Case studies of companies that have gotten into trouble from inappropriate use of data.

Chapter 12: Launching the ship – successful deployment in the organization

Case study of a high-profile project failure. Best practices for making data initiatives successful in your organization, including advice on making your organization more data-driven, positioning your analytics staff within your organization, consolidating data and using resources efficiently.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.30.178