203
11
Using Big Data and Analytics
to Manage Risk
By now youve probably heard about or have had some experience with
something called “big data.” While we may have heard of the concept, tak-
ing advantage of the treasure trove of data that resides at most companies
remains an evolving challenge. With an estimate of more than 15 million
gigabytes of new information collected every day (15 petabytes), which is
eight times the information in all U.S. libraries, its no wonder most com-
panies are wondering how to use big data to their advantage.
1
But is using big data going to be that straightforward? A report titled
Big Data Insights and Innovations Report revealed some ndings that
relate directly to big data and its uses.
2
First, many organizations are chal-
lenged by data overload and an abundance of trivial information. And
important data are not reaching practitioners in ecient time frames.
Current technology is also not yet at the level of providing measurable,
reportable, and quantiable data in areas including production sched-
uling, inventory, and customer demand across the entire supply chain.
Furthermore, despite the sophistication of current systems, data are not
always easily accessible to internal users. Finally, noticeable gaps are
present in many end- to- end supply chain ow models. Other than these
minor” issues, everything is working just ne in the world of big data and
risk management.
In this chapter we’ll advance some denitions and an overview of big
data and predictive analytics; talk about the process for successfully lever-
aging big data; present barriers and challenges with big data; and present
tools, techniques, and methodologies that support big data and analytics.
e chapter concludes with examples of companies using big data and how
these companies are leveraging their data to help manage supply chain risk.
204 • Supply Chain Risk Management: An Emerging Discipline
WHAT IS BIG DATA AND PREDICTIVE
ANALYTICS, REALLY?
To some observers big data got its start around 2003 with the advent of the
Data Center Program at Massachusetts Institute of Technology (MIT).
3
Before this, most of the early research in the late 1990s used the term data
analytics as a key descriptor. It becomes critical to dene the terms big
data and predictive analytics.
According to the Leadership Council of Information Advantage, big
data is not a precise term. is group sees it as data sets that are grow-
ing exponentially and that are too large, too raw, or too unstructured for
analysis using relational database techniques. So, where is this unbeliev-
able amount of unstructured data coming from? According to one source,
the amount of data available is doubling every two years and is emanat-
ing from not only traditional sources but also industrial equipment, auto-
mobiles, electrical meters, and shipping crates, just to name a few. e
information gathered includes parameters such as location, movement,
vibration, temperature, humidity, and chemical changes in the air.
4
Predictive analytics (PA) encompasses a variety of techniques from sta-
tistics, data mining, and game theory that analyze current and historical
facts to make predictions about the future. In business, predictive models
exploit patterns found in historical and transactional data to identify risks
and opportunities. Models capture relationships among factors to allow
assessment of risk or potential associated with a particular set of condi-
tions, guiding decision making for specic transactions.
5
Predictive analytics has been traditionally used in actuarial science,
nancial services, insurance, telecommunications, retail, travel, health
care, and pharmaceuticals. Yet it is barely mentioned in the manufacturing
and supply chain arenas. One of the best known and early applications of
PA is credit scoring, which is used throughout nancial services. Scoring
models process customers’ credit history, loan applications, customer data,
and so forth, in an eort to rank- order individuals by their likelihood of
making future credit payments on time. A well- known example is the
FICO score.
IBM, a leading provider of big data systems, maintains that more than
90% of the data that exists in the world today was created within the last
two years. We are in an age where more than 2.5 quintillion bytes of data
Using Big Data and Analytics to Manage Risk • 205
are created every day! We are increasingly becoming familiar with terms
such as follows
6
:
gigabytes (a unit of information equal to one billion (10
9
) or, strictly,
2
30
bytes)
petabytes (2
50
bytes; 1,024 terabytes, or a million gigabytes)
exabytes (a unit of information equal to one quintillion (10
18
) bytes,
or one billion gigabytes)
zettabytes (a unit of information equal to one sextillion (10
21
) or,
strictly, 2
70
bytes)
yottabytes (a unit of information equal to one septillion (10
24
) or,
strictly, 2
80
bytes)
Dont be concerned if these denitions are confusing. ey confuse
us also.
IBM has been at the forefront of articulating the concept of big data.
7
In
one of its analyses, the company concludes that big data, which admittedly
means many things to many people, is no longer conned to the realm
of technology. It has become a business priority given its ability to aect
commerce in a globally integrated economy. Organizations are using big
data to target customer- centric outcomes, tap into internal data, and build
a better information ecosystem. IBM has created a topology that looks
at big data in terms of four dimensions that conveniently start with the
letter V.
e rst dimension of big data is volume, which represents the sheer
amount of data. Perhaps the characteristic most associated with big data,
volume refers to the mass quantities of data that organizations are try-
ing to harness to improve decision making. As mentioned, data volumes
continue to increase at an unprecedented rate. However, what consti-
tutes truly high volume varies by industry and even geography and can
be smaller than the petabytes and zettabytes oen referenced in articles
and statistics.
Next, variety refers to the dierent types of data and data sources. is
dimension is about managing the complexity of multiple data types, includ-
ing structured, semistructured, and unstructured data. Organizations
need to integrate and analyze data from a complex array of both traditional
and nontraditional information sources within and outside the enterprise.
With the proliferation of sensors, smart devices, and social collaboration
technologies, data are being generated in countless forms such as text, web
206 • Supply Chain Risk Management: An Emerging Discipline
data, tweets, sensor data, audio, video, click streams, log les, and much
more. e bottom line is that data come in many forms.
e third dimension, velocity, refers to data in motion. e speed with
which data is created, processed, and analyzed continues to acceler-
ate. Contributing to this higher velocity is the real- time nature of data
creation, especially within global supply chains, as well as the need to
incorporate streaming data into business processes and decision making.
Velocity impacts latency—the lag time between when data are created or
captured and when they are accessible and able to be acted upon. Data are
continually being generated at a pace that is impossible for traditional sys-
tems to capture, store, and analyze, resulting in the development of new
technologies with new capabilities.
Finally, veracity refers to the level of reliability associated with certain
types of data. Striving for high- quality data is an important big data
requirement and challenge, but even the best data cleansing methods can-
not remove the inherent unpredictability of some data, like the weather,
the global economy, or a customer’s future buying decisions. e need to
acknowledge and plan for uncertainty is a dimension of big data that has
been introduced to executives to better understand the uncertain world of
risk around big data. Veracity requires the ability to manage the reliability
and predictability of imprecise data types.
A good portion of the data within global supply chains is inherently
uncertain. e need to acknowledge and embrace this level of uncertainty
is the hallmark of big data and supply chain risk management. An exam-
ple is in energy production where the weather is uncertain but a utility
company must still forecast production. In many countries, regulators
require a percentage of production to emanate from renewable sources,
yet neither wind nor clouds can be forecast with precision. So, what to
do? To manage this uncertainty, analysts, either in energy or supply chain
management, need to create context around the data.
One way to manage data uncertainty is through something called data
fusion, where combining multiple, less- reliable sources creates a more
accurate and useful set of data points, such as social media comments
appended to geospatial location maps. Another way to manage uncer-
tainty is through advanced mathematics that embrace uncertainty, such
as probabilistic modeling, discrete- event simulation and multivariate,
nonlinear analyses coupled with failure mode eects analysis (FMEA).
Most observers predict a major impact of big data and predictive ana-
lytics on the global economy. In a recent Fortune article, an expert from
Using Big Data and Analytics to Manage Risk • 207
Gartner suggested that over a relatively short time period, more than four
million positions worldwide will emerge for analytic talent, of which only
about one third will be lled.
8
Dice.com has identied the Top 10 techni-
cal skills big data will need over the next several years. By a large margin
the rst is Hadoop plus Java, which is not surprising since Java powers
Yahoo, Amazon, eBay, Google, and LinkedIn. Aer that it is Developer,
NoSQL, Map Reduce, BigData, Pig, Linux, Python, Hive, and Scala.
e shortage of professional skills in Hadoop and NoSQL has given rise
to higher pay for qualied hires, topping $100K on average. e real win-
ner here could be the U.S. economy. Anticipating a multiplier eect, one
observer predicts that for every big data– related role in the United States,
employment for three people outside IT will be created.
9
While the rise of
big data presents opportunities, a shortage of qualied IT professionals
also exposes an organization to risk.
As we conclude this overview of big data and predictive analytics, it
would be appropriate to close this section with some key ndings from
IBM’s research into big data. First, across multiple industries, the business
case for big data is strongly focused on addressing customer- centric objec-
tives. Second, a scalable and extensible information management founda-
tion is a prerequisite for big data advancement. ird, organizations are
beginning their pilot and implementation programs by using existing and
newly accessible internal sources of data. Next, advanced analytics capa-
bilities are required, yet oen lacking, for organizations to get the most
value from big data. And nally, as organizations’ awareness and involve-
ment in big data grows, four stages of adoption emerge, which the next
section presents.
THE PROCESS OF SUCCESSFULLY LEVERAGING
BIG DATA FOR MAXIMUM BENEFIT
Many of the cases we describe later in the chapter maintain the hallmarks
of supply chain analytic implementations. ese hallmarks include a clear
business problem with supporting metrics; a focus on fact- based decision
making and on improving business key performance indicators (KPIs); and
the establishment of an end- to- end, enterprise- wide process that is champi-
oned by C- level management. Other characteristics include forward- looking
scenarios and causal analysis to understand variability and performance
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.197.212