278 | Big Data Simplied
11.1 INTRODUCTION
In this book, along the way, we have seen a number of Big Data applications. There are applications
from a functional perspective in the areas of manufacturing, retail, nance. At the same time, we
can also look at applications from a technological perspective. We have seen how BigData provides
an essential foundation to a number of emerging technologies, like Analytics and Data Science, the
use of Recommendation Engines and also in the area of Internet of Things (IoT). Now, why does a
particular organization need a Big Data program in the rst place? Therefore, it is extremely impor-
tant to have a strong business and technology reason to adopt and sustain a Big Data program.
Once the reason is in place, one needs to come up with the data lifecycle for the enterprise.
This is where one decides where data originates, who are the producers and owners of the data,
what are the touchpoints of data as it flows through the enterprise, as in which enterprise appli-
cations use and modify that data, and finally, who are the consumers of the data. This will pro-
vide the basis for defining the data and integration architectures for an enterprise, as well as the
processes around information management.
Finally, the question is to choose which platform. As we have seen, there are a number of
BigData platforms. We need to have a clear understanding of the objective of a Big Data program
and the nature of data and integration architecture of the enterprise to make tooling choices,
which are best suited for a particular enterprise.
This chapter explores all these concepts and it also examines why Big Data projects fail and
what are the common pitfalls to avoid such drawbacks.
11.2 TWO TYPICAL BIG DATA USE CASES
There are two broad use cases for Big Data adoption:
a. Big Data adoption for cost optimization
b. Big Data adoption for enhanced value
11.2.1 Big Data Primarily for Cost Reduction
As discussed earlier in the initial chapters of this book, organizations have been generating, stor-
ing and processing huge volumes of data for several years now. Typically, the humongous data is
nothing but processed and stored information about their customers, the products manufactured
or services offered, information about employees, information about suppliers and vendors, loca-
tion where companies operate, transaction and business operations involving all these entities,
and however, the list is endless. Even before the emergence of Hadoop Distributed File System
(HDFS), organizations have traditionally used large repositories of data.
This data is mostly structured data, and it is stored in relational databases, as explained in
the early chapters of the book while concocting the different types of data. Again, there are huge
volumes of structured data in data warehouses. We shall look into the meaning of data warehouse
and how it differs from the more modern concept of a Data Lake. However, it is now sufficient to
understand that data warehouse is a storehouse of integrated data pouring in from data sources
across different parts of an enterprise. Thus, the data stored in a data warehouse is used primarily
for reporting, analysis and to support various processes related to making business decisions.
Typically, the data is gathered for transactional and operational sources.
M11 Big Data Simplified XXXX 01.indd 278 5/13/2019 9:57:44 PM