Introduction

Data management and analytic practices have changed dramatically since I entered the industry in 1998. Data volumes are exploding beyond imagination, easily in the petabytes. There are many varieties of data that we are collecting, both structured and semi-structured data. We are acquiring data at much higher velocity, demanding daily renewal, sometimes even hourly. As the Greek philosopher Heraclitus so wisely stated centuries ago, “The only thing that is constant is change.”

WHY YOU SHOULD READ THIS BOOK

The management of data, and how we handle and analyze it, has changed dramatically since the start of the “big data” era. Ultimately, all of the data must deliver information for decision making. It is definitely an exciting time that creates many challenges but also great opportunities for all of us to explore and adopt new and disruptive technologies to help with data management and analytical needs. And, now, the journey of this book begins.

I have attended a number of conferences where I have been able to share with both business and IT audiences the technologies that can help them more effectively manage their data, in return creating a more streamlined analytical life cycle. I have learned from customers the challenges they encounter and the fascinating things they are doing with agile analytics to drive innovation and gain competitive advantage for their companies. These are the biggest and most common themes:

  • “How can I integrate data management and analytical process into a unified environment to make my processes run faster?”
  • “I do NOT have days or weeks to prepare my data for analysis.”
  • “My analytical process takes days and weeks to complete, and by the time it is completed, the information is outdated.”
  • “My staff is spending too much time with tactical data management tasks and not enough time focusing on strategic analytical exploration.”
  • “What I can do to retain my staff from leaving because their work is no longer challenging?”
  • “My data is scattered all over. Where do I go to get the most current version of the data for analysis?”

A good friend of mine, who is an editor, approached me to consider writing a book that combines real-world customer successes based on the concepts they adopted from presentations and white papers that I authored over the years. After a few months of developing the abstracts, outlines, and chapters, we agreed to proceed publishing this book with a focus on customer success stories in each section. My goals for this book are to:

  • Educate on what innovative technologies are available for integrating data management and analytics in a cohesive environment.
  • Inform about what fascinating technologies leading edge companies are adopting and implementing to help them solve some of the big data challenges.
  • Share customer case studies and successes across industries such as retail, banking, telecommunications, e-commerce, and transportation.

Whether you are from business or IT, I believe you will appreciate the real-world best practices and use cases that you can leverage in your profession. These best practices have been proven to help provide faster data-driven insights and decisions.

Writing this book was a privilege and honor. Mixed feelings went through my head as I started writing the book even though I was excited about sharing my experiences and customer successes with other IT and business professionals. The reasons for the mixed feelings were twofold:

  1. Will the technology discussed in this book still be considered as innovative or relevant when the book is published?
  2. How can I bring value to the readers who consider themselves to be innovators and leaders in the IT market?

Customer interactions are very important to me and a highlight in my profession. I have talked to many customers globally, tried to understand their business problems, and advised them on the appropriate technologies and solutions to solve their issues. I also have traveled around the world, sharing with customers and prospects the latest technologies and innovation in the market and how some of the leading-edge companies have adopted them to be more competitive and become the pioneers of managing data and applying analytics in a unified environment. Before I dive into the details, I believe it is appropriate to set the tone and definitions to be referenced throughout this book and some trends in the industry that demand inventive technologies to sustain leadership in a competitive, global economy. The topics of this book are focused on data management and analytics and how to unite these two elements into one single entity for optimal performance, economics, and governance—all of which are key initiatives for business and IT in many corporations.

LET'S START WITH DEFINITIONS

The term data management has been around for a long time and has transformed into many other trendy buzzwords over the years. However, for simplification purposes, I will use the term data management since it is the foundation for this book. I define data management as a process by which data are acquired, integrated, and stored for data users to access. Data management is often associated with the ETL (extraction, transformation, and load) process to prepare the data for the database or warehouse. The ETL process is very much embedded into the data management environment. The ultimate result from the ETL process is to satisfy data users with reliable and timely data for analytics.

There are many definitions for analytics, and the focus on analytics has recently been on the rise. Its popularity has reemerged since the 1990s because many companies across industries have recognized the value of analytics and the field of data analysis to analyze the past, present, and future with data. Analytics can be very broad and has become the catch-all term for a variety of different business initiatives. According to Gartner, analytics is used to describe statistical and mathematical data analysis that clusters, segments, scores, and predicts what scenarios have happened, are happening, or are most likely to happen.1 Analytics have become the link between IT and business to exploit massive mounds of data. Based on my interactions with customers, I define analytics as a process of analyzing large amounts of data to gain knowledge and understanding about your business and deliver data-driven decisions to make business improvements or changes within an organization.

INDUSTRY TRENDS AND CHALLENGES

Now that the definitions have been established, let's examine the state of the IT industry and what customers are sharing with me regarding the challenges they encounter in their organizations:

  • Data as a differentiator and an asset: Forbes Research concurs that data is a differentiator and an asset.2 As an industry, we are data rich but knowledge poor because organizations are unable to make sense of all the data they collect. We are barely scratching the surface when it comes to analyzing all of the data that we have access to or can acquire. In addition, the ability to analyze the data has become much more complex, and companies may not have the right infrastructure and/or tools to do the job effectively and efficiently. As data volumes continue to grow, it is imperative to have the proper foundation for managing big data and beyond.
  • Analytics for everything: Customers demand real-time analytics to empower data-driven decisions from CEO to a factory operator. Based on recent TechRepublic research, 70 percent of the respondents use analytics in some shape or form to drive performance and decisions. Whether it is to open a brand new division or develop another product line, the right decision will have a significant impact on the bottom line and, ultimately, the organization's success. As business becomes more targeted, personalized, and public, it is vital to make precise, fact-based (data-driven), transparent decisions. These decisions need to have an auditable history to show regulatory compliance and risk management.
  • The “now” factor: It seems that the X factor that a company should possess is to have immediate availability of products and services for their consumers. For example, the retail industry is facing the “now” factor challenge. Extremely low prices and great services are no longer enough to attract consumers. Businesses need to have what consumers are looking for such as color, size, and fit—when they need it. That is the key to attract and retain customers for success. Consumers are willing to pay at a premium on product availability. Based on a retail survey from Forbes, 58 percent said availability is more important than price, and 92 percent said they will not wait for products to come into stock. Companies must outsmart their competition and be able to share information with customers for products and services readiness.

These trends translate into challenges and opportunities for companies in every industry. The customers that I deal with consider these as their top three challenges:

  • Database performance: With a database architecture that may not scale to match the amount of data, it's difficult to process full data sets—or accomplish data discovery, analysis, and visualization activities.
  • Analytical capabilities: Because of inept data access and time consuming data preparation, analysts tend to focus on solving access issues instead of running tactical analytical processes and strategic tasks. In addition, there is an inability to develop and process complex analytic models fast enough to keep up with economic changes.
  • Data quality and integration: Having a multitude of data variety, silo data marts, and localized data extracts makes it difficult to get a handle on exactly how much data there is and what kind. When data are not in one location and/or data management is disjointed, its quality is questionable. When quality is questionable, results are uncertain.

Data is every organization's strategic asset. Data provide information for operational and strategic decisions. Because we are collecting many more types of data (from websites, social media, mobile, sensors, etc.) and the speed at which we collect the data has significantly accelerated, data volumes have grown exponentially. Customers that I have spoken to have doubled their data volumes in less than 24 months, which is beyond what Moore's law (that the rate of change doubles in 24 months) predicted over 50 years ago. With the pace of change escalating faster than ever, customers are looking for the latest innovation in technologies to try and satisfy their needs in both IT and business within a corporation and transform every challenge into big opportunities to positively impact the profitability and bottom line. I truly believe the new and innovative technologies such as in-database processing, in-memory analytics, and the emerging Hadoop technology will help tame the challenges of managing big data, uncover new opportunities with analytics, and deliver a higher return on investment by augmenting data management with integrated analytics.

WHO SHOULD READ THIS BOOK?

This book is for business and IT professionals who want to learn about new and innovative technologies and learn what their peers have done to be successful in their line of work. It is for the business analysts who want to be smarter at delivering information to different parts of the organization. It is for the data scientists who want to explore new ways to apply analytics. It is for managers, directors, and executives who want to innovate and leverage analytics to make data-driven decisions impacting profitability and the livelihood of their business.

You should read this book if your profession is in one of these groups:

  • Executive managers, including chief executive officers, chief operating officers, chief strategy officers, chief marketing officers, or any other company leader, who want to innovate and drive efficiency or deliver strategic goals
  • Line of business managers that oversee existing technologies and want to adopt new technologies for the company
  • Sales managers and account directors who want to introduce new concepts and technologies to their customers
  • Business professions such as business analysts, program managers, and offer managers who analyze data and deliver information to the leadership team for decision making
  • IT professionals who manage the data, ensuring data quality and integration, so that the data can be available for analytics

This book is ideal for professions who want to improve the data management and analytical processes of their organization, explore new capabilities by applying analytics directly to the data, and learn from others how to be innovative and to become pioneers in their organization.

HOW TO READ THIS BOOK

This book can be read in a linear manner, chapter by chapter. It proceeds very much as a process of crawling, walking, sprinting, then running. However, if you are a reader who is already familiar with the concept of in-database processing, in-memory analytics, or Hadoop, you can simply skip to the chapter that is most relevant to your situation. If you are not familiar with any of the topics, I highly suggest starting with Chapter 1, as it highlights the analytical life cycle of the data and data's typical journey to become information and insights for your organization. You can proceed to Chapters 2 to 4 (crawl, walk, sprint) to see how specific technologies can be applied directly to the data. Chapter 5 (how to run the relay) brings all of the elements together and how each technology can help to manage big data and advanced analytics. Chapter 6 discusses the top five focus areas in data management and analytics as well as possible future technologies.

Table 1 provides a description and focus for each chapter.

Table 1 Outline of the Chapters

Chapter Description Takeaway
  1. 1. The Analytical Data Life Cycle
The purpose of this chapter is to illustrate the typical life cycle of data and the stages (data exploration, data preparation, model development, and model deployment) involved to transform data into strategic insights using analytics.
  • What is the analytical data life cycle?
  • What are the characteristics of each stage of the life cycle?
  • What technologies are best suited for each stage of the data?
  1. 2. In-Database Processing
This purpose of this chapter is to provide the reader with the concept of in-database processing. In-database processing refers to the integration of advanced analytics into the database or data warehousing. With this capability, analytic processing is optimized to run where the data reside, in parallel, without having to copy or move the data for analysis.
  • What is in-database processing?
  • Why in-database processing?
  • What process should leverage in-database?
  • What are some best practices?
  • What are some use cases and success stories?
  • What are the benefits of using in-database analytics?
  1. 3. In-Memory Analytics
This purpose of this chapter is to provide the reader the concept of in-memory analytics. This latest innovation provides an entirely new approach to tackle big data by using an in-memory analytics engine to deliver super-fast responses to complex analytical problems.
  • What is in-memory analytics?
  • Why in-memory analytics?
  • What process should leverage in-memory analytics?
  • What are some best practices?
  • What are some use cases and success stories?
  • What are the benefits of using in-memory analytics?
  1. 4. Hadoop and Big Data
This purpose of this chapter is to explain the value of Hadoop. Organizations are faced with the unique big data challenges collecting more data than ever before, both structured and semi-structured data. There has never been a greater need for proactive and agile strategies to manage and integrate big data.
  • What is Hadoop?
  • Why Hadoop in big data environment?
  • How does Hadoop play in the modern architecture?
  • What are some best practices?
  • What are some use cases and success stories?
  • What are the benefits of using Hadoop in big data?
  1. 5. End-to-End – Bringing it All Together
This purpose of this chapter is to summarize and bring together the various technologies and concepts shared in Chapters 24. Combining traditional methods with modern and new approaches can save time and money for any organization.
  • How are in-database analytics, in-memory analytics, and Hadoop complementary?
  • What are use cases and customer success stories?
  • What are some benefits of an integrated data management and analytic architecture?
  1. 6. Conclusion and Forward Thoughts
This purpose of this chapter is to conclude the book with the power of having an end-to-end data management and analytics platform for delivering data-driven decisions. It also provides final thoughts about the future of technologies.
  • What is the future for data management?
  • What is the future for analytics?
  • What are the top five focus areas in data management and analytics?

LET YOUR JOURNEY BEGIN

An organization's most valuable asset is its customers. Yet right next to customers are those precious assets that the enterprise can leverage to attract, retain, and interact with those valuable customers for profitable growth: your data. Every organization that I have encountered has huge, tidal waves of data—streaming in like waves from every direction—from multiple channels and a variety of sources. Data are everywhere—as far as the eye can see! All day, every day, data flow into and through the business and your database or data warehouse environment. Now, let's examine how all your data can be analyzed in an efficient and effective process to deliver data-driven decisions.

ENDNOTES

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.28.107