Preface

According to Dr. Genichi Taguchi's quality loss function (QLF), there is an associated loss when a quality characteristic deviates from its target value. The loss function concept can easily be extended to the data quality (DQ) world. If the quality levels associated with the data elements used in various decision-making activities are not at the desired levels (also known as specifications or thresholds), then calculations or decisions made based on this data will not be accurate, resulting in huge losses to the ­organization. The overall loss (referred to as “loss to society” by Dr. Taguchi) includes direct costs, indirect costs, warranty costs, reputation costs, loss due to lost customers, and costs associated with rework and rejection. The results of this loss include system breakdowns, company failures, and company bankruptcies. In this context, everything is considered part of society (­customers, organizations, government, etc.). The effect of poor data ­quality during the global crisis that began in 2007 cannot be ignored because inadequate information technology and data architectures to support the management of risk were considered as one of the key factors.

Because of the adverse impacts that poor-quality data can have, organizations have begun to increase the focus on data quality in business in general, and they are viewing data as a critical resource like others such as people, capital, raw materials, and facilities. Many companies have started to establish a dedicated data management function in the form of the chief data office (CDO). An important component of the CDO is the data quality team, which is responsible for ensuring high quality levels for the underlying data and ensuring that the data is fit for its intended purpose. The responsibilities of the DQ constituent should include building an end-to-end DQ program and executing it with appropriate concepts, methods, tools, and techniques.

Much of this book is concerned with describing how to build a DQ program with an operating model that has a four-phase DAIC (Define, Assess, Improve, and Control) approach and showing how various concepts, tools, and techniques can be modified and tailored to solve DQ problems. In addition, discussions on data analytics (including the big data context) and establishing a data quality practices center (DQPC) are also provided.

This book is divided into two sections—Section I: Building a Data Quality program and Section II: Executing a Data Quality program—with 14 ­chapters covering various aspects of the DQ function. In the first section, the DQ operating model (DQOM) and the four-phase DAIC approach are described. The second section focuses on a wide range of concepts, methodologies, approaches, frameworks, tools, and techniques, all of which are required for successful execution of a DQ program. Wherever possible, case studies or illustrative examples are provided to make the discussion more interesting and provide a practical context. In ­Chapter 13, which focuses on data analytics, emphasis is given to having good quality data for analytics (even in the big data context) so that benefits can be maximized. The concluding chapter highlights the importance of building an enterprise-wide data quality practices center. This center helps organizations identify common enterprise problems and solve them through a systematic and standardized approach.

I believe that the application of approaches or frameworks provided in this book will help achieve the desired levels of data quality and that such data can be successfully used in the various decision-making activities of an enterprise. I also think that the topics covered in this book strike a balance between rigor and creativity. In many cases, there may be other methods for solving DQ problems. The methods in this book present some perspectives for designing a DQ problem-solving approach. In the coming years, the methods provided in this book may become elementary, with the introduction of newer methods. Before that happens, if the contents of this book help industries solve some important DQ problems, while minimizing the losses to society, then it will have served a fruitful purpose.

I would like to conclude this section with the following quote from Arthur Conan Doyle's The Adventure of the Copper Beeches:

“Data! Data!” I cried impatiently, “I cannot make bricks without clay.”

I venture to modify this quote as follows:

“Good data! Good data!” I cried impatiently, “I cannot make usable bricks without good clay.”

Rajesh Jugulum

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.74.54