It is often said that Master Data Management enables an enterprise to create and use a “single version of the truth.” As such, Master Data Management applies to almost all industries and covers a broad category of corporate data. Banks, insurance companies and brokerage houses, service companies, hospitality companies, airlines, car manufacturers, publishing houses, healthcare providers, telecommunication companies, retail businesses, high-technology organizations, manufacturing, energy providers, and law firms have at least one common need—the need to have access to complete, accurate, timely, and secure information about their respective businesses’ financial and market performance, partners, customers, prospects, products, patients, providers, competitors, and other important business entities. Similarly, this need to create and use accurate and complete information about individuals and organizational entities applies equally well to the public sector. Governments at different levels across the globe need to know their citizens, patients, and persons of interest (POI) from multiple perspectives. They also need to understand what services are provided to which individuals and locations, where people and organizations reside, and so on.
As stated previously, the quest for a single version of the truth is not an industry or geography or data-domain specific. Therefore, the scope of Master Data Management is very broad and may cover customer data, product data, supplier data, employee data, reference data, and other key types of data that should be used to consistently manage the entire enterprise in an integrated fashion. And the primary vehicle by which Master Data Management enables this consistent and integrated management of the business is the ability to create and maintain an accurate and timely authoritative “system of record” for a given subject domain. In the case of customer data, for example, Master Data Management can support various aspects of customer, partner, and prospect information, including customers (both individuals and business entities), customer profiles, accounts, buying preferences, behavioral characteristics, service requests and complaints, contact information, and other important attributes.
The creation of a single version of truth and managing the quality of its data is a problem for almost every enterprise. The sheer number and variety of data quality issues and cross-system inconsistencies accumulated over long periods of time make it difficult even to list and prioritize the issues across the enterprise. This makes it practically impossible to focus on the data quality overall and concentrate on a single version of truth for every piece of information within one application or data domain. Who wants to boil the ocean? This rhetorical question reveals a great enterprise challenge: What categories of data should be a priority for the enterprise projects around the single version of truth and data quality?
Master Data Management resolves this uncertainty in priorities by clearly stating MDM focus.1 MDM claims that some entities (master entities) are more important than others because they are widely distributed across the enterprise as well as reside and are maintained in multiple systems and application silos.
Master entities are critical for multiple business and operational processes. Although these entities and associated data domains may comprise only a small percent of an enterprise data model, their significance from the data-quality perspective is disproportionally high.
This point is particularly important when we consider the role of MDM in managing reference data. Reference data includes entities with fewer record counts than in traditional MDM entities. Reference data deals with identifiers, categories, types, classifications, and other “pick lists” defined by the business and/or data governance policies.
Bringing “order” to master data often solves 60–80 percent of the most critical and difficult-to-fix data quality problems. Thus, sound enterprise practices in management and governance of master data directly contribute to the success of the organization. Mismanagement of these practices and lack of master data governance pose the highest risk.
MDM and the “Single Version of Truth”
In a number of cases, “single version” has been interpreted literally and simplistically as a version having only one physical record representing a customer, product, location, and so on, that is equally useful, available, and accessible to all functional groups across the enterprise. In reality, enterprises typically require a holistic panoramic view of master entities that includes multiple (sometimes 10–15) definitions and functional views approved by data owners and maintained according to established data governance policies. These views can have different sets of attributes, different tolerances to false positives or false negatives, different latency requirements, and so on. For instance, the marketing department may have requirements different from those of customer service and different from compliance. A definition of a customer can vary significantly between shipments and billing. “Individual,” “household,” and “institution” can represent a customer in different contexts and scenarios. Similarly, the notion of a product or service can vary broadly based on the context of discussion within an enterprise. The notion of a holistic and panoramic master entity view represents real-life MDM requirements much better.
In essence, MDM is a great way to prioritize data quality and information development problems and focus resources properly to maximize the return on a data quality effort. Expressed differently, MDM is an efficient approach to fixing enterprise data quality problems (cherry picking in the area of data quality and data management, with master data being the cherry in a huge data quality garden).
In the early stages of MDM, two separate domain-specific streams had emerged: CDI (Customer Data Integration) and PIM (Product Information Management). Different software vendors focused on different data domains, and single-domain implementations were predominant. As MDM was maturing and evolving, the value of the multidomain MDM has become increasingly clear. For instance, it is much easier to build and maintain cross-domain relationships when MDM is implemented on a single multidomain MDM platform.
Party, customer, product, account, organization, and location top the list of choices of what most companies recognize as their master data. The terms “party” and “product” have a variety of industry-specific variants. These entity types are not just widely distributed across application silos but are also represented by high volumes of records, which creates additional challenges.
In the case of customer data, for example, Master Data Management can support various aspects of customer, partner, and prospect information, including customer profiles, accounts, buying preferences, service requests and complaints, contact information, and other important attributes.
MDM strategy, architecture, and enabling technologies dealing with various aspects of customer data constitute the largest segment of the current Master Data Management market. One direct benefit of MDM is its ability to enable customer centricity and reach the often-elusive goal of creating and effectively using a “single and holistic version of the truth about customers.” This single version of the truth is one of the requirements to support the fundamental transformation of an enterprise from an account-centric business to a new, effective, and agile customer-centric business—a transformation that has a profound impact on the way companies conduct business and interact with their customers.
Master Data Management emerged as an area of enterprise data management over the last decade, and now has become a broadly recognized and fast-growing discipline.
Although the aspirations of MDM are not new, the interest in developing MDM solutions has grown significantly. We can now see MDM-related initiatives being implemented across the wide spectrum of industries. This timing is not accidental, and is among several key reasons why implementing Master Data Management has become such a universal requirement for almost any business and industry. Some of these reasons are driven by recently adapted and emerging regulations.
A number of well-publicized corporate scandals and class-action shareholder lawsuits gave rise to new pieces of legislation and regulations, such as the Sarbanes-Oxley Act, the Basel II Capital Accord, and numerous Securities and Exchange Commission (SEC) rulings, all of which were focused on companies’ need and requirement to provide, use, and report accurate, verifiable, and relevant data about their financial performance and significant material events that could impact company valuations and shareholder value. A few examples illustrate this point:
• On February 17, 2009, U.S. President Barack Obama signed the American Recovery and Reinvestment Act (ARRA) into law. ARRA funds ($789 billion) are targeted toward rebuilding infrastructure and positioning the country to grow the next-generation economy. The healthcare part of the bill includes health information technology ($19 billion) to stimulate early adopters of Electronic Health Record (EHR) standards and technologies.
• Interoperable Health Infrastructure—a ten-year plan to build a national electronic health infrastructure in the U.S. The idea is that with interoperable electronic health records, always-current medical information could be available wherever and whenever the patient and attending healthcare professional need it across the healthcare ecosystem. The plan improves efficiency and lowers costs in the healthcare system through the adoption of state-of-the-art health information systems and significantly improved data management with the primary focus on patient and provider data.
In addition to overarching reporting regulations such as the Sarbanes-Oxley Act, companies have to comply with a multitude of local, state, federal, and international regulations focused on the following:
• Various aspects related to protecting enterprise data from unauthorized access, use, and compromise
• Capturing and enforcing customer privacy preferences
• Protecting customer data from malicious use and identity theft, the fastest-growing white-collar crime
Regulations such as the Gramm-Leach-Bliley Act, the Health Insurance Portability and Accountability Act (HIPAA), and state regulations such as California’s SB 1386 require that companies implement effective and verifiable security controls designed to protect data, ensure data integrity, and provide appropriate notifications in case of a security breach resulting in data privacy and integrity compromise.
The increased volume and global reach of money-laundering activities, the events of September 11, 2001, and growing appreciation of terroristic threats gave rise to regulations such as the USA Patriot Act with its Anti-Money Laundering (AML) and Know Your Customer (KYC) provisions. These regulations not only require an enterprise to maintain accurate and timely data on its customers and their financial transactions, but also to manage this data in such a way that it can be analyzed to detect and prevent money-laundering or other fraudulent activities before these transactions can take place. An attempt to blow up Northwest Airlines Flight 253 on Christmas day of 2009 and the widely publicized failure of the U.S. government agencies to provide timely actions to ensure safety and security of the flight indicate that significant and complex problems in these areas persist.
These regulations require that an organization maintain integrity, security, accuracy, timeliness, and proper controls over the content and usage of corporate operational and customer data—in effect, this is the requirement to implement Master Data Management for any data subject area that needs to be in compliance with key oversight tenets of SarbanesOxley, Basel II, Gramm-Leach-Bliley, ARRA, and others.
In addition to the nondiscretionary requirements of regulatory compliance, the need for Master Data Management can be easily traced to the more traditional goal of improving customer service and customer experience management. Having an accurate “single version of the truth” allows an organization to understand the factors and trends that may affect the business. Here are some particulars:
• Having an authoritative master data set allows an organization to reduce costs by sunsetting and discontinuing old application systems that create and use various “local” versions of the data.
• Having accurate and complete data about customers and their interactions with the enterprise allows an organization to gain better insight into the customers’ goals, demands, abilities, and their propensity to request additional products and services, thus increasing the cross-sell and up-sell revenue opportunities.
• Having a complete picture of the customer allows an enterprise to offer a rich set of personalized services and appropriate treatments—the factors leading to improved customer experience and reduced customer attrition.
These goals and benefits, in turn, are driven by a number of factors related to the growing complexity and increased velocity of business and government activities, as discussed next:
• Business and government structures are evolving over time, and both the size and complexity of the organizations continue to grow to reflect the global, dynamic, and interconnected nature of the business and regulatory environments. In the case of business, we see increasing need to understand key business facts in a consistent, holistic way that integrates both cross-LOB (line of business) and cross-channel views. For example, a diversified global financial institution may create and maintain views of their customers that are specific to a particular line of business such as retail bank, investment bank, credit card division, and risk management division.
• Creating a panoramic, holistic view of the customer is a real business imperative that can significantly improve customer service, allow for better analysis of customer needs and credit risk exposure, and would allow the organization to better understand the totality of the relationships it has or may have with the customer.
• A similar situation arises when an organization creates a specific customer view based on the channel it uses to interface with the customer (bank branch, online channel, broker-driven interactions, and so on). In this case, MDM can become an enabler of cross-channel integration.
• The need for an integrated, accurate, and consistent view of the customers, products, locations, vendors, and suppliers is further exacerbated in cases of industry consolidations via mergers and acquisitions (M&A), an activity that is affecting many industry segments and equally impacting small, medium, and very large enterprises.
These points apply not just to customers but to patients and service providers in the healthcare industry, travelers in the travel and leisure industry, citizens and persons of interest in government and law enforcement areas, and many other data domains, thus covering all the major types of Master Data Management.
Understanding the reasons for embarking on a Master Data Management initiative does not make it easier to accomplish the goals of MDM. Some significant challenges have to be overcome in order to make Master Data Management a reality. As the term “Master Data Management” implies, one of these challenges is centered on how to make data under management a “golden,” authoritative version known as the “master.”
For example, in the case of building Customer Relationship Management (CRM)2 solutions across sales, marketing, and customer service channels, master data may consist of customer personal information (for example, name, address, and tax identification number), their assets and account numbers, service/warranty records, and history of service or product complaints. In the healthcare industry, consider Enterprise Master Patient Index (EMPI), the healthcare-specific variant of MDM for patient data widely used in hospitals, medical centers, and Integrated Delivery Networks (IDNs). There, master data may include not only patient personal information but also some diagnostic and prescription data, data on healthcare providers (such as doctors and hospitals), health insurance information, and similar data points. In the consumer retail business, master data may include information on products, product codes, suppliers, contracts, stores, sales, inventory levels, current and planned sales promotions, and so on. Even within the organization, master data varies from one business unit to another. For example, the scope of the master data subset for the accounting department within a retail enterprise may include information on budgets, cost centers, department codes, company hierarchies, invoices, accounts payable, and accounts receivables. Of course, in this case, the goal of Master Data Management would be to eventually integrate various subsets of department-level master data into an enterprise-wide master data superset.
Whether it is about customers, products, partners, or invoices, having relevant information in the form of master data allows businesses to capture key performance indicators, analyze all business transactions, and measure results to increase business effectiveness and competitiveness.
In order to create domain-specific, complete, accurate, and integrated master data, an organization needs to develop and institutionalize processes that help to discover and resolve inconsistencies, incompleteness, and other data quality issues caused in significant part by the way the established enterprises collect, store, and process data. Typically, the data that should be used to build the enterprise master is collected, stored, and processed by different business units, departments, and subsidiaries using different application systems; different definitions for the same data attributes; and different technologies, processes, formats, and transformation rules. The result is disjoined islands of data that manifest data quality issues in a number of ways, including the following:
• Semantic inconsistencies at the data attribute level include the following symptoms:
• Different business units often use the same data attributes to describe different entities. For example, a customer identifier for CRM master data may point to a social security number but could be a Dun & Bradstreet DUNS number for a supply-chain business area.
• Data attributes that describe business entities (product names, total revenue, and so on) often contain different values for the same attributes across different applications and lines of business. For example, a product name in one application may mean a product type in the other, and a product code in a third application.
• Business entities may be identified differently across the enterprise because different applications systems may use different reference data sources.
• Inconsistencies in attribute-level data often go hand in hand with the inconsistencies across data-related business rules that define the way the data has to be formatted, translated, and used; these rules often vary from one business unit and application system to another.
• Data relationship inconsistencies impact the ability to identify explicit and/or inferred relationships between business entities (for example, accounts and payments, customers and households, and products and suppliers). These relationships are often defined differently in different applications and across lines of business. This is not a pure technology issue, although it is not unusual to find an organization that over time created various data stores designed strictly to support the business requirements of an individual business unit. This “stovepipe” design approach often results in situations that naturally create inconsistencies in data definitions, content, and structures, such as expressions of how various entities are related to one another.
• Business entities such as products, partners, and suppliers are sometimes inherently organized into hierarchies. For example, the corporate structure of a large supplier may contain a parent company and several levels of subsidiaries. Traversing these hierarchies is one of the requirements for applications that, for example, need to understand and manage intercompany relationships and to measure the total value of the transactions conducted with all business entities of a given corporate structure. Depending on the scope and design of an individual application, these hierarchies may be represented differently across system domains.
And the list can go on and on …
This discussion of data quality may appear to be of a more traditional nature and only slightly related to the goals of Master Data Management. In fact, the issues of data quality raised here are the primary factors in making the MDM goal of data integration across the enterprise much harder.3
To put it another way, MDM is much more than traditional data quality initiatives: Whereas most of the data quality initiatives are concerned with improving data quality within the scope of a particular application area or at a level of the specific line of business, MDM is focused on solving data-quality concerns in an integrated fashion across the entire enterprise.
Moreover, MDM is intrinsically linked with enterprise business processes and Business Process Management (BPM) technologies. Indeed, many business processes are designed assuming that accurate, complete data that can be trusted to execute business transactions and make key business decision is available as and when needed, preferably in the form of a trusted authoritative “system of record” (SOR). As defined in the previous sections, this system of record is created by an MDM solution. Conversely, the business process owners need to realize that, more often than not, the business processes have to be designed and aware of not only having access to the master data from the MDM system, but also that data created by these business processes can be of very questionable quality as it gets consumed by an MDM system. In short, MDM and BPM together have a profound impact on each other as well as on the overall data quality and the resulting effectiveness and behavior of the business.
This enterprise-wide impact of MDM on data quality of established and newly enabled business processes is a direct consequence of the MDM goal of achieving a “single (holistic) version of the truth.” The way Master Data Management approaches this goal of delivering an integrated data view is by matching key data attributes across different application systems, lines of business, and enterprise entities in order to identify and link similar data records into uniquely identified groups of records, sometimes called “affinity clusters.” For example, a customer-focused MDM solution for a financial institution would attempt to find all records about an individual customer from all available data sources that come from various lines of business, such as banking, credit cards, insurance, and others, and link them into a group of all individuals who comprise that customer’s household. Fundamentally, this matching and linking activity is infeasible or at least unreliable if the data that is being matched displays the properties of inconsistency, inaccuracy, incompleteness, and other data quality issues discussed earlier in this section.
To sum up, one of the goals and challenges of Master Data Management is to allow organizations to create, manage, and deliver a master data platform that can demonstrate acceptable and measurable levels of data quality and that enables the consistent and effective integration of various data entities into cohesive and complete data views.
We have now reached the point where we can formally define Master Data Management. Although a number of MDM definitions are available, we need to define MDM in a way that is agnostic of the particular data subject area and provides a sufficiently complete description representing both the business and technology views.
This need to achieve a “single version of the truth” is not a particular property of one industry. In the next chapter we will show that MDM has extremely broad applicability not only across industries but also across various types of organizations, including private and public companies as well as government organizations.
In achieving this ambitious goal of creating a “single holistic version of the truth,” Master Data Management helps any organization that has disparate data sources and data stores, various applications, and multiple lines of business. In doing so, MDM can be viewed as an evolutionary, next-generation data management and data quality discipline, and we’ll show some components of that evolution in the example of Master Data Management for Customer Domain (also known as Customer Data Integration or CDI) later in this chapter. At the same time, given the breadth, depth, and profound consequences of implementing Master Data Management, we can see it as a revolutionary, disruptive approach to data management, and we’ll show later in the book that the impact of MDM is reaching deep into the core of many established business processes. These revolutionary properties of MDM require significant financial, time, and organizational commitment across the entire enterprise, including participation from both the business and technology sides of the company. Indeed, Master Data Management is an enterprise-wide data- and system-integration activity that requires a multidisciplinary, extremely well-planned and executed program that involves business process analysis; data analysis and mapping; data cleaning, enrichment, and rationalization; data matching, linking, and integration; data synchronization and reconciliation; data security; and data delivery.
In addition, we’ll show in Part II of the book that an enterprise-class MDM solution should be implemented as an instance of a service-oriented architecture (SOA), and thus the program would include the design, development, testing, and deployment of both business and technical services that enable an MDM platform to function and continuously manage the new “system of records.” Although this is far from a complete list of activities required to implement an MDM solution, many of these activities have to be planned, managed, and executed in a holistic fashion, whether an MDM initiative is focused on a small or large organization as well as whether its focus is on customer data, product reference data, or other data domains. Moreover, MDM provides significant value not only within an enterprise but also across enterprises when new efficiencies, standards, or regulations demand that two or more companies need to share data across enterprise boundaries—for example, when various government agencies need to share data about potential threats to national security, or when financial services companies need to support global industry initiatives such as Straight Through Processing (STP). On the one hand, when we look at the wide-open field of MDM opportunities, it is hard to imagine any enterprise, large or small, that has only a single source of data that it uses to manage the business. However, when we talk about initiatives of the scale and impact of MDM, size does matter, and many small to midsize companies have to limit the scope and investment of their MDM initiatives to avoid the challenges of justifying a significant level of commitment and investment.
In general, cleaning, standardizing, rationalizing, matching, linking, and managing records and their relationships are some of the key challenges and key differentiations of Master Data Management solutions.
As previously mentioned, situations where Master Data Management is focused on creating and managing an authoritative system of records about customers is the subject of the MDM variant known as Customer Data Integration (CDI). This term, however, may be misleading in that it may create an impression that CDI only deals with customer information, where customers are individuals who have predefined, known, and usually account-based relationships with the enterprise.
In fact, even though CDI stands for Customer Data Integration, the word “Customer” is used as a generic term that can be replaced by industry or LOB-specific terms, such as Client, Contact, Party, Counterparty, Patient, Subscriber, Supplier, Prospect, Service Provider, Citizen, Guest, Legal Entity, Trust, or Business Entity. We will use terms such as “customer” and “party” as primary descriptors of the customer master entities interchangeably throughout the book.
In addition to the broad scope of the term “customer,” the CDI architecture and terminology have been evolving to reflect rich MDM capabilities that extend beyond data integration. As a result, the customer-focused MDM solutions known as CDI are sometimes called “Customer MDM” or “Customer Master.”
Once we clearly define the data domain as the one dealing with the generic term “customer,” we can provide a working definition of Customer Data Integration. As follows from the previous discussion, this definition builds on the definition of MDM presented in the preceding section.
To state it slightly differently, a customer-focused MDM (CDI) solution takes customer data from a variety of data sources, discards redundant data, cleanses the data, and then rationalizes and aggregates it together. We can graphically depict such an MDM system as a hub-and-spokes environment. The spokes are information sources that are connected to the central hub as a new “home” for the accurate, aggregated, and timely data that, in the case of CDI, represents customer data (see Figure 1-1). Clearly, this hub-and-spoke topology is generally applicable to all MDM data domains; thus we often use the term “Data Hub” when discussing MDM solution space.
As stated earlier, CDI is a special, customer-data-focused type of Master Data Management, with the same goals, objectives, and benefits. Because this type of MDM deals with customer information that it collects, cleanses, rationalizes, and aggregates into a holistic customer view, a comprehensive CDI initiative can have a profound impact on the way any enterprise conducts its business and interacts with its customers.
Specifically, a customer-focused MDM solution can allow the enterprise to discover various relationships that the customers may have with one another, relationships that can allow the enterprise to understand and take advantage of potential opportunities offered by customer groups that define households, communities of interest, and professional affiliations. For example, a CDI solution implemented by a financial services firm can indicate that a group of its customers have created a private investment club, and that club could present a very attractive opportunity for the new services and product offerings.
The benefits of discovering and understanding relationships among individuals apply not just to commercial businesses. Understanding the relationships between individuals has direct and profound implications on various government activities, including law enforcement, risk management for global financial transactions, and national security. Indeed, if the full spectrum of MDM capabilities for recognition, identification, and relationships discovery of individuals and their activities had been available globally, the tragic events of September 11, 2001, might not have happened because we now know that there was a sufficient amount of information available about the hijackers and their relationships. Unfortunately, this information was collected and analyzed by different government services but never integrated and delivered to the right people at the right time.
Although discovering and managing the relationships is a very useful capability, CDI benefits don’t stop there. A CDI solution (often referred to as a Customer Data Hub, or CDH) can allow an enterprise to drastically change its business model from a traditional, account-centric approach to a new, more effective, and rewarding customer-centric model that can significantly improve the customer experience, reduce customer attrition, strengthen customer relationships with the enterprise, and even increase the customer’s share of the wallet for the enterprise.
We’ll discuss account-centric-to-customer-centric transformations in more detail in Chapter 2.
Given that MDM enables a near-real-time, accurate, and complete “single holistic version of the truth,” its customer-focused variant—CDI—enables a single version of the truth about the customer. The properties and benefits of having the accurate and complete view of the customer make it abundantly clear that CDI solutions are much more than simply another customer database or a new enterprise CRM platform, even though we can find the genesis of MDM and CDI in these and similar customer management technologies.
Business and government entities have historically striven to create and maintain authoritative and timely information sources. This natural enterprise requirement has been further emphasized by a number of regulatory requirements that include the Sarbanes-Oxley Act and the Basel II Capital Accord (see discussion on these regulations in Part III of the book).
In the case of moving toward a holistic and accurate “system of record” for customer information organizations had been working on creating customer-centric business models and applications and enabling infrastructure for a long time. However, as the business complexity, number of customers, number of lines of business, and number of sales and service channels continued to grow, and because this growth often proceeded in a tactical, nonintegrated fashion, many organizations evolved into a state with a wide variety of customer information stores and applications that manage customer data.
The customer data in those “legacy” environments was often incomplete and inconsistent across various data stores, applications, and lines of business. Although in many cases individual applications and lines of business (LOB) were reasonably satisfied with the quality and scope of customer data that they managed, the lack of completeness, accuracy, and consistency of data across LOB prevented organizations from creating a complete, accurate, and up-to-date view of customers and their relationships.
Recognizing this customer data challenge and the resulting inability to transform the business from an account-centric to a customer-centric business model, organizations embarked on developing a variety of strategies and solutions designed to achieve this transformation. They offered new customer insights, new ways to improve customer service, and increased cross-selling and up-selling. Yet they were limited to being deployed within the boundaries of the existing organizational units and lines of business and were not positioned to become true enterprise-wide, cross-LOB global solutions.
The continuous discovery of cross-industry opportunities that could leverage customer centricity have helped define Master Data Management in general and its customer-focused CDI strategies and architecture in particular. We can see today that MDM has emerged to become not just a vehicle for creating the authoritative system of key business information, but, in the case of customer data, it became an enabler of achieving customer centricity.
Let’s briefly look at what has been done and what elements, if any, can be and should be leveraged in implementing a CDI solution. The most notable CDI predecessors include Customer Information File (CIF); Extract, Transform, and Load (ETL) technologies; Enterprise Data Warehouse (EDW); Operational Data Store (ODS); data quality (DQ) technologies; Enterprise Information Integration (EII); and Customer Relationship Management (CRM) systems. Many of these CDI predecessors have been described in numerous books8 and publications9:
• Customer Information File (CIF) is typically a legacy environment that represents some basic, static information about the customers. CIF systems have a number of constraints, including limited flexibility and extensibility; they are not well suited to capturing and maintaining real-time customer data, customer privacy preferences, customer behavior traits, and customer relationships. CIF systems are often used to feed the company’s Customer Relationship Management systems.
• Extract, Transform, and Load (ETL) tools are designed to extract data from multiple data sources, perform complex transformations from source formats to the target formats, and efficiently load the transformed and formatted data into a target database such as an MDM Data Hub. Contemporary ETL tools include components that perform data consistency and data quality analysis as well as providing the ability to generate and use metadata definitions for data elements and entities.10 Although not pure MDM solutions, ETL tools have been a part of the typical MDM framework and represent an important functionality required to build an MDM Data Hub platform, including Customer Data Integration systems.
• Enterprise data warehouse (EDW)11, 12 is an information system that provides its users with current and historical decision support information that is hard to access or present in traditional operational data stores. An enterprise-wide data warehouse of customer information was considered to be an integration point where most of the customer data can be stored for the purpose of supporting Business Intelligence applications and Customer Relationship Management systems. Classical data warehouses provide historical data views and do not support operational applications that need to access real-time transactional data associated with a given customer and therefore are falling short of delivering on the MDM promise of an accurate and timely system of record for key business data domains, including customer information.
• Operational data store (ODS)13 is a data technology that allows transaction-level detail data records to be stored in a nonsummarized form suitable for analysis and reporting. Typical ODS does not maintain summarized data, nor does it manage historical information. Similar to the EDW, an ODS of customer data can and should be considered a valuable source of information for building an MDM Customer Data Hub.
• Data quality (DQ) technologies,14, 15 strictly speaking, are not customer data platforms, but they play an important role in making these platforms useful and useable, whether they are built as data warehouses, operational data stores, or customer information files. Given the importance of data quality as one of the key requirements to match and link customer data as well as to build an authoritative source of accurate, consistent, and complete customer information, data quality technologies are often considered not just as predecessors but as key MDM and CDI enablers.
• Enterprise Information Integration (EII)16,17 tools are designed to create a virtually federated view of distributed data stores and provide access to that data as if it was coming from a single system. EII tools can isolate applications from the concerns related to the distributed and heterogeneous nature of source data stores and allow users to aggregate distributed data in memory or nonpersistent storage, thus potentially delivering a “just-in-time” customer data view. Depending on the data sources, data quality, and data availability, these EII solutions could be used as components of a full-function MDM Customer Data Hub.
• Customer Relationship Management (CRM) is a set of technologies and business processes designed to understand customers, improve customer experience, and optimize customer-facing business processes across marketing, sales, and servicing channels. As such, CRM solutions are probably the closest to what an MDM Customer Data Hub can offer to its consumers. CRM has been adopted by many enterprises as a strategy to integrate information across various customer data stores and to deliver appropriate analytics and improve customer interactions with the enterprise across all major channels. Unfortunately, experience shows that CRM systems positioned for enterprise-wide deployment could not scale well as an integrated system of complete and timely customer information.
Each of these predecessor technologies has its particular advantages and shortcomings. A detailed discussion of these technologies is beyond the scope of this book, but we offer some additional considerations regarding these technologies in Chapter 4.
In addition to MDM Customer, recent years have seen the emergence of several other MDM variants and specialized solutions. Most prominent among them is Master Data Management solution for Product Information. This variant is sometimes referred to as Product Information Master, or PIM, and it addresses a number of issues that by the very nature of the product data domain tend to have an enterprise-wide scope. PIM solutions support business processes that deal with the definition, creation, introduction, marketing, sales, servicing, supply-chain management, spend and inventory management, and product catalog management of product data. Often, the MDM Product Master has to deal with managing product definitional or reference data across multiple lines of business, where each LOB may have a unique terminology and structure of things that an organization may call “products,” but reconciling and integrating data about these products is not a trivial task. For example, a retail bank may have a product called Money Market Mutual Fund that it sells to its retail customers. The fund may consist of several components (subfunds or securities), some of which are defined and managed by the investment bank that uses different names and product hierarchies/structures for these underlying security products. When the organization has to review and report on the revenue and expenses associated with each product by category, the reconciliation of the different product views across lines of business is the goal of the MDM PIM solution.
The challenges of product information mastering are significant and often require approaches and capabilities that exceed those available for mastering customer information. Indeed, one of the key MDM capabilities is the ability to match and link data records in order to find and integrate similar records. The natural question of what is similar is the subject of extensive research, and its answer in large part depends on the domain of data that is being matched. For example, matching records about individuals is a relatively well-known problem, and a large number of matching techniques and solutions use a variety of attributes of the individual (for example, name, address, date of birth, social security number, work affiliation, and even physical attributes, if they are known) to deliver a high-confidence matching result. We discuss matching and linking in more detail in Parts II and IV of this book.
The situation changes when we move to other domains, such as product reference data and hierarchies of entities within various domains. Current published research shows that product matching, for example, can be much more complex than name and/or address matching. This complexity is driven by the fact that, although product attributes may represent a standard set for each product category (for example, consider a TV set as a product whose features and technical characteristics are well known), different manufacturers or suppliers may use different expressions, abbreviations, and attribute values to describe the same feature. Likewise, in the example of the financial products, different lines of business define and manage different aspects of a product, and often organize the products in hierarchies based on established business processes and practices (the structure of a sales team, geopolitical constraints, and so on). From a consumer perspective, if you try to review a detailed description of a product, you may end up with an incomplete or one-sided definition of the product (for example, an LCD HDTV set may have a feature described one way in a manufacturer’s catalog but described differently or simply omitted in a store catalog). Similarly, in financial services, a product called a “High-Yield Bond” may be described differently or called something different, depending on the level of detail available from a given bank or broker.
This product reference incompatibility is magnified at the business level when you have to deal with a variety of suppliers or business partners. That is why a number of industry initiatives are underway to develop a library of common standards that describe entities for business-to-business (B2B) commerce in general (for example, global supplier networks such as 1SYNC, ebXML, radio-frequency identification [RFID]) or for a given domain (for example, RosettaNet for various vertical industries such as electronic components, manufacturing, and supply-chain automation; HL7 for the pharmaceutical industry; ACORD for the insurance industry; FpML, XBRL, and FixML for the financial services industry; and so on).
Another challenge that an MDM solution should be able to handle is the ability of an MDM Data Hub to define, recognize, and reconcile hierarchical structures of a given entity domain. For example, the process of aggregating and reconciling product expenses and revenue across a large, globally distributed organization is complicated because individual product information coming from a single LOB and/or from a given geographical region may describe the product (for example, Personal Loan) that is not only called something different but is at a different level than a conceptually similar product offered by another LOB of the bank (for example, Credit Management Basket, which may have a component that corresponds to the Personal Loan). Likewise, a large organization may consist of numerous legal entities/subsidiaries, each of which reports its own profit and loss statements to the corporate headquarters for the annual reports. A straightforward aggregation may produce erroneous results because some of the legal entities are subsidiaries of others, thus resulting in an over-counting of the numbers. Cases such as these illustrate that full-function MDM solutions not only have to create and manage master data for a given domain, but also enable understanding and management of master data hierarchies and reconciliation of hierarchy-level–defined data.
Customer Masters, Product Information Masters, Hierarchy Masters and associated Hierarchy Management, as well as other MDM variants have emerged to deal with these complex issues, sometimes involving the management and integration of not just structured, reference data but also unstructured data such as documents describing investment instruments, features and functions of new or existing products, and the management and navigation of complex entity groupings that may or may not be hierarchical in nature. We’ll discuss many of these issues in Part IV of the book.
The previous sections illustrated the complex, broad, and multifaceted nature of MDM. As defined previously, MDM is a framework that addresses complex business and technical problems, and as such it is a complex, multifaceted subject that can be described and viewed from various angles. The amount of information about MDM is quite large, and in order to make sense of the various and sometimes contradictory assertions about MDM, we need to apply one of the principles of organizing and managing information—classification or categorization. This allows us to organize available information and discuss various aspects of MDM according to a well-defined structure. Thus, in this section, we introduce several commonly accepted MDM classification schemes or dimensions. These classification dimensions include the Design and Deployment dimension, the Use Pattern dimension, and the Information Scope or Data Domain dimension. We see these dimensions as persistent characteristics of any MDM solution, regardless of the industry or master data domain. Clearly, there are other approaches to classifying such a complex subject, but we feel that these three dimensions cover most of the major differences between various MDM variants:
• Design and Deployment This classification describes MDM architecture styles that support a full spectrum of MDM implementation—from a thin MDM reference-only layer to a full master data store that can support all business processes, including online transaction processing. These styles—Registry, Coexistence, Full Transaction Hub—are discussed in greater detail in Part II of the book.
• Use Pattern This classification differentiates MDM solutions based on how the master data is used. We see three primary use patterns for MDM data usage:
• Analytical MDM This use pattern supports business processes and applications that use master data primarily to analyze business performance and provide appropriate reporting and insight into the data itself, perhaps directly interfacing with Business Intelligence suites. Analytical MDM tends to be read mostly, in that it does not change/correct source data in the operational systems but does cleanse and enrich data in the MDM store; from the data warehousing perspective, Analytical MDM builds complex data warehousing dimensions.
• Operational MDM This use pattern allows master data to be collected, changed, and used to process business transactions. Operational MDM is designed to maintain the semantic consistency of master data affected by the transactional activity. Operational MDM provides a mechanism to improve the quality of data in the operational systems, where the data is usually created.
• Collaborative MDM This use pattern allows its users to author master data objects and collaborate in the process of creation and maintenance of master data and associated metadata.
• Information Scope or Data Domain This dimension describes the primary data domain managed by the MDM solution. In the case of customer MDM, the resulting solution is often called Customer Data Integration, or CDI; in the case of product MDM, the solution is known as Product Information Management, or PIM; other data domains may not have formal names yet, but they could have an impact on how the MDM solution is designed and deployed.
These MDM classification dimensions help us to better understand the relevance and importance of various factors affecting the product selection, architecture, design and deployment choices, impact on the existing and new business processes, and overall MDM strategy and readiness for the enterprise. The latter is based on the MDM Capability Maturity Model (MDM CMM), discussed in Part IV of the book.
An introduction to Master Data Management cannot be complete unless we look at the benefits these types of solutions bring to the enterprise. Companies embark on major MDM initiatives because of their natural need to establish a single, authoritative, accurate, timely, and secured master data system. In turn, they can create more accurate and timely key performance metrics, measure and manage risks, and develop competitive winning strategies. Furthermore, MDM allows enterprises to be compliant with appropriate regulatory requirements, including those defined by the Gramm-Leach-Bliley Act, the Sarbanes-Oxley Act, the Basel II Capital Accord, and many others (a discussion of these regulations can be found in Part III of the book). This compliance allows organizations to avoid costly penalties and bad publicity.
From a financial point of view, having a single authoritative system of record positions the enterprise to gradually sunset a number of legacy systems and applications and thus realize significant cost savings.
In addition to potential cost savings and gaining compliance, Master Data Management offers a number of critical capabilities to enterprises and government agencies alike, including the ability to detect and prevent illegal money-laundering activities and other fraudulent financial transactions in accordance with regulations such as AML and the KYC provisions of the USA Patriot Act. Moreover, the ability of customer-focused MDM (CDI) to discover and expose previously unknown relationships between individuals can be extremely useful in the global fight against terrorist organizations.
Likewise, creating and managing an authoritative system of record for product information and the ability to create and manage complex hierarchies of master data has a profoundly positive effect on all aspects of the enterprise business—from conceptualizing and introducing new, more effective, and highly competitive products to improving supply-chain metrics and efficient, just-in-time product delivery processes that save costs and improve customer satisfaction levels, to reconciling organizational and structural differences between various definitions of products, customers, organizational entities, and the like.
As a vehicle for discovering and addressing data quality issues, MDM is impacted by and has a direct impact on established business processes and can improve their effectiveness, efficiencies, timeliness, and reduce complexity, all of which can have a profound positive impact on the enterprise agility, cost structure, and competitiveness.
Master Data Management is not just about data accuracy and compliance. MDM helps to create new opportunities and ways to drastically improve customer experience and increase top-line revenue. Indeed, many new and established enterprises are looking to differentiate themselves from the competition by significantly increasing customer satisfaction and improving customer experience. Having an accurate and complete system of record for customer information and for products and services that can be offered and personalized for various customer groups allows enterprises to gain new and more actionable intelligence into the customer’s buying behavior and thus allows companies to create and offer better and more accurate personalized products and services. Master Data Management solutions allow enterprises to capture and enforce the customer’s privacy preferences and ensure the protection of the customer’s confidential data—actions that result in the enterprise’s ability to establish and strengthen trusted relationships with their customers, thus creating an additional competitive advantage.
In any customer-facing business, MDM not only helps to retain profitable customers but also addresses the challenge of any enterprise to grow its customer base. This customer base growth opportunity comes from several different directions:
• Accurate and complete customer data allows the enterprise to better leverage various cross-sell and up-sell opportunities.
• Master data that contains information about prospects allows enterprises to increase their prospect-to-customer conversion ratio, thus increasing the customer base.
• MDM’s ability to discover and understand the complete picture of current and potential customer relationships allows an enterprise to create a targeted set of marketing campaigns and product and service offers that may prove to be more cost effective and demonstrate higher lift than traditional mass marketing.
Finally, any discussion about the benefits of Master Data Management would not be complete without mentioning the disruptive, transformational nature of its CDI variant for customer-facing business, which allows an enterprise to change its core business model and customer-facing products and services by transforming itself from an account-centric to a customer-centric enterprise. This new, transformed enterprise no longer views, recognizes, and services customers by account number.
The old account-centric model does not enable an enterprise to easily and reliably identify individuals who are associated with the account. Moreover, the old model does not enable an enterprise to discover associations and relationships between individuals owning the accounts and other individuals and businesses that own other accounts. For example, an individual may have several accounts with a bank, and some of these accounts may have designated an individual in the role of beneficiary or with power of attorney who may own another set of accounts, some of which may be managed by another business unit. Ideally, the enterprise would gain a significant competitive advantage if these intra- and inter-LOB relationships were discovered and leveraged, to increase the customer base and corresponding share of the customer wallet.
Discovering these relationships may have an extremely high impact on the way the enterprise should treat the individual. Indeed, recognizing the total lifetime value of the customer would allow the enterprise to provide an appropriate set of products, services, and special treatments that are commeasurable with the total value of the relationships that the customer may have with the enterprise. For example, an individual who opens a low-value savings account may be treated differently by the bank if it is known that this individual is also a high-net-worth customer of the bank’s wealth management business, or if the customer is also a president of a medium-size company, or if the customer’s spouse has a separate high-value account, or if this customer’s child, who does not yet have an account with the bank, has inherited a multimillion-dollar trust fund.
In short, MDM-enabled transformation from the account-centric to the customer-centric model is revolutionary, and it can drastically impact established business processes and change the way the enterprise treats its customers, competes in the market, and grows its core customer-driven revenue.
1. Dubov, Lawrence. “MDM as a Technique to Prioritize Data Quality Problems.” http://blog.initiate.com/index.php/2009/12/08/mdm-as-a-technique-to-prioritize-data-quality-problems/.
2. Berson, Alex, Stephen Smith, and Kurt Thearling. Building Data Mining Applications for CRM. McGraw-Hill (December 1999).
3. Dubov, Lawrence. “MDM Data Quality Processes.” http://blog.initiate.com/index.php/2009/12/15/mdm-data-quality-processes/.
4. White, Andrew and John Radcliffe. “Mastering Master Data Management.” Gartner Research (May 2008).
5. Dyche, Jill, Evan Levy, Don Peppers, and Martha Rogers. Customer Data Integration: Reaching a Single Version of the Truth (SAS Institute, Inc.) Wiley (August 2006).
6. Loshin, David. Master Data Management. The MK/OMG Press (September 2008).
7. Berson, Alex and Larry Dubov. Master Data Management and Customer Data Integration for a Global Enterprise. McGraw-Hill (May 2007).
8. Berson, Alex, Stephen Smith, and Kurt Thearling. Building Data Mining Applications for CRM. McGraw-Hill (December 1999).
9. Berson, Alex and Stephen J. Smith. Data Warehousing, Data Mining, and OLAP. McGraw-Hill (November 1997).
10. Kimball, Ralph and Joe Caserta. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. Wiley (September 2004).
11. Kimball, Ralph, Margy Ross, Bob Becker, and Joy Mundy. Kimball’s Data Warehouse Toolkit Classics: The Data Warehouse Toolkit, 2nd Edition; The Data Warehouse Lifecycle, 2nd Edition; The Data Warehouse ETL Toolkit. Wiley (April 2009).
12. Inmon, William H. Building the Data Warehouse. Wiley (Oct 2005).
13. Inmon, William H. Building the Operational Data Store, 2nd Edition. Wiley (May 1999).
14. English, Larry P. Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits. Wiley (March 1999).
15. McGilvray, Danette. Executing Data Quality Projects: Ten Steps to Quality Data and Trusted Information. Morgan Kaufmann (July 2008).
16. Morgenthal, JP. Enterprise Information Integration: A Pragmatic Approach. Lulu.com (May 2005).
17. Morgenthal, JP. Enterprise Information Integration: A Pragmatic Approach. Lulu.com (January 2009).
3.144.40.212