Chapter 7

Data Stewardship

Abstract

This chapter covers the discipline practice of data stewardship in a multi-domain Master Data Management (MDM) model. It discusses the various roles, responsibilities, and investments needed to develop multi-domain data steward models and practices, as well as the need for data stewardship to be effective at the tactical and operational levels.

Keywords

Stewardship

stewards

support

centralized

decentralized

federated

governance

monitoring

expertise

In this chapter, we will discuss many opportunities where data steward roles can be applied, but it will be up to the MDM PMO and the data governance council to clearly understand their company’s business model, the ebb and flow of their master data, and how and where to most effectively position data stewards so that the practice of data stewardship has a broad foothold for support across a multi-domain MDM model. In a MDM model, data stewards cannot just be agents for data governance policy and standards; they also need to be closely aligned to the touch points and consumers of the master data where data entry, usage, management, and quality control can be influenced most.

Throughout this book, there is frequent discussion of the importance of the data steward role. Whether this role exists within a company from employees with formal data steward job titles, or from employees with other job titles but which provide data steward–type functions, the most important point is that these roles will truly support the concept and practice of data stewardship as an underlying discipline and success factor for MDM. Here is the definition of data stewardship provided by the DAMA Data Management Body of Knowledge (DMBOK 2010):

The formal accountability for business responsibilities ensuring effective control and use of data assets. Some of the responsibilities are data governance responsibilities, but there are significant data stewardship responsibilities within each of the other data management functions.

Data stewardship should run deep through all of MDM, but the data steward focus and requirements can vary with each domain. Therefore, a multi-domain plan needs to closely examine the data steward needs for each domain in scope to determine how and where data steward roles can be best positioned and how the right resources can be identified and engaged. Once the MDM data steward model is defined and aligned with data governance, the successful execution and practice of data stewardship will then come down to how effectively the data steward processes and supporting technologies are able to repeatedly address and control data quality needs over time.

To begin defining the right data steward model and approach for maximum support of master data, several fundamental factors need to be examined:

 What is the current state of data governance?

 Should “data steward” be a formal job role and job title?

 What is the right data steward approach within each MDM domain?

 How to engage data stewards in data access control processes?

 What processes and technologies are needed to support data steward roles?

Let’s take a deeper look at each of these important points.

What Is the Current State of Data Governance?

For a multi-domain MDM program to succeed, data governance and data stewardship practices need to be closely orchestrated. This effort needs to start with an effective data governance process aligned with the MDM strategy and implementation plan. Chapter 6 provided strategies and examples for how to establish a consistent, transparent governance model across MDM domains and indicated that if a sufficient data governance process and structure does not already exist for the MDM program to use, the MDM program itself will need to become the driving force behind the implementation of the necessary data governance processes. Similarly, if an appropriate data steward model does not already exist, the MDM program and data governance will need to become the driving forces behind the implementation of the data steward model. Without well-aligned data governance and data steward models, the MDM program cannot succeed.

Prior to a company having a multi-domain MDM strategy and plan, there are likely to be existing instances of data governance and data steward practices that have resulted from locally or functionally oriented data management initiatives. For example, in one functional area, a data administrator or a support engineer may be acting in a data steward role to control a specific set of master data, such as validating sources to target data loads according to certain acceptance criteria and monitoring error log activity associated with any data mapping or integration issues.

On the other hand, in another functional area, a data management team holds a broader and more highly visible responsibility for overseeing various change control processes and data quality management activities that include master data. Both are good examples of various information technology (IT) or business functions taking the initiative to implement data steward activity, but these cases are usually independently defined based on data management requirements associated with a specific functional area; they lack a broader alignment with an enterprisewide data governance strategy and plan.

As a company executes a multi-domain MDM strategy, these existing data steward functions need to be identified and aligned with the enterprise strategy. This will pull together a more cohesive network of data steward and quality controls that supports local and enterprisewide requirements. For example, a local or regional order management process requires the customer’s name, address, and payment information to handle an order, but the customer’s email address and telephone number is not required, even though other functions in the company, such as marketing, finance, or customer service, would benefit by having that additional information. In an enterprise MDM and data governance model that has well-positioned data stewards, it will be much easier to identify these data capture needs to satisfy the broader enterprise requirements and demands for this data. A well-structured data governance process aligned with the MDM program model and data architecture strategy will greatly aid in determining the master data priorities and control points where data steward roles can be most effective.

Should “Data Steward” Be a Formal Job Role and Job Title?

If your company is serious about data governance and data quality management, it must get serious about creating formal data steward job titles and roles. As with almost any job, a person with a specific job role and set of responsibilities should have a job title and career path associated with that role and those responsibilities. The purpose and function of data stewards are vital to maintaining the high-quality data that a healthy company needs. If data is in fact one of the most important strategic assets within a company, then data steward roles should be highly valued and formally recognized. This is important from a number of perspectives:

 From a cultural perspective, it is critical to indicate that a company truly values and formally recognizes the data steward function as a necessary component for establishing a high-quality culture.

 From a career development perspective, there is a growing focus in the data management industry on the data steward role and career path. This can be observed through job postings, industry training, and certification offerings, as well as the fact that data steward–oriented topics are often featured at many annual data governance and data management conferences.

 From a management and HR perspective, there is a need to statistically track how and where data steward roles are positioned and performing within a company.

Unfortunately, reaching a situation where the data steward job role and job title are formally recognized in a company can take a long time and be an evolving factor in the maturity of MDM and data governance. Companies will not typically have data stewards until there is a strong internal initiative and business justification that can drive the need and creation of this job role by Human Resources (HR). In the meantime, however, the concept and practice of data stewardship can be addressed without having formal job titles in place. Working cross-functionally, the MDM PMO and data governance process should examine all the options needed to implement data steward roles by leveraging other existing IT and business roles. In parallel, the MDM and data governance programs should build the justification and foundation to get formal data steward job roles and titles established in the company as part of the strategic roadmap.

Consider that the data steward role typically involves one or more of these responsibilities, skill sets, and knowledge areas:

 Training and enforcement of data policy and standards

 Process area expertise regarding data use, flow, and touch points

 Subject matter expertise in data governance activities and decisions

 Process analysis and development of the data management process

 Familiarity with certain data models and data infrastructure

 Data analysis and research

 Monitoring of data quality and error conditions

 Correcting data error conditions

 Supporting specific data management, change control, or maintenance tasks

 Data access management

Many of these responsibilities and areas of expertise already exist in one form or another in job roles in various other IT and business areas. Therefore, there certainly are opportunities early in the MDM program plan to define and begin initiating some data steward practices by leveraging support from these existing roles, discussed in the next section.

Leveraging IT and Business Support

In a typical IT engagement model, there are specific resource request processes that allow projects to identify and secure IT resources on a fixed time and budget basis provided that the resources and funding are available. In this scenario, an MDM or data governance project initiative can plan for and gain certain IT assistance, such as with data mining, profiling and analysis, and application support, or address certain data integration, administration, or corrective action needs that fit into the IT services portfolio. Fixed time and project support arrangements with IT can certainly provide much value for many needs that MDM and data governance programs will have as these programs expand and mature, but this type of IT support will usually only be able to address technical and infrastructure-oriented requirements. Thus, IT support can play an important role, but it cannot be a substitute for creating an ongoing data steward model. In an MDM program, the roles and responsibilities for data stewards reach beyond technical support needs and must penetrate deeply into the business process areas.

And while there are opportunities to leverage IT support for specific technical support needs, the ability to leverage existing business resources and expertise for business support can be far more challenging and will not usually have a specific resource request process that exists in the IT model. Therefore, during the planning and initiation of the MDM program, it is critical that the business areas are engaged, aligned properly, and can help with supporting the build-out of the data steward models needed by the MDM domains in scope.

Otherwise, if the requirements for data stewards were not sufficiently planned for and emerge later—as they will—getting the resources and time commitments needed from the business side at that point will be very difficult. These people may be unavailable (or only partially available) to participate as subject matter experts in data governance teams and often unable to take on tasks outside their current job responsibilities. Business functions or projects that are most affected by master data issues may try to offer some resources on a short-term or part-time basis. In other situations, the use of contractors may have to be considered, but often the cost associated with that has not been planned in any budget. Having only limited business support and trying to use contractors to fill the gap will not be a sufficient or sustainable approach to address the longer-term needs for data steward support across a multi-domain MDM program. Without a more dedicated data steward model in place, it will be difficult to get master data management and quality improvement projects underway or fully completed.

Now contrast that with a scenario where a more focused enterprise data management model and quality culture exists in the company and where an effective data steward model has been implemented. Having recognized data stewards who are sufficiently skilled, empowered, and well-positioned in key functional and regional areas can immediately help support the projects and priorities determined through the MDM and data governance roadmaps. Having well-positioned data stewards demonstrates that there is a commitment within the business to data management and quality improvement. There are some key factors that will drive the more proactive approach described previously:

 Ensure that early in their planning, MDM and data governance programs call out data steward budget and resource needs as being critical to MDM success.

 Work with the data governance council to define enterprise policies to ensure that project development and solution design processes have requirements for data steward and quality management support where master data is involved in the deliverables.

 MDM and data governance programs should initiate efforts with HR to justify the formal creation of data steward job roles within the company if they don’t already exist.

 Minimize the use of IT and consulting help as substitutes for real data stewards.

Too often, projects developing and delivering data models and applications that involve new assets, such as additional master data and reference data, do this without planning sufficiently for quality control and ongoing data management of these assets. This leads to data ownership and stewardship gaps that later can become unwanted wrestling matches between business, IT, and data governance teams as a result of data quality or maintenance problems, information protection issues, internal audits, and other issues. Again, early planning and having project requirements to account for data management and stewardship roles will significantly reduce the occurrence of data ownership and accountability problems.

What is the Right Data Steward Approach Within Each MDM Domain?

It’s important to recognize that how and where data stewards can be most effectively engaged in a multi-domain model will be highly dependent on the data architecture and master data flow within each domain. Throughout Part 1 of this book, while discussing the planning aspects of MDM, it was pointed out that implementing a successful MDM initiative starts with choosing the right approach—one that will drive the proper fitting of the MDM practices to an enterprise architecture and business model. A company’s enterprise architecture and business model will consist of many specific functions, processes, teams, and job roles that already interact with the master data. By examining the usage and flow of master data in a particular domain (also see Chapter 2), you can determine critical points where the practice of data stewardship can be effectively applied. This may involve master data entry points, data integration points, or other downstream processes where the master data quality and consistency need to be controlled.

This may also require data monitoring to alert the data steward of conditions that will need attention. For example, in a customer data hub environment, there may be rules that require certain types of customers to have certain types of account setups and profiles. A company can have various types of business relationships with its customers that require the use of different types of accounts, contracts, and pricing details to define them. End user customers who interact on a retail basis with a company may have an end user type of account setup, but typically they will not have any special contract or pricing relationships with the company. On the other hand, customers who are wholesale purchasers or product resellers will have specific account, contract, pricing, credit, and payment details associated with that relationship. In each case, it is important that customer account types are correctly identified and set up by the process area responsible for managing the accounts. This doesn’t always happen correctly due to administrative oversight, incorrect customer information, or data integration issues. For instance, in a data merge process involving customer records, the merge logic can create a false-positive or false-negative condition that produces the wrong merge result, leading to an incorrect customer master record.

Using another example, in the health care industry, there needs to be consistent business definitions, rules, and data attributes that defines health care providers at the institutional level (e.g., a provider organization such as a hospital, medical group, pharmacy group, clinic, or emergency care center) and the individual level (e.g., a doctor, pharmacist, nurse, therapist, psychologist, or chiropractor). These definitions and distinctions are critical not only to how contracts, claims, payments, and networks are handled between health insurance companies and service providers, but also to presenting accurate information to health plan members and potential members regarding what services and providers are included in in-network or out-of-network coverage for any given health plan. Therefore, the accuracy and consistency of provider data, definitions, and rules are vital to the success of any health care company and to the industry as a whole. The accuracy and consistency of this data will hinge on having effective data governance standards and steward practices in place to manage the quality and integrity of this data across the many entry and consumption points this data can have.

Scenarios like this are examples of where checks and balances are needed with master data. These checks and balances should be brought to the attention of data stewards who are asked to ensure that the master data is correct and that the usage is consistent with the policies, standards, rules, business requirements, and operational guidelines. It is also important to have data stewards clearly engaged in the applications and data repositories associated with the metadata, reference data, and any external vendor-provided data involved in master data. These additional areas play important roles in the overall consistency, maintenance, and standardization of master data. These aspects are explored further in the next chapters, which cover data integration, data quality management, and metadata.

Chapter 6 mentioned that a multi-domain MDM program will likely require a federated data governance model that can support centralized and decentralized system and data architecture. This also means that different data steward approaches need to be considered and applied in order to address centralized and decentralized architectures in a multi-domain MDM model. Each domain will likely have different architecture and processes that the master data is subject to; therefore, as part of an MDM and data governance execution plan, each domain scenario should be examined to determine the right data steward model and approach. Let’s examine this task from a centralized data hub architecture perspective and a decentralized architecture perspective.

Data Stewardship in a Centralized Data Hub Architecture

In a domain with data hub architecture, there should be great focus on the quality management of the data coming into and out of the data hub. A data hub will typically have multiple internal systems connected as spoke systems and may have one or more sources of external data (such as vendor data) used in the data hub. These inbound and outbound flows of data are all key areas where data stewards should be assigned. These data stewards should be very familiar with the data and the transactional or vendor processes that are connect to the hub. For example, customer and product domains may each have a data hub that interacts with multiple transactional systems involved with sales, quoting, shipping, and customer service processes, and is likely to also use data from one or more external vendors. Data stewards should be positioned to oversee these inbound and outbound interactions, as well as the data quality. Figure 7.1 provides an example of this concept.

f07-01-9780128008355
Figure 7.1 Data steward in a data hub model

This type of data steward model ensures that there is monitoring and data quality control with the master data flow. As any data flow or quality issues emerge, data stewards can quickly address them through the data governance and quality management processes in that domain or, if needed, with other domains if cross-domain impacts are related to the root cause and correction of the issue.

Data Stewardship in a Decentralized Model

Contrast the scenario involving a data hub architecture with a situation with a data domain that has a more decentralized operational architecture where sales, orders, distribution, and customer services are handled through processes and systems on a regional basis. Each regional system’s transactional data may or may not be consolidated into a central data warehouse, and there can be various quality and consistency issues with the data from each source system. Obviously, this creates a more challenging footprint for data governance and stewardship practices. In this decentralized model, much more mapping and normalization of data are required at various points, such as by extract, transform, and load (ETL) processes, entity resolution processes, checks and balances, and other data quality management processes. This will translate to more data stewards that need to be engaged across source systems, data warehouse, data marts, and other environments.

For example, in this type of model, data stewards should be positioned across various transactional and analytical processes and system areas and with the vendor data entry points. Together, these data stewards can work as a collective community aligned with the data governance process to address master data issues, common standards, business rules, business term definitions, reference data, and data quality management needs. In a decentralized model, this network of data stewards will provide tremendous value with their ability to proactively or reactively address emerging data quality and data governance issues or needs. Figure 7.2 provides an example of this concept.

f07-02-9780128008355
Figure 7.2 Data stewards in a decentralized model

Without this network of data stewards, data issues in the source systems or data warehouses will be much harder to address in a timely fashion due to delays when trying to assign resources to do the root-cause analysis and address corrective action needs where data ownership and accountability issues exist. These type of issues can be largely avoided or more easily handled with a well-positioned network of data stewards.

Data Stewardship Across a Federated Data Governance Model

In a multi-domain MDM model, there is likely to be a mix of data hub and decentralized operational models across the domains, meaning that in multi-domain MDM, there can be many layers of complexity and cross-functional involvement that need flexible data governance and data steward approaches. This is why the federated data governance model described in Chapter 6 will typically be needed within a multi-domain MDM program to coordinate data governance and data steward focus across various domain structures. Figure 7.3 provides an example of this concept.

f07-03-9780128008355
Figure 7.3 Data stewardship across a federated model

As these models and layers become better understood and reveal the coordination needed to successfully drive data quality improvement initiatives, companies begin to realize that the analysis of a significant data quality issue and the implementation of the action plan to correct that issue will require a number of orchestrated tasks from a number of committed resources from across the business and IT functions. The data stewards will make the difference between poor and successful execution of these initiatives.

How to Engage Data Stewards in Data Access Control Processes?

Many companies today require employees to complete internal training related to information protection and business conduct. In these courses, there are usually various real-life examples cited where fraud, theft, inside trading, information privacy issues, and other types of accidents, violations, or misconduct have occurred in relation to company data and proprietary information. Many of these cases involve employees in business roles who are knowing or unknowing participants in the incident. These cases are used in training to help emphasize how damaging the lack of information protection and control can be.

As MDM practices mature, more and more applications and users are being migrated to increasingly integrated platforms and shared environments where the common use of tools, interfaces, and master data exists. What had been independently controlled applications and groups of users now collectively become a much larger pool of users interacting with a common environment. The user, application, and data separation that physically existed before does not exist any longer. Maintaining information protection and user access control to master data now requires closer examination of and accurate information about employee identification, job roles, data usage intent, and monitoring of cross-functional process interactions with data, to the extent that a simple systems administrator approach for managing access control is no longer adequate.

In data hub or data warehouse environments, there can be a complex variety of process interactions and data access needs from various user groups involving different combinations of data. Because of this, the access management methodology and master data control practices for these types of environments need to be more rigorous and should involve a data steward role acting in an access control gatekeeper function associated with an MDM domain and the data governance process. The primary purpose of this gatekeeper function is to define and manage a more business process–oriented set of checks, controls, and monitoring focus that is needed to augment existing access management and authorization processes in order to more precisely recognize and control who has the authority to access the master data for any given domain. The data steward–oriented gatekeeper process can be an additional function and process layer integrated with an existing access control model, as shown in Figure 7.4.

f07-04-9780128008355
Figure 7.4 Data steward role in the data access management process

This is all about information protection and the ability to more tightly manage compliance and other risk factors associated with uncontrolled events or any misconduct involving master data. Master data access control can be greatly enhanced if data stewards can be included in a gatekeeper role during the planning and implementation of data access management. This will enable more business involvement and accountability in the process.

What Processes and Technologies Are Needed to Support Data Steward Roles?

Similar to data governance practices, data steward practices are primarily people- and process-driven activities that need to be specially fitted within business models and data architectures. There are no business applications or vendor solutions that can deliver a comprehensive data steward model and process. Data stewards, however, can be greatly enabled and work much more effectively when they have training and access to common tools, applications, and reports that allow them to understand the use and flow of data, as well as to analyze, control, and monitor data quality in the areas for which they are accountable.

Chapter 3 mentioned that a certain amount of data quality management capability is expected to already exist or will be planned for as part of a well-designed and executed MDM solution. Some of this functionality may be delivered as part of implementing a data hub solution, while other functionality may be delivered from other vendor solutions or internally built solutions. For example, having standard solutions for data profiling, data quality metrics, and quality dashboards is fundamental to efficient and effective data governance and stewardship activities. More investment and consolidation toward enterprise standard platforms and solutions for data quality management, metadata management, and reference data management will improve the efficiency and effectiveness of data stewards.

Data stewards are a natural user group for the reviews, proof of concepts, and pilot activities involving vendor data management solutions. Many vendors now market data governance– and data steward–oriented products or capabilities in their solution suites. Many of these products and capabilities can be very useful, but of course, they also can be expensive, need careful assessment, and must be able to provide sufficient value and improvement to existing data management practices. Many vendor tools and services are purchased and implemented from an IT perspective without much (if any) consideration of the data governance and data steward perspectives. Therefore, the solutions implemented for data quality management, metadata management, and reference data management can be geared much more toward meeting technical requirements, and as a result they are not very user-friendly from a data governance and steward perspective. When evaluating these solutions, it is important to also consider the requirements and usability aspects from a data governance, data steward, and business user perspective.

Conclusion

This chapter covered the role of data stewardship in a multi-domain MDM program and stressed that data stewards cannot just be agents for data governance policy and standards. Rather, they need to be closely aligned to the touch points and consumers of the master data, where data entry, usage, and quality control can be most influenced.

There is an implicit relationship between data quality management, data governance, and data stewardship. For a multi-domain MDM program to succeed, these practices need to be closely orchestrated. Data stewardship should run deep through all of MDM, so a multi-domain plan needs to develop a firm concept of how a data steward model will look and function, where the data steward roles will be best positioned, and how the right resources can be identified and engaged. This chapter pointed out that a multi-domain MDM model can have a lot of layers of complexity, requiring a great deal of cross-functional involvement and support, and that as the layers become better understood and reveal the coordination needed to successfully drive the quality and control of master data quality, the data stewards will make the difference between poor and successful execution of these initiatives.

Ideally, there should be formal job titles for data stewards to demonstrate the value and importance this role has in a company, especially in relation to establishing a high-quality culture. But creating those job titles and developing a quality culture are often the result of reaching many prior milestones within the MDM and data governance maturity models. Until such time that formal data steward roles are in place, the concept and practice of data stewardship as an underlying discipline and success factor for MDM will still need to be supported through IT and the business resources that have the knowledge, skills, and focus needed to support and control master data.

It is important that data stewardship quickly starts proving its value and worth. Therefore, choose high value and visible points to position data steward roles, but also expect that the ability to expand the data steward footprint across the enterprise landscape can be a slow process that will compete with other internal growth and investment areas for resource and budget allocation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.38.92