Chapter 23 Administering the DW 2.0 environment

The DW 2.0 environment is a complex environment that is built over a long period of time. The DW 2.0 environment touches many parts of the corporation—day-to-day operations, management, tactical decisions, strategic decisions, and even the boardroom. There are many facets to the DW 2.0 environment—technical, business, legal, engineering, human resources, and so forth. As such, the DW 2.0 environment is one that is a long-term management issue and requires careful management and administration.

This chapter will touch on some of the many issues of management and administration of the DW 2.0 environment.

THE DATA MODEL

At the intellectual heart of the DW 2.0 environment is the data model. The data model is the description of how technology meets business. The data model is used to direct the efforts of many different developers over a long period of time. When the data model is used properly, one piece of development fits with another piece of development like a giant jigsaw puzzle that is built over a period of years. Stated differently, without a data model, coordinating the long-term development of many projects within the DW 2.0 environment with many different people is an almost impossible task.

The data model is built at different levels—high, middle, and low. The first step (and one of the most difficult) is the definition of the scope of integration for the data model. The reason the scope of integration is so difficult to define is that it never stays still. The scope is constantly changing. And each change affects the data model.

When the scope changes too often and too fast, the organization is afflicted with “scope creep.”

Over time, the high-level data model needs the least amount of maintenance. The long-term changes that the organization experiences have the most effect on the midlevel model and the low-level model. At the middle level of the data model, over time the keys change, relationships of data change, domains of data change, definitions of data change, attributes change, and occasionally even the grouping of attributes changes. And as each of these changes occurs, the underlying related physical data base design also changes.

Part of the job of the administration of the data model is to ensure that it is being followed with each new development and with each new modification of the data warehouse. The biggest challenges here are to ensure that

no new primitive attributes are being developed that are not in the data model, or if new primitive data elements are introduced, they find their way into the data model;
new developers view the data model as an accelerant to progress, not a barrier to progress;
new modifications to DW 2.0 follow the data model.

It is important to note that groupings of attributes and keys/foreign keys are very important for compliance, but other aspects of the data model may be less important.

In addition, it is not necessary for the derivatives of primitive data to be in compliance with the data model.

ARCHITECTURAL ADMINISTRATION

In addition to an administrative organization tending to the data model for compliance with the data model, it is necessary to have a general architectural organization that administers the DW 2.0 architecture. The architectural administration is one that tends to the long-term oversight of the architecture. Some of the concerns of the architectural administration are the following.

Defining the moment when an Archival Sector will be needed

In most environments it is not necessary to build the archival environment immediately. Usually some period of time passes before it is necessary to build the archival environment. The architectural administration provides the guidance for when and how the archival environment is to be built. The architectural administration determines many aspects of the archival environment, such as

when data will be passed into the archival environment;
how long data will stay in the archival environment;
the criteria for moving data out of the archival environment;
the platform the archival environment will reside on;
the data base design for the archival environment;
whether passive indexes will be created;
how passive indexes will be created;
the level(s) of granularity for archival data;
and so forth.

Determining whether the Near Line Sector is needed

If the Near Line Sector is needed, the architectural administration determines such important parameters as when data will be moved into the Near Line Sector, when it will be moved back into the Integrated Sector, and when it will be moved to the Archival Sector; what metadata will be stored; what platform the Near Line Sector will reside on; and so forth. Over time the need for a Near Line Sector can change. At the point of initial design it may be apparent that there is no need for a Near Line Sector. But over time the factors that shape the need may change. Therefore, there may come a day when the Near Line Sector is needed. It is the job of the architectural administration to make that decision. Some of the decisions made by the architectural administrator include

whether near-line storage is needed;
the criteria under which data is passed into near-line storage;
the platform that is needed for near-line storage;
the metadata that will be stored;
the criteria under which data passes out of near-line storage.

Another sector of the DW 2.0 environment that is of concern to the architectural administrator is the Interactive Sector. In some organizations, there is no interactive environment. In other organizations there is an interactive environment. The architectural administrator addresses such issues as the following:

Is an interactive environment needed?
If there is an interactive environment, is response time adequate, meeting all SLA requirements? Is availability adequate,meeting all SLA requirements? Is the interactive environment available for any reporting that needs to be done? Are the capacity requirements being met?
Is data being properly integrated as it leaves the interactive environment?
If legacy data is being read into the Interactive Sector, is that data being integrated properly into applications?
What platform is the Interactive Sector being run on?

Another task of the architectural administrator is that of making sure there is never a flow of data from one data mart to another. Should the administrator find that such a flow occurs, he/she redirects the flow from a data mart to the DW 2.0 environment and then back to the data mart receiving the data.

Yet another ongoing task of the architectural administrator is to ensure that monitoring is occurring properly and that the results of the monitoring are properly interpreted. Various kinds of monitoring activity need to occur in the DW 2.0 environment. There is the monitoring of transactions and response time in the interactive environment and there is the monitoring of data and its usage in the other parts of the DW 2.0 environment.

Some of the questions that should be considered in monitoring the DW 2.0 environment follow:

Are transactions being monitored in the Interactive Sector?
Is availability being monitored in the Interactive Sector?
Is data usage being monitored in the Integrated Sector?
Is dormant data identified?
Are the monitors consuming too many system resources?
When are the results of monitoring being examined?

One of the most important determinations made as a result of monitoring the usage of the Integrated Sector is whether it is time to build a new data mart. The administrator looks for repeated patterns of usage of the Integrated Sector. When enough requests for data structured in the same way appear, it is a clue that a data mart is needed.

These then are some of the activities of the architectural administration of the DW 2.0 environment. But there are other aspects of the DW 2.0 environment that need administration as well.

It should go without saying that one of the skills the architectural administrator needs to have is that of understanding architecture. It is futile for an individual to try to act as an architectural administrator without understanding what is meant by architecture and what the considerations of architecture are.

Another important part of architectural administration is that of managing the ETL processes found in DW 2.0. The first kind of ETL process found in DW 2.0 is that of classical integration of data from application sources. The issues that need to be monitored here include the traffic of data that flows through the ETL process, the accuracy of the transformations, the availability of those transformations to the analytical community, the speed and ease with which the transformations are made, and so forth. The second type of ETL tool is that of textual transformation by which unstructured data is transformed into the data warehouses contained in DW 2.0. The issues of administration here include the volume of data that arrives in the DW 2.0 environment, the integration algorithms that are used, the type of data that is placed in the DW 2.0 environment, and so forth. It is noted that the two forms of ETL transformation are entirely different from each other.

METADATA ADMINISTRATION

One of the most important aspects of the DW 2.0 environment is that of metadata. For a variety of reasons the administration of metadata is its own separate subject. Some of those reasons are:

The tools for metadata capture and management have lagged behind the rest of the industry considerably.
The history of metadata management has not been good. There have been far more failures than successes.
The business case for metadata requires much more attention than the business case for other aspects of the DW 2.0 environment.

And there probably are more reasons metadata management is a sensitive issue.

The problem is that it is metadata that is needed for the many different parts of the DW 2.0 environment to be meaningfully held together. Stated differently, without a cohesive metadata infrastructure, the many different parts of DW 2.0 have no way of coordinating efforts and work.

There are many aspects to metadata administration. Some of these aspects include

the original capture of metadata;
the editing of metadata;
the making of metadata available at the appropriate time and place in the DW 2.0 environment;
the ongoing maintenance of metadata;
the distribution of metadata to different locations in the DW 2.0 environment;
the further extension of metadata;
the archiving of metadata.

In addition to these considerations, the metadata administrator makes such important decisions as

what platform(s) metadata will reside on;
what technologies will be used to capture and store metadata;
what technologies will be used to display and otherwise make available metadata.

One of the issues of metadata is its ephemeral nature. Unlike structured data, metadata comes in many forms and structures. It simply is not as stable or as malleable as other forms of data.

Another major issue relating to metadata is that of the different forms of metadata. There are two basic forms of metadata. Those forms are

business metadata and
technical metadata.

As a rule technical metadata is much easier to recognize and capture than business metadata. The reason technical metadata is easier to find and capture is that it has long been understood. The truth is that business metadata has long been a part of the information landscape, but business metadata has not been formally addressed—by vendors, by products, by technology. Therefore, it is much easier to find and address technical metadata than it is to find and address business metadata.

DATA BASE ADMINISTRATION

Another essential aspect of the DW 2.0 environment is that of data base administration. In data base administration the day-to-day care and tending of data bases is done. Data base administration is a technical job. The data base administrator needs to know such things as how to restore a data base, how to recover lost transactions, how to determine when a transaction has been lost, how to bring a data base back up when the data base goes down, and so forth.

In short, when something malfunctions with a data base, it is the data base administrator who is charged with getting the data base back up and running.

One of the challenges of data base administration is the sheer number of data base administration activities that are required in the DW 2.0 environment. There are so many data bases and tables that the data base administrator cannot possibly devote huge amounts of time to any one data base. There simply are too many, all of which are too important to focus on any one data base. Therefore, the administrator needs tools that can look over the many aspects of the many data bases and tables that comprise the DW 2.0 environment.

Some of the considerations of data base administration in the DW 2.0 environment are

selecting tools for monitoring the data base administration aspects of the DW 2.0 environment;
selecting tools for the cure of data and prevention of its travails in the DW 2.0 environment;
making sure that the tools are applied when needed.

As a rule the data base administration job is a 24/7 job. Someone from data base administration is on call all the time to be able to advise computer operations as to what to do when a problem arises. Especially in the case of the interactive environment, when a data base problem occurs, the data base administrator needs to be as proactive as possible, because malfunctions and down time equal dissatisfaction with the environment. But being proactive is difficult because the vast majority of the tasks facing the data base administrator are reactive.

STEWARDSHIP

In recent years, with governance and compliance becoming large issues, the role of stewardship has become an important topic. In years past it was sufficient simply to get data into and through the system. In today’s world, the quality and accuracy of the data have become important.

It is in this framework that stewardship has been elevated into a position of recognized responsibility.

Stewardship entails the following:

The identification of what data elements make up the system of record.
The specification of the criteria of data quality for those data elements.
The specification of algorithms and formulas pertinent to those data elements.

To differentiate between the role of data base administrator and that of data steward, consider the following. When a data base has “gone down” and is unavailable to the system, the data base administrator is called. When performance suffers and there is a general system slowdown, the data base administrator is called. When there is an incorrect value in a record that the end user has encountered, the data steward is called. And when it comes time to create a new data base design, and the sources of data and their transformation are considered, the data steward is called.

There are, then, related yet different sets of activities that the data base administrator and the data steward have.

As a rule the data base administrator is a technician and the data steward is a business person. Trying to make the job of the data steward technical is usually a mistake.

Some of the aspects of the data steward’s working life include

being available for data base design, especially where transformation and mapping are a part of the design;
being available to answer questions with regard to the contents of a given data element;
teaching business analysts what data there is and how it is best interpreted;
ensuring that mapping and transformation are done accurately;
describing how algorithms and program logic should be done to reflect the true business meaning of data.

As a rule a large corporation will have more than one data steward. There are usually many business people that act as data stewards. Each primitive data element will have exactly one and only one data steward at any moment in time. If a data element has no data steward or if a data element has more than one data steward at any point in time, then there is an issue.

SYSTEMS AND TECHNOLOGY ADMINISTRATION

An integral part of the DW 2.0 environment is systems and technology. In its final format the world of DW 2.0 resides on one or more platforms. Because of the diverse nature of the data, processes, and requirements that are served by the different parts of the DW 2.0 environment, it is very unusual to have a single platform serve the entire DW 2.0 environment. Instead, many different technologies and many different platforms need to be blended together to satisfy the needs of DW 2.0 processing.

In one place DW 2.0 requires high performance. In another place DW 2.0 focuses on integration of data. In other places DW 2.0 mandates the storage of data for a long period of time. And in yet other places DW 2.0 caters to the needs of the analytical end user. In brief, there are many different kinds of measurement that determine the success of the DW 2.0 environment in different places.

These needs are very different and it is not surprising that no single platform or no single technology meets all of these needs at once.

Therefore, the technical and systems administrator of the DW 2.0 environment wears a lot of hats. Some of the aspects of the job of the technical administrator are

ensuring compatibility of technology, such as ensuring that data can be passed from one environment to the next, that the performance needs of one system are not impaired by another system, and that data can be integrated throughout all systems, and ensuring availability throughout the environment;
ensuring that there is a long-term growth plan for all the components of DW 2.0;
ensuring that metadata can be meaningfully exchanged between the components of the DW 2.0 environment;
making sure that it is clear to the end user which component of DW 2.0 is appropriate to be used for different kinds of processing;
network management—making sure that communications are possible and efficient throughout the DW 2.0 environment;
timing—making sure that data that must interface with other data is flowing in a smooth and unimpeded manner;
performance—making sure that performance is acceptable throughout the DW 2.0 environment;
availability—making sure that the components of DW 2.0 that are needed are up and running when needed;
making sure that metadata flows where needed and is available to the end user when needed.

An important aspect of the technical administrator’s job is capacity planning. In many ways the job of the technical administrator is like that of the data base administrator. The technician operates in many cases in a reactive mode. And no person likes to be constantly bombarded with the need to have everything done yesterday. And yet that is exactly the world in which the technician and the data base administrator often find themselves.

One of the most important ways that the technician can get out of being in a reactive mode is to do proper capacity planning. Not all errors and problems in the DW 2.0 environment are related to capacity, but many are. When there is adequate capacity, the system flows normally. But when a system starts to reach the end of its capacity, it starts to fall apart, in many different manifestations.

The sorts of capacity and related measurements that the technician pays attention to in the DW 2.0 environment include

memory, for all types of processing but especially for online transaction processing that occurs in the interactive environment;
queue length and capacity (queue length is usually an indicator of a bottleneck in the system);
caching capacity and cache hits;
hard disk space;
near-line space;
archival space;
archival processors;
networking capacity;
and so forth.

By looking at these various measurements, the technician can preempt many problems before they occur.

Other important measurements include the growth of dormant data in the Integrated Sector, the growth of near-line storage, the growth of archival storage, the measurement of the probability of access of data throughout the environment, network bottlenecks, and so forth. In short, any place the technician can preempt a critical shortage, the better.

One of the most important jobs of management of the DW 2.0 environment is that of managing end-user relationships and expectations. If management ignores this aspect of the DW 2.0 environment, then management is at risk. Some of the ways in which the end-user expectations are managed include

the creation and staffing of a help desk;
the publication of a periodic newsletter containing success stories and helpful hints as to how to use the DW 2.0 environment;
occasional in-house classes describing the contents and usage of aspects of the DW 2.0 environment;
the conduct of steering committees in which the end user gets to decide on or at least give input on priorities and schedules;
involvement of the end user in the entire development and design life cycle for the DW 2.0 environment;
corporate “show and tell” sessions during which in-house sessions are conducted;
the occasional day-long seminar by outside professionals that complements the experience or information found in the DW 2.0 environment.

Another important element of the management of end-user relationships is the establishment of an SLA, or service level agreement. The SLA is measured throughout the day-to-day processing that occurs in the DW 2.0 environment. The SLA provides a quantifiable and open record of system performance. The establishment of the SLA benefits both the end user and the technician. As a rule the SLA addresses both online performance and availability. In addition, the SLA for the analytical environment is very different from the SLA for the transactional environment.

In the case in which there is statistical processing to be done in the DW 2.0 environment, the technician must carefully monitor the full impact of the statistical processing on resource utilization. There is a point at which a separate facility must be built for research statistical analysis.

MANAGEMENT ADMINISTRATION OF THE DW 2.0 ENVIRONMENT

Sitting over all of these administrative activities is management. It is management’s job to make sure that the goals and objectives of the DW 2.0 environment are being met. And there are many aspects to the management of the DW 2.0 environment. Some of the more important aspects are the following.

Prioritization and prioritization conflicts

The buck stops at the manager’s office when it comes to prioritization. It is almost mandatory that certain parts of the organization want changes and additions to DW 2.0 at the same time that other parts also want changes and additions. It is the job of the manager to resolve (or at least ameliorate) the conflicts. Typical considerations include

which additions to the DW 2.0 environment will achieve the biggest financial payback;
which additions to the DW 2.0 environment are the easiest and fastest to make;
which additions to the DW 2.0 environment can be made in a time frame that is acceptable to the organization;
which additions to the DW 2.0 environment have the greatest strategic payback.

The manager must juggle all of these considerations when determining the order of additions and adjustments to the organization. But there are other considerations when managing the DW 2.0 environment.

Budget

The primary way in which management influences the organization is through budget. The projects that receive funding continue and flourish; the projects that do not receive funding do not. Some budgetary decisions are long term and some are short term. Nearly everything that is done in the DW 2.0 environment is done iteratively. This means that management has the opportunity to make mid-term and short-term corrections as a normal part of the budgeting process.

Scheduling and determination of milestones

One of the most important parts of management is the setting of milestones and schedules. Usually management does not create the original schedules and milestones. Instead management has the projects that are being managed propose the schedules and milestones. Then management approves those milestones that are acceptable. Because nearly all aspects of DW 2.0 are constructed in an iterative manner, management has ample opportunity to influence the corporate schedule.

Allocation of resources

The manager chooses who gets to lead projects. There is an art to selecting leadership. One school of thought says that when a project is in trouble more resources need to be added. Unfortunately this sends the wrong message to the organization. One sure way to get more resources is to get a project in trouble. Another approach is to remove the project leader of any project that gets in trouble. Unfortunately there often are legitimate circumstances that cause a project to become mired down. The art of management needs to be able to determine which of these circumstances is at hand and to make the proper decision. Another way of saying this is that management needs to be able to tell the difference between running over a speed bump and running off a cliff.

Managing consultants

Because the development skills for DW 2.0 are in short supply, it is very normal for an organization to turn to outside consultants for help. Management needs to be able to select a consulting firm objectively and not necessarily select the consulting firm that has always been the preferred supplier. The reason is that the preferred supplier may not have any legitimate experience. In addition management needs to be wary of consulting firms that sell a corporation on their capabilities and then staff the project with newly hired people who are “learning the ropes” at the expense of the client. There are several ways to ensure that a consulting firm does not “sell the goods” to an unsuspecting firm.

Never sign off on a contract over 12 months. If a consulting firm is worth its salt, it knows that at the end of 12 months, if the work has been done satisfactorily, the contract will be continued. Conversely if the contract has not been done well, then a new consulting firm can be engaged.
Make sure there are tangible short-term deliverables. This is a good measure of whether progress is really being made.
Make sure the consulting firm tells you specifically who will be staffing the project. The key positions are in design and management.
Place two or three corporate employees in key positions in the project working hand-in-hand with the consultants so that if there is a problem, the corporate employees can judge for themselves whether management should be alerted.
Place key design decisions in writing and make that documentation available to management at any time.
Check the credentials of the consulting firm. Do not take it on faith that the consulting firm can build a DW 2.0 environment just because they are a large and well-known firm.
Be wary of a consulting firm that will not allow the work to be occasionally reviewed by an outside expert. A confident and competent consulting firm will be happy to have its work reviewed by outside experts, especially if there is a problem with design, development, or implementation.
Be wary of consulting firms that are attached to a hardware and/or a software vendor. At the bottom line of the consulting firm’s proposal is often a subtle sales pitch for that vendor’s products.
Be open to sharing experiences with the management of other organizations. You can learn a lot if other managers are open to discussing their experiences.
Be wary of vendors who “showcase” a manager from another company. Oftentimes these managers have agendas you are not aware of. In some circles, managers of consumer companies are de facto employees of the vendor, or at least agents of the vendor.
Be wary of vendors who readily bring forth an independent consulting firm to bolster their position. Many software vendors have favorite consulting firms who have hidden arrangements with the vendor. The work or the evaluation you get will not be unbiased.
Be wary of publicly available consultants who purport to do market evaluations of suites of products. Oftentimes these consultants have “under the table” arrangements with vendors in which their goal is to influence you to buy the vendor’s products rather than to give you an honest evaluation of products in the marketplace.
Be wary of firms that purport to do marketplace research and evaluations of products. You should know that many of these research firms sell services to the vendors that they are supposedly evaluating and that the sale of those services to the vendors has a way of influencing the evaluation made of the vendor’s products. If a market evaluation company includes how much money is being spent with the marketing research firm along with its evaluations of the vendor’s products, then the evaluations of the vendor’s products may be valid. But if a marketing research firm will not disclose how much money is being spent by the vendor companies that are being evaluated, then the advice and rankings made by the marketing research firm must be discredited.

SUMMARY

In summary, there are many aspects to the management and administration of the DW 2.0 environment. Some of the aspects include the administration of

the data model;
the ETL environment;
data bases;
stewardship;
technology and systems;
network management;
archival processing;
near-line storage;
interactive processing;
metadata management.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.7.174