Chapter 4. The CIO

The Chief Information Officer (CIO) is responsible for the effective deployment of information technology resources and staff to meet the strategic, decisional, and operational information requirements of the enterprise.

Data warehousing, with its accompanying array of new technologies and its dependence on operational systems, naturally makes strong demands on the technical and human resources under the jurisdiction of the CIO.

For this reason, it is natural for the CIO to be strongly involved in any data warehousing effort. This chapter attempts to answer the typical questions of CIOs who participate in data warehousing initiatives.

How Do I Support the Data Warehouse?

After the data warehouse goes into production, different support services are required to ensure that the implementation is not derailed. These support services fall into the categories described below.

Regular Warehouse Load

The data warehouse needs to be constantly loaded with additional data. The amount of work required to load data into the warehouse on a regular basis depends on the extent to which the extraction, transformation, and loading processes have been automated, as well as the load frequency required by the warehouse.

The frequency of the load depends on the user requirements, as determined during the data warehouse design activity. The most frequent load possible with a data warehouse is once a day, although it is not unusual to find organizations that load their warehouses once a week, or even once a month.

The regular loading activities fall under the responsibilities of the warehouse support team, who almost invariably report to directly or indirectly to the CIO.

Applications

After the data warehouse and related data marts have been deployed, the IT department or division may turn its attention to the development and deployment of Executive Systems or custom applications that run directly against the data warehouse or the data marts. These applications are developed or targeted to meet the needs of specific user groups.

Any in-house application development will likely be handled by internal IT staff; otherwise, such projects should be outsourced under the watchful eye of the CIO.

Warehouse DB Optimization

Apart from the day-to-day database administration support of production systems, the warehouse DBA must also collect and monitor new sets of query statistics with each rollout or phase of the data warehouse.

The data structure of the warehouse is then refined or optimized on the basis of these usage statistics, particularly in the area of stored aggregates and table indexing strategies.

User Assistance or Help Desk

As with any information system in the enterprise, a User Assistance Desk or Help Desk can provide users with general information, assistance, and support. An analysis of the help requests received by the Help Desk provides insight on possible subjects for follow-on training with end users.

In addition, the Help Desk is an ideal site for publicizing the status of the system after every successful load.

Training

Provide more training as more end users gain access to the data warehouse. Aside from covering the standard capabilities, applications, and tools that are available to the users, the warehouse training should also clearly convey what data are available in the warehouse.

Advanced training topics may be appropriate for more advanced users. Specialized work groups or one-on-one training may be appropriate as follow-on training, depending on the type of questions and help requests that the Help Desk receives.

Preparation for Subsequent Rollouts

All internal preparatory work for subsequent rollouts must be performed while support activities for prior rollouts are underway. This activity may create resource contention and therefore should be carefully managed.

How Will My Data Warehouse Evolve?

One of the toughest decisions any data warehouse planner has to make is to decide when to evolve the system with new data and when to wait for the user base, IT organization, and business to catch up with the latest release of the warehouse.

Warehouse evolution is not only a technical and management issue, it is also a political issue. The IT organization must continually either:

  • market or sell the warehouse for continued funding and support of existing capabilities; or

  • attempt to control the demand for new capabilities.

Each new extension of the data warehouse results in increased complexity in terms of data scope, data management, and warehouse optimization. In addition, each rollout of the warehouse is likely to be in different stages and therefore to have different support needs.

For example, an enterprise may find itself busy with the planning and design of the third phase of the warehouse, while deployment and training activities are underway for the second phase, and Help Desk Support is available for the first phase. The CIO will undoubtedly face the unwelcome task of making critical decisions regarding resource assignments.

In general, data warehouse evolution takes place in one or more of the following areas:

  • Data. Evolution in this area typically results in an increase in scope (although a decrease is not impossible). The extraction subsystem will require modification in cases where the source systems are modified or new operational systems are deployed.

  • Users. New users will be given access to the data warehouse, or existing users will be trained on advanced features. This implies new or additional training requirements, the definition of new users and access profiles, and the collection of new usage statistics. New security measures may also be required.

  • IT organization. New skill sets are required to build, manage, and support the data warehouse. New types of support activities will be needed.

  • Business. Changes in the business result in changes in the operations, monitoring needs, and performance measures used by the organization. The business requirements that drive the data warehouse change as the business changes.

  • Application functionality. New functionality can be added to existing OLAP tools, or new tools can be deployed to meet end-user needs.

Who Should Be Involved in a Data Warehouse Project?

Every data warehouse project has a team of people with diverse skills and roles. The involvement of internal staff during the warehouse development is critical to the warehouse maintenance and support tasks once the data warehouse is in production. Not all the roles in a data warehouse project can be outsourced to third parties; of the typical roles listed below, internal enterprise staff should fulfill the roles listed in bold face:

  • Steering Committee

  • User Reference Group

  • Warehouse Driver

  • Warehouse Project Manager

  • Business Analysts

  • Warehouse Data Architect

  • Metadata Administrator

  • Warehouse DBA

  • Source System DBA and System Administrator

  • Project Sponsor (see Chapter 3)

The same person may play more than one role.

Every data warehouse project has a team of people with diverse skills and roles. Below is a list of typical roles in a data warehouse project. Note that the same person may play more than one role.

Steering Committee

The Steering Committee is composed of high-level executives representing each major type of user requiring access to the data warehouse. The Project Sponsor is a member of the committee; in most cases, the sponsor heads the committee. The Steering Committee should already be formed by the time data warehouse implementation starts; however, the existence of a Steering Committee is not a prerequisite for data warehouse planning. During implementation, the Steering Committee receives regular status reports from the project team and intervenes to redirect project efforts whenever appropriate.

User Reference Group

Representatives from the user community (typically middle-level managers and analysts) provide critical inputs to data warehousing projects by specifying detailed data requirements, business rules, predefined queries, and report layouts. User representatives also test the outputs of the data warehousing effort.

It is not unusual for end-user representatives to spend up to 80 percent of their time on the project, particularly during the requirements analysis and data warehouse design activities. Toward the end of a rollout, up to 80 percent of the representatives' time may be required again for testing the usability and correctness of warehouse data.

End users also participate in regular meetings or interviews with the warehousing team throughout the life of each rollout (up to 50 percent involvement).

Warehouse Driver

The Warehouse Driver reports to the steering committee, ensures that the project is moving in the right direction, and is responsible for meeting project deadlines.

The Warehouse Driver is a business manager but is responsible for defining the data warehouse strategy (with the assistance of the warehouse project manager) and for planning and managing the data warehouse implementation from the business side of operations.

The Warehouse Driver also communicates the warehouse objectives to other areas of the enterprise. This individual normally serves as the coordinator in cases where the implementation team has cross-functional team members. It is therefore not unusual for the Warehouse Driver to be the liaison to the User Reference Group.

Warehouse Project Manager

The Project Manager is usually an individual who is very well versed in technology and in managing technology projects. This person's technical skills strongly complement the business acumen of the Warehouse Driver.

The Project Manager normally reports to the Warehouse Driver and jointly defines the data warehouse strategy with the Warehouse Driver. It is not unusual, though, to find organizations where the Warehouse Driver and Project Manager jointly manage the project. In such cases, the Project Manager is actually a Technical Manager.

The Project Manager is responsible for implementing the project plans and acts as coordinator on the technology side of the project, particularly when the project involves several vendors. The warehouse Project Manager keeps the Warehouse Driver updated on the technical aspects of the project but isolates the Warehouse Driver from the technical details.

Business Analyst(s)

The analysts act as liaisons between the User Reference Group and the more technical members of the project team. Through interviews with members of the User Reference Group, the analysts identify, document, and model the current business requirements and usage scenarios.

Analysts play a critical role in managing end-user expectations, since most of the contact between the User Reference Group and the warehousing team takes place through the analysts. Analysts represent the interests of the end users in the project and therefore have the responsibility of ensuring that the resulting implementation will meet end-user needs.

Warehouse Data Architect

The warehouse data architect develops and maintains the warehouse's enterprise-wide view of the data. This individual analyzes the information requirements specified by the user community and designs the data structures of the data warehouse accordingly.

The workload of the architect is heaviest at the start of each rollout, when most of the design decisions are made. The workload tapers off as the rollout gets underway.

The warehouse data architect has an increasingly critical role as the warehouse evolves. Each successive rollout that extends the warehouse must respect an overall integrating architecture—and the responsibility for the integrating architecture falls squarely on the warehouse data architect. Data mart deployments that are fed by the warehouse should likewise be considered part of the architecture to avoid the data administration problems created by multiple, unrelated data marts.

Metadata Administrator

The metadata administrator defines metadata standards and manages the metadata repository of the warehouse. The workload of the Metadata Administrator is quite high both at the start and toward the end of each warehouse rollout. Workload is high at the start primarily due to metadata definition and setup work. Workload toward the end of a rollout increases as the schema, the aggregate strategy, and the metadata repository contents are finalized.

Metadata play an important role in data warehousing projects and therefore warrant the separate discussion in Chapter 13.

Warehouse DBA

The warehouse database administrator works closely with the Warehouse Data Architect. The workload of the warehouse DBA is typically heavy throughout a data warehouse project. Much of this individual's time will be devoted to setting up the warehouse schema at the start of each rollout. As the rollout gets underway, the warehouse DBA takes on the responsibility of loading the data, monitoring the performance of the warehouse, refining the initial schema, and creating dummy data for testing the decision support front-end tools. Toward the end of the rollout, the warehouse DBA will be busy with database optimization tasks as well as aggregate table creation and population.

As expected, the warehouse DBA and the metadata administrator work closely together. The warehouse DBA is responsible for creating and populating metadata tables within the warehouse in compliance with the standards that have been defined by the metadata administrator.

Source System Database Administrators (DBAs) and System Administrators (SAs)

These IT professionals play extremely critical roles in the data warehousing effort. Among their typical responsibilities are:

  • Identify best extraction mechanisms. Given their familiarity with the current computing environment, source system DBAs and SAs are often asked to identify the data transfer and extraction mechanisms best suited for their respective operational systems.

  • Contribute to source-to-target field mapping. These individuals are familiar with the data structures of the operational systems and are therefore the most qualified to contribute to or finalize the mapping of source system fields to warehouse fields.

  • Data quality assessment. In the course of their day-to-day operations, the DBAs and SAs encounter data quality problems and are therefore in a position to highlight areas that require special attention during data cleansing and transformation.

Depending on the status of the operational systems, these individuals may spend the majority of their time on the above activities during the course of a rollout.

Conversion and Extraction Programmer(s)

The programmers write the extraction and conversion programs that pull data from the operational databases. They also write programs that integrate, convert, and summarize the data into the format required by the data warehouse. Their primary resource persons for the extraction programs will be the source system DBAs and SAs.

If data extraction, transformation, and transportation tools are used, these individuals are responsible for setting up and configuring the selected tools and ensuring that the correct data records are retrieved for loading into the warehouse.

Technical and Network Architect

The technical and network architect ensures that the technical architecture defined for the data warehouse rollout is suitable for meeting the stated requirements. This individual also ensures that the technical and network architecture of the data warehouse is consistent with the existing enterprise infrastructure.

The network architect coordinates with the project team on the extensions to the enterprise's network infrastructure required to support the data warehouse and constantly monitors the warehouse's impact on network capacity and throughput.

Trainer

The trainer develops all required training materials and conducts the training courses for the data warehousing project. The warehouse project team will require some data warehousing training, particularly during early or pilot projects. Toward the end of each rollout, end users of the data warehouse will also require training on the warehouse contents and on the tools that will be used for analysis and reporting.

What Is the Team Structure Like?

Figure 4-1 illustrates a typical project team structure for a data warehouse project. Note that there are many other viable alternative team structures. Also, unless the team is quite large and involves many contracted parties, a formal team structure may not even be necessary.

The team structure will evolve once the project has been completed. Day-to-day maintenance and support of the data warehouse will call for a different organizational structure—sometimes one that is more permanent.

Typical Project Team Structure for Development

Figure 4-1. Typical Project Team Structure for Development

Resource contention will arise when a new rollout is underway and resources are required for both warehouse development and support.

What New Skills Will My People Need?

IT professionals and end-users will both require new but different skill sets, as described below.

IT Professionals

Data warehousing places new demands on the IT professionals of an enterprise. New skill sets are required, particularly in the following areas:

  • New database design skills. Traditional database design principles do not work well with a data warehouse. Dimensional modeling concepts break many of the OLTP design rules. Also, the large size of warehouse tables requires database optimization and indexing techniques that are appropriate for very large database (VLDB) implementations.

  • Technical capabilities. New technical skills are required, especially in enterprises where new hardware or software is purchased (e.g., new hardware platform, new RDBMS, etc.). System architecture skills are required for warehouse evolution, and networking management and design skills are required to ensure the availability of network bandwidth for warehousing requirements.

  • Familiarity with tools.  In many cases, data warehousing works better when tools are purchased and integrated into one solution. IT professionals must become familiar with the various warehousing tools that are available and must be able to separate the wheat from the chaff. IT professionals must also learn to use, and learn to work around, the limitations of the tools they select.

  • Knowledge of the business. Thorough understanding of the business and of how the business will utilize data are critical in a data warehouse effort. IT professionals cannot afford to focus on technology only. Business analysts, in particular, have to understand the business well enough to properly represent the interests of end users. Business terms have to be standardized, and the corresponding data items in the operational systems have to be found or derived.

  • End-user support. Although IT professionals have constantly provided end-user support to the rest of the enterprise, data warehousing puts the IT professional in direct contact with a special kind of end user: senior management. Their successful day-to-day use of the data warehouse (and the resulting success of the warehousing effort) depends greatly on the end-user support that they receive.

The IT professional's focus changes from meeting operational user requirements to helping users satisfy their own information needs.

End Users

Gone are the days when end users had to wait for the IT department to release printouts or reports or to respond to requests for information. End users can now directly access the data warehouse and can tap it for required information, looking up data themselves.

This advance assumes that end users have acquired the following skills:

  • Desktop computing. End users must be able to use OLAP tools (under a graphical user interface environment) to gain direct access to the warehouse. Without desktop computing skills, end users will always rely on other parties to obtain the information they require.

  • Business knowledge. The answers that the data warehouse provides are only as good as the questions that it receives. End users will not be able to formulate the correct questions or queries without a sufficient understanding of their own business environment.

  • Data knowledge. End users must understand the data that is available in the warehouse and must be able to relate the warehouse data to their knowledge of the business.

Apart from the above skills, data warehousing is more likely to succeed if end users are willing to make the warehouse an integral part of the management and decision-making process of the organization. The warehouse support team must help end users overcome a natural tendency to revert to "business as usual" after the warehouse is rolled out.

How Does Data Warehousing Fit into My IT Architecture?

As discussed in Chapter 1, the data warehouse is an entirely separate architectural component, distinct from the operational systems. Each time a new architectural component is added or introduced, the enterprise architect must consciously study its impact on the rest of the IT architecture and ensure that

  • the IT architecture does not become brittle as a result of this new component; and

  • the new architectural components are isolated from the obsolescence of legacy applications.

A data warehouse places new demands on the technical infrastructure of the enterprise. The following factors determine the technical environment required.

  • User requirements. The user requirements largely determine the scope of the warehouse, i.e., the requirements are the basis for identifying the source systems and the required data items.

  • Location of the source systems and the warehouse. If the source systems and the data warehouse are not in the same location, the extraction of data from operational systems into the warehouse may present difficulties with regard to logistics or network communications. In actual practice, the initial extraction is rarely 100 percent correct—some fine-tuning will be required because of errors in the source-to-target field mapping, misunderstood requirements, or changes in requirements. Easy, immediate access to both the source systems and the warehouse make it easier to modify and correct the data extraction, transformation, and loading processes. The availability of easy access to both types of computing environments depends on the current technical architecture of the enterprise.

  • Number and location of warehouse users. The number of users that may access the warehouse concurrently implies a certain level of network traffic. The location of each user will also be a factor in determining how users will be granted access to the warehouse data. For example, if the warehouse users are dispersed over several remote locations, the enterprise may decide to use secure connections through the public Internet infrastructure to deliver the warehouse data.

  • Existing enterprise IT architecture. The existing enterprise IT architecture defines or sets the limits on what is technically feasible and practical for the data warehouse team.

  • Budget allocated to the data warehousing effort. The budget for the warehousing effort determines how much can be done to upgrade or improve the current technical infrastructure in preparation for the data warehouse.

It is always prudent to first study and plan the technical architecture (as part of defining the data warehouse strategy) before the start of any warehouse implementation project.

How Many Vendors Do I Need to Talk to?

A warehousing project, like any IT project, will require a combination of hardware, software, and services, which may not all be available from one vendor. Some enterprises choose to isolate themselves from the vendor selection and liaison process by hiring a systems integrator, who subcontracts work to other vendors. Other enterprises prefer to deal with each vendor directly, and therefore assume the responsibility of integrating the various tools and services they acquire.

Vendor Categories

Although some vendors have products or services that allow them to fit in more than one of the vendor categories below, most if not all vendors are particularly strong in only one of the categories discussed below.

  • Hardware or operating system vendors. Data warehouses require powerful server platforms to store the data and to make these data available to multiple users. All the major hardware vendors offer computing environments that can be used for data warehousing.

  • Middleware/data extraction and transformation tool vendors. These vendors provide software products that facilitate or automate the extraction, transportation, and transformation of operational data into the format required for the data warehouse.

  • RDBMS vendors. These vendors provide the relational database management systems that are capable of storing up to terabytes of data for warehousing purposes. These vendors have been introducing more and more features (e.g., advanced indexing features) that support VLDB implementations.

  • Consultancy and integration services supplier. These vendors provide services either by taking on the responsibility of integrating all components of the warehousing solution on behalf of the enterprise, by offering technical assistance on specific areas of expertise, or by accepting outsourcing work for the data warehouse development or maintenance.

  • Front-end/OLAP/decision support/data access and retrieval tool vendors. These vendors offer products that access, retrieve, and present warehouse data in meaningful and attractive formats. Data mining tools, which actively search for previously unrecognized patterns in the data, also fall into this category.

Enterprise Options

The number of vendors that an enterprise will work with depends on the approach the enterprise wishes to take. There are three main alternatives when it comes to building a data warehouse. An enterprise can:

  • Build its own. The enterprise can build the data warehouse, using a custom architecture. A "best of breed" policy is applied to the selection of warehouse components and vendors. The data warehouse team accepts responsibility for integrating all the distinct selected products from multiple vendors.

  • Use a framework. Nearly all data warehousing vendors present a warehousing framework to influence and guide the data warehouse market. Most of these frameworks are similar in scope and substance, with differences greatly influenced by the vendor's core technology or product. Vendors have also opportunistically established alliances with one another and are offering their product combinations as the closest thing to an "off the shelf" warehousing solution.

  • Use an anchor supplier (hardware, software, or service Vendor). Enterprises may also select a supplier for a product or service as its key or anchor vendor. The anchor supplier's products or services are then used to influence the selection of other warehousing products and services.

What Should I Look for in a Data Warehouse Vendor?

The following sections provide evaluation criteria for the different components that make up a data warehouse solution. Different weighting should be applied to each criterion, depending on its importance to the organization.

Solution Framework

The following evaluation criteria can be applied to the overall warehousing solution:

  • Relational data warehouse. The data warehouse resides on a Relational DBMS. (Multidimensional databases are not an appropriate platform for an enterprise data warehouse, although they may be used for data marts with power-user requirements.)

  • Scalability. The warehouse solution can scale up in terms of disk space, processing power, and warehouse design as the warehouse scope increases. This scalability is particularly important if the warehouse is expected to grow at a rapid rate.

  • Front-end independence. The design of the data warehouse is independent of any particular front-end tool. This independence leaves the warehouse team free to mix and match different front-end tools according to the needs and skills of warehouse users. Enterprises can easily add more sophisticated front-ends (such as data mining tools) at a later time.

  • Architectural integrity. The proposed solution does not make the overall system architecture of the enterprise brittle; rather, it contributes to the long-term resiliency of the IT architecture.

  • Preservation of prior investments. The solution leverages as much as possible prior software and hardware investments and existing skill sets within the organization.

Project and Integration Consultancy Services

The following evaluation criteria can be applied to consultants and system integrators:

  • Star join schema. Warehouse designers use a dimensional modeling approach based on a star join schema. This form of modeling results in database designs that are navigable by business users and are resilient to evolving business requirements.

  • Source data audit. A thorough physical and logical data audit is performed prior to implementation, to identify data quality issues within source systems and propose remedial steps. Source system quality issues are the number-one cause of data warehouse project delays. The data audit also serves as a reality check to determine if the required data are available in the operational systems of the enterprise.

  • Decisional requirements analysis. Perform a thorough decisional requirements analysis activity with the appropriate end-user representatives prior to implementation to identify detailed decisional requirements and their priorities. This analysis must serve as the basis for key warehouse design decisions.

  • Methodology. The consultant team specializes in data warehousing and uses a data warehousing methodology based on the current state-of-the-art. Avoid consultants who apply unsuitable OLTP methodologies to the development of data warehouses.

  • Appropriate fact record granularity. The fact records are stored at the lowest granularity necessary to meet current decisional requirements without precluding likely future requirements. The wrong choice of grain can dramatically reduce the usefulness of the warehouse by limiting the degree to which users can slice and dice through data.

  • Operational Data Store. The consultant team is capable of implementing on Operational Data Store layer beneath the data warehouse if one is necessary for operational integrity. The consultant team is cognizant of the key differences between Operational Data Store and warehousing design issues, and it designs the solution accordingly.

  • Knowledge transfer. The consultant team views knowledge transfer as a key component of a data warehousing initiative. The project environment encourages coaching and learning for both IT staff and end users. Business users are encouraged to share their in-depth understanding and knowledge of the business with the rest of the warehousing team.

  • Incremental rollouts. The overall project approach is driven by risks and rewards, with clearly defined phases (rollouts) that incrementally extend the scope of the warehouse. The scope of each rollout is strictly managed to prevent schedule slippage.

Front-End/OLAP/Decision Support/Data Access and Retrieval Tools

The following evaluation criteria can be applied to front-end/OLAP/ decision support/data access and retrieval tools:

  • Multidimensional views. The tool supports pivoting, drill-up, and drill-down and displays query results as spreadsheets, graphs, and charts.

  • Usability. The tool works under the GUI environment and has features that make it user-friendly (e.g., the ability to open and run an existing report with one or two clicks of the mouse).

  • Star schema aware. Applicable only to Relational OLAP tools. The tool recognizes a star schema database and takes advantage of the schema design.

  • Tool sophistication. The tool is appropriate for the intended user. Not all users are comfortable with desktop computing, and Relational OLAP tools can meet most user requirements. Multidimensional databases are highly sophisticated and are more appropriate for power users.

  • Delivery lead time. The product vendor can deliver the product within the required time frame.

  • Planned functionality for future releases. Since this area of data warehousing technologies is constantly evolving and maturing, it is helpful to know the enhancements or features that the tool will eventually have in its future releases. The planned functionality should be consistent with the above evaluation criteria.

Middleware/Data Extraction and Transformation Tools

The following evaluation criteria can be applied to middleware and extraction and transformation tools:

  • Price/performance. The product performs well in a price/performance/maintenance comparison with other vendors of similar products.

  • Extraction and transformation steps supported. The tool supports or automates one or more of the basic steps to extracting and transforming data. These steps are reading source data, transporting source data, remapping keys, creating load images, generating or creating stored aggregates, logging load exceptions, generating indexes, quality assurance checking, alert generation, and backup and recovery.

  • Delivery lead time. The product vendor can deliver the product within the required time frame.

Most tools in this category are very expensive. Seriously consider writing in-house versions of these tools as an alternative, especially if your source and target environments are homogeneous.

Relational Database Management Systems

The following evaluation criteria can be applied to an RDBMS:

  • Preservation of prior investments. The warehouse solution leverages as much as possible prior software and hardware investments and existing skill sets within the organization. Note however, that data warehousing does require additional database management techniques because of the size and scale of the database.

  • Financial stability. The product vendor has proven to be a strong and visible player in the relational database market, and its financial performance indicates growth or stability.

  • Data warehousing features. The product has or will have features that support data warehousing requirements (e.g., bit-mapped indexes for large tables, aggregate navigation).

  • Star schema aware. The product's query optimizer recognizes the star schema and optimizes the query accordingly. Note that most query optimizers strongly support only OLTP-type queries. Unfortunately, although these optimizers are appropriate for transactional environments, they may actually slow down the performance on decisional queries.

  • Warehouse metadata. The tool supports the use of warehouse metadata for aggregate navigation, query statistics collection, etc.

  • Price/performance. The product performs well in a price/performance comparison with other vendors of similar products.

Hardware or Operating System Platforms

The following evaluation criteria can be applied to hardware and operating system platforms:

  • Scalability. The warehouse solution can scale up in terms of space and processing power. This scalability is particularly important if the warehouse is projected to grow at a rapid rate.

  • Financial stability. The product vendor has proven to be a strong and visible player in the hardware segment, and its financial performance indicates growth or stability.

  • Price/performance. The product performs well in a price/performance comparison with other vendors of similar products.

  • Delivery lead time. The product vendor can deliver the hardware or an equivalent service unit within the required time frame. If the unit is not readily available within the same country, there may be delays due to importation logistics.

  • Reference sites. The hardware vendor has a reference site that is using a similar unit for the same purpose. The warehousing team can either arrange a site visit or interview representatives from the site visit. Alternatively, an onsite test of the unit can be conducted, especially if no reference is available.

  • Availability of support. Support for the hardware and its operating system is available, and support response times are within the acceptable down time for the warehouse.

How Does Data Warehousing Affect My Existing Systems?

Existing operational systems are the source of internal warehouse data. Extractions can take place only during the batch windows of the operational systems, typically after office hours. If batch windows are sufficiently large, warehouse-related activities will have little or no disruptive effects on normal, day-to-day operations.

Improvement Areas in Operational Systems

Data warehousing, however, does highlight areas in existing systems where improvements can be made to operational systems, particularly in two areas:

  • Missing data items. Decisional information needs almost always require the collection of data that are currently outside the scope of the existing systems. If possible, the existing system are extended to support the collection of such data. The team will have to study alternatives to data collection if the operational systems cannot be modified (for example, if the operational system is an application package whose warranties will be void if modifications are made).

  • Insufficient data quality. The data warehouse efforts may also identify areas where the data quality of the operational systems can be improved. This is especially true for data items that are used to uniquely identify customers, such as social security numbers.

The data warehouse implementation team should continuously provide constructive feedback regarding the operational systems. Easy improvements can be quickly implemented, and improvements that require significant effort and resources can be prioritized during IT planning.

By ensuring that each rollout of a data warehouse phase is always accompanied by a review of the existing systems, the warehousing team can provide valuable inputs to plans for enhancing operational systems.

Data Warehousing and Its Impact on Other Enterprise Initiatives

By its enterprise-wide nature, a data warehousing initiative will naturally have an impact on other enterprise initiatives, two of which are discussed below.

How Does Data Warehousing Tie In with BPR?

Data warehousing refers to the gamut of activities that support the decisional information requirements of the enterprise. BPR is "the radical redesign of strategic and value-added processes—and the systems, policies, and organizational structures that support them—to optimize the work flows and productivity in an organization."

Most BPR projects have focused on the optimization of operational business processes. Data warehousing, on the other hand, focuses on optimizing the decisional (or decision-making) processes within the enterprise. It can be said that data warehousing is the technology enabler for reengineering decisional processes.

The ready availability of integrated data for corporate decision-making also has implications for the organizational structure of the enterprise. Most organizations are structured or designed to collect, summarize, report, and direct the status of operations (i.e., there is an operational monitoring purpose). The availability of integrated data at different levels of detail may encourage a flattening of the organization structure.

Data warehouses also provide the enterprise with the measures for gauging competitive standing. The use of the warehouse leads to insights as to what drives the enterprise. These insights may quickly lead to business process reengineering initiatives in the operational areas.

How Does Data Warehousing Tie In with Intranets?

The term intranet refers to the use of Internet technologies for internal corporate networks. Intranets have been touched as cost-effective, client/server solutions to enterprise computing needs. Intranets are also popular due to the universal, easy-to-learn, easy-to-use front-end, i.e., the web browser.

The web-publishing nature of the Internet, and the browser's metaphor of searching for information, are consistent with the data warehouse's querying metaphor. The availability of many web-based tools that draw their data from relational database structures has naturally encouraged the use of web technology as a means for delivering warehouse data to end-users.

A data warehouse with a web-enabled front-end therefore provides enterprises with interesting options for intranet-based solutions.

With the introduction of technologies that enable secure connections over the public Internet infrastructure, enterprises now also have a cost-effective way of distributing or delivering warehouse data to users in multiple locations.

When Is a Data Warehouse Not Appropriate?

Not all organizations are ready for a data warehousing initiative. Below are two instances when a data warehouse is simply inappropriate.

When the Operational Systems Are Not Ready

The data warehouse is populated with information primarily from the operational systems of the enterprise. A good indicator of operational system readiness is the amount of IT effort focused on operational systems.

A number of telltale signs indicate a lack of readiness. These include the following:

  • Many new operational systems are planned for development or are in the process of being deployed. Much of the enterprise's IT resources will be assigned to this effort and will therefore not be available for data warehousing projects.

  • Many of the operational systems are legacy applications that require much firefighting. The source systems are brittle or unstable and are candidates for replacement. IT resources are also directed at fighting operational system fires.

  • Many of the operational systems require major enhancements and must be overhauled. If the operational systems require major enhancements, then chances are these systems do not sufficiently support the day-to-day operations of the enterprise. Again, IT resources will be directed to enhancement or replacement efforts. Furthermore, deficient operational systems almost always fail to capture all the data required to meet the decisional information needs of business managers.

Regardless of the reason for a lack of operational system readiness, the bottom line is simple: an enterprise-wide data warehouse is out of the question due to the lack of adequate source systems. However, this does not preclude a phased data warehousing initiative, as illustrated in Figure 4-2.

The enterprise may opt for an interleaved deployment of systems. A series of projects can be conducted, where a project to deploy an operational system is followed by a project that extends the scope of the data warehouse to encompass the newly stabilized operational system.

Data Warehouse Rollout Strategy

Figure 4-2. Data Warehouse Rollout Strategy

The main focus of the majority of IT staff remains on deploying the operational systems. However, a data warehouse scope extension project is initiated as each operational system stabilizes. This project extends the data warehouse scope with data from each new operational system.

Note, however, that this approach may create unrealistic end-user expectations, particularly during earlier rollouts. The scope and strategy should therefore be communicated clearly and consistently to all users. Most, if not all, business users will understand that enterprise-wide views of data are not possible while most of the operational systems are not feeding the warehouse.

When the Need Is Operational Integration

Despite its ability to provide integrated data for decisional information needs, a data warehouse does not in any way contribute to meeting the operational information needs of the enterprise. Data warehouses are refreshed at best on a daily basis. They do not integrate data quickly enough or often enough for operational management purposes.

If the enterprise needs operational integration, then the typical data warehouse deployment (as shown in Figure 4-3) is insufficient.

Instead, the enterprise needs an Operational Data Store and its accompanying front-end applications. As mentioned in Chapter 1, flash monitoring and reporting tools are often likened to a dashboard that is constantly refreshed to provide operational management with the latest information about enterprise operations. Figure 4-4 illustrates the Operational Data Store architecture.

Traditional Data Warehouse Architecture

Figure 4-3. Traditional Data Warehouse Architecture

When the intended users of the system are operational managers and when the requirements are for an integrated view of constantly refreshed operational data, an Operational Data Store is the appropriate solution.

Enterprises that have implemented Operational Data Stores will find it natural to use the Operational Data Store as one of the primary source systems for their data warehouse. Thus, the Data Warehouse contains a series (i.e., layer upon layer) of ODS snapshots, where each layer corresponds to data as of a specific point in time.

How Do I Manage or Control a Data Warehouse Initiative?

There are several ways to manage or control a data warehouse project. Note that most of the techniques described below are useful in any technology project.

The Data Warehouse and the Operational Data Store

Figure 4-4. The Data Warehouse and the Operational Data Store

Milestones. Clearly defined milestones provide project management and the Project Sponsor with regular checkpoints to track the progress of the data warehouse development effort. Milestones should be far enough apart to show real progress, but not so far apart that senior management becomes uneasy or loses focus and commitment. In general, one data warehouse rollout should be treated as one project, lasting anywhere between three to six months.

Incremental Rollouts, Incremental Investments. Avoid biting off more than you can chew; projects that are gigantic leaps forward are more likely to fail. Instead, break up the data warehouse initiative into incremental rollouts. By doing so, you give the warehouse team manageable but ambitious targets and clearly defined deliverables.

Applying a phased approach also has the added benefit of allowing the Project Sponsor and the warehousing team to set priorities and manage end-user expectations. The benefits of each rollout can be measured separately, and the data warehouse is justified on a phase-per-phase basis.

A phased approach, however, requires an overall architect so that each phase also lays the foundation for subsequent warehousing efforts, and earlier investments remain intact.

Clearly Defined Rollout Scopes. To the maximum extent possible, clearly define the scope of each rollout to set the expectations of both senior management and warehouse end-users. Each rollout should deliver useful functionality. As in most development projects, the project manager will be walking the fine line between increasing the scope to better meet user needs and ruthlessly controlling the scope to meet the rollout deadline.

Individually Cost-Justified Rollouts. The scope of each rollout determines the corresponding rollout cost. Each rollout should be cost-justified on its own merits to ensure appropriate return on investment. However, this practice should not preclude long-term architectural investments that do not have an immediate return in the same rollout.

Plan to Have Early Successes. Data warehousing is a long-term effort that must have early and continuous successes that justify the length of the journey. Focus early efforts on areas that can deliver highly visible success, and that success will increase organizational support.

Plan to be Scalable. Initial successes with the data warehouse will result in a sudden demand for increased data scope, increased functionality, or both! The warehousing environment and design must both be scalable to deal with increased demand as needed.

Reward your Team. Data warehousing is hard work, and teams need to know their work is appreciated. A motivated team is always an asset in long-term initiatives.

In Summary

The Chief Information Officer (CIO) has the unenviable task of juggling the limited IT resources of the enterprise. He or she makes the resource assignment decisions that determine the skill sets of the various IT project teams.

Unfortunately, data warehousing is just one of the many projects on the CIO's plate. If the enterprise is still in the process of deploying operational system, data warehousing will naturally be at a lower priority.

CIOs also have the difficult responsibility of evolving the enterprise's IT architecture. They must ensure that the addition of each new system, and the extension of each existing system, contributes to the stability and resiliency of the overall IT architecture.

Fortunately, data warehouse and operational data store technologies allow CIOs to migrate reporting and analytical functionality from legacy or operational environments, thereby creating a more robust and stable computing environment for the enterprise.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.44.108