Chapter 1
Data Warehousing Perspectives

The secret of success is changing the way you think.

Jack Welch, CEO, General Electric

When you have completed this chapter, you will be able to:

  Understand Data Warehousing and Business Intelligence (DWBI) terms

  Talk about DWBI benefits

  Assess the maturity of DWBI efforts for your organization.

In this chapter of The Analytical Puzzle, the following topics are explained:

  Definition of Business Intelligence and Analytics

  Definition of Data Warehousing

  Benefits of Data Warehousing

  The Business Case for Business Intelligence, Analytics, and Data Warehousing

  Operational Data Versus Warehouse Data

  Data Timeliness, Consistency and Comparability

  Decision Support Goals

  What Data Warehouse Is and Is Not

  Data Warehousing Trends and Hot Topics.

Business Intelligence (BI) is the set of practices and tools for displaying and exploring data for decision-making. Techniques and tools associated with BI include dashboards, scorecards, drilldown, and slicing and dicing data. Analytics practices take BI to a higher level.

Analytics is the practice of supporting decision-making through number crunching. Analytics supports processes such as customer segmentation, product mix analysis, budgeting, and operations improvements. Some analytic solutions focus on a single domain, rather than the broader domain of the data warehouse. Techniques and tools associated with analytics include data mining, statistical analysis, and regression modeling. Table 01.01 provides examples of BI and analytics.


Table 01-01: Data Warehousing Business Intelligence Scope

Business Intelligence

Analytics

    The system displays average historical temperature.

    Customer data is listed in a report or provided on a dashboard.

    The number of customers who responded to a sales offer is reported.

    A report showing inventory turnover by product is produced.

    The system predicts the weather for next week.

    Customers are organized into segments based on multiple factors.

    The system recommends a sales campaign offer to make to a customer.

    The system recommends actions to optimize inventory levels.

Chapters 12 through 18 of this book provide detailed information about BI and analytics. Data is the raw material needed for BI and analytics, which is where data warehousing comes into play.

Decisions based on BI and Analytics can impact the bottom line by reducing cost and increasing revenues. Costs may be reduced by:

  Avoiding problems such as poor insurance risks – Progressive Insurance learned through analytics that credit score is an indicator of insurance risk, which gave the company a competitive advantage. (Davenport 2007)

  Negotiating improvements in supply – Prices paid to suppliers can be compared to industry benchmarks, which can identify potential savings.

  Dropping unprofitable products – Companies can improve profits by dropping products that cost more to produce and support than they bring in through sales. Analytics identifies those products through a correlation of costs and benefits.

  Identifying and removing the root causes of waste – Waste incidents such as scrap and rework are correlated with business process steps and machine settings to determine preventable root causes.

Revenues may be increased by:

  Understanding and better serving customers – Behavior and preferences of customers are captured and then analyzed on both group and individual levels. This information may be used to improve the design of websites, as well as to provide service tailored to individual preferences.

  Focusing on the most profitable productsHigh profit products are identified by relating cost and benefit streams. Companies can zero in on the products that provide the best long term profit.

  Cross selling to customers – Market basket analytics enables web-based stores like Amazon to recommend additional products. Knowledge of products already purchased helps to avoid making duplicate offers to customers.

  Capitalizing on trends and events – Walmart mobilizes its suppliers to provide goods needed in the event of forecasted events such as hurricanes. (Davenport 2007)

  Growing marketing opportunities – For example, the United States Census provides statistics and projected trends in the demographic composition of the United States. This data might point to opportunities to provide financial products such as annuities.

Improved decision-making can improve organizational results, as shown in Table 01-02. Organizations like Olive Garden, Royal Shakespeare Company, and 3M are great examples of this:

Table 01-02: Decision-Making Examples

Olive Garden

Olive Garden, the Italian restaurant chain, has reduced unplanned staff hours by 40% and wasted food by 10%. (Davenport 2010)

Nike Shoes

Nike uses sensors in running shoes to learn about the habits of its customers. (Davenport 2010)

Royal Shakespeare Company

The Royal Shakespeare Company increased the number of regulars by 70% through customer analytics. (Davenport 2010)

Best Buy

Best Buy found that 7% of their customers account for 43% of their business. (Davenport 2010)

3M

3M is saving $10 million in maintenance and has boosted sales force productivity by 10%. (Gray 2005)

Royal Bank of Scotland (RBS)

The Royal Bank of Scotland has integrated its customer information, creating a single customer view that enables it to better serve and cross sell to its customers. (Friedlos 2007)

See Chapter 17, BI Applications, for numerous examples of the benefits and applications of using data in decision-making.

Enterprise Data Warehousing (EDW) is a process for collecting, storing, and delivering decision support data for an entire enterprise or business unit. Data warehousing is a broad area that is described point by point in this book. A data warehouse is one of the artifacts created by the data warehousing process.

William (Bill) H. Inmon has provided an alternate and useful definition of the data warehouse, which he defines as “a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management's decision-making process.”

As a total architecture, EDW includes people, processes, and technologies to achieve the goal of providing decision support data that is:

  Consistent across the enterprise – Each person accessing the system sees the same values. The customer service department sees the same numbers as the sales department.

  Integrated – Data is consolidated from many systems. Customer data may be gathered from customer service, sales, and subscription systems, for example.

  Standardized – Data is described using common names and definitions. The definition of a customer is the same in each department.

  Easy to access from multiple viewpoints – The direction of data viewing can be shifted from customer to product to geographic area through a point and click portal.

  Easy to understand – Information is presented using recognizable approaches such as lists, gauges, and charts with clear labels and descriptions.

A data warehouse is a database that contains a copy of operational and other data, rather than being a source of original data. This data is often obtained from multiple data sources and is useful for strategic decision-making.

A data warehouse is not another name for a database. Some people incorrectly use the term data warehouse as a generic name for a database. Its purpose is not just to maintain historical data. A data warehouse contains specific data that has been gathered for analytics and reporting. It also does not record transactional data as transactions take place. This is managed in application data stores. (A data warehouse may, however, house transactional data after the fact, for reporting purposes.)

Data warehousing is not the solution for all data analytic problems. Data that supports analytics can be obtained and managed through means other than data warehousing. Challenges with data warehousing include:

  High development costBuilding a data warehouse can be time consuming and costly.

  Timeliness of data – day old data is not good enough for tactical decision-making such as assigning people to incoming work or controlling machines in a factory. Data may be streaming in so fast that it cannot be stored in time to make needed decisions.

  High volume dataThe petabytes and exabytes of Big Data overwhelm data warehouses.

  Specialized application – A specialized application may be better for specific analysis such as risk analysis, actuarial analysis, or investment analysis (Chapter 18).

  Small and one time data The data warehouse may be “overkill” for projects that must be completed quickly and economically.

A data mart may be thought of as a smaller scale data warehouse. A data mart is a database that is part of a data warehouse system where data stored for presentation and user access. Early data mart definitions specified that data marts were targeted to specific subjects and business processes. In practice, the data mart is a database that is organized into facts and dimensions that can cross subjects. Some organizations start with a smaller scale data mart effort and then progress toward the Enterprise Data Warehouse.

Data analysis can provide worthwhile results whether done using data in a spreadsheet, a data mart, a data warehouse or Big Data across many servers!

How Mature is our DWBI Effort?, you may be asking. You can develop your own answers to this question by completing the DWBI Maturity Profile; Table 01-03 provides an example. The Maturity Profile helps you determine where your DWBI efforts need improvement.

DWBI Maturity Profile:

  Organization Background – provide a profile of the organization, including topics such as:

o         Organization name

o         Legal organization

o         Industry

o         Geographic area

o         Revenue

o         Employees.

  History – Describe the history of DWBI at the organization, including prior projects.

  Executive Support – Identify the support provided by executive sponsors such as the CEO and CFO.

  Competencies – Describe the skills and capabilities that are in place, such as:

o        BI Competency Center

o        Data Integration Competency Center

o        Data Governance.

  Usage – Describe the current state of DWBI users and analysts:

o        How many users are there?

o        What percent of employees are using the DWBI?

o        Do users see DWBI as mission critical?

o        Do users trust the data and analysis from DWBI?

o        How is DWBI usage divided by management level?

o        How is DWBI usage divided by functional area?

o         What is the level of analytical capability?

  Data Sources – Describe and enumerate the data sources that feed the DW. How dispersed is the data?

  Data Integration Tools – Identify the data integration tools available and in use.

  Data Warehouse Platform – Describe components of the data warehouse platform (hardware/software/DBMS).

  Data Warehouse Size – Quantify the amount of data storage consumed by the data warehouse. Include frequency of update and growth rate.

  BI Tools – Identify the BI tools and applications that are in place and in use.

  Results – Describe the benefits obtained from the use of DWBI, both quantitative and qualitative.


Table 01-03: Data Maturity Profile Example

DW BI Maturity Profile

Organization Background

First Place Toys Corporation (FPTC) is a fictional corporation headquartered in the state of Minnesota. This 200-person company achieved $24 million in revenue last year by distributing niche lines of toys worldwide.

History

The company developed its first successful data warehouse in 2008 with a focus on sales analysis. The data warehouse has been extended several times, adding support for customer service, manufacturing, and financial analytics.

Executive Support

The senior executives of FPTC are very supportive of the data warehouse. Gary Benedict, the company CEO, is the chief sponsor and has become a believer.

Competency Centers

The company has developed competencies in the areas of business intelligence, data management, and data integration. The company has 2 specialists who focus on business intelligence and 1 specialist who designs databases and develops data integration. In addition, the company has developed relationships with consulting firms who extend these capabilities.

Usage

The data warehouse is widely used at FPTC. Over 25 percent of the company's employees regularly access the system, guiding their activities and decisions using portals, dashboards, and scorecards populated with data warehouse data.

Data Sources

The data warehouse is populated from multiple sources. Accounting data is sourced from a PC-based software package, sales information is sourced from web-based shopping cart software, and manufacturing data is obtained from an open source system.

Data Integration Tools

The company uses Microsoft SQL Server 2008 Integration Services (SSIS) to integrate data.

Data Warehouse Platform

The data warehouse related databases are hosted on SQL Server 2008. Some of the data is stored as cubes using SQL Server Analysis Services (SSAS).

Data Warehouse Size

The data warehouse contains approximately 550 Gigabytes (GB) of information. It is growing at a rate of 15 percent yearly.

BI Tools

The company uses Microsoft SQL Server 2008 Reporting Services (SSRS) to generate reports. It also uses Microsoft Excel 2010 for additional analysis and computation.

Results

The company has achieved a return on investment (ROI) of 215 percent. The investment in data analytics has resulted in lowered costs of inventory and improved customer service.

Proven management disciplines, such as Corporate Performance Management, are enhanced and supported by data warehousing and business intelligence. For example, scorecards for motivating and evaluating employees require data that is often gathered through a data warehousing effort and made available through BI tools, as illustrated in Figure 01-01.

In my experience, people outside of the IT and BI departments are often power users of data and related tools. They may build elaborate systems using desktop tools such as spreadsheets to build models and integrate data. Great benefits can be gained by bringing together the users who are the most knowledgeable about data (the “power users”) and the data warehousing/BI team.

Unfortunately, power users often find themselves spending a large portion of their time gathering and manually manipulating data, rather than analyzing the data and producing value. In addition, numbers produced by independent spreadsheets often produce inconsistent answers, which undermine the credibility of the results. Data warehousing addresses the manual effort by automating the gathering and storage of data. In addition, data warehousing addresses the problem of inconsistent data by providing a source of consistent and cleansed data.

Figure 01-01: Data Warehousing and BI are Management Tools

Operational data is data that many people are familiar with, such as the current checking account balances, customer service call statuses, and today's sales amounts. Operational data emphasizes current values. The databases that store this kind of data are optimized for transaction processing, meaning the design of the database is normalized to avoid duplication of data. Each fact is represented in one place to support fast and frequent updating.

Operational data is different from data warehouse data. Data warehouse data is focused on historic and trend value – what was the data in the past and how is it changing. Examples of historic data include checking account balances as of the first day of the month for the last five years; average customer service call time by month; and total sales amounts by month. Instead of a normalized database design, a multi-dimensional design organized into facts and dimensions is used for part of the data warehouse. Chapter 7 explains how that works and shows you how to create a multidimensional data model.

Data warehouse data is optimized for analytical processing, rather than transactional processing. The analytic data stored in the data warehouse is different from operational data. Operational data is:

  Optimized for Transaction Processing

  Frequently Updated

  Designed using Entity Relationship Modeling.

In contrast, data warehouse / analytic data is:

  Optimized for Analytical Processing

  Not Updated (It is loaded instead)

  Designed using multiple patterns including Entity Relationship Modeling and Multidimensional Modeling.

High quality data is needed for data warehousing and business intelligence in order to produce good answers. Some characteristics of high quality data are:

  Timeliness – Data is up to date as of the same point in time.

  Consistency – The same answers are produced each time and each place. Reports consistently produce the same answers, which makes reporting more dependable based on consistent data and consistent formulas.

  Comparability – Numbers can be added and compared. They can be summed, averaged, analyzed with statistics, etc.

See Chapter 9, Data Sources for Data Warehousing and Business Intelligence, to learn how to assess and improve the quality of your data.

The data warehouse supports the Decision Support function of the organization, which typically has the following goals:

  Make fact based decisions

  Make timely decisions

  Make profitable decisions that reduce costs and increase revenue.

These decisions can support a number of stakeholders

  Customers

  Employees

  Shareholders

  Suppliers

  Community.

Here is another take on decision-support roles. Decision support is the use of business intelligence to make profitable decisions that (1) reduce cost or (2) increase revenue. You want to make sure that business decisions serve your stakeholders, the people affected, such as customers, employees, shareholders, suppliers, and the community, by making fact-based, timely decisions based on an analysis of the numbers.

Solid enterprise architecture is critical to the success of data warehousing efforts. The ANSI/IEEE Standard 1471-2000 defines the architecture of software-intensive systems as “the fundamental organization of a system, embodied in its components, their relationships to each other and the environment, and the principles governing its design and evolution.”

The Enterprise Architect (EA) builds and communicates blueprints that model the enterprise and IT solutions. These models identify the people, processes, and technologies that are needed to enable the enterprise to succeed in its mission. Architecture is organized into a number of domains, each of which provides value to data warehouse architecture. Architecture domains, illustrated in Figure 01-02, include business architecture, information architecture, software architecture, and technology architecture.

Chapters 3 and 4 dive into each architecture domain and show how they support the success of data warehousing and business intelligence. This includes definitions as well as examples of deliverables, such as documents and models. Each domain includes subdomains as listed in Table 01-04. When you have completed Chapters 3 and 4, you will have an overall understanding of enterprise architecture and the specifics of data warehouse architecture.

Figure 01-02: Enterprise Architecture Pyramid

Table 01-04: Enterprise Architecture Sub-Domains

Business

Information

   Business Requirements

   Business Rules

   Organization Structure

   Critical Success Factors

   Business Process Design and Modeling

   Mission / Vision

 

   Data Governance

   Data Integration

   Data Architecture

   Master Data Management

   Data Delivery Architecture

   Data Modeling

   Data Quality

   Content Management

   Dashboards and Analytics

   Business Intelligence

   Data Mining and Analytics

   Enterprise Reporting

   Corporate Performance Management

Application

Technology

   Enterprise Application Integration

   Custom Application Development

   Services Definition

   Process Alignment

   Services / Event Architectures

 

   Servers

   Networks

   Telecom

   Operating Systems

   Desktop

   Middleware

   Database Infrastructure

   Security

   Storage

   Other Hardware

The fields of data warehousing, business intelligence and analytics are continually growing and changing. The industry is buzzing with new developments. In addition, innovations in technology impact data warehousing. It is imperative to know about them in order to solve the Analytical Puzzle. Hot topics are identified here and more fully explained in this book.

  Agile Data Warehousing and Business Intelligence – agile approaches reduce time to value by delivering results at regular intervals

  Big Datamulti-terabytes of typically unstructured data is critical to modern analytics and contributes to the success of companies such Google, Facebook and Amazon

  Data Warehousing in the Cloud a method to rapidly ramp up and manage data warehousing capacity without investing in fixed infrastructure

  Data Warehouse Appliances vendor provided solutions that include both hardware and software greatly increase performance through Massive Parallel Processing (MPP)

  Columnar Databases new database software that stores and retrieves data by columns rather than by rows is dramatically improving data warehouse performance

  In Memory Databases – this software boosts performance by storing data in memory rather than on hard disk

  Federated and Virtual Databases – data distributed in multiple data stores can be made to look like a single data store, avoiding the cost and time of copying data

  Rthis open source analytics language with supporting software is gaining wide acceptance in both the business and academic community

  Predictive and Prescriptive Analytics predicting who will be a profitable customer and prescribing the best offer to make, provides competitive advantage

  Data Mininganalyzing data to find patterns and gain new knowledge using statistical methods is rapidly advancing

  Data Visualization improved tools are enabling data to be better understood through graphical displays

  Mobile Computing – the growing use of smart phones and tablet computers is making business intelligence available to new audiences and requires new ways of thinking.

Key Points

  Business Intelligence (BI) is the practice of supporting decision-making through the presentation and analysis of data.

  Analytics takes BI to a higher level by applying number crunching to support processes such as customer segmentation, product mix analysis, budgeting, and operations improvements.

  Data Warehousing is a total architecture for collecting, storing, and delivering decision support data for an entire enterprise.

  BI improves decision-making, which can impact the bottom line by reducing costs and increasing revenues.

  BI requires high quality data that is timely, accurate and comparable.

Deepen your knowledge of data warehousing and BI through these resources:

Do It Now!

Assess you organization's Data Warehousing and BI Maturity. Download and complete the assessment available at the support website for this book:

http://www.analyticalpuzzle.com

Visit a Website!

Visit The Data Warehousing Institute – a premier organization that provides information about data warehousing and business intelligence:

http://www.tdwi.org/

Wikipedia provides a great overview of data warehousing, including a history starting in the 1960s:

http://en.wikipedia.org/wiki/Data_Warehouse

The BeyeNETWORK is a web portal that brings together knowledge from numerous BI experts. Allen Messerli has written an excellent article series titled “Reinventing Business: Enterprise Data Warehouse Business Opportunities for Manufacturing” that help you to develop a business case: http://www.b-eye-network.com

Read about it!

These books provide excellent overviews of the economics of analytics and EDW:

Ayres, Ian. Super Crunchers: Why Thinking-By-Numbers is the New Way to be Smart. New York NY: Bantam Books, 2007.

Davenport, Thomas H., and Jeanne G. Harris. Competing on Analytics: The New Science of Winning. Boston, MA: Harvard Business School, 2007.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.164.39