Introducing dimensional modeling

So far you have dealt with the Jigsaw puzzles database, a database used for daily operational work. In the real-world, a database like this is maintained by an On-Line Transaction Processing (OLTP) system. The users of an OLTP system perform operational tasks—sell products, process orders, control stock, and so on.

As a counterpart, a datawarehouse is a nonoperational database; it is a specialized database designed for decision support purposes. Users of a datawarehouse analyze the data, and they do it from different points of view.

The most used technique for delivering data to datawarehouse users is dimensional modeling. This technique makes databases simple and understandable.

The primary table in a dimensional model is the fact table. A fact table stores numerical measurements of the business such as quantity of products sold, amount represented by the sold products, discounts, taxes, number of invoices, number of claims, and anything that can be measured. These measurements are referred as facts.

A fact is useless without the dimension tables. Dimension tables contain the textual descriptors of the business. Typical dimensions are product, time, customers, and regions. The fact along with all the surrounding dimension tables make a star-like structure often called a star schema.

Datawarehouse is a very broad concept. In this book we will deal with datamarts. While a datawarehouse represents a global vision of an enterprise, a datamart holds the data from a single business process .

Data stored in datawarehouses and datamarts usually comes from different sources, the operational database being the main. The process that takes the information from the source, transforms it in several ways, and finally loads the data into the datamart or datawarehouse is the already mentioned ETL process. As said, PDI is a perfect tool for accomplishing that task. In the rest of this chapter, you will learn how to load dimension tables with PDI. This will build the basis for the final project of the book: Loading a full datamart.

Through the tutorials you will learn more about this. However, the terminology introduced here constitutes just a preamble to dimensional modeling. There is much more you can learn. If you are really interested in the subject, you should start by reading The Data Warehouse Toolkit (Second Edition) by Ralph Kimball and Margy Ross. The book is undoubtedly the best guide to dimensional modeling.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.223.10