Chapter 8. Storage Optimization

Chapter Objectives Estimated time: 1.75 hours
In this chapter, you’ll learn when to use each of the three data storage modes, how aggregations work, and how to
  • Use the Storage Design Wizard to set storage mode and design aggregations.

  • Create aggregations that optimize actual usage patterns.

  • Control aggregation levels for individual dimensions.


As you learned in Chapter 1, “A Data Analysis Foundation,” an OLAP cube appears to contain every possible summarized value at every possible level of detail for every dimension. For example, the Sales cube created in Chapter 2, “Analysis Manager from 500 Feet,” contains Sales Units and Sales Dollars as measures and Product, Employee, State, and Time as dimensions. If you query the cube at the lowest level of detail—for example, Sales Dollars for Bagels by Rebecca Kanagaki in Washington during October 1998—the cube returns the number $3.51 as if it were stored directly in the cube. But at the same time, if you query the cube at a much higher level of detail—for example, Sales Dollars for Bread in the USA by all employees during 1998—the cube returns the number $14,198.45, again as if that number were stored directly in the cube.

The sample Chapter 8 data warehouse on this book’s companion CD contains information to support four dimensions: Product, Employee, State, and Time. At the level of detail stored in the warehouse, there are 200 products, 4 leaf-level employees with data values, 6 states, and 22 months. That means there are 200 × 4 × 6 × 22, or 105,600 possible combinations, which means there could theoretically be 105,600 rows in the fact table. In reality, products, employees, and states were introduced gradually over time and only 6023 combinations actually appear as rows in the fact table. Even though the fact table contains only 6023 detail values, the cube can display up to 105,600 detail level cells—most of which would be empty.

In addition, because a cube appears to contain every possible summarized value at every possible level of detail for every dimension, the cube must appear to contain not only the 105,600 possible detail combinations but also all possible summary values. Counting all the members at all levels of the hierarchy, the Chapter 8 warehouse database has 216 members in the Product dimension, 8 members in the employee dimension, 14 members in the State dimension, 33 members in the Time dimension, and also the 2 measures. The cube must therefore appear to contain 216 × 8 ×14 × 33 × 2, or 1,596,672 values. In this small sample database, the cube appears to contain 265 times as many values as the fact table contains rows! This is called data explosion. Data explosion is a major issue with OLAP cubes, and all OLAP products must deal with it in some way.

The simplest way to avoid data explosion is simply to not store aggregations at all and, instead, to calculate them on demand. But when you have a large data warehouse, this option quickly takes its toll on performance because requesting a single high level summarized value from the cube would require retrieving and summing hundreds or thousands of values from the source data. The challenge of OLAP is to make queries as fast as possible while avoiding data explosion. Microsoft SQL Server 2000 Analysis Services provides several features that allow the database administrator to control and fine-tune the relationship between the physical size of the cube and the speed of the queries. In fact, Analysis Services provides options in many cases that allow both compact data files and responsive queries.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.205.154