Chapter 12
Tabular Models
What's in this chapter?
Introduction to PowerPivot
Importing data into PowerPivot
Explaining the PowerPivot window
Analyzing and enriching data
In Chapter 9 you learned about Microsoft Business Intelligence Semantic Model (BISM). You learned that it is made up of two components: multidimensional mode, which corresponds to the previous unified dimensional model (UDM), and tabular modeling, which is a more recent approach first implemented in the initial release of PowerPivot in SQL Server 2008 R2, with an add-in for Excel as the model development tool.
The Analysis Services team received a lot of positive feedback on PowerPivot. It considered different approaches for how to evolve it and make it available for corporate BI developers, as well as enable the underlying model for Power View for self-service business users performing highly interactive, visual analysis. SQL Server 2012 has two development tools for creating tabular models: an enhanced PowerPivot add-in for Excel and a new tabular development environment in Visual Studio for BI applications. This chapter focuses primarily on PowerPivot for Excel and covers these topics:
A key aspect of Microsoft's vision behind self-service business intelligence, as implemented in PowerPivot and Power View, is that your analytical data remains connected to its source. Also, it should be easy to update and refine your BI application and model to frequently evolving requirements. Furthermore, you can easily share data in a controlled way, and share visualizations and analyses built from the data.
PowerPivot is made up of two separate components that work together to accomplish this:
PowerPivot applications are like Excel workbooks, but they include PowerPivot data and metadata embedded in the workbook. This enables PowerPivot workbooks to offer additional functionality over regular Excel workbooks. For example, PowerPivot workbooks can contain tables of hundreds of millions of rows of data; PowerPivot tables are not constrained by Excel tables, which can contain only 1 million rows of data (1,048,576 rows to be exact).
PowerPivot tables can be a source for Excel PivotTables and PivotCharts, as well as a source for Power View (which you'll read more about in Chapter 13) for reporting, analysis, and visualization. PowerPivot can represent relationships between tables and join tables just like a database. Figure 12.1 shows a PowerPivot workbook with multiple tables.
PowerPivot workbooks can be shared using Microsoft Office SharePoint Server. Workgroup members can then browse and interact with the workbook using the Excel client, a web browser (with Excel Services configured), or Power View. There is also PowerPivot Gallery, shown in Figure 12.2. This is a custom SharePoint document library type that previews workbook contents and provides an entry for creating a Power View analysis and visualization from a workbook, as discussed in Chapter 13.
PowerPivot workbooks can reference external data sources, for which you schedule automatic data refresh. Although manual data refresh can be accomplished directly within PowerPivot for Excel when the workbook is loaded, automatic data refresh uses PowerPivot for SharePoint and executes unattended on a specified schedule. As shown in Figure 12.3, you could configure an automatic data refresh for your workbook with the latest data every morning at 6.
To summarize, PowerPivot workbooks provide all the capabilities of Excel, plus additional modeling and analytical capabilities, to deliver self-service BI in conjunction with Excel and Reporting Services Power View.
PowerPivot for Excel allows you to integrate data from various types of external data sources, link to existing data inside the workbook, add relationships between data, and enrich data with custom calculations. The data can then be used directly within Excel through features such as PivotTables and PivotCharts. You also can create highly interactive visualizations and analytical presentations with Power View.
PowerPivot for Excel includes the VertiPaq engine, a local version of the Analysis Services in-memory engine in VertiPaq mode that performs calculations and executes queries with high performance.
When you are working with PowerPivot for Excel, the data resides in memory. When you save the workbook, the data and metadata are stored inside the Excel workbook file.
PowerPivot for Excel is a free download on the web that can be found at http://powerpivot.com. It has the following prerequisites:
The installation of Office Shared Features is required because PowerPivot for Excel is a Visual Studio Tools for Office (VSTO) add-in and requires the VSTO runtime. The latter is installed when Office Shared Features are installed.
After you install these prerequisites and PowerPivot for Excel, start Microsoft Office Excel. A new command called PowerPivot Window appears on the Excel ribbon, as shown in Figure 12.4. This command is the entry point for PowerPivot for Excel. When you click it, the PowerPivot window opens, as shown in Figure 12.5.
The PowerPivot Window provides commands to integrate data from various types of external data sources, linking to existing data inside the workbook, adding relationships between data, and enriching data with custom calculations. It provides a “window” into the PowerPivot data that is stored inside the workbook.
The following sections describe some of the key features of PowerPivot for Excel. You will work through a tutorial based on a sample relational database for SQL Server. It is called AdventureWorksDW_WroxSSRS2012 and is available on the Wrox download page for this book.
If you haven't installed Excel and PowerPivot for Excel, follow these steps:
The PowerPivot window provides several ways to import data from various types of data sources, including the following:
The following steps take you through an example of importing data from a SQL Server database that has the AdventureWorksDW_WroxSSRS2012 relational sample database deployed:
Editing the names in the Friendly Name column allows you to rename them immediately upon import. You can also rename them after the import is completed, directly in the PowerPivot window.
PowerPivot for Excel creates a data store utilizing the VertiPaq engine running in PowerPivot, and retrieves the data from the data source. You see the progress of these operations in the Importing page of the Table Import Wizard.
When importing tables directly, PowerPivot also tries to detect relationships between those tables in the data source system, to add them in the PowerPivot data store.
You can inspect the relationships created by clicking the Details link in the Message column of the import dialog, as shown in Figure 12.10.
If PowerPivot was unable to import relationships, the dialog provides more information on which relationships could not be successfully imported.
The PowerPivot window is now populated with all tables that have been imported. The default view of a model is Data View, where you can see a single table at a time and switch between them using tabs, such as Excel worksheet tabs, as shown in Figure 12.11. For each table, you see the columns of the table and data rows. You can select a cell value, and the Record indicator at the bottom of the window describes which record the cursor is currently positioned on. You can navigate rows within a table using the vertical scrollbar or the arrow buttons next to the Record indicator.
With data imported, we can further enrich and refine the model. Before we do that, though, let's take a look at some of the features of the tabular designer. Figure 12.13 shows the PowerPivot tabular designer after our import is completed. Clicking the button to the left of the Home tab opens the File menu to save, and to toggle between Normal and Advanced mode. Advanced mode provides additional model options that are otherwise hidden in the PowerPivot window.
The Home tab, shown in Figure 12.13, contains commands that apply to the model you are currently working on. The following is a brief description of its commands:
The Design tab, shown in Figure 12.14, contains commands that affect individual model columns, define calculations, relationships, and table properties. The following is a brief description of its commands:
You can use Calculation Options to change to manual calculation mode. In manual mode, the Calculate Now command is enabled, and calculations are updated only when you invoke the Calculate Now command.
The Advanced tab, shown in Figure 12.15, is visible only after PowerPivot switches from normal mode to advanced mode. This can be accomplished by clicking the top-left corner of the PowerPivot window to get to its File menu, which contains options for saving, asking questions, sending feedback, and switching between normal and advanced modes. The following is a brief description of the commands on the Advanced tab:
This section describes some basic operations on tables using PowerPivot. Specifically, you'll learn about filtering and sorting, relationships, calculated columns, and measures.
While importing data, you can apply filters and preview data. After importing data into PowerPivot, you can filter and sort data in a table in the PowerPivot window. For example, you can analyze sales data by following these steps:
Figure 12.16 shows the Filter drop-down in the PowerPivot window for a specific table column. Depending on the data type of the underlying field, custom filter options are available.
For example, fields that are whole numbers would show Number Filters with options for conditions such as greater than. For string columns, you would see Text Filters in the Filter drop-down.
With the sales data filtered to 99 rows for that particular date, you can view minimum and maximum sales by order on that date by sorting the SalesAmount column following these steps:
Sorting is performed over the filtered rows. It is very fast utilizing the VertiPaq engine in PowerPivot, even if the underlying dataset has millions of rows. By sorting, you can see that the maximum sales order on that day was $2,443.35, and the minimum sales order was $2.29.
Finally, on the Home tab, in the Sort and Filter section, select Clear All Filters to see all the data in the model again.
An important aspect of the tabular model is relationships between tables. PowerPivot for Excel supports one-to-many relationships. This means that a column value in a specific table can have multiple instances of the value in another table's related column.
The table import wizard detects and understands the relationships that are present in the source data. It creates relationships in your model based on the relationships defined in the source data. In addition, the tabular designer provides ways to add relationships yourself.
If you are working in Data View, the designer indicates columns that participate in relationships with a glyph in the column header next to the column name. Hovering over the glyph provides details of the relationship. For example, hovering over the ProductKey column's glyph indicates “Related to [ProductKey] in table [DimProduct].” in a tooltip window.
Diagram View provides a richer environment for working with relationships. You can see all the relationships in your model simultaneously in Diagram View. You can also work with them in a graphical way. For example, you can create new relationships by dragging and dropping from one column to the other. Follow these steps to add a new column to the model and manually create a relationship between it and an existing table:
Alternatively, you can accomplish the same task using the Create Relationship command available on the Design tab. You can manage relationships by selecting Manage Relationships, which opens the dialog shown in Figure 12.20.
Relationships are a key part of the tabular model. The calculations that you create with DAX (Data Analysis Expressions) can use relationships to allow calculations that involve columns in different tables. Understanding when and how to create relationships in your model will help you realize your goals when analyzing your business data.
You can add DAX calculations to your model in two ways. The first, calculated columns, allows you to add expressions that define a new column in an existing table. You can refer to columns in the same table in your calculation, and the execution engine uses the value of the column in the current row as the calculation being evaluated. Another way to add calculations to your model is through what are called measures. Measures are calculations that are not done within the context of a table row. Rather, they are evaluated in the contents of the particular cell whose value they are being asked to provide.
Calculated columns and measures are explained in more detail in the following sections.
Similar to performing calculations on an Excel worksheet, PowerPivot allows you to create calculated columns and measures within the PowerPivot window using DAX functions. DAX functions are grouped into eight major categories:
Calculated columns are useful when you need particular data values in your tables but those values are not in the data you have imported. It could be that you want to format data to display in a certain way in your client application. Or you may want to analyze a value that could be calculated from data values that are in the table. Or you may need a calculated column for some other purpose. The following steps show you how to add calculated columns to your model:
=RIGHT(" " & Format([MonthNumberOfYear],"#0"),2) & " - " & [EnglishMonthName]
You can verify the formula is correctly specified based on the values populating the calculated column (for example, “1 - January”).
=[SalesAmount] - [TotalProductCost]
You now know how to create calculated columns in PowerPivot. The columns you added are available in reporting client tools the same as any column that came from the source date. In this way, calculated columns allow you to customize the data for your model beyond the data you imported. This is a powerful capability of tabular mode.
Measures, like calculated columns, are defined by a DAX expression. Unlike calculated columns, they cannot refer to a particular column as a value unless an aggregation function has been applied to the column name. The values for measures are calculated on-the-fly at the time they are evaluated in the context that the value is being asked for. This section shows you how to work with measures.
In the PowerPivot window, you work with measures in an area called the Calculation Area. (In the tabular designer in SQL Server Data Tools, the Calculation Area is called the Measure Grid.) The Calculation Area is the grid area below the splitter bar in the lower half of the Data View grid, as shown in Figure 12.22.
Follow these steps to create a measure in your model using the AutoSum feature:
You have now used the AutoSum function to create two measures. As you learn more about DAX, you will be able to create your own DAX measures to do more sophisticated analysis actions.
As you work with your tabular model, you may want to work with your tabular model in a client tool to verify it. Chapter 13 shows you how to accomplish this with Power View. You can also explore the model directly in Excel using a PivotTable or PivotChart by using the PivotTable menu on the Home tab, as shown in Figure 12.25.
Figure 12.26 shows a PivotTable connected to the current model. Note that the measures you defined in the preceding section are available as values for analysis. The PivotTable shown in Figure 12.26 uses DimGeography.CountryRegionCode, FactInternetSales.DistinctSalesOrder, and FactInternetSales.InternetSalesAmount, which you created earlier.
This chapter gave you your first taste of working with PowerPivot for Excel, creating a tabular model. You learned about the commands available in the tabular designer and the two main model views, Data View and Diagram View. You walked through a simple scenario that showed the main components that make up a tabular model.
In the next chapter you'll learn about analyzing and visualizing your tabular models with Power View and creating exciting presentations in the process.
3.135.219.166