Fact-to-dimension relationships

To create the Data Model relationships identified in the Data Warehouse Bus Matrix image:

  1. Click Manage Relationships from the Modeling tab in Report View.
  2. From the Manage Relationships dialog, click the New command button at the bottom to open the Create relationship interface. Choose the fact table, such as Internet Sales, for the top table via the dropdown and then select the dimension table as shown in the following screenshot:
Creating a relationship for the import mode dataset

If the relationship columns have the same name, such as Currency Key in this example, Power BI will automatically select the columns to define the relationship. Almost all relationships will follow this Many to one(*:1) or fact-to-dimension pattern with the Cross-filter direction property set to Single and the relationship set to active.

The two columns used for defining each relationship should be of the same data type. In most relationships, both columns will be of the whole number data type as only a numeric value can be used with slowly changing dimensions. For example, a Product Key column could use the values 12, 17, and 27 to represent three time periods for a single product as certain attributes of the product changed over time.

Prior to the Mark as Date Table feature described in the previous chapter, a date column stored as a date data type was used for relationships in Power BI datasets as this enables the time intelligence functions of DAX to work correctly. Given this feature, however, whole number (integer) columns stored in YYYYMMDD format (for example, 20180225 for February 25th, 2018) can be used for fact-to-date table relationships in Power BI datasets. Details on utilizing this feature and other considerations for date dimension tables are included within the SQL views section of Chapter 2, Connecting to Sources and Transforming Data with M.

As more relationships are created, it can be helpful to switch to the Relationships view and move or organize the dimension tables around the fact table. Relationships view can make it clear when additional relationships need to be defined and can be useful in explaining the model to report authors and users.

Click OK to create the relationship and repeat this process to build the planned star schema relationships for both the Internet Sales and Reseller Sales fact tables, as shown in the following screenshot of Internet Sales:

Internet Sales relationships

All relationships from Internet Sales to a dimension table are active (solid line) except for two additional relationships to the Date dimension table. In this dataset, the Order Date is used as the active relationship, but two additional inactive (dotted line) relationships are created based on the Due Date and Ship Date columns of the fact table. DAX Measures can be created to invoke these alternative relationships via the USERELATIONSHIP() DAX function, as shown in the following example:

Internet Net Sales (Due Date) = 
CALCULATE([Internet Net Sales], USERELATIONSHIP('Internet Sales'[Due Date Key],'Date'[Date Key]))

Internet Net Sales (Ship Date) =
CALCULATE([Internet Net Sales],USERELATIONSHIP('Internet Sales'[Ship Date Key],'Date'[Date Key]))

The inactive relationships and their corresponding measures enable report visualizations based on a single-date dimension table, such as in the following table:

Measures with active and inactive relationships 

In this scenario, the Internet Net Sales measure uses the active relationship based on Order Date by default, but the other measures override this relationship via the CALCULATE() and USERELATIONSHIP() functions.

A common alternative approach to inactive relationships is to load additional date dimension tables and create active relationships for each additional date column on the fact table (for example, Due Date, Ship Date) to these tables. The columns for these additional date tables can be named to avoid confusion with other date columns (for example, Ship Date Calendar Year) and some teams or organizations are more comfortable with table relationships than DAX Measures. Additionally, this design allows for intuitive matrix-style visualizations with two separate date dimensions (Ship Date, Order Date) on the x and y axis filtering a single measure via active relationships.

For DirectQuery datasets, the Assume referential integrity relationship property is critical for performance as this determines whether inner- or outer-join SQL statements are generated to resolve report queries. When enabled, as shown in the following screenshot, inner-join SQL queries will be passed to the source system when report queries require columns or logic from both tables of the relationship:

Assume referential integrity

If Assume referential integrity is not enabled, outer-join SQL queries will be generated to ensure that all necessary rows from the fact table or many sides of the relationship are retrieved to resolve the report query. The query optimizers within supported DirectQuery sources, such as SQL Server and Oracle, are able to produce much more efficient query execution plans when presented with inner-join SQL statements. Of course, improved performance is of no value if the outer join is necessary to return the correct results, thus it's essential for referential integrity violations in the source system to be addressed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.72.212