Query design per dataset mode

Many common M queries can be written for both import and DirectQuery datasets, but with widely different implications for the source system resources utilized and the performance of the analytical queries from Power BI. It's essential that the mode of the dataset (import or DirectQuery) has been determined in advance of the development of the data access queries and that this decision is reflected in the M queries of the dataset.

The M queries supporting a Power BI dataset import mode should exclude, or possibly split, columns with many unique values, such as a Transaction Number column, as these columns consume relatively high levels of memory. A standard design technique for import mode models is to exclude derived fact table columns with relatively more unique values when these values can be computed via simple DAX measure expressions based on columns of the same table with fewer unique values.

In the following example, the SUMX() DAX function is used to compute the Sales Amount measure based on the Order Quantity and Unit Price columns of the Internet Sales fact table, thus avoiding the need to import the Sales Amount column:

Internet Sales Amount (Import) = 
SUMX('Internet Sales','Internet Sales'[Order Quantity]*'Internet Sales'[Unit Price]) 

Internet Sales Amount (DirectQuery) = 
SUM('Internet Sales'[Sales Amount])

As per the second measure, the Sales Amount column would be included in a DirectQuery data model and the DAX measure for the sales amount would exclusively utilize this column to generate a more efficient SQL query for the data source.

The import mode model is able to efficiently compute similar SUMX() expressions at scale with basic arithmetic operators (+, -, *, /) as these operations are supported by the multithreaded storage engine of the xVelocity in-memory analytics engine. For greater detail on DAX measures for import and DirectQuery datasets, see Chapter 4, Developing DAX Measures and Security Roles.

The M queries supporting a DirectQuery dataset should generally contain minimal to no transformation logic as the complexity of the resulting SQL statement may negatively impact the performance of Power BI report queries as well as increase the resource usage of the data source.

This is especially important for the fact tables and any large dimension tables of the DirectQuery dataset. Given the central role of the data source for query performance and scalability of DirectQuery solutions, the Power BI dataset designer should closely collaborate with the data source owner or subject matter expert, such as a database administrator, to make the best use of available source system resources.

As noted in the To get the most out of this book section of the Preface, an AdventureWorks data warehouse sample database (AdventureWorksDW2016CTP3) hosted on a local instance of the SQL Server 2017 database engine was the primary data source for the examples in this book. The PBIX files included in the code bundle reference ATLAS as the name of the database server and AdventureWorksDW as the name of the database. Therefore, any attempt to refresh the queries within these PBIX files or create new queries against this data source will return errors as the user doesn't have access to this source.

Additionally, certain objects of the AdventureWorksDW database used in this book such as views are not included in the downloadable sample database. For this reason, the exact results depicted in this book cannot be perfectly reproduced via a SQL Server 2017 (or later) database instance and the sample database alone. Moreover, the code examples in the book are intended to highlight essential concepts and use cases. The corresponding code included in the code bundle may, for example, include additional columns not referenced in the book as these columns weren't essential to the given example.

Table of Contents for Query design per dataset mode

Create new playlist

Sign In

Sign Up

Table of Contents for
Query design per dataset mode