Columnstore and HTAP

Business intelligence queries generated from tools such as Power BI are more suited for columnar data stores and most DirectQuery source systems offer a columnar feature to deliver improved query performance. For Microsoft SQL Server, the columnstore index is recommended for large fact tables and this index eliminates the need to maintain traditional B-tree indexes or to apply row or page compression. Additionally, a combination of non-clustered columnstore indexes and in-memory table technologies can be used to support hybrid transactional and analytical processing (HTAP) workloads. For example, the Power BI queries against the DirectQuery dataset would utilize the columnstore index without impacting the OLTP workload of the database.

The details of these features and configurations are outside the scope of this book but at a minimum the owners or experts on the DirectQuery data source should be engaged on the performance of the Power BI dataset. The following URL provides guidance on designing columnstore indexes for SQL Server database services (for example, Azure SQL Database, Azure SQL Data Warehouse) and on-premises SQL Server database environments: http://bit.ly/2EQon0q.

The Related Tasks section of the Columnstore indexes – Design guidance documentation referenced in the preceding URL contains links for the T-SQL DDL statements associated with implementing the columnstore index. In most scenarios, the dataset designer in a Power BI project or the author of an Analysis Services model is not responsible or authorized to optimize data sources such as with the columnstore index. However, the dataset designer can regularly collaborate with this subject matter expert or team as the demands and requirements of the dataset change. For example, the dataset designer can use tools, such as DAX Studio and SQL Server Profiler as described in the Microsoft Power BI Cookbook (https://www.packtpub.com/big-data-and-business-intelligence/microsoft-power-bi-cookbook), to capture the common or important SQL queries generated by Power BI reports and then share this information with the data warehouse team.

Alternatively, the database or data warehouse team can run a trace against a data source system per the DirectQuery report execution section of Chapter 2Connecting to Sources and Transforming Data with M, during a test query workload from Power BI. This trace data could be used to identify the specific columns, tables, or expressions associated with slow queries and thus inform database modification decisions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.164.141