Product dimension integration

The SQL view for the product dimension referenced earlier in this chapter contained the following four operations:

  1. Join the Product, ProductSubcategory, and ProductCategory dimension tables into a single query
  2. Create a custom product category group column (for example, Bikes versus Non-Bikes)
  3. Apply report-friendly column names with spaces and proper casing
  4. Replace any null values in the Product Subcategory and Product Category columns with the 'Undefined' value

Like almost all operations available to SQL SELECT queries, the same query can also be created via M functions. If the SQL view for the product dimension cannot be created within the data source, the following M query produces the same results:

let
Source = AdWorksSQLServer,
//Product Dimension Table Views
Product = Source{[Schema = "BI", Item = "vDim_Products"]}[Data],
ProductSubCat = Source{[Schema = "BI", Item = "vDim_ProductSubcategory"]}[Data],
ProductCat = Source{[Schema = "BI", Item = "vDim_ProductCategory"]}[Data],

//Product Outer Joins
ProductJoinSubCat = Table.NestedJoin(Product,"ProductSubcategoryKey",ProductSubCat,"ProductSubcategoryKey","ProductSubCatTableCol",JoinKind.LeftOuter),
ProductJoinSubCatCol = Table.ExpandTableColumn(ProductJoinSubCat,"ProductSubCatTableCol",{"EnglishProductSubcategoryName","ProductCategoryKey"},{"Product Subcategory", "ProductCategoryKey"}),

ProductJoinCat = Table.NestedJoin(ProductJoinSubCatCol,"ProductCategoryKey",ProductCat,"ProductCategoryKey","ProductCatTableCol",JoinKind.LeftOuter),
ProductJoinCatCol = Table.ExpandTableColumn(ProductJoinCat,"ProductCatTableCol",{"EnglishProductCategoryName"},{"Product Category"}),

//Select and Rename Columns
ProductDimCols = Table.SelectColumns(ProductJoinCatCol,{"ProductKey","ProductAlternateKey","EnglishProductName","Product Subcategory","Product Category"}),
ProductDimRenameCols = Table.RenameColumns(ProductDimCols,{
{"ProductKey", "Product Key"},{"ProductAlternateKey","Product Alternate Key"},{"EnglishProductName","Product Name"}
}),

//Product Category Group Column
ProductCatGroupCol = Table.AddColumn(ProductDimRenameCols,"Product Category Group", each
if [Product Category] = "Bikes" then "Bikes"
else if [Product Category] = null then "Undefined"
else "Non-Bikes"
,type text),

//Remove Null Values
UndefinedCatAndSubcat = Table.ReplaceValue(ProductCatGroupCol,null,"Undefined",Replacer.ReplaceValue,{"Product Subcategory","Product Category"})
in
UndefinedCatAndSubcat

The three product dimension tables in the dbo schema of the data warehouse are referenced from the AdWorksSQLServer staging query described earlier in this chapter.

The Table.NestedJoin() function is used to execute the equivalent of the LEFT JOIN operations from the SQL View, and the Table.ExpandTableColumn() function extracts and renames the required Product Subcategory and Product Category columns. Following the selection and renaming of columns, the Product Category group column is created via a conditional expression within the Table.AddColumn() function. Finally, the Table.ReplaceValue() function replaces any null values in the Product Category and Product Subcategory columns with the 'Undefined' text string. The Power Query Editor provides a preview of the results: 

Power Query Editor preview of Product M Query

Despite the additional steps and complexity of this query relative to the previous M query examples (trailing three years filter, Customer History Segment column), the entire query is translated into a single SQL statement and executed by the source SQL Server database. The View Native Query option in the Applied Steps pane of the Power Query Editor reveals the specific syntax of the SQL statement generated via query folding:

Part of Native Query generated from Product M Query
Note that a dedicated SQL view object in the BI schema (for example, BI.vDim_ProductSubcategory) is accessed for each of the three product dimension tables. Per the SQL views section earlier in this chapter, it's recommended to always access SQL views from Power BI datasets, as this declares a dependency with the source tables.

Note that the Table.Join() function could not be used in this scenario given the requirement for a left outer join and the presence of common column names. With a left outer join, the presence of common column names, such as ProductSubcategoryKey or ProductCategoryKey, for the tables in the join operation would cause an error. Additionally, although a left outer join is the default behavior of the Table.NestedJoin() function, it's recommended to explicitly specify the join kind (for example, JoinKind.Inner, JoinKind.LeftOuter, JoinKind.LeftAnti) as per the ProductJoinSubCat and ProductJoinCat variables of the M query. 

Whenever any unstructured or business-user-owned data sources are used as sources for a Power BI dataset, it's usually appropriate to implement additional data quality and error-handling logic within the M query. For example, a step that invokes the Table.Distinct() function could be added to the Sales and Margin Plan query that retrieves from the Excel workbook to remove any duplicate rows. Additionally, the third parameter of the Table.SelectColumns() function (for example, MissingField.UseNull) can be used to account for scenarios in which source columns have been renamed or removed. 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.200.220