Data in a transactional format is not very useful for spotting trends or identifying outliers in any reasonably sized set of information. Taking data and summarizing it so that a reader can more easily absorb a large amount of information is a highly important last step in the data analysis journey. Recently, as business users have started to experience a deluge of data, trends have been moving toward more visual means of presenting data as a starting point in analysis. A lot of focus has been put on building dashboards and scorecards that enable a user to start analysis at a highly aggregated entry point, and can then be guided through the process of drilling down to details as more “why” questions are asked.
This chapter covers the different ways to present data for end-user consumption. You start by looking at PivotTables to summarize data in a tabular format that enables the classic slicing and dicing of data. You then move to visualizing data using PivotCharts. And finally, in Skill 3-3 we look at the ever-growing and improving integration points between Excel and Power BI for both consuming and presenting data. Power BI offers some tremendous ways to share the work done in an Excel workbook that enables you to get around the traditional sharing challenges that have existed with Excel for some time now.
Skills in this chapter:
PivotTables have long been a staple in the Excel analysts’ toolkit. With the advent of the Excel Data model, the ability to navigate over huge datasets has been made possible to extend the ability for a business user to quickly discover insights in their datasets to enable better decision making. And as we have seen, the Data Model also enables users to centralize logic in the form of Calculated Columns, Measures, Hierarchies, Formatting properties, etc. This enables reuse and consistency, which improves the quality and reliability of data products.
In this section, we take a deeper look at the many facets of a PivotTable that can be configured to help improve presentation and ultimately usability. There are many ways to format and interact with the PivotTable to influence this.
When you create PivotTables, you should always be thinking about what you would like your data product to look like once it is in your users’ hands. As a designer, you have numerous options for formatting a PivotTable so that it is in the most usable state possible for users. Some formatting options can either be applied to the Data Model or made local to the PivotTable, and others can only be applied to the PivotTable. These will be highlighted as we move through the various options.
To begin looking at the options available, create a PivotTable so that you have something to format. There are two ways to create a PivotTable. The first is from within Excel, and the second is from within PowerPivot. The quickest way is to do it from within PowerPivot. If you do it there, Excel detects that you want to base the PivotTable in the Data Model, so you save some configuration options.
As a note, if you are basing the PivotTable on the Excel Data Model, you cannot use the Recommend A PivotTable functionality that is available from within Excel because it is not available when using the Data Model.
Let’s first start by creating an empty PivotTable. To do this, perform the following steps:
Open the Chapter 3CH03 Skill 3-1 Formatting.xlsx workbook.
On the Insert tab, in the Tables group, click PivotTable. You will be presented with the Create PivotTable dialog as in Figure 3-1.
Make the following selections below:
In the Choose The Data That You Want To Analyze section, select Use This Workbook’s Data Model.
In the Choose Where You Want The Pivottable To Be Placed section, select New Worksheet.
Click OK when complete.
Save the workbook and leave it open for the next demo.
Alternatively, if you are already in Power Pivot, you can use the following navigation path:
On the Power Pivot tab, in the Data Model group, click Manage. This will open the Power Pivot window.
On the Home tab, click PivotTable. This option will not present you the Create PivotTable dialog box from Figure 3-1 because it assumes you want to use the Data Model as the basis for the PivotTable. You will, however, be asked if you want to create the PivotTable in a New or Existing Worksheet.
Using either method above for adding a new PivotTable to the existing worksheet will show you Figure 3-2. The PivotTable Fields pane is where you will add and configure many of the elements that you add to a PivotTable. It will appear whenever you click anywhere in the PivotTable. When you click out of the PivotTable, it will disappear since it is only relevant when the PivotTable is open. If you are in the PivotTable and do not want it on, you can toggle it on or off by doing the following. Under the PivotTable Tools, on the Analyze tab, in the Show group, click Field List to toggle it on or off.
The areas of interest within the PivotTable Fields list are shown in this list, and the numbers correspond to the highlighted number areas in Figure 3-2.
PivotTable This is the area where data will be presented as you add items from the other parts of the PivotTable Fields pane.
Table Status This allows you to show either All the tables in the data model or only the ones that are Active in the PivotTable. All is desirable when building a report initially, but after you have it defined the way you want it, you can choose Active, which will only show the tables that are used in the Pivot. Optionally, you can Pin a table to the active tab, even if it’s not currently being used, by right-clicking the table and selecting the Show In Active Tab option from the context menu. Once it is in the active tab you can remove it by doing the reverse of the previous step. Take note that you cannot remove a table from the active tab using this technique if it is being used in the PivotTable.
Search Some data models, no matter how well-designed, can on occasion become large. In these instances, you can use the search function to find the field that you are looking for using the names that are assigned in the Data Model.
Table List What appears in this window depends on the option that you have selected in the highlighted area shown in Figure 3-2. It will list tables that are in the Data Model if you have them All selected or only those in use if you have Active selected.
Filters area Fields here are added to the PivotTable as a Report Filter. This option will be discussed in-depth in the section on Filtering data.
Columns area Fields here are added to the columns axis as labels on the PivotTable. You have a great deal of flexibility, which will be discussed shortly.
Rows area Fields that are added here are added to the rows axis as labels on the PivotTable. As with columns, you have a great deal of flexibility, which will be discussed shortly.
Values area Fields that you add here are the things that you are trying to measure when using a PivotTable. You either use Explicit Measures that you created in the Data Model, or you can choose to add any fields here, and the PivotTable will choose the aggregation method that makes the most sense. If a default aggregation has been set in the Data Model, it will use that. You do have the option to override this within the PivotTable.
Defer Layout Update When you add a PivotTable to a worksheet and begin to interact with it, the values are constantly being refreshed. With a Data Model that has lots of data, if you have an under-resourced machine, these updates can take time. If you know the structure of the PivotTable that you want to build, you can build it from the model definition and then manually refresh the data when you choose to.
PivotTable Tools When you have a PivotTable highlighted, you will get two new context sensitive tabs in the ribbon. These are the Analyze and Design tabs, which are used to perform customizations on your PivotTable.
Now that you have had a walkthrough of the PivotTable Field pane, it is time to add to the empty PivotTable you created earlier so that you can see what can be formatted. There are many options available for formatting, and demos are best to show how to improve your model when you are ready for production consumption.
Open the Chapter 3CH03 Skill 3-1 Formatting.xlsx workbook, which you previously worked on and saved.
Using the empty PivotTable shown in Figure 3-2, add the following columns to the Fields pane:
Columns area CalendarYear from DimDate.
Rows area SalesTerritoryCountry and SalesTerritoryRegion from DimSalesTerritory. Ensure that SalesTerritoryCountry is listed first so that they nest properly.
Values Freight from FactInternetSales.
The first thing is to name your PivotTable something meaningful. This will be helpful when you start working with multiple PivotTables, Slicers, Timelines, and PivotCharts in a Dashboard-like scenario. This enables you to easily find the object you are looking for when connecting objects together.
With the Pivot table highlighted, under PivotTable Tools, on the Analyze tab, in the PivotTable group, click on the PivotTable Name text box. Change the name from PivotTable1 to PTFreightByRegionAndYear.
Your unformatted PivotTable should look like Figure 3-3. When you are done, save your work because you will need the PivotTable in this form for the next exercises. Between now and your next exercise, you will be experimenting with different layout, styling, and formatting options.
When in a PivotTable, Under PivotTable Tools, on the Design tab, you can see the formatting options as shown in Figure 3-4. In the Layout group, notice that you can control how Subtotals and Grand Totals are displayed, in addition to controlling the overall Report Layout, including the ability to insert blank rows between groups. Finally, there are additional options for styling the PivotTable through the style options and PivotTable Styles groupings, and each will be covered shortly. The styling options that are discussed in this section are made locally to the PivotTable and cannot be controlled by the Data Model itself.
In the Layout group, you can configure how Grand Totals and Subtotals are displayed in the PivotTable. To control how Subtotals are displayed, you can experiment with the following steps.
Under PivotTable Tools, on the Design tab, in the Layout group, click Subtotals.
From the command drop-down, you can select from the options list below. Experiment with the options below to view their effects on the PivotTable display.
Do Not Show Subtotals
Show all Subtotals at Bottom of Group
Show all Subtotals at Top of Group
Include Filtered Items in Totals
Next, to control how Grand Totals are displayed, you can experiment with the following steps.
Under PivotTable Tools, on the Design tab, in the Layout group, click Grand Totals.
From the command drop-down, you can select from the options list below. Experiment with these options to view their effects on the PivotTable display.
Off for Rows and Columns
On for Rows and Columns
On for Rows Only
On for Columns Only
There is no need to save these steps.
Grand Totals can also be configured by performing the following actions:
Under PivotTable Tools, on the Analyze tab, in the PivotTable group, click Options.
In the PivotTable Options dialog, click the Totals & Filters tab. Then in the Grand Totals section you can choose to turn Grand Total off for rows and/or columns.
Once you have your PivotTable created, you may want to change the layout of the actual Pivot table based on your users’ needs. There are three different Report Layout Forms that you can use to display a PivotTable. They are listed and described below:
Compact Form This is the default behavior for a PivotTable and can be seen in Figure 3-3. It displays items from different row area fields in one column and uses indentation to differentiate between fields. This takes up less space, which leaves more room for numeric data. Expand and Collapse buttons are displayed (which can also be turned off) so that you can display or hide details.
Tabular Form This displays one column per field, and fields are kept at the same level across the display. This is shown in Figure 3-5.
Outline Form This is like tabular, but it displays subtotals at the top of every group because items in the next column are displayed one row below the current item. Figure 3-6 shows this format.
Often it is helpful to be able to have Blank Rows between groups of data in a PivotTable to help draw emphasis to the data. To do this, you can use the Blank Rows command to toggle the feature on and off. To do this, under PivotTable Tools, on the Design tab, in the Layout group, click Blank Rows after each item.
The PivotTable Style Options group in the Design Tab deals with styling the PivotTable with headers and banding. The following options can be toggled on and off based on preferences. Experiment with each of these to see their effects on the PivotTable.
Row Headers Highlights the row headers.
Column Headers Highlights the columns headers.
Banded Rows Shades every other row to make it easier to distinguish one row from another.
Banded Columns Shades every other column to make it easier to distinguish one column from another.
The final grouping option for formatting the PivotTable on the Design tab is PivotTable Styles group. Here you can change the coloring, formatting, and borders using the predefined Light, Medium, and Dark styles. In addition, you may create your own New PivotTable style as you determine an appropriate corporate standard. Once again, put your PivotTable in focus and try some of the styles out.
The following are a series of general commands that you can apply to your PivotTables. The styling options that are discussed in this section are made locally to the PivotTable and cannot be controlled by the Data Model itself.
Once you have a PivotTable formatted, you may wish to clear all of the formatting that you applied and start the process again. Under PivotTable Tools, on the Analyze tab, in the Actions group, click the Clear drop-down. From here you have two options: Clear All and Clear Filters.
The Clear All command resets your PivotTable, but does not delete it. The data connection, placement, and cache remain the same. If you have a PivotChart associated with the PivotTable, Clear All also removes related PivotChart fields, chart customizations, and formatting.
The other option is to use Clear Filters if you have any filters applied to the PivotTable.
To Move your PivotTable to a new or existing worksheet, you can do the following. Under PivotTable Tools, on the Analyze tab, in the Actions group, click the Move PivotTable. You may need to do this in an instance where you are arranging a series of Dashboards and the flow needs to change within the workbook.
The freshness of data is always important to manage. You can manually refresh your data for all sources or a single PivotTable. If you want to refresh all the sources in the workbook you use the Refresh All command. If you only want to refresh the connections of the current PivotTable, you choose Refresh while the PivotTable that you want to refresh is selected. Both commands can be found under the Analyze > Data > Refresh drop-down. Alternatively, you can set the PivotTable to Refresh data automatically when opening the workbook. This can be done by doing the following:
Under PivotTable Tools > on the Analyze tab > in the PivotTable group > click
PivotTable Options.
In the PivotTable Options dialog, click the Data tab, and select Refresh Data When Opening the File.
When performing refreshes, it is also advisable to review how you want the layout and formatting to respond. This can be done by the following:
Under PivotTable Tools, on the Analyze tab, in the PivotTable group, click PivotTable Options.
In the PivotTable Options dialog, click the Layout & Formatting tab. Review the following two options:
Autofit column widths on update
Preserve cell formatting on update
If the source of your PivotTable needs to change, you can do so by editing the Connection Properties. To change these settings, do the following. Under PivotTable Tools, on the Analyze tab, in the Data group, click Change Data Source and select Connection Properties.
The last General commands you will review are used to change what you see when working with a PivotTable. Under PivotTable Tools, on the Analyze tab, in the Show group, you will see three options as in Figure 3-7. By clicking any of the items in the Show group, the corresponding number that matches within the PivotTable will react to the click. These are also described below and correspond to the numbers in Figure 3-7.
Field List Will toggle the PivotTable Field list on and off.
+/- Buttons Will turn the expand and collapse buttons on or off, should you have data in your PivotTable that enabled them.
Field Headers Turns the Field headers on and off in the PivotTable.
Should you want to format the values that appear in the PivotTable, you have the option of making changes to the Field Settings on both the Rows and Column areas and the Value Setting on the Values area.
First let’s look at the Values settings, in the Values area of the PivotTable Field pane.
With the PivotTable selected, click the drop-down arrow on the right side of the Sum of Freight column in the Values area of the PivotTable Fields pane.
Choose Value Fields Settings.
You will be presented with the Value Fields Settings dialog box as in Figure 3-8.
The first thing that you notice in the Value Fields Settings dialog is the Source Name for the column that you are evaluating. This is important to know when debugging calculation issues, especially if you are using a custom name in your reporting layer.
Users will often want to use a Custom Name for display purposes that may not match the source name. To change the name, you can type in a name of your choice in the text box.
Also notice that you have several ways to summarize the field you are working with on the Summarize Values By tab. These options are summarized and described in Table 3-1.
Function |
Summarizes |
Sum |
The sum of the values. This is the default function for numeric values. |
Count |
The number of values. The Count summary function works the same as the COUNTA worksheet function. Count is the default function for values other than numbers. |
Distinct Count |
The number of distinct values. |
Average |
The average of the values. |
Max |
The largest value. |
Min |
The smallest value. |
StDev |
An estimate of the standard deviation of a population, where the sample is a subset of the entire population. |
StDevp |
The standard deviation of a population, where the population is all of the values to be summarized. |
Var |
An estimate of the variance of a population, where the sample is a subset of the entire population. |
Varp |
The variance of a population, where the population is all the values to be summarized. |
You can also format the actual number by clicking the Number Format button in Figure 3-8. This will give you the standard ways to format data as you would have seen in Excel.
The Show Values As tab in the Value Field Settings dialog lets you choose how you want to display a field’s value relative to others in the PivotTable. The default value for this is no calculation, which will just show the summarized value for that given Row and Column intersection point.
Next, you will look at the options to format the Rows or Columns with the Field Settings option. Perform the following:
Close the dialog boxes from Figure 3-8 if you still have them open.
With the PivotTable selected, click the drop-down arrow to the right of CalendarYear in the Columns area of the PivotTable Fields pane.
Choose Field Settings in the context menu. You will be presented with Figure 3-9.
In the Subtotals & Filters tab of the Fields Settings dialog, you can turn off subtotals for that Field or choose to use the Automatic value set at the PivotTable, as described earlier in this section. On this tab, you also have the option to include new items that come into the data set in any manual filters that have been set. This can be useful when you have a report set to filter data the way that you want, but the underlying values are changing, and you do not want to have them flow into the report.
In the Layout & Print tab of the Fields Settings dialog, you can change the following items:
Layout
Show item labels in outline form
Display labels from the next field in the same column (compact form)
Display subtotals at the top of each group
Show item labels in tabular form
Repeat item labels
Insert blank line after each item
Insert page break after each item
Let’s now go through a demo of Formatting a PivotTable.
Open the Chapter 3CH03 Skill 3-1 Formatting.xlsx workbook, which you previously worked on and saved.
In the Field Settings for SalesTerritoryCountry do the following:
Insert blank line after each item label in the Layout & Print options.
Modify the Custom Name by changing the default value of SalesTerritoryCountry to Sales Territory Country. Click OK when complete.
In the Values Field Settings for Sum of Freight do the following and then click OK when complete.
Format the numeric values by modifying the format of the Freight values. Click the Number format button and then on the Format Cells dialog select Currency which is under the Category grouping. Leave all other selections as defaults.
Rename the Custom Name value from Sum of Freight to Average Freight.
Change the summarization of the field from sum to average by selecting Average from the Summarize Values By tab.
Turn off the Field Headers. Under PivotTable Tools, on the Analyze tab, in the Show group, click Field Headers.
Now let’s change the Report Layout to Tabular from the default of Compact. Under PivotTable Tools, on the Design tab, in the Layout group, click Report Layout and select Show in Tabular Form.
In the rows bucket on the Pivot Table, swap the SalesTerritoryGroup and SalesTerritoryRegion.
Once complete, save your model, which should look like Figure 3-10.
Now that you have looked at some of the formatting options for the PivotTable itself, it is time to review implementing these changes at the Data Model level. As mentioned earlier, making formatting changes as we did directly in the PivotTable are local to that PivotTable and would need to be repeated for each PivotTable that you create that needs similar formatting. This is normal for some formatting options that are only available at the PivotTable level, but other formatting options should be centralized in the Data Model. Some of the things that you centralize will be demonstrated via an exercise where you will do the following:
Remove Columns from Data Model Remove those that are not useful in analysis or the Data Model.
Hide Table and Columns Hide those that are not useful in analysis but needed for the Data Model.
Naming Conventions Provide intuitive naming conventions for tables and columns.
Table and Column Descriptions For ease of maintenance, add table and column descriptions to each.
Data Types Choose the correct data type for columns.
Columns Formats Choose the appropriate formats for the values in a column.
Create Explicit Measures This will be done in the next section.
Column Sorting Enables sorting of columns by using the values in another column.
Data Categorization Adds extra metadata to the columns that helps reporting tools better interpret data values.
Set Summarize By Property This applies a default summary method for a column so that when it is used in a PivotTable, the proper summarization method is chosen by the PivotTable.
Each of these was explained in detail in Skill 2.1. In the following demos, we will drive these changes back to the Data Model so they are globally available to all PivotTables and PivotCharts that use the Data Model. We will go through each of the options above to prepare our model for general use.
Open the Chapter 3CH03 Skill 3-1 Optimizing.xlsx workbook.
Look at the Data Model and note that it is very rudimentary and has not been optimized. You should notice the tasks we outlined above have not been applied to this Data Model. These steps are usually completed as part of modeling, but we will perform them here to emphasize the process of optimization.
The first thing is to Remove columns from the Data Model that are not needed for Analysis to support the Data Model. From Power Pivot, ensure that you have the table you want to modify highlighted in either the Diagram View or Data View. Navigate to Design, and then click Table Properties to open the Edit Table Properties dialog as shown in Figure 3-11. From here, you exclude columns that you don’t want imported into the Data Model by Unchecking the Check box next to each column name that needs to be excluded. Follow this entire procedure for each of the table and column pairings that are listed in Table 3-2, and click Save after you have completed each table. Note that you could have performed these steps when you did your initial import, per Skill 2.1.
Table |
Column to Remove |
FactInternetSales |
DueDate ShipDate |
DimCustomer |
SpanishEducation FrenchEducation SpanishOccupation FrenchOccupation |
DimDate |
SpanishDayNameOfWeek FrenchDayNameOfWeek SpanishMonthName FrenchMonthName |
DimProduct |
SpanishProductName FrenchProductName LargePhoto FrenchDescription ChineseDescription ArabicDescription HebrewDescription ThaiDescription GermanDescription JapaneseDescription TurkishDescription StartDate EndDate Status |
DimPromotion |
SpanishPromotionName FrenchPromotionName SpanishPromotionType FrenchPromotionType SpanishPromotionCategory FrenchPromotionCategory |
Now let’s hide the ProductCategory and ProductSubcategory tables from client tools because they have no use in reporting. The columns that are needed from these two tables to support the H_Product Hierarchy have been brought into the Product table using the DAX RELATED function. To hide a table in Diagram View, right-click on the DimProductCategory table, and choose Hide from Client Tools from the context menu. Repeat this procedure for DimProductSubcategory. You will notice that the tables are still visible but will become grayed-out in the diagram, which signifies that they cannot be seen in client tools but are still usable when performing modeling. If you now navigate back to the PivotTable, you will see that they no longer appear in the Table List on the PivotTable Fields pane. You do not need to do this, but to allow the tables to be seen again, you can go back to the table and choose Unhide from Client Tools. As a note, this can also be done from the Data View.
Now you will hide unneeded columns from each table from client tools. In Diagram View, find all fields that are listed in Table 3-3 and hide these from client tools. In Diagram View, right-click on each of the fields and choose Hide From Client Tools. Once you are done, they will become grayed-out in the model, which signifies that they cannot be seen in client tools, but are visible and usable when forming modeling. As a note, you can choose multiple fields at once using the standard Windows multiselect commands. This can also be performed from the Data View, but is far easier to do in Diagram View.
Table |
Columns |
FactInternetSales |
ProductKey OrderDateKey DueDateKey ShipDateKey CustomerKey PromotionKey CurencyKey SalesTerritoryKey |
DimCustomer |
CustomerKey GeographyKey |
DimPromotion |
PromotionKey PromotionAlternateKey |
DimCurrency |
CurrencyKey CurencyAlternateKey |
DimSalesTerritory |
SalesTerritoryKey SalesTerritoryAlternateKey |
DimDate |
DateKey |
DimProduct |
ProductKey ProductSubcategoryKey |
Now let’s give the tables in the model friendlier names by renaming them in the Data Model. Note that if the tables had been imported using the Query Editor, you would need to perform these operations back in the Query Editor. In this demo, our tables were imported using PowerPivot so you can perform this as directed without error. If you had tried to rename tables that were imported using the Query Editor, you would receive the Error Message as shown in Figure 3-12.
In our case, you can right-click on the Table Name in either the Diagram View or Data View and then choose Rename from the context menu. When done, hit Enter so that the change takes effect. Do this for all the tables listed in Table 3-4.
Original Table Name |
New Table Name |
FactInternetSales |
InternetSales |
DimCurrency |
Currency |
DimCustomer |
Customer |
DimDate |
Dates |
DimPromotion |
Promotion |
DimSalesTerritory |
SalesTerritory |
DimProduct |
Product |
Now let’s give some of the columns names in the model friendlier names by renaming them in the Data Model. Once again, if the tables underlying the columns had been imported using the Query Editor, you would need to perform these operations back in the Query Editor. In this demo, our tables (and hence columns) were imported using PowerPivot, so you can perform this as directed without error. If you had tried to rename columns that were imported using the Query Editor, you would receive the same error message shown in Figure 3-12.
In our case, in either the Diagram View or Data View, you can right-click on the Column Name that you want to change and then choose Rename from the context menu. When done, hit Enter so that the change takes effect. Do this for all the columns listed in Table 3-5.
Table |
Original Column Name |
New Column Name |
Customer |
EnglishEducation |
Education |
EnglishOccupation |
Occupation |
|
Dates |
EnglishDayNameOfWeek |
DayNameOfWeek |
EnglishMonthName |
MonthName |
|
Product |
EnglishDescription |
Description |
EnglishProductName |
ProductName |
|
Promotion |
EnglishPromotionName |
PromotionName |
EnglishPromotionType |
PromotionType |
|
EnglishPromotionCategory |
PromotionCategory |
Let’s now examine the data types in the model and choose the correct data type for each column. The only column that does not have the correct data type (for this demo) is the DiscountAmount in InternetSales. Change this to currency by ensuring the DiscountAmount field is highlighted in the Data View and then navigate to Home > Formatting > Data Type and choose Currency from the drop-down list. You may get the Data might be lost error message as in Figure 3-13. If so, read the warning and understand what it is telling you and then click Yes to continue.
Format the values in the UnitPriceDiscountPct column as a %. To do this, highlight the UnitPriceDiscountPct column in the Data View and navigate to Home > Formatting > Format and choose Percentage from the drop-down box.
If you require Explicit Measures to be created, you can do them here, but our model has enough in the way of Explicit measures. Notice that we have an Average Freight measure that we will format later. This can be used in place of the implicit measure named Average Freight that we have in the PivotTable.
Set Column Sorting on the MonthName column. In Data View, highlight the MonthName column in the Dates table. Now navigate to Home > Sort and Filter > and click Sort by Column. In the Sort by Column dialog, ensure the Sort Column drop-down box has MonthName selected, and then in the By Column drop-down box choose MonthNumberOfYear. Click OK when complete.
Data Categorization adds extra metadata to columns of a certain type. Let’s set two columns up to demonstrate. In Data View, ensure the SalesTerritory table is highlighted and then ensure the SalesTerritoryCountry column is highlighted within the table. Navigate to Advanced > Reporting Properties > Data Category > and choose Country/Region. It is likely that the Data Model has suggested that category for you already. Repeat the previous steps for SalesTerritoryRegion and choose Continent as its category. Setting these properties will help reporting tools understand that the text values in each column are not just any text, but that they represent geographical attributes.
And finally, set the Summarize By Property for our metrics. For each of the columns in Table 3-6, follow this procedure. Highlight the column in the InternetSales table and then navigate to Advanced > Summarized By, and choose the value in the drop-down box that corresponds to the Summarize By value in Table 3-6. Note that some of the values that the Data Model thought could be summarized, like the SalesOrderLineNumber, should never be summarized because it makes no sense to add up these values.
Table |
Column |
Summarize By |
InternetSales |
SalesOrderLineNumber |
Do not summarize |
RevisionNumber |
Do not summarize |
|
OrderQuantity |
Sum |
|
UnitPrice |
Sum |
|
ExtendedAmount |
Sum |
|
UnitPriceDiscountPct |
Do not summarize |
|
DiscountAmount |
Sum |
|
ProductStandardCost |
Do not summarize |
|
TotalProductCost |
Sum |
|
SalesAmount |
Sum |
|
TaxAmt |
Sum |
|
Freight |
Sum |
Go to the PivotTable and add the Average Freight Explicit Measure from the InternetSales table to the Values area. At this point, take note of the fact that it has not been formatted. As opposed to formatting right in the PivotTable, go back to the PowerPivot and proceed with the next step.
Let’s change the format of the measure from within Excel to demonstrate the different places where measures can be managed. Navigate to the Power Pivot tab, Calculations, Measures, and choose Manage Measures. In the Manage Measures dialog, choose Average Freight and then click Edit. This will open the Measure dialog box as shown in Figure 3-14.
In the Category selection at the bottom of the dialog, choose Currency and leave the Symbol and Decimal places values as their defaults. Click OK when complete.
If you now return to the PivotTable, you will see that the new formatting has propagated to the model. If you had applied formatting at the PivotTable level first, the formatting you added to the model would not override what you did in the PivotTable. Despite the fact the formatting has been centrally set, you can override it in the PivotTable when necessary.
Save your work.
Once you create a Measure, it is advisable to provide it with a format you intend users to use when reporting. Formatting is a display setting only and does not affect how the data is stored in the Data Model.
Once the measure is created, you have the option to change the formatting to one of:
Decimal Number
Whole Number
Percentage
Scientific
Currency
Date
TRUE/FALSE
Any Date formats that begin with an asterisk (*) will respond to changes in regional date time settings that are specified at the operating system level. To see which date formats are marked with an asterisk (*), do the following:
In Data View, ensure that the Dates table is highlighted and then highlight the FullDateAlternateKey column.
From the Data Model, navigate to Home > Formatting > Format, and choose More Formats, which will open the Formatting dialog shown in Figure 3-15
In the Category selection box, choose Date, which will then change the drop-down box to be a list of date formats.
In the drop-down, find a value with an asterisk next to it. The two options that are available are General and Short. In Figure 3-15 we have the General date format chosen.
As a reminder, you can also take advantage of the DAX FORMAT function to format a data field in an acceptable manner. To review how to achieve this, refer to Chapter 2 in the section on writing DAX queries where an example for the FORMAT function was presented. Do note that once formatted, numbers and dates become text strings and you can’t perform calculations on them.
Let’s take the measures in the Data Model and ensure that they are formatted properly.
Open the Chapter 3CH03 Skill 3-1 Optimizing.xlsx workbook you had been previously working with and navigate to the Data Model.
Format each of the measures in Table 3-7. Choose each Measure one by one in the
Calculation Area and then choose Home > Formatting > Format and apply each format as indicated.
Measure |
Home Table |
Format |
Most Recent Order Date |
InternetSales |
Date |
Average Sales Amount Rounded |
InternetSales |
Whole Number |
Average Freight |
InternetSales |
Currency |
Sales over one million dollars |
InternetSales |
TRUE/FALSE |
Total Sales |
InternetSales |
Currency |
Total Cost |
InternetSales |
Currency |
Profit Margin |
InternetSales |
Percentage |
Users often do not want to see an entire data set and are usually interested in smaller, more focused slices of data. If you want to narrow the data in a PivotTable based on criteria that has been specified, you can take advantage of filters. There are several ways to filter data that is contained within a PivotTable. You can use the following methods:
Filter data using the PivotTable
Add a PivotTable Filter
Add a Slicer
Add a Timeline
The first thing that we will do is add all of the filter types that we listed above to the Excel sheet so that we can demonstrate how to use them.
Open Chapter 3CH03 Skill 3-1 Filtering.xlsx workbook and ensure that you see the PivotTable that has been created for you.
With your cursor in the PivotTable, ensure that the PivotTable Fields pane is visible.
Next, from the DimCustomer table, drag the MaritalStatus column to the Filter area in the PivotTable Fields pane.
Now let’s create a slicer. With your cursor outside of the PivotTable, navigate to Insert > Filters, and click Slicer.
In the Existing Connections dialog, click the Data Model Tab and ensure that This Workbook Data Model is selected and click Open.
On the Insert Slicers dialog, click the All tab and then find the DimCustomer table and check the Gender checkbox. Click OK when complete.
In the newly created Gender slicer, click M and then F. Notice that nothing happened. This is because the slicer has not yet been set up to interact with the PivotTable. To do this, right-click the Gender Slicer and choose Report Connections from the context menu. As a note, if you had your cursor in the PivotTable in step 4, the association would have been made for you automatically.
In the Report Connections (Gender) dialog, click PivotTable1. This will associate the slicer and PivotTable.
Now add a timeline. This time with your cursor inside the PivotTable, navigate to Insert > Filters, and click Timeline.
In the Insert Timelines dialog, find the FactInternetSales table and check the OrderDate field. Notice that the Insert Timelines dialog has only picked up on columns that have a date data type. Click OK when complete.
And finally, in the newly created timeline, click the drop down in the upper right corner that says MONTHS and choose YEARS.
When you are complete, your screen should look like Figure 3-16.
Save your work as Chapter 3CH03 Skill 3-1 Filtering.xlsx.
In Figure 3-16 you will notice four distinct areas where filtering can be applied:
Callout Bubble 1 on the X and Y axis of the PivotTable contains several areas where filters can be applied. We will cover these options in Filter data within the PivotTable.
Callout Bubble 2 is the Pivot Table Filter.
Callout Bubble 3 is the Slicer.
Callout Bubble 4 is the Timeline.
Let’s discuss the Filter options that can be applied to a PivotTable. These can be found by right-clicking on any of the dimension members in the rows axis (or columns if we had a columns axis) in the PivotTable. Each of these will be shown shortly in an example. For now, right-click either Australia level member in the row axis and choose Filter from the context menu. You will be presented with the following options:
Keep Only Selected Items This option allows you to select any number of members at any level and then only keep those members in the PivotTable.
Hide Selected Items This is the opposite of Keep Only Selected and allows you to only hide what is selected.
Top 10 This allows you to keep the Top or Bottom “x” (not just 10) member values. This is not just the measure that is in the PivotTable, but any measure in the Data Model.
Label Filters This allows you to filter on the member labels.
Value Filters This allows you to filter on measure values and not just the measure that is in the PivotTable, but any measure in the Data Model.
The first place that we will apply a filter is in the PivotTable itself. In this demo we will use the Top 10 filters.
Open the Chapter 3CH03 Skill 3-1 Filtering.xlsx workbook and ensure that the previous PivotTable is visible.
Place the cursor on the Australia member at the SalesTerritoryCountry level (as opposed to the value that is at the SalesTerritoryRegion level) of the row axis.
Right-click and in the context menu, choose Filter, and then navigate to the submenu and choose Top 10.
This will open the Top 10 Filter (SalesTerritoryCountry) dialog box as shown in Figure 3-17. Ensure that the dialog name does not have SalesTerritoryRegion in brackets, as that means you chose Australia at the incorrect level.
In the dialog box you will have the following options that appear from left to right in Figure 3-17:
In the first drop-down box, you can choose the following to decide where to start the ranking:
Top
Bottom
In the second option box, you can choose a number to rank by. Examples are:
Bottom 3 gives you the three worst performers for a metric.
Top 5 gives you the Top five performers.
In the third drop-down, your options are:
Items (default) Gives you the absolute number of items.
Percent Gives you a percentage of data relative to the data set size.
Sum Use to find the items that make up a specific sum.
Choose Top 2 Items by Sum of Freight and click OK.
You should only see the United States and Australia as in Figure 3-18.
Now clear the filter by clicking the Filter Symbol next to Row Labels in the PivotTable, and then select Clear Filter From “SalesTerritoryCountry” from the context menu.
These filters are added by moving one or more columns from the tables list in the PivotTable Fields pane to the Filters area. Note the following:
If you have a field already in one of the other three areas (Rows, Columns, or Values), then the field cannot be placed in the Filter area without removing it from one of the other areas first.
The PivotTable filter can only be used against the PivotTable it belongs to. If you want a filter to apply across multiple PivotTables, you will need to take advantage of Slicers and Timelines.
To use the filter, click the drop-down box next to the column name to be presented with a list of selectable items.
Slicers provide highly visible buttons that you can click to filter data in your PivotTables and PivotCharts. In addition, slicers also indicate the current filtering state, which makes it easy to understand exactly which filters are being applied.
Another option for filtering dates is to take advantage of a Timeline. A timeline enables you to filter by times and easily zoom in on the periods you want to see. Some notes are:
A timeline is only useable with Date data types
You can customize the slider to YEARS, QUARTERS, DAYS, and MONTHS
It can be connected to multiple PivotTables and PivotCharts
You have already seen grouping and summarizing data from within a PivotTable by taking advantage of nesting fields in the Rows and Columns axis of the PivotTable. This is a manual way of grouping and summarizing data that is local to the PivotTable. Also, if you did this without using DAX Measures (explicit measures) then you will have created your own implicit measures that are summarized.
To make this centralized, you can add custom columns to your Data Model tables that can be used to group rows of data together. You can then use this alongside Hierarchies to improve usability.
PivotTables are often a starting point for understanding data, but for general end-user consumption you often want to take advantage of using visuals for displaying data. Typically, when delivering solutions, you organize a collection of related visuals in the form of scorecards or dashboards that are used to tell a story for users in an “at a glance” fashion. If well designed and thought out, these solutions enable users to digest massive amounts of data in a very short period. Excel has many different visuals available, and you will review how to create and customize the major visuals in the remainder of this section. The intention of this book is not to make you an expert in how to design a well thought out and designed dashboard, but to show you the most common visuals and how to use and customize them as you prepare for the exam.
Fact-based decision making is the foundation upon which successful businesses are built. Decision makers need to access data to help them make the right decisions quickly or to spot issues with a process so that corrective actions can be taken. The question on most decision makers’ minds is “Do I need to do anything?”
As organizations grow, more and more data becomes available, so attempting to synthesize it all in a timely fashion can become a challenge. This is where visualizations that are properly selected and delivered can help provide that “at a glance” view of business performance across massive amounts of information. The Art and Science of Data Visualizations has taken off in recent years to help tame the deluge of data that inundates most organizations.
Building a well thought out Data Model that is friendly to use takes a considerable amount of time, but the efforts put in will pay dividends when you and your users are actively using the model in day-to-day analytic situations. The work done to Optimize the Data Model will make it easier to use, will help make the output more consistent between different users, and can help speed up time to delivery.
The Data Model only really becomes useful when you begin to consume the data for decision making purposes. As discussed in the earlier parts of this chapter, PivotTables are often the first stop when it comes to consuming data from the Data Model. Many Excel users have long been comfortable using PivotTables as their sole means of analysis. There comes a point, however, in the Analytics lifecycle where users that need to track multiple metrics or KPIs are better served by better means of visualizing their data. Tables of data and mountains of reports are hard to navigate when trying to understand performance. The current trend has been toward using the power of tools such as Excel to start with highly summarized visuals and providing users the ability to drill to details where needed. This is where Hierarchies that you would have built into the Data Model become useful.
So, the starting point when it comes to dashboard design is to understand the story that needs to be told with data while choosing appropriate visualizations arranged in an intuitive manner. It is beyond the scope of this book to discuss how to design well thought out dashboards that provide an at-a-glance view of data that is easily digestible and navigable, but it is worth pointing out its importance. The goal is to provide consumers of your data products the ability to find answers to questions themselves.
When building visualizations, is it is worth stepping back to see the big picture before selecting appropriate visualizations for what is being measured. There are four basic presentation types that you can use to present your data:
Comparison
Composition
Distribution
Relationship
Unless you are performing Data Science activities, you will primarily use the first two types in this list. In addition, Trending is something that is often broken into its own category as aspects of trending overlay both Comparison and Composition types.
For purposes of the exam, it is more important to get to an appropriate chart type because the specific categories are not likely to be tested. They are provided here to frame your thinking around the selection process.
Below are a few questions that you may ask yourself to help guide you toward selecting a proper visualization for your needs.
Are you trying to compare values? The following charts are useful for comparing one or many categories of data, as they can enable you to identify low and high points in the data:
Column
Bar
Line
Do you want to show the composition of something? The following charts are useful for showing how individual parts make up the whole:
Pie
Bar and variants of Bar
Stacked Column
Waterfall
Are you interested in analyzing trends? The following charts are useful if you want to know more about performance over time:
Line and variants of Line
Column and variants of column
Are you trying to understand the distribution of your data? The following Distribution-based charts are used to highlight outliers, display normal tendency, and show the range of data in your data set:
Column
Bar
Do you want to better understand the relationship between value sets? The following charts are useful for showing how one variable relates to one or more different variables:
Scatter Plot
Bubble
Line and variants of line
Excel has the following chart groupings. These chart types can use the Data Model or a PivotTable directly as a source.
Column
Line
Pie
Bar
Area
Surface
Radar
Combo
The following chart groupings cannot use the Data Model or a PivotTable directly as a source. To create these visuals, if you are using the Data Model as your primary source, you first need to create PivotTables and then convert them to Cube Functions or load the data to a traditional Excel table.
Scatter
Map
Stock
Funnel
The following chart groupings are new in Office 2016 and cannot use the Data Model or a PivotTable directly as a source.
Treemap
Sunburst
Histogram
Pareto
Box and Whisker
Waterfall
A Column Chart typically displays categories along the horizontal (category) axis and values along the vertical (value) axis. They are good for comparison-based analysis and are well suited if you have any negative category values, since negative values are often associated with having a downward direction. Each of the seven charts are shown in Figure 3-19 and Figure 3-20.
Clustered Column and 3D Clustered Column
Stacked Column and 3D Stacked Column
100% Stacked Column and 3D 100% Stacked
3D Column
In a line Chart, category data is distributed evenly across the horizontal axis, and value data is distributed evenly across the vertical axis. Line Charts are good for showing continuous data over time. The charts that have markers are the same as the line charts shown in Figure 3-21 except for the fact that data point markers would be shown on the visual. Marker Charts are not separately shown but the only difference between each is that markers will show up on the visuals for each data point. Excel has the following seven chart styles as shown in Figure 3-21.
Line and Line with Markers
Stacked Line and Stacked Line with Markers
100% Stacked Column and 100% Stacked Column with Markers
3D Line
Pie Charts show the size of items in one data series, and the data points are shown as a percentage of the whole pie. There are five variants of the Pie Chart that also include the Doughnut. Figure 3-22 only shows the Pie and Doughnut Charts.
Pie and 3D Pie
Pie of Pie
Bar of Pie
Doughnut
There are six types of Bar Charts available in Excel. They are a good way to present data with long labels that would be hard to display below a vertical bar. They also work well if you are trying to display many categories on the category axis. The three 2D versions of the charts are shown in Figure 3-23.
Clustered Bar and 3D Clustered Bar
Stacked Bar and 3D Stacked Bar
100% Stacked Bar and 3D 100% Stacked Bar
Combo Charts enable you to combine two or more chart types to make relationships in your data easy to understand. You can combine charts from the groupings of Column, Bar, Line, and Area into a combo chart. You are also able to define a secondary axis, which is useful when plotting data sets with differing scales, such as we did with the chart labeled Clustered Column-Line on Secondary Axis in Figure 3-24. The three charts that are shown in Figure 3-24 are generally your starting point when building a Combo Chart. If any customizations are made to one of these visuals, Excel switches the chart type to a Custom Combination, which then allows you to make further customizations. The following four combo chart types are available:
Clustered Column-Line
Clustered Column-Line on Secondary Axis
Stacked Area-Clustered Column
Custom Combination
Area Charts are useful for visualizing time-series relationships (see Figure 3-25). Unlike line charts, they also visually represent volume by emphasizing the area with color or shading in the area between the Axis. Stacked Area Charts show the trend of the contribution of each value over time or other category data, and 100% Stacked Area Charts show the percentage trend that each value contributes over a category such as time. There are six different chart types that fall into this category in Excel:
Area and 3D Area
Stacked Area and 3D Stacked Area
100% Stacked Area and 3D 100% Stacked Area
Scatter Plots are like line graphs (see Figure 3-26). A line graph uses a line on the X-Y axis to plot continuous data, while a scatter plot uses markers to represent individual data points. Scatter Plots are useful if you want to see how two variables are related to each other. There are seven different styles of Scatter Plots in Excel:
Scatter
Scatter with Smooth Lines and Markers
Scatter with Smooth Lines
Scatter with Straight Lines and Markers
Scatter with Straight Lines
Bubble and 3D Bubble
As the name implies, Stock Charts can show fluctuations in stock prices and other data sets that have high-low variability that you want to visualize. There are four different types of Stock charts:
High-low-close
Open-high-low-close
Volume-high-low-close
Volume-open-high-low-close
Map Charts are useful when you want to plot data on a map to enhance the understanding of a data set across regionality. It is important to have geographical attributes in your data sets, as they are needed to help draw out the map visualization. Figure 3-27 contains a map of the Adventure Works sales across all time. Notice that the default map as created here varies; the color shades go from darkest to lightest, where the highest sales are shown using the darkest colors. This helps to further emphasize patterns in the data.
As defined by Wikipedia, a Radar Chart is “a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point.” In Figure 3-28, we display Total Profit and Total Sales across the Accessories Product Subcategory.
Funnel Charts are typically used to show multiple stages in a process, such as movement through a sales pipeline or a registration process for a school. They use horizontal bars that are arranged in the same order that a process flows, to show progression though the process. They are also useful for displaying data across categories as well, since the bars are sized relative to each other on display. In Figure 3-29 we are displaying the ranked Total Sales of the Accessories Product Subcategory.
A Treemap is a Hierarchical Chart that is useful for comparing parts to a whole, or when several columns of a category form a hierarchy. It provides an easy way to compare different levels of categorization. It displays categories by color and proximity and can display lots of data easily.
Remember that this visual cannot be sourced directly though the data model. If you try, you will see a message that says the following.
You can’t create this chart type with data inside a PivotTable. Please select a different chart type or copy the data outside of the PivotTable.
To solve this, we will first create a PivotTable and then will convert the PivotTable to Cube Function calls. Once we have the data in this format, you will build the PivotChart off that data set. Let’s create a demo around building out the Treemap by performing the following steps:
Open the Chapter 3CH03 Skill 3-3 PivotChart Demo.xlsx workbook.
From Excel > click the Insert tab > Tables > PivotTable.
In the Create PivotTable dialog, in the Choose the data that you want to analyze, Use this workbook’s Data Model, and then in the Choose where you want the PivotTable to be placed, choose New Worksheet. Click OK.
In the PivotTable Fields pane, do the following:
Drag Total Sales from the InternetSales table into the Values area.
Right-Click on the SalesTerritoryGroup in the SalesTerritory table and choose Add to Rows Labels on the context menu. Do the same for SalesTerritoryCountry as well. Once complete, ensure that SalesTerritoryCountry is nested under the SalesTeritoryGroup. Note: Do not use the hierarchy for this exercise but the field itself which is under the More Fields folder.
At this point, your PivotTable should look like Figure 3-30.
We now need to change the Layout of the PivotTable so that the data is in the necessary state to support the visual. With the cursor in the PivotTable, navigate to the Design tab, Layout group, Report Layout, and click Show in Tabular Form.
Now make the data labels repeat down the rows by navigating to the Design tab > Layout group > Report Layout > click Repeat All Item Labels.
Remove the Subtotals by right-clicking one of the subtotal labels in the PivotTable and choose Field Settings from the context menu. On the Field Settings dialog, ensure the Subtotals & Filters tab has focus and then under Subtotals, choose None.
Remove the Grand Totals by navigating to Cube formulas so that it can be used as the source for the Treemap. For this, navigate to Design > Layout > Grand Totals > click Off for Rows and Columns.
Now insert a Slicer so that you can interact with the data. To do this, ensure that the cursor is inside the PivotTable and the navigate to Insert > Filters > click Slicer to open the Insert Slicers dialog. Note that since we said to insert the slicer while we were in the PivotTable, a few things were done for us that we may need to otherwise do manually. First, Excel knew to base the slicer on the Data Model. If you had done this operation outside of the PivotTable, you would have been asked what source to use for the slicer. Secondly, Excel knew to associate the slicer to the PivotTable, which is what signals the slicer values to be passed to the PivotTable when you interact with the slicer.
From within the Insert Slicer dialog, ensure that the All tab is open and then in the Customer table, check the Gender field to create the slicer based on this. Click OK when complete.
Finally, we need to convert the PivotTable to Cube Functions since the Treemap PivotChart type cannot use a PivotTable. To do this, navigate to Analyze > Calculations > OLAP Tools > and click Convert to Formulas.
At this point your data worksheet should look like Figure 3-31. Notice that it is no longer a PivotTable, but has been converted to a series of Cube Function calls that get the equivalent data to the PivotTable.
Use the newly converted set of data as the source for the Treemap chart. To do this, highlight the rows and columns in the worksheet in Figure 3-30, including the column headers.
Next navigate to Insert tab, Charts group, Insert Hierarchy Chart command and then click Treemap in the drop-down.
You should then be presented with the default Treemap chart as in Figure 3-32. Notice that this screenshot was taken immediately after you clicked Treemap so that data points in the table are still highlighted.
Interact with the slicer values and watch the chart and data table change with the different filter values.
Save the file, as you will use what you have done so far to build a Sunburst Chart next. You will also come back to the Treemap in a later demo to format it.
The Sunburst Chart is useful for displaying hierarchical data. Each level of the hierarchy is represented by one ring or circle with the innermost circle as the top of the hierarchy. A Sunburst Chart with only one level of categories resembles a Doughnut Chart. However, a Sunburst Chart with multiple levels of categories shows how the outer rings relate to the inner rings. The Sunburst Chart is most effective at showing how one ring is broken into its contributing pieces.
Even though the data for the previous example may not be the best for display in a Sunburst Chart, let’s use the data as a matter of convenience to show creating an additional visual.
Open the Chapter 3CH03 Skill 3-3 PivotChart Demo.xlsx workbook that you previously saved.
Use the newly converted set of data as the source for the Sunburst Chart. To do this, highlight the rows and columns in the worksheet in Figure 3-31, including the column headers.
Next, navigate to Insert tab > Charts group > Insert Hierarchy Chart command and then click Sunburst in the drop-down.
You should be presented with a Sunburst Chart as shown in Figure 3-33. Take note of how the chart is effective at showing how one ring is broken into its contributing pieces.
Interact with the slicer values and watch the chart and data table change with the different filter values.
Save the file, as will you use what you have done so far. You will also come back to the Treemap in a later demo to format it.
Histograms show distributions of data. They are used to plot data with ranges of the data grouped into bins or intervals. Often Histograms are described as bar charts, which is a mistake. Bar charts are used to plot categorical data and are used to compare values across categories.
Let’s look at creating Histogram by looking at how our customer base is distributed across commute distance. The Customer has a Bin already included in the data set, which is useful. All that we will do is perform a count in the CustomerAlternateKey so that we can see the number of customers in each bin. To do this, follow these steps:
Open the Chapter 3CH03 Skill 3-3 PivotChart Demo.xlsx workbook.
Navigate to the Histogram worksheet. Once this worksheet is open, you will notice that a data set has already been created for this demo to save you the steps of data preparation.
Highlight the rows and columns in the worksheet, including the column headers.
Next navigate to Insert tab > Charts group > Insert Statistic Chart command, and then click Histogram in the drop-down.
You should then be presented with a new Histogram chart as shown in Figure 3-34.
This is a good start, but it is not quite what we had intended. We wanted to see the distribution across the Bins that we had in our data set, which is what we will see in Figure 3-34 after we configure the next step.
To get your Histogram to look like this, do the following steps. Highlight the x-axis as shows by Callout Bubble 1 on Figure 3-35 and then double-click it. Then in the context menu, select Format Axis. When the Format Axis pane appears, ensure that Axis Options (Callout Bubble 2) is highlighted, and choose By Category in the Bins options, which is Callout Bubble 3. This will put the counts in the appropriate bins.
For completeness, you have the following options for configuring bins:
By Category You specify the categories for the bins.
Automatic Excel will automatically determine the number of bins.
Bin Width Use this to specify the range of each bin.
Number Of Bins Use this option if you know the number of bins you want to show in the chart.
Overflow Bin Use this to define an upper limit value for bins. Values that are above this number are put into another bin.
Underflow Bin Use this to define a lower limit value for bins. Values that are above this number are put into another bin.
Box and Whisker plots are used to compare distributions because the center, spread, and overall range are immediately apparent. In contrast to a Histogram, it does not show a distribution in as much detail, but is useful for indicating whether a distribution is skewed or not.
Waterfall Charts are useful for visualizing positive and negative values and how they impact a subtotal or total. They are often used when visualizing financial statements data such as net income or profit, and how accounts, business units, or divisions impact these overall values. Figure 3-36 demonstrates this.
When building PivotCharts, you typically start out by getting the PivotChart functioning properly from a data perspective. Once you have it displaying the data that you want, and perhaps even alongside some other visuals, the time will come to give your PivotChart a professional look and feel.
Each of the Chart Types have many elements in common, such as Titles, Axis Labels, and Legends. There are some elements that are unique to each visual. In this section, we will focus on the common things that can be done to format your PivotCharts to make them look visually appealing. There are many formatting options, and it is beyond the scope of the exam to go into details around all options.
The first thing that you need to be familiar with is the Format Selection pane. There are numerous ways to have it appear. In Figure 3-37, the chart area is highlighted and then we navigated to Format, Current Selection, and clicked Format Selection, which made the Format Chart Area pane appear as shown.
In the Current Selection group, as shown in Figure 3-38, you will notice the drop-down with the word Chart Area selected.
If you click the Chart Elements drop-down box, you will see all the chart elements for a Pie Chart as is shown in Figure 3-39. You will know what can be formatted once you click on an element and then see what is available in the Format Selection pane.
As you have seen, when you have a PivotChart in focus, you will get the following tabs that show up in the Ribbon under PivotChart Tools.
Analyze
Design
Format
The tabs that we are concerned about in this Section are the Design and Format tabs. In the Design tab, as shown in Figure 3-40, you will notice the following options, which are used to change the styles of your PivotChart.
Add Chart Element Lets you add and modify valid elements for a given Chart Type.
Quick Layout Has predefined styles for you to change the overall layout of your chart.
Change Colors Allows you to customize your colors and styles.
Gallery Allows you to choose from many predefined styles that are applied to all the elements of a given visual.
Change Chart Type From here you can change your chart to many different types.
In the Format tab, as shown in Figure 3-41, you will notice the following options, which are used to change the format of your PivotChart. Below are the most commonly used commands on this tab.
Chart Elements As mentioned earlier, this is where you can pick chart elements from the drop-down.
Format Selection Opens the Format Selection pane that has formatting options for each element.
Reset to Match Style Clears the custom formatting of the selected chart element back to the overall style applied to the chart. This helps to ensure that the selected chart elements match the overall theme of the document.
WordArt Styles Gallery Customizes fonts on the visual.
Height Adjusts the height of the selected chart element, if applicable.
Width Adjusts the height of the selected chart element, if applicable.
Listed below are some of the Chart Elements that can be formatted:
Chart Area
Chart Title
Axis Title
Axis
Plot Area
Data Labels
Data Series
Data Point
Legend
Formatting a chart can be a time-consuming task as you work to get every aspect of the design and format just right. So, once you have your chart formatted the way you like, you may find yourself wanting to use a similar set of formatting styles on other charts. To do this, you can save your chart as a template for reuse in other workbooks.
Filtering data can be done on the PivotChart or it can be done using an external object such as a Slicer or a Timeline. This is done in the same manner as we did with PivotTables. You have already seen Slicers and Timelines, so let’s focus the discussion around filtering PivotCharts to the filters on the chart itself. Figure 3-42 shows the three filters that are available to be applied to the PivotChart.
Callout Bubble 1 is the CalendarYear filter that has been designated in the Filter area of the PivotChart Fields pane. This filter can only choose specific values from the list of valid members. Callout Bubble 2 has the filter options that can be applied to the Axis (Categories) and Callout Bubble 3 is for the Legend (Series). Both Callout Bubbles 2 and 3 allow for extended Filtering on Labels and Values in the chart.
Excel has arguably been one of the top, if not the top, BI tool for many years. However, Excel has several shortcomings that needed to be addressed. Some of the largest pain points with Excel have been sharing, collaborating, and maintaining version control of an Excel workbook. For years, users have been creating wonderful insights that they have had a tough time sharing in a consistent and secure manner.
In recent years, Power BI has gained in popularity, largely in part due to its ability to overcome the sharing and collaboration barrier. In conjunction with the evolution of the Office 365 and Azure suite of products, we now have many options that allow for easier centralization of data and sharing of those data products in the form of the Data Model or Excel Visualizations.
Before we dive into the options that you have for interactions between Power BI and Excel, we will do a quick overview of Power BI and its components.
Power BI desktop is the tool where most of the development work to create Power BI models and visualization is done. The tool has capabilities to connect to data sources, load, cleanse, and model data for end-user consumption. It then allows you to build visuals and then consume this content in the Power BI service. Fundamentally it does many of the same things that you have already done in Excel using PivotTable and PivotCharts, using the Excel data model and Get & Transform data. Power BI originally launched as part of Office 365 back in July 2013 and in July 2015 it became its own product that was disconnected (from a product development lifecycle) from the Office suite of products.
It comes in two versions: Power BI Desktop and Power BI Desktop optimized for Power BI Report Server. The latter is meant to be used against Power BI Report server.
The Power BI Service is where content that is developed using Power BI Desktop is published for more widespread sharing and collaboration using the Internet. From here it can be accessed anytime, anywhere, by any device. It is also possible to import data using the service and then build reports from it, but the functionality is not nearly as rich as the desktop tools. The Power BI Service interface is shown in Figure 3-43.
The relevant parts of the service for purposes of this book are shown in the following list. Note that each number in the list corresponds to a numbered Callout Bubbles on Figure 3-43.
Import or Connect to Data We will be navigating to Files, Get to publish data from Excel to the Power BI Service so that content can be shared using this interface.
DASHBOARDS This is where Power BI Dashboards are housed and managed once they are created.
REPORTS This where Power BI Reports are stored, managed, and created.
WORKBOOKS This is where any Excel workbooks that have been published to the Power BI Service will be housed and managed.
DATASETS This is where Power BI stores datasets that are used to build reports.
Once content is published to or built in the Power BI Service, it is made available almost instantaneously in Power BI Mobile. It allows you to securely access and view live any dashboards and reports on any device, with native mobile BI apps for Windows, iOS, and Android.
Microsoft Power BI Embedded is targeted at application developers to use so that they can embed fully interactive reports, dashboards, and tiles into applications without the time and expense of building their own data visualizations and controls from the ground-up. Power BI, on the other hand, is a software-as-a-service analytics solution that gives organizations a single view of their most critical business data.
There are multiple ways to get data and objects from Excel into Power BI. The terminology differs depending if you’re using Excel or Power BI as the starting point.
With Excel 2016, you can publish Excel workbooks to Power BI right from Excel. This enables you to harness the power and skills you have with Excel and then benefit from the interaction and distribution capabilities from within Power BI. For years, one of the largest challenges with Excel has been how to share workbooks while maintaining versions. With Office 365 and Power BI you now have the capabilities to avoid what has traditionally been one of the largest challenges.
You have two options for publishing to Power BI from within Excel and each has differing outcomes. You can:
Upload your workbook to Power BI
Export your workbook data to Power BI
To get to the Publish location from within Excel, navigate to File, to >Publish and you will see Figure 3-44.
First Let’s look at what each option offers by first looking at the Upload Your Workbook To Power BI option, which has the following functionality. As a note, these functions are active once you publish:
Interact with your Excel workbook just as you would in Excel Online.
Pin selections from your workbook to Power BI dashboards.
Share your workbook or selected elements through Power BI.
There a few things that you need to be aware of before you publish:
Before you can publish to Power BI, your workbook must be saved locally, in OneDrive for Business or on a SharePoint Online team site. If the file is saved locally, only Excel 2016 with an Office 365 subscription will see the experience to publish with local files. If you are using an Excel 2016 standalone installation, you will still have the Publish behavior, but this requires that the Excel workbook be saved to OneDrive for Business or SharePoint Online.
The account you use to sign in to Office, OneDrive for Business, and Power BI must be the same account.
You cannot publish an empty workbook or a workbook that doesn’t have any Power BI supported content.
You cannot publish encrypted or password-protected workbooks, or workbooks with Information Protection Management.
Publishing to Power BI requires modern authentication be enabled (default). If disabled, the Publish option is not available from the File menu.
If your Excel work book is stored on OneDrive for Business or on a SharePoint Online team site, you will only be able to publish to your own Power BI workspace per the below message in Figure 3-45.
If your Excel workbook is stored locally, you will able to Select Where You’d Like To Publish To In Power BI. This is any workspace that you have permissions for as shown in Figure 3-46. To generate this screenshot, simply move the Excel file that you want to publish to your desktop.
Let’s now publish a file to Power BI using the Upload option. To do this, perform the following steps:
Open the file named Chapter3CH03 Skill 3-3 Power BI–Upload.xlsx. In this demo, the file was stored in OneDrive for Business.
Observe that the file has the following characteristics:
It has a Data Model that has been populated using the Query Editor
It has an Excel Table named tblLetters on Sheet2
It has several reporting objects such as a Timeline, PivotCharts, and PivotTables
Navigate to File > Publish > and then click Upload. Once you click upload, you should see the status bar at the bottom of the screen as appears in Figure 3-47.
Once the Publish is complete, Excel will display the message in Figure 3-48 below the Ribbon. Click the Go to Power BI button.
In Power BI you should see the following, as in Figure 3-49. You will notice that it is Excel Online and that the name of the workbook CH03 Skill 3-3 Power BI–Upload shows up under the WORKBOOKS area. Here are a few important points:
The entire workbook has been published.
You cannot edit the workbook in Power BI.
You can edit it by clicking Edit and then choose to Edit in Excel Online or Edit in Excel.
No data has been uploaded to Power BI because nothing shows up in DATASETS. This means that the data is still in the Excel Data Model. Any additional visuals need to be created in Excel.
If data changes in the workbook on OneDrive for Business, it can take up to one hour for the changes to be reflected in Power BI.
You can Pin the visuals, including the PivotTable to a Dashboard.
Let’s now publish a file to Power BI using the Export option. To do this, perform the following steps:
Open the file named Chapter3CH03 Skill 3-3 Power BI – Export.xlsx. In this demo, the file was stored in OneDrive for Business.
It has the same characteristics as the Upload file but was given a different name to add clarity to the demo.
Navigate to File > Publish > and then click Export. Once you click upload, you should see the status bar at the bottom of the screen appear as in Figure 3-47.
Once Publish is complete, Excel will display the message in Figure 3-48 below the Ribbon. Click the Go to Power BI button.
In the Power BI service, you will notice that this time the only the Data Model tables have been imported in the DATASETS section as shown in Figure 3-50.
Any supported data in tables and/or the Data Model are exported to a new dataset in Power BI. Note that if you have Data Model data and Excel tables, only the Data Model tables will come over. In our case, the Excel table name tblLetters was not brought over. Had there been no data model in the workbook, the table would have come over.
If you have Power View sheets, they are re-created as Power BI Reports.
When you edit your workbook and save it in OneDrive for Business, your changes are synchronized with the dataset in Power BI within an hour. If you need it faster, you can Publish again.
If you click on the Ellipses next to the CH03 Skill 3-3 Power BI–Export dataset you will notice the following, as in Figure 3-51, which indicates when the last data refresh occurred.
From within Power BI you can effectively perform the same action:
Import a workbook to Power BI (like Export within Excel)
Connect to the workbook to navigate its data and content (like Upload from Excel)
As a note, My Workspace in Power BI was cleaned out completely so that we can demo the same actions from above, but this time using Power BI as the interface to bring in Excel content.
If your current Power BI screen does not look like Figure 3-52, you can click Get Data in the bottom left of the Power BI interface as shown with Callout Bubble 1, because this will show up in all Power BI screens. Once you have the full screen as in Figure 3-52, navigate to Import or Connect to Data, and click Get (Callout Bubble 2).
You are then presented with the following screen as in Figure 3-53. Here you will see that Power BI allows you to bring files in from several other locations as well.
When you select one of the options on Figure 3-53, you get the navigation screen for OneDrive for Business. Here you navigate to the file locations (same as where they were for Excel) and then click Connect. Once you have done this, you will be presented with Figure 3-54, where you decide how you want to connect your Excel workbook.
From this point forward, you will end up with the same result as when we published from Excel. Once again, if you choose to Import a workbook to Power BI, it will function like Export within Excel. If you choose to Connect to the workbook, it will function like Upload from Excel.
Power BI publisher for Excel enables you to take snapshots of Excel objects such as PivotTables and PivotCharts and pin them directly into Power BI.
You first need to download the Power BI publisher for Excel. The easiest way to do this is to navigate to Power BI and click on the download icon in the upper right part of the screen as in Figure 3-55. From here you will be taken to the download site.
Once you have done the install of the Power BI publisher for Excel, you will have a new tab in Excel named Power BI, as shown in Figure 3-56.
Once your data is in Power BI, you can create Reports and build Dashboards as you would if you were building Power BI objects off an existing Power BI model. The tutorial that is in the MORE INFO reader aid below does an excellent job explaining how to:
Get data
Start exploring your dataset
Continue the exploration with Q&A (natural language querying)
In this thought experiment, you will test your knowledge pertaining to visualizing data. As a practice for the exam, eliminate answers that you know are incorrect first as to narrow the smallest set of most correct answers first.
1. Which of the following is not an area in the PivotTable Fields pane?
Values
Slicers
Rows
Filters
2. You have a Data Model with one table in it, and data for a PivotTable you are building is sourced exclusively from the Data Model. If the Data Model gets its data from a SQL Server via the Query Editor, what sequence of steps needs to happen to get refreshed into your PivotTable if you know that data at source has changed? Assume that the Query Property named Refresh this connection on Refresh All is checked. Choose all that are correct.
From the PivotTable Tools menu, click Refresh
From the PivotTable Tools menu, click Refresh All
From the Query Editor, click Home, Query, Refresh All
All the above
3. Which set of steps will connect a slicer to a PivotTable? Assume one PivotTable and Slicer are in the workbook. Choose all that apply.
With a cell in the PivotTable selected, choose Insert, Filters, Slicer
With a cell selected outside of the PivotTable, choose Insert, Filters, Slicer
With a cell selected outside of the PivotTable, choose Insert, Filters, Slicer. Right-click on the Slicer and choose Report Connections from the context menu and then associate it with the PivotTable
All the above
4. Which filter types can be connected to many PivotTables and PivotCharts? Choose all that apply.
Slicer
Timeline
Dicer
Filter in the Filters area of a PivotTable
5. Which Filters are available when filtering on a member that is in the Rows of Columns of a PivotTable? Choose all that apply.
Top 10
Label Filters
Value Filters
All the above
6. Which is not an available chart type in Excel?
Histogram
Combo
Radar
Card
7. Which chart type is best to illustrate comparisons among individual items where there are many individual items? Choose the BEST answers.
Bar
Pie
Column
Waterfall
8. Which chart types cannot connect directly to data that is house in the Data Model?
Pie
Bar
Funnel
Waterfall
This section contains the solutions to the thought experiment.
1. Answer B: The Slicer is a separate object from the PivotTable.
2. Answers A and B: The Refresh preview in the Query Editor only refreshes the data preview for the Query Editor and does not load the data to the Data Model.
3. Answers A and C: Step B is incomplete. You would need to go through the steps of choosing where to source the Slicer from and then you would need to connect it to the PivotTable
4. Answers A and B: A Dicer does not exist. The Filter in the Filters area of a PivotTable is local to one PivotTable.
5. Answer D: All these filter types are available.
6. Answer D: There is no Card Chart type in Excel.
7. Answers A and C: Pie Charts would become too difficult to interpret with many individual items. Waterfall Charts are useful for understanding how an initial value is affected by a series of positive and negative values, such as in Financial Statements.
8. Answers C and D: The Funnel Chart and Waterfall Chart cannot connect to data directly in the Data Model.
PivotTables are a great way to start analysis. They summarize data into a grid format as defined by the PivotTable Fields pane.
It is a good idea to give your PivotTable a name once it is created. This is useful as most reporting solutions end up with multiple reporting objects; this helps keep them organized.
With a Data Model that has lots of data, you can use the Defer Layout Update to defer data updates as you define a PivotTable.
You have options to format aspects of your PivotTable locally or in the Data Model. Making changes to the Data Model allows them to be available to all PivotTables.
The default Report Layout is Compact. There are two other Report Layout types:
Tabular
Outline
Measures that are created in the PivotTable are known as Implicit Measures and are local to the PivotTable they are created in. To create globally available measures, you will create Explicit Measures that become part of the Data Model.
Grand Totals can be removed from PivotTables in the following ways:
Off for Rows and Columns
On for Rows and Columns
On for Rows Only
On for Columns Only
Subtotals can be configured in the following ways:
Do Not Show Subtotals
Show all Subtotals at Bottom of Group
Show all Subtotals at Top of Group
Blank Rows can be added to your PivotTables between groupings to help add clarity to your PivotTables.
Refreshing data in your PivotTables can be done in several ways:
Manually for each PivotTable
Manually for all PivotTables
Automatically when opening the file
Values in a PivotTable can be summarized and formatted locally to a PivotTable using the Value Field Settings set of properties. These override any formatting that has been done in the Data Model.
When done building a Data Model, it should be optimized by doing the following:
Remove Columns from Data Mode that are not useful in analysis
Hide Tables and Columns that are not useful in analysis but are needed in the Data Model
Naming Conventions should be intuitive for tables and columns
Table and Column Descriptions should be added to each
Data Types - choose the correct data type for columns
Columns Formats should be set to appropriate formats for the column
Create Explicit Measures
Column Sorting
Data Categorization, which helps reporting tools better interpret data values
Set Summarize By Property, which applies default summary method for a column
Filtering data the is displayed in your PivotTable can be done in the following locations:
In the PivotTable itself
By adding a PivotTable Filter
By adding a Slicer and connecting it to the PivotTable
By adding a Timeline and connecting it to the PivotTable
Excel Has the following chart groupings. These chart type can use the Data Model or a PivotTable directly as a source:
Column
Line
Pie
Bar
Area
Surface
Radar
Combo
These chart groupings cannot use the Data Model or a PivotTable directly as a source:
Scatter
Map
Stock
Funnel
The following chart groupings are new in Office 2016 and cannot use the Data Model or a PivotTable directly as a source:
Treemap
Sunburst
Histogram
Box and Whisker
Waterfall
When you upload a workbook to Power BI using the Excel Publish to Power BI, you can:
Interact with your Excel workbook just as you would in Excel Online
Pin selections from your workbook to Power BI dashboards
Share your workbook or selected elements through Power BI
When you Export workbook data to Power BI:
Excel table data comes as long as there is no Data Model. If a data model is present, only it is exported into a Power BI dataset
You can create Power BI reports and dashboards from your dataset
With Power BI Publisher for Excel, you can take snapshots of PivotTables, charts, and ranges and pin them to dashboards in Power BI.
Specifically, you can select:
A range of cells from a sheet or table
PivotTables
PivotChart
Illustrations and images
Text
You cannot select:
3D Maps
Power View visualizations
3.147.83.126