© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
J. M. RhodesCreating Business Applications with Microsoft 365https://doi.org/10.1007/978-1-4842-8823-8_18

18. Using Power BI to Find Inconsistent Data Between Excel Spreadsheets

Jeffrey M. Rhodes1  
(1)
Colorado Springs, CO, USA
 

This example solves a challenge that confronted my wife, Sue, in her job in human resources. She had to reconcile large spreadsheets taken from different systems that had conflicting data. Since these had hundreds of rows and dozens of columns, trying to eyeball the differences was time-consuming and prone to errors. Since she has to listen to me drone on about how great Power BI and the rest of the Power Platform is An image of the smiley emoticon., the least I could do is try to help.

A Microsoft Access Solution

The key to a solution is to have a common data element in both spreadsheets and then “join” on that element. In our example, here I use a Position Number column. Before tackling the problem in Power BI , I first wanted to try a Microsoft Access solution. Figure 18-1 shows the first step of importing both spreadsheets as tables.

A cropped screenshot of the access table. It highlights the external data taskbar with excel data to be imported from its new data source option.

Figure 18-1

Importing Excel spreadsheets into Access tables

We use the Import Spreadsheet Wizard and choose to use the first row as headers, as shown in Figure 18-2.

A screenshot of the import spreadsheet wizard window. It has the option for first row contains column headings checked with a table underneath.

Figure 18-2

Import Spreadsheet Wizard using first row as headers

Once we get our tables imported, we create a query, as shown in Figure 18-3.

A screenshot of a compare spreadsheets window. It has 2 spreadsheets with an arrow pointed at the join properties dialog box.

Figure 18-3

Access Query Design View

The key is to do a join1 on our common column, Position Number. We then add our common columns (Position Number and Position Title) and then our potentially non-matching ones for each spreadsheet, naming them Office Symbol 1, Office Symbol 2, Program Code 1, and Program Code 2. We are now ready for the useful part, which are “Match” columns that are True if the columns match and False if they don’t. Listing 18-1 shows the entire Structured Query Language (SQL) syntax with the key elements in bold.
SELECT
Spreadsheet1.[Position Number], Spreadsheet1.[Position Title], Spreadsheet1.[Office Symbol] AS [Office Symbol 1], Spreadsheet2.[Office Symbol] AS [Office Symbol 2], IIf([Spreadsheet1].[Office Symbol]=[Spreadsheet2].[Office Symbol],True,False) AS [Office Symbol Match], Spreadsheet1.[Program Code] AS [Program Code 1], Spreadsheet2.[Program Code] AS [Program Code 2], IIf([Spreadsheet1].[Program Code]=[Spreadsheet2].[Program Code],True,False) AS [Program Code Match]
FROM
Spreadsheet1 INNER JOIN Spreadsheet2 ON Spreadsheet1.[Position Number] = Spreadsheet2.[Position Number];
Listing 18-1

SQL syntax for “Compare Spreadsheets” Access query

We use the IIf function to show True if the two Office Symbol or Program Code columns match and False otherwise. Figure 18-4 shows the results.

A screenshot of a compare spreadsheets window. It has 9 columns, where the sixth column is highlighted and labeled as office symbol match.

Figure 18-4

Access query results showing Office Symbol Match and Program Office Match columns

Notice that True shows up as -1 while False shows up as 0. We would then export this query back out to Excel and then filter on these values to find our problem rows.

While this solution works, it is not very scalable in that the user needs to import new spreadsheets each month (or create a linked table, which is itself fairly complicated) and then redo the query. Worse, there is no good way to share this data with other team members without them having to be able to open and understand an Access database.

Configuring Our Data in Power BI

In Power BI , we link to both spreadsheets as data sources in the normal way and then Transform data . The only change we need to make in the original data is to Change Type for Position Number to Text so that we don’t lose the leading 0’s, as shown in Figure 18-5.

A screenshot of the options overlay on the position number tab with change type and text highlighted.

Figure 18-5

Change Position Number Type to Text

We now have to figure out the best way to join our data. Going to the model screen as in a previous chapter will not work in this case as we want to do more than just show both tables in one visualization. We actually need our new “Match” columns in the same data source. For that, we need a new Merge Queries as New, as shown in Figure 18-6.

A screenshot of a spreadsheet taskbar with an arrow pointed at merge queries as new on the right.

Figure 18-6

Merge Queries as New

This gives us the dialog shown in Figure 18-7.

A screenshot of the merge window. It has spreadsheets 1 and 2 selected with data underneath, and the inner join kind option selected.

Figure 18-7

Inner Join merge on Position Number column

We choose our two data sources and select that the Position Number column is the one that matches in each. We perform an inner join as in our Access implementation. This gives us the result shown in Figure 18-8 after we have renamed Office Symbol to Office Symbol 1 and Program Code to Program Code 1.

A screenshot of the merged data table with 5 columns. The column program code 1 is highlighted.

Figure 18-8

Merged Data After Renaming Spreadsheet 1 columns with Spreadsheet 2 data as a Table

Notice how all of Spreadsheet2 is shown as a Table. We expand that to add its Office Symbol and Program Code columns, as shown in Figure 18-9.

A cropped screenshot of program code 1 and spreadsheet 2 tabs. An options overlay has the office symbol and program code option checked.

Figure 18-9

Selecting desired Spreadsheet2 columns - Office Symbol and Program Code

Since we check the box to “Use original column name as prefix,” that gives us columns like Spreadsheet2.Office Symbol. We rename them and reorder to get to Figure 18-10.

A screenshot of a spreadsheet with 6 columns. The position number column with its data is highlighted.

Figure 18-10

Merged Data with both spreadsheets displayed and columns renamed

We are now ready to implement our “Match” columns. These are a good use of Conditional Columns . Rather than comparing the value as is the default, however, we want to compare Office Symbol 1 to another column (Office Symbol 2), as shown in Figure 18-11.

A cropped screenshot of office symbol match. It has 3 columns of column name, operator, and value. The select a column option is selected from value.

Figure 18-11

Match Column Values in Conditional Column

If these match, we put in True and otherwise False, as shown in Figure 18-12.

A screenshot of the add conditional column window. It has office symbol match written in new column name input box, with 4 columns of column name, operator, value, and output underneath.

Figure 18-12

Office Symbol Match Conditional Column

We do the same thing for a Program Code Match column .

Thinking forward to our desire to color-code our entries (e.g., show True in green and False in red), it turns out that we need a numeric conditional column as well. Figure 18-13 shows how we can set that up.

A screenshot of the add conditional column window. It has office symbol status color written in new column name, with 4 columns of column name, operator, value, and output underneath.

Figure 18-13

Conditional Column for Setting Color

We won’t display this column in our visualization, but rather just use it in our conditional formatting. We are now ready to Close and Apply and start on the report.

Visualizing Our Data in Power BI

We are now ready to display our data. We start with a Table control to show the values from our original spreadsheet. The users can then filter to show non-matches and then export that data if desired. We create slicers for Office Symbol Match (Figure 18-14) and Program Code Match as well as card to show the number of positions affected.

A screenshot depicts how to create slicers for office symbol match and program code match. It has a panel for visualizations and fields on the right.

Figure 18-14

Office Symbol Match Slicer

Our last objective is to color code our “Match” columns so that True values show as green and False ones show as red. To do that, we select our column in the table and apply Conditional formatting, as shown in Figure 18-15.

A screenshot of the columns panel with office symbol match marked as 1 on the left. An options overlay has conditional formatting and font color marked as 2 and 3 on the right.

Figure 18-15

Conditional formatting in Power BI Table visualization

Power BI conditional formatting is not yet as flexible as Excel’s and thus we can’t apply formatting with non-numeric data. That’s why we created the additional conditional column back in Figure 18-13. As we see in Figure 18-16, we change the Format style to Rules, leave the Apply to as Values only, and then choose our What field should we base this on? to be our Office Symbol Status Color column. We leave it as Sum, but Average, Minimum, or Maximum would all have worked the same.

A screenshot of the font color - office symbol match window. It has options for format style, apply to, field, and summarization with rules underneath.

Figure 18-16

Setting Conditional Access Font Color

We edit the first rule to be = 0 as a Number and then choose red as the color. We add a New rule and use = 1 as a Number. We choose a variation of green (#039703 hex value) for its color. Figure 18-17 shows the final result.

A screenshot of the office symbol match and program code match color formatting has false and true checkboxes with 8 columns and data underneath.

Figure 18-17

Visualization with slicers and conditional color formatting

While this is a comparable amount of work to the Access solution, the big advantage is that updated feeds could be uploaded to Teams or another location and processed via Power BI’s scheduled refresh. The report can easily be shared with other staff. They just use the two slicers and, if needed, export the problem data. This is MUCH easier and less prone to error than the Access solution.

Summary

In this chapter, we took a common example of needing to determine if data from multiple systems was consistent. We first looked at a “legacy” technique of using Microsoft Access queries to do the job. We then examined how to create a reusable solution in Power BI via Merged Queries. We added slicers and conditional formatting to easily see unmatched data. Powerful indeed!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.116.43