© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
J. M. RhodesCreating Business Applications with Microsoft 365https://doi.org/10.1007/978-1-4842-8823-8_6

6. Power BI Challenges with JSON, XML, and Yes/No Data

Jeffrey M. Rhodes1  
(1)
Colorado Springs, CO, USA
 

Adding in multiple answers, multiple lines of text, and Yes/No responses leads to some additional challenges in visualizing data. This chapter looks at some of these challenges and how to address them.

JSON and HTML Data Formats

Our first example comes from an IT Demographic Survey of all our personnel. Figure 6-1 shows a section of the survey that reveals problematic data.

A screenshot shows 2 questions. Single choice on I T level with second out of 5 options selected. Multi choice on currently used office networks with second and fourth out of 7 options checked.

Figure 6-1

Form questions that generate JSON and HTML

I used the same technique as the previous chapter to copy each response to a SharePoint list using Power Automate. We see the first inkling of an issue when we look at the data in SharePoint, as shown in Figure 6-2. The extra explanatory text after Level 4 gets copied (since it was part of the answer).

A screenshot of the data in SharePoint with I T level and current networks office.

Figure 6-2

SharePoint display of the data showing the complete answer text and JSON format

More significantly, the multiple answers get represented as:
["EDU (wired)","MIL"]
Those with some web programming experience might recognize this as JavaScript Object Notation (JSON). When we connect to the data in Power BI, we see an additional issue in that the data is displayed as HTML,1 as shown in Figure 6-3.

A screenshot of visual projection capability and current L M S programming data. It features 4 lines of H T M L code in 2 columns.

Figure 6-3

HTML Format in Power BI for multiline text in SharePoint

The value looks like this :
<div class="ExternalClassE89BC5E188494C92998F948FB46F942E">Yes. The current visual projection capability in my classroom is sufficient.</div>

Working with the Data in Power BI

As in the previous chapter, we connect to the SharePoint list as our data source and then Transform data to begin manipulating the data. Our first task is to deal with the HTML format. While we could extract the values as XML (like we will with JSON as follows), that is subject to error if there are stray brackets in the data AND it is more problematic for null data. Instead, we want to use a similar technique as a previous chapter where we connect to the SharePoint list in the 1.0 implementation, as shown in Figure 6-4.

A screenshot shows SharePoint online lists with site U R L. Implementation option 1.0 selected. Advanced options, o k, and cancel are visible at the bottom.

Figure 6-4

Connecting to a SharePoint list with the 1.0 implementation

This gives us additional options when transforming our data. In particular, it exposes the FieldValuesAsText column, which allows us to extract the plain text version of any of our column values. This is shown in Figure 6-5.

A screenshot of field value as text with several options in the center pane. A downward arrow from the o k button points to visual projection capabilities.

Figure 6-5

Retrieving Field Values as Text

Our first task is to deal with null data. We initially sent out the survey requiring answers, but we updated it later for the stragglers with a request to just fill out the first four questions with the option to skip the rest. For those columns that were JSON, this meant we needed put in [] for that data. This is shown in Figure 6-6. Note that we used to enter null for the Value to Find but now we just leave it blank.

A screenshot of replace values with 2 panes for value to find and replace with. Advanced options, o k, and cancel are visible at the bottom.

Figure 6-6

Replacing null JSON values in Power BI

Our next task is to remove all the extraneous text from our IT Level and instead have it show Level 1 through Level 4. As shown in Figure 6-7, we use the Extract option and then use First Characters, which turns out to be seven. This will make our visualizations much cleaner.

A screenshot of an I T level with option extract and first characters highlighted in the right pane.

Figure 6-7

Extracting the first seven characters of the IT Level column

We have two tasks when working with the JSON data. The first is to eliminate the brackets and the quotes. We do this by going to Extract – JSON, as shown in Figure 6-8.

A screenshot of the current office networks with option parse and J S O N highlighted in the top pane.

Figure 6-8

Parsing JSON data in a multiple response column

The second task is more difficult. Notice how the first row contains EDU (wired), MIL while the second contains EDU (wired), EDU (wireless). If we do a visualization “as is,” one “pie slice” will be the EDU (wired), MIL combination and the other will be the EDU (wired), EDU (wireless) combination. What we actually want is to show two EDU (wired) answers and one each for MIL and EDU (wireless). To make this happen, we need to split each response into its own row. As shown in Figure 6-9, we click on the “Expand” button at the top of the column.2 We then select Expand to New Rows. 3 Each entry gets its own row. Notice the circled Id values in Figure 6-9. These are duplicated and are how we will match up these extra rows to the original responses .

A screenshot of the current office networks with option expand to new rows highlighted in the top left and each I D circled in the list below.

Figure 6-9

Expanding multiple responses into separate rows

We do this for each of the multiple response columns. In our example, we started with 270 rows of data. After expanding all the columns, we ended up with 1,227,676 rows! Luckily, Power BI can handle this extra data just fine.

Visualizing the Data in Power BI

After performing these steps on all our JSON columns,4 we are ready to visualize the data. As we have seen in previous examples, we put the desired column (Current Networks (Office) in this case) as the Details and then find a unique value like Id for the Count. Figure 6-10 shows the results . When we look at the tooltip showing the # of Responses for EDU (wired), we see the problem. 308,753 responses reflect all the expanded rows described earlier in this chapter and does not reflect the true data.

A screenshot of current network 308753 with a pie chart in the center pane and a box of visualizations with several options in the right pane.

Figure 6-10

Erroneous results when using Count

We can now take advantage of the fact we noticed in Figure 6-9 that the Id values were duplicated when we expanded the rows. We use Count (Distinct) instead as shown in Figure 6-11. The # of Responses is now reasonable at 261. Notice how the percentage is much different as well.

A screenshot of current network 261 with a pie chart in the center pane and a box of visualizations with several options in the right pane.

Figure 6-11

Correct results when using Count (Distinct)

Another implication of the expanded rows becomes apparent when we want to view the actual data. I normally use a Table for this purpose. If we do that here, however, we see all the duplicate data (name, department, IT Level, etc.). Instead, we can use a Matrix, as shown in Figure 6-12. We add each desired element to Rows and then click the Expand all down one level in the hierarchy button (see arrow in Figure 6-12) until it is disabled. We then go to the Format your visual tab and turn off Stepped Layout under Row headers ➤ Options.

A screenshot of a matrix with 6 pie charts in the center pane and a box of visualizations with several options in the right pane.

Figure 6-12

Using a Matrix without stepped layout instead of a Table

Working with Yes/No Data

Our final task for this chapter is to effectively display Yes/No (or Boolean) data. Figure 6-13 shows a portion of a daily status check form. It is largely a series of Yes/No/Other questions on whether various systems and networks are operational.

A screenshot of C F A M daily check with a part of the daily status check questionnaire in the center pane.

Figure 6-13

Daily Check form showing Yes/No data

While we could easily make a standard pie chart for each question showing the numbers of Yes and No responses, this doesn’t give us the desired “dashboard ” of both the most recent date (Figure 6-14) and the results over a given time frame (Figure 6-15). To do this, we need to make some adjustments to the data.

A screenshot of a dashboard featuring 11 completed tasks with the most recent date of check in the bottom right.

Figure 6-14

Dashboard showing most recent Daily Check data

A screenshot of a dashboard with timeframe. It features 11 tasks. Several pie charts and a date range tracker are visible in the right pane.

Figure 6-15

Dashboard showing Daily Check data over time

As with previous examples, we go to Transform data. Before we get into the Yes/No data, let’s look at how we come up with the “Most Recent Date.” We don’t want to limit this to “today” since the daily check may not have occurred yet and we want to use yesterday’s (or Friday’s if this a Monday) data. For this, I found the easiest way was to first create a MostRecentSubmissionDate column using this formula:
=List.Max(#"Changed Type"[Created])
But where does the #"Changed Type" bit come from? That is actually referencing a prior step. To see that, we need to look at the Advanced Editor , as shown in Figure 6-16. With some study, you will see that each step is named and then the next or other later rows refer to that name (#”Changed Type”, #”Added Custom”, etc.).

A screenshot of advanced editor, C F A M daily check programming data. The changed type is highlighted at two places in the code.

Figure 6-16

Using the Advanced Editor to reference a previous step

This gives us a column that has the same value in every row, for example 8/23/2022 1:34:33 PM. We then add a Conditional Column that we name isMostRecentDate . We put 1 in this column if the Created column equals the MostRecentSubmissionDate and 0 if not, as shown in Figure 6-17.

A screenshot of add conditional column with the option most recent date and others. Option O K and cancel in the bottom right.

Figure 6-17

isMostRecentDate conditional column

We change its type to True/False and then use it as the Page Level Filter , as shown in Figure 6-18. This limits all data on the page/tab to only the most recent.

A screenshot of page level filter with the options: select all, false, and true.

Figure 6-18

isMostRecentDate page-level filter

Getting back to our Yes/No data, we remove all the unneeded columns (Author, Editor, etc. that came over from SharePoint). We then select all the Yes/No columns and choose to Unpivot Only Selected Columns, as shown in Figure 6-19. This gives us a new Attribute column (that was the previous column name) and a new Value column (which is Yes/No/Other). We rename these to System/Task and Operational/Completed, respectively.

A screenshot of attribute and value columns shows 8 wired or wireless attributes. A pop-up in the right pane with Unpivot only selected columns points at the value column.

Figure 6-19

Unpivoting Yes/No columns leading to Attribute and Value columns

We are now ready to make our visualization. We choose Stacked Bar Chart and use System/Task for the Axis, Operational/Completed as the Legend and the Count of Operational/Completed as the Value. This is shown in Figure 6-20.

A screenshot of a stacked bar chart of operational, completed on the left. A screenshot of visualizations with several options on the right.

Figure 6-20

Configuring the stacked bar chart

Since Yes is good and No is bad in our context, we set the Data Colors so that Yes is green and No is red. This is shown in Figure 6-21.

A screenshot of visualizations with several options: y axis, x axis, legend, gridlines, zoom slider, bars, colors.

Figure 6-21

Configuring custom data colors

Summary

This chapter introduced us to several tools and techniques that help us work with some of the more complicated data types and still accurately and effectively display our data and get the insights that we need.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.190.253.222