Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

J. M. RhodesCreating Business Applications with Microsoft 365https://doi.org/10.1007/978-1-4842-8823-8_6

6. Power BI Challenges with JSON, XML, and Yes/No Data

Jeffrey M. Rhodes¹

(1)

Colorado Springs, CO, USA

Adding in multiple answers, multiple lines of text, and Yes/No responses leads to some additional challenges in visualizing data. This chapter looks at some of these challenges and how to address them.

JSON and HTML Data Formats

Our first example comes from an IT Demographic Survey of all our personnel. Figure 6-1 shows a section of the survey that reveals problematic data.

Figure 6-1
Form questions that generate JSON and HTML

I used the same technique as the previous chapter to copy each response to a SharePoint list using Power Automate. We see the first inkling of an issue when we look at the data in SharePoint, as shown in Figure 6-2. The extra explanatory text after Level 4 gets copied (since it was part of the answer).

Figure 6-2
SharePoint display of the data showing the complete answer text and JSON format

More significantly, the multiple answers get represented as:

["EDU (wired)","MIL"]

Those with some web programming experience might recognize this as JavaScript Object Notation (JSON). When we connect to the data in Power BI, we see an additional issue in that the data is displayed as HTML,¹ as shown in Figure 6-3.

Figure 6-3
HTML Format in Power BI for multiline text in SharePoint

The value looks like this :

<div class="ExternalClassE89BC5E188494C92998F948FB46F942E">Yes. The current visual projection capability in my classroom is sufficient.</div>

Working with the Data in Power BI

As in the previous chapter, we connect to the SharePoint list as our data source and then Transform data to begin manipulating the data. Our first task is to deal with the HTML format. While we could extract the values as XML (like we will with JSON as follows), that is subject to error if there are stray brackets in the data AND it is more problematic for null data. Instead, we want to use a similar technique as a previous chapter where we connect to the SharePoint list in the 1.0 implementation, as shown in Figure 6-4.

Figure 6-4
Connecting to a SharePoint list with the 1.0 implementation

This gives us additional options when transforming our data. In particular, it exposes the FieldValuesAsText column, which allows us to extract the plain text version of any of our column values. This is shown in Figure 6-5.

Figure 6-5
Retrieving Field Values as Text

Our first task is to deal with null data. We initially sent out the survey requiring answers, but we updated it later for the stragglers with a request to just fill out the first four questions with the option to skip the rest. For those columns that were JSON, this meant we needed put in [] for that data. This is shown in Figure 6-6. Note that we used to enter null for the Value to Find but now we just leave it blank.

Figure 6-6
Replacing null JSON values in Power BI

Our next task is to remove all the extraneous text from our IT Level and instead have it show Level 1 through Level 4. As shown in Figure 6-7, we use the Extract option and then use First Characters, which turns out to be seven. This will make our visualizations much cleaner.

Figure 6-7
Extracting the first seven characters of the IT Level column

We have two tasks when working with the JSON data. The first is to eliminate the brackets and the quotes. We do this by going to Extract – JSON, as shown in Figure 6-8.

Figure 6-8
Parsing JSON data in a multiple response column

The second task is more difficult. Notice how the first row contains EDU (wired), MIL while the second contains EDU (wired), EDU (wireless). If we do a visualization “as is,” one “pie slice” will be the EDU (wired), MIL combination and the other will be the EDU (wired), EDU (wireless) combination. What we actually want is to show two EDU (wired) answers and one each for MIL and EDU (wireless). To make this happen, we need to split each response into its own row. As shown in Figure 6-9, we click on the “Expand” button at the top of the column.² We then select Expand to New Rows. ³ Each entry gets its own row. Notice the circled Id values in Figure 6-9. These are duplicated and are how we will match up these extra rows to the original responses .

Figure 6-9
Expanding multiple responses into separate rows

We do this for each of the multiple response columns. In our example, we started with 270 rows of data. After expanding all the columns, we ended up with 1,227,676 rows! Luckily, Power BI can handle this extra data just fine.

Visualizing the Data in Power BI

After performing these steps on all our JSON columns,⁴ we are ready to visualize the data. As we have seen in previous examples, we put the desired column (Current Networks (Office) in this case) as the Details and then find a unique value like Id for the Count. Figure 6-10 shows the results . When we look at the tooltip showing the # of Responses for EDU (wired), we see the problem. 308,753 responses reflect all the expanded rows described earlier in this chapter and does not reflect the true data.

Figure 6-10
Erroneous results when using Count

We can now take advantage of the fact we noticed in Figure 6-9 that the Id values were duplicated when we expanded the rows. We use Count (Distinct) instead as shown in Figure 6-11. The # of Responses is now reasonable at 261. Notice how the percentage is much different as well.

Figure 6-11
Correct results when using Count (Distinct)

Another implication of the expanded rows becomes apparent when we want to view the actual data. I normally use a Table for this purpose. If we do that here, however, we see all the duplicate data (name, department, IT Level, etc.). Instead, we can use a Matrix, as shown in Figure 6-12. We add each desired element to Rows and then click the Expand all down one level in the hierarchy button (see arrow in Figure 6-12) until it is disabled. We then go to the Format your visual tab and turn off Stepped Layout under Row headers ➤ Options.

Figure 6-12
Using a Matrix without stepped layout instead of a Table

Working with Yes/No Data

Our final task for this chapter is to effectively display Yes/No (or Boolean) data. Figure 6-13 shows a portion of a daily status check form. It is largely a series of Yes/No/Other questions on whether various systems and networks are operational.

Figure 6-13
Daily Check form showing Yes/No data

While we could easily make a standard pie chart for each question showing the numbers of Yes and No responses, this doesn’t give us the desired “dashboard ” of both the most recent date (Figure 6-14) and the results over a given time frame (Figure 6-15). To do this, we need to make some adjustments to the data.

Figure 6-14
Dashboard showing most recent Daily Check data

Figure 6-15
Dashboard showing Daily Check data over time

As with previous examples, we go to Transform data. Before we get into the Yes/No data, let’s look at how we come up with the “Most Recent Date.” We don’t want to limit this to “today” since the daily check may not have occurred yet and we want to use yesterday’s (or Friday’s if this a Monday) data. For this, I found the easiest way was to first create a MostRecentSubmissionDate column using this formula:

=List.Max(#"Changed Type"[Created])

But where does the #"Changed Type" bit come from? That is actually referencing a prior step. To see that, we need to look at the Advanced Editor , as shown in Figure 6-16. With some study, you will see that each step is named and then the next or other later rows refer to that name (#”Changed Type”, #”Added Custom”, etc.).

Figure 6-16
Using the Advanced Editor to reference a previous step

This gives us a column that has the same value in every row, for example 8/23/2022 1:34:33 PM. We then add a Conditional Column that we name isMostRecentDate . We put 1 in this column if the Created column equals the MostRecentSubmissionDate and 0 if not, as shown in Figure 6-17.

Figure 6-17
isMostRecentDate conditional column

We change its type to True/False and then use it as the Page Level Filter , as shown in Figure 6-18. This limits all data on the page/tab to only the most recent.

Figure 6-18
isMostRecentDate page-level filter

Getting back to our Yes/No data, we remove all the unneeded columns (Author, Editor, etc. that came over from SharePoint). We then select all the Yes/No columns and choose to Unpivot Only Selected Columns, as shown in Figure 6-19. This gives us a new Attribute column (that was the previous column name) and a new Value column (which is Yes/No/Other). We rename these to System/Task and Operational/Completed, respectively.

Figure 6-19
Unpivoting Yes/No columns leading to Attribute and Value columns

We are now ready to make our visualization. We choose Stacked Bar Chart and use System/Task for the Axis, Operational/Completed as the Legend and the Count of Operational/Completed as the Value. This is shown in Figure 6-20.

Figure 6-20
Configuring the stacked bar chart

Since Yes is good and No is bad in our context, we set the Data Colors so that Yes is green and No is red. This is shown in Figure 6-21.

Figure 6-21
Configuring custom data colors

Summary

This chapter introduced us to several tools and techniques that help us work with some of the more complicated data types and still accurately and effectively display our data and get the insights that we need.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 6. Power BI Challenges with JSON, XML, and Yes/No Data

Create new playlist

Sign In