© Leila Etaati 2019
Leila EtaatiMachine Learning with Microsoft Technologieshttps://doi.org/10.1007/978-1-4842-3658-1_4

4. R Visualization in Power BI

Leila Etaati1 
(1)
Aukland, Auckland, New Zealand
 

Power BI is a self-service business intelligence (BI) software. This tool can be used for data visualization, data cleaning, modeling, analysis, and collaboration at enterprise scale. Many books and blogs have been published about how to use Power BI. In this chapter, I am going to show how we can leverage R to create better visualizations and get additional value from Power BI. In this chapter, I will explain how to set up R within Power BI, how to draw charts in Power BI using R scripts, how to set up the Power BI report environment, how to set up Power BI to write R code, and how to draw R charts in Power BI.

Power BI

The Power BI report environment helps us to visualize data. According to Dr. Abela [1], there are four main purposes of data visualization, as follows and represented graphically in Figure 4-1:
  1. 1.

    Data comparison

     
  2. 2.

    Data relationship

     
  3. 3.

    Data composition

     
  4. 4.

    Data distribution

     
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig1_HTML.jpg
Figure 4-1

Data visualization diagram [1]

Power BI Desktop is a free application that you can download from https://powerbi.microsoft.com/en-us/desktop/ . The Power BI Visualizations panel supports most of the charts suggested in Figure 4-1. It is possible to extend the visualization capabilities, by importing custom visuals from the marketplace or other sources.

Power BI Desktop helps users to get data from different sources with the Get Data options (Figure 4-2). To learn more about Power BI, go to http://radacad.com/online-book-power-bi-from-rookie-to-rockstar .
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig2_HTML.jpg
Figure 4-2

Power BI Visualizations environment

Setting Up R in Power BI

To draw an R chart in Power BI, the first step is to install at least one R version or Microsoft R Open. Now we must specify what R version we are going to use. To do this, we click the File menu, then click Options and Settings, then Options. Under the Global option, click R Scripting, to specify the R version (Figure 4-3).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig3_HTML.jpg
Figure 4-3

Setting up the R environment in Power BI

As mentioned in Chapter 3, R is based on packages. Whenever you use a specific package, you must install it on your machine first. (If you are using Power BI Desktop, most of the packages are already installed in the service.) You can install packages using an R IDE, e.g., R Studio. Then, in the Power BI R editor, you must refer to that package. Moreover, you ensure that you are using the right version of the R package that you already installed (Figure 4-4).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig4_HTML.jpg
Figure 4-4

Specifying an R version and environment in Power BI

Writing R Code in Power BI

It is possible to write R code in Power BI’s report area. To do that, we must import data into Power BI report.

A data set that includes car specifications, such as speed in town and along a highway, cylinders, and so forth, is available to download at https://forge.scilab.org/index.php/p/rdataset/source/tree/master/csv/ggplot2/mpg.csv . This data set is free and titled “mpg.csv.” I am going to use this data set to show the speed of the car on the highway and the city, the number of cylinders the car has, its year of production, and type of drive (front wheel, rear wheel, and so on) in one picture. According to Figure 4-1, to show the comparison of the data among the different items, we must use a table chart. We are going to import data into Power BI Desktop via the Get Data option and import a CSV file (Figure 4-5).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig5_HTML.jpg
Figure 4-5

Importing a CSV file into Power BI Desktop

After importing the data, you should be able to see it under Fields. Click the R and drag it into the whitespace area (Figure 4-6).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig6_HTML.jpg
Figure 4-6

Putting R into a whitespace area

Expand the mpg data set (Figure 4-7) and choose cty (speed in the city), hwy (speed on the highway), and cyl (cylinder) options.
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig7_HTML.jpg
Figure 4-7

Selecting related data from the mpg file

The R scripts editor will be enabled. Under R script editor, you will see some R codes. The crosshatch (#) is a symbol for making comments in R and is not executed as script. Power BI automatically puts the selected fields in a variable named dataset, so all fields (cty, hwy, and cyl) will be stored in a data set variable equivalent to using the <- syntax. Also, Power BI automatically removes any duplicated rows. This is explained in the R script editor area (Figure 4-8).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig8_HTML.jpg
Figure 4-8

Active Python environment in Visual Studio 2017

Next, we are going to put R codes for drawing a two-dimensional graph in Power BI. In Power BI, to use any R scripts, install the package in R, using the install.package() function. In our case, we must install ggplot2. We use the function library(ggplot2) to refer to this package. The first line of code in the R editor is library(ggplot2). This package contains some important functions for drawing charts.

To draw a chart, I first use the ggplot function, to configure a two-dimensional chart. The first argument is dataset, which holds our three fields. Then we have another function inside ggplot2, named aes, which identifies which file should be on the x or y axes. Finally, in the chart, I also want to indicate the number of the car’s cylinders. This can be done by adding another layer in the aes function: Size. So, cars with more cylinders will have bigger dots (see Figure 4-9).
t←ggplot(dataset, aes(x=cty, y=hwy,size=cyl))
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig9_HTML.jpg
Figure 4-9

R script editor

However, the preceding figure shows the graph area without any specific charts. We need a scatter chart to indicate the city and highway speeds on x and y axes. ggplot2’s geom_point function draws a scatter chart. This function has value of pch=21, which relates to the shape of the dot in the chart. For example, if I set this value as 20, it becomes a filled cycle; a value of 23 becomes a diamond shape. To run the code, click the run option (Figure 4-10).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig10_HTML.jpg
Figure 4-10

How to run the R code

After clicking the run code option, the charts will be shown in R visual (Figure 4-11).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig11_HTML.jpg
Figure 4-11

Drawing a chart using the ggplot function

In Figure 4-11, you can see that we have three different fields: highway and city speed, along the y and x axes, respectively. A car’s cylinder variable is shown as a different cycle size. However, you might need a bigger cycle, to differentiate cars with 8 and 4 cylinders. We accommodate this need by adding another layer, by adding a function name, as follows:
scale_size_continuous(range=c(1,5))
Following, is the entire code:
t<-ggplot(dataset, aes(x=cty, y=hwy,size=cyl)) + geom_point(pch=23)+scale_size_continuous(range=c(1,5))
in the scale_size_continues(range=c(1,5))
The difference between the lowest and highest value specified is 5. I can increase this difference by changing it from 5 to 10. The resulting scatter chart is shown in Figure 4-12.
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig12_HTML.jpg
Figure 4-12

Scatter chart showing speed in the city and on the highway, plus the number of cylinders

In the next example, I have changed the pch value to 24 and added another code inside the aes function name: fill=Red. This means that the chart will show solid red rectangles (Figure 4-13).
t<-ggplot(dataset, aes(x=cty, y=hwy,size=cyl,fill="Red")) + geom_point(pch=24)+scale_size_continuous(range=c(1,5))
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig13_HTML.jpg
Figure 4-13

The final chart

By using the facet command in R, it is possible to show five different variables in just one chart. This helps us to have more dimension in our chart. For this we are going to show the year a car was produced and its type of drive. Therefore, we must change the code, as follows:
t← ggplot(dataset, aes(x=cty, y=hwy,color = factor(cyl))) + geom_point(size=4)

By changing the aes function argument, we have replaced size argument with color. This indicates that I want to differentiate a car’s cylinder values not just by cycle size, but that I am going to show them by allocating different colors to them. Hence, I change the aes function as in the preceding code snippet.

We must change the code, as follows:
library(ggplot2)
t<-ggplot(dataset, aes(x=cty, y=hwy,color = factor(cyl))) + geom_point(size=4)
t<-t + facet_grid(year ~ drv)
Now, you can see the car’s speed on the highway and in the city in y and x axes. Also, we have cylinders as a color and drive and year of production as a facet (Figure 4-14).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig14_HTML.jpg
Figure 4-14

Drawing a table chart showing five variables in the same chart

R Features in Power BI

R script editor has some features that help us to better use Power BI Desktop, such as slice and dice and edit R code.

Slice and Dice

R visuals are interactive. It is possible to slice and dice them. We are going to use a slicer to slice the facet chart. We also are going to insert a year slicer in the report area, to filter the R visual (Figure 4-15).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig15_HTML.jpg
Figure 4-15

Slicing and dicing the R visual

Edit R Code in RStudio

It is possible to open the code in RStudio with the data. There is also an option to run the R code in RStudio using the data we have (Figure 4-16).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig16_HTML.jpg
Figure 4-16

Running the R code

After clicking the Run option, a page will be open in RStudio that contains the data related to the car. The imported data set contains all changes in data (Figure 4-17).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig17_HTML.jpg
Figure 4-17

Opening R code in RStudio

Custom Visuals

Using custom visuals is another way to extend the capability of visualization in Power BI. There are two ways to access custom visuals:
  1. 1.

    Use existing custom visuals available from the Office store

     
  2. 2.

    Create custom visuals

     

Custom Visuals in the Power BI Office Store

The Office store provides a variety of custom visuals to the Power BI user. To access them, you must first sign in to your work account (Figure 4-18).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig18_HTML.jpg
Figure 4-18

Signing in to a Power BI account

Then, in the Power BI Visualizations panel, click the three dots and select the Import from store option (Figure 4-19).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig19_HTML.jpg
Figure 4-19

Importing visuals from the Power BI store

Next, you will see a page with the title “Power BI Custom Visuals.” There are many categories listed here, such as filters, KPIs, Maps, Advanced Analytics, Time, Gauges, Infographics, and Data Visualizations (Figure 4-20).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig20_HTML.jpg
Figure 4-20

Power BI Custom Visuals pane

As mentioned, Power BI Custom Visuals includes an Advanced Analytics category. By clicking it, you will see such advanced analytics as time series, associative rules, clustering, and more. I am going to use one of these custom visuals to forecast milk production.

The first step is to import a milk data set into Power BI Desktop. The data has two columns: date of milk production and value of the milk (Figure 4-21).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig21_HTML.jpg
Figure 4-21

Milk production data

We must now import custom visuals from Power BI Custom Visuals, to forecast milk production (Figure 4-22).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig22_HTML.jpg
Figure 4-22

Advanced Analytics time series custom visual

After importing Custom Visuals, you will see that it has been added to the standard visual panel (Figure 4-23).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig23_HTML.jpg
Figure 4-23

Forecasting a custom visual

Having imported the forecasting custom visual, we must now choose the date and value for the data field.

This custom visual uses an Exponential Smoothing algorithm for forecasting milk production over the next ten months (Figure 4-24).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig24_HTML.jpg
Figure 4-24

Forecasting milk production

The forecasting chart shows in yellow the actual data related to the production of milk in the previous months, and the forecasting for the later ten months is in red.

The algorithm behind the scenes is not accessible, and we are only able to change the parameters (Figure 4-25).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig25_HTML.jpg
Figure 4-25

Forecast milk production for the next ten months

Custom visuals in Power BI extend the possibility of having different charts. However, the number of custom visuals created by the Microsoft team is limited. To extend visualization capabilities, there is a way to create custom visuals by using R scripts. In the next section, I will explain the process of creating custom visuals.

Creating Custom Visuals

Some of the main Power BI standard visuals in Power BI Desktop are widely used. However, as mentioned in the previous section, it is possible to extend these with custom visuals. To access the custom visuals, you must click Market Place in Power BI Desktop. There are two ways to create custom visuals using Java scripts or R codes.

I will explain briefly the process of creating custom visuals by using R codes.
  1. 1.

    The first step is to install NodeJS from https://nodejs.org/ .

     
  2. 2.

    Next, install Power BI visuals tools using the command prompt (Figure 4-26):

     
npm install -g powerbi-visuals-tools
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig26_HTML.jpg
Figure 4-26

Installing Power BI custom visuals via the command prompt

  1. 3.
    After installing the Power BI custom visuals, to ensure that they are properly installed, we must run the pbiviz command. After running this command, the information about Power BI Custom Visual Tool will be shown in the command prompt console (Figure 4-27).
    ../images/463840_1_En_4_Chapter/463840_1_En_4_Fig27_HTML.jpg
    Figure 4-27

    Confirming Power BI custom visuals

     
  2. 4.

    Next, we must create an rhtml template. To create a template folder, we first must create an empty folder in the C drive. After creating a new folder with the name CustomVisual in the C folder, then, using the command prompt, we must change the directory to C: CustomVisual. From the command prompt, we run the following code:

     
pbiviz new sampleRHTMLVisual -t rhtml
By running the code, a folder with the name sampleRHTMLVisual will be created in CustomVisual folder. In this folder, there is an R file with the name script.r (Figure 4-28).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig28_HTML.jpg
Figure 4-28

A sample folder for custom visuals

This is a folder that provides a template with which to create other R custom visuals. Check the file script. R inside the folder (Figure 4-29).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig29_HTML.jpg
Figure 4-29

R codes inside the script file

Now, as you can see in the code in Figure 4-29, we require two libraries, Plotly and ggplot2, to draw a simple ggplot2 chart with Plotly.

Here, the data set has been hard-coded for iris, which is an open source data set in R. The plot gets the data from the iris data set and shows the petal and the length and species of the flower. Then we use the ggplotly function to show the data.

As a result, we have an R script. Now I first will create a package from this, then I will write my codes to create different charts. Hence, I return to the command prompt and type pbiviz package in the folder (Figure 4-30).
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig30_HTML.jpg
Figure 4-30

Creating a custom visual in the command prompt

  1. 5.
    Open the custom visual in the Power BI file. To do this, we must select the Import from file option in the Visualizations standard panel (Figure 4-31).
    ../images/463840_1_En_4_Chapter/463840_1_En_4_Fig31_HTML.jpg
    Figure 4-31

    Importing a custom visual into Power BI

     
  2. 6.
    After choosing the Import from file option, we must browse the sampleRHTMLVisual folder and look for the dist folder. In this folder, there is a pbiviz file with the folder name. We must import this file into the Power BI Visualizations panel (Figure 4-32).
    ../images/463840_1_En_4_Chapter/463840_1_En_4_Fig32_HTML.jpg
    Figure 4-32

    Imported custom visual

     
  3. 7.
    The last step is to drag and drop the custom visual into the Power BI report area (Figure 4-33). The visual is not related to the specific data.
    ../images/463840_1_En_4_Chapter/463840_1_En_4_Fig33_HTML.jpg
    Figure 4-33

    Custom visual with R code

     

We can change the R code and create a custom visual with a different name and icon. In another example, I am going to create a table chart. (I already explained the R codes for this earlier in the chapter.)

First, we must return to the sampleRHTMLVisual folder we have from the sample Power BI package, then copy the folder, to create a new custom visual. We change the name of the folder to TableChart. There is a pbiviz file here that contains information about the name of the custom visual, the icon, and so forth. We must change the pbiviz.json files content.
  1. 1.

    Change the name of the custom visual to “TableChart.”

     
  2. 2.

    Create an icon for a custom visual with the dimension 20×20 and put it into the assets folder.

     
  3. 3.
    Change the address of the icon in the JSON file (Figure 4-34).
    ../images/463840_1_En_4_Chapter/463840_1_En_4_Fig34_HTML.jpg
    Figure 4-34

    JSON file name must be changed

     
  4. 4.

    After changing the name and creating a new icon for the custom visual, we must save the JSON file and put the created icon in the assets folder.

     
  5. 5.
    Next, we must change the R scripts in the scripts. R file. In the actual code, section, replace the existing code with the following one.
    if(ncol(Values)==5)
    {
      names(Values)[1]<-paste("X")
      names(Values)[2]<-paste("Y")
      names(Values)[3]<-paste("Z")
      names(Values)[4]<-paste("W")
      names(Values)[5]<-paste("M")
        g=ggplot(Values,aes(x=X,y=Y,color=Z))+geom_jitter()+facet_grid( M ~  W,scales = "free")
      }
     
  6. 6.

    Save the file.

     
  7. 7.
    We must now return to the command prompt environment and change the directory to the sampleRHTMLVisual folder, by typing pbiviz package (Figure 4-35).
    ../images/463840_1_En_4_Chapter/463840_1_En_4_Fig35_HTML.jpg
    Figure 4-35

    pbiviz command in the command prompt

     
  8. 8.
    After running the code, we should be able to see a new pbiviz file with name TableChart in the dist folder (Figure 4-36).
    ../images/463840_1_En_4_Chapter/463840_1_En_4_Fig36_HTML.jpg
    Figure 4-36

    Custom visual for a table chart

     
  9. 9.

    Finally, we must import the created custom visual into Power BI Desktop and pass the data fields to the visual.

     
As mentioned, we must change the code, as follows:
if(ncol(Values)==5)
{
  names(Values)[1]<-paste("X")
  names(Values)[2]<-paste("Y")
  names(Values)[3]<-paste("Z")
  names(Values)[4]<-paste("W")
  names(Values)[5]<-paste("M")
    g=ggplot(Values,aes(x=X,y=Y,color=Z))+geom_jitter()+facet_grid( M ~  W,scales = "free")
  }

The main data set is Values. The first item in the Values data set relates to the x axis, the second to the y axis, and so forth.

So, in Power BI Desktop, be careful about the axis you specify in the R code.

As you can see in Figure 4-37, the imported visual has a name and icon. In addition, in the data fields, we have Values as the main data set, and the sequence of the variable we put there is as in the code we specify for each of them. For example, the cty field will be located on the x axis, hwy on the y axis (the second value), and so on.
../images/463840_1_En_4_Chapter/463840_1_En_4_Fig37_HTML.jpg
Figure 4-37

Custom visual with a specific R code, name, and icon

It is possible to change the Value fields to have separate x, y, legend, row, and column data fields. This can be done from capabilities.JSON. Additional information on how to change the data fields and how to add more settings is accessible via other Microsoft sources and the RADACAD web site .

Summary

This chapter explained how to set up Power BI Desktop to write R code. It also described how to draw a simple chart using R packages, such as ggplot2 and Plotly, inside Power BI. A very brief explanation of the available custom visuals for advanced analytics in Power BI was provided. Finally, the process of how we can create R custom visuals by writing R and JSON scripts was addressed briefly. In the next chapter, the main concepts for businesses to understand before initiating machine learning will be explained.

Reference

  1. [1]

    Dr. Abela, The Extreme Presentation ™ Method: Extremely effective communication of complex information, “Choosing a Good Chart,” http://extremepresentation.typepad.com/blog/2006/09/choosing_a_good.html , September 6, 2006.

     
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.195.198