CHAPTER 8

image

Visualizing Your Models with Power BI

Building predictive models is essential. However, explaining the results is just as important. Even an excellent predictive model can be seriously undermined by a failure to effectively communicate the results. Data visualization helps data scientists explain the results of predictive models to their stakeholders and end users. In this chapter, we show how you can share the results of your models through Power BI.

Overview

There is a large body of literature on data visualization that is beyond the scope of this book. This chapter focuses on Microsoft’s Power BI. You will learn how to use it to share the results of your model through visualization. You will explore three approaches for visualizing your results with Power BI.

  1. You’ll score a test dataset in Azure Machine Learning and use Power BI tools in Microsoft Excel for visualization.
  2. In the second approach, you will learn how to score your test dataset in Excel by calling a trained predictive model through the REST API provided by Azure Machine Learning, when you publish an experiment as a web service. This enables you to score and visualize data without leaving Excel.
  3. In the third approach, you will score your test data in Azure Machine Learning and visualize it with Microsoft’s Power BI service at www.powerbi.com. This service enables you to visualize your results without installing Excel 2013 on your machine.

Introducing Power BI

This section presents a brief introduction to Power BI. We provide the information you need to visualize the results of your predictive models in Power BI. For more in-depth coverage of Power BI, please see the resources listed at the end of this section.

Power BI provides exciting new ways to visualize data, share results, and collaborate in new ways. The new experience of Power BI is based on powerbi.com, an online service that you can use to create and share reports and dashboards seamlessly. Both the old and new Power BI experiences are deeply integrated with Excel, which is Microsoft’s leading tool for business analysts. The new Power BI service at powerbi.com was released into general availability on July 24th. Let’s review the key components of Power BI.

The focal point of the new powerbi.com is a dashboard that allows you to visualize all your data in one place. Figure 8-1 shows an example of the dashboard that ships with powerbi.com. The live dashboard allows you to load data from several sources including your Excel workbooks. You can also visualize data from SQL Server Analysis (SSAS) on premises. In addition, powerbi.com ships with out-of-the box connectors for cloud Software as a Service (SaaS) solutions such as Salesforce, Zendesk, Marketo, SendGrid, GitHub, Dynamics CRM Online, and Dynamics AX. With these connectors you can visualize your data from any of these products through dashboards in powerbi.com. The dashboard supports several chart types, from the standard bar charts and pie charts to combo charts, maps, gauges, and new funnel charts.

9781484212011_Fig08-01.jpg

Figure 8-1. A sampe dashboard in powerbi.com

Microsoft ships Power BI Designer, a desktop version of Power BI that you can use to create your reports and dashboards in an offline mode. You can connect to the same data sources as the Power BI service to load data and create reports and dashboards. When you are done, you can share these with others by publishing to the Power BI Service. This allows you to do personal BI in a safe environment before sharing broadly online. For those without Excel 2013 this is a good way to create reports quickly and cheaply. So if you just need to build a BI dashboard that loads data from sources other than Excel, the Power BI Designer is a great choice. However, if you already own a license for Excel 2013 or have a lot of your data in Excel, then you can simply build your data models in Excel and share the results as dashboards in PowerBI.com.

The best authoring experience on premises is in Excel, which is Microsoft’s leading tool for business analysis. Excel has rich BI capabilities such as Power Query, PowerPivot, Power View, and Power Map. You can create rich BI models on your desktop and then share these by publishing to the Power BI service (powerbi.com). Let’s review the key BI features in Excel 2013:

  • Power Query is an Excel add-in that you can use to discover, combine, and refine data from multiple sources of relational and non-relational data. Power Query allows you to pull data from traditional databases such as SQL Server, Oracle, Teradata, IBM DB2, and non-relational sources including Facebook, Hadoop (both HDFS and Microsoft Azure HDInsight), SharePoint lists, Wikipedia, etc. This tool significantly simplifies the process of loading and transforming data from these myriad sources and analyzing it in Excel. You can download the add-in from the Microsoft Download site at www.microsoft.com/en-us/download/details.aspx?id=39379. More details on Power Query are available at https://support.office.com/en-us/article/Introduction-to-Microsoft-Power-Query-for-Excel-6E92E2F4-2079-4E1F-BAD5-89F6269CD605.
  • PowerPivot is an in-memory BI engine that ships natively in Excel 2013. Before 2013, PowerPivot was available as an Excel add-in. However, in Excel 2013, PowerPivot ships natively and is now called the Data Model. With PowerPivot you can create a power BI model that uses data from multiple sources all in Excel. You can add any data you load, including unstructured data, with the Power Query add-in described above. The Data Model in Excel supports the DAX language for intuitive calculations, has a Diagram View for managing relationships in your data, allows you to define calculated fields and key performance indicators, and many more features. Due to a more efficient storage model, the Data Model allows you to load very large datasets into Excel. With enough memory you can load over one billion rows of data into Excel, thanks to this in-memory column store. You are only limited by your computer’s memory.
  • Power View is an interactive tool for exploring and visualizing your data. You can use it to build dynamic and ad hoc reports easily in Excel even with very little BI experience. Power View was an Excel add-in before Excel 2013. However, it now ships natively in Excel 2013. Power View works in tandem with PowerPivot: when you create a data model in memory with PowerPivot, you can use Power View to visualize the data through rich interactive reports. Power View is also a feature of SharePoint 2013.
  • Power Map is a very cool 3D mapping tool that you can use to visualize geospatial data. For example, you can use it to map sales across different countries or cities with very little effort. The key is to have location markers in your data. And the location data does not have to be longitudes and latitudes only. Instead, Power Map can map data using any relevant address attributes such as country, city, ZIP code, etc. Power Map is an add-in for Excel 2013. You can download it from www.microsoft.com/en-us/download/details.aspx?id=38395. More information on Power Map is also available at www.microsoft.com/en-us/powerBI/power-map.aspx.

Image Note  Please refer to http://powerbi.com and www.youtube.com/user/mspowerbi for more details on Power BI.

Three Approaches for Visualizing with Power BI

Now let’s explore the three approaches for visualizing your data with Power BI. Your first visualization approach will be to score a test dataset in Azure Machine Learning Studio and then visualize it with Power BI tools in Excel.

Image Note  In this chapter, you will use the Bike Buyer model from Chapter 2. This model is published as the Buyer Propensity Model in the Azure Machine Learning Gallery. You can access the Gallery at http://gallery.azureml.net/.

Download this experiment to your workspace in Azure Machine Learning for the rest of the experiment. Figure 8-2 shows the Buyer Propensity model in Azure Machine Learning Studio.

9781484212011_Fig08-02.jpg

Figure 8-2. The Buyer Propensity Model in Azure Machine Learning Studio

Scoring Your Data in Azure Machine Learning and Visualizing in Excel

To start with, you will modify the Buyer Propensity Model to include geospatial fields such as latitude, longitude, address, city, and country. These fields were excluded from the model since they were not statistically relevant. Now you are adding them back because you will need them to visualize the results on a map.

Modify the Buyer Propensity Model to include geospatial fields with the following steps.

  1. Go to the Buyer Propensity Model in Azure Machine Learning Studio.
  2. Drag down the dataset box named BikeBuyerWithLocation to the bottom right side of the canvas.
  3. In the first Project Columns module at the top of the experiment, use the Launch Column Selector to add the variable ID. To do this,
    1. Click the Project Columns module, and in the right pane, click Launch column selector. You will see the excluded variables in the text box, as shown in Figure 8-3.

      9781484212011_Fig08-03.jpg

      Figure 8-3. Output of the first Project Columns module showing the list of excluded variables

    2. Drop the variable ID from the list of excluded variables by clicking the X on the ID variable in this list.
  4. Add the following three new modules to the bottom of your experiment: Project Columns, Join, and Convert to CSV. Connect them as shown in the diagram in Figure 8-4

    9781484212011_Fig08-04.jpg

    Figure 8-4. The Buyer Propensity Model modified to add geospatial data

  5. In the new Project Columns module, add the variables ID, Longitude, Longitude, City, ZIP Code, and Country. See Figure 8-5 for details.

    9781484212011_Fig08-05.jpg

    Figure 8-5. Adding ID and geospatial variables to the Project Columns module

  6. Next, in the Join module, use the variable ID to join your two datasets as follows:
    1. In the properties pane, use Launch column selector under Join key columns for L. Set the value to ID.
    2. Use Launch column selector under Join key columns for R. Also, set its value to ID. See Figure 8-6 for the parameters in the Join module.

    9781484212011_Fig08-06.jpg

    Figure 8-6. Parameters in the Join module

  7. Run the experiment. On completion, select the small dot on the Convert to CSV module, right-click it, and choose download from the menu.

Now that you have a scored dataset that contains a geospatial variable, you need to map your data. Next, let’s switch to Excel where you will visualize your scored dataset. To this, follow these instructions.

  1. Open the resulting CSV dataset in Excel.
  2. Save the file as an Excel workbook and name it BikeBuyerwithLocation2_Scored_dataset.xlsx.
  3. Select Power Map from the Insert ribbon in Excel.
  4. Click Launch Power Map.
  5. In the right pane, choose City in the pane named Geography and Map Level, and click Next.
  6. Then drag the field called Scored Labels to the Height box. This plots the scored labels on the map.
  7. Also, drag the field Scored Labels to the Category box. This will use the two categories (Yes and No) in the scored labels. See Figure 8-7 for the parameters in Power Map.

    9781484212011_Fig08-07.jpg

    Figure 8-7. Setting parameters in Power Map

  8. Click Map Labels in the ribbon to label the map.
  9. Click 2D Map from the ribbon, and plot Top 100 locations by YES. This shows a ranking of top cities that are most likely to respond to the campaign.

Congratulations! You have just completed your first visualization in Power BI. Your map should appear as shown in Figure 8-8. This map plots customers’ propensity to buy by city. At each city, you can see the number of customers predicted to buy (shown in blue bars) or not buy (shown in orange bars). The overlaid 2D chart shows a list of cities sorted by their propensity to buy (sorted by Yes). You can see that the top cities by propensity are West Jordan (Utah), Spokane (Washington), and Nampa (Idaho).

9781484212011_Fig08-08.jpg

Figure 8-8. A 3D plot of the Scored Labels in Power Map

Scoring and Visualizing Your Data in Excel

The second way to visualize your results is to score your model and visualize the results in Excel. In Chapter 2, you saw how to publish your experiment to run as a web service in Azure Machine Learning. Follow the steps in Chapter 2 to publish the Buyer Propensity Model as a web service in Azure Machine Learning. Name your experiment BikeTestModelScore. When you publish your model, Azure Machine Learning will automatically create an Excel spreadsheet containing an API key that you will need to access your mode. Now follow these steps to access the Excel spreadsheet containing code for calling the REST API.

  1. In Azure Machine Learning, click Web Services on the left pane to see your web services. One of them should be the BikeTestModelScore that you published.
  2. Click BikeTestModelScore. This opens a new page showing full details of your published service. Figure 8-9 shows an illustration of what you should see. The API key for your new web service is shown in the text box named API key. You will need this key to securely access your web service. There are also two URLs listed here: the first is the URL you will need to call the service interactively to score one row of data at a time. The second URL can be used to score multiple rows in batch mode with a single API call. Also listed on this page is a link to an Excel workbook from where you can also score your data.

    9781484212011_Fig08-09.jpg

    Figure 8-9. Details of the published Buyer Propensity Model in Azure Machine Learning

  3. Click the Excel link and it will download the file. Open the file in Excel. Once you enable macros, Excel will show your input parameters and the model’s output from the web service. You can enter values for the input parameters. Once you are done, Excel will automatically compute the outputs by calling the API of your web service. Figure 8-10 shows an example. In the last section, you downloaded a CSV file containing test data. You can also copy and paste several rows of input data from this CSV file into the new Excel spreadsheet. Excel will automatically compute the output of each row.

9781484212011_Fig08-10.jpg

Figure 8-10. Testing a model from Excel

Scoring Your Data in Azure Machine Learning and Visualizing in powerbi.com

The third approach to visualizing your results is to use powerbi.com. As you saw earlier, powerbi.com is an online service that you can use to visualize your data that resides on premises or in the cloud. The centerpiece of powerbi.com is the dashboard. The dashboard enables you to visualize your data using several chart types in exciting ways. Through its Q&A feature you can also search your data easily using English text. This is great for your users since they do not need to learn SQL to query the data. Also, powerbi.com offers a desktop tool called Power BI Designer that you can use to create your reports and dashboards offline. When you are ready, you can publish it online at powerbi.com. Power BI is also available as an iOS app that runs on iPads and iPhones.

Let’s get started. To use powerbi.com you need to sign up at www.powerbi.com/, which is now available as a free preview service. When you first sign up, you will see the Retail dashboard shown in Figure 8-1. Your goal is to create your own dashboards and reports using the results of the Buyer Propensity Model.

To create your own dashboard in powerbi.com you have to load your results dataset. Let’s learn how to do this.

Loading Data

In the first approach to visualization you saved a scored dataset in the file named BikeBuyerwithLocation2_Scored_dataset.xlsx. You will now load this file from your local filesystem using the following steps.

  1. Click Get Data from the top left pane. You will see that you can load data from several sources including Excel, Power BI Designer (the desktop version of Power BI), SQL Server Analysis Services, Microsoft Dynamics CRM, and non-Microsoft products such as Marketo, Salesforce, and GitHub. This is shown in Figure 8-11.

    9781484212011_Fig08-11.jpg

    Figure 8-11. Showing the data sources for powerbi.com

  2. To load your scored dataset, select Excel Workbook and then click Connect. Choose Computer and then Browse. Select the file named BikeBuyerwithLocation2_Scored_dataset.xlsx from your computer. When the data is loaded, you will see it under the Datasets menu in the left pane. Now you are ready to start building your report.

Building Your Dashboard

When your dataset is loaded, you will see a new dashboard named BikeBuyerwithLocation2_Scored_dataset.xlsx under the Dashboard menu on the left pane. This is a blank canvas you will use to create your new dashboard. Figure 8-12 shows this blank canvas. Now create your own dashboard with this dataset through the following steps.

9781484212011_Fig08-12.jpg

Figure 8-12. A blank canvas ready for your first dashboard

  1. On the right pane is a filter. The filter shows all the fields available to create your charts. Click the field named Country. By default, Power BI will plot a 2D map of the sum of IDs by country. You should choose the right variable to visualize by country.
  2. Next, you might want to see how many prospective customers will respond to the mailing campaign by country. To do that, replace ID with the predicted scores. After you have replaced the ID, unselect ID in the filters list, and instead select Scored Labels. Now the 2D map plots Scored Labels by country, so it shows how many prospective customers will or will not buy bikes by country.
  3. Next, add a second chart to show how the propensity to buy bikes varies with the level of education. To do this, follow these steps.
    1. Click the field named Education in the filters. By default, Power BI will plot a second chart showing the sum of IDs by Education. Again, this is not very informative. So you need to change the Y-axis to a more useful variable than the sum of IDs.
    2. Unselect the sum of IDs field in the Filters pane, and instead drag the field named Scored Labels to the values section of the Filters pane. See Figure 8-13 for details.

      9781484212011_Fig08-13.jpg

      Figure 8-13. Details of the second chart that plots the count of Scored Labels by Education

    3. Finally, convert the chart type to a horizontal stacked bar as follows:
      • Select the new bar chart of Scored Labels by Education.
      • Choose the top small icon on the top left of this chart. It is called the change visualization type.
      • Now select the stacked bar type. The result is shown in Figure 8-13.

At the end of these steps, your dashboard should appear as shown in Figure 8-14. This dashboard has two charts: the first is a 2D map showing propensity to buy by country, while the second shows propensity to buy by level of education.

9781484212011_Fig08-14.jpg

Figure 8-14. Complete dashboard with two charts

Summary

Data visualization is a critical tool for a data scientist because it helps to communicate the results of modeling to stakeholders. Even after creating an excellent predictive model, you need to communicate the insights obtained from the model to your stakeholders. In this chapter, you learned how to visualize the results of your models with Power BI. This chapter started with an introduction to Power BI from Microsoft. You gained essential skills to help you visualize your results with Power BI in three ways.

  1. You scored a test dataset in Azure Machine Learning and used Power BI tools in Microsoft Excel for visualization.
  2. You scored your test dataset in Excel by calling a published predictive model through the REST API of Azure Machine Learning. This enabled you to score and visualize data without leaving Excel.
  3. You scored your test data in Azure Machine Learning and visualized it with Microsoft’s Power BI service at www.powerbi.com.

You are now ready to create rich visualizations of your results with Power BI. With the knowledge you gained from this chapter you can now impress your stakeholders with dazzling dashboards that communicate the findings from your Machine Learning models. Now go forth and impress!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.37.191