Chapter 3. Understanding data refresh

Let’s continue following David, the manager of budgeting at Contoso. While he explores Microsoft Power BI, he now has begun sharing his dashboards and reports with Contoso’s country/region managers around the globe. Everybody likes the idea of being able to view a report on any device and share considerations while looking at the same figures. Nevertheless, the managers are concerned that they are making decisions based on sales as of October 2015, and it is now mid-December. The figures are no longer the best data upon which to forecast sales, a fact that is even more pertinent for the products that show a clear seasonality.

Thus, David needs to retrieve the latest sales data set and refresh the Microsoft Excel model. Urged on by the country/region managers, he ends up doing this each and every morning until he begins to wonder whether this process can be automated in some way. In fact, Power BI offers an option with which he can schedule refreshes.

Introducing data refresh

In Chapter 1 and Chapter 2, you learned the basics of Power BI: how to upload a workbook containing some data, build reports and dashboards, and how to share the content with other users in your organization or outside of it. Those chapters also touched upon the basics of data refresh: uploading a new version of a workbook containing data, and using Microsoft OneDrive for Business to automate the uploading process.

For both of these scenarios, updating data meant that David needed to refresh the content of the Excel file manually and then upload it to Power BI, either via another web service, such as OneDrive for Business, or by using the Power BI user interface (UI). When it comes to refreshing your data, Power BI offers much more control. Learning it requires some attention to details. We suggest that you read this chapter in its entirety because there are some small but important details that you need to know before making any decision about your future data refresh strategy.

More important, in the first two chapters, we kept the presentation deliberately simple, trying to show only the basic concepts. In this chapter, however, we begin to delve into the details of working with Power BI. Hereinafter, the details become important.

Anyway, first, we need to focus on what “refresh” really means. In the context of this chapter, refreshing data does not mean to manually update the Excel workbook and save it as another version of the same file. Instead, we want the workbook to automatically update its content by using a connection to the source database from which we originally query the data.

Let’s review the steps in David’s simple data-processing system:

1. Data is retrieved from the database by IT. IT then gives David an Excel file containing the latest figures of sales.

2. David manually copies the information to his own version of the Sales 2015 workbook.

3. He saves it so that OneDrive uploads the content to the cloud.

4. Power BI loads the content of the file from OneDrive and updates its own internal data model.

As part of step 3, the Excel file automatically computes the forecasts generated by the country/region managers by using formulas. These results are saved in OneDrive by the managers, so the results are immediately available to Power BI, too.

The data refresh mechanisms David learned so far are useful to automate step 4. But now, he wants to automate steps 1 and 2. This requires some more understanding of how Power BI works internally.

Introducing the Power BI refresh architecture

What happens when you upload a workbook to Power BI? Let’s consider the workbook that David uploaded to look more closely at the process. Recall that his workbook contained a table. Figure 3-1 shows the flow of data.

Image

Figure 3-1: The data flow, from the original database up to Power BI.

David’s data moves from SQL Server into Excel (David does that). Next, the Excel file is uploaded to OneDrive where Power BI reads it. After reading the file, Power BI generates a SQL Server Analysis Services (SSAS) database that ultimately computes, through the Power BI UI, the dashboards and reports. Sounds complicated, right? In fact, it is, but luckily, Power BI hides all of this complexity, making it easy for you to generate reports from Excel files. Nevertheless, to understand how data refresh works, you need to have a clear picture of the complete flow of information.

To use data refresh, you need a way to pull data from the data source (SQL Server, in this example) and push it directly into the SSAS model generated by Power BI. In other words, you want to create a data flow that circumvents Excel and OneDrive (both operations are done outside of Power BI) to make it flow as shown in Figure 3-2, in which the steps that we want to remove appear in a blue box.

Image

Figure 3-2: We want to remove the steps enclosed in the blue box to make data refresh work more efficiently.

Here is what you need to do to make this happen:

• The data set cannot be a plain Excel table, because Power BI needs to know how to query the source database (SQL Server, in this example) in order to refresh the data. Obviously, it cannot rely on asking you or asking IT. The method must be formalized and use a language that Power BI can understand.

• The SSAS engine running in Power BI needs a way to access the source database. Such a database is usually located within your company (or on your laptop). Thus, you will need software that implements the connection with Power BI.

Be sure that you understand these architectural requirements well before moving on with the rest of the chapter. As you will see by reading the next sections, we stripped away most of the technical complexities that comprise data refresh, but you need to keep in mind that the aim is to create the scenario depicted in Figure 3-2.

Introducing Power BI Desktop

You might remember from Chapter 1 that Power BI offers you two ways of interacting with it: direct access to the web service, or by using Power BI Desktop. In this section, we are going to show you how to use Power BI Desktop. Power BI Desktop is an application that runs locally on your computer but offers you all of the features available on the web, plus many more options for building a data model.

By using Power BI Desktop, you do not rely on the web service to create the data model for you. Instead, you have the full modeling capabilities of Power BI available at your disposal and control.

Why should you worry about Power BI Desktop at all if the web service is capable of building a model for you? There are several reason for this, but for now, let’s concentrate on one of those reasons: you can describe the details of your dataset, which accomplishes the first requirement of our data-refresh scenarios from the preceding section. But, before we can get to that, first you need to download Power BI Desktop from the Power BI website and install it.

On the Power BI website, on the right side of the menu bar at the top, click the download button (see step 1 in Figure 3-3), and then click Power BI Desktop (see step 2).

Image

Figure 3-3: Downloads in Power BI are available in the download list, on the right side of the menu bar.

After you have downloaded Power BI Desktop, install it by following the instructions provided in the Microsoft Power BI Desktop Setup Wizard, and then start the application. A welcome screen greets you, and then you see the main Power BI Desktop window, as shown in Figure 3-4.

Image

Figure 3-4: The Power BI Desktop UI resembles the Power BI website.

The first thing you will probably notice is that the UI of Power BI Desktop is very similar to that of Power BI. Nevertheless, as you will learn, there is more to Power BI Desktop, and it is a bit more complex to use than its web-based counterpart, but it exploits the full power of the Power BI engine.

As with Power BI, the first step is to provide some data to Power BI Desktop. Because the original data is in Excel, you can begin practicing with Power BI Desktop by using the same Excel file you used earlier for the website. On the ribbon, in the Data group, click Get Data, and then click Excel (see Figure 3-5).

Image

Figure 3-5: Your first step when using Power BI Desktop is to load some data; in this example it’s from Excel.

In the Open dialog box, select an Excel file and then click Open. The Navigator dialog box opens, in which you select the source file. When you load data from Excel, you can choose to load from tables or from worksheets. Figure 3-6 shows the two sample items available. Note that for this exercise we renamed the worksheet “Sales” to “Sales Worksheet” to make the figure clearer. In your models, it is likely they will have the same name, and only the icons adjacent to the names will differentiate tables from worksheets. For this example, let’s work with the table.

Image

Figure 3-6: The Power BI Navigator helps you to choose the source for importing data in Power BI Desktop.

Select Sales, click Load to import the table into Power BI Desktop, and then close the Navigator dialog box. At this point, you will likely feel at home and in familiar surroundings. In fact, by using the Fields and Visualizations panes, you can build a report in Power BI Desktop in the very same way you built the report earlier, on the website.

For example, Figure 3-7 shows that you can build a report similar to the one you built on the website. The main difference is that, now, you are doing the work directly on your PC instead of interacting with a cloud service.

Image

Figure 3-7: This is a sample report that you can build by using Power BI Desktop.


Note

We suggest that you become familiar with the UI of Power BI Desktop by experimenting with some reports. In this chapter, we are not showing you a step-by-step guide to Power BI Desktop. Instead, we focus on the new features available in Power BI Desktop that are not in the cloud service.


Building the report, you can appreciate how user-friendly the Power BI Desktop environment is, with features such as copy-and-paste available. So, for example, if you need a visualization similar to one you have already created, copy it, paste it, et voilà: the job is done. Of course, there is more to it than that, but Power BI Desktop offers little things like this that make life much better.

Publishing to Power BI

Let’s get back to our exercise. When the model is ready, save it on your computer using the name Sales PBD. Of course, at this point, the report is local to your PC, and no one else can view it. However, you ultimately want to publish this model on the Power BI cloud service, to take advantage of all the features of Power BI, including sharing and viewing on a mobile device.

To do this, go to the Power BI Desktop ribbon, and then, on the Home tab, in the Share group, click Publish. Power BI Desktop then asks you to sign in to the Power BI service (and might ask if you want to modify your changes). As previously mentioned, Power BI Desktop is an application that runs locally on your PC, and it can work without a Power BI account, but as soon as you want to publish your data, you need to have an account (or create one) and sign in. After you sign in, Power BI Desktop shows the message depicted in Figure 3-8, confirming the operation and providing a link to the published report.

Image

Figure 3-8: When the model is published, you can immediately see it using the appropriate link.

If you open the file in the Power BI website, you will find a dataset named Sales PBD and a report with the same name. By using Power BI Desktop, you created both a model and a report, and, when publishing it to Power BI, it created both objects.


Note

The model is copied from your local file to Power BI. When the model is in Power BI, you can further enhance it, but the two versions are disconnected. Changes that you apply to the published model from within your browser are not be applied to your local version on your PC, and any subsequent publish operation that you initiate from Power BI Desktop will overwrite the changes that you made via the web browser. Thus, when you begin building models with Power BI Desktop, it is a good practice to continue updating them locally and then republish them. Do not modify the online version. You might want to use different version names for your local files in order to publish different, varied reports in the Power BI cloud service.


After you get used to this method of developing reports, you will find it extremely convenient. In fact, working with Power BI Desktop is more productive because you do not need an Internet connection and you have the full power of a Windows application. When the model is ready, you publish it, overwriting any previous version that you might already have done.

Let’s recap what we have seen so far:

• Power BI Desktop is a Windows application that offers the same features of the cloud service, but it runs on your local PC.

• You can build a model with Power BI Desktop and save it to your PC.

• You can publish a Power BI Desktop model to Power BI, but you’ll need to have or create an account and sign in first.

You might remember that this chapter is about data refresh. Why is Power BI Desktop relevant to data refresh? A Power BI Desktop file contains all the information needed to refresh the model. In fact, in the Power BI Desktop file, we created a link between the original Excel file containing figures for the budget and the Power BI Desktop model.

In Chapter 1, when we uploaded the file to Power BI, we simply copied it. However, using Power BI Desktop, we create a link between the Excel file and the Power BI Desktop file by writing a query. In reality, we do not actually author any query, but as you will learn, Power BI Desktop created the query for us, making the task transparent.

All that is missing at this point is to provide a way for Power BI to access the Excel file that is the ultimate source of our data. In fact, the original Excel file is stored on the local PC, and the Power BI cloud service cannot access it. Solving this problem is the topic of the next section.

Installing the Power BI Personal Gateway

The Power BI Personal Gateway is another piece of software that can connect with the Power BI cloud service and carry out the queries stored in the Power BI Desktop file. You can download it from the same webpage from which you downloaded the Power BI Desktop application, but this time select Power BI Gateways after you click the download button. Power BI then asks you to choose between the two Power BI gateways:

Personal Gateway This version is intended for use with personal datasets. It is simple to use and install but offers limited features regarding monitoring and security for multiple users.

Enterprise Gateway This version offers more functionality but, at the same time, involves more complexity in its setup and usage and usually requires involving your IT department.


Note

In the last chapter of this book, we briefly outline the differences between the personal and the enterprise versions of the gateway. For the purposes of this discussion on how data refresh works, those differences are negligible.


Let’s go back and visit David to see how he is progressing.

David is still experimenting with Power BI, so he has no intention of installing a complex system. For this reason, he chooses the Personal Gateway. Before jumping into the setup of the Personal Gateway, though, he needs to understand how it will be implemented.

If David runs the setup with administrative privileges (that is, he installs it as an administrator), the gateway will run as a service. On the other hand, if he installs it as a standard user, the gateway runs as a normal program. What is the difference? When the gateway runs as a service, it runs even when no user (or another user) is signed in to his PC. Conversely, if it runs as a standard program, the gateway will run only if David is signed in to his PC. This might be relevant if at some point he wants to refresh his data while at home, accessing the Power BI cloud service, but his laptop is still in the office, turned on but without anybody using it. In such a scenario, if the gateway runs as a service, the refresh operation will succeed, whereas if it runs as a normal program, it will fail.

The choice of which to run is up to you. Figure 3-9 shows David choosing to run the installer as an administrator.

Image

Figure 3-9: Right-click the downloaded installer to install the gateway as an administrator.

After you install the gateway, you must start it to complete the configuration. In fact, when you start it, it requires the credentials to access Power BI. The gateway requires them because it must contact the Power BI cloud service and begin answering queries coming from the service.

After David provides the correct sign-in, the Power BI Gateway – Personal dialog box opens, in which he must provide another set of credentials, as illustrated in Figure 3-10.

Image

Figure 3-10: As part of the configuration, the gateway asks for the credentials to use when running as a service.

Why does the gateway require these credentials? Because David wants to run it as a service. Being a service, it will run even if no one is connected to the PC, and it requires credentials to access files and connections from David’s PC. In other words, the gateway will have the same permissions that David has on his PC, and those are required to let it access his files and data sets as if it were him.

When everything is complete, the gateway informs you that one operation still remains: you need to go to powerbi.com to complete the setup of the data sources. When you go there, you will reenter your credentials.


Note

At first sight, it might seem cumbersome to be required to enter the credentials so many times. Indeed, it is important that you follow the procedure in the correct way. The gateway connects the Power BI website to your PC, and it will be able to access all of your files. Security is always important, and Microsoft takes it very seriously.


David is now near the end of the gateway configuration. The last step is to visit the Power BI cloud service. To complete the setup, in Power BI, on the menu bar, he clicks the configuration button (the small gear icon; step 1 in Figure 3-11) and then chooses Settings (step 2). This opens the Settings page on which he can configure the settings for each data set, as demonstrated in Figure 3-11.

Image

Figure 3-11: The final step in the configuration of the gateway is to grant permission for a Power BI dataset to access it.

On the Settings page, David sees an alert with two important pieces of information:

• The gateway is online and running (in this case) on a PC named HARRY (which you can see in Figure 3-11, in the line below the Gateway Status section header).

• This data set is not yet ready. In fact, even if the gateway is set up, you need to specify the credentials specific for the single data source.

Why do you need to enter credentials again for each data source? The reason is that every data source might require different user credentials. In our example, the Sales PBD file is stored on David’s local PC, so it is automatically accessible by using the Windows credentials that David stored in the Personal Gateway, too. In this case, when David clicks Edit Credentials, he is asked to choose an authentication method, as illustrated in Figure 3-12. The Windows authentication method is the only option, and when David clicks Sign In, he will not need to provide his user name and password again. However, for other data sources not using Windows authentication, clicking Sign In will require a user name and password to make it possible for the Personal Gateway to connect to that data source during a refresh operation.

Image

Figure 3-12: You have a choice of authentication methods for each data source.

Configuring automatic refresh

After you set the credentials, the data source is ready to be refreshed. Power BI now has all of the information it needs to refresh the data, both on demand and on a scheduled basis. Expanding the Schedule Refresh section, you can define when Power BI attempts to refresh the dataset. Figure 3-13 shows an example of a data refresh scheduled twice daily, at 9:00 AM and 4:00 PM.

Image

Figure 3-13: You can configure automatic data refresh to run daily or weekly and at different times.


Note

You need a license for Power BI Pro to schedule more than one refresh per day. Using the free Power BI license, you can schedule only one daily refresh.


In case the refresh fails, you have the option to receive an email alert so that you can take the needed remedial actions.

At this point, you can either wait for the refresh to happen or, if you want to be sure that everything is set up correctly, you can force an immediate refresh. To perform this operation, in the navigation pane on the left, in the Datasets section, click the ellipsis to the right of the data source (Sales PBD, in our example; step 1 in Figure 3-14), and then click Refresh Now, shown as step 2 in Figure 3-14.

Image

Figure 3-14: You can refresh a dataset immediately by clicking Refresh Now.

When you ask for a refresh, Power BI prepares for the data refresh and then starts it. Depending on the size of the dataset and the speed of the Internet connection, the time to carry out the refresh can range from a few seconds, as in the case of David, to a much longer duration. You can see when a dataset was last refreshed by looking in the same window where you asked for the immediate refresh (click the ellipsis to the right of the data set).


Note

Although it might be obvious, it is useful to state an important fact: the data refresh happened on the model in the cloud, not in the model on David’s PC. In fact, if David opens Sales PBD by using Power BI Desktop, he will see the data as he saved it. Power BI does not change any file on your PC, it only uses the gateway as a medium to access the data sets upon which the model is based.


Conclusions

In this chapter, you learned the basics of data refresh. Let’s recap them briefly:

• You can upload simple data models, based on an Excel file, to Power BI and refresh them by using the Personal Gateway, which makes it possible for Power BI to access your local datasets.

• For data refresh to work on more complex models—for example those that load data from SQL Server—you need to use Power BI Desktop.

• With Power BI Desktop, you build models containing the needed information to let the cloud service connect to the Personal Gateway and retrieve the dataset.

• You can refresh your data daily with the free license, whereas you need a professional license if you need to refresh your model multiple times each day.

Power BI Desktop offers many more features and will let you move to the next level in the learning path of data modeling. This is the topic for Chapter 4.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.188.238