Chapter 8. Deploying and Processing

We've discussed deploying and processing cubes a few times, but it's been a very rote "push this button, hope it works" process for you. In this chapter, we're going to discuss what deploying actually means and the various ways of doing it. We'll talk a bit about synchronizing databases between servers—similar to mirroring for transactional databases. Finally, we'll talk about processing the various Analysis Services objects and how to schedule processing should you need to.

Deploying a Project

After you've designed an OLAP solution, consisting of data sources, data source views, dimensions, and cubes, as well as data-mining structures, roles, and assemblies if you created them, you need to deploy the project to a server before you can actually do anything with it. When you deploy a project, it creates (or updates) an Analysis Services database. Then all the subordinate objects are created in the database.

Generally, you will deploy a project from BIDS to a development server. If everything passes there, you can use the Deployment Wizard to deploy the project to testing and production servers. You could also write code to use Analysis Management Objects (AMO) to manage deployment of SSAS objects. This becomes particularly compelling when there's a series of complex operations necessary in deployment, such as seeding confidential information or connecting a number of disparate systems.

Project Properties

As you saw in earlier exercises, you need to adjust the project properties before you can deploy the project. If nothing else, you'll have to change the deployment server name, because the default value is localhost. To change the project properties, right-click on the solution name and select Properties (Figure 8-1).

Opening the project properties

Figure 8.1. Opening the project properties

After you have the properties dialog box open (Figure 8-2), you'll have access to all the properties that govern deployment of the project to a server. You can choose from multiple configurations by using the Configuration Manager at the top. Each configuration is simply a collection of settings, so you can specify which server to deploy to, whether the database should be processed, and so forth.

The project properties dialog box

Figure 8.2. The project properties dialog box

Following is a list describing each of the properties available from the dialog box in Figure 8-2. The list is divided into three parts, one for each tab in the dialog:

Build Tab Properties
  • Deployment Server (Edition, Version): Indicates the SSAS edition and version of the server the database will be deployed to. This will run verifications on the project you're trying to deploy to verify all the features you're trying to use are in fact available on the target server. For example, perspectives are available only on the Enterprise Edition of SQL Server Analysis Services. So if you're using perspectives and you set the target server edition to Standard, you will get an error regarding the use of perspectives when you try to deploy.

  • Output Path: When you build the solution, a number of structures are generated (for example, XML manifests). The output path indicates where these structures should be copied to.

  • Remove Passwords: This option will remove any hard-coded passwords from the data sources you're using as they're deployed. If you are hard-coding passwords, you may not want those stored with the project. Alternatively, you may work against a development database, but ultimately deploy to test or production, where you don't know the passwords and a DBA must go in to enter these separately (via SSMS).

Debugging Tab Properties
  • Start Object: When you start debugging a solution, this is the object that will be started by BIDS. The Currently Active Object option means that whatever object is open and has the focus will be where debugging starts. (This is for debugging .NET stored procedures.)

Deployment Tab Properties
  • Processing Option: This option indicates whether BIDS should process the database after it's deployed. You may instead choose to deploy a database solution and have the processing job run during a later maintenance window.

  • Transactional Deployment: On occasion you'll have errors when you process a database object (we'll look at this later in the chapter).

  • Deployment Mode: The options here are Deploy Changes Only or Deploy All—simply an option to deploy the full project or just the parts that have changed. For very large projects, you may want to deploy only the changes, but deploying the full project ensures that what's on the server matches what you've been working on.

  • Server: This is simply the server name you will deploy to (the default value is localhost).

  • Database: If you need the database to have a different name on the server, you can change it here. The default value is the name of the project.

Deployment Methods

Let's take a closer look at the methods we have of deploying a database. The methods available include the following:

AMO Automation:

The word automation here is a little bit misleading, considering how much work we've been doing in BIDS to date. The concept here is that you can use AMO to create database objects on the server. So in theory, you could use code to create dimensions, build cubes, add members, and so on. Consider a large multinational company that has millions of sales every day—their sales cubes would be huge. You might set up partitions so that the current quarter's data is processed daily, but all historical data is processed only weekly. Instead of setting up new partitions each quarter, you could have an automated process to move the just-completed quarter's data into the current year, and then create a new partition for the current quarter to be processed daily. The work on the server to make the changes to the database would be effectively considered "deploying" the new partition.

Backup/Restore:

This is an easy way to move a cube intact. If you need a cube available on a system but would never need to actually process it on the system (perhaps a demo system that doesn't need to be absolutely current), then you could back up the cube from the production system and restore it to the demo system. So long as the source data wasn't needed again, the cube would function on the demo system as if it were "live."

Deployment Wizard:

SSAS client tools include a deployment wizard that can be scripted to deploy databases. When you run the wizard, you'll be prompted for an Analysis Services database file (*.asdatabase). You'll find this in the folder you specified in the project properties. After you've selected the file, walking through the wizard should seem fairly familiar; you've seen all the deployment options before. When you finish the wizard, it will deploy the database and supporting objects. You will also have the ability to save the deployment as a script that you can run to repeat the deployment.

Synchronize Database Wizard:

This wizard runs similarly to the import/export wizard in SQL Server. Running the Synchronize Database Wizard copies the cube data and database metadata from one server to another. This is best used to move databases between development, testing, and production servers.

XMLA Scripts:

Finally, you can deploy a database (or database objects) by generating XML for Analysis (XMLA) scripts (either from existing objects or by creating your own) and running these scripts on the server. XMLA scripts can be run from SSMS, an SSIS task, or executed via code.

Now that you have an understanding of the basics of deploying databases in Analysis Services, we'll dig a bit into using the Deployment Wizard and synchronizing databases. Then we'll take a look at processing databases, cubes, and other database objects.

Using the Deployment Wizard

The SQL Server Analysis Services Deployment Wizard is a stand-alone application. With the Deployment Wizard, you can take the database file generated when you build an SSAS database from BIDS and either deploy the database to a target server or generate a deployment script in XMLA.

Running the Wizard

The Deployment Wizard is installed with the SQL Server client tools, and can be found via Start → All Programs → Microsoft SQL Server 2008 → Analysis Services. It runs just like a standard wizard; when you launch it, you'll get a welcome screen (Figure 8-3).

The welcome screen for the Deployment Wizard

Figure 8.3. The welcome screen for the Deployment Wizard

As I mentioned earlier, the wizard will ask for the location of the asdatabase file generated by BIDS when you build an Analysis Services solution. You'll find this file in the Build location directory (generally in on the project directory). Next you'll be prompted for the server and database names, then partition and role options (Figure 8-4). The options for both roles and partitions cover the same general idea: do you want the deployed database to overwrite or add to existing roles and partitions?

Role and partition options

Figure 8.4. Role and partition options

The next page of the wizard, shown in Figure 8-5, enables you to specify some key configuration properties for the objects in the database. On this page, you can edit the connection strings for each data source, as well as the impersonation information, and indicate the file locations for log files and storage files.

Database object properties

Figure 8.5. Database object properties

After verifying the configuration properties, you can specify the processing method, writeback options, and whether processing should be wrapped in a transaction. Then the final page will verify that you want to deploy the solution, and ask whether you want to generate a script (there's no way to generate a script at this point without actually going through the deployment). The script will be generated in the location you indicate in XMLA format.

The two great options for the Deployment Wizard are to create an easily repeatable method for developers to deploy their databases (just run the script, instead of having to click through the wizard every time), and for DBAs to have a way to deploy a database without needing BIDS installed.

After building the solution, you'll have a number of files that are also the input files for the Deployment Wizard. Each will have the same name as the solution, and an extension indicating the contents. In addition, the wizard will generate a file ending in Script.xmla.

Input Files

The configuration files for an SSAS solution are XML files named after the solution with extensions indicating their use:

asdatabase:

This file is effectively a manifest file for the solution. This file lists all the objects in the database.

deploymenttargets:

This simply lists the server and database names where the objects in the asdatabase file should be created.

configsettings:

This file gives specific information about the objects and connection information about data connections. Settings in this file override any information in the asdatabase file.

deploymentoptions:

This file contains settings from the deployment dialog box indicating whether the deployment is transactional, processing options, and so forth.

These files are what you'll need to run a deployment script generated with the Deployment Wizard, as you'll examine in the next section.

Deployment Scripts

The deployment script is an XML representation of the entire SSAS database (in other words, it has no dependencies on the input files). As a result, you can use the script to deploy and process the database simply by executing the script against the desired Analysis Server.

After you have the XMLA script, you have several options to execute it:

  • Execute it from code via an AMO connection.

  • Run it via PowerShell.

  • Execute it in Management Studio.

  • Run it with an Analysis Services Execute DDL Task in SQL Server Integration Services.

You can automate the generation of the deployment script, either via command-line execution of the Deployment Wizard or by creating the deployment script with another process (after all, it's just XML). So again we can appreciate the options for automating creation, deployment, and processing of Analysis Services databases and their contents.

After a database is deployed, we may want to replicate it on another server. Analysis Services provides a synchronization capability to enable copying an active database from one server to another.

Synchronizing SSAS Databases

You can synchronize two Analysis Services databases, which is effectively an import operation from one Analysis Server to another. The Synchronize Database Wizard is hidden in SQL Server Management Studio. To find it, connect to the SSAS instance you will be importing to, and then right-click the Databases folder and select Synchronize, as shown in Figure 8-6.

Synchronizing SSAS databases

Figure 8.6. Synchronizing SSAS databases

Walking through the wizard is pretty straightforward. First you select a source server and database. The destination server is the server you launched the wizard from (Figure 8-7). The tricky thing about working with the wizard is that you can't synchronize a database to the same server, so you need two Analysis Servers to run the wizard.

Selecting the source server and database

Figure 8.7. Selecting the source server and database

After you've selected a source database, the wizard will poll the database and fetch a list of partitions, as shown in Figure 8-8. This is informative only—you can't change any of the locations (though you can after the database is deployed).

Specifying partition locations

Figure 8.8. Specifying partition locations

Finally, you'll have the option to sync roles and members, and indicate whether you want to run the synchronization now or generate an XMLA script to a file. Again, you can then use the file using the methods described previously to create the database wherever you may need it.

We've covered various ways of deploying databases and database objects, basically getting our structures in place on an Analysis Services server. After they're there, they're just metadata. If we want working cubes, we need to process the database and its objects. And that's what we're looking at in the next section.

Processing

We've deployed our project, with data source views mapped to various tables, dimensions, and cubes. However, these are just the structures of the data that make up our cube. Our dimensions have no members, and our cube has no data. We need to process the database if we want our users to be able to browse the data. Also, after the cube is processed, if the structure of an object changes, or data in the object needs to be updated, or the aggregation design is changed, we need to reprocess one or more structures to reflect the changes. (I'll refer to both as processing.)

What Processing Does for Us

Overall, processing a database (or individual cubes or dimensions) consists of loading the fact data, and then running the scripts for aggregation, calculated members, named sets, and so on. Let's take a look at a notional example that will produce output similar to that shown in Figure 8-9. For this example, we'll imagine that the individual fact record is the reported sales for the month by product subcategory.

A cube showing purchase data by month and by product subcategory

Figure 8.9. A cube showing purchase data by month and by product subcategory

When we process the cube, first the dimensions are processed. Analysis Services does this by going through each attribute and running a SELECT query to return the members of the attribute. For example, in the Products dimension, for the Subcategory attribute, SSAS runs the following query:

SELECT
  DISTINCT
 [dbo_DimProductSubcategory].[ProductSubcategoryKey] AS 
A cube showing purchase data by month and by product subcategory
[dbo_DimProductSubcategoryProductSubcategoryKey0_0], [dbo_DimProductSubcategory].[EnglishProductSubcategoryName] AS
A cube showing purchase data by month and by product subcategory
[dbo_DimProductSubcategoryEnglishProductSubcategoryName0_1], [dbo_DimProductSubcategory].[ProductCategoryKey] AS
A cube showing purchase data by month and by product subcategory
[dbo_DimProductSubcategoryProductCategoryKey0_2] FROM [dbo].[DimProductSubcategory] AS [dbo_DimProductSubcategory]

This returns results similar to those shown in Table 8-1.

Table 8.1. The Subcategory Attribute

ProductSubcategoryKey0

EnglishProductSubcategoryName0

ProductCategoryKey0

1

Mountain Bikes

1

2

Road Bikes

1

3

Touring Bikes

1

4

Handlebars

2

5

Bottom Brackets

2

6

Brakes

2

7

Chains

2

This process continues through each dimension attribute. After all the attributes in a hierarchy are populated, the hierarchy is constructed. Analysis Services also turns the tabular data into unique names. For example, the HL Road Handlebars product may have a unique name of [Components].[Handlebars].[HL Road Handlebars] denoting its full hierarchy. After all the dimensions are completely processed, SSAS starts processing the cube by assembling the dimensions, which can be visualized similar to Figure 8-10.

A "cube" with the dimensions in place

Figure 8.10. A "cube" with the dimensions in place

The next step is to load the fact data—the direct transactional data we pull from our database. This will look similar to what you see in Figure 8-11.

A "cube" with just the fact data loaded

Figure 8.11. A "cube" with just the fact data loaded

The next step is the CALCULATE statement. After the fact data is loaded, Analysis Services runs all the scripts in the cube. The first script, the default MDX script, has automatically included in it the CALCULATE statement. CALCULATE effectively directs every cell in the cube to aggregate from the cells below it. (Cells are created based on the structure of the dimensions and their hierarchies.)

Warning

If you ever have a cube that processes without errors but shows no aggregate values, check for the presence of the CALCULATE statement in the default script. There are (very few) reasons to delete the statement or move it, but for most cubes it should be the first statement in the default script. You can "lose" it if you get frustrated with your scripts, open the script window, and just press Ctrl+A, then Delete. Ask me how I know this.

Each subtotal and total is aggregated from the granular data beneath it, similar to the indicators shown in Figure 8-12. After the CALCULATE statement has executed, we'll have our fully populated cube, as shown (again) in Figure 8-12.

Aggregating fact data and completing the cube

Figure 8.12. Aggregating fact data and completing the cube

After the CALCULATE statement, Analysis Services will go on to execute any other scripts, such as calculated members, named sets, and other scripts. Note that because calculated members are calculated after the CALCULATE statement, they are not aggregated (calculations must provide for all levels of a hierarchy, if necessary).

In addition to the database, cubes, dimensions, and measure groups, the server will process (as necessary) partitions, mining models, and mining structures. Processing any higher-level object will force processing of subordinate objects (for example, processing a database will cause processing of everything within it).

How to Initiate Processing from BIDS

There are various ways to invoke processing an object in Analysis Services—via BIDS, Management Studio, XMLA, through AMO with code, or as we covered previously in the chapter, as the final action in a deployment. Let's look at using BIDS first. Then we'll get into the other tools.

In BIDS, you can initiate processing on the database (solution), cubes, dimensions, or mining structures in one of three ways:

  • Right-click on the object in the Solution Explorer and select Process from the context menu. (Note that you can select multiple objects by using the Ctrl key and then process them all.)

  • With the object open, click the Process button (

    How to Initiate Processing from BIDS
  • With the object open, select the appropriate object menu (Dimension, Cube, Mining Model) and click Process.

Any of these methods will open the Process dialog box, as shown in Figure 8-13. All objects use the same dialog.

The Process dialog box

Figure 8.13. The Process dialog box

The Object list indicates all the objects that you have selected to be processed; you can remove objects from the list by selecting them and clicking the Remove button.

There are several options for processing the objects in the Object list. You can select the options from a drop-down list in the Process Options column, as shown in Figure 8-14. Each object can be selected separately.

Selecting processing options

Figure 8.14. Selecting processing options

The processing options have the following effects:

Process Default:

Checks the status of an object, and if the object is unprocessed, it is fully processed; if the object is partially processed, the processing is finished.

Process Full:

Drops all data and performs full processing on the object and all objects it contains. This is necessary when there has been a structural change in the object.

Unprocess:

Drops all data in the objects and subordinate objects. The data is not reloaded. Generally, this is a step in troubleshooting, because this will also delete all the binary data store files.

Process Data:

Will drop all data and then reload the data without building aggregations or indexes (see the following Process Index item).

Process Index:

Creates or rebuilds indexes and aggregations on processed partitions.

Process Update:

(Only supported for dimensions.) Will force a reread of data and update of dimension attributes. Will also add new members to a dimension if they have been created in the underlying data.

After you've set the processing options for the objects you've selected, you can see what the effect of processing them will be by clicking the Impact Analysis button to open the Impact Analysis dialog box, as shown in Figure 8-15.

The Impact Analysis dialog box

Figure 8.15. The Impact Analysis dialog box

The Object list shows objects affected by all the objects you've selected to process and their end state after you run the processing. In Figure 8-15, you can see that processing the AdventureWorks cube with a setting of Structure Only will leave all its partitions in an unprocessed state. Depending on the processing you've selected, you may or may not see a list of impacted objects in the Object list. For example, running a cube on Process Full will not show any objects in the list, because all subordinate objects are automatically processed. The yellow alert box at the bottom of the dialog alerts you to objects that will automatically be processed along with the objects you've selected.

The Change Settings button in the dialog box will open the Change Settings dialog, as shown in Figure 8-16. In this dialog, you can set whether processing tasks should be performed in parallel or sequentially (with a governing transaction), whether to use existing tables for writeback storage or always create new ones, and whether SSAS should simply always process any objects affected by the current objects and their processing settings.

The Processing Options tab of the Change Settings dialog box

Figure 8.16. The Processing Options tab of the Change Settings dialog box

The other tab in the dialog, shown in Figure 8-17, is for how you want dimension key errors handled. This is a very important area, because dimension key errors are a critical aspect of dealing with processing cubes and databases. When a cube is processed, the fact tables are joined to the dimension tables. Now remember that relationships may be defined in the data source view, so they are not enforced on the source tables. As a result, there's a strong chance of having data in the fact tables that doesn't map to a key in a dimension table.

Settings for errors in dimension keys while processing

Figure 8.17. Settings for errors in dimension keys while processing

Your first option is whether to accept the default error configuration, which can be specified at the measure group level or the partition level. Alternatively, you can select to define the error configuration here for this processing run.

The first option is what to do when a record has a key that doesn't match any key in the dimension table. You can either have SSAS convert the value to flag it as "unknown," in which case it will appear in the Unknown member of the dimension, or you can select to delete the record from the cube if it doesn't match a dimension member.

The next option allows you to stop processing if there are more than a certain number of errors. You can either ignore the error count—essentially process no matter what happens—or indicate that the object(s) should stop processing (or just stop logging) after a certain number of errors. Finally, there's a section where you can indicate actions based on the type of error, and if you select Report and Continue for any of them, you can indicate a path to the error log where they should be logged as they happen.

Finally, you're ready to process the objects you've selected. Click the Run button to open the Process Progress dialog box, as shown in Figure 8-18 for the AdventureWorks cube. You can see the cube and its measure groups listed first, and then the related cube Mined Customers and its measure groups as well.

Processing a cube

Figure 8.18. Processing a cube

If you open up a measure group, you'll see the processing information for that group. In the case of Reseller Sales for AdventureWorks, the four partitions are processed as shown in Figure 8-19. Under each partition is the processing data for that partition, and then the SQL queries used to build the partition. You can highlight the query, copy it, and run the query in SSMS. This is a great troubleshooting tool when you're getting errors during processing.

Details of processing a measure group

Figure 8.19. Details of processing a measure group

Note that from the Process Progress dialog box, you can also stop the processing while it's in progress, reprocess the selected objects, and view the details or copy either specific lines or errors. If you get errors in processing, often you'll see a long list of errors as SSAS hits several of them simultaneously. The error that's truly indicative of what's going on is often at the bottom of the list, but you may have to hunt around to find it.

Tip

When you're first building a cube, I often find the first time it's processed is when you find all your authentication errors. If you get a large batch of errors, look for authentication problems in the error reports. Fix those first and then try again.

Processing from SQL Server Management Studio

Processing from SSMS is essentially the same as processing from BIDS, except that you can't multiselect objects from the Object Explorer pane. You can multiselect from the Object Details pane (View menu ® Object Explorer Details), but then you can only multiselect from objects of the same type by selecting the object type folder (for example, Cubes or Dimensions).

The Process dialog will look somewhat different, as shown in Figure 8-20, but has all the same functionality. Everything else we've discussed remains the same.

The Process object dialog box in SSMS

Figure 8.20. The Process object dialog box in SSMS

Processing via XMLA

Everything you do in BIDS and SSMS is actually executed against Analysis Services with XMLA, so obviously you can simply execute a cube manually the same way. At its simplest, an XMLA script to process a database will look like this:

<Process xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <Object>
    <DatabaseID>Adventure Works DW</DatabaseID>
  </Object>
  <Type>ProcessFull</Type>
  <WriteBackTableCreation>UseExisting</WriteBackTableCreation>
</Process>

Pretty straightforward, right? Of course, XMLA provides all the options you need to perform all the functions and options we've covered so far in this section. The benefit of using XMLA is that you can generate and run the script from an automated process, and you can also perform more-complex actions (process one dimension and use the results to select a table that serves as the data source for another dimension, for example). For more information about processing SSAS objects with XMLA, see the TechNet article at http://technet.microsoft.com/en-us/library/ms187199.aspx.

Processing with Analysis Management Objects (AMO)

Finally, if you don't relish learning yet another dialect of XML, you can process Analysis Services objects by using AMO from a .NET language. In AMO, after you have a reference to the object you want to process, it's simply a matter of calling the Process() method. The Process() method is inherited for each object type, so it has the same signature. It's overloaded, so you can call it with no parameters, or one to five parameters covering various processing options.

For more information on processing SSAS objects via AMO, see the Books Online article also available in TechNet at http://technet.microsoft.com/en-us/library/ms345091.aspx.

Scheduling OLAP Maintenance

We know that if we're using MOLAP, the data and aggregations in cubes (and dimensions and partitions) are static and don't reflect the underlying data until refreshed. In general, there are two approaches to consider, depending on whether you are using a staging database or you have a data source view to connect directly to live systems.

If you're using a staging database, the best approach is to process the OLAP database or cubes from the ETL package that is moving the data into the staging database. For example, if you are using SQL Server Integration Services to extract and clean data before loading it into your staging database, you could add an Analysis Services Processing Task at the end. The SSAS Processing Task in Integration Services gives you a task editor, as shown in Figure 8-21. The options in the dialog should look pretty familiar. In this way, whenever the staging database is loaded with new data, the cubes (or partitions) will be automatically processed to reflect the new data available.

Setting options for the AS Processing Task in Integration Services

Figure 8.21. Setting options for the AS Processing Task in Integration Services

On the other hand, perhaps you don't want the related objects processed whenever you load new data. Maybe you're loading data in an ongoing manner through the week but want to process cubes only over the weekend (the business isn't that time-sensitive). Or if you're not using a staging database, and the cube links directly to the live data, absent ROLAP or proactive caching you may not get an indication of when there's new data. Finally, you may be linked to a transactional database that's very busy during working hours, but you just need to process the cubes every night. In that case, you can schedule cube processing by using the SQL Server Agent.

The SQL Server Agent is a feature of SQL Server that can schedule and run various jobs, including processing SSAS objects. Connect to a SQL Server instance that has the agent installed and running, and you'll see it at the bottom of the Object Explorer. If you open the agent in the tree view, you'll see a folder containing all the jobs for that agent, as shown in Figure 8-22.

SQL Server Agent in SSMS

Figure 8.22. SQL Server Agent in SSMS

From here you can right-click on the Jobs folder and select New Job. Within the New Job Step dialog box, you can create a number of steps; the one we are most interested in is processing our SSAS objects. You can do this with a type of SQL Server Analysis Services command. The downside is that the command text is just a big text box and has to be in XMLA format (Figure 8-23). However, after you have your steps set up properly, you can also add scheduling, alerts, notifications, and so on to make the job more robust.

A SQL Agent job step to process a cube

Figure 8.23. A SQL Agent job step to process a cube

Summary

Hopefully, what happens behind the curtains when you process a cube is now a little bit less of a mystery. You've looked at deploying solutions, processing cubes and objects, and a little bit of scheduling and maintenance.

Over the past eight chapters, I've often mentioned MDX queries as a way of selecting data from OLAP solutions. Let's take a look at how MDX works in Chapter 9.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.20.156