Chapter 10: Indexing Tools

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 10

Indexing Tools

When it comes to indexing, Microsoft has two tools built into SQL Server that can be used to help identify indexes that can improve database performance. These are the missing index dynamic management objects (DMOs) and the Database Engine Tuning Advisor (DTA). Both tools are useful to assist with indexing databases and can provide valuable input when working on tuning a database.

This chapter explains both indexing tools. The chapter begins by explaining what they are and the capabilities they provide. Then it walks you through how the tools can be used to provide assistance with indexing. Throughout the chapter, you will also learn about the pros and cons of using each of these tools.

Missing Index DMOs

The missing index DMOs are a set of management objects that provide feedback from the query optimizer. When the query optimizer compiles an execution plan, it can identify when materializing statistics into a physical index would improve performance. In these situations, the query optimizer will compile the results and store the information in the missing index DMOs.

There are a couple of benefits that the missing index DMOs provide. First, the missing index information is collected from the query optimizer without any action required on your part. Unlike Extended Events and other performance monitoring tools, you don’t need to configure and enable it in order for information to be collected. The other thing to consider is that the missing index information is based on actual activity occurring on the SQL Server instance. The index suggestions aren’t based on a test load you believe might happen in production but rather on the production load itself. As the usage patterns of the data in a database change, so too will the missing index recommendations.

Despite the benefits provided by the missing index DMOs, you must take into account a few considerations when using them. The limitations on the missing index DMOs can be summarized into the following categories:

Size of queue
Depth of analysis
Accuracy
Type of indexes

The size of the queue for missing indexes is one of the limitations that is easy to miss. Regardless of the number of databases on the SQL Server instance, there can be no more than 500 missing index groups. Once 500 missing index groups have been identified, the query optimizer will stop reporting new missing index suggestions. It will not make any determinations to decide whether a new possible missing index is of better quality than items already reported; the information is just not collected.

Note As with other dynamic management objects, the information within the missing index DMOs resets when SQL Server restarts and gets dropped for a database whenever the database is brought offline.

When considering the information in missing indexes, the depth of the analysis is a limitation that needs to be considered whenever you are reviewing the suggestions. The query optimizer considers only the current plan and whether the missing index would benefit the execution plan. Sometimes, adding the missing index to the database will result in a new plan with a new missing index suggestion. These suggestions are only a first pass at improving performance on an execution plan. The other half of this limitation is that the missing index details don’t include tests to determine whether the order of the columns in the missing index suggestion is optimal. When looking at missing index suggestions, it will be necessary to test in order to determine the proper column order.

The third limitation of the missing index suggestion is the accuracy of the information returned with the statistics. There are two things that need to be considered with this limitation. First, when the queries use inequality predicates, the cost information is less accurate than those returned with equality predicates. Second, it is possible to return the same missing index suggestion with multiple cost estimates. How and where the missing index would be leveraged may change the cost estimate that is calculated. For each cost estimate, a missing index suggestion will be logged.

Lastly, the missing index tool is limited in the types of indexes it can suggest. The main limitation is index types and the inability of missing indexes to suggest clustered, XML, spatial, or columnstore indexes. The suggestions also will not include information on when to make an index filtered. Along these same lines, suggestions may, at times, contain only INCLUDE columns. When this happens, one of the INCLUDE columns will need to be designated as the key column.

Note Missing index information for a table will be dropped whenever there are metadata operations made on the table. For instance, when a column is added to a table, the missing index information will be dropped. A less obvious example is when an index on a table changes. In this case as well, the missing index information will be dropped.

Explaining the DMOs

There are four DMOs that can be used to return information on missing indexes. Each DMO provides a portion of the information needed to build indexes that the query optimizer can use to improve the performance of a query. The DMOs for missing indexes are as follows:

sys.dm_db_missing_index_details
sys.dm_db_missing_index_columns
sys.dm_db_missing_index_group_stats
sys.dm_db_missing_index_group

In the next four sections, I’ll review each of the dynamic management objects and look at how each provides information on how to identify missing indexes.

sys.dm_db_missing_index_details

The DMO sys.dm_db_missing_index_details is a dynamic management view that returns a list of missing index suggestions. Each row in the dynamic management view (DMV) provides a single suggested missing index. The columns in Table 10-1 provide information on the database and the table to create the index on. It also includes the columns that should comprise the key and the included columns for the index.

Table 10-1. Columns in sys.dm_db_missing_index_details

Column Name	Data Type	Description
index_handle	int	Unique identifier for each missing index suggestions. This is the key value for this DMV.
database_id	smallint	Identifies the database where the table with the missing index resides.
object_id	int	Identifies the table where the index is missing.
equality_columns	nvarchar(4000)	Comma-separated list of columns that contribute to equality predicates.
inequality_columns	nvarchar(4000)	Comma-separated list of columns that contribute to inequality predicates.
included_columns	nvarchar(4000)	Comma-separated list of columns needed as covering columns for the query.
statement	nvarchar(4000)	Name of the table where the index is missing.

There are two columns in sys.dm_db_missing_index_details that are used to identify key columns on missing index suggestions. These are equality_columns and inequality_columns. The equality_columns are generated when there is a comparison in the query plan that makes a direct comparison. For instance, when the filter for a query is ColumnA = @Parameter, this is an equality predicate. The inequality_columns details are created when any nonequal filter is used in a query plan. Examples of this are when there are greater than, less than, or NOT IN comparisons being used.

When it comes to the included_columns information, this is generated when there are columns that are not part of the filter but that would be used to allow the index to cover the query request using a single index. Included columns are covered in more depth in Chapter 8. Suffice it to say, the use of included columns will help prevent the query plan from having to use a key lookup in the execution plan if the missing index is created.

sys.dm_db_missing_index_columns

The next DMO is sys.dm_db_missing_index_columns, which is a dynamic management function (DMF). This function returns a list of columns for each missing index listed in sys.dm_db_missing_index_details. To use the DMF, an index_handle is passed into the function as a parameter. Each row in the resultset represents a column in the missing index suggestion from sys.dm_db_missing_index_details and repeats the information in equality_columns, inequality_columns, and included_columns. Table 10-2 lists the output for sys.dm_db_missing_index_columns.

Table 10-2. Columns in sys.dm_db_missing_index_columns

Column Name	Data Type	Description
column_id	int	ID of the column
column_name	sysname	Name of the table column
column_usage	varchar(20)	Description of how the column will be used in the index

The primary information in this DMF is the column_usage column. For every row, this column will return one of the following values: EQUALITY, INEQUALITY, or INCLUDE. These values map to equality_columns, inequality_columns, and included_columns in sys.dm_db_missing_index_details. Depending on the type of usage in the former DMV, the use will be the same for this DMF.

sys.dm_db_missing_index_groups

The DMV sys.dm_db_missing_index_groups is the next missing index DMO. The DMV returns a list of missing index groups paired with missing index suggestions. Table 10-3 lists the columns for sys.dm_db_missing_index_groups. Although this DMV supports the ability for many-to-many relationships within missing index suggestions, they are always made in a one-to-one relationship.

Table 10-3. Columns in sys.dm_db_missing_index_groups

Column Name	Data Type	Description
index_group_handle	int	Identifies a missing index group. This value joins to group_handle in sys.dm_db_missing_index_group_stats.
index_handle	int	Identifies a missing index handle. This value joins to index_handle in sys.dm_db_missing_index_details.

sys.dm_db_missing_index_group_stats

The last missing index DMO is the DMV sys.dm_db_missing_index_group_stats. The information in this DMV contains statistics on how the query optimizer would expect to use the missing index if it were built. From this, using the columns in Table 10-4, you can determine which missing indexes would provide the greatest benefit and the scope to which the index will be used.

Table 10-4. Columns in sys.dm_db_missing_index_group_stats

Column Name	Data Type	Description
group_handle	int	Unique identifier for each missing index group. This is the key value for this DMV. All queries that would benefit from using the missing index group are included in this group.
unique_compiles	bigint	Count of the execution plan compilations and recompilations that would benefit from this missing index group.
user_seeks	bigint	Count of seeks in user queries that would have occurred if the missing index had been built.
user_scans	bigint	Count of scans in user queries that would have occurred if the missing index had been built.
last_user_seek	datetime	Date and time of last user seek from user queries that would have occurred if the missing index had been built.
last_user_scan	datetime	Date and time of last user scans from user queries that would have occurred if the missing index had been built.
avg_total_user_cost	float	Average cost of the user queries that could be reduced by the index in the group.
avg_user_impact	float	Average percentage benefit that user queries could experience if this missing index group had been implemented.
system_seeks	bigint	Count of seeks in system queries that would have occurred if the missing index had been built.
system_scans	bigint	Count of scans in system queries that would have occurred if the missing index had been built.
last_system_seek	datetime	Date and time of last system seek from system queries that would have occurred if the missing index had been built.
last_system_scan	datetime	Date and time of last system scans from system queries that would have occurred if the missing index had been built.
avg_total_system_cost	float	Average cost of the system queries that could be reduced by the index in the group.
avg_system_impact	float	Average percentage benefit that system queries could experience if this missing index group had been implemented.

Using the DMOs

Now that the missing index DMOs have been explained, it is time to look at how they can be used together to provide missing index suggestions. You may have noticed that the results of the missing index DMOs have been referred to as suggestions instead of recommendations. This variation in wording is intentional. Typically, when someone receives a recommendation, it is fully thought through and ready to be implemented. This is not so with the missing index DMOs; thus, they are referred to as suggestions.

With the suggestions from the missing index DMOs, you have a starting point to begin looking at and building new indexes. There are two things that are important to consider when looking at missing index suggestions. First, variations of each missing index suggestion may appear multiple times in the results. It is not recommended that each of these variations be implemented. Common patterns within the suggestions should be found. An index that covers a few of the suggestions is usually ideal. Second, when more than one column is suggested, the order of the columns needs to be tested to determine which is optimal.

To help explain how the missing index DMOs work and are related to one another, I’ll walk you through an example that includes a few SQL statements. These statements, shown in Listing 10-1, execute a few queries against the SalesOrderHeader table in the AdventureWorks2014 database. For each of the queries, the filtering is on either the DueDate or OrderDate column, or both.

Listing 10-1. SQL Statements to Generate Missing Index Suggestions

SELECT DueDate FROM Sales.SalesOrderHeader
WHERE DueDate = '2014-07-01 00:00:00.000'
AND OrderDate = '2014-06-19 00:00:00.000'
GO
SELECT DueDate FROM Sales.SalesOrderHeader
WHERE OrderDate Between '20140601' AND '20140630'
AND DueDate Between '20140701' AND '20140731'
GO
SELECT DueDate, OrderDate FROM Sales.SalesOrderHeader
WHERE DueDate Between '20140701' AND '20140731'
GO
SELECT CustomerID, OrderDate FROM Sales.SalesOrderHeader
WHERE OrderDate Between '20140601' AND '20140630'
AND DueDate Between '20140701' AND '20140731'
GO

If you examine the execution plan for any of the example queries, you’ll see that they each use a clustered index scan to satisfy the query. Figure 10-1 shows the execution plan for the first query. In this execution plan, there is an indication that there is a missing index that could help improve the performance of the query.

Figure 10-1. Execution plan for missing index SELECT query

To see more details on this missing index suggestion, you need to look at the missing index DMOs. A query against the missing index DMOs will look similar to Listing 10-2. The query includes the equality, inequality, and included column information that was described earlier. The query includes two calculations not previously described: the calculations for Impact and Score.

The Impact calculation helps identify missing index suggestions that will have the highest overall impact across multiple query executions. This is calculated by adding the potential seeks and scans on the missing index based on the average impact; the resulting value represents the total improvement across all queries that might have used the index. The higher the value, the more improvement the index could provide.

The Score calculation also helps to identify missing index suggestions that will improve query performance. The difference between Impact and Score is the inclusion of the average total user cost. For the Score calculation, the average total user cost is multiplied by the Impact score and divided by 100. The inclusion of the cost value helps differentiate between expensive and inexpensive queries when deciding whether to consider the missing index. For instance, a missing index suggestion that provides an 80 percent improvement on queries with an average cost value of 1,000 would likely provide a better return that a 90 percent improvement for a query with an average cost value of 1.

Listing 10-2. Query for Missing Index DMOs

SELECT
    DB_NAME(database_id) AS database_name
    ,OBJECT_NAME(object_id, database_id) AS table_name
    ,mid.equality_columns
    ,mid.inequality_columns
    ,mid.included_columns
    ,(migs.user_seeks + migs.user_scans) * migs.avg_user_impact AS Impact
    ,migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) AS Score
    ,migs.user_seeks
    ,migs.user_scans
FROM sys.dm_db_missing_index_details mid
        INNER JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle
        INNER JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle
WHERE DB_NAME(database_id) = 'AdventureWorks2014'
ORDER BY migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) DESC

Figure 10-2 shows some results from exeucuting this query.

Figure 10-2. Results from missing index query

With the results from the missing index query, shown in Figure 10-2, there are a few items to consider from these suggestions. First, there are quite a few similarities between the suggestions. The predicate columns between each of the suggestions include the OrderDate and DueDate, except for one missing index. Since the column order has not been tested, the optimal column order could go either way. To satisfy the missing index suggestion, one possible index could have the key column DueDate followed by OrderDate. This configuration would create an index that would satisfy all four of the missing index items.

The next item to look at is included_columns. For two of the suggestions, there are included_columns values listed. On the fourth missing index suggestion, it suggests including the column OrderDate. Since it will be one of the key columns of the index, it doesn’t need to be included. The other column, from the third missing index suggestion, is the CustomerID column. While only one index needs this column, as an included column, the addition of this column would likely be negligible since it is a narrow column. You would also want to add this column to the index.

After looking at these results, you’ve seen four missing index suggestions and ended up with a suggestion for one index that can cover all four of the missing index items. If you build the index using a DDL statement similar to that in Listing 10-3, you will end up with an index that solves these missing indexes. If you execute the queries in Listing 10-1 again, you can see this for yourself.

Listing 10-3. Index from Missing Index DMOs

CREATE NONCLUSTERED INDEX missing_index_SalesOrderHeader
ON Sales.SalesOrderHeader([DueDate], [OrderDate])
INCLUDE ([CustomerID])

Database Engine Tuning Advisor

The other indexing tool available in SQL Server is the Database Engine Tuning Advisor. This tool allows SQL Server to analyze a workload from a file, a table, or the plan cache. The output of the DTA can assist in providing recommendations for indexing and configuring partitions for the workload. The chief benefit of using the tool is that it doesn’t require a deep understanding of the underlying databases to make the recommendations.

The following are some of the core capabilities that the DTA can provide:

Recommend the best mix of indexes for databases by using the query optimizer to analyze queries in a workload
Recommend aligned or nonaligned partitions for databases referenced in a workload
Recommend indexed views for databases referenced in a workload
Analyze the effects of the proposed changes, including index usage, query distribution among tables, and query performance in the workload
Recommend ways to tune the database for a small set of problem queries
Allow you to customize the recommendation by specifying advanced options, such as disk space constraints
Provide reports that summarize the effects of implementing the recommendations for a given workload
Consider alternatives in which you supply possible design choices in the form of hypothetical configurations for DTA to evaluate

Although the DTA has quite a few capabilities, there are also a number of limitations on the tools. The following are some of these limitations:

Not able to recommend indexes on system tables.
Cannot add or drop unique indexes or indexes that enforce primary key or unique constraints.
May provide variations in recommendations on some workloads. The DTA samples data while it executes, which will influence the recommendations.
Unable to tune database from SQL Server 7.0 or earlier.
Unable to tune trace tables on remote servers.
Constraints placed on tuning workloads can have a negative impact on suggestions if the tuning session exceeds the constraints.

Note The DTA often suffers a bad rap as an indexing tool. This is mostly because of abuse and misuse by others who have used it. When using the tool, be sure to validate any change that is recommended and test any changes thoroughly before applying them in a production environment.

Explaining the DTA

There are two ways in which users can interact with the DTA. These are the graphical user interface (GUI) and the command-line utility. Both of these methods offer most of the same capabilities. Depending on your comfort level, you can choose either.

The GUI tool , which you will use throughout most of this chapter, provides a wrapper for the DTA. It allows you to select from the available options, and it enables you to view the tuning sessions that were previously executed. If you want to view tuning results, the GUI is well-suited to the task. Tuning sessions can be configured and executed through the GUI.

The command-line utility provides the same capabilities as the GUI when it comes to configuring and executing sessions. The command-line utility can be configured through either switches or an XML configuration file. Both of these options allow database administrators (DBAs) and developers to build processes to automate tuning activities for reviewing and analyzing workloads and to build an index tuning process that allows the DBA to work with results instead of going through the motions of setting up and configuring the tuning sessions. You will learn more about integrating the DTA utility into a performance tuning methodology in Chapter 15.

With both tools, two general areas of configuration need to occur. The first determines how the tuning session will interact and makes suggestions with the physical design structures (PDSs). The second determines which type of partitioning strategy the DTA should employ when trying to tune the database.

There are two parts to the options on how physical design structure suggestions will be generated. The first option you will want to configure is which type of physical design structure can be utilized in the tuning. The following are the options for this:

Indexes and indexed views
Indexes (default option)
Evaluate utilization of existing PDSs only
Indexed views
Nonclustered indexes

Along with the following options, the tuning session can also consider whether the filtered indexes can be included in the suggestions. The other PDS option determines which objects to keep within the database. This option can help ensure that the tuning recommendations do not adversely affect tuning that was previously tested and deployed. The following are the options for PDS items to retain in the database:

Do not keep any existing PDSs
Keep all existing PDSs (default option)
Keep aligned partitioning
Keep indexes only
Keep clustered indexes only

The other general option with when configuring DTA is how the tuning session will consider table partitioning. The following are the options for table partition:

No partitioning (default option)
Aligned partitioning
Full partitioning

Outside these options, there are some advanced options that can be configured. These options configure how long the tuning session will run and how much memory the session can utilize. Also, for the index DDL, there is an option to determine whether the script for indexes will include online index rebuild options.

Note Before following along in the next section, run the code in Listing 10-1. If the index in Listing 10-3 has been created, drop the index using the DROP INDEX statement provided in Listing 10-4.

Listing 10-4. DDL Statement to Drop Index missing_index_SalesOrderHeader

DROP INDEX Sales.SalesOrderHeader.missing_index_SalesOrderHeader;

Using the DTA GUI

As mentioned earlier in the chapter, one of the ways to interact with the DTA is through the GUI. In this section, you’ll look at a scenario demonstrating how to use the DTA for index tuning. There are a few methods for launching the tool. The first option is within SQL Server Management Studio (SSMS). Within SSMS, you can choose Tools Database Engine Tuning Advisor from the menu bar. The other option is to open the SQL Server Database Engine Tuning Advisor from the Start menu.

After launching the DTA, you will be prompted to connect to a SQL Server instance. Once connected, the tool will open a new tuning session for configuration. Figure 10-3 shows a DTA session.

Figure 10-3. General configuration screen from the Database Engine Tuning Advisor

In the session launch screen on the General options tab, there are a few things to configure initially. To start, there is the session name. The session name can be any value you desire. The default value includes your username with the date and time. Next select the type of workload that will be used. There are three options for the workload .

File: A file containing SQL Trace output, an XML configuration, or SQL scripts.
Table: SQL Server database table containing SQL Trace output. Before using the table, be sure the trace populating it has been completed.
Plan Cache: The plan cache of the SQL Server that the tuning session is connected to. This capability is new to SQL Server 2012 and provides a powerful mechanism to tune execution plans that are being used in your SQL Server environment.

Each of the workloads can be used to provide recommendations. Through each of these workload sources, there is an opportunity to tune pretty much any type of workload that is needed. For the purposes of this exercise, select the Plan Cache option.

The next step is to select the database and tables to tune. With large databases, it will be critical to select only the tables that are part of the workload and for which index recommendations are needed. When the DTA executes, it will generate statistics based on information in the table, and the fewer tables that need to be considered, the faster the tuning session can complete. Check the box in the “Select databases and tables to tune” section next to the AdventureWorks2014 database before continuing.

Caution Do not use the DTA in your production SQL Server environment. The tool uses brute-force tactics to identify index recommendations and create hypothetical indexes to support this effort. Running the tool in production can adversely affect the performance of other workloads on the server. Consider running the DTA from a command line and on a remote SQL Server for analyzing production databases. This option will be discussed in Chapter 15.

With the General options configured, the next step is to configure the Tuning Options settings. On the screen shown in Figure 10-4, deselect the “Limit tuning time” option. For the other options, leave them as the default selections . These should be as follows:

Physical Design Structures (PDS) to use in database: Indexes
Partitioning strategy to employ: No partitioning
Physical Design Structures (PDS) to keep in database: Keep all existing PDS

Figure 10-4. Tuning Options configuration screen from Database Engine Tuning Advisor

The next step is to start the Database Engine Tuning Advisor. This can be accomplished through the toolbar or the menu, by selecting Actions Start Analysis. After starting the DTA, the Progress tab will open, as shown in Figure 10-5.

Figure 10-5. Progress screen from the Database Engine Tuning Advisor

After a few minutes the tuning session will complete, though this will depend entirely on your computer’s workload. With the indexes from Listing 10-1, the results should be similar to those in Figure 10-6. In these results, there is one recommendation. While the names will vary in your environment, the recommendation should be as follows:

Index on OrderDate and then DueDate including CustomerID

Figure 10-6. Recommendations from the Database Engine Tuning Advisor

This index is similar to the suggestion previously found with the missing index DMOs. In situations where there are multiple recommendations provided, you will need to go through the same considerations that were part of reviewing the suggestions from the missing index DMOs, such as “Can the recommendations be consolidated?” To remove any item from the list of recommendations, simply deselect the check box, and it will not be included in any of the recommendation outputs.

At this point, there are a few options that can be used to apply the indexes .

Apply the indexes: To apply the indexes, select Actions in the menu bar and select Apply Recommendations. In the Apply Recommendations window that comes up, leave the default, Apply Now, selected and click OK.
Apply the indexes in the future: To apply the indexes in the future, select Actions in the menu bar and select Apply Recommendations. In the Apply Recommendations window that comes up, select “Schedule for later.” Alter the scheduled date as desired and click OK. This will create the SQL Agent job. Ensure the SQL Agent is running and the agent service account has the required permissions to apply the indexes.
Save recommendations: To save recommendations, click the Save Recommendations icon in the menu bar and press the key combination Ctrl+S; or, select Actions Save Recommendations in the menu bar.

If the recommendations are saved, they will create a script like the one in Listing 10-5. Before applying indexes from the DTA, it is recommended that the names of indexes be changed to match your organization’s index naming standards. Also, when it comes to statistics, these are generally not created. SQL Server will create statistics as needed behind the scenes, removing the need for you to build your own statistics.

Listing 10-5. Database Engine Tuning Advisor Index Recommendations

use [AdventureWorks2014]
go

CREATE NONCLUSTERED INDEX [_dta_index_SalesOrderHeader_7_1266103551__K4_K3_11] ON [Sales].[SalesOrderHeader]
(
    [DueDate] ASC,
    [OrderDate] ASC
)
INCLUDE (     [CustomerID]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
go

By using the DTA through its GUI, you are able to make quick work of a workload. The recommendations returned provide a level of index tuning above using the missing index DMOs. In essence, they provide a brute-force indexing exercise to improve performance without improving code. Instead of spending many hours on tuning that can be resolved with a few new indexes, you can focus your time on performance tuning issues that are beyond just adding an index.

Note When the DTA is terminated while processing, it will sometimes leave behind hypothetical indexes that were used while it was investigating possible indexes that could improve an environment. A hypothetical index is an index that contains only statistics and no data. These indexes can be identified through the is_hypothetical column in sys.indexes. If they exist in your environment, they should always be dropped.

Using the DTA Utility

The GUI isn’t the only way to use the DTA within your SQL Server environment. The other method is through the command line with the DTA utility. What DTA utility lacks in an interactive interface, it makes up for with the flexibility to leverage the DTA utility in scripts and automation.

The syntax for using the DTA utility, shown in Listing 10-6, includes a number of arguments. These arguments, defined in Table 10-5, allow the DTA utility to contain the same features and flexibility of the GUI. Instead of clicking through a number of screens, the configuration information is passed in through the arguments.

Listing 10-6. DTA Utility Syntax

dta
[ -? ] |
[
      [ -S server_name[ instance ] ]
      { { -U login_id [-P password ] } | –E  }
      { -D database_name [ ,...n ] }
      [ -d database_name ]
      [ -Tl table_list | -Tf table_list_file ]
      { -if workload_file | -it workload_trace_table_name  | -ip | -ipf }
      { -ssession_name | -IDsession_ID }
      [ -F ]
      [ -of output_script_file_name ]
      [ -or output_xml:report_file_name ]
      [ -ox output_XML_file_name ]
      [ -rl analysis_report_list [ ,...n ] ]
      [ -ix input_XML_file_name ]
      [ -A time_for_tuning_in_minutes ]
      [ -n number_of_events ]
      [ -m minimum_improvement ]
      [ -fa physical_design_structures_to_add ]
      [ -fi ]
      [ -fp partitioning_strategy ]
      [ -fk keep_existing_option ]
      [ -fx drop_only_mode ]
      [ -B storage_size ]
      [ -c max_key_columns_in_index ]
      [ -C max_columns_in_index ]
      [ -e | -e tuning_log_name ]
      [ -N online_option]
      [ -q ]
      [ -u ]
      [ -x ]
      [ -a ]
]

Table 10-5. DTA Utility Arguments

Argument	Description
-?	Returns help information, including a list of all arguments.
-A	Provides a time limit, in minutes, in which the DTA utility will spend tuning the workload. The default time limit is 8 hours, or 640 minutes. Setting the limit to 0 will result in an unlimited tuning session.
-a	After the workload is tuned, the recommendations are applied without further prompting.
-B	Specifies the maximum size, in megabytes, that recommended indexes can consume. By default, this value is set to either three times the current raw data size or the free space on attached disk drives plus raw data size, whichever is smaller.
-c	Maximum number of key columns that DTA will recommend in an index. This value defaults to 16. The restriction does not include INCLUDED columns.
-C	Maximum number of columns that DTA will recommend in an index. The value defaults to 16 but can be raised as high as 1024, the maximum columns allowed in an index.
-d	Identifies the database that the DTA session connects to when the session begins. Only a single database can be specified for this argument.
-D	Identifies the databases that the DTA session will tune the workload against. One or more databases can be specified for this argument. To add multiple databases to a session, either include all the database names in a comma-separated list in one argument or add one argument per database.
-e	Identifies the name of the logging table or file where the DTA session will output events that could not be tuned. When specifying a table name, use the three-part naming convention of [database_name].[schema_name].[table_name]. With an output file, the extension for the file should be .xml.
-E	Sets the database connection using a trusted connection. The required argument if -U is not used.
-F	Grants DTA permission to overwrite an output file if it already exists.
-fa	Identifies the types of physical design structures that the DTA session can include in the recommendations. The default value for this argument is IDX. The available values are as follows: IDX_IV: Indexes and indexed views IDX: Indexes only IX: Indexed views only NCL_IDX: Nonclustered indexes only
-fi	Allows the DTA session to include recommendations for filtered indexes.
-fk	Sets the limitations on the existing physical design structures that the DTA session can modify in the recommendations. The available values are as follows: NONE: No existing structures ALL: All existing structures ALIGNED: All partition-aligned structures CL_IDX: All clustered indexes on tables IDX: All clustered and nonclustered indexes on tables
-fp	Determines whether partitioning recommendations can be included in the DTA session recommendations. The default value for this argument is NONE. The available values are as follows: NONE: No partitioning FULL: Full partitioning ALIGNED: Aligned partitioning
-fx	Limits the DTA session to only including recommendations to drop existing physical design structures. Lightly used indexes in the session are evaluated, and recommendations for dropping them are provided. This argument cannot be used with the arguments -fa, -fp, and -fk ALL.
-ID	Sets a numerical identifier for the DTA session. Either this argument or -s must be specified.
-ip	Set the source of the workload for the DTA session to the plan cache. The top –n plan cache events for the databases specified with argument –D are analyzed.
-ipf	Sets the source of the workload for the DTA session to the plan cache. The top –n plan cache events for all databases are analyzed.
-if	Sets the source of the workload for the DTA session to a file source. The path and file name are passed in through this argument. The file must be SQL Server Profiler trace file (trc), SQL file (sql), or SQL Server trace file (log).
-it	Sets the source of the workload for the DTA session to a table. When specifying a table name, use the three-part naming convention of [database_name].dbo.[table_name]. The schema for the table must be dbo.
-ix	Identifies an XML file containing the configuration information for the DTA session. The XML file must conform to the DTASchema.xsd (which is located at http://schemas.microsoft.com/sqlserver/2004/07/dta/dtaschema.xsd).
-m	Sets the minimum percentage of improvement that a recommendation must provide.
-n	Sets the number of events in the workload that the DTA session should tune. When specified for a trace file, the order of the events selected is based on the decreasing order of duration.
-N	Determines whether the physical design structures are created online or offline. The available values are as follows: OFF: No objects are created online. ON: All objects are created online. MIXED: Objects are created where possible.
-of	Configures the DTA session to output the recommendations in a T-SQL format in the path and file specified.
-or	Configures the DTA session to output the recommendations to a report in an XML format. When a file name is not provided, a file name based on the session (-s) name will be used.
-ox	Configures the DTA session to output the recommendations in an XML format in the path and file specified.
-P	Sets the password to be used for the SQL login in the database connection.
-q	Sets the DTA session to execute in quiet mode.
-rl	Configures the reports that will be generated by the DTA session. One or more reports can be selected in a comma-separated list. The available values are as follows: ALL: All analysis reports STMT_COST: Statement cost report EVT_FREQ: Event frequency report STMT_DET: Statement detail report CUR_STMT_IDX: Statement-index relations report (current configuration) REC_STMT_IDX: Statement-index relations report (recommended configuration) STMT_COSTRANGE: Statement cost range report CUR_IDX_USAGE: Index usage report (current configuration) REC_IDX_USAGE: Index usage report (recommended configuration) CUR_IDX_DET: Index detail report (current configuration) REC_IDX_DET: Index detail report (recommended configuration) VIW_TAB: View-table relations report WKLD_ANL: Workload analysis report DB_ACCESS: Database access report TAB_ACCESS: Table access report COL_ACCESS: Column access report
-S	Sets the instance of SQL Server to be used for the DTA session.
-s	Sets the name of the DTA session.
-Tf	Identifies the name of a path and file containing a list of tables to be used for tuning. The file should contain one table per line using the three-part naming convention. After each table name, the number of rows can be specified to tune the workload for a scaled version of the table. If -Tf and -Tl is omitted, the DTA session will default to using all tables.
-Tl	Sets a list of tables to be used for tuning. Each table should be listed using the three-part naming convention, with each table name separated by a comma. If -Tf and -Tl are omitted, the DTA session will default to using all tables.
-U	Sets the username to be used for the SQL login in the database connection. The required argument if -E is not used.
-u	Launches the GUI interface for the DTA with all of the configuration values specified the to the DTA utility.
-x	Starts the DTA session and exists upon completion.

Using the DTA utility is fairly easy. You’ll look at two scenarios of using the tool that provide different outcomes. In the first scenario, you’ll use the DTA utility to recommend indexing changes with allowing only nonclustered indexing changes. For the second scenario, the DTA utility will be configured to recommend any change to the indexing that would improve the performance of the workload. In both scenarios, you’ll use the plan cache for SQL Server as the workload source. To populate the plan cache, execute the query in Listing 10-7.

Listing 10-7. Scenario Setup

USE AdventureWorks2014
GO

IF OBJECT_ID('dbo.SalesOrderDetail') IS NOT NULL
        DROP TABLE dbo.SalesOrderDetail;
SELECT SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate
INTO dbo.SalesOrderDetail
FROM Sales.SalesOrderDetail;

CREATE CLUSTERED INDEX CL_SalesOrderDetail ON dbo.SalesOrderDetail(SalesOrderDetailID);

CREATE NONCLUSTERED INDEX IX_SalesOrderDetail ON dbo.SalesOrderDetail(SalesOrderID);
GO

SELECT SalesOrderID, CarrierTrackingNumber
INTO #temp
FROM dbo.SalesOrderDetail
WHERE SalesOrderID = 43660;
DROP TABLE #temp;
GO 1000

SELECT SalesOrderID, OrderQty
INTO #temp
FROM dbo.SalesOrderDetail
WHERE SalesOrderID = 43661;
DROP TABLE #temp;
GO 1000

For the first scenario, you’ll build a command-line script similar to the one shown in Listing 10-8. For your environment, the server name (-S) will be different. The rest, however, will be the same. The database (-D and –d arguments) will be AdventureWorks2014. The source of the workload will be the plan cache (-ip argument). The name of the session (-s argument) is "First Scenario".

Listing 10-8. First Scenario DTA Utility Syntax

dta
-S localhostSQL2014 -E
-D AdventureWorks2014
-d AdventureWorks2014
-ip
-s "First Scenario"
-Tl AdventureWorks2014.dbo.SalesOrderDetail
-of "C:TempFirst Scenario.sql"
-fa NCL_IDX
-fp NONE
-fk ALL

With the DTA utility syntax prepared, the next step is to execute the script through the Command Prompt window. Depending on your SQL Server instance and the amount of information in the plan cache, the execution may take a few minutes. When it completes, the output in the Command Prompt window will look similar to the output shown in Figure 10-7. This output indicates that the file C:TempFirst Scenario.sql contains the recommendations for tuning the query in Listing 10-7.

Figure 10-7. Command Prompt window for first scenario

Based on the arguments passed into the DTA utility and the current workload, the recommendation from the first scenario tuning session includes the creation of two nonclustered indexes and statistics on two columns, shown in Listing 10-9. These indexes function as covering indexes for the queries in Listing 10-7; as a result, the key lookup is no longer required as part of the execution plan. The statistics provide information that SQL Server can use to build good plans for queries on the columns used in the query.

Note Listing 10-9 creates the dbo.SalesOrderDetail table.

Listing 10-9. First Scenario DTA Utility Output

use [AdventureWorks2014]
go

SET ANSI_PADDING ON

go

CREATE NONCLUSTERED INDEX [_dta_index_SalesOrderDetail_7_1895677801__K1_K2_3] ON [dbo].[SalesOrderDetail]
(
    [SalesOrderID] ASC,
    [SalesOrderDetailID] ASC
)
INCLUDE (     [CarrierTrackingNumber]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
go

CREATE NONCLUSTERED INDEX [_dta_index_SalesOrderDetail_7_1895677801__K1_K2_4] ON [dbo].[SalesOrderDetail]
(
    [SalesOrderID] ASC,
    [SalesOrderDetailID] ASC
)
INCLUDE (     [OrderQty]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
go

The downside to the arguments that were selected in the first scenario is that there isn’t any information included that helps determine the value in adding this index and the statistics. For the next scenario, you’ll learn how to obtain that information along with moving deeper into providing recommendations on the physical structure of your databases.

To begin the next scenario, you’ll use the same database and query. The arguments, though, will be modified slightly to accommodate the new goals, as shown in Listing 10-10. First, you’ll change the name of the session (-s) to "Second Scenario". Next, change the allowed physical structure changes (argument –fa) from nonclustered indexes only (NCL_IDX) to indexes and indexed views (IDX_IV). The final change, for the reporting output, is to add the report list (argument –rl) to the script with the all-analysis reports (ALL) option.

Listing 10-10. Second Scenario DTA Utility Syntax

dta
-S localhostSQL2014
-D AdventureWorks2014
-d AdventureWorks2014
-ip
-s "Second Scenario"
-Tl AdventureWorks2014.dbo.SalesOrderDetail
-of "C:TempSecond Scenario.sql"
-fa IDX_IV
-fp NONE
-fk ALL
-rl ALL

Executing the DTA utility using the second scenario produces entirely different results from the first scenario. Instead of recommending nonclustered indexes, the second scenario recommends a change in the clustered key columns. With this solution, the DTA session identified the SalesOrderID column as the column frequently used to access data and recommended that as the clustered index. Listing 10-11 shows these recommendations.

Listing 10-11. Second Scenario DTA Utility Output

use [AdventureWorks2014]
go

SET ANSI_PADDING ON

go

CREATE NONCLUSTERED INDEX [_dta_index_SalesOrderDetail_7_1895677801__K1_K2_3] ON [dbo].[SalesOrderDetail]
(
    [SalesOrderID] ASC,
    [SalesOrderDetailID] ASC
)
INCLUDE (    [CarrierTrackingNumber]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
go

CREATE NONCLUSTERED INDEX [_dta_index_SalesOrderDetail_7_1895677801__K1_K2_4] ON [dbo].[SalesOrderDetail]
(
    [SalesOrderID] ASC,
    [SalesOrderDetailID] ASC
)
INCLUDE (    [OrderQty]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
go

The one other difference with the second scenario is the creation of an XML report file. The session used the ALL option for the –rl argument, which includes all the reports listed for the argument in Table 10-5. These reports provide information regarding the statements that were tuned, the costs associated with the statements, the amount of improvement the recommendations provide, and much more (Figure 10-8). Through these reports, you are provided the information needed to make decisions about which recommendations to apply to your databases.

Figure 10-8. Sample report output from DTA utility

One thing to remember with the last two scenarios is that the table being tuned was tuned in a vacuum. There were no constraints or foreign key relationships on the table that need to be considered. In the real world, this won’t be the way your database is designed, and foreign key relationships will affect how recommendations are provided. Also, the load for these scenarios contained only two queries. When building your workloads, be sure to use a sample that is representative of your environment.

Through the DTA scenarios provided in this section, you’ve laid a foundation for using tools in your index tuning activities. Not only can the DTA identify missing indexes, but, given a workload, it can also help identify where clustered indexes and partitioning can assist with performance. The physical changes that DTA can provide could be extremely useful when you quickly need to address performance issues with a database.

Summary

This chapter walked you through using the built-in indexing tools available in SQL Server. Each of these tools can be a great addition to your SQL Server tool belt. They allow you to dig in and start making informed indexing decisions without expending a lot of effort.

When it comes to the missing index DMOs, you are working with index suggestions based on existing activity on the SQL Server instance. These are real-world applications, and they represent areas where you can almost immediately begin to build solutions to improve performance.

The DTA, while not as readily available as the missing index DMOs, allows you to tune indexes from a single query to a full workload with minimal effort. The new option to tune the contents of the plan cache allows you to leverage the work currently being done in an environment to build recommendations without the need to create a workload.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 10: Indexing Tools

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 10: Indexing Tools