CHAPTER 17

image

SQL Server Metadata

Metadata is data that describes other data. SQL Server exposes a vast array of metadata including structural metadata, which describes every object, and descriptive metadata, which described the data itself. Metadata is exposed through a series of catalog views; information schema views; dynamic management views; and functions, system functions, and stored procedures.

Introducing Metadata Objects

Catalog views reside in the sys schema. There are many catalog views, some of the most useful of which, such as sys.master_files, are explored in this chapter. Listing 17-1 shows an example of how to use a catalog view to produce a list of databases that are in FULL recovery model.

Information schema views reside in the INFORMATION_SCHEMA schema. They return less detail than catalog views but are based on the ISO standards. This means that you can port your queries between RDBMS (Relational Database Management Systems). Listing 17-2 shows an example of using information schema views to produce a list of principals that have been granted SELECT access to the Chapter9.dbo.SensitiveData table.

Many dynamic management views and functions available in SQL Server. Collectively, they are known as DMVs and they provide information about the current state of the instance, which you can use for troubleshooting and tuning performance. The following categories of DMV are exposed in SQL Server 2014:

  • AlwaysOn Availability Groups
  • Change data capture
  • Change tracking
  • Common language runtime (CLR)
  • Database mirroring
  • Databases
  • Execution
  • Extended events
  • FILESTREAM and FileTable
  • Full-text search and semantic search
  • Indexes
  • I/O
  • Memory-optimized tables
  • Objects
  • Query notifications
  • Replication
  • Resource Governor
  • Security
  • Service broker
  • SQL Server operating system
  • Transactions

We demonstrate and discuss how to use DMVs many times throughout this chapter. DMVs always begin with a dm_ prefix, followed by two to four characters that describe the category of the object—for example, os_ for operating system, db_ for database, and exec_ for execution. This is followed by the name of the object. In Listing 17-3, you can see two things: an example of how to use a dynamic management view to find a list of logins that are currently connected to the Chapter16 database, and a dynamic management function you can use to produce details of the pages that store the data relating to the Chapter16.dbo.Customers table.

SQL Server also offers many metadata-related system functions, such as DB_ID() and OBJET_ID(), which we used in the Listing 17-3. Another example of a metadata-related system function is DATALENGTH, which we use in Listing 17-4 to return the length of each value in the LastName column of the Chapter16.dbo.Customers table.

Server- and Instance-Level Metadata

Many forms of metadata are available for the server and instance. Server-level metadata can be very useful for DBAs who need to find configuration information or troubleshoot an issue when they do not have access to the underlying operating system. For example, the dm_server category of DMVs offers views that allow you to check the status of server audits, view SQL Server’s Registry keys, find the location of memory dump files, and find details of the instance’s services. In the following sections, we discuss how to view the Registry keys associated with the instance, expose details of SQL Server’s services, and view the contents of the buffer cache.

Exposing Registry Values

The sys.dm_server_registry DMV exposes key registry entries pertaining to the instance. The view returns three columns, which are detailed in Table 17-1.

Table 17-1. sys.dm_server_registry Columns

Column

Description

Regstry_key

The name of the Registry key

Value_name

The name of the key’s value

Value_data

The data contained within the value

A very useful piece of information that you can find in the sys.dm_server_registry DMV is the port number on which SQL Server is currently listening. The query in Listing 17-5 uses the sys.dm_server_registry DMV to return the port on which the instance is listening, assuming the instance is configured to listen on all IP addresses.

Another useful feature of this DMV is its ability to return the startup parameters of the SQL Server service. This is particularly useful if you want to find out if switches such as -E have been configured for the instance. The -E switch increases the number of extents that are allocated to each file in the round-robin algorithm. The query in Listing 17-6 displays the startup parameters configured for the instance.

Exposing Service Details

Another useful DMV within the dm_server category is sys.dm_server_services, which exposes details of the services the instance is using. Table 17-2 describes the columns returned.

Table 17-2. sys.dm_server_services Columns

Column

Description

Servicename

The name of the service.

Startup_type

An integer representing the startup type of the service.

Startup_desc

A textual description of the startup type of the service.

Status

An integer representing the current status of the service.

Status_desc

A textual description of the current service state.

Process_id

The process ID of the service.

Last_startup_time

The date and time that the service last started.

Service_account

The account used to run the service.

Filename

The file name of the service, including the full file path.

Is_clustered

1 indicates that the service is clustered, 0 indicates that it is standalone.

Clusternodename

If the service is clustered, this column indicates the name of the node on which the service is running.

The query in Listing 17-7 returns the name of each service, its startup type, its current status, and the name of the service account that runs the service.

Analyzing Buffer Cache Usage

The dm_os category of DMV exposes 41 objects that contain information about the current status of SQLOS, although only 31 of these are documented. A particularly useful DMV in the dm_os category, which exposes the contents of the buffer cache, is sys.dm_os_buffer_descriptors. When queried, this object returns the columns detailed in Table 17-3.

Table 17-3. sys.dm_os_buffer_descriptors Columns

Column

Description

Database_id

The ID of the database that the page is from

File_id

The ID of the file that the page is from

Page_id

The ID of the page

Page_level

The index level of the page

Allocation_unit_id

The ID of the allocation unit that the page is from

Page_type

The type of page, for example, DATA_PAGE, INDEX_PAGE, IAM_PAGE, or PFS_PAGE

Row_count

The number of rows stored on the page

Free_space_in_bytes

The amount of free space on the page

Is_modified

A flag that indicates if the page is dirty

Numa_node

The NUMA node for the buffer

Read_microset

The amount of time taken to read the page into cache, specified in microseconds

The script in Listing 17-8 demonstrates how we can use the sys.dm_os_buffer_descriptors DMV to determine the percentage of the buffer cache each database is using on the instance. This can help you during performance tuning, as well as give you valuable insights that you can use during capacity planning or consolidation planning.

Image Note  More DMVs within the dm_os category are discussed in the “Metadata for Troubleshooting and Performance Tuning” section of this chapter.

Metadata for Capacity Planning

One of the most useful ways you can use metadata is during your pursuit of proactive capacity management. SQL Server exposes metadata that provides you with information about the current size and usage of your database files, and you can use this information to plan ahead and arrange additional capacity, before your enterprise monitoring software starts generating critical alerts.

Exposing File Stats

The sys.dm_db_file_space_usage DMV returns details of the space used within each data file of the database in which it is run. Before SQL Server 2012, it could only be used against TempDB, but it is now supported against any database. The columns returned by this object are detailed in Table 17-4.

Table 17-4. sys.dm_db_file_space_usage Columns

Column

Description

database_id

The ID of the database to which the file belongs.

file_id

The ID of the file within the database. These IDs are repeated between databases. For example, the primary file always has an ID of 1, and the first log file always has an ID of 2.

filegroup_id

The ID of the filegroup in which the file resides.

total_page_count

The total number of pages within the file.

allocated_extent_page_count

The number of pages within the file that are in extents that have been allocated.

unallocated_extent_page_count

The number of pages within the file that are in extents that have not been allocated.

version_store_reserved_page_count

The number of pages reserved to support transactions using snapshot isolation. Only applicable to TempDB.

user_object_reserved_page_count

The number of pages reserved for user objects. Only applicable to TempDB.

internal_object_reserved_page_count

The number of pages reserved for internal objects. Only applicable to TempDB.

mixed_extent_page_count

The number of extents that have pages allocated to different objects.

The sys.dm_io_virtual_file_stats DMV returns IO statistics for the database and log files of the database. This can help you determine the amount of data being written to each file and warn you of high IO stalls. The object accepts database_id and file_id as parameters and returns the columns detailed in Table 17-5.

Table 17-5. sys.dm_io_virtual_file_stats Columns

Column

Description

database_id

The ID of the database to which the file belongs.

file_id

The ID of the file within the database. These IDs are repeated between databases. For example, the primary file always has an ID of 1 and the first log file always has an ID of 2.

sample_ms

The number of milliseconds since the computer started.

num_of_reads

The total number of reads against the file.

num_of_bytes_read

The total number of bytes read from the file.

io_stall_read_ms

The total time waiting for reads to be issued against the file, specified in milliseconds.

num_of_writes

The total number of write operations performed against the file.

num_of_bytes_written

The total number of bytes written to the file.

io_stall_write_ms

The total time waiting for writes to complete against the file, specified in milliseconds.

io_stall

The total time waiting for all IO requests against the file to be completed, specified in milliseconds.

size_on_disk_bytes

The total space used by the file on disk, specified in bytes.

file_handle

The Windows file handle.

io_stall_queued_read_ms

Total IO latency for read operations against the file, caused by Resource Governor. Resource Governor is discussed in Chapter 23.

io_stall_queued_write_ms

Total IO latency for write operations against the file, caused by Resource Governor. Resource Governor is discussed in Chapter 23.

Image Tip  IO stalls are the amount of time it takes the IO subsystem to respond to SQL Server.

Unlike the previous two DMVs discussed in this section, the sys.master_files catalog view is a system-wide view, meaning that it returns a record for every file within every database on the instance. The columns returned by this view are described in Table 17-6.

Table 17-6. sys.master_files Columns

Column

Description

database_id

The ID of the database to which the file belongs.

file_id

The ID of the file within the database. These IDs are repeated between databases. For example, the primary file always has an ID of 1 and the first log file always has an ID of 2.

file_guid

The GUID of the file.

type

An integer representing the file type.

type_desc

A textual description of the file type.

data_space_id

The ID of the filegroup in which the file resides.

name

The logical name of the file.

physical_name

The physical path and name of the file.

state

An integer indicating the current state of the file.

state_desc

A textual description of the current state of the file.

size

The current size of the file, specified as a count of pages.

max_size

The maximum size of the file, specified as a count of pages.

growth

The growth setting of the file. 0 indicates autogrowth is disabled. If is_percent_growth is 0, then the value indicates the growth increment as a count of pages. If is_percent_growth is 1, then the value indicates a whole number percentage increment.

is_media_read_only

Specifies if the media on which the file resides is read only.

is_read_only

Specifies if the file is in a read-only filegroup.

is_sparse

Specifies that the file belongs to a database snapshot.

is_percent_growth

Indicates if the growth output is a percentage or a fixed rate.

is_name_reserved

Specifies if the filename is reusable.

create_lsn

The LSN (Log Sequence Number) at which the file was created.

drop_lsn

The LSN at which the file was dropped (if applicable).

read_only_lsn

The most recent LSN at which the filegroup was marked read only.

read_write_lsn

The most recent LSN at which the filegroup was marked read/write.

differential_base_lsn

The LSN at which changes in the file started being marked in the DIFF pages.

differential_base_guid

The GUID of the full backup on which differential backups for the file are made.

differential_base_time

The time of the full backup on which differential backups for the file are made.

redo_start_lsn

The LSN at which the next roll forward will start.

redo_start_fork_guid

The GUID of the recovery fork.

redo_target_lsn

The LSN at which an online roll forward for the file can stop.

redo_target_fork_guid

The GUID of the recovery fork.

backup_lsn

The most recent LSN at which a full or differential backup was taken.

Using File Stats for Capacity Analysis

When combined together, you can use the three metadata objects described in the previous section to produce powerful reports that can help you with capacity planning and diagnosing performance issues. For example, the query in Listing 17-9 provides the file size, amount of free space remaining, and IO stalls for each file in the database. Because sys.dm_io_virtual_file_stats is a function as opposed to a view, we CROSS APPLY the function to the results set, passing in the database_id and the file_id of each row as parameters.

The script in Listing 17-10 demonstrates how you can use sys.master_files to analyze drive capacity for each volume by detailing the current size of each file, the amount each file will grow by the next time it grows, and the current free capacity of the drive. You can obtain the free space on the drive by using the xp_fixeddrives stored procedure.

The issue with xp_fixeddrives is that it cannot see mapped drives. Therefore, as an alternative, you can employ the script in Listing 17-11, which uses PowerShell to return the information.

Image Caution  The drawback of this approach is that it requires xp_cmdshell to be enabled, which is against security best practice.

Metadata for Troubleshooting and Performance Tuning

You can use many metadata objects to tune performance and troubleshoot issues within SQL Server. In the following sections, we explore how to capture performance counters from within SQL Server, how to analyze waits, and how to use DMVs to troubleshoot issues with expensive queries.

Retrieving Perfmon Counters

Perfmon is a Windows tool that captures performance counters for the operating system, plus many SQL Server–specific counters. DBAs who are trying to diagnose performance issues find this very useful. The problem is that many DBAs do not have administrative access to the underlying operating system, which makes them reliant on Windows Administrators to assist with the troubleshooting process. A workaround for this issue is the sys_dm_os_performance_counters DMV, which exposes the SQL Server Perfmon counters within SQL Server. The columns returned by sys.dm_os_performance_counters are described in Table 17-7.

Table 17-7. sys.dm_os_performance_counters Columns

Column

Description

object_name

The category of the counter.

counter_name

The name of the counter.

instance_name

The instance of the counter. For example, database-related counters have an instance for each database.

cntr_value

The value of the counter.

cntr_type

The type of counter. Counter types are described in Table 17-8.

The sys.dm_os_performance_counters DMV exposes different types of counters that can be identified by the cntr_type column, which relates to the underlying WMI performance counter type. You need to handle different counter types in different ways. The counter types exposed are described in Table 17-8.

Table 17-8. Counter Types

Counter Type

Description

1073939712

You will use PERF_LARGE_RAW_BASE as a base value in conjunction with the PERF_LARGE_RAW_FRACTION type to calculate a counter percentage, or with PERF_AVERAGE_BULK to calculate an average.

537003264

Use PERF_LARGE_RAW_FRACTION as a fractional value in conjunction with PERF_LARGE_RAW_BASE to calculate a counter percentage.

1073874176

PERF_AVERAGE_BULK is a cumulative average that you use in conjunction with PERF_LARGE_RAW_BASE to calculate a counter average. The counter, along with the base, is sampled twice to calculate the metric over a period of time.

272696320

PERF_COUNTER_COUNTER is a 32-bit cumulative rate counter. The value should be sampled twice to calculate the metric over a period of time.

272696576

PERF_COUNTER_BULK_COUNT is a 64-bit cumulative rate counter. The value should be sampled twice to calculate the metric over a period of time.

65792

PERF_COUNTER_LARGE_RAWCOUNT returns the last sampled result for the counter.

The query in Listing 17-12 demonstrates how to use sys.dm_os_performance_counters to capture metrics of the PERF_COUNTER_LARGE_RAWCOUNT type, which is the simplest form of counter to capture. The query returns the number of memory grants that are currently pending.

The script in Listing 17-13 demonstrates capturing the number of lock requests that are occurring per second over the space of one minute. The lock requests/sec counter uses the PERF_COUNTER_BULK_COUNT counter type, but the same method applies to capturing counters relating to In-Memory OLTP, which uses the PERF_COUNTER_COUNTER counter type.

The script in Listing 17-14 demonstrates capturing the plan cache hit ratio for the instance. The Plan Cache Hit Ratio counter is counter type 537003264. Therefore, we need to multiply the value by 100 and then divide by the base counter to calculate the percentage. Before running the script, you should change the instance name to match your own.

The script in Listing 17-15 demonstrates how to capture the Average Latch Wait Time (ms) counter. Because this counter is of type PERF_AVERAGE_BULK, we need to capture the value and its corresponding base counter twice. We then need to deduct the first capture of the counter from the second capture, deduct the first capture of the base counter from the second capture, and then divide the fractional counter value by its base value to calculate the average over the time period. Because it is possible that no latches will be requested within the time period, we have wrapped the SELECT statement in an IF/ELSE block to avoid the possibility of a divide-by-0 error being thrown.

Analyzing Waits

Waits are a natural aspect of any RDBMS, but they can also indicate a performance bottleneck. A full explanation of all wait types can be found at msdn.microsoft.com, but all wait types break down into three categories: resource waits, queue waits, and external waits.

Resource waits occur when a thread requires access to an object, but that object is already in use, and therefore, the thread has to wait. This can include the thread waiting to take a lock out on an object or waiting for a disk resource to respond. Queue waits occur when a thread is idle and is waiting for a task to be assigned. This does not necessarily indicate a performance bottleneck, since it is often a background task, such as the Deadlock Monitor or Lazy Writer waiting until it is needed. External waits occur when a thread is waiting for an external resource, such as a linked server. The hidden gotcha here is that an external wait does not always mean that the thread is actually waiting. It could be performing an operation external to SQL Server, such as an extended stored procedure running external code.

Any task that has been issued is in one of three states: running, runnable, or suspended. If a task is in the running state, then it is actually being executed on a processor. When a task is in the runnable state, it sits on the processor queue, awaiting its turn to run. This is known as a signal wait. When a task is suspended, it means that the task is waiting for any reason other than a signal wait. In other words, it is experiencing a resource wait, a queue wait, or an external wait. Each query is likely to alternate between the three states as it progresses.

The sys.dm_os_wait_stats returns details of the cumulative waits for each wait type, since the instance started, or since the statistics exposed by the DMV were reset. You can reset the statistics by running the command in Listing 17-16.

The columns returned by sys.dm_os_wait_stats are detailed in Table 17-9.

Table 17-9. sys.dm_os_wait_stats Columns

Column

Description

wait_type

The name of the wait type that has occurred.

waiting_tasks_count

The number of tasks that have occurred on this wait type.

wait_time_ms

The cumulative time of all waits against this wait type, displayed in milliseconds. This includes signal wait times.

max_wait_time_ms

The maximum duration of a single wait against this wait type.

signal_wait_time_ms

The cumulative time for all signal waits against this wait type.

To find the wait types that are responsible for the highest cumulative wait time, run the query in Listing 17-17. This query adds a calculated column to the result set, which deducts the signal wait time from the overall wait time to avoid CPU pressure from skewing the results.

Of course, signal wait time can be a cause for concern in its own right, potentially identifying the processor as a bottleneck, and you should analyze it. Therefore, use the query in Listing 17-18 to calculate the percentage of overall waits, which are due to a task waiting for its turn on the processor. The value is displayed for each wait type and it is followed by a row that displays the overall percentage for all wait types.

To find the highest waits over a defined period, you need to sample the data twice and then deduct the first sample from the second sample. The script in Listing 17-19 samples the data twice with a ten-minute interval and then displays the details of the five highest waits within that interval.

Finding and Tuning Expensive Queries

Although poor performance can be the sign of a hardware bottleneck, more often than not, you can achieve larger performance gains from tuning queries than from throwing hardware at the problem. If developers have written queries using cursors or other suboptimal techniques, then you may need to pass the queries back to the development or technical support team; but in other cases, the DBA may be able to resolve the issue—by adding a missing index, for example.

The first step in the query troubleshooting process is to find the most expensive queries, which are frequently being run. You can achieve this by using the sys.dm_exec_query_stats DMV in conjunction with the sys.dm_exec_sql_text and sys.dm_exec_query_plan DMVs. The columns returned by the sys.dm_exec_query_plan dynamic management view are detailed in Table 17-10.

Table 17-10. sys.dm_exec_query_stats Columns

Column

Description

sql_handle

A token that identifies the parent procedure or batch of the query.

statement_start_offset

A zero-based offset, denoting the start of the query within the parent batch or procedure.

statement_end_offset

A zero-based offset, denoting the start of the query within the parent batch or procedure. If statement_end_offset is -1, then this denotes the end of the parent batch or procedure.

plan_generation_num

Identifies the instance of a query plan following a recompile.

plan_handle

A token that identifies the query plan, which contains the query.

creation_time

The time that the query plan was compiled.

last_execution_time

The time that the plan last started to execute. This does not include invocations that are currently running.

execution_count

The number of times the plan has been executed. This does not include invocations that are currently running.

total_worker_time

The cumulative amount of CPU time that has been used by all executions of the plan, specified in microseconds.

last_worker_time

The amount of CPU time used by the plan the last time it was executed, specified in microseconds.

min_worker_time

The minimum amount of CPU time used by any single execution of the plan, specified in microseconds.

max_worker_time

The maximum amount of CPU time used by any single execution of the plan, specified in microseconds.

total_physical_reads

The cumulative number of reads made from disk during executions of the plan.

last_physical_reads

The number of reads from disk made the last time the query was executed.

min_physical_reads

The minimum number of reads from disk made for any single execution of the plan.

max_physical_reads

The maximum number of reads from disk made for any single execution of the plan.

total_logical_writes

The cumulative number of writes made to the buffer cache during all executions of the plan.

last_logical_writes

The number of writes made to the buffer cache the last time the plan was executed.

min_logical_writes

The minimum number of writes made to the buffer cache on any single execution of the plan.

max_logical_writes

The maximum number of writes made to the buffer cache on any single execution of the plan.

total_logical_reads

The cumulative number of reads made to the buffer cache during all executions of the plan.

last_logical_reads

The number of reads made to the buffer cache the last time the plan was executed.

min_logical_reads

The minimum number of reads made to the buffer cache on any single execution of the plan.

max_logical_reads

The maximum number of reads made to the buffer cache on any single execution of the plan.

total_clr_time

The cumulative total amount of CPU time used within CLR objects for all executions of the plan, specified in microseconds.

last_clr_time

The amount of CPU time used inside CLR objects the last time that the plan was executed, specified in microseconds.

min_clr_time

The minimum amount of CPU time used within CLR objects for a single execution of the plan, specified in microseconds.

max_clr_time

The maximum amount of CPU time used within CLR objects for a single execution of the plan, specified in microseconds.

total_elapsed_time

The cumulative total execution time for all executions of the plan, specified in microseconds.

last_elapsed_time

The total execution time for the plan the last time it was run, specified in microseconds.

min_elapsed_time

The minimum execution time for any single execution of the plan, specified in microseconds.

max_elapsed_time

The maximum execution time for any single execution of the plan, specified in microseconds.

query_hash

A binary hash representation of the query that is used by the optimizer to identify similar queries.

query_plan_hash

A binary hash representation of the execution plan that is used by the optimizer to identify similar plans.

total_rows

The cumulative total number of rows returned by all executions of the plan.

last_rows

The number of rows that was returned the last time that the plan was executed.

min_rows

The minimum number of rows that have been returned by a single execution of the plan.

max_rows

The maximum number of rows that have been returned by a single execution of the plan.

The sys.dm_exec_sql_text dynamic management function accepts a single parameter, which is the sql_handle. This means that the function can be cross-applied to the sys.dm_exec_query_stats results. The function returns the text of the batch or procedure, of which the query is a part. The sys.dm_exec_query_plan accepts a single parameter, which is the plan_handle. This means that it can also be cross-applied to the results of sys.dm_exec_query_stats. It returns the XML representation of the query plan.

Before we demonstrate how to use sys.dm_exec_query_stats, we first create the Chapter17 database, which includes two tables that we populate with data. These tasks are performed in Listing 17-20.

In Listing 17-21, we tear down the plan cache and run some queries against the Customers table in our Chapter17 database before we use the sys.dm_exec_query_stats object to view details of our longest running queries. Because we have reset the plan cache, we know that our queries against the Customers table will be returned. In a production scenario, you may choose to filter the results by execution count, since you may not be interested in ad-hoc queries that have only run once or twice. You may also choose to filter by last execution time to avoid returning information for old queries that are no longer run.

The results of the query against sys.dm_exec_query_stats are displayed in Figure 17-1. You can see that, unsurprisingly, the query to identify duplicates was the longest running query, as well as the most expensive in terms of CPU time and IO.

9781484207116_Fig17-01.jpg

Figure 17-1. Longest running query results

The right-most column in the results displays the XML execution plan. Clicking this execution plan in SQL Server Management Studio displays the graphical representation of the execution plan. The execution plan for the most expensive query is shown in Figure 17-2.

9781484207116_Fig17-02.jpg

Figure 17-2. Execution plan

The query optimizer is very clever, and the vast majority of the time, it chooses the best plan for a query, unless statistics are out of date or indexes are fragmented. Occasionally, however, when troubleshooting a performance issue with a query, you may notice that the optimizer has chosen a suboptimal plan. In the case of our query to identify duplicate names, the optimizer chose a Merge join, which requires the data to be sorted first. The Nested Loop algorithm is likely to perform better for this query, because although using it involves increasing the reads because the data pages of the table are warm, the elapsed time is likely to still be less. We can resolve this issue by asking the development team to add a query hint, but this is not always possible either because of third-party code or delays caused by change control. Therefore, you can resolve the issue by using plan freezing.

Plan Freezing

Plan freezing involves creating an object called a plan guide, which influences the optimizer by attaching query hints to a specific query or programmable object. When the optimizer is passed a query, it checks to see if there is a plan guide that matches the query pattern stored in the sys.plan_guides table. If there is, it modifies the query to incorporate the hints before compiling it.

You can create a plan guide by using the sp_create_plan_guide stored procedure, which accepts the parameters detailed in Table 17-11.

Table 17-11. sp_create_plan_guide Parameters

Parameter

Description

@name

The name of the plan guide that you are creating.

@stmt

The test of the SQL statement that the plan guide is matched against.

@type

Acceptable values are OBJECT, SQL, or TEMPLATE. If OBJECT is specified, the statement must form part of a programmable object, such as a stored procedure or function. If SQL is specified, then the statement must be in an ad-hoc batch. If TEMPLATE is specified, then the statement must match the query after it has been parameterized. For example, WHERE Col1 = @0. TEMPLATE is only compatible with the PARAMETERIZATION query hint.

@module_or_batch

Specifies the name of the programmable object in the format schema.object if the type is object. This needs to be set to the exact text of the batch if the type is SQL. If NULL is passed, the statement supplied in the @stmt parameter is used.

@params

Specifies the definitions of all parameters that are embedded in the statement.

@hints

Specifies either the hint to use by using N’OPTION(), the query plan to use by using XML_SHOWPLAN, or that hints embedded within the statement should be ignored by passing NULL.

Image Caution  The batch in the plan guide to the batch must be identical to the batch being executed, or the plan guide will not be used. This includes white space, case, and accents. However, EXEC, EXECUTE, and execute are all considered identical.

The command in Listing 17-22 creates a plan guide that forces our duplicate identifying query to use a Loop join, as opposed to a Merge join.

After running the duplicates query again, you have two plans in the plan cache. We can run the query in Listing 17-23 to find the details.

Image Caution  Because of the string function in the WHERE clause, this query may not perform well on systems with a large volume of cached plans.

The results of this query are displayed in Figure 17-3. Notice that there are two plans and that each has been executed once. Although the logical reads for the second plan are substantially higher, the elapsed time of the query is shorter.

9781484207116_Fig17-03.jpg

Figure 17-3. Find the plan for specific query results

If we view the graphical execution plan of the faster plan shown in Figure 17-4, we can see that a Nested Loop operator was used to join the results set, as per the instruction in our plan guide.

9781484207116_Fig17-04.jpg

Figure 17-4. Query plan with nested loops

Image Caution  Exercise caution when using query hints or plan guides. If the distribution of values changes, your hints may no longer be optimal. Also, the Optimizer is smart, so it is best to work with it rather than against it. For example, if you know that a Loop join is always a poor choice, then instead of using OPTION(HASH JOIN), use OPTION(HASH JOIN, MERGE JOIN). This gives the Optimizer the choice of using any join algorithm, except Loop join, and gives you the best chance of generating the most optimal plan.

Metadata Driven Automation

You can use metadata to drive intelligent scripts that you can use to automate routine DBA maintenance tasks, while at the same time incorporating business logic. In the following sections, you see how you can use metadata to generate rolling database snapshots and also to rebuild only those indexes that are fragmented.

Dynamically Cycling Database Snapshots

As discussed in Chapter 16, we can use database snapshots to create a read-only copy of the database that can reduce contention for read-only reporting. The issue is that the data becomes stale, as data in the source database is modified. For this reason, a useful tool for managing snapshots is a stored procedure, which dynamically creates a new snapshot and drops the oldest existing snapshot. You can then schedule this procedure to run periodically, using SQL Server Agent. (SQL Server Agent is discussed in Chapter 21.) The script in Listing 17-24 creates a stored procedure that, when passed the name of the source database, drops the oldest snapshot and creates a new one.

The procedure accepts two parameters. The first specifies the name of the database that should be used to generate the snapshot. The second parameter specifies how many snapshots you should have at any one time. For example, if you pass in a value of Chapter17 to the @DBName parameter and a value of 2 to the @RequiredSnapshots parameter, the procedure creates a snapshot against the Chapter17 database but only removes the oldest snapshot if at least two snapshots already exist against the Chapter17 database.

The procedure builds up the CREATE DATABASE script in three parts (see Listing 17-24). The first part contains the initial CREATE DATABASE statement. The second part creates the file list, based on the files that are recorded as being part of the database in sys.master_files. The third part contains the AS SNAPSHOT OF statement. The three strings are then concatenated together before being executed. The script appends a sequence number to the name of the snapshot, and the name of each file within the snapshot, to ensure uniqueness.

The command in Listing 17-25 runs the DynamicSnapshot procedure against the Chapter17 database specifying that two snapshots should exist at any one time.

Rebuilding Only Fragmented Indexes

When you rebuilds all indexes with a maintenance plan, which we discuss in Chapter 21, SQL Server supplies no intelligent logic out of the box. Therefore, all indexes are rebuilt, regardless of their fragmentation level, which requires unnecessary time and resource utilization. A workaround for this issue is to write a custom script that rebuilds indexes only if they are fragmented.

The script in Listing 17-26 demonstrates how you can use SQLCMD to identify indexes that have more than 25-percent fragmentation and then rebuild them dynamically. The reason that the code is in a SQLCMD script, as opposed to a stored procedure, is because sys.dm_db_index_physical_stats must be called from within the database that you wish to run it against. Therefore, when you run it via SQLCMD, you can use a scripting variable to specify the database you require; doing so makes the script reusable for all databases. When you run the script from the command line, you can simply pass in the name of the database as a variable.

When this script is saved as in the root of C: as RebuildIndexes.sql, it can be run from the command line. The command in Listing 17-27 demonstrates running it against the Chapter17 database.

Summary

SQL Server exposes a vast array of metadata, which describes the data structures within SQL Server as well as the data itself. Metadata is exposed through a series of catalog views, dynamic management views and functions, system functions, and the INFORMATION_SCHEMA. Normally you only use the INFORMATION_SCHEMA if you need your scripts to be transferable to other RDBMS products. This is because it provides less detail than SQL Server–specific metadata but conforms to ISO standards, and therefore, it works on all major RDBMS.

This chapter also covered much useful information about the underlying operating system, as well as SQLOS. For example, you can use the dm_server category of DMV to find details of the instance’s Registry keys and expose details of the instance’s services. You can use the dm_os category of DMV to expose many internal details regarding the SQLOS, including the current contents of the buffer cache.

SQL Server also exposes metadata that you can use for capacity planning, such as the usage statistics for all files within a database (for example, IO stalls) and the amount of free space remaining. You can use this information to proactively plan additional capacity requirements before alerts start being triggered and applications are put at risk.

Metadata can also help in the pursuit of troubleshooting and performance tuning. The sys.dm_os_performance_counters DMV allows DBAs to retrieve Perfmon counters, even if they do not have access to the operating system. This can remove inter-team dependencies. You can use sys.dm_os_wait_stats to identify the most common cause of waits within the instance, which can in turn help diagnose hardware bottlenecks, such as memory or CPU pressure. The dm_exec category of DMV can help identify expensive queries, which may be tuned, to improve performance.

DBAs can also use metadata to create intelligent scripts, which can reduce their workload by adding business rules to common maintenance tasks. For example, a DBA can use metadata for tasks such as dynamically rebuilding only indexes that have become fragmented, or dynamically managing the cycling of database snapshots. I encourage you to explore the possibilities of metadata-driven automation further; the possibilities are endless.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.91.47