© Tracy Boggiano and Grant Fritchey 2019
T. Boggiano, G. FritcheyQuery Store for SQL Server 2019https://doi.org/10.1007/978-1-4842-5004-4_7

7. Forcing Plans

Tracy Boggiano1  and Grant Fritchey2
(1)
Cary, NC, USA
(2)
Grafton, MA, USA
 

While the information collected in the Query Store and all that it tells you about the performance of your system is the part of Query Store that you’ll use more frequently, the most powerful part of Query Store is the ability to force execution plans. Plan forcing is where the choices made by the optimizer during a compile or recompile event are superceded by a plan you select. Query Store provides enough information between the performance metrics and the execution plan to enable you to decide that, under some circumstances, you can pick a superior execution plan.

This chapter will show you how to use Query Store information and reports to identify a poorly behaving execution plan. With a badly behaving plan identified, we’ll explore how to use the information in Query Store to identify a well-behaved plan, if one exists. With this information in hand, we can then force the good plan to be used.

Identifying Badly Behaved Execution Plans

Execution plans, also called query plans or show plans, are your window into the choices made by the query optimization process. They show how your T-SQL, indexes, statistics, tables, columns, and constraints make up your database, and its queries will be used to retrieve or modify the data stored there. Reading execution plans is a very dense topic. For a detailed look, you should read the book Execution Plans by Grant Fritchey (one of the co-authors of this book). Here we’ll explore the information and reports that may suggest you have a poorly performing query. We’ll then use that to lead us to an execution plan. We’ll cover a few of the guideposts that can help you tell that a plan might not be supporting a query well, but we won’t be covering all the details of how to read an execution plan.

To get started, we have to understand how we can use Query Store to identify a query that has both changed its execution plan and how that plan caused performance to degrade. Another way you can identify poorly performing queries is when you get alerted to issues from the consumers of your database that performance is poor but with the information in the Query Store, we can be more proactive than that.

Identifying Regressed Queries

A regression is something that occurs when a query was behaving well and now performs poorly. You may have a query that runs badly from the start. These are easily identified and dealt with. You may have a query that slowly degrades in performance over time as more and more data is added to the system. These are also easily identified and dealt with. The real issues come when you have queries that were performing well and suddenly, sometimes intermittently, run slowly. Prior to Query Store, these were frequently difficult problems to solve.

The causes for these regressions are varied. You might see them when you migrate from earlier versions of SQL Server to later versions of SQL Server. These are caused by changes to how the query optimizer works and the introduction in SQL Server 2014 of a new cardinality estimation engine. You may see a regression caused by statistics on your database being incorrect or out of date. You could also be suffering from a problem known as bad parameter sniffing.

While incorrect and out-of-date statistics are probably the most common cause of regressions in query performance, the most talked about problem, and the one we’ll use for our examples in this chapter, is bad parameter sniffing. Let’s start by explaining what parameter sniffing is and how it can sometimes cause problems.

When you have a parameterized query, whether that’s a stored procedure, a parameterized query from an ORM tool such as Entity Framework, or parameters in sp_executesql, the query optimizer will use the values passed to that parameter when compiling an execution plan. It takes those exact values and uses them to look at the statistics for the column or index being referenced and creates and execution plan based on those precise values. This sampling of values is what is known as parameter sniffing. The majority of the time, this is a benign practice. A lot of the time, this is a very helpful practice. Sometimes though, you’ll see a situation where a plan gets created for one value that performs well enough for the query immediately called. However, all other queries perform very poorly with that same execution plan. This situation is known as bad parameter sniffing.

When you hit a situation where a query was performing well, but is suddenly performing poorly, you can use the information in Query Store to identify both the query metrics that define the performance and get a look at the execution plans both for when the query was running well and when it was running badly. You can do this one of two ways. You can run T-SQL code against the data collected in Query Store. Alternatively, you can use the Regressed Query report. Since the Regressed Query report may not always identify a query that is giving you problems as a regressed query, it’s a very good idea to know how to retrieve this information from the Query Store data yourself.

Identifying Regressed Queries from Query Store Data

We discussed the information collected by the Query Store in Chapter 2. We then went over the basics of how to retrieve data from the Query Store catalog views in Chapter 5. If you don’t get what we’re doing with the queries below, I’d recommend going back and reviewing those chapters before proceeding with this one.

To start with, we need a query that we can reliably get different execution plans from. The data distribution in the Person.Address table is such that depending on what value is used to filter the City column, you can get a couple of different execution plans in the AdventureWorks database. Here is the query we’ll use to explore behavior in the rest of the chapter:
CREATE OR ALTER PROC [dbo].[AddressByCity] @City NVARCHAR(30)
AS
   SELECT a.AddressID,
      a.AddressLine1,
      a.AddressLine2,
      a.City,
      sp.Name AS StateProvinceName,
      a.PostalCode
   FROM Person.Address AS a
   JOIN Person.StateProvince AS sp
      ON a.StateProvinceID = sp.StateProvinceID
   WHERE a.City = @City;
GO
With this procedure in place, we can execute the query with one of two different values to arrive at two different execution plans. If you run the following script with execution plans enabled, you can see the two plans. Note: For test purposes we’re using a simple way to clear the procedure cache. This may not be the best approach on a production system:
EXEC dbo.AddressByCity @City = N'London';
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;
EXEC dbo.AddressByCity @City = N'Mentor';

This is a classic case of parameter sniffing resulting in different execution plans. Each plan is optimized for the result set it is returning; 434 rows for the value “London” and one row for the value “Mentor.” However, each plan causes poor performance for the opposite data set. What I mean here is that the plan to return 434 rows runs fast when returning 434 rows, but it doesn’t run as fast as the other plan when returning only a few rows. The opposite is also true. In fact, the query will run so much slower when using the plan for “Mentor” when returning 434 rows that it will be marked as a regressed query in the Regressed Query report.

Regressed Query Report

We’re going to be covering the reports in detail in Chapter 4. However, here we want to show how the behavior of the dbo.AddressByCity procedure returns in the report:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig1_HTML.jpg
Figure 7-1

Regressed Query report showing poor performance

The first pane, on the left, is a listing of the queries based on the metric you’re measuring (duration by default). The second pane in the report shows query execution speed and the different plans involved. Clicking on one of the dots will change the plan being displayed in the bottom pane. Hovering over the dot will show how many executions it represents, average runtime, and more. Again, for more details on using the reports, see Chapter 4.

You can’t simply run the code above and arrive at a regressed query. The Regressed Query report compares behavior over time. So to get the report to generate, you have to run the code multiple times over a period of time. To simulate this yourself, you can run the following script. You may have to run it multiple times to get the query to show up as a regressed query because it has to have the data inside the Query Store to get the report to fire and it has to show a substantial difference over time. However, this script will work:
--Establish baseline behavior
EXEC dbo.AddressByCity @City = N'London';
GO 100
--Remove the plan from cache
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;
--Compile a new plan
EXEC dbo.AddressByCity @City = N'Mentor';
GO
--Execute the code to show query regression
EXEC dbo.AddressByCity @City = N'London';
GO 100

You should be able to see the query in the Regressed Query report and it should show two different execution plans the same as Figure 7-1.

Warning Signs in Execution Plans

Regressed query behavior is driven by changes to the execution plans. There isn’t room in this book to cover everything you need to know to read execution plans. For a detailed examination on that topic, we recommend Grant Fritchey’s book, SQL Server Execution Plans (Redgate Press, 2018). However, there a few warning signs we can look for in execution plans as a quick set of guides. Just understand, these will only guide you initially. You’ll have to eventually learn the details to read and understand the information inside execution plans.

Here is a core set of things to look at when examining execution plans for the first time.
  • First Operator: A detailed collection of information about the plan itself

    Most Costly Operator: The highest estimated cost showing a highly likely culprit for poor performance

    Warnings and Errors: Issues experienced by the optimization process, if any

    Fat Pipes: An indication of data flow, with wider pipes showing more data

    Scans: An indicator of larger data movement

    Extra Operators: Operators that you can’t explain readily why they are there or what they are doing

    Estimates vs. Actuals: The comparisons between the estimated number of rows or executions for an operator and the actual number of either

Here’s what you are using these guides to look for. I’ll use the execution plan that is causing us the most trouble, the one from the value of “Mentor” as shown here in Figure 7-2:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig2_HTML.jpg
Figure 7-2

Execution plan for the value “Mentor”

We’ll now walk through the guideposts looking for information about this plan.

First Operator

The first operator is the one all the way on the left side of any plan. It will usually be labeled by the type of operation your T-SQL is performing: SELECT, INSERT, UPDATE, and DELETE. Right-click on that operator and select properties. You’ll see something like Figure 7-3:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig3_HTML.jpg
Figure 7-3

Partial list of first operator properties

This is only a partial of all the properties and information exposed through the first operator of an execution plan. When looking at plan regressions, chances are high that you may be seeing problems caused by changes in data or statistics, or parameter sniffing. In that case, the Parameter List property shows you the parameter or parameters used to compile the execution plan. You can see in Figure 7-3 that this plan used the value “Mentor.”

Most Costly Operator

The plan shown in Figure 7-2 is very simple and easy to read. You can quickly spot that the Index Scan operator is 93% of the estimated cost. These costs are based on calculations within the optimizer and are not reflective of actual behavior. However, since the numbers that drive the costs are derived from your code, objects, and the row counts in the statistics, it’s a value we use to examine execution plans. A high cost operator may indicate where the problem lies.

Warnings and Errors

These will show as a yellow caution sign or a red “X”. They can be indicators of problems with the code that affect how the execution plan behaves. There are none in Figure 7-2.

Fat Pipes

The arrows that connect the operators represent data flow. Big pipes reflect a lot of data flow, whereas small pipes represent small data flow. Look for the fat pipes to understand how data is being moved. You also have to look for transitions where more and more data is created, or where data is moved from disk and then filtered later. These can be strong indicators of problems.

Scans

A scan indicates that the entire table or index was read in order to retrieve data. This is not necessarily a problem; however, it indicates possible large amounts of I/O and therefore is something to consider when evaluating an execution plan. Please note that seeks, the opposite of scans, can also cause problems depending on the rest of the plan. Index scans such as the one in Figure 7-2 should lead us to question the plan. We have a data set that is filtering down to one row yet it’s scanning an entire clustered index to do that.

Extra Operators

This concept is a little harder to explain. If you’re looking at a plan and you can’t explain what an operator is, or why a given operator is being used, that should draw your eye for consideration. Let’s take a look at the plan for the value “London” in Figure 7-4:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig4_HTML.jpg
Figure 7-4

Execution plan for the value “London”

In this plan we have an Index Scan, a Clustered Index Scan, a Merge Join, and a Sort. When you consider the query in dbo.AddressByCity, you’ll note that there is no ORDER BY command. Therefore, in this case, the Sort operator is a mystery, an extra operator. Why do we have a Sort operation? The answer here is because the optimizer decided that the Merge Join was faster for the larger data set. However, a Merge Join requires that all data be ordered. So a Sort operator was added to satisfy the needs of the Merge Join.

Estimated vs. Actual

You can only compare these values when you’re looking at an Actual plan, which looks like the same thing as an Estimated plan but has additional runtime metrics. In short, to capture an Actual plan, you have to execute the query. If we look at the execution plan for the value of “Mentor” after executing the query with the value of “London,” we can see this comparison in action as shown in Figure 7-5:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig5_HTML.jpg
Figure 7-5

Estimated vs. Actual values compared

Here we’re seeing one of the estimated vs. actual comparisons, the number of rows processed by each operator. You can see that the Nested Loops join processed 434 rows of the 2 that it anticipated. In this case, the estimated number of rows was 2, but the actual was 434. This wide disparity, 21,700%, can be an indicator for why performance is poor.

Even with these guides for looking at execution plans, you’re still going to need to drill down into the properties in order to understand better how execution plans work. However, the guideposts will get you started. In order to evaluate if a plan is better, it’s a good idea to learn how to compare one plan to the other.

Comparing Execution Plans

From the Regressed Query report (as well as other Query Store reports), you have the ability to easily, and quickly, compare execution plans. Yes, you can look at the plans graphically and compare them by clicking on the dots. However, there is a utility for comparing execution plans that shows a lot more detail and functionality.

To access this functionality, you’ll need to shift-click on two of the execution plans (you can only compare two plans at a time). Then click on the toolbar of the report as shown in Figure 7-6:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig6_HTML.jpg
Figure 7-6

The toolbar for comparing two execution plans

When you click on this after selecting two plans, a new window will open looking like Figure 7-7:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig7_HTML.jpg
Figure 7-7

Two execution plans being compared

What you’re seeing in Figure 7-7 are two execution plans being compared in the two panes, top and bottom, on the left. On the right are two sets of properties from a selected operator. Those property values are also being compared. The shading (pink in my case) around the operators in both plans is indicated operators, or even sets of operators, that are common between two plans. So, in our plan, the scan of the index on the Person.Address table is essentially the same in both plans. The other operators are different. This information can be used to help troubleshoot performance either by showing you a common problem needing a solution (in this case a scan of all the data in an index) or by showing you the differences in the plan that are causing poor performance.

Clicking on any operator on the left panes will change the property values on the right side, allowing you to further explore the details and differences. Figure 7-8 is the properties from the common Index Scan operator:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig8_HTML.jpg
Figure 7-8

Common operator in both execution plans

The property values that do not match have the yellow “does not equal” icon in front of those values. You can see that overall, the estimated costs, row size, object information, and the rest are all the same. It’s only the estimated number of rows, node ID, and the estimated operator percentage costs that are different. In fact, the percentage estimates are only different because all the other operators in the two plans are different.

Comparing common operators, especially in this case, a scan, can give you indications where the query performance may be improved by changes to the code or structure. However, the whole point of using plan forcing is to avoid having to modify the code or structure. You mainly want to compare plans so that you’re sure the one you pick to force is more likely to succeed.

When you are looking at operators that are not marked as common between the two plans, in order to compare those operators, you would need to select each one individually. As a general rule, this won’t tell you much about the plans because when different operators are performing different functions between plans, comparing them is of little use.

You combine the performance metrics, the wait statistics, and the information you glean from the execution plans in question. Along with this information, you should also take into account things like the importance of the query, the frequency at which it is being called, and other factors that will help you decide that, in fact, one of these plans will outperform the other for your most common needs.

Once you’ve decided which of the plans you want to force, you have several options to force the plan.

Forcing and Unforcing Execution Plans

When a plan regresses for whatever reason, the single best solution is to make changes, whether that is to code, structure, or statistics. However, it’s just not always possible, or desirable, to make those changes. In this case, we’re going to want to force a plan. There are several things you should know about plan forcing before we get on to how it’s done.

Plan forcing overrides the work of the query optimizer. When you choose to force a plan, that is the plan that will always be used until you unforce, or invalidate, that plan. You can only force a plan that is valid for a given query. If you make changes to the code or the structures, such as dropping an index or something along those lines, it invalidates the plan, forcing is no longer possible.

The query optimizer will still generate a plan if one is not in the cache. However, in the event that you have a forced plan, any other plan generated by the optimizer will be discarded. You can sometimes see a situation where a forced plan looks different than the original plan that you forced. Because the optimizer still goes through the optimization process, if a plan that is morally equivalent (Microsoft’s term) is generated, meaning a plan that for all intents and purposes is exactly the same, that morally equivalent plan will be used. However, in most cases, you’ll see the exact plan that you chose to force.

SQL Server has long had a function that is similar in behavior to plan forcing in the Query Store, plan guides. These were a way to apply a query hint or even suggest an entire execution plan for a query, like plan forcing. However, they are very difficult to use and frequently fail. This is one of many reasons why plan forcing in Query Store is so attractive. If you are using plan guides, plan forcing overrides the plan guides. Although, you’ll still see evidence that guide was applied in both Extended Events and the execution plan itself.

When you mark a plan as forced, that is stored within the Query Store catalog views. This means that plan forcing will survive a restart, a detach, or even a failover in a clustered environment. This is because the fact that a plan is forced has been persisted to disk with the database. The only way to stop plan forcing is to invalidate the plan as described earlier, or choose to stop forcing that plan.

Because data and statistics change over time, it’s a good idea to plan to review forced plans on a regular basis. You want to be sure that the reasons for forcing a particular plan still hold true as things change over time. It’s also a very good idea to be very judicious about plan forcing and only do it when absolutely necessary in order to address regression.

Forcing Plans Through the Reports

Choosing to force a plan is never difficult. It’s especially easy inside the Query Store reports. In the right-hand pane where you can see the performance of the queries over time, you’ll see the execution plans for a query. There may be one, or there may be several. In Figure 7-9 you can see that I have two. They are listed on the right of Figure 7-9 in the little box labeled “Plan ID”:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig9_HTML.jpg
Figure 7-9

Two execution plans in the Query Store report

Once we decide which of the plans we wish to force, we just have to click on that Plan ID value on the right, either 269 or 273 in Figure 7-9. With that selected, the toolbar above has an icon for forcing plans as shown in Figure 7-10:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig10_HTML.jpg
Figure 7-10

Button in the report that lets you force plans

Clicking that button opens a confirmation window. You have the opportunity to choose not to force the plan. Figure 7-11 shows the window:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig11_HTML.jpg
Figure 7-11

Confirming that you wish to force a plan

Clicking on the Yes button will immediately force the plan. Clicking on the No button will of course cancel the process. After you click yes, the plan will be forced. In my case, I’m choosing to force plan 273. When I do this, the report then marks that plan with a check mark. You can see this immediately in the Query Store report as shown in Figure 7-12:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig12_HTML.jpg
Figure 7-12

The plans after picking a plan to force

You can see that a check mark as been placed on plan 273 and will remain there until the plan forcing is removed. This is an immediate indication that the plan has been forced. You’ll also see that the “Force Plan” button is marked as depressed when the plan is forced and not depressed when the plan is not forced.

Another way to force the plan is immediately below the pane we’ve been looking at in Figures 7-9 and 7-12. Another set of buttons are visible as shown in Figure 7-13:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig13_HTML.jpg
Figure 7-13

Plan forcing buttons in the Query Store report

This button works the same as before. Pick the plan you wish to force and select the button. A confirmation window will open and you can acknowledge forcing the plan.

Choosing to remove forcing, or unforcing a plan, is very simple. You have to use the button shown in Figure 7-13. When you have a plan selected that has already been forced, the “Unforce Plan” button will be enabled. You can then remove plan forcing for that query. You’ll again be prompted to confirm that you are unforcing the plan.

Forcing Plans Using T-SQL

You can also force a plan programmatically using T-SQL. The actual functionality of plan forcing through T-SQL is extremely easy. The only trick is that you have to have two pieces of information. You have to have the query_id within the Query Store and you have to have the plan_id. As you can see above, you can retrieve these using the Query Store reports. You can also use T-SQL against the Query Store catalog views as outlined in Chapter 5. With T-SQL you can retrieve the query_id and plan_id. With those in hand, forcing the plan is quite simple:
EXEC sys.sp_query_store_force_plan 262,273;

Since we’re now doing this programmatically, there is no confirmation necessary. Assuming that both ID values are accurate and that the plan chosen is a valid plan, the plan_id, in this case 273, is forced for the query_id, in this case 262. If either ID is invalid, you’ll get an error.

Choosing to unforce the plan programmatically is not very difficult either:
EXEC sys.sp_query_store_unforce_plan 262,273;

Again, there is no confirmation needed when forcing and unforcing programmatically through T-SQL.

Determining Which Plans Have Been Forced

As you’ve seen in Figure 7-12, it’s very easy to see if a plan has been forced. There is even a report that lists forced plans. We’ll go over that report in Chapter 4 in some detail. The question should be, is there a way to programmatically tell that a plan has been forced. The answer to this is, of course, yes. You just have to query the Query Store catalog views:
SELECT qsq.query_id,
       qsp.plan_id,
       qsp.is_forced_plan
FROM sys.query_store_query AS qsq
    JOIN sys.query_store_plan AS qsp
        ON qsp.query_id = qsq.query_id
WHERE qsq.query_id = 262;
The results of this query look like Figure 7-14:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig14_HTML.jpg
Figure 7-14

Showing which query plan is forced programmatically

You can quickly see that the plan_id 273 has been forced since it’s is_forced_plan value is set to 1, while the other plan, 269, is not forced.

You can also see if a plan is forced by looking at some execution plans. You won’t see it on plans where you just capture an estimated plan if the plan is not yet in cache. You won’t see it on a plan from the Query Store itself. You will see this on pretty much any other plan. What you’re looking for is a little obscure. We’ll need to go the properties of the first operator and look for the “Use Plan” property value as shown in Figure 7-15:
../images/473933_1_En_7_Chapter/473933_1_En_7_Fig15_HTML.jpg
Figure 7-15

The “Use Plan” property in the first operator

This is only available as a property in the first operator. It’s not visible in the tooltip. You’ll only see this on a plan that has been forced. You won’t even see the property on any other plan, so there is no “Use Plan” that will resolve to False.

Conclusion

Query regression is a real problem. The best way to solve it still remains modifying the code, structure, or statistics to ensure a more appropriate plan is generated. However, when you must, you do have the capability of forcing an execution plan. There are plenty of ways to keep an eye on plan forcing, so you can adjust them as necessary when the structure or code changes or other changes are made to the system that may affect plan generation. As already mentioned, only use this functionality when you have to and have a plan for a regular review of the queries that have forced plans.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.31.180