While the information collected in the Query Store and all that it tells you about the performance of your system is the part of Query Store that you’ll use more frequently, the most powerful part of Query Store is the ability to force execution plans. Plan forcing is where the choices made by the optimizer during a compile or recompile event are superceded by a plan you select. Query Store provides enough information between the performance metrics and the execution plan to enable you to decide that, under some circumstances, you can pick a superior execution plan.
This chapter will show you how to use Query Store information and reports to identify a poorly behaving execution plan. With a badly behaving plan identified, we’ll explore how to use the information in Query Store to identify a well-behaved plan, if one exists. With this information in hand, we can then force the good plan to be used.
Identifying Badly Behaved Execution Plans
Execution plans, also called query plans or show plans, are your window into the choices made by the query optimization process. They show how your T-SQL, indexes, statistics, tables, columns, and constraints make up your database, and its queries will be used to retrieve or modify the data stored there. Reading execution plans is a very dense topic. For a detailed look, you should read the book Execution Plans by Grant Fritchey (one of the co-authors of this book). Here we’ll explore the information and reports that may suggest you have a poorly performing query. We’ll then use that to lead us to an execution plan. We’ll cover a few of the guideposts that can help you tell that a plan might not be supporting a query well, but we won’t be covering all the details of how to read an execution plan.
To get started, we have to understand how we can use Query Store to identify a query that has both changed its execution plan and how that plan caused performance to degrade. Another way you can identify poorly performing queries is when you get alerted to issues from the consumers of your database that performance is poor but with the information in the Query Store, we can be more proactive than that.
Identifying Regressed Queries
A regression is something that occurs when a query was behaving well and now performs poorly. You may have a query that runs badly from the start. These are easily identified and dealt with. You may have a query that slowly degrades in performance over time as more and more data is added to the system. These are also easily identified and dealt with. The real issues come when you have queries that were performing well and suddenly, sometimes intermittently, run slowly. Prior to Query Store, these were frequently difficult problems to solve.
The causes for these regressions are varied. You might see them when you migrate from earlier versions of SQL Server to later versions of SQL Server. These are caused by changes to how the query optimizer works and the introduction in SQL Server 2014 of a new cardinality estimation engine. You may see a regression caused by statistics on your database being incorrect or out of date. You could also be suffering from a problem known as bad parameter sniffing.
While incorrect and out-of-date statistics are probably the most common cause of regressions in query performance, the most talked about problem, and the one we’ll use for our examples in this chapter, is bad parameter sniffing. Let’s start by explaining what parameter sniffing is and how it can sometimes cause problems.
When you have a parameterized query, whether that’s a stored procedure, a parameterized query from an ORM tool such as Entity Framework, or parameters in sp_executesql, the query optimizer will use the values passed to that parameter when compiling an execution plan. It takes those exact values and uses them to look at the statistics for the column or index being referenced and creates and execution plan based on those precise values. This sampling of values is what is known as parameter sniffing. The majority of the time, this is a benign practice. A lot of the time, this is a very helpful practice. Sometimes though, you’ll see a situation where a plan gets created for one value that performs well enough for the query immediately called. However, all other queries perform very poorly with that same execution plan. This situation is known as bad parameter sniffing.
When you hit a situation where a query was performing well, but is suddenly performing poorly, you can use the information in Query Store to identify both the query metrics that define the performance and get a look at the execution plans both for when the query was running well and when it was running badly. You can do this one of two ways. You can run T-SQL code against the data collected in Query Store. Alternatively, you can use the Regressed Query report. Since the Regressed Query report may not always identify a query that is giving you problems as a regressed query, it’s a very good idea to know how to retrieve this information from the Query Store data yourself.
Identifying Regressed Queries from Query Store Data
We discussed the information collected by the Query Store in Chapter 2. We then went over the basics of how to retrieve data from the Query Store catalog views in Chapter 5. If you don’t get what we’re doing with the queries below, I’d recommend going back and reviewing those chapters before proceeding with this one.
This is a classic case of parameter sniffing resulting in different execution plans. Each plan is optimized for the result set it is returning; 434 rows for the value “London” and one row for the value “Mentor.” However, each plan causes poor performance for the opposite data set. What I mean here is that the plan to return 434 rows runs fast when returning 434 rows, but it doesn’t run as fast as the other plan when returning only a few rows. The opposite is also true. In fact, the query will run so much slower when using the plan for “Mentor” when returning 434 rows that it will be marked as a regressed query in the Regressed Query report.
Regressed Query Report
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig1_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig1_HTML.jpg)
Regressed Query report showing poor performance
The first pane, on the left, is a listing of the queries based on the metric you’re measuring (duration by default). The second pane in the report shows query execution speed and the different plans involved. Clicking on one of the dots will change the plan being displayed in the bottom pane. Hovering over the dot will show how many executions it represents, average runtime, and more. Again, for more details on using the reports, see Chapter 4.
You should be able to see the query in the Regressed Query report and it should show two different execution plans the same as Figure 7-1.
Warning Signs in Execution Plans
Regressed query behavior is driven by changes to the execution plans. There isn’t room in this book to cover everything you need to know to read execution plans. For a detailed examination on that topic, we recommend Grant Fritchey’s book, SQL Server Execution Plans (Redgate Press, 2018). However, there a few warning signs we can look for in execution plans as a quick set of guides. Just understand, these will only guide you initially. You’ll have to eventually learn the details to read and understand the information inside execution plans.
First Operator: A detailed collection of information about the plan itself
Most Costly Operator: The highest estimated cost showing a highly likely culprit for poor performance
Warnings and Errors: Issues experienced by the optimization process, if any
Fat Pipes: An indication of data flow, with wider pipes showing more data
Scans: An indicator of larger data movement
Extra Operators: Operators that you can’t explain readily why they are there or what they are doing
Estimates vs. Actuals: The comparisons between the estimated number of rows or executions for an operator and the actual number of either
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig2_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig2_HTML.jpg)
Execution plan for the value “Mentor”
We’ll now walk through the guideposts looking for information about this plan.
First Operator
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig3_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig3_HTML.jpg)
Partial list of first operator properties
This is only a partial of all the properties and information exposed through the first operator of an execution plan. When looking at plan regressions, chances are high that you may be seeing problems caused by changes in data or statistics, or parameter sniffing. In that case, the Parameter List property shows you the parameter or parameters used to compile the execution plan. You can see in Figure 7-3 that this plan used the value “Mentor.”
Most Costly Operator
The plan shown in Figure 7-2 is very simple and easy to read. You can quickly spot that the Index Scan operator is 93% of the estimated cost. These costs are based on calculations within the optimizer and are not reflective of actual behavior. However, since the numbers that drive the costs are derived from your code, objects, and the row counts in the statistics, it’s a value we use to examine execution plans. A high cost operator may indicate where the problem lies.
Warnings and Errors
These will show as a yellow caution sign or a red “X”. They can be indicators of problems with the code that affect how the execution plan behaves. There are none in Figure 7-2.
Fat Pipes
The arrows that connect the operators represent data flow. Big pipes reflect a lot of data flow, whereas small pipes represent small data flow. Look for the fat pipes to understand how data is being moved. You also have to look for transitions where more and more data is created, or where data is moved from disk and then filtered later. These can be strong indicators of problems.
Scans
A scan indicates that the entire table or index was read in order to retrieve data. This is not necessarily a problem; however, it indicates possible large amounts of I/O and therefore is something to consider when evaluating an execution plan. Please note that seeks, the opposite of scans, can also cause problems depending on the rest of the plan. Index scans such as the one in Figure 7-2 should lead us to question the plan. We have a data set that is filtering down to one row yet it’s scanning an entire clustered index to do that.
Extra Operators
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig4_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig4_HTML.jpg)
Execution plan for the value “London”
In this plan we have an Index Scan, a Clustered Index Scan, a Merge Join, and a Sort. When you consider the query in dbo.AddressByCity, you’ll note that there is no ORDER BY command. Therefore, in this case, the Sort operator is a mystery, an extra operator. Why do we have a Sort operation? The answer here is because the optimizer decided that the Merge Join was faster for the larger data set. However, a Merge Join requires that all data be ordered. So a Sort operator was added to satisfy the needs of the Merge Join.
Estimated vs. Actual
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig5_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig5_HTML.jpg)
Estimated vs. Actual values compared
Here we’re seeing one of the estimated vs. actual comparisons, the number of rows processed by each operator. You can see that the Nested Loops join processed 434 rows of the 2 that it anticipated. In this case, the estimated number of rows was 2, but the actual was 434. This wide disparity, 21,700%, can be an indicator for why performance is poor.
Even with these guides for looking at execution plans, you’re still going to need to drill down into the properties in order to understand better how execution plans work. However, the guideposts will get you started. In order to evaluate if a plan is better, it’s a good idea to learn how to compare one plan to the other.
Comparing Execution Plans
From the Regressed Query report (as well as other Query Store reports), you have the ability to easily, and quickly, compare execution plans. Yes, you can look at the plans graphically and compare them by clicking on the dots. However, there is a utility for comparing execution plans that shows a lot more detail and functionality.
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig6_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig6_HTML.jpg)
The toolbar for comparing two execution plans
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig7_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig7_HTML.jpg)
Two execution plans being compared
What you’re seeing in Figure 7-7 are two execution plans being compared in the two panes, top and bottom, on the left. On the right are two sets of properties from a selected operator. Those property values are also being compared. The shading (pink in my case) around the operators in both plans is indicated operators, or even sets of operators, that are common between two plans. So, in our plan, the scan of the index on the Person.Address table is essentially the same in both plans. The other operators are different. This information can be used to help troubleshoot performance either by showing you a common problem needing a solution (in this case a scan of all the data in an index) or by showing you the differences in the plan that are causing poor performance.
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig8_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig8_HTML.jpg)
Common operator in both execution plans
The property values that do not match have the yellow “does not equal” icon in front of those values. You can see that overall, the estimated costs, row size, object information, and the rest are all the same. It’s only the estimated number of rows, node ID, and the estimated operator percentage costs that are different. In fact, the percentage estimates are only different because all the other operators in the two plans are different.
Comparing common operators, especially in this case, a scan, can give you indications where the query performance may be improved by changes to the code or structure. However, the whole point of using plan forcing is to avoid having to modify the code or structure. You mainly want to compare plans so that you’re sure the one you pick to force is more likely to succeed.
When you are looking at operators that are not marked as common between the two plans, in order to compare those operators, you would need to select each one individually. As a general rule, this won’t tell you much about the plans because when different operators are performing different functions between plans, comparing them is of little use.
You combine the performance metrics, the wait statistics, and the information you glean from the execution plans in question. Along with this information, you should also take into account things like the importance of the query, the frequency at which it is being called, and other factors that will help you decide that, in fact, one of these plans will outperform the other for your most common needs.
Once you’ve decided which of the plans you want to force, you have several options to force the plan.
Forcing and Unforcing Execution Plans
When a plan regresses for whatever reason, the single best solution is to make changes, whether that is to code, structure, or statistics. However, it’s just not always possible, or desirable, to make those changes. In this case, we’re going to want to force a plan. There are several things you should know about plan forcing before we get on to how it’s done.
Plan forcing overrides the work of the query optimizer. When you choose to force a plan, that is the plan that will always be used until you unforce, or invalidate, that plan. You can only force a plan that is valid for a given query. If you make changes to the code or the structures, such as dropping an index or something along those lines, it invalidates the plan, forcing is no longer possible.
The query optimizer will still generate a plan if one is not in the cache. However, in the event that you have a forced plan, any other plan generated by the optimizer will be discarded. You can sometimes see a situation where a forced plan looks different than the original plan that you forced. Because the optimizer still goes through the optimization process, if a plan that is morally equivalent (Microsoft’s term) is generated, meaning a plan that for all intents and purposes is exactly the same, that morally equivalent plan will be used. However, in most cases, you’ll see the exact plan that you chose to force.
SQL Server has long had a function that is similar in behavior to plan forcing in the Query Store, plan guides. These were a way to apply a query hint or even suggest an entire execution plan for a query, like plan forcing. However, they are very difficult to use and frequently fail. This is one of many reasons why plan forcing in Query Store is so attractive. If you are using plan guides, plan forcing overrides the plan guides. Although, you’ll still see evidence that guide was applied in both Extended Events and the execution plan itself.
When you mark a plan as forced, that is stored within the Query Store catalog views. This means that plan forcing will survive a restart, a detach, or even a failover in a clustered environment. This is because the fact that a plan is forced has been persisted to disk with the database. The only way to stop plan forcing is to invalidate the plan as described earlier, or choose to stop forcing that plan.
Because data and statistics change over time, it’s a good idea to plan to review forced plans on a regular basis. You want to be sure that the reasons for forcing a particular plan still hold true as things change over time. It’s also a very good idea to be very judicious about plan forcing and only do it when absolutely necessary in order to address regression.
Forcing Plans Through the Reports
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig9_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig9_HTML.jpg)
Two execution plans in the Query Store report
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig10_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig10_HTML.jpg)
Button in the report that lets you force plans
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig11_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig11_HTML.jpg)
Confirming that you wish to force a plan
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig12_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig12_HTML.jpg)
The plans after picking a plan to force
You can see that a check mark as been placed on plan 273 and will remain there until the plan forcing is removed. This is an immediate indication that the plan has been forced. You’ll also see that the “Force Plan” button is marked as depressed when the plan is forced and not depressed when the plan is not forced.
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig13_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig13_HTML.jpg)
Plan forcing buttons in the Query Store report
This button works the same as before. Pick the plan you wish to force and select the button. A confirmation window will open and you can acknowledge forcing the plan.
Choosing to remove forcing, or unforcing a plan, is very simple. You have to use the button shown in Figure 7-13. When you have a plan selected that has already been forced, the “Unforce Plan” button will be enabled. You can then remove plan forcing for that query. You’ll again be prompted to confirm that you are unforcing the plan.
Forcing Plans Using T-SQL
Since we’re now doing this programmatically, there is no confirmation necessary. Assuming that both ID values are accurate and that the plan chosen is a valid plan, the plan_id, in this case 273, is forced for the query_id, in this case 262. If either ID is invalid, you’ll get an error.
Again, there is no confirmation needed when forcing and unforcing programmatically through T-SQL.
Determining Which Plans Have Been Forced
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig14_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig14_HTML.jpg)
Showing which query plan is forced programmatically
You can quickly see that the plan_id 273 has been forced since it’s is_forced_plan value is set to 1, while the other plan, 269, is not forced.
![../images/473933_1_En_7_Chapter/473933_1_En_7_Fig15_HTML.jpg](http://images-20200215.ebookreading.net/2/3/3/9781484250044/9781484250044__query-store-for__9781484250044__images__473933_1_En_7_Chapter__473933_1_En_7_Fig15_HTML.jpg)
The “Use Plan” property in the first operator
This is only available as a property in the first operator. It’s not visible in the tooltip. You’ll only see this on a plan that has been forced. You won’t even see the property on any other plan, so there is no “Use Plan” that will resolve to False.
Conclusion
Query regression is a real problem. The best way to solve it still remains modifying the code, structure, or statistics to ensure a more appropriate plan is generated. However, when you must, you do have the capability of forcing an execution plan. There are plenty of ways to keep an eye on plan forcing, so you can adjust them as necessary when the structure or code changes or other changes are made to the system that may affect plan generation. As already mentioned, only use this functionality when you have to and have a plan for a regular review of the queries that have forced plans.